regex delete duplicate chars from string.

Get help using Construct 2

Post » Fri Jun 26, 2015 6:01 pm

Does any one know how to delete any duplicate numbers and letter from a string using regex?

For example if my string looks like this

"112, 113, 112, 114, 113, 112,"

it would be turned into this

"112, 113, 114".

I've been looking into regex but it's over my head.

Thanks.
B
43
S
23
G
21
Posts: 735
Reputation: 12,132

Post » Fri Jun 26, 2015 6:18 pm

I don't think regex is suited for that. A better way would be to add each number to a dictionary to eliminate duplicates. for example:

global string list1= "112, 113, 112, 114, 113, 112,"
global string list2=""
repeat tokencount(list, ",") times
--- Dictionary: add key tokenat(list, loopindex, ",") with value 0

dictionary: for each key
--- add Dictionary.CurrentKey&"," to list2
B
94
S
33
G
113
Posts: 5,356
Reputation: 73,273

Post » Fri Jun 26, 2015 6:21 pm

Ok i'll try that thanks for your help.
B
43
S
23
G
21
Posts: 735
Reputation: 12,132

Post » Tue Jul 07, 2015 8:58 pm

I just remembered this topic...and... i really do not like dictionary ... :)

example.capx
B
67
S
24
G
7
Posts: 1,518
Reputation: 11,072

Post » Tue Jul 07, 2015 9:57 pm

Since OP asked for a Regex, here's how to do it :
Matching pattern : (?:(\w)(?:\1)*)
Flags : g
Substitution pattern : $1

See the witchcraft in action in this capx.

Note that the (\w) in the matching pattern can be changed to something else to accommodate for accentuated letters and other symbols. Replacing it by (.) will ensure that any character can only be found once consecutively, except for the newline character. (You'd have to use ([\s\S]) to include it.)
B
74
S
31
G
26
Posts: 994
Reputation: 20,198

Post » Wed Jul 08, 2015 6:43 am

@Magistross

Your example doesn't work as the OP requires.

Using his string as an example, feeding the following into your .capx - "112, 113, 112, 114, 113, 112," - would produce "12, 13, 12, 14, 13, 12," which is obviously not correct.

Unfortunately, I'm useless at RegEx so am unable to remedy.
If your vision so exceeds your ability, then look to something closer.
Moderator
B
136
S
31
G
86
Posts: 5,486
Reputation: 59,758

Post » Wed Jul 08, 2015 3:31 pm

It seems I misread his need, I stopped at the first sentence, while what he wants is to remove duplicate "words" from a string as explained further. It's definitely not the same thing. I fear using a single RegexReplace won't quite cut it. What could work is to use "lookaround" to create a match if a word doesn't appear more than once. Then you can concatenate all matches in a loop.

Since Javascript only support lookahead, it will have the drawback of losing the original order in which words appear, only the last occurrence of a word will create a match. Only the first occurrence would have been matched if lookbehind was supported... that's too bad if original order is needed. For the sake of showcasing the power of Regex, here's how to do it :

Matching pattern : (\b\w+\b)(?!.*\b\1\b)
Flags : g

capx
B
74
S
31
G
26
Posts: 994
Reputation: 20,198

Post » Thu Jul 09, 2015 7:50 pm

Thanks everyone.

I used R0j0's method shortly after he suggested it which works fine, but i'll certainly look at yours @korbaach, and @Magistross thanks for taking the time show how to do it with Regex as originally requested, really impressive, it might be a quick and neater alternative. The reason i wanted to use Regex was because i'm used to using tokens to search strings and could have done it that way if the replace function could specify a particular token index instead of replacing all occurrences.
B
43
S
23
G
21
Posts: 735
Reputation: 12,132


Return to How do I....?

Who is online

Users browsing this forum: No registered users and 44 guests