Regular Expression

Kevin, it's not a personal need to be able to access captured groups but the need of any serious regexp user.
I will test your new version.
Cordially

It still lacks ReplaceAll which supports groups.

EX1
We want to replace all vowels with @
str=""Where have all the flowers gone?"
str=RegexpReplace(str, "[aeiou]", "@")
---> "Wh@r@ h@v@ @ll th@ fl@w@rs g@n@?"

EX2
We want to change the format of a date
str="date: 14:27 28/03/2023"
str=RegexpReplace(str, "(\d{2})/(\d{2})/(\d{4})", "$2.$1.$3")
---> "date: 14:27 03.28.2023"

$0 is full match: "28/03/2023"
$1 is captured group number 1: "28"
$2 is captured group number 2: "03"
$3 is captured group number 3: "2023"
The "/" is not captured, is remplaced by "."

Cordially

PROJECT:

In defense of AI2's power, here are block solutions:




EX2_sample

3 Likes

There will always be solutions with blocks, but mastering regular expressions very often makes the code shorter and faster. (translated with Google!!)

I measure faster by the time it takes another programmer to read and understand the code.
For that purpose, I push functional decomposition.

To each his own.

The regular expression language is a programming language with its vocabulary and grammar. Once integrated into AI2, users will gradually learn to master the concept, over several years if necessary.
[pcrepattern specification]
image

Five options OK with KevinkunRegEx-version3

PCRE patterns may contain options, which are enclosed in (? ) sequences. Options can be grouped together: "(?imx)". Options following an hyphen are negated: "(?im-sx)".

Options appearing outside a group affect the remaining of the pattern from that point onwards. Options appearing inside a group affect that group only. Options loose their special meaning inside a character class, where they are treated literally.

(?i) Caseless: matching becomes case-insensitive from that point on. By default, matching is case-sensitive.
regex="red|black" matche "the red king" but not "the Red king"
regex="(?i)red|black" matche "the red king" and "the Red king"
(?m) Multiline: ^ and $ match at newline sequences within data. By default, multiline is off, ^ match at start of text and $ match at end of text.
regex="^A" does not match "Good\nAfternoon"
regex="(?m)^A" match "Good\nAfternoon"
(?s) Single-line or DotAll: . matches anything including a newline sequence. By default, DotAll is off hence . does not match a newline sequence.
regex="oo.*?oo" does not match "Good\nAfternoon"
regex="(?s)oo.*?oo" match "Good\nAfternoon"
(?x) eXtended: whitespaces outside character classes are ignored and # starts a comment up to the next solid newline in pattern. Meaningless whitespaces between components make regular expressions much more readable. By default, whitespaces match themselves and # is a literal character.
regex="oo .*? oo" does not match "Good Afternoon" because spaces in regex
regex="(?x)oo .*? oo #match" match "Good Afternoon"

He! Kevin ! New problem
I'm looking in a long text for words ending in "ing" to capitalize them.
I thought of:
Kevin.ReplaceAll("Parking: it's good, camping is better!", "\b\w*ing\b", upcase("$0"))
DON'T WORK!
Probably because of the call of the function upcase which puts the parameter on the stack and which erases the meaning of "$0".
In java there is:
replaceAll(Function, stringreplace)
Is it possible to translate in ai2?

Honestly I am not an expert of reg exp. Maybe try other pattern .

The error does not come from the regex but from the use of a function in the third parameter. This is not possible with ReplaceAll.
Damage!

Is upcace a regex derived command/function?

No, upcase is the ai2 string (block) function used here to upcase the match.
See replaceAll(Function) in java.

Have you considered using the terminal extension and constructing your regex commands using bash / linux command line ?

// Java code to illustrate replaceAll() method with function

import java.util.regex.*;

public class regexp {
	public static void main(String[] args)
	{
		// Get the regex to be checked
                String regex =   		
			"(?x)     # Ignore WhiteSpace \n" +
  			"\\b      # First or last char in a word \n" +  
			"\\w*     # Alphanum, any number \n" +
			"ing      # Letters i n g \n" +  
			"\\b	     # First or last char in a word";

		//String regex = "\\b\\w*ing\\b";

		// Create a pattern from regex
		Pattern pattern = Pattern.compile(regex);

		// Get the String to be matched
		String text = "Parking is good; camping, is better.";

		// Create a matcher for the input String
		Matcher matcher = pattern.matcher(text);

		// Replace every matched pattern. toUpperCase() is example.
		text = matcher.replaceAll(x -> x.group().toUpperCase());
		System.out.println(text);
	}
}

This code WORK in Java: "PARKING is good; CAMPING, is better."

will this work in java8?

My version: "18.0.2.1" 2022-08-18

to make extension, we need java 1.8.

It's a java8 expression, but the ant compiler won't recognize it. But I think Rush should compile it. It has the functions of desugaring.

desugaring = depreciated ?
No, it's a function very important in regex.