Archive for October, 2022

Regex expressions quick reference

Hi, In this post, I am trying to share my notes about regex expressions, I recently worked on some code using regex expressions, this is an attempt to write what I learned, may be useful for others.

Some notes:
.* = will match till end of line, all characters including special chars and symbols, but it wont match newline.

Source string :
x = y + z - func(a, b);
v = func(x, z)

Regex to identify function name in above expression:

(?<=.*=.*)(func)(?=\(.*)

https://regexr.com/

Here, (?<= allows to continue matching the symbols, without including them into match group. It is used before main match group.
.* matched any value except newline till ‘=’ sign is encountered (see what happens, if there is no equal = sign, the third line is below figure is not highlighted)
Again, .* matches everything till ‘func’
(?= is a special directive again, which allows to include everything after match group, without including it in result. It is used after main match group
\( is to match ‘(‘, where \ is an escape char. finally .* matching everything till new line.

https://regexr.com/

Modifying above expression, to get ‘func’, if when we dont know, name of function.

https://regexr.com/

(?<=.*=.*)([\w]+)(?=\(.*,.*\))

Above regex is more useful, and it can match any function call in below format:

[variable] = [variable] [operator + – *] functioncall(parama, paramb);
for ex:
c = a + somefunc(p1,p2);

If you want to be more generic, to work for any function with any number of args

(?<=.*=.*)([\w]+)(?=\(.*\))
But, it wont work for function not returning any values, see last func: func(v, d) in image.

Let’s make our regex, very strict and we will only allow format:

c = a + function(x,y)

(?<=[\w\s\_]+=[\w\s\_]+\+[\w\s\_]+)([\w]+)(?=\(.*\))
[\w\s_]+ = this will match a group of char, space and underscore, no other special chars are allowed.

you can in image, that it matches only 1 function call now.