Hi, In this post, I am trying to share my notes about regex expressions, I recently worked on some code using regex expressions, this is an attempt to write what I learned, may be useful for others.
Some notes:
.* = will match till end of line, all characters including special chars and symbols, but it wont match newline.
Source string :x = y + z - func(a, b);
v = func(x, z)
Regex to identify function name in above expression:
(?<=.*=.*)(func)(?=\(.*)
Here, (?<= allows to continue matching the symbols, without including them into match group. It is used before main match group.
.* matched any value except newline till ‘=’ sign is encountered (see what happens, if there is no equal = sign, the third line is below figure is not highlighted)
Again, .* matches everything till ‘func’
(?= is a special directive again, which allows to include everything after match group, without including it in result. It is used after main match group
\( is to match ‘(‘, where \ is an escape char. finally .* matching everything till new line.
Modifying above expression, to get ‘func’, if when we dont know, name of function.
(?<=.*=.*)([\w]+)(?=\(.*,.*\))
Above regex is more useful, and it can match any function call in below format:
[variable] = [variable] [operator + – *] functioncall(parama, paramb);
for ex:
c = a + somefunc(p1,p2);
If you want to be more generic, to work for any function with any number of args
(?<=.*=.*)([\w]+)(?=\(.*\))
But, it wont work for function not returning any values, see last func: func(v, d) in image.
Let’s make our regex, very strict and we will only allow format:
c = a + function(x,y)
(?<=[\w\s\_]+=[\w\s\_]+\+[\w\s\_]+)([\w]+)(?=\(.*\))
[\w\s_]+
= this will match a group of char, space and underscore, no other special chars are allowed.
you can in image, that it matches only 1 function call now.