python-re – Tarik Billa

re.findall not returning full match?

August 19, 2023 by Tarik

The problem you have is that if the regex that re.findall tries to match captures groups (i.e. the portions of the regex that are enclosed in parentheses), then it is the groups that are returned, rather than the matched string. One way to solve this issue is to use non-capturing groups (prefixed with ?:). >>> … Read more

The result list contains single spaces when splitting a string with re.split(“( )+”) – is there a better way?

June 18, 2023 by Tarik

By using (,), you are capturing the group, if you simply remove them you will not have this problem. >>> str1 = “a b c d” >>> re.split(” +”, str1) [‘a’, ‘b’, ‘c’, ‘d’] However there is no need for regex, str.split without any delimiter specified will split this by whitespace for you. This would … Read more

How to match a newline character in a raw string?

April 25, 2023 by Tarik

In a regular expression, you need to specify that you’re in multiline mode: >>> import re >>> s = “””cat … dog””” >>> >>> re.match(r’cat\ndog’,s,re.M) <_sre.SRE_Match object at 0xcb7c8> Notice that re translates the \n (raw string) into newline. As you indicated in your comments, you don’t actually need re.M for it to match, but … Read more

Using more than one flag in python re.findall

April 3, 2023 by Tarik

Yes, but you have to OR them together: x = re.findall(pattern=r’CAT.+?END’, string=’Cat \n eND’, flags=re.I | re.DOTALL)

re.sub replace with matched content

February 8, 2023 by Tarik

Simply use \1 instead of $1: In [1]: import re In [2]: method = ‘images/:id/huge’ In [3]: re.sub(r'(:[a-z]+)’, r'<span>\1</span>’, method) Out[3]: ‘images/<span>:id</span>/huge’ Also note the use of raw strings (r’…’) for regular expressions. It is not mandatory but removes the need to escape backslashes, arguably making the code slightly more readable.