Can you create a Python list from a string, while keeping characters in specific keywords together?

With re.findall. Alternate between your keywords first.

>>> import re
>>> s = "xyzcarbusabccar"
>>> re.findall('car|bus|[a-z]', s)
['x', 'y', 'z', 'car', 'bus', 'a', 'b', 'c', 'car']

In case you have overlapping keywords, note that this solution will find the first one you encounter:

>>> s="abcaratab"
>>> re.findall('car|rat|[a-z]', s)
['a', 'b', 'car', 'a', 't', 'a', 'b']

You can make the solution more general by substituting the [a-z] part with whatever you like, \w for example, or a simple . to match any character.

Short explanation why this works and why the regex '[a-z]|car|bus' would not work:
The regular expression engine tries the alternating options from left to right and is “eager” to return a match. That means it considers the whole alternation to match as soon as one of the options has been fully matched. At this point, it will not try any of the remaining options but stop processing and report a match immediately. With '[a-z]|car|bus', the engine will report a match when it sees any character in the character class [a-z] and never go on to check if ‘car’ or ‘bus’ could also be matched.

Leave a Comment

Hata!: SQLSTATE[HY000] [1045] Access denied for user 'divattrend_liink'@'localhost' (using password: YES)