How to delete the words between two delimiters?

Use regular expressions:

>>> import re
>>> s="<@ """@$ FSDF >something something <more noise>"
>>> re.sub('<[^>]+>', '', s)
'something something '

[Update]

If you tried a pattern like <.+>, where the dot means any character and the plus sign means one or more, you know it does not work.

>>> re.sub(r'<.+>', s, '')
''

Why!?! It happens because regular expressions are “greedy” by default. The expression will match anything until the end of the string, including the > – and this is not what we want. We want to match < and stop on the next >, so we use the [^x] pattern which means “any character but x” (x being >).

The ? operator turns the match “non-greedy”, so this has the same effect:

>>> re.sub(r'<.+?>', '', s)
'something something '

The previous is more explicit, this one is less typing; be aware that x? means zero or one occurrence of x.

Leave a Comment

Hata!: SQLSTATE[HY000] [1045] Access denied for user 'divattrend_liink'@'localhost' (using password: YES)