lookbehind – Tarik Billa

Is there a bug in Ruby lookbehind assertions (1.9/2.0)?

December 7, 2023 by Tarik

This has been officially classified as a bug and subsequently fixed, together with another problem concerning \Z anchors in multiline strings.

How to match the first word after an expression with regex?

September 19, 2023 by Tarik

This sounds like a job for lookbehinds, though you should be aware that not all regex flavors support them. In your example: (?<=\bipsum\s)(\w+) This will match any sequence of letter characters which follows “ipsum” as a whole word followed by a space. It does not match “ipsum” itself, you don’t need to worry about reinserting … Read more

Regular Expression Lookbehind doesn’t work with quantifiers (‘+’ or ‘*’)

July 13, 2023 by Tarik

Many regular expression libraries do only allow strict expressions to be used in look behind assertions like: only match strings of the same fixed length: (?<=foo|bar|\s,\s) (three characters each) only match strings of fixed lengths: (?<=foobar|\r\n) (each branch with fixed length) only match strings with a upper bound length: (?<=\s{,4}) (up to four repetitions) The … Read more

What’s the technical reason for “lookbehind assertion MUST be fixed length” in regex?

July 9, 2023 by Tarik

Lookahead and lookbehind aren’t nearly as similar as their names imply. The lookahead expression works exactly the same as it would if it were a standalone regex, except it’s anchored at the current match position and it doesn’t consume what it matches. Lookbehind is a whole different story. Starting at the current match position, it … Read more

Does lookbehind work in sed?

May 21, 2023 by Tarik

GNU sed does not have support for lookaround assertions. You could use a more powerful language such as Perl or possibly experiment with ssed which supports Perl-style regular expressions. perl -pe ‘s/(?<=foo)bar/test/g’ file.txt

Python Regex Engine – “look-behind requires fixed-width pattern” Error

April 16, 2023 by Tarik

Python re lookbehinds really need to be fixed-width, and when you have alternations in a lookbehind pattern that are of different length, there are several ways to handle this situation: Rewrite the pattern so that you do not have to use alternation (e.g. Tim’s above answer using a word boundary, or you might also use … Read more

Does lookaround affect which languages can be matched by regular expressions?

February 5, 2023 by Tarik

The answer to the question you ask, which is whether a larger class of languages than the regular languages can be recognised with regular expressions augmented by lookaround, is no. A proof is relatively straightforward, but an algorithm to translate a regular expression containing lookarounds into one without is messy. First: note that you can … Read more