Regular expression to select all whitespace that isn’t in quotes?

Here’s a single regex-replace that works:

\s+(?=([^"]*"[^"]*")*[^"]*$)

which will replace:

(this is a test "sentence for the regex" foo bar)

with:

(thisisatest"sentence for the regex"foobar)

Note that if the quotes can be escaped, the even more verbose regex will do the trick:

\s+(?=((\\[\\"]|[^\\"])*"(\\[\\"]|[^\\"])*")*(\\[\\"]|[^\\"])*$)

which replaces the input:

(this is a test "sentence \"for the regex" foo bar)

with:

(thisisatest"sentence \"for the regex"foobar)

(note that it also works with escaped backspaces: (thisisatest"sentence \\\"for the regex"foobar))

Needless to say (?), this really shouldn’t be used to perform such a task: it makes ones eyes bleed, and it performs its task in quadratic time, while a simple linear solution exists.

EDIT

A quick demo:

String text = "(this is a test \"sentence \\\"for the regex\" foo bar)";
String regex = "\\s+(?=((\\\\[\\\\\"]|[^\\\\\"])*\"(\\\\[\\\\\"]|[^\\\\\"])*\")*(\\\\[\\\\\"]|[^\\\\\"])*$)";
System.out.println(text.replaceAll(regex, ""));

// output: (thisisatest"sentence \"for the regex"foobar)

Leave a Comment

Hata!: SQLSTATE[HY000] [1045] Access denied for user 'divattrend_liink'@'localhost' (using password: YES)