For reasonably modern versions of sed, edit the standard input to yield the standard output with
$ echo 'τέχνη βιβλίο γη κήπος' | sed -E -e 's/[[:blank:]]+/\n/g'
τέχνη
βιβλίο
γη
κήπος
If your vocabulary words are in files named lesson1 and lesson2, redirect sed’s standard output to the file all-vocab with
sed -E -e 's/[[:blank:]]+/\n/g' lesson1 lesson2 > all-vocab
What it means:
- The character class
[[:blank:]]matches either a single space character or
a single tab character.- Use
[[:space:]]instead to match any single whitespace character (commonly space, tab, newline, carriage return, form-feed, and vertical tab). - The
+quantifier means match one or more of the previous pattern. - So
[[:blank:]]+is a sequence of one or more characters that are all space or tab.
- Use
- The
\nin the replacement is the newline that you want. - The
/gmodifier on the end means perform the substitution as many times as possible rather than just once. - The
-Eoption tells sed to use POSIX extended regex syntax and in particular for this case the+quantifier. Without-E, your sed command becomessed -e 's/[[:blank:]]\+/\n/g'. (Note the use of\+rather than simple+.)
Perl Compatible Regexes
For those familiar with Perl-compatible regexes and a PCRE-capable sed, use \s+ to match runs of at least one whitespace character, as in
sed -E -e 's/\s+/\n/g' old > new
or
sed -e 's/\s\+/\n/g' old > new
These commands read input from the file old and write the result to a file named new in the current directory.
Maximum portability, maximum cruftiness
Going back to almost any version of sed since Version 7 Unix, the command invocation is a bit more baroque.
$ echo 'τέχνη βιβλίο γη κήπος' | sed -e 's/[ \t][ \t]*/\
/g'
τέχνη
βιβλίο
γη
κήπος
Notes:
- Here we do not even assume the existence of the humble
+quantifier and simulate it with a single space-or-tab ([ \t]) followed by zero or more of them ([ \t]*). - Similarly, assuming sed does not understand
\nfor newline, we have to include it on the command line verbatim.- The
\and the end of the first line of the command is a continuation marker that escapes the immediately following newline, and the remainder of the command is on the next line.- Note: There must be no whitespace preceding the escaped newline. That is, the end of the first line must be exactly backslash followed by end-of-line.
- This error prone process helps one appreciate why the world moved to visible characters, and you will want to exercise some care in trying out the command with copy-and-paste.
- The
Note on backslashes and quoting
The commands above all used single quotes ('') rather than double quotes (""). Consider:
$ echo '\\\\' "\\\\"
\\\\ \\
That is, the shell applies different escaping rules to single-quoted strings as compared with double-quoted strings. You typically want to protect all the backslashes common in regexes with single quotes.