bash command to convert html page to a text file

Easiest way is to use something like this which the dump (in short is the text version of viewable HTML).

Remote file:

lynx --dump www.google.com > file.txt
links -dump www.google.com

Local file:

lynx --dump ./1.html > file.txt
links -dump ./1.htm

With charset conversion to utf8 (see):

lynx -dump -display_charset UTF-8 ./1.htm
links -dump -codepage UTF-8 ./1.htm

Leave a Comment

Hata!: SQLSTATE[HY000] [1045] Access denied for user 'divattrend_liink'@'localhost' (using password: YES)