PEG for Python style indentation

Pure PEG cannot parse indentation. But peg.js can. I did a quick-and-dirty experiment (being inspired by Ira Baxter’s comment about cheating) and wrote a simple tokenizer. For a more complete solution (a complete parser) please see this question: Parse indentation level with PEG.js /* Initializations */ { function start(first, tail) { var done = [first[1]]; … Read more

Lexing and parsing concurrently in F#

First of all in real case lexing and parsing is time critical. Especially if you need to process tokens before parsing. For example — filtering and collecting of comments or resolving of context-depended conflicts. In this case parser often wait for a lexer. The answer for a question. You can run lexing and parsing concurrently … Read more

How to understand an EDI file?

Several of these other answers are very good. I’ll try to fill in some things they haven’t mentioned. EDI is a set of standards, the most common of which are: ANSI X12 (popular in the states) EDIFACT (popular in Europe) Sounds like you’re looking at X12 version 4010. That’s the most widely used (in my … Read more

Use Scala parser combinator to parse CSV files

What you missed is whitespace. I threw in a couple bonus improvements. import scala.util.parsing.combinator._ object CSV extends RegexParsers { override protected val whiteSpace = “””[ \t]”””.r def COMMA = “,” def DQUOTE = “\”” def DQUOTE2 = “\”\”” ^^ { case _ => “\”” } def CR = “\r” def LF = “\n” def CRLF … Read more

tech