parsing
Why is bottom-up parsing more common than top-down parsing?
If you choose a powerful parser generator, you can code your grammar without worrying about peculiar properties. (LA)LR means you don’t have to worry about left recursion, one less headache. GLR means you don’t have to worry about local ambiguity or lookahead. And the bottom-up parsers tend to be pretty efficient. So, once you’ve paid … Read more
Practical difference between parser rules and lexer rules in ANTLR?
… what are the practical differences between these two statements in ANTLR … MY_RULE will be used to tokenize your input source. It represents a fundamental building block of your language. my_rule is called from the parser, it consists of zero or more other parser rules or tokens produced by the lexer. That’s the difference. … Read more
Writing a parser from scratch in Haskell
It’s actually surprisingly easy to build Parsec-from-scratch. The actual library code itself is heavily generalized and optimized which contorts the core abstraction, but if you’re just building things from scratch to understand more about what’s going on you can write it in just a few lines of code. I’ll build a slightly weaker Applicative parser … Read more
Are there such a thing as LL(0) parsers?
LL(0) parsers do look at the tokens, but they don’t decide which productions to apply upon them. They just determine if the sequence belongs to the language or not. This means that every non-terminal symbol must have a single right-hand side and that there may be no recursion. G == ID name lastname name == … Read more
How to parse Markdown in PHP?
You should have a look at Parsedown. It parses Markdown text the way people do. First, it divides texts into lines. Then it looks at how these lines start and relate to each other. Finally, it looks for special characters to identify inline elements.
Choosing a Haskell parser
You have several good options. For lightweight parsing of String types: parsec polyparse For packed bytestring parsing, e.g. of HTTP headers. attoparsec For actual binary data most people use either: binary — for lazy binary parsing cereal — for strict binary parsing The main question to ask yourself is what is the underlying string type? … Read more
Difference between compilers and parsers?
A compiler is often made up of several components, one of which is a parser. A common set of components in a compiler is: Lexer – break the program up into words. Parser – check that the syntax of the sentences are correct. Semantic Analysis – check that the sentences make sense. Optimizer – edit … Read more
Building a parser (Part I)
Generally, you want to separate the functions of the tokeniser (also called a lexer) from other stages of your compiler or interpreter. The reason for this is basic modularity: each pass consumes one kind of thing (e.g., characters) and produces another one (e.g., tokens). So you’ve converted your characters to tokens. Now you want to … Read more
Parse URL in shell script
[EDIT 2019] This answer is not meant to be a catch-all, works for everything solution it was intended to provide a simple alternative to the python based version and it ended up having more features than the original. It answered the basic question in a bash-only way and then was modified multiple times by myself … Read more