parsing – Tarik Billa

How can I use NLP to parse recipe ingredients?

April 9, 2024 by Tarik

Parsing a .json column in Power BI

April 9, 2024 by Tarik

There is an easier way to do it, in the Query Editor on the column you want to read as a json: Right click on the column Select Transform>JSON then the column becomes a Record that you can split in every property of the json using the button on the top right corner.

How parse 2013-03-13T20:59:31+0000 date string to Date

April 4, 2024 by Tarik

Parsing assembly qualified name?

January 8, 2024 by Tarik

The AssemblyName class can parse the assembly name for you, just pass in the string to its constructor. If you have an assembly qualified type name, I think you’ll have to strip of the type part of the string first (ie everything up to the first comma).

Why is bottom-up parsing more common than top-down parsing?

January 7, 2024 by Tarik

If you choose a powerful parser generator, you can code your grammar without worrying about peculiar properties. (LA)LR means you don’t have to worry about left recursion, one less headache. GLR means you don’t have to worry about local ambiguity or lookahead. And the bottom-up parsers tend to be pretty efficient. So, once you’ve paid … Read more

Practical difference between parser rules and lexer rules in ANTLR?

January 7, 2024 by Tarik

… what are the practical differences between these two statements in ANTLR … MY_RULE will be used to tokenize your input source. It represents a fundamental building block of your language. my_rule is called from the parser, it consists of zero or more other parser rules or tokens produced by the lexer. That’s the difference. … Read more

Parsing YAML, return with line number

January 5, 2024 by Tarik

Here’s an improved version of puzzlet’s answer: import yaml from yaml.loader import SafeLoader class SafeLineLoader(SafeLoader): def construct_mapping(self, node, deep=False): mapping = super(SafeLineLoader, self).construct_mapping(node, deep=deep) # Add 1 so line numbering starts at 1 mapping[‘__line__’] = node.start_mark.line + 1 return mapping You can use it like this: data = yaml.load(whatever, Loader=SafeLineLoader)

Writing a parser from scratch in Haskell

January 5, 2024 by Tarik

It’s actually surprisingly easy to build Parsec-from-scratch. The actual library code itself is heavily generalized and optimized which contorts the core abstraction, but if you’re just building things from scratch to understand more about what’s going on you can write it in just a few lines of code. I’ll build a slightly weaker Applicative parser … Read more

how do I parse an iso 8601 date (with optional milliseconds) to a struct tm in C++?

January 1, 2024 by Tarik

New answer for old question. Rationale: updated tools. Using this free, open source library, one can parse into a std::chrono::time_point<system_clock, milliseconds>, which has the advantage over a tm of being able to hold millisecond precision. And if you really need to, you can continue on to the C API via system_clock::to_time_t (losing the milliseconds along … Read more

How to get all html data after all scripts and page loading is done? (puppeteer)

January 1, 2024 by Tarik

If you want full html same as inspect? Here it is: const puppeteer = require(‘puppeteer’); (async function main() { try { const browser = await puppeteer.launch(); const [page] = await browser.pages(); await page.goto(‘https://example.org/’, { waitUntil: ‘networkidle0’ }); const data = await page.evaluate(() => document.querySelector(‘*’).outerHTML); console.log(data); await browser.close(); } catch (err) { console.error(err); } })();