large-data-volumes
Is it possible to change argv or do I need to create an adjusted copy of it?
The C99 standard says this about modifying argv (and argc): The parameters argc and argv and the strings pointed to by the argv array shall be modifiable by the program, and retain their last-stored values between program startup and program termination.
Using Hibernate’s ScrollableResults to slowly read 90 million records
Using setFirstResult and setMaxResults is your only option that I’m aware of. Traditionally a scrollable resultset would only transfer rows to the client on an as required basis. Unfortunately the MySQL Connector/J actually fakes it, it executes the entire query and transports it to the client, so the driver actually has the entire result set … Read more
Designing a web crawler
If you want to get a detailed answer take a look at section 3.8 this paper, which describes the URL-seen test of a modern scraper: In the course of extracting links, any Web crawler will encounter multiple links to the same document. To avoid downloading and processing a document multiple times, a URL-seen test must … Read more