When should I use mmap for file access?

mmap is great if you have multiple processes accessing data in a read only fashion from the same file, which is common in the kind of server systems I write. mmap allows all those processes to share the same physical memory pages, saving a lot of memory. mmap also allows the operating system to optimize … Read more

Parsing CSV files in C#, with header

A CSV parser is now a part of .NET Framework. Add a reference to Microsoft.VisualBasic.dll (works fine in C#, don’t mind the name) using (TextFieldParser parser = new TextFieldParser(@”c:\temp\test.csv”)) { parser.TextFieldType = FieldType.Delimited; parser.SetDelimiters(“,”); while (!parser.EndOfData) { //Process row string[] fields = parser.ReadFields(); foreach (string field in fields) { //TODO: Process field } } } … Read more

Copy a file in a sane, safe and efficient way

Copy a file in a sane way: #include <fstream> int main() { std::ifstream src(“from.ogv”, std::ios::binary); std::ofstream dst(“to.ogv”, std::ios::binary); dst << src.rdbuf(); } This is so simple and intuitive to read it is worth the extra cost. If we were doing it a lot, better to fall back on OS calls to the file system. I … Read more

Lazy Method for Reading Big File in Python?

To write a lazy function, just use yield: def read_in_chunks(file_object, chunk_size=1024): “””Lazy function (generator) to read a file piece by piece. Default chunk size: 1k.””” while True: data = file_object.read(chunk_size) if not data: break yield data with open(‘really_big_file.dat’) as f: for piece in read_in_chunks(f): process_data(piece) Another option would be to use iter and a helper … Read more

Hata!: SQLSTATE[HY000] [1045] Access denied for user 'divattrend_liink'@'localhost' (using password: YES)