What is the Cost of an L1 Cache Miss?

Here is an attempt to provide insight into the relative cost of cache misses by analogy with baking chocolate chip cookies … Your hands are your registers. It takes you 1 second to drop chocolate chips into the dough. The kitchen counter is your L1 cache, twelve times slower than registers. It takes 12 x … Read more

In CUDA, what is memory coalescing, and how is it achieved?

It’s likely that this information applies only to compute capabality 1.x, or cuda 2.0. More recent architectures and cuda 3.0 have more sophisticated global memory access and in fact “coalesced global loads” are not even profiled for these chips. Also, this logic can be applied to shared memory to avoid bank conflicts. A coalesced memory … Read more

Can I set a breakpoint on ‘memory access’ in GDB?

watch only breaks on write, rwatch let you break on read, and awatch let you break on read/write. You can set read watchpoints on memory locations: gdb$ rwatch *0xfeedface Hardware read watchpoint 2: *0xfeedface but one limitation applies to the rwatch and awatch commands; you can’t use gdb variables in expressions: gdb$ rwatch $ebx+0xec1a04f Expression … Read more

Hata!: SQLSTATE[HY000] [1045] Access denied for user 'divattrend_liink'@'localhost' (using password: YES)