What are _mm_prefetch() locality hints?

Sometimes intrinsics are better understood in terms of the instruction they represent rather than as the abstract semantic given in their descriptions. The full set of the locality constants, as today, is #define _MM_HINT_T0 1 #define _MM_HINT_T1 2 #define _MM_HINT_T2 3 #define _MM_HINT_NTA 0 #define _MM_HINT_ENTA 4 #define _MM_HINT_ET0 5 #define _MM_HINT_ET1 6 #define _MM_HINT_ET2 … Read more

Extract target from Tensorflow PrefetchDataset

You can convert it to a list with list(ds) and then recompile it as a normal Dataset with tf.data.Dataset.from_tensor_slices(list(ds)). From there your nightmare begins again but at least it’s a nightmare that other people have had before. Note that for more complex datasets (e.g. nested dictionaries) you will need more preprocessing after calling list(ds), but … Read more

How to prefetch data using a custom python function in tensorflow

This is a common use case, and most implementations use TensorFlow’s queues to decouple the preprocessing code from the training code. There is a tutorial on how to use queues, but the main steps are as follows: Define a queue, q, that will buffer the preprocessed data. TensorFlow supports the simple tf.FIFOQueue that produces elements … Read more

How do I programmatically disable hardware prefetching?

You can enable or disable the hardware prefetchers using msr-tools http://www.kernel.org/pub/linux/utils/cpu/msr-tools/. The following enables the hardware prefetcher (by unsetting bit 9): [root@… msr-tools-1.2]# ./wrmsr -p 0 0x1a0 0x60628e2089 [root@… msr-tools-1.2]# ./rdmsr 0x1a0 60628e2089 The following disables the hardware prefetcher (by enabling bit 9): [root@… msr-tools-1.2]# ./wrmsr -p 0 0x1a0 0x60628e2289 [root@… msr-tools-1.2]# ./rdmsr 0x1a0 60628e2289 … Read more

Do current x86 architectures support non-temporal loads (from “normal” memory)?

To answer specifically the headline question: Yes, recent1 mainstream Intel CPUs support non-temporal loads on normal 2 memory – but only “indirectly” via non-temporal prefetch instructions, rather than directly using non-temporal load instructions like movntdqa. This is in contrast to non-temporal stores where you can just use the corresponding non-temporal store instructions3 directly. The basic … Read more

Prefetching Examples?

Here’s an actual piece of code that I’ve pulled out of a larger project. (Sorry, it’s the shortest one I can find that had a noticable speedup from prefetching.) This code performs a very large data transpose. This example uses the SSE prefetch instructions, which may be the same as the one that GCC emits. … Read more

Why does django’s prefetch_related() only work with all() and not filter()?

In Django 1.6 and earlier, it is not possible to avoid the extra queries. The prefetch_related call effectively caches the results of a.photoset.all() for every album in the queryset. However, a.photoset.filter(format=1) is a different queryset, so you will generate an extra query for every album. This is explained in the prefetch_related docs. The filter(format=1) is … Read more

Hata!: SQLSTATE[HY000] [1045] Access denied for user 'divattrend_liink'@'localhost' (using password: YES)