Running Cython in Windows x64 – fatal error C1083: Cannot open include file: ‘basetsd.h’: No such file or directory

In case anyone is currently (2017) facing same error with visual C++ 2015 tools, launch setup again and also select windows 8.1 / 10 SDK depending upon your OS. This will fix basestd.h error. If it is still not working, try launching build tools from: C:\Program Files (x86)\Microsoft Visual C++ Build Tools. Another alternative would … Read more

Cython: cimport and import numpy as (both) np

cimport my_module gives access to C functions or attributes or even sub-modules under my_module import my_module gives access to Python functions or attributes or sub-modules under my_module. In your case: cimport numpy as np gives you access to Numpy C API, where you can declare array buffers, variable types and so on… And: import numpy … Read more

Numpy vs Cython speed

With slight modification, version 3 becomes twice as fast: @cython.boundscheck(False) @cython.wraparound(False) @cython.nonecheck(False) def process2(np.ndarray[DTYPE_t, ndim=2] array): cdef unsigned int rows = array.shape[0] cdef unsigned int cols = array.shape[1] cdef unsigned int row, col, row2 cdef np.ndarray[DTYPE_t, ndim=2] out = np.empty((rows, cols)) for row in range(rows): for row2 in range(rows): for col in range(cols): out[row, col] … Read more

Cython: (Why / When) Is it preferable to use Py_ssize_t for indexing?

Py_ssize_t is signed. See PEP 353, where it says “A new type Py_ssize_t is introduced, which has the same size as the compiler’s size_t type, but is signed. It will be a typedef for ssize_t where available.” You should use Py_ssize_t for indexing. I didn’t find a definitive statement of this in the Cython docs, … Read more

Make distutils look for numpy header files in the correct place

Use numpy.get_include(): from distutils.core import setup from distutils.extension import Extension from Cython.Distutils import build_ext import numpy as np # <—- New line ext_modules = [Extension(“hello”, [“hello.pyx”], include_dirs=[get_numpy_include()])] # <—- New argument setup( name=”Hello world app”, cmdclass = {‘build_ext’: build_ext}, ext_modules = ext_modules )

Improve Pandas Merge performance

set_index on merging column does indeed speed this up. Below is a slightly more realistic version of julien-marrec’s Answer. import pandas as pd import numpy as np myids=np.random.choice(np.arange(10000000), size=1000000, replace=False) df1 = pd.DataFrame(myids, columns=[‘A’]) df1[‘B’] = np.random.randint(0,1000,(1000000)) df2 = pd.DataFrame(np.random.permutation(myids), columns=[‘A2’]) df2[‘B2′] = np.random.randint(0,1000,(1000000)) %%timeit x = df1.merge(df2, how=’left’, left_on=’A’, right_on=’A2′) #1 loop, best of … Read more

tech