numba – Tarik Billa

Performance of various numpy fancy indexing methods, also with numba

December 28, 2023 by Tarik

Your summary isn’t completely correct, you already did tests with differently sized arrays but one thing that you didn’t do was to change the number of elements indexed. I restricted it to pure indexing and omitted take (which effectively is integer array indexing) and compress and extract (because these are effectively boolean array indexing). The … Read more

Why is numba faster than numpy here?

December 9, 2023 by Tarik

I think this question highlights (somewhat) the limitations of calling out to precompiled functions from a higher level language. Suppose in C++ you write something like: for (int i = 0; i != N; ++i) a[i] = b[i] + c[i] + 2 * d[i]; The compiler sees all this at compile time, the whole expression. … Read more

Matrix inversion without Numpy

August 12, 2023 by Tarik

Here is a more elegant and scalable solution, imo. It’ll work for any nxn matrix and you may find use for the other methods. Note that getMatrixInverse(m) takes in an array of arrays as input. Please feel free to ask any questions. def transposeMatrix(m): return map(list,zip(*m)) def getMatrixMinor(m,i,j): return [row[:j] + row[j+1:] for row in … Read more

Multiple output and numba signatures

July 16, 2023 by Tarik

You can either use explicit declarations or string declaration : Tuple with homogeneous types : @nb.jit(nb.types.UniTuple(nb.float64[:],2)(nb.float64[:]),nopython=True) def f(a) : return a,a @nb.jit(‘UniTuple(float64[:], 2)(float64[:])’,nopython=True) def f(a) : return a,a Tuple with heterogeneous types : @nb.jit(nb.types.Tuple((nb.float64[:], nb.float64[:,:]))(nb.float64[:], nb.float64[:,:]),nopython=True) def f(a, b) : return a, b @nb.jit(‘Tuple((float64[:], float64[:,:]))(float64[:], float64[:,:])’,nopython=True) def f(a, b) : return a, b Source : … Read more

How do I use numba on a member function of a class?

July 13, 2023 by Tarik

I was in a very similar situation and I found a way to use a Numba-JITed function inside of a class. The trick is to use a static method, since this kind of methods are not called prepending the object instance to the argument list. The downside of not having access to self is that … Read more

Why is np.dot so much faster than np.sum?

May 28, 2023 by Tarik

numpy.dot delegates to a BLAS vector-vector multiply here, while numpy.sum uses a pairwise summation routine, switching over to an 8x unrolled summation loop at a block size of 128 elements. I don’t know what BLAS library your NumPy is using, but a good BLAS will generally take advantage of SIMD operations, while numpy.sum doesn’t do … Read more

Python numpy: cannot convert datetime64[ns] to datetime64[D] (to use with Numba)

May 23, 2023 by Tarik

Series.astype converts all date-like objects to datetime64[ns]. To convert to datetime64[D], use values to obtain a NumPy array before calling astype: dates_input = df[“month_15”].values.astype(‘datetime64[D]’) Note that NDFrames (such as Series and DataFrames) can only hold datetime-like objects as objects of dtype datetime64[ns]. The automatic conversion of all datetime-likes to a common dtype simplifies subsequent date … Read more

Improve Pandas Merge performance

May 7, 2023 by Tarik

set_index on merging column does indeed speed this up. Below is a slightly more realistic version of julien-marrec’s Answer. import pandas as pd import numpy as np myids=np.random.choice(np.arange(10000000), size=1000000, replace=False) df1 = pd.DataFrame(myids, columns=[‘A’]) df1[‘B’] = np.random.randint(0,1000,(1000000)) df2 = pd.DataFrame(np.random.permutation(myids), columns=[‘A2’]) df2[‘B2′] = np.random.randint(0,1000,(1000000)) %%timeit x = df1.merge(df2, how=’left’, left_on=’A’, right_on=’A2′) #1 loop, best of … Read more

Filtering (reducing) a NumPy Array

April 8, 2023 by Tarik

Summary Using a loop-based approach with single pass and copying, accelerated with Numba, offers the best overall trade-off in terms of speed, memory efficiency and flexibility. If the execution of the condition function is sufficiently fast, two-passes (filter2_nb()) may be faster, while they are more memory efficient regardless. Also, for sufficiently large inputs, resizing instead … Read more