Performance of various numpy fancy indexing methods, also with numba

Your summary isn’t completely correct, you already did tests with differently sized arrays but one thing that you didn’t do was to change the number of elements indexed. I restricted it to pure indexing and omitted take (which effectively is integer array indexing) and compress and extract (because these are effectively boolean array indexing). The … Read more

Why is numba faster than numpy here?

I think this question highlights (somewhat) the limitations of calling out to precompiled functions from a higher level language. Suppose in C++ you write something like: for (int i = 0; i != N; ++i) a[i] = b[i] + c[i] + 2 * d[i]; The compiler sees all this at compile time, the whole expression. … Read more

Matrix inversion without Numpy

Here is a more elegant and scalable solution, imo. It’ll work for any nxn matrix and you may find use for the other methods. Note that getMatrixInverse(m) takes in an array of arrays as input. Please feel free to ask any questions. def transposeMatrix(m): return map(list,zip(*m)) def getMatrixMinor(m,i,j): return [row[:j] + row[j+1:] for row in … Read more

Multiple output and numba signatures

You can either use explicit declarations or string declaration : Tuple with homogeneous types : @nb.jit(nb.types.UniTuple(nb.float64[:],2)(nb.float64[:]),nopython=True) def f(a) : return a,a @nb.jit(‘UniTuple(float64[:], 2)(float64[:])’,nopython=True) def f(a) : return a,a Tuple with heterogeneous types : @nb.jit(nb.types.Tuple((nb.float64[:], nb.float64[:,:]))(nb.float64[:], nb.float64[:,:]),nopython=True) def f(a, b) : return a, b @nb.jit(‘Tuple((float64[:], float64[:,:]))(float64[:], float64[:,:])’,nopython=True) def f(a, b) : return a, b Source : … Read more

Python numpy: cannot convert datetime64[ns] to datetime64[D] (to use with Numba)

Series.astype converts all date-like objects to datetime64[ns]. To convert to datetime64[D], use values to obtain a NumPy array before calling astype: dates_input = df[“month_15”].values.astype(‘datetime64[D]’) Note that NDFrames (such as Series and DataFrames) can only hold datetime-like objects as objects of dtype datetime64[ns]. The automatic conversion of all datetime-likes to a common dtype simplifies subsequent date … Read more

Improve Pandas Merge performance

set_index on merging column does indeed speed this up. Below is a slightly more realistic version of julien-marrec’s Answer. import pandas as pd import numpy as np myids=np.random.choice(np.arange(10000000), size=1000000, replace=False) df1 = pd.DataFrame(myids, columns=[‘A’]) df1[‘B’] = np.random.randint(0,1000,(1000000)) df2 = pd.DataFrame(np.random.permutation(myids), columns=[‘A2’]) df2[‘B2′] = np.random.randint(0,1000,(1000000)) %%timeit x = df1.merge(df2, how=’left’, left_on=’A’, right_on=’A2′) #1 loop, best of … Read more

Filtering (reducing) a NumPy Array

Summary Using a loop-based approach with single pass and copying, accelerated with Numba, offers the best overall trade-off in terms of speed, memory efficiency and flexibility. If the execution of the condition function is sufficiently fast, two-passes (filter2_nb()) may be faster, while they are more memory efficient regardless. Also, for sufficiently large inputs, resizing instead … Read more

Hata!: SQLSTATE[HY000] [1045] Access denied for user 'divattrend_liink'@'localhost' (using password: YES)