Sum array by number in numpy

The numpy function bincount was made exactly for this purpose and I’m sure it will be much faster than the other methods for all sizes of inputs: data = [1,2,3,4,5,6] ids = [0,0,1,2,2,1] np.bincount(ids, weights=data) #returns [3,9,9] as a float64 array The i-th element of the output is the sum of all the data elements … Read more

Converting numpy solution into dask (numpy indexing doesn’t work in dask)

Chunk random_days_panel instead of historical_data and use da.map_blocks: def dask_way(sim_count, sim_days, hist_days): # shared historical data # on a cluster you’d load this on each worker, e.g. from a NPZ file historical_data = np.random.normal(111.51, 10, size=hist_days) random_days_panel = da.random.randint( 1, hist_days, size=(1, 1, sim_count, sim_days) ) future_panel = da.map_blocks( lambda chunk: historical_data[chunk], random_days_panel, dtype=float ) … Read more

Embedding Python in MATLAB

Try to approach the problem from the Python side: Python is a great glue language, I would suggest you to have Python run your Matlab and C programs. Python has: Numpy PyLab Matplotlib IPython Thus, the combination is a good alternative for almost any existing Matlab module.

Making Int64 the default integer dtype instead of standard int64 in pandas

You could use a function like this: def nan_ints(df, convert_strings=False, subset=None): types = [“int64”, “float64”] if subset is None: subset = list(df) if convert_strings: types.append(“object”) for col in subset: if df[col].dtype in types: df[col] = ( df[col].astype(float, errors=”ignore”).astype(“Int64″, errors=”ignore”) ) return df It iterates through each column and coverts it to an Int64 if it … Read more

Custom transformer for sklearn Pipeline that alters both X and y

Modifying the sample axis, e.g. removing samples, does not (yet?) comply with the scikit-learn transformer API. So if you need to do this, you should do it outside any calls to scikit learn, as preprocessing. As it is now, the transformer API is used to transform the features of a given sample into something new. … Read more

Select N evenly spaced out elements in array, including first and last

To get a list of evenly spaced indices, use np.linspace: idx = np.round(np.linspace(0, len(arr) – 1, numElems)).astype(int) Next, index back into arr to get the corresponding values: arr[idx] Always use rounding before casting to integers. Internally, linspace calls astype when the dtype argument is provided. Therefore, this method is NOT equivalent to: # this simply … Read more

Hata!: SQLSTATE[HY000] [1045] Access denied for user 'divattrend_liink'@'localhost' (using password: YES)