binning
resize with averaging or rebin a numpy 2d array
Here’s an example based on the answer you’ve linked (for clarity): >>> import numpy as np >>> a = np.arange(24).reshape((4,6)) >>> a array([[ 0, 1, 2, 3, 4, 5], [ 6, 7, 8, 9, 10, 11], [12, 13, 14, 15, 16, 17], [18, 19, 20, 21, 22, 23]]) >>> a.reshape((2,a.shape[0]//2,3,-1)).mean(axis=3).mean(1) array([[ 3.5, 5.5, 7.5], [ … Read more
Getting data for histogram plot
This is a post about a super quick-and-dirty way to create a histogram in MySQL for numeric values. There are multiple other ways to create histograms that are better and more flexible, using CASE statements and other types of complex logic. This method wins me over time and time again since it’s just so easy … Read more
Pandas: convert categories to numbers
First, change the type of the column: df.cc = pd.Categorical(df.cc) Now the data look similar but are stored categorically. To capture the category codes: df[‘code’] = df.cc.cat.codes Now you have: cc temp code 0 US 37.0 2 1 CA 12.0 1 2 US 35.0 2 3 AU 20.0 0 If you don’t want to modify … Read more
Binning a column with pandas
You can use pandas.cut: bins = [0, 1, 5, 10, 25, 50, 100] df[‘binned’] = pd.cut(df[‘percentage’], bins) print (df) percentage binned 0 46.50 (25, 50] 1 44.20 (25, 50] 2 100.00 (50, 100] 3 42.12 (25, 50] bins = [0, 1, 5, 10, 25, 50, 100] labels = [1,2,3,4,5,6] df[‘binned’] = pd.cut(df[‘percentage’], bins=bins, labels=labels) print … Read more
Histogram using gnuplot?
yes, and its quick and simple though very hidden: binwidth=5 bin(x,width)=width*floor(x/width) plot ‘datafile’ using (bin($1,binwidth)):(1.0) smooth freq with boxes check out help smooth freq to see why the above makes a histogram to deal with ranges just set the xrange variable.