How to group a Series by values in pandas?
grouped = s.groupby(s) Or: grouped = s.groupby(lambda x: s[x])
grouped = s.groupby(s) Or: grouped = s.groupby(lambda x: s[x])
If you don’t want to show the series names in the legend you can disable them by setting showInLegend:false. example: series: [{ showInLegend: false, name: “<b><?php echo $title; ?></b>”, data: [<?php echo $yaxis; ?>], }] You get other options here. legend options other chart options
Try this: dataframe[column].value_counts().index.tolist() [‘apple’, ‘sausage’, ‘banana’, ‘cheese’]
You can use NumPy’s built in methods to do this: np.ceil(series) or np.floor(series). Both return a Series object (not an array) so the index information is preserved.
Explicit is better than implicit. df[boolean_mask] selects rows where boolean_mask is True, but there is a corner case when you might not want it to: when df has boolean-valued column labels: In [229]: df = pd.DataFrame({True:[1,2,3],False:[3,4,5]}); df Out[229]: False True 0 3 1 1 4 2 2 5 3 You might want to use df[[True]] … Read more
Use iloc to access by position (rather than label): In [11]: df = pd.DataFrame([[1, 2], [3, 4]], [‘a’, ‘b’], [‘A’, ‘B’]) In [12]: df Out[12]: A B a 1 2 b 3 4 In [13]: df.iloc[0] # first row in a DataFrame Out[13]: A 1 B 2 Name: a, dtype: int64 In [14]: df[‘A’].iloc[0] # … Read more
First, change the type of the column: df.cc = pd.Categorical(df.cc) Now the data look similar but are stored categorically. To capture the category codes: df[‘code’] = df.cc.cat.codes Now you have: cc temp code 0 US 37.0 2 1 CA 12.0 1 2 US 35.0 2 3 AU 20.0 0 If you don’t want to modify … Read more
Granted that the behavior is inconsistent, but I think it’s easy to imagine cases where this is convenient. Anyway, to get a DataFrame every time, just pass a list to loc. There are other ways, but in my opinion this is the cleanest. In [2]: type(df.loc[[3]]) Out[2]: pandas.core.frame.DataFrame In [3]: type(df.loc[[1]]) Out[3]: pandas.core.frame.DataFrame
You can transpose the single-row dataframe (which still results in a dataframe) and then squeeze the results into a series (the inverse of to_frame). df = pd.DataFrame([list(range(5))], columns=[“a{}”.format(i) for i in range(5)]) >>> df.squeeze(axis=0) a0 0 a1 1 a2 2 a3 3 a4 4 Name: 0, dtype: int64 Note: To accommodate the point raised by … Read more
The dtype object comes from NumPy, it describes the type of element in a ndarray. Every element in an ndarray must have the same size in bytes. For int64 and float64, they are 8 bytes. But for strings, the length of the string is not fixed. So instead of saving the bytes of strings in … Read more