How can I group time by hour or by 10 minutes?
finally done with GROUP BY DATEPART(YEAR, DT.[Date]), DATEPART(MONTH, DT.[Date]), DATEPART(DAY, DT.[Date]), DATEPART(HOUR, DT.[Date]), (DATEPART(MINUTE, DT.[Date]) / 10)
finally done with GROUP BY DATEPART(YEAR, DT.[Date]), DATEPART(MONTH, DT.[Date]), DATEPART(DAY, DT.[Date]), DATEPART(HOUR, DT.[Date]), (DATEPART(MINUTE, DT.[Date]) / 10)
Try this: mtcars %>% group_by(am, gear) %>% summarise(n = n()) %>% mutate(freq = n / sum(n)) # am gear n freq # 1 0 3 15 0.7894737 # 2 0 4 4 0.2105263 # 3 1 4 8 0.6153846 # 4 1 5 5 0.3846154 From the dplyr vignette: When you group by multiple variables, … Read more
Use the HAVING clause and GROUP By the fields that make the row unique The below will find all users that have more than one payment per day with the same account number SELECT user_id , COUNT(*) count FROM PAYMENT GROUP BY account, user_id , date HAVING COUNT(*) > 1 Update If you want to … Read more
Did you try df.groupby(‘id’).head(2) Output generated: id value id 1 0 1 1 1 1 2 2 3 2 1 4 2 2 3 7 3 1 4 8 4 1 (Keep in mind that you might need to order/sort before, depending on your data) EDIT: As mentioned by the questioner, use df.groupby(‘id’).head(2).reset_index(drop=True) to remove … Read more
>>> df.groupby(‘id’).first() value id 1 first 2 first 3 first 4 second 5 first 6 first 7 fourth If you need id as column: >>> df.groupby(‘id’).first().reset_index() id value 0 1 first 1 2 first 2 3 first 3 4 second 4 5 first 5 6 first 6 7 fourth To get n first records, you … Read more
pandas >= 1.1 From pandas 1.1 you have better control over this behavior, NA values are now allowed in the grouper using dropna=False: pd.__version__ # ‘1.1.0.dev0+2004.g8d10bfb6f’ # Example from the docs df a b c 0 1 2.0 3 1 1 NaN 4 2 2 1.0 3 3 1 2.0 2 # without NA (the … Read more
SELECT t1.ks, t1.[# Tasks], COALESCE(t2.[# Late], 0) AS [# Late] FROM (SELECT ks, COUNT(*) AS ‘# Tasks’ FROM Table GROUP BY ks) t1 LEFT JOIN (SELECT ks, COUNT(*) AS ‘# Late’ FROM Table WHERE Age > Palt GROUP BY ks) t2 ON (t1.ks = t2.ks);
Update 2022-03 This answer by caner using transform looks much better than my original answer! df[‘sales’] / df.groupby(‘state’)[‘sales’].transform(‘sum’) Thanks to this comment by Paul Rougieux for surfacing it. Original Answer (2014) Paul H’s answer is right that you will have to make a second groupby object, but you can calculate the percentage in a simpler … Read more
GROUP BY col1, col2, col3
You could also just do it in one go, by doing the sort first and using head to take the first 3 of each group. In[34]: df.sort_values([‘job’,’count’],ascending=False).groupby(‘job’).head(3) Out[35]: count job source 4 7 sales E 2 6 sales C 1 4 sales B 5 5 market A 8 4 market D 6 3 market B