Mongodb aggregation pipeline how to limit a group push

Suppose the bottom left coordinates and the upper right coordinates are respectively [0, 0] and [100, 100]. From MongoDB 3.2 you can use the $slice operator to return a subset of an array which is what you want. db.collection.aggregate([ { “$match”: { “loc”: { “$geoWithin”: { “$box”: [ [0, 0], [100, 100] ] } }} … Read more

Grouping Python tuple list

itertools.groupby can do what you want: import itertools import operator L = [(‘grape’, 100), (‘grape’, 3), (‘apple’, 15), (‘apple’, 10), (‘apple’, 4), (‘banana’, 3)] def accumulate(l): it = itertools.groupby(l, operator.itemgetter(0)) for key, subiter in it: yield key, sum(item[1] for item in subiter) print(list(accumulate(L))) # [(‘grape’, 103), (‘apple’, 29), (‘banana’, 3)]

Oracle: How to count null and non-null rows

COUNT(expr) will count the number of rows where expr is not null, thus you can count the number of nulls with expressions like these: SELECT count(a) nb_a_not_null, count(b) nb_b_not_null, count(*) – count(a) nb_a_null, count(*) – count(b) nb_b_null, count(case when a is not null and b is not null then 1 end)nb_a_b_not_null count(case when a is … Read more

How to group dates by week?

The fundamental question here is how to project a DateTime instance into a week of year value. This can be done using by calling Calendar.GetWeekOfYear. So define the projection: Func<DateTime, int> weekProjector = d => CultureInfo.CurrentCulture.Calendar.GetWeekOfYear( d, CalendarWeekRule.FirstFourDayWeek, DayOfWeek.Sunday); You can configure exactly how the “week number” is determined by tweaking the parameters in the … Read more

How to generate a train-test-split based on a group id?

I figured out the answer. This seems to work: from sklearn.model_selection import GroupShuffleSplit splitter = GroupShuffleSplit(test_size=.20, n_splits=2, random_state = 7) split = splitter.split(df, groups=df[‘Group_Id’]) train_inds, test_inds = next(split) train = df.iloc[train_inds] test = df.iloc[test_inds]

tech