jq count the number of items in json by a specific key

Question

Here’s one solution (assuming the input is a stream of valid JSON objects) and that you invoke jq with the -s option:

map({ItemId: .Properties.ItmId})             # extract the ItmID values
| group_by(.ItemId)                          # group by "ItemId"
| map({ItemId: .[0].ItemId, Count: length})  # store the counts
| .[]                                        # convert to a stream

A slightly more memory-efficient approach would be to use inputs if your jq has it; but in that case, use -n instead of -s, and replace the first line above by: [inputs | {ItemId: .Properties.ItmId} ]

Efficient solution

The above solutions use the built-in group_by, which is convenient but leads to easily-avoided inefficiencies. Using the following counter makes it easy to write a very efficient solution:

def counter(stream):
  reduce stream as $s ({}; .[$s|tostring] += 1);

Using the -n command-line option, and applied as follows:

counter(inputs | .Properties.ItmId)

this leads to a dictionary of counts:

{
  "1694738780": 1,
  "1347809133": 1
}

Such a dictionary is probably more useful than a stream of singleton objects as envisioned by the OP, but if such as stream is needed, one can modify the above as follows:

counter(inputs | .Properties.ItmId)
| to_entries[]
| {ItemId: (.key), Count: .value}

Efficient solution

Leave a Comment Cancel reply