Getting per-category counts of list items with jq

jq count number of occurrences
jq get value
jq select
jq examples
jq group by count
jq concatenate objects
jq iterate over array
jq multiple json objects

I'm currently learning how to use jq with shell in Linux since I'm developing custom checks for Check_MK (formerly known as Nagios) and my application (qBittorrent with their WebUI API) returns JSON strings.

Currently, I'm already able to count the total number of torrents just by using a simple jq length. Now, I would like to count the number of torrents that are currently dowloading, seeding or on pause. I'm only interested by the state, so if I have 6 torrents, my JSON could look like that:

[
  {
    "state": "uploading"
  },
  {
    "state": "downloading"
  },
  {
    "state": "downloading"
  },
  {
    "state": "downloading"
  },
  {
    "state": "pauseDL"
  },
  {
    "state": "pauseUP"
  }
]

Here, jq length returns 6. What do I need to do to get the details such as 3 are downloading, 1 is uploading, 2 are paused and 0 are in error?

Here is my actual script:

#!/bin/sh
curl -s http://localhost:8080/query/torrents -o /tmp/torrents.json
count=$(jq length /tmp/torrents.json)
echo "0 qbt_Nb_torrents - $count"

The syntax for the echo is required by Check_MK (as explained here).

I've read multiple examples on filters but they all seem to be working when we're filtering through the top-level attributes. Here, my top level is basically just [0], ..., [5], so it doesn't work with the examples I've found in the manual.

Additional information

The WebUI API says there are 12 different possible states. That's how I intend to split them up:

downloading: queuedDL, checkingDL, downloading 
uploading: queuedUP, checkingUP, uploading 
pause: pausedUP, pausedDL 
error: error 
stalled: stalledUP, stalledDL, metaDL

As per the CheckMK syntax, I need to basically output something like:

0 qbt_Nb_torrents - 6 total, 3 downloading, 1 seeding, 2 on pause, 0 stalled, 0 error

The first 0 at the beginning means an OK status for CheckMK. If there are any stalled torrents, I want that status to become 1, and if there is any torrent in error, the status becomes 2. Example:

2 qbt_Nb_torrents - 8 total, 3 downloading, 1 seeding, 2 on pause, 1 stalled, 1 error

For others with related questions, but not sharing the OP's specific requirements: See edit history! There are several other relevant proposals, including group_by use, in prior iterations of this answer.


If you need entries for all values, even ones which have no occurrence, you might consider:

jq -r '
  def filterStates($stateMap):
    if $stateMap[.] then $stateMap[.] else . end;

  def errorLevel:
    if (.["error"] > 0) then 2 else
      if (.["stalled"] > 0) then 1 else
        0
      end
    end;

  {"queuedDL": "downloading", 
   "checkingDL": "downloading",
   "queuedUP": "uploading", 
   "checkingUP": "uploading",
   "pausedUP": "pause", 
   "pausedDL": "pause",
   "stalledUP": "stalled", 
   "stalledDL": "stalled", 
   "metaDL": "stalled"} as $stateMap |

  # initialize an output array since we want 0 outputs for everything
  {"pause": 0,  "stalled": 0, "error": 0, "downloading": 0, "uploading": 0} as $counts |

  # count number of items which filter to each value
  reduce (.[].state | filterStates($stateMap)) as $state ($counts; .[$state]+=1) |

  # actually format an output string
  "\(. | errorLevel) qbt_Nb_torrents - \(values | add) total, \(.["downloading"]) downloading, \(.["uploading"]) seeding, \(.["pause"]) on pause, \(.["stalled"]) stalled, \(.["error"]) error"
' /tmp/torrents.json

Count JSON Array Elements with jq, Count JSON Array Elements with jq. Sometimes when working with JSON on the command line, it is helpful to know how many elements exist  Both counts() and value_counts() are great utilities for quickly understanding the shape of your data. Conclusion. In this post, we learned about groupby, count, and value_counts – three of the main methods in Pandas. Pandas is a powerful tool for manipulating data once you know the core operations and how to use it. New to Pandas or Python?

Here is a slightly more modularized version of @CharlesDuffy's solution. The main point of interest is perhaps the generic "bag of words" filter:

# bag of words
def bow(init; s): reduce s as $word (init; .[$word] += 1) ;

Note also the initialization function:

# initialize an output object since we minimally want 0s
def init:
  {} | {pause,stalled,error,downloading,uploading} | map_values(0);

With these additional abstractions, the "main" program becomes just two lines of code.

  def filterStates($stateMap):
    if $stateMap[.] then $stateMap[.] else . end ;

  def errorLevel:
    if .error > 0 then 2
    elif .stalled > 0 then 1
    else 0
    end ;

  def stateMap:
    {"queuedDL": "downloading", 
     "checkingDL": "downloading",
     "queuedUP": "uploading", 
     "checkingUP": "uploading",
     "pausedUP": "pause", 
     "pausedDL": "pause",
     "stalledUP": "stalled", 
     "stalledDL": "stalled", 
     "metaDL": "stalled"} ;
"Main"
  # count number of items which map to each value
  bow(init; .[].state | filterStates(stateMap))
  # format an output string
  | "\(errorLevel) qbt_Nb_torrents - \(values | add) total, \(.downloading) downloading, \(.uploading) seeding, \(.pause) on pause, \(.stalled) stalled, \(.error) error"

Guide to Linux jq Command for JSON Processing, In our simple fruits JSON, we get true in each result item. We can also use the map function to apply operations to the elements in an array. Let's  Description: The number of elements in the jQuery object. The number of elements currently matched. The . size () method will return the same value. $ ( "span" ).text ( "There are " + n + " divs." "Click to add more.");

Just in case anyone is wondering what I exactly ended up using following Charles Duffy's excellent answer, here is the full /usr/lib/check_mk_agent/local/qbittorrent shell script that allows Check_MK (1.5.0 raw) to get what I think is the most relevant information about my qBittorrent application (qBittorrent v3.3.7 Web UI) running in a dedicated VM on my server:

#!/bin/sh
curl -s http://localhost:8080/query/transferInfo -o /tmp/transferInfo.json
curl -s http://localhost:8080/query/torrents -o /tmp/torrents.json

if [ -e /tmp/transferInfo.json ]
then
 dwl=$(jq .dl_info_speed /tmp/transferInfo.json)
 dwl_MB=$(bc <<< "scale=2;$dwl/1048576")
 upl=$(jq .up_info_speed /tmp/transferInfo.json)
 upl_MB=$(bc <<< "scale=2;$upl/1048576")
 echo "0 qbt_Global_speed download=$dwl_MB|upload=$upl_MB Download: $dwl_MB MB/s, Upload: $upl_MB MB/s"
 rm -f /tmp/transferInfo.json
else
 echo "3 qbt_Global_speed download=0|upload=0 Can't get the information from qBittorrent WebUI API"
fi

if [ -e /tmp/torrents.json ]
then    
 jq -r '
   def filterStates($stateMap):
    if $stateMap[.] then $stateMap[.] else . end;

   {"queuedDL": "downloading",
   "checkingDL": "downloading",
   "queuedUP": "uploading",
   "checkingUP": "uploading",
   "pausedUP": "pause",
   "pausedDL": "pause",
   "stalledUP": "stalled",
   "stalledDL": "stalled",
   "metaDL": "stalled"} as $stateMap |

   # initialize an output array since we want 0 outputs for everything
   {"pause": 0,  "stalled": 0, "error": 0, "downloading": 0, "uploading": 0} as $counts |

   # count number of items which filter to each value
   reduce (.[].state | filterStates($stateMap)) as $state ($counts; .[$state]+=1) |

   # output string
   "P qbt_Nb_torrents total=\(values|add)|downloading=\(.["downloading"])|seeding=\(.["uploading"])|pause=\(.["pause"])|stalled=\(.["stalled"]);0|error=\(.["error"]);0;0 total is \(values|add), downloading is \(.["downloading"]), seeding is \(.["uploading"]), pause is \(.["pause"])"
' /tmp/torrents.json

 rm -f /tmp/torrents.json
else
 echo "3 qbt_Nb_torrents total=0|downloading=0|seeding=0|pause=0|stalled=0;0|error=0;0;0 Can't get the information from qBittorrent WebUI API"
fi

Here is the output with 1 stalled torrent:

0 qbt_Global_speed download=0|upload=0 Download: 0 MB/s, Upload: 0 MB/s
P qbt_Nb_torrents total=1|downloading=0|seeding=0|pause=0|stalled=1;0|error=0;0;0 total is 1, downloading is 0, seeding is 0, pause is 0

The errorLevel that I thought I needed (see Charles's answer) isn't required because of how Check_MK works; it handles the thresholds itself through the metric parameter when warning and critical values are specified.

Here is what it looks like in Check_MK:

Notice how Check_MK automatically added stalled and errored torrents. This is because of the warning and/or critical thresholds being specified. The metrics in general (with or without thresholds) are important to have the detailed relevant charts available.

counting the number of elements in output · Issue #1277 · stedolan/jq, Does jq support counting the number of elements in the output? Then to output the array of results together with a count of the number of results, By the way, if your filter produces a stream of results and if you wanted to  To create a count of the values that appear in in a list or table, you can use the COUNTIFS function. In the example shown, the formula in D5 is: =COUNTIFS( B:B, B5, C:C, C5) How this formula works. The COUNTIFS function takes range/criteria pairs, and delivers a count when all criteria match. This example, contains two range/criteria pairs.

Parsing JSON with jq, jq is a program described as " sed for JSON data": JSON objects can also contain Arrays, which can be thought of as lists of elements. lack of square brackets: by specifying a specific element in terms , we get a single object in return: There may be a specific filter in jq that returns a count, but let's just  jq is built around the concept of filters that work over a stream of JSON. Each filter takes an input and emits JSON to standard out. As we’re going to see, there are many predefined filters that we can use. And, we can effortlessly combine these filters using pipes to quickly construct and apply complex operations and transformations to our

JSON parsing: counting, articles · categories · me · rss The goal is to loop through each object in a list of JSON objects and count produce a to_entries[] ] so it's a way to loop over each key, value pair of each object and return a “list”. [0].key[4:] as the key for an object we need to wrap it with () to prevent jq from getting confused. Given a Pandas dataframe, we need to find the frequency counts of each item in one or more columns of this dataframe. This can be achieved in multiple ways: This method is applicable to pandas.Series object. Since each DataFrame object is a collection of Series object, we can apply this method to get the frequency counts of values in one column.

Bash that JSON (with jq), That makes sense, since not every GitHub issue will have a corresponding pull_request. jq also allows us to refer to elements and attributes by  This is MD.SIDDIQALI working on sharepoint 2010.I have task to work on jquery using SHarepoint 2010 designer to get the total items existed in SP list, But i have no knowledge about jquery.I searched for some books but haven't got any books related to it,However through the help of some blogs

Comments
  • No need for a temporary file, btw. curl | jq will work perfectly well, as will s=$(curl ...), and then jq ... <<<"$s" (though you'll need to switch from #!/bin/sh to #!/bin/bash for the latter).
  • (Temporary files create security risks when neither in a private location unwritable by other users, nor created with mktemp or other tools which generate a random, unique name; someone who runs ln -s /etc/passwd /tmp/torrents.json before your script is invoked could cause it to overwrite /etc/passwd when run as root, even if they themselves only had permission to write to /tmp).
  • BTW, your stated desired output includes a "seeding" state, but that's not in your list of the possible states from the API.
  • Also, your sample data has pauseDL, but your statement about possible API states has pausedDL, with an extra d.
  • ...just realized I'd left out the "total". (And btw, as a point of clarity: Please don't take this level of attention to requirements beyond a narrow and specific question as appropriate to the site; I'm going beyond what the rules call for -- and maybe even harming the knowledgebase's usefulness, making the answer less useful to people who aren't you, by diverting focus from the grouped-counts narrow, immediate question -- because you managed to come up with a problem that's a somewhat fun exercise to answer).
  • Thank you, I tested multiple of you edits and they all work one way or another. I will need to adapt since I still need the 0 outputs if there are 0 torrents on pause for example to prevent Check_MK check from crashing but it's pretty simple to do from there (I can just set default values first basically).
  • Pays to be explicit about all your requirements up-front; if I'd known that was what you needed, I probably would have stuck with and expanded the reducer approach.
  • Yes, according to the linked documentation of the WebUI API, there are 12 different states. I will definitely regroup them into just 4 or 5 categories, basically downloading, seeding, on pause or error. That latest state will trigger a critical warning in CheckMK (so instead of being 0, the status will be 3 as per CheckMK linked documentation). Besides, I will probably have it return everything in 1 row, like 0 qbt_Nb_torrents - 6 total, 3 downloading, 1 seeding, 2 on pause, 0 in error.
  • Well -- after you have a correct answer is the wrong time to change the requirements; that generally gets us into "ask a new question" territory. See the discussion of "chameleon questions" at meta.stackoverflow.com/questions/332820/… on Meta Stack Overflow.
  • I don't mind posting a new question with the additional info, although it would be almost close enough to be a duplicate. But I will work with your answer first and see where it gets me. Either way, I added the details and accepted your answer.
  • So you don't use jq at all or is it implicit in your example?
  • Only the jq program (aka filter) is shown. (Obviously?)
  • Just tested it, it does work. It basically saves 2-3 rows of code. Considering the checks are run every 60 sec and Check_MK can execute hundreds of checks at the same time, I'm wondering if it makes a difference in terms of performances.
  • The main difference comes from the use of bow().