Processing JSON documents with jq - an example-based approach

2 minute read

Introduction

jq is a flexible command-line JSON processor that can be used to filter, slice, map & transform JSON documents. It can be used in conjuction with Linux pipes and output redirection.
jq is for JSON like sed/awk/grep are for structured text. It accepts input from both a JSON file or stdin through a pipe or xargs.

The simplest invocation of jq is to just beautify the JSON input:

$ curl 'https://jsonplaceholder.typicode.com/users' | jq '.'

As you see, for these examples we will use the JsonPlaceholder testing API. It will generate us a JSON containig 10 different users with this structure:

[
  {
    "id": 1,
    "name": "Leanne Graham",
    "username": "Bret",
    "email": "Sincere@april.biz",
    "address": {
      "street": "Kulas Light",
      "suite": "Apt. 556",
      "city": "Gwenborough",
      "zipcode": "92998-3874",
      "geo": {
        "lat": "-37.3159",
        "lng": "81.1496"
      }
    },
    "phone": "1-770-736-8031 x56442",
    "website": "hildegard.org",
    "company": {
      "name": "Romaguera-Crona",
      "catchPhrase": "Multi-layered client-server neural-net",
      "bs": "harness real-time e-markets"
    }
  }
]

Extracting just a subset of users

# Extracts just the first user
$ curl 'https://jsonplaceholder.typicode.com/users' | jq '.[0]'

# Extracts just the last user
$ curl 'https://jsonplaceholder.typicode.com/users' | jq '.[-1]'

# Array slicing (like in Python)
$ curl 'https://jsonplaceholder.typicode.com/users' | jq '.[2:4]'

Selecting just some fields for each user

$ curl 'https://jsonplaceholder.typicode.com/users' | jq '[.[] | { id: .id, name: .name, company: .company.name }]'

.[] returns each element of the array, like a stream of sub-documents, that will be fed one by one into { id: .id, name: .name, company: .company.name }. The beginning [ and the ending ‘]’ tell jq to make an array out of the result. It will print something like this:

[
  {
    "id": 1,
    "name": "Leanne Graham",
    "company": "Romaguera-Crona"
  },
  {
    "id": 2,
    "name": "Ervin Howell",
    "company": "Deckow-Crist"
  }
  ...
]

Pipelining multiple JSON operations

# Returns the last element of the already processed JSON
$ curl 'https://jsonplaceholder.typicode.com/users' | jq '[.[] | { id: .id, name: .name, company: .company.name }] | .[-1]'

Filtering JSON array contents based on specific fields

# Filter by numeric field
$ curl 'https://jsonplaceholder.typicode.com/users' | jq '.[] | select(.id == 3)'

# Filter by text field
$ curl 'https://jsonplaceholder.typicode.com/users' | jq '.[] | select(.company.catchPhrase | contains("neural-net"))'

# Filterig by multiple fields
$ curl 'https://jsonplaceholder.typicode.com/users' | jq '.[] | select(.id == 1 and .name == "Leanne Graham")'

# Filtering based on a regular expression (keep only the users with .com websites)
$ curl 'https://jsonplaceholder.typicode.com/users' | jq '.[] | select(.website | test("^.*.com"))'

Deleting a field or a sub-tree from the JSON document

$ curl 'https://jsonplaceholder.typicode.com/users' | jq '.[] | del(.company)'

Sort the array contents by a given field

$ curl 'https://jsonplaceholder.typicode.com/users' | jq 'sort_by(.name)'

Conclusion

Those are the most common examples for the jq command. For more advanced use-cases check the jq Cookbook. You can also use this website to test your jq queries on custom input data.

Tags: ,

Updated:

Leave a comment