Learn/Advanced Topics

Mastering jq — Command-Line JSON Processing

Every modern API, CI/CD tool, and cloud platform speaks JSON. jq lets you slice, filter, reshape, and aggregate that data directly on the command line — no throwaway scripts needed. This guide takes you from basic field access to production-grade recipes for Kubernetes, Terraform, and log analysis.

What jq Is and When to Use It

jq is to JSON what sed and awk are to text. It is a single binary with zero dependencies that reads JSON from stdin, applies a filter expression, and writes the result to stdout. This makes it composable with every Unix tool.

jq in the Shell Pipeline
Use jq WhenUse Code When
Extracting a field from an API responseBuilding a full application around the data
Filtering Kubernetes pod listsComplex multi-step business logic
Reshaping JSON for another toolPersisting transformed data to a database
Ad-hoc log analysis on the command lineScheduled ETL pipelines with error handling
Scripting CI/CD pipelinesUI-driven data exploration

Core Syntax

Identity and Field Access

Basic field accessbash
1# Identity: pass JSON through unchanged (pretty-prints)
2echo '{"name":"Alice","age":30}' | jq '.'
3
4# Access a single field
5echo '{"name":"Alice","age":30}' | jq '.name'
6# Output: "Alice"
7
8# Access nested fields
9echo '{"user":{"name":"Alice","address":{"city":"NYC"}}}' | jq '.user.address.city'
10# Output: "NYC"
11
12# Optional access (returns null instead of error if missing)
13echo '{"user":{}}' | jq '.user.address?.city'
14# Output: null

Pipes and Chaining

Piping filtersbash
1# The pipe operator passes output of one filter as input to the next
2echo '{"users":[{"name":"Alice","role":"admin"},{"name":"Bob","role":"user"}]}' \
3 | jq '.users[] | .name'
4# Output:
5# "Alice"
6# "Bob"
7
8# Array indexing
9echo '[10,20,30,40,50]' | jq '.[2]'
10# Output: 30
11
12# Array slicing
13echo '[10,20,30,40,50]' | jq '.[1:4]'
14# Output: [20, 30, 40]

Raw Output and Compact Mode

Output formatting flagsbash
1# -r: raw output (no quotes around strings)
2echo '{"url":"https://api.example.com"}' | jq -r '.url'
3# Output: https://api.example.com
4
5# -c: compact output (single line)
6echo '{"a":1,"b":2}' | jq -c '.'
7# Output: {"a":1,"b":2}
8
9# -e: exit with error code 1 if output is null or false
10echo '{}' | jq -e '.missing' || echo "Field not found"

Filtering with select()

select() examplesbash
1# Filter array elements by condition
2echo '[{"name":"Alice","age":30},{"name":"Bob","age":25},{"name":"Carol","age":35}]' \
3 | jq '.[] | select(.age > 28)'
4# Output: {"name":"Alice","age":30}
5# {"name":"Carol","age":35}
6
7# Wrap results back into an array
8echo '[{"name":"Alice","age":30},{"name":"Bob","age":25}]' \
9 | jq '[.[] | select(.age >= 30)]'
10# Output: [{"name":"Alice","age":30}]
11
12# String matching
13echo '[{"name":"payment-service"},{"name":"user-service"},{"name":"payment-worker"}]' \
14 | jq '[.[] | select(.name | startswith("payment"))]'
15
16# Regex matching
17echo '[{"email":"[email protected]"},{"email":"[email protected]"}]' \
18 | jq '[.[] | select(.email | test("@example\\.com$"))]'

Transforming with map(), reduce(), and group_by()

map()

map() transforms every array elementbash
1# Extract a single field from each object
2echo '[{"name":"Alice","age":30},{"name":"Bob","age":25}]' \
3 | jq 'map(.name)'
4# Output: ["Alice", "Bob"]
5
6# Transform each element
7echo '[{"price":10,"qty":3},{"price":20,"qty":1}]' \
8 | jq 'map({item_total: (.price * .qty)})'
9# Output: [{"item_total":30},{"item_total":20}]

group_by() and sort_by()

Grouping and sortingbash
1# Group users by role
2echo '[{"name":"Alice","role":"admin"},{"name":"Bob","role":"user"},{"name":"Carol","role":"admin"}]' \
3 | jq 'group_by(.role) | map({role: .[0].role, count: length, members: map(.name)})'
4# Output: [
5# {"role":"admin","count":2,"members":["Alice","Carol"]},
6# {"role":"user","count":1,"members":["Bob"]}
7# ]
8
9# Sort by a field (descending)
10echo '[{"name":"Bob","score":85},{"name":"Alice","score":92}]' \
11 | jq 'sort_by(-.score)'

reduce()

Aggregation with reducebash
1# Sum all prices
2echo '[{"price":10},{"price":20},{"price":15}]' \
3 | jq 'reduce .[] as $item (0; . + $item.price)'
4# Output: 45
5
6# Build a lookup map from an array
7echo '[{"id":"a","val":1},{"id":"b","val":2}]' \
8 | jq 'reduce .[] as $item ({}; . + {($item.id): $item.val})'
9# Output: {"a":1,"b":2}

Object Construction and Reshaping

Building new JSON structuresbash
1# Construct a new object from existing fields
2echo '{"first_name":"Alice","last_name":"Smith","age":30,"email":"[email protected]"}' \
3 | jq '{full_name: (.first_name + " " + .last_name), contact: .email}'
4# Output: {"full_name":"Alice Smith","contact":"[email protected]"}
5
6# Add or override fields
7echo '{"name":"Alice","version":"1.0"}' | jq '. + {"env":"production","version":"2.0"}'
8
9# Delete fields
10echo '{"name":"Alice","password":"secret","role":"admin"}' | jq 'del(.password)'
11
12# Rename keys with with_entries
13echo '{"user_name":"Alice","user_age":30}' \
14 | jq 'with_entries(.key |= ltrimstr("user_"))'
15# Output: {"name":"Alice","age":30}

Recursive Descent and walk()

Searching deep structuresbash
1# .. recurses into every value at any depth
2# Find all "id" fields anywhere in the structure
3echo '{"users":[{"id":1,"profile":{"id":"p1"}},{"id":2}]}' \
4 | jq '.. | .id? // empty'
5# Output: 1, "p1", 2
6
7# walk() applies a function to every node
8# Convert all string values to uppercase
9echo '{"name":"alice","address":{"city":"new york"}}' \
10 | jq 'walk(if type == "string" then ascii_upcase else . end)'
11# Output: {"name":"ALICE","address":{"city":"NEW YORK"}}

Format Strings: @csv, @tsv, @base64

Output format conversionsbash
1# Convert array of arrays to CSV
2echo '[["name","age"],["Alice",30],["Bob",25]]' | jq -r '.[] | @csv'
3# Output:
4# "name","age"
5# "Alice",30
6# "Bob",25
7
8# Convert to TSV
9echo '[["name","age"],["Alice",30]]' | jq -r '.[] | @tsv'
10
11# Base64 encode a value
12echo '{"token":"secret123"}' | jq -r '.token | @base64'
13# Output: c2VjcmV0MTIz
14
15# String interpolation
16echo '{"name":"Alice","age":30}' | jq -r '"User: \(.name), Age: \(.age)"'
17# Output: User: Alice, Age: 30

Streaming Mode for Large Files

Standard jq loads the entire JSON document into memory. For multi-gigabyte files, use --stream to process data incrementally as path-value pairs:

Streaming large filesbash
1# Stream mode emits [path, value] pairs
2echo '{"users":[{"name":"Alice"},{"name":"Bob"}]}' | jq --stream '.'
3# Output:
4# [["users",0,"name"],"Alice"]
5# [["users",1,"name"],"Bob"]
6# ... (plus truncated arrays/objects markers)
7
8# Extract values from a specific path in a huge file
9jq --stream 'select(.[0][:2] == ["users",0]) | .[1] // empty' huge.json
10
11# Reconstruct objects from stream (truncate to first 100 users)
12jq -n --stream '[
13 foreach (inputs | select(.[0][0] == "users")) as $x (
14 {count: 0, out: []};
15 if $x | length == 2 then .out += [$x] else . end;
16 if .count < 100 then . else ., halt end
17 ) | .out
18] | fromstream(.[])' huge.json

When to Use Streaming

Use --stream when files exceed available memory or when you only need a small subset of a large document. For files under ~500 MB that fit in memory, standard mode is faster and simpler.

Real-World Recipes

Kubernetes: List Unhealthy Pods

Filter pods not in Running phasebash
1kubectl get pods -A -o json | jq -r '
2 .items[]
3 | select(.status.phase != "Running")
4 | [.metadata.namespace, .metadata.name, .status.phase]
5 | @tsv'

Terraform: Extract Output Values

Parse terraform outputbash
1# Get all output values as a flat map
2terraform output -json | jq 'to_entries | map({(.key): .value.value}) | add'
3
4# Extract a specific nested value
5terraform show -json | jq -r '.values.root_module.resources[]
6 | select(.type == "aws_instance")
7 | .values.public_ip'

API Debugging: GitHub API

Analyze GitHub repository databash
1# Top 5 contributors by commit count
2curl -s "https://api.github.com/repos/owner/repo/contributors" \
3 | jq 'sort_by(-.contributions) | .[0:5] | .[] | {login, contributions}'
4
5# List open issues with labels
6curl -s "https://api.github.com/repos/owner/repo/issues?state=open" \
7 | jq '.[] | {title, labels: [.labels[].name], created: .created_at}'

Log Analysis

Process JSON log filesbash
1# Count errors by service
2cat app.log | jq -s '
3 [.[] | select(.level == "error")]
4 | group_by(.service)
5 | map({service: .[0].service, error_count: length})
6 | sort_by(-.error_count)'
7
8# Find the slowest 10 requests
9cat app.log | jq -s '
10 [.[] | select(.duration_ms)]
11 | sort_by(-.duration_ms)
12 | .[0:10]
13 | .[] | {path: .url, duration_ms, timestamp}'
14
15# P95 response time
16cat app.log | jq -s '
17 [.[] | .duration_ms] | sort |
18 .[length * 0.95 | floor]'

Docker and docker-compose

Inspect Docker containersbash
1# List container names and their IP addresses
2docker inspect $(docker ps -q) | jq -r '.[] | "\(.Name): \(.NetworkSettings.IPAddress)"'
3
4# Get environment variables from a container
5docker inspect mycontainer | jq '.[0].Config.Env'

jq vs. Alternatives

ToolLanguageStrengthsBest For
jqCustom DSLZero deps, fastest for scripts, shell composableCI/CD, shell scripts, automation
jaqRust (jq-compatible)Faster execution, stricter error handlingDrop-in jq replacement when speed matters
gojqGo (jq-compatible)YAML support, easier to embedGo projects needing jq semantics
fxJavaScriptInteractive TUI, JS expressionsExploring JSON interactively
xq / yqPython / GoXML and YAML support via jq syntaxMixed-format pipelines

Common Pitfalls

  • Redirecting output to the same file you read from: jq '.' f.json > f.json truncates the file to zero bytes. Use a temp file or sponge.
  • Forgetting -r when piping string output to other tools. Without it, jq wraps strings in quotes.
  • Using jq -s (slurp) on huge files. Slurp loads the entire input into a single array in memory.
  • Shell quoting: always single-quote jq filters to prevent shell expansion of $ and . characters.

Frequently Asked Questions

What is jq?
jq is a lightweight, command-line JSON processor written in C with zero runtime dependencies. It lets you slice, filter, map, and transform structured JSON data as easily as sed, awk, and grep let you work with text. It is the de facto standard for command-line JSON processing.
How do I install jq?
On macOS: brew install jq. On Ubuntu/Debian: sudo apt-get install jq. On Windows: choco install jq or download the binary from the official site. Most Linux distributions include jq in their default repositories.
What is the difference between jq and fx?
jq uses its own filter language and is optimized for scripting and piping in shell workflows. fx is an interactive JSON viewer that lets you explore JSON with JavaScript expressions. Use jq for automation and scripts, fx for interactive exploration.
Can jq handle large JSON files?
Yes. jq supports a --stream flag that processes JSON incrementally as path-value pairs instead of loading the entire file into memory. This enables processing multi-gigabyte files with constant memory usage.
How do I pretty-print JSON with jq?
Simply pipe JSON through jq with the identity filter: echo '{"a":1}' | jq '.' This outputs formatted, colorized JSON. Use the -c flag for compact output and -r for raw strings without quotes.
Can I modify JSON files in place with jq?
jq does not support in-place editing directly. The standard pattern is to write to a temporary file and replace the original: jq '.version = "2.0"' package.json > tmp.json && mv tmp.json package.json. Alternatively, use sponge from the moreutils package.