Processing JSON of arrays in array with jq

Question

Environment: JQ 1.5, Windows 64-bit.

I have the following JSON:

{
  "unique": 1924,
  "coordinates": [
    {
      "time": "2015-01-25T00:00:01.683",
      "xyz": [
        {
          "z": 4,
          "y": 2,
          "x": 1,
          "id": 99
        },
        {
          "z": 9,
          "y": 9,
          "x": 8,
          "id": 100
        },
        {
          "z": 9,
          "y": 6,
          "x": 10,
          "id": 101
        }
      ]
    },
    {
      "time": "2015-01-25T00:00:02.790",
      "xyz": [
        {
          "z": 0,
          "y": 3,
          "x": 7,
          "id": 99
        },
        {
          "z": 4,
          "y": 6,
          "x": 2,
          "id": 100
        },
        {
          "z": 2,
          "y": 9,
          "x": 51,
          "id": 101
        }
      ]
    }
  ]
}

And would like to convert it into this CSV format with jq:

unique,time,id,x,y,z
1924,"2015-01-25T00:00:01.683",99,1,2,4
1924,"2015-01-25T00:00:01.683",100,8,9,9

(and so on)

I tried a few things, such as:

jq -r '{unique: .unique, coordinates: .coordinates[].xyz[] | [.id, .x, .y, .z], time: .coordinates.[].time} | flatten | @csv'

which gave me my desired JSON, but multiplied for every id, x, y, and z (i.e. each unique line appears four times - one each for id, x, y, z).

Assigning a number to the array, such as

jq -r '{unique: .unique, coordinates: .coordinates[0].xyz[] | [.id, .x, .y, .z], time: .coordinates.[0].time} | flatten | @csv'

gives me the first index of the coordinates array, but I'd like all of them, naturally.

Mmm. Requiring a very recent build, there; I'm still on 1.4. But then, such is the way of shiny features... — Charles Duffy
– Charles Duffy, Commented May 21, 2015 at 18:05

Jeff Mercado · Accepted Answer · 2015-05-21 18:19:36Z

11

First step would be to flatten down the results to rows.

[{ unique } + (.coordinates[] | { time } + .xyz[])]

This would yield an array of objects per row:

[
  {
    "unique": 1924,
    "time": "2015-01-25T00:00:01.683",
    "id": 99,
    "x": 1,
    "y": 2,
    "z": 4
  },
  {
    "unique": 1924,
    "time": "2015-01-25T00:00:01.683",
    "id": 100,
    "x": 8,
    "y": 9,
    "z": 9
  },
  {
    "unique": 1924,
    "time": "2015-01-25T00:00:01.683",
    "id": 101,
    "x": 10,
    "y": 6,
    "z": 9
  },
  {
    "unique": 1924,
    "time": "2015-01-25T00:00:02.790",
    "id": 99,
    "x": 7,
    "y": 3,
    "z": 0
  },
  ...
]

Then it's just a matter of converting this to csv rows.

["unique","time","id","x","y","z"] as $fields | $fields, (.[] | [.[$fields[]]]) | @csv

answered May 21, 2015 at 18:19

Jeff Mercado

136k34 gold badges268 silver badges282 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

innocentunremarkable Over a year ago

Thanks! I've been using the flatten / @csv paradigm for previous commands - is your method to convert to CSV faster/more efficient, would you say?

Jeff Mercado Over a year ago

I don't think flatten would really work here. It's meant to flatten arrays of arrays, not objects. And you wouldn't want to use it because flattening an object will yield the values. You'll lose the associated keys and ordering is not guaranteed. You would need to convert to other intermediate forms to utilize that better. The approach I took here is more direct so I would expect it to be relatively faster than trying to get it to work with flatten.

Collectives™ on Stack Overflow

Processing JSON of arrays in array with jq

1 Answer 1

2 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

2 Comments

Your Answer

Sign up or log in

Post as a guest

Related