How to convert json into excel format using python

Question

I tried to convert json file into excel but somehow panda is not able to do it for all the keys.

I have a json input:

{
    "result": [
        {
            "level": "L3_SW",
            "name": "L23",
            "type": "CM"
        },
        {
            "level": "L3_SW",
            "name": "SOFT",
            "type": "QM"
        }],
   "context": {
        "config": {
            "project_area_name": "XYZ",
            "component_name": "Configuration",
            "config_name": "_WorkOn",
            "bu": "H"
        },
        "meta": {
            "project": {
                "name": "_WorkOn",
                "key": "2023-05-02_96614cc50ac8dc7e121f7090",
                "started_at": "2023-05-02-16-00"
            },
            "task": {
                "started": "2023-05-02-16-00",
                "finished": "2023-05-02-16-00",
                "req_count": 1
            }
        }
    }
}

This is just one of a json file and I don't know what will be the structure of other json inputs.

I tried with panda library with I guess it required specific key which I don't want.

json_file_path = filedialog.askopenfilename(title='Select a JSON file', filetypes=[('JSON files','*.json')])

    # Load the JSON file into a pandas dataframe
    data = json.load(open(json_file_path))
    print(data)
    df = pd.DataFrame(data["result"])  #=============> this giving me output of excel column only of result array.

    # Ask the user to select a location to save the Excel file
    excel_file_path = filedialog.asksaveasfilename(title='Save as Excel',defaultextension='.xlsx')

    # Save the dataframe as an Excel file
    df.to_excel(excel_file_path, index=False)

How can I convert entire json to excel irrespective of the json structure which can give me all columns of the keys ?

you'd have to "flatten" your json first, which would lead to ambiguous keys...e.g. you have a "name" property in result[0] and also in context["meta"]["project"] how should this be handled? — mrxra
– mrxra, Commented May 11, 2023 at 8:45
@mrxra I am not sure of that since I cannot change the json input. Something like in project column , there will be 3 more column of name, key, started_at may be. I cannot say what will be the contents inside json input. Only thing is I need to create columns for all keys with values — NoobCoder
– NoobCoder, Commented May 11, 2023 at 8:50
what should the "title" row in your excel contain if you have multiple "name", "id", "whatever" fields in your unknown json structure? or you don't need named columns in excel? — mrxra
– mrxra, Commented May 11, 2023 at 8:54
how is your excel to look like exactly? e.g. 2 columns with keys from your json in column A and values in column B? or first row keys, second row values? — mrxra
– mrxra, Commented May 11, 2023 at 9:31
@mrxra No, the keys should be the title of each column. For example of my code I am able to generate result array with column name- level, name , type. Like the same way if I can generate more columns like project_area_name, component_name etc — NoobCoder
– NoobCoder, Commented May 11, 2023 at 10:38

mrxra · Accepted Answer · 2023-05-11 09:48:32Z

1

this will convert any json structure into a flat excel table, with the keys from your json in column A and the corresponding values in column B:

import json
import flatdict
import openpyxl
import pandas as pd

data = json.loads("""{
    "result":[{
        "level": "L3_SW",
        "name": "L23",
        "type": "CM"
    },{
        "level": "L3_SW",
        "name": "SOFT",
        "type": "QM"
    }],
    "context":{
        "config": {
            "project_area_name": "XYZ",
            "component_name": "Configuration",
            "config_name": "_WorkOn",
            "bu": "H"
        },
        "meta": {
            "project": {
                "name": "_WorkOn",
                "key": "2023-05-02_96614cc50ac8dc7e121f7090",
                "started_at": "2023-05-02-16-00"
            },
            "task": {
                "started": "2023-05-02-16-00",
                "finished": "2023-05-02-16-00",
                "req_count": 1
            }
        }
    }
}
""")

d = dict(flatdict.FlatterDict(data))
# EITHER: column A: keys, column B: values
df = pd.DataFrame.from_dict(d, orient='index')
df.to_excel("test.xlsx", header=False)

# OR: row 1: keys, row 2: values
# df = pd.DataFrame.from_dict([d])
# df.to_excel("test.xlsx", index=False)
print(df)

output:

result:0:level                                                  L3_SW
result:0:name                                                     L23
result:0:type                                                      CM
result:1:level                                                  L3_SW
result:1:name                                                    SOFT
result:1:type                                                      QM
context:config:project_area_name                                  XYZ
context:config:component_name                           Configuration
context:config:config_name                                    _WorkOn
context:config:bu                                                   H
context:meta:project:name                                     _WorkOn
context:meta:project:key          2023-05-02_96614cc50ac8dc7e121f7090
context:meta:project:started_at                      2023-05-02-16-00
context:meta:task:started                            2023-05-02-16-00
context:meta:task:finished                           2023-05-02-16-00
context:meta:task:req_count                                         1

answered May 11, 2023 at 9:48

mrxra

8621 gold badge7 silver badges9 bronze badges

Sign up to request clarification or add additional context in comments.

10 Comments

NoobCoder Over a year ago

I am getting this - ValueError: dictionary update sequence element #0 has length 1; 2 is required basically on this line : d = dict(flatdict.FlatterDict(data))

mrxra Over a year ago

...not with the example from the code above I assume? what does your json look like?

NoobCoder Over a year ago

Same what I posted in the question. Code is

data = json.load(open(json_file_path))     json_formatted_str = json.dumps(data, indent=4)     d = dict(flatdict.FlatterDict(json_formatted_str))

mrxra Over a year ago

that's not possible. the data you posted has trailing "," chars which json.load() cannot handle. json.decoder.JSONDecodeError: Expecting property name enclosed in double quotes: line 7 column 9 (char 119) ==> run the code i posted, then adapt it to your needs. the data it contains is your data, but fixed :)

mrxra Over a year ago

as for the trailing comma issue in your json file, you probably have to fix them: "trailing commas are not allowed in JSON" (developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/…)

|

Collectives™ on Stack Overflow

How to convert json into excel format using python

1 Answer 1

10 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

10 Comments

Your Answer

Sign up or log in

Post as a guest

Related