2

I tried to convert json file into excel but somehow panda is not able to do it for all the keys.

I have a json input:

{
    "result": [
        {
            "level": "L3_SW",
            "name": "L23",
            "type": "CM"
        },
        {
            "level": "L3_SW",
            "name": "SOFT",
            "type": "QM"
        }],
   "context": {
        "config": {
            "project_area_name": "XYZ",
            "component_name": "Configuration",
            "config_name": "_WorkOn",
            "bu": "H"
        },
        "meta": {
            "project": {
                "name": "_WorkOn",
                "key": "2023-05-02_96614cc50ac8dc7e121f7090",
                "started_at": "2023-05-02-16-00"
            },
            "task": {
                "started": "2023-05-02-16-00",
                "finished": "2023-05-02-16-00",
                "req_count": 1
            }
        }
    }
}

This is just one of a json file and I don't know what will be the structure of other json inputs.

I tried with panda library with I guess it required specific key which I don't want.

json_file_path = filedialog.askopenfilename(title='Select a JSON file', filetypes=[('JSON files','*.json')])

    # Load the JSON file into a pandas dataframe
    data = json.load(open(json_file_path))
    print(data)
    df = pd.DataFrame(data["result"])  #=============> this giving me output of excel column only of result array.

    # Ask the user to select a location to save the Excel file
    excel_file_path = filedialog.asksaveasfilename(title='Save as Excel',defaultextension='.xlsx')

    # Save the dataframe as an Excel file
    df.to_excel(excel_file_path, index=False)

How can I convert entire json to excel irrespective of the json structure which can give me all columns of the keys ?

7
  • you'd have to "flatten" your json first, which would lead to ambiguous keys...e.g. you have a "name" property in result[0] and also in context["meta"]["project"] how should this be handled? Commented May 11, 2023 at 8:45
  • @mrxra I am not sure of that since I cannot change the json input. Something like in project column , there will be 3 more column of name, key, started_at may be. I cannot say what will be the contents inside json input. Only thing is I need to create columns for all keys with values Commented May 11, 2023 at 8:50
  • what should the "title" row in your excel contain if you have multiple "name", "id", "whatever" fields in your unknown json structure? or you don't need named columns in excel? Commented May 11, 2023 at 8:54
  • how is your excel to look like exactly? e.g. 2 columns with keys from your json in column A and values in column B? or first row keys, second row values? Commented May 11, 2023 at 9:31
  • @mrxra No, the keys should be the title of each column. For example of my code I am able to generate result array with column name- level, name , type. Like the same way if I can generate more columns like project_area_name, component_name etc Commented May 11, 2023 at 10:38

1 Answer 1

1

this will convert any json structure into a flat excel table, with the keys from your json in column A and the corresponding values in column B:

import json
import flatdict
import openpyxl
import pandas as pd

data = json.loads("""{
    "result":[{
        "level": "L3_SW",
        "name": "L23",
        "type": "CM"
    },{
        "level": "L3_SW",
        "name": "SOFT",
        "type": "QM"
    }],
    "context":{
        "config": {
            "project_area_name": "XYZ",
            "component_name": "Configuration",
            "config_name": "_WorkOn",
            "bu": "H"
        },
        "meta": {
            "project": {
                "name": "_WorkOn",
                "key": "2023-05-02_96614cc50ac8dc7e121f7090",
                "started_at": "2023-05-02-16-00"
            },
            "task": {
                "started": "2023-05-02-16-00",
                "finished": "2023-05-02-16-00",
                "req_count": 1
            }
        }
    }
}
""")

d = dict(flatdict.FlatterDict(data))
# EITHER: column A: keys, column B: values
df = pd.DataFrame.from_dict(d, orient='index')
df.to_excel("test.xlsx", header=False)

# OR: row 1: keys, row 2: values
# df = pd.DataFrame.from_dict([d])
# df.to_excel("test.xlsx", index=False)
print(df)

output:

result:0:level                                                  L3_SW
result:0:name                                                     L23
result:0:type                                                      CM
result:1:level                                                  L3_SW
result:1:name                                                    SOFT
result:1:type                                                      QM
context:config:project_area_name                                  XYZ
context:config:component_name                           Configuration
context:config:config_name                                    _WorkOn
context:config:bu                                                   H
context:meta:project:name                                     _WorkOn
context:meta:project:key          2023-05-02_96614cc50ac8dc7e121f7090
context:meta:project:started_at                      2023-05-02-16-00
context:meta:task:started                            2023-05-02-16-00
context:meta:task:finished                           2023-05-02-16-00
context:meta:task:req_count                                         1
Sign up to request clarification or add additional context in comments.

10 Comments

I am getting this - ValueError: dictionary update sequence element #0 has length 1; 2 is required basically on this line : d = dict(flatdict.FlatterDict(data))
...not with the example from the code above I assume? what does your json look like?
Same what I posted in the question. Code is data = json.load(open(json_file_path)) json_formatted_str = json.dumps(data, indent=4) d = dict(flatdict.FlatterDict(json_formatted_str))
that's not possible. the data you posted has trailing "," chars which json.load() cannot handle. json.decoder.JSONDecodeError: Expecting property name enclosed in double quotes: line 7 column 9 (char 119) ==> run the code i posted, then adapt it to your needs. the data it contains is your data, but fixed :)
as for the trailing comma issue in your json file, you probably have to fix them: "trailing commas are not allowed in JSON" (developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/…)
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.