1

I have following catalog and want to use AWS glue to flatten it

| accountId | resourceId | items                                                           |
|-----------|------------|-----------------------------------------------------------------|
| 1         | r1         | {application:{component:[{name: "tool", version: "1.0"}, {name: "app", version: "1.0"}]}} |
| 1         | r2         | {application:{component:[{name: "tool", version: "2.0"}, {name: "app", version: "2.0"}]}} |
| 2         | r3         | {application:{component:[{name: "tool", version: "3.0"}, {name: "app", version: "3.0"}]}} |

Here is my schema

root
 |-- accountId: 
 |-- resourceId: 
 |-- PeriodId: 
 |-- items: 
 |    |-- application: 
 |    |    |-- component: array

I want to flatten it to following:

| accountId | resourceId | name | version |
|-----------|------------|------|---------|
| 1         | r1         | tool | 1.0     |
| 1         | r1         | app  | 1.0     |
| 1         | r2         | tool | 2.0     |
| 1         | r2         | app  | 2.0     |
| 2         | r3         | tool | 3.0     |
| 2         | r3         | app  | 3.0     |

1 Answer 1

1

From what I understand from your schema and data, yours is a deeply nested structure, so you could explode on items.application.component, and then select your name and version columns from that.

This link might help you understand: https://docs.databricks.com/spark/latest/dataframes-datasets/complex-nested-data.html

from pyspark.sql import functions as F
df.withColumn("items", F.explode(F.col("items.application.component")))\
.select("accountId","resourceId","items.name","items.version").show()


    +---------+----------+----+-------+
    |accountId|resourceId|name|version|
    +---------+----------+----+-------+
    |        1|        r1|tool|    1.0|
    |        1|        r1| app|    1.0|
    |        1|        r2|tool|    2.0|
    |        1|        r2| app|    2.0|
    |        2|        r3|tool|    3.0|
    |        2|        r3| app|    3.0|
    +---------+----------+----+-------+
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.