I have a dataframe with a json string column.
I am trying to turn this json string column into a proper STRUCT object but as you can see my schema is dynamic and can differ for each row. Basically, in some instances I have a json object and then in some its a json array of objects and the number of possible objects in that array can not be known.
I tried this solution but it can only successfully generate schemas for a single object but not an array of objects.
json_schema = spark.read.json(df.rdd.map(lambda row: row.json-string)).schema
df = df.withColumn('new-struct-column', F.from_json(F.col('json-string'), json_schema))
Also, I have an extra key called text being generated by this method and I don't know where it is coming from.

