I am writing Spark Application in Java which reads the HiveTable and store the output in HDFS as Json Format.
I read the hive table using HiveContext and it returns the DataFrame. Below is the code snippet.
SparkConf conf = new SparkConf().setAppName("App");
JavaSparkContext sc = new JavaSparkContext(conf);
HiveContext hiveContext = new org.apache.spark.sql.hive.HiveContext(sc);
DataFrame data1= hiveContext.sql("select * from tableName")
Now I want to convert DataFrame to JsonArray. For Example, data1 data looks like below
| A | B |
-------------------
| 1 | test |
| 2 | mytest |
I need an output like below
[{1:"test"},{2:"mytest"}]
I tried using data1.schema.json() and it gives me the output like below, not an Array.
{1:"test"}
{2:"mytest"}
What is the right approach or function to convert the DataFrame to jsonArray without using any third Party libraries.