如何将Spark数据帧输出转换为json?

时间:2022-04-18 23:11:41

I am reading file with CSV file with Spark SQL Context.

我正在使用带有Spark SQL Context的CSV文件读取文件。

Code :

代码:

m.put("path", CSV_DIRECTORY+file.getOriginalFilename());
m.put("inferSchema", "true"); // Automatically infer data types else string by default
m.put("header", "true");      // Use first line of all files as header         
m.put("delimiter", ";");

DataFrame df = sqlContext.load("com.databricks.spark.csv",m);              
df.printSchema();

Fetching Column names and Data type with df.printSchema()

使用df.printSchema()获取列名和数据类型

O/P :

O / P:

|--id : integer (nullable = true)
|-- ApplicationNo: string (nullable = true)
|-- Applidate: timestamp(nullable = true)

What is the return type of the statement printSchema. How to convert the output in JSON format, How to convert data frame into JSON??

printSchema语句的返回类型是什么。如何转换JSON格式的输出,如何将数据帧转换为JSON?

Desired O/P:

期望的O / P:

{"column":"id","datatype":"integer"}

1 个解决方案

#1


2  

DataType has a json() method and a fromJson() method which you can use to serialize/deserialize schemas.

DataType有一个json()方法和一个fromJson()方法,您可以使用它来序列化/反序列化模式。

val df = sqlContext.read().....load()
val jsonString:String = df.schema.json()
val schema:StructType = DataType.fromJson(jsonString).asInstanceOf[StructType]

#1


2  

DataType has a json() method and a fromJson() method which you can use to serialize/deserialize schemas.

DataType有一个json()方法和一个fromJson()方法,您可以使用它来序列化/反序列化模式。

val df = sqlContext.read().....load()
val jsonString:String = df.schema.json()
val schema:StructType = DataType.fromJson(jsonString).asInstanceOf[StructType]