I am reading file with CSV file with Spark SQL Context.
我正在使用带有Spark SQL Context的CSV文件读取文件。
Code :
代码:
m.put("path", CSV_DIRECTORY+file.getOriginalFilename());
m.put("inferSchema", "true"); // Automatically infer data types else string by default
m.put("header", "true"); // Use first line of all files as header
m.put("delimiter", ";");
DataFrame df = sqlContext.load("com.databricks.spark.csv",m);
df.printSchema();
Fetching Column names and Data type with df.printSchema()
使用df.printSchema()获取列名和数据类型
O/P :
O / P:
|--id : integer (nullable = true)
|-- ApplicationNo: string (nullable = true)
|-- Applidate: timestamp(nullable = true)
What is the return type of the statement printSchema. How to convert the output in JSON format, How to convert data frame into JSON??
printSchema语句的返回类型是什么。如何转换JSON格式的输出,如何将数据帧转换为JSON?
Desired O/P:
期望的O / P:
{"column":"id","datatype":"integer"}
1 个解决方案
#1
2
DataType has a json() method and a fromJson() method which you can use to serialize/deserialize schemas.
DataType有一个json()方法和一个fromJson()方法,您可以使用它来序列化/反序列化模式。
val df = sqlContext.read().....load()
val jsonString:String = df.schema.json()
val schema:StructType = DataType.fromJson(jsonString).asInstanceOf[StructType]
#1
2
DataType has a json() method and a fromJson() method which you can use to serialize/deserialize schemas.
DataType有一个json()方法和一个fromJson()方法,您可以使用它来序列化/反序列化模式。
val df = sqlContext.read().....load()
val jsonString:String = df.schema.json()
val schema:StructType = DataType.fromJson(jsonString).asInstanceOf[StructType]