Rust : 读含有中文字符的CSV文件解决方案

时间:2022-07-13 08:12:09

本文感谢42的帮助。

Rust有一个库CSV,可以用来处理相关的CSV文件。

相关资料见:http://burntsushi.net/rustdoc/csv/

比如有个CSV文件,其结构是(有表头的):

Rust : 读含有中文字符的CSV文件解决方案

以下读出CSV,可以有两种组织方式:

一、可以解析成tuple.

extern crate stopwatch;
extern crate csv;
fn main() {

let mut rdr =
csv::Reader::from_file("C:\\IC1505.csv").unwrap().has_headers(false);//不要求是读出第一行段首,否则选择true.
for record in rdr.decode() {
let (s1, s2, dist): (String, String, String) = record.unwrap();
println!("{}=> {}, {}, {}", num, s1, s2, dist);
}
thread::sleep_ms(500000);
}

二、也可以解析成struct,非常方便,赞!

extern crate stopwatch;
extern crate csv;
extern crate rustc_serialize;
use stopwatch::Stopwatch;
use std::thread;
#[derive(RustcDecodable, RustcEncodable)]
struct bar {
market: String,
code: String,
date: String,
open: String,//=>f64?
high: String,//=>f64?
low: String,//=>f64?
volume: String,//=>f64?
openInterests: String,//=>f64?
}
fn main() {
let mut rdr =
csv::Reader::from_file("C:\\IC1505.csv").unwrap().has_headers(false);
for record in rdr.decode() {
let temp: bar = record.unwrap();
println!("{}=> {}, {}, {},{},{}",
num,
temp.market,
temp.code,
temp.date,
temp.open,
temp.high);
}
thread::sleep_ms(500000);
}

三、关于struct的优化

如何把bar的结构精准地描述(f64,String…..)?
我们发现:
如果去掉表头,形成以下模式,则可以把bar结构中字段进行精确解析。

Rust : 读含有中文字符的CSV文件解决方案

extern crate stopwatch;
extern crate csv;
extern crate rustc_serialize;
use stopwatch::Stopwatch;
use std::thread;
use std::io;
use std::time::{Duration, Instant};

#[derive(RustcDecodable, RustcEncodable)]
struct bar {
market: String,
code: String,
date: String,
open: f64,
high: f64,
low: f64,
close: f64,
volume: f64,
openInterests: f64,
}
fn main() {
let mut rdr =
csv::Reader::from_file("C:\\IC1505.csv").unwrap().has_headers(false);
let mut data: Vec<bar> = Vec::new();
for record in rdr.decode() {
let mut temp: bar = record.unwrap();
println!("len: =>{:?},{},{},{},{}",
data.len(),
temp.market,
temp.code,
temp.date,
temp.close);
data.push(temp);
}
thread::sleep_ms(500000);
}

四、csv库如何读出汉字?
比如:
Rust : 读含有中文字符的CSV文件解决方案


extern crate csv;
extern crate rustc_serialize;
extern crate encoding;

use std::io;
use std::io::prelude::*;
use std::fs::File;

use encoding::{Encoding, DecoderTrap};
use encoding::all::GB18030;//可以转成汉字的字库

fn main() {
let path = "C:\\Users\\Desktop\\test.csv";
let mut f = File::open(path).ok().expect("cannot open file");
let mut reader: Vec<u8> = Vec::new();
f.read_to_end(&mut reader).ok().expect("can not read file");
let mut chars = String::new();
GB18030.decode_to(&mut reader, DecoderTrap::Ignore, &mut chars);
let mut rdr = csv::Reader::from_string(chars).has_headers(true);
for row in rdr.decode() {
let (x, y, r): (String, String, String) = row.unwrap();
println!("({}, {}): {:?}", x, y, r);
}
}