如何从ruby中的CSV文件中获取标头

时间:2022-11-24 23:24:50

I need to validate headers in a CSV file before parsing data in it.

我需要在解析数据之前验证CSV文件中的标头。

# convert the data into an array of hashes
CSV::Converters[:blank_to_nil] = lambda do |field|
  field && field.empty? ? nil : field
end
csv = CSV.new(file, :headers => true, :header_converters => :symbol, :converters => [:all, :blank_to_nil])
csv_data = csv.to_a.map {|row| row.to_hash }

I know I can use headers method to get the headers

我知道我可以使用headers方法来获取标题

    headers = csv.headers

But the problem with headers method is it "Returns nil if headers will not be used, true if they will but have not yet been read, or the actual headers after they have been read."

但是header方法的问题是“如果不使用标题则返回nil,如果它们尚未被读取则返回true,或者在读取之后返回实际标题”。

So if I put headers = csv.headers above csv_data = csv.to_a.map {|row| row.to_hash } line headers is true and if I put it after reading data, headers contain headers row in an array. It imposes an order of instructions on my method which is very hard to test and is bad programming.

所以如果我把headers = csv.headers放在csv_data = csv.to_a.map {| row | row.to_hash}行标题为true,如果我在读取数据后将其放入,则标题包含数组中的标题行。它对我的方法施加了一个指令顺序,这很难测试并且编程很糟糕。

Is there a way to read headers row without imposing order in this scenario? I'm using ruby 2.0.

有没有办法在这种情况下读取标题行而不强加顺序?我正在使用ruby 2.0。

3 个解决方案

#1


12  

CSV.open(file_path, &:readline)

#2


4  

I get the problem! I'm having the same one. Calling read seems to do what you want (populates the headers variable):

我明白了!我有同样的一个。调用read似乎做你想要的(填充headers变量):

data = CSV.new(file, **flags)
data.headers # => true

data = CSV.new(file, **flags).read
data.headers # => ['field1', 'field2']

There might be other side effects I'm not aware of, but this works for me and doesn't smell too bad.

可能还有其他副作用我不知道,但这对我有用,并且闻起来不太糟糕。

#3


2  

I don't quite get the problem. If you use one of the iterator methods, it's quite easy to do some validation on the headers:

我不太明白这个问题。如果您使用其中一个迭代器方法,则可以很容易地对标头进行一些验证:

CSV.foreach('tmp.txt', headers: true) do |csv| 
  return unless csv.headers[0] != 'xyz'
end

#1


12  

CSV.open(file_path, &:readline)

#2


4  

I get the problem! I'm having the same one. Calling read seems to do what you want (populates the headers variable):

我明白了!我有同样的一个。调用read似乎做你想要的(填充headers变量):

data = CSV.new(file, **flags)
data.headers # => true

data = CSV.new(file, **flags).read
data.headers # => ['field1', 'field2']

There might be other side effects I'm not aware of, but this works for me and doesn't smell too bad.

可能还有其他副作用我不知道,但这对我有用,并且闻起来不太糟糕。

#3


2  

I don't quite get the problem. If you use one of the iterator methods, it's quite easy to do some validation on the headers:

我不太明白这个问题。如果您使用其中一个迭代器方法,则可以很容易地对标头进行一些验证:

CSV.foreach('tmp.txt', headers: true) do |csv| 
  return unless csv.headers[0] != 'xyz'
end