I have an Excel spreadsheet that has many people's estimates of another person's height and weight. In addition, some people have left comments on both estimate cells like "This estimate takes into account such and such".
我有一个Excel电子表格,有许多人估计另一个人的身高和体重。此外,有些人对两个估算单元都留下了评论,例如“此估算会考虑到这样的情况”。
I want to take the data from the spreadsheet (I've already figured out how to parse it), and represent it in a plain text file such that I can easily parse it back into a structured format (using Perl, ideally).
我想从电子表格中获取数据(我已经弄清楚了如何解析它),并在纯文本文件中表示它,以便我可以轻松地将其解析为结构化格式(理想情况下使用Perl)。
Originally I thought to use YAML:
原本我以为使用YAML:
Tom:
Height:
Estimate: 5
Comment: Not that confident
Weight:
Estimate: 7
Comment: Very confident
Natalia: ...
But now I'm thinking this is a bit difficult to read, and I was wondering if there were some textual tabular representation that would would be easier to read and still parsable.
但是现在我认为这有点难以阅读,我想知道是否有一些文本表格表示会更容易阅读并且仍然可以解析。
Something like:
PERSON HEIGHT Weight
-----------------------------
Tom 5 7
___START_HEIGHT_COMMENT___
We hold these truths to be self-evident, that all men are created equal, that they are endowed by their Creator with certain unalienable Rights, that among these are Life, Liberty and the pursuit of Happiness. That to secure these rights, Governments are instituted among Men, deriving their just powers from the consent of the governed [...]
Wait, what's this project about again?
___END_HEIGHT_COMMENT___
___START_WEIGHT_COMMENT___
We hold these truths to be self-evident, that all men are created equal, that they are endowed by their Creator with certain unalienable Rights, that among these are Life, Liberty and the pursuit of Happiness. That to secure these rights, Governments are instituted among Men, deriving their just powers from the consent of the governed [...]
Wait, what's this project about again?
___END_WEIGHT_COMMENT___
Natalia 2 4
John 3 3
Is there a better way to do this?
有一个更好的方法吗?
5 个解决方案
#1
CSV (Comma Separated Values).
CSV(逗号分隔值)。
You can even save it directly into this format from Excel, and read it directly into Excel from this format. Yet it is also human readable, and easily machine parseable.
您甚至可以从Excel直接将其保存为此格式,并从此格式直接将其读取到Excel中。然而,它也是人类可读的,并且易于机器解析。
#2
Normally if I want to capture data from a spreadsheet in textual form I use CSV (which Excel can read and write). It's easy to generate and parse as well as being compatible with many other tools but it doesn't rank high on the "human readable" chart. It can be read but it's awkward for anything but simple files with equal field widths.
通常,如果我想以文本形式从电子表格中捕获数据,我使用CSV(Excel可以读写)。它易于生成和解析,并且与许多其他工具兼容,但它在“人类可读”图表中排名不高。它可以读取但除了具有相等字段宽度的简单文件之外,它的任何东西都很尴尬。
XML is an option, but YAML is easier to read. Being human-readable is one of the design goals of YAML. The YAML::Tiny module is a nice and lightweight module for typical cases.
XML是一个选项,但YAML更容易阅读。人类可读是YAML的设计目标之一。对于典型案例,YAML :: Tiny模块是一个漂亮而轻量级的模块。
It looks like what you have in mind is a plain text table, or possibly a tabular format with fixed with columns. There are some modules on CPAN that might be useful: Text::Table, Text::SimpleTable, others... These modules can generate a representation that's easy to read but parsing it will be harder. (They're intended for data presentation, not storage and retrieval.) You'd probably have to build your own parser.
看起来你的想法是一个纯文本表,或者可能是一个用列固定的表格格式。 CPAN上有一些可能有用的模块:Text :: Table,Text :: SimpleTable,其他......这些模块可以生成一个易于阅读的表示,但解析它会更难。 (它们用于数据表示,而不是存储和检索。)您可能必须构建自己的解析器。
#3
Adding to Robert's answer, you can simply put the comments in additional columns (commas will be escaped by the CSV output filter of Excel etc). More on CSV format: www.csvreader.com/csv_format.php
添加到Robert的答案中,您可以简单地将注释放在其他列中(逗号将通过Excel的CSV输出过滤器等进行转义)。有关CSV格式的更多信息:www.csvreader.com/csv_format.php
#4
No reason you can't use XML, though I'd imagine it's overkill in this particular case.
没有理由你不能使用XML,虽然我认为在这种特殊情况下它是过度的。
#5
There's also Config::General for simple data, and its family of related classes.
还有Config :: General用于简单数据,以及它的相关类系列。
#1
CSV (Comma Separated Values).
CSV(逗号分隔值)。
You can even save it directly into this format from Excel, and read it directly into Excel from this format. Yet it is also human readable, and easily machine parseable.
您甚至可以从Excel直接将其保存为此格式,并从此格式直接将其读取到Excel中。然而,它也是人类可读的,并且易于机器解析。
#2
Normally if I want to capture data from a spreadsheet in textual form I use CSV (which Excel can read and write). It's easy to generate and parse as well as being compatible with many other tools but it doesn't rank high on the "human readable" chart. It can be read but it's awkward for anything but simple files with equal field widths.
通常,如果我想以文本形式从电子表格中捕获数据,我使用CSV(Excel可以读写)。它易于生成和解析,并且与许多其他工具兼容,但它在“人类可读”图表中排名不高。它可以读取但除了具有相等字段宽度的简单文件之外,它的任何东西都很尴尬。
XML is an option, but YAML is easier to read. Being human-readable is one of the design goals of YAML. The YAML::Tiny module is a nice and lightweight module for typical cases.
XML是一个选项,但YAML更容易阅读。人类可读是YAML的设计目标之一。对于典型案例,YAML :: Tiny模块是一个漂亮而轻量级的模块。
It looks like what you have in mind is a plain text table, or possibly a tabular format with fixed with columns. There are some modules on CPAN that might be useful: Text::Table, Text::SimpleTable, others... These modules can generate a representation that's easy to read but parsing it will be harder. (They're intended for data presentation, not storage and retrieval.) You'd probably have to build your own parser.
看起来你的想法是一个纯文本表,或者可能是一个用列固定的表格格式。 CPAN上有一些可能有用的模块:Text :: Table,Text :: SimpleTable,其他......这些模块可以生成一个易于阅读的表示,但解析它会更难。 (它们用于数据表示,而不是存储和检索。)您可能必须构建自己的解析器。
#3
Adding to Robert's answer, you can simply put the comments in additional columns (commas will be escaped by the CSV output filter of Excel etc). More on CSV format: www.csvreader.com/csv_format.php
添加到Robert的答案中,您可以简单地将注释放在其他列中(逗号将通过Excel的CSV输出过滤器等进行转义)。有关CSV格式的更多信息:www.csvreader.com/csv_format.php
#4
No reason you can't use XML, though I'd imagine it's overkill in this particular case.
没有理由你不能使用XML,虽然我认为在这种特殊情况下它是过度的。
#5
There's also Config::General for simple data, and its family of related classes.
还有Config :: General用于简单数据,以及它的相关类系列。