哪个更快，XML搜索或CSV搜索单词索引？

I have 2 files, each having 2 words : "word1" and "word2"

我有2个文件，每个文件有2个单词：“word1”和“word2”

They are

他们是

An XML
一个XML

<text>
   <word id="word1">
     <file>File1Name.txt</file>
     <file>File2Name.txt</file>
     <file>File3Name.txt</file>
   </word>
   <word id="word2">
     <file>File1Name.txt</file>
     <file>File4Name.txt</file>
   </word>
</text>

A CSV File
CSV文件

word1, File1Name.txt, File2name.txt, File3Name.txt
word2, File1Name.txt, File4Name.txt

Suppose I have 1 million words in both formats and I have to search for one word. Which format would be faster to retrieve my required files which contain that word?

假设我在两种格式中都有100万个单词，我必须搜索一个单词。检索包含该单词的所需文件的格式会更快？

1 个解决方案

#1

-1

Hey I wanted to put my two cents here. https://github.com/elastic/elasticsearch

嘿，我想把我的两分钱放在这里。 https://github.com/elastic/elasticsearch

is something I highly recommend you look into for something like this. I would recommend JSON over either XML or CSV as far as performance. But if you are going to have a million records. Something like document store with a non-relational DB, such as MongoDB would provide you with the fastest result's possible most likely, especially if your data is flat.

我强烈建议你研究这样的事情。就性能而言，我建议使用XML或CSV格式的JSON。但如果你要有一百万条记录。像带有非关系数据库的文档存储之类的东西，例如MongoDB，可能会为您提供最快的结果，特别是如果您的数据是平的。

Alternatively if this is something you are loading into memory, I would try to use some type of caching solution, let me know if you have more questions. Something like redis might be useful for you. http://redis.io/topics/introduction

或者，如果您正在加载到内存中，我会尝试使用某种类型的缓存解决方案，如果您有更多问题，请告诉我。像redis这样的东西可能对你有用。 http://redis.io/topics/introduction

#1

-1

Hey I wanted to put my two cents here. https://github.com/elastic/elasticsearch

嘿，我想把我的两分钱放在这里。 https://github.com/elastic/elasticsearch

秒客网

哪个更快，XML搜索或CSV搜索单词索引？

1 个解决方案

#1

#1

相关文章