C# 读取大型Xml文件

这篇博客将介绍在C#中如何读取数据量很大的Xml文件。请看下面的Xml文件，

<?xml version="1.0" encoding="utf-8"?>

<catalog>

  <book id="bk101">

    <author>Gambardella, Matthew</author>

    <title>C# developer</title>

    <genre>Computer</genre>

    <price>44.95</price>

    <publish_date>2000-10-01</publish_date>

    <description>An in-depth look at creating applications

      with XML.</description>

  </book>

  <book id="bk102">

    <author>Ralls, Kim</author>

    <title>Midnight Rain</title>

    <genre>Fantasy</genre>

    <price>5.95</price>

    <publish_date>2000-12-16</publish_date>

    <description>A former architect battles corporate zombies,

      an evil sorceress, and her own childhood to become queen

      of the world.</description>

  </book>

</catalog>

使用LINQ TO XML会很方便的处理这个Xml文件，例如我们要获取Book的数量

    XElement doc = XElement.Load("Book.xml");

    var books = from book in doc.Descendants("book")

                where book.Attribute("id").Value != "bk109"

                select book;

    Console.WriteLine("Books count: {0}", books.Count());

非常方便快捷的可以得到结果。但是当Xml文件很大时(例如，XML文件50M)，使用这种方式读取会很慢。这是因为XElement会将这个Xml文档一次性的加载到内存中，在内存中需要维护XML的DOM模型，会消耗很多的内存。使用XmlDocument操作大Xml文件结果也是一样。

当遇到大型的Xml文件，使用XmlReader来处理。请看下面的代码；

    public static IEnumerable<Book> Books(this XmlReader source)

    {

        while (source.Read())

        {

            if (source.NodeType == XmlNodeType.Element &&

                source.Name == "book")

            {

                string id = source.GetAttribute("id");

                int count = source.AttributeCount;

                string content = source.ReadInnerXml();

                string formated = string.Format("<book>{0}</book>", content);

                XElement element = XElement.Parse(formated);

                yield return new Book

                {

                    Id = id,

                    Author = element.Element("author").Value,

                    Title = element.Element("title").Value,

                    Description = element.Element("description").Value

                };

            }

        }

    }

    using (XmlReader reader = XmlReader.Create("Book.xml"))

    {

        Console.WriteLine("Books count: {0}", reader.Books().Count());

    }

使用XmlReader读取Xml文件时，不会一次性将Xml文件读取到内存中。处理大型Xml文件的效率比XmlDocument/LINQ TO Xml高很多。

感谢您的阅读。

秒客网

C# 读取大型Xml文件

相关文章