如何使用嵌套元素获取XML文件并从中获取一组C#类?

时间:2022-06-09 14:26:46

First off, I'm not terribly experienced in XML. I know the very basics of reading in and writing it, but for the most part, things like schemas start to make my eyes cross really quickly. If it looks like I'm making incorrect assumptions about how XML works, there's a good chance that I am.

首先,我在XML方面不是很有经验。我知道阅读和写作的基本知识,但在大多数情况下,模式之类的东西开始让我的眼睛快速交叉。如果看起来我对XML的工作方式做出了错误的假设,那么我很有可能。

That disclaimer aside, this is a problem I've run into several times without finding an agreeable solution. I have an XML which defines data, including nested entries (to give an example, a file might have a "Power" element which has a child node of "AlternatePowers" which in turn contains "Power" elements). Ideally, I would like to be able to generate a quick set of classes from this XML file to store the data I'm reading in. The general solution I've seen is to use Microsoft's XSD.exe tool to generate an XSD file from the XML file and then use the same tool to convert the schema into classes. The catch is, the tool chokes if there are nested elements. Example:

除了这个免责声明,这是一个问题,我已经碰到好几次而没有找到一个令人满意的解决方案。我有一个定义数据的XML,包括嵌套的条目(举个例子,一个文件可能有一个“Power”元素,它有一个子节点“AlternatePowers”,后面又包含“Power”元素)。理想情况下,我希望能够从这个XML文件中生成一组快速的类来存储我正在读取的数据。我看到的一般解决方案是使用Microsoft的XSD.exe工具从中生成XSD文件XML文件,然后使用相同的工具将模式转换为类。问题是,如果有嵌套元素,工具会扼流圈。例:

- A column named 'Power' already belongs to this DataTable: cannot set 
a nested table name to the same name.

Is there a nice simple way to do this? I did a couple of searches for similar questions here, but the only questions I found dealing with generating schemas with nested elements with the same name were unanswered.

有一个很简单的方法吗?我在这里对类似的问题进行了几次搜索,但是我发现处理生成具有相同名称的嵌套元素的模式的唯一问题没有得到解答。

Alternately, it's also possible that I am completely misunderstanding how XML and XSD work and it's not possible to have such nesting...

或者,我也可能完全误解XML和XSD是如何工作的,并且不可能有这样的嵌套......

Update

更新

As an example, one of the things I'd like to parse is the XML output of a particular character builder program. Fair warning, this is a bit wordy despite me removing anything but the powers section.

例如,我要解析的一件事是特定字符构建器程序的XML输出。公平的警告,尽管我删除了除了权力部分以外的任何内容,但这有点罗嗦。

<?xml version="1.0" encoding="ISO-8859-1"?>
<document>
  <product name="Hero Lab" url="http://www.wolflair.com" versionmajor="3" versionminor="7" versionpatch=" " versionbuild="256">Hero Lab® and the Hero Lab logo are Registered Trademarks of LWD Technology, Inc. Free download at http://www.wolflair.com
    Mutants &amp; Masterminds, Second Edition is ©2005-2011 Green Ronin Publishing, LLC. All rights reserved.</product>
  <hero active="yes" name="Pretty Deadly" playername="">
    <size name="Medium"/>
    <powers>
      <power name="Enhanced Trait 16" info="" ranks="16" cost="16" range="" displaylevel="0" summary="Traits: Constitution +6 (18, +4), Dexterity +8 (20, +5), Charisma +2 (12, +1)" active="yes">
        <powerdesc>You have an enhancement to a non-effect trait, such as an ability (including saving throws) or skill (including attack or defense bonus). Since Toughness save cannot be increased on its own,use the Protection effect instead of Enhanced Toughness (see Protection later in this chapter).</powerdesc>
        <descriptors/>
        <elements/>
        <options/>
        <traitmods>
          <traitmod name="Constitution" bonus="+6"/>
          <traitmod name="Dexterity" bonus="+8"/>
          <traitmod name="Charisma" bonus="+2"/>
        </traitmods>
        <flaws/>
        <powerfeats/>
        <powerdrawbacks/>
        <usernotes/>
        <alternatepowers/>
        <chainedpowers/>
        <otherpowers/>
      </power>
      <power name="Sailor Suit (Device 2)" info="" ranks="2" cost="8" range="" displaylevel="0" summary="Hard to lose" active="yes">
        <powerdesc>A device that has one or more powers and can be equipped and un-equipped.</powerdesc>
        <descriptors/>
        <elements/>
        <options/>
        <traitmods/>
        <flaws/>
        <powerfeats/>
        <powerdrawbacks/>
        <usernotes/>
        <alternatepowers/>
        <chainedpowers/>
        <otherpowers>
          <power name="Protection 6" info="+6 Toughness" ranks="6" cost="10" range="" displaylevel="1" summary="+6 Toughness; Impervious [4 ranks only]" active="yes">
            <powerdesc>You're particularly resistant to harm. You gain a bonus on your Toughness saving throws equal to your Protection rank.</powerdesc>
            <descriptors/>
            <elements/>
            <options/>
            <traitmods/>
            <extras>
              <extra name="Impervious" info="" partialranks="2">Your Protection stops some damage completely. If an attack has a damage bonus less than your Protection rank, it inflicts no damage (you automatically succeed on your Toughness saving throw). Penetrating damage (see page 112) ignores this modifier; you must save against it normally.</extra>
            </extras>
            <flaws/>
            <powerfeats/>
            <powerdrawbacks/>
            <usernotes/>
            <alternatepowers/>
            <chainedpowers/>
            <otherpowers/>
          </power>
        </otherpowers>
      </power>
    </powers>
  </hero>
</document>

Yes, there are a number of unnecessary tags in there, but it's an example of the kind of XML that I'd like to be able to plug in and get something reasonable. This XML, when sent into XSD, generates the following error:

是的,那里有许多不必要的标签,但它是我希望能够插入并获得合理的XML类型的一个例子。此XML在发送到XSD时会生成以下错误:

- A column named 'traitmods' already belongs to this DataTable: cannot set
a nested table name to the same name.

3 个解决方案

#1


1  

I just finished helping someone with that. Try reading this thread here: https://*.com/a/8840309/353147

我刚刚完成了帮助。请尝试阅读此主题:https://*.com/a/8840309/353147

Taking from your example and my link, you'd have classes like this.

从你的例子和我的链接,你有这样的课程。

public class Power
{
    XElement self;

    public Power(XElement power) { self = power; }

    public AlternatePowers AlternatePowers
    { get { return new AlternatePowers(self.Element("AlternatePowers")); } }
}

public class AlternatePowers
{
    XElement self;

    public AlternatePowers(XElement power) { self = power; }

    public Power2[] Powers
    { 
        get 
        { 
            return self.Elements("Power").Select(e => new Power2(e)).ToArray();
        }
    }
}

public class Power2
{
    XElement self;

    public Power2(XElement power) { self = power; }
}

Without knowing the rest of your xml, I cannot make the properties that make up each class/node level, but you should get the gist from here and from the link.

在不了解xml的其余部分的情况下,我无法创建构成每个类/节点级别的属性,但是您应该从此处和链接获取要点。

You'd then reference it like this:

然后你会像这样引用它:

Power power = new Power(XElement.Load("file"));
foreach(Power2 power2 in power.AlternatePowers.Powers)
{
    ...
}

#2


0  

Your error message implies that you are trying to generate a DataSet from the schema (/d switch), as opposed to a set of arbitrary classes decorated with XML Serializer attributes (/c switch).

您的错误消息暗示您正在尝试从架构(/ d开关)生成DataSet,而不是使用XML Serializer属性(/ c开关)修饰的一组任意类。

I've not tried generating a DataSet like that myself, but I can see how it might fail. A DataSet is a collection of DataTables, which in turn contain a collection of DataRows. That's a fixed 3-level hierarchy. If your XML schema is more or less than 3 levels deep, then it won't fit into the required structure. Try creating a test DataSet in the designer and examine the generated .xsd file; that will show you what kind of schema structure will fit.

我没有尝试像我自己那样生成DataSet,但我可以看到它可能会失败。 DataSet是DataTable的集合,而DataTable又包含DataRows的集合。这是一个固定的3级层次结构。如果您的XML模式深度多于或少于3个级别,那么它将不适合所需的结构。尝试在设计器中创建测试DataSet并检查生成的.xsd文件;这将告诉你什么样的架构结构适合。

I can assure you from personal experience, if you convert the schema to a set of arbitrary classes instead, then it will handle pretty much any schema structure that you care to throw at it.

我可以从个人经验中向你保证,如果你将模式转换为一组任意类,那么它将处理你想要抛出的任何模式结构。

#3


0  

So, it's not pretty, but the following is what I wound up with as a solution. I run processElement on the base node and then I go through extantElements and export the class code.

所以,它并不漂亮,但以下是我最终解决的问题。我在基节点上运行processElement,然后我浏览extantElements并导出类代码。

namespace XMLToClasses
{
    public class Element
    {
        public string Name;
        public HashSet<string> attributes;
        public HashSet<string> children;

        public bool hasText;

        public Element()
        {
            Name = "";

            attributes = new HashSet<string>();
            children = new HashSet<string>();

            hasText = false;
        }

    public string getSource()
        {
            StringBuilder sourceSB = new StringBuilder();

            sourceSB.AppendLine("[Serializable()]");
            sourceSB.AppendLine("public class cls_" + Name);
            sourceSB.AppendLine("{");

            sourceSB.AppendLine("\t// Attributes" );

            if (hasText)
            {
                sourceSB.AppendLine("\tstring InnerText;");
            }

            foreach(string attribute in attributes)
            {
                sourceSB.AppendLine("\tpublic string atr_" + attribute + ";");
            }
            sourceSB.AppendLine("");
            sourceSB.AppendLine("\t// Children");
            foreach (string child in children)
            {
                sourceSB.AppendLine("\tpublic List<cls_" + child + "> list" + child + ";");
            }

            sourceSB.AppendLine("");
            sourceSB.AppendLine("\t// Constructor");
            sourceSB.AppendLine("\tpublic cls_" + Name + "()");
            sourceSB.AppendLine("\t{");
            foreach (string child in children)
            {
                sourceSB.AppendLine("\t\tlist" + child + " = new List<cls_" + child + ">()" + ";");
            }
            sourceSB.AppendLine("\t}");

            sourceSB.AppendLine("");
            sourceSB.AppendLine("\tpublic cls_" + Name + "(XmlNode xmlNode) : this ()");
            sourceSB.AppendLine("\t{");

            if (hasText)
            {
                sourceSB.AppendLine("\t\t\tInnerText = xmlNode.InnerText;");
                sourceSB.AppendLine("");
            }            

            foreach (string attribute in attributes)
            {
                sourceSB.AppendLine("\t\tif (xmlNode.Attributes[\"" + attribute + "\"] != null)");
                sourceSB.AppendLine("\t\t{");
                sourceSB.AppendLine("\t\t\tatr_" + attribute + " = xmlNode.Attributes[\"" + attribute + "\"].Value;");
                sourceSB.AppendLine("\t\t}");
            }

            sourceSB.AppendLine("");

            foreach (string child in children)
            {
                sourceSB.AppendLine("\t\tforeach (XmlNode childNode in xmlNode.SelectNodes(\"./" + child + "\"))");
                sourceSB.AppendLine("\t\t{");
                sourceSB.AppendLine("\t\t\tlist" + child + ".Add(new cls_" + child + "(childNode));");
                sourceSB.AppendLine("\t\t}");
            }

            sourceSB.AppendLine("\t}");

            sourceSB.Append("}");

            return sourceSB.ToString();
        }
    }

    public class XMLToClasses
    {
        public Hashtable extantElements;

        public XMLToClasses()
        {
            extantElements = new Hashtable();
        }

        public Element processElement(XmlNode xmlNode)
        {
            Element element;

            if (extantElements.Contains(xmlNode.Name))
            {
                element = (Element)extantElements[xmlNode.Name];
            }
            else
            {
                element = new Element();
                element.Name = xmlNode.Name;

                extantElements.Add(element.Name, element);
            }            

            if (xmlNode.Attributes != null)
            {
                foreach (XmlAttribute attribute in xmlNode.Attributes)
                {
                    if (!element.attributes.Contains(attribute.Name))
                    {
                        element.attributes.Add(attribute.Name);
                    }
                }
            }


            if (xmlNode.ChildNodes != null)
            {
                foreach (XmlNode node in xmlNode.ChildNodes)
                {
                    if (node.Name == "#text")
                    {
                        element.hasText = true;
                    }
                    else
                    {
                        Element childNode = processElement(node);

                        if (!element.children.Contains(childNode.Name))
                        {
                            element.children.Add(childNode.Name);
                        }
                    }
                }
            }

            return element;
        }
    }
}

I'm sure there's ways to make this look more pretty or work better, but it's sufficient for me.

我敢肯定有办法让这个看起来更漂亮或更好,但这对我来说已经足够了。

Edit: And ugly but functional deserialization code added to take an XMLNode containing the object and decode it.

编辑:添加丑陋但功能性的反序列化代码,以获取包含该对象的XMLNode并对其进行解码。

Later Thoughts: Two years later, I had an opportunity to re-use this code. Not only have I not kept it up to date here (I'd made changes to better normalize the names of the items), but I think that the commenters saying that I was going about this the wrong way were right. I still think this could be a handy way of generating template classes for an XML file where a given type of element could show up at different depths, but it's inflexible (you have to rerun the code and re-extract the classes every time) and doesn't nicely handle changes in versioning (between when I first created this code to allow me to quickly create a character file converter and now, the format changed, so I had people complaining that it stopped working. In retrospect, it would have made more sense to search for the correct elements using XPaths and then pull the data from there).

后来的想法:两年后,我有机会重新使用这段代码。我不仅没有在这里保持最新状态(我已做出更改以更好地规范项目的名称),但我认为评论者说我以错误的方式进行此操作是正确的。我仍然认为这可能是为XML文件生成模板类的一种方便方法,其中给定类型的元素可以显示在不同的深度,但它不灵活(您必须重新运行代码并每次重新提取类)和并没有很好地处理版本控制的变化(在我第一次创建这个代码以允许我快速创建一个字符文件转换器之间,现在,格式发生了变化,所以我让人抱怨它停止工作。回想起来,它会做的更有意义的是使用XPath搜索正确的元素,然后从那里拉取数据)。

Still, it was a valuable experience, and I suspect I'm probably going to come back to this code from time to time for quickly roughing out XML data, at least until I find something better.

尽管如此,这是一次宝贵的经历,我怀疑我可能会不时地回到这段代码来快速粗略化XML数据,至少在我找到更好的东西之前。

#1


1  

I just finished helping someone with that. Try reading this thread here: https://*.com/a/8840309/353147

我刚刚完成了帮助。请尝试阅读此主题:https://*.com/a/8840309/353147

Taking from your example and my link, you'd have classes like this.

从你的例子和我的链接,你有这样的课程。

public class Power
{
    XElement self;

    public Power(XElement power) { self = power; }

    public AlternatePowers AlternatePowers
    { get { return new AlternatePowers(self.Element("AlternatePowers")); } }
}

public class AlternatePowers
{
    XElement self;

    public AlternatePowers(XElement power) { self = power; }

    public Power2[] Powers
    { 
        get 
        { 
            return self.Elements("Power").Select(e => new Power2(e)).ToArray();
        }
    }
}

public class Power2
{
    XElement self;

    public Power2(XElement power) { self = power; }
}

Without knowing the rest of your xml, I cannot make the properties that make up each class/node level, but you should get the gist from here and from the link.

在不了解xml的其余部分的情况下,我无法创建构成每个类/节点级别的属性,但是您应该从此处和链接获取要点。

You'd then reference it like this:

然后你会像这样引用它:

Power power = new Power(XElement.Load("file"));
foreach(Power2 power2 in power.AlternatePowers.Powers)
{
    ...
}

#2


0  

Your error message implies that you are trying to generate a DataSet from the schema (/d switch), as opposed to a set of arbitrary classes decorated with XML Serializer attributes (/c switch).

您的错误消息暗示您正在尝试从架构(/ d开关)生成DataSet,而不是使用XML Serializer属性(/ c开关)修饰的一组任意类。

I've not tried generating a DataSet like that myself, but I can see how it might fail. A DataSet is a collection of DataTables, which in turn contain a collection of DataRows. That's a fixed 3-level hierarchy. If your XML schema is more or less than 3 levels deep, then it won't fit into the required structure. Try creating a test DataSet in the designer and examine the generated .xsd file; that will show you what kind of schema structure will fit.

我没有尝试像我自己那样生成DataSet,但我可以看到它可能会失败。 DataSet是DataTable的集合,而DataTable又包含DataRows的集合。这是一个固定的3级层次结构。如果您的XML模式深度多于或少于3个级别,那么它将不适合所需的结构。尝试在设计器中创建测试DataSet并检查生成的.xsd文件;这将告诉你什么样的架构结构适合。

I can assure you from personal experience, if you convert the schema to a set of arbitrary classes instead, then it will handle pretty much any schema structure that you care to throw at it.

我可以从个人经验中向你保证,如果你将模式转换为一组任意类,那么它将处理你想要抛出的任何模式结构。

#3


0  

So, it's not pretty, but the following is what I wound up with as a solution. I run processElement on the base node and then I go through extantElements and export the class code.

所以,它并不漂亮,但以下是我最终解决的问题。我在基节点上运行processElement,然后我浏览extantElements并导出类代码。

namespace XMLToClasses
{
    public class Element
    {
        public string Name;
        public HashSet<string> attributes;
        public HashSet<string> children;

        public bool hasText;

        public Element()
        {
            Name = "";

            attributes = new HashSet<string>();
            children = new HashSet<string>();

            hasText = false;
        }

    public string getSource()
        {
            StringBuilder sourceSB = new StringBuilder();

            sourceSB.AppendLine("[Serializable()]");
            sourceSB.AppendLine("public class cls_" + Name);
            sourceSB.AppendLine("{");

            sourceSB.AppendLine("\t// Attributes" );

            if (hasText)
            {
                sourceSB.AppendLine("\tstring InnerText;");
            }

            foreach(string attribute in attributes)
            {
                sourceSB.AppendLine("\tpublic string atr_" + attribute + ";");
            }
            sourceSB.AppendLine("");
            sourceSB.AppendLine("\t// Children");
            foreach (string child in children)
            {
                sourceSB.AppendLine("\tpublic List<cls_" + child + "> list" + child + ";");
            }

            sourceSB.AppendLine("");
            sourceSB.AppendLine("\t// Constructor");
            sourceSB.AppendLine("\tpublic cls_" + Name + "()");
            sourceSB.AppendLine("\t{");
            foreach (string child in children)
            {
                sourceSB.AppendLine("\t\tlist" + child + " = new List<cls_" + child + ">()" + ";");
            }
            sourceSB.AppendLine("\t}");

            sourceSB.AppendLine("");
            sourceSB.AppendLine("\tpublic cls_" + Name + "(XmlNode xmlNode) : this ()");
            sourceSB.AppendLine("\t{");

            if (hasText)
            {
                sourceSB.AppendLine("\t\t\tInnerText = xmlNode.InnerText;");
                sourceSB.AppendLine("");
            }            

            foreach (string attribute in attributes)
            {
                sourceSB.AppendLine("\t\tif (xmlNode.Attributes[\"" + attribute + "\"] != null)");
                sourceSB.AppendLine("\t\t{");
                sourceSB.AppendLine("\t\t\tatr_" + attribute + " = xmlNode.Attributes[\"" + attribute + "\"].Value;");
                sourceSB.AppendLine("\t\t}");
            }

            sourceSB.AppendLine("");

            foreach (string child in children)
            {
                sourceSB.AppendLine("\t\tforeach (XmlNode childNode in xmlNode.SelectNodes(\"./" + child + "\"))");
                sourceSB.AppendLine("\t\t{");
                sourceSB.AppendLine("\t\t\tlist" + child + ".Add(new cls_" + child + "(childNode));");
                sourceSB.AppendLine("\t\t}");
            }

            sourceSB.AppendLine("\t}");

            sourceSB.Append("}");

            return sourceSB.ToString();
        }
    }

    public class XMLToClasses
    {
        public Hashtable extantElements;

        public XMLToClasses()
        {
            extantElements = new Hashtable();
        }

        public Element processElement(XmlNode xmlNode)
        {
            Element element;

            if (extantElements.Contains(xmlNode.Name))
            {
                element = (Element)extantElements[xmlNode.Name];
            }
            else
            {
                element = new Element();
                element.Name = xmlNode.Name;

                extantElements.Add(element.Name, element);
            }            

            if (xmlNode.Attributes != null)
            {
                foreach (XmlAttribute attribute in xmlNode.Attributes)
                {
                    if (!element.attributes.Contains(attribute.Name))
                    {
                        element.attributes.Add(attribute.Name);
                    }
                }
            }


            if (xmlNode.ChildNodes != null)
            {
                foreach (XmlNode node in xmlNode.ChildNodes)
                {
                    if (node.Name == "#text")
                    {
                        element.hasText = true;
                    }
                    else
                    {
                        Element childNode = processElement(node);

                        if (!element.children.Contains(childNode.Name))
                        {
                            element.children.Add(childNode.Name);
                        }
                    }
                }
            }

            return element;
        }
    }
}

I'm sure there's ways to make this look more pretty or work better, but it's sufficient for me.

我敢肯定有办法让这个看起来更漂亮或更好,但这对我来说已经足够了。

Edit: And ugly but functional deserialization code added to take an XMLNode containing the object and decode it.

编辑:添加丑陋但功能性的反序列化代码,以获取包含该对象的XMLNode并对其进行解码。

Later Thoughts: Two years later, I had an opportunity to re-use this code. Not only have I not kept it up to date here (I'd made changes to better normalize the names of the items), but I think that the commenters saying that I was going about this the wrong way were right. I still think this could be a handy way of generating template classes for an XML file where a given type of element could show up at different depths, but it's inflexible (you have to rerun the code and re-extract the classes every time) and doesn't nicely handle changes in versioning (between when I first created this code to allow me to quickly create a character file converter and now, the format changed, so I had people complaining that it stopped working. In retrospect, it would have made more sense to search for the correct elements using XPaths and then pull the data from there).

后来的想法:两年后,我有机会重新使用这段代码。我不仅没有在这里保持最新状态(我已做出更改以更好地规范项目的名称),但我认为评论者说我以错误的方式进行此操作是正确的。我仍然认为这可能是为XML文件生成模板类的一种方便方法,其中给定类型的元素可以显示在不同的深度,但它不灵活(您必须重新运行代码并每次重新提取类)和并没有很好地处理版本控制的变化(在我第一次创建这个代码以允许我快速创建一个字符文件转换器之间,现在,格式发生了变化,所以我让人抱怨它停止工作。回想起来,它会做的更有意义的是使用XPath搜索正确的元素,然后从那里拉取数据)。

Still, it was a valuable experience, and I suspect I'm probably going to come back to this code from time to time for quickly roughing out XML data, at least until I find something better.

尽管如此,这是一次宝贵的经历,我怀疑我可能会不时地回到这段代码来快速粗略化XML数据,至少在我找到更好的东西之前。