在xml标记中包装一些文本的最佳方法是什么?

时间:2022-11-29 13:20:51

I am trying to use Regex in C# to match a section in an xml document and wrap that section inside of a tag.

我试图在C#中使用Regex来匹配xml文档中的一个部分并将该部分包装在一个标记内。

For example, I have this section:

例如,我有这个部分:

<intro>
    <p>this is the first section of content</p>
    <p> this is another</p>
</intro>

and I want it to look like this:

我希望它看起来像这样:

<intro>
   <bodyText>
      <p> this is asdf</p>
      <p> yada yada </p>
   </bodyText>
</intro>

any thoughts?

有什么想法吗?

I was considering doing it using the XPath class in C# or just by reading in the document and using Regex. I just can't seem to figure it out either way.

我正在考虑在C#中使用XPath类,或者只是通过阅读文档和使用Regex。我似乎无法想象出来。

here is the one try:

这是一次尝试:

        StreamReader reader = new StreamReader(filePath);
        string content = reader.ReadToEnd();
        reader.Close();

        /* The regex stuff would go here */

        StreamWriter writer = new StreamWriter(filePath);
        writer.Write(content);
        writer.Close();
    }

Thanks!

谢谢!

2 个解决方案

#1


6  

I wouldn't recommend regular expressions for this task. Instead you can do it using LINQ to XML. For example, here is how you could wrap some tags inside a new tag:

我不建议为此任务使用正则表达式。相反,你可以使用LINQ to XML来完成它。例如,以下是如何在新标记中包含一些标记:

XDocument doc = XDocument.Load("input.xml");
var section = doc.Root.Elements("p");
doc.Root.ReplaceAll(new XElement("bodyText", section));
Console.WriteLine(doc.ToString()); 

Result:

结果:

<intro>
  <bodyText>
    <p>this is the first section of content</p>
    <p> this is another</p>
  </bodyText>
</intro>

I assume that your actual document differs considerably from the example you posted so the code will need some adjustment to fit your requirements, but if you read the documentation for XDocument you should be able to do what you want.

我假设您的实际文档与您发布的示例有很大不同,因此代码需要进行一些调整以满足您的要求,但如果您阅读XDocument的文档,您应该能够做您想要的。

#2


1  

I would suggest the use of System.XML and XPath - I don't think XML is considered a regular language similar to HTML which causes issues when trying to parse it with Regular expressions.

我建议使用System.XML和XPath - 我不认为XML被认为是类似于HTML的常规语言,在尝试使用正则表达式解析时会导致问题。

Use something like

使用类似的东西

XMLDocument doc = new XMLDocument();
doc.Load("Path to your xml document");

Enjoy!

请享用!

#1


6  

I wouldn't recommend regular expressions for this task. Instead you can do it using LINQ to XML. For example, here is how you could wrap some tags inside a new tag:

我不建议为此任务使用正则表达式。相反,你可以使用LINQ to XML来完成它。例如,以下是如何在新标记中包含一些标记:

XDocument doc = XDocument.Load("input.xml");
var section = doc.Root.Elements("p");
doc.Root.ReplaceAll(new XElement("bodyText", section));
Console.WriteLine(doc.ToString()); 

Result:

结果:

<intro>
  <bodyText>
    <p>this is the first section of content</p>
    <p> this is another</p>
  </bodyText>
</intro>

I assume that your actual document differs considerably from the example you posted so the code will need some adjustment to fit your requirements, but if you read the documentation for XDocument you should be able to do what you want.

我假设您的实际文档与您发布的示例有很大不同,因此代码需要进行一些调整以满足您的要求,但如果您阅读XDocument的文档,您应该能够做您想要的。

#2


1  

I would suggest the use of System.XML and XPath - I don't think XML is considered a regular language similar to HTML which causes issues when trying to parse it with Regular expressions.

我建议使用System.XML和XPath - 我不认为XML被认为是类似于HTML的常规语言,在尝试使用正则表达式解析时会导致问题。

Use something like

使用类似的东西

XMLDocument doc = new XMLDocument();
doc.Load("Path to your xml document");

Enjoy!

请享用!