Edit: I decided to take the LINQ to XML approach (see the answer below) that was recommended and everything works EXCEPT that I can't replace out the changed records with the records from the incremental file. I managed to make the program work by just removing the full file node and then adding in the incremental node. Is there a way to just swap them instead? Also, while this solution is very nice, is there any way to shrink down memory usage without losing the LINQ code? This solution may still work, but I would be willing to sacrifice time to lower memory usage.
编辑:我决定采用推荐的LINQ to XML方法(请参阅下面的答案),除了我不能用增量文件中的记录替换已更改的记录外,一切都有效。我设法通过删除完整文件节点然后添加增量节点来使程序工作。有没有办法只交换它们?此外,虽然这个解决方案非常好,有没有办法缩小内存使用而不会丢失LINQ代码?这个解决方案可能仍然有效,但我愿意花时间来降低内存使用率。
I'm trying to take two XML files (a full file and an incremental file) and merge them together. The XML file looks like this:
我正在尝试获取两个XML文件(完整文件和增量文件)并将它们合并在一起。 XML文件如下所示:
<List>
<Records>
<Person id="001" recordaction="add">
...
</Person>
</Records>
</List>
The recordaction attribute can also be "chg" for changes or "del" for deletes. The basic logic of my program is:
recordaction属性对于更改也可以是“chg”,对于删除也可以是“del”。我的程序的基本逻辑是:
1) Read the full file into an XmlDocument.
1)将完整文件读入XmlDocument。
2) Read the incremental file into an XmlDocument, select the nodes using XmlDocument.SelectNodes(), place those nodes into a dictionary for easier searching.
2)将增量文件读入XmlDocument,使用XmlDocument.SelectNodes()选择节点,将这些节点放入字典中以便于搜索。
3) Select all the nodes in the full file, loop through and check each against the dictionary containing the incremental records. If recordaction="chg" or "del" add the node to a list, then delete all the nodes from the XmlNodeList that are in that list. Finally, add recordaction="chg" or "add" records from the incremental file into the full file.
3)选择完整文件中的所有节点,循环并根据包含增量记录的字典检查每个节点。如果recordaction =“chg”或“del”将节点添加到列表中,则从该列表中的XmlNodeList中删除所有节点。最后,将增量文件中的recordaction =“chg”或“add”记录添加到完整文件中。
4) Save the XML file.
4)保存XML文件。
I'm having some serious problems with step 3. Here's the code for that function:
我在第3步遇到了一些严重的问题。这是该函数的代码:
private void ProcessChanges(XmlNodeList nodeList, Dictionary<string, XmlNode> dictNodes)
{
XmlNode lastNode = null;
XmlNode currentNode = null;
List<XmlNode> nodesToBeDeleted = new List<XmlNode>();
// If node from full file matches to incremental record and is change or delete,
// mark full record to be deleted.
foreach (XmlNode fullNode in fullDocument.SelectNodes("/List/Records/Person"))
{
dictNodes.TryGetValue(fullNode.Attributes[0].Value, out currentNode);
if (currentNode != null)
{
if (currentNode.Attributes["recordaction"].Value == "chg"
|| currentNode.Attributes["recordaction"].Value == "del")
{
nodesToBeDeleted.Add(currentNode);
}
}
lastNode = fullNode;
}
// Delete marked records
for (int i = nodeList.Count - 1; i >= 0; i--)
{
if(nodesToBeDeleted.Contains(nodeList[i]))
{
nodeList[i].ParentNode.RemoveChild(nodesToBeDeleted[i]);
}
}
// Add in the incremental records to the new full file for records marked add or change.
foreach (XmlNode weeklyNode in nodeList)
{
if (weeklyNode.Attributes["recordaction"].Value == "add"
|| weeklyNode.Attributes["recordaction"].Value == "chg")
{
fullDocument.InsertAfter(weeklyNode, lastNode);
lastNode = weeklyNode;
}
}
}
The XmlNodeList being passed in is just all of the incremental records that were selected out from the incremental file, and the dictionary is just those same nodes but key'd on the id so I didn't have to loop through all of the incremental records each time. Right now the program is dying at the "Delete marked records" stage due to indexing out of bounds. I'm pretty sure the "Add in the incremental records" doesn't work either. Any ideas? Also some suggestions on making this more efficient would be nice. I could potentially run into a problem because it's reading in a 250MB file which balloons up to 750MB in memory, so I was wondering if there was an easier way to go node-by-node in the full file. Thanks!
传入的XmlNodeList只是从增量文件中选择的所有增量记录,而字典只是那些相同的节点,但在id上键入,所以我不必循环遍历所有增量记录每一次。由于索引越界,该程序现在正在“删除标记记录”阶段死亡。我很确定“添加增量记录”也不起作用。有任何想法吗?另外一些关于提高效率的建议会很好。我可能会遇到一个问题,因为它正在读取一个250MB的文件,内存容量高达750MB,所以我想知道是否有更简单的方法在整个文件中逐个节点。谢谢!
1 个解决方案
#1
5
Here's an example of how you might accomplish it with LINQ-to-XML. No dictionary is needed:
这是一个如何使用LINQ-to-XML实现它的示例。不需要字典:
using System.Xml.Linq;
// Load the main and incremental xml files into XDocuments
XDocument fullFile = XDocument.Load("fullfilename.xml");
XDocument incrementalFile = XDocument.Load("incrementalfilename.xml");
// For each Person in the incremental file
foreach (XElement person in incrementalFile.Descendants("Person")) {
// If the person should be added to the full file
if (person.Attribute("recordaction").Value == "add") {
fullFile.Element("List").Element("Records").Add(person); // Add him
}
// Else the person already exists in the full file
else {
// Find the element of the Person to delete or change
var personToChange =
(from p in fullFile.Descendants("Person")
where p.Attribute("id").Value == person.Attribute("id").Value
select p).Single();
// Perform the appropriate operation
switch (person.Attribute("recordaction").Value) {
case "chg":
personToChange.ReplaceWith(person);
break;
case "del":
personToChange.Remove();
break;
default:
throw new ApplicationException("Unrecognized attribute");
}
}
}// end foreach
// Save the changes to the full file
fullFile.Save("fullfilename.xml");
Please let me know if you have any problems running it and I'll edit and fix it. I'm pretty sure it's correct, but don't have VS available at the moment.
如果您在运行它时遇到任何问题,请告诉我,我会对其进行编辑和修复。我很确定这是正确的,但目前还没有VS可用。
EDIT: fixed the "chg" case to use personToChange.ReplaceWith(person) rather than 'personToChange = person'. The latter doesn't replace anything, as it just shifts the reference away from the underlying document.
编辑:修复“chg”案例使用personToChange.ReplaceWith(person)而不是'personToChange = person'。后者不会替换任何内容,因为它只是将引用从基础文档移开。
#1
5
Here's an example of how you might accomplish it with LINQ-to-XML. No dictionary is needed:
这是一个如何使用LINQ-to-XML实现它的示例。不需要字典:
using System.Xml.Linq;
// Load the main and incremental xml files into XDocuments
XDocument fullFile = XDocument.Load("fullfilename.xml");
XDocument incrementalFile = XDocument.Load("incrementalfilename.xml");
// For each Person in the incremental file
foreach (XElement person in incrementalFile.Descendants("Person")) {
// If the person should be added to the full file
if (person.Attribute("recordaction").Value == "add") {
fullFile.Element("List").Element("Records").Add(person); // Add him
}
// Else the person already exists in the full file
else {
// Find the element of the Person to delete or change
var personToChange =
(from p in fullFile.Descendants("Person")
where p.Attribute("id").Value == person.Attribute("id").Value
select p).Single();
// Perform the appropriate operation
switch (person.Attribute("recordaction").Value) {
case "chg":
personToChange.ReplaceWith(person);
break;
case "del":
personToChange.Remove();
break;
default:
throw new ApplicationException("Unrecognized attribute");
}
}
}// end foreach
// Save the changes to the full file
fullFile.Save("fullfilename.xml");
Please let me know if you have any problems running it and I'll edit and fix it. I'm pretty sure it's correct, but don't have VS available at the moment.
如果您在运行它时遇到任何问题,请告诉我,我会对其进行编辑和修复。我很确定这是正确的,但目前还没有VS可用。
EDIT: fixed the "chg" case to use personToChange.ReplaceWith(person) rather than 'personToChange = person'. The latter doesn't replace anything, as it just shifts the reference away from the underlying document.
编辑:修复“chg”案例使用personToChange.ReplaceWith(person)而不是'personToChange = person'。后者不会替换任何内容,因为它只是将引用从基础文档移开。