EOutOfMemory在Delphi中使用TXMLDocument解析大型XML

时间:2021-11-06 17:07:13

I have a large XML file to parse in code like the following sample. The issue seems to be that the memory allocated to childnode (IXMLNode) is not released, even when childnode falls out of scope. The memory only seems to be released once the parent TXMLDocument is deactivated (Active:=false), or freed. So my code, which starts around 380Mb once the xml document is loaded, blows out to 2Gb and that's where it ends. Setting childnode to nil has no effect on memory usage.

我有一个大的XML文件来解析代码,如下面的示例。问题似乎是,即使childnode超出范围,分配给childnode(IXMLNode)的内存也不会被释放。一旦父TXMLDocument被取消激活(Active:= false)或释放,似乎只释放内存。所以我的代码,一旦加载了xml文档就开始大约380Mb,吹到2Gb,那就是它结束的地方。将childnode设置为nil对内存使用没有影响。

My question is how to explicitly release the memory allocated to the IXMLNode interfaces. I'm not open to using a different XML object and I think I've tried almost every way to control the scope of the node interfaces.

我的问题是如何显式释放分配给IXMLNode接口的内存。我不愿意使用不同的XML对象,我想我几乎已经尝试过各种方法来控制节点接口的范围。

var
  childnode: IXMLNode;

for i:=0 to rootnode.ChildNodes.Count-1 do begin
    childnode:=rootnode.ChildNodes[i];
    ...
    childnode:=nil;
end;

1 个解决方案

#1


2  

i know you said you didn't want a separate XML library; but maybe someone else would like the sample code:

我知道你说你不想要一个单独的XML库;但也许其他人想要示例代码:

var
   sax: SAXXMLReader60;
   stm: IStream;
begin
   //Get a stream around our large file
   stm := TStreamAdapter.Create(TFileStream.Create('USGovBudgetLineItems2008.xml', fmOpenRead   ));

   sax := CoSAXXMLReader60.Create;
   sax.contentHandler := TVBSAXContentHandler.Create;
   sax.parse(stm);
end;

And we listen for the events with our SAXContentHandler object.

我们用SAXContentHandler对象监听事件。

For all the IDispatch events you can return E_NOTIMPL (msxml doesn't even call them).

对于所有IDispatch事件,您可以返回E_NOTIMPL(msxml甚至不会调用它们)。

All the rest you can plug in whatever code you want:

其余的你可以插入你想要的任何代码:

TVBSAXContentHandler = class(TInterfacedObject, IVBSAXContentHandler)
protected
    { IDispatch }
    function GetTypeInfoCount(out Count: Integer): HResult; stdcall;
    function GetTypeInfo(Index, LocaleID: Integer; out TypeInfo): HResult; stdcall;
    function GetIDsOfNames(const IID: TGUID; Names: Pointer; NameCount, LocaleID: Integer; DispIDs: Pointer): HResult; stdcall;
    function Invoke(DispID: Integer; const IID: TGUID; LocaleID: Integer; Flags: Word; var Params; VarResult, ExcepInfo, ArgErr: Pointer): HResult; stdcall;
public
    { IVBSAXContentHandler }
    procedure Set_documentLocator(const Param1: IVBSAXLocator); safecall;
    procedure startDocument; safecall;
    procedure endDocument; safecall;
    procedure startPrefixMapping(var strPrefix: WideString; var strURI: WideString); safecall;
    procedure endPrefixMapping(var strPrefix: WideString); safecall;
    procedure startElement(var strNamespaceURI: WideString; var strLocalName: WideString;
                                var strQName: WideString; const oAttributes: IVBSAXAttributes); safecall;
    procedure endElement(var strNamespaceURI: WideString; var strLocalName: WideString;
                             var strQName: WideString); safecall;
    procedure characters(var strChars: WideString); safecall;
    procedure ignorableWhitespace(var strChars: WideString); safecall;
    procedure processingInstruction(var strTarget: WideString; var strData: WideString); safecall;
    procedure skippedEntity(var strName: WideString); safecall;
//      property documentLocator: IVBSAXLocator write Set_documentLocator;
end;

Note: Any code is released into the public domain. No attribution required.

注意:任何代码都将发布到公共域中。无需归属。

#1


2  

i know you said you didn't want a separate XML library; but maybe someone else would like the sample code:

我知道你说你不想要一个单独的XML库;但也许其他人想要示例代码:

var
   sax: SAXXMLReader60;
   stm: IStream;
begin
   //Get a stream around our large file
   stm := TStreamAdapter.Create(TFileStream.Create('USGovBudgetLineItems2008.xml', fmOpenRead   ));

   sax := CoSAXXMLReader60.Create;
   sax.contentHandler := TVBSAXContentHandler.Create;
   sax.parse(stm);
end;

And we listen for the events with our SAXContentHandler object.

我们用SAXContentHandler对象监听事件。

For all the IDispatch events you can return E_NOTIMPL (msxml doesn't even call them).

对于所有IDispatch事件,您可以返回E_NOTIMPL(msxml甚至不会调用它们)。

All the rest you can plug in whatever code you want:

其余的你可以插入你想要的任何代码:

TVBSAXContentHandler = class(TInterfacedObject, IVBSAXContentHandler)
protected
    { IDispatch }
    function GetTypeInfoCount(out Count: Integer): HResult; stdcall;
    function GetTypeInfo(Index, LocaleID: Integer; out TypeInfo): HResult; stdcall;
    function GetIDsOfNames(const IID: TGUID; Names: Pointer; NameCount, LocaleID: Integer; DispIDs: Pointer): HResult; stdcall;
    function Invoke(DispID: Integer; const IID: TGUID; LocaleID: Integer; Flags: Word; var Params; VarResult, ExcepInfo, ArgErr: Pointer): HResult; stdcall;
public
    { IVBSAXContentHandler }
    procedure Set_documentLocator(const Param1: IVBSAXLocator); safecall;
    procedure startDocument; safecall;
    procedure endDocument; safecall;
    procedure startPrefixMapping(var strPrefix: WideString; var strURI: WideString); safecall;
    procedure endPrefixMapping(var strPrefix: WideString); safecall;
    procedure startElement(var strNamespaceURI: WideString; var strLocalName: WideString;
                                var strQName: WideString; const oAttributes: IVBSAXAttributes); safecall;
    procedure endElement(var strNamespaceURI: WideString; var strLocalName: WideString;
                             var strQName: WideString); safecall;
    procedure characters(var strChars: WideString); safecall;
    procedure ignorableWhitespace(var strChars: WideString); safecall;
    procedure processingInstruction(var strTarget: WideString; var strData: WideString); safecall;
    procedure skippedEntity(var strName: WideString); safecall;
//      property documentLocator: IVBSAXLocator write Set_documentLocator;
end;

Note: Any code is released into the public domain. No attribution required.

注意:任何代码都将发布到公共域中。无需归属。