I'm currently using POI to attempt to extract text out of a batch of Word documents and I need to be able to determine what entries a document contains. I've been able to get as far as pulling the document root and pulling the first entry but I want to be able to view all entries. The getEntries()
method seems to provide this functionality but I'm at a loss as to how to use getViewableIterator()
to pull them out.
我目前正在使用POI尝试从一批Word文档中提取文本,我需要能够确定文档包含哪些条目。我已经能够拉动文档根并拉动第一个条目,但我希望能够查看所有条目。 getEntries()方法似乎提供了这个功能,但我不知道如何使用getViewableIterator()将它们拉出来。
Below is what I have code-wise:
以下是我的代码方式:
<cfset myFile = createObject("java", "java.io.FileInputStream").init(fileInputPath)>
<cfset fileSystem = CreateObject( "java", "org.apache.poi.poifs.filesystem.POIFSFileSystem" ).Init(myFile)>
<cfloop from="1" to="#fileSystem.getRoot().getEntryCount()#" index="i">
<cfset viewableIterator = fileSystem.getRoot().getEntries().next().getViewableIterator()>
<cfset nextEntry = fileSystem.getRoot().getEntries().next()>
<cfif viewableIterator.hasNext()>
<cfdump var="#nextEntry.getShortDescription()#">
<cfset viewableIterator.remove()>
</cfif>
</cfloop>
On the first loop, I'm able to get the first entry just fine. However, I get an java.lang.IllegalStateException
error as soon as remove()
is executed. Obviously I'm not using the remove()
method correctly but I haven't been able to find any examples of how this should be properly used. Any help would be greatly appreciated.
在第一个循环中,我能够得到第一个条目就好了。但是,一旦执行remove(),我就会收到java.lang.IllegalStateException错误。显然我没有正确使用remove()方法,但我还没有找到任何应该如何正确使用它的例子。任何帮助将不胜感激。
2 个解决方案
#1
I don't really understand your XML tags (usually I use Java in its normal form, with curly braces and stuff), but generally a Java iterator works like the following:
我真的不了解你的XML标签(通常我使用Java的正常形式,带有大括号和东西),但通常Java迭代器的工作方式如下:
while(iterator.hasNext()) {
x = iterator.next(); // get element
// do with x what you want
if (/*you want to remove x from the underlying list*/)
iterator.remove();
}
In practice, remove is only used very rarely, in cases you want to go through a collection and remove everything you do not need any longer in it. remove can fail if the collecion is readonly or if you are trying to iterate over it twice with two different iterators at the same time. Just stick with hasNext and next.
在实践中,只有极少使用删除,如果您想要浏览集合并删除其中不再需要的所有内容。如果集合是只读的,或者如果您尝试使用两个不同的迭代器同时迭代两次,则remove可能会失败。坚持使用hasNext和next。
#2
Ben Nadel of Kinky Solutions fame wrote a component that might handle your situation. Give a look see and report back if his project helped you.
Kinky Solutions成名的Ben Nadel写了一个可以处理你情况的组件。如果他的项目帮助了您,请查看并报告。
POI Utility ColdFusion Component
POI实用程序ColdFusion组件
#1
I don't really understand your XML tags (usually I use Java in its normal form, with curly braces and stuff), but generally a Java iterator works like the following:
我真的不了解你的XML标签(通常我使用Java的正常形式,带有大括号和东西),但通常Java迭代器的工作方式如下:
while(iterator.hasNext()) {
x = iterator.next(); // get element
// do with x what you want
if (/*you want to remove x from the underlying list*/)
iterator.remove();
}
In practice, remove is only used very rarely, in cases you want to go through a collection and remove everything you do not need any longer in it. remove can fail if the collecion is readonly or if you are trying to iterate over it twice with two different iterators at the same time. Just stick with hasNext and next.
在实践中,只有极少使用删除,如果您想要浏览集合并删除其中不再需要的所有内容。如果集合是只读的,或者如果您尝试使用两个不同的迭代器同时迭代两次,则remove可能会失败。坚持使用hasNext和next。
#2
Ben Nadel of Kinky Solutions fame wrote a component that might handle your situation. Give a look see and report back if his project helped you.
Kinky Solutions成名的Ben Nadel写了一个可以处理你情况的组件。如果他的项目帮助了您,请查看并报告。
POI Utility ColdFusion Component
POI实用程序ColdFusion组件