I want to read a docx File in c#. the docx file when converted to .zip generate the xml of our file. I want to read that xml file . I need all the data from the doc with ther font name (bold italic setting), color from the file. How can we do this?
我想要读取c#中的docx文件。将docx文件转换为.zip时生成文件的xml。我想读取那个xml文件。我需要文档中的所有数据,以及文件中的字体名称(粗体斜体设置)和颜色。我们怎么做呢?
4 个解决方案
#1
3
The format of DOCX is well documented. To read the packages, you can use the classes from the System.IO.Packaging
namespace.
DOCX的格式有很好的文档说明。要读取包,可以使用System.IO中的类。包装的名称空间。
#2
4
Low level answer: DOCX files are OPC (Open Packaging Conventions) format (zip files with a manifest) and can be opened with the classes available in the System.IO.Packaging
namespace.
低层次的回答:DOCX文件是OPC(开放打包协议)格式(带有清单的zip文件),可以在System.IO中可用的类打开。包装的名称空间。
High level answer: DocX is an opensource framework that supports manipulating DOCX files using higher level constructs.
高级回答:DocX是一个开放源码框架,它支持使用高级结构来操作DocX文件。
#3
1
You would use the Microsoft Office 12.0 Object Library
您将使用Microsoft Office 12.0对象库
#4
0
If you're able to read the file as XML then may be you could apply some XPATH queries to get the info you need.
如果您能够以XML的形式读取文件,那么您可以应用一些XPATH查询来获取所需的信息。
#1
3
The format of DOCX is well documented. To read the packages, you can use the classes from the System.IO.Packaging
namespace.
DOCX的格式有很好的文档说明。要读取包,可以使用System.IO中的类。包装的名称空间。
#2
4
Low level answer: DOCX files are OPC (Open Packaging Conventions) format (zip files with a manifest) and can be opened with the classes available in the System.IO.Packaging
namespace.
低层次的回答:DOCX文件是OPC(开放打包协议)格式(带有清单的zip文件),可以在System.IO中可用的类打开。包装的名称空间。
High level answer: DocX is an opensource framework that supports manipulating DOCX files using higher level constructs.
高级回答:DocX是一个开放源码框架,它支持使用高级结构来操作DocX文件。
#3
1
You would use the Microsoft Office 12.0 Object Library
您将使用Microsoft Office 12.0对象库
#4
0
If you're able to read the file as XML then may be you could apply some XPATH queries to get the info you need.
如果您能够以XML的形式读取文件,那么您可以应用一些XPATH查询来获取所需的信息。