查询存储为xml的SQL Server 2012中的文档

时间:2022-02-14 01:37:50

I have been trying to learn Open XML, and I successfully generated a free text 2007 .docx file in Visual Studio c#. After that I saved the file in the .xml format in Word and then Imported the .xml file into sql server 2012 into a table called XML to a column called xml: here is the code for the table:

我一直在尝试学习Open XML,并且我在Visual Studio c#中成功生成了一个*文本2007 .docx文件。之后,我将文件保存为Word中的.xml格式,然后将.xml文件导入到sql server 2012中,将一个名为XML的表导入到名为xml的列中:这是该表的代码:

CREATE TABLE [dbo].[XML](
    [XML_ID] [int] IDENTITY(1,1) NOT NULL,
    [XML] [xml] NOT NULL,
 CONSTRAINT [PK_XML] PRIMARY KEY CLUSTERED 
(
    [XML_ID] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY] TEXTIMAGE_ON [PRIMARY]

GO

Here is the insert I ran

这是我跑的插入

INSERT INTO XML(XML)
SELECT * FROM OPENROWSET(
   BULK 'C:\Users\bbt2d\Desktop\Document.xml',
   SINGLE_BLOB) AS x;

When I ran a select * the xml file showed up, but when I tried querying some of the data from the paragraphs I wasn't getting anywhere.

当我运行select * xml文件出现时,但是当我尝试查询段落中的一些数据时,我没有得到任何结果。

Here is a partial snipper of the xml that is stored in sql server: Was wondering if I went about the process wrong, or how would you query out text from the document.

这是存储在sql server中的xml的部分剪切器:想知道我是否错误地处理了该进程,或者如何从文档中查询文本。

?mso-application progid="Word.Document"?>
<pkg:package xmlns:pkg="http://schemas.microsoft.com/office/2006/xmlPackage">
  <pkg:part pkg:name="/_rels/.rels" pkg:contentType="application/vnd.openxmlformats-package.relationships+xml" pkg:padding="512">
    <pkg:xmlData>
      <Relationships xmlns="http://schemas.openxmlformats.org/package/2006/relationships">
        <Relationship Id="rId3" Type="http://schemas.openxmlformats.org/officeDocument/2006/relationships/extended-properties" Target="docProps/app.xml" />
        <Relationship Id="rId2" Type="http://schemas.openxmlformats.org/package/2006/relationships/metadata/core-properties" Target="docProps/core.xml" />
        <Relationship Id="rId1" Type="http://schemas.openxmlformats.org/officeDocument/2006/relationships/officeDocument" Target="word/document.xml" />
      </Relationships>
    </pkg:xmlData>
  </pkg:part>
  <pkg:part pkg:name="/word/_rels/document.xml.rels" pkg:contentType="application/vnd.openxmlformats-package.relationships+xml" pkg:padding="256">
    <pkg:xmlData>
      <Relationships xmlns="http://schemas.openxmlformats.org/package/2006/relationships">
        <Relationship Id="rId8" Type="http://schemas.openxmlformats.org/officeDocument/2006/relationships/fontTable" Target="fontTable.xml" />
        <Relationship Id="rId3" Type="http://schemas.openxmlformats.org/officeDocument/2006/relationships/settings" Target="settings.xml" />
        <Relationship Id="rId7" Type="http://schemas.openxmlformats.org/officeDocument/2006/relationships/footer" Target="footer1.xml" />
        <Relationship Id="rId2" Type="http://schemas.openxmlformats.org/officeDocument/2006/relationships/styles" Target="styles.xml" />
        <Relationship Id="rId1" Type="http://schemas.openxmlformats.org/officeDocument/2006/relationships/numbering" Target="numbering.xml" />
        <Relationship Id="rId6" Type="http://schemas.openxmlformats.org/officeDocument/2006/relationships/endnotes" Target="endnotes.xml" />
        <Relationship Id="rId5" Type="http://schemas.openxmlformats.org/officeDocument/2006/relationships/footnotes" Target="footnotes.xml" />
        <Relationship Id="rId10" Type="http://schemas.microsoft.com/office/2007/relationships/stylesWithEffects" Target="stylesWithEffects.xml" />
        <Relationship Id="rId4" Type="http://schemas.openxmlformats.org/officeDocument/2006/relationships/webSettings" Target="webSettings.xml" />
        <Relationship Id="rId9" Type="http://schemas.openxmlformats.org/officeDocument/2006/relationships/theme" Target="theme/theme1.xml" />
      </Relationships>
    </pkg:xmlData>
  </pkg:part>
  <pkg:part pkg:name="/word/document.xml" pkg:contentType="application/vnd.openxmlformats-officedocument.wordprocessingml.document.main+xml">
    <pkg:xmlData>
      <w:document xmlns:ve="http://schemas.openxmlformats.org/markup-compatibility/2006" xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:r="http://schemas.openxmlformats.org/officeDocument/2006/relationships" xmlns:m="http://schemas.openxmlformats.org/officeDocument/2006/math" xmlns:v="urn:schemas-microsoft-com:vml" xmlns:wp="http://schemas.openxmlformats.org/drawingml/2006/wordprocessingDrawing" xmlns:w10="urn:schemas-microsoft-com:office:word" xmlns:w="http://schemas.openxmlformats.org/wordprocessingml/2006/main" xmlns:wne="http://schemas.microsoft.com/office/word/2006/wordml">
        <w:body>
          <w:p w:rsidR="00C32D34" w:rsidRDefault="00C32D34">
            <w:pPr>
              <w:pStyle w:val="Heading1" />
              <w:spacing w:line="240" w:lineRule="auto" />
              <w:jc w:val="center" />
              <w:rPr>
                <w:rFonts w:ascii="Times New Roman" w:hAnsi="Times New Roman" />
                <w:sz w:val="44" />
                <w:szCs w:val="44" />
              </w:rPr>
            </w:pPr>
            <w:r>
              <w:rPr>
                <w:rFonts w:ascii="Times New Roman" w:hAnsi="Times New Roman" />
                <w:sz w:val="44" />
                <w:szCs w:val="44" />
              </w:rPr>
              <w:t>School District of Marshfield</w:t>
            </w:r>
          </w:p>
          <w:p w:rsidR="00C32D34" w:rsidRDefault="00C32D34">
            <w:pPr>
              <w:autoSpaceDE w:val="0" />
              <w:autoSpaceDN w:val="0" />
              <w:adjustRightInd w:val="0" />
              <w:spacing w:after="0" w:line="240" w:lineRule="auto" />
              <w:rPr>
                <w:rFonts w:ascii="Cambria" w:hAnsi="Cambria" w:cs="Cambria" />
                <w:b />
                <w:bCs />
                <w:sz w:val="28" />
                <w:szCs w:val="28" />
              </w:rPr>
            </w:pPr>
          </w:p>
          <w:p w:rsidR="00C32D34" w:rsidRDefault="00DE219B">
            <w:pPr>
              <w:autoSpaceDE w:val="0" />
              <w:autoSpaceDN w:val="0" />
              <w:adjustRightInd w:val="0" />
              <w:spacing w:after="0" w:line="240" w:lineRule="auto" />
              <w:jc w:val="center" />
              <w:rPr>
                <w:rFonts w:ascii="Cambria" w:hAnsi="Cambria" w:cs="Cambria" />
                <w:b />
                <w:bCs />
                <w:sz w:val="28" />
                <w:szCs w:val="28" />
              </w:rPr>
            </w:pPr>
            <w:r>
              <w:rPr>
                <w:rFonts w:ascii="Cambria" w:hAnsi="Cambria" w:cs="Cambria" />
                <w:b />
                <w:bCs />
                <w:sz w:val="28" />
                <w:szCs w:val="28" />
              </w:rPr>
              <w:t>Exploring French</w:t>
            </w:r>
            <w:r w:rsidR="00C32D34">
              <w:rPr>
                <w:rFonts w:ascii="Cambria" w:hAnsi="Cambria" w:cs="Cambria" />
                <w:b />
                <w:bCs />
                <w:sz w:val="28" />
                <w:szCs w:val="28" />
              </w:rPr>
              <w:t xml:space="preserve"> &amp; Spanish</w:t>
            </w:r>

2 个解决方案

#1


3  

The XML you posted isn't complete - assuming your table's XML is complete, you can try something like this (BTW: as a side-note: I would recommend to never use reserved words for your columns; don't call your column XML - that's a reserved keyword; use XmlDoc or something like that....)

您发布的XML不完整 - 假设您的表的XML已完成,您可以尝试这样的事情(顺便说一下:作为旁注:我建议永远不要为您的列使用保留字;不要调用您的列XML - 这是一个保留的关键字;使用XmlDoc或类似的......)

-- define the two XML namespaces we need to use to get to the data in question
;WITH XMLNAMESPACES ('http://schemas.microsoft.com/office/2006/xmlPackage' as pkg, 
                     'http://schemas.openxmlformats.org/wordprocessingml/2006/main' as w)
select
    XmlContent.value('(//w:t)[2]', 'varchar(100)')
FROM dbo.YourTable
where ....(some condition here).....

The really interesting part is finding out the proper XPath expression to get to your data. Unfortunately, the text you're interested in doesn't seem to be in a specifically marked paragraph, e.g. the <w:r> and <w:rPr> tags surrounding it don't have any specific attributes you can check for - so all you can do is figure out that it's the second <w:t> node in that document that you're interested in, and get to its text that way.

真正有趣的部分是找到适当的XPath表达式来获取数据。不幸的是,您感兴趣的文本似乎没有特别标记的段落,例如围绕它的 标签没有你可以检查的任何特定属性 - 所以你所能做的就是弄清楚它是你在那个文件中的第二个 节点感兴趣,并以这种方式得到它的文本。 :t> :rpr> :r>

#2


1  

i allways suggest to use Xquery when you are dealing with XML datatypes in sql server,Below is the url which gives you clear understanding of Xquery. http://www.youtube.com/watch?v=mj4qodGsgDA

我总是建议在sql server中处理XML数据类型时使用Xquery,下面是url,它让你清楚地了解Xquery。 http://www.youtube.com/watch?v=mj4qodGsgDA

#1


3  

The XML you posted isn't complete - assuming your table's XML is complete, you can try something like this (BTW: as a side-note: I would recommend to never use reserved words for your columns; don't call your column XML - that's a reserved keyword; use XmlDoc or something like that....)

您发布的XML不完整 - 假设您的表的XML已完成,您可以尝试这样的事情(顺便说一下:作为旁注:我建议永远不要为您的列使用保留字;不要调用您的列XML - 这是一个保留的关键字;使用XmlDoc或类似的......)

-- define the two XML namespaces we need to use to get to the data in question
;WITH XMLNAMESPACES ('http://schemas.microsoft.com/office/2006/xmlPackage' as pkg, 
                     'http://schemas.openxmlformats.org/wordprocessingml/2006/main' as w)
select
    XmlContent.value('(//w:t)[2]', 'varchar(100)')
FROM dbo.YourTable
where ....(some condition here).....

The really interesting part is finding out the proper XPath expression to get to your data. Unfortunately, the text you're interested in doesn't seem to be in a specifically marked paragraph, e.g. the <w:r> and <w:rPr> tags surrounding it don't have any specific attributes you can check for - so all you can do is figure out that it's the second <w:t> node in that document that you're interested in, and get to its text that way.

真正有趣的部分是找到适当的XPath表达式来获取数据。不幸的是,您感兴趣的文本似乎没有特别标记的段落,例如围绕它的 标签没有你可以检查的任何特定属性 - 所以你所能做的就是弄清楚它是你在那个文件中的第二个 节点感兴趣,并以这种方式得到它的文本。 :t> :rpr> :r>

#2


1  

i allways suggest to use Xquery when you are dealing with XML datatypes in sql server,Below is the url which gives you clear understanding of Xquery. http://www.youtube.com/watch?v=mj4qodGsgDA

我总是建议在sql server中处理XML数据类型时使用Xquery,下面是url,它让你清楚地了解Xquery。 http://www.youtube.com/watch?v=mj4qodGsgDA