如何在XML中获取不区分大小写的元素

时间:2022-04-04 20:15:44

As far as I know XML element type names as well as attribute names are case sensitive.

据我所知,XML元素类型名称以及属性名称区分大小写。

Is there a way or any trick to get case insensitive elements?

有没有办法或任何技巧来获得不区分大小写的元素?

Clarification: A grammar has been defined via XSD which is used for some clients to upload data. The users -the content generators- are creating XML files using different tools but many of them are using plain text editors or whatever. Sometimes when this people are trying to upload their files they get incompatibility errors. It is a common error that they mix lowerCase and upperCase tags although it is was always clear that tags ARE case sensitive.

澄清:已经通过XSD定义了语法,该语法用于某些客户端上传数据。用户 - 内容生成器 - 使用不同的工具创建XML文件,但其中许多使用纯文本编辑器或其他任何工具。有时当这些人试图上传他们的文件时,会出现不兼容错误。它们混合使用lowerCase和upperCase标签是一个常见的错误,尽管标签总是很敏感。

I have access to the XSD file which defines this grammar and I can change it. The question is how to avoid this error-prone lower/upper case tags problem.

我可以访问定义此语法的XSD文件,我可以更改它。问题是如何避免这种容易出错的大/小写标签问题。

Any idea?

任何想法?

Thanks in advance!

提前致谢!

7 个解决方案

#1


5  

If I understand your problem correctly then the case errors can only be corrected between the creation and the upload by a 3rd party parsing tool.

如果我正确理解您的问题,则只能在创建和第三方解析工具上传之间纠正大小写错误。

i.e. XML File > Parsed against XSD and corrected > Upload approved

即XML文件>解析XSD并更正>上传已批准

You could do this at run-time by developing a container application for your clients to create their XML files in. Alternatively you could write an application on the server side that takes the uploaded file and checks the syntax. Either way you're going to have to make a decision and then do some work!!

您可以在运行时通过为客户端开发容器应用程序来创建XML文件来执行此操作。或者,您可以在服务器端编写一个应用程序来获取上载的文件并检查语法。无论哪种方式,你将不得不做出决定,然后做一些工作!

A lot depends on the scale of the problem. If you have similar tags in different cases in your XSD e.g. and but you are receiving then you will need a complicated solution based on node counting etc.

很大程度上取决于问题的规模。如果您的XSD在不同情况下有类似标签,例如然而你接收到的则需要一个基于节点计数等的复杂解决方案。

If you are purely stuck with clients using random cases against an XSD only containing lower case tags then you should be able to parse the files and convert all tags to lower case in one go. This is assuming the content between the tags is multi-case and you can't just convert the full document.

如果您完全坚持使用针对仅包含小写标签的XSD的随机案例,那么您应该能够解析文件并将所有标签一次性转换为小写。这假设标签之间的内容是多个案例,您不能只转换整个文档。

How you do this depends on the mechanics of your situation. Obviously it will be easier to get the clients to error check their own submissions. If this isn't practical then you'll need to identify a window of opportunity in the process which will allow you to convert the file to the correct format before errors are encountered.

你如何做到这一点取决于你的情况的机制。显然,让客户错误检查他们自己的提交会更容易。如果这不实用,那么您需要在流程中确定一个机会窗口,这样您就可以在遇到错误之前将文件转换为正确的格式。

There are far too many ways to go about this to discuss here. It mainly depends on the skill-sets or finance available to you.

这里有很多方法可以讨论这个问题。它主要取决于您可以使用的技能或财务。

#2


1  

XPath/ Xslt processors are case sensitive. They can't select a node/ attribute if you specify the wrong case.

XPath / Xslt处理器区分大小写。如果指定了错误的大小写,则无法选择节点/属性。

In case you want to output the node name and want it to be in upper case, you can do:

如果您想输出节点名称并希望它是大写的,您可以执行以下操作:

upper-case(local-name())

#3


1  

As @Melkisadek said, the XSD validation exists for a purpose. If you allow users to upload files with invalid XML, your application is bound to fail at some point when the data within those files is accessed. Furthermore, the whole purpose of having an XSD validate the input XML schema is defeated. If you are willing to forego the whole schema validation feature, then you would need to use an XSLT to convert all tags to Uppercase or Lowercase as you desire (see @Rashmi's answer).

正如@Melkisadek所说,XSD验证存在于一个目的。如果允许用户上载包含无效XML的文件,则在访问这些文件中的数据时,您的应用程序必然会失败。此外,使用XSD验证输入XML模式的整个目的都被打败了。如果您愿意放弃整个模式验证功能,那么您需要使用XSLT将所有标记转换为大写或小写(请参阅@ Rashmi的答案)。

It would be analogous to allowing a user to input special characters in a Social Security Number entry field, just because the user is more comfortable entering special characters (Yes, this example is silly, couldn't think of a better one!)

这类似于允许用户在社会安全号码输入字段中输入特殊字符,仅仅因为用户更容易输入特殊字符(是的,这个例子很傻,想不到更好的字符!)

Therefore, in my mind, the solution lies in keeping the schema validation as-is, but providing users a way to validate the schema before uploading. For instance, if this is Web app, you could provide a button on the page which uses Javascript to validate the file against your schema. Alternatively, validate on the server only when the file is uploaded. In both cases, provide appropriate feedback such as the line number on which the errant entities lie, the character position, and reason for flagging an error.

因此,在我看来,解决方案在于保持架构验证不变,但为用户提供在上载之前验证架构的方法。例如,如果这是Web应用程序,您可以在页面上提供一个按钮,该按钮使用Javascript根据您的架构验证文件。或者,仅在上载文件时在服务器上进行验证。在这两种情况下,都要提供适当的反馈,例如错误实体所在的行号,字符位置以及标记错误的原因。

#4


1  

In theory, you could try to hack the XML Schema to validate incorrectly capitalised element names.

从理论上讲,您可以尝试破解XML Schema以验证错误的大写元素名称。

This can be done by using the substitution group mechanism in XML Schema. For example, if your schema had defined:

这可以通过使用XML Schema中的替换组机制来完成。例如,如果您的架构已定义:

  <xsd:element name="foobar" type="xsd:string"/>

then you could add the following to the XML Schema:

然后你可以将以下内容添加到XML Schema:

  <xsd:element name="Foobar" type="xsd:string" substitutionGroup="foobar"/>
  <xsd:element name="FooBar" type="xsd:string" substitutionGroup="foobar"/>
  <xsd:element name="fooBar" type="xsd:string" substitutionGroup="foobar"/>
  <xsd:element name="FOOBAR" type="xsd:string" substitutionGroup="foobar"/>

etc.

等等

to try and anticipate the possible mistakes they could make. For each element, there could be 2^n possible combination of cases, where n is the length of the name (assuming each character of the name is a letter).

试图预测他们可能犯的错误。对于每个元素,可能存在2 ^ n个可能的情况组合,其中n是名称的长度(假设名称的每个字符是字母)。

In practice, this is too much trouble, only delays the problem rather than solving it, and probably won't work. If the users don't realise that XML is case sensitive, then they might not have end tags that match the case of the start tag and it will still fail to validate.

在实践中,这太麻烦了,只能延迟问题而不是解决问题,而且可能无法正常工作。如果用户没有意识到XML区分大小写,那么他们可能没有与开始标记的大小写匹配的结束标记,但仍然无法验证。

As other people have said, either pre-process the submitted input to fix the case or to get the users to produce correct input before they submit it.

正如其他人所说,要么预先处理提交的输入以修复案例,要么让用户在提交之前产生正确的输入。

#5


0  

XML is normally machine generated. Therefore, you should have no real issue here width <RANdOm /> case.

XML通常是机器生成的。因此,这里你应该没有真正的问题width case。

If the real issue is that two different systems are generating two different types of the tag (<Widget /> vs. <widget />), I guess you could simply define both cases in your XSD.

如果真正的问题是两个不同的系统正在生成两种不同类型的标签( vs. ),我想你可以简单地在XSD中定义两种情况。

#6


0  

After uploading, walk the XML file (via DOM or SAX) and fix the casing before you validate?

上传后,浏览XML文件(通过DOM或SAX)并在验证之前修复外壳?

#7


0  

The simples solution is send to lowercase all tags/attributes when you load xml from user and only then check it over xsd designed for all lowercase tags/attributes

当您从用户加载xml时,简单的解决方案是发送到小写的所有标记/属性,然后通过为所有小写标记/属性设计的xsd进行检查

#1


5  

If I understand your problem correctly then the case errors can only be corrected between the creation and the upload by a 3rd party parsing tool.

如果我正确理解您的问题,则只能在创建和第三方解析工具上传之间纠正大小写错误。

i.e. XML File > Parsed against XSD and corrected > Upload approved

即XML文件>解析XSD并更正>上传已批准

You could do this at run-time by developing a container application for your clients to create their XML files in. Alternatively you could write an application on the server side that takes the uploaded file and checks the syntax. Either way you're going to have to make a decision and then do some work!!

您可以在运行时通过为客户端开发容器应用程序来创建XML文件来执行此操作。或者,您可以在服务器端编写一个应用程序来获取上载的文件并检查语法。无论哪种方式,你将不得不做出决定,然后做一些工作!

A lot depends on the scale of the problem. If you have similar tags in different cases in your XSD e.g. and but you are receiving then you will need a complicated solution based on node counting etc.

很大程度上取决于问题的规模。如果您的XSD在不同情况下有类似标签,例如然而你接收到的则需要一个基于节点计数等的复杂解决方案。

If you are purely stuck with clients using random cases against an XSD only containing lower case tags then you should be able to parse the files and convert all tags to lower case in one go. This is assuming the content between the tags is multi-case and you can't just convert the full document.

如果您完全坚持使用针对仅包含小写标签的XSD的随机案例,那么您应该能够解析文件并将所有标签一次性转换为小写。这假设标签之间的内容是多个案例,您不能只转换整个文档。

How you do this depends on the mechanics of your situation. Obviously it will be easier to get the clients to error check their own submissions. If this isn't practical then you'll need to identify a window of opportunity in the process which will allow you to convert the file to the correct format before errors are encountered.

你如何做到这一点取决于你的情况的机制。显然,让客户错误检查他们自己的提交会更容易。如果这不实用,那么您需要在流程中确定一个机会窗口,这样您就可以在遇到错误之前将文件转换为正确的格式。

There are far too many ways to go about this to discuss here. It mainly depends on the skill-sets or finance available to you.

这里有很多方法可以讨论这个问题。它主要取决于您可以使用的技能或财务。

#2


1  

XPath/ Xslt processors are case sensitive. They can't select a node/ attribute if you specify the wrong case.

XPath / Xslt处理器区分大小写。如果指定了错误的大小写,则无法选择节点/属性。

In case you want to output the node name and want it to be in upper case, you can do:

如果您想输出节点名称并希望它是大写的,您可以执行以下操作:

upper-case(local-name())

#3


1  

As @Melkisadek said, the XSD validation exists for a purpose. If you allow users to upload files with invalid XML, your application is bound to fail at some point when the data within those files is accessed. Furthermore, the whole purpose of having an XSD validate the input XML schema is defeated. If you are willing to forego the whole schema validation feature, then you would need to use an XSLT to convert all tags to Uppercase or Lowercase as you desire (see @Rashmi's answer).

正如@Melkisadek所说,XSD验证存在于一个目的。如果允许用户上载包含无效XML的文件,则在访问这些文件中的数据时,您的应用程序必然会失败。此外,使用XSD验证输入XML模式的整个目的都被打败了。如果您愿意放弃整个模式验证功能,那么您需要使用XSLT将所有标记转换为大写或小写(请参阅@ Rashmi的答案)。

It would be analogous to allowing a user to input special characters in a Social Security Number entry field, just because the user is more comfortable entering special characters (Yes, this example is silly, couldn't think of a better one!)

这类似于允许用户在社会安全号码输入字段中输入特殊字符,仅仅因为用户更容易输入特殊字符(是的,这个例子很傻,想不到更好的字符!)

Therefore, in my mind, the solution lies in keeping the schema validation as-is, but providing users a way to validate the schema before uploading. For instance, if this is Web app, you could provide a button on the page which uses Javascript to validate the file against your schema. Alternatively, validate on the server only when the file is uploaded. In both cases, provide appropriate feedback such as the line number on which the errant entities lie, the character position, and reason for flagging an error.

因此,在我看来,解决方案在于保持架构验证不变,但为用户提供在上载之前验证架构的方法。例如,如果这是Web应用程序,您可以在页面上提供一个按钮,该按钮使用Javascript根据您的架构验证文件。或者,仅在上载文件时在服务器上进行验证。在这两种情况下,都要提供适当的反馈,例如错误实体所在的行号,字符位置以及标记错误的原因。

#4


1  

In theory, you could try to hack the XML Schema to validate incorrectly capitalised element names.

从理论上讲,您可以尝试破解XML Schema以验证错误的大写元素名称。

This can be done by using the substitution group mechanism in XML Schema. For example, if your schema had defined:

这可以通过使用XML Schema中的替换组机制来完成。例如,如果您的架构已定义:

  <xsd:element name="foobar" type="xsd:string"/>

then you could add the following to the XML Schema:

然后你可以将以下内容添加到XML Schema:

  <xsd:element name="Foobar" type="xsd:string" substitutionGroup="foobar"/>
  <xsd:element name="FooBar" type="xsd:string" substitutionGroup="foobar"/>
  <xsd:element name="fooBar" type="xsd:string" substitutionGroup="foobar"/>
  <xsd:element name="FOOBAR" type="xsd:string" substitutionGroup="foobar"/>

etc.

等等

to try and anticipate the possible mistakes they could make. For each element, there could be 2^n possible combination of cases, where n is the length of the name (assuming each character of the name is a letter).

试图预测他们可能犯的错误。对于每个元素,可能存在2 ^ n个可能的情况组合,其中n是名称的长度(假设名称的每个字符是字母)。

In practice, this is too much trouble, only delays the problem rather than solving it, and probably won't work. If the users don't realise that XML is case sensitive, then they might not have end tags that match the case of the start tag and it will still fail to validate.

在实践中,这太麻烦了,只能延迟问题而不是解决问题,而且可能无法正常工作。如果用户没有意识到XML区分大小写,那么他们可能没有与开始标记的大小写匹配的结束标记,但仍然无法验证。

As other people have said, either pre-process the submitted input to fix the case or to get the users to produce correct input before they submit it.

正如其他人所说,要么预先处理提交的输入以修复案例,要么让用户在提交之前产生正确的输入。

#5


0  

XML is normally machine generated. Therefore, you should have no real issue here width <RANdOm /> case.

XML通常是机器生成的。因此,这里你应该没有真正的问题width case。

If the real issue is that two different systems are generating two different types of the tag (<Widget /> vs. <widget />), I guess you could simply define both cases in your XSD.

如果真正的问题是两个不同的系统正在生成两种不同类型的标签( vs. ),我想你可以简单地在XSD中定义两种情况。

#6


0  

After uploading, walk the XML file (via DOM or SAX) and fix the casing before you validate?

上传后,浏览XML文件(通过DOM或SAX)并在验证之前修复外壳?

#7


0  

The simples solution is send to lowercase all tags/attributes when you load xml from user and only then check it over xsd designed for all lowercase tags/attributes

当您从用户加载xml时,简单的解决方案是发送到小写的所有标记/属性,然后通过为所有小写标记/属性设计的xsd进行检查