XML模式创建需要很长时间

时间:2022-11-29 18:56:43

I have the following code:

我有以下代码:

public XsdValidator(Resource... xsds) {
    Preconditions.checkArgument(xsds != null);
    try {
      this.xsds = ImmutableList.copyOf(xsds);
      SchemaFactory schemaFactory = SchemaFactory.newInstance(W3C_XML_SCHEMA_NS_URI);
      LOGGER.debug("Schema factory created: {}",schemaFactory);
      StreamSource[] streamSources = streamSourcesOf(xsds);
      LOGGER.debug("StreamSource[] created: {}",streamSources);
      Schema schema = schemaFactory.newSchema(streamSources);
      LOGGER.debug("Schema created: {}",schema);
      validator = schema.newValidator();
      LOGGER.debug("Validator created: {}",validator);
    } catch ( Exception e ) {
      throw new IllegalArgumentException("Can't build XsdValidator",e);
    }
  }

It seems the line schemaFactory.newSchema(streamSources); takes a very long time (30 seconds) to execute against my XSD file.

看来行schemaFactory.newSchema(streamSources);对我的XSD文件执行需要很长时间(30秒)。

After many tests on this XSD, it seems it's because I have:

经过对此XSD的多次测试后,似乎是因为我有:

  <xs:complexType name="entriesType">
    <xs:sequence>
      <xs:element type="prov:entryType" name="entry" minOccurs="0" maxOccurs="10000" />
    </xs:sequence>
  </xs:complexType>

The problem is maxOccurs="10000"

问题是maxOccurs =“10000”

With maxOccurs="1" or maxOccurs="unbounded", it is very fast.

使用maxOccurs =“1”或maxOccurs =“unbounded”,它非常快。

Can someone tell me what's the problem of using maxOccurs="10000" ?

有人能告诉我使用maxOccurs =“10000”的问题是什么?

1 个解决方案

#1


4  

Based on my personal experience, having particles bounded by what some may consider "unreasonably" high values is cause for performance problems (this link is from my browser's favourites).

根据我的个人经验,让某些人认为“不合理”高价值的粒子受到影响会导致性能问题(这个链接来自我浏览器的最爱)。

The underlying cause seems to be memory allocation (to the effect indicated by the maxOccurs value).

根本原因似乎是内存分配(由maxOccurs值指示的效果)。

Also, I recall a documentation item which was stating a threshold value beyond which, for all intents and purposes, the parser would actually treat the maxOccurs as unbounded, regardless of what the XSD says (I'll revisit this post if I find it).

此外,我还记得一个文档项,它说明了一个阈值,超出该阈值,无论出于什么意图和目的,解析器实际上都会将maxOccurs视为无限制,无论XSD说什么(如果我找到它,我将重新访问此帖子) 。

#1


4  

Based on my personal experience, having particles bounded by what some may consider "unreasonably" high values is cause for performance problems (this link is from my browser's favourites).

根据我的个人经验,让某些人认为“不合理”高价值的粒子受到影响会导致性能问题(这个链接来自我浏览器的最爱)。

The underlying cause seems to be memory allocation (to the effect indicated by the maxOccurs value).

根本原因似乎是内存分配(由maxOccurs值指示的效果)。

Also, I recall a documentation item which was stating a threshold value beyond which, for all intents and purposes, the parser would actually treat the maxOccurs as unbounded, regardless of what the XSD says (I'll revisit this post if I find it).

此外,我还记得一个文档项,它说明了一个阈值,超出该阈值,无论出于什么意图和目的,解析器实际上都会将maxOccurs视为无限制,无论XSD说什么(如果我找到它,我将重新访问此帖子) 。