我应该使用哪个类来在字符串中进行许多替换?

时间:2022-09-10 23:31:32

I have to make a lot of text-replacements. Which class is best used to make this in a performant manner? Is it StringBuilder?

我必须做大量的文本替换。用哪一门课来做这件事最好?StringBuilder吗?

StringBuilder stringBuilder=new StringBuilder(startString);
stringBuilder.Replace(literala1,literala2);
stringBuilder.Replace(literalb1,literalb2);
stringBuilder.Replace(literalc1,literalc2);
...

or is there a better class to do this? By the way, the literals will be mostly constants.

或者有更好的类来做这个?顺便说一下,字面量主要是常数。

4 个解决方案

#1


3  

This exact question was dealt with at length on Roberto Farah's blog: Comparing RegEx.Replace, String.Replace and StringBuilder.Replace – Which has better performance?

这个问题在Roberto Farah的博客上详细讨论了:比较RegEx。替换字符串。替换和StringBuilder。替换-哪个有更好的性能?

I'll summarize the findings here, which come as a shock to many .NET developers. It turns out that for relatively simple string replacement (in cases where it's not necessary for matches to be case sensitive), RegEx.Replace() has the worst performance and String.Replace() wins with the best.

我将在这里总结这些发现,这对许多。net开发人员来说是一个冲击。事实证明,对于相对简单的字符串替换(在不需要匹配的情况下),RegEx.Replace()具有最差的性能和字符串。replace()以最好的方式获胜。

A link is also provided to an article on CodeProject that confirms these findings: StringBuilder vs String / Fast String Operations with .NET 2.0

还提供了一篇关于CodeProject的文章的链接,该文章证实了这些发现:StringBuilder与.NET 2.0的String / Fast字符串操作

In general, I would say the rules ought to be as follows:

总的来说,我认为规则应该如下:

  • Use String.Replace() when you only have to do a small number of replacements (say around 5)
  • 使用String.Replace(),当您只需要做少量替换时(比如5个左右)
  • Use StringBuilder.Replace() when you have to do a larger number of replacements
  • 当需要进行大量替换时,使用stringbuild . replace ()
  • Reserve regular expressions (RegEx.Replace) only for the most complex scenarios where it's worth paying a slight performance penalty for the elegance of a single expression that handles all of the necessary replacements.
  • 只在最复杂的场景中保留正则表达式(RegEx.Replace),在这些场景中,处理所有必要替换的单个表达式的优雅性值得付出少许性能代价。
  • Ignore all of the above guidelines and use whatever makes your code most readable or expressive. Prematurely optimizing something like this isn't worth the time it took me to write this answer.
  • 忽略上面所有的指导方针,使用任何使您的代码更具可读性和表达性的东西。过早地优化这样的东西不值得我花时间去写这个答案。

#2


0  

I would go with RegEx.Replace. This overload: http://msdn.microsoft.com/en-us/library/cft8645c.aspx

我选择RegEx.Replace。这种过载:http://msdn.microsoft.com/en-us/library/cft8645c.aspx

All your different inputs can be matched in the regular expression and all your different replacements strings could go in your MatchEvaluator.

所有不同的输入都可以在正则表达式中进行匹配,所有不同的替换字符串都可以在MatchEvaluator中进行匹配。

#3


0  

StringBuilder is probably the best class for doing this, as it won't create extra copies of the underlying character buffer during replacements. If you are performance-sensitive, then String may be bad because it creates copies of the string with every call to Replace, and using a Regex will probably be inferior to the straightforward search-and-replace of StringBuilder.

StringBuilder可能是最适合这样做的类,因为它不会在替换期间创建底层字符缓冲区的额外副本。如果您对性能敏感,那么String可能是不好的,因为它在每次调用时都创建要替换的字符串副本,并且使用Regex可能不如直接搜索和替换StringBuilder。

#4


0  

I found using this code implementing Aho-Corasick string matching to find all the strings to match and then only going your string only once with StringBuilder doing the replacements was a lot better than looping with a set of string replacements one at a time.

我发现,使用这段代码实现Aho-Corasick字符串匹配来查找所有要匹配的字符串,然后使用StringBuilder进行替换,只执行一次字符串匹配,要比每次使用一组字符串替换来循环要好得多。

#1


3  

This exact question was dealt with at length on Roberto Farah's blog: Comparing RegEx.Replace, String.Replace and StringBuilder.Replace – Which has better performance?

这个问题在Roberto Farah的博客上详细讨论了:比较RegEx。替换字符串。替换和StringBuilder。替换-哪个有更好的性能?

I'll summarize the findings here, which come as a shock to many .NET developers. It turns out that for relatively simple string replacement (in cases where it's not necessary for matches to be case sensitive), RegEx.Replace() has the worst performance and String.Replace() wins with the best.

我将在这里总结这些发现,这对许多。net开发人员来说是一个冲击。事实证明,对于相对简单的字符串替换(在不需要匹配的情况下),RegEx.Replace()具有最差的性能和字符串。replace()以最好的方式获胜。

A link is also provided to an article on CodeProject that confirms these findings: StringBuilder vs String / Fast String Operations with .NET 2.0

还提供了一篇关于CodeProject的文章的链接,该文章证实了这些发现:StringBuilder与.NET 2.0的String / Fast字符串操作

In general, I would say the rules ought to be as follows:

总的来说,我认为规则应该如下:

  • Use String.Replace() when you only have to do a small number of replacements (say around 5)
  • 使用String.Replace(),当您只需要做少量替换时(比如5个左右)
  • Use StringBuilder.Replace() when you have to do a larger number of replacements
  • 当需要进行大量替换时,使用stringbuild . replace ()
  • Reserve regular expressions (RegEx.Replace) only for the most complex scenarios where it's worth paying a slight performance penalty for the elegance of a single expression that handles all of the necessary replacements.
  • 只在最复杂的场景中保留正则表达式(RegEx.Replace),在这些场景中,处理所有必要替换的单个表达式的优雅性值得付出少许性能代价。
  • Ignore all of the above guidelines and use whatever makes your code most readable or expressive. Prematurely optimizing something like this isn't worth the time it took me to write this answer.
  • 忽略上面所有的指导方针,使用任何使您的代码更具可读性和表达性的东西。过早地优化这样的东西不值得我花时间去写这个答案。

#2


0  

I would go with RegEx.Replace. This overload: http://msdn.microsoft.com/en-us/library/cft8645c.aspx

我选择RegEx.Replace。这种过载:http://msdn.microsoft.com/en-us/library/cft8645c.aspx

All your different inputs can be matched in the regular expression and all your different replacements strings could go in your MatchEvaluator.

所有不同的输入都可以在正则表达式中进行匹配,所有不同的替换字符串都可以在MatchEvaluator中进行匹配。

#3


0  

StringBuilder is probably the best class for doing this, as it won't create extra copies of the underlying character buffer during replacements. If you are performance-sensitive, then String may be bad because it creates copies of the string with every call to Replace, and using a Regex will probably be inferior to the straightforward search-and-replace of StringBuilder.

StringBuilder可能是最适合这样做的类,因为它不会在替换期间创建底层字符缓冲区的额外副本。如果您对性能敏感,那么String可能是不好的,因为它在每次调用时都创建要替换的字符串副本,并且使用Regex可能不如直接搜索和替换StringBuilder。

#4


0  

I found using this code implementing Aho-Corasick string matching to find all the strings to match and then only going your string only once with StringBuilder doing the replacements was a lot better than looping with a set of string replacements one at a time.

我发现,使用这段代码实现Aho-Corasick字符串匹配来查找所有要匹配的字符串,然后使用StringBuilder进行替换,只执行一次字符串匹配,要比每次使用一组字符串替换来循环要好得多。