WebWorker只计算慢速regexp匹配(3x) - firefox

时间:2022-04-25 19:32:41

First I just created myself a regular expression that will match all unique external library paths in a list of all header files in a project. I asked a question regarding making that regexp a week ago.

首先,我为自己创建了一个正则表达式,它将匹配项目中所有头文件列表中的所有唯一外部库路径。一周前我问了一个关于制作regexp的问题。

I started meddling around to see how it would behave when asynchronous and when turned into a web worker. For convenience and reliability I created this universal file that runs in all three modes:

我开始瞎折腾,看看它在异步和转换为web worker时的行为。为了方便和可靠,我创建了这个通用文件,运行在所有三种模式:

/** Will call result() callback with every match it founds. Asynchronous unless called 
 *  with interval = -1.
 *  Javadoc style comment for Arnold Rimmer and other Java programmers:
 *  
 * @param regex regular expression to match in string
 * @param string guess what
 * @param result callback function that accepts one parameter, string match
 * @param done callback on finish, has no parameters
 * @param interval delay (not actual interval) between finding matches. If -1, 
 *        function  will be blocking
 * @property working false if loop isn't running, otherwise contains timeout ID
 *           for use with clearTimeout
 * @property done copy of done parameter
 * @throws heavy boulders
**/
function processRegex(regex, string, result, done, interval) {
  var m;
  //Please tell me interpreter optimizes this
  interval = typeof interval!='number'?1:interval;
  //And this
  processRegex.done = done;
  while ((m = regex.exec(string))) {
    Array.prototype.splice.call(m,0,1);
    var path = m.join("");
    //It's good to keep in mind that result() slows down the process
    result(path);
    if (interval>=0) {
      processRegex.working = setTimeout(processRegex, 
                              interval, regex, string, 
                              result, done, interval);
      // Comment these out for maximum speed
      processRegex.progress = regex.lastIndex/string.length;
      console.log("Progress: "+Math.round(processRegex.progress*100)+"%");
      return;
    }
  }

  processRegex.working = false;
  processRegex.done = null;
  if (typeof done=="function")
    done();
}
processRegex.working = false; 

I created a test file, rather than pasting it here I uploaded it on very reliable web hosting: Demo - Test data.

我创建了一个测试文件,而不是粘贴在这里,我将它上传到非常可靠的web托管:演示-测试数据。

What I find very surprising is that there is such a significant difference between web worker and browser execution of RegExp. The results I got:

我发现非常令人惊讶的是,在RegExp的web worker和浏览器执行之间有如此显著的差异。结果我得到了:

  • Mozilla Firefox
    • [WORKER]: Time elapsed:16.860s
    • (工人):时间:16.860秒
    • [WORKER-SYNC]: Time elapsed:16.739s
    • [WORKER-SYNC]:时间:16.739秒
    • [TIMEOUT]: Time elapsed:5.186s
    • (超时):时间:5.186秒
    • [LOOP]: Time elapsed:5.028s
    • (循环):时间:5.028秒
  • Mozilla Firefox [WORKER]:运行时间:16.860秒[WORKER- sync]:运行时间:16.739秒[TIMEOUT]:运行时间:5.186秒[LOOP]:运行时间:5.028秒

You can also see that with my particular regular expression, the difference between a synchronous and an asynchronous loop is insignificant. I tried to use a match list instead of a lookahead expression and the results changed a lot. Here are the changes to the old function:

您还可以看到,对于我的特定正则表达式,同步循环和异步循环之间的差异是无关紧要的。我尝试使用一个匹配列表而不是一个前向表达式,结果发生了很大的变化。

function processRegexUnique(regex, string, result, done, interval) {
  var matchList = arguments[5]||[];
  ... same as before ...
  while ((m = regex.exec(string))) {
    ... same as before ...
    if (matchList.indexOf(path)==-1) {
      result(path);
      matchList.push(path);
    }
    if (interval>=0) {
      processRegex.working = setTimeout(processRegex, interval, 
                               regex, string, result, 
                               done, interval, matchList);
      ... same as before ...
    }
  }
  ... same as before ...
}

And the results:

结果:

  • Mozilla Firefox
    • [WORKER]: Time elapsed:0.062s
    • (工人):时间:0.062秒
    • [WORKER-SYNC]: Time elapsed:0.023s
    • [WORKER-SYNC]:时间:0.023秒
    • [TIMEOUT]: Time elapsed:12.250s (note to self: it's getting weirder every minute)
    • [超时]:时间过了:12.25秒(注意到自己:每分钟变得越来越奇怪)
    • [LOOP]: Time elapsed:0.006s
    • (循环):时间:0.006秒
  • Mozilla Firefox [WORKER]:时间经过:0.062s [WORKER- sync]:时间经过:0.023秒[超时]:时间流逝:12.25秒(注意到自己:每分钟都变得越来越奇怪)[循环]:时间流逝:0.006s。

Can anyone explain such a difference in speed?

谁能解释这种速度上的差异吗?

1 个解决方案

#1


2  

After a series of tests, I confirmed that this is a Mozilla Firefox issue (it affects all windows desktop versions I tried). With Google Chrome, Opera, or even Firefox mobile, the regexp matches take about the same, worker or not.

经过一系列测试,我确认这是Mozilla Firefox的问题(它影响了我尝试过的所有windows桌面版本)。使用谷歌Chrome、Opera甚至火狐移动浏览器,regexp匹配的工作人员和非工作人员都差不多。

If you need this issue fixed, be sure to vote on bug report on bugzilla. I will try to add additional information if anything changes.

如果您需要解决这个问题,请务必对bugzilla的bug报告进行投票。如果有任何变化,我将尝试添加额外的信息。

#1


2  

After a series of tests, I confirmed that this is a Mozilla Firefox issue (it affects all windows desktop versions I tried). With Google Chrome, Opera, or even Firefox mobile, the regexp matches take about the same, worker or not.

经过一系列测试,我确认这是Mozilla Firefox的问题(它影响了我尝试过的所有windows桌面版本)。使用谷歌Chrome、Opera甚至火狐移动浏览器,regexp匹配的工作人员和非工作人员都差不多。

If you need this issue fixed, be sure to vote on bug report on bugzilla. I will try to add additional information if anything changes.

如果您需要解决这个问题,请务必对bugzilla的bug报告进行投票。如果有任何变化,我将尝试添加额外的信息。