如何在JavaScript中找到数组中的连续子数组?

时间:2022-12-13 21:46:39

I wanted to write a function to find a contiguous subarray within a given array from a given starting index and return the index of the subarray within the array if it's found, and -1 if it's not found. This is similar to String.indexOf, but for arrays and subarrays instead of strings and substrings.

我想编写一个函数,从给定的起始索引中查找给定数组中的连续子数组,如果找到了,则返回数组中的子数组的索引,如果没有找到,则返回-1。这类似于字符串。索引,但用于数组和子数组,而不是字符串和子字符串。

This is my working code:

这是我的工作代码:

var find_csa = function (arr, subarr, from_index) {
    if (typeof from_index === 'undefined') {
        from_index = 0;
    }

    var i, found, j;
    for (i = from_index; i < 1 + (arr.length - subarr.length); ++i) {
        found = true;
        for (j = 0; j < subarr.length; ++j) {
            if (arr[i + j] !== subarr[j]) {
                found = false;
                break;
            }
        }
        if (found) return i;
    }
    return -1;
};

And these are my tests and their expected values:

这些是我的测试和它们的期望值

console.log(find_csa([1, 2, 3, 4, 5], [2, 3, 4]) === 1);
console.log(find_csa([1, 2, 3, 4, 5], [5]) === 4);
console.log(find_csa([1, 2, 3, 4, 5], [1, 3]) === -1);
console.log(find_csa([1, 2, 3, 4, 5], [42]) === -1);
console.log(find_csa([1, 2, 3, 4, 5], []) === 0);
console.log(find_csa([3, 4, 3, 4, 3, 4], [3, 4, 3], 1) === 2);
console.log(find_csa([6, 6, 6, 7], [6, 6, 7]) === 1);
console.log(find_csa([12, 9, 16, 42, 7, 866, 3], [16, 42, 7, 866]) === 2);

My code passes the tests, but as you can see, it uses a boolean value found in the inner loop which is just my messy, ad-hoc way of continuing an outer loop from a nested loop. is there a cleaner way of writing it? I looked into Array.prototype.findIndex but it is an experimental technology at the moment so I can't use it. I want a method that works in most browsers. I know there is a "polyfill" code snippet written on the Mozilla page, but that is even longer than my current code and it will be slower due to the function calls, so I'd rather avoid it.

我的代码通过了测试,但是正如您所看到的,它使用了在内部循环中找到的布尔值,这是我从嵌套循环中继续外部循环的一种杂乱的、特别的方式。有没有更简洁的写法?我看着Array.prototype。findIndex,但它是一种实验技术,我无法使用它。我想要一个在大多数浏览器中都能工作的方法。我知道在Mozilla页面上有一个“polyfill”代码片段,但它甚至比我现在的代码还要长,而且由于函数调用,它会变得更慢,所以我宁愿避免它。

My primary goal for this function is performance (the subarrays will be very small, so I believe that using Boyer-Moore string search algorithm or tries is a tad overkill), and then my secondary goal is elegance of my implementation. With those two goals in mind, I would like to know if there is a better way of writing this code, or if there are any JavaScript features or functions that I'm missing that could help me avoid the found boolean.

这个函数的主要目标是性能(子数组将非常小,所以我认为使用Boyer-Moore字符串搜索算法或try是有点过分了),然后我的次要目标是实现的优雅性。考虑到这两个目标,我想知道是否有更好的方法来编写这段代码,或者是否有一些JavaScript特性或函数可以帮助我避免找到布尔值。

JSFiddle if it helps anyone: http://jsfiddle.net/qc4zxq2p/

JSFiddle是否有帮助:http://jsfiddle.net/qc4zxq2p/

5 个解决方案

#1


11  

Are there any JavaScript features or functions that I'm missing that could help me avoid the found boolean

有没有什么JavaScript特性或函数可以帮助我避免找到布尔值

Yes, you can use a label on your outer loop:

是的,你可以在你的外环上使用标签:

function find_csa(arr, subarr, from_index) {
    var i = from_index >>> 0,
        sl = subarr.length,
        l = arr.length + 1 - sl;

    loop: for (; i<l; i++) {
        for (var j=0; j<sl; j++)
            if (arr[i+j] !== subarr[j])
                continue loop;
        return i;
    }
    return -1;
}

#2


7  

This is the same as yours, just prettified a bit (at least to my aesthetics):

这和你的一样,只是稍微修饰了一下(至少在我的审美观上):

var find_csa = function (arr, subarr, from_index) {
    from_index = from_index || 0;

    var i, found, j;
    var last_check_index = arr.length - subarr.length;
    var subarr_length = subarr.length;

    position_loop:
    for (i = from_index; i <= last_check_index; ++i) {
        for (j = 0; j < subarr_length; ++j) {
            if (arr[i + j] !== subarr[j]) {
                continue position_loop;
            }
        }
        return i;
    }
    return -1;
};

#3


4  

The inner loop can be reduced to a single line using the array method every:

使用数组方法可以将内循环简化为一行:

if(subarr.every(function(e, j) { return (e === arr[i + j]); })
    return i;

or (ES6 proposal):

或(ES6建议):

if(subarr.every( (e, j) => (e === arr[i + j]) ))
    return i;

But this may be just a curiosity or educational example, unless you don't care about performance.

但这可能只是一个好奇或教育的例子,除非你不关心性能。

#4


1  

Inside your loop, you can eliminate the found variable and avoid continue like this:

在循环内部,您可以消除找到的变量,并避免继续如下:

for (j = 0; j < subarr.length; ++j) {
    if (arr[i + j] !== subarr[j]) break;
}
/*
 * the above loop breaks in two cases:
 * normally: j === subarr.length
 * prematurely: array items did not match
 * we are interested in kowing if loop terminated normally
 */
if (j === subarr.length) return i;

Having said that, here is my solution using Array.join and String.indexOf. This is only good for array of Numbers:

话虽如此,这是我使用数组的解决方案。加入和String.indexOf。这只适用于数字数组:

function find_csa(arr, subarr, from_index) {
    from_index |= 0;
    if (subarr.length === 0) {
        return from_index;
    }
    var haystack = "," + arr.slice(from_index).join(",") + ",",
        needle = "," + subarr.join(",") + ",",
        pos = haystack.indexOf(needle);
    if (pos > 0) {
        pos = haystack.substring(1, pos).split(",").length + from_index;
    }
    return pos;
}
console.log("All tests should return true");
console.log(find_csa([1, 2, 3, 4, 5], [1, 2, 3]) === 0);
console.log(find_csa([1, 2, 3, 4, 5], [2, 3, 4]) === 1);
console.log(find_csa([1, 2, 3, 4, 5], [5]) === 4);
console.log(find_csa([1, 2, 3, 4, 5], [6]) === -1);
console.log(find_csa([1, 2, 3, 4, 5], [1, 3]) === -1);
console.log(find_csa([6, 6, 6, 7], [6, 6, 7]) === 1);
console.log(find_csa([1, 2, 3, 4, 5], []) === 0);
console.log(find_csa([3, 4, 3, 4, 3, 4], [3, 4, 3], 1) === 2);
console.log(find_csa([1, 2, 3, 4, 5], [], 1) === 1);

#5


1  

Reading initial discussion based on zerkms proposition, I was interested to try a solution using JSON.stringify, despite the unfavorable opinions.

阅读基于zerkms的初始讨论,我有兴趣使用JSON尝试一个解决方案。尽管有一些不好的意见,但还是有些苛刻。

Then I finally got a solution, which passes all tests properly.
Probably not the faster method, but surely the shortest one:

然后我终于得到了一个解决方案,它通过了所有的测试。也许不是更快的方法,但肯定是最短的方法:

var find_csa = function (arr, subarr, from_index) {
  var start=from_index|0,
      needle=JSON.stringify(subarr),
      matches=JSON.stringify(arr.slice(start)).
      match(new RegExp('^\\[(.*?),?'+
        needle.substr(1,needle.length-2).replace(/([\[\]])/g,'\\$1')
      ));
  return !!matches?(matches[1].length?matches[1].split(',').length:0)+start:-1;
}

The above code accepts arrays of arrays, as suggested by Shashank, but fails to process items containing commas.

上面的代码接受数组,如Shashank所建议的,但是不能处理包含逗号的项。

So I developped another solution which also accepts commas (thanks to Steven Levithan for the elegant tip about while(str!=(str=str.replace(regexp,replacement)));).

因此我开发了另一个方案,它也接受逗号(感谢Steven Levithan提供了关于while的优雅提示(str!= str.replace(regexp,replace)););

But it is only for fun, since:

但这只是为了好玩,因为:

  • the code is not so short, now... Sigh!
  • 代码不是那么短,现在……叹息!
  • it probably consumes a lot of CPU time
  • 它可能会消耗大量的CPU时间
  • it doesn't properly work with empty items (they are ignored)
  • 它不能正确地处理空项目(它们被忽略)
  • I suspect (and didn't dig deeper :-) it might fail with complex objects as items
  • 我怀疑(而且没有深入挖掘:-)它可能会以复杂对象作为项目而失败

Anyway, here is it:

无论如何,这里是:

var find_csa = function (arr, subarr, from_index) {
  var start=from_index|0,
      commas=new RegExp('(?:(\')([^,\']+),([^\']+)\'|(")([^,"]+),([^"]+))"'),
      strip_commas='$1$2$3$1$4$5$6$4',
      haystack=JSON.stringify(arr.slice(start)),
      needle=JSON.stringify(subarr).replace(/^\[(.*)\]$/,'$1');
  while(haystack!=(haystack=haystack.replace(commas,strip_commas)));
  while(needle!=(needle=needle.replace(commas,strip_commas)));
  matches=haystack.match(new RegExp('^\\[(.*?),?'+needle.replace(/([\[\]])/g,'\\$1')));
  return !!matches?(matches[1].length?matches[1].split(',').length:0)+start:-1;
}

#1


11  

Are there any JavaScript features or functions that I'm missing that could help me avoid the found boolean

有没有什么JavaScript特性或函数可以帮助我避免找到布尔值

Yes, you can use a label on your outer loop:

是的,你可以在你的外环上使用标签:

function find_csa(arr, subarr, from_index) {
    var i = from_index >>> 0,
        sl = subarr.length,
        l = arr.length + 1 - sl;

    loop: for (; i<l; i++) {
        for (var j=0; j<sl; j++)
            if (arr[i+j] !== subarr[j])
                continue loop;
        return i;
    }
    return -1;
}

#2


7  

This is the same as yours, just prettified a bit (at least to my aesthetics):

这和你的一样,只是稍微修饰了一下(至少在我的审美观上):

var find_csa = function (arr, subarr, from_index) {
    from_index = from_index || 0;

    var i, found, j;
    var last_check_index = arr.length - subarr.length;
    var subarr_length = subarr.length;

    position_loop:
    for (i = from_index; i <= last_check_index; ++i) {
        for (j = 0; j < subarr_length; ++j) {
            if (arr[i + j] !== subarr[j]) {
                continue position_loop;
            }
        }
        return i;
    }
    return -1;
};

#3


4  

The inner loop can be reduced to a single line using the array method every:

使用数组方法可以将内循环简化为一行:

if(subarr.every(function(e, j) { return (e === arr[i + j]); })
    return i;

or (ES6 proposal):

或(ES6建议):

if(subarr.every( (e, j) => (e === arr[i + j]) ))
    return i;

But this may be just a curiosity or educational example, unless you don't care about performance.

但这可能只是一个好奇或教育的例子,除非你不关心性能。

#4


1  

Inside your loop, you can eliminate the found variable and avoid continue like this:

在循环内部,您可以消除找到的变量,并避免继续如下:

for (j = 0; j < subarr.length; ++j) {
    if (arr[i + j] !== subarr[j]) break;
}
/*
 * the above loop breaks in two cases:
 * normally: j === subarr.length
 * prematurely: array items did not match
 * we are interested in kowing if loop terminated normally
 */
if (j === subarr.length) return i;

Having said that, here is my solution using Array.join and String.indexOf. This is only good for array of Numbers:

话虽如此,这是我使用数组的解决方案。加入和String.indexOf。这只适用于数字数组:

function find_csa(arr, subarr, from_index) {
    from_index |= 0;
    if (subarr.length === 0) {
        return from_index;
    }
    var haystack = "," + arr.slice(from_index).join(",") + ",",
        needle = "," + subarr.join(",") + ",",
        pos = haystack.indexOf(needle);
    if (pos > 0) {
        pos = haystack.substring(1, pos).split(",").length + from_index;
    }
    return pos;
}
console.log("All tests should return true");
console.log(find_csa([1, 2, 3, 4, 5], [1, 2, 3]) === 0);
console.log(find_csa([1, 2, 3, 4, 5], [2, 3, 4]) === 1);
console.log(find_csa([1, 2, 3, 4, 5], [5]) === 4);
console.log(find_csa([1, 2, 3, 4, 5], [6]) === -1);
console.log(find_csa([1, 2, 3, 4, 5], [1, 3]) === -1);
console.log(find_csa([6, 6, 6, 7], [6, 6, 7]) === 1);
console.log(find_csa([1, 2, 3, 4, 5], []) === 0);
console.log(find_csa([3, 4, 3, 4, 3, 4], [3, 4, 3], 1) === 2);
console.log(find_csa([1, 2, 3, 4, 5], [], 1) === 1);

#5


1  

Reading initial discussion based on zerkms proposition, I was interested to try a solution using JSON.stringify, despite the unfavorable opinions.

阅读基于zerkms的初始讨论,我有兴趣使用JSON尝试一个解决方案。尽管有一些不好的意见,但还是有些苛刻。

Then I finally got a solution, which passes all tests properly.
Probably not the faster method, but surely the shortest one:

然后我终于得到了一个解决方案,它通过了所有的测试。也许不是更快的方法,但肯定是最短的方法:

var find_csa = function (arr, subarr, from_index) {
  var start=from_index|0,
      needle=JSON.stringify(subarr),
      matches=JSON.stringify(arr.slice(start)).
      match(new RegExp('^\\[(.*?),?'+
        needle.substr(1,needle.length-2).replace(/([\[\]])/g,'\\$1')
      ));
  return !!matches?(matches[1].length?matches[1].split(',').length:0)+start:-1;
}

The above code accepts arrays of arrays, as suggested by Shashank, but fails to process items containing commas.

上面的代码接受数组,如Shashank所建议的,但是不能处理包含逗号的项。

So I developped another solution which also accepts commas (thanks to Steven Levithan for the elegant tip about while(str!=(str=str.replace(regexp,replacement)));).

因此我开发了另一个方案,它也接受逗号(感谢Steven Levithan提供了关于while的优雅提示(str!= str.replace(regexp,replace)););

But it is only for fun, since:

但这只是为了好玩,因为:

  • the code is not so short, now... Sigh!
  • 代码不是那么短,现在……叹息!
  • it probably consumes a lot of CPU time
  • 它可能会消耗大量的CPU时间
  • it doesn't properly work with empty items (they are ignored)
  • 它不能正确地处理空项目(它们被忽略)
  • I suspect (and didn't dig deeper :-) it might fail with complex objects as items
  • 我怀疑(而且没有深入挖掘:-)它可能会以复杂对象作为项目而失败

Anyway, here is it:

无论如何,这里是:

var find_csa = function (arr, subarr, from_index) {
  var start=from_index|0,
      commas=new RegExp('(?:(\')([^,\']+),([^\']+)\'|(")([^,"]+),([^"]+))"'),
      strip_commas='$1$2$3$1$4$5$6$4',
      haystack=JSON.stringify(arr.slice(start)),
      needle=JSON.stringify(subarr).replace(/^\[(.*)\]$/,'$1');
  while(haystack!=(haystack=haystack.replace(commas,strip_commas)));
  while(needle!=(needle=needle.replace(commas,strip_commas)));
  matches=haystack.match(new RegExp('^\\[(.*?),?'+needle.replace(/([\[\]])/g,'\\$1')));
  return !!matches?(matches[1].length?matches[1].split(',').length:0)+start:-1;
}