I have 2 arrays. One with bad keywords and the other with names of sites.
我有2个阵列。一个有坏关键字,另一个有网站名称。
$bad_keywords = array('google',
'twitter',
'facebook');
$sites = array('youtube.com', 'google.com', 'm.google.co.uk', 'walmart.com', 'thezoo.com', 'etc.com');
Simple task: I need to filter through the $sites
array and filter out any value that contains any keyword that is found in the $bad_keywords
array. At the end of it I need an array with clean values that I would not find any bad_keywords occurring at all.
简单的任务:我需要过滤$ sites数组并过滤掉包含$ bad_keywords数组中找到的任何关键字的任何值。在它结束时,我需要一个具有干净值的数组,我根本不会发现任何bad_keywords。
I have scoured the web and can't seem to find a simple easy solution for this. Here are several methods that I have tried:
1. using 2 foreach
loops (feels slower - I think using in-built php functions will speed it up)
2. array_walk
3. array_filter
我已经在网上搜索过,似乎找不到一个简单易用的解决方案。以下是我尝试过的几种方法:1。使用2个foreach循环(感觉较慢 - 我认为使用内置的php函数会加快它的速度)2。array_walk 3. array_filter
But I haven't managed to nail down the best, most efficient way. I want to have a tool that will filter through a list of 20k+ sites against a list of keywords that may be up to 1k long, so performance is paramount. Also, what would be the better method for the actual search in this case - regex
or strpos
?
但我还没有设法确定最好,最有效的方法。我希望有一个工具可以根据可能长达1k的关键字列表过滤20k +网站列表,因此性能至关重要。此外,在这种情况下,实际搜索的更好方法是什么 - 正则表达式或strpos?
What other options are there to do this and what would be the best way?
还有哪些其他选择可以做到这一点以及最好的方法是什么?
1 个解决方案
#1
2
Short solution using preg_grep function:
使用preg_grep函数的简短解决方案:
$result = preg_grep('/'. implode('|', $bad_keywords) .'/', $sites, 1);
print_r($result);
The output:
Array
(
[0] => youtube.com
[3] => walmart.com
[4] => thezoo.com
[5] => etc.com
)
#1
2
Short solution using preg_grep function:
使用preg_grep函数的简短解决方案:
$result = preg_grep('/'. implode('|', $bad_keywords) .'/', $sites, 1);
print_r($result);
The output:
Array
(
[0] => youtube.com
[3] => walmart.com
[4] => thezoo.com
[5] => etc.com
)