I need to use python to extract 3 words before and 3 words after a specific list of words
我需要使用python来提取3个单词之前和3个单词之后的特定单词列表
Nokia Lumia 930 Smartphone, Display 5 pollici, Fotocamera 20 MP, 2GB RAM, Processore Quad-Core 2,2GHz, Memoria 32GB, Windows Phone 8.1, Bianco [Germania]
诺基亚Lumia 930智能手机,显示器5 pollici,Fotocamera 20 MP,2GB RAM,处理器四核2.2GHz,Memoria 32GB,Windows Phone 8.1,Bianco [Germania]
At the moment I'm using this regex without success
目前我正在使用这个正则表达式而没有成功
((?:[\S,]+\s+){0,3})ram\s+((?:[\S,]+\s*){0,3})
https://regex101.com/r/yN6iI0/1
My list of words that I need is:
我需要的单词列表是:
- Display
- Fotocamera
- RAM
- Processore
- Memoria
2 个解决方案
#1
1
You regex did not work because \s+
requires at least 1 whitespace, but between RAM
and ,
there is none. Either use a *
quantifier or just remove it and use ``
你的正则表达式没有工作,因为\ s +需要至少1个空格,但在RAM和之间没有。要么使用*量词,要么删除它并使用``
(?i)((?:\S+\s+){0,3})\bRAM\b\s*((?:\S+\s+){0,3})
See demo
I added \b
(word boundary) to make sure we match RAM
, not RAMBUS
.
我添加\ b(字边界)以确保我们匹配RAM,而不是RAMBUS。
Mind the re.I
modifier (or use an inline version (?i)
at the beginning of the pattern).
注意re.I修饰符(或在模式的开头使用内联版本(?i))。
Other patterns can be formed in a similar way, just replace RAM
with the words from your list.
其他模式可以以类似的方式形成,只需用列表中的单词替换RAM。
#2
1
((?:[\S,]+\s+){0,3})ram,?\s+((?:[\S,]+\s*){0,3})
^^
Just add a ,
.See demo.
只需添加一个,参见演示。
https://regex101.com/r/yN6iI0/4
You can use this finally,
你终于可以用了,
((?:[\S,]+\s+){0,3})(?:ram|Display|Fotocamera|RAM|Processore|Memoria),?\s+((?:[\S,]+\s*){0,3})
#1
1
You regex did not work because \s+
requires at least 1 whitespace, but between RAM
and ,
there is none. Either use a *
quantifier or just remove it and use ``
你的正则表达式没有工作,因为\ s +需要至少1个空格,但在RAM和之间没有。要么使用*量词,要么删除它并使用``
(?i)((?:\S+\s+){0,3})\bRAM\b\s*((?:\S+\s+){0,3})
See demo
I added \b
(word boundary) to make sure we match RAM
, not RAMBUS
.
我添加\ b(字边界)以确保我们匹配RAM,而不是RAMBUS。
Mind the re.I
modifier (or use an inline version (?i)
at the beginning of the pattern).
注意re.I修饰符(或在模式的开头使用内联版本(?i))。
Other patterns can be formed in a similar way, just replace RAM
with the words from your list.
其他模式可以以类似的方式形成,只需用列表中的单词替换RAM。
#2
1
((?:[\S,]+\s+){0,3})ram,?\s+((?:[\S,]+\s*){0,3})
^^
Just add a ,
.See demo.
只需添加一个,参见演示。
https://regex101.com/r/yN6iI0/4
You can use this finally,
你终于可以用了,
((?:[\S,]+\s+){0,3})(?:ram|Display|Fotocamera|RAM|Processore|Memoria),?\s+((?:[\S,]+\s*){0,3})