如何检查字符串是否包含Python列表中的元素?

时间:2020-11-26 01:37:46

I have something like this:

我有这样的东西:

extensionsToCheck = ['.pdf', '.doc', '.xls']

for extension in extensionsToCheck:
    if extension in url_string:
        print(url_string)

I am wondering what would be the more elegant way to do this in python (without using the for loop)? I was thinking of something like this (like from c/c++), but it didn't work:

我想知道,在python中(不使用for循环),哪种方法更优雅?我想的是这样的东西(比如c/c++),但它不起作用:

if ('.pdf' or '.doc' or '.xls') in url_string:
    print(url_string)

6 个解决方案

#1


222  

Use a generator together with any, which short-circuits on the first True:

使用发电机与任何发电机一起使用,使第一个真值短路:

if any(ext in url_string for ext in extensionsToCheck):
    print(url_string)

EDIT: I see this answer has been accepted by OP. Though my solution may be "good enough" solution to his particular problem, and is a good general way to check if any strings in a list are found in another string, keep in mind that this is all that this solution does. It does not care where the string is found. If this is important, as is often the case with urls, you should look to the answer of @Wladimir Palant, or you risk getting false positives.

编辑:我知道了这个答案相机会虽然已经接受我的解决方案可能是“足够好”解决特定的问题,一般是一个很好的方法来检查任何字符串是否在另一个字符串列表,请记住,这都是这个解决方案。它不关心字符串在哪里找到。如果这很重要,就像经常使用url的情况一样,你应该看看@Wladimir Palant的答案,否则你可能会得到误报。

#2


26  

extensionsToCheck = ('.pdf', '.doc', '.xls')

'test.doc'.endswith(extensionsToCheck)   # returns True

'test.jpg'.endswith(extensionsToCheck)   # returns False

#3


12  

It is better to parse the URL properly - this way you can handle http://.../file.doc?foo and http://.../foo.doc/file.exe correctly.

最好正确地解析URL——这样您就可以处理http://.f.file.doc?foo和http://.../foo.doc/file。exe正确。

from urlparse import urlparse
import os
path = urlparse(url_string).path
ext = os.path.splitext(path)[1]
if ext in extensionsToCheck:
  print(url_string)

#4


2  

Check if it matches this regex:

检查它是否匹配这个regex:

'(\.pdf$|\.doc$|\.xls$)'

Note: if you extensions are not at the end of the url, remove the $ characters, but it does weaken it slightly

注意:如果扩展不在url的末尾,那么删除$字符,但是它会稍微削弱它

#5


2  

Use list comprehensions if you want a single line solution. The following code returns a list containing the url_string when it has the extensions .doc, .pdf and .xls or returns empty list when it doesn't contain the extension.

如果想要单行解决方案,请使用列表理解。下面的代码在扩展名.doc、.pdf和.xls时返回一个包含url_string的列表,在不包含扩展名时返回空列表。

print [url_string for extension in extensionsToCheck if(extension in url_string)]

NOTE: This is only to check if it contains or not and is not useful when one wants to extract the exact word matching the extensions.

注意:这只是检查它是否包含,并且在想要提取与扩展匹配的确切单词时是没有用的。

#6


0  

This is a variant of the list comprehension answer given by @psun.

这是@psun给出的列表理解答案的变体。

By switching the output value, you can actually extract the matching pattern from the list comprehension (something not possible with the any() approach by @Lauritz-v-Thaulow)

通过切换输出值,您实际上可以从列表理解中提取匹配模式(不可能使用@Lauritz-v-Thaulow的任何()方法)

extensionsToCheck = ['.pdf', '.doc', '.xls']
url_string = 'http://.../foo.doc'

print [extension for extension in extensionsToCheck if(extension in url_string)]

['.doc']`

[' . doc ']'

You can furthermore insert a regular expression if you want to collect additional information once the matched pattern is known (this could be useful when the list of allowed patterns is too long to write into a single regex pattern)

如果您想在知道匹配的模式之后收集其他信息(当允许的模式列表太长而不能写入单个regex模式时,这将非常有用),那么您可以进一步插入一个正则表达式。

print [re.search(r'(\w+)'+extension, url_string).group(0) for extension in extensionsToCheck if(extension in url_string)]

['foo.doc']

(“foo.doc”)

#1


222  

Use a generator together with any, which short-circuits on the first True:

使用发电机与任何发电机一起使用,使第一个真值短路:

if any(ext in url_string for ext in extensionsToCheck):
    print(url_string)

EDIT: I see this answer has been accepted by OP. Though my solution may be "good enough" solution to his particular problem, and is a good general way to check if any strings in a list are found in another string, keep in mind that this is all that this solution does. It does not care where the string is found. If this is important, as is often the case with urls, you should look to the answer of @Wladimir Palant, or you risk getting false positives.

编辑:我知道了这个答案相机会虽然已经接受我的解决方案可能是“足够好”解决特定的问题,一般是一个很好的方法来检查任何字符串是否在另一个字符串列表,请记住,这都是这个解决方案。它不关心字符串在哪里找到。如果这很重要,就像经常使用url的情况一样,你应该看看@Wladimir Palant的答案,否则你可能会得到误报。

#2


26  

extensionsToCheck = ('.pdf', '.doc', '.xls')

'test.doc'.endswith(extensionsToCheck)   # returns True

'test.jpg'.endswith(extensionsToCheck)   # returns False

#3


12  

It is better to parse the URL properly - this way you can handle http://.../file.doc?foo and http://.../foo.doc/file.exe correctly.

最好正确地解析URL——这样您就可以处理http://.f.file.doc?foo和http://.../foo.doc/file。exe正确。

from urlparse import urlparse
import os
path = urlparse(url_string).path
ext = os.path.splitext(path)[1]
if ext in extensionsToCheck:
  print(url_string)

#4


2  

Check if it matches this regex:

检查它是否匹配这个regex:

'(\.pdf$|\.doc$|\.xls$)'

Note: if you extensions are not at the end of the url, remove the $ characters, but it does weaken it slightly

注意:如果扩展不在url的末尾,那么删除$字符,但是它会稍微削弱它

#5


2  

Use list comprehensions if you want a single line solution. The following code returns a list containing the url_string when it has the extensions .doc, .pdf and .xls or returns empty list when it doesn't contain the extension.

如果想要单行解决方案,请使用列表理解。下面的代码在扩展名.doc、.pdf和.xls时返回一个包含url_string的列表,在不包含扩展名时返回空列表。

print [url_string for extension in extensionsToCheck if(extension in url_string)]

NOTE: This is only to check if it contains or not and is not useful when one wants to extract the exact word matching the extensions.

注意:这只是检查它是否包含,并且在想要提取与扩展匹配的确切单词时是没有用的。

#6


0  

This is a variant of the list comprehension answer given by @psun.

这是@psun给出的列表理解答案的变体。

By switching the output value, you can actually extract the matching pattern from the list comprehension (something not possible with the any() approach by @Lauritz-v-Thaulow)

通过切换输出值,您实际上可以从列表理解中提取匹配模式(不可能使用@Lauritz-v-Thaulow的任何()方法)

extensionsToCheck = ['.pdf', '.doc', '.xls']
url_string = 'http://.../foo.doc'

print [extension for extension in extensionsToCheck if(extension in url_string)]

['.doc']`

[' . doc ']'

You can furthermore insert a regular expression if you want to collect additional information once the matched pattern is known (this could be useful when the list of allowed patterns is too long to write into a single regex pattern)

如果您想在知道匹配的模式之后收集其他信息(当允许的模式列表太长而不能写入单个regex模式时,这将非常有用),那么您可以进一步插入一个正则表达式。

print [re.search(r'(\w+)'+extension, url_string).group(0) for extension in extensionsToCheck if(extension in url_string)]

['foo.doc']

(“foo.doc”)