迭代查找/替换Python中的元组列表

时间:2022-09-28 14:10:39

I have a list of tuples, each containing a find/replace value that I would like to apply to a string. What would be the most efficient way to do so? I will be applying this iteratively, so performance is my biggest concern.

我有一个元组列表,每个元组包含一个我想要应用于字符串的查找/替换值。最有效的方法是什么?我将迭代地应用它,因此性能是我最关心的问题。

More concretely, what would the innards of processThis() look like?

更具体地说,processThis()的内部会是什么样子?

x = 'find1, find2, find3'
y = [('find1', 'replace1'), ('find2', 'replace2'), ('find3', 'replace3')]

def processThis(str,lst):
     # Do something here
     return something

>>> processThis(x,y)
'replace1, replace2, replace3'

Thanks, all!

5 个解决方案

#1


You could consider using re.sub:

您可以考虑使用re.sub:

import re
REPLACEMENTS = dict([('find1', 'replace1'),
                     ('find2', 'replace2'),
                     ('find3', 'replace3')])

def replacer(m):
    return REPLACEMENTS[m.group(0)]

x = 'find1, find2, find3'
r = re.compile('|'.join(REPLACEMENTS.keys()))
print r.sub(replacer, x)

#2


A couple notes:

几个笔记:

  1. The boilerplate argument about premature optimization, benchmarking, bottlenecks, 100 is small, etc.
  2. 关于过早优化,基准测试,瓶颈,100的样板论证很小,等等。

  3. There are cases where the different solutions will return different results. if y = [('one', 'two'), ('two', 'three')] and x = 'one' then mhawke's solution gives you 'two' and Unknown's gives 'three'.
  4. 在某些情况下,不同的解决方案会返回不同的结果。如果y = [('one','two'),('two','three')]和x ='one',那么mhawke的解决方案会给你'two'而Unknown'会给'3'。

  5. Testing this out in a silly contrived example mhawke's solution was a tiny bit faster. It should be easy to try it with your data though.
  6. 在一个愚蠢的人为例子中测试这一点,mhawke的解决方案要快一点。尽管如此,应该很容易尝试使用您的数据。

#3


x = 'find1, find2, find3'
y = [('find1', 'replace1'), ('find2', 'replace2'), ('find3', 'replace3')]

def processThis(str,lst):
    for find, replace in lst:
        str = str.replace(find, replace)

    return str

>>> processThis(x,y)
'replace1, replace2, replace3'

#4


s = reduce(lambda x, repl: str.replace(x, *repl), lst, s)

#5


Same answer as mhawke, enclosed with method str_replace

与mhawke相同的答案,用方法str_replace括起来

def str_replace(data, search_n_replace_dict):
    import re
    REPLACEMENTS = search_n_replace_dict

    def replacer(m):
        return REPLACEMENTS[m.group(0)]

    r = re.compile('|'.join(REPLACEMENTS.keys()))
    return r.sub(replacer, data)

Then we can call this method with example as below

然后我们可以用下面的例子调用这个方法

s = "abcd abcd efgh efgh;;;;;; lkmnkd kkkkk"
d = dict({ 'abcd' : 'aaaa', 'efgh' : 'eeee', 'mnkd' : 'mmmm' })


print (s)
print ("\n")
print(str_replace(s, d))

output :

abcd abcd efgh efgh;;;;;; lkmnkd kkkkk


aaaa aaaa eeee eeee;;;;;; lkmmmm kkkkk

#1


You could consider using re.sub:

您可以考虑使用re.sub:

import re
REPLACEMENTS = dict([('find1', 'replace1'),
                     ('find2', 'replace2'),
                     ('find3', 'replace3')])

def replacer(m):
    return REPLACEMENTS[m.group(0)]

x = 'find1, find2, find3'
r = re.compile('|'.join(REPLACEMENTS.keys()))
print r.sub(replacer, x)

#2


A couple notes:

几个笔记:

  1. The boilerplate argument about premature optimization, benchmarking, bottlenecks, 100 is small, etc.
  2. 关于过早优化,基准测试,瓶颈,100的样板论证很小,等等。

  3. There are cases where the different solutions will return different results. if y = [('one', 'two'), ('two', 'three')] and x = 'one' then mhawke's solution gives you 'two' and Unknown's gives 'three'.
  4. 在某些情况下,不同的解决方案会返回不同的结果。如果y = [('one','two'),('two','three')]和x ='one',那么mhawke的解决方案会给你'two'而Unknown'会给'3'。

  5. Testing this out in a silly contrived example mhawke's solution was a tiny bit faster. It should be easy to try it with your data though.
  6. 在一个愚蠢的人为例子中测试这一点,mhawke的解决方案要快一点。尽管如此,应该很容易尝试使用您的数据。

#3


x = 'find1, find2, find3'
y = [('find1', 'replace1'), ('find2', 'replace2'), ('find3', 'replace3')]

def processThis(str,lst):
    for find, replace in lst:
        str = str.replace(find, replace)

    return str

>>> processThis(x,y)
'replace1, replace2, replace3'

#4


s = reduce(lambda x, repl: str.replace(x, *repl), lst, s)

#5


Same answer as mhawke, enclosed with method str_replace

与mhawke相同的答案,用方法str_replace括起来

def str_replace(data, search_n_replace_dict):
    import re
    REPLACEMENTS = search_n_replace_dict

    def replacer(m):
        return REPLACEMENTS[m.group(0)]

    r = re.compile('|'.join(REPLACEMENTS.keys()))
    return r.sub(replacer, data)

Then we can call this method with example as below

然后我们可以用下面的例子调用这个方法

s = "abcd abcd efgh efgh;;;;;; lkmnkd kkkkk"
d = dict({ 'abcd' : 'aaaa', 'efgh' : 'eeee', 'mnkd' : 'mmmm' })


print (s)
print ("\n")
print(str_replace(s, d))

output :

abcd abcd efgh efgh;;;;;; lkmnkd kkkkk


aaaa aaaa eeee eeee;;;;;; lkmmmm kkkkk