MapReduce是一种函数式编程模型,用于大规模数据集(大于1TB)的并行运算。概念"Map(映射)"和"Reduce(归约)",是它们的主要思想,都是从函数式编程语言里借来的,还有从矢量编程语言里借来的特性。它极大地方便了编程人员在不会分布式并行编程的情况下,将自己的程序运行在分布式系统上。
Map(映射)函数,用来把一组键值对映射成一组新的键值对,指定并发的Reduce(归约)函数,用来保证所有映射的键值对中的每一个共享相同的键组。
然而在python中,map就是 :将一个函数映射到所有可枚举类型上,reduce就是归约。
#map/reduce from functools import reduce
print(list(map(str,[-1,-2,-3,-4,-5])))
def fn(x, y):
print(x, y)
return x*10 + y
r = reduce(fn,[1,3,5,7,9])
print(r) def func_sum(x, y):
return x + y
def square(x):
return x*x
list_r = map(square,[-1,-2,-3,-4,-5])
ll_r = list(list_r)
print('r = ',ll_r)
ans = reduce(func_sum, ll_r)
print('sum is ',ans) def add_100(a, b, c):
print(a,b,c)
return a * 10000+ b *100 + c
list1 = [11,22,33]
list2 = [44,55,66]
list3 = [77,88,99]
rec = map(add_100,list1, list2, list3)
print(list(rec)) from functools import reduce
def fn(x, y):
print(x, y)
return x * 10 + y
def char2num(s):
return {'': 0, '': 1, '': 2, '': 3, '': 4, '': 5, '': 6, '': 7, '': 8, '': 9}[s]
'''{} is a dictionary, [] is index of dict, so [key] return value'''
print(reduce(fn, map(char2num, ''))) def str2int(s):
def fn(x, y):
return x * 10 + y
def char2num(s):
return {'': 0, '': 1, '': 2, '': 3, '': 4, '': 5, '': 6, '': 7, '': 8, '': 9}[s]
return reduce(fn, map(char2num, s))
print(str2int(''))
print(int(''))
print(str(13572)) def str2int_(s):
def char2num(s):
return {'': 0, '': 1, '': 2, '': 3, '': 4, '': 5, '': 6, '': 7, '': 8, '': 9}[s]
return reduce(lambda x,y: x * 10 + y, map(char2num, s))
'''lambda 匿名函数,有些简单函数只需要用1次,所以不给起名字'''
print(str2int_('')) #联系
#1 规范化英文名
def normalize(name):
return name.capitalize()
L1 = ['adam', 'LISA', 'barT']
L2 = list(map(normalize, L1))
print(L2)
#2 请编写一个prod()函数,可以接受一个list并利用reduce()求积:
from functools import reduce
def prod(L):
return reduce(lambda x,y : x * y, L)
print('3 * 5 * 7 * 9 =', prod([3, 5, 7, 9]))
#3 利用map和reduce编写一个str2float函数,把字符串'123.456'转换成浮点数123.456:
from functools import reduce
def str2float(s):
def char2num(s):
return {'': 0, '': 1, '': 2, '': 3, '': 4, '': 5, '': 6, '': 7, '': 8, '': 9}[s]
pos = s.find('.')
s_num = s.split('.')[0] + s.split('.')[1]
print(pos, s_num)
L = reduce(lambda x,y : x * 10 + y, map(char2num, s_num))
return L/ math.pow(10, pos)
print('str2float(\'123.456\') =', str2float('123.456'))
#Another solution
def str2float_(s):
def char2num(s):
return {'': 0, '': 1, '': 2, '': 3, '': 4, '': 5, '': 6, '': 7, '': 8, '': 9}[s]
a, b = s.split('.')
L = reduce(lambda x,y: x * 10 + y, map(char2num, a + b))
return L/10**len(b)
print('str2float(\'123.456\') =', str2float('123.456'))
以上代码都是liaoxuefeng教程中的内容实现和练习题。