Word Count (Map Reduce)

主要是用来了解map reduce这个框架下由开发人员自定义的map和reduce这两个stage具体在做什么逻辑的事情,总体来说就是在mapper函数(实际中则是mapper机器)下不能使用线性级别的额外空间去处理数据,例如哈希表,数组之类的,而reducer里面则是一个同样key在各个map机器中计算得到的结果的一个临时集合values

class WordCount:

    # @param {str} line a text, for example "Bye Bye see you next"
    def mapper(self, key, line):
        # Write your code here
        # Please use 'yield key, value'
        for word in line.split():
            yield (word, 1)


    # @param key is from mapper
    # @param values is a set of value with the same key
    def reducer(self, key, values):
        # Write your code here
        # Please use 'yield key, value'
        result = 0
        for value in values:
            result += value

        yield (key, result)

results matching ""

    No results matching ""