From the course: Learning Hadoop
Unlock the full course today
Join today to access over 24,900 courses taught by industry experts.
Prepare for MapReduce Java coding - Hadoop Tutorial
From the course: Learning Hadoop
Prepare for MapReduce Java coding
- [Instructor] As we start to get ready to look at the code for MapReduce, I want to start us off with pseudocode. One of the evolutions of the Hadoop ecosystem is initially it was mostly written in Java, but because not everybody codes in Java, there's more and more work around libraries, particularly around Python and starting with a sort of a Pythonic pseudocode here because I think that's more accessible. So you can see this is a word count. So you have a mapper and a reducer and the mapper takes the file name and the file contents. And for each word in the file contents, you emit the word and one. Now of course, as I said in a previous movie, the definition of word is going to vary vastly by your business case. There is a default, but that won't suit most business cases. The reducer is going to take the word and the values starting with a sum of zero, so your counter, for each value in values, sum plus values,…