- You must have a minimum of 12 slides in your presentation. The slides must be readable and formatted neatly and they must be original. You must attach them to this assignment, not send a link. This will be checked via Turnitin
- Your presentation must include the following: What are the components of YARN and what does each component do when a Hadoop MapReduce job is submitted?What is data locality? Does data locality optimization apply equally to Map tasks and Reduce tasks? Why?What are the uses of Pig, Hive and Impala? What are the differences?What does a Mapper do in MapReduce?What does a Reducer do in MapReduce?What happens in HDFS when a worker node is corrupt/taken offline? How is data loss prevented? Specifically speak about the HDFS daemons.
Format your presentation as if you were presenting it to a group of people. It must be a coherent, prepared and fluid presentation for full credit.