Which two of the following statements are true about Pig's approach toward data? Choose 2 answers
You have written a Mapper which invokes the following five calls to the OutputColletor.collect method:
output.collect (new Text ("Apple"), new Text ("Red") ) ; output.collect (new Text ("Banana"), new Text ("Yellow") ) ; output.collect (new Text ("Apple"), new Text ("Yellow") ) ; output.collect (new Text ("Cherry"), new Text ("Red") ) ; output.collect (new Text ("Apple"), new Text ("Green") ) ; How many times will the Reducer's reduce method be invoked?
What does the following command do? register '/piggyban):/pig-files.jar';
Which TWO of the following statements are true regarding Hive? Choose 2 answers
You've written a MapReduce job that will process 500 million input records and generated 500 million key-value pairs. The data is not uniformly distributed. Your MapReduce job will create a significant amount of intermediate data that it needs to transfer between mappers and reduces which is a potential bottleneck. A custom implementation of which interface is most likely to reduce the amount of intermediate data transferred across the network?
Which HDFS command uploads a local file X into an existing HDFS directory Y?
Which Two of the following statements are true about hdfs? Choose 2 answers
Identify the MapReduce v2 (MRv2 / YARN) daemon responsible for launching application containers and monitoring application resource usage?