Map reduce is a framework that was developed to process massive amounts of data efficiently. For example, if we have 1 million records in a dataset, and it is stored in a relational representation – it is very expensive to derive values and perform any sort of transformations on these.
For Example In SQL, Given the Date of Birth, to find out How many people are of age > 30 for a million records would take a while, and this would only increase in the order of magnitude when the complexity of the query increases. Map Reduce provides a cluster-based implementation where data is processed in a distributed manner
The biggest disadvantage of MapReduce is high latency. which makes it unusable for real time applications. There are frameworks on top of hadoop like HBase that are well suited for real time use cases too but plain hadoop/mapreduce doesn’t provide that functionality.
Answer ( 1 )
Map reduce is a framework that was developed to process massive amounts of data efficiently. For example, if we have 1 million records in a dataset, and it is stored in a relational representation – it is very expensive to derive values and perform any sort of transformations on these.
For Example In SQL, Given the Date of Birth, to find out How many people are of age > 30 for a million records would take a while, and this would only increase in the order of magnitude when the complexity of the query increases. Map Reduce provides a cluster-based implementation where data is processed in a distributed manner
The biggest disadvantage of MapReduce is high latency. which makes it unusable for real time applications. There are frameworks on top of hadoop like HBase that are well suited for real time use cases too but plain hadoop/mapreduce doesn’t provide that functionality.