Programming model for processing large datasets in parallel across distributed clusters with a simple two-step logic (Map and Reduce).
MapReduce divides tasks into: 1) Map phase (processes input data into key-value pairs), and 2) Reduce phase (aggregates results).
Native to Hadoop but also implemented in other systems (e.g., Google's proprietary version, Apache Spark's RDDs).
Handles fault tolerance automatically - redistributes work if a node fails during computation.
Optimized for batch processing of large-scale unstructured data (e.g., web crawling, log analysis).
Limitations include disk I/O bottlenecks (addressed by Spark's in-memory processing) and complexity for iterative algorithms.