Map Reduce in MongoDB

In MongoDB Documentation, Map-Reduce is a data processing system that condenses a large amount of data into useful overall results. MongoDB uses mapReduce command for Map-Reduce operation . In general, Map Reduce is used to handle large data sets.

MapReduce command in MongoDB

The basic syntax of mapReduce command is as follows:

 > db . collection . mapReduce ( function () { emit ( key , value );}, //map function function ( key , values ) { return reduceFunction }, //reduce function { out : collection , query : document , sort : document , limit : number } ) 

First, the function of Map Reduce Query Collection, then map the resulting Document to emit (Emit) key-value pairs that are then shortened based on the keys that have multiple values.

In the above syntax:

map is a JavaScript function that maps a value to a key and emits a key-value pair.

reduce is a JavaScript function that shortens or groups all Documents with the same key.

out determines the location of Map-Reduce query results.

The query specifies the arbitrary selection criteria to select the documents.

Sort sorting criteria arbitrarily arranged.

limit determines the number of arbitrary maximum documents to be returned.

Use MapReduce in MongoDB

You follow the structure of the Document to keep User Post. This Document stores the user's user_name and status.

 { "post_text" : "tutorialspoint is an awesome website for tutorials" , "user_name" : "mark" , "status" : "active" } 

Now, we will use a mapReduce function on the posts collection to select all active posts, group them based on user_name and then count the number of posts of each user by using the following code:

 > db . posts . mapReduce ( function () { emit ( this . user_id , 1 ); }, function ( key , values ) { return Array . sum ( values )}, { query :{ status : "active" }, out : "post_total" } ) 

The mapReduce query on will result:

 { 
"result": "post_total",
"timeMillis": 9,
"counts": {
"input": 4,
"emit": 4,
"reduce": 2,
"output": 2
},
"ok": 1,
}

The results show that, the total of 4 documents have been connected to the query (status: "active"), the map function emits 4 documents with key-value pairs and ultimately reduce the function of mapped Documents with the same key go in 2.

To see the results of this mapReduce query, you use the find operator:

 > db . posts . mapReduce ( function () { emit ( this . user_id , 1 ); }, function ( key , values ) { return Array . sum ( values )}, { query :{ status : "active" }, out : "post_total" } ). find () 

The above query provides results indicating that both users tom and mark have two posts in the state that is active.

 { "_id" : "tom" , "value" : 2 } { "_id" : "mark" , "value" : 2 } 

In the same manner, MapReduce queries can be used to build complex Aggregation queries. The use of custom JavaScript functions makes using MapReduce more flexible and powerful.

According to Tutorialspoint

Previous article: ObjectId in MongoDB

Next article: Text Search in MongoDB

4 ★ | 1 Vote

May be interested

  • Shard in MongoDBShard in MongoDB
    sharding is a process of storing data records across multiple devices and it is a method of mongodb to meet the requirement for increasing data. when the size of the data increases, a single device cannot be enough to store data.
  • Learn about Java Driver in MongoDBLearn about Java Driver in MongoDB
    in the following article, we will introduce you some basic features of mongodv java driver as well as how to deploy and apply in practice.
  • Install MongoDBInstall MongoDB
    instructions for installing mongodb on windows.
  • Aggregation in MongoDBAggregation in MongoDB
    aggregation can be understood as aggregation. the aggregation operation handles data records and returns calculated results. the operations group the values ​​from multiple documents together, and can perform multiple operations on the grouped data to return a single result. in sql, count (*) and group by are equivalent to aggregation in mongodb.
  • Data modeling in MongoDBData modeling in MongoDB
    data in mongodb has a flexible schema. documents in the same collection need not have the same set of fields or structures, and common fields in collection documents can keep different data types.
  • How to use Aggregation Pipeline in MongoDBHow to use Aggregation Pipeline in MongoDB
    if you are using mongodb's mapreduce, it is best to switch to aggregation pipeline for more efficient computation.
  • GridFS in MongoDBGridFS in MongoDB
    gridfs is the mongodb specification for storing and collecting large files such as images, audio, video files, etc. it is a type of file system to store files but its data is stored inside mongodb collections. .
  • Query analysis in MongoDBQuery analysis in MongoDB
    analyzing queries is a very important aspect to assess the effectiveness of database and the effectiveness of the designed index. we will explore the two most frequently used queries, $ explain and $ hint.
  • Projection in MongoDBProjection in MongoDB
    in mongodb, projection's meaning is to select only the necessary data instead of selecting the entire data of a document. if a document has 5 fields and you only need 3 fields, you should only select 3 fields from that document.
  • Reference Database in MongoDBReference Database in MongoDB
    as shown in the relationship chapter in mongodb, to deploy a standardized database structure in mongodb, we use the referenced relationship concept, also known as manual references, in which we manipulate to store the id of the documents referenced in another document.