Map Reduce in MongoDB
In MongoDB Documentation, Map-Reduce is a data processing system that condenses a large amount of data into useful overall results. MongoDB uses mapReduce command for Map-Reduce operation . In general, Map Reduce is used to handle large data sets.
MapReduce command in MongoDB
The basic syntax of mapReduce command is as follows:
> db . collection . mapReduce ( function () { emit ( key , value );}, //map function function ( key , values ) { return reduceFunction }, //reduce function { out : collection , query : document , sort : document , limit : number } )
First, the function of Map Reduce Query Collection, then map the resulting Document to emit (Emit) key-value pairs that are then shortened based on the keys that have multiple values.
In the above syntax:
map is a JavaScript function that maps a value to a key and emits a key-value pair.
reduce is a JavaScript function that shortens or groups all Documents with the same key.
out determines the location of Map-Reduce query results.
The query specifies the arbitrary selection criteria to select the documents.
Sort sorting criteria arbitrarily arranged.
limit determines the number of arbitrary maximum documents to be returned.
Use MapReduce in MongoDB
You follow the structure of the Document to keep User Post. This Document stores the user's user_name and status.
{ "post_text" : "tutorialspoint is an awesome website for tutorials" , "user_name" : "mark" , "status" : "active" }
Now, we will use a mapReduce function on the posts collection to select all active posts, group them based on user_name and then count the number of posts of each user by using the following code:
> db . posts . mapReduce ( function () { emit ( this . user_id , 1 ); }, function ( key , values ) { return Array . sum ( values )}, { query :{ status : "active" }, out : "post_total" } )
The mapReduce query on will result:
{
"result": "post_total",
"timeMillis": 9,
"counts": {
"input": 4,
"emit": 4,
"reduce": 2,
"output": 2
},
"ok": 1,
}
The results show that, the total of 4 documents have been connected to the query (status: "active"), the map function emits 4 documents with key-value pairs and ultimately reduce the function of mapped Documents with the same key go in 2.
To see the results of this mapReduce query, you use the find operator:
> db . posts . mapReduce ( function () { emit ( this . user_id , 1 ); }, function ( key , values ) { return Array . sum ( values )}, { query :{ status : "active" }, out : "post_total" } ). find ()
The above query provides results indicating that both users tom and mark have two posts in the state that is active.
{ "_id" : "tom" , "value" : 2 } { "_id" : "mark" , "value" : 2 }
In the same manner, MapReduce queries can be used to build complex Aggregation queries. The use of custom JavaScript functions makes using MapReduce more flexible and powerful.
According to Tutorialspoint
Previous article: ObjectId in MongoDB
Next article: Text Search in MongoDB
You should read it
May be interested
- Shard in MongoDBsharding is a process of storing data records across multiple devices and it is a method of mongodb to meet the requirement for increasing data. when the size of the data increases, a single device cannot be enough to store data.
- Learn about Java Driver in MongoDBin the following article, we will introduce you some basic features of mongodv java driver as well as how to deploy and apply in practice.
- Install MongoDBinstructions for installing mongodb on windows.
- Aggregation in MongoDBaggregation can be understood as aggregation. the aggregation operation handles data records and returns calculated results. the operations group the values from multiple documents together, and can perform multiple operations on the grouped data to return a single result. in sql, count (*) and group by are equivalent to aggregation in mongodb.
- Data modeling in MongoDBdata in mongodb has a flexible schema. documents in the same collection need not have the same set of fields or structures, and common fields in collection documents can keep different data types.
- How to use Aggregation Pipeline in MongoDBif you are using mongodb's mapreduce, it is best to switch to aggregation pipeline for more efficient computation.
- GridFS in MongoDBgridfs is the mongodb specification for storing and collecting large files such as images, audio, video files, etc. it is a type of file system to store files but its data is stored inside mongodb collections. .
- Query analysis in MongoDBanalyzing queries is a very important aspect to assess the effectiveness of database and the effectiveness of the designed index. we will explore the two most frequently used queries, $ explain and $ hint.
- Projection in MongoDBin mongodb, projection's meaning is to select only the necessary data instead of selecting the entire data of a document. if a document has 5 fields and you only need 3 fields, you should only select 3 fields from that document.
- Reference Database in MongoDBas shown in the relationship chapter in mongodb, to deploy a standardized database structure in mongodb, we use the referenced relationship concept, also known as manual references, in which we manipulate to store the id of the documents referenced in another document.