Learn about Non-relational Database - NoSQL

NoSQL is gradually emerging as a force in the programming world. In this article, Quantrimang will give an overview of NoSQL, the difference between traditional and NoSQL SQL as well as the most outstanding features about NoSQL. Invites you to read the track.

Relational database management system (RDBMS) was born in the 70s of the last century, allowing applications to store data through query language and standard data modeling ( Structured Query Language - SQL). SQL in general or specifically as RDBMS is a product that has been used for decades of technology development, showing its applicability and responsiveness in real stress testing. At that time, data storage was quite expensive, but data schemas were relatively simple, easy to understand, so the need for a new tool was not necessary.

More and more, technology is growing, especially since the web emerges, the volume of data, information about users, products, objects and events that systems need to handle now. 1 big. For example, Google, Facebook must store and process a huge amount of data every day. Even displaying a website or responding to an API request can lose dozens or hundreds of database requests when information processing applications become more complex.
Now SQL has made some hindrances with restrictions - namely a rigid schema / schema, which makes them less suitable for other types of applications.

Learn about Non-relational Database - NoSQL Picture 1Learn about Non-relational Database - NoSQL Picture 1

To meet the database needs, the service infrastructure and the continuous strategies offered by the developers have also changed dramatically. Easier and more affordable cloud technologies have emerged to replace complex and expensive servers. Or more engineers using speeding methods, aim to continuously develop and shorten the cycle, the purpose of querying data at a fast rate, meeting the needs of users.

And so, NoSQL was born to serve the requirements that match the current. The NoSQL system stores and manages data so that it can support high-speed operation and provide great flexibility for developers to use. Unlike SQL databases, many NoSQL databases can expand horizontally across hundreds or thousands of servers.

Learn about Non-relational Database - NoSQL Picture 2Learn about Non-relational Database - NoSQL Picture 2

However, NoSQL system is not consistent data like SQL. In fact, SQL databases do not prioritize performance and scalability, but will often push compliance with ACID attributes to ensure reliability of transactions up front, while the NoSQL database is near. like ignoring the ACID guarantees to prioritize speed and scalability.

In summary, SQL and NoSQL have different trade-offs in their systems. Although both can compete in the context of a project, when placed in an overall picture, there is a supporting and complementary role. Determining which tools to choose depends on the nature of the actual work.

Differentiate SQL and NoSQL

Learn about Non-relational Database - NoSQL Picture 3Learn about Non-relational Database - NoSQL Picture 3

SQL Database NoSQL Database History Developed in the 1970s with the first wave of data storage applications. Developed in the 2000s to address the limitations of SQL databases, specifically related to scaling, scaling and archiving unstructured data. Database of MySQL, Postgres, Oracle Database MongoDB, Cassandra, HBase, and Neo4j data models Individual records (for example, "employees") are stored as rows in the table, with each column storing one specific data about that record (eg "manager", "rented date" .), like a spreadsheet. Separate data types are stored in separate tables and then joined together when more complex queries are executed. For example: "office" can be stored in a table and "employees" in another table. When the user wants to find the employee's working address, the database tool will join the "employee" and "office" tables together to get all the necessary information. Different based on NoSQL database type. For example, key-value stores operate similarly to SQL databases, but only two columns ("key-key" and "value-value"). Document database completely eliminates the table-and-row model, storing all the related data in a single "document" in JSON, XML or other formats, can nest values ​​according to things. rank. Scalability Vertical, which means that the single server must be increasingly powerful to meet the expansion needs of the data. It is possible to extend the SQL database on multiple servers, but need to add important techniques. Horizontally, which means to add space, the database administrator only needs to add more servers or clouds. NoSQL database automatically disperses data on servers when necessary Open source Integrated Development Model (eg Postgres, MySQL) and closed source code (eg Oracle Database) Code Open source Manipulate specific language data using the Select, Insert, and Update statements.
Example: SELECT fields FROM table WHERE . Through object-oriented API Consistency There is strong consistency Depend on the system. There is a priority system that provides consistency (for example, MongoDB) while others provide final consistency (for example, Cassandra) Structure Schema and a fixed type of data. To store information about a new data item, the entire database must be changed, during which time the database must be offline. Logs can add new information quickly, unlike rows in SQL tables, different data can be stored together as needed. For some databases, adding new fields flexibly will be more difficult.

Features of NoSQL

  1. NoSQL stores its data in the form of 'key - value' pairs. Use large numbers of nodes to store information.
  2. Distributed model under software control.
  3. Accept duplicate data because some nodes will save the same information.
  4. A query will be sent to multiple machines at the same time, so when a machine fails to serve, it will not affect the quality of the results returned.
  5. Non-relationship - there is no constraint for data consistency.
  6. Consistency is not real-time: After each database change, there is no need to immediately impact all related databases that are spread over time.

Popular NoSQL systems

With NoSQL, data can be stored in a no-schema or free-form manner. Any data can be stored in any record. Among the NoSQL databases, there are 4 popular data storage models, so there are 4 popular NoSQL system types: Document database, Key-value stores, Wide column stores and Graph database.

Learn about Non-relational Database - NoSQL Picture 4Learn about Non-relational Database - NoSQL Picture 4

1. Document database (eg: CouchDB, MongoDB): Data is added to store as a free JSON structure or 'document', in which data can be any type, from integer to string or to free texts.

  1. Advantages : Use when source data is not fully described.
  2. Disadvantages : Query performance, There is no standard syntax for data query.

2. Key-value stores (eg Redis, Riak): Free-form values, from integers or simple strings to complex JSON documents, are accessed in the database by keys.

  1. Advantage : Very fast search.
  2. Disadvantages : Save the data in a certain format (schema).

3. Wide column stores (eg: HBase, Cassandra): Data is stored in columns instead of rows as in a regular SQL system. Any number of columns (and therefore different types of data) can be grouped or aggregated when needed for a query or data view.

  1. Advantages : Quick search, Good data dispersion.
  2. Disadvantages : Support with very few software.

4. Graph database (example: Neo4j): Data is represented as a network or graph of entities and relationships of that entity, with each node in the diagram being a data block in the form of self by.

  1. Advantages: Application of algorithms on graphs such as Shortest path, interconnection, .
  2. Disadvantages: Must browse the graph internally, to respond to queries. Not easy to disperse.

Learn about Non-relational Database - NoSQL Picture 5Learn about Non-relational Database - NoSQL Picture 5

Removing schema data storage is useful in the following cases:

  1. Users want quick access to data, and the goal is to pay more attention to the speed, simplicity of access than the reliability or consistency of transactions.
  2. Users store a large amount of data and do not want to limit themselves to schemas, because changing the schema later may be slow and difficult.
  3. People are taking unstructured data from one or more sources and want to keep the data in their original form to make the most of its flexibility.
  4. Users want to store data in a hierarchical structure, but want those hierarchies to be described by the data itself, not an external schema. NoSQL allows self-referenced data in a more complex method than simulated SQL databases.

Query the NoSQL database

The structured query language used by traditional databases provides a unified way to communicate with the server when storing and retrieving data. The SQL syntax is highly standardized, so while individual databases can handle different operations, the basics are the same.

In contrast, each NoSQL database has its own syntax for querying and managing data. For example, CouchDB uses requests in the form of JSON, sent over HTTP, to create or retrieve documents from its database. MongoDB sends JSON objects via binary protocol, using command line interface or language library.

Some NoSQL tools can use SQL-like syntax to work with data, but only in limited scope. For example, Apache Cassandra, a database store like language of SQL - Cassandra query language (Cassandra Query Language - CQL). Some CQL syntax is out of SQL, like SELECT or INSERT keywords, but there is no way to implement JOIN or subquery in Cassandra.

Shared-nothing architecture - Shared-nothing

A design choice of NoSQL system is non-shared architecture. In this type of architecture, each node in the cluster operates independently of all other nodes. The system does not need to receive consensus from each node to return data to customers. Queries are fast because they can be returned from the nearest or most convenient node.

Another advantage of 'Shared-nothing' is the ability to recover and scale up. Cluster expansion is as easy as rotating new nodes in the cluster and waiting for them to synchronize with other nodes. If a node in NoSQL fails, the other servers in the cluster will continue to operate, all data is still available to serve the requirements even if there are fewer nodes.

Restrictions of NoSQL

If NoSQL provides a lot of freedom and flexibility, why not give up completely SQL? The answer is simple: many applications still require the types of constraints, consistency, and protections that SQL databases provide. In those cases, some advantages on NoSQL's platform can be turned into blemishes.
Some other limitations stem from the fact that the NoSQL systems are quite 'fledgling', which may include some of the following disadvantages:

1. There is no schema

Even if you get data in free form, you almost always need to impose constraints to make it useful. With NoSQL, responsibility will be transferred from the database to the application developer. For example, a developer can impose a structure through a relational object map system or ORM. But if you want a self-data schema, NoSQL will not normally support it.

Some NoSQL solutions provide optional authentication and data entry mechanisms. For example, Apache Cassandra has a series of original data types similar to the data types found in regular SQL.

2. Lack of consistency

NoSQL trade-offs consistency to prioritize speed and performance more efficiently. Any data inserted into the cluster will be available throughout the system anyway, but it is not possible to know for sure.

Some NoSQL databases have mechanisms to fix this. For example, MongoDB, this system ensures consistency for individual operations, but not for the entire database. Microsoft Azure CosmosDB allows you to choose the consistency level for each request, so you can choose the behavior appropriate for your use case.

3. NoSQL lock-in

Most NoSQL systems are similar in concept, however, the implementation is very different. Each system will have its own data query and management mechanism. This may become a problem if system changes occur during the process.

For example, if you change the system from MongoDB to CouchDB you will have to do more than just move data. You must also navigate the difference in data access and how to program, in other words, you must rewrite parts of the database access application.

4. Young skills

Another limitation for NoSQL is that users may lack the relative professional skills. While the market for SQL is still growing, NoSQL is still very young because the system is still quite new and not everyone knows how to use it fluently.

According to Truth.com at the end of 2017, the volume of job listings for SQL, usually MySQL MySQL, Microsoft SQL Server, Oracle Database, . is higher than the total of three years with the workload for MongoDB, Couchbase and Cassandra. NoSQL's demand is growing, but it is still a small part of the market for SQL.

Deploy NoSQL Database in enterprises and organizations

With its advantages, NoSQL becomes extremely suitable for the challenges of modern data storage. In addition, cost savings and time make NoSQL even more prominent than relational database solutions.

Learn about Non-relational Database - NoSQL Picture 6Learn about Non-relational Database - NoSQL Picture 6

Typically, organizations will start with small-scale testing on a NoSQL database. Most of these databases are open source, meaning they can be downloaded, deployed and expanded at low cost.

Many organizations began to see significant advantages when using NoSQL databases for projects. Because the cycle grows faster, organizations can innovate faster and provide a superior customer experience at a lower cost. With these advantages, NoSQL is being used much in Big Data projects, Real-time projects, and a lot of data.

See more:

  1. SQL - Typical Database management tool
  2. Learn about the most popular RDBMS
  3. Overview of MongoDB
  4. What is Python? Why choose Python?
4 ★ | 6 Vote