Summary of Sql VS NoSql
This is a summary of a comparison between Sql/relational and NoSql/document based database, from Designing Data Intensive Applications. Please read Chapter 2, 5, 7 of the book if you wish to know more details.
Comparisons in terms of data models
Comparison 1: To joins or not to joins
Relation database is better suited for inter-connected data with MANY-TO-ONE and MANY-TO-MANY, relationships because the relational model provides better support for database-level joins using SQL. In document-based database, you’d need to do the joins in application code which complicates the code and it’s often slower. If your application doesn’t make heavy use of joins, then it’s probably fine
Comparison 2: Convert document-like structure to tables
Document databased is better suited if data in your application has a document-like structure(a tree of one-to-many relationship): the process of splitting a document-like structure into many tables is cumbersome.
Comparison 3: Schema flexibility
In relational database, a schema needs to be predefined and every insertion/update to a table needs to conform to the schema. In document database, the concept of schema is implicit, we only assume a schema when we read the document.
We’d want schema flexibility when the objects we want to stored have too many types so that it’s unreasonable to create a table for each of them. Or if the objects we want to store come from external APIs so we don’t really know the schema. In document database, when we’d like to change the schema, we wouldn’t need to perform expensive migrations as in relation database, which is slow, as we’d need to change all rows of the database.
On the other hand, if the object structure is well defined and don’t change very often then it’d be nice to have the database enforce the structure with explicit schema.
Comparison 4: Data locality
In document database, if your ‘document’ is large and your application often needs to access the entire document(e.g display a user profile), then you get better read performance because of data locality. In the case of relational database, a user profile is split across many tables(e.g user, job, country, etc), it may take a couple of joins to piece together the entire user profile: not only does it not take advantage of data locality, it actually takes more disk reads to fetch the data.
Comparisons in terms of fault-tolerant properties and handlings of concurrency
Comparison1: scalability
NoSql database comes with replication and partitioning for free, for Mysql you have to build these yourself.
Comparison2: transaction support
NoSql databases abandoned transaction or only support a weaker set of guarantees. i.e maybe it only supports Read Committed for Isolation level