# NoSQL

NoSQL is a collection of data items represented in a **key-value store**, **document store**, **wide column store**, or a **graph database**. Data is denormalized, and joins are generally done in the application code. Most NoSQL stores lack true ACID transactions and favor eventual consistency.

## 1. BASE

**BASE** is often used to describe the properties of NoSQL databases. In comparison with the CAP Theorem, BASE chooses **availability over consistency**.

* **Basically available** - the system guarantees availability.
* **Soft state** - the state of the system may change over time, even without input.
* **Eventual consistency** - the system will become consistent over a period of time, given that the system doesn't receive input during that period.

In addition to choosing between SQL or NoSQL, it is helpful to understand which type of NoSQL database best fits your use case(s). We'll review key-value stores, document stores, wide column stores, and graph databases in the next section.

## 2. Key-value store

> Abstraction: hash table

A key-value store generally allows for **O(1)** reads and writes and is often backed by memory or SSD. Data stores can maintain keys in **lexicographic order**, allowing efficient retrieval of key ranges. Key-value stores can allow for storing of metadata with a value.

Key-value stores provide high performance and are often used for simple data models or for rapidly-changing data, such as an **in-memory cache layer**. Since they offer only a limited set of operations, complexity is shifted to the application layer if additional operations are needed.

A key-value store is the basis for more complex systems such as a document store, and in some cases, a graph database.

### More read

* [Key-value database](https://en.wikipedia.org/wiki/Key-value_database)
* [Disadvantages of key-value stores](https://stackoverflow.com/questions/4056093/what-are-the-disadvantages-of-using-a-key-value-table-over-nullable-columns-or)
* [Redis architecture](http://qnimate.com/overview-of-redis-architecture/)
* [Memcached architecture](https://www.adayinthelifeof.nl/2011/02/06/memcache-internals/)

## 3. Document store

> Abstraction: key-value store with documents stored as values

A document store is centered around **documents** (XML, JSON, binary, etc), where a document stores all information for a given object. Document stores provide APIs or a query language to query based on the internal structure of the document itself. Note, many key-value stores include features for working with a value's metadata, blurring the lines between these two storage types.

Based on the underlying implementation, documents are organized by **collections**, **tags**, **metadata**, or **directories**. Although documents can be organized or grouped together, documents may have fields that are completely different from each other.

Some document stores like [MongoDB](https://www.mongodb.com/mongodb-architecture) and [CouchDB](https://blog.couchdb.org/2016/08/01/couchdb-2-0-architecture/) also provide a SQL-like language to perform complex queries. DynamoDB supports both key-values and documents.

Document stores provide high flexibility and are often used for working with occasionally changing data.

### More read

* [Document-oriented database](https://en.wikipedia.org/wiki/Document-oriented_database)
* [MongoDB architecture](https://www.mongodb.com/mongodb-architecture)
* [CouchDB architecture](https://blog.couchdb.org/2016/08/01/couchdb-2-0-architecture/)
* [Elasticsearch architecture](https://www.elastic.co/blog/found-elasticsearch-from-the-bottom-up)

## 4. Wide column store

> Abstraction: nested map `ColumnFamily<RowKey, Columns<ColKey, Value, Timestamp>>`

![](https://2407442552-files.gitbook.io/~/files/v0/b/gitbook-legacy-files/o/assets%2F-Lpv9LvHzpublmUWisvz%2Fsync%2F6c7bda1394cfee30860466268dfd3b05c1106442.png?generation=1591078390521164\&alt=media)

A wide column store's basic unit of data is a **column** (name/value pair). A column can be grouped in column families (analogous to a SQL table). Super column families further group column families. You can access each column independently with a row key, and columns with the same row key form a row. Each value contains a **timestamp for versioning** and for conflict resolution.

Google introduced Bigtable as the first wide column store, which influenced the open-source [HBase](https://mapr.com/blog/in-depth-look-hbase-architecture/) often-used in the Hadoop ecosystem, and [Cassandra](https://docs.datastax.com/en/archived/cassandra/3.0/cassandra/architecture/archIntro.html) from Facebook. Stores such as BigTable, HBase, and Cassandra maintain keys in **lexicographic order**, allowing efficient retrieval of selective key ranges.

Wide column stores offer **high availability** and **high scalability**. They are often used for very **large data sets**.

### More read

* [SQL & NoSQL, a brief history](https://blog.grio.com/2015/11/sql-nosql-a-brief-history.html)
* [Bigtable architecture](http://www.read.seas.harvard.edu/~kohler/class/cs239-w08/chang06bigtable.pdf)
* [HBase architecture](https://mapr.com/blog/in-depth-look-hbase-architecture/)
* [Cassandra architecture](https://docs.datastax.com/en/archived/cassandra/3.0/cassandra/architecture/archIntro.html)

## 5. Graph database

> Abstraction: graph

![](https://2407442552-files.gitbook.io/~/files/v0/b/gitbook-legacy-files/o/assets%2F-Lpv9LvHzpublmUWisvz%2Fsync%2F2b43f9aa6f25bad12e8f2291378b668de9680648.png?generation=1591078391622185\&alt=media)

In a graph database, each **node** is a **record** and each **arc** is a **relationship between two nodes**. Graph databases are optimized to represent complex relationships with many foreign keys or many-to-many relationships.

Graphs databases offer **high performance** for data models with complex relationships, such as a social network. They are relatively new and are not yet widely-used; it might be more difficult to find development tools and resources. Many graphs can only be accessed with REST APIs.

### More read

* [Graph database](https://en.wikipedia.org/wiki/Graph_database)
* [Neo4j](https://neo4j.com/)
* [FlockDB](https://blog.twitter.com/engineering/en_us/a/2010/introducing-flockdb.html)

## 6. More read about NoSQL

* [NoSQL databases a survey and decision guidance](https://medium.baqend.com/nosql-databases-a-survey-and-decision-guidance-ea7823a822d#.wskogqenq)
* [Scalability](https://www.lecloud.net/post/7994751381/scalability-for-dummies-part-2-database)
* [Introduction to NoSQL](https://www.youtube.com/watch?v=qI_g07C_Q5I)
* [NoSQL patterns](http://horicky.blogspot.com/2009/11/nosql-patterns.html)


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://liuyang89116.gitbook.io/system-design/chapter/chapter-1/nosql.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
