Fork me on GitHub

NoSQL Data Stores

Relational databases store the vast majority of web application persistent data. However, there are several alternative classifications of storage representations.

  1. Key-value pair
  2. Document-oriented
  3. Column-family table
  4. Graph

These persistent data storage representations are commonly used to augment, rather than completely replace, relational databases. The underlying persistence type used by the NoSQL database often gives it different performance characteristics than a relational database, with better results on some types of read/writes and worse performance on others.

Key-value Pair

Key-value pair data stores are based on hash map data structures.

Key-value pair data stores

  • Redis is an open source in-memory key-value pair data store. Redis is often called "the Swiss Army Knife of web application development." It can be used for caching, queuing, and storing session data for faster access than a traditional relational database, among many other use cases. Redis-py is a solid Python client to use with Redis.

Key-value pair resources

Document-oriented

A document-oriented database provides a semi-structured representation for nested data.

Document-oriented data stores

  • MongoDB is an open source document-oriented data store with a Binary Object Notation (BSON) storage format that is JSON-style and familiar to web developers. PyMongo is the most commonly used client to interface with MongoDB through Python code.

  • Riak is an open source distributed data store focused on availability, fault tolerance and large scale deployments.

  • Apache CouchDB is also an open source project where the focus is on embracing RESTful-style HTTP access for working with stored JSON data.

Document-oriented data store resources

  • MongoDB for startups is a guide about using non-relational databases in green field environments.

  • The creator and maintainers of PyMongo review four decisions they regret from building the widely-used Python MongoDB driver.

    1. start_request
    2. use_greenlets
    3. "copy_database"
    4. The final post will cover MongoReplicaSetClient.

Column-family table

A the column-family table class of NoSQL data stores builds on the key-value pair type. Each key-value pair is considered a row in the store while the column family is similar to a table in the relational database model.

Column-family table data stores

Graph

A graph database represents and stores data in three aspects: nodes, edges, and properties.

A node is an entity, such as a person or business.

An edge is the relationship between two entities. For example, an edge could represent that a node for a person entity is an employee of a business entity.

A property represents information about nodes. For example, an entity representing a person could have a property of "female" or "male".

Graph data stores

  • Neo4j is one of the most widely used graph databases and runs on the Java Virtual Machine stack.

  • Cayley is an open source graph data store written by Google primarily written in Go.

  • Titan is a distributed graph database built for multi-node clusters.

Graph data store resources

NoSQL third-party services

  • MongoHQ provides MongoDB as a service. It's easy to set up with either a standard LAMP stack or on Heroku.

NoSQL data store resources

  • NoSQL databases: an overview explains what NoSQL means, how data is stored differently than in relational systems and what the Consistency, Availability and Partition-Tolerance (CAP) Theorem means.

  • CAP Theorem overview presents the basic constraints all databases must trade off in operation.

  • This post on What is a NoSQL database? Learn By Writing One in Python is a detailed article that breaks the mystique behind what some forms of NoSQL databases are doing under the covers.

  • NoSQL Weekly is a free curated email newsletter that aggregates articles, tutorials, and videos about non-relational data stores.

  • NoSQL comparison is a large list of popular, BigTable-based, special purpose, and other datastores with attributes and the best use cases for each one.

NoSQL data stores learning checklist

Understand why NoSQL data stores are better for some use cases than relational databases. In general these benefits are only seen at large scale so they may not be applicable to your web application.

Integrate Redis into your project for a speed boost over slower persistent storage. Storing session data in memory is generally much faster than saving that data in a traditional relational database that uses persistent storage. Note that when memory is flushed the data goes away so anything that needs to be persistent must still be backed up to disk on a regular basis.

Evaluate other use cases such as storing transient logs in document-oriented data stores such as MongoDB.

What's next?

Tell me more about standard relational databases.

My app is running but looks awful. How do I style the interface?

How do I create a better browser experience with JavaScript?


Interested in a complete Full Stack Python book with detailed tutorials and example code? Sign up here and you'll get an alert email if a book is created. No other emails will be sent other than sign up confirmation.