Awesome Bigdata Overview

A curated list of awesome big data frameworks, ressources and other awesomeness.

🏠 Home · 🔥 Feed · 📮 Subscribe · ❤️ Sponsor · 😺 newTendermint/awesome-bigdata · ⭐ 12K · 🏷️ Big Data

[ Daily / Weekly / Overview ]

Awesome Big Data

Awesome

A curated list of awesome big data frameworks, resources and other awesomeness. Inspired by awesome-php (⭐29k), awesome-python (⭐169k), awesome-ruby (⭐1.2k), hadoopecosystemtable & big-data.

Your contributions are always welcome!

RDBMS

Frameworks

Distributed Programming

Distributed Filesystem

Distributed Index

Document Data Model

Key Map Data Model

Note: There is some term confusion in the industry, and two different things are called "Columnar Databases". Some, listed here, are distributed, persistent databases built around the "key-map" data model: all data has a (possibly composite) key, with which a map of key-value pairs is associated. In some systems, multiple such value maps can be associated with a key, and these maps are referred to as "column families" (with value map keys being referred to as "columns").

Another group of technologies that can also be called "columnar databases" is distinguished by how it stores data, on disk or in memory -- rather than storing data the traditional way, where all column values for a given key are stored next to each other, "row by row", these systems store all column values next to each other. So more work is needed to get all columns for a given key, but less work is needed to get all values for a given column.

The former group is referred to as "key map data model" here. The line between these and the Key-value Data Model stores is fairly blurry.

The latter, being more about the storage format than about the data model, is listed under Columnar Databases.

You can read more about this distinction on Prof. Daniel Abadi's blog: Distinguishing two major types of Column Stores.

Key-value Data Model

Graph Data Model

Columnar Databases

Note please read the note on Key-Map Data Model section.

NewSQL Databases

Time-Series Databases

SQL-like processing

Data Ingestion

Service Programming

Scheduling

Machine Learning

Benchmarking

Security

System Deployment

Applications

Search engine and framework

MySQL forks and evolutions

PostgreSQL forks and evolutions

Memcached forks and evolutions

Embedded Databases

Business Intelligence

Data Visualization

Internet of things and sensor data

Interesting Readings

Interesting Papers

2015 - 2016

2013 - 2014

2011 - 2012

2001 - 2010

Videos

Books

Streaming

Distributed systems

Graph Based approach

Data Visualization

Other Awesome Lists