Track Awesome Cassandra Updates Weekly
A curated list of the best resources in the Cassandra community.
🏠 Home · 🔍 Search · 🔥 Feed · 📮 Subscribe · ❤️ Sponsor · 😺 Anant/awesome-cassandra · ⭐ 201 · 🏷️ Databases
Nov 29 - Dec 05, 2021
Cassandra from Relational / Cassandra as a Service / Managed Cassandra Based on Proprietary Technology
- Real-Time Replication from MySQL to Cassandra - Demonstration of migrating data from MySQL to Cassandra.
- Cassandra Tutorial for Beginners - Lesson plan for users just starting out with Cassandra.
Integrating with Cassandra / .NET and Cassandra
- Cassandra API with .NET - Quickstart guide on how to use .NET and the Azure Cosmos DB Cassandra API to build a profile app.
- DataStax C# Driver Documentation - Documentation on the C# Driver for Cassandra from DataStax.
- CQL data types to C# types - Documentation on CQL data types to C# types.
- Connect to Cassandra with C# - Instaclustr article on how to connect to Cassandra with C#.
- Access Amazon Keyspaces with a Cassandra .NET Core Driver - Article shows how to connect to Amazon Keyspaces by using a .NET Core client driver.
- Cassandra ADO.NET Driver - Cassandra ADO.NET Data Provider enables user to easily connect to Cassandra data from .NET applications.
- Cassandra Pagination with ASP.NET Core C# - Article covering how to create infinite scroll pagination with Cassandra and ASP.NET Core C#.
Miscellaneous / Custom Time Series
- Cassandra vs MongoDB - Article comparing the two popular NoSQL databases.
Communities / Custom Time Series
- Stack Overflow: Astra DataStax - ASP.NET Core - Answered question regarding connecting DataStax Astra and an ASP.NET Core API published to Microsoft Azure.
Videos / Custom Time Series
- Working with .NET and Cassandra/DataStax Enterprise - Getting a C# .NET core application started to work against a Cassandra or DSE database.
Nov 15 - Nov 21, 2021
Cassandra Deployment / Cassandra as a Service / Managed Cassandra Based on Proprietary Technology
- Benchmarking Cassandra with Local Storage on Azure - Learn about comparing Cassandra on Azure VMs w/ Local vs. Remote storage.
Nov 08 - Nov 14, 2021
Cassandra History
- IDG: 10 Years of Apache Cassandra - Retrospective discussing the first 10 years of Cassandra's history.
Cassandra Distributions / Cassandra as a Service / Managed Cassandra Based on Open Source Cassandra
- IBM Cloud Databases for DataStax - IBM Cloud Managed Service for DataStax Enterprise.
Aug 23 - Aug 29, 2021
Cassandra Distributions / Cassandra Compliant Databases on JVM
- DataStax Enterprise - Most widely used commercial distribution of Cassandra, integrated with Apache Spark (for SparkSQL, analytics), Apache Solr (for secondary index), Apache TinkerPop based Graph stored in Cassandra, and OpsCenter.
Cassandra Distributions / Cassandra as a Service / Managed Cassandra Based on Open Source Cassandra
- DataStax Astra - DataStax Astra Cassandra as a Service running on the Kubernetes operator Cassandra available on AWS and GCP.
Aug 02 - Aug 08, 2021
Books / Custom Time Series
Apr 26 - May 02, 2021
Cassandra History
- ZDNet: Cassandra Turns 10 - Highlights of the growth of Cassandra over it's first 10 years.
Cassandra Distributions / Cassandra Compliant Databases on JVM
- DDAC/Luna - Datastax Distribution of Cassandra, a production ready distribution with a bulk loader supported by Datastax. DDAC is Deprecated now, but Datastax is still supporting Cassandra with it's new Luna Service.
Cassandra Distributions / Cassandra Compliant Databases on C++
- ScyllaDB (⭐8.5k) - NoSQL data store using the seastar framework, compatible with Cassandra.
Cassandra Distributions / Cassandra as a Service / Managed Cassandra Based on Open Source Cassandra
- Instaclustr Managed Cassandra as a Service - Instaclustr provides a fully managed and SOC 2 certified hosted & managed service for Cassandra® on AWS, Azure, GCP and IBM Cloud.
- Aiven for Cassandra - Aiven for Cassandra is a managed and hosted distributed NoSQL database providing scalability, high availability, and excellent fault tolerance. Cassandra as a Service is available on Google Cloud Platform, Amazon Web Services, Microsoft Azure, DigitalOcean, and UpCloud.
- Microsoft Azure Managed Instance for Cassandra - Azure Managed Instance for Cassandra provides automated deployment and scaling operations for managed open-source Cassandra datacenters. It accelerates hybrid scenarios and reduces ongoing maintenance.
Cassandra Distributions / Cassandra as a Service / Managed Cassandra Based on Proprietary Technology
- Microsoft Azure Cosmos DB: Cassandra API - Azure Cosmos DB provides the Cassandra API (preview) for applications that are written for Cassandra that need premium capabilities.
- Amazon Keyspaces for Cassandra - Amazon Web Services (AWS) Amazon Keyspaces for Cassandra provides a CQL compliant access to a "Serverless" auto-scaling datastore.
Using Cassandra / Cassandra as a Service / Managed Cassandra Based on Proprietary Technology
- The LIMIT Clause in Cassandra might not work as you think - Blog post for the considerations on the efficiency of the LIMIT clause.
- Top 5 reasons to use the Cassandra Database - Few good reasons why you'd want to consider Cassandra.
- Cassandra Use Cases: When to use and when not to use Cassandra - Practical guide for when to use and when not to use Cassandra.
- Cassandra Database (Guide) - Great guide to learn about Cassandra, from Instaclustr.
Cassandra Data Modeling / Cassandra as a Service / Managed Cassandra Based on Proprietary Technology
- CQL: This is not the SQL you are Looking For - Presentation that explores and explains the differences between the CQL and SQL languages.
- Spring Data Cassandra Examples (⭐4) - Maven project that contains examples showcasing the features and functionality of the Spring Data Cassandra project.
Cassandra Architecture / Cassandra as a Service / Managed Cassandra Based on Proprietary Technology
- The Gossip Protocol - Inside Cassandra. - Good visual explanation of how Cassandra keeps consistent.
- Introduction To The Cassandra 3.x Storage Engine - The 3.x storage engine makes it easier for Cassandra to get bytes off disk.
- Dropping columns in Cassandra 3.0 - Blog post describing the steps Cassandra takes when a column is dropped.
- About Deletes and Tombstones in Cassandra - Deleting distributed and replicated data from a system such as Cassandra is far trickier than in a relational database.
- Undetectable tombstones in Cassandra - Indepth analysis of cell and range tombstones.
- Understanding the Nuance of Compaction in Cassandra - Overview of how Cassandra manages data on disk.
- Guide to Cassandra Thread Pools - Guide that provides a description of the different thread pools and how to monitor them. Includes what to alert on, common issues and solutions. Old but very useful reference.
- Improving Cassandra's Front Door and Backpressure - Explore how an incoming request was processed by Cassandra before, see what they changed, and look at new relevant configuration knobs available.
- Cassandra Architecture - High level overview of Cassandra from Instaclustr.
- The 10 Things I hate about Cassandra - Do you really want to use Cassandra? Learn why not to use it.
Cassandra Monitoring / Cassandra as a Service / Managed Cassandra Based on Proprietary Technology
- Resources for Monitoring Datastax, Cassandra, Spark, & Solr Performance - Blog post detailing different types of monitoring tools and their purpose.
- Monitoring Cassandra With Grafana And Influx DB - Blog post explaining how to set up Cassandra monitoring with influxDB and Grafana.
- Cassandra Monitoring - Introduction (1/2) - Blog post detailing how Cassandra metrics can be gathered.
- Monitoring Cassandra using Intel Snap and Grafana - Blog post describing how to monitor Cassandra using the Intel Snap open source telemetry framework.
- Cassandra Monitoring Best Practice Guide - Blog post that aims to touch all the important aspects of Cassandra monitoring.
Cassandra Maintenance / Cassandra as a Service / Managed Cassandra Based on Proprietary Technology
- Cassy (⭐39) - Simple and integrated backup tool for Cassandra.
- Medusa (⭐202) - Cassandra backup system.
Cassandra Performance Tuning / Cassandra as a Service / Managed Cassandra Based on Proprietary Technology
- Cassandra Node Diagnostics Tools (⭐51) - Monitoring and audit power kit for Cassandra.
- Performing User Defined Compactions in Cassandra - Documenting a process by which we tell Cassandra to create a compaction task for one or more tables explicitly.
- Modeling real life workloads with cassandra-stress is hard - Blog post detailing caveats with cassandra-stress when modeling real workloads.
Cassandra Security / Cassandra as a Service / Managed Cassandra Based on Proprietary Technology
- Securing Cassandra with Application Level Encryption - Discusses how to do application level data encryption to properly manage secure information in Cassandra.
- LDAP Authenticator for Cassandra (⭐20) - Pluggable authentication implementation for Cassandra, providing a way to authenticate and create users based on a configured LDAP server.
Cassandra Deployment / Cassandra as a Service / Managed Cassandra Based on Proprietary Technology
- tlp-cluster, a tool for launching Cassandra clusters in AWS (⭐17) - Provisioning tool for Cassandra designed for developers looking to both benchmark and test the correctness of Cassandra. It assists with builds and starting instances on AWS.
- Setting Up Cassandra Cluster Through Ansible - Guide detailing how to set up a Cassandra cluster with automation using Ansible.
- Running Cassandra on DC/OS (Mesos) - Blog that shows how to setup DC/OS in the Amazon cloud, how to install Cassandra on a DC/OS cluster, and finally new ways to interact with and Cassandra after it is installed.
Cassandra Deployment / Cassandra Deployment on Docker / Containerized Cassandra
- Docker-Cassandra (⭐214) - Set of scripts and config files to run a Cassandra cluster from Docker.
- Cassandra Docker (⭐8) - Instaclustr public docker image for Cassandra. It contains docker images for Cassandra 3.0 and 3.11.1.
- Cassandra / Elassandra Docker (⭐0) - Cassandra and Elassandra docker images.Cass Operator is maintained by a team at DataStax and it is part of what powers DataStax Astra.
Cassandra Deployment / Cassandra Deployment on Kubernetes / Kubernetized Cassandra
- K8ssandra.io - Kubernetes + Cassandra - K8ssandra provides a production-ready platform for running Cassandra on Kubernetes. This includes automation for operational tasks such as repairs, backups, and monitoring.
- Datastax - Cassandra Kubernetes Operator (⭐245) - Datastax's Cassandra Kubernetes Operator which supports Datastax as well as open source Cassandra containers on Kubernetes.
- CassKop - Cassandra operator for Kubernetes (⭐186) - Kubernetes operator automates the Cassandra operations such as deploying a new rack aware cluster, adding/removing nodes, configuring the C and JVM parameters, upgrading JVM and C versions. Written in Go.
- Rook.io - Cassandra on Kubernetes - Rook is an open source cloud-native storage orchestrator, providing the platform, framework, and support for a diverse set of storage solutions to natively integrate with cloud-native environments. They have a special operator for Cassandra amongst other providers.
- Kudo Cassandar Operator (⭐10) - The KUDO Cassandra Operator makes it easy to deploy and manage Cassandra on Kubernetes.
Integrating with Cassandra / Cassandra Deployment on Kubernetes / Kubernetized Cassandra
- Building a Streaming Data Hub with Elasticsearch, Kafka and Cassandra - Blog post detailing how a streaming analytics system on top of open source, big data components can be done.
Libraries / Custom Time Series
- DataStax C# Driver (⭐547) - Modern, feature-rich and highly tunable C# client library for Cassandra (1.2+) and DataStax Enterprise (3.1+) using exclusively Cassandra's binary protocol and Cassandra Query Language v3.
- DataStax Java Driver (⭐1.3k) - Java client driver for Cassandra.
- DataStax C++ Driver (⭐376) - Modern, feature-rich, and highly tunable C/C++ client library for Cassandra (1.2+) and DataStax Enterprise (3.1+) using exclusively Cassandra's native protocol and Cassandra Query Language v3.
- DataStax Python Driver (⭐1.3k) - Modern, feature-rich and highly-tunable Python client library for Cassandra (2.1+) using exclusively Cassandra's binary protocol and Cassandra Query Language v3.
- DataStax Ruby Driver (⭐227) - Ruby client driver for Cassandra. This driver works exclusively with the Cassandra Query Language version 3 (CQL3) and Cassandra's native protocol.
- DataStax Node.js Driver (⭐1.2k) - Modern, feature-rich and highly tunable Node.js client library for Cassandra (1.2+) and DataStax Enterprise (3.1+) using exclusively Cassandra's binary protocol and Cassandra Query Language v3.
- DataStax PHP Driver (⭐419) - DataStax PHP Driver for Cassandra.
- Achilles - Achilles is an open source Persistence Manager for Cassandra,with the features like Advanced bean mapping (compound primary key, composite partition key, timeUUID, ect),Native collections and map support,and so.
- phpcassa (⭐250) - PHP client library for Cassandra.
- Caffinitas - Caffinitas is an advanced object mapper for Cassandra which has been especially designed to work with Datastax Java Driver 2.1+ against Cassandra 2.1, 2.0 or 1.2.
- Spring Data for Cassandra - Spring Data for Cassandra offers a familiar interface to those who have used other Spring Data modules in the past.
Integrating with Cassandra / Spark
- DataStax Spark Cassandra Connector (⭐1.9k) - Library that lets you expose Cassandra tables as Spark RDDs, write Spark RDDs to Cassandra tables, and execute arbitrary CQL queries in your Spark applications.
Timeseries Databases / Custom Time Series
- Newts - Time-series data store based on Cassandra.
- Hawkular.org - Time series / distributed tracing database powered by Cassandra by Redhat.
Miscellaneous / Custom Time Series
- Apache/Usergrid (⭐998) - Open source Backend as a Service (BaaS) on Cassandra, Elasticsearch with client SDKs for iOS/Android/.NET/Java.
Tools / Custom Time Series
- Hackolade - Visual data modeling tool for NoSQL databases and stuctures like Cassandra, ElasticSearch, Graph DBs, JSON, APIs.
- Datastax - Management API for Cassandra (⭐55) - The Management API is a sidecar service layer that attempts to build a well supported set of operational actions on Cassandra® nodes that can be administered centrally.
- Ansible-Galaxy: Cassandra GitHub (⭐16) - Collection called cassandra that aims at providing all Ansible modules allowed to interact with Cassandra.
- RazorSQL - Multi DB Manager Tool - Multi-db tool for Linux, Mac, and Windows that works with Cassandra.
- Cassandra Reaper - Automated repairs for Cassandra. Supports all versions.
- cstar perf (⭐71) - Cassandra performance testing platform.
- Spark Cassandra Stress (⭐26) - Tool for testing the DataStax Spark Connector against Cassandra or DSE.
- Cassalog (⭐14) - Cassalog is a schema change management library and tool for Cassandra that can be used with applications running on the JVM.
- Cassandra-web - Web interface for Cassandra.
- tlp-cluster - Provisioning tool for Cassandra designed for developers looking to benchmark and test Cassandra. It assists with builds and starting instances on AWS.
- Helenos (⭐163) - Free web based environment that simplifies a data exploring & schema management with Cassandra database.
- Cassandra-Migration (⭐52) - Cassandra / DataStax Enterprise database migration (schema evolution) library.
- Instaclustr Kerberos plugin (⭐5) - GSSAPI authentication provider for Cassandra.
Open Source Applications / Custom Time Series
- Cassandra Cluster Admin (⭐207) - Cassandra Cluster Admin is a GUI tool to help people administrate their Cassandra cluster.
- CCM: Cassandra Cluster Manager) (⭐1.2k) - Script/library to create, launch and remove an Cassandra cluster on localhost.
- CStar (⭐250) - Cassandra cluster orchestration tool for the command line.
Logging /Metrics / Custom Time Series
- Metrics Collector for Cassandra (⭐92) - Metric collection and Dashboards for Cassandra (2.2, 3.0, 3.11, 4.0) clusters. Comes with dashboards for Graphana.
- Cassandra Log Tools (⭐8) - Simple scripts for working with Cassandra logs.
- ctop (⭐2) - Very simple console tool for monitoring column families read/write activities at remote cassandra host.
Documentation / Custom Time Series
- Cassandra Documentation - Definitive documentation for all published versions.
Communities / Custom Time Series
Blogs / Custom Time Series
- Datastax - DataStax, Inc. is a data management company that provides commercial support, software, and cloud database-as-a-service based on Cassandra.
- Codecentric: Cassandra - Codecentric is an IT consulting company, these are their blog posts surrounding the topic of Cassandra.
- Pythian: Cassandra - Pythian provides data and cloud-related services. The company provides services for Oracle, SQL Server, MySQL, Hadoop, Cassandra and other databases and their supporting infrastructure.
- Instaclustr - Managed and supported open source solutions for Cassandra, Kafka, Elasticsearch & Redis.
- OpenCredo:Cassandra - OpenCredo is a consulting company that helps clients make informed decisions around cloud native and open source technologies, as well as public cloud services.
- DOAN DuyHai's Blog: Cassandra - Duyhai Doan is a freelance big data and cloud architect who values sharing knowledge and contributing to the technology community.
- Amy Tobert - Amy Tobert is a full-stack engineer & leader with passion for sustainable systems and people-centered leadership. Her blog details different Cassandra deployments amont other topics.
- Christopher Batey: Cassandra - Christopher Batey is a software engineer of over 15 years and is a primary contributor to Akka and occasional contributor to Cassandra.
- Distributed Bytes: Cassandra - Tim Ojo is the creator of Distributed Bytes and software engineer at Capital one. These are a collection of his posts surrounding the topic of Cassandra.
- The Netflix Tech Blog - Learn about Netflix’s world class engineering efforts, company culture, product developments and more.
- Spotify R&D / Engineering Blog : Cassandra - Cassandra related posts on Spotify's official technology blog.
- Ryan Svilha - Ryan Svilha is a principle engineer at DataStax. His blog posts covers topics surround Cassandra and associated tools.
- Anant - Anant builds and manages business platforms of which they connect customer experiences and information systems with real-time data platforms.
Videos / Custom Time Series
- Monitoring Cassandra: Don't Miss a Thing (Alain Rodriguez, The Last Pickle) | C* Summit 2016 - Talk given by Alain Rodriguez, Consultant at The Last Pickle, discussing what to monitor in Cassandra, how, and why.
- Cassandra.Lunch (⭐6) - Collection of all past Cassandra.Lunch webinars including videos, slides, and Blog posts surrounding all topics Cassandra.
Slides / Custom Time Series
- HAPI Cassandra (⭐5) - Simple REST API with hapi Node.js framework on top of a Cassandra database.
Apr 19 - Apr 25, 2021
Using Cassandra / Cassandra as a Service / Managed Cassandra Based on Proprietary Technology
- How to install Cassandra 2 on CentOS 7 / RHEL 7 - Guide on how to install Cassandra on the popular linux distributions RedHat and CentOS.
Cassandra Architecture / Cassandra as a Service / Managed Cassandra Based on Proprietary Technology
- Common Problems with Cassandra Tombstones - Large number of tombstones causes Latency and heap pressure.
Cassandra Performance Tuning / Cassandra as a Service / Managed Cassandra Based on Proprietary Technology
- Gatling DSE Stress (⭐5) - Tool for stress testing DSE.
- Gatling DSE Plugin for Gatling Load injector (⭐8) - Plugin for the Gatling load injector. It adds CQL support in Gatling for Datastax Enterprise. It allows for benchmarking Datastax Enterprise features, including DSE Graph Fluent API.
Integrating with Cassandra / Cassandra Deployment on Kubernetes / Kubernetized Cassandra
- Docker container for Kafka - Spark streaming - Cassandra (⭐93) - Dockerfile that sets up a complete streaming environment for experimenting with Kafka, Spark streaming (PySpark), and Cassandra.
Integrating with Cassandra / Search / Secondary Indexes
- Elassandra - Elassandra = Elasticsearch as a Cassandra secondary index.
Timeseries Databases / Monitoring / Metrics
- cortexproject/cortex (⭐4.9k) - Horizontally scalable, highly available, multi-tenant, long term Prometheus storage.
Tools / Custom Time Series
- CassandraCAS (⭐2) - Compare-and-swap tool for Cassandra created by Datomic.
- Peloton (⭐581) - Unified resource scheduler created by Uber. This tool can handle many nodes and clusters through resource management and scalability.
- Ansible-dse (⭐15) - Set of Ansible playbooks that will build a Datastax Enterprise cluster.
- DBeaver - Free Universal Database Tool - Third party tool for dealing with all sorts of databases including Cassandra.
- Web: Cassandra Calculator - Simple calculator to see how size / replication factor affect the system's consistency.
- Netflix: Staash (⭐203) - Language-agnostic as well as storage-agnostic web interface for storing data into persistent storage systems, the metadata layer abstracts a lot of storage details and the pattern automation APIs take care of automating common data access patterns.
- SSTable Tools (⭐155) - Toolkit for parsing, creating and doing other fun stuff with Cassandra 3.x SSTables.
- CQL Data Modeler - Very useful tool to test out a CQL schema and visualize what the partition would like in relationship to the columns and rows.
- Cassandra Snapshot Backup (⭐6) - Quick and easy way to snapshot files in a Cassandra database and back them up using Ansible.
- Slothsandra (⭐0) - Integration for Cassandra with the Slack app, which stores old messages that Slack no longer does itself.
- sandraREST (⭐23) - Cassandra manager with a web UI for RESTful APIs.
- Cassandra Leadership (⭐7) - Library to help elect leaders using cassandra. Uses paxos to build a leadership election module.
- Terraform Cassandra (⭐6) - Terraform module that creates a Cassandra cluster.
- Datadog - Third party tool that allows monitoring and metrics for Cassandra nodes and clusters.
Open Source Applications / Custom Time Series
- Twissandra (⭐800) - Twissandra is an example project, created to learn and demonstrate how to use Cassandra. Running the project will present a website that has similar functionality to Twitter.
- ChronoServer (⭐2) - Test server for sampling how long it takes mobile & web clients to make various types of requests to a server doing common request patterns.
- CMB (⭐280) - Highly available, horizontally scalable queuing and notification service compatible with AWS SQS and SNS.
- CassieQ (⭐50) - Distributed queue built off of Cassandra.
- Scheduler (⭐211) - Scala library for scheduling arbitrary code to run at an arbitrary time.
Videos / Custom Time Series
- Best Practices for Running Cassandra on AWS - Joint webinar between Amazon Web Services (AWS) and Stackdriver, an AWS Technology partner, to learn best practices that apply to storing, analyzing and managing queries that equate to over 1+ billion measurements a day.
Slides / Custom Time Series
- Cassandra DataTables Using Restful API - How to create a performant API using Python / Flash.
- GumGum: Multi-Region Cassandra in AWS - Presentation that details how Gumgum scaled out from one local Cassandra datacenter to a multi-datacenter Cassandra cluster and all the problems they encountered and choices they made while implementing it.
- Hardening Cassandra for Compliance or Paranoia - Includes details on configuring SSL, setting up a certificate authority and creating certificates and trust chains for the JVM.
- Securing Cassandra - Ben Bromhead CTO of Instaclustr, will explore the various ways in which you can setup and secure Cassandra appropriately for your threat environment.
Apr 12 - Apr 18, 2021
Cassandra Use Cases
- Datastax Academy: What is Cassandra? - Introduction to what Cassandra is, where it came from, and some of it's benefits.
Using Cassandra / Cassandra as a Service / Managed Cassandra Based on Proprietary Technology
- Installing the Cassandra / Spark OSS Stack - Installation process and user guide for the Cassandra / Spark OSS Stack.
- The Cassandra Query Language - Documentation for CQL.
- Building a Performant API using Go and Cassandra - Tutorial documenting how to build a RESTful API using Go and Cassandra.
- Introduction to Spark & Cassandra - Blog post on setting up a really simple Spark job that does a data migration for Cassandra.
- From Cassandra to S3, with Spark - Blog post showing how to connect Spark to Cassandra, analyze event data from Cassandra, and store the results of the analysis into S3, making it available for reporting or further analysis.
Cassandra from Relational / Cassandra as a Service / Managed Cassandra Based on Proprietary Technology
- Cassandra Query Language: CQL vs SQL - Blog post documenting similarities and differences between CQL and SQL.
Cassandra Data Modeling / Cassandra as a Service / Managed Cassandra Based on Proprietary Technology
- A Deep Look at the CQL Where Clause - Blog post to describe what is supported by the CQL WHERE clause and the reasons why it differs from normal SQL.
- Casandra Time Series Data Modeling for Massive Scale - Blog post discussing a common Cassandra data modeling technique called bucketing.
- Scalar DB (⭐336) - Transaction library for Cassandra that makes non-ACID distributed databases/storages ACID-compliant.
Cassandra Architecture / Cassandra as a Service / Managed Cassandra Based on Proprietary Technology
- Deletes and Tombstones - Explains how deletes create tombstones in Cassandra and what they are.
- Curious Case of Tombstones - How someone dealt with tombstone issues and reclaimed space in their cluster.
- Cassandra Architecture and Operations - High level overview in one page of how Cassandra works.
Cassandra Monitoring / Cassandra as a Service / Managed Cassandra Based on Proprietary Technology
- How to Monitor Cassandra - Guide to help you monitor Cassandra performance and work metrics regardles of which monitoring tool you choose to use.
- Cassandra metrics and their use in Grafana - Case study of using Cassandra metrics in Grafana.
- Monitoring Cassandra with Prometheus - Quick setup guide to using Cassandra with Prometheus.
- Cassandra Monitoring - Graphite/InfluxDB & Grafana on Docker (2/2) - Continuation of the previous entry exploring the topic of Cassandra metric reporters mentioned in Part I. The goal is to configure a reporter that sends metrics to an external time series database.
Cassandra Maintenance / Cassandra as a Service / Managed Cassandra Based on Proprietary Technology
- Intro to CStar - Tutorial on how to use CStar.
Cassandra Performance Tuning / Cassandra as a Service / Managed Cassandra Based on Proprietary Technology
- Ryan Svihla's Cassandra 2.0 checklist - Checklist for determining the efficiency of your Cassandra database.
- Amy's Cassandra 2.1 tuning guide - Guide to tracking down performance issues in production level Cassandra clusters.
- DSE 5.1: Tuning Java Resource - Documentation for tuning JVM.
Cassandra Deployment / Cassandra as a Service / Managed Cassandra Based on Proprietary Technology
- An Introduction to Cassandra Multi-Data Centers: Part 1 - Learn about how to plan and implement Multi-Data Centers: Part 1.
- An Introduction to Cassandra Multi-Data Centers: Part 2 - Learn about how to plan and implement Multi-Data Centers: Part 2.
Timeseries Databases / Custom Time Series
- kairosdb/kairosdb (⭐1.7k) - Fast scalable time series database.
- Cassandra Schema — KairosDB 1.0.1 documentation - KairosDB documentation.
- OpenNMS/newts (⭐190) - New-fangled Timeseries Data Store that powers OpenNMS.
- Hawkular GitHub - Hawkular's GitHub resources.
- OpenTSDB/opentsdb (⭐4.7k) - GitHub resources for OpenTSDB. A Distributed, Scalable Monitoring System built on a Time Series Database.
Graph / Custom Time Series
- Thinkaurelius/Titan (⭐5.2k) - Distributed Graph Database, predecessor to DSE Graph, JanusGraph, and now HugeGraph.
- Introduction to TitanDB - Introductory slides about TitanDB.
- JanusGraph/janusgraph (⭐4.6k) - JanusGraph: an open-source, distributed graph database, successor to TitanDB.
- Large Scale Graph Analytics with JanusGraph - Slides detailing deployment options and technical aspects of JanusGraph.
- Hugegraph/Hugegraph (⭐2.1k) - HugeGraph Database core component, including graph engine, API, and built-in backends.
- Architecture Overview · GitBook - Documentation for HugeGraph.
Miscellaneous / Custom Time Series
- Stargate (⭐672) - Stargate is an open-source data gateway that provides REST, GraphQL and schemaless JSON interfaces to Cassandra.
- Meet Stargate, DataStax's GraphQL for databases. First stop - Cassandra - Introduction and high-level overview of Stargate.
- Building Your Own BaaS With Apache Usergrid & Docker: Lessons Learned At Scale - Introductory presentation to Apache UserGrid.
- Scalar-labs/Scalardl (⭐75) - Tamper-evident and scalable distributed ledger platform.
- Wikimedia/Restbase (⭐98) - Distributed storage with REST API & dispatcher for backend services.
- Wikimedia/restbase-mod-table-spec (⭐3) - Shared spec and tests for RESTBase table storage.
Tools / Custom Time Series
- JetBrains Datagrip DB IDE - The Cross-Platform IDE for Databases & SQL by JetBrains, with support for Cassandra.
- Ansible-Galaxy: Cassandra - Documentation for Ansible-Galaxy: Cassandra.
- DbSchema - Cassandra Designer - DbSchema: Cassandra Diagram Designer & GUI Admin Tool which can do Cassandra amongst other databases.
- Cassandra-Exporter (⭐41) - Simple Tool to Export / Import Cassandra Tables into JSON.
- Cassandra SStable Tools (⭐87) - Multiple different tools combined into one that helps admins get summaries, metadata, partition info, cell info.
- Cassandra-Client (⭐50) - Simple gui tool for browsing tables and data in Cassandra.
- Zipkin (⭐16k) - Distributed tracing system.
- Instaclustr Java Driver for Kerberos (⭐4) - GSSAPI authentication provider for the Cassandra Java driver.
- Instaclustr TTL Remover (⭐19) - Command line tool for rewriting SSTables to remove TTLs.
- Instaclustr SSTable Generator (⭐5) - CLI tool for programmatic generation of Cassandra SSTables.
- Instaclustr Exporter (⭐54) - Java agent that exports Cassandra metrics to Prometheus.
- Instaclustr Go Client for Instaclustr Icarus (⭐4) - Go client for Instaclustr Icarus sidecar.
Feb 08 - Feb 14, 2021
Cassandra Data Modeling / Cassandra as a Service / Managed Cassandra Based on Proprietary Technology
- Cassandra Data Modeling Notes - Simple notes on how to estimate the size of your cluster.
- Cassandra Data Modeling Best Practices Guide - Explains five Cassandra data modeling best practices.
Cassandra Maintenance / Cassandra as a Service / Managed Cassandra Based on Proprietary Technology
- Backup Strategies for Cassandra - Good comparison of different backup and restoration strategies for Cassandra.
- Cassandra backup util (⭐40) - Instaclustr's cassandra backup tool.
Cassandra Performance Tuning / Cassandra as a Service / Managed Cassandra Based on Proprietary Technology
- Analyzing Cassandra Performance with Flame Graphs - Visually examining Cassandra performance visually using Flamegraphs.
Cassandra Deployment / Cassandra Deployment on Kubernetes / Kubernetized Cassandra
- Sky UK - Cassandra Kubernetes Operator (⭐23) - Kubernetes operator that manages Cassandra clusters inside Kubernetes. Well designed and organized.
Timeseries Databases / Monitoring / Metrics
- filodb/FiloDB (⭐1.4k) - Distributed Prometheus time-series database compatible with Prometheus queries.
- cybem/cyanite-iow (⭐0) - Cassandra backed Carbon daemon and metric web service. IPONWEB repository, compatible with Carbon.
Graph / Custom Time Series
- DSE Graph | Datastax - Successor to TitanDB , Commercial Tinkerpop / Gremlin compatible large scale Graph Database on DSE.
Tools / Custom Time Series
- cassandra-migration-tool-java (⭐98) - Cassandra migration tool for java is a lightweight tool used to execute schema and data migration on Cassandra database.
- Presto - Distributed SQL Query Engine for Big Data. Presto allows querying data where it lives, including Hive, Cassandra, relational databases or even proprietary data stores.
Open Source Applications / Custom Time Series
- Cassandra-Tools (⭐55) - Python Fabric scripts to help automate the launching and managing of cluster testing on AWS.
Logging /Metrics / Custom Time Series
- Cassandra CFStats to CSV Parser (⭐1) - Converts the output of CFStats to CSV.
Books / Custom Time Series
Videos / Custom Time Series
- Tuning the Spark Cassandra Connector - Great talk by Russell Spitzer maintainer of the Spark Cassandra connector.
Slides / Custom Time Series
- Tuning the Spark Cassandra Connector - Slides by Russell Spitzer maintainer of the Spark Cassandra connector.
Jan 25 - Jan 31, 2021
Integrating with Cassandra / Spark
- Spark + Cassandra Best Practices - Outlines general use cases and best practices of Spark & Cassandra together.
Jan 18 - Jan 24, 2021
Tools / Custom Time Series
- DataStax OpsCenter - Simplified management for DataStax Enterprise and Cassandra database clusters.
- dseansible (⭐8) - DSE Installation and Upgrade Ansible Playbooks/Roles for Ubuntu Linux.
Logging /Metrics / Custom Time Series
- Cassandra Nagios (⭐5) - Perl Based scripts to get metrics for monitoring using Jolokia.
- Cassandra StatD Agent (⭐13) - Java Agent for Cassandra integration with StatsD.
Dec 14 - Dec 20, 2020
Tools / Custom Time Series
- Instaclustr Minotaur (⭐5) - Command line tool for consistent rebuilding of a Cassandra cluster.
Aug 17 - Aug 23, 2020
Cassandra Deployment / Cassandra Deployment on Kubernetes / Kubernetized Cassandra
- Strapdata - Elassandra Operator for Kubernetes (⭐11) - The Elassandra Kubernetes Operator automates the deployment and management of Elassandra clusters deployed in multiple Kubernetes clusters.
Apr 20 - Apr 26, 2020
Books / Custom Time Series
Apr 06 - Apr 12, 2020
Cassandra Deployment / Cassandra as a Service / Managed Cassandra Based on Proprietary Technology
Mar 02 - Mar 08, 2020
Tools / Custom Time Series
- ValuStor (⭐51) - ValuStor is a key-value pair database solution.
- JanuesGraph-Utils (⭐200) - Tool to Develop a graph database app.
- Scylla-Migrator (⭐33) - Migrate data extract using Spark to Scylla, normally from Cassandra.
- Cassandra CA Manager (⭐11) - Create and sign Java keystores.
Aug 05 - Aug 11, 2019
Cassandra Deployment / Cassandra Deployment on Kubernetes / Kubernetized Cassandra
- Instaclustr - Kubernetes Operator for Cassandra (⭐236) - The Cassandra operator manages Cassandra clusters deployed to Kubernetes and automates tasks related to operating an Cassandra cluster.
Jul 22 - Jul 28, 2019
Cassandra
- Apache Cassandra - Manage massive amounts of data, fast, without losing sleep.
Cassandra Use Cases
- Kaa application based on Raspberry Pi and DHT11 sensor (⭐0) - Cassandra IoT usecase with Raspberry Pi and a DHT11 Sensor.
- Simple Node.js Express 4 Cassandra Application (⭐20) - MySubscribers is a very simple application (Start of an application) which allows you to create, read, update and delete users/subscribers. This application was only created to aid the YouTube course.
Using Cassandra / Cassandra as a Service / Managed Cassandra Based on Proprietary Technology
- Cassandra Data Copy Tool (⭐7) - Java tool to copy data from one cassandra table to another.
- Import CSV files with spark (⭐0) - How to import a file from S3 into cassandra using Spark.
- Cloud DevOps with Cassandra - Using Packer, Ansible/SSH and AWS command line tools to create and DBA manage EC2 Cassandra instances in AWS.
Cassandra from Relational / Cassandra as a Service / Managed Cassandra Based on Proprietary Technology
- RDBMS to NoSQL - Your roadmap to understanding whether NoSQL is right for you.
Cassandra Data Modeling / Cassandra as a Service / Managed Cassandra Based on Proprietary Technology
- Basic Rules Of Cassandra Data Modeling - Picking the right data model is the hardest part of using Cassandra. If you have a relational background, CQL will look familiar, but the way you use it can be very different.
- killrvideo-sample-schema (⭐20) - Sample Cassandra CQL Schema for a YouTube clone.
Cassandra Security / Cassandra as a Service / Managed Cassandra Based on Proprietary Technology
- Hardening Cassandra Step by Step: Part 1 - Inter-Node Encryption (And a Gentle Intro to Certificates).
Cassandra Deployment / Cassandra Deployment on Docker / Containerized Cassandra
- Docker Meet Cassandra. Cassandra Meet Docker - Article reviewing how to setup a complete Cassandra application with monitoring on Docker.
- Cassandra & Zeppelin Notebook on Docker (⭐4) - Docker-Compose script for Cassandra + Zeppelin setup.
Integrating with Cassandra / Spark
- fluxcapacitor/pipeline (⭐4.1k) - End-to-End, Real-time, Advanced Analytics Big Data Reference Pipeline using Spark, Spark SQL, Spark ML, GraphX, Spark Streaming, Kafka, NiFi, Cassandra, ElasticSearch, Redis, Tachyon, HDFS, Zeppelin, iPython/Jupyter Notebook, Tableau, Twitter Algebird.
Integrating with Cassandra / Search / Secondary Indexes
- Tuning DSE Search - Tuning DSE Search – Indexing latency and query latency.
- Cassandra Lucene Index (⭐592) - Lucene based secondary indexes for Cassandra.
- cassandra-trigger - Cassandra trigger to push realtime updates to elasticsearch.
Libraries / Custom Time Series
- express-cassandra (⭐192) - Cassandra ORM/ODM/OGM for Node.js with optional support for Elassandra & JanusGraph.
Tools / Custom Time Series
- cdeploy (⭐8) - Cdeploy is a simple tool to manage your Cassandra schema migrations in the style of dbdeploy.
- cql-vim (⭐36) - Cassandra CQL Syntax Highlighter for Vim.
Open Source Applications / Custom Time Series
- Cassandra Opstools (⭐54) - Generic scripts to review and monitor cassandra, from Spotify.
- Netflix-Priam (⭐1k) - Co-Process for backup/recovery, Token Management, and Centralized Configuration management for Cassandra.
- Cherami - Distributed, scalable, durable, and highly available message queue system.
Logging /Metrics / Custom Time Series
- cassandra-log4j-appender (⭐19) - Cassandra appenders for Log4j.
Documentation / Custom Time Series
- DataStax Documentation - Documentation and Drivers from DataStax.
Courses / Custom Time Series
- DataStax Academy - Free online courses on Cassandra.
Communities / Custom Time Series
- Apache Software Foundation Slack - The #cassandra and #cassandra-dev channels are official slack channels migrating from IRC.
Oct 08 - Oct 14, 2018
Cassandra from Relational / Cassandra as a Service / Managed Cassandra Based on Proprietary Technology
- Cassandra Schemas for Beginners (like me) - Great article for new developers to Cassandra.
Sep 17 - Sep 23, 2018
Books / Custom Time Series
Aug 06 - Aug 12, 2018
Cassandra Architecture / Cassandra as a Service / Managed Cassandra Based on Proprietary Technology
- Hinted Handoff and GC Grace Demystified - Tuning the balance between GC Grace and Hinted Handoff.
- Null bindings on prepared statements and undesired tombstone creation - Good follow up to the last article on Tombstones.
Cassandra Maintenance / Cassandra as a Service / Managed Cassandra Based on Proprietary Technology
- Running commands cluster-wide without any management tool - Some tips and tricks to do basic Cluster operations without tools like Chef, Ansible, or Salt.
- Limiting Nodetool Parallel Threads - Little known tool to do nodetool operations with less resources.
- Bootstrapping Cassandra Nodes - Indepth article on how to add nodes to a running Cassandra cluster.
- Node Replacement without Bootstrapping - How to avoid the long bootstrapping process.
- Cassandra Backup and Restore - Backup in AWS using EBS Volumes - Indepth article about Backup and recovery in AWS.
Cassandra Performance Tuning / Cassandra as a Service / Managed Cassandra Based on Proprietary Technology
- Garbage Collection Tuning for Cassandra - Optimizing garbage collection for better performance.
- TWCS part 1 - how does it work and when should you use it? - Best suited for time series data that expires, Time Window Compaction Strategy comes with some caveats.
Jul 30 - Aug 05, 2018
Cassandra Distributions / Cassandra Compliant Databases on C++
- YugaByte Database (⭐7.2k) - YugaByteDB is a transactional, high-performance database for building distributed cloud services. It supports Cassandra-compatible and Redis-compatible APIs, with PostgreSQL in Beta.
Cassandra Performance Tuning / Cassandra as a Service / Managed Cassandra Based on Proprietary Technology
- Graphing cassandra-stress - Benchmarking schemas and configuration changes using the cassandra-stress tool, before pushing such changes out to production is one of the things every Cassandra developer should know and regularly practice.
- Gatling DSE Stress Simulation Catalog (⭐4) - The goal of the repo is to provide a sample of the Gatling DSE Stress Framework's usage. Feel free to submit a pull request with example simulations.
Cassandra Security / Cassandra as a Service / Managed Cassandra Based on Proprietary Technology
- Encrypting EC2 ephemeral volumes with LUKS and AWS KMS - The example used here is Cassandra data stored on ephemeral disks.
Cassandra Deployment / Cassandra Deployment on Docker / Containerized Cassandra
Integrating with Cassandra / Cassandra Deployment on Kubernetes / Kubernetized Cassandra
- sample KafkaSparkCassandra (⭐23) - Introductory sample scala app using Apache Spark Streaming to accept data from Kafka and write a summary to Cassandra.
- sample Spark Cassandra with SSL (⭐1) - Simple sample job illustrating the use of Spark to execute Apache Spark analytics with Cassandra with SSL connection.
Integrating with Cassandra / Spark
- sample Spark Job Server Cassandra (⭐2) - Simple sample job illustrating the use of Spark Jobserver to execute Apache Spark analytics with Cassandra.
Libraries / Custom Time Series
- gocql (⭐2.3k) - Package gocql implements a fast and robust Cassandra client for the Go programming language.
Tools / Custom Time Series
- cqlmigrate (⭐45) - Cassandra CQL migration tool. cqlmigrate is a library for performing schema migrations on a cassandra cluster.
- CassanddraRestfulAPI (⭐12) - CassandraRestfulAPI project exposes the cassandra data tables with the help of Restful API.
Jul 23 - Jul 29, 2018
Using Cassandra / Cassandra as a Service / Managed Cassandra Based on Proprietary Technology
- Spring Data Cassandra Examples (⭐4) - Examples for the Spring Data Cassandra Project.
Cassandra Data Modeling / Cassandra as a Service / Managed Cassandra Based on Proprietary Technology
- Common Problems in Cassandra Data Models - Presentation and Article on wide partions, tombstones, and data skew.
Cassandra Deployment / Cassandra Deployment on Docker / Containerized Cassandra
- Packer: Cassandra Image (⭐48) - Cassandra Image using Packer for Docker and EC2 AMI. Covers managing EC2 Cassandra clusters with Ansible.
Communities / Custom Time Series