Awesome List Updates on Aug 14, 2018
8 awesome lists updated today.
🏠 Home · 🔍 Search · 🔥 Feed · 📮 Subscribe · ❤️ Sponsor
1. Awesome Swift
Other Data
- Disk (⭐3.1k) - Delightful framework for iOS to easily persist structs, images, and data.
2. Awesome Embedded Rust
Architecture support crates / ARM
cortex-a
Low-level access to Cortex-A processors (early state) -
3. Awesome Keycloak
Community Extensions
4. Amas
Ask these people anything!
- Ali Spittel (⭐10) - Teaching code, Python and JavaScript developer, blogger.
5. Awesome Spark
Packages / Language Bindings
- Flambo (⭐607) - Clojure DSL.
- sparklyr (⭐903) - An alternative R backend, using
dplyr
.
- sparkle (⭐442) - Haskell on Apache Spark.
Packages / Notebooks and IDEs
- Apache Zeppelin - Web-based notebook that enables interactive data analytics with plugable backends, integrated plotting, and extensive Spark support out-of-the-box.
- Spark Notebook (⭐3.1k) - Scalable and stable Scala and Spark focused notebook bridging the gap between JVM and Data Scientists (incl. extendable, typesafe and reactive charts).
- sparkmagic (⭐1.2k) - Jupyter magics and kernels for working with remote Spark clusters, for interactively working with remote Spark clusters through Livy (⭐995), in Jupyter notebooks.
Packages / General Purpose Libraries
- Succinct - Support for efficient queries on compressed data.
Packages / SQL Data Sources
- Spark XML (⭐441) - XML parser and writer.
- Spark Cassandra Connector (⭐1.9k) - Cassandra support including data source and API and support for arbitrary queries.
- Spark Riak Connector (⭐57) - Riak TS & Riak KV connector.
- Mongo-Spark (⭐674) - Official MongoDB connector.
- OrientDB-Spark (⭐19) - Official OrientDB connector.
Packages / Bioinformatics
- ADAM (⭐949) - Set of tools designed to analyse genomics data.
- Hail (⭐874) - Genetic analysis framework.
Packages / GIS
- Magellan (⭐525) - Geospatial analytics using Spark.
Packages / Time Series Analytics
- Spark-Timeseries (⭐1.2k) - Scala / Java / Python library for interacting with time series data on Apache Spark.
- flint (⭐979) - A time series library for Apache Spark.
Packages / Graph Processing
- Mazerunner (⭐377) - Graph analytics platform on top of Neo4j and GraphX.
- GraphFrames (⭐919) - Data frame based graph API.
- neo4j-spark-connector (⭐293) - Bolt protocol based, Neo4j Connector with RDD, DataFrame and GraphX / GraphFrames support.
- SparklingGraph - Library extending GraphX features with multiple functionalities useful in graph analytics (measures, generators, link prediction etc.).
Packages / Machine Learning Extension
- dbscan-on-spark (⭐178) - An Implementation of the DBSCAN clustering algorithm on top of Apache Spark by irvingc and based on the paper from He, Yaobin, et al. MR-DBSCAN: a scalable MapReduce-based DBSCAN algorithm for heavily skewed data.
- Apache SystemML - Declarative machine learning framework on top of Spark.
- Mahout Spark Bindings [status unknown] - linear algebra DSL and optimizer with R-like syntax.
- spark-sklearn (⭐1.1k) - Scikit-learn integration with distributed model training.
- JPMML-Spark (⭐95) - PMML transformer library for Spark ML.
- Distributed Keras (⭐623) - Distributed deep learning framework with PySpark and Keras.
- ModelDB - A system to manage machine learning models for
spark.ml
andscikit-learn
.
- Sparkling Water (⭐940) - H2O interoperability layer.
- BigDL (⭐4.2k) - Distributed Deep Learning library.
- MLeap (⭐1.4k) - Execution engine and serialization format which supports deployment of
o.a.s.ml
models without dependency onSparkSession
.
Packages / Middleware
- spark-jobserver (⭐2.8k) - Simple Spark as a Service which supports objects sharing using so called named objects. JVM only.
- Mist (⭐322) - Service for exposing Spark analytical jobs and machine learning models as realtime, batch or reactive web services.
- Apache Toree (⭐711) - IPython protocol based middleware for interactive applications.
Packages / Utilities
- silex (⭐18) - Collection of tools varying from ML extensions to additional RDD methods.
- sparkly (⭐54) - Helpers & syntactic sugar for PySpark.
- Flintrock (⭐622) - A command-line tool for launching Spark clusters on EC2.
Packages / Natural Language Processing
- spark-corenlp (⭐425) - DataFrame wrapper for Stanford CoreNLP.
Packages / Streaming
- Apache Bahir - Collection of the streaming connectors excluded from Spark 2.0 (Akka, MQTT, Twitter. ZeroMQ).
Packages / Interfaces
- Apache Beam - Unified data processing engine supporting both batch and streaming applications. Apache Spark is one of the supported execution environments.
- Blaze (⭐3.1k) - Interface for querying larger than memory datasets using Pandas-like syntax. It supports both Spark
DataFrames
andRDDs
.
Packages / Testing
- spark-testing-base (⭐1.4k) - Collection of base test classes.
- spark-fast-tests (⭐383) - A lightweight and fast testing framework.
Packages / Workflow Management
- Cromwell (⭐881) - Workflow management system with Spark backend (⭐881).
6. Awesome Code Review
Tools
- Rubberduck Browser extension to adds code-aware navigation to GitHub pull requests.
7. Awesome Sre
Service Level Agreement
8. Awesome Aws
Open Source Repos / Accompanying Repos
- aws-training-demo 🔥 (⭐128) - Demos from the Technical Trainers community.
- Prev: Aug 15, 2018
- Next: Aug 13, 2018