A unified analytics engine for large-scale data processing
A Scala API for Apache Beam and Google Cloud Dataflow
Deequ is a library built on top of Apache Spark
Apache Spark to Apache Cassandra connector
Simple and distributed Machine Learning
Abstract Algebra for Scala
Memory optimized analytics database, based on Apache Spark
A low-code open-source programming language for data pipeline
An open-source web-based self-service BI for analytical databases
Streaming MapReduce with Scalding and Storm
Machine learning server for developers and ML engineers