Alternatives to Dask
Compare Dask alternatives for your business or organization using the curated list below. SourceForge ranks the best alternatives to Dask in 2026. Compare features, ratings, user reviews, pricing, and more from Dask competitors and alternatives in order to make an informed decision for your business.
-
1
PandaDoc
PandaDoc
PandaDoc empowers more than 50,000+ growing organizations to thrive by taking the work out of document workflow. PandaDoc provides an all-in-one document automation platform that helps fast scaling teams accelerate the ability to create, manage, and sign digital documents including agreements, proposals, quotes, contracts, and more. Powerful, integrated, and secure, PandaDoc enables business users to create and send personalized documents for electronic signing in under 4 minutes. The five main use cases of PandaDoc are: - Proposals - Quotes - Contracts - eSignatures - Forms. PandaDoc seamlessly integrates with the existing software you use, like CRM, payment gateway, cloud storage. We support the tools highly effective teams use, like Zoom, Canva, Monday, HubSpot, and SalesForce. Moreover, our robust API and Zapier integration could connect PandaDoc with any custom software you may use.Starting Price: Free -
2
Posit
Posit
Posit builds tools that help data scientists work more efficiently, collaborate seamlessly, and share insights securely across their organizations. Its Positron code editor provides the speed of an interactive console combined with the power to build, debug, and deploy data-science workflows in Python and R. Posit’s platform enables teams to scale open-source data science, offering enterprise-ready capabilities for publishing, sharing, and operationalizing applications. Companies rely on Posit’s secure infrastructure to host Shiny apps, dashboards, APIs, and analytical reports with confidence. Whether using open-source packages or cloud-based solutions, Posit supports reproducible, high-quality work at every stage of the data lifecycle. Trusted by millions of users—and more than half of the Fortune 100—Posit empowers professionals across industries to innovate with data. -
3
BigPanda
BigPanda
Aggregate data from all observability, monitoring, change and topology tools. BigPanda’s Open Box Machine Learning will correlate the data into a small number of actionable insights so incidents are detected in real-time, as they form, before they escalate into outages. Accelerate incident and outage resolution by automatically identifying the probable root cause of problems. BigPanda identifies both root cause changes and infrastructure-related root causes. Resolve incidents and outages faster. BigPanda automates and streamlines the incident response lifecycle across incident triage, ticketing, notifications, and war room creation. Accelerate remediation by integrating BigPanda with enterprise runbook automation tools. Applications and cloud services are the lifeblood of every company. When there’s an outage, everyone is impacted. BigPanda cements AIOps market leadership with $190M in funding, $1.2B valuation. -
4
Vaex
Vaex
At Vaex.io we aim to democratize big data and make it available to anyone, on any machine, at any scale. Cut development time by 80%, your prototype is your solution. Create automatic pipelines for any model. Empower your data scientists. Turn any laptop into a big data powerhouse, no clusters, no engineers. We provide reliable and fast data driven solutions. With our state-of-the-art technology we build and deploy machine learning models faster than anyone on the market. Turn your data scientist into big data engineers. We provide comprehensive training of your employees, enabling you to take full advantage of our technology. Combines memory mapping, a sophisticated expression system, and fast out-of-core algorithms. Efficiently visualize and explore big datasets, and build machine learning models on a single machine. -
5
Polars
Polars
Knowing of data wrangling habits, Polars exposes a complete Python API, including the full set of features to manipulate DataFrames using an expression language that will empower you to create readable and performant code. Polars is written in Rust, uncompromising in its choices to provide a feature-complete DataFrame API to the Rust ecosystem. Use it as a DataFrame library or as a query engine backend for your data models. -
6
Ray
Anyscale
Develop on your laptop and then scale the same Python code elastically across hundreds of nodes or GPUs on any cloud, with no changes. Ray translates existing Python concepts to the distributed setting, allowing any serial application to be easily parallelized with minimal code changes. Easily scale compute-heavy machine learning workloads like deep learning, model serving, and hyperparameter tuning with a strong ecosystem of distributed libraries. Scale existing workloads (for eg. Pytorch) on Ray with minimal effort by tapping into integrations. Native Ray libraries, such as Ray Tune and Ray Serve, lower the effort to scale the most compute-intensive machine learning workloads, such as hyperparameter tuning, training deep learning models, and reinforcement learning. For example, get started with distributed hyperparameter tuning in just 10 lines of code. Creating distributed apps is hard. Ray handles all aspects of distributed execution.Starting Price: Free -
7
scikit-learn
scikit-learn
Scikit-learn provides simple and efficient tools for predictive data analysis. Scikit-learn is a robust, open source machine learning library for the Python programming language, designed to provide simple and efficient tools for data analysis and modeling. Built on the foundations of popular scientific libraries like NumPy, SciPy, and Matplotlib, scikit-learn offers a wide range of supervised and unsupervised learning algorithms, making it an essential toolkit for data scientists, machine learning engineers, and researchers. The library is organized into a consistent and flexible framework, where various components can be combined and customized to suit specific needs. This modularity makes it easy for users to build complex pipelines, automate repetitive tasks, and integrate scikit-learn into larger machine-learning workflows. Additionally, the library’s emphasis on interoperability ensures that it works seamlessly with other Python libraries, facilitating smooth data processing.Starting Price: Free -
8
Bokeh
Bokeh
Bokeh makes it simple to create common plots, but also can handle custom or specialized use-cases. Plots, dashboards, and apps can be published in web pages or Jupyter notebooks. Python has an incredible ecosystem of powerful analytics tools: NumPy, Scipy, Pandas, Dask, Scikit-Learn, OpenCV, and more. With a wide array of widgets, plot tools, and UI events that can trigger real Python callbacks, the Bokeh server is the bridge that lets you connect these tools to rich, interactive visualizations in the browser. Microscopium is a project maintained by researchers at Monash University. It allows researchers to discover new gene or drug functions by exploring large image datasets with Bokeh’s interactive tools. Panel is a tool for polished data presentation that utilizes the Bokeh server. It is created and supported by Anaconda. Panel makes it simple to create custom interactive web apps and dashboards by connecting user-defined widgets to plots, images, tables, or text.Starting Price: Free -
9
Build, run and manage AI models, and optimize decisions at scale across any cloud. IBM Watson Studio empowers you to operationalize AI anywhere as part of IBM Cloud Pak® for Data, the IBM data and AI platform. Unite teams, simplify AI lifecycle management and accelerate time to value with an open, flexible multicloud architecture. Automate AI lifecycles with ModelOps pipelines. Speed data science development with AutoAI. Prepare and build models visually and programmatically. Deploy and run models through one-click integration. Promote AI governance with fair, explainable AI. Drive better business outcomes by optimizing decisions. Use open source frameworks like PyTorch, TensorFlow and scikit-learn. Bring together the development tools including popular IDEs, Jupyter notebooks, JupterLab and CLIs — or languages such as Python, R and Scala. IBM Watson Studio helps you build and scale AI with trust and transparency by automating AI lifecycle management.
-
10
Azure Databricks
Microsoft
Unlock insights from all your data and build artificial intelligence (AI) solutions with Azure Databricks, set up your Apache Spark™ environment in minutes, autoscale, and collaborate on shared projects in an interactive workspace. Azure Databricks supports Python, Scala, R, Java, and SQL, as well as data science frameworks and libraries including TensorFlow, PyTorch, and scikit-learn. Azure Databricks provides the latest versions of Apache Spark and allows you to seamlessly integrate with open source libraries. Spin up clusters and build quickly in a fully managed Apache Spark environment with the global scale and availability of Azure. Clusters are set up, configured, and fine-tuned to ensure reliability and performance without the need for monitoring. Take advantage of autoscaling and auto-termination to improve total cost of ownership (TCO). -
11
IntelliHub
Spotflock
We work closely with businesses to find out what are the common issues preventing companies from realising benefits. We design to open up opportunities that were previously not viable using conventional approaches Corporations -big and small, require an AI platform with complete empowerment and ownership. Tackle data privacy and adopt to AI platforms at a sustainable cost. Enhance the efficiency of businesses and augment the work humans do. We apply AI to gain control over repetitive or dangerous tasks and bypass human intervention, thereby expediting tasks with creativity and empathy. Machine Learning helps to give predictive capabilities to applications with ease. You can build classification and regression models. It can also do clustering and visualize different clusters. It supports multiple ML libraries like Weka, Scikit-Learn, H2O and Tensorflow. It includes around 22 different algorithms for building classification, regression and clustering models. -
12
Keepsake
Replicate
Keepsake is an open-source Python library designed to provide version control for machine learning experiments and models. It enables users to automatically track code, hyperparameters, training data, model weights, metrics, and Python dependencies, ensuring that all aspects of the machine learning workflow are recorded and reproducible. Keepsake integrates seamlessly with existing workflows by requiring minimal code additions, allowing users to continue training as usual while Keepsake saves code and weights to Amazon S3 or Google Cloud Storage. This facilitates the retrieval of code and weights from any checkpoint, aiding in re-training or model deployment. Keepsake supports various machine learning frameworks, including TensorFlow, PyTorch, scikit-learn, and XGBoost, by saving files and dictionaries in a straightforward manner. It also offers features such as experiment comparison, enabling users to analyze differences in parameters, metrics, and dependencies across experiments.Starting Price: Free -
13
Flower
Flower
Flower is an open source federated learning framework designed to simplify the development and deployment of machine learning models across decentralized data sources. It enables training on data located on devices or servers without transferring the data itself, thereby enhancing privacy and reducing bandwidth usage. Flower supports a wide range of machine learning frameworks, including PyTorch, TensorFlow, Hugging Face Transformers, scikit-learn, and XGBoost, and is compatible with various platforms and cloud services like AWS, GCP, and Azure. It offers flexibility through customizable strategies and supports both horizontal and vertical federated learning scenarios. Flower's architecture allows for scalable experiments, with the capability to handle workloads involving tens of millions of clients. It also provides built-in support for privacy-preserving techniques like differential privacy and secure aggregation.Starting Price: Free -
14
Lucidworks Fusion
Lucidworks
Fusion transforms your siloed data into personalized insights unique to each user. Lucidworks Fusion lets customers easily deploy AI-powered data discovery and search applications in a modern, containerized, cloud-native architecture. Data scientists interact with those applications by leveraging existing machine learning models and workflows. Or they can quickly create and deploy new models using popular tools like Python ML, TensorFlow, scikit-learn, and spaCy. Reduce the effort and risk of managing deployments of Fusion in the cloud. Lucidworks has modernized Fusion with a cloud-native microservices architecture orchestrated by Kubernetes. Fusion allows customers to dynamically manage application resources as utilization ebbs and flows, reduce the effort of deploying and upgrading Fusion, and avoid unscheduled downtime and performance degradation. Fusion includes native support for Python machine learning models. Plug your custom ML models into Fusion. -
15
NumPy
NumPy
Fast and versatile, the NumPy vectorization, indexing, and broadcasting concepts are the de-facto standards of array computing today. NumPy offers comprehensive mathematical functions, random number generators, linear algebra routines, Fourier transforms, and more. NumPy supports a wide range of hardware and computing platforms, and plays well with distributed, GPU, and sparse array libraries. The core of NumPy is well-optimized C code. Enjoy the flexibility of Python with the speed of compiled code. NumPy’s high level syntax makes it accessible and productive for programmers from any background or experience level. NumPy brings the computational power of languages like C and Fortran to Python, a language much easier to learn and use. With this power comes simplicity: a solution in NumPy is often clear and elegant.Starting Price: Free -
16
Metaflow
Netflix
Successful data science projects are delivered by data scientists who can build, improve, and operate end-to-end workflows independently, focusing more on data science, less on engineering. Use Metaflow with your favorite data science libraries, such as Tensorflow or SciKit Learn, and write your models in idiomatic Python code with not much new to learn. Metaflow also supports the R language. Metaflow helps you design your workflow, run it at scale, and deploy it to production. It versions and tracks all your experiments and data automatically. It allows you to inspect results easily in notebooks. Metaflow comes packaged with the tutorials, so getting started is easy. You can make copies of all the tutorials in your current directory using the metaflow command line interface. -
17
Provision a VM quickly with everything you need to get your deep learning project started on Google Cloud. Deep Learning VM Image makes it easy and fast to instantiate a VM image containing the most popular AI frameworks on a Google Compute Engine instance without worrying about software compatibility. You can launch Compute Engine instances pre-installed with TensorFlow, PyTorch, scikit-learn, and more. You can also easily add Cloud GPU and Cloud TPU support. Deep Learning VM Image supports the most popular and latest machine learning frameworks, like TensorFlow and PyTorch. To accelerate your model training and deployment, Deep Learning VM Images are optimized with the latest NVIDIA® CUDA-X AI libraries and drivers and the Intel® Math Kernel Library. Get started immediately with all the required frameworks, libraries, and drivers pre-installed and tested for compatibility. Deep Learning VM Image delivers a seamless notebook experience with integrated support for JupyterLab.
-
18
statsmodels
statsmodels
statsmodels is a Python module that provides classes and functions for the estimation of many different statistical models, as well as for conducting statistical tests and statistical data exploration. An extensive list of result statistics is available for each estimator. The results are tested against existing statistical packages to ensure that they are correct. The package is released under the open-source Modified BSD (3-clause) license. statsmodels supports specifying models using R-style formulas and pandas DataFrames. Have a look at dir(results) to see available results. Attributes are described in results.__doc__ and results methods have their own docstrings. You can also use numpy arrays instead of formulas. The easiest way to install statsmodels is to install it as part of the Anaconda distribution, a cross-platform distribution for data analysis and scientific computing. This is the recommended installation method for most users.Starting Price: Free -
19
Datatron
Datatron
Datatron offers tools and features built from scratch, specifically to make machine learning in production work for you. Most teams discover that there’s more to just deploying models, which is already a very manual and time-consuming task. Datatron offers single model governance and management platform for all of your ML, AI, and Data Science models in production. We help you automate, optimize, and accelerate your ML models to ensure that they are running smoothly and efficiently in production. Data Scientists use a variety of frameworks to build the best models. We support anything you’d build a model with ( e.g. TensorFlow, H2O, Scikit-Learn, and SAS ). Explore models built and uploaded by your data science team, all from one centralized repository. Create a scalable model deployment in just a few clicks. Deploy models built using any language or framework. Make better decisions based on your model performance. -
20
h5py
HDF5
The h5py package is a Pythonic interface to the HDF5 binary data format. It lets you store huge amounts of numerical data, and easily manipulate that data from NumPy. For example, you can slice into multi-terabyte datasets stored on disk, as if they were real NumPy arrays. Thousands of datasets can be stored in a single file, categorized and tagged however you want. H5py uses straightforward NumPy and Python metaphors, like dictionary and NumPy array syntax. For example, you can iterate over datasets in a file, or check out the .shape or .dtype attributes of datasets. You don't need to know anything special about HDF5 to get started. In addition to the easy-to-use high level interface, h5py rests on a object-oriented Cython wrapping of the HDF5 C API. Almost anything you can do from C in HDF5, you can do from h5py.Starting Price: Free -
21
JAX
JAX
JAX is a Python library designed for high-performance numerical computing and machine learning research. It offers a NumPy-like API, facilitating seamless adoption for those familiar with NumPy. Key features of JAX include automatic differentiation, just-in-time compilation, vectorization, and parallelization, all optimized for execution on CPUs, GPUs, and TPUs. These capabilities enable efficient computation for complex mathematical functions and large-scale machine-learning models. JAX also integrates with various libraries within its ecosystem, such as Flax for neural networks and Optax for optimization tasks. Comprehensive documentation, including tutorials and user guides, is available to assist users in leveraging JAX's full potential. -
22
Quadratic
Quadratic
Quadratic enables your team to work together on data analysis to deliver faster results. You already know how to use a spreadsheet, but you’ve never had this much power. Quadratic speaks Formulas and Python (SQL & JavaScript coming soon). Use the language you and your team already know. Single-line formulas are hard to read. In Quadratic you can expand your recipes to as many lines as you need. Quadratic has Python library support built-in. Bring the latest open-source tools directly to your spreadsheet. The last line of code is returned to the spreadsheet. Raw values, 1/2D arrays, and Pandas DataFrames are supported by default. Pull or fetch data from an external API, and it updates automatically in Quadratic's cells. Navigate with ease, zoom out for the big picture, and zoom in to focus on the details. Arrange and navigate your data how it makes sense in your head, not how a tool forces you to do it. -
23
scikit-image
scikit-image
scikit-image is a collection of algorithms for image processing. It is available free of charge and free of restriction. We pride ourselves on high-quality, peer-reviewed code, written by an active community of volunteers. scikit-image provides a versatile set of image processing routines in Python. This library is developed by its community, and contributions are most welcome! scikit-image aims to be the reference library for scientific image analysis in Python. We accomplish this by being easy to use and install. We are careful in taking on new dependencies, and sometimes cull existing ones, or make them optional. All functions in our API have thorough docstrings clarifying expected inputs and outputs. Conceptually identical arguments have the same name and position in a function signature. Test coverage is close to 100% and code is reviewed by at least two core developers before being included in the library.Starting Price: Free -
24
Amazon EC2 UltraClusters
Amazon
Amazon EC2 UltraClusters enable you to scale to thousands of GPUs or purpose-built machine learning accelerators, such as AWS Trainium, providing on-demand access to supercomputing-class performance. They democratize supercomputing for ML, generative AI, and high-performance computing developers through a simple pay-as-you-go model without setup or maintenance costs. UltraClusters consist of thousands of accelerated EC2 instances co-located in a given AWS Availability Zone, interconnected using Elastic Fabric Adapter (EFA) networking in a petabit-scale nonblocking network. This architecture offers high-performance networking and access to Amazon FSx for Lustre, a fully managed shared storage built on a high-performance parallel file system, enabling rapid processing of massive datasets with sub-millisecond latencies. EC2 UltraClusters provide scale-out capabilities for distributed ML training and tightly coupled HPC workloads, reducing training times. -
25
NVIDIA RAPIDS
NVIDIA
The RAPIDS suite of software libraries, built on CUDA-X AI, gives you the freedom to execute end-to-end data science and analytics pipelines entirely on GPUs. It relies on NVIDIA® CUDA® primitives for low-level compute optimization, but exposes that GPU parallelism and high-bandwidth memory speed through user-friendly Python interfaces. RAPIDS also focuses on common data preparation tasks for analytics and data science. This includes a familiar DataFrame API that integrates with a variety of machine learning algorithms for end-to-end pipeline accelerations without paying typical serialization costs. RAPIDS also includes support for multi-node, multi-GPU deployments, enabling vastly accelerated processing and training on much larger dataset sizes. Accelerate your Python data science toolchain with minimal code changes and no new tools to learn. Increase machine learning model accuracy by iterating on models faster and deploying them more frequently. -
26
GeoPandas
GeoPandas
GeoPandas is an open-source project to make working with geospatial data in python easier. GeoPandas extends the datatypes used by pandas to allow spatial operations on geometric types. Geometric operations are performed by shapely. Geopandas further depends on fiona for file access and matplotlib for plotting. The goal of GeoPandas is to make working with geospatial data in python easier. It combines the capabilities of pandas and shapely, providing geospatial operations in pandas and a high-level interface to multiple geometries to shapely. GeoPandas enables you to easily do operations in python that would otherwise require a spatial database such as PostGIS. GeoPandas is a community-led project written, used and supported by a wide range of people from all around of world of a large variety of backgrounds. GeoPandas will always be 100% open source software, free for all to use and released under the liberal terms of the BSD-3-Clause license. -
27
PyQtGraph
PyQtGraph
PyQtGraph is a pure-python graphics and GUI library built on PyQt/PySide and NumPy. It is intended for use in mathematics/scientific/engineering applications. Despite being written entirely in python, the library is very fast due to its heavy leverage of NumPy for number crunching and Qt's GraphicsView framework for fast display. PyQtGraph is distributed under the MIT open-source license. Basic 2D plotting in interactive view boxes. Line and scatter plots. Data can be panned/scaled by mouse. Fast drawing for real-time data display and interaction. Displays most data types (int or float; any bit depth; RGB, RGBA, or luminance). Functions for slicing multidimensional images at arbitrary angles (great for MRI data). Rapid update for video display or real-time interaction. Image display with interactive lookup tables and level control. Mesh rendering with isosurface generation. Interactive viewports rotate/zoom with mouse. Basic 3D scenegraph for easier programming.Starting Price: Free -
28
AWS ParallelCluster
Amazon
AWS ParallelCluster is an open-source cluster management tool that simplifies the deployment and management of High-Performance Computing (HPC) clusters on AWS. It automates the setup of required resources, including compute nodes, a shared filesystem, and a job scheduler, supporting multiple instance types and job submission queues. Users can interact with ParallelCluster through a graphical user interface, command-line interface, or API, enabling flexible cluster configuration and management. The tool integrates with job schedulers like AWS Batch and Slurm, facilitating seamless migration of existing HPC workloads to the cloud with minimal modifications. AWS ParallelCluster is available at no additional charge; users only pay for the AWS resources consumed by their applications. With AWS ParallelCluster, you can use a simple text file to model, provision, and dynamically scale the resources needed for your applications in an automated and secure manner. -
29
Daft
Daft
Daft is a framework for ETL, analytics and ML/AI at scale. Its familiar Python dataframe API is built to outperform Spark in performance and ease of use. Daft plugs directly into your ML/AI stack through efficient zero-copy integrations with essential Python libraries such as Pytorch and Ray. It also allows requesting GPUs as a resource for running models. Daft runs locally with a lightweight multithreaded backend. When your local machine is no longer sufficient, it scales seamlessly to run out-of-core on a distributed cluster. Daft can handle User-Defined Functions (UDFs) in columns, allowing you to apply complex expressions and operations to Python objects with the full flexibility required for ML/AI. Daft runs locally with a lightweight multithreaded backend. When your local machine is no longer sufficient, it scales seamlessly to run out-of-core on a distributed cluster. -
30
Avanzai
Avanzai
Avanzai helps accelerate your financial data analysis by letting you use natural language to output production-ready Python code. Avanzai speeds up financial data analysis for both beginners and experts using plain English. Plot times series data, equity index members, and even stock performance data using natural prompts. Skip the boring parts of financial analysis by leveraging AI to generate code with relevant Python packages already installed. Further edit the code if you wish, once you're ready copy and paste the code into your local environment and get straight to business. Leverage commonly used Python packages for quant analysis such as Pandas, Numpy, etc using plain English. Take financial analysis to the next level, quickly pull fundamental data and calculate the performance of nearly all US stocks. Enhance your investment decisions with accurate and up-to-date information. Avanzai empowers you to write the same Python code that quants use to analyze complex financial data. -
31
Bodo.ai
Bodo.ai
Bodo’s powerful compute engine and parallel computing approach provides efficient execution and effective scalability even for 10,000+ cores and PBs of data. Bodo enables faster development and easier maintenance for data science, data engineering and ML workloads with standard Python APIs like Pandas. Avoid frequent failures with bare-metal native code execution and catch errors before they appear in production with end-to-end compilation. Experiment faster with large datasets on your laptop with the simplicity that only Python can provide. Write production-ready code without the hassle of refactoring for scaling on large infrastructure! -
32
Slurm
IBM
Slurm Workload Manager, formerly known as Simple Linux Utility for Resource Management (SLURM), is a free, open-source job scheduler and cluster management system for Linux and Unix-like kernels. It's designed to manage compute jobs on high performance computing (HPC) clusters and high throughput computing (HTC) environments, and is used by many of the world's supercomputers and computer clusters.Starting Price: Free -
33
Outerbounds
Outerbounds
Design and develop data-intensive projects with human-friendly, open-source Metaflow. Run, scale, and deploy them reliably on the fully managed Outerbounds platform. One platform for all your ML and data science projects. Access data securely from your existing data warehouses. Compute with a cluster optimized for scale and cost. 24/7 managed orchestration for production workflows. Use results to power any application. Give your data scientists superpowers, approved by your engineers. Outerbounds Platform allows data scientists to develop rapidly, experiment at scale, and deploy to production confidently. All within the outer bounds of policies and processes defined by your engineers, running on your cloud account, fully managed by us. Security is in our DNA, not at the perimeter. The platform adapts to your policies and compliance requirements through multiple layers of security. Centralized auth, a strict permission boundary, and granular task execution roles. -
34
broot
broot
The ROOT data analysis framework is used much in High Energy Physics (HEP) and has its own output format (.root). ROOT can be easily interfaced with software written in C++. For software tools in Python there exists pyROOT. Unfortunately, pyROOT does not work well with python3.4. broot is a small library that converts data in python numpy ndarrays to ROOT files containing trees with a branch for each array. The goal of this library is to provide a generic way of writing python numpy datastructures to ROOT files. The library should be portable and supports both python2, python3, ROOT v5 and ROOT v6 (requiring no modifications on the ROOT part, just the default installation). Installation of the library should only require a user to compile to library once or install it as a python package.Starting Price: Free -
35
Azure Data Science Virtual Machines
Microsoft
DSVMs are Azure Virtual Machine images, pre-installed, configured and tested with several popular tools that are commonly used for data analytics, machine learning and AI training. Consistent setup across team, promote sharing and collaboration, Azure scale and management, Near-Zero Setup, full cloud-based desktop for data science. Quick, Low friction startup for one to many classroom scenarios and online courses. Ability to run analytics on all Azure hardware configurations with vertical and horizontal scaling. Pay only for what you use, when you use it. Readily available GPU clusters with Deep Learning tools already pre-configured. Examples, templates and sample notebooks built or tested by Microsoft are provided on the VMs to enable easy onboarding to the various tools and capabilities such as Neural Networks (PYTorch, Tensorflow, etc.), Data Wrangling, R, Python, Julia, and SQL Server.Starting Price: $0.005 -
36
Shapelets
Shapelets
Powerful computing at your fingertips. Parallel computing, groundbreaking algorithms, so what are you waiting for? Designed to empower data scientists in business. Get the fastest computing in an all-inclusive time-series platform. Shapelets provides you with analytical features such as causality, discords and motif discovery, forecasting, clustering, etc. Run, extend and integrate your own algorithms into the Shapelets platform to make the most of Big Data analysis. Shapelets integrates seamlessly with any data collection and storage solution. It also integrates with MS Office and any other visualization tool to simplify and share insights without any technical acumen. Our UI works with the server to bring you interactive visualizations. You can make the most of your metadata and represent it in the many different visual graphs provided by our modern interface. Shapelets enables users from the oil, gas, and energy industry to perform real-time analysis of operational data. -
37
Panda Security Cleanup
Panda
Panda Cleanup cleans and speeds up your Windows devices, extending their useful life and improving performance. Speed up your Windows device and free up hard disk space by deleting unnecessary files. Delete temporary files and clear your browser's history. Delete cookies (Chrome, Firefox, Edge, and Internet Explorer). Clean up the Windows registry. Defragment the hard disk. Some programs and applications are configured to run automatically whenever you start your PC. This may slow down your computer. With Panda Cleanup, you'll be able to see which programs are configured to run at startup and disable them if you don't think they are necessary. Additionally, Panda Cleanup will warn you every time a program installs as part of your computer's startup sequence so you can keep the boot process always optimized. Panda Cleanup will delete any corrupted or unnecessary registry keys that may cause operating system errors.Starting Price: $17.93 per year -
38
Plotly Dash
Plotly
Dash & Dash Enterprise let you build & deploy analytic web apps using Python, R, and Julia. No JavaScript or DevOps required. Through Dash, the world's largest companies elevate AI, ML, and Python analytics to business users at 5% the cost of a full-stack development approach. Deliver apps and dashboards that run advanced analytics: ML, NLP, forecasting, computer vision and more. Work in the languages you love: Python, R, and Julia. Reduce costs by migrating legacy, per-seat licensed software to Dash Enterprise's open-core, unlimited end-user pricing model. Move faster by deploying and updating Dash apps without an IT or DevOps team. Create pixel-perfect dashboards & web apps, without writing any CSS. Scale effortlessly with Kubernetes. Support mission-critical Python applications with high availability. -
39
Appsilon
Appsilon
Appsilon provides innovative data analytics, machine learning, and managed services solutions for Fortune 500 companies, NGOs, and non-profit organizations. We deliver the world’s most advanced R Shiny applications, with a unique ability to rapidly develop and scale enterprise Shiny dashboards. Our proprietary machine learning frameworks allow us to deliver Computer Vision, NLP, and fraud detection prototypes in as little as one week. Above all, we are committed to making a positive impact on the world. Through our AI For Good Initiative, we routinely contribute our skills to projects that support the preservation of human life and the conservation of animal populations all over the globe. Recently, our team has worked to mitigate poaching in Africa with computer vision, provide satellite image analysis for assessing damage after natural disasters, and build tools to help with COVID-19 risk assessment. Appsilon is also a pioneer in open source. -
40
H2O.ai
H2O.ai
H2O.ai is the open source leader in AI and machine learning with a mission to democratize AI for everyone. Our industry-leading enterprise-ready platforms are used by hundreds of thousands of data scientists in over 20,000 organizations globally. We empower every company to be an AI company in financial services, insurance, healthcare, telco, retail, pharmaceutical, and marketing and delivering real value and transforming businesses today. -
41
Beaker Notebook
Two Sigma Open Source
BeakerX is a collection of kernels and extensions to the Jupyter interactive computing environment. It provides JVM support, Spark cluster support, polyglot programming, interactive plots, tables, forms, publishing, and more. All of BeakerX’s JVM languages plus Python and JavaScript have APIs for interactive time-series, scatter plots, histograms, heatmaps, and treemaps. The widgets remain interactive in both notebooks saved to disk, and notebooks published to the web. They include unique features for handling many points, nanosecond resolution, zooming, and exporting. BeakerX’s table widget automatically recognizes pandas data frames and allows you to search, sort, drag, filter, format, select, graph, hide, pin, and export to CSV or clipboard. This makes connecting to spreadsheets quickly and easy. BeakerX has a Spark magic with GUIs for configuration, status, progress, and interrupt of Spark jobs. You can either use the GUI or create your own SparkSession with code. -
42
HPE Performance Cluster Manager
Hewlett Packard Enterprise
HPE Performance Cluster Manager (HPCM) delivers an integrated system management solution for Linux®-based high performance computing (HPC) clusters. HPE Performance Cluster Manager provides complete provisioning, management, and monitoring for clusters scaling up to Exascale sized supercomputers. The software enables fast system setup from bare-metal, comprehensive hardware monitoring and management, image management, software updates, power management, and cluster health management. Additionally, it makes scaling HPC clusters easier and efficient while providing integration with a plethora of 3rd party tools for running and managing workloads. HPE Performance Cluster Manager reduces the time and resources spent administering HPC systems - lowering total cost of ownership, increasing productivity and providing a better return on hardware investments. -
43
QuantRocket
QuantRocket
QuantRocket is a Python-based platform for researching, backtesting, and trading quantitative strategies. It provides a JupyterLab environment, offers a suite of data integrations, and supports multiple backtesters: Zipline, the open-source backtester that originally powered Quantopian; Alphalens, an alpha factor analysis library; Moonshot, a vectorized backtester based on pandas; and MoonshotML, a walk-forward machine learning backtester. Built on Docker, QuantRocket can be deployed locally or to the cloud and has an open architecture that is flexible and extensible. -
44
Cloudera Data Science Workbench
Cloudera
Accelerate machine learning from research to production with a consistent experience built for your traditional platform. With Python, R, and Scala directly in the web browser, Cloudera Data Science Workbench (CDSW) delivers a self-service experience data scientists will love. Download and experiment with the latest libraries and frameworks in customizable project environments that work just like your laptop. Cloudera Data Science Workbench provides connectivity not only to CDH and HDP but also to the systems your data science teams rely on for analysis. Cloudera Data Science Workbench lets data scientists manage their own analytics pipelines, including built-in scheduling, monitoring, and email alerting. Quickly develop and prototype new machine learning projects and easily deploy them to production. -
45
Karpenter
Amazon
Karpenter simplifies Kubernetes infrastructure with the right nodes at the right time. Karpenter is an open source, high-performance Kubernetes cluster autoscaler that simplifies infrastructure management by automatically launching the appropriate compute resources to handle your cluster's applications. Designed to leverage the full potential of the cloud, Karpenter enables fast and straightforward compute provisioning for Kubernetes clusters. It enhances application availability by swiftly responding to changes in application load, scheduling, and resource requirements, efficiently placing new workloads onto a variety of available computing resources. By identifying opportunities to remove under-utilized nodes, replace costly nodes with more economical alternatives, and consolidate workloads onto more efficient compute resources, Karpenter effectively reduces cluster compute costs.Starting Price: Free -
46
OpenHPC
The Linux Foundation
Welcome to the OpenHPC site. OpenHPC is a collaborative, community effort that was initiated from a desire to aggregate a number of common ingredients required to deploy and manage High Performance Computing (HPC) Linux clusters including provisioning tools, resource management, I/O clients, development tools, and a variety of scientific libraries. Packages provided by OpenHPC have been pre-built with HPC integration in mind with a goal to provide reusable building blocks for the HPC community. Over time, the community also plans to identify and develop abstraction interfaces between key components to further enhance modularity and interchangeability. The community includes representation from a variety of sources including software vendors, equipment manufacturers, research institutions, supercomputing sites, and others. This community works to integrate a multitude of components that are commonly used in HPC systems and are freely available for open source distribution.Starting Price: Free -
47
MATLAB
The MathWorks
MATLAB® combines a desktop environment tuned for iterative analysis and design processes with a programming language that expresses matrix and array mathematics directly. It includes the Live Editor for creating scripts that combine code, output, and formatted text in an executable notebook. MATLAB toolboxes are professionally developed, rigorously tested, and fully documented. MATLAB apps let you see how different algorithms work with your data. Iterate until you’ve got the results you want, then automatically generate a MATLAB program to reproduce or automate your work. Scale your analyses to run on clusters, GPUs, and clouds with only minor code changes. There’s no need to rewrite your code or learn big data programming and out-of-memory techniques. Automatically convert MATLAB algorithms to C/C++, HDL, and CUDA code to run on your embedded processor or FPGA/ASIC. MATLAB works with Simulink to support Model-Based Design. -
48
Gathr.ai
Gathr.ai
Gathr is a Data+AI fabric, helping enterprises rapidly deliver production-ready data and AI products. Data+AI fabric enables teams to effortlessly acquire, process, and harness data, leverage AI services to generate intelligence, and build consumer applications— all with unparalleled speed, scale, and confidence. Gathr’s self-service, AI-assisted, and collaborative approach enables data and AI leaders to achieve massive productivity gains by empowering their existing teams to deliver more valuable work in less time. With complete ownership and control over data and AI, flexibility and agility to experiment and innovate on an ongoing basis, and proven reliable performance at real-world scale, Gathr allows them to confidently accelerate POVs to production. Additionally, Gathr supports both cloud and air-gapped deployments, making it the ideal choice for diverse enterprise needs. Gathr, recognized by leading analysts like Gartner and Forrester, is a go-to-partner for Fortune 500Starting Price: $0.25/credit -
49
SAS Viya
SAS
SAS® Viya® data science offerings provide a comprehensive, scalable analytics environment that's quick and easy to deploy, enabling you to meet diverse business needs. Automatically generated insights enable you to identify the most common variables across all models, the most important variables selected across models and assessment results for all models. Natural language generation capabilities are used to create project summaries written in plain language, enabling you to easily interpret reports. Analytics team members can add project notes to the insights report to facilitate communication and collaboration among team members. SAS lets you embed open source code within an analysis and call open source algorithms seamlessly within its environment. This facilitates collaboration across your organization because users can program in their language of choice. You can also take advantage of SAS Deep Learning with Python (DLPy), our open-source package on GitHub. -
50
HPE Cray
Hewlett Packard
HPE Cray exascale supercomputers are an entirely new design, created from the ground up to handle today’s new massive converged modeling, simulation, AI and analytics workloads. Meet the next era of supercomputing. HPE Cray supercomputers are one of our most significant technology advancements in decades. With it, we’re introducing revolutionary capabilities for revolutionary questions. HPE Cray supercomputers are the next era of supercomputing for your next era of science, discovery, and achievement. Rethought and re-engineered, we’ve created an entirely new solution to address today’s diversifying needs. Hardware and software innovations tackle challenges that emerge when core counts increase, compute node architectures proliferate, and workflows expand to incorporate AI at scale.