DSTK - DataScience ToolKit is an opensource free software for statistical analysis, data visualization, text analysis, and predictive analytics. Newer version and smaller file size can be found at: https://sourceforge.net/projects/dstk3/

It is designed to be straight forward and easy to use, and familar to SPSS user. While JASP offers more statistical features, DSTK tends to be a broad solution workbench, including text analysis and predictive analytics features. Of course you may specify JASP for advanced data editing and RapidMiner for advanced prediction modeling.

DSTK is written in C#, Java and Python to interface with R, NLTK, and Weka. It can be expanded with plugins using R Scripts. We have also created plugins for more statistical functions, and Big Data Analytics with Microsoft Azure HDInsights (Spark Server) with Livy.

License: R, RStudio, NLTK, SciPy, SKLearn, MatPlotLib, Weka, ... each has their own licenses.

Features

  • Data Scraping (Web Scraping, Video2Text, Image2Text)
  • Data and Text Preprocessing (with stemming, stopwords...)
  • Data Exploration and Visualizations (histogram, bar, pie, boxplot, ...)
  • Document Clustering
  • Text Analytics (Text Link Analysis, POSTagging, Sentiments Analysis, ...)
  • Predictive Analytics (both numerical and text, Naive Bayes, with additional Weka add-ins )
  • Plugins with Big Data features (need Microsoft Azure account)
  • Expandable with Plugins using R Scripts
  • Text Explorer/Analytics uses Gate's Gazetteer .lst files and online university sentiment word lists

Project Samples

Project Activity

See All Activity >

License

GNU General Public License version 3.0 (GPLv3)

Follow DSTK - DataScience ToolKit

DSTK - DataScience ToolKit Web Site

Other Useful Business Software
Skillfully - The future of skills based hiring Icon
Skillfully - The future of skills based hiring

Realistic Workplace Simulations that Show Applicant Skills in Action

Skillfully transforms hiring through AI-powered skill simulations that show you how candidates actually perform before you hire them. Our platform helps companies cut through AI-generated resumes and rehearsed interviews by validating real capabilities in action. Through dynamic job specific simulations and skill-based assessments, companies like Bloomberg and McKinsey have cut screening time by 50% while dramatically improving hire quality.
Learn More
Rate This Project
Login To Rate This Project

User Reviews

Be the first to post a review of DSTK - DataScience ToolKit !

Additional Project Details

Operating Systems

Windows

Intended Audience

Engineering, Financial and Insurance Industry, Information Technology, Management, Non-Profit Organizations, Science/Research

User Interface

.NET/Mono

Programming Language

C#, Java, Python

Related Categories

C# Business Software, C# Business Intelligence Software, C# Machine Learning Software, C# Data Analytics Tool, C# Web Scrapers, Python Business Software, Python Business Intelligence Software, Python Machine Learning Software, Python Data Analytics Tool, Python Web Scrapers, Java Business Software, Java Business Intelligence Software, Java Machine Learning Software, Java Data Analytics Tool, Java Web Scrapers

Registered

2017-04-26