Showing 469 open source projects for "python data analysis"

View related business solutions
  • Next-Gen Encryption for Post-Quantum Security | CLEAR by Quantum Knight Icon
    Next-Gen Encryption for Post-Quantum Security | CLEAR by Quantum Knight

    Lock Down Any Resource, Anywhere, Anytime

    CLEAR by Quantum Knight is a FIPS-140-3 validated encryption SDK engineered for enterprises requiring top-tier security. Offering robust post-quantum cryptography, CLEAR secures files, streaming media, databases, and networks with ease across over 30 modern platforms. Its compact design, smaller than a single smartphone image, ensures maximum efficiency and low energy consumption.
    Learn More
  • The Most Powerful Software Platform for EHSQ and ESG Management Icon
    The Most Powerful Software Platform for EHSQ and ESG Management

    Addresses the needs of small businesses and large global organizations with thousands of users in multiple locations.

    Choose from a complete set of software solutions across EHSQ that address all aspects of top performing Environmental, Health and Safety, and Quality management programs.
    Learn More
  • 1
    Mist Cloud Management Platform

    Mist Cloud Management Platform

    Mist is an open source, multicloud management platform

    Mist CE is an open-source multi-cloud management platform, offering unified control and monitoring for hybrid and multi-cloud environments.
    Downloads: 5 This Week
    Last Update:
    See Project
  • 2
    douyin

    douyin

    Open source Douyin crawler for collecting and downloading public data

    DouyinCrawler is an open source data collection tool designed to gather publicly available information from the Douyin platform. It demonstrates how to build a Python-based web crawler combined with a graphical interface and command line functionality. It allows users to collect data from various types of Douyin content, including user profiles, videos, hashtags, and music pages.
    Downloads: 5 This Week
    Last Update:
    See Project
  • 3
    waitress

    waitress

    A WSGI server for Python 3

    Waitress is meant to be a production-quality pure-Python WSGI server with very acceptable performance. It has no dependencies except ones which live in the Python standard library. It runs on CPython on Unix and Windows under Python 3.7+. It is also known to run on PyPy 3 (python version 3.7+) on UNIX. It supports HTTP/1.0 and HTTP/1.1. Waitress now validates that chunked encoding extensions are valid, and don't contain invalid characters that are not allowed. ...
    Downloads: 5 This Week
    Last Update:
    See Project
  • 4
    Helium Browser

    Helium Browser

    Private, fast, and honest web browser

    Helium is a Chromium-based web browser designed to deliver privacy, speed, and simplicity by removing Google’s proprietary services, telemetry, and bloat. It’s built atop ungoogled-chromium, extending its philosophy with additional privacy features, design refinements, and user experience improvements aimed at transparency and control. Helium blocks ads and trackers by default through an integrated, unbiased uBlock Origin extension prepackaged as a native browser component. Its UI and...
    Downloads: 89 This Week
    Last Update:
    See Project
  • Simplify Purchasing For Your Business Icon
    Simplify Purchasing For Your Business

    Manage what you buy and how you buy it with Order.co, so you have control over your time and money spent.

    Simplify every aspect of buying for your business in Order.co. From sourcing products to scaling purchasing across locations to automating your AP and approvals workstreams, Order.co is the platform of choice for growing businesses.
    Learn More
  • 5
    SearXNG

    SearXNG

    Free internet metasearch engine which aggregates

    SearXNG is a free and open-source metasearch engine designed to aggregate results from multiple search engines while prioritizing user privacy and anonymity. Instead of maintaining its own index, it queries numerous external search providers and merges the results into a single interface, increasing coverage and diversity of information. One of its core principles is privacy, as it does not track users, store personal data, or create search profiles, making it a strong alternative to...
    Downloads: 21 This Week
    Last Update:
    See Project
  • 6
    Linkedin Scraper

    Linkedin Scraper

    A library that scrapes Linkedin for user data

    Linkedin Scraper is a library that scrapes Linkedin for user data. Version 2.0.0 and before is called linkedin_user_scraper and can be installed via pip3 install --user linkedin_user_scraper. The reason is that LinkedIn has recently blocked people from viewing certain profiles without having previously signed in. So by setting scrape=False, it doesn't automatically scrape the profile, but Chrome will open the linkedin page anyways. You can login and logout, and the cookie will stay in the...
    Downloads: 8 This Week
    Last Update:
    See Project
  • 7
    CloudQuery

    CloudQuery

    The open-source cloud asset inventory powered by SQL

    ...Integrate CloudQuery with your current visualization, monitoring, and alerting such as Grafana. CloudQuery supports the TimescaleDB PostgreSQL extension, giving you full historical snapshots of your cloud asset inventory. Data analysis, security, auditing, and compliance. Leverage SQL to get visibility into your cloud infrastructure and SaaS applications. Build a cloud-asset inventory across any of our supported official or community providers.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8
    SPX

    SPX

    A simple & straight-to-the-point PHP profiling extension

    ...Multi metrics capable: 22 are currently supported (various time & memory metrics, included files, objects in use, I/O...). Able to collect data without losing context. For example Xhprof (and potentially its forks) aggregates data per caller / callee pairs, which implies the loss of the full call stack and forbids timeline or Flamegraph based analysis.
    Downloads: 6 This Week
    Last Update:
    See Project
  • 9
    FEAPDER

    FEAPDER

    Powerful Python crawler framework for scalable web scraping tasks

    feapder is a Python-based web crawling framework designed to simplify the process of building scalable and efficient web scrapers. It focuses on providing a developer-friendly environment that makes it easier to create, run, and manage crawlers for a variety of data collection tasks. It includes several built-in spider types, such as AirSpider, Spider, TaskSpider, and BatchSpider, which address different crawling scenarios ranging from lightweight scraping to distributed and batch-based jobs. feapder supports features such as breakpoint resume, allowing crawlers to continue from where they stopped without losing progress. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • Outbound sales software Icon
    Outbound sales software

    Unified cloud-based platform for dialing, emailing, appointment scheduling, lead management and much more.

    Adversus is an outbound dialing solution that helps you streamline your call strategies, automate manual processes, and provide valuable insights to improve your outbound workflows and efficiency.
    Learn More
  • 10
    OpenWPM

    OpenWPM

    A web privacy measurement framework

    OpenWPM is a web privacy measurement framework that makes it easy to collect data for privacy studies on a scale of thousands to millions of websites. OpenWPM is built on top of Firefox, with automation provided by Selenium. It includes several hooks for data collection. Check out the instrumentation section below for more details. OpenWPM is tested on Ubuntu 18.04 via TravisCI and is commonly used via the docker container that this repo builds, which is also based on Ubuntu. Although we...
    Downloads: 7 This Week
    Last Update:
    See Project
  • 11
    Scweet

    Scweet

    Scrape tweets, profiles, followers and following from Twitter/X

    Scweet is a Python-based Twitter/X scraping library and CLI designed to collect tweets, profile timelines, followers, following lists, and user profile data without requiring the official Twitter/X API or a developer account. Instead of depending on deprecated unauthenticated scraping methods, it works by using X’s web GraphQL API together with authenticated browser cookies, which gives it a more current and practical approach for data extraction.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 12
    CyberScraper 2077

    CyberScraper 2077

    A Powerful web scraper powered by LLM | OpenAI, Gemini & Ollama

    CyberScraper 2077 is not just another web scraping tool – it's a glimpse into the future of data extraction. Born from the neon-lit streets of a cyberpunk world, this AI-powered scraper uses OpenAI, Gemini and LocalLLM Models to slice through the web's defenses, extracting the data you need with unparalleled precision and style.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 13
    dude uncomplicated data extraction

    dude uncomplicated data extraction

    dude uncomplicated data extraction: A simple framework

    Dude is a very simple framework for writing web scrapers using Python decorators. The design, inspired by Flask, was to easily build a web scraper in just a few lines of code. Dude has an easy-to-learn syntax. Dude is currently in Pre-Alpha. Please expect breaking changes. You can run your scraper from terminal/shell/command-line by supplying URLs, the output filename of your choice and the paths to your python scripts to dude scrape command.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    Spider

    Spider

    High-performance Rust web crawler and scraper for large-scale data

    ...Spider can operate concurrently across many pages, allowing it to gather large datasets in a short period of time. Spider also provides mechanisms for subscribing to crawl events so developers can process page data such as URLs, status codes, or HTML content as it is discovered. It supports advanced capabilities such as headless browser rendering, background crawling tasks, and configurable rules that control crawl depth or ignored paths. These capabilities make the project suitable for building search indexers, data extraction pipelines, & SEO analysis tools.
    Downloads: 6 This Week
    Last Update:
    See Project
  • 15
    news-please

    news-please

    Python tool for crawling and extracting structured data from news site

    ...It combines several established technologies and libraries to perform web crawling and content extraction, enabling reliable processing across a wide range of news sources. Developers can use the software either as a standalone command line application or integrate it into their own Python applications through its library interface. Extracted article data can be stored in different formats and systems, including JSON files or database-backed storage solutions.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    ScrapydWeb

    ScrapydWeb

    Web app for Scrapyd cluster management

    Web app for Scrapyd cluster management, with support for Scrapy log analysis & visualization. Make sure that Scrapyd has been installed and started on all of your hosts. Start ScrapydWeb via command scrapydweb. (a config file would be generated for customizing settings on the first startup.) Add your Scrapyd servers, both formats of string and tuple are supported, you can attach basic auth for accessing the Scrapyd server, as well as a string for grouping or labeling. You can select any...
    Downloads: 4 This Week
    Last Update:
    See Project
  • 17
    Crawl4AI

    Crawl4AI

    Open-source LLM Friendly Web Crawler & Scraper

    Crawl4AI is a high-performance, AI‑ready web crawler tailored for LLM data ingestion and RAG pipelines. It supports adaptive crawling heuristics (stopping when enough info is gathered), structured markdown output, and high-speed parallel execution. Designed to operate at scale with optional Docker deployment and framework integrations.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    Docker-ELK

    Docker-ELK

    The Elastic stack (ELK) powered by Docker and Compose

    A turnkey Docker Compose stack to spin up the ELK stack (Elasticsearch, Logstash, Kibana) for log collection, analysis, and visualization. Based on official Elastic images and enhanced with configuration defaults optimized for local development and testing.
    Downloads: 8 This Week
    Last Update:
    See Project
  • 19
    CloudEvents

    CloudEvents

    CloudEvents Specification

    Events are everywhere. However, event producers tend to describe events differently. The lack of a common way of describing events means developers must constantly re-learn how to consume events. This also limits the potential for libraries, tooling and infrastructure to aide the delivery of event data across environments, like SDKs, event routers or tracing systems. The portability and productivity we can achieve from event data is hindered overall. CloudEvents is a specification for...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    Tweepy

    Tweepy

    Twitter for Python

    ...Each method can accept various parameters and return responses. When we invoke an API method most of the time returned back to us will be a Tweepy model class instance. This will contain the data returned from Twitter which we can then use inside our application. Models contain the data and some helper methods which we can then use. Tweepy supports both OAuth 1a (application-user) and OAuth 2 (application-only) authentication. Authentication is handled by the tweepy.AuthHandler class.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 21
    Algo VPN

    Algo VPN

    Set of Ansible scripts that simplifies the setup of a personal VPN

    Introducing Algo, a self-hosted personal VPN server designed for ease of deployment and security. Algo automatically deploys an on-demand VPN service in the cloud that is not shared with other users, relies on only modern protocols and ciphers, and includes only the minimal software you need. And it’s free. For anyone who is privacy conscious, travels for work frequently, or can’t afford a dedicated IT department, this one’s for you. Really, the paid-for services are just commercial...
    Downloads: 46 This Week
    Last Update:
    See Project
  • 22
    UTMStack

    UTMStack

    Customizable SIEM and XDR powered by Real-Time correlation

    Welcome to the UTMStack open-source project! UTMStack is a unified threat management platform that merges SIEM (Security Information and Event Management) and XDR (Extended Detection and Response) technologies. Our unique approach allows real-time correlation of log data, threat intelligence, and malware activity patterns from multiple sources, enabling the identification and halting of complex threats that use stealthy techniques. UTMStack stands out in threat prevention by surpassing the...
    Downloads: 6 This Week
    Last Update:
    See Project
  • 23
    hosts

    hosts

    Consolidate and extend hosts files from several well-curated sources

    Consolidating and extending hosts files from several well-curated sources. You can optionally pick extensions to block pornography, social media, and other categories. The unified hosts file is optionally extensible. Extensions are used to include domains by category. Currently, we offer the following categories: fakenews, social, gambling, and porn. Extensions are optional, and can be combined in various ways with the base hosts file. The combined products are stored in the alternates...
    Downloads: 4 This Week
    Last Update:
    See Project
  • 24
    PyFCM

    PyFCM

    Python client for FCM - Firebase Cloud Messaging

    Python client for FCM - Firebase Cloud Messaging (Android, iOS and Web) Firebase Cloud Messaging (FCM) is the new version of GCM. It inherits the reliable and scalable GCM infrastructure, plus new features. GCM users are strongly recommended to upgrade to FCM. Using FCM, you can notify a client app that new email or other data is available to sync.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    changedetection.io

    changedetection.io

    The best free open source website change detection and restock service

    Loved by smart shoppers, data journalists, research engineers, data scientists, security researchers, and more. From simply monitoring website pages that have a change (such as watching prices, and restocking notifications), to deep inspection such as PDF text support, JSON and XML monitoring, and extensive text triggers. Monitor out-of-stock products and get alerts when those products are back in stock, get restock alerts via Discord, Slack, email, and many other platforms. Using the...
    Downloads: 13 This Week
    Last Update:
    See Project
MongoDB Logo MongoDB