Showing 272 open source projects for "text code"

View related business solutions
  • Connect with customers in one app Icon
    Connect with customers in one app

    Businesses of all sizes seeking an AI-enhanced, all-in-one communication platform to unify voice, video, and messaging for improved team collaboration

    Dialpad Connect is an AI-powered unified communications platform that combines voice, video, and messaging to enhance team collaboration and customer interactions. It features real-time call transcription, automated call summaries, and AI-generated action items to help users stay focused during conversations. The platform integrates seamlessly with popular business apps like Salesforce, Zendesk, Microsoft Teams, and Google Workspace to streamline workflows. Designed for businesses of all sizes, Dialpad Connect delivers enterprise-grade reliability with 100% uptime SLA and robust disaster recovery. Security and privacy are core priorities, meeting standards like GDPR, HIPAA, and SOC 2 compliance. Dialpad Connect helps companies elevate customer experiences while boosting team productivity.
    Learn More
  • Effortlessly manage macOS, iOS, iPadOS and tvOS devices across multiple clients and locations. Icon
    Effortlessly manage macOS, iOS, iPadOS and tvOS devices across multiple clients and locations.

    The Most Powerful Apple Device Management Tool for MSPs and IT Teams

    Addigy solutions accelerate Apple adoption in any environment.
    Learn More
  • 1
    dots.ocr

    dots.ocr

    Multilingual Document Layout Parsing in a Single Vision-Language Model

    ...Beyond standard OCR tasks, it extends its capabilities to parse complex visual elements such as charts, diagrams, and web interfaces, converting them into structured outputs like SVG code.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 2
    LongBench

    LongBench

    LongBench v2 and LongBench (ACL 25'&24')

    ...Traditional language model benchmarks typically evaluate tasks involving relatively short inputs, which does not reflect many real-world applications such as analyzing large documents or entire code repositories. LongBench addresses this gap by providing datasets that require models to process and reason over long sequences of text across multiple tasks. The benchmark includes multiple categories such as single-document question answering, multi-document reasoning, summarization, long dialogue understanding, and code analysis. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 3
    DocETL

    DocETL

    A system for agentic LLM-powered data processing and ETL

    ...The platform allows developers and researchers to construct structured workflows that extract, transform, and organize information from sources such as reports, transcripts, legal documents, and other text-heavy data. Instead of relying on single prompts or ad-hoc scripts, DocETL provides a declarative pipeline framework that breaks complex document analysis tasks into manageable operations that can be optimized and orchestrated automatically. Pipelines are typically defined using a low-code YAML interface, giving users full control over prompts and processing steps while still simplifying workflow creation.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 4
    InfiniteYou

    InfiniteYou

    Flexible Photo Recrafting While Preserving Your Identity

    InfiniteYou is an open-source image-generation and “identity-preserving image editing / generation” framework from ByteDance, designed to generate high-fidelity images that preserve a subject’s identity while allowing flexible editing or re-creation according to textual prompts. Using an architecture built around diffusion transformers (DiTs), InfiniteYou introduces a component called InfuseNet that injects identity features derived from reference images into the generation process — via...
    Downloads: 0 This Week
    Last Update:
    See Project
  • EHS Software and Management System Icon
    EHS Software and Management System

    ERA offers the only full EHS&Q platform with advanced automation to drive your complete compliance.

    ERA Environmental Software Solutions develops web-based EHS management software for small, medium, and large manufacturers needing to comply with federal, provincial, and state regulations, monitor their air, water, and waste emissions and other environmental outputs, author and manage Safety Data Sheets (SDS) in more than 40 languages, or standardize their Health and Safety procedures for incident and inspection tracking, training delivery, and audit management. The platform also supports comprehensive reporting for programs like TRI, Tier II, Title V, NEI, and NPRI. Companies across the automotive, aerospace, general manufacturing, and paints and coatings industries, to name a few, rely on ERA’s all-in-one, SOC 2 Type II certified SaaS for complete coverage of their EHS needs.
    Learn More
  • 5
    OpenAI DALL·E AsyncImage SwiftUI

    OpenAI DALL·E AsyncImage SwiftUI

    OpenAI swift async text to image for SwiftUI app using OpenAI

    SwiftUI views that asynchronously loads and displays an OpenAI image from open API. You just type in your idea and AI will give you an art solution. DALL-E and DALL-E 2 are deep learning models developed by OpenAI to generate digital images from natural language descriptions, called "prompts". You need to have Xcode 13 installed in order to have access to Documentation Compiler (DocC) OpenAI's text-to-image model DALL-E 2 is a recent example of diffusion models. It uses diffusion models for...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 6
    Vidi2

    Vidi2

    Large Multimodal Models for Video Understanding and Editing

    Vidi is a family of large multimodal models developed for deep video understanding and editing tasks, integrating vision, audio, and language to allow sophisticated querying and manipulation of video content. It’s designed to process long-form, real-world videos and answer complex queries such as “when in this clip does X happen?” or “where in the frame is object Y during that moment?” — offering temporal retrieval, spatio-temporal grounding (i.e. locating objects over time + space), and...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7
    files-to-prompt

    files-to-prompt

    Concatenate a directory full of files into a single prompt

    ...It includes rich filtering controls, letting you limit by extension, include or skip hidden files, and ignore paths that match glob patterns or .gitignore rules. The output format is flexible: you can emit plain text, Markdown with fenced code blocks, or a Claude-XML style format designed for structured multi-file prompts. It can read file paths from stdin (including NUL-separated paths), which makes it easy to combine with find, rg, or other shell tools.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8
    Biomni

    Biomni

    Biomni: a general-purpose biomedical AI agent

    Biomni is a general-purpose biomedical AI agent designed to autonomously perform complex research tasks across a wide range of scientific domains, combining language model reasoning with structured planning and execution. It integrates retrieval-augmented generation with code-based execution, allowing it to access external knowledge, process data, and generate testable hypotheses in scientific workflows. The system is built to support researchers by automating repetitive and time-consuming tasks such as literature review, data analysis, and experimental design. Biomni operates within a comprehensive environment that includes tools, APIs, and datasets, enabling it to execute multi-step research processes rather than just generating text responses. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9
    Ollama JavaScript Library

    Ollama JavaScript Library

    Ollama JavaScript library

    Ollama JavaScript is the official JavaScript client for integrating Ollama into JS and TS applications with a lightweight, developer-friendly API. It is designed around the Ollama REST API, so it feels consistent with the platform while making common tasks easier to handle in application code. The library supports standard chat interactions, text generation, embeddings, and model management, which makes it useful for both simple chat interfaces and more advanced AI-powered workflows. It works in Node.js and also supports browser usage through a dedicated browser import, which broadens where it can be deployed. Streaming responses are built in, returning an async generator so applications can render output progressively instead of waiting for a full response. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • RouteGenie NEMT software Icon
    RouteGenie NEMT software

    Modern software for non-emergency medical transportation providers, built to improve scheduling, billing, routing, and dispatching processes.

    RouteGenie NEMT software is a modern system built to automate all non-emergency medical transportation processes including routing, scheduling, dispatching, and billing. It helps manage everyday challenges like vehicle breakdowns, traffic problems, cancelations, driver call-offs, will calls, no shows, add-on trips, on-demand trips, and more.
    Learn More
  • 10
    Beads

    Beads

    A memory upgrade for your coding agent

    ...By leveraging Git as the storage backbone, the project ensures that memory is persistent, diffable, and sharable, with the ability to roll back, branch, or merge memory states just like source code.
    Downloads: 6 This Week
    Last Update:
    See Project
  • 11
    MobileCLIP

    MobileCLIP

    Implementation of "MobileCLIP" CVPR 2024

    MobileCLIP is a family of efficient image-text embedding models designed for real-time, on-device retrieval and zero-shot classification. The repo provides training, inference, and evaluation code for MobileCLIP models trained on DataCompDR, and for newer MobileCLIP2 models trained on DFNDR. It includes an iOS demo app and Core ML artifacts to showcase practical, offline photo search and classification on iPhone-class hardware.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    Milvus Bootcamp

    Milvus Bootcamp

    Dealing with all unstructured data, such as reverse image search

    Milvus Bootcamp is a collection of tutorials, examples, and best practices for using Milvus, an open-source vector database designed for AI-powered similarity search and retrieval applications.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13
    Trae Agent

    Trae Agent

    LLM-based agent for general purpose software engineering tasks

    Trae Agent is an open-source, LLM-based agent system also developed by ByteDance, focused primarily on automating software engineering workflows. It provides a command-line interface (CLI) that accepts natural-language instructions (e.g. “refactor this module,” “write a unit test,” “generate a REST API skeleton”), and then orchestrates tool-based workflows — such as file editing, shell/batch commands, code generation, code formatting or refactoring — to carry out complex engineering tasks....
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    MegaTTS 3

    MegaTTS 3

    Official PyTorch Implementation

    MegaTTS3 is an open-source text-to-speech (TTS) and voice-cloning system from ByteDance that aims to deliver high-quality, expressive speech synthesis, including zero-shot voice cloning of previously unseen speakers. Its backbone is a lightweight diffusion-transformer (on the order of ~0.45 B parameters), which enables efficient inference while still producing high-fidelity audio.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    LLM Foundry

    LLM Foundry

    LLM training code for MosaicML foundation models

    Introducing MPT-7B, the first entry in our MosaicML Foundation Series. MPT-7B is a transformer trained from scratch on 1T tokens of text and code. It is open source, available for commercial use, and matches the quality of LLaMA-7B. MPT-7B was trained on the MosaicML platform in 9.5 days with zero human intervention at a cost of ~$200k. Large language models (LLMs) are changing the world, but for those outside well-resourced industry labs, it can be extremely difficult to train and deploy these models. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    Laminar

    Laminar

    Open-source all-in-one platform for engineering AI products

    Laminar is an open source all-in-one platform for engineering best-in-class LLM products. Data governs the quality of your LLM application. Laminar helps you collect it, understand it, and use it. When you trace your LLM application, you get a clear picture of every step of execution and simultaneously collect invaluable data. You can use it to set up better evaluations, as dynamic few-shot examples, and for fine-tuning. All traces are sent in the background via gRPC with minimal overhead....
    Downloads: 7 This Week
    Last Update:
    See Project
  • 17
    wa-automate-nodejs

    wa-automate-nodejs

    WhatsApp tool for chatbots with advanced features

    wa-automate-nodejs is the most advanced NodeJS library which provides a high-level API to control WA. Want to convert your WA account to an API instantly? You can now with the CLI. For more details see Easy API. After executing create() function, @open-wa/wa-automate will create an instance of WA web. If you are not logged in, it will print a QR code in the terminal. Scan it with your phone and you are ready to go! @open-wa/wa-automate will remember the session so there is no need to...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    Advanced NLP with spaCy

    Advanced NLP with spaCy

    Advanced NLP with spaCy: A free online course

    ...The course is designed to teach developers how to build real-world NLP systems by combining rule-based techniques with machine learning models. The repository includes lessons, exercises, and examples that guide learners through tasks such as tokenization, named entity recognition, text classification, and training custom NLP models. It also demonstrates how spaCy pipelines work and how developers can extend them with custom components and training data. The course is structured as a hands-on learning environment where students can run code examples, experiment with NLP techniques, and build practical language processing applications. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    Sygil WebUI

    Sygil WebUI

    Stable Diffusion web UI

    Sygil WebUI is a browser-based interface for running Stable Diffusion image generation locally or on a server, wrapping common text-to-image and image-to-image workflows into a practical UI. It provides multiple UI modes (including a legacy Gradio interface) and focuses on making iterative prompting, parameter tuning, and post-processing accessible without writing code. The UI exposes core generation controls like resolution, CFG guidance, sampling steps, samplers, seeds, and batch generation so users can reproduce results and refine outputs systematically. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    LLM TLDR

    LLM TLDR

    95% token savings. 155x faster queries. 16 languages

    LLM TLDR is a tool that leverages large language models (LLMs) to generate concise, coherent summaries (TL;DRs) of long documents, articles, or text files, helping users quickly understand large amounts of content without reading every word. It integrates with LLM APIs to handle input texts of varying lengths and complexity, applying techniques like chunking, context management, and multi-pass summarization to preserve accuracy even when the source is very large. The system supports both...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    Lingvo

    Lingvo

    Framework for building neural networks

    Lingvo is a TensorFlow based framework focused on building and training sequence models, especially for language and speech tasks. It was originally developed for internal research and later open sourced to support reproducible experiments and shared model implementations. The framework provides a structured way to define models, input pipelines, and training configurations using a common interface for layers, which encourages reuse across different tasks. It has been used to implement state...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    agentation

    agentation

    The visual feedback tool for agents

    Agentation is a visual annotation and feedback tool designed to make interacting with AI coding agents more intuitive and precise by letting developers visually click on frontend elements in a browser and annotate them with context before sending structured feedback to an agent. Instead of describing UI elements in text — like “the blue button in the sidebar” — users click directly on elements to automatically capture selectors, positions, and contextual metadata that can be consumed by AI agents to locate exact code references. This approach dramatically improves clarity and reduces ambiguity when working with AI tools that generate or modify UI code, making the handoff between human design intent and AI execution much clearer. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    GLM-4.6V

    GLM-4.6V

    GLM-4.6V/4.5V/4.1V-Thinking, towards versatile multimodal reasoning

    GLM-4.6V represents the latest generation of the GLM-V family and marks a major step forward in multimodal AI by combining advanced vision-language understanding with native “tool-call” capabilities, long-context reasoning, and strong generalization across domains. Unlike many vision-language models that treat images and text separately or require intermediate conversions, GLM-4.6V allows inputs such as images, screenshots or document pages directly as part of its reasoning pipeline — and...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    Step-Audio-EditX

    Step-Audio-EditX

    LLM-based Reinforcement Learning audio edit model

    Step-Audio-EditX is an open-source, 3 billion-parameter audio model from StepFun AI designed to make expressive and precise editing of speech and audio as easy as text editing. Rather than treating audio editing as low-level waveform manipulation, this model converts speech into a sequence of discrete “audio tokens” (via a dual-codebook tokenizer) — combining a linguistic token stream and a semantic (prosody/emotion/style) token stream — thereby abstracting audio editing into high-level...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    PPTAgent

    PPTAgent

    PPTAgent: Generating and Evaluating Presentations

    PPTAgent is a research system for generating and evaluating slide decks that goes beyond simple text-to-slides. It follows a two-stage, edit-based workflow: first it analyzes reference presentations to infer slide roles and structure, then it drafts an outline and iteratively performs editing actions to produce new slides. The project includes both the generation agent and an evaluation framework, PPTEval, to score content quality, design, and coherence. The repository highlights the EMNLP...
    Downloads: 3 This Week
    Last Update:
    See Project
MongoDB Logo MongoDB