Search Results for "parallel language" - Page 4

Showing 139 open source projects for "parallel language"

View related business solutions
  • Boon: The Agile Referral Hiring Platform Icon
    Boon: The Agile Referral Hiring Platform

    Tap your entire community to hire better talent, faster

    Boon's agile referral platform expands your recruiting power 
through AI, automation, integrations, and gamification.
    Learn More
  • Next-generation security awareness training. Built for AI email phishing, vishing, smishing, and deepfakes. Icon
    Next-generation security awareness training. Built for AI email phishing, vishing, smishing, and deepfakes.

    Track your GenAI risk, run multichannel deepfake simulations, and engage employees with incredible security training.

    Assess how your company's digital footprint can be leveraged by cybercriminals. Identify the most at-risk individuals using thousands of public data points and take steps to proactively defend them.
    Learn More
  • 1
    text-dedup

    text-dedup

    All-in-one text de-duplication

    ...This is especially useful for NLP tasks where duplicated training data can skew model performance. text-dedup scales to billions of documents and offers tools for chunking, hashing, and comparing text efficiently with low memory usage. It supports Jaccard similarity thresholding, parallel execution, and flexible deduplication strategies, making it ideal for cleaning web-scraped data, language model training datasets, or document archives.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 2
    Transducers.jl

    Transducers.jl

    Efficient transducers for Julia

    Transducers are transformations of "sequence" of input that can be composed very efficiently. The interface used by transducers naturally describes a wide range of processes that is expressible as a succession of steps. Furthermore, transducers can be defined without specifying the details of the input and output (collections, streams, channels, etc.) and therefore achieves a full reusability. Transducers are introduced by Rich Hickey, the creator of the Clojure language. His Strange Loop...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 3
    Nunjucks

    Nunjucks

    A rich and powerful templating language for JavaScript

    A powerful templating engine with inheritance, asynchronous control, and more (jinja2 inspired). You've been looking for a more sophisticated templating engine for JavaScript. Here it is. Rich Powerful language with block inheritance, autoescaping, macros, asynchronous control, and more. Heavily inspired by jinja2. Fast & Lean High-performant. Small 8K gzipped runtime with precompiled templates in the browser. Crazy extensible with custom filters and extensions. Everywhere available in node...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 4
    GPT-NeoX

    GPT-NeoX

    Implementation of model parallel autoregressive transformers on GPUs

    This repository records EleutherAI's library for training large-scale language models on GPUs. Our current framework is based on NVIDIA's Megatron Language Model and has been augmented with techniques from DeepSpeed as well as some novel optimizations. We aim to make this repo a centralized and accessible place to gather techniques for training large-scale autoregressive language models, and accelerate research into large-scale training. For those looking for a TPU-centric codebase, we...
    Downloads: 0 This Week
    Last Update:
    See Project
  • Download the most trusted enterprise browser Icon
    Download the most trusted enterprise browser

    Chrome Enterprise brings enterprise controls and easy integrations to the browser users already know and love.

    Chrome Enterprise is ideal for businesses of all sizes, IT professionals, and organizations looking for a secure, scalable, and easily managed browser solution that supports remote work, data protection, and streamlined enterprise operations.
    Learn More
  • 5
    TextBox

    TextBox

    A text generation library with pre-trained language models github.com

    TextBox 2.0 is an up-to-date text generation library based on Python and PyTorch focusing on building a unified and standardized pipeline for applying pre-trained language models to text generation. From a task perspective, we consider 13 common text generation tasks such as translation, story generation, and style transfer, and their corresponding 83 widely-used datasets. From a model perspective, we incorporate 47 pre-trained language models/modules covering the categories of general, translation, Chinese, dialogue, controllable, distilled, prompting, and lightweight models (modules). ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 6
    OK

    OK

    Welcome to the future of programming languages

    ...The language emphasises readability and pushing logic out into functions so cases remain simple. It includes concurrency via a map function that executes callbacks in parallel. The project is illustrative of Duffield’s vision: code should feel “magical to write” by removing what is unnecessary.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7
    Fairseq

    Fairseq

    Facebook AI Research Sequence-to-Sequence Toolkit written in Python

    Fairseq(-py) is a sequence modeling toolkit that allows researchers and developers to train custom models for translation, summarization, language modeling and other text generation tasks. We provide reference implementations of various sequence modeling papers. Recent work by Microsoft and Google has shown that data parallel training can be made significantly more efficient by sharding the model parameters and optimizer state across data parallel workers. These ideas are encapsulated in the new FullyShardedDataParallel (FSDP) wrapper provided by fairscale. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8
    Mycat2

    Mycat2

    MySQL Proxy using Java NIO based on Sharding SQL, Calcite

    ...Support parallel extraction of result sets, support the automatic transfer of back-end result sets, support multiple routing notes, and optimizer notes. Parameterization of the requested SQL, cache physics execution plan, and request of the same parameterization SQL, will avoid some analysis and optimization processes.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 9
    GPT Neo

    GPT Neo

    An implementation of model parallel GPT-2 and GPT-3-style models

    An implementation of model & data parallel GPT3-like models using the mesh-tensorflow library. If you're just here to play with our pre-trained models, we strongly recommend you try out the HuggingFace Transformer integration. Training and inference is officially supported on TPU and should work on GPU as well. This repository will be (mostly) archived as we move focus to our GPU-specific repo, GPT-NeoX. NB, while neo can technically run a training step at 200B+ parameters, it is very...
    Downloads: 1 This Week
    Last Update:
    See Project
  • DataHub is the leading open-source data catalog helping teams discover, understand, and govern their data assets. Icon
    DataHub is the leading open-source data catalog helping teams discover, understand, and govern their data assets.

    Modern Data Catalog and Metadata Platform

    Built on an open source foundation with a thriving community of 13,000+ members, DataHub gives you unmatched flexibility to customize and extend without vendor lock-in. DataHub Cloud is a modern metadata platform with REST and GraphQL APIs that optimize performance for complex queries, essential for AI-ready data management and ML lifecycle support.
    Learn More
  • 10
    FARM

    FARM

    Fast & easy transfer learning for NLP

    ...With FARM you can build fast proofs-of-concept for tasks like text classification, NER or question answering and transfer them easily into production. Easy fine-tuning of language models to your task and domain language. AMP optimizers (~35% faster) and parallel preprocessing (16 CPU cores => ~16x faster). Modular design of language models and prediction heads. Switch between heads or combine them for multitask learning. Full Compatibility with HuggingFace Transformers' models and model hub. Smooth upgrading to newer language models. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    XLM (Cross-lingual Language Model)

    XLM (Cross-lingual Language Model)

    PyTorch original implementation of Cross-lingual Language Model

    XLM (Cross-lingual Language Model) is a family of multilingual pretraining methods that align representations across languages to enable strong zero-shot transfer. It popularized objectives like Masked Language Modeling (MLM) across many languages and Translation Language Modeling (TLM) that jointly trains on parallel sentence pairs to tighten cross-lingual alignment.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    gluon

    gluon

    A static, type inferred and embeddable language written in Rust

    ...Marshalling values to and from gluon requires next to no boilerplate, allowing functions defined in Rust to be directly passed to gluon. Gluon supports Unicode out of the box with utf-8 encoded strings and Unicode codepoints as characters. Gluon is a garbage-collected language but uses a separate heap for each executing gluon thread. This keeps each heap small, reducing the overhead of the garbage collector. Gluon is written in Rust, which guarantees thread safety. Gluon keeps the same guarantees, allowing multiple gluon programs to run in parallel.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13
    Accelerate

    Accelerate

    Embedded language for high-performance array computations

    ...Embedded in the advanced functional programming language Haskell, Accelerate programs are declarative, statically-typed, pure, functional, and ready to exploit all of the performance of modern parallel hardware. The combination of a strong type system, high-level code, and interactive development environment, allows you to develop code quickly with the confidence that it is correct.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    Saturn

    Saturn

    The vip.com's distributed job scheduling platform

    Saturn is a platform created by VIP.com to provide a distributed, fault-tolerant and high available job scheduling service.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    Idris-dev

    Idris-dev

    A Dependently Typed Functional Programming Language

    Idris‑dev is the development version of Idris 1, a general-purpose functional programming language featuring full dependent types, designed for writing type-safe programs and proofs within the language itself. It compiles to C and JavaScript (for Node.js and browsers), and supports code generation via substitute backends. This repository represents the latest development version of the language, and may contain bugs that are being actively worked on. For those who wish to use a more stable...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    Cubature.jl

    Cubature.jl

    One- and multi-dimensional adaptive integration routines for Julia

    This module provides one- and multi-dimensional adaptive integration routines for the Julia language, including support for vector-valued integrands and facilitation of parallel evaluation of integrands, based on the Cubature Package by Steven G. Johnson. Adaptive integration works by evaluating the integrand at more and more points until the integrand converges to a specified tolerance (with the error estimated by comparing integral estimates with different numbers of points). ...
    Downloads: 4 This Week
    Last Update:
    See Project
  • 17
    Weld

    Weld

    High-performance runtime for data analytics applications

    Weld is a programming language and runtime designed to improve the performance of data-intensive applications by optimizing computations across multiple libraries. Instead of optimizing individual functions independently, Weld introduces an intermediate representation that allows different frameworks to share optimization opportunities. This approach reduces data movement between libraries and enables the system to generate highly optimized machine code for parallel execution. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    YouTubeCrawler

    YouTubeCrawler

    Go-based automation utility that downloads YouTube videos

    This tool is a Go-based automation utility that downloads YouTube videos and permanently embeds or “hard-codes” their subtitles (typically English) into MP4 output files. The workflow involves specifying one or more URLs (via a simple “url” text file in each folder) and the program uses youtube-dl to fetch video and subtitle, then ffmpeg to overlay the subtitles onto the video track. The architecture follows a command-pattern setup: tasks implement a common interface and are scheduled and...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 19
    UnsupervisedMT

    UnsupervisedMT

    Phrase-Based & Neural Unsupervised Machine Translation

    ...The project also provides scripts to fetch and preprocess monolingual data, learn BPE codes, and train cross-lingual embeddings that bootstrap unsupervised alignment between languages. Beyond the core EMNLP 2018 setup, the codebase exposes additional, optional capabilities such as multi-language training, language model pretraining with shared parameters, and adversarial training.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 20
    Pipelines

    Pipelines

    An experimental programming language for data flow

    Pipelines is a language and runtime for crafting massively parallel pipelines. Unlike other languages for defining data flow, the Pipeline language requires the implementation of components to be defined separately in the Python scripting language. This allows the details of implementations to be separated from the structure of the pipeline while providing access to thousands of active libraries for machine learning, data analysis, and processing.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    Skylark

    Skylark

    Skylark in Go: the Skylark configuration language

    Skylark, now known as Starlark, is an interpreter for a Python-like language implemented in Go. It is designed as a lightweight, deterministic, and embeddable configuration and scripting language ideal for use within larger applications. Skylark maintains Python’s familiar syntax and high-level data types while omitting features that could cause nondeterminism, such as concurrency and dynamic module imports.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 22
    OpenSeq2Seq

    OpenSeq2Seq

    Toolkit for efficient experimentation with Speech Recognition

    ...Its core goal is to give researchers a flexible, modular framework for building and training encoder–decoder architectures while fully leveraging distributed and mixed-precision training. The toolkit includes ready-made models for neural machine translation, automatic speech recognition, speech synthesis, language modeling, and additional NLP tasks such as sentiment analysis. It supports multi-GPU and multi-node data-parallel training, and integrates with Horovod to scale out across large GPU clusters. Mixed-precision support (float16) is optimized for NVIDIA Volta and Turing GPUs, allowing significant speedups and memory savings without sacrificing model quality. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    TensorFlow-ZH

    TensorFlow-ZH

    Chinese version of the official document of TensorFlow

    The tensorflow-zh repository is a Chinese translation of the official TensorFlow documentation, organized to make the core guides, tutorials, and reference material accessible to Chinese speakers. It was initiated shortly after TensorFlow’s open-sourcing, with translation and proofreading contributions from a community of volunteers who aimed to bridge the language barrier for learners in China and other Mandarin communities. The repo mirrors the structure of the original English docs:...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24

    popt4jlib

    Parallel Optimization Library for Java

    popt4jlib is an open-source parallel optimization library for the Java programming language supporting both shared memory and distributed message passing models. Implements a number of meta-heuristic algorithms for Non-Linear Programming, including Genetic Algorithms, Differential Evolution, Evolutionary Algorithms, Simulated Annealing, Particle Swarm Optimization, Firefly Algorithm, Monte-Carlo Search, Local Search algorithms, Gradient-Descent-based algorithms, as well as some well-known network flow and other graph algorithms. ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 25
    Chapel

    Chapel

    a Productive Parallel Programming Language

    Chapel is an emerging parallel programming language whose design and development are being led by HPE in collaboration with academia, computing labs, and industry. Chapel's goal is to improve the productivity of parallel programmers, from laptops to supercomputers. **Please note that Chapel development has moved to GitHub**
    Downloads: 0 This Week
    Last Update:
    See Project
MongoDB Logo MongoDB