Search Results for "8-puzzle reinforcement learning python" - Page 3

Showing 252 open source projects for "8-puzzle reinforcement learning python"

View related business solutions
  • Planview is the leading end-to-end platform for Strategic Portfolio Management (SPM) and Digital Product Development (DPD) Icon
    Planview is the leading end-to-end platform for Strategic Portfolio Management (SPM) and Digital Product Development (DPD)

    Manage project and product portfolios enterprise-wide

    Planview AdaptiveWork (formerly Clarizen) with embedded AI helps you proactively plan and deliver any type and size of portfolio, project, and work. Gain AI-enhanced visibility and insights, drive collaboration, and achieve better business outcomes across your organization.
    Learn More
  • Non Emergency Medical Transportation (NEMT) Software Icon
    Non Emergency Medical Transportation (NEMT) Software

    Healthcare providers in search of a scheduling and dispatch solution for non emergency medical transportation

    NovusMED is an ecosystem that includes call center, administrative, driver applications, and client/clinic booking applications. NovusMED is the platform of choice for a wide range of medical transportation services and includes configurations for brokerage, providers, senior, community, and home health programs. Accurately manage calls and patient information. Monitor real-time performance and adjust resource capacity to meet changes in service demand. Manage will calls, confirmation calls, and recurring trips/standing orders in real time. Improved mileage reimbursement and cost calculators to manage multiple contractors, funding sources (payors), multiple providers, and volunteer driver programs. Enhanced credential management for vehicles and drivers. Manage subcontractor outsourcing with provider mobile, trip bidding, and trip offers. Able to see the closest vehicle and perform immediate bookings.
    Learn More
  • 1
    VectorizedMultiAgentSimulator (VMAS)

    VectorizedMultiAgentSimulator (VMAS)

    VMAS is a vectorized differentiable simulator

    VectorizedMultiAgentSimulator is a high-performance, vectorized simulator for multi-agent systems, focusing on large-scale agent interactions in shared environments. It is designed for research in multi-agent reinforcement learning, robotics, and autonomous systems where thousands of agents need to be simulated efficiently.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 2
    machine_learning_examples

    machine_learning_examples

    A collection of machine learning examples and tutorials

    machine_learning_examples is an open-source repository that provides a large collection of machine learning tutorials and practical code examples. The project aims to teach machine learning concepts through hands-on programming rather than purely theoretical explanations. It includes implementations of many machine learning algorithms and neural network architectures using Python and popular libraries such as TensorFlow and NumPy. The repository covers a wide range of topics including supervised learning, unsupervised learning, reinforcement learning, and natural language processing. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 3
    D4RL

    D4RL

    Collection of reference environments, offline reinforcement learning

    D4RL (Datasets for Deep Data-Driven Reinforcement Learning) is a benchmark suite focused on offline reinforcement learning — i.e., learning policies from fixed datasets rather than via online interaction with the environment. It contains standardized environments, tasks and datasets (observations, actions, rewards, terminals) aimed at enabling reproducible research in offline RL. Researchers can load a dataset for a given task (e.g., maze navigation, manipulation) and apply their algorithm...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 4
    PRIME

    PRIME

    Scalable RL solution for advanced reasoning of language models

    PRIME is an open-source reinforcement learning framework designed to improve the reasoning capabilities of large language models through process-level rewards rather than relying only on final outputs. The system introduces the concept of process reinforcement through implicit rewards, allowing models to receive feedback on intermediate reasoning steps instead of evaluating only the final answer. This approach helps models learn better reasoning strategies and encourages them to generate...
    Downloads: 1 This Week
    Last Update:
    See Project
  • Job Evaluation and Talent Management Software Icon
    Job Evaluation and Talent Management Software

    For human resources departments in search of a tool to manage time, expenses, leave, documents, recruitment, and onboarding

    Encompassing Visions (ENCV), industry-leading job evaluation and pay equity software, is the best choice for organizations requiring transparent, comprehensive, and objective Job Evaluation software designed to help them ensure equal pay for work of equal value.
    Learn More
  • 5
    DeepSeek R1

    DeepSeek R1

    Open-source, high-performance AI model with advanced reasoning

    DeepSeek-R1 is an open-source large language model developed by DeepSeek, designed to excel in complex reasoning tasks across domains such as mathematics, coding, and language. DeepSeek R1 offers unrestricted access for both commercial and academic use. The model employs a Mixture of Experts (MoE) architecture, comprising 671 billion total parameters with 37 billion active parameters per token, and supports a context length of up to 128,000 tokens. DeepSeek-R1's training regimen uniquely...
    Downloads: 63 This Week
    Last Update:
    See Project
  • 6
    DeepSeek-V3

    DeepSeek-V3

    Powerful AI language model (MoE) optimized for efficiency/performance

    DeepSeek-V3 is a robust Mixture-of-Experts (MoE) language model developed by DeepSeek, featuring a total of 671 billion parameters, with 37 billion activated per token. It employs Multi-head Latent Attention (MLA) and the DeepSeekMoE architecture to enhance computational efficiency. The model introduces an auxiliary-loss-free load balancing strategy and a multi-token prediction training objective to boost performance. Trained on 14.8 trillion diverse, high-quality tokens, DeepSeek-V3...
    Downloads: 117 This Week
    Last Update:
    See Project
  • 7
    rLLM

    rLLM

    Democratizing Reinforcement Learning for LLMs

    rLLM is an open-source framework for building and training post-training language agents via reinforcement learning — that is, using reinforcement signals to fine-tune or adapt language models (LLMs) into customizable agents for real-world tasks. With rLLM, developers can define custom “agents” and “environments,” and then train those agents via reinforcement learning workflows, possibly surpassing what vanilla fine-tuning or supervised learning might provide. The project is designed to...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8
    Hello Python

    Hello Python

    Comprehensive tutorial repository aimed at teaching the Python program

    Hello-Python is a comprehensive tutorial repository aimed at teaching the Python programming language from scratch for beginners. It includes over 100 classes and about 44 hours of video instruction, combined with code samples, projects, and a chat community for support. The material covers the fundamentals—variables, data types, loops, functions—as well as intermediate topics like date handling, list comprehensions, file IO, regular expressions, modules, and packages. The course is designed...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 9
    The Arcade Learning Environment

    The Arcade Learning Environment

    The Arcade Learning Environment (ALE) -- a platform for AI research

    Arcade Learning Environment (ALE) is a widely used open-source framework that wraps hundreds of Atari 2600 games via an emulator and presents them as RL environments for AI agents. It decouples the game/emulation aspects from the agent interface, providing a clean API (C++, Python, Gymnasium) so researchers can focus on agent design rather than game plumbing. This environment suite has been central to many RL breakthroughs, including value-based agents, deep Q-nets, and general-agent...
    Downloads: 0 This Week
    Last Update:
    See Project
  • Zendesk: The Complete Customer Service Solution Icon
    Zendesk: The Complete Customer Service Solution

    Discover AI-powered, award-winning customer service software trusted by 200k customers

    Equip your agents with powerful AI tools and workflows that boost efficiency and elevate customer experiences across every channel.
    Learn More
  • 10
    verl-agent

    verl-agent

    Designed for training LLM/VLM agents via RL

    verl-agent is an open-source reinforcement learning framework designed to train large language model agents and vision-language model agents for complex interactive environments. Built as an extension of the veRL reinforcement learning infrastructure, the project focuses on enabling scalable training for agents that perform multi-step reasoning and decision-making tasks. The framework supports multi-turn interactions between agents and their environments, allowing the system to receive...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    Humanoid-Gym

    Humanoid-Gym

    Reinforcement Learning for Humanoid Robot with Zero-Shot Sim2Real

    Humanoid-Gym is a reinforcement learning framework designed to train locomotion and control policies for humanoid robots using high-performance simulation environments. The system is built on top of NVIDIA Isaac Gym, which allows large-scale parallel simulation of robotic environments directly on GPU hardware. Its primary goal is to enable efficient training of humanoid robots in simulation while enabling policies to transfer effectively to real-world hardware without additional training....
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    Complete-Python-3-Bootcamp

    Complete-Python-3-Bootcamp

    Course Files for Complete Python 3 Bootcamp Course on Udemy

    The Complete-Python-3-Bootcamp repository is an educational resource created by Pierian Data as part of their popular Python for Data Science and Machine Learning Bootcamp course. It contains a comprehensive collection of Jupyter Notebooks designed to teach Python programming from the ground up. The repository covers a wide range of Python topics, including data types, control flow, functions, object-oriented programming, error handling, modules, and advanced concepts like decorators and...
    Downloads: 4 This Week
    Last Update:
    See Project
  • 13
    Diffusion for World Modeling

    Diffusion for World Modeling

    Learning agent trained in a diffusion world model

    Diffusion for World Modeling is an experimental reinforcement learning system that trains intelligent agents inside a simulated environment generated by a diffusion-based world model. The project introduces the idea of using diffusion models, commonly used for image generation, to simulate the dynamics of an environment and predict future states based on previous observations and actions. Instead of interacting directly with a real environment, the reinforcement learning agent learns within...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    MetaClaw

    MetaClaw

    Just talk to your agent

    MetaClaw is an AI or agent-oriented system that appears to focus on advanced control, coordination, or training of autonomous agents, potentially within reinforcement learning or tool-using environments. The project likely emphasizes meta-level reasoning, where agents are not only executing tasks but also adapting their strategies based on feedback and performance signals. It may incorporate mechanisms for learning from interactions, improving decision-making over time, and generalizing...
    Downloads: 4 This Week
    Last Update:
    See Project
  • 15
    highway-env

    highway-env

    A minimalist environment for decision-making in autonomous driving

    HighwayEnv is an OpenAI Gym-compatible environment focused on autonomous driving scenarios. It provides flexible simulations for testing decision-making algorithms in highway, intersection, and merging traffic situations.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    RLax

    RLax

    Library of JAX-based building blocks for reinforcement learning agents

    RLax (pronounced “relax”) is a JAX-based library developed by Google DeepMind that provides reusable mathematical building blocks for constructing reinforcement learning (RL) agents. Rather than implementing full algorithms, RLax focuses on the core functional operations that underpin RL methods—such as computing value functions, returns, policy gradients, and loss terms—allowing researchers to flexibly assemble their own agents. It supports both on-policy and off-policy learning, as well as...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    R1-V

    R1-V

    Witness the aha moment of VLM with less than $3

    R1-V is an initiative aimed at enhancing the generalization capabilities of Vision-Language Models (VLMs) through Reinforcement Learning in Visual Reasoning (RLVR). The project focuses on building a comprehensive framework that emphasizes algorithm enhancement, efficiency optimization, and task diversity to achieve general vision-language intelligence and visual/GUI agents. The team's long-term goal is to contribute impactful open-source research in this domain.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    Reco-papers

    Reco-papers

    Classic papers and resources on recommendation

    Reco-papers is a curated repository that collects influential research papers, technical resources, and industry materials related to recommender systems and recommendation algorithms. The project organizes a large body of literature into thematic sections such as classic recommender systems, exploration-exploitation strategies, deep learning–based recommendation models, and cold-start mitigation techniques. It serves as a reference library for researchers and engineers who want to explore...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    DreamerV3

    DreamerV3

    Mastering Diverse Domains through World Models

    DreamerV3 is an open-source implementation of a reinforcement learning algorithm that uses world models to train intelligent agents capable of learning complex behaviors across many environments. The system works by building an internal model of the environment and then using that model to simulate possible future outcomes of actions, allowing the agent to learn from imagined experiences rather than only from real interactions. This approach enables the algorithm to efficiently learn...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    ReCall

    ReCall

    Learning to Reason with Search for LLMs via Reinforcement Learning

    ReCall is an open-source framework designed to train and evaluate language models that can reason through complex problems by interacting with external tools. The project builds on earlier work focused on teaching models how to search for information during reasoning tasks and extends that idea to a broader system where models can call a variety of external tools such as APIs, databases, or computation engines. Instead of relying purely on static knowledge stored inside the model, ReCall...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    RLHF-Reward-Modeling

    RLHF-Reward-Modeling

    Recipes to train reward model for RLHF

    RLHF-Reward-Modeling is an open-source research framework focused on training reward models used in reinforcement learning from human feedback for large language models. In RLHF pipelines, reward models are responsible for evaluating generated responses and assigning scores that guide the model toward outputs that better match human preferences. The repository provides training recipes and implementations for building reward and preference models using modern machine learning frameworks. It...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    CUDA Agent

    CUDA Agent

    Large-Scale Agentic RL for High-Performance CUDA Kernel Generation

    CUDA Agent is a research-driven agentic reinforcement learning system designed to automatically generate and optimize high-performance CUDA kernels for GPU workloads. The project addresses the long-standing challenge that efficient CUDA programming typically requires deep hardware expertise by training an autonomous coding agent capable of iterative improvement through execution feedback. Its architecture combines large-scale data synthesis, a skill-augmented CUDA development environment,...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    Recursive Language Models

    Recursive Language Models

    General plug-and-play inference library for Recursive Language Models

    RLM (short for Reinforcement Learning Models) is a modular framework that makes it easier to build, train, evaluate, and deploy reinforcement learning (RL) agents across a wide range of environments and tasks. It provides a consistent API that abstracts away many of the repetitive engineering patterns in RL research and application work, letting developers focus on modeling, experimentation, and fine-tuning rather than infrastructure plumbing. Within the framework, you can define custom...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    Qbot

    Qbot

    AI-powered Quantitative Investment Research Platform

    Qbot is an open source quantitative research and trading platform that provides a full pipeline from data ingestion and strategy development to backtesting, simulation, and (optionally) live trading. It bundles a lightweight GUI client (built with wxPython) and a modular backend so researchers can iterate on strategies, run batch backtests, and validate ideas in a near-real simulated environment that models latency and slippage. The project places special emphasis on AI-driven strategies —...
    Downloads: 39 This Week
    Last Update:
    See Project
  • 25
    PyBoy

    PyBoy

    Game Boy emulator written in Python

    PyBoy is an open-source Game Boy emulator written in Python, designed for both gameplay and AI experimentation. It allows users to run classic Game Boy games while providing a powerful API for automation, scripting, and reinforcement learning. Developers can interact directly with game memory, inputs, and screen data, making it ideal for training bots and analyzing game mechanics. PyBoy emphasizes performance, enabling accelerated emulation speeds and frame skipping for large-scale simulations. ...
    Downloads: 0 This Week
    Last Update:
    See Project
MongoDB Logo MongoDB