Join/Login
Business Software
Open Source Software
For Vendors
Blog
About
More

For Vendors Help Create Join Login

Business Software

Open Source Software

SourceForge Podcast

Resources

Articles
Case Studies
Blog

Menu

Help
Create
Join
Login

Home
Open Source Software
Search Results

Search Results for "8-puzzle reinforcement learning python" - Page 3

x

Sort By:

Relevance

Clear All Filters

OS

Linux 222
Mac 210
Windows 210
More...
BSD 110
ChromeOS 110
Mobile Operating Systems 3

Category

Artificial Intelligence 183
Software Development 31
Education 14
Games 12
Business 9
Scientific/Engineering 9
System 4
Multimedia 2
Communications 1
Database 1
Formats and Protocols 1

License

OSI-Approved Open Source 204
Creative Commons Attribution License 4
GNU Free Documentation License 1

Translations

English 3
Chinese (Simplified) 1
Chinese (Traditional) 1

Programming Language

Python 230
C++ 8
Unix Shell 6
C 2
Java 1
More...
JavaScript 1
MATLAB 1

Status

Beta 6
Alpha 3
Production/Stable 3
Pre-Alpha 2
More...
Planning 1

Showing 230 open source projects for "8-puzzle reinforcement learning python"

View related business solutions

Python Clear Filters & Widen Search

All-in-One Mental Health EHR
Simplify your systems. Strengthen your cash flow. Start fresh with Ensora Health

Ensora Health’s Mental Health EHR is designed for mental health professionals, therapists, and practice managers looking for a secure, user-friendly solution to streamline administrative tasks and improve efficiency in their practice management

Learn More
Project Planning and Management Software | Planview
Connect programs, projects, resources, and financials with business outcomes using portfolio management software from Planview.

Planview® Portfolios enables enterprises to accelerate strategic execution by seamlessly integrating business and technology planning, optimizing resources, and leveraging the power of embedded AI — Planview Anvi™ — to deliver breakthrough products, services, and customer experiences. This unified approach aligns strategy with execution, driving enhanced business performance across the organization.

Learn More
1

PRIME

Scalable RL solution for advanced reasoning of language models

PRIME is an open-source reinforcement learning framework designed to improve the reasoning capabilities of large language models through process-level rewards rather than relying only on final outputs. The system introduces the concept of process reinforcement through implicit rewards, allowing models to receive feedback on intermediate reasoning steps instead of evaluating only the final answer. This approach helps models learn better reasoning strategies and encourages them to generate...

Downloads: 1 This Week

Last Update: 2026-03-06
See Project
2

DeepSeek R1

Open-source, high-performance AI model with advanced reasoning

DeepSeek-R1 is an open-source large language model developed by DeepSeek, designed to excel in complex reasoning tasks across domains such as mathematics, coding, and language. DeepSeek R1 offers unrestricted access for both commercial and academic use. The model employs a Mixture of Experts (MoE) architecture, comprising 671 billion total parameters with 37 billion active parameters per token, and supports a context length of up to 128,000 tokens. DeepSeek-R1's training regimen uniquely...

1 Review

Downloads: 87 This Week

Last Update: 2025-07-09
See Project
3

DeepSeek-V3

Powerful AI language model (MoE) optimized for efficiency/performance

DeepSeek-V3 is a robust Mixture-of-Experts (MoE) language model developed by DeepSeek, featuring a total of 671 billion parameters, with 37 billion activated per token. It employs Multi-head Latent Attention (MLA) and the DeepSeekMoE architecture to enhance computational efficiency. The model introduces an auxiliary-loss-free load balancing strategy and a multi-token prediction training objective to boost performance. Trained on 14.8 trillion diverse, high-quality tokens, DeepSeek-V3...

1 Review

Downloads: 155 This Week

Last Update: 2025-07-09
See Project
4

rLLM

Democratizing Reinforcement Learning for LLMs

rLLM is an open-source framework for building and training post-training language agents via reinforcement learning — that is, using reinforcement signals to fine-tune or adapt language models (LLMs) into customizable agents for real-world tasks. With rLLM, developers can define custom “agents” and “environments,” and then train those agents via reinforcement learning workflows, possibly surpassing what vanilla fine-tuning or supervised learning might provide. The project is designed to...

Downloads: 0 This Week

Last Update: 2025-12-18
See Project
The #1 solution for profitable resource management
Designed to give Operations and Finance leaders the insight and foresight they need to achieve profitable delivery at scale.

Unlike spreadsheets or clunky PSAs, Float offers a clear, centralized view to schedule teams, plan capacity, estimate work, and track margins in real-time so that you can keep your people and profits on track.

Learn More
5

Humanoid-Gym

Reinforcement Learning for Humanoid Robot with Zero-Shot Sim2Real

Humanoid-Gym is a reinforcement learning framework designed to train locomotion and control policies for humanoid robots using high-performance simulation environments. The system is built on top of NVIDIA Isaac Gym, which allows large-scale parallel simulation of robotic environments directly on GPU hardware. Its primary goal is to enable efficient training of humanoid robots in simulation while enabling policies to transfer effectively to real-world hardware without additional training....

Downloads: 1 This Week

Last Update: 2026-03-15
See Project
6

verl-agent

Designed for training LLM/VLM agents via RL

verl-agent is an open-source reinforcement learning framework designed to train large language model agents and vision-language model agents for complex interactive environments. Built as an extension of the veRL reinforcement learning infrastructure, the project focuses on enabling scalable training for agents that perform multi-step reasoning and decision-making tasks. The framework supports multi-turn interactions between agents and their environments, allowing the system to receive...

Downloads: 0 This Week

Last Update: 2026-03-10
See Project
7

highway-env

A minimalist environment for decision-making in autonomous driving

HighwayEnv is an OpenAI Gym-compatible environment focused on autonomous driving scenarios. It provides flexible simulations for testing decision-making algorithms in highway, intersection, and merging traffic situations.

Downloads: 1 This Week

Last Update: 2025-10-18
See Project
8

PyBoy

Game Boy emulator written in Python

PyBoy is an open-source Game Boy emulator written in Python, designed for both gameplay and AI experimentation. It allows users to run classic Game Boy games while providing a powerful API for automation, scripting, and reinforcement learning. Developers can interact directly with game memory, inputs, and screen data, making it ideal for training bots and analyzing game mechanics. PyBoy emphasizes performance, enabling accelerated emulation speeds and frame skipping for large-scale simulations. ...

Downloads: 6 This Week

Last Update: 2026-01-24
See Project
9

Diffusion for World Modeling

Learning agent trained in a diffusion world model

Diffusion for World Modeling is an experimental reinforcement learning system that trains intelligent agents inside a simulated environment generated by a diffusion-based world model. The project introduces the idea of using diffusion models, commonly used for image generation, to simulate the dynamics of an environment and predict future states based on previous observations and actions. Instead of interacting directly with a real environment, the reinforcement learning agent learns within...

Downloads: 0 This Week

Last Update: 2026-03-12
See Project
Outplacement, Executive Coaching and Career Development | Careerminds
Careerminds outplacement includes personalized coaching and a high-tech approach to help transition employees back to work faster.

By helping to avoid the potential risks of RIFs or layoffs through our global outplacement services, companies can move forward with their goals while preserving their internal culture, employer brand, and bottom lines.

Learn More
10

MetaClaw

Just talk to your agent

MetaClaw is an AI or agent-oriented system that appears to focus on advanced control, coordination, or training of autonomous agents, potentially within reinforcement learning or tool-using environments. The project likely emphasizes meta-level reasoning, where agents are not only executing tasks but also adapting their strategies based on feedback and performance signals. It may incorporate mechanisms for learning from interactions, improving decision-making over time, and generalizing...

Downloads: 6 This Week

Last Update: 2026-04-11
See Project
11

RLax

Library of JAX-based building blocks for reinforcement learning agents

RLax (pronounced “relax”) is a JAX-based library developed by Google DeepMind that provides reusable mathematical building blocks for constructing reinforcement learning (RL) agents. Rather than implementing full algorithms, RLax focuses on the core functional operations that underpin RL methods—such as computing value functions, returns, policy gradients, and loss terms—allowing researchers to flexibly assemble their own agents. It supports both on-policy and off-policy learning, as well as...

Downloads: 0 This Week

Last Update: 2025-10-09
See Project
12

CUDA Agent

Large-Scale Agentic RL for High-Performance CUDA Kernel Generation

CUDA Agent is a research-driven agentic reinforcement learning system designed to automatically generate and optimize high-performance CUDA kernels for GPU workloads. The project addresses the long-standing challenge that efficient CUDA programming typically requires deep hardware expertise by training an autonomous coding agent capable of iterative improvement through execution feedback. Its architecture combines large-scale data synthesis, a skill-augmented CUDA development environment,...

Downloads: 1 This Week

Last Update: 2026-03-03
See Project
13

R1-V

Witness the aha moment of VLM with less than $3

R1-V is an initiative aimed at enhancing the generalization capabilities of Vision-Language Models (VLMs) through Reinforcement Learning in Visual Reasoning (RLVR). The project focuses on building a comprehensive framework that emphasizes algorithm enhancement, efficiency optimization, and task diversity to achieve general vision-language intelligence and visual/GUI agents. The team's long-term goal is to contribute impactful open-source research in this domain.

Downloads: 0 This Week

Last Update: 2025-03-19
See Project
14

Youtu-Agent

A simple yet powerful agent framework that delivers with models

Youtu-Agent is an open-source framework developed to simplify the creation, execution, and evaluation of autonomous AI agents. The system focuses on reducing the complexity traditionally involved in configuring large language model agents by providing a modular architecture that separates execution environments, tools, and context management. This structure allows developers to rapidly assemble agent systems capable of performing tasks such as research, file processing, and data analysis....

Downloads: 2 This Week

Last Update: 2026-03-10
See Project
15

Reco-papers

Classic papers and resources on recommendation

Reco-papers is a curated repository that collects influential research papers, technical resources, and industry materials related to recommender systems and recommendation algorithms. The project organizes a large body of literature into thematic sections such as classic recommender systems, exploration-exploitation strategies, deep learning–based recommendation models, and cold-start mitigation techniques. It serves as a reference library for researchers and engineers who want to explore...

Downloads: 0 This Week

Last Update: 2026-03-11
See Project
16

DreamerV3

Mastering Diverse Domains through World Models

DreamerV3 is an open-source implementation of a reinforcement learning algorithm that uses world models to train intelligent agents capable of learning complex behaviors across many environments. The system works by building an internal model of the environment and then using that model to simulate possible future outcomes of actions, allowing the agent to learn from imagined experiences rather than only from real interactions. This approach enables the algorithm to efficiently learn...

Downloads: 0 This Week

Last Update: 2026-03-13
See Project
17

ReCall

Learning to Reason with Search for LLMs via Reinforcement Learning

ReCall is an open-source framework designed to train and evaluate language models that can reason through complex problems by interacting with external tools. The project builds on earlier work focused on teaching models how to search for information during reasoning tasks and extends that idea to a broader system where models can call a variety of external tools such as APIs, databases, or computation engines. Instead of relying purely on static knowledge stored inside the model, ReCall...

Downloads: 0 This Week

Last Update: 2026-03-09
See Project
18

RLHF-Reward-Modeling

Recipes to train reward model for RLHF

RLHF-Reward-Modeling is an open-source research framework focused on training reward models used in reinforcement learning from human feedback for large language models. In RLHF pipelines, reward models are responsible for evaluating generated responses and assigning scores that guide the model toward outputs that better match human preferences. The repository provides training recipes and implementations for building reward and preference models using modern machine learning frameworks. It...

Downloads: 0 This Week

Last Update: 2026-03-06
See Project
19

Recursive Language Models

General plug-and-play inference library for Recursive Language Models

RLM (short for Reinforcement Learning Models) is a modular framework that makes it easier to build, train, evaluate, and deploy reinforcement learning (RL) agents across a wide range of environments and tasks. It provides a consistent API that abstracts away many of the repetitive engineering patterns in RL research and application work, letting developers focus on modeling, experimentation, and fine-tuning rather than infrastructure plumbing. Within the framework, you can define custom...

Downloads: 0 This Week

Last Update: 2026-02-18
See Project
20

Qbot

AI-powered Quantitative Investment Research Platform

Qbot is an open source quantitative research and trading platform that provides a full pipeline from data ingestion and strategy development to backtesting, simulation, and (optionally) live trading. It bundles a lightweight GUI client (built with wxPython) and a modular backend so researchers can iterate on strategies, run batch backtests, and validate ideas in a near-real simulated environment that models latency and slippage. The project places special emphasis on AI-driven strategies —...

Downloads: 31 This Week

Last Update: 2025-11-03
See Project
21

NVIDIA Isaac Lab

Unified framework for robot learning built on NVIDIA Isaac Sim

Isaac Lab is an open-source modular robotics learning framework built atop Isaac Sim. It simplifies research workflows across reinforcement learning, imitation learning, and motion planning by offering robust, GPU-accelerated simulation with realistic sensor and physics fidelity—ideal for sim-to-real robot training. Compatible and optimized for use with Isaac Sim versions (e.g., Sim 5.0 and 4.5). GPU-accelerated, high-fidelity physics and sensor simulation suitable for complex learning...

Downloads: 8 This Week

Last Update: 2026-04-09
See Project
22

LiteMultiAgent

The Library for LLM-based multi-agent applications

LiteMultiAgent is a lightweight and extensible multi-agent reinforcement learning (MARL) platform designed for rapid experimentation. It allows researchers to design and test coordination, competition, and collaboration scenarios in simulated environments.

Downloads: 0 This Week

Last Update: 2025-03-13
See Project
23

PilottAI

Python framework for building scalable multi-agent systems

pilottai is an AI-based autonomous drone navigation system utilizing reinforcement learning for real-time decision-making. It is designed for simulating and training drones to fly safely through dynamic environments using AI-based controllers.

Downloads: 0 This Week

Last Update: 2025-12-01
See Project
24

Sapiens

High-resolution models for human tasks

Sapiens is a research framework from Meta AI focused on embodied intelligence and human-like multimodal learning, aiming to train agents that can perceive, reason, and act in complex environments. It integrates sensory inputs such as vision, audio, and proprioception into a unified learning architecture that allows agents to understand and adapt to their surroundings dynamically. The project emphasizes long-horizon reasoning and cross-modal grounding—connecting language, perception, and...

Downloads: 0 This Week

Last Update: 2025-10-07
See Project
25

Minigrid

Simple and easily configurable grid world environments

Minigrid is a lightweight, minimalistic grid-world environment library for reinforcement learning (RL) research. It provides a suite of simple 2D grid-based tasks (e.g., navigating mazes, unlocking doors, carrying keys) where an agent moves in discrete steps and interacts with objects. The design emphasizes speed (agents can run thousands of steps per second), low dependency overhead, and high customizability — making it easy to define new maps, new tasks, or wrappers. It supports the...

Downloads: 1 This Week

Last Update: 2025-11-25
See Project

Previous
1
2
You're on page 3
4
5
6
7
Next

Related Searches

deepseek

chromebook game emulator

quotex trading bot

rivals

quotex signals bot

knowledge base

game

deepseek-r1-distill-qwen-1.5b

deep seek

auto clicker chromebook

Related Categories

Artificial Intelligence

Software Development

Education

Games

Business

SourceForge

Create a Project
Open Source Software
Business Software
Top Downloaded Projects

Company

About
Team
SourceForge Headquarters
1320 Columbia Street Suite 310
San Diego, CA 92101
+1 (858) 422-6466

Resources

Support
Site Documentation
Site Status
SourceForge Reviews

© 2026 Slashdot Media. All Rights Reserved.

Terms Privacy Privacy Choices Advertise