Join/Login
Business Software
Open Source Software
For Vendors
Blog
About
More

For Vendors Help Create Join Login

Business Software

Open Source Software

SourceForge Podcast

Resources

Articles
Case Studies
Blog

Menu

Help
Create
Join
Login

Home
Open Source Software
Artificial Intelligence Software
Search Results

Search Results for "8-puzzle reinforcement learning python" - Page 3

x

Sort By:

Relevance

Clear All Filters

OS

Linux 179
Mac 170
Windows 168
More...
BSD 92
ChromeOS 92
Mobile Operating Systems 3

Category

Artificial Intelligence 183
Software Development 13
Scientific/Engineering 5
Business 4
Education 3
Games 3
Multimedia 2
Communications 1
Database 1
Formats and Protocols 1
System 1

License

OSI-Approved Open Source 164
Creative Commons Attribution License 1

Translations

Chinese (Simplified) 1
Chinese (Traditional) 1
English 1

Programming Language

Python 183
C++ 6
Unix Shell 3
C 2
Java 1
More...
JavaScript 1
MATLAB 1

Status

Alpha 3
Beta 3
Planning 1
Pre-Alpha 1

Showing 183 open source projects for "8-puzzle reinforcement learning python"

View related business solutions

Artificial Intelligence Python Clear Filters & Widen Search

A warehouse and inventory management software that scales with your business.
For leading 3PLs and high-volume brands searching for an advanced WMS

Logiwa is a leader in cloud-native fulfillment technology, revolutionizing high-volume fulfillment for third-party logistics (3PLs), B2B and B2C fulfillment networks, and direct-to-consumer brands. Our flagship product, Logiwa IO, is an advanced Fulfillment Management System (FMS) designed to scale operations in the digital era. Logiwa elevates digital warehousing to new heights, ensuring dynamic and efficient fulfillment processes. Our commitment to AI-driven technology, combined with a focus on customer-centricity, equips businesses to adeptly navigate and excel in rapidly changing market landscapes. Discover the future of smart fulfillment and how you can fulfill brilliantly with Logiwa IO.

Learn More
The top-rated AI recruiting platform for faster, smarter hiring.
Humanly is an AI recruiting platform that automates candidate conversations, screening, and scheduling.

Humanly is an AI-first recruiting platform that helps talent teams hire in days, not months—without adding headcount. Our intuitive CRM pairs with powerful agentic AI to engage and screen every candidate instantly, surfacing top talent fast. Built on insights from over 4 million candidate interactions, Humanly delivers speed, structure, and consistency at scale—engaging 100% of interested candidates and driving pipeline growth through targeted outreach and smart re-engagement. We integrate seamlessly with all major ATSs to reduce manual work, improve data flow, and enhance recruiter efficiency and candidate experience. Independent audits ensure our AI remains fair and bias-free, so you can hire confidently.

Learn More
1

rLLM

Democratizing Reinforcement Learning for LLMs

rLLM is an open-source framework for building and training post-training language agents via reinforcement learning — that is, using reinforcement signals to fine-tune or adapt language models (LLMs) into customizable agents for real-world tasks. With rLLM, developers can define custom “agents” and “environments,” and then train those agents via reinforcement learning workflows, possibly surpassing what vanilla fine-tuning or supervised learning might provide. The project is designed to...

Downloads: 0 This Week

Last Update: 2025-12-18
See Project
2

Humanoid-Gym

Reinforcement Learning for Humanoid Robot with Zero-Shot Sim2Real

Humanoid-Gym is a reinforcement learning framework designed to train locomotion and control policies for humanoid robots using high-performance simulation environments. The system is built on top of NVIDIA Isaac Gym, which allows large-scale parallel simulation of robotic environments directly on GPU hardware. Its primary goal is to enable efficient training of humanoid robots in simulation while enabling policies to transfer effectively to real-world hardware without additional training....

Downloads: 1 This Week

Last Update: 2026-03-15
See Project
3

verl-agent

Designed for training LLM/VLM agents via RL

verl-agent is an open-source reinforcement learning framework designed to train large language model agents and vision-language model agents for complex interactive environments. Built as an extension of the veRL reinforcement learning infrastructure, the project focuses on enabling scalable training for agents that perform multi-step reasoning and decision-making tasks. The framework supports multi-turn interactions between agents and their environments, allowing the system to receive...

Downloads: 0 This Week

Last Update: 2026-03-10
See Project
4

highway-env

A minimalist environment for decision-making in autonomous driving

HighwayEnv is an OpenAI Gym-compatible environment focused on autonomous driving scenarios. It provides flexible simulations for testing decision-making algorithms in highway, intersection, and merging traffic situations.

Downloads: 1 This Week

Last Update: 2025-10-18
See Project
Create stunning, professional email signatures in minutes
For companies looking to create, assign and manage all their employees email signatures and add targeted marketing banners.

Create, assign and manage all your employees’ email signatures and add targeted marketing banners. Stop getting worked up about your signatures! Leverage a centralized interface to easily create and manage the email signatures of all your employees. Take advantage of each email to broadcast and amplify your brand. Letsignit helps you regain control over your digital identity. Harmonize 100% of your employee’s email signatures in just a few clicks! 121 professional emails are received and 40 are sent every day by an employee. With Letsignit, turn every email into a powerful communication opportunity: send the right message to the right person at the right time! Innovative more than tech, inspiring more than following. Authentic more than overrated, close more than "think big", trustworthy more than doubtful. Hands-on more than complex, available but yet premium, fun but yet expert.

Learn More
5

MetaClaw

Just talk to your agent

MetaClaw is an AI or agent-oriented system that appears to focus on advanced control, coordination, or training of autonomous agents, potentially within reinforcement learning or tool-using environments. The project likely emphasizes meta-level reasoning, where agents are not only executing tasks but also adapting their strategies based on feedback and performance signals. It may incorporate mechanisms for learning from interactions, improving decision-making over time, and generalizing...

Downloads: 6 This Week

Last Update: 2026-04-11
See Project
6

Diffusion for World Modeling

Learning agent trained in a diffusion world model

Diffusion for World Modeling is an experimental reinforcement learning system that trains intelligent agents inside a simulated environment generated by a diffusion-based world model. The project introduces the idea of using diffusion models, commonly used for image generation, to simulate the dynamics of an environment and predict future states based on previous observations and actions. Instead of interacting directly with a real environment, the reinforcement learning agent learns within...

Downloads: 0 This Week

Last Update: 2026-03-12
See Project
7

CUDA Agent

Large-Scale Agentic RL for High-Performance CUDA Kernel Generation

CUDA Agent is a research-driven agentic reinforcement learning system designed to automatically generate and optimize high-performance CUDA kernels for GPU workloads. The project addresses the long-standing challenge that efficient CUDA programming typically requires deep hardware expertise by training an autonomous coding agent capable of iterative improvement through execution feedback. Its architecture combines large-scale data synthesis, a skill-augmented CUDA development environment,...

Downloads: 1 This Week

Last Update: 2026-03-03
See Project
8

Youtu-Agent

A simple yet powerful agent framework that delivers with models

Youtu-Agent is an open-source framework developed to simplify the creation, execution, and evaluation of autonomous AI agents. The system focuses on reducing the complexity traditionally involved in configuring large language model agents by providing a modular architecture that separates execution environments, tools, and context management. This structure allows developers to rapidly assemble agent systems capable of performing tasks such as research, file processing, and data analysis....

Downloads: 2 This Week

Last Update: 2026-03-10
See Project
9

R1-V

Witness the aha moment of VLM with less than $3

R1-V is an initiative aimed at enhancing the generalization capabilities of Vision-Language Models (VLMs) through Reinforcement Learning in Visual Reasoning (RLVR). The project focuses on building a comprehensive framework that emphasizes algorithm enhancement, efficiency optimization, and task diversity to achieve general vision-language intelligence and visual/GUI agents. The team's long-term goal is to contribute impactful open-source research in this domain.

Downloads: 0 This Week

Last Update: 2025-03-19
See Project
Managed File Transfer Software
Products to help you get data where it needs to go—securely and efficiently.

For too many businesses, complex file transfer needs make it difficult to create, manage and support data flows to and from internal and external systems. Progress® MOVEit® empowers enterprises to take control of their file transfer workflows with solutions that help secure, simplify and centralize data exchanges throughout the organization.

Learn More
10

Reco-papers

Classic papers and resources on recommendation

Reco-papers is a curated repository that collects influential research papers, technical resources, and industry materials related to recommender systems and recommendation algorithms. The project organizes a large body of literature into thematic sections such as classic recommender systems, exploration-exploitation strategies, deep learning–based recommendation models, and cold-start mitigation techniques. It serves as a reference library for researchers and engineers who want to explore...

Downloads: 0 This Week

Last Update: 2026-03-11
See Project
11

ReCall

Learning to Reason with Search for LLMs via Reinforcement Learning

ReCall is an open-source framework designed to train and evaluate language models that can reason through complex problems by interacting with external tools. The project builds on earlier work focused on teaching models how to search for information during reasoning tasks and extends that idea to a broader system where models can call a variety of external tools such as APIs, databases, or computation engines. Instead of relying purely on static knowledge stored inside the model, ReCall...

Downloads: 0 This Week

Last Update: 2026-03-09
See Project
12

RLHF-Reward-Modeling

Recipes to train reward model for RLHF

RLHF-Reward-Modeling is an open-source research framework focused on training reward models used in reinforcement learning from human feedback for large language models. In RLHF pipelines, reward models are responsible for evaluating generated responses and assigning scores that guide the model toward outputs that better match human preferences. The repository provides training recipes and implementations for building reward and preference models using modern machine learning frameworks. It...

Downloads: 0 This Week

Last Update: 2026-03-06
See Project
13

LiteMultiAgent

The Library for LLM-based multi-agent applications

LiteMultiAgent is a lightweight and extensible multi-agent reinforcement learning (MARL) platform designed for rapid experimentation. It allows researchers to design and test coordination, competition, and collaboration scenarios in simulated environments.

Downloads: 0 This Week

Last Update: 2025-03-13
See Project
14

PilottAI

Python framework for building scalable multi-agent systems

pilottai is an AI-based autonomous drone navigation system utilizing reinforcement learning for real-time decision-making. It is designed for simulating and training drones to fly safely through dynamic environments using AI-based controllers.

Downloads: 0 This Week

Last Update: 2025-12-01
See Project
15

Sapiens

High-resolution models for human tasks

Sapiens is a research framework from Meta AI focused on embodied intelligence and human-like multimodal learning, aiming to train agents that can perceive, reason, and act in complex environments. It integrates sensory inputs such as vision, audio, and proprioception into a unified learning architecture that allows agents to understand and adapt to their surroundings dynamically. The project emphasizes long-horizon reasoning and cross-modal grounding—connecting language, perception, and...

Downloads: 0 This Week

Last Update: 2025-10-07
See Project
16

Minigrid

Simple and easily configurable grid world environments

Minigrid is a lightweight, minimalistic grid-world environment library for reinforcement learning (RL) research. It provides a suite of simple 2D grid-based tasks (e.g., navigating mazes, unlocking doors, carrying keys) where an agent moves in discrete steps and interacts with objects. The design emphasizes speed (agents can run thousands of steps per second), low dependency overhead, and high customizability — making it easy to define new maps, new tasks, or wrappers. It supports the...

Downloads: 1 This Week

Last Update: 2025-11-25
See Project
17

PKU Beaver

Constrained Value Alignment via Safe Reinforcement Learning

PKU Beaver is an open-source research project focused on improving the safety alignment of large language models through reinforcement learning from human feedback under explicit safety constraints. The framework introduces techniques that separate helpfulness and harmlessness signals during training, allowing models to optimize for useful responses while minimizing harmful behavior. To support this process, the project provides datasets containing human-labeled examples that encode both...

Downloads: 0 This Week

Last Update: 2026-03-06
See Project
18

Ring

Ring is a reasoning MoE LLM provided and open-sourced by InclusionAI

Ring is a reasoning Mixture-of-Experts (MoE) large language model (LLM) developed by inclusionAI. It is built from or derived from Ling. Its design emphasizes reasoning, efficiency, and modular expert activation. In its “flash” variant (Ring-flash-2.0), it optimizes inference by activating only a subset of experts. It applies reinforcement learning/reasoning optimization techniques. Its architectures and training approaches are tuned to enable efficient and capable reasoning performance....

Downloads: 0 This Week

Last Update: 2025-09-30
See Project
19

AgentEvolver

Towards Efficient Self-Evolving Agent System

AgentEvolver is an open-source research framework for building self-evolving AI agents powered by large language models. The system focuses on improving the efficiency and scalability of training autonomous agents by allowing them to generate tasks, explore environments, and refine strategies without heavy reliance on manually curated datasets. Its architecture combines reinforcement learning with LLM-driven reasoning mechanisms to guide exploration and learning. The framework introduces...

Downloads: 0 This Week

Last Update: 2026-03-28
See Project
20

Tongyi DeepResearch

Tongyi Deep Research, the Leading Open-source Deep Research Agent

DeepResearch (Tongyi DeepResearch) is an open-source “deep research agent” developed by Alibaba’s Tongyi Lab designed for long-horizon, information-seeking tasks. It’s built to act like a research agent: synthesizing, reasoning, retrieving information via the web and documents, and backing its outputs with evidence. The model is about 30.5 billion parameters in size, though at any given token only ~3.3B parameters are active. It uses a mix of synthetic data generation, fine-tuning and...

Downloads: 0 This Week

Last Update: 2026-02-27
See Project
21

Pearl

A Production-ready Reinforcement Learning AI Agent Library

Pearl is a production-ready reinforcement learning and contextual bandit agent library built for real-world sequential decision making. It is organized around modular components—policy learners, replay buffers, exploration strategies, safety modules, and history summarizers—that snap together to form reliable agents with clear boundaries and strong defaults. The library implements classic and modern algorithms across two regimes: contextual bandits (e.g., LinUCB, LinTS, SquareCB, neural...

Downloads: 1 This Week

Last Update: 2 days ago
See Project
22

Unsloth Studio

Unified web UI for training and running open models locally

Unsloth Studio is a web-based interface for running and training AI models locally with a unified and user-friendly experience. It allows users to work with a wide range of models for text, audio, vision, embeddings, and more without relying heavily on cloud infrastructure. Built on top of the Unsloth framework, it focuses on high-performance training with reduced VRAM usage and faster speeds compared to traditional methods. The platform supports fine-tuning, pretraining, and reinforcement...

Downloads: 14 This Week

Last Update: 2026-04-08
See Project
23

SWIFT LLM

Use PEFT or Full-parameter to CPT/SFT/DPO/GRPO 600+ LLMs

SWIFT LLM is a comprehensive framework developed within the ModelScope ecosystem for training, fine-tuning, evaluating, and deploying large language models and multimodal models. The platform provides a full machine learning pipeline that supports tasks ranging from model pre-training to reinforcement learning alignment techniques. It integrates with popular inference engines such as vLLM and LMDeploy to accelerate deployment and runtime performance. The framework also includes support for...

Downloads: 3 This Week

Last Update: 5 days ago
See Project
24

Agent Behavior Monitoring

The open source post-building layer for agents

Agent Behavior Monitoring is an open-source framework designed to monitor, evaluate, and improve the behavior of AI agents operating in real or simulated environments. The system focuses on agent behavior monitoring by collecting interaction data and analyzing how agents perform across different scenarios and tasks. Developers can use the framework to observe agent actions in both online production environments and offline evaluation settings, making it useful for debugging and performance...

Downloads: 3 This Week

Last Update: 2026-04-09
See Project
25

slime LLM

slime is an LLM post-training framework for RL Scaling

slime is an open-source large language model (LLM) post-training framework developed to support reinforcement learning (RL)-based scaling and high-performance training workflows for advanced LLMs, blending training and rollout modules into an extensible system. It offers a flexible architecture that connects high-throughput training (e.g., via Megatron-LM) with a customizable data generation pipeline, enabling researchers and engineers to iterate on new RL training paradigms effectively. The...

Downloads: 3 This Week

Last Update: 2026-03-29
See Project

Previous
1
2
You're on page 3
4
5
6
7
8
Next

Related Categories

Artificial Intelligence

Software Development

Scientific/Engineering

Business

Education

SourceForge

Create a Project
Open Source Software
Business Software
Top Downloaded Projects

Company

About
Team
SourceForge Headquarters
1320 Columbia Street Suite 310
San Diego, CA 92101
+1 (858) 422-6466

Resources

Support
Site Documentation
Site Status
SourceForge Reviews

© 2026 Slashdot Media. All Rights Reserved.

Terms Privacy Privacy Choices Advertise