Join/Login
Business Software
Open Source Software
For Vendors
Blog
About
More
Articles
Create
SourceForge Podcast
Site Documentation
Subscribe to our Newsletter
Support Request
For Vendors
Help
Create
Join
Login
Business Software
Open Source Software
SourceForge Podcast
Resources
Articles
Case Studies
Blog
Menu
Help
Create
Join
Login
Home
Browse
SourceForge Open Source Mirror Directory
Data-Juicer
Activity
Data-Juicer Activity
Data processing for and with foundation models
SourceForge Open Source Mirror Directory
Summary
Files
Reviews
Activity for Data-Juicer
1 month ago
Data-Juicer
released
/v1.5.1/py_data_juicer-1.5.1-py3-none-any.whl
1 month ago
Data-Juicer
released
/v1.5.1/Release v1.5.1_ LaTeX OPs_ Compressed Format Support_ Operator Robustness Fixes source code.zip
1 month ago
Data-Juicer
released
/v1.5.1/Release v1.5.1_ LaTeX OPs_ Compressed Format Support_ Operator Robustness Fixes source code.tar.gz
1 month ago
Data-Juicer
released
/v1.5.1/README.md
2 months ago
Data-Juicer
released
/v1.5.0/py_data_juicer-1.5.0-py3-none-any.whl
2 months ago
Data-Juicer
released
/v1.5.0/Release v1.5.0_ Partitioned Ray Executor_ Embodied-AI OPs_ OP-level Env Management source code.zip
2 months ago
Data-Juicer
released
/v1.5.0/Release v1.5.0_ Partitioned Ray Executor_ Embodied-AI OPs_ OP-level Env Management source code.tar.gz
2 months ago
Data-Juicer
released
/v1.5.0/README.md
3 months ago
Data-Juicer
released
/v1.4.6/py_data_juicer-1.4.6-py3-none-any.whl
3 months ago
Data-Juicer
released
/v1.4.6/Release v1.4.6_ introduce Q_A Copilot_ Video bytes I_O_ Tracer for Ray mode source code.zip
3 months ago
Data-Juicer
released
/v1.4.6/Release v1.4.6_ introduce Q_A Copilot_ Video bytes I_O_ Tracer for Ray mode source code.tar.gz
3 months ago
Data-Juicer
released
/v1.4.6/README.md
3 months ago
Data-Juicer
released
/v1.4.5/py_data_juicer-1.4.5-py3-none-any.whl
3 months ago
Data-Juicer
released
/v1.4.5/Release v1.4.5_ Embodied-AI OPs_ Doc System Upgrading source code.zip
3 months ago
Data-Juicer
released
/v1.4.5/Release v1.4.5_ Embodied-AI OPs_ Doc System Upgrading source code.tar.gz
3 months ago
Data-Juicer
released
/v1.4.5/README.md
5 months ago
Data-Juicer
released
/v1.4.4/py_data_juicer-1.4.4-py3-none-any.whl
5 months ago
Data-Juicer
released
/v1.4.4/Release v1.4.4_ NeurIPS 2025 Spotlight_ New Video _ Multimodal Ops_ Repo Reorganization_ S3 I_O Support source code.zip
5 months ago
Data-Juicer
released
/v1.4.4/README.md
5 months ago
Data-Juicer
released
/v1.4.4/Release v1.4.4_ NeurIPS 2025 Spotlight_ New Video _ Multimodal Ops_ Repo Reorganization_ S3 I_O Support source code.tar.gz
7 months ago
Data-Juicer
released
/v1.4.3/Release v1.4.3_ OP Doc Enhancement_ Optimized Auto Parallelism_ Optimized Sandbox source code.zip
7 months ago
Data-Juicer
released
/v1.4.3/py_data_juicer-1.4.3-py3-none-any.whl
7 months ago
Data-Juicer
released
/v1.4.3/Release v1.4.3_ OP Doc Enhancement_ Optimized Auto Parallelism_ Optimized Sandbox source code.tar.gz
7 months ago
Data-Juicer
released
/v1.4.3/README.md
8 months ago
Data-Juicer
released
/v1.4.2/Release v1.4.2_ Python _ 3.10 are supported_ Data Attribution OPs_ External OPs are supported_ Install with _uv_ source code.zip
8 months ago
Data-Juicer
released
/v1.4.2/py_data_juicer-1.4.2-py3-none-any.whl
8 months ago
Data-Juicer
released
/v1.4.2/README.md
8 months ago
Data-Juicer
released
/v1.4.2/Release v1.4.2_ Python _ 3.10 are supported_ Data Attribution OPs_ External OPs are supported_ Install with _uv_ source code.tar.gz
9 months ago
Data-Juicer
released
/v1.4.1/py_data_juicer-1.4.1-py3-none-any.whl
9 months ago
Data-Juicer
released
/v1.4.1/README.md
9 months ago
Data-Juicer
released
/v1.4.1/Release v1.4.1_ MCP server_ GPU-based Minhash deduplicator_ Improved unit test coverage. source code.zip
9 months ago
Data-Juicer
released
/v1.4.1/Release v1.4.1_ MCP server_ GPU-based Minhash deduplicator_ Improved unit test coverage. source code.tar.gz
10 months ago
Data-Juicer
released
/v1.4.0/py_data_juicer-1.4.0-py3-none-any.whl
10 months ago
Data-Juicer
released
/v1.4.0/v1.4.0 Major Refactor for Env Management, Doc, Sandbox_ Derivative Works (TPAMI Survey_ Trinity-RFT _ DetailMaster) source code.zip
10 months ago
Data-Juicer
released
/v1.4.0/v1.4.0 Major Refactor for Env Management, Doc, Sandbox_ Derivative Works (TPAMI Survey_ Trinity-RFT _ DetailMaster) source code.tar.gz
10 months ago
Data-Juicer
released
/v1.4.0/README.md
12 months ago
Data-Juicer
released
/v1.3.3/py_data_juicer-1.3.3-py3-none-any.whl
12 months ago
Data-Juicer
released
/v1.3.3/Release v1.3.3_ Sandbox is accepted as Spotlight by ICML 2025_ Add Img-Diff recipes. source code.zip
12 months ago
Data-Juicer
released
/v1.3.3/Release v1.3.3_ Sandbox is accepted as Spotlight by ICML 2025_ Add Img-Diff recipes. source code.tar.gz
12 months ago
Data-Juicer
released
/v1.3.3/README.md
12 months ago
Data-Juicer
released
/v1.3.2/py_data_juicer-1.3.2-py3-none-any.whl
12 months ago
Data-Juicer
released
/v1.3.2/Release v1.3.2_ Enhancements on usability _ two OPs_ some bugs fixes source code.zip
12 months ago
Data-Juicer
released
/v1.3.2/Release v1.3.2_ Enhancements on usability _ two OPs_ some bugs fixes source code.tar.gz
12 months ago
Data-Juicer
updated
/v1.3.2/README.md
12 months ago
Data-Juicer
released
/v1.3.2/README.md
12 months ago
Data-Juicer
released
/v1.3.2/Enhancements on usability _ two OPs_ some bugs fixes source code.zip
12 months ago
Data-Juicer
released
/v1.3.2/Enhancements on usability _ two OPs_ some bugs fixes source code.tar.gz
1 year ago
Data-Juicer
released
/v1.3.1/py_data_juicer-1.3.1-py3-none-any.whl
1 year ago
Data-Juicer
released
/v1.3.1/Release v1.3.1_ added HumanOPs _ fixed some bugs source code.zip
1 year ago
Data-Juicer
released
/v1.3.1/Release v1.3.1_ added HumanOPs _ fixed some bugs source code.tar.gz
1 year ago
Data-Juicer
released
/v1.3.1/README.md
1 year ago
Data-Juicer
released
/v1.3.0/py_data_juicer-1.3.0-py3-none-any.whl
1 year ago
Data-Juicer
released
/v1.3.0/Refactor of dataset builder and executor! source code.zip
1 year ago
Data-Juicer
released
/v1.3.0/README.md
1 year ago
Data-Juicer
released
/v1.3.0/Refactor of dataset builder and executor! source code.tar.gz
1 year ago
Data-Juicer
released
/v1.2.2/py_data_juicer-1.2.2-py3-none-any.whl
1 year ago
Data-Juicer
released
/v1.2.2/Release v1.2.2 source code.zip
1 year ago
Data-Juicer
released
/v1.2.2/Release v1.2.2 source code.tar.gz
1 year ago
Data-Juicer
released
/v1.2.2/README.md
1 year ago
Data-Juicer
released
/v1.2.1/py_data_juicer-1.2.1-py3-none-any.whl
1 year ago
Data-Juicer
released
/v1.2.1/Release v1.2.1 source code.zip
1 year ago
Data-Juicer
released
/v1.2.1/Release v1.2.1 source code.tar.gz
1 year ago
Data-Juicer
released
/v1.2.1/README.md
1 year ago
Data-Juicer
released
/v1.2.0/py_data_juicer-1.2.0-py3-none-any.whl
1 year ago
Data-Juicer
released
/v1.2.0/v1.2.0 Doc refactored_ New algorithm proposed source code.zip
1 year ago
Data-Juicer
released
/v1.2.0/v1.2.0 Doc refactored_ New algorithm proposed source code.tar.gz
1 year ago
Data-Juicer
released
/v1.2.0/README.md
1 year ago
Data-Juicer
released
/v1.1.0/py_data_juicer-1.1.0-py3-none-any.whl
1 year ago
Data-Juicer
released
/v1.1.0/Release v1.1.0 source code.zip
1 year ago
Data-Juicer
released
/v1.1.0/Release v1.1.0 source code.tar.gz
1 year ago
Data-Juicer
released
/v1.1.0/README.md
1 year ago
Data-Juicer
released
/v1.0.3/py_data_juicer-1.0.3-py3-none-any.whl
1 year ago
Data-Juicer
released
/v1.0.3/Release v1.0.3_ More Powerful Distributed MinHashLSH Deduplicator_ Post-Tuning Formats _ OPs_ Ray Actor for GPU-based OPs source code.tar.gz
1 year ago
Data-Juicer
released
/v1.0.3/Release v1.0.3_ More Powerful Distributed MinHashLSH Deduplicator_ Post-Tuning Formats _ OPs_ Ray Actor for GPU-based OPs source code.zip
1 year ago
Data-Juicer
released
/v1.0.3/README.md
1 year ago
Data-Juicer
released
/v1.0.2/py_data_juicer-1.0.2-py3-none-any.whl
1
✕