The interactive file manager requires Javascript. Please enable it or use sftp or scp.
You may still browse the files here.

Name	Modified	Size	InfoDownloads / Week
Parent folder
koboldcpp_nocuda.exe	2025-01-23	63.5 MB	1
koboldcpp_cu12.exe	2025-01-23	596.2 MB	0
koboldcpp.exe	2025-01-23	478.1 MB	0
koboldcpp-mac-arm64	2025-01-23	26.3 MB	0
koboldcpp-linux-x64-nocuda	2025-01-23	64.0 MB	0
koboldcpp-linux-x64-cuda1210	2025-01-23	671.5 MB	0
koboldcpp-linux-x64-cuda1150	2025-01-23	586.3 MB	0
koboldcpp_oldcpu.exe	2025-01-23	478.3 MB	0
koboldcpp-1.82.4 source code.tar.gz	2025-01-23	28.0 MB	0
koboldcpp-1.82.4 source code.zip	2025-01-23	28.4 MB	0
README.md	2025-01-23	5.0 kB	0
Totals: 11 Items		3.0 GB	1

koboldcpp-1.82.4

Old kobo yells at cloud edition

cloud

NEW: Added OuteTTS for Text-To-Speech: OuteTTS is a text-to-speech model that can be used for narration by generating audio from KoboldCpp.
You need two models, an OuteTTS GGUF and a WavTokenizer GGUF which you can find here.
Once downloaded, load them in the Audio tab or using --ttsmodel and --ttswavtokenizer. You can also use --ttsgpu to load them on the GPU instead, and --ttsthreads to set a custom thread count used.
When enabled, sets up OpenAI Speech API and XTTS API compatibility endpoints allowing you to easily hook KoboldCpp TTS into existing TTS frontends.
Comes with a set of included voices, as well as New Speaker Synthesis, allowing you to create hundreds of new unique voices just by entering a random name. Read more here.
All OuteTTS GGUF v0.2 and the NEW v0.3 models are supported, including both 500m and 1B models.
Credits to @ggerganov and @edwko for the original upstream implementation
NEW: Bundled GGUF file analyzer: In the GUI Extras tab, or with --analyze, you can now analyze any GGUF file, which will display the metadata and tensor names, dimensions and types within that file.
TAESD is now also available for SD3 and Flux! Enable with --sdvaeauto or "AutoFix VAE" in the GUI. TAESD is now compressed to fp8, making this VAE only about 3mb in size.
VAE tiling for image generation can now be disabled with --sdnotile, this fixes the bleeding graphical artifacts on some cards.
Adjusted compatibility build targets: CLBlast (Older CPU) mode now no longer requires AVX, providing a good option for very old/cheap systems to still have some level of GPU support. For users with AVX but not AVX2, you can use the Vulkan (Old CPU) mode instead.
mmap is no longer the default option. To enable it, you now need --usemmap or set it in the GUI.
Fix for save file GUI prompt not working
Fix for web browser not launching with --launch in Linux GUI.
Added more GUI slider options for context sizes.
Max supported images per API request for Multimodal Vision is now increased to 8.
Enabled multilingual support for Whisper (Voice Recognition) setting specific language codes.
KoboldCpp now displays what capabilities and endpoints enabled on launch.
Available Modules: TextGeneration ImageGeneration VoiceRecognition MultimodalVision NetworkMultiplayer WebSearchProxy TextToSpeech ApiKeyPassword
Available APIs: KoboldCppApi OpenAiApi OllamaApi A1111ForgeApi ComfyUiApi WhisperTranscribeApi XttsApi OpenAiSpeechApi
Updated Kobold Lite, multiple fixes and improvements
Added Whisper language selection: Instead of automatically detecting the speaker language, you can now optionally specify it with a 2 character language code (e.g. ja for Japanese, fr for French). This ensures the output is in the right language.
Added Text-To-Speech support for KoboldCpp backend
Merged fixes and improvements from upstream

Hotfix 1.82.1: Fixed --analyze which should be working correctly now. Minor fixes to OuteTTS v0.3 handling and updated Lite UI. Whisper now accepts 8 bit and 32bit wav files, and form data input. Hotfix 1.82.2: Added support for Deepseek R1 Qwen Distill Hotfix 1.82.3: Fixed a TTS crash, CLBlast mislabeling, quiet now overrides debug Hotfix 1.82.4: Fixed deepseek adapter, draft decoding now accepts slightly different vocabs

To use, download and run the koboldcpp.exe, which is a one-file pyinstaller. If you don't need CUDA, you can use koboldcpp_nocuda.exe which is much smaller. If you have an Nvidia GPU, but use an old CPU and koboldcpp.exe does not work, try koboldcpp_oldcpu.exe If you have a newer Nvidia GPU, you can use the CUDA 12 version koboldcpp_cu12.exe (much larger, slightly faster). If you're using Linux, select the appropriate Linux binary file instead (not exe). If you're on a modern MacOS (M1, M2, M3) you can try the koboldcpp-mac-arm64 MacOS binary. If you're using AMD, we recommend trying the Vulkan option (available in all releases) first, for best support. Alternatively, you can try koboldcpp_rocm at YellowRoseCx's fork here

Run it from the command line with the desired launch parameters (see --help), or manually select the model in the GUI. and then once loaded, you can connect like this (or use the full koboldai client): http://localhost:5001

For more information, be sure to run the program from command line with the --help flag. You can also refer to the readme and the wiki.

Source: README.md, updated 2025-01-23