Download Latest Version koboldcpp-1.111.2 source code.zip (64.1 MB)
Email in envelope

Get an email when there's a new version of KoboldCpp

Home / v1.61.2
Name Modified Size InfoDownloads / Week
Parent folder
koboldcpp-linux-x64 2024-03-14 389.9 MB
koboldcpp-linux-x64-nocuda 2024-03-14 56.7 MB
koboldcpp.exe 2024-03-14 308.9 MB
koboldcpp_nocuda.exe 2024-03-14 38.3 MB
koboldcpp-1.61.2 source code.tar.gz 2024-03-14 16.5 MB
koboldcpp-1.61.2 source code.zip 2024-03-14 16.7 MB
README.md 2024-03-14 4.1 kB
Totals: 7 Items   826.9 MB 0

koboldcpp-1.61.2

Finally multimodal edition

image image

  • NEW: KoboldCpp now supports Vision via Multimodal Projectors (aka LLaVA), allowing it to perceive and react to images! Load a suitable --mmproj file or select it in the GUI launcher to use vision capabilities. (Not working on Vulkan)
  • Note: This is NOT limited to only LLaVA models, any compatible model of the same size and architecture can gain vision capabilities!
  • Simply grab a 200mb mmproj file for your architecture here, load it with --mmproj and stick it into your favorite compatible model, and it will be able to see images as well!
  • KoboldCpp supports passing up to 4 images, each one will consume about 600 tokens of context (LLaVA 1.5). Additionally, KoboldCpp token fast-forwarding and context-shifting works with images seamlessly, so you only need to process each image once!
  • A compatible OpenAI GPT-4V API endpoint is emulated, so GPT-4-Vision applications should work out of the box (e.g. for SillyTavern in Chat Completions mode, just enable it). For Kobold API and OpenAI Text-Completions API, passing an array of base64 encoded images in the submit payload will work as well (planned Aphrodite compatible format).
  • An A1111 compatible /sdapi/v1/interrogate endpoint is also emulated, allowing easy captioning for other image-interrogation frontends.
  • In Kobold Lite, click any image to select from available AI Vision options.
  • NEW: Support for authentication via API Keys has been added, set it with --password. This key will be required for all text generation endpoints, using Bearer Authorization. Image endpoints are not secured.
  • Proper support for generating non-square images, scaling correctly based on aspect ratio
  • --benchmark limit increased to 16k context
  • Added aliases for the image sampler names for txt2img generation.
  • Added the clamped option for --sdconfig which prevents generating too large resolutions and potentially crashing due to OOM.
  • Pulled and merged improvements and fixes from upstream
  • Includes support for mamba models, (CPU only). Note: mamba does not support context shifting
  • Updated Kobold Lite:
  • Added better support for displaying larger images, added support for generating portrait and landscape aspect ratios
  • Increased max image resolution in HD mode, allow downloading non-square images properly
  • Added ability to choose image samplers for image generation
  • Added ability to upload images to KoboldCpp for LLaVA usage, with 4 selectable "AI Vision" modes
  • Allow inserting images from files even when no image generation backend is selected
  • Added support for password input and using API keys over KoboldAI API

Fix 1.61.1 - Fixed mamba (removed broken context shifting), merged other fixes from upstream, support uploading non-square images. Fix 1.61.2 - Added new launch flag --ignoremissing which deliberately ignores any optional missing files that were passed in, e.g. --lora, --mmproj, skipping them instead of exiting. Also, paste image from clipboard is added to lite.

To use, download and run the koboldcpp.exe, which is a one-file pyinstaller. If you don't need CUDA, you can use koboldcpp_nocuda.exe which is much smaller. If you're using AMD, you can try koboldcpp_rocm at YellowRoseCx's fork here

Run it from the command line with the desired launch parameters (see --help), or manually select the model in the GUI. and then once loaded, you can connect like this (or use the full koboldai client): http://localhost:5001

For more information, be sure to run the program from command line with the --help flag.

Source: README.md, updated 2024-03-14