FastChat is an open platform for training, serving, and evaluating large language model-based chatbots. If you do not have enough memory, you can enable 8-bit compression by adding --load-8bit to the commands above. This can reduce memory usage by around half with slightly degraded model quality. It is compatible with the CPU, GPU, and Metal backend. Vicuna-13B with 8-bit compression can run on a single NVIDIA 3090/4080/T4/V100(16GB) GPU. In addition to that, you can add --cpu-offloading to commands above to offload weights that don't fit on your GPU onto the CPU memory. This requires 8-bit compression to be enabled and the bitsandbytes package to be installed, which is only available on linux operating systems.

Features

  • The weights, training code, and evaluation code for state-of-the-art models
  • A distributed multi-model serving system with Web UI and OpenAI-compatible RESTful APIs
  • For training, serving, and evaluating large language models
  • Reduce the CPU RAM requirement of weight conversion
  • Inference with Command Line Interface
  • Several supported models

Project Samples

Project Activity

See All Activity >

License

Apache License V2.0

Follow FastChat

FastChat Web Site

Other Useful Business Software
Online Project Management Platform - Zoho Icon
Online Project Management Platform - Zoho

A plan put together with small businesses and startups in mind.

Zoho Projects is a cloud-based project management solution that helps teams plan, track, collaborate, and achieve project goals.
Learn More
Rate This Project
Login To Rate This Project

User Reviews

Be the first to post a review of FastChat!

Additional Project Details

Operating Systems

Linux, Mac

Programming Language

Python

Related Categories

Python Artificial Intelligence Software, Python LLM Inference Tool

Registered

2023-06-01