DeepEval is a simple-to-use, open-source LLM evaluation framework, for evaluating and testing large-language model systems. It is similar to Pytest but specialized for unit testing LLM outputs. DeepEval incorporates the latest research to evaluate LLM outputs based on metrics such as G-Eval, hallucination, answer relevancy, RAGAS, etc., which uses LLMs and various other NLP models that run locally on your machine for evaluation. Whether your application is implemented via RAG or fine-tuning, LangChain, or LlamaIndex, DeepEval has you covered. With it, you can easily determine the optimal hyperparameters to improve your RAG pipeline, prevent prompt drifting, or even transition from OpenAI to hosting your own Llama2 with confidence.

Features

  • Large variety of ready-to-use LLM evaluation metrics (all with explanations) powered by ANY LLM of your choice
  • Red team your LLM application for 40+ safety vulnerabilities in a few lines of code
  • Documentation available
  • Examples available
  • Evaluate your entire dataset in bulk in under 20 lines of Python code in parallel. Do this via the CLI in a Pytest-like manner, or through our evaluate() function
  • Create your own custom metrics that are automatically integrated with DeepEval's ecosystem by inheriting DeepEval's base metric class

Project Samples

Project Activity

See All Activity >

License

Apache License V2.0

Follow DeepEval

DeepEval Web Site

Other Useful Business Software
Streamline Hiring with Skill Assessments Icon
Streamline Hiring with Skill Assessments

Say goodbye to hiring guesswork. Use Canditech’s job simulation tests to assess real-world skills and make data-driven decisions.

Canditech offers innovative, cheat-proof skill assessments and job simulations to transform your hiring process. From technical skills to soft skills, we help you assess candidates on actual job performance. With over 500 customizable tests and powerful video interview features, you can evaluate real-world capabilities, streamline your hiring, and reduce biases. Whether you’re hiring for remote roles, mass hiring, or looking to expand your diversity pool, Canditech’s data-driven platform ensures the right candidates are chosen for the job every time.
Get a Free Demo
Rate This Project
Login To Rate This Project

User Reviews

Be the first to post a review of DeepEval!

Additional Project Details

Operating Systems

Linux, Mac, Windows

Programming Language

Python

Related Categories

Python Artificial Intelligence Software

Registered

2024-11-08