Back to Research
Publication

SCLEB — Saanora Comprehensive LLM Evaluation Benchmark

Download the peer-reviewed PDF, full open-source codebase & Zenodo snapshot of our next-gen LLM evaluation framework.

SCLEB

by Saanora AI Research & Technology

“Yesterday’s leaderboards are saturated.
SCLEB keeps models honest—with 450 + contamination-proof tasks and hybrid auto + human + LLM-judge scoring.”


🚀 Download & Get Started

ResourceLink
PDF (camera-ready)Download PDF
GitHub — code + docsSCLEB Code Repo

Clone the repo or grab the ZIP/PDF—everything lives in one project so links never rot.


Why Another Benchmark?

  • Comprehensive – Four pillars
    1. Advanced Reasoning
    2. Nuanced Language
    3. Ethical AI & Safety
    4. Real-World Application
  • Contamination-resistant – 100 % novel, expert-crafted tasks; quarterly refreshes
  • Hybrid scoring – Exact-match & unit-tests + LLM-as-a-judge calibrated to human panels
  • Open & extensible – Apache-2.0 code, CC BY data, modular adapters (OpenAI, Anthropic, local)

TL;DR Run

git clone git@github.com:Saanora-Tech/SCLEB-Code.git
cd SCLEB-Code
pip install -r requirements.txt
python run_benchmark.py --model gpt-4o