SCLEB
by Saanora AI Research & Technology
“Yesterday’s leaderboards are saturated.
SCLEB keeps models honest—with 450 + contamination-proof tasks and hybrid auto + human + LLM-judge scoring.”
🚀 Download & Get Started
Resource | Link |
---|---|
PDF (camera-ready) | Download PDF |
GitHub — code + docs | SCLEB Code Repo |
Clone the repo or grab the ZIP/PDF—everything lives in one project so links never rot.
Why Another Benchmark?
- Comprehensive – Four pillars
- Advanced Reasoning
- Nuanced Language
- Ethical AI & Safety
- Real-World Application
- Contamination-resistant – 100 % novel, expert-crafted tasks; quarterly refreshes
- Hybrid scoring – Exact-match & unit-tests + LLM-as-a-judge calibrated to human panels
- Open & extensible – Apache-2.0 code, CC BY data, modular adapters (OpenAI, Anthropic, local)
TL;DR Run
git clone git@github.com:Saanora-Tech/SCLEB-Code.git
cd SCLEB-Code
pip install -r requirements.txt
python run_benchmark.py --model gpt-4o