KI-Benchmark-Leaderboard

Jeder Score, jede Quelle.

Benchmark-Daten aus offiziellen Quellen fuer Frontier-KI-Modelle. Die aktuelle Ansicht versteckt ersetzte Modelle und alte Benchmark-Versionen. Das Archiv bewahrt den Verlauf.

Zuletzt aktualisiert2026-05-29

Angezeigte Modelle13

Angezeigte Benchmarks15

Modell	Anbieter	Input $/M	Output $/M	SWE-bench Pro	Terminal-Bench 2.1	MCP-Atlas	Toolathlon	AutomationBench	OSWorld-Verified	BrowseComp	GPQA Diamond	Humanity's Last Exam	Humanity's Last Exam with tools	FrontierMath T1-3	ARC-AGI-2	Finance Agent v2	GDPval-AA	CyberGym
GPT-5.5	OpenAI	$5.00	$30.00
GPT-5.5 Pro	OpenAI	$30.00	$180.00	-	-	-	-	-	-		-	-	-		-	-	-	-
Claude Opus 4.8	Anthropic	$5.00	$25.00				-							-	-			-
GPT-5.4	OpenAI	$2.50	$15.00		-			-				-	-			-	-
Gemini 3.1 Pro Preview	Google	$2.00	$12.00															-
Gemini 3.1 Flash-Lite	Google	$0.25	$1.50	-	-	-	-	-	-	-	-	-	-	-	-	-	-	-
Gemini 3 Flash Preview	Google	$0.50	$3.00	-	-	-	-	-	-	-	-	-	-	-	-	-	-	-
DeepSeek V4 Flash	DeepSeek	$0.14	$0.28	-	-	-	-	-	-	-	-	-	-	-	-	-	-	-
DeepSeek V4 Pro	DeepSeek	$0.43	$0.87	-	-	-	-	-	-	-	-	-	-	-	-	-	-	-
Grok 4.3	xAI	$1.25	$2.50	-	-	-	-	-	-	-	-	-	-	-	-	-	-	-
GPT-OSS 120B	OpenAI via Groq	$0.15	$0.60	-	-	-	-	-	-	-	-	-	-	-	-	-	-	-
Kimi K2.6	Moonshot AI	$0.95	$4.00	-	-	-	-	-	-	-	-	-	-	-	-	-	-	-
GLM-5.1	Z.ai	$1.40	$4.40	-	-	-	-	-	-	-	-	-	-	-	-	-	-	-

Offizielle Quellen-URL

Offizielle Vergleichsquelle

Leere Zellen bedeuten, dass keine offizielle Quelle gefunden wurde. Wir schaetzen nicht.

Redaktionelle Richtlinie

Every score must cite an official provider, model-release, or benchmark-owner URL. When official data is missing, the cell is omitted rather than filled with an estimate. Current leaderboard rows hide archived models and superseded benchmark versions. Archive rows are kept for history and are clearly marked.

The default leaderboard shows current models and current benchmark versions only. The archive view keeps previous models and previous benchmark versions so older articles and historical comparisons remain traceable.

Jeder Score, jede Quelle.

Redaktionelle Richtlinie

Verwandte Analysen