Sunlight AI Benchmarks

Real Benchmarks, Real Performance

Based on Kimi K2.6 and GPT-OSS-20B - proven models with published benchmark results. See how Sunlight compares to industry leaders on standardized tests.

DeepSearchQA

92.5%

f1-score Leader

SWE-Bench Pro

58.6%

Agentic Coding Leader

Cost

96% Less

vs Opus 4.6

MathVision

93.2%

Visual Reasoning

Performance Comparison

Overall Capabilities

Detailed Model Comparison

Model	Reasoning	Speed	Context Window	Pricing	Multimodal
Sunlight 1Our Model	Advanced (High)	128 tok/s	200K tokens	$0.60/$3.00 per 1M	Yes
Sunlight 1 SpeedOur Model	Standard	178 tok/s	128K tokens	$0.20/$1.00 per 1M	Yes
Opus 4.6	Advanced (Max)	65 tok/s	200K tokens	$15.00/$75.00 per 1M	Yes
Gemini 3.1 Pro	Good	82 tok/s	2M tokens	$1.25/$5.00 per 1M	Yes
GPT 5.4	Advanced (xHigh)	80 tok/s	128K tokens	$2.50/$15.00 per 1M	Yes

Benchmark Methodology

Sunlight models are powered by proven open-source architectures: Kimi K2.6 (Sunlight 1) and GPT-OSS-20B (Sunlight 1 Speed). All benchmarks are from published academic papers and official model cards, tested under standardized conditions.

Key Benchmarks

• HLE (Full) - Tool Use & Reasoning
• BrowseComp - Web Navigation
• DeepSearchQA - Information Retrieval
• SWE-Bench Pro - Agentic Coding
• Terminal-Bench 2.0 - CLI Tasks
• MathVision - Visual Math Reasoning
• OSWorld-Verified - Desktop Automation

Data Sources

• Moonshot AI Official Benchmarks
• OpenAI GPT-OSS Model Card
• Artificial Analysis Reports
• Academic Research Papers
• Independent Third-Party Tests

Experience the Difference

Try Sunlight AI today and see why developers choose us for superior performance at unbeatable prices.