Sunlight AI Benchmarks

Real Benchmarks, Real Performance

Based on Kimi K2.6 and GPT-OSS-20B - proven models with published benchmark results. See how Sunlight compares to industry leaders on standardized tests.

DeepSearchQA

92.5%

f1-score Leader

SWE-Bench Pro

58.6%

Agentic Coding Leader

Cost

96% Less

vs Opus 4.6

MathVision

93.2%

Visual Reasoning

Performance Comparison

Overall Capabilities

Detailed Model Comparison

ModelReasoningSpeedContext WindowPricingMultimodal
Sunlight 1Our Model
Advanced (High)128 tok/s200K tokens$0.60/$3.00 per 1MYes
Sunlight 1 SpeedOur Model
Standard178 tok/s128K tokens$0.20/$1.00 per 1MYes
Opus 4.6
Advanced (Max)65 tok/s200K tokens$15.00/$75.00 per 1MYes
Gemini 3.1 Pro
Good82 tok/s2M tokens$1.25/$5.00 per 1MYes
GPT 5.4
Advanced (xHigh)80 tok/s128K tokens$2.50/$15.00 per 1MYes

Benchmark Methodology

Sunlight models are powered by proven open-source architectures: Kimi K2.6 (Sunlight 1) and GPT-OSS-20B (Sunlight 1 Speed). All benchmarks are from published academic papers and official model cards, tested under standardized conditions.

Key Benchmarks

  • • HLE (Full) - Tool Use & Reasoning
  • • BrowseComp - Web Navigation
  • • DeepSearchQA - Information Retrieval
  • • SWE-Bench Pro - Agentic Coding
  • • Terminal-Bench 2.0 - CLI Tasks
  • • MathVision - Visual Math Reasoning
  • • OSWorld-Verified - Desktop Automation

Data Sources

  • • Moonshot AI Official Benchmarks
  • • OpenAI GPT-OSS Model Card
  • • Artificial Analysis Reports
  • • Academic Research Papers
  • • Independent Third-Party Tests

Experience the Difference

Try Sunlight AI today and see why developers choose us for superior performance at unbeatable prices.