CrofAI updated my worldview

CrofAI is an AI inference provider I learned about a few days ago. I used to think inference was fast OR cheap - all providers were either mid, cheap but slow or unreliable, or fast but expensive. CrofAI is both fast and cheap.

Numbers show this best. Let's say you're using AI, specifically Kimi K2, to clone a set of calculators. Across 100 calls, you used 10K input tokens and 100K output tokens. How slow and expensive would this be on your favorite provider?

Please consult the tableAvg timeTotal cost (~)
Groq (OR's fastest)6.32s$0.31
CrofAI Turbo4.67s$0.21
DeepInfra (OR's cheapest)14.99s$0.23
Chutes (OR's cheapest)15.73s$0.05
CrofAI non-turbo29.5s$0.04

Almost certainly, your favorite provider is beat by CrofAI.

Gemma 3n is also on CrofAI, which raises an interesting point: if input_audio blocks worked, we could transcribe at the price of $0.001/hour, dethroning Phi 4 Multimodal and transcribing a whole year of audio (including night) for just $9.86. This is what "the price of intelligence [going] to zero" looks like.

CrofAI is a story about markets in a sense. They're performing arbitrage: there are a lot of cheap 3090s and 4090s out there, but nobody went to the effort of finding the cheapest ones, manually optimizing inference, and selling it as a service with a little markup. Doing this is how they become better than any other choice and prove themselves to costumers, and eventually get into OpenRouter (OR please add CrofAI thanks).

Who's the closest to catching up to CrofAI? Arguably Chutes or another entity on Bittensor. Their prices are already decent, and if there's an incentive for making kernels faster, they could go down even more and kill CrofAI in the process. However, I doubt this is soon: Chutes moves slow, and CrofAI has already acknowledged Chutes as competition, planning even more price cuts to K2 and other models within the next week.