Why is Gemini Flash so popular?
Gemini 2.0 Flash definitely has its virtues. It's a fast, low-latency model that has high quality for its price. If you look at one API gateway, OpenRouter, it processes around 321B tokens a week or 46B tokens a day. The question is what those tokens are doing.
Apps
Unfortunately, apps only explain a small portion of the usage. The sum of the top 15 consumers makes 21.65 out of the 321 billion tokens used weekly.
Categories
You can tally up the weekly usage across categories:
- Roleplay: 0.794B
- Programming: 2.32B
- Marketing: 0.169B
- Marketing/SEO: 0.021B
- Technology: 0.781B
- Science: 1.66B
- Translation: 0.589B
- Legal: 0.204B
- Finance: 0.173B
- Health: 0.771B
- Trivia: 0.008B
- Academia: 1.28B
- Total: 8.77B
This is weird - why doesn't it sum to 321B (total per week) or 3.21B (1%, categorized prompts)? The most likely explanation is that OR's categorizer sometimes outputs multiple or no categories.
Conclusion
One more thing to consider is that for every 1 token Gemini outputs, it reads through 10 input tokens. This verifies that Gemini 2.0 Flash users are working with long contexts, typical of uses like programming, search, and scientific summarization. Those are the majority even though there are a few other uses.