Understanding Cost Avoidance

Understanding Cost Avoidance
Savings represent pure Cost Avoidance. They are calculated deterministically on a per-request basis against the flagship 2026 gpt-5.5 model.
If a routed model costs less than the baseline, the difference is added to your savings.
If you intentionally route a task to a higher-cost premium SKU, the savings for that specific request is strictly $0.00.
Your cumulative savings track how much total money you avoided spending across all efficient routing decisions. Negative savings are mathematically floored to zero to ensure your dashboard accurately reflects pure saved capital without penalizing intentional premium model usage.

How it is computed in the platform

For each metered request, the gateway records a baseline cost using gpt-5.5 list economics ($5.00 per million input tokens and $30.00 per million output tokens) applied to the same token counts as the routed call. Your wholesale spend for the actual model is stored separately.

Cost avoidance for the period is the sum, over requests, of max(0, baseline cost − wholesale cost). Premium routes where wholesale meets or exceeds the baseline contribute zero to the total; the aggregate never goes negative, so intentional premium usage does not reduce the cumulative avoidance figure.

Platform fees are not part of this comparison—the metric is pure compute against the flagship baseline.

Understanding Cost Avoidance ​

How it is computed in the platform ​

Related ​

Understanding Cost Avoidance

How it is computed in the platform

Related