Stamford, Conn., June 24: By 2028, AI coding costs will overtake the average developer’s salary due to rising large language model (LLM) token consumption and the shift to consumption-based licensing models, according to Gartner, Inc., a business and technology insights company.
AI tokens are the units of data processed by generative AI models. Token consumption directly impacts the cost of AI coding tools, particularly under consumption-based pricing structures.
“Organizations are rapidly moving from experimentation to scaled deployment of AI coding agents, but many are underestimating the financial impact of rising token consumption,” said Nitish Tyagi, Sr. Principal Analyst at Gartner. “Token discipline will not emerge through developer choice alone, as developers tend to optimize for speed and convenience over cost efficiency. Without a governed engineering operating model, costs can escalate faster than the productivity gains these tools are designed to deliver.”
Consumption-Based Pricing Introduces Cost Predictability Challenges
The shift from seat-based licensing to consumption-based pricing among AI coding agent vendors is introducing highly variable cost structures for software engineering workloads. Many vendors lack transparency into how token consumption is calculated and billed, limiting enterprises’ ability to accurately forecast and control costs.
Without clear visibility into token usage across development tasks, organizations risk budget overruns and reduced ability to track cost-to-value outcomes.
“Most organizations still lack the maturity and frameworks to effectively measure cost versus business impact,” said Tyagi. “Software engineering leaders are increasingly concerned as token-driven AI spend becomes harder to justify, with budgets often being depleted earlier than expected.”
Usage Patterns and Governance Gaps Are Driving Cost Pressure
Beyond pricing and visibility challenges, how AI coding agents are used within organizations is further driving cost pressures. Token overspending is often linked to how software engineering leaders govern usage, with common failure modes including ungoverned autonomy in agent-driven workflows, bloated context windows and the absence of structured feedback mechanisms to optimize usage.
In addition, AI coding vendors are yet to deliver mature, built-in cost optimization capabilities in AI coding agents, further contributing to cost escalation.
“AI coding costs will continue to rise as infrastructure investment and profitability challenges push model pricing higher,” said Tyagi. “At the same time, as more developers adopt AI tools, light users are expected to rapidly become mainstream users as familiarity and reliance increase, driving further growth in token consumption and overall spend.”
To manage rising costs and avoid budget overruns, Gartner recommends that software engineering leaders implement a disciplined operating model for AI usage:
- Establish a use-case-driven decision framework: Organizations should clearly define when AI coding agents should be used and determine appropriate levels of autonomy for each task. This includes classifying development tasks into three execution models: developer‑led, developer‑with‑agent, and fully agent‑led.
- Align model selection with task complexity: AI coding agents are most cost-effective when work is broken into smaller tasks that can be handled by smaller models, with escalation only when complexity demands it. Engineering and platform teams should implement intelligent model routing strategies that direct simpler, high-frequency tasks to smaller models while reserving frontier models for complex and high-value development work.
- Mandate context engineering practices: Developers must be trained to optimize the input context provided to AI systems by including only relevant information, summarizing content where possible, and eliminating unnecessary data to reduce token consumption without compromising output quality.
- Implement governance and cost controls: Organizations should introduce mechanisms such as token thresholds, escalation policies, and automated monitoring to manage usage. Embedding these controls into engineering workflows ensures consistency and prevents uncontrolled cost growth.
- Embed token usage reviews into development cycles: Leaders should mandate regular reviews of high-token-consuming workflows as part of sprint retrospectives to identify inefficiencies, refine practices, and promote knowledge sharing across engineering teams.
Additional Insights Available
Gartner clients can read more in How to Optimize Token Consumption for AI Coding Agents.
Learn how software development teams compare to others in the Gartner Software Engineering Score.
Gartner is the World Authority on AI
Gartner is the indispensable partner to C-Level executives and technology providers as they implement AI strategies to achieve their mission-critical priorities. The independence and objectivity of Gartner insights provide clients with the confidence to make informed decisions and unlock the full potential of AI. Clients across the C-Level are using Gartner‘s proprietary AskGartner AI tool to determine how to leverage AI in their business. With more than 2,500 business and technology experts, 6,000 written insights, as well as more than 4,000 AI use cases and case studies, Gartner is the world authority on AI. More information can be found here.