Mumbai, March 30: Mustafa Suleyman, the CEO of Microsoft AI, has outlined a sharp economic thesis on the near-term future of the artificial intelligence industry. In a recent detailed analysis, Suleyman argued that the next two to three years will not be defined by the intelligence of AI models, but rather by the ability of companies to fund and run them at scale. He noted that demand for inference compute is currently "wildly outstripping" supply, creating a significant bottleneck for the sector.
The industry shift moves the focus from the cost of training large foundation models to the recurring expense of serving them to millions of users in real time. According to Suleyman, companies with the financial margins to absorb high "token" costs will improve their products the fastest. This creates a competitive "flywheel" effect, where lower latency leads to higher user retention, which in turn generates the proprietary data required for further model refinement. AI To Replace White-Collar Jobs: Microsoft AI Chief Mustafa Suleyman Predicts Job Automation Within 12 to 18 Months.
The Shift from Model Training to Inference Scarcity
Suleyman’s argument highlights a critical transition in the AI landscape for 2026. While previous years focused on building the "smartest" models, the current constraint lies on the serving side. Data from Deloitte’s 2026 TMT Predictions indicates that inference workloads now account for approximately two-thirds of all AI compute spending. This shift has led to GPU lead times stretching to nearly a year, with high-bandwidth memory from major suppliers reportedly sold out through 2026.
Infrastructure remains a primary constraint. Out of 16 GW of global data centre capacity planned for the year, only about 5 GW is currently under construction, with the remainder existing only as announced projects. Suleyman suggests that in this environment of scarcity, only products with high gross margins, such as enterprise legal tools, healthcare software and Microsoft 365 Copilot, can afford the premium inference costs required to maintain low latency and high performance.
Microsoft AI 'Flywheel' and the Scaling Gap
The "flywheel" logic proposed by Suleyman suggests a compounding advantage for high-margin products. When a company can afford premium inference, it delivers a faster user experience. Returning users then generate rich workflow data that allows the company to refine the fine-tuning loop, a concept Suleyman has emphasised since late 2024. Microsoft’s internal data supports this growth trajectory, with paid Copilot seats reaching 15 million in Q2 FY2026, marking a 160 per cent year-on-year increase.
However, this economic reality presents an uncomfortable corollary for consumer-facing AI apps and smaller start-ups. Without the capital to secure limited tokens, these entities face slower response times and weaker user retention. While some industry analysts argue that open-source developments or on-device AI could eventually reduce these costs, Suleyman’s current strategy is backed by Microsoft’s annual investment of over USD 80 billion in AI infrastructure.
Industry Outlook and Token Rationing
As the industry navigates this period of token rationing, the gap between well-funded enterprise solutions and cash-strapped start-ups is expected to widen. Suleyman’s perspective suggests that for the next few years, the business that can pay for the most tokens will effectively win the intelligence race by proxy. The focus is no longer solely on laboratory results, but on commercial viability in a resource-constrained market. Microsoft AI CEO Mustafa Suleyman Warns Industry Against Rushing AI Development.
While debate continues over whether intelligence per dollar may eventually level the playing field, the immediate trend favours large-scale infrastructure providers. For now, the AI industry appears to be entering a phase where financial scale is the primary requirement for technical superiority and market leadership.
(The above story first appeared on LatestLY on Mar 30, 2026 03:40 PM IST. For more news and updates on politics, world, sports, entertainment and lifestyle, log on to our website latestly.com).













Quickly


