U.S. Tech Firms Quietly Adopt Chinese AI Models, with Coinbase Leading the Way on GLM and Kimi

American technology companies are quietly integrating Chinese open-source AI models into their production infrastructure. As the cost of leading U.S. model services continues to rise, firms like Coinbase Global, Inc. (COIN) are beginning to adopt Chinese open-source models as the default option, aiming to significantly reduce AI expenses without curbing usage.

Coinbase Global, Inc. CEO Brian Armstrong revealed in a post on platform X last Friday evening that the company has set the recently released GLM 5.2 from Zhipu AI and Kimi 2.7 from Beijing-based Moonshot AI as the default models for engineers via an internal LLM gateway. Armstrong stated that, after combining measures like routing optimization and improved caching, Coinbase Global, Inc.'s AI spending has been cut by "nearly half," while token usage continues to grow at an exponential rate.

The Cost Advantage of Chinese Open-Source Models Comes to the Fore

Armstrong explicitly noted in his post that 91% of engineers never reached the previous usage limits. Therefore, instead of lowering caps or adding spending alerts, Coinbase Global, Inc. opted to switch to "cheaper default models."

GLM 5.2 originates from Zhipu AI, and Kimi 2.7 from Moonshot AI, both being open-source weight models. Armstrong explained that these models are deployed for routine tasks, while engineers can still opt for cutting-edge models for assignments requiring complex reasoning. His rationale is that using top-tier models for execution tasks is often "overkill."

For code review, a multi-model parallel strategy is employed, allowing different models to cross-check outputs to maintain quality standards.

Three-Tier Infrastructure Overhaul Drives Cost Reduction

Armstrong outlined three core methods.

The first is intelligent routing: Within a custom scheduling framework, the system pre-processes prompts and, considering cache hit rates and model pricing, automatically routes tasks to the most suitable and cost-effective model. He stated that the ultimate goal is for AI, not humans, to handle the task of model selection.

The second is aggressive caching: Coinbase Global, Inc. requires all requests to be cache-aware, maximizing the reuse of existing caches. Using LibreChat as an example, after correctly implementing caching, the cache hit rate jumped from 5% to 60%.

The third is context streamlining: Armstrong advises starting new sessions when switching tasks, narrowing file context scope, and disconnecting unused tools. He emphasized that the goal is not to reduce the total number of tokens used, but to reduce "wasted tokens."

Prioritizing Efficiency Over Restriction

Armstrong framed this cost-cutting as a prerequisite for scaling AI adoption, not as a limitation. He noted that engineers remain free to use any number of tokens and any model, but the company has made usage data visible and linked usage to business impact—"the more you spend, the more impact we expect."

He did not disclose specific absolute spending figures. However, structurally, achieving a near-halving of expenses while usage grows exponentially suggests Coinbase Global, Inc. has somewhat decoupled consumption from cost.

Armstrong concluded that this methodology is broadly applicable, and any enterprise can adopt it to achieve sustainable expansion of AI usage without letting cost become a ceiling.

Disclaimer: Investing carries risk. This is not financial advice. The above content should not be regarded as an offer, recommendation, or solicitation on acquiring or disposing of any financial products, any associated discussions, comments, or posts by author or other users should not be considered as such either. It is solely for general information purpose only, which does not consider your own investment objectives, financial situations or needs. TTM assumes no responsibility or warranty for the accuracy and completeness of the information, investors should do their own research and may seek professional advice before investing.

Tiger Brokers

U.S. Tech Firms Quietly Adopt Chinese AI Models, with Coinbase Leading the Way on GLM and Kimi

Most Discussed