"The previous generation of models focused on knowledge modeling; this generation aims to model productivity." We have learned exclusively that enterprise AI company Yanli Intelligent has recently completed an $8 million seed round of funding, led by BlueRun Ventures, with participation from the LightSource Entrepreneur Fund. Yanli Intelligent was founded by Zhang Fan, former COO of Zhihui AI, aiming to enhance digital workers through commercial reinforcement learning for enterprise clients. Zhang Fan is a serial entrepreneur who previously conducted machine translation research at the French National Research Center. After returning to China in 2010, he held roles at Sogou and Tencent for Siri-like intelligent voice products, founded Miaji Travel in 2013, served as CTO of Dazuche in 2020, established Yuanyin Intelligent in 2022 (which was later acquired by Zhihui), and became COO of Zhihui AI in 2023, leaving in June of this year.
Reinforcement learning has shown significant results in various areas like mathematics and programming. For instance, the new models from OpenAI and Gemini achieved gold medal levels at this year's International Mathematical Olympiad, while Cursor, utilizing an online reinforcement learning-enhanced code completion model, processes over 400 million requests daily. The core mechanism of reinforcement learning is to learn optimal strategies through trial and error and interaction with the environment, guided by rewards and penalties, making it suitable for tasks with clear rules and feedback. However, in business contexts, feedback is often sparse and delayed; for example, in a sales scenario, spending three hours conversing that ultimately does not result in a deal cannot simply be considered ineffective. Furthermore, business practices often encompass a wealth of tacit know-how beyond the explicit text that needs to be combined with foundational models to create effective reward mechanisms in real-world business applications.
Globally, the Thinking Machines Lab, founded by Mira Murati, former CTO of OpenAI, has also indicated plans to leverage reinforcement learning to help enterprises customize AI models that directly optimize their key performance indicators, such as revenue or profits. Currently, firms attempting Agent-to-B are generally categorized into three types: first, new companies emerging after the 2023 surge, looking to explore new enterprise services based on AI, such as Yuhua Technology, XunTu Technology, and Future Intelligence. Second, SaaS companies employing new AI technologies to deliver related services, already having established customer bases and industry data, like Fourth Paradigm, MingLue Technology, and LaiYe Technology. Third, large companies or cloud vendors creating service platforms based on self-developed models, such as Alibaba Cloud, ByteDance, Tencent, and Baidu, all offering related products and services.
Zhang Fan revealed that the current mainstream approach for Agent-to-B is based on a fixed workflow of "customization + Full-parameter Fine-Tuning (FFT)." Each scenario must adapt the base model, processes, and systems from scratch, making the deployment cost for individual scenarios exceedingly high. However, foundational models are optimized based on benchmarks, usually resulting in universal models that may perform decently across industries, achieving an 80% effectiveness rate, but struggle to excel sufficiently. Yanli Intelligent aims to orient its model development towards business outcomes by integrating commercial know-how and reinforcing learning tailored for the business domain. "The previous generation of models was about knowledge modeling, compressing knowledge into models; this generation needs to model productivity, decompressing the model back into the real world," said Zhang Fan.
Since 2015, China has experienced a surge in investments in enterprise services. However, due to low willingness to pay in the domestic market, difficulties in managing payment terms, and high sales and operational costs, the development of many startup companies in the previous wave was suboptimal, leading early investors to incur losses. Consequently, funds are more cautious when investing in the B2B enterprise service sector. A dollar fund investor mentioned they will continue to monitor the Agent-to-B track but primarily target the top founders and firms."