XPeng and Peking University Propose Novel Visual Token Pruning Framework; He Xiaopeng: New Breakthrough Achieved on the Path to Exploring L4

The results of the paper acceptance for the international artificial intelligence conference AAAAI 2026 were recently announced. A paper jointly completed by XPeng Inc. and the National Key Laboratory of Multimedia Information Processing at the School of Computer Science, Peking University, titled "FastDriveVLA: Efficient End-to-End Driving via Plug-and-Play Reconstruction-based Token Pruning," was successfully selected. The primary contribution of this paper lies in proposing an efficient visual token pruning framework specifically tailored for end-to-end autonomous driving VLA models, named FastDriveVLA.

Reportedly, FastDriveVLA incorporates a plug-and-play visual token pruner called ReconPruner. During the inference phase of the vehicle-side model, ReconPruner can be directly embedded into the autonomous driving VLA model for visual token pruning, offering a plug-and-play capability without requiring retraining of the entire model. To facilitate the training of this pruner, a dedicated dataset named nuScenes-FG, containing 241,000 image-mask pairs from six camera perspectives, was constructed. This large-scale annotated dataset for autonomous driving foreground segmentation can be widely utilized in future autonomous driving research.

Ultimately, testing on the nuScenes autonomous driving dataset revealed that employing this pruning framework achieved state-of-the-art (SOTA) results across various pruning ratios: when the pruning ratio reached 25% of visual tokens, driving performance showed almost no degradation, with its L2 trajectory error and collision rate metrics even surpassing the unpruned baseline model; when the pruning ratio reached 50% of tokens, performance was more balanced across all metrics; simultaneously, the inference efficiency of the VLA model was significantly enhanced.

The FastDriveVLA framework, proposed jointly by XPeng Inc. and Peking University, establishes a new paradigm for efficient visual token pruning in autonomous driving VLA models, while also setting a new benchmark for the efficient deployment of large vehicle-side models. XPeng's Chairman, He Xiaopeng, commented on Weibo, stating, "We are very pleased to have achieved another new breakthrough on our path to exploring L4. We will continue to advance in the field of Physical AI and look forward to the second-generation VLA delivering an even better smart driving experience to our Peng friends."

免責聲明：投資有風險，本文並非投資建議，以上內容不應被視為任何金融產品的購買或出售要約、建議或邀請，作者或其他用戶的任何相關討論、評論或帖子也不應被視為此類內容。本文僅供一般參考，不考慮您的個人投資目標、財務狀況或需求。TTM對信息的準確性和完整性不承擔任何責任或保證，投資者應自行研究並在投資前尋求專業建議。

老虎證券

XPeng and Peking University Propose Novel Visual Token Pruning Framework; He Xiaopeng: New Breakthrough Achieved on the Path to Exploring L4

熱議股票