<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>ForgeTrain &#8211; 孙威的阳光海</title>
	<atom:link href="https://www.sunnyfly.com/tag/forgetrain/feed" rel="self" type="application/rss+xml" />
	<link>https://www.sunnyfly.com</link>
	<description>人生没有终极意义，只有那些过程中的好时光。</description>
	<lastBuildDate>Tue, 16 Jun 2026 09:27:03 +0000</lastBuildDate>
	<language>zh-Hans</language>
	<sy:updatePeriod>
	hourly	</sy:updatePeriod>
	<sy:updateFrequency>
	1	</sy:updateFrequency>
	<generator>https://wordpress.org/?v=7.0</generator>

<image>
	<url>https://www.sunnyfly.com/wp-content/uploads/2019/07/cropped-wyat-32x32.jpg</url>
	<title>ForgeTrain &#8211; 孙威的阳光海</title>
	<link>https://www.sunnyfly.com</link>
	<width>32</width>
	<height>32</height>
</image> 
	<item>
		<title>AI自己造AI，面壁把&#8221;人工&#8221;两个字删了</title>
		<link>https://www.sunnyfly.com/ai%e8%87%aa%e5%b7%b1%e9%80%a0ai%ef%bc%8c%e9%9d%a2%e5%a3%81%e6%8a%8a%e4%ba%ba%e5%b7%a5%e4%b8%a4%e4%b8%aa%e5%ad%97%e5%88%a0%e4%ba%86.html</link>
		
		<dc:creator><![CDATA[孙威]]></dc:creator>
		<pubDate>Fri, 29 May 2026 09:26:14 +0000</pubDate>
				<category><![CDATA[AI]]></category>
		<category><![CDATA[ForgeTrain]]></category>
		<category><![CDATA[预训练框架]]></category>
		<guid isPermaLink="false">https://www.sunnyfly.com/?p=3355</guid>

					<description><![CDATA[事情是这样的。 前天晚上，我刷到一条消息，当时就没...]]></description>
										<content:encoded><![CDATA[<p style="text-align: start; font-size: 17px; font-weight: 400; color: rgba(0, 0, 0, 0.9); line-height: 1.8; margin-bottom: 24px; visibility: visible;"><span style="visibility: visible;"><span style="color: rgba(0, 0, 0, 0.9); visibility: visible;">事情是这样的。</span></span></p>
<p style="text-align: start; font-size: 17px; font-weight: 400; color: rgba(0, 0, 0, 0.9); line-height: 1.8; margin-bottom: 24px; visibility: visible;"><span style="visibility: visible;"><span style="color: rgba(0, 0, 0, 0.9); visibility: visible;">前天晚上，我刷到一条消息，当时就没绷住。</span></span></p>
<p style="text-align: start; font-size: 17px; font-weight: 400; color: rgba(0, 0, 0, 0.9); line-height: 1.8; margin-bottom: 24px; visibility: visible;"><span style="visibility: visible;"><span style="color: rgba(0, 0, 0, 0.9); visibility: visible;">面壁智能，一家国内AI公司，搞出了一个叫ForgeTrain的东西。</span></span></p>
<p style="text-align: start; font-size: 17px; font-weight: 400; color: rgba(0, 0, 0, 0.9); line-height: 1.8; margin-bottom: 24px; visibility: visible;"><span style="visibility: visible;"><span style="color: rgba(0, 0, 0, 0.9); visibility: visible;">不是新模型，不是新应用，是一整套大模型的预训练框架。</span></span></p>
<p style="text-align: start; font-size: 17px; font-weight: 400; color: rgba(0, 0, 0, 0.9); line-height: 1.8; margin-bottom: 24px; visibility: visible;"><span style="visibility: visible;"><span style="color: rgba(0, 0, 0, 0.9); visibility: visible;">重点是，这玩意是AI自己写的。</span></span></p>
<p style="text-align: start; font-size: 17px; font-weight: 400; color: rgba(0, 0, 0, 0.9); line-height: 1.8; margin-bottom: 24px; visibility: visible;"><span style="visibility: visible;"><span style="color: rgba(0, 0, 0, 0.9); visibility: visible;">零人工代码。</span></span></p>
<p style="text-align: start; font-size: 17px; font-weight: 400; color: rgba(0, 0, 0, 0.9); line-height: 1.8; margin-bottom: 24px; visibility: visible;"><span style="visibility: visible;"><span style="color: rgba(0, 0, 0, 0.9); visibility: visible;">你没看错，AI写了一个用来训练AI的框架，然后用这个框架，又训练出了一个新模型MiniCPM5-1B。</span></span></p>
<p style="text-align: start; font-size: 17px; font-weight: 400; color: rgba(0, 0, 0, 0.9); line-height: 1.8; margin-bottom: 24px; visibility: visible;"><span style="visibility: visible;"><span style="color: rgba(0, 0, 0, 0.9); visibility: visible;">从框架到模型，全程没有人类程序员手写核心代码。</span></span></p>
<p style="text-align: start; font-size: 17px; font-weight: 400; color: rgba(0, 0, 0, 0.9); line-height: 1.8; margin-bottom: 24px; visibility: visible;"><span style="visibility: visible;"><span style="color: rgba(0, 0, 0, 0.9); visibility: visible;">如果这事是真的，那就是全球第一次有人跑通这件事。</span></span></p>
<hr style="box-sizing: border-box; margin: 16px 0px; border-width: 0.8px medium medium; border-style: solid none none; border-color: #e5e5e5 currentcolor currentcolor; border-image: initial; color: rgba(0, 0, 0, 0.9); font-family: -apple-system, BlinkMacSystemFont, 'Segoe UI', Roboto, 'Helvetica Neue', sans-serif; font-size: 15px; font-style: normal; font-variant-ligatures: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; orphans: 2; text-align: start; text-indent: 0px; text-transform: none; widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px; white-space: normal; text-decoration-thickness: initial; text-decoration-style: initial; text-decoration-color: initial; visibility: visible;" />
<p style="text-align: start; font-size: 17px; font-weight: 400; color: rgba(0, 0, 0, 0.9); line-height: 1.8; margin-bottom: 24px; visibility: visible;"><span style="visibility: visible;"><span style="color: rgba(0, 0, 0, 0.9); visibility: visible;">我先压一下情绪，把事情捋清楚。</span></span></p>
<p style="text-align: start; font-size: 17px; font-weight: 400; color: rgba(0, 0, 0, 0.9); line-height: 1.8; margin-bottom: 24px; visibility: visible;"><span style="visibility: visible;"><span style="color: rgba(0, 0, 0, 0.9); visibility: visible;">大模型训练框架这东西，普通人可能没概念，我打个比方。</span></span></p>
<p style="text-align: start; font-size: 17px; font-weight: 400; color: rgba(0, 0, 0, 0.9); line-height: 1.8; margin-bottom: 24px; visibility: visible;"><span style="visibility: visible;"><span style="color: rgba(0, 0, 0, 0.9); visibility: visible;">你要造一辆车，训练框架就是那条生产线。以前这条生产线是谁搭的？英伟达的工程师搭了Megatron，Meta的工程师搭了Fairseq，谷歌的工程师搭了TensorFlow。</span></span></p>
<p style="text-align: start; font-size: 17px; font-weight: 400; color: rgba(0, 0, 0, 0.9); line-height: 1.8; margin-bottom: 24px; visibility: visible;"><span style="visibility: visible;"><span style="color: rgba(0, 0, 0, 0.9); visibility: visible;">全是人力，一行一行堆出来的。</span></span></p>
<p style="text-align: start; font-size: 17px; font-weight: 400; color: rgba(0, 0, 0, 0.9); line-height: 1.8; margin-bottom: 24px; visibility: visible;"><span style="visibility: visible;"><span style="color: rgba(0, 0, 0, 0.9); visibility: visible;">现在面壁智能说，这条生产线，他们的AI自己设计、自己写代码、自己搭起来了。</span></span></p>
<p style="text-align: start; font-size: 17px; font-weight: 400; color: rgba(0, 0, 0, 0.9); line-height: 1.8; margin-bottom: 24px; visibility: visible;"><span style="visibility: visible;"><span style="color: rgba(0, 0, 0, 0.9); visibility: visible;">他们给这套东西起了个名，ForgeTrain。</span></span></p>
<p style="text-align: start; font-size: 17px; font-weight: 400; color: rgba(0,0,0,0.9); line-height: 1.8; margin-bottom: 24px;"><span style="color: rgba(0, 0, 0, 0.9);">然后最狠的地方来了。</span></p>
<p style="text-align: start; font-size: 17px; font-weight: 400; color: rgba(0,0,0,0.9); line-height: 1.8; margin-bottom: 24px;"><span style="color: rgba(0, 0, 0, 0.9);">ForgeTrain跑起来之后，训练速度比英伟达的Megatron还快10%。</span></p>
<p style="text-align: start; font-size: 17px; font-weight: 400; color: rgba(0,0,0,0.9); line-height: 1.8; margin-bottom: 24px;"><span style="color: rgba(0, 0, 0, 0.9);">在华为昇腾芯片上跑，比昇腾自己的原生框架还快10%。</span></p>
<p style="text-align: start; font-size: 17px; font-weight: 400; color: rgba(0,0,0,0.9); line-height: 1.8; margin-bottom: 24px;"><span style="color: rgba(0, 0, 0, 0.9);">也就是说，AI写的框架，打赢了人类工程师写给自家芯片的优化框架。</span></p>
<p style="text-align: start; font-size: 17px; font-weight: 400; color: rgba(0,0,0,0.9); line-height: 1.8; margin-bottom: 24px;"><span style="color: rgba(0, 0, 0, 0.9);">这事我盯着屏幕看了好一会。</span></p>
<hr style="box-sizing: border-box; margin: 16px 0px; border-right: none; border-bottom: none; border-left: none; border-image: initial; border-top: 0.8px solid #e5e5e5; color: rgba(0, 0, 0, 0.9); font-family: -apple-system, BlinkMacSystemFont, 'Segoe UI', Roboto, 'Helvetica Neue', sans-serif; font-size: 15px; font-style: normal; font-variant-ligatures: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; orphans: 2; text-align: start; text-indent: 0px; text-transform: none; widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px; white-space: normal; text-decoration-thickness: initial; text-decoration-style: initial; text-decoration-color: initial;" />
<p style="text-align: start; font-size: 17px; font-weight: 400; color: rgba(0,0,0,0.9); line-height: 1.8; margin-bottom: 24px;"><span style="color: rgba(0, 0, 0, 0.9);">要理解为什么这事重要，得先搞懂ForgeTrain到底做到了哪一步。</span></p>
<p style="text-align: start; font-size: 17px; font-weight: 400; color: rgba(0,0,0,0.9); line-height: 1.8; margin-bottom: 24px;"><span style="color: rgba(0, 0, 0, 0.9);">面壁智能搞了一套分级，从L1到L5，说的是AI自主研发能走到哪一步。</span></p>
<p style="text-align: start; font-size: 17px; font-weight: 400; color: rgba(0,0,0,0.9); line-height: 1.8; margin-bottom: 24px;"><span style="color: rgba(0, 0, 0, 0.9);">L1就是AI给建议人来执行，GitHub Copilot干的就是这个。L2进了一步，AI能帮你写完一个函数、改一段脚本、调几个参数，Cursor和Claude Code在这个位置。L3是另一回事了，AI端到端产出下一代模型，不需要人类在中间不停介入。ForgeTrain，就站在这个L3的位置。再往上，L4是AI能改造自己的训练管线，L5更遥远，AI自己决定明天要研究什么。</span></p>
<p style="text-align: start; font-size: 17px; font-weight: 400; color: rgba(0,0,0,0.9); line-height: 1.8; margin-bottom: 24px;"><span style="color: rgba(0, 0, 0, 0.9);">现在业界绝大多数产品，还在L1和L2之间挣扎。</span></p>
<p style="text-align: start; font-size: 17px; font-weight: 400; color: rgba(0,0,0,0.9); line-height: 1.8; margin-bottom: 24px;"><span style="color: rgba(0, 0, 0, 0.9);">面壁说他们摸到了L3。</span></p>
<p style="text-align: start; font-size: 17px; font-weight: 400; color: rgba(0,0,0,0.9); line-height: 1.8; margin-bottom: 24px;"><span style="color: rgba(0, 0, 0, 0.9);">我一开始是怀疑的。</span></p>
<p style="text-align: start; font-size: 17px; font-weight: 400; color: rgba(0,0,0,0.9); line-height: 1.8; margin-bottom: 24px;"><span style="color: rgba(0, 0, 0, 0.9);">AI写个爬虫脚本、写个前端页面，这事不稀奇。但预训练框架是干嘛的？是定义神经网络怎么通信、怎么分配算力、怎么在成千上万张卡上协同训练的基础设施。</span></p>
<p style="text-align: start; font-size: 17px; font-weight: 400; color: rgba(0,0,0,0.9); line-height: 1.8; margin-bottom: 24px;"><span style="color: rgba(0, 0, 0, 0.9);">这东西出了bug，不是跑不通，是跑出来一堆垃圾还不报错。</span></p>
<p style="text-align: start; font-size: 17px; font-weight: 400; color: rgba(0,0,0,0.9); line-height: 1.8; margin-bottom: 24px;"><span style="color: rgba(0, 0, 0, 0.9);">AI能写这个？</span></p>
<p style="text-align: start; font-size: 17px; font-weight: 400; color: rgba(0,0,0,0.9); line-height: 1.8; margin-bottom: 24px;"><span style="color: rgba(0, 0, 0, 0.9);">然后我去看了他们的技术路径，稍微有点信服了。</span></p>
<hr style="box-sizing: border-box; margin: 16px 0px; border-right: none; border-bottom: none; border-left: none; border-image: initial; border-top: 0.8px solid #e5e5e5; color: rgba(0, 0, 0, 0.9); font-family: -apple-system, BlinkMacSystemFont, 'Segoe UI', Roboto, 'Helvetica Neue', sans-serif; font-size: 15px; font-style: normal; font-variant-ligatures: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; orphans: 2; text-align: start; text-indent: 0px; text-transform: none; widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px; white-space: normal; text-decoration-thickness: initial; text-decoration-style: initial; text-decoration-color: initial;" />
<p style="text-align: start; font-size: 17px; font-weight: 400; color: rgba(0,0,0,0.9); line-height: 1.8; margin-bottom: 24px;"><span style="color: rgba(0, 0, 0, 0.9);">ForgeTrain的核心是一套叫Harness的自动验证系统。</span></p>
<p style="text-align: start; font-size: 17px; font-weight: 400; color: rgba(0,0,0,0.9); line-height: 1.8; margin-bottom: 24px;"><span style="color: rgba(0, 0, 0, 0.9);">说人话就是，把AI关进一个自动测试的考场里。</span></p>
<p style="text-align: start; font-size: 17px; font-weight: 400; color: rgba(0,0,0,0.9); line-height: 1.8; margin-bottom: 24px;"><span style="color: rgba(0, 0, 0, 0.9);">流程不复杂，AI生成一段代码，系统自动运行测试，把结果反馈给AI，AI根据反馈修改代码，再跑测试，再改，循环往复。</span></p>
<p style="text-align: start; font-size: 17px; font-weight: 400; color: rgba(0,0,0,0.9); line-height: 1.8; margin-bottom: 24px;"><span style="color: rgba(0, 0, 0, 0.9);">全程不需要人类干预。</span></p>
<p style="text-align: start; font-size: 17px; font-weight: 400; color: rgba(0,0,0,0.9); line-height: 1.8; margin-bottom: 24px;"><span style="color: rgba(0, 0, 0, 0.9);">他们的方法其实分三个阶段来推进。</span></p>
<p style="text-align: start; font-size: 17px; font-weight: 400; color: rgba(0,0,0,0.9); line-height: 1.8; margin-bottom: 24px;"><span style="color: rgba(0, 0, 0, 0.9);">先从现有预训练框架采集关键数据，形成评测标准和Harness。然后拿这个Harness去逼AI写代码，写到能通过所有评测为止。最后一步最有意思，不再要求AI写的代码跟参考实现二进制一致，放开手脚让AI继续优化，一直跑到比人类写的参考框架还快。</span></p>
<p style="text-align: start; font-size: 17px; font-weight: 400; color: rgba(0,0,0,0.9); line-height: 1.8; margin-bottom: 24px;"><span style="color: rgba(0, 0, 0, 0.9);">我看到这段的时候，脑子里冒出来一个画面。</span></p>
<p style="text-align: start; font-size: 17px; font-weight: 400; color: rgba(0,0,0,0.9); line-height: 1.8; margin-bottom: 24px;"><span style="color: rgba(0, 0, 0, 0.9);">就像一个老师，先出了一套卷子，然后把AI关进教室，说你做完我自动批改，批改完你接着改答案，改到满分为止。</span></p>
<p style="text-align: start; font-size: 17px; font-weight: 400; color: rgba(0,0,0,0.9); line-height: 1.8; margin-bottom: 24px;"><span style="color: rgba(0, 0, 0, 0.9);">区别在于，这套卷子考的不是数学，是「写出一个能跑通且跑得快的训练框架」。</span></p>
<p style="text-align: start; font-size: 17px; font-weight: 400; color: rgba(0,0,0,0.9); line-height: 1.8; margin-bottom: 24px;"><span style="color: rgba(0, 0, 0, 0.9);">这事的可验证性，是ForgeTrain最聪明的地方。</span></p>
<p style="text-align: start; font-size: 17px; font-weight: 400; color: rgba(0,0,0,0.9); line-height: 1.8; margin-bottom: 24px;"><span style="color: rgba(0, 0, 0, 0.9);">因为训练框架对不对，一跑就知道，要么能训练出正常模型，要么不能。不存在「看起来对但实际不对」的模糊地带。</span></p>
<p style="text-align: start; font-size: 17px; font-weight: 400; color: rgba(0,0,0,0.9); line-height: 1.8; margin-bottom: 24px;"><span style="color: rgba(0, 0, 0, 0.9);">AI在有一个明确对错标准的环境里，是能迭代出可用代码的。</span></p>
<hr style="box-sizing: border-box; margin: 16px 0px; border-right: none; border-bottom: none; border-left: none; border-image: initial; border-top: 0.8px solid #e5e5e5; color: rgba(0, 0, 0, 0.9); font-family: -apple-system, BlinkMacSystemFont, 'Segoe UI', Roboto, 'Helvetica Neue', sans-serif; font-size: 15px; font-style: normal; font-variant-ligatures: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; orphans: 2; text-align: start; text-indent: 0px; text-transform: none; widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px; white-space: normal; text-decoration-thickness: initial; text-decoration-style: initial; text-decoration-color: initial;" />
<p style="text-align: start; font-size: 17px; font-weight: 400; color: rgba(0,0,0,0.9); line-height: 1.8; margin-bottom: 24px;"><span style="color: rgba(0, 0, 0, 0.9);">然后就是那个让人有点难受的对比数据了。</span></p>
<p style="text-align: start; font-size: 17px; font-weight: 400; color: rgba(0,0,0,0.9); line-height: 1.8; margin-bottom: 24px;"><span style="color: rgba(0, 0, 0, 0.9);">相同硬件条件下，ForgeTrain比英伟达Megatron训练速度快10%。</span></p>
<p style="text-align: start; font-size: 17px; font-weight: 400; color: rgba(0,0,0,0.9); line-height: 1.8; margin-bottom: 24px;"><span style="color: rgba(0, 0, 0, 0.9);">Megatron是什么？是英伟达花了好几年、投入大量工程师、专门为自家的GPU优化的训练框架。结果一个AI花几十分钟「写」出来的框架，跑得比它还快。</span></p>
<p style="text-align: start; font-size: 17px; font-weight: 400; color: rgba(0,0,0,0.9); line-height: 1.8; margin-bottom: 24px;"><span style="color: rgba(0, 0, 0, 0.9);">这事放在以前，我会觉得是营销话术。</span></p>
<p style="text-align: start; font-size: 17px; font-weight: 400; color: rgba(0,0,0,0.9); line-height: 1.8; margin-bottom: 24px;"><span style="color: rgba(0, 0, 0, 0.9);">但这次，ForgeTrain开源了，代码在GitHub上能查到，地址是 github.com/OpenBMB/ForgeTrain。</span></p>
<p style="text-align: start; font-size: 17px; font-weight: 400; color: rgba(0,0,0,0.9); line-height: 1.8; margin-bottom: 24px;"><span style="color: rgba(0, 0, 0, 0.9);">他们还用ForgeTrain在华为昇腾上预训练了MiniCPM5-1B，相比昇腾原生框架也有10%的速度加速。</span></p>
<p style="text-align: start; font-size: 17px; font-weight: 400; color: rgba(0,0,0,0.9); line-height: 1.8; margin-bottom: 24px;"><span style="color: rgba(0, 0, 0, 0.9);">昇腾的MindSpeed框架，是华为工程师专门为昇腾芯片调优的训练框架。AI写出来的版本，又赢了一局。</span></p>
<p style="text-align: start; font-size: 17px; font-weight: 400; color: rgba(0,0,0,0.9); line-height: 1.8; margin-bottom: 24px;"><span style="color: rgba(0, 0, 0, 0.9);">我看到这里的时候，其实心里冒出来的第一个念头不是兴奋，是一种很复杂的感觉。</span></p>
<p style="text-align: start; font-size: 17px; font-weight: 400; color: rgba(0,0,0,0.9); line-height: 1.8; margin-bottom: 24px;"><span style="color: rgba(0, 0, 0, 0.9);">就是那种，你知道这件事迟早会发生，但它真的发生的时候，你还是会被击中一下。</span></p>
<hr style="box-sizing: border-box; margin: 16px 0px; border-right: none; border-bottom: none; border-left: none; border-image: initial; border-top: 0.8px solid #e5e5e5; color: rgba(0, 0, 0, 0.9); font-family: -apple-system, BlinkMacSystemFont, 'Segoe UI', Roboto, 'Helvetica Neue', sans-serif; font-size: 15px; font-style: normal; font-variant-ligatures: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; orphans: 2; text-align: start; text-indent: 0px; text-transform: none; widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px; white-space: normal; text-decoration-thickness: initial; text-decoration-style: initial; text-decoration-color: initial;" />
<p style="text-align: start; font-size: 17px; font-weight: 400; color: rgba(0,0,0,0.9); line-height: 1.8; margin-bottom: 24px;"><span style="color: rgba(0, 0, 0, 0.9);">MiniCPM5-1B这个模型，值得单独说两句。</span></p>
<p style="text-align: start; font-size: 17px; font-weight: 400; color: rgba(0,0,0,0.9); line-height: 1.8; margin-bottom: 24px;"><span style="color: rgba(0, 0, 0, 0.9);">1B参数，FP16精度下权重体积约2GB，INT4量化后约0.5GB。</span></p>
<p style="text-align: start; font-size: 17px; font-weight: 400; color: rgba(0,0,0,0.9); line-height: 1.8; margin-bottom: 24px;"><span style="color: rgba(0, 0, 0, 0.9);">0.5GB是什么意思，就是你手机上装个App，随便一个都要比这个大。</span></p>
<p style="text-align: start; font-size: 17px; font-weight: 400; color: rgba(0,0,0,0.9); line-height: 1.8; margin-bottom: 24px;"><span style="color: rgba(0, 0, 0, 0.9);">但就是这么小一个模型，综合评测平均得分42.57，在MMLU-Pro、MMLU-Redux、AIME-2025、AIME-2026、BFCL-v4、AA榜单这些公开评测里，都排在同尺寸模型的前面。</span></p>
<p style="text-align: start; font-size: 17px; font-weight: 400; color: rgba(0,0,0,0.9); line-height: 1.8; margin-bottom: 24px;"><span style="color: rgba(0, 0, 0, 0.9);">更狠的是，在AA-Index国际知名榜单上，它超过了所有2B参数以下的模型。</span></p>
<p style="text-align: start; font-size: 17px; font-weight: 400; color: rgba(0,0,0,0.9); line-height: 1.8; margin-bottom: 24px;"><span style="color: rgba(0, 0, 0, 0.9);">三个月前发布的Qwen3.5-2B，参数量是它的两倍，效果还不如它。</span></p>
<p style="text-align: start; font-size: 17px; font-weight: 400; color: rgba(0,0,0,0.9); line-height: 1.8; margin-bottom: 24px;"><span style="color: rgba(0, 0, 0, 0.9);">面壁智能想用这个案例证明一件事，小参数模型也能实现高智能密度，模型能力不是只能靠堆参数。</span></p>
<p style="text-align: start; font-size: 17px; font-weight: 400; color: rgba(0,0,0,0.9); line-height: 1.8; margin-bottom: 24px;"><span style="color: rgba(0, 0, 0, 0.9);">他们给这个趋势算了一个速度，大模型智能密度，大约每3.5个月翻一番。</span></p>
<p style="text-align: start; font-size: 17px; font-weight: 400; color: rgba(0,0,0,0.9); line-height: 1.8; margin-bottom: 24px;"><span style="color: rgba(0, 0, 0, 0.9);">这个速度如果维持下去，一年不到，1B模型就能达到别人4B模型的水准。</span></p>
<hr style="box-sizing: border-box; margin: 16px 0px; border-right: none; border-bottom: none; border-left: none; border-image: initial; border-top: 0.8px solid #e5e5e5; color: rgba(0, 0, 0, 0.9); font-family: -apple-system, BlinkMacSystemFont, 'Segoe UI', Roboto, 'Helvetica Neue', sans-serif; font-size: 15px; font-style: normal; font-variant-ligatures: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; orphans: 2; text-align: start; text-indent: 0px; text-transform: none; widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px; white-space: normal; text-decoration-thickness: initial; text-decoration-style: initial; text-decoration-color: initial;" />
<p style="text-align: start; font-size: 17px; font-weight: 400; color: rgba(0,0,0,0.9); line-height: 1.8; margin-bottom: 24px;"><span style="color: rgba(0, 0, 0, 0.9);">这事对整个行业的冲击，我觉得可以从几个角度来想。</span></p>
<p style="text-align: start; font-size: 17px; font-weight: 400; color: rgba(0,0,0,0.9); line-height: 1.8; margin-bottom: 24px;"><span style="color: rgba(0, 0, 0, 0.9);">最直观的肯定是成本。训练一个大模型，算力成本是天价。ForgeTrain如果能把训练速度稳定提升10%，意味着同样的预算可以多跑11%的实验，或者同样的实验少花10%的钱。</span></p>
<p style="text-align: start; font-size: 17px; font-weight: 400; color: rgba(0,0,0,0.9); line-height: 1.8; margin-bottom: 24px;"><span style="color: rgba(0, 0, 0, 0.9);">10%听起来不多，但大模型训练的规模摆在那里，几个亿的投入，10%就是几千万。</span></p>
<p style="text-align: start; font-size: 17px; font-weight: 400; color: rgba(0,0,0,0.9); line-height: 1.8; margin-bottom: 24px;"><span style="color: rgba(0, 0, 0, 0.9);">再往下一层，是研发效率的问题。</span></p>
<p style="text-align: start; font-size: 17px; font-weight: 400; color: rgba(0,0,0,0.9); line-height: 1.8; margin-bottom: 24px;"><span style="color: rgba(0, 0, 0, 0.9);">以前要写一个训练框架，人类工程师团队可能要花几周甚至几个月。ForgeTrain的路径如果成熟，这个周期可以压缩到几十分钟——只要你能定义好Harness测试用例。</span></p>
<p style="text-align: start; font-size: 17px; font-weight: 400; color: rgba(0,0,0,0.9); line-height: 1.8; margin-bottom: 24px;"><span style="color: rgba(0, 0, 0, 0.9);">人类工程师的角色，从「亲手写代码」变成了「设计验证标准」。</span></p>
<p style="text-align: start; font-size: 17px; font-weight: 400; color: rgba(0,0,0,0.9); line-height: 1.8; margin-bottom: 24px;"><span style="color: rgba(0, 0, 0, 0.9);">面壁智能把这个变化描述成，从Human in the loop，变成Human on the loop。</span></p>
<p style="text-align: start; font-size: 17px; font-weight: 400; color: rgba(0,0,0,0.9); line-height: 1.8; margin-bottom: 24px;"><span style="color: rgba(0, 0, 0, 0.9);">以前是人类在循环里面执行，以后是人类在循环外面监督。</span></p>
<p style="text-align: start; font-size: 17px; font-weight: 400; color: rgba(0,0,0,0.9); line-height: 1.8; margin-bottom: 24px;"><span style="color: rgba(0, 0, 0, 0.9);">还有一层，是最让我在意的，国产算力的机会。</span></p>
<p style="text-align: start; font-size: 17px; font-weight: 400; color: rgba(0,0,0,0.9); line-height: 1.8; margin-bottom: 24px;"><span style="color: rgba(0, 0, 0, 0.9);">华为昇腾的硬件这些年进步很大，但软件生态是短板。英伟达的CUDA生态积累了15年，开发者习惯了Megatron、PyTorch、TensorFlow这一套，迁移到昇腾要重新学、重新调。</span></p>
<p style="text-align: start; font-size: 17px; font-weight: 400; color: rgba(0,0,0,0.9); line-height: 1.8; margin-bottom: 24px;"><span style="color: rgba(0, 0, 0, 0.9);">如果训练框架可以由AI自动生成、自动适配，那这个生态差距就有了一个弯道超车的可能。</span></p>
<p style="text-align: start; font-size: 17px; font-weight: 400; color: rgba(0,0,0,0.9); line-height: 1.8; margin-bottom: 24px;"><span style="color: rgba(0, 0, 0, 0.9);">AI不需要15年来积累生态，AI可以在几天内生成适配昇腾的优化代码。</span></p>
<p style="text-align: start; font-size: 17px; font-weight: 400; color: rgba(0,0,0,0.9); line-height: 1.8; margin-bottom: 24px;"><span style="color: rgba(0, 0, 0, 0.9);">这事如果真的跑通，国产算力的软件短板，有可能被AI自己补上。</span></p>
<hr style="box-sizing: border-box; margin: 16px 0px; border-right: none; border-bottom: none; border-left: none; border-image: initial; border-top: 0.8px solid #e5e5e5; color: rgba(0, 0, 0, 0.9); font-family: -apple-system, BlinkMacSystemFont, 'Segoe UI', Roboto, 'Helvetica Neue', sans-serif; font-size: 15px; font-style: normal; font-variant-ligatures: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; orphans: 2; text-align: start; text-indent: 0px; text-transform: none; widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px; white-space: normal; text-decoration-thickness: initial; text-decoration-style: initial; text-decoration-color: initial;" />
<p style="text-align: start; font-size: 17px; font-weight: 400; color: rgba(0,0,0,0.9); line-height: 1.8; margin-bottom: 24px;"><span style="color: rgba(0, 0, 0, 0.9);">我写到这里，估计已经有人要说了。</span></p>
<p style="text-align: start; font-size: 17px; font-weight: 400; color: rgba(0,0,0,0.9); line-height: 1.8; margin-bottom: 24px;"><span style="color: rgba(0, 0, 0, 0.9);">Wyat你是不是又在吹？AI写框架这事，靠谱吗？不会因为一两个案例就上头吧？</span></p>
<p style="text-align: start; font-size: 17px; font-weight: 400; color: rgba(0,0,0,0.9); line-height: 1.8; margin-bottom: 24px;"><span style="color: rgba(0, 0, 0, 0.9);">我诚实说，ForgeTrain目前还是一个特定场景下的成果。它现在验证的是预训练框架这个特定环节，而且是在1B这种小尺寸模型上跑通的。</span></p>
<p style="text-align: start; font-size: 17px; font-weight: 400; color: rgba(0,0,0,0.9); line-height: 1.8; margin-bottom: 24px;"><span style="color: rgba(0, 0, 0, 0.9);">能不能推广到更复杂的模型、更通用的场景，还需要更多验证。</span></p>
<p style="text-align: start; font-size: 17px; font-weight: 400; color: rgba(0,0,0,0.9); line-height: 1.8; margin-bottom: 24px;"><span style="color: rgba(0, 0, 0, 0.9);">但我觉得，方向比距离重要。</span></p>
<p style="text-align: start; font-size: 17px; font-weight: 400; color: rgba(0,0,0,0.9); line-height: 1.8; margin-bottom: 24px;"><span style="color: rgba(0, 0, 0, 0.9);">以前我们讨论AI能不能写代码，讨论的是「AI能不能辅助人类程序员」。</span></p>
<p style="text-align: start; font-size: 17px; font-weight: 400; color: rgba(0,0,0,0.9); line-height: 1.8; margin-bottom: 24px;"><span style="color: rgba(0, 0, 0, 0.9);">现在ForgeTrain把问题变成了「AI能不能替代人类程序员写核心基础设施」。</span></p>
<p style="text-align: start; font-size: 17px; font-weight: 400; color: rgba(0,0,0,0.9); line-height: 1.8; margin-bottom: 24px;"><span style="color: rgba(0, 0, 0, 0.9);">这两个问题的答案，如果是前者，那AI是工具。如果是后者，那AI是另一种东西。</span></p>
<p style="text-align: start; font-size: 17px; font-weight: 400; color: rgba(0,0,0,0.9); line-height: 1.8; margin-bottom: 24px;"><span style="color: rgba(0, 0, 0, 0.9);">我不确定最终答案是什么，但我确定的是，这个问题现在值得认真想了。</span></p>
<hr style="box-sizing: border-box; margin: 16px 0px; border-right: none; border-bottom: none; border-left: none; border-image: initial; border-top: 0.8px solid #e5e5e5; color: rgba(0, 0, 0, 0.9); font-family: -apple-system, BlinkMacSystemFont, 'Segoe UI', Roboto, 'Helvetica Neue', sans-serif; font-size: 15px; font-style: normal; font-variant-ligatures: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; orphans: 2; text-align: start; text-indent: 0px; text-transform: none; widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px; white-space: normal; text-decoration-thickness: initial; text-decoration-style: initial; text-decoration-color: initial;" />
<p style="text-align: start; font-size: 17px; font-weight: 400; color: rgba(0,0,0,0.9); line-height: 1.8; margin-bottom: 24px;"><span style="color: rgba(0, 0, 0, 0.9);">说个让我印象很深的细节。</span></p>
<p style="text-align: start; font-size: 17px; font-weight: 400; color: rgba(0,0,0,0.9); line-height: 1.8; margin-bottom: 24px;"><span style="color: rgba(0, 0, 0, 0.9);">面壁智能同时提出了一个叫Forge Engineering的编程范式。</span></p>
<p style="text-align: start; font-size: 17px; font-weight: 400; color: rgba(0,0,0,0.9); line-height: 1.8; margin-bottom: 24px;"><span style="color: rgba(0, 0, 0, 0.9);">核心思路是，打破传统通用框架的「均码适配」逻辑，针对不同模型、不同硬件、不同任务，由AI自动生成专用代码。</span></p>
<p style="text-align: start; font-size: 17px; font-weight: 400; color: rgba(0,0,0,0.9); line-height: 1.8; margin-bottom: 24px;"><span style="color: rgba(0, 0, 0, 0.9);">翻译成人话就是，以前我们做衣服，先做一个均码版型，谁穿都差不多，但不完全合身。现在AI的做法是，你来一个人，我现场给你生成一件只适合你的衣服。</span></p>
<p style="text-align: start; font-size: 17px; font-weight: 400; color: rgba(0,0,0,0.9); line-height: 1.8; margin-bottom: 24px;"><span style="color: rgba(0, 0, 0, 0.9);">均码的逻辑是「我做一个通用框架，你们都来适配我」。</span></p>
<p style="text-align: start; font-size: 17px; font-weight: 400; color: rgba(0,0,0,0.9); line-height: 1.8; margin-bottom: 24px;"><span style="color: rgba(0, 0, 0, 0.9);">Forge Engineering的逻辑是「你的模型+你的硬件+你的任务 = 我给你生成一个专用框架」。</span></p>
<p style="text-align: start; font-size: 17px; font-weight: 400; color: rgba(0,0,0,0.9); line-height: 1.8; margin-bottom: 24px;"><span style="color: rgba(0, 0, 0, 0.9);">这两个逻辑的区别，本质上是软件开发范式的一个可能的大转向。</span></p>
<p style="text-align: start; font-size: 17px; font-weight: 400; color: rgba(0,0,0,0.9); line-height: 1.8; margin-bottom: 24px;"><span style="color: rgba(0, 0, 0, 0.9);">从「平台化、通用化」转向「按需生成、专用优化」。</span></p>
<p style="text-align: start; font-size: 17px; font-weight: 400; color: rgba(0,0,0,0.9); line-height: 1.8; margin-bottom: 24px;"><span style="color: rgba(0, 0, 0, 0.9);">如果这事真的能跑通，以后可能不需要那么多通用框架了，你需要什么，AI现场给你生成一个，专门为你优化。</span></p>
<p style="text-align: start; font-size: 17px; font-weight: 400; color: rgba(0,0,0,0.9); line-height: 1.8; margin-bottom: 24px;"><span style="color: rgba(0, 0, 0, 0.9);">这事听起来很科幻，但ForgeTrain已经把这个路径跑出来了一个可验证的样本。</span></p>
<hr style="box-sizing: border-box; margin: 16px 0px; border-right: none; border-bottom: none; border-left: none; border-image: initial; border-top: 0.8px solid #e5e5e5; color: rgba(0, 0, 0, 0.9); font-family: -apple-system, BlinkMacSystemFont, 'Segoe UI', Roboto, 'Helvetica Neue', sans-serif; font-size: 15px; font-style: normal; font-variant-ligatures: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; orphans: 2; text-align: start; text-indent: 0px; text-transform: none; widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px; white-space: normal; text-decoration-thickness: initial; text-decoration-style: initial; text-decoration-color: initial;" />
<p style="text-align: start; font-size: 17px; font-weight: 400; color: rgba(0,0,0,0.9); line-height: 1.8; margin-bottom: 24px;"><span style="color: rgba(0, 0, 0, 0.9);">我想聊聊这件事让我想到的更远处。</span></p>
<p style="text-align: start; font-size: 17px; font-weight: 400; color: rgba(0,0,0,0.9); line-height: 1.8; margin-bottom: 24px;"><span style="color: rgba(0, 0, 0, 0.9);">去年的时候，大家还在争论，AI会不会取代程序员。</span></p>
<p style="text-align: start; font-size: 17px; font-weight: 400; color: rgba(0,0,0,0.9); line-height: 1.8; margin-bottom: 24px;"><span style="color: rgba(0, 0, 0, 0.9);">今年，AI开始写框架了。</span></p>
<p style="text-align: start; font-size: 17px; font-weight: 400; color: rgba(0,0,0,0.9); line-height: 1.8; margin-bottom: 24px;"><span style="color: rgba(0, 0, 0, 0.9);">明年呢？</span></p>
<p style="text-align: start; font-size: 17px; font-weight: 400; color: rgba(0,0,0,0.9); line-height: 1.8; margin-bottom: 24px;"><span style="color: rgba(0, 0, 0, 0.9);">我不是一个容易焦虑的人，但ForgeTrain这个事，确实让我认真地想了一下。</span></p>
<p style="text-align: start; font-size: 17px; font-weight: 400; color: rgba(0,0,0,0.9); line-height: 1.8; margin-bottom: 24px;"><span style="color: rgba(0, 0, 0, 0.9);">如果AI能写训练框架，能训练模型，那下一个被AI「自己造出来」的东西会是什么？</span></p>
<p style="text-align: start; font-size: 17px; font-weight: 400; color: rgba(0,0,0,0.9); line-height: 1.8; margin-bottom: 24px;"><span style="color: rgba(0, 0, 0, 0.9);">AI写的操作系统？AI写的编译器？AI写的数据库？</span></p>
<p style="text-align: start; font-size: 17px; font-weight: 400; color: rgba(0,0,0,0.9); line-height: 1.8; margin-bottom: 24px;"><span style="color: rgba(0, 0, 0, 0.9);">面壁智能的L5阶段，写的是「AI自主设定研究议程，开放式探索」。</span></p>
<p style="text-align: start; font-size: 17px; font-weight: 400; color: rgba(0,0,0,0.9); line-height: 1.8; margin-bottom: 24px;"><span style="color: rgba(0, 0, 0, 0.9);">那个阶段如果到来，AI就不再是工具了，它是一个能自己决定「我下一步要研究什么」的东西。</span></p>
<p style="text-align: start; font-size: 17px; font-weight: 400; color: rgba(0,0,0,0.9); line-height: 1.8; margin-bottom: 24px;"><span style="color: rgba(0, 0, 0, 0.9);">我现在说这个，肯定有人觉得太远了、太科幻了。</span></p>
<p style="text-align: start; font-size: 17px; font-weight: 400; color: rgba(0,0,0,0.9); line-height: 1.8; margin-bottom: 24px;"><span style="color: rgba(0, 0, 0, 0.9);">但ForgeTrain在两天前，把「AI造AI」从口号变成了可复现的工程样本。</span></p>
<p style="text-align: start; font-size: 17px; font-weight: 400; color: rgba(0,0,0,0.9); line-height: 1.8; margin-bottom: 24px;"><span style="color: rgba(0, 0, 0, 0.9);">这事发生在2026年5月26日，记住这个日期，以后回看，可能是一个节点。</span></p>
<hr style="box-sizing: border-box; margin: 16px 0px; border-right: none; border-bottom: none; border-left: none; border-image: initial; border-top: 0.8px solid #e5e5e5; color: rgba(0, 0, 0, 0.9); font-family: -apple-system, BlinkMacSystemFont, 'Segoe UI', Roboto, 'Helvetica Neue', sans-serif; font-size: 15px; font-style: normal; font-variant-ligatures: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; orphans: 2; text-align: start; text-indent: 0px; text-transform: none; widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px; white-space: normal; text-decoration-thickness: initial; text-decoration-style: initial; text-decoration-color: initial;" />
<p style="text-align: start; font-size: 17px; font-weight: 400; color: rgba(0,0,0,0.9); line-height: 1.8; margin-bottom: 24px;"><span style="color: rgba(0, 0, 0, 0.9);">最后说一个我挺认同的判断。</span></p>
<p style="text-align: start; font-size: 17px; font-weight: 400; color: rgba(0,0,0,0.9); line-height: 1.8; margin-bottom: 24px;"><span style="color: rgba(0, 0, 0, 0.9);">面壁智能的团队在发布里写了一段话，大意是，大模型竞争的逻辑正在转变，从「堆资源、拼参数」转向「提效率」。</span></p>
<p style="text-align: start; font-size: 17px; font-weight: 400; color: rgba(0,0,0,0.9); line-height: 1.8; margin-bottom: 24px;"><span style="color: rgba(0, 0, 0, 0.9);">ForgeTrain就是效率逻辑下的产物。</span></p>
<p style="text-align: start; font-size: 17px; font-weight: 400; color: rgba(0,0,0,0.9); line-height: 1.8; margin-bottom: 24px;"><span style="color: rgba(0, 0, 0, 0.9);">以后比的可能不再是「我有多少张卡」「我模型有多少参数」，而是「我用什么框架训练」「我的训练效率比你高多少」。</span></p>
<p style="text-align: start; font-size: 17px; font-weight: 400; color: rgba(0,0,0,0.9); line-height: 1.8; margin-bottom: 24px;"><span style="color: rgba(0, 0, 0, 0.9);">这个转向，对资源相对有限的公司和地区，其实是一个机会。</span></p>
<p style="text-align: start; font-size: 17px; font-weight: 400; color: rgba(0,0,0,0.9); line-height: 1.8; margin-bottom: 24px;"><span style="color: rgba(0, 0, 0, 0.9);">因为效率是可以弯道超车的，堆资源不行。</span></p>
<p style="text-align: start; font-size: 17px; font-weight: 400; color: rgba(0,0,0,0.9); line-height: 1.8; margin-bottom: 24px;"><span style="color: rgba(0, 0, 0, 0.9);">中国AI这次在ForgeTrain这件事上走的路径，跟之前不太一样。不是跟在后面追，是在一个很前沿的方向上，自己先踩了一脚。</span></p>
<p style="text-align: start; font-size: 17px; font-weight: 400; color: rgba(0,0,0,0.9); line-height: 1.8; margin-bottom: 24px;"><span style="color: rgba(0, 0, 0, 0.9);">这个事能不能跑通、能不能规模化、能不能真的改变行业，现在下结论还太早。</span></p>
<p style="text-align: start; font-size: 17px; font-weight: 400; color: rgba(0,0,0,0.9); line-height: 1.8; margin-bottom: 24px;"><span style="color: rgba(0, 0, 0, 0.9);">但我觉得，它值得你停下来认真看一眼。</span></p>
<hr style="box-sizing: border-box; margin: 16px 0px; border-right: none; border-bottom: none; border-left: none; border-image: initial; border-top: 0.8px solid #e5e5e5; color: rgba(0, 0, 0, 0.9); font-family: -apple-system, BlinkMacSystemFont, 'Segoe UI', Roboto, 'Helvetica Neue', sans-serif; font-size: 15px; font-style: normal; font-variant-ligatures: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; orphans: 2; text-align: start; text-indent: 0px; text-transform: none; widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px; white-space: normal; text-decoration-thickness: initial; text-decoration-style: initial; text-decoration-color: initial;" />
<p style="text-align: start; font-size: 17px; font-weight: 400; color: rgba(0,0,0,0.9); line-height: 1.8; margin-bottom: 24px;"><span style="color: rgba(0, 0, 0, 0.9);">以上，既然看到这里了，如果觉得不错，随手点个赞、在看、转发三连吧，如果想第一时间收到推送，也可以给我个星标<img src="https://s.w.org/images/core/emoji/17.0.2/72x72/2b50.png" alt="⭐" class="wp-smiley" style="height: 1em; max-height: 1em;" />～</span></p>
<p style="text-align: start; font-size: 17px; font-weight: 400; color: rgba(0,0,0,0.9); line-height: 1.8; margin-bottom: 24px;"><span style="color: rgba(0, 0, 0, 0.9);">谢谢你看我的文章，我们，下次再见。</span></p>
<blockquote style="font-size: 15px; font-weight: 400; color: rgba(0,0,0,0.55); line-height: 1.8; margin-bottom: 24px;">
<p style="box-sizing: border-box; margin: 0px; padding-bottom: 12px; font-size: 17px; font-weight: 400; color: rgba(0,0,0,0.9); line-height: 1.8; margin-bottom: 24px;">/ 作者：Wyat/ 联系邮箱：wyat.sun@qq.com</p>
</blockquote>
]]></content:encoded>
					
		
		
			</item>
	</channel>
</rss>
