The Origins of Agar

2026年1月23日 · 郭瑞 · 来源：tutorial资讯

Testing LLM reasoning abilities with SAT is not an original idea; there is a recent research that did a thorough testing with models such as GPT-4o and found that for hard enough problems, every model degrades to random guessing. But I couldn't find any research that used newer models like I used. It would be nice to see a more thorough testing done again with newer models.

2.7 亿个参数 — 比 Gemma 3n E2B 小 10 倍，但足以满足函数调用需求

载人月球探测两大任务。业内人士推荐爱思助手下载最新版本作为进阶阅读

黎智英欺詐案上訴得直：定罪及刑罰被撤銷，出獄時間提前

Pakistan now in 'open war' with Afghanistan, defence minister says, after countries trade attacks

Sample

值得注意的是，OPPO Find 系列产品负责人周意保昨天还在微博透露，Find N6 将搭载「折叠唯一的哈苏 2 亿超清四摄」，并将首次在折叠屏搭载丹霞色彩还原镜头。