跳到主要内容

26 篇博文 含有标签「agents」

查看所有标签

AI Daily Digest: Agent 成功率暴增 12%→66%,RL 奖励作弊检测新方法 - 2026/04/20

· 阅读需 6 分钟
Yi Wang
Full Stack & AI Engineer

Stanford 2026 AI Index 发布最新数据:AI Agent 任务成功率从去年的 12% 跃升至 66%,AI Agent 相关网络流量暴增 7,851%。与此同时,arXiv 本周论文聚焦 AI 安全审计和 RL 奖励作弊检测,Google 发布机器人领域新模型,Docker 公开其 Agent 沙箱架构。

AI Daily Digest: Claude Opus 4.7 登顶,OpenAI 进军生命科学,Mozilla 掷出 Thunderbolt - 2026/04/17

· 阅读需 7 分钟
Yi Wang
Full Stack & AI Engineer

2026 年 4 月 17 日,AI 行业经历了又一个密集发布日:Anthropic 的 Claude Opus 4.7 在 14 项基准测试中赢了 12 项,OpenAI 发布了首个生命科学专用模型 GPT-Rosalind,Mozilla 则用开源的 Thunderbolt 向企业 AI 发起了挑战。

The Agent Layer Is Hardening: OpenAI Sandboxes, Anthropic's Advisor Pattern, and the Protocol Wars Are Over

· 阅读需 9 分钟
Yi Wang
Full Stack & AI Engineer

April 16, 2026. The AI agent ecosystem just had one of its most consequential 72-hour windows of the year. OpenAI restructured how agents interact with compute. Anthropic published a new cost-efficiency architecture and shipped Claude Cowork to GA. Microsoft unified its fractured agentic SDKs. DeepSeek V4 is days away. And across developer communities, the backlash against unreliable agents is getting louder.

This is not hype. This is infrastructure. The agent layer is hardening.