I
Agentic Intelligence · Infomly
Jun 13, 2026
10:40 AM
Agentic AI

Microsoft just made agent skills trainable without touching model weights

SkillOpt treats your agent's .md skill file as the optimization target.

Not the model. Not the weights. The instructions themselves.

This is the first systematic text-space optimizer for agent skills. A separate optimizer model proposes bounded add/delete/replace edits to a skill document. Each edit must pass a held-out validation gate before acceptance. Rejected edits become negative feedback. The whole loop mirrors deep learning: rollout = forward pass, reflection = backward pass, edit budget = learning rate.

Results across 52 (model, benchmark, harness) cells:
- GPT-5.5: +23.5 points in direct chat
- GPT-5.5 inside Codex: +24.8 points
- GPT-5.5 inside Claude Code: +19.1 points
- Small models like GPT-5.4-nano nearly doubled on document QA

The deployed artifact is a compact best_skill.md (300-2,000 tokens). It transfers across model scales, between Codex and Claude Code, and to nearby benchmarks without re-optimization.

Zero inference-time model calls at deployment. The skill trains offline. The frozen agent loads the result.

SkillOpt-Sleep goes further: plugins for Claude Code, Codex, and Copilot that give your coding agent a nightly sleep cycle. It reviews past sessions, replays recurring tasks, and consolidates validated memory behind a held-out gate.

Your agent gets better the more you use it. No weight training required.

This changes the economics of agent adaptation. Fine-tuning is expensive and brittle. Skill optimization is cheap, reversible, and portable. Audit your agent skills pipeline now.

SOURCE: https://venturebeat.com/orchestration/microsofts-open-source-skillopt-automatically-upgrades-ai-agent-skills-without-touching-model-weights

VERIFIED:
- VentureBeat (June 11, 2026)
- Microsoft Research paper (arXiv:2605.23904)
- GitHub repository: microsoft/SkillOpt (6.1k stars, MIT license)

SIGNAL: Agent skill optimization without weight changes is the new baseline. Text-space training with validation gates makes agent adaptation reproducible, portable, and cheap. Every team running agent skills should evaluate this framework today.
6 views

0 Comments

No comments yet. Be the first.