Agentic X — What I'm Learning About LLMs as Implementation Agents

4 min read

Agentic X

This section captures what I’m learning about applying LLMs to real engineering work. The “X” is deliberate—this is an evolving area where my understanding is changing rapidly.

The Core Insight

The biggest wins come from being the owner instead of the builder. When I own the intent, define the boundaries, and set the constraints, the LLM becomes a highly effective developer that implements the story.

This is a different mental model than “AI assistant” or “code completion.” It’s closer to having a junior developer who’s tireless, fast, and good at following instructions—but who needs clear direction.

What Has Worked

Clear User Stories

When I write user stories with explicit acceptance criteria, the LLM can implement them reliably. The key is being specific about what “done” looks like, including edge cases and error handling.

This has made me better at writing user stories in general. If I can’t explain what I want clearly enough for an LLM to implement it, I probably didn’t understand it clearly myself.

Up-Front Integration Work

Putting in the work to connect MCPs (Model Context Protocols), scope SSH keys appropriately, and configure the environment so the model can do exactly what it needs—this pays off dramatically. The first hour of setup saves many hours later.

The pattern: give the LLM the minimum permissions it needs to accomplish the task, in an environment where it can’t accidentally cause damage outside that scope.

Treating LLMs as Implementation Agents

The LLM is excellent at implementation—turning a clear specification into working code. It’s less reliable at architecture decisions, requirements gathering, or understanding implicit context.

I’ve learned to separate these concerns: I do the architecture and requirements work, then hand implementation to the LLM with clear boundaries.

Repetitive Tasks and Refactors

LLMs excel at tasks where the pattern is clear but the volume is high: renaming variables across a codebase, applying the same fix to multiple files, generating boilerplate, updating documentation.

These are tasks where human attention wanders and mistakes creep in. LLMs don’t get bored.

What Has Not Worked

Expecting the Model to Intuit Missing Requirements

If I leave requirements ambiguous, the LLM will fill in the blanks—and often not the way I would have. The same creativity that makes the model impressive is the same trait that leads it to invent plausible-sounding but wrong behavior.

I’ve learned to treat ambiguity as a bug in my specification, not something the model should resolve.

Assuming First-Pass Output is Production-Ready

Without tight constraints, LLM output often works but isn’t quite right—subtle issues with error handling, logging, edge cases. The first pass needs review, and often a second iteration with more specific feedback.

This is similar to code review with a junior developer: the code might work, but you need to check it.

Complex Multi-Step Reasoning

Tasks that require holding a lot of context in mind, reasoning through multiple interdependent decisions, or understanding implicit organizational knowledge—these still need human involvement.

The LLM can help with pieces of these tasks, but the overall reasoning still needs to be human-driven.

Early Experience

My early experiments with Cursor and LLM coding were mind-blowing, but also frustrating. The highs were high—watching the model implement something correctly on the first try felt like magic. The lows were confusing—when it got things wrong, understanding why was difficult.

The key insight was that the quality of output correlates directly with the quality of input. Clear constraints, specific instructions, and appropriate scoping transform the experience from frustrating to productive.

What I’m Still Exploring

  • Where to draw the line between human and LLM work. The boundary isn’t fixed—it depends on the task, the stakes, and how well I can specify what I want.

  • How to build feedback loops. Can I use LLMs to review their own output? To iterate on specifications?

  • Team integration. How do these tools work in team settings? What happens to code review, knowledge sharing, and collaboration?

The Meta-Point

LLMs are tools that amplify the owner’s clarity. If you know what you want and can specify it precisely, LLMs accelerate dramatically. If you’re still figuring out what you want, LLMs won’t figure it out for you—and they might lead you down the wrong path confidently.

This makes specification and intent-definition the critical skills for working with LLMs effectively.