How I Actually Work With AI

Saturday, June 6, 2026 · eng

This is a brain-dump of how I actually use AI to build software day-to-day. It’s opinionated and it will be wrong in six months. That’s fine. Just leverage the tech, do what works, experiment. There is no one right answer yet.

AI Hygiene — the mindset

Read the output. Don’t outsource the thinking. You are supposed to engage with AI analysis, not rubber-stamp it. When I present AI work to coworkers, I frame it explicitly: “AI says X. I agree with Y, I disagree with Z.” There’s no shame in using AI as a resource, but the thinking is still mine. It’s an assistant, not a manager — don’t let it bully you into believing it’s right when it isn’t. And there’s no shame in asking it how to center a div. It’s not Stack Overflow; it won’t mark your question as a duplicate and close it. It has infinite patience because it’s alien and has no concept of patience.

Separate learning from doing. Research → Plan → Execute. I brainstorm, plan, and revise the plan in one context window. Then I nuke the context and start fresh for implementation. Old context is noise; clean slates are fast. The implementer doesn’t need to know about junk deprecated files that have no relevance to the feature. If you’re sharp, you can do both in one context with subagents.

Maintain your own docs, feature to feature. Trust no generated build-plan files. Build files are stale before the plan is even written — who’s going to maintain that markdown? Just regenerate it. Curated markdown only: skills, AGENTS.md, the special context that unblocks the agent without supervision. Inline comments and docs are king; the agent has no choice but to read those.

Tool permissions: yolo modes.

Read-only yolo: ls good, mv bad. Let it explore and ingest info freely, but never write. You are the sentinel of sanity checks. Don’t give it the power to delete prod.
Supervised yolo: comprehensive allowlists — you decide exactly which operations are permitted. It’s worth curating.
No external reads. Vulnerabilities everywhere. Always provide docs yourself; don’t let the agent fetch them. Link the specific file in GitHub.

Clarity of Intent — communication as a superpower

My culture ship name, for any Iain M. Banks fans in the room.

No “still broken” prompts — that’s lazy in the wrong way. The agent is not psychic. Tell it exactly what “fixed” looks like: “Go inspect the logs in docker container foo. What went wrong?”

Goalposts: why AI crushes coding and lawyers lose their jobs. In computer science, everything is verifiable — lean into that. “Cite me a case” needs to be validated. You can do this. “Pass these unit tests without modifying them” is unambiguous, provable, and fast.

Testability is your north star. If you can do it in the computer world, so can the agent. Unit tests, integration tests, cURL, Playwright, screenshots — it’s nothing if not diligent. More “conform to this expected cURL I/O” and less “build this feature.” This is VDD: Verification-Driven Development — always ask the agent to prove its work. It applies to performance too: “add logging, don’t stop until you can prove a 50% improvement.” Real tool access is the hallucination antidote: when it can run the query, it’s harder to make something up.

Tooling & MCP — unblock the agent

You are not an agent. You are an agent coordinator, manager, and unblocker.

Bad: “Please run these hallucinated MySQL queries with syntax errors.”
Good: “I ran 300 ro queries, here’s a summary.”

Bring conversations to their most useful terminal. Remove blockers before the agent asks; anticipate what it needs. With the right set of prompts, the agent can work for 30+ minutes uninterrupted. I once completed an entire Jira ticket in a single context, given just the ticket and an API token/gateway. It kicked exactly two terminals back to me: “help, I’m out of disk space!” and “what model name is on the API key?” Otherwise, no blockers. Surface only the most useful terminals.

MCP is good for application logic — not for things you can just do on the command line.

Prefer CLI for well-known tools (the model already knows them and can quickly –help). MCP shines for novel/custom tools — pair an unfamiliar CLI with a Skill it doesn’t already ship with one.
All enabled MCP schemas pre-load into your context window every session. Prune tools before queries, or write a shim MCP to strip the noise.
If you can CLI it, the MCP overhead probably isn’t worth it (and we’re loading more tools every day).
Local Docker = yolo, writeable MySQL on the command line. “Oops, I blew the DB away” is acceptable there. “Write some accurate test data. Take a sample from prod.” — powerful.
Prod = MCP with read-only app logic, or a read-only connection string.

Servers and context worth your time:

Playwright — a godsend. Record videos of runs for even stronger proof-of-work than screenshots.
Database connections — essential for external-env data. No black boxes extends to cross-service calls too: give the agent access to both sides and let it answer “whose bug is this?”
Jira — good tickets → better agent context → better plans → better tickets you share with teammates. A virtuous cycle. (I grudgingly love it now.)
A shared “thoughts” / context server — loaded with powerful context. If you aren’t pulling it in yet, try it.

Skills — unblock the agent permanently

“Just unblock the agent” is my current philosophy. I’m sure this will change.

Skills are reusable, curated context that let the agent do a job without supervision. If you find yourself explaining the same thing to the agent twice or more, it should probably be a skill. Let the agent write its own skills: watch it stumble, unblock it, then ask it to document what it learned. Good tooling plus skills matters more than an expensive model — a mid-tier model with full tool access often beats a frontier model flying blind.

Collaboration at the AI floor

Everybody can technically fullstack now, but it’s still very doable to divide work well.

A monorepo changes how you split things up. Shared context across repos means the agent can traverse boundaries, but you still need to own the plan. Domain experts still matter; the AI floor just raises the baseline for everyone.

Local-first — Docker all the things

Killer local sandboxes are king. Everything needs to run locally so the agent can inspect its own work. Testability requires feedback loops — prove it locally. “Show me a screenshot of your feature working in Playwright” is a (positively) broken prompt. Use it.

Backend is easy; frontend is currently the hard problem for me.

Backend: “The route at URI should return response R when input I is sent.” → Agent implements. It’s done. (Security and perf concerns aside, it’s extremely cut and dry.)
Frontend: “The UX should be good, beautiful, snappy, and performant under many conditions.” → Agent: "?"

Frontend solution: design tools as goalposts. Use low-code/drag-and-drop builders (e.g. the Figma MCP) to communicate what beautiful, good UX looks like. Then say — or force with tests — “Match this design. Are all components in place? Are they connected to actions?” Clear, benchmarkable, testable. Same principle as the backend.