the end of writing code and the beginning of directing it

the shift nobody prepared for

Nine months ago I wrote a post about how Anthropic’s pricing changes hurt indie builders. I was frustrated so I spent most of that post complaining about rate limits and weekly caps and the $500/month paywall for doing real work. What I didn’t fully appreciate at the time was that tooling was changing underneath me in a more fundamental way than pricing; tools were getting more expensive but they were also getting more capable, fundamentally shifting the nature of what it means to “build software” from writing code to directing agents that write code for you.

I shipped lots last year using Claude, ChatGPT, and whatever else was available, you can read more about what I built in my YC post, if you are curious. At the time, I thought the hard part of building with AI tools was managing rate limits, context windows, and token costs but as I continued to build and develop more mature skills with these tools, I came to realize the hard part is actually more about learning how to think like an orchistrator that prompts various interpolating engines with bootstrapped toy applications and observability. As anyone who has been building in the last year or two can attest, the term “Developer” now means something entirely different than before. The actual work of “building” looks more like product management, platform engineering, and irritating quality assurance of junior engineers. Before AI tools, development was more designing and writing functions, debugging stack traces, and completing GitHub issues with PRs. Now, basically everyone is an engineering manager dealing with interns.

This post is about what changes AI tools have made to development workflows in practice, how I manage them, what tools exist, what works, what doesn’t, and what the future probably holds, at least from the perspective of a former developer.

claude code in 2026

Claude Code has matured significantly since the rate limit fiasco of mid-2025 and is now probably the most capable agentic coding tool available for retail users. The best way to take advantage of Claude Code today is to treat it less like an autocomplete engine and more like a junior engineer who needs clear direction, periodic review, and structured workflows, as I alluded to earlier.

Let’s talk about git tree workflows. For those not aware of what git trees are, they are one of the core objects in how Git actually stores your code. Every time you make a commit, Git creates a tree object that is basically a snapshot of your entire project directory at that moment. The tree contains references to blobs, which are the actual file contents, and to other trees, which are subdirectories. It is recursive, meaning a tree can contain other trees, all the way down until you hit individual files. Think of it like a kind of “table of contents” that Git generates every time you save your work. The tree knows what files exist, where they live in your folder structure, and what version of each file belongs to that specific commit. When Claude Code reads your git tree, it is reading that “table of contents” and understanding how your entire project fits together, which files import from which, what changed between branches, and where things might be inconsistent. This is what makes multi-file edits and branch-level reviews possible without you having to manually explain your project structure every time you start a new session.

In practice, you open a terminal in your project root and start a Claude Code session. Because it can already see your git tree, you can immediately ask it to “look at the diff between main and feature/auth-refactor and tell you if you missed anything.” Claude will then traverse the tree objects for both branches, compare blobs, and give you a file-by-file breakdown of what changed, what looks wrong, and what might break downstream. It catches things like a renamed utility function in one file that is still being called by its old name in three others. The same approach works for onboarding into unfamiliar projects. Clone a repo you have never seen, ask Claude Code to “read the tree and explain the project structure,” and it will walk through the directories, identify the framework, note the config files, and summarize how things are organized. Useful for open-source contributions where you want to make a targeted change without reading every file first.

The workflow I rely on most involves using git trees to work on multiple tasks in parallel. Say you have three things on your plate: a bug fix in authentication, a new user profiles feature, and a database query refactor. You create three branches, open a Claude Code session on each one sequentially, and work through them. Because Claude Code reads the git tree fresh each time you switch branches, it picks up the correct state of the codebase for that branch without carrying over context or assumptions from the previous task. Each branch gets a clean mental model. Where this gets really useful is at the end when you need to merge. You ask Claude Code to compare the trees across all three branches and flag any files that were touched by more than one. If two branches both modified src/db/connection.ts, it will show you exactly what each changed and suggest how to reconcile the differences before you ever run git merge. While building status.health, I would run this at the end of every work session, asking Claude Code to audit my branches against main and give me a merge order that minimized conflicts. It would suggest merging infrastructure config first since the other branches depended on it, then the API branch, then the frontend. That kind of sequencing advice from a tool that can read your project’s dependency graph through the tree is worth more than most people realize until they have spent an evening untangling a bad merge.

For really heavy workflows, Claude Code supports orchestration patterns where you can chain multiple tasks together. You can ask it to scaffold a new module, write tests for that module, run the tests, and iterate on failures, all in a single session if your context window holds. The key limitation here is still context. Once a session gets long enough, Claude Code starts losing track of earlier instructions, which is why breaking complex work into discrete, well-scoped tasks produces better results than trying to do everything in one conversation.

Pairing Claude Code with VS Code is important for version control and visibility. You can use Claude Code for the heavy generation work, then switch to VS Code to review diffs, stage or revert changes, and commit with intention. I find this workflow essential because Claude Code does not always get things right on the first pass, and reviewing its output in a proper editor with syntax highlighting and git integration makes it much easier to catch errors, hallucinations, or misunderstanding before they hit a branch. Think of Claude Code as the drafter and VS Code as the editing room.

Compared to OpenAI’s Codex, Claude Code is stronger at maintaining context across large codebases and following nuanced instructions. Codex is faster for short, self-contained tasks and has better integration with GitHub through OpenAI’s partnerships. If you are doing quick prototyping or need something disposable, Codex is fine but if you are building something that needs to be maintained, Claude Code’s deeper context handling makes it worth the cost. Google’s Gemini Code Assist sits somewhere in between, strong on Google Cloud integrations but still catching up in terms of agentic autonomy.

Where Claude Code really shines for a solo founder is in building a startup end-to-end. I have used it to go from a product requirements document (PRD) to a working MVP in a weekend. The general workflow I follow looks like this: write a PRD in plain English describing what the product does, who it is for, and what the core features are. Feed that PRD to Claude Code and ask it to generate a project structure. Iterate on the structure until it matches your mental model. Then go feature by feature, having Claude Code implement each one while you review and test. For website design, Claude Code can generate full frontend layouts using frameworks like Next.js or Astro, and you can deploy those directly to services like Vercel or GitHub Pages. Domain registration is still manual because DNS providers have not built agentic APIs yet, but everything from code to deployment can be orchestrated through a single tool. I used this exact workflow to build attest.ink, this blog, and the early versions of status.health.

prompt strategy: the skill that actually matters now

The single most important skill for working with any AI tools is prompt construction. Specifically, understanding when to be detailed, when to be sparse, and how to structure your instructions via a prompt so the model produces useful output on the first or second pass instead of the fifth.

There are roughly three categories of prompts that I use depending on the task.

The first is a specification prompt. This is for when you are starting something new and the model has no prior context. Specification prompts need to be detailed. You should include the programming language, the framework, the file structure you expect, naming conventions, and any constraints like “do not use any external dependencies” or “this needs to run in a serverless environment.” The more specific you are upfront, the less time you spend correcting the output later. A specification prompt for a new API endpoint might be three or four paragraphs long, and that is fine. The cost of being explicit here is far lower than the cost of debugging implicit assumptions.

The second is an iteration prompt. This is for when the model has already generated something and you need it to change or improve. Iteration prompts should be short and surgical. Point to the exact file, the exact function, the exact line if possible, and describe what is wrong and what you want instead. Do not restate the entire specification. The model already has context from the previous turn. Restating everything wastes tokens and can actually confuse the model by making it think you want a fresh start. A clear example would be something like the following: “In auth.ts, the validateToken function is not checking for token expiration. Add an expiration check using the exp claim from the JWT payload.”

The third is a diagnostic prompt. This is for when something is broken and you do not know why. Diagnostic prompts benefit from including the error message, the relevant code, and a description of what you expected to happen versus what actually happened. The mistake most people make with diagnostic prompts is including too little context. If you just paste an error message and say “fix this,” the model will guess at the cause and often guess wrong. If you include the error, the function that produced it, and the input that triggered it, the model can reason about the problem with much higher accuracy.

There is a meta-skill underneath all three of these categories, and it is knowing what the model is good at versus what it is bad at. Models are excellent at boilerplate, at translating well-defined specifications into code, at refactoring, and at explaining existing code. Models are bad at architectural decisions, at understanding business logic they have not been told about, at maintaining consistency across very large codebases without explicit reminders, and at creativity in the way a human designer or writer would define it. I wrote about this in my hello world post: AI tools are generally intelligent but not expert at anything. That remains true. Knowing where the boundary is saves you from the frustration of expecting expert output from a general tool.

One more thing worth noting. The best prompts I have written are the ones where I spent five minutes thinking before typing anything. The instinct is to start prompting immediately, to just throw the problem at the model and see what comes back. Resist that instinct. Think about what you actually want. Write it down for yourself first if you have to. The model is a mirror for your clarity of thought. If your instructions are vague, the output will be vague. If your instructions are precise, the output will be precise. Garbage in, garbage out is not a new concept, but it has never been more directly applicable than it is right now.

open-source alternatives to claude code

The open-source ecosystem for AI-assisted development has improved dramatically since I last wrote about it. In my Lumo post, I mentioned that running a local LLM was still out of reach for most people. That is still partially true, but the gap has narrowed.

For self-hosted model inference, the two tools worth knowing about are LM Studio and Ollama. LM Studio provides a desktop application with a clean interface for downloading, running, and chatting with open-weight models locally. It supports quantized models, which means you can run surprisingly capable models on consumer hardware with 16-32GB of RAM. Ollama takes a more developer-friendly approach, running as a local server you can interact with via API, which makes it easier to integrate into scripts and automated workflows. Both are free and both run entirely on your machine, meaning your prompts and code never leave your device. For anyone building in a privacy-sensitive domain, as I am with status.health, this matters.

For agentic orchestration on top of these local models, tools like Open Code are emerging as open-source alternatives to Claude Code. Open Code supports multiple model backends including Ollama, and provides a terminal-based interface for agentic coding workflows like file editing, command execution, and multi-step task completion. It is not as polished as Claude Code and the model quality is noticeably lower for complex tasks, but for straightforward development work it is genuinely usable and entirely free.

The downside of self-hosted models is hardware. Running a 70B parameter model with acceptable speed requires a GPU with at least 48GB of VRAM, which means an NVIDIA A6000 or better, and those cards start at $4,000. You can run smaller quantized models on consumer GPUs like the RTX 4090 (24GB VRAM), but you sacrifice quality and context length. For most indie developers, the cost of hardware that can match Claude or GPT-4o quality exceeds the cost of just paying for API access for several years. The math does not work yet for most people.

The best-performing open-source models right now depend on the task. For development workflows, DeepSeek-V3 is exceptionally strong at code generation and refactoring, outperforming gpt-oss on several benchmarks after supervised fine-tuning, as I referenced in my Protocol Agent post. The Qwen3 series from Alibaba has been surprisingly competitive, especially the 30B variant which showed a 73.3% improvement after fine-tuning in the Protocol Agent benchmark. For general-purpose tasks, gpt-oss-120b from OpenAI remains solid out of the box but does not improve as dramatically with fine-tuning as the alternatives. Llama from Meta continues to be the most versatile option for self-hosting because of its broad community support and extensive fine-tuned variants.

The future is open-source and open-weight models. I believe this fully, and I wrote about it in my Lumo post and in my Protocol Agent notes. Prioritizing open-source alternatives to paid variations supports work like Protocol Agent and ERC-8004 and democratizes access for everyone. That said, the future will also be better hardware. Inference chips are getting cheaper and more efficient every quarter, and consumer-grade hardware capable of running frontier-quality models locally is probably two to three years away. Right now, for most builders, it makes more sense to pay for the frontier through Claude or ChatGPT and supplement with local models for privacy-sensitive or high-volume tasks where the cost savings justify the quality trade-off.

notes on being a builder in the ai era

The role of the developer is becoming something closer to a product manager who can read code. I say this as someone who accidentally became a self-taught engineer and then progressed through recruiting, program management, and product management before ending up as a founder. The trajectory I took through my career is now compressible into months instead of years because AI tools have collapsed the distance between having an idea and building it.

It is no longer necessary to spend hours debugging a cryptic error message or learning a new language’s syntax from scratch. The models handle that. What matters now is learning strategies for orchestrating these tools, testing their outputs rigorously, and displaying results publicly using services like GitHub Pages, Vercel, and cloud providers like AWS, GCP, and Azure. The skill is in knowing what to build, how to validate it, and how to ship it, not in memorizing the difference between useEffect and useLayoutEffect.

As developers transition from pure engineers to something resembling product managers and startup founders, the divide between ideas and actualizing them becomes smaller and smaller. I built four products in four months. A year before that, building one product in four months would have been ambitious. The cost of developing software is simultaneously increasing and decreasing. It is decreasing because the labor required for implementation is falling toward zero for many categories of work. It is increasing because the expectations for what constitutes a “minimum viable product” keep rising as everyone else also has access to the same tools. When everyone can ship a landing page in an afternoon, a landing page alone is no longer impressive. The bar moves up even as the cost of reaching it moves down.

This is the tension every builder should be thinking about. The tools make it easier to build, which means more people are building, which means differentiation comes from taste, judgment, and the quality of the problem you choose to solve rather than from technical execution. I wrote in my hello world post that AI is terrible at being creative. That has not changed. Creativity, taste, and the ability to identify a problem worth solving remain the competitive advantages that no model can replicate. The rest is orchestration.

what comes next

The agentic future is not hypothetical. It is here. I use agents to write code, review code, generate documentation, scaffold projects, and iterate on designs. The question is not whether this will become the dominant way software is built. It will. The question is what the landscape looks like when it does.

Will the tools stay closed and costly, locked behind $200/month paywalls and enterprise contracts, the way Anthropic tried to push things last year? Or will open-source alternatives catch up fast enough to keep the frontier accessible to indie builders, solo founders, and the “not-yet-funded” crowd I wrote about in my plea to Anthropic?

Will the next generation of builders need to learn programming at all, or will prompt engineering and system design replace syntax and data structures as the foundational skills? And if so, what does computer science education look like in five years?

I do not have answers to these questions. Nobody does. What I do know is that the people who will thrive in this era are the ones who learn to direct machines rather than compete with them. The ones who invest in taste, in judgment, in knowing what is worth building. The ones who treat AI as a tool and not as a replacement for thinking.

The end of writing code is not the end of building. It is the beginning of building differently. And for someone like me, a college dropout who started at the Apple Store and ended up here, differently has always been the only option anyway.