Let me acknowledge that I usually skip articles of "why I started/stopped/use X" sort, because thanks, keep me posted, we are all deeply interested in your experience. And yet we are in the middle of generative AI storm of a kind, and it would be weird and silly to avoid discussing personal experience in a personal blog.
Social media reveals many extreme views belonging to the generalized "LinkedIn" and "Homepages" camps. A stereotypical LinkedIn writer, presumably a manager, is often infatuated with new vistas. If we are to believe what regular employees say about their bosses, there is a common sentiment of "AI can replace you all", and "why do we need coders, if I can ask ChatGPT directly". In contrast, a homepage owner, presumably, a coder who also codes for leisure, is appalled with the quality of "neuroslop" produced at his company at industrial scale, and shows with vivid examples that he can do better.
I could make a trivial observation that every tool needs to be mastered, that the truth is somewhere in the middle, etc., etc., and call it a day. Yet I want to unpack this into something more sensible, supporting and criticizing all kinds of extreme views, and highlighting my experience.
The Truths of Managers and Coders
OK, we all know that LLMs can be very persuasive. They can paint you the picture of everything doable and the Moon on your fingertips. So it should not come as a surprise that the bosses who are not professional engineers fall for it. I clearly see that LLMs are especially good in the areas where my own skills fall short, or, what is more probable, they look capable when I don’t have the knowledge and experience to question them. I witnessed technically competent people grossly underestimating certain work because an LLM portrayed it this way (and I can’t even blame the LLM too much, because it did not have enough information to provide an accurate estimation), so I don’t even want to think what less tech-savvy managers can be made to believe.
Such managers is perhaps the most irritating byproduct of the ongoing "AI revolution" or whatever we are going through. I don’t see how a regular non-star engineer can overcome the authority of ChatGPT or Gemini, the "we do it because the AI said so" attitude. My condolences. However, I’d blame the overly ardent tool user rather than the tool itself. As an engineer, I accept as a fact of life that sometimes decisions are based on poor advice of improper advisors, be it AI, drinking buddies, LinkedIn posts, or tarot cards. I am doing my best to separate my "decision maker mode" from my "expert mode". When I act as an expert, my job is to give advice, not to make decisions. If you ignore my advice, I don’t take it personal. Later, if things go awry, feel free to call me again for another advice.
Now let’s turn out attention to professional coders, or the "Homepages" camp. These folks generally grossly overestimate the ability, attitude, and communicational skills of an average (especially junior) software engineer. I often stumble upon opinion pieces describing how a coding agent made a certain subtle error that a competent human would never make, and I feel like screaming: "show me this competent human!" In my view, such pieces should be read as "I, the author of this blog, can do better". This very well may be true, but it does not tell us much about the skills of someone who works for money and does not spend their free time writing technical blogs.
Frustratingly often such authors fail to provide crucial context to the LLM when formulating tasks. They seem to believe it is obvious that code must me compact, optimized, correct for certain edge cases, handle errors "right", and so on. It’s easy to be sarcastic when someone points to such lack of instruction: "ah, right, I forgot to tell the system to write good code!", but I fail to understand the motivation to ridicule the tool: yes, it behaves counter-intuitively at times, but is your goal to learn how to use it for your advantage or just to vent off online? Venting off is a kind of psychotherapy, I get it.
The Team Lead Perspective
When working with code agents, I tend to think like a team lead. I am in charge of a project, and if I feel confident, I can do everything myself. Naturally, this approach will get me this far, and eventually I will have to outsource something. Imagine for a moment that I have the choice between outsourcing my work to a human or to a coding agent. Since my human subordinate will use a coding agent anyway, there should be some added value from having an engineer between me and an LLM. In particular, this engineer has:
to understand clearly the scope of the project and its architecture in order to be able to expand my short task description into a detailed TODO list;
to show enough autonomy to make reasonable decisions even if something goes wrong;
to ensure the quality of output by carefully reviewing the resulting code and by implementing unit/feature tests;
to refrain from writing code when tired, angry, or intoxicated (yep, I have to mention this);
to keep documentation current, protocol bugs, edge cases, and known limitations;
to show skill, diligence, and attention to details, even if the given task is boring or exhausting.
In my opinion, such people are hard to come by, and pure coding skills constitute only a fraction of what is required, so in many cases outsourcing my tasks to a coding is an attractive alternative. With all their shortcomings, coding agents
are ready immediately when I need them;
work fast, so we can achieve a lot within a single session;
have some expertise literally in any technology;
are sufficiently diligent if requested.
I don’t trust their autonomy. They tend to make bad decisions sometimes, they have short memory and might "forget" something relevant when you don’t expect it. They sometimes decide that it’s good time to make massive changes in the code base, and when I come back from the kitchen with a sandwich I gape seeing hundreds of modifications where mere cosmetic touches were needed. These things happen, it’s frustrating, and in general I feel that LLMs need more babysitting than above-average human subordinates.
However, in my "team lead perspective" I don’t expect my human subordinates to be too good. Realistically, people who are too good won’t be my subordinates. I take it as granted that software development is a mass profession. We wish our doctors, police officers, and school teachers to be stellar, but we must learn to get away with mere average mortals. Software engineering is not much different nowadays: we can expect people at the very top to be exceptionally good, but it is already a win if a regular engineer is good at certain specific kinds of work, be it debugging, optimization, architecture, or translating vague requirements into code.
Software Factories
While it is exciting to watch the progress of language models, I believe the real work of a team lead is not to browse AI rankings pondering whether GPT 5.5 is up to the job that GPT 5.4 failed; it is to design a workflow, a conveyer that does not require superhuman coding abilities. This simple thought is completely in line with the philosophy that backed the technological progress of the past two centuries. You can find an exceptionally good individual able to design something like Prague astronomical clock, but production at scale is only achievable with a conveyer belt system, requiring the workers to be good at few specific operations.
It would be a stretch to compare software development with factory production. A factory produces identical items, while each software project is unique (even if pretty generic). We generally don’t need clones. Even if your product is a close imitation of another product, once it is available, any potential user can download it without having the company to manufacture a copy. Still, software development industry is a mass-scale employer, and it cannot structure its business processes around stellar individuals. A company is not sustainable if its operations depends on whether a certain star coder has a stomachache or decides to switch jobs.
As a side "industrial" note, critics often like to cite a 2016 comic strip, where a character’s dream of a no-coder future, where a precise spec would be automatically converted to a program, is met with a retort: you can write such a spec right now, "it’s called code". This is clever as a comic strip plot, but in reality massive parts of any software system are fairly generic. Such parts are either completely standard ("File" menu on the left, "Help" menu on the right) or do not have hard requirements (any sensible defaults are fine for fonts, port numbers, hotkeys). Keeping this in mind, it actually makes sense to focus the initial specs on non-generic functionality and entrust the rest to the judgement of a coding agent. If the result does not look right, it is always possible to refine and revise.
This approach is consistent with the notion of "soft" or "humane" technology as described by Donald Norman. To borrow his example, imagine a search engine for restaurants that requires the user to provide all kinds of criteria in advance: location, food type, price range, etc. This would be the case of "inhumane" technology, mostly convenient for the machine. For a human, it would be more natural to start with something simple like location, and then refine the query if necessary. This might look like a strawman system — obviously, our specialized search engines work exactly this way; yet, it certainly looked more realistic to Norman who wrote his book in 1993.
When an LLM fails at a certain task, I make a conscious effort to understand whether (a) the given model is not up to the task; (b) my task description is inadequate; (c) my workflow is not solid. The cases of (a) and (b) are somewhat easy to deal with: try a better model or a better prompt and see what happens. The case of (c) is harder to identify and address. Usually I blame the workflow if I give an LLM a certain task that looks simple to me, but the agent still fails to produce a satisfactory solution. Note that a team lead must have the right intuitions about "simple" and "hard" tasks, and possibly adjust estimations if things happen to go wrong. An individual without solid technical expertise cannot really do that.
Soft Spots and Solid Worlkflows
LLMs do have objective weaknesses. They tend to provide bogus links and point to nonexisting settings and functions (i.e., hallucinate). They tend to ignore even direct instructions if they happen to fall out of context window or conflict with some other goal or whatever else. These issues gradually become less frequent, but they are still very real, and we should discuss them in a constructive manner — in the same manner we approach the weaknesses and limitations of human workers. If quality cannot be achieved at the level of an individual, it must be achieved at the level of the system.
Fortunately, software engineering industry has developed a large collection of techniques (often grouped under the umbrella of "agile development"), equally applicable to human and AI workers:
split work into small tasks (one task should take at most a few hours);
describe the desired functionality — these documents become "tasks", and once done, will be integrated into project documentation;
have all kinds of documentation at hand;
write tests;
do code reviews;
use continuous integration.
Some of these techniques can be adopted to a broader context. Need a reference to a book? Ask an LLM in a way that generates a testable outcome, e.g., demand to provide an ISBN. Check if the provided ISBN is found in WorldCat. If not found, repeat the query.
Such techniques and workflows are usually codified as Skills or Agent guides, but ensuring the right workflow is still a team lead’s responsibility. For example, I strive to design and refine any functionality using OpenSpec before I instruct the agent to write any code. I run dedicated refactoring and code review workflows. I use structured code review skills to identify critical flaws. I use specialized code review reception skills to evaluative the code review automatically (in addition to my own judgement). I use different language models to write code and to review code. I instruct the agents to produce tests and documentation where applicable.
The concept of skills might be somewhat hard to comprehend: why do you need to instruct the agent to write bug-free code, to ensure security, or to avoid obvious code smells? A simple advice is to treat AI agents as impersonators rather than software developers. You can instruct an LLM to "talk like a pirate", "to assume a role of a Victorian butler", and so on. There are dozens if not hundreds of ways to sort an array of numbers. A coding agent has no internal knowledge of the "right" implementation for your case. Therefore, it must rely on hints such as surrounding code (so its new code blends seamlessly) or specific instructions ("write code as a novice", "write code as a professional developer"). Normally, the coding agent provides enough context automatically. When it writes code, it knows the code around the intended change. However, refactoring and code reviews are separate activities requiring separate modes of thinking (which is true for humans as well). Structured activities require structured instructions.
I consider myself a rather conservative developer. I like established approaches, I am not an early adopter, I hate hype, and I generally prefer to design and implement my software by hand. Unfortunately, this attitude is only good for small pet projects that showcase my craftsmanship. Once you aim higher, you have to outsource — to people or to LLMs. Coding agents provide a great opportunity to train your outsourcing skills. Yes, I believe you can do it yourself, and you can do it well. Now, can you instruct your very average employee to obtain passable results? Can you improve these results? This is an interesting challenge. I am still somewhat lukewarm to the idea of "large-scale LLM coding" with all these "fleets of agents", "active autonomous AIs", and other ways to produce megabytes of code on a daily basis. I am still struggling with workflows that do not provide auto-testable results (generation of graphics, music, or 3D models, for example). Yet I appreciate the opportunity to achieve more than I could on my own, and to hone my "team lead mindset" without sacrificing the comforts of solo development and home workspace.