The emergence of this kind of thing has been so surprising to me. The exact same sort of person that managed to bottleneck themselves and obliterate signal-to-noise ratios at every company they work for with endless obsession over the trivial minutiae of the systems they are working with have found a way to do it with LLMs, too, which I would have assumed would have been the death of this kind of busywork
All of this might as well be greek to me. I use ChatGPT and copy paste code snippets. Which was bleeding edge a year or two ago, and now it feels like banging rocks together when reading these types of articles. I never had any luck integrating agents, MCP, using tools etc.
Like if I'm not ready to jump on some AI-spiced up special IDE, am I then going to just be left banging rocks together? It feels like some of these AI agent companies just decided "Ok we can't adopt this into the old IDE's so we'll build a new special IDE"?_Or did I just use the wrong tools (I use Rider and VS, and I have only tried Copilot so far, but feel the "agent mode" of Copilot in those IDE's is basically useless).
I'm so happy someone else says this, because I'm doing exactly the same. I tried to use agent mode in vs code and the output was still bad. You read simple things like: "We use it to write tests". I gave it a very simple repository, said to write tests, and the result wasn't usable at all. Really wonder if I'm doing it wrong.
I’m not particularly proAI but I struggle with the mentality some engineers seem to apply to trying.
If you read someone say “I don’t know what’s the big deal with vim, I ran it and pressed some keys and it didn’t write text at all” they’d be mocked for it.
But with these tools there seems to be an attitude of “if I don’t get results straight away it’s bad”. Why the difference?
I don't understand how to get even bad results. Or any results at all. I'm at a level where I'm going "This can't just be me not having read the manual".
I get the same change applied multiple times, the agent having some absurd method of applying changes that conflict with what I say it like some git merge from hell and so on. I can't get it to understand even the simplest of contexts etc.
It's not really that the code it writes might not work. I just can't get past the actual tool use. In fact, I don't think I'm even at the stage where the AI output is even the problem yet.
There isn't a bunch of managers metaphorically asking people if they're using vim enough, and not so many blog posts proclaiming vim as the only future for building software
I’d argue that, if we accept that AI is relevant enough to at least be worth checking, then dismissing it with minimal effort is just as bad as mindlessly hyping the tech.
You must be new here. "I use vim between", "you don't use vim, you use Visual Studio, your opinion doesn't count" is a thing in programming circles.
I agree to a degree, but I am in that camp. I subscribe to alphasignal, and every morning there are 3 new agent tools, and two new features, and a new agentic approach, and I am left wondering, where is the production stuff?
So just like in the JavaScript world?
Well one could say that since it's AI, AI should be able to tell us what we're doing wrong. No?
AI is supposed to make our work easier.
What you are doing wrong in respect to what? If you ask for A, how would any system know that you actually wanted to ask for B?
Honestly IMO it's more that I ask for A, but don't strongly enough discourage B then I get both A, B and maybe C, generally implemented poorly. The base systems need to have more focus and doubt built in before they'll be truely useful for things aside from a greenfield apps or generating maintainable code.
You didn't actually just say "write tests" though right? What was the actual prompt you used?
I feel like that matters more than the tooling at this point.
I can't really understand letting LLMs decide what to test or not, they seem to completely miss the boat when it comes to testing. Half of them are useless because they duplicate what they test, and the other half doesn't test what they should be testing. So many shortcuts, and LLMs require A LOT of hand-holding when writing tests, more so than other code I'd wager.
No, you have similar experience as a lot of people have.
LLMs just fail (hallucinate) in less known fields of expertise.
Funny: Today I have asked Claude to give me syntax how to run Claude Code. And its answer was totally wrong :) So you go to documentation… and its parts are obsolete as well.
LLM development is in style “move fast and break things”.
So in few years there will be so many repos with gibberish code because “everybody is coder now” even basketball players or taxi drivers (no offense, ofc, just an example).
It is like giving F1 car to me :)
you need to write a test suite to check his test generation (soft /s)
Yeah if you've not used codex/agent tooling yet it's a paradigm shift in the way of working, and once you get it it's very very difficult to go back to the copy-pasta technique.
There's obviously a whole heap of hype to cut through here, but there is real value to be had.
For example yesterday I had a bug where my embedded device was hard crashing when I called reset. We narrowed it down to the tool we used to flash the code.
I downloaded the repository, jumped into codex, explained the symptoms and it found and fixed the bug in less than ten minutes.
There is absolutely no way I'd of been able to achieve that speed of resolution myself.
- We narrowed it down to the tool we used to flash the code.
- I downloaded the repository, jumped into codex, explained the symptoms and it found and fixed the bug in less than ten minutes.
Change the second step to:
- I downloaded the repository, explained the symptoms, copied the relevant files into Claude Web and 10 minutes later it had provided me with the solution to the bug.
Now I definitely see the ergonomic improvement of Claude running directly in your directory, saving you copy/paste twice. But in my experience the hard parts are explaining the symptoms and deciding what goes into the context.
And let's face it, in both scenarios you fixed a bug in 10-15 minutes which might have taken you a whole hour/day/week before. It's safe to say that LLMs are an incredible technological advancement. But the discussion about tooling feels like vim vs emacs vs IDEs. Maybe you save a few minutes with one tool over the other, but that saving is often blown out of proportion. The speedup I gain from LLMs (on some tasks) is incredible. But it's certainly not due to the interface I use them in.
Also I do believe LLM/agent integrations in your IDE are the obvious future. But the current implementations still add enough friction that I don't use them as daily drivers.
> I never had any luck integrating agents
What exactly do you mean with "integrating agents" and what did you try?
The simplest (and what I do) is not "integrating them" anywhere, but just replace the "copy-paste code + write prompt + copy output to code" with "write prompt > agent reads code > agent changes code > I review and accept/reject". Not really "integration" as much as just a workflow change.
I installed the copilot extension in my IDE, and switched on Agent mode.
I don't really get how the workflow is supposed to work, but I think it's mostly due to how the tool is made. It has like some sort of "change stack" similar to git commits/staging but which keeps conflicting with anything I manually edit.
Perhaps it's just this particular implementation (Copilot integration in VS) which is bad, and others are better? I have extreme trouble trying to feed it context, handling suggested AI changes without completely corrupting the code for even small changes.
Hm, yeah maybe. I've tried Cursor once, but the entire experience was so horrible, and it was really hard to know what's going on.
The workflow I have right now, is something like what I put before, and I do it with Codex and Claude Code, both work the same. Maybe try out one of those, if you're comfortable with the terminal? It basically opens up a terminal UI, can read current files, you enter a prompt, wait, then can review the results with git or whatever VCS you use.
But I'm also never "vibe-coding", I'm reviewing every single line, and mercilessly ask the agent to refactor whenever the code isn't up to my standards. Also restart the agent after each prompt finished, as they get really dumb as soon as context is used more than 20% of their "max".
Make sure you’re clicking “Keep” to “approve” the changes. It’s annoying but I don’t think there is a way around having to do that. Then if you manually edit something, you can mention it in your next chat message, e.g., “I made a few changes to <file>. <Next instruction>”
I used to do it the way you were doing it. A friend went to a hackathon and everyone was using Cursor and insisted that I try it. It lets you set project level "rules" that are basically prompts for how you want things done. It has access to your entire repo. You tell the agent what you want to do, and it does it, and allows you to review it. It's that simple; although, you can take it much further if you want or need to. For me, this is a massive leap forward on its own. I'm still getting up to speed with reproducible prompt patterns like TFA mentions, but it's okay to work incrementally towards better results.
I also sympathize with that approach, and found it sometimes better than agents. I believe some of the agentic IDEs are missing a "contained mode".
Let me select lines in my code which you are allowed to edit in this prompt and nothing else, for these "add a function that does x" without starting to run amok
Yes. And some way of using an instructions file. Because interacting with an agent in a tiny plugin window without use of "agents.md" or some sort of persistent prompt you can adjust retry etc is horrible.
Now it's "please add one unit test for Foobar()" and it goes away and thinks for 2 minues and does nothing then I point it to where the FooBar() which it didn't find and then adds a test method then I change the name to one I like better but now the AI change wasn't "accepted"(?) so the thing is borked...
I think the UX for agents is important and ...this can't be it.
I recently pasted an error I found into claude code and asked who broke this. It found the commit and also found that someone else had fixed it in their branch.
You should use claude code.
There's no reason this should not be possible in other IDEs, except for the vendor lock-in.
You just didn't drink enough cool-aid and have intact brain.
Copilot's agent mode is a disaster. Use better tools: try Claude Code or OpenCode (my favorite).
It's a new ecosystem with its own (atrocious!) jargon that you need to learn. The good news is that it's not hard to do so. It's not as complex or revolutionary as everyone makes it look like. Everything boils down to techniques and frameworks of collecting context/prompt before handing it over to the model.
Yep, basically this. In the end it helps having the mental model that (almost) everything related to agents is just a way to send the upstream LLM a better and more specific context for the task you need to solve at that specific time.
i.e Claude Code "skills" are simply a markdown file in a subdirectory with a specific name that translates to a `/SKILL_NAME` command in Claude and a prompt that is injected each time that skill is mentioned or Claude thinks it needs to use, so it doesn't forget the specific way you want to handle that specific task.
Sadly we have some partnership meaning it's Copilot or nothing.
I feel like just use claude code. That is it. Use it you get the feel for it. Everyone is over complicating.
It is like learning to code itself. You need flight hours.
I'm stuck with the Copilot tools. Again, I don't think this is a problem with the models but with the tooling. I can't switch to claude code (for work, that is) and while I don't mind using more command line tools I don't want to run multiple IDE's.
But it's good to hear that it's not me being completely dumb, it's Copilot Agent Mode tooling that is?
It's not that simple. That's how I started as well but now I have hooked up Gemini and GPT 5.2 to review code and plans and then to do consensus on design questions.
And then there's Ralph with cross LLM consensus in a loop. It's great.
This is something that continues to surprise me. LLMs are extremely flexible and already come prepackaged with a lot of "knowledge", you don't need to dump hundreds of lines of text to explain to it what good software development practices are. I suspect these frameworks/patterns just fill up the context with unecessary junk.
I think avoiding filling context up with too much pattern information, is partially where agent skills are coming from, with the idea there being that each skill has a set of triggers, and the main body of the skill is only loaded into context, if that trigger is hit.
You could still overload with too many skills but it helps at least.
You get to 80% there (numbers pulled out of the air) by just telling it to do things. You do need more to get from 80% there to 90%+ there.
How much more depends on what you're trying to do and in what language (e.g. "favourite" pet peeve: Claude occasionally likes to use instance_variable_get() in Ruby instead of adding accessors; it's a massive code smell), but there are some generic things, such as giving it instructions on keeping notes and giving them subagents to farm out repetitive tasks to prevent the individual task completion from filling up the context for tasks that are truly independent (in which case, for Claude Code at least, you can also tell it to do multiple in parallel)
But, indeed, just starting Claude Code (or Codex; I prefer Claude but it's a "personality thing" - try tools until you click with one) and telling it to do something is the most important step up from a chat window.
I agree about the small tweaks like the Ruby accessor thing, I also have some small notes like that myself, to nudge the agent in the right direction.
> I suspect these frameworks/patterns just fill up the context with unecessary junk.
That's exactly the point. Agents have their own context.
Thus, you try to leverage them by combining ad-hoc instructions for repetitive tasks (such as reviewing code or running a test checklist) and not polluting your conversation/context.
Ah do you mean sub-agents? I do understand that if I summon a sub-agent and give it e.g. code reviewing instructions, it will not fill up the context of the main conversation. But my point is that giving the sub-agent the instruction "review this code as if you were a staff engineer" (literally those words) should cover most use cases (but I can't prove this, unfortunately).
I do think you're right that you should be cautious about writing too convoluted sub-agents.
I'd rather use more of them that are brief and specialized, than try to over-correct on having a single agent try to "remember" too many rules. Not really because the description itself will eat too much context, but because having the sub-agent work for too long will accumulate too much context and dilute your initial instructions anyway.
If I don't instruct it to in some way, the agent will not write tests, will not conform with the linter standard, will not correctly figure out the command to run a subset of tests, etc.
The idea is to produce such articles, not read them. Do not even read them as the agent is spitting them out - simply feed straight into another agent to verify.
Present it at the next team/management meeting to seem in the loop and hope nobody asks any questions
No questions. It will be pasted into their AI tool. And things will be great. For few weeks at least until something break a nobody will know what
I'm doing the same. My reason is not the IDE, I just can't let AI agent software onto my machine. I have no trust at all in it and the companies who make this software. I neither trust them in terms of file integrity nor for keeping secrets secret, and I do have to keep secrets like API keys on my file system.
Am I right in assuming that the people who use AI agent software use them in confined environments like VMs with tight version control?
Then it makes sense but the setup is not worth the hassle for me.
I am on the other side, I have given the complete control of my computer to Claude Code - Yolo Mode. Sudo. It just works. My servers run the same. I SSH into Claude Code there and let them do whatever work they need to do.
So my 2 cents. Use Claude Code. In Yolo mode. Use it. Learn with it.
Whenever I post something like this I get a lot of downvots. But well ... end of 2026 we will not use computer the way we use them now. Claude Code Feb 2025 was the first step, now Jan 2026 CoWork (Claude Code for everyone else) is here. It is just a much much more powerful way to use computers.
> end of 2026 we will not use computer the way we use them now.
I think it will take much longer than that for most people, but I disagree with the timeline, not where we're headed.
I have a project now where the entirety of the project fall into these categories:
- A small server that is geared towards making it easy to navigate the reports the agents produce. This server is 100% written by Claude Code - I have not even looked at it, nor do I have any interest in looking at it as it's throwaway.
- Agent definitions.
- Scripts written by the agents for the agents, to automate away the parts where we (well, the agents mostly) have found a part of the task is mechanical enough to either take Claude out of the loop entirely, or produce a script that does the mechanical part interspersed with claude --print for smaller subtasks (and then systematically try to see if sonnet or haiku can handle the tasks). Eventually I may get to a point of starting to optimise it to use API's for smaller, faster models where they can handle the tasks well enough.
The goal is for an increasing proportion of the project to migrate from the second part (agent definitions) to the third part, and we do that in "production" workflows (these aren't user facing per se, but third parties do see the outputs).
That is, I started with a totally manual task I was carrying out anyway, defined agents to take over part of the process and produce intermediate reports, had it write the UI that lets me monitor the agents progress, then progressively I'd ask the agent after each step to turn any manual intervention into agents, commands, and skills, and to write tools to handle the mechanical functions we identified.
For each iteration, more stuff first went into the agent definitions, and then as I had less manual work to do, some of that time has gone into talking to the agent about which sub-tasks we can turn into scripts.
I see myself doing this more and more, and often "claude" is now the very first command I run when I start a new project whether it is code related or not.
This comment matches my experience with proto-AGI LLMs.
Claude Code is the secret.
Claude Code is the question and the answer.
Claude Code has already revolutionized this industry. Some of you are just too blind to see it yet.
Claude Code and agents are the hot new hammer, and they are cool, I use CC and like it for many things, but currently they suffer from the "hot new hammer" hype so people tend to think everything is a nail the LLM can handle. But you still need a screwdriver for screws, even if you can hammer them in.
Don't say "we" when talking about yourself.
I already do.
And yes, it is a hypothesis about the future. Claude Code was just a first step. It will happen to the rest of computer use as well.
[dead]
I’d rather just read the prompt that this article was generated from.
I finally found the perfect way to describe what I feel when I read stuff like this.
I remember some proto-memes about translation of some text between English and Chinese 100 times and the results being hilarious...modern parallel would be to ask a LLM to read the article, and generate the prompt that constructed the article. Then generate an article based on that prompt. Repeat x100.
I Would Rather Read The Prompt (IWRRTP)
I laughed when I noticed the username
JTPP - just the prompt, please
I hereby second the motion to get this acronym widely adopted
Tempted to copy the content and launder it through another LLM and post a comment linking to my own version
That's like saying you'd rather listen to someone ask a question than read a chapter of a textbook.
About 99% of the blogs [written by humans] that reach HN's front page are fundamentally incorrect. It's mostly hot takes by confident neophytes. If it's AI-written, it actually comes close to factual. The thing you don't like is usually right, the thing you like is usually wrong. And that's fine if you'd rather read fiction. Just know what you're getting yourself into.
Donate me the tokens, dont donate me slop PRs - open source maintainer
Not wanting to be a gatekeeper, but the author appears to be a "AI Growth Innovator" or some-such-I-don't-know-what rather than an actual engineer who has been ramping up on AI use to see what works in production:
Built engaged communities across platforms (2.8K X, 5.4K LinkedIn, 700+ YouTube)
etc, etc.
No doubt impressive to marketing types but maybe a pinch of salt required for using AI Agents in production.
That's so trite, what makes people write such sentences and not feel embarrassed? I remember when bragging so callously about arbitrary stuff would make you seem off-putting, what happened with that? Today it seems like everyone is bragging about what they do more than actually doing, and others seem fine with this, just part of "the hustle", where did we go wrong?
Not only is the website layout horrible to read, it also smells like the article was written by AI.
My brain just screams "no" when I try to read that.
Don't worry, it's not supposed to be read. The idea is to induce FOMO and subscribe to authors newsletter to get more "insights".
Seems like a reasonable feeling to have. Anything that's not worth writing is not worth reading imo.
Eh, you're going too far with that IMO.
The other day we were discussing a new core architecture for a Microservice we were meant to split out of a "larger" Microservice so that separate teams could maintain each part.
Instead of just discussing it entirely without any basis, I instead made a quick prototype via explicit prompts telling the LLM exactly what to create, where etc.
Finally, I asked it to go through the implementation and create a wiki page, concatting the code and outlining in 1-4 sentences above each "file" excerpt what the goal for the file is.
In the end, I went through it to double-check if it held up from my intentions - which it did and thus didn't change anything
Now we could all discuss the pros and cons of that architecture while going through it, and the intro sentence gave enough context to each code excerpt to improve understanding/reduce mental load as necessary context was added to each segment.
I would not have been able to allot that time to do all this without an LLM - especially the summarization to 1-3 sentences, so I'll have to disagree when you state this generally.
Though I definitely agree that a blog article like this isn't worth reading if the author couldn't even be arsed to write it themselves.
AI written article about AI usage, building things with AI that others will use to build their own AI with. The future is now indeed.
I feel like HN should have a policy of discouraging comments which accuse articles and other comments of being written by AI. We all know this happens, we all know it's a possibility, and often such comments may even be correct. But seeing this type of comment dozens of times a day on all sorts of different content is tedious. It almost feels like nobody can write anything anymore without someone immediately jumping up and saying "You used AI to write that!".
No. Public shaming for sharing AI written slop is what we need more of.
Such public shaming loses its value when it's overused though (see: boy who cried wolf). The "written by AI" accusation is thrown around so much, when it often isn't even true, that it just triggers scepticism as the initial reaction. At least, it does for me.
But it’s also true in this case. I’ve had my own comments claimed to be AI by someone because I used a phrase like “delve into”, but a few false positives from the over-eager are to be expected even if it’s not optimal.
So it begins, Design Patterns and Agile/Scrum snake oil of modern times.
No dude, you just don't get it, if you shout at the ai that YOU HAVE SUPERPOWERS GO READ YOUR SUPERPOWERS AT ..., then give it skills to write new skills, and then sprinkle anti grader reward hacking grader design.md with a bit of proactive agent state externalization (UPDATED), and then emotionally abuse it in the prompt, it's going to replace programmers and cure cancer yesterday. This is progress.
Claude Code is AGI. It's simply a (brief) matter of time before it cures cancer. I give Claude Code until Q3 2026 before it synthesizes a complete treatment plan which can eliminate cancer in 80% of patients. This should be obvious to anyone who has intuited the awe-inspiring intelligence of Claude.
Yeah the (updated) tag on all patterns was a bit much
Curing cancer is H2 2030, once my options have vested. :cool-eyeglasses-emoji:
No no. We promise this solution has a totally different name.
In the spirit of the article, I asked chatgpt to suggest names.
One of the better ones were "Unified LLM Interaction Model (ULIM)". You read it here first...
Here's a pattern I noticed - you notice some pattern that is working (let's say planning or TODO management) - if the pattern is indeed solid then it gets integrated into the black box and your agent starts doing that internally. At which point your abstraction on top becomes defective because agents get confused about planning the planning.
So with the top performers I think what's most effective is just stating clearly what the end result you want to be (with maybe some hints for verification of results which is just clarifying the intent more)
Because as soon as I started reading the patterns I realized this was bogus and one could only recommend this because of personal stakes.
I sometimes feel like the cognitive cost of agentic coding is so much higher than a skilled human. There is so much more bootstrap and handling process around making sure agents don't go off the rails (they will), or that they will adhere to their goals (they won't). And in my experience fixing issues downstream takes more effort than solving the issue at the root.
The pipe dream of agents handling Github Issue -> PullRequest -> Resolve Issue becomes a nightmare of fixing downstream regressions or other chaos unleashed by agents given too much privilege. I think people optimistic on agents are either naive or hype merchants grifting/shilling.
I can understand the grinning panic of the hype merchants because we've collectively shovelled so much capital into AI with very little to show for it so far. Not to say that AI is useless, far from it, but there's far more over-optimism than realistic assessment of the actual accuracy and capabilities.
Cognitive overhead is real. Spent the first few weeks fixing agent mess more than actually shipping.
One thing that helped: force the agent to explain confidence before anything irreversible. Deleting a file? Tell me why you're sure. Pushing code? Show me the reasoning. Just a speedbump but it catches a lot.
Still don't buy the full issue→PR dream though. Too many failure modes.
It can definitely feel like that right now but I think a big part of that is us learning to harness it. That’s why resources like this are so valuable. There’s always going to be pain at the start.
I've seen this "we're still learning" argument for at least 6 months now and I get it and even agree with it. However at which point do we start to question how much is it part of a learning curve and how much is just limitations of the models/software?
[deleted]
> The Real Bottleneck: Time
Already a "no", the bottleneck is "drowning under your own slop". Ever noticed how fast agents seems to be able to do their work in the beginning of the project, but the larger it grows, it seems to get slower at doing good changes that doesn't break other things?
This is because you're missing the "engineering" part of software engineering, where someone has to think about the domain, design, tradeoffs and how something will be used, which requires good judgement and good wisdom regarding what is a suitable and good design considering what you want to do.
Lately (last year or so), more client jobs of mine have basically been "Hey, so we have this project that someone made with LLMs, they basically don't know how it works, but now we have a ton of users, could you redo it properly?", and in all cases, the applications have been built with zero engineering and with zero (human) regards to design and architecture.
I have no yet have any clients come to me and say "Hey, our current vibe-coders are all busy and don't have time, help us with X", it's always "We've built hairball X, rescue us please?", and that to me makes it pretty obvious what the biggest bottleneck with this sort of coding is.
Moving slower is usually faster long-term granted you think about the design, but obviously slower short-term, which makes it kind of counter-intuitive.
> Moving slower is usually faster long-term granted you think about the design, but obviously slower short-term, which makes it kind of counter-intuitive.
Like an old mentor of mine used to say:
“Slow is smooth; smooth is fast”
[dead]
You should definitely read the whole thing, but tl;dr
- Generate a stable sequence of steps (a plan), then carry it out. Prevents malicious or unintended tool actions from altering the strategy mid-execution and improves reliability on complex tasks.
- Provide a clear goal and toolset. Let the agent determine the orchestration. Increases flexibility and scalability of autonomous workflows.
- Have the agent generate, self-critique, and refine results until a quality threshold is met.
- Provide mechanisms to interrupt and redirect the agent’s process before wasted effort or errors escalate. Effective systems blend agent autonomy with human oversight. Agents should signal confidence and make reasoning visible; humans should intervene or hand off control fluidly.
If you've ever heard of "continuous improvement", now is the time to learn how that works, and hook that into your AI agents.
This is a great consolidation of various techniques and patterns for agentic coding. It’s valuable just to standardize our vocabulary in this new world of AI led or assisted programming. I’ve seen a lot of developers all converging toward similar patterns. Having clear terms and definitions for various strategies can help a lot in articulating the best way to solve a given problem. Not so different from approaching a problem and saying “hey, I think we’d benefit from TDD here.”
I recognized the need for this recently and started by documenting one [1]... then I dropped the ball because I, too, spent my winter holiday engrossed in agentic development. (Instead of documenting patterns.) I'm glad somebody kept writing!
I will ruefully admit that I had also planned a similar blog post! I am hoping I can still add some value to the conversation, but it does seem like _everyone_ is writing about agentic development right now.
I can imagine all the middle managers are just salivating at the idea of presenting this webpage to higher ups as part of their "AI Strategy" at the next shareholder meeting.
Bullet point lists! Cool infographics! Foreign words in headings! 93 pages of problem statement -> solution! More bullet points as tradeoffs breakdown! UPDATED! NEW!
So it doesn't include the only useful thing: the actual agent "code".
> Star History
How you know something is done either by a grifter or a starving student looking for work.
Why is this at the top?
I've flagged it, that's what we should be doing with AI content.
The word cost is mentioned only twice in the entire article, lol
If you're remotely interested in this type of stuff then scan papers arxiv[0] and you'll start to see patterns emerge. This article is awful from a readability standpoint and from an "does this author give me the impression they know what they're talking about" impression.
But scrap that, it's better just thinking about agent patterns from scratch. It's a green field and, unless you consider yourself profoundly uncreative, the process of thinking through agent coordination is going to yield much greater benefit than eating ideas about patterns through a tube.
The emergence of this kind of thing has been so surprising to me. The exact same sort of person that managed to bottleneck themselves and obliterate signal-to-noise ratios at every company they work for with endless obsession over the trivial minutiae of the systems they are working with have found a way to do it with LLMs, too, which I would have assumed would have been the death of this kind of busywork
All of this might as well be greek to me. I use ChatGPT and copy paste code snippets. Which was bleeding edge a year or two ago, and now it feels like banging rocks together when reading these types of articles. I never had any luck integrating agents, MCP, using tools etc.
Like if I'm not ready to jump on some AI-spiced up special IDE, am I then going to just be left banging rocks together? It feels like some of these AI agent companies just decided "Ok we can't adopt this into the old IDE's so we'll build a new special IDE"?_Or did I just use the wrong tools (I use Rider and VS, and I have only tried Copilot so far, but feel the "agent mode" of Copilot in those IDE's is basically useless).
I'm so happy someone else says this, because I'm doing exactly the same. I tried to use agent mode in vs code and the output was still bad. You read simple things like: "We use it to write tests". I gave it a very simple repository, said to write tests, and the result wasn't usable at all. Really wonder if I'm doing it wrong.
I’m not particularly proAI but I struggle with the mentality some engineers seem to apply to trying.
If you read someone say “I don’t know what’s the big deal with vim, I ran it and pressed some keys and it didn’t write text at all” they’d be mocked for it.
But with these tools there seems to be an attitude of “if I don’t get results straight away it’s bad”. Why the difference?
I don't understand how to get even bad results. Or any results at all. I'm at a level where I'm going "This can't just be me not having read the manual".
I get the same change applied multiple times, the agent having some absurd method of applying changes that conflict with what I say it like some git merge from hell and so on. I can't get it to understand even the simplest of contexts etc.
It's not really that the code it writes might not work. I just can't get past the actual tool use. In fact, I don't think I'm even at the stage where the AI output is even the problem yet.
There isn't a bunch of managers metaphorically asking people if they're using vim enough, and not so many blog posts proclaiming vim as the only future for building software
I’d argue that, if we accept that AI is relevant enough to at least be worth checking, then dismissing it with minimal effort is just as bad as mindlessly hyping the tech.
You must be new here. "I use vim between", "you don't use vim, you use Visual Studio, your opinion doesn't count" is a thing in programming circles.
I agree to a degree, but I am in that camp. I subscribe to alphasignal, and every morning there are 3 new agent tools, and two new features, and a new agentic approach, and I am left wondering, where is the production stuff?
So just like in the JavaScript world?
Well one could say that since it's AI, AI should be able to tell us what we're doing wrong. No?
AI is supposed to make our work easier.
What you are doing wrong in respect to what? If you ask for A, how would any system know that you actually wanted to ask for B?
Honestly IMO it's more that I ask for A, but don't strongly enough discourage B then I get both A, B and maybe C, generally implemented poorly. The base systems need to have more focus and doubt built in before they'll be truely useful for things aside from a greenfield apps or generating maintainable code.
You didn't actually just say "write tests" though right? What was the actual prompt you used?
I feel like that matters more than the tooling at this point.
I can't really understand letting LLMs decide what to test or not, they seem to completely miss the boat when it comes to testing. Half of them are useless because they duplicate what they test, and the other half doesn't test what they should be testing. So many shortcuts, and LLMs require A LOT of hand-holding when writing tests, more so than other code I'd wager.
No, you have similar experience as a lot of people have.
LLMs just fail (hallucinate) in less known fields of expertise.
Funny: Today I have asked Claude to give me syntax how to run Claude Code. And its answer was totally wrong :) So you go to documentation… and its parts are obsolete as well.
LLM development is in style “move fast and break things”.
So in few years there will be so many repos with gibberish code because “everybody is coder now” even basketball players or taxi drivers (no offense, ofc, just an example).
It is like giving F1 car to me :)
you need to write a test suite to check his test generation (soft /s)
Yeah if you've not used codex/agent tooling yet it's a paradigm shift in the way of working, and once you get it it's very very difficult to go back to the copy-pasta technique.
There's obviously a whole heap of hype to cut through here, but there is real value to be had.
For example yesterday I had a bug where my embedded device was hard crashing when I called reset. We narrowed it down to the tool we used to flash the code.
I downloaded the repository, jumped into codex, explained the symptoms and it found and fixed the bug in less than ten minutes.
There is absolutely no way I'd of been able to achieve that speed of resolution myself.
- We narrowed it down to the tool we used to flash the code.
- I downloaded the repository, jumped into codex, explained the symptoms and it found and fixed the bug in less than ten minutes.
Change the second step to: - I downloaded the repository, explained the symptoms, copied the relevant files into Claude Web and 10 minutes later it had provided me with the solution to the bug.
Now I definitely see the ergonomic improvement of Claude running directly in your directory, saving you copy/paste twice. But in my experience the hard parts are explaining the symptoms and deciding what goes into the context.
And let's face it, in both scenarios you fixed a bug in 10-15 minutes which might have taken you a whole hour/day/week before. It's safe to say that LLMs are an incredible technological advancement. But the discussion about tooling feels like vim vs emacs vs IDEs. Maybe you save a few minutes with one tool over the other, but that saving is often blown out of proportion. The speedup I gain from LLMs (on some tasks) is incredible. But it's certainly not due to the interface I use them in.
Also I do believe LLM/agent integrations in your IDE are the obvious future. But the current implementations still add enough friction that I don't use them as daily drivers.
> I never had any luck integrating agents
What exactly do you mean with "integrating agents" and what did you try?
The simplest (and what I do) is not "integrating them" anywhere, but just replace the "copy-paste code + write prompt + copy output to code" with "write prompt > agent reads code > agent changes code > I review and accept/reject". Not really "integration" as much as just a workflow change.
I installed the copilot extension in my IDE, and switched on Agent mode.
I don't really get how the workflow is supposed to work, but I think it's mostly due to how the tool is made. It has like some sort of "change stack" similar to git commits/staging but which keeps conflicting with anything I manually edit.
Perhaps it's just this particular implementation (Copilot integration in VS) which is bad, and others are better? I have extreme trouble trying to feed it context, handling suggested AI changes without completely corrupting the code for even small changes.
Hm, yeah maybe. I've tried Cursor once, but the entire experience was so horrible, and it was really hard to know what's going on.
The workflow I have right now, is something like what I put before, and I do it with Codex and Claude Code, both work the same. Maybe try out one of those, if you're comfortable with the terminal? It basically opens up a terminal UI, can read current files, you enter a prompt, wait, then can review the results with git or whatever VCS you use.
But I'm also never "vibe-coding", I'm reviewing every single line, and mercilessly ask the agent to refactor whenever the code isn't up to my standards. Also restart the agent after each prompt finished, as they get really dumb as soon as context is used more than 20% of their "max".
Make sure you’re clicking “Keep” to “approve” the changes. It’s annoying but I don’t think there is a way around having to do that. Then if you manually edit something, you can mention it in your next chat message, e.g., “I made a few changes to <file>. <Next instruction>”
I used to do it the way you were doing it. A friend went to a hackathon and everyone was using Cursor and insisted that I try it. It lets you set project level "rules" that are basically prompts for how you want things done. It has access to your entire repo. You tell the agent what you want to do, and it does it, and allows you to review it. It's that simple; although, you can take it much further if you want or need to. For me, this is a massive leap forward on its own. I'm still getting up to speed with reproducible prompt patterns like TFA mentions, but it's okay to work incrementally towards better results.
I also sympathize with that approach, and found it sometimes better than agents. I believe some of the agentic IDEs are missing a "contained mode".
Let me select lines in my code which you are allowed to edit in this prompt and nothing else, for these "add a function that does x" without starting to run amok
Yes. And some way of using an instructions file. Because interacting with an agent in a tiny plugin window without use of "agents.md" or some sort of persistent prompt you can adjust retry etc is horrible.
Now it's "please add one unit test for Foobar()" and it goes away and thinks for 2 minues and does nothing then I point it to where the FooBar() which it didn't find and then adds a test method then I change the name to one I like better but now the AI change wasn't "accepted"(?) so the thing is borked...
I think the UX for agents is important and ...this can't be it.
I recently pasted an error I found into claude code and asked who broke this. It found the commit and also found that someone else had fixed it in their branch.
You should use claude code.
There's no reason this should not be possible in other IDEs, except for the vendor lock-in.
You just didn't drink enough cool-aid and have intact brain.
Copilot's agent mode is a disaster. Use better tools: try Claude Code or OpenCode (my favorite).
It's a new ecosystem with its own (atrocious!) jargon that you need to learn. The good news is that it's not hard to do so. It's not as complex or revolutionary as everyone makes it look like. Everything boils down to techniques and frameworks of collecting context/prompt before handing it over to the model.
Yep, basically this. In the end it helps having the mental model that (almost) everything related to agents is just a way to send the upstream LLM a better and more specific context for the task you need to solve at that specific time. i.e Claude Code "skills" are simply a markdown file in a subdirectory with a specific name that translates to a `/SKILL_NAME` command in Claude and a prompt that is injected each time that skill is mentioned or Claude thinks it needs to use, so it doesn't forget the specific way you want to handle that specific task.
Sadly we have some partnership meaning it's Copilot or nothing.
I feel like just use claude code. That is it. Use it you get the feel for it. Everyone is over complicating.
It is like learning to code itself. You need flight hours.
I'm stuck with the Copilot tools. Again, I don't think this is a problem with the models but with the tooling. I can't switch to claude code (for work, that is) and while I don't mind using more command line tools I don't want to run multiple IDE's.
But it's good to hear that it's not me being completely dumb, it's Copilot Agent Mode tooling that is?
It's not that simple. That's how I started as well but now I have hooked up Gemini and GPT 5.2 to review code and plans and then to do consensus on design questions.
And then there's Ralph with cross LLM consensus in a loop. It's great.
This is something that continues to surprise me. LLMs are extremely flexible and already come prepackaged with a lot of "knowledge", you don't need to dump hundreds of lines of text to explain to it what good software development practices are. I suspect these frameworks/patterns just fill up the context with unecessary junk.
I think avoiding filling context up with too much pattern information, is partially where agent skills are coming from, with the idea there being that each skill has a set of triggers, and the main body of the skill is only loaded into context, if that trigger is hit.
You could still overload with too many skills but it helps at least.
You get to 80% there (numbers pulled out of the air) by just telling it to do things. You do need more to get from 80% there to 90%+ there.
How much more depends on what you're trying to do and in what language (e.g. "favourite" pet peeve: Claude occasionally likes to use instance_variable_get() in Ruby instead of adding accessors; it's a massive code smell), but there are some generic things, such as giving it instructions on keeping notes and giving them subagents to farm out repetitive tasks to prevent the individual task completion from filling up the context for tasks that are truly independent (in which case, for Claude Code at least, you can also tell it to do multiple in parallel)
But, indeed, just starting Claude Code (or Codex; I prefer Claude but it's a "personality thing" - try tools until you click with one) and telling it to do something is the most important step up from a chat window.
I agree about the small tweaks like the Ruby accessor thing, I also have some small notes like that myself, to nudge the agent in the right direction.
> I suspect these frameworks/patterns just fill up the context with unecessary junk.
That's exactly the point. Agents have their own context.
Thus, you try to leverage them by combining ad-hoc instructions for repetitive tasks (such as reviewing code or running a test checklist) and not polluting your conversation/context.
Ah do you mean sub-agents? I do understand that if I summon a sub-agent and give it e.g. code reviewing instructions, it will not fill up the context of the main conversation. But my point is that giving the sub-agent the instruction "review this code as if you were a staff engineer" (literally those words) should cover most use cases (but I can't prove this, unfortunately).
I do think you're right that you should be cautious about writing too convoluted sub-agents.
I'd rather use more of them that are brief and specialized, than try to over-correct on having a single agent try to "remember" too many rules. Not really because the description itself will eat too much context, but because having the sub-agent work for too long will accumulate too much context and dilute your initial instructions anyway.
If I don't instruct it to in some way, the agent will not write tests, will not conform with the linter standard, will not correctly figure out the command to run a subset of tests, etc.
The idea is to produce such articles, not read them. Do not even read them as the agent is spitting them out - simply feed straight into another agent to verify.
Present it at the next team/management meeting to seem in the loop and hope nobody asks any questions
No questions. It will be pasted into their AI tool. And things will be great. For few weeks at least until something break a nobody will know what
I'm doing the same. My reason is not the IDE, I just can't let AI agent software onto my machine. I have no trust at all in it and the companies who make this software. I neither trust them in terms of file integrity nor for keeping secrets secret, and I do have to keep secrets like API keys on my file system.
Am I right in assuming that the people who use AI agent software use them in confined environments like VMs with tight version control?
Then it makes sense but the setup is not worth the hassle for me.
I am on the other side, I have given the complete control of my computer to Claude Code - Yolo Mode. Sudo. It just works. My servers run the same. I SSH into Claude Code there and let them do whatever work they need to do.
So my 2 cents. Use Claude Code. In Yolo mode. Use it. Learn with it.
Whenever I post something like this I get a lot of downvots. But well ... end of 2026 we will not use computer the way we use them now. Claude Code Feb 2025 was the first step, now Jan 2026 CoWork (Claude Code for everyone else) is here. It is just a much much more powerful way to use computers.
> end of 2026 we will not use computer the way we use them now.
I think it will take much longer than that for most people, but I disagree with the timeline, not where we're headed.
I have a project now where the entirety of the project fall into these categories:
- A small server that is geared towards making it easy to navigate the reports the agents produce. This server is 100% written by Claude Code - I have not even looked at it, nor do I have any interest in looking at it as it's throwaway.
- Agent definitions.
- Scripts written by the agents for the agents, to automate away the parts where we (well, the agents mostly) have found a part of the task is mechanical enough to either take Claude out of the loop entirely, or produce a script that does the mechanical part interspersed with claude --print for smaller subtasks (and then systematically try to see if sonnet or haiku can handle the tasks). Eventually I may get to a point of starting to optimise it to use API's for smaller, faster models where they can handle the tasks well enough.
The goal is for an increasing proportion of the project to migrate from the second part (agent definitions) to the third part, and we do that in "production" workflows (these aren't user facing per se, but third parties do see the outputs).
That is, I started with a totally manual task I was carrying out anyway, defined agents to take over part of the process and produce intermediate reports, had it write the UI that lets me monitor the agents progress, then progressively I'd ask the agent after each step to turn any manual intervention into agents, commands, and skills, and to write tools to handle the mechanical functions we identified.
For each iteration, more stuff first went into the agent definitions, and then as I had less manual work to do, some of that time has gone into talking to the agent about which sub-tasks we can turn into scripts.
I see myself doing this more and more, and often "claude" is now the very first command I run when I start a new project whether it is code related or not.
This comment matches my experience with proto-AGI LLMs.
Claude Code is the secret.
Claude Code is the question and the answer.
Claude Code has already revolutionized this industry. Some of you are just too blind to see it yet.
Claude Code and agents are the hot new hammer, and they are cool, I use CC and like it for many things, but currently they suffer from the "hot new hammer" hype so people tend to think everything is a nail the LLM can handle. But you still need a screwdriver for screws, even if you can hammer them in.
Don't say "we" when talking about yourself.
I already do.
And yes, it is a hypothesis about the future. Claude Code was just a first step. It will happen to the rest of computer use as well.
[dead]
I’d rather just read the prompt that this article was generated from.
I finally found the perfect way to describe what I feel when I read stuff like this.
I remember some proto-memes about translation of some text between English and Chinese 100 times and the results being hilarious...modern parallel would be to ask a LLM to read the article, and generate the prompt that constructed the article. Then generate an article based on that prompt. Repeat x100.
I Would Rather Read The Prompt (IWRRTP)
I laughed when I noticed the username
JTPP - just the prompt, please
I hereby second the motion to get this acronym widely adopted
Tempted to copy the content and launder it through another LLM and post a comment linking to my own version
That's like saying you'd rather listen to someone ask a question than read a chapter of a textbook.
About 99% of the blogs [written by humans] that reach HN's front page are fundamentally incorrect. It's mostly hot takes by confident neophytes. If it's AI-written, it actually comes close to factual. The thing you don't like is usually right, the thing you like is usually wrong. And that's fine if you'd rather read fiction. Just know what you're getting yourself into.
Donate me the tokens, dont donate me slop PRs - open source maintainer
Not wanting to be a gatekeeper, but the author appears to be a "AI Growth Innovator" or some-such-I-don't-know-what rather than an actual engineer who has been ramping up on AI use to see what works in production:
https://www.nibzard.com/about
Scaled GitHub stars to 20,000+
Built engaged communities across platforms (2.8K X, 5.4K LinkedIn, 700+ YouTube)
etc, etc.
No doubt impressive to marketing types but maybe a pinch of salt required for using AI Agents in production.
That's so trite, what makes people write such sentences and not feel embarrassed? I remember when bragging so callously about arbitrary stuff would make you seem off-putting, what happened with that? Today it seems like everyone is bragging about what they do more than actually doing, and others seem fine with this, just part of "the hustle", where did we go wrong?
Not only is the website layout horrible to read, it also smells like the article was written by AI. My brain just screams "no" when I try to read that.
Don't worry, it's not supposed to be read. The idea is to induce FOMO and subscribe to authors newsletter to get more "insights".
Seems like a reasonable feeling to have. Anything that's not worth writing is not worth reading imo.
Eh, you're going too far with that IMO.
The other day we were discussing a new core architecture for a Microservice we were meant to split out of a "larger" Microservice so that separate teams could maintain each part.
Instead of just discussing it entirely without any basis, I instead made a quick prototype via explicit prompts telling the LLM exactly what to create, where etc.
Finally, I asked it to go through the implementation and create a wiki page, concatting the code and outlining in 1-4 sentences above each "file" excerpt what the goal for the file is.
In the end, I went through it to double-check if it held up from my intentions - which it did and thus didn't change anything
Now we could all discuss the pros and cons of that architecture while going through it, and the intro sentence gave enough context to each code excerpt to improve understanding/reduce mental load as necessary context was added to each segment.
I would not have been able to allot that time to do all this without an LLM - especially the summarization to 1-3 sentences, so I'll have to disagree when you state this generally.
Though I definitely agree that a blog article like this isn't worth reading if the author couldn't even be arsed to write it themselves.
AI written article about AI usage, building things with AI that others will use to build their own AI with. The future is now indeed.
I feel like HN should have a policy of discouraging comments which accuse articles and other comments of being written by AI. We all know this happens, we all know it's a possibility, and often such comments may even be correct. But seeing this type of comment dozens of times a day on all sorts of different content is tedious. It almost feels like nobody can write anything anymore without someone immediately jumping up and saying "You used AI to write that!".
No. Public shaming for sharing AI written slop is what we need more of.
Such public shaming loses its value when it's overused though (see: boy who cried wolf). The "written by AI" accusation is thrown around so much, when it often isn't even true, that it just triggers scepticism as the initial reaction. At least, it does for me.
But it’s also true in this case. I’ve had my own comments claimed to be AI by someone because I used a phrase like “delve into”, but a few false positives from the over-eager are to be expected even if it’s not optimal.
So it begins, Design Patterns and Agile/Scrum snake oil of modern times.
No dude, you just don't get it, if you shout at the ai that YOU HAVE SUPERPOWERS GO READ YOUR SUPERPOWERS AT ..., then give it skills to write new skills, and then sprinkle anti grader reward hacking grader design.md with a bit of proactive agent state externalization (UPDATED), and then emotionally abuse it in the prompt, it's going to replace programmers and cure cancer yesterday. This is progress.
Claude Code is AGI. It's simply a (brief) matter of time before it cures cancer. I give Claude Code until Q3 2026 before it synthesizes a complete treatment plan which can eliminate cancer in 80% of patients. This should be obvious to anyone who has intuited the awe-inspiring intelligence of Claude.
Yeah the (updated) tag on all patterns was a bit much
Curing cancer is H2 2030, once my options have vested. :cool-eyeglasses-emoji:
No no. We promise this solution has a totally different name.
In the spirit of the article, I asked chatgpt to suggest names.
One of the better ones were "Unified LLM Interaction Model (ULIM)". You read it here first...
Here's a pattern I noticed - you notice some pattern that is working (let's say planning or TODO management) - if the pattern is indeed solid then it gets integrated into the black box and your agent starts doing that internally. At which point your abstraction on top becomes defective because agents get confused about planning the planning.
So with the top performers I think what's most effective is just stating clearly what the end result you want to be (with maybe some hints for verification of results which is just clarifying the intent more)
If you are interested here is a list of actual agentic patterns - https://go.cbk.ai/patterns
You could also disclose you work there.
Because as soon as I started reading the patterns I realized this was bogus and one could only recommend this because of personal stakes.
I sometimes feel like the cognitive cost of agentic coding is so much higher than a skilled human. There is so much more bootstrap and handling process around making sure agents don't go off the rails (they will), or that they will adhere to their goals (they won't). And in my experience fixing issues downstream takes more effort than solving the issue at the root.
The pipe dream of agents handling Github Issue -> PullRequest -> Resolve Issue becomes a nightmare of fixing downstream regressions or other chaos unleashed by agents given too much privilege. I think people optimistic on agents are either naive or hype merchants grifting/shilling.
I can understand the grinning panic of the hype merchants because we've collectively shovelled so much capital into AI with very little to show for it so far. Not to say that AI is useless, far from it, but there's far more over-optimism than realistic assessment of the actual accuracy and capabilities.
Cognitive overhead is real. Spent the first few weeks fixing agent mess more than actually shipping. One thing that helped: force the agent to explain confidence before anything irreversible. Deleting a file? Tell me why you're sure. Pushing code? Show me the reasoning. Just a speedbump but it catches a lot. Still don't buy the full issue→PR dream though. Too many failure modes.
It can definitely feel like that right now but I think a big part of that is us learning to harness it. That’s why resources like this are so valuable. There’s always going to be pain at the start.
I've seen this "we're still learning" argument for at least 6 months now and I get it and even agree with it. However at which point do we start to question how much is it part of a learning curve and how much is just limitations of the models/software?
> The Real Bottleneck: Time
Already a "no", the bottleneck is "drowning under your own slop". Ever noticed how fast agents seems to be able to do their work in the beginning of the project, but the larger it grows, it seems to get slower at doing good changes that doesn't break other things?
This is because you're missing the "engineering" part of software engineering, where someone has to think about the domain, design, tradeoffs and how something will be used, which requires good judgement and good wisdom regarding what is a suitable and good design considering what you want to do.
Lately (last year or so), more client jobs of mine have basically been "Hey, so we have this project that someone made with LLMs, they basically don't know how it works, but now we have a ton of users, could you redo it properly?", and in all cases, the applications have been built with zero engineering and with zero (human) regards to design and architecture.
I have no yet have any clients come to me and say "Hey, our current vibe-coders are all busy and don't have time, help us with X", it's always "We've built hairball X, rescue us please?", and that to me makes it pretty obvious what the biggest bottleneck with this sort of coding is.
Moving slower is usually faster long-term granted you think about the design, but obviously slower short-term, which makes it kind of counter-intuitive.
> Moving slower is usually faster long-term granted you think about the design, but obviously slower short-term, which makes it kind of counter-intuitive.
Like an old mentor of mine used to say:
“Slow is smooth; smooth is fast”
[dead]
You should definitely read the whole thing, but tl;dr
If you've ever heard of "continuous improvement", now is the time to learn how that works, and hook that into your AI agents.This is a great consolidation of various techniques and patterns for agentic coding. It’s valuable just to standardize our vocabulary in this new world of AI led or assisted programming. I’ve seen a lot of developers all converging toward similar patterns. Having clear terms and definitions for various strategies can help a lot in articulating the best way to solve a given problem. Not so different from approaching a problem and saying “hey, I think we’d benefit from TDD here.”
I recognized the need for this recently and started by documenting one [1]... then I dropped the ball because I, too, spent my winter holiday engrossed in agentic development. (Instead of documenting patterns.) I'm glad somebody kept writing!
[1]: https://kerrick.blog/articles/2025/use-ai-to-stand-in-for-a-...
I will ruefully admit that I had also planned a similar blog post! I am hoping I can still add some value to the conversation, but it does seem like _everyone_ is writing about agentic development right now.
Agentic Patterns website https://agentic-patterns.com/
Github https://github.com/nibzard/awesome-agentic-patterns
I can imagine all the middle managers are just salivating at the idea of presenting this webpage to higher ups as part of their "AI Strategy" at the next shareholder meeting.
Bullet point lists! Cool infographics! Foreign words in headings! 93 pages of problem statement -> solution! More bullet points as tradeoffs breakdown! UPDATED! NEW!
So it doesn't include the only useful thing: the actual agent "code".
> Star History
How you know something is done either by a grifter or a starving student looking for work.
Why is this at the top?
I've flagged it, that's what we should be doing with AI content.
The word cost is mentioned only twice in the entire article, lol
If you're remotely interested in this type of stuff then scan papers arxiv[0] and you'll start to see patterns emerge. This article is awful from a readability standpoint and from an "does this author give me the impression they know what they're talking about" impression.
But scrap that, it's better just thinking about agent patterns from scratch. It's a green field and, unless you consider yourself profoundly uncreative, the process of thinking through agent coordination is going to yield much greater benefit than eating ideas about patterns through a tube.
0: https://arxiv.org/search/?query=agent+architecture&searchtyp...
Am I the only one with scrolling issues in Firefox on this website?
It literally gets "stuck" and becomes un-scrollable.
looks to be a good resource with lots of links
thanks for the share!