Good Enough vs. Do It Myself: Sorting My AI Use-Cases
A non-exhaustive catalog of how a software engineer actually uses AI, scored 1–5 - and a pattern I absolutely did expect to find
I think one of the reasons I’m in this hate/love relationship with LLMs/AI-assistants/... is that there are so many different use-cases and somehow we are now getting used to using one single (or maybe two - think Claude/ChatGPT (chat interface)/... and Claude Code/Codex/...) interface to address all of them.
As an example, when I asked Ia whether she finds using AI “dangerous” because it might lower your own cognitive capabilities, she quite reasonably highlighted that it can be used in so many ways and you should be fine, as long as you are aware which type of activity you are engaging in.
I decided to start cataloging the ways a software engineer is, in my experience, using AI interfaces and what I personally find useful or harmful, along with some tips for usage. The list is not intended to be exhaustive, and I hope you’ll add your personal use-cases and stories of success or failure with each of them.
I am also sure that each of these “types” of usage can be improved in a certain way - through prompts, agents, infra around them, etc. - I’d love to hear your stories, to be honest, since I’m in my Claude Code bubble of my problems.
Let’s start with the soft basics:
Rubberducking 🦆
You have a problem at hand and you have a lot of thoughts about it. You need to express them, so that through the expression of those thoughts you come to some conclusion.
How did I do it before AI?
a) Harass a friend - teammates, ex-colleagues, friends, my mom - everyone got some random “hey, help me think through this”.
b) Write it out - this one’s my favorite, because humans are not always present/awake, and on paper you can navigate up and down between different parts of your arguments and thoughts.
How I do it with AI
Honestly, it would sound like talking with AI in a conversational manner would replace the harass a friend kind of interaction. And maybe the first models did this actually, but nowadays, I feel much better if I
first write down all of what I think
mark all of the “unknowns” in the train-of-thought I’m having
doing the first two steps usually already brings you forward; if not - continue
when I’ve exhausted all thoughts, but still have questions, I’d research the marked unknowns - here, fine, you can use a chat assistant to do the research for you and read through it. Remember though to actually look at the papers/articles found, and not just the final conclusion
Now here, you’ll be tempted to dump all of what you wrote into the AI and ask it to make sense of it / suggest more ideas / suggest an outcome - in my experience, this is a bad idea. It’ll generate a lot of NOISE that you don’t need at the moment, because the whole point of the exercise was to find signal, instead of generating more noise.
The research parts of your marked areas should already give you more structured information, that you yourself can process and turn into signal.
Plus, if you do that, you’ll feel afterwards lost as to which of the thoughts were YOUR thoughts and which were generated by the LLM. Which is a very uncomfortable feeling if you want to be honest with your solution.
The process is the solution here - so you kinda have to walk the problem yourself.
Applicable to both the chat interface and the Claude Code-like interface - probably Claude Code would give you more structure, more access to your internal docs, skills, etc., but still.
AI-usefulness index: 2/5
Remind me what’s going on? 🤔
I’ve actually found this a very small, but very helpful use-case. You’ve been on vacation for two weeks. You’ve been so braindead and tired that you just LEFT everything behind and went to enjoy that wonderful sea/mountain/river-side at last. You come back and... what? It used to take us some time to “get back to work”. Nowadays I ask Claude to “remind me where we were at”. It’s pretty good actually. I think the output will be good enough to at least get you back into the working mood. And in my case it can be several repos and their commits; in your case it might be all the user stories created while you were gone - but the point being, summarizing a certain amount of “data” (commits, chats, stories, specs) over a relatively short time window works pretty well 😃
AI-usefulness index: 5/5
Enrichment 📖
I’ve talked about the enrichment step in my Automating myself out of development post.
To keep it short and honest - enrichment is when you know what you want, in terms of a user story, but you are too lazy to write the details out. Works best if you have prior approved-and-implemented specs somewhere accessible for the agent (let me call it “agent” from now on, ah? LLM, chatbot, Claude Code, whatever you are using - let’s just call it agent for simplicity). It would also work if it has to go and read the code every time, but yeah, tokens, you know.
Using the agent here goes typically fine. Although it’s quite hard to evaluate it truly, because I still have the mental model of the application, so thanks to that, I automatically write even one-sentence descriptions that imply a good description and imply understanding of the codebase.
So what would probably make sense here, once you move from solo to team development, is to make this step critical. Make this step (through prompts, or skills, or whatever the next agent framework gives you) critical of the original message - make it highlight any prior implementations that the next agent (or human) should pay attention to.
Useful also to remind yourself of prior implementations, you know? Like for example, we wanted to introduce “capabilities” to our agents (they are SRE agents investigating incidents, in short), but we already had “capabilities” as a concept attached to our integrations (e.g. Grafana/GitHub/GitLab integrations). So the enrichment step would highlight that this already exists and we should at least call it something different - better yet, rethink the whole relationship.
So, summary -
AI-usefulness index: 4/5
Tip: make it critical
Spec 📝
I got roasted for saying this, at the beginning of the agentic-engineering discussions, but I’m going to state it again and I want to see some real arguments - the output that is good for an agent and the output that is good for a human are not the same. A human will be better off without code snippets in the spec, and would instead appreciate an architecture diagram; an agent, however, can “split” part of the future implementation into adding some details in the spec.
Here, what I’d pay attention to and specifically have a mechanism for dealing with (I’m not being overly detailed on the how, because the concrete output might be different for different tools / teams - but the idea should stand) is that agents, even if they have some open questions and they flag some areas that are questionable, they’d go like “I’ve flagged these, and the planner/implementation agent will need to deal with them”. Not OK. I want to see those NOW, and make decisions about them sooner rather than later.
Basically, make them (prompt, ask for structured output, etc.) be explicit about areas that are likely to cause problems or are questionable.
And back to the human-output-vs-agent-output thing - if you care about building the model of the application in your head, alongside it being developed (and I’m not being condescending here; it can be valid not to care in the context of a demo, or an I-just-want-to-be-done-with-this-now-we’ll-throw-it-out-tomorrow kind of implementation, and it might be critical in some other case) - you need to ask for an additional “human-readable” output with extras like diagrams, and no code snippets.
So now, if you have this wonderfully crafted two-type output to help YOU understand, to help YOU answer the important questions, and to help downstream agents process the feature further, you will recognize that not all tasks are made equal (what a surprise).
Here you can create one more “categorizer” that decides which tasks are big enough that they need to go through this elaborate spec creation, or whether they’re easy and can just go further, let’s say without an arch diagram.
I can hear you screaming that I’m overcomplicating things and following the herd to increase token usage. Maybe. MAYBE.
But the thing is - someone, somewhere in the development pipeline needs to do these things. Needs to enrich the story, needs to create the spec, needs to create a diagram describing the spec, needs to decide if the task is “complicated” or not. It’s up to YOU and YOUR use-case to delegate this or not. Lego bricks. Lego bricks....
I got a bit distracted; let’s go back to use-cases.
AI-usefulness index: 4/5
Tip:
think of the result artifacts and who’s going to read them
make sure it’s explicit in the parts where it asks a question
keep already-implemented specs somewhere for reference/context gathering
think about categorizing what is worth an elaborate spec and what’s worth implementing right away. Use your OWN project criteria for this.
Brainstorming🌪️
Wait, you think - you just talked about rubberducking. Well, that’s different. And I’m not trying to overcomplicate things with terminology. I’m just defining it so we have common ground. For you, brainstorming can mean one thing; for me - another.
So, I see the difference between brainstorming and rubberducking in that, for the first one, you already - maybe not completely, but somewhat - see the possible options, possible tradeoffs, and possible places where something can break, and you need to discuss those tradeoffs with someone. Rubberducking is really more uncomfortable, and there you must “walk the problem”. Brainstorming is ahead of that.
And, as long as you are not dealing with novel issues, or issues that don’t have a lot of ready-made solutions out there in the sea of solutions on the internet - you’ll be fine using LLMs and agents.
Present the problem, ask it to push back, brainstorm options. That’s fine. Don’t forget to have an artifact in the end - a result of the brainstorming. And I’m afraid you are still better off if you brainstorm it first, and THEN ask the LLM for more input if needed.
Generally, I’ll be honest. The more I do something myself and THEN go back to the LLM, the more I re-discover my initial attitude towards agentic systems - which is that they are not doing THAT good of a job... The thing is... I’ve been told so many times that I’m overdoing things. Overcomplicating things. Overthinking things. So I have come to understand that some things are OK to be done “good enough”. One can’t do everything alone, right?
The KEY is for you to understand which things are OK to be done “good enough” and which are not.
Paraphrasing the famous quote:
Grant me the serenity to let the LLM handle the “good enough,”
the courage to do the rest myself,
and the wisdom to know the difference.
AI-usefulness index: 4/5
Tip: pick your battles. Brainstorm about things you know. Avoid jumping into brainstorming about things you don’t know.
Remind me, what was X? 🕵️♀️
In the conversation with Ia, I mention this tiny use-case that I find absolutely useful. This is when I forget the name of a book (which happens quite often), or (more applicable to software engineering) the name of some term (I’m so good at remembering things on the concept level, but so bad at remembering the exact term for them...). So I absolutely love when I write something vague and get an answer from the LLM.
I know it’s a small use-case, but yeah, it kinda feels like magic sometimes 😄
I’d put the “cold start” use-case here too. This is when you want to learn/research/search for something and you don’t know where to start at all. You need someone to point you at what to google in the first place. AI’s good at that.
AI-usefulness index: 4.5/5 (I still can’t find a detective sci-fi story from the 90s about a woman embedding her mind into someone else / there was AI involved...)
Tip: Use it if you’re bad at terms
Wait - is there a pattern here?
Okay. I wanted to put these on paper, because I like structure and I want to learn to be more aware what type of activity I’m engaging in. Now that the whole list is sitting in front of me, the conclusion is the obvious one: remind me where we were, remind me what X was, the cold start. They’re all retrieval and summarization over a bounded corpus I already own - my commits, my chats, my specs, the existing pile of articles and papers already out there on the internet. The agent isn’t conjuring value out of thin air; it’s fetching and compressing something that already exists and that I have access to.
The low and middling ones - rubberducking, brainstorming - are generative work, where the process of doing it is the value. And the moment you outsource the process, you don’t get the thing the process was supposed to give you. That’s the whole “the process is the solution” point from way back at the top.
So that’s the line, more or less:
the agent shines when the value is in retrieving and compressing something that already exists
the agent gets shaky (or actively counterproductive) when the value is in the process of generating the thing
That’s why some land at 5 and some at 2. It’s not really a list of random use-cases I happened to think of - it’s the same axis viewed from different angles.
And the in-betweens (enrichment, spec) sit at 4 exactly because they’re mixed: a lot of their value is retrieval (pulling in prior implementations, prior specs, the existing model of the app), and the generative part stays useful only as long as I keep ownership of it - i.e., I demand the human-readable output, I make the open questions explicit NOW, I keep the mental model in my own head. The second I let the agent own the generative part wholesale, those scores would drop too.
Obvious, I know, and actually I was more aware of these in the beginning of using AI, but I don’t know how about you, when you have this assistant next to you, which is now so much more than just LLM, but a lot of internal logic of the provider, you don’t usually PAUSE and think how and what you are going to tell it. You just do. And then get frustrated when the output is not to your satisfaction.
Hopefully this list will help me manage my expectations before entering the dialog.
So maybe the one-liner to take away, going back to what Ia said: it’s not “is AI dangerous” - it’s am I asking it to retrieve, or to generate? And if it’s the second one, do I still want to own the process?
The usefulness matrix
Rough reading of the table: the higher up you go, the more it’s pure retrieval and the more comfortable you can be handing it off. The lower you go, the more the process itself is the point - and the more you should think twice before delegating it.
A lot of this advice will sound obvious, but pay attention to what and how you use the LLMs every day, and you’ll start to notice that it takes a little bit of effort to introduce these friction points, instead of jumping into the conversation right away.
I’ll be adding more examples, about code generation, about learning something new, etc. but I thought I’d wrap this one before you are bored.
Looking forward to some comments of real usecases and thoughts.


