Hey! Ankur here, and this is the 24th edition of Lazy AI5 mins of reading to help you stay ahead of the AI curve.

If you’ve ever stared at a “you’ve reached your usage limit” screen on Claude and wondered what you did wrong, this article is for you.

It’s not the questions. It’s the conversation.

Picture this. You open Claude on a Monday morning. You ask it to help draft an email. Then you tweak it. Then you ask a follow-up. Then another. By noon, you’ve had a 25-message back-and-forth — and suddenly, Claude tells you you’re out of credits for the day.

What happened?

Well, here’s what you should know: Claude doesn’t just read your latest message. It re-reads the entire conversation every single time you hit send. Right from message #1.

For every conversation, it uses tokens.

A token is simply a chunk of text. AI doesn’t read sentences the way we humans do. It breaks down sentences into tokens, each of which could be a whole word, part of a word, or even a punctuation

(I’ve written more about tokens in an earlier newsletter. You can read about them below:

Now, Chatbots consume tokens for every single chat message in the window. And that’s why a 25-message conversation feels like it eats up your credits. Because it’s not 25 separate questions. It’s 25 questions where each one carries the full weight of everything that came before it.

One creator tracked his Claude usage in detail and found that 98.5% of his tokens were going towards re-reading conversation history. Not the actual work he was asking Claude to do.

Read that again.

You don’t need to be an engineer to talk AI. Subscribe to this newsletter — every other day, you’ll understand AI well enough to be the smartest person in the room at work.

Each issue is just 5 minutes — less than the time you spend doomscrolling before bed. Except, this actually moves your career forward. Join 8,000+ subscribers now.

So how do you ensure you don’t hit Claude’s limits?

Once you understand the token problem, the fixes are simple. Every habit below is answering one question: How do I reduce unnecessary token consumption?

1. Start a new chat for a new topic

This is the one people resist most, because starting a fresh chat feels like starting over. It’s not.

When you switch topics inside the same conversation, you’re forcing Claude to carry the full history of the previous topic into every reply about the new one. Imagine asking your accountant a tax question in the middle of a meeting about your marketing strategy, and they insist on re-reading the entire marketing discussion before answering. That’s actually what’s happening.

New topic. New chat. The context you actually need takes 10 seconds to re-paste. Everything else is just weight.

2. Edit the message, don’t send a correction.

When Claude doesn’t get it right, most people just type “make it shorter” or “no that's not what I want. I want XYZ” and hit send.

But what that does is, it adds one more message to the stack that Claude now keeps re-reading forever.

(Before you scroll ahead, I’m going to use Claude in this article, because this article IS about Claude. But the structure and process is pretty much the same across all Chatbots. ChatGPT, Gemini and the rest — all follow the same token usage process)

In Claude’s chat interface, you can click Edit on any message you’ve already sent, change what you wrote, and regenerate the response. The new version replaces the old one. It doesn’t stack on top of it.

This single habit probably saves more tokens than anything else on this list. Please use it. Aggressively.

If you like what you’re reading, subscribe to this newsletter — every other day, you’ll understand AI well enough to be the smartest person in the room at work.

Each issue is just 5 minutes — less than the time you spend doomscrolling before bed. Except, this actually moves your career forward. Join 8,000+ subscribers now!

3. Ask for everything at once.

“Summarise this article.” (reads response) “Now pull out the three key points.” (reads response) “Now suggest a headline.”

That’s three separate messages. Three full context reloads. Claude re-reads the entire conversation each time.

Instead, keep it one message — “Summarise this, pull out three key points, and suggest a headline”. It’s one reload. Same output. The answers are usually sharper too, because Claude sees what you’re building toward from the start.

4. Don’t redo the whole thing when one part is wrong.

“Rewrite this email” asks Claude to regenerate every word. “Only fix the opening line — keep the rest exactly as is” asks Claude to touch one sentence.

If your email is 300 words and you ask for a full rewrite, you’ve just spent that many output tokens fixing something a 20-token surgical edit would have handled.

Be specific about what’s broken. Point to the exact paragraph. And add “no explanation needed, just the updated version”. Otherwise Claude will spend a paragraph telling you what it changed. And that paragraph costs tokens too.

5. Summarize and restart when a conversation gets long.

Every conversation has a point where it becomes more expensive to continue than to start fresh.

When you feel a session getting bloated — say, past 20-25 messages, ask Claude to write a summary of everything important so far. Copy it. Open a new chat. Paste the summary as your first message. And then continue from there.

You lose nothing. You carry forward the context that matters, without the token cost of re-reading every exchange that got you there.

If you like what you’re reading, subscribe to this newsletter — every other day, you’ll understand AI well enough to be the smartest person in the room at work.

Each issue is just 5 minutes — less than the time you spend doomscrolling before bed. Except, this actually moves your career forward. Join 8,000+ subscribers now!

6. Use Projects for files you keep going back to.

Every time you upload a document to a new chat, Claude processes it fresh. Upload the same 10-page brief to five different chats and you’ve paid to process it five times.

Claude’s Projects feature lets you upload a file once and reference it across multiple conversations without that repeated cost. It also uses smarter retrieval — pulling only the sections relevant to your question instead of loading the whole document every time.

If there’s a document you return to regularly, like a brand guide, a brief, a research report or something else — just move it into a Project. And then just use that project’s environment for new tasks.

Okay, quick poll. I want to understand if you guys have ever used Claude projects. If not, I’ll write a detailed article on it next. It’s one of the most underused features of Claude!

7. Choose your model carefully

Claude has three main models: Haiku (fast, light, inexpensive), Sonnet (balanced), and Opus (the heavy one, best for complex reasoning).

Most people leave it on Sonnet or Opus by default. And then wonder why their credits disappear quickly. Haiku handles a surprising amount: summarising text, fixing grammar, reformatting a document, brainstorming ideas. It’s not a downgrade for those tasks. It’s the right tool.

Save Opus for the work that genuinely needs it, like nuanced analysis, complex writing, and situations where the quality difference is actually visible. For everything else, Haiku is fast, capable, and uses a fraction of the cost. Or at least stick to Sonnet.

8. Stop asking Claude to do things it’s not built for.

This is the most important. We’re all lazy, and we want one tool that does everything.

Now, Claude is exceptional at reasoning, writing, and analysis. It cannot generate images. It’s not the fastest at real-time search.

If you spend six messages trying to get Claude to produce a visual — describing the colours, the layout, the style — you’re burning tokens on a task it was never designed to solve. Use Gemini. Same logic for live news or real-time data: there are better tools for that job.

I’m not saying Claude is limited. I’m saying every tool has a lane. Keeping Claude in its lane isn’t a workaround — it’s just using it well.

So what does this mean for you?

The usage limit is not arbitrary. It’s just math — and most of us are doing the math wrong without realising it.

Pick just three habits from this list. Start with: new chat per topic, edit instead of correcting, and ask for everything at once. Those three alone will feel like a noticeable difference by the end of the week.

The limit stops being a wall once you stop carrying the whole conversation on your back.

Let me know if this helped!

See you next time..

Cheers,

Ankur

Keep Reading