Phase 02 — Agents

Module 03 of 12 · 8 min read · Free

Module 3: What is an Agent?

A chatbot answers questions. An agent gets things done. Here's the difference, and why it matters.

This is Module 3 of a 12-part curriculum: Build Software Products with AI — From First Principles to Production Pipeline.

The word “agent” is everywhere right now and mostly meaningless in common usage. People call anything from a chatbot with a slightly longer prompt to a fully autonomous system an “agent.” That ambiguity makes it hard to reason about what you’re actually building.

Let’s fix the definition, then build up from there.

The Core Distinction

A chatbot processes a message and returns a response. One input, one output. The loop begins and ends in a single exchange. The user does the work of chaining multiple interactions together.

An agent does something fundamentally different. It receives a goal, breaks it into steps, takes actions to complete those steps (including using tools and external systems), observes the results, and continues until the goal is achieved — or until it decides it can’t proceed.

The difference is autonomy and action. An agent isn’t just generating text in response to your input. It’s executing a plan.

The Agent Loop

Every agent — regardless of the framework or platform — runs on some version of this loop:

1. Perceive — receive input (user message, event, scheduled trigger)
2. Think — reason about what to do next
3. Act — call a tool, write a file, send a message, spawn a sub-agent
4. Observe — receive the result of the action
5. Repeat — use the observation to inform the next cycle
6. Terminate — when the goal is achieved or can't be achieved

This is sometimes called the ReAct loop (Reason + Act). It’s the underlying structure of every serious agent implementation.

What makes this powerful: the agent can run through this loop many times in a single “turn.” Ask an agent to research a topic and write a report, and it might: search the web (act), read the results (observe), identify gaps (think), search again with refined queries (act), synthesise what it found (think), write the draft (act), check it against the original brief (think), revise (act) — all before returning anything to you.

The user experience is: you gave it a task, it completed the task. The underlying reality is: many cycles of reasoning and action.

↺ THE AGENT LOOP — what runs behind every agent response

👁 PERCEIVE input arrives: message / event / scheduled trigger ↓ 🧠 THINK what is the best next action? ↓ ⚡ ACT call tool / write file / spawn sub-agent / send message ↓ 👁 OBSERVE what did the action return? ↓ ↺ REPEAT? goal achieved? → no → back to THINK → yes → respond to user

This loop may run dozens of times before anything is returned to you. The user sees one response. The agent ran many cycles to produce it.

Tools and Function Calling

Tools are what give an agent the ability to act in the world beyond generating text.

A tool is a function the model can choose to call. You define the function, describe what it does in natural language, and specify its parameters. The model decides when and how to call it based on the task at hand.

A simple tool definition (in JSON schema format):

{
  "name": "read_file",
  "description": "Read the contents of a file at the given path",
  "parameters": {
    "type": "object",
    "properties": {
      "path": {
        "type": "string",
        "description": "The absolute path to the file"
      }
    },
    "required": ["path"]
  }
}

When the model decides to call this tool, it returns a structured call with the parameter values filled in. Your code receives that call, executes the actual file read, and feeds the result back into the context window. The model then continues reasoning with the file contents in context.

Common categories of tools:

File I/O — read, write, search files
Shell execution — run commands, scripts, tests
Web access — fetch URLs, search the web
APIs — interact with external services (Slack, GitHub, Stripe, databases)
Spawning — create sub-agents or spawn new sessions

The tool set defines what the agent can do. An agent with no tools is just a chatbot with more complex prompting. An agent with a rich tool set can interact with virtually any system.

Memory and Goals

Two more things distinguish a real agent from a chatbot: memory and goals.

Memory is the mechanism by which an agent maintains context across actions and — critically — across sessions. We’ll go deep on this in Module 4, but the core idea is: without memory, every agent invocation starts from scratch. With memory, an agent can maintain continuity, learn from previous runs, and build up long-term context.

Goals give the agent direction. Rather than responding to a single message, a properly designed agent has a clear objective it’s working toward. That objective shapes every decision in the loop: what action to take, what information to seek, when the task is complete.

In practice, goals are encoded in the system prompt and refined through the conversation. A well-specified goal is the difference between an agent that meanders and one that executes.

A Concrete Example

Let me make this concrete with an example from the setup I run daily.

I have an agent called Ultron — my main assistant agent. Here’s a simplified version of what happens when I tell it: “Review the open PRs in the whorang repo and summarise anything that needs my attention.”

Goal: Summarise open PRs needing attention

Cycle 1:
- Think: I need to list the open PRs in the whorang repo
- Act: call exec("gh pr list --repo nikovijay/whorang --state open")
- Observe: returns list of 3 open PRs with titles and ages

Cycle 2:
- Think: PR #47 is 4 days old — I should check the review status and diff
- Act: call exec("gh pr view 47 --json reviews,files,body")
- Observe: returns PR details — no reviews, 3 files changed

Cycle 3:
- Think: PR #51 has a failing CI check — I should check the log
- Act: call exec("gh run list --branch feat/filter-fix | head -5")
- Observe: returns failed test run

Cycle 4:
- Think: I have enough context to summarise
- Act: return response to user with summary of what needs attention

From my perspective: I asked a question. It came back with a useful summary. Underneath: four cycles of reasoning and tool use, each building on the last.

This is what separates an agent from a chatbot. The chatbot would have told me how to check PRs. The agent just did it.

Agentic Failure Modes

Building agents well requires understanding how they fail. The main failure modes:

Goal drift. The agent completes a plausible-looking task that isn’t actually what you asked for. Mitigation: precise goal specification, output validation.

Runaway loops. The agent gets stuck in a loop — taking actions, observing results that don’t satisfy the goal, taking more actions, not knowing when to stop. Mitigation: explicit termination conditions, maximum iteration limits.

Tool misuse. The agent calls the wrong tool, or calls the right tool with incorrect parameters, or calls a destructive tool when it shouldn’t. Mitigation: clear tool descriptions, approval gates for irreversible actions.

Context window exhaustion. Long agent runs fill up the context window, causing the agent to lose track of earlier context. Mitigation: periodic summarisation, context management strategies.

Hallucinated tool calls. The model invents tool calls that don’t exist, or invents parameter values. Mitigation: strict tool schemas, validation before execution.

None of these are showstoppers. All of them are manageable with good design. But you have to build with them in mind.

Agents Are Infrastructure, Not Magic

The mental model I want you to have: an agent is a piece of software infrastructure that has an LLM as its reasoning core.

The LLM decides what to do. The surrounding infrastructure — tool execution, memory management, session handling, error recovery — makes it reliable and useful in production.

Most “AI agent” failures are infrastructure failures, not model failures. The model reasoned correctly; the surrounding system didn’t execute the action properly, didn’t feed back the result, didn’t handle the error case, didn’t persist the state. Build the infrastructure right and the model performs remarkably well. Skimp on it and you’ll have a fragile system that works in demos and breaks in production.

What’s Next

You now understand what an agent is: a reasoning loop that uses tools to take actions in pursuit of a goal. The next question is: how does it maintain continuity over time? How does it remember what happened yesterday, last week, across hundreds of sessions?

That’s the memory problem. Module 4 covers it in full.

Referenced from @nikovijay

“This is my Mission Control: A Squad of 10 autonomous @openclaw agents… They create work on their own. They claim tasks on their own. They talk with each other.” — @pbteja1998

“We’re opening up a new job role for Firecrawl. This time humans aren’t allowed to apply. AI Agents only.” — @nickscamara_

N+1 Newsletter

Enjoyed this module?

Subscribe to get notified when new modules and courses drop. No drip — just updates when there's something worth reading.

Subscribe on Substack →

Before you dive in —