How to Run an Agentic Digital Agency

A practical, opinionated guide to how we actually run projects using agentic coding.
Lee Higgins
Lee Higgins
January 1, 2026

How to Run an Agentic Digital Agency

A practical, opinionated guide to how we actually run projects using agentic coding.

This is not theory. This is the distilled version of the workflow we evolved throughout 2025 after building real production systems with agents at the core.

Use this as an operating manual, not a manifesto.

1. First Principle: Start With Context, Not Code

Agents are only as good as the context you give them.

Never start with a task. Always start with a project.

A project is a long‑lived container for understanding, decisions, and memory.

2. Project Starting Point (Per Client)

Every client starts with a Claude Project.

This is not optional.

We do not work in ad-hoc chats. We do not start in an editor. We start by creating a project in the Claude / Anthropic ecosystem.

A Claude Project is a long-lived memory container.

It persists:

  • Context
  • Decisions
  • Research
  • Language

This project exists for the entire lifetime of the client relationship.

Initial Contents

The first iteration is intentionally high-level:

  • What the client does
  • What they specialise in
  • Their market
  • Their competitors
  • Known constraints

Accuracy is less important than alignment.

Client Feedback Loop

We share early summaries back with the client and explicitly ask:

“Is this how you see yourselves?”

Corrections at this stage prevent months of misalignment later.

3. Context Building (Living Knowledge Base)

The Clients specific Claude project evolves continuously.

Over time it accumulates:

  • Priorities
  • Working practices
  • Technical preferences
  • Things to avoid
  • Past decisions and why they were made

This becomes the entry point for every project.

When a new project starts, we do not re-explain the client.

We start from the client specific claude project.

4. Research Phase (Before Any Specs)

We do deep research before defining solutions.

This phase is about widening the lens, not converging too early.

Tools

  • Claude Projects for long-form reasoning, synthesis, and maintaining continuity
  • ChatGPT Deep Research for broad, parallel exploration and cross-checking assumptions

We intentionally use both.

Claude excels at:

  • Holding long-lived context
  • Building coherent narratives
  • Carrying decisions forward

ChatGPT Deep Research is particularly strong at:

  • Rapidly surveying large problem spaces
  • Exploring adjacent domains
  • Surfacing alternatives and edge cases

Outputs

  • Market research
  • Competitor analysis
  • Product positioning
  • Technical feasibility notes

Research output is never throwaway.

It is summarised, curated, and fed back into the Claude Project so it becomes part of the permanent project context.

Only once this shared understanding exists do we move on to specs.

5. From Vision to Structure

Once the context is solid:

  1. Write a high‑level overview
  2. Break it into epics and features
  3. Capture behaviour, not implementation

Format

  • BDD‑style specs
  • Gherkin where possible

This forces clarity and creates a shared language for humans and agents.

6. Git Is the Source of Truth

All execution happens in Git repositories.

Claude Projects hold thinking.

Git holds reality.

A key realisation for us in 2025 was that a repo is not just a place for code.

We use Git repositories and Claude Code as a universal work container for:

  • Planning
  • Specs
  • Documentation
  • Client deliverables
  • Internal process docs
  • Even presentations (as deployable Flutter sites/apps)

If it’s important enough to collaborate on, iterate on, or remember later, it belongs in a repo.

Why Claude Code (Even for Non-Code)

Claude Code’s superpower isn’t just code generation — it’s that it can:

  • Explore a filesystem
  • Read and write structured documents
  • Keep work organised inside a repo
  • Produce reviewable diffs

This makes it an extremely natural interface for planning and documentation.

Repository Structure

  • /docs – background, context, decisions
  • /planning – specs, epics, iterations

Plans evolve via pull requests just like code.

Core CLIs

We operate almost entirely via CLIs:

  • git – source of truth
  • gh (GitHub CLI) – issues, PRs, reviews

Agents are instructed to use these tools directly.

If it can’t be done via CLI, it doesn’t scale.

7. Starting Execution

Once specs exist:

  • Create a Git repository
  • Add planning and documentation first
  • Then begin implementation

We often run agents via:

  • ClaudeCode in terminal
  • Persistent remote machines
  • tmux for long‑running sessions

Agents work asynchronously.

Humans review via PRs.

8. Sub‑Agents

Rather than one monolithic agent, we use sub‑agents.

Typical roles:

  • Planner
  • Implementer
  • Reviewer
  • Refactorer
  • Researcher

These can run within a single session.

This reduces coordination overhead and keeps context tight.

9. Agent Skills

As patterns repeat, we formalise them into skills.

Skills are reusable behaviours, not prompts.

Examples

  • Code review skill (via CodeRabbit CLI)
  • Refactoring skill
  • Architecture enforcement
  • Preferred library usage
  • Dependency injection patterns

Skills encode:

  • What is acceptable
  • What must be fixed
  • How decisions are made

This dramatically improves consistency.

10. Guardrails via Internal Packages

Agents are non‑deterministic.

Left unconstrained, they will:

  • Re‑implement the same thing differently
  • Increase long‑term complexity

Our Solution

Build and maintain small internal packages.

Examples:

  • Dependency injection
  • Data repositories
  • Styling / UI systems
  • Project bootstrapping

Agents are explicitly instructed to use these packages.

Most projects become composition, not reinvention.

11. Unified Language — Pure Dart (No Bash)

We made a deliberate decision to standardise on one language: Dart.

Everything is Dart.

  • Backend services
  • Frontend apps (Flutter)
  • Tooling and CLIs
  • Deployment helpers
  • Automation scripts

No Bash.

No JSX.

No CSS.

No HTML.

No mixed template languages.

No competing mental models.

Dart’s Tooling Is a Force Multiplier

Dart isn’t just a pleasant language — it has exceptionally strong tooling, and that matters enormously for agentic work.

Out of the box, Dart gives us:

  • A world-class analyser that surfaces errors early and precisely
  • A deterministic formatter that removes style debates entirely
  • The ability to run Dart files directly like scripts during development
  • The ability to AOT compile the same code into fast, static binaries for any target platform with tiny amount of dependencies

This creates a rare combination:

  • Script-like ergonomics during development
  • Production-grade performance and predictability in deployment

Agents thrive in this environment because:

  • Errors are explicit, not implicit
  • Feedback loops are tight
  • Formatting and structure are enforced automatically

Dart for Tooling Changes Everything

Using Dart for scripting and tooling is not just a preference — it’s a structural advantage.

Because tooling is written in the same language as the application:

  • Enums can be shared
  • Constants can be shared
  • Configuration models can be shared
  • Validation logic can be shared

There is a single source of truth.

Deployment tools, CLIs, background workers, and applications all agree on:

  • Environment names
  • Feature flags
  • Service identifiers
  • API versions

This removes an entire class of bugs caused by:

  • Duplicated config
  • Stringly-typed environments
  • Drift between scripts and runtime code

Why This Matters for Agentic Work

Agents perform best when:

  • Logic is explicit
  • Types are enforced
  • Errors are surfaced early
  • Behaviour is predictable

Dart gives us:

  • A single syntax
  • A single type system
  • A single formatter
  • A single analyzer
  • Deterministic execution

Benefits

  • Dramatically reduced context switching
  • Fewer environment-specific failures
  • Better agent reliability
  • Easier human review
  • Shared code between apps, servers, and tooling

Agents produce better work when the solution space is constrained.

Humans do too.

12. Integrations — CLIs, Not MCP

We deliberately do not use MCP.

This is an explicit decision.

Why No MCP

MCP-style integrations attempt to solve agent tooling by pushing more data and affordances into the model context.

In practice, this:

  • Bloats context
  • Obscures intent
  • Makes behaviour harder to reason about
  • Increases non-determinism

Claude has a clear loop that will search dynamically for the context it needs and digs deeper when stuck, using things like the cli documentation.

What We Use Instead

We rely on:

  • Explicit CLI commands
  • Deterministic inputs and outputs
  • Agentic execution loops

Core tools:

  • git
  • gh (GitHub CLI)
  • linear (Linear CLI)
  • coderabbit (CodeRabbit CLI)

These tools:

  • Are scriptable
  • Produce predictable output
  • Fail loudly
  • Are easy for agents to reason about

The Agentic Loop

A typical loop looks like:

  1. Read state (issues, code, docs)
  2. Make a plan
  3. Execute via CLI
  4. Inspect results
  5. Iterate

Nothing is hidden.

Nothing is implicit.

Humans can inspect every step.

Human-Compatible by Default

CLIs are not just agent-friendly — they are human-compatible.

That matters.

When agents use the same interfaces humans use:

  • Humans can reproduce behaviour
  • Humans can debug failures
  • Humans can step in without translation layers

There is no hidden protocol.

There is no invisible abstraction.

What the agent does is exactly what a human would do.

MCP-style integrations, by contrast, are not human-friendly.

They:

  • Hide execution details
  • Make it difficult to reproduce behaviour manually
  • Create a gap between human understanding and agent action

This gap is dangerous.

Agentic systems must be:

  • Inspectable
  • Reproducible
  • Debbugable by humans

CLIs give us that for free.

13. Bug Fixing Workflow

Bug fixing is driven by issues, not conversations.

Flow

  1. Issue is created (GitHub preferred, Linear synced)
  2. Issue is labelled (P1, P2, etc.)
  3. Agent is instructed to fix the issue
  4. Agent uses gh to:
    • Create branch
    • Implement fix
    • Open PR
  5. CodeRabbit runs automated review
  6. Review feedback is fed back to the agent
  7. Agent fixes issues and updates PR
  8. Human reviews and merges

We deliberately avoid auto-merging.

Understanding the system matters.

14. Code Reviews

Every PR is reviewed.

Automated

  • Local review tools
  • AI‑assisted review comments

Human

  • Architectural sanity
  • Long‑term maintainability

Agents fix review feedback themselves when possible.

15. Foreground vs Background Work

We explicitly separate work into foreground and background modes. When running foreground agents there is often a waiting period we fill with background tasks.

Foreground work is:

  • High priority
  • Actively supervised
  • Tight feedback loops

Background work is:

  • Lower priority
  • Long-running
  • Reviewed at PR time

This allows parallel progress without overwhelming any single human.

16. Reducing Human Context Switching

One of the biggest productivity wins we saw in 2025 had nothing to do with raw speed.

It came from reducing human context switching.

We achieved this by making every project look fundamentally the same.

What “The Same” Means

Across all client projects:

  • Same language (Dart)
  • Same repo structure
  • Same internal packages
  • Same tooling
  • Same workflows
  • Same agent skills

Once you’ve seen one project, you’ve effectively seen them all.

Why This Matters

Context switching is the silent productivity killer.

When every project has different:

  • Languages
  • Frameworks
  • Build systems
  • Conventions

…humans burn enormous cognitive energy just re-orienting.

By enforcing uniformity:

  • Humans spend less time remembering how things work
  • Agents spend less time rediscovering patterns
  • Reviews get faster
  • Quality becomes easier to judge

The Compound Effect

This uniformity enables something that would otherwise be impossible:

  • One primary focus task
  • One or more secondary tasks progressing in parallel
  • Sometimes across different projects on the same day

Because everything looks the same, switching costs collapse.

This is what makes an agentic agency sustainable.

17. The Meta‑Rule

If something keeps repeating:

  • Encode it
  • Package it
  • Automate it

But only after you understand it.

18. Self & Company-Level Projects

Clients are not the only thing worth giving context to.

We maintain Claude Projects for ourselves.

Company Project

This project contains:

  • Company structure
  • Rates and pricing
  • Working practices
  • Contract templates
  • Past decisions
  • Legal language

This allows the agent to:

  • Act as a rubber duck
  • Help think through decisions
  • Draft SOWs and contracts
  • Prepare internal documents
  • Maintain consistency across the business

This is not a chatbot.

It is a contextual operating system for the company.

19. What Humans Actually Do

Humans are not replaced.

Their role shifts.

Humans provide:

  • Direction
  • Judgment
  • Taste
  • Accountability
  • Client communication

They also perform a critical operational role:

Humans Unblock Agents

Agents are extremely capable, but they can still get stuck.

Common reasons include:

  • Ambiguous requirements
  • Conflicting constraints
  • Missing context
  • Decisions that require taste or business judgment

When this happens, humans step in to:

  • Clarify intent
  • Make a decision
  • Adjust constraints
  • Add missing context

Once unblocked, agents continue execution autonomously.

This creates a healthy loop:

  • Agents do the work
  • Humans remove blockers
  • Progress resumes

Humans are not micromanaging.

They are enabling flow.

The Core Rules

If you only remember one section, make it this.

This is the compressed version of how an agentic digital agency actually works.

  • Start with Claude Projects, not tasks
    Long‑lived context beats clever prompts.
  • One project per client, for the lifetime of the relationship
    Memory compounds.
  • Research first, always
    Use Claude for depth, ChatGPT Deep Research for breadth. Collapse everything back into the project.
  • Git + Claude Code are inseparable
    We never use Git without Claude Code, and never use Claude Code without Git.
    Git provides versioned reality. Claude Code provides structured reasoning over that reality.
    Planning, docs, specs, deliverables — if it matters, it’s versioned.
    Claude Code is used for far more than code: plans, documents, presentations, internal ops, client deliverables — all expressed as files, reviewed as diffs.
    Code, plans, documents, presentations — all via repos and diffs.
  • Pure Dart, everywhere
    No JSX. No CSS. No Bash. One syntax, one mental model.
  • Tooling in Dart is a superpower
    Shared enums, constants, config, validation across apps and tools.
  • CLIs over integrations
    git, gh, linear, coderabbit — explicit beats magical.
  • No MCP
    Context bloat hurts determinism. Clear agentic loops win.
  • Human‑compatible by default
    If a human can’t reproduce it, the system is broken.
  • Guardrails > intelligence
    Humans ultimately provide the intelligence: judgment, taste, intent.
    Agents provide relentless execution: consistency, persistence, follow-through.
    Internal packages, skills, and constraints create quality.
  • Agents execute, humans unblock
    Humans resolve ambiguity and judgment calls, then get out of the way.
  • Make every project look the same
    Uniformity collapses context switching and enables parallel work.

That’s the system.