AI consulting

Hiring an AI Consultant: The 11 Questions That Filter Out the Slop

A blunt buyer's filter for hiring an AI consultant: where the money goes, what real implementation looks like, and which answers expose vapor.

Ralph DuinMay 29, 2026 · 8 min read

shareX LI

Hiring an AI Consultant: The 11 Questions That Filter Out the Slop

Most teams do not waste AI consulting budget because the consultant is stupid. They waste it because the contract rewards polish, not shipment.

That is the surface. The interesting part is where the money actually goes. Slides, workshops, roadmap language, stakeholder interviews, and a final presentation can all feel like progress. Then the consultant leaves. Nobody owns the repo. Nobody owns the evals. Nobody owns the failure modes. Nothing runs on Monday.

If you want to hire AI consultant help, ask these 11 questions before you send the engagement letter. They separate strategy-deck consulting from implementation work that leaves behind code, tests, agents, runbooks, and a system your team can keep operating.

Receipts, not claims.

Deck or running system

An AI strategy consultant sells direction. An AI implementation consultant turns a specific workflow into a working system. Both can be useful. They are not the same purchase.

A strategy engagement usually ends with a deck: use cases, risk register, vendor shortlist, governance model, and roadmap. A real AI consulting implementation should end with a running asset: repo access, deployed workflow, eval suite, monitoring, rollback path, hand-off docs, and named owners.

The buying mistake is paying implementation money for strategy outputs. If the statement of work says “AI roadmap” but never names a repo, integration, evaluation method, or production owner, you are probably buying a very expensive opinion.

Where the money goes

For a strategy-heavy project, money goes into discovery calls, stakeholder alignment, slide production, partner meetings, and executive readouts. That can help when leadership has no shared language or risk appetite.

For an implementation-heavy project, money goes into workflow teardown, data access, integrations, evals, CI, security review, deployment, and operator training. That is the work that converts budget into shipped surface area.

The contract should make this visible. I want line items for code, evals, agents, infrastructure, documentation, and hand-off. If those are missing, the spend is drifting toward theater.

The 11-question buyer filter

1. What will exist at the end that runs without you?

A good answer names the system, repo, environment, workflow, and owner. It sounds boring because real implementation is specific.

A red-flag answer says the team will have “clarity,” “alignment,” or a “future-ready roadmap.” Those words only matter when attached to a shipped next step.

2. Which business workflow are you changing first?

Good consultants narrow the first bet. They ask where work repeats, where judgment matters, and where data is accessible. Business judgment picks the bet. The swarm is the engine.

Bad consultants start with generic demos. They show a chatbot, copilot, or search box before they understand the workflow. That is how AI gets pasted onto a process instead of changing the process.

3. What is your $/outcome ratio?

This is the scoring framework I use. Take the total project cost and divide it by the measurable outcome created. Not aspiration. Outcome.

Examples: cost per manual hour removed, qualified lead enriched, support ticket resolved, compliance review completed, or deployable workflow shipped. Every number below is measured, not aspirational.

A good answer defines the denominator before work starts. A red-flag answer says ROI will become clear after the roadmap phase.

4. What do you count as implementation?

Implementation means the system touches reality. It reads from the right source, writes to the right destination, handles bad inputs, logs enough to debug, and has a clear owner.

That is why generative AI implementation is not the same as prompt advice. A good answer includes integration, evals, deployment, security, and hand-off. A red-flag answer treats a prototype video as the deliverable.

5. Where do evals live?

If the consultant cannot explain evals, they are not ready to own AI implementation. Evals stop teams from arguing from vibes. They define what “good” means before the model starts producing confident nonsense.

In my own work, branch protection, ship gates, evals, and audit are the governance baseline. CI Gate is a single fail-closed aggregate required check. k2k-merge-keeper and the Mergify merge queue use a 5-minute settling window.

A red-flag answer says the team will “review outputs manually” forever. That is not an operating model. That is a support burden wearing a strategy hat.

6. Who owns the repo?

This question cuts through a lot of slop.

A good answer says: your accounts, your code, runbooks included. No lock-in. The consultant may help operate it, but the client should not be trapped inside a black box.

My default stack is Next.js, Supabase, Fly, Cloudflare, Infisical, MCP, and a Claude Code agent fleet. Yours may differ. The point is that there is a stack, and someone can inspect it.

A red-flag answer avoids code ownership, credentials, deployment access, or hand-off details until after signature.

7. Can you show named products, real numbers, real dates?

If I can't defend it in a sales call, it doesn't go on the page.

I can point to AppHandoff, the agent-orchestration MCP server that finishes the Lovable 80%. I can point to infra-gha-runners-fly, the TeamK2K self-hosted GitHub Actions runner fleet on Fly.io. I can point to ~30 concurrent AI coding agents coordinated by one operator.

On the IBF repo, the measured average is 55 merged PRs/day, 66/day over the last 7, peak 111 on 2026-05-21, with median queue-to-merge at 5 minutes. Those numbers are not a promise. They prove the operating system exists.

A red-flag answer hides behind unnamed “enterprise transformations” with no artifact trail.

8. How do you prevent the hand-off cliff?

The common failure pattern is simple. Strategy consultants arrive, interview everyone, produce a polished target state, and leave. The internal team now has a deck, a backlog, and no builder who understands the implied architecture.

A good engagement avoids that cliff by shipping during the engagement. The last week should transfer an already-running system: runbooks, environment ownership, access review, known failure modes, and an owner walkthrough.

9. When is strategy actually worth it?

Strategy is worth it when the organization lacks a shared operating model, has regulatory exposure, has many competing AI bets, or needs executive alignment before touching production workflows.

That is the useful lane for AI strategy consulting. It should compress uncertainty. It should not become a hiding place for avoiding implementation.

The red flag is strategy that never graduates into a system. If the consultant cannot tell you what happens after the deck, they are selling the preface as the book.

10. How much does an AI implementation consultant cost?

The honest answer: it depends on scope, seniority, and whether you are buying advice, build work, or an operator who can own both.

AI consultant salary and pricing data is noisy because the title covers analysts, prompt trainers, implementation engineers, architects, and fractional leaders. For buyers, the better question is cost per shipped outcome.

A lower-rate consultant who burns weeks on discovery can be more expensive than a senior operator who ships quickly. A higher rate is defensible when it buys architecture, implementation, evals, and transfer.

11. How would someone become an AI implementation consultant?

This answer reveals whether the person respects the craft.

A good answer says: learn software delivery, workflow design, integrations, data handling, model behavior, evals, security, and change management. Then ship systems in public or inside real companies. Implementation consulting is not “I use ChatGPT a lot.”

A red-flag answer over-indexes on prompt libraries, certificates, and trend language. Useful consultants can explain how the thing gets built, tested, deployed, and maintained.

Three case patterns I see

Pattern 1: the board wants AI, but nobody owns the workflow

This is the strategy-valid case. The company needs a short, sharp discovery pass to pick the first workflow, name the risk boundaries, and define the operating model. Keep it short. Tie it to a build decision.

Pattern 2: the team has demos everywhere, but nothing in production

This is an implementation case. The missing layer is usually infrastructure, evals, access control, and ownership. AI agent development only matters if the agent is connected to the actual process and governed like software.

Pattern 3: the product is half-built in a visual tool

Inspired by frustration. I mean that literally. I built AppHandoff because Lovable-style speed often dies at the last 20%. The prototype exists, but the production path is unclear.

That is where a Senior AI Systems Architect should turn the prototype into a real repo, real deployment path, and real maintenance model. Two repos, one product.

How to structure the contract so implementation happens

Make the contract hostile to vapor.

Tie payment milestones to concrete artifacts: workflow map, repo initialized, first integration working, eval suite passing, deployment live, runbook accepted, owner trained. Name the accounts, environments, and hand-off assets.

Require access and ownership language upfront. The client should own the source code, cloud accounts, secrets vault, documentation, and runbooks unless there is a specific reason not to.

Add a weekly demo rule. Not a slide update. A working-system demo. Broken is fine. Invisible is not.

And require a kill switch. If the work cannot produce a running artifact by the first checkpoint, you should be able to stop before the engagement turns into a sunk-cost theater project.

The operator test

Hiring an AI consultant is not about finding the person with the cleanest trend vocabulary. It is about finding someone who can turn business judgment into a system that runs.

Ask for receipts. Ask where the code lives. Ask how evals work. Ask what happens when the consultant leaves. Ask how the contract forces implementation.

If the answer keeps returning to decks, you are buying decks.

If the answer returns to repos, evals, agents, ship gates, ownership, and runbooks, you might have an operator.

For the implementation-first version, talk to us.