Signal4i Field Note — Harvard and Stanford Just Ran the Experiment

The Setup

Three groups. Same tasks.
Same AI tools.

Researchers embedded at IG — a leading U.K. fintech — tested whether GenAI could enable professionals from different occupational backgrounds to perform tasks at the same level as domain specialists. They recruited three groups and gave them two sequential tasks: conceptualize a web article, then execute it.

Group 01 · Insiders

Web Analysts

Regularly write content for the company's website. Deep domain familiarity with audience, SEO, and conversion logic.

Group 02 · Adjacent

Marketing Specialists

Work in related functions but don't typically write web articles. Shared vocabulary — customer engagement, conversion, targeting.

Group 03 · Distant

Data Scientists & Developers

Technically capable. No content background. Approach writing as technical documentation — brevity, clarity, directness.

Some participants had access to bespoke GenAI tools. Others did not. The experiment was designed to isolate one question: where does AI stop being able to bridge the expertise gap?

The Finding

GenAI equalized conceptualization.
It could not equalize execution.

Task 01 · Conceptualization

The gap disappeared.

Without AI, web analysts significantly outperformed both other groups. With GenAI, marketing specialists and data scientists produced outlines statistically indistinguishable from the experts. AI acted as a full equalizer for structured, abstract work.

Task 02 · Execution

The gap held — then widened.

Marketing specialists with GenAI matched the specialists. Data scientists with GenAI consistently underperformed — and in many cases degraded the output. Not because the AI failed. Because they couldn't evaluate what it gave them.

This is the GenAI wall effect: a threshold beyond which AI can no longer meaningfully bridge the expertise gap. It bridges adjacent distances. It cannot bridge distant ones.

"GenAI suggested some catchy hooks… Actually, I didn't fully understand what it was doing because I never wrote an article like that. I added random stuff to make it more 'marketing.'"

Data Scientist · IG Field Experiment · Harvard/Stanford 2026

The researchers identified why. The data scientist approached marketing content the way he'd approach technical documentation — and removed the hooks, calls-to-action, and narrative structure he didn't recognize as valuable. He made the AI output worse because he couldn't judge it. Marketing specialists could evaluate and refine. Distant outsiders had to trust the AI for both navigation and the destination — and that's where things collapsed.

The paper's conclusion: "The bottleneck isn't the idea's quality — it's the implementer's knowledge distance from the domain."

The Alignment

They called it the knowledge distance problem.
We've been calling it the org readiness gap.

Signal4i Core Thesis

AI capability is advancing exponentially.
Organizational readiness is moving linearly.
The gap between them is the market.

The Harvard/Stanford experiment doesn't just confirm the gap exists — it identifies the precise mechanism. Post-mortems on failed AI pilots return six root causes: lack of skills, high costs, inadequate tools, complex projects, data complexity, confidence gap. These are not six problems. They are one problem with six faces — and every face is knowledge distance. Lack of skills is knowledge distance named from HR. High costs are the rework bill when distant evaluators degrade output. Inadequate tools is what organizations reach for when the actual problem is the human can't judge what the current tool produces. Data complexity is invisible to anyone without domain proximity. The confidence gap is precisely what knowledge distance feels like from the inside. All six collapse into a single structural failure: the human holding the AI doesn't have the domain proximity to judge the output. So it stalls in review, or it gets degraded on the way to delivery.

This is why 94% of GenAI pilots are failing. It is not a technology story. It never was.

The three-layer transformation model — Technology, Organization, Human — maps directly onto what the experiment proved.

01 · Technology

Tech

The AI tools worked. Every group had the same access. The technology layer was not the constraint. It never is. And yet it's the only layer most transformation programs invest in.

02 · Organization

Org

Roles were not designed around knowledge proximity. Distant outsiders were assigned tasks that require adjacent knowledge to execute. The org layer failed — and the AI had nowhere to go.

03 · Human

Human

Without domain foundation, the human couldn't evaluate AI output. They degraded it. Human readiness isn't AI fluency. It's the depth of domain knowledge that makes AI judgment possible.

Pull any one strand and all three stop moving. The experiment ran a controlled test and confirmed the model.

This is not a strategy. It's a procurement cycle. The pattern in most organizations: evaluate the AI tool, budget it, integrate it, move on. The org design layer gets skipped entirely. Buying a more sophisticated model doesn't close knowledge distance. It produces more sophisticated output the org still can't evaluate — and now can't slow down.

The Misread

Every previous technology cycle
was an infrastructure decision.
This one isn't.

Client/server. Web. Cloud. Each transition was evaluated as an infrastructure question: assess the technology, build the business case, integrate it, move on. The platform evolved. The org stayed the same. That pattern worked because the technology was a tool. Tools don't act without being acted upon. Agents do.

Most organizations approaching AI are still running the infrastructure playbook. They're asking "what AI tools should we deploy?" when the actual question is "what does our org look like when agents are doing the execution work?" These are not the same question. The gap between them is where transformations fail.

Where Most Orgs Are

Technology Upgrade Cycle

AI as the next infrastructure layer. Evaluate, procure, integrate. Org design unchanged. Humans still in every loop. Knowledge still in people, not systems.

→

Where Most Think They're Going

Human Augmented

AI assists humans. Faster output, better suggestions. Still human-centric. The org shape is the same — humans just have better tools. This is the GenAI wall. It's not the destination.

The Actual Destination

Human Agentic

Agents handle execution. Humans govern strategy, judgment, escalation. Knowledge encoded as auditable system behavior. The org is redesigned for intelligence, not organized around information scarcity.

The knowledge distance paper measures failure on a single task with a human reviewing the output. The agentic deployment removes that review by design. Agents observe, decide, and execute across time horizons no human monitors in real time. Knowledge distance × agentic deployment = degraded decisions at autonomous scale, running unmonitored in production systems. The experiment showed you the wall. Agentic deployment is what happens when you hit it at 10× the speed with no human in the loop.

The Destination

The practitioner doesn't disappear in the human-agentic org.
They move to the layer that matters most.

Strategy. Judgment. Escalation. The decisions only a human with deep domain context can own. But that only works if the knowledge is encoded, the governance layer is designed, and the organization understands it's making a structural transformation — not installing a feature. The question is not "what AI tools should we buy?" The question is "what does our org look like when agents are doing the execution work, and who governs what they do?"

The Practitioner

The IBM i practitioner is not a
distant outsider.
They are the domain.

The data scientists in the experiment failed because they were far from the domain. IBM i practitioners with 30 years of business logic embedded in their heads are not far from anything. They are the domain expertise the experiment shows AI needs to function at the execution layer.

→ Why the pricing rule has that one exception. Thirty years of institutional memory that exists nowhere in the documentation.
→ What the error code actually means vs. what the docs say. The gap between official record and operational reality.
→ Why the month-end job runs in that exact sequence. Business logic encoded in process, not in comments.
→ Which customer relationships require human judgment. The exceptions that can't be systemized — yet.

That is not legacy. That is the specific asset the experiment shows determines whether AI output gets elevated or degraded. The market invented a job title for the person who closes that distance.

The Market Has a Number for This

Forward Deployed Engineer

$238K

Average Comp WSJ · 2025–2026

+800%

Job Posting Growth 18 months

"Hottest Job in Tech" Wall Street Journal

The market invented a job for someone who deeply understands a domain, earns the trust of senior practitioners, and makes AI work in real environments — not demos. That person already exists in the IBM i community. They just don't know what they're worth yet.

The Wall Moves

The organizations that close
the knowledge distance now
compound that advantage as
the wall shifts.

The researchers noted the GenAI wall is not fixed — it will move as AI improves. That makes the strategic window narrow, not wide. Organizations encoding practitioner knowledge into governed, auditable systems now are building a compounding asset. The ones that wait face a gap that doesn't close.

Harvard and Stanford measured the gap. Signal4i has been tracking it since signal one.

Harvard and StanfordJust Ran the Experiment.

Three groups. Same tasks.Same AI tools.

GenAI equalized conceptualization. It could not equalize execution.

They called it the knowledge distance problem. We've been calling it the org readiness gap.

Every previous technology cycle was an infrastructure decision. This one isn't.

The IBM i practitioner is not a distant outsider. They are the domain.

The organizations that closethe knowledge distance now compound that advantage asthe wall shifts.

Harvard and Stanford
Just Ran the Experiment.