Digital Employees: How BNY Scaled to 100+ AI Agents When Others Can't Get Past 10

Suhit Anantula

19 Nov 2025 — 10 min read

We've been reading a lot of stories about AI adoption and its challenges. There's a new study every week. One claims it's not working, while another says it is. So, what's really happening? I think everything is both true and not true. There's a lot of truth in the fact that people are doing many pilots, which is great. That's what you want—people doing pilots. But we also want people to scale.

What's the gap here? It seems to be organisational challenges. It's humans, governance, and structure. The technology does play a role, but it's also about these other things. That's why I like this story from the Wall Street Journal. They really delve into understanding how organisations like the Bank of New York and Walmart are scaling AI themselves. This requires a new way of thinking and approach.

The article touches on these ideas I've been sharing about what it takes to build or co-intelligent organisations.

If you like to listen to me talk about this, you can do it here. Subscribe to my YouTube channel here.

The Paradox of AI Agent Adoption

Last month at the Gartner IT Symposium in Orlando, a panel titled "Agentic AI: Is it Real?" addressed a question keeping executives awake at night: Should we move fast on AI agents, or wait for the technology to mature?

The evidence is now in, and it's not what most expected.

While a Massachusetts Institute of Technology study in August 2024 reported that most generative AI pilot projects were failing to generate meaningful returns, two Fortune 500 leaders presented starkly different results. BNY has deployed 100 "digital employees" delivering what CIO Leigh-Ann Russell calls "really, really tangible outcomes that impact our bottom line." Walmart compressed its fashion design-to-store timeline by 18 weeks using AI agents. Meanwhile, Salesforce reports 5x growth in its Agentforce platform since launch.

"For us, that's a hard false," Russell said when asked about the MIT study's findings.

The difference isn't the AI technology. It's something far more fundamental — and far harder to copy.

The 5-to-10 Agent Ceiling

Most organizations deploying AI agents hit an invisible wall at around 5 to 10 agents. The reasons are familiar: governance concerns, audit challenges, unclear accountability, and integration complexity. What starts as an exciting pilot quickly becomes an operational nightmare.

BNY has 100 digital employees. Walmart's fashion business runs on a multi-agent ecosystem. What did they figure out that others haven't?

The answer lies in a deceptively simple insight: AI agents need organizational identity, not just technical capability.

The Digital Employee Framework

At BNY, AI agents aren't "tools" or "copilots." They're digital employees. And like human employees, they have:

Distinct login credentials (just like every other employee in the company)
Communication protocols (they use email and Microsoft Teams)
Reporting relationships (each agent reports to a human manager)
Clear roles and responsibilities (such as "digital engineer")

"It's very, very hard to build an agentic framework," Russell acknowledged. "But we have 117 solutions touching everything that happens at the bank."

This isn't just clever branding. It's a fundamentally different organizational design approach.

Why Traditional Approaches Fail

When organizations treat AI agents as technology projects, they assign them to IT departments. When they treat them as productivity tools, they distribute them ad hoc across teams. Both approaches hit the same scaling barrier: lack of organizational legibility.

Without organizational structure, you can't answer basic questions:

Who is responsible when an agent makes a mistake?
How do you audit agent decisions?
What happens when an agent's scope needs to expand?
How do you manage 50+ agents across different departments?

BNY solved this by making agents organizationally legible. Each digital employee appears in the organizational system the same way a human employee does. Managers know which agents report to them. Audit trails exist automatically through login systems. Escalation paths are clear.

The framework was designed for three things: managing, auditing, and scaling.

The Technical Foundation

BNY's agents run on multiple foundation models — OpenAI, Google, and Anthropic — integrated through an internal platform called "Eliza." This multi-model approach, combined with security and accuracy enhancements, represents a crucial architectural decision: don't rely on any single AI provider.

But the more important decision was organizational. By creating the digital employee framework first, BNY built the governance structure needed to deploy agents at scale. The technology became the easy part.

Consider one example: BNY's "digital engineer" scans the codebase for vulnerabilities and autonomously writes and implements fixes for low-complexity problems. The agent has a manager, follows escalation protocols for complex issues, and maintains an audit trail through its login credentials.

It's not just working. It's working within a system designed to manage it.

Ai Agents 100 BNY.png

Walmart: The Multi-Agent Orchestration Model

While BNY demonstrates agent governance at scale, Walmart shows what's possible when agents work together.

In the fashion industry, the traditional timeline from ideation to product availability in stores is about six months. Walmart's Trend-to-Product agent has compressed this timeline by 18 weeks — a 69% reduction. The agent monitors trend signals (what teenagers are buying), creates product specifications and patterns, and accelerates the entire design-to-production-to-store workflow.

But Trend-to-Product doesn't work alone. As OpenAI CFO Sarah Friar explained at the WSJ Tech Live conference, "Walmart is a good example of a customer now working with us on the commerce side, but also using a lot of our technology internally around things like how to merchandise, how to handle risk."

Vinod Bidarkoppa, Walmart's Executive VP and CTO for Walmart International, describes AI as a "force multiplier" that lifts productivity across all lines of business, particularly in engineering where "significant increases are already visible."

His key insight: "This is real value, if all of these agents are working together."

Not isolated tools. Not individual copilots. An ecosystem of agents collaborating across the workflow.

The Behind-the-Scenes Transformation

Here's what most executives miss: the most valuable AI agent deployments are invisible.

Walmart's fashion timeline compression is visible — you can measure weeks saved. But what about the agents handling merchandising decisions? Risk management? Engineering productivity improvements? Those agents are working "behind the scenes," delivering value that won't show up in customer-facing metrics or industry reports.

This creates what we might call the "perception lag problem." Public adoption studies significantly underestimate actual deployment. The MIT study that reported widespread AI pilot failures? It captured a snapshot of organizations in the pilot phase, before they'd developed governance frameworks or reached production maturity.

Six months later, those same organizations may have 50+ agents in production. But nobody writes case studies about internal operational improvements.

BNY's Russell noted that their 117 AI solutions touch "everything that happens at the bank." How many of those 117 solutions are customer-facing? How many are internal transformation initiatives that competitors can't see?

Your competitors may be further ahead than you think.

Why First Movers Are Winning

At the Gartner panel, the moderator framed the central question directly: "The pack could appear prudent in the not-so-distant future for waiting to see how agents mature, or the first movers might gain a long-term, sustainable advantage."

The evidence now supports the first movers.

Here's why: organizational capability is the competitive moat, not technology access.

Every company can license the same foundation models. Every company can hire AI engineers. Every company can run pilot projects.

But building an organizational framework that enables 100+ agent deployments? That takes time. BNY's Russell was explicit: "It's very, very hard to build an agentic framework."

The organizations investing that time now — learning how to govern agents, train managers to oversee digital employees, develop escalation protocols, create audit frameworks — are building a capability advantage that can't be purchased.

The 12-Month Gap

Based on the pattern between the MIT study (August 2024) and current production deployments (November 2025), there appears to be a 6-12 month organizational learning curve from pilot to production.

That's not technology maturation time. That's organizational transformation time.

Organizations waiting for "more proof" or "better technology" are falling 12-18 months behind leaders like BNY and Walmart. And unlike technology gaps, organizational capability gaps can't be closed by throwing money at the problem.

You can't buy organizational learning. You have to earn it.

What Executives Should Do Now

The digital employee framework points to a fundamentally different approach to AI agent deployment. Here's what that means for your organization:

1. Treat This as an Organizational Design Problem, Not a Technology Problem

Stop delegating AI agents to IT. This is a workforce composition question. How many digital employees do you need to compete? What roles should they fill? How will you structure accountability?

These are C-suite decisions, not IT department decisions.

2. Start With 5 Digital Employees and a Governance Framework

Don't try to scale to 100 agents immediately. Start with 5:

Define clear roles (like BNY's "digital engineer")
Assign each agent to a human manager
Establish communication protocols (email, Teams, etc.)
Create login credentials and audit trails
Document escalation procedures

The goal isn't to deploy 5 agents. The goal is to build the organizational infrastructure that enables scaling to 50-100 agents.

3. Design for Multi-Agent Orchestration, Not Individual Tools

Walmart's insight is critical: value comes when agents work together. Don't think about "an agent for customer service" or "an agent for code review." Think about agent ecosystems that collaborate across workflows.

How would a collection of agents working together compress your critical timelines by 18 weeks?

4. Measure Organizational Learning, Not Just Technology Deployment

Track metrics like:

How many agents can you govern simultaneously?
How quickly can managers learn to oversee digital employees?
What percentage of agent outputs require human intervention?
How fast can you add new agents to your infrastructure?

These metrics reveal organizational capability development — the actual competitive advantage.

5. Accept That This Is a Multi-Year Journey

BNY didn't build 100 digital employees overnight. Walmart didn't compress its fashion timeline in a quarter. These are multi-year organizational transformations.

The question isn't whether to start. The question is: will you start now, when you can still be a first mover, or will you start in 18 months, when you're already behind?

The Real ROI: Capacity Expansion, Not Cost Reduction

Notice BNY's framing: Russell didn't talk about headcount reduction or cost savings. She talked about "growing capacity."

This is the more powerful ROI story. AI agents don't replace your workforce. They expand what your existing workforce can accomplish.

A bank that can deploy 100 digital employees can:

Scan codebases for vulnerabilities 24/7
Handle low-complexity fixes autonomously
Free human engineers for high-complexity work
Scale operations without scaling headcount proportionally

A retailer that compresses design-to-store timelines by 18 weeks can:

Respond to trend signals 4x faster than competitors
Test more products with less risk
Capture market opportunities that competitors miss
Maintain fresher inventory and better margins

These aren't efficiency gains. These are capability advantages.

The Organizational Design Challenge

The hardest part of AI agent deployment isn't the technology. It's answering a question most executives haven't thought about:

What does it mean to manage a digital employee?

When a digital engineer identifies a vulnerability and proposes a fix, how does the human manager evaluate that proposal? When should the manager approve autonomous implementation versus requiring human review? How do you conduct a "performance review" for an agent? What happens when an agent makes a mistake — who is accountable?

These aren't technical questions. They're organizational design questions.

BNY has spent years developing answers. Their competitors are just starting to ask the questions.

The 100-Agent Future

Within 24 months, "digital employee count" will become a standard business metric, reported alongside FTE (full-time equivalent) in earnings calls.

Organizations with 100+ digital employees will have fundamentally different capabilities than organizations with 10. The gap will be as significant as the difference between a 10-person startup and a 1,000-person enterprise.

But unlike scaling human employees — which requires recruiting, hiring, training, and management overhead — scaling digital employees requires something different: organizational design capability.

The companies building that capability now are establishing a competitive moat that will be visible in 3-5 years but very hard to close.

Conclusion: The Strategic Choice

At that Gartner panel, the question was whether first movers would gain sustainable advantage or whether the prudent path was to wait.

The evidence is clear: first movers are establishing organizational capabilities that can't be purchased later.

BNY's 100 digital employees. Walmart's 18-week compression. Salesforce's 5x growth. These aren't just early results. They're proof that organizational design — not AI technology — determines who wins.

The technology is available to everyone. The organizational capability is not.

Your competitors are making this strategic choice right now. Some are building governance frameworks, training managers, establishing digital employee infrastructure. Others are waiting for "more proof" or "better technology."

Which choice is your organization making?

And perhaps more importantly: do you have 12-18 months to spare?

Key Takeaways

The Digital Employee Framework — AI agents need organizational identity (logins, managers, communication protocols) to scale beyond 5-10 agent ceiling
Governance Enables Scale — Organizations with governance frameworks deploy 100+ agents; without framework, stuck at pilot stage
Multi-Agent Orchestration — Value comes from agents working together across workflows, not individual tool deployments
Organizational Design > Technology — Success determined by organizational structure and governance, not AI capability
First Mover Advantage Is Real — 12-18 month organizational learning curve can't be compressed; leaders building capability now establish sustainable competitive advantage
Behind-the-Scenes Transformation — Most valuable deployments are internal/invisible; public adoption data significantly underestimates reality
Manager as AI Governor — Human managers become accountability mechanism for digital workforce; requires new training and mindset
Capacity Expansion Over Cost Reduction — ROI story is "growing capacity" not "reducing headcount"

Sidebar: The MIT Study Contradiction Explained

In August 2024, an MIT study reported that most generative AI pilot projects had failed to generate meaningful returns. Four months later, BNY's CIO called this finding "a hard false" based on her company's results.

Both statements can be true. Here's why:

The MIT study captured organizations in the pilot phase (0-6 months), before they'd developed governance frameworks or organizational capabilities. At this stage, low ROI is expected and normal.

The WSJ reporting captured organizations in the production phase (6-12 months later), after governance frameworks were established and agents were deployed at scale.

The lesson: pilot-phase metrics don't predict production-phase outcomes. The 6-12 month organizational learning curve is real and unavoidable.

Organizations that abandoned AI agent initiatives based on early pilot results missed the critical insight: the pilot phase is about building organizational capability, not achieving ROI. ROI comes later, once governance is established.

This timing pattern has strategic implications: organizations starting today won't see production-level ROI for 6-12 months. But organizations that don't start today will be 12-18 months behind those that do.