AI Foundation Models Go Vertical: $75M Funding Round
Standard Intelligence's $75M Series A funding marks a shift toward hyper-specialized foundation models trained on video understanding for specific software categories, challenging the horizontal LLM narrative.

AI Foundation Models Go Vertical: $75M Funding Round
Standard Intelligence's $75 million Series A funding led by Sequoia and Spark Capital (May 5, 2026) marks a critical inflection point: the next wave of enterprise AI isn't about general-purpose models—it's about hyper-specialized foundation models trained to control specific software categories through video understanding.
Angel Investors Network provides marketing and education services, not investment advice. Consult qualified legal, tax, and financial advisors before making investment decisions.
The Seattle-based startup's approach—building foundation models that watch and replicate software actions rather than trying to be everything to everyone—reveals where actual productivity gains will cluster. While accredited investors continue pouring capital into horizontal LLM platforms promising to revolutionize every industry simultaneously, the companies capturing enterprise spend are going narrow and deep.
This isn't theoretical. Standard Intelligence's thesis is that software automation at scale requires models trained on the visual patterns of specific application interfaces, not natural language instructions that break the moment UI updates or workflows change. The bet: vertical specialization trumps horizontal generalization when enterprises need reliability over flexibility.
Why General-Purpose Foundation Models Failed Enterprise Software Control
The original promise of large language models in enterprise software was simple: give the model a text instruction, let it figure out how to execute across any application. Write "schedule this meeting" and watch it navigate Outlook, Teams, and Salesforce without human intervention.
That promise broke immediately in production.
General-purpose models trained on internet text understand language patterns but have no persistent memory of software interfaces. When Salesforce updates its UI—which happens quarterly—the model's mental map becomes outdated. When a custom enterprise workflow involves fifteen different SaaS tools chained together, the model guesses its way through each step with compounding error rates.
The fundamental issue: foundation models trained on text treat software as a language problem when it's actually a visual-spatial problem. Humans don't read button labels to use software efficiently—they recognize layout patterns, color coding, and spatial relationships built through repetition. They see the "Submit" button is always bottom-right in this particular application. They know the dropdown menu pattern for this specific workflow.
Standard Intelligence's approach inverts this. Instead of training models on text descriptions of software tasks, they train on video recordings of expert users performing those tasks in specific applications. The model learns the visual grammar of enterprise software categories—CRM interfaces, ERP workflows, data analytics dashboards—rather than trying to understand every possible software interaction through natural language alone.
This matters because enterprise software automation isn't a reasoning problem; it's a pattern recognition problem at scale. The company that can reliably click the correct button in Workday 99.9% of the time beats the company that can have a fascinating conversation about HR policy but clicks the wrong field 15% of deployments.
How Do Specialized Foundation Models Change Enterprise Software Economics?
The economics of horizontal versus vertical AI deployment diverge sharply when implementation costs hit enterprise budgets.
A general-purpose LLM requires extensive fine-tuning for each enterprise deployment. That means weeks of custom prompt engineering, integration testing across the client's specific tech stack, and ongoing maintenance every time any connected application updates. Consulting firms happily bill $500K+ for these implementations because the customization work is genuinely complex and brittle.
Vertical foundation models trained on specific software categories eliminate most of that customization tax. If the model was trained on 10,000 hours of Salesforce usage patterns across multiple enterprise instances, it already understands Salesforce's visual language. Deploy it to a new client running Salesforce, and the model recognizes the interface immediately—no extensive retraining required.
The implications cascade through enterprise budgets. Implementation timelines compress from months to weeks. Ongoing maintenance costs drop because the model vendor absorbs UI update retraining across all customers simultaneously. When Salesforce ships a major interface redesign, Standard Intelligence retrains one vertical model that immediately works for all clients—rather than each enterprise customer retraining their custom-tuned general model independently.
This creates network effects in model training that don't exist with horizontal approaches. Every new enterprise deployment generates usage data that improves the vertical model for all subsequent customers in that category. The 50th Salesforce automation customer gets a dramatically better model than the first, because the model learned from 49 previous deployments.
Compare this to general-purpose models, where learnings from one deployment rarely transfer to the next because each enterprise's custom workflow is unique. There's no compounding advantage.
What Does $75M in Series A Funding Signal About AI Investment Thesis Evolution?
Sequoia and Spark Capital leading a $75 million Series A for a company focused on narrow software automation—not broad AGI promises—represents a meaningful pivot in institutional capital allocation.
Three years ago, this capital would have flowed to startups promising "AI that understands everything." The pitch deck would have included slides about human-level reasoning, multimodal understanding across infinite domains, and revolutionary transformation of every industry simultaneously. Investors rewarded ambition and breadth.
That cycle is ending. Dry powder doesn't convert to deployed capital when enterprise customers can't get general-purpose models working reliably in production. The companies raising significant institutional rounds now are the ones with concrete proof points: "Our model automates invoice processing in NetSuite with 99.7% accuracy across 200 enterprise customers" beats "Our model can do anything you imagine" when CFOs are signing checks.
Standard Intelligence's funding validates a specific thesis: vertical depth beats horizontal breadth in enterprise AI deployment. The startup isn't promising to automate all knowledge work—it's promising to automate specific, high-value software workflows with measurable reliability. That's a thesis institutional LPs can underwrite with actual ROI projections rather than science fiction scenarios.
This funding also signals where subsequent capital will flow. Once one vertical-focused AI automation company proves the model works—raises a large Series A, signs marquee enterprise customers, demonstrates superior implementation economics—capital follows rapidly. Expect specialized foundation models for ERP automation, supply chain software control, healthcare records management, and financial services compliance workflows to raise significant rounds in 2026-2027.
The pattern repeats: technology that solves narrow problems extremely well scales faster than technology that solves broad problems poorly. Accredited investors betting exclusively on general-purpose LLM platforms are structurally overexposed to implementation risk and underexposed to proven enterprise deployment models.
Why Video-Based Software Training Changes Model Performance Expectations
Training foundation models on video recordings of software usage—rather than text descriptions—fundamentally changes what "good performance" means in enterprise automation.
Text-trained models hallucinate because they're generating probable next tokens based on internet text patterns, not ground truth about what actually happens when you click a button in SAP. Ask a text-based model to "submit this expense report," and it might generate a plausible-sounding sequence of steps that completely fails when executed because it invented a menu option that doesn't exist in your SAP instance.
Video-trained models don't hallucinate UI elements—they've seen thousands of hours of actual SAP interfaces and learned which buttons, menus, and workflows actually exist. When the model predicts the next action in an expense approval workflow, that prediction is grounded in observed reality across many real deployments, not linguistic probability.
This distinction matters enormously in enterprise contexts. A model that's 95% accurate in generating grammatically correct instructions but only 70% accurate in executing actual software actions is worse than useless—it creates more work through failed automation attempts. A model that's 99% accurate in software execution doesn't need perfect natural language understanding because the task is "click the right button" not "have a conversation about buttons."
Video training also captures implicit knowledge that text descriptions miss. Expert users develop muscle memory around software—they know the Submit button is always in the same position relative to the Cancel button in this particular application. They recognize visual patterns like color-coded status indicators or layout conventions that signal workflow state. Text descriptions of these workflows lose all that spatial and visual information.
Standard Intelligence's approach treats software interfaces as visual languages with grammatical rules that can be learned through observation. Just as humans learn software through watching tutorials and practicing, not through reading technical documentation, video-trained models learn software through visual pattern recognition rather than linguistic instruction following.
How Will Enterprise Buyers Evaluate Vertical Versus Horizontal AI Tools?
Enterprise software buyers now face a bifurcating market: general-purpose AI platforms that promise to automate anything, versus specialized foundation models that automate specific software categories with high reliability.
The evaluation criteria diverge sharply between these categories.
For general-purpose tools, buyers evaluate flexibility and reasoning capabilities. Can this model handle edge cases? Can it adapt to our unique workflows? Does it understand complex multi-step instructions? These are reasonable questions when the value proposition is "one model for all tasks."
For vertical-focused tools, buyers evaluate accuracy and implementation speed. What's your error rate in production? How fast can we deploy this? What's the maintenance burden when our software updates? These questions matter more because the value proposition is "solve this specific problem extremely well."
The friction point: most enterprises still think they want general-purpose tools because flexibility sounds valuable. The reality emerging from early deployments is that flexibility without reliability costs more than narrow capability with high accuracy. A CFO would rather have 99.9% automated invoice processing in NetSuite than 80% automated "everything involving numbers" across all systems.
This creates a wedge for vertical-focused companies. They can demonstrate ROI with specific dollar amounts—"we'll save you 10,000 hours annually on Salesforce data entry at 99.7% accuracy"—while horizontal platforms are still struggling to prove their implementation actually works in production at scale.
Expect enterprise buying patterns to shift toward vertical solutions rapidly once a few high-profile horizontal implementations fail publicly. The first Fortune 500 company to announce they're ripping out a general-purpose AI automation platform due to reliability issues will accelerate this rotation dramatically. Generic positioning gets punished when buyers learn to ask specific questions.
What Should Accredited Investors Watch in Enterprise AI Deal Flow?
Standard Intelligence's funding reveals clear patterns for evaluating enterprise AI investments in 2026-2027.
First: look for companies focused on specific software categories, not industries. "We automate healthcare" is still too broad. "We automate Epic EMR workflows for hospital billing departments" is a fundable thesis. The narrower the software target, the faster the model can achieve production-grade reliability.
Second: implementation economics matter more than model architecture. The winning companies will be those that can deploy their vertical models in weeks rather than months, with minimal custom integration work. If the startup's pitch requires six-month implementation timelines, they haven't solved the productization problem yet.
Third: training data sources differentiate vertical specialists from horizontal generalists. Companies with proprietary access to thousands of hours of recorded software usage in their target category have structural moats. Video training data for enterprise software is harder to acquire than internet text, which means first-movers can build defensible datasets.
Fourth: enterprise software vendors themselves will become acquirers. Salesforce, Workday, SAP, and Oracle all have strategic incentives to own specialized automation models for their own platforms. A vertical foundation model that automates Salesforce workflows perfectly is either a competitive threat or an acquisition target for Salesforce. Factor acquisition probability into valuation models accordingly.
Fifth: watch for evidence of network effects in model improvement. The best vertical specialists will demonstrate that their models get measurably better with each new customer deployment. Ask for metrics: accuracy improvement from customer 1 to customer 50, implementation time reduction across the customer base, maintenance cost per customer over time.
What to avoid: companies still pitching "AI that can do anything" without concrete production deployment metrics. The market already tried funding dozens of general-purpose enterprise AI platforms. Most are struggling with implementation at scale. The next wave of winners will be specialists who picked one software category and solved it completely.
AI reliability matters more than AI flexibility when enterprise buyers are writing checks. Standard Intelligence's $75 million validates this thesis at institutional scale.
Why Foundation Model Competition Will Intensify in Vertical Categories
Once Standard Intelligence proves the vertical foundation model thesis works—large enterprise customers, measurable ROI, successful Series A exit—competitors will flood each software category rapidly.
The playbook is now clear: pick a high-value enterprise software category with repetitive workflows, acquire video training data of expert users, build a foundation model specialized for that category's visual interface patterns, and sell on implementation speed plus accuracy metrics. Every category with $10B+ in annual enterprise spend becomes a target.
This creates timing advantages for early movers. The first company to achieve production-grade reliability in SAP automation, Workday automation, or Epic EMR automation captures the initial enterprise customers and their usage data. That data compounds into model improvements that make it harder for later entrants to match accuracy metrics.
But timing advantages aren't permanent moats. If Standard Intelligence can train a specialized model in 12-18 months, so can competitors with comparable capital. The sustainable differentiation will be depth of enterprise relationships, quality of training data pipelines, and speed of deployment—not just being first.
Expect consolidation within each vertical category. Just as horizontal SaaS markets consolidated to 2-3 major players per category, vertical foundation models will likely consolidate similarly. Most enterprise software categories won't support ten different specialized automation vendors—buyers will standardize on the 1-2 platforms with best-in-class accuracy and integration.
This has implications for investor strategy. Early-stage investors should target categories where no clear leader has emerged yet. Growth-stage investors should focus on companies demonstrating market leadership within their chosen category. Both should avoid categories where a well-funded specialist already has 100+ enterprise customers and measurably superior accuracy metrics—that race is already over.
The exception: enterprise software vendors acquiring vertical specialists to integrate directly into their own platforms. That creates exit opportunities even in categories with established leaders, because Salesforce might acquire the #2 player in Salesforce automation to prevent a competitor from buying them.
Related Reading
- AvaWatz RegCF: $80.8M Raise for AI Reliability Platform — Crowdfunded AI infrastructure
- AI Search Could Flatten Generic Fund Manager Positioning — Specialization in AI investment
- Dry Powder Is Not Dry Powder if Your Due Diligence Still Sucks — Capital deployment discipline
Frequently Asked Questions
What is a vertical foundation model in enterprise AI?
A vertical foundation model is an AI system trained specifically on one category of enterprise software—like Salesforce CRM, SAP ERP, or Epic EMR—rather than attempting to understand all software types. These models learn the visual patterns and workflows of their target software through thousands of hours of recorded usage, enabling higher accuracy and faster deployment than general-purpose models.
Why did Standard Intelligence raise $75 million for specialized software automation?
The funding validates that narrow, reliable software automation beats broad, unreliable automation in enterprise buying decisions. Sequoia and Spark Capital are betting that companies solving specific software categories with 99%+ accuracy will capture more enterprise spend than platforms promising to automate everything at 80% accuracy.
How do video-trained AI models differ from text-trained models?
Video-trained models learn software interfaces by watching actual screen recordings of expert users, capturing visual layouts, button positions, and workflow patterns that text descriptions miss. This grounds their predictions in observed reality rather than linguistic probability, reducing hallucinations and improving execution accuracy in actual software environments.
Will enterprise software vendors acquire vertical AI automation companies?
Very likely. Companies like Salesforce, Workday, and SAP have strategic incentives to own specialized automation capabilities for their own platforms. A vertical model that perfectly automates a vendor's software is either a competitive threat or a natural acquisition target, creating exit opportunities beyond traditional IPO or strategic sale paths.
Should accredited investors focus on horizontal or vertical AI companies?
Current evidence favors vertical specialists with proven production deployments over horizontal platforms still struggling with implementation reliability. Look for companies demonstrating measurable accuracy improvements across multiple customers, fast implementation timelines, and focus on specific software categories rather than broad industry verticals.
What makes a strong vertical foundation model investment thesis?
Key indicators include: proprietary access to training data in the target software category, production deployments with 99%+ accuracy metrics, implementation timelines under 30 days, evidence of network effects as model quality improves with each customer, and total addressable market above $5B in annual enterprise software spend.
How will vertical AI automation impact enterprise software pricing?
As specialized models automate repetitive workflows with high reliability, enterprises will shift spending from headcount toward automation licenses. This creates pricing pressure on traditional software vendors who charged based on user seats, while creating new revenue opportunities for automation specialists charging based on tasks completed or time saved.
What happens to general-purpose LLM platforms in this shift?
General-purpose platforms will continue serving use cases requiring flexibility and reasoning—content generation, analysis, conversational interfaces. But the high-value enterprise automation market will increasingly favor vertical specialists. Some horizontal platforms may pivot toward becoming infrastructure for vertical model builders rather than competing directly in automation.
Ready to invest in the next wave of enterprise AI infrastructure? Apply to join Angel Investors Network to access curated dealflow in vertical foundation models and specialized software automation.
Part of Guide
Looking for investors?
Browse our directory of 750+ angel investor groups, VCs, and accelerators across the United States.
About the Author
Sarah Mitchell