From Pilots That Stall to Systematic Implementation
Most AI pilots succeed. Most scaled implementations fail.
The problem is execution methodology, not technology.
Organizations everywhere face the same pattern: A team runs a successful pilot. AI generates promising results. Leadership gets excited. Then someone says "Let's roll this out company-wide."
Six months later, the initiative is quietly shelved. The pilot worked. The scaling didn't.
The numbers tell the story:
- 95% of AI pilots never achieve enterprise-wide deployment
- Organizations lose an average of $1.9 million per failed AI initiative
- The pilot-to-scale transition kills more AI projects than technical failure
The problem is execution methodology. Most organizations know what they want AI to do—they've defined goals, identified use cases, secured budget. What they lack is a systematic approach to move from "this works" to "this works at scale."
SHAPE is a five-phase execution methodology that bridges the gap between strategy and results:
- Situation: Assess current state honestly before changing anything
- Hypothesis: Define measurable success criteria upfront
- Action: Execute systematic pilots with clear decision frameworks
- Process: Scale what works through systematic phases
- Evaluation: Measure continuously and iterate based on evidence
Like PAST (which provides strategic clarity), SHAPE works at every level:
- Organizational: Enterprise AI rollouts
- Team: Department-specific workflow improvements
- Individual: Personal productivity optimization
- Prompt: Iterative improvement of AI interactions
PAST tells you WHAT and WHY. Purpose, Audience, Scope, Tone—strategic clarity about what you're trying to achieve.
SHAPE tells you HOW and WHEN. Systematic execution from assessment through evaluation.
Together, they form a complete system from strategy to implementation. You can use SHAPE independently, but organizations using both frameworks report the most consistent results.
Implementation leaders who execute AI projects and need systematic methodology for scaling.
Project managers moving from pilot success to production deployment.
Team leads optimizing workflows without getting lost in tool comparisons.
Consultants delivering execution methodology clients can follow independently.
Anyone who has strategy but struggles with systematic execution.
Read the five SHAPE chapters sequentially to understand the methodology deeply.
Apply the worksheets to your specific implementation challenge.
Use the Takers/Shapers/Makers decision framework (Chapter 3) before every implementation.
Return to Chapter 6 for level-specific examples when needed.
Most AI implementation failures trace back to a single cause: Organizations didn't understand where they were starting from.
They assumed their data was cleaner than it was. They overestimated team readiness. They underestimated integration complexity. They didn't know that "how people actually work" differs dramatically from "how processes are documented."
You can't plan the path without knowing the terrain.
Situation assessment isn't bureaucratic overhead—it's the foundation that prevents expensive mistakes. The 30 minutes you spend honestly assessing current state can save months of failed implementation.
Before selecting any AI approach, assess which deployment model fits your situation.
Dr. Janna Lipenkova notes in The Art of AI Product Development: "For companies with the technical resources, hosting an open source model internally grants complete control over all aspects of the LM, including its fine-tuning."
But she also warns: "Data is sent to the provider's servers for processing using a managed LLM via API, which can raise privacy concerns, especially for sensitive or proprietary data."
Infrastructure Decision Matrix:
| Factor | API/Vendor Better | Self-Hosted Better |
|---|---|---|
| Data Sensitivity | Public data, non-sensitive | Regulated, proprietary, competitive |
| Technical Team | Limited ML expertise | Can fine-tune and maintain |
| Control Requirements | Standard compliance | Audit trails, full visibility |
| Volume Economics | Variable/low usage | High-volume, predictable |
| Model Customization | General purpose sufficient | Domain-specific needed |
Decision: If 3+ factors favor self-hosting AND you have technical capability, evaluate build vs. buy. Otherwise, default to vendor solutions.
This adds critical infrastructure assessment to the Situation phase, preventing later scaling failures from wrong deployment model choice.
Technology Assessment:
What systems and tools are currently in use? Not the official IT inventory—what people actually use daily.
How well do existing systems integrate? Can data flow between them, or does someone manually copy-paste between applications?
What data is available and in what formats? Is it accessible, clean, structured? Or scattered across spreadsheets with inconsistent naming conventions?
Where are the technical skill gaps? Who on the team can troubleshoot when AI tools misbehave?
Workflow Assessment:
How do people actually work versus how they're supposed to work? This gap often surprises leadership. Shadow processes exist because official processes don't match reality.
Where are the biggest inefficiencies or pain points? What frustrates people most? What takes far longer than it should?
What processes are most ready for AI augmentation? Look for high-volume, repetitive tasks with clear quality standards.
Which workflows are too complex for initial AI implementation? Some processes have too many exceptions, too much tacit knowledge, too many edge cases. Save those for later.
Before concluding your Situation assessment, audit shadow AI usage—the unsanctioned tools employees already use.
Dr. Janna Lipenkova observes in The Art of AI Product Development: "Shadow processes exist because official processes don't match reality."
Shadow AI Audit Questions:
- Which unofficial AI tools are employees already using?
- Why did they choose those tools over official alternatives?
- What does shadow AI adoption reveal about workflow friction?
- How can official AI solutions address what shadow AI is solving?
The Signal: Shadow AI isn't a security problem to eliminate—it's market research about what your users actually need. Any Situation assessment that ignores shadow AI will miss critical adoption intelligence.
When 50% of employees prefer unsanctioned tools, the problem isn't employee behavior. The problem is that official solutions don't match how work actually happens.
Organizational Assessment:
How comfortable are teams with new technology? Past experience predicts future adoption. Teams with successful technology adoption histories will embrace AI more readily.
Where is leadership support strongest? AI implementations need visible champions. Where are yours?
What resistance should you expect and from whom? Identify skeptics early. Their concerns often reveal real implementation risks.
Before optimizing team workflows:
What's the team's current baseline? Measure before you change. How long do tasks take now? What does quality look like today?
Where does the team spend most time? Often different from where management thinks time goes. Ask the team directly.
What frustrates the team most? These pain points reveal the highest-value implementation opportunities.
What's the team's tech comfort level? Some teams embrace new tools. Others need extensive hand-holding. Plan accordingly.
Where has the team successfully adopted tools before? Past success patterns indicate what implementation approaches will work.
For personal productivity optimization:
Which repetitive tasks consume the most time? Track for a week. The answer often surprises you.
What's the quality standard I need to maintain? AI can accelerate low-stakes work easily. High-stakes work requires more careful implementation.
How much complexity am I comfortable managing? Be honest. Simple tools used consistently beat complex tools abandoned after a week.
What's my current workflow baseline? Measure before claiming improvement. "Feels faster" isn't evidence.
Where have I successfully changed habits before? Habit change is hard. Build on previous successes.
For AI interaction optimization:
What's my current process for this task? Before optimizing prompts, understand the workflow they serve.
What's working and what's frustrating? Which prompts produce usable results? Which require extensive editing?
What specific outcome do I need? Vague goals produce vague prompts produce vague outputs.
What constraints exist? Time limits, format requirements, audience expectations, compliance needs.
What have I tried before? Document failed approaches so you don't repeat them.
AI Readiness Audit (Rate 1-10):
- Technical infrastructure maturity
- Data quality and accessibility
- Team AI skills and comfort level
- Leadership support and budget allocation
- Change management capabilities
Scoring: 40-50 = Ready for aggressive implementation. 25-39 = Proceed with careful pilots. Below 25 = Foundation work needed before AI implementation.
Workflow Mapping Exercise (For each potential AI use case):
- Current time investment (hours/week)
- Pain point severity (1-10)
- Process standardization level (1-10)
- Data availability for AI (1-10)
- User enthusiasm for AI assistance (1-10)
Prioritization: Total scores indicate implementation readiness. Start with highest-scoring workflows.
Most AI projects fail because success isn't clearly defined upfront.
Without clear success criteria, you can't measure results. Without measurable results, you can't make scaling decisions. Without scaling decisions, pilots drift indefinitely.
"We want to improve efficiency" isn't a hypothesis. "AI-assisted research will reduce report preparation time from 8 hours to 3 hours" is a hypothesis. One can be measured. The other can be debated forever.
Specific hypotheses enable learning and iteration. When you know exactly what success looks like, you can tell whether you've achieved it—and if not, why not.
Organizational Level:
"AI-assisted research will reduce report preparation time from 8 hours to 3 hours."
"Automated customer classification will improve sales team efficiency by 25%."
"AI content generation will increase marketing output by 50% with maintained quality scores."
"Loan processing time will decrease from 5 days to 2 days while maintaining approval accuracy."
Team Level:
"The team will complete weekly reporting in 30 minutes versus current 2 hours."
"Meeting notes will capture 100% of action items versus current 70%."
"Customer inquiry response time will drop from 4 hours to 30 minutes."
"First-pass document review will take 15 minutes instead of 45 minutes."
Individual Level:
"This automation will save 2 hours per week with less than 30 minutes setup."
"Email processing time will drop from 90 minutes to 30 minutes daily."
"Research summaries will take 15 minutes versus current 45 minutes."
"I'll complete this recurring task in half the time within two weeks."
Prompt Level:
"This prompt will generate usable first drafts 80% of the time."
"AI output will require less than 5 minutes of editing for typical use."
"Response quality will be consistent across similar requests."
"I'll save 60% of time on this recurring task."
Quantitative metrics matter, but qualitative outcomes often determine long-term success.
Organizational Level:
"Teams will find AI tools intuitive and helpful rather than frustrating."
"AI recommendations will be trusted and acted upon by experienced staff."
"AI integration will enhance rather than replace human expertise."
"Staff will view AI as augmentation, not threat."
Team Level:
"Team morale will improve with reduced manual work."
"Collaboration will increase with shared AI tools."
"Job satisfaction will rise as team members spend more time on strategic work."
"Inter-team coordination will improve with standardized AI outputs."
Individual Level:
"I'll feel more confident in my outputs."
"My work-life balance will improve."
"I'll have more time for creative and strategic thinking."
"I'll be less frustrated by repetitive tasks."
Efficiency Metrics:
- Time savings on specific tasks
- Throughput improvements
- Resource allocation optimization
- Cost per unit of output
Quality Metrics:
- Accuracy improvements
- Consistency enhancements
- Error reduction rates
- Quality score maintenance or improvement
Adoption Metrics:
- User engagement rates
- Feature utilization levels
- User satisfaction scores
- Return usage rates (do people keep using it?)
Business Impact Metrics:
- Revenue impact
- Cost savings
- Customer satisfaction improvements
- Competitive positioning changes
Veljko Krunic puts it directly in Succeeding with AI: "The business metric must be defined for every single AI project. AI methods are, by their nature, quantitative. The inability to easily recognize business metrics by which an AI project should be measured raises a big red flag. If you can't quantify the business result you're hoping to achieve, you have to ask yourself and your stakeholders whether the project is worth doing."
A valid hypothesis MUST include all five elements:
| Element | Valid Example | Invalid Example |
|---|---|---|
| Named metric | "Response time" | "Efficiency" |
| Baseline number | "Currently 4 hours" | "Currently slow" |
| Target number | "Target 30 minutes" | "Target faster" |
| Timeline | "Within 90 days" | "Soon" |
| Kill criteria | "Abandon if >60 min" | "Review if not working" |
Gate Rule: No hypothesis moves to Action phase without all five elements quantified. "Improve efficiency" isn't a hypothesis—it's a wish.
Before finalizing your Hypothesis, verify that your ROI assumptions aren't optimistic fantasy:
| Question | Red Flag |
|---|---|
| Is the baseline measurement accurate? | Estimated, not measured |
| Are time savings realistic? | Based on vendor claims, not pilot data |
| Does improvement justify investment? | Marginal gains with significant cost |
| Can we actually measure the metric? | Requires new instrumentation |
| Will users report honestly? | Self-reported time savings |
If more than two red flags appear, pause. Gather real baseline data before proceeding. Over-optimistic hypotheses guarantee "failure" when reality is measured—even when actual results are positive.
Every AI implementation is an experiment. Approach it scientifically:
State your hypothesis clearly. If you can't write it in one sentence, you don't understand what you're testing.
Define measurement methodology. How exactly will you know if the hypothesis is true or false?
Set decision criteria. At what point will you conclude success or failure? What threshold triggers scaling versus abandonment?
Establish timeline. When will you evaluate results? Commit to a date.
Before any implementation, you face a fundamental choice. Research shows dramatically different success rates based on approach:
Takers: 67% Success Rate
Use off-the-shelf vendor solutions with minimal customization.
Characteristics:
- Fastest time-to-value (4-8 weeks typically)
- Lowest technical risk
- Vendor handles updates and maintenance
- Limited differentiation from competitors
Best for: Standard business processes, proven use cases, resource-constrained organizations, organizations new to AI.
Example: Implementing ChatGPT Enterprise, Microsoft Copilot, or Salesforce Einstein with standard configurations.
Shapers: 45% Success Rate
Customize vendor solutions for specific needs.
Characteristics:
- Moderate implementation timeline (8-16 weeks)
- Medium technical and organizational risk
- Some internal capability required
- Moderate differentiation possible
Best for: Industry-specific requirements, unique workflows, organizations with technical capability, moderate customization needs.
Example: Configuring Salesforce Einstein with custom data models, building on vendor APIs with specific business logic.
Makers: 33% Success Rate
Build custom AI solutions from scratch.
Characteristics:
- Longest implementation timeline (16+ weeks)
- Highest technical and organizational risk
- Significant internal capability required
- Maximum differentiation potential
Best for: Competitive differentiation requirements, highly unique processes, organizations with strong AI/ML capabilities.
Example: Developing proprietary machine learning models for specialized prediction or recommendation tasks.
Simple tools that work reliably outperform complex customizations requiring constant maintenance.
Organizations fall into the capability trap when they choose Shapers or Makers approaches for problems Takers solutions could solve. They're seduced by customization potential rather than guided by actual business requirements.
50% of employees prefer unsanctioned simple tools over complex official solutions. When your official AI tool is harder to use than ChatGPT, employees will use ChatGPT. Shadow AI exists because approved solutions are too complex.
| Criteria | Takers | Shapers | Makers |
|---|---|---|---|
| Success Rate | 67% | 45% | 33% |
| Time to Value | 4-8 weeks | 8-16 weeks | 16+ weeks |
| Resource Needs | Low | Medium | High |
| Technical Risk | Low | Medium | High |
| Customization | Minimal | Moderate | Maximum |
| Maintenance | Vendor | Shared | Internal |
| Differentiation | Low | Medium | High |
Default to Takers unless you have compelling, documented reasons for Shapers or Makers.
The Hidden Costs of Customization
The success rate differences (67% vs. 45% vs. 33%) reflect more than technical complexity. They reflect adoption friction that compounds over time.
Dr. Janna Lipenkova notes in The Art of AI Product Development: "To minimize adoption barriers, ensure the interface is easy to use and accessible for your team."
Adoption Friction by Approach:
| Approach | Typical Training | User Friction | Support Load |
|---|---|---|---|
| Takers | 1-2 hours | Low (familiar interfaces) | Vendor handles |
| Shapers | 1-2 days | Medium (custom elements) | Shared responsibility |
| Makers | 1+ week | High (entirely new systems) | Internal only |
The Hidden Cost: Custom solutions don't just have higher failure rates—they have higher ongoing support costs, longer training cycles, and greater resistance to adoption. Every hour of training is an hour of resistance, confusion, and "I'll just do it the old way."
When a Takers solution breaks, the vendor fixes it. When a Makers solution breaks, your team fixes it—and your team has other work to do.
Veljko Krunic advises in Succeeding with AI: "Don't start by chasing difficult projects that tie up all your resources and destroy you if they fail. Start instead with simple projects that have big business impact."
High Impact, Low Complexity: Choose use cases that deliver meaningful value without requiring extensive integration. The goal is demonstrable success, not technical achievement.
Willing Participants: Work with enthusiastic early adopters who will provide honest feedback. Skeptics can join Phase 2.
Measurable Outcomes: Select pilots where success can be clearly measured and communicated. Vague benefits can't build organizational momentum.
Representative Challenges: Ensure pilots address real organizational needs, not just interesting technical possibilities. Toy problems don't prove anything.
Favor Takers Approach: Unless you have compelling reasons for customization, start with proven vendor solutions. You can always customize later.
Week 1-2: Setup and Training
- Tool configuration (minimal for Takers, extensive for Makers)
- User training and support materials creation
- Baseline metrics measurement (before AI implementation)
- Communication plan with broader organization
- Support escalation paths defined
Week 3-8: Active Pilot
- Daily usage monitoring and support availability
- Weekly feedback collection from all pilot users
- Iterative improvements based on user input
- Documentation of unexpected challenges and benefits
- Regular check-ins with pilot team
Week 9-10: Evaluation and Documentation
- Comprehensive success metrics analysis versus hypotheses
- User satisfaction assessment
- Cost-benefit calculation
- Technical performance review
- Scaling recommendations and requirements documentation
Most AI projects fail before implementation begins—in the pitch meeting. Technical teams present AI in technical terms. Executives need business terms. The gap between those languages is where initiatives die.
Dr. Janna Lipenkova emphasizes in The Art of AI Product Development that effective AI communication must be "precise, business focused, and free of unnecessary technical jargon."
Pilot Communication Protocol:
| Audience | Communication Focus | Avoid |
|---|---|---|
| Executive sponsors | Business outcome progress | Technical metrics |
| Pilot users | Daily workflow impact | Roadmap promises |
| IT/Security | Stability and compliance | Performance benchmarks |
| Skeptics | Honest challenges + solutions | Overselling results |
Weekly Pilot Updates Must:
- Lead with business outcome progress, not technical status
- Include honest challenges (builds trust faster than false optimism)
- Avoid jargon that creates distance between teams
- Show user perspective, not project perspective
The same project needs different translations for different audiences. What resonates with your CFO will confuse your pilot users. What excites your IT team will bore your executive sponsor.
Team: Customer support optimizing response workflows
Situation: 4-hour average response time, inconsistent quality, team burnout from repetitive questions.
Hypothesis: AI-powered response suggestions can reduce average response time to 30 minutes while improving consistency and reducing team stress.
Action: 2-week pilot with 3 team members. Test different AI assistance levels—from simple templates to full draft generation.
Takers Approach: Use existing helpdesk platform's AI features rather than building custom solution.
Success Criteria: >50% time savings, quality maintained (measured by customer satisfaction scores), team satisfaction improved (measured by weekly pulse survey).
Individual: Consultant spending 10 hours/week on client reporting.
Situation: Manual reporting across 5 clients, inconsistent formats, time-consuming data gathering and synthesis.
Hypothesis: AI-assisted reporting can reduce time from 10 hours to 2 hours while maintaining personalization and quality.
Action: Create reporting templates for each client type. Test AI assistance for data synthesis, formatting, and first-draft generation over 3 weeks.
Process: Refine templates based on client feedback. Standardize workflow for maximum efficiency.
Evaluation: Track time savings weekly. Assess client satisfaction monthly. Iterate quarterly.
Task: Optimizing executive summary generation.
Situation: Current prompts produce inconsistent quality—sometimes too technical, sometimes too vague, often missing key points.
Hypothesis: A refined prompt with clear PAST elements will deliver usable summaries 90% of the time with less than 5 minutes editing.
Action: Test 3 different prompt variations over 15 summary generations. Document which produces best results against quality checklist.
Process: Once best approach identified, create template and refine based on actual ongoing usage.
Evaluation: After 30 uses, assess consistency, editing time required, and areas for improvement.
Pilots get special attention—selected users, close support, rapid iteration, high visibility.
Scaling gets none of that: broader user base, stretched support, slow iteration, reduced visibility.
Successful pilots must be systematically scaled, not just copied.
The same implementation that worked with 5 enthusiastic users in one department will fail with 50 skeptical users across the organization—unless you systematically adapt your approach.
Before scaling, honestly assess readiness across three dimensions:
Technical Readiness:
- Can infrastructure support broader usage without performance degradation?
- Are integrations stable and secure at scale?
- Is support documentation comprehensive enough for self-service?
- Are troubleshooting playbooks developed from pilot learnings?
Organizational Readiness:
- Are change management processes in place?
- Is additional training capacity available?
- Are success stories compelling and well-documented for skeptics?
- Do pilot users want to champion broader adoption?
Resource Readiness:
- Is budget allocated for expanded implementation?
- Are internal champions identified and empowered?
- Are vendor relationships and contracts scalable?
- Is ongoing support staffed adequately?
Dr. Janna Lipenkova emphasizes in The Art of AI Product Development: "AI's value isn't realized until it's successfully integrated into real-world workflows and embraced by users."
Technical readiness isn't enough. Before scaling, pass this adoption readiness gate:
Pre-Scaling Adoption Check:
| Factor | Question | Ready If... |
|---|---|---|
| Workflow Integration | How many steps change for new users? | ≤3 steps change |
| Interface Accessibility | Can non-technical staff use independently? | No IT support required |
| Training Load | How long until productive? | <4 hours training |
| Fallback Plan | What happens when AI fails? | Documented manual process exists |
| User Advocacy | Do pilot users want to champion expansion? | Active advocacy present |
| Support Scalability | Can support handle 5x users? | Yes without linear headcount |
Scoring: All factors "ready" = proceed with scaling. Any factor "not ready" = address before scaling.
This gate prevents the common failure pattern: technically ready but adoption-doomed scaling attempts. A system that works perfectly but nobody uses delivers zero value.
Remember the shadow AI lesson: 50% of employees prefer unsanctioned simple tools over complex official solutions.
When scaling, resist the temptation to add complexity. The pilot succeeded partly because it was simple. Every additional feature, integration, and approval layer reduces success probability.
The Customization Trap:
Someone will request customization during scaling. "Can we add integration with System X?" "Can we build a custom dashboard?" "Can we modify the workflow for our specific needs?"
These requests seem reasonable. They often kill scaling initiatives.
67% of Takers implementations succeed versus 33% of Makers. The difference is simplicity. Every custom layer reduces success probability.
Ask before approving customization:
- Does this customization deliver measurable business value that justifies the added complexity?
- Would users actually prefer the complex version to the simple version?
- Does this move us from Takers toward Shapers or Makers territory?
- Can we achieve 80% of the benefit without this complexity?
Schedule this monthly during scaling. Complexity creeps in one "reasonable" request at a time.
Monthly Complexity Check:
| Question | Red Flag |
|---|---|
| Are we adding features users didn't request? | Yes |
| Is training time increasing? | By >20% |
| Are support tickets increasing per user? | Yes |
| Would shadow AI tools still be simpler? | Yes |
| Are customization requests driving scope? | Yes |
| Is the pilot version now "legacy"? | Yes |
The Rule: If 3+ red flags appear, pause scaling and simplify. Every month that passes without this check, the system becomes harder to simplify.
The best implementations look almost identical at month 12 as they did at month 1—because the team resisted complexity at every decision point.
Phase 1: Adjacent Expansion
Expand to similar use cases within the same department. Add users with similar skill levels and responsibilities.
During Phase 1:
- Maintain close monitoring and support
- Keep support ratios high (more support per user than at scale)
- Document all issues and solutions
- Keep it simple: Resist customization requests until Phase 2 is complete
Phase 1 Success Criteria:
- Adoption metrics maintained from pilot
- Minimal new issue types (problems should be familiar from pilot)
- Users can self-serve for common issues
- Support load is manageable
Phase 2: Cross-Functional Integration
Connect AI workflows across departments. This is where organizational complexity enters.
During Phase 2:
- Develop cross-functional success metrics
- Build organizational AI governance processes
- Create integration points between department workflows
- Selective complexity: Add integrations only when business value clearly justifies complexity
Phase 2 Success Criteria:
- Cross-functional workflows demonstrating value
- Governance processes preventing shadow AI growth
- Support scalable without linear headcount increase
- Integration points stable and documented
Phase 3: Strategic Integration
Integrate AI capabilities with core business processes. This is where competitive advantage emerges—or where over-engineering destroys value.
During Phase 3:
- Integrate AI with core business processes
- Develop AI-enhanced competitive advantages
- Build organizational AI capabilities and culture
- Strategic customization: Custom development only for genuine competitive differentiation
Phase 3 Success Criteria:
- Measurable competitive advantage from AI capabilities
- AI embedded in core business processes
- Organization has AI capability as strategic asset
- Innovation pipeline producing new AI applications
From pilot team to adjacent teams:
- Document what worked and why
- Identify early adopter champions in adjacent teams
- Adapt training materials for different contexts
- Maintain simplicity—resist "improvements" that add complexity
From pilot users to full team:
- Phase rollout within team (don't flip the switch for everyone at once)
- Pair pilot veterans with new users
- Create feedback channels for continuous improvement
- Monitor adoption metrics weekly
From single workflow to multiple workflows:
- Apply SHAPE methodology to each new workflow
- Resist temptation to build one complex system for all workflows
- Simple tools for each workflow beat complex unified platforms
From one workflow to similar workflows:
- Apply same prompt patterns to analogous tasks
- Create personal template library
- Iterate based on what actually works versus what should work
From occasional use to daily habit:
- Build AI into existing routines (don't create new routines)
- Remove friction from AI access
- Make AI the default, with manual as fallback
From single tool to integrated toolkit:
- Add tools slowly—master one before adding another
- Integrate tools that work together naturally
- Abandon tools that don't deliver consistent value
AI implementation is not a project—it's an ongoing capability development process.
Technology evolves monthly, new tools emerge, user needs change, and what worked yesterday may not work tomorrow.
Continuous evaluation enables:
- Early detection of degrading performance
- Identification of new opportunities
- Evidence-based scaling decisions
- Organizational learning and capability building
Quantitative Performance Review:
- Success metrics trending analysis (are we still meeting hypotheses?)
- Cost-benefit ratio assessment (is ROI improving or declining?)
- User adoption rate monitoring (growing, stable, or declining?)
- Technical performance optimization (speed, reliability, accuracy)
Qualitative Impact Assessment:
- User satisfaction and feedback analysis (what do users actually say?)
- Organizational culture impact evaluation (how is AI changing how people work?)
- Unexpected benefits and challenges documentation (what surprised us?)
- Strategic value realization progress (are we achieving intended outcomes?)
Simplicity Check:
- Are we adding unnecessary complexity?
- Could we achieve similar results with simpler approaches?
- Are custom features actually being used?
- Would shadow AI users prefer our solution to unsanctioned alternatives?
Capability Assessment:
- What new AI capabilities have been developed?
- How has organizational AI maturity progressed?
- What new opportunities have emerged from our learnings?
Competitive Analysis:
- How do our AI capabilities compare to competitors?
- What new competitive advantages have been created?
- Where are we falling behind industry leaders?
Strategic Alignment Review:
- How well do current AI implementations support business strategy?
- What adjustments are needed based on market changes?
- What new AI investments should be prioritized?
Build vs. Buy Re-evaluation:
- Are our custom solutions still delivering differentiated value?
- Could vendor solutions now address our needs more effectively?
- Should we simplify by migrating custom builds to vendor platforms?
- Has the Takers/Shapers/Makers balance shifted since last review?
Organizations often maintain custom solutions past their strategic value due to sunk cost bias. Every quarterly review should explicitly assess whether custom implementations still justify their complexity.
Quarterly Build vs. Buy Reassessment:
| Question | Action if Yes |
|---|---|
| Are we maintaining custom code that vendors now offer? | Evaluate migration to vendor solution |
| Is our custom solution now commodity capability? | Consider standardization |
| Has vendor ecosystem improved significantly? | Reassess approach (Makers → Shapers → Takers) |
| Are maintenance costs exceeding value delivered? | Simplify or sunset |
| Could we redirect engineering to differentiated work? | Evaluate shift in resource allocation |
The Sunk Cost Trap: "We've already invested so much" is not a valid reason to continue. The question is always: "Given where we are now, what's the best path forward?"
What was differentiated custom work two years ago may now be commodity capability that vendors do better, cheaper, and with less maintenance burden.
Weekly: Quick metrics check, user satisfaction pulse. Are we still on track?
Monthly: Comprehensive workflow analysis, efficiency gains measurement, team feedback synthesis.
Quarterly: Team capability assessment, strategic alignment review, next quarter planning.
Daily: Quick reflection—what worked, what didn't, what to try tomorrow.
Weekly: Time savings analysis, quality assessment, habit formation check.
Monthly: Workflow optimization review, skill development assessment, tool utilization audit.
Quarterly: Major iteration decision—continue, modify, or abandon current approaches.
After 5 uses: Does this prompt consistently deliver acceptable results?
After 10 uses: Should I refine or replace this prompt?
After 20 uses: Is this prompt optimized, or does it need significant overhaul?
After 50 uses: Time for complete review. Technology may have changed. Use case may have evolved. Start SHAPE cycle again.
Evaluation should drive clear decisions:
Continue: Metrics meeting or exceeding hypotheses. User satisfaction high. No simplification opportunities.
Expand: Strong performance with clear adjacent opportunities. Scaling readiness confirmed.
Modify: Performance acceptable but improvable. Specific changes identified. Resources available for iteration.
Simplify: Performance acceptable but complexity higher than necessary. Opportunity to reduce without losing value.
Pause: Performance declining or stagnant. Investigation needed. Don't scale while diagnosing problems.
Abandon: Hypotheses clearly false. Cost-benefit negative. Resources better deployed elsewhere.
Organization: Regional bank implementing AI for loan processing and customer service
Full SHAPE Application:
Situation Assessment:
- Legacy systems with limited integration capabilities
- Highly regulated environment requiring extensive compliance
- Experienced staff skeptical of automation
- Strong data security requirements
- Technical infrastructure: 6/10
- Data quality: 5/10
- Team readiness: 4/10
- Leadership support: 8/10
Critical Decision: Takers Approach
- Regulatory environment demands proven, compliant solutions
- Staff skepticism suggests need for low-risk initial implementation
- Leadership support means opportunity to build confidence systematically
- Select established vendor with industry compliance features rather than custom build
Hypothesis Definition:
- "AI pre-screening will reduce loan processing time by 30% while maintaining approval accuracy above 98%"
- "Automated customer inquiries will improve response time by 50% without reducing satisfaction scores"
- "Staff will embrace AI tools that enhance rather than replace their expertise" (measured by adoption rates and satisfaction surveys)
Action (Pilot Program):
- Selected: Commercial loan pre-screening for pilot (high volume, clear metrics, willing team)
- 8-week pilot with weekly feedback sessions
- 12 enthusiastic early adopters from lending team
- Clear baseline metrics established before implementation
- Vendor support engaged for training and troubleshooting
Results from Action Phase:
- Processing time reduced 35% (exceeded hypothesis)
- Accuracy maintained at 99.2% (exceeded hypothesis)
- 10 of 12 pilot users rated experience "positive" or "very positive"
- Three unexpected benefits identified: better documentation, fewer manual errors, improved audit trail
Process (Scaling):
- Phase 1: Expand to full lending team (45 users) over 6 weeks
- Phase 2: Add retail loan pre-screening (similar workflow, different product)
- Phase 3: Customer service inquiry routing
- Simplicity principle: Resisted 4 customization requests that would have added complexity without clear ROI
Evaluation Results:
- 6-month review: 87% daily usage rate across expanded user base
- Customer satisfaction maintained (no decline from pre-AI baseline)
- Staff satisfaction improved (less time on routine tasks)
- ROI: 340% return on implementation investment
- Decision: Proceed with Phase 3 customer service implementation
Team: Marketing team optimizing content production workflow
SHAPE Application:
Situation:
- 6-person team producing 20 pieces of content monthly
- Average production time: 8 hours per piece
- Quality inconsistent—some content excellent, some mediocre
- Team spending excessive time on research and first drafts
- High burnout from repetitive writing tasks
Hypothesis:
- "AI-assisted content production will reduce average production time from 8 hours to 4 hours per piece"
- "Content quality scores will remain at or above current average"
- "Team satisfaction will improve as repetitive work decreases"
Action:
- 4-week pilot with 2 team members
- Test AI for: research synthesis, first draft generation, headline variations
- Takers approach: Use existing content platform AI features rather than custom prompts
- Success metrics: time per piece, quality scores from review process, team pulse surveys
Process:
- Week 1-2: Setup, training, baseline measurement
- Week 3-6: Active pilot with daily logging
- Week 7-8: Evaluation and scaling decision
Results:
- Production time reduced to 5.5 hours average (31% improvement, below 50% hypothesis)
- Quality scores improved slightly (unexpected positive)
- Team satisfaction significantly improved ("I finally have time for strategy")
Scaling Decision: Scale to full team with modified hypothesis (35% improvement realistic, 50% aspirational). Address quality variance through standardized review process.
Individual: Consultant managing multiple client relationships and deliverables
SHAPE Application:
Situation:
- 8 active clients with different communication styles and requirements
- 12 hours weekly spent on client reporting and updates
- Quality inconsistent due to time pressure
- High cognitive load tracking different client contexts
Hypothesis:
- "AI-assisted client management will reduce reporting time from 12 hours to 4 hours weekly"
- "Client satisfaction will remain stable or improve"
- "Cognitive load will decrease (measured by end-of-week energy levels)"
Action:
- 3-week pilot across 3 representative clients
- AI applications: meeting note synthesis, progress report generation, email drafting
- Document what works and what doesn't for each client type
Process:
- Week 1: Establish baseline, test initial approaches
- Week 2: Refine based on what works
- Week 3: Standardize successful approaches
Results:
- Reporting time reduced to 5 hours weekly (58% improvement)
- Two clients commented positively on "improved communication"
- Cognitive load decreased (Friday energy significantly better)
Scaling: Roll out to remaining 5 clients over 2 weeks. Create client-specific templates based on pilot learnings.
Task: Weekly executive briefing generation
SHAPE Application:
Situation:
- Current process: 45 minutes gathering information, 30 minutes writing, 15 minutes formatting
- Quality inconsistent—sometimes too detailed, sometimes missing key points
- Executive feedback: "Usually helpful, occasionally misses what I actually need to know"
Hypothesis:
- "A refined prompt structure will produce usable first drafts in 80%+ of weekly briefings"
- "Total preparation time will decrease from 90 minutes to 30 minutes"
- "Executive satisfaction will improve ('always helpful' feedback)"
Action:
- Test 3 prompt variations over 6 weeks (2 weeks each)
- Variation A: Simple structure (situation, key developments, decisions needed)
- Variation B: PAST-structured (purpose, audience, scope, tone specified in prompt)
- Variation C: Example-based (include previous successful briefing as template)
Results:
- Variation A: 60% usable (too generic)
- Variation B: 85% usable (consistent quality)
- Variation C: 75% usable (over-fitted to example)
Process:
- Standardize on Variation B structure
- Create template with customization points for different meeting contexts
- Add checklist for critical items that must be included
Evaluation:
- After 10 weeks: 90% usable first drafts
- Total time: 25 minutes average (72% improvement)
- Executive feedback: "Much better—you always know what I need to know"
Simple enough to remember. Five phases, five letters. You can recall it in any meeting, any situation.
Comprehensive enough to prevent major failures. Each phase addresses a common failure mode. Situation prevents bad assumptions. Hypothesis prevents unmeasurable goals. Action prevents analysis paralysis. Process prevents scaling disasters. Evaluation prevents stagnation.
Flexible enough to apply at any level. The same methodology that guides enterprise implementations guides individual prompt optimization. The logic scales because execution patterns are consistent.
Systematic enough to ensure consistency. Following SHAPE means following the same approach every time. Consistency enables organizational learning. Learning enables improvement.
PAST gives you strategic clarity: What are you trying to achieve (Purpose)? Who does it serve (Audience)? What are the boundaries (Scope)? How should it feel (Tone)?
SHAPE gives you execution methodology: Where are you now (Situation)? What does success look like (Hypothesis)? How will you test (Action)? How will you scale (Process)? How will you improve (Evaluation)?
Together: Complete system from "why are we doing this?" to "did it work and what's next?"
67% vs. 33%: Takers implementations succeed at twice the rate of Makers. Simplicity wins.
50%: Half of employees prefer unsanctioned simple tools. Complexity kills adoption.
95%: Most AI pilots never scale. Systematic methodology prevents this.
If you have a specific implementation challenge:
- Start with Situation assessment—30 minutes of honest evaluation
- Write your Hypothesis in one sentence with measurable criteria
- Choose your approach: Takers, Shapers, or Makers (default to Takers)
- Design a minimal pilot with clear timeline and success criteria
- Document everything for the Process phase
If you need strategic clarity first: Start with the PAST Framework Guide to establish Purpose, Audience, Scope, and Tone before applying SHAPE for execution.
If you're ready for comprehensive implementation: The Strategic Frameworks Bundle (PAST + SHAPE) provides complete methodology from strategy through evaluation.
Pilots succeed. Scaling fails—unless you have systematic methodology.
Simplicity scales better than complexity.
Takers approach (67% success) beats Makers (33%).
SHAPE works because execution patterns are consistent across levels.
Whether implementing enterprise AI or optimizing prompts, follow the same discipline: Situation, Hypothesis, Action, Process, Evaluation.
The question isn't whether AI can help you. The question is whether you'll implement it systematically enough to actually realize that help.
SHAPE is your answer.
Use this worksheet for any implementation initiative—organizational, team, individual, or prompt level.
Implementation Name: ________________________________
Level: [ ] Organizational [ ] Team [ ] Individual [ ] Prompt
Date Started: _____________
SITUATION ASSESSMENT
Current state description:
Key pain points (list top 3):
Readiness Scores (1-10):
- Technical readiness: ___
- Data quality: ___
- Team/user readiness: ___
- Leadership/personal support: ___
- Change capacity: ___
Total Score: ___ /50
Baseline metrics to track:
HYPOTHESIS DEFINITION
Quantitative success hypothesis:
"_____________________________________________________________"
Qualitative success hypothesis:
"_____________________________________________________________"
Timeline for evaluation: _____________
Decision criteria (what number/outcome triggers scaling vs. abandonment?):
ACTION PLANNING
Implementation approach: [ ] Takers (67%) [ ] Shapers (45%) [ ] Makers (33%)
Justification for approach choice:
Pilot scope (users, duration, use case):
Pilot success criteria:
Key risks and mitigation:
PROCESS PLANNING
Phase 1 (Adjacent Expansion):
Phase 2 (Cross-Functional/Broader):
Phase 3 (Strategic Integration):
Simplicity checkpoints—what customization requests will you resist?
EVALUATION SCHEDULE
Weekly check: _____________
Monthly review: _____________
Quarterly assessment: _____________
Key metrics to track:
Decision framework (continue/expand/modify/pause/abandon criteria):
Complete before every implementation decision.
Choose TAKERS (67% success rate) if:
- Proven vendor solutions exist for this use case
- Your needs are similar to other organizations
- You want fastest time-to-value (4-8 weeks)
- Resources are limited
- Risk tolerance is low
- This is your first AI implementation in this area
Choose SHAPERS (45% success rate) if:
- Industry-specific requirements exist that vendors don't address
- Some customization needed but not building from scratch
- You have technical capability for configuration
- Timeline is moderate (8-16 weeks acceptable)
- Business case clearly justifies customization cost
- You've successfully implemented Takers solutions before
Choose MAKERS (33% success rate) if:
- True competitive differentiation requires custom solution
- No vendor solutions address your unique needs
- You have strong AI/ML engineering capability
- Timeline is flexible (16+ weeks acceptable)
- Strategic value justifies high investment and risk
- You've successfully implemented Shapers solutions before
Default to TAKERS unless compelling, documented reasons exist for alternatives.
Selected approach: _____________
Justification: _______________________________________________________________
Complete before scaling from pilot.
Technical Readiness:
- System performance stable under increased load
- Integrations reliable and documented
- Support processes can handle projected growth
- Security and compliance requirements met
- Simplicity preserved from pilot version (no scope creep)
Organizational Readiness:
- Leadership committed to continued investment
- Change management resources allocated
- Success metrics clearly demonstrate value (with evidence)
- Internal champions identified and empowered
- Resistance plan developed (skeptics addressed)
User Readiness:
- Pilot users can mentor or train others
- Training materials comprehensive and tested
- Support documentation covers common issues
- Onboarding process streamlined and fast
- Quality standards understood and accepted
If any section has unchecked items: Address before scaling. Scaling before readiness is a leading cause of implementation failure.
Monthly check to avoid the customization trap.
- Are we adding complexity that users didn't request?
- Could simpler approaches achieve 80% of the value?
- Are custom features actually being used (check usage data)?
- Is onboarding still as simple as during pilot phase?
- Do users prefer our solution to shadow AI alternatives?
- Can we migrate any custom builds to vendor solutions now?
- Are we still using Takers approach, or drifting toward Makers?
- Would a new user find this intuitive without extensive training?
Scoring:
- 6-8 checkboxes green: Simplicity maintained
- 4-5 checkboxes green: Complexity creep warning—review and simplify
- 3 or fewer green: Pause scaling. Simplify before continuing.
Track metrics across evaluation cycles.
Initiative: ________________________________
| Metric | Baseline | Month 1 | Month 2 | Month 3 | Hypothesis |
|---|---|---|---|---|---|
Monthly Notes:
Month 1: _______________________________________________________________
Month 2: _______________________________________________________________
Month 3: _______________________________________________________________
Quarterly Decision: [ ] Continue [ ] Expand [ ] Modify [ ] Pause [ ] Abandon
Rationale: _______________________________________________________________
Book-sourced gates and checks to strengthen each SHAPE phase.
Situation Enhancements:
- Self-hosted vs. API decision documented (Infrastructure Decision Matrix)
- Shadow AI audit completed (what unofficial tools are users choosing?)
- Reality gap between documented and actual processes identified
Hypothesis Enhancements:
- Metric Readiness Gate passed (all 5 elements quantified)
- ROI Reality Check completed (no more than 2 red flags)
- Kill criteria are specific numbers, not vague conditions
Action Enhancements:
- Communication Protocol defined for each stakeholder group
- Takers approach selected unless documented justification for alternatives
- Pilot updates lead with business outcomes, not technical status
Process Enhancements:
- Adoption Readiness Gate passed before scaling (all factors "ready")
- Complexity Creep Audit scheduled monthly
- Simplification triggers documented
Evaluation Enhancements:
- Build vs. Buy Re-evaluation on quarterly schedule
- Sunk cost bias explicitly addressed in reviews
- Sunset criteria defined for all custom solutions
Jim Christian helps organizations implement AI systematically through observation-based frameworks rather than ideological debates about AI's societal impact.
With over 30 years in technology consulting and digital transformation, Jim has worked across industries from Fortune 500 financial institutions to regional SMEs, consistently focusing on one question: What actually works?
This curiosity-driven approach led to the development of the PAST and SHAPE frameworks featured in this guide—systematic methodologies now used by organizations worldwide to turn scattered AI experiments into sustainable competitive advantages.
Jim is the author of Signal Over Noise, a weekly newsletter serving professionals who need practical AI implementation guidance without the hype.
Based in Valencia, Spain, Jim works with clients globally, bringing three decades of implementation experience to organizations ready to move from AI experimentation to systematic advantage.
Connect: Newsletter | Website
For ongoing AI implementation insights and strategic guidance, subscribe to Signal Over Noise at signalovernoise.at
For strategic clarity methodology, see The PAST Framework Field Guide.
PAST + SHAPE = Complete system from strategy to implementation.
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.