05 User Research & Validation Plan
1 Key Assumptions to Validate
Critical assumptions grouped by category, prioritized by risk. Validate these before MVP build to de-risk $350K investment.
| Assumption | Risk | Method | Target Evidence |
|---|---|---|---|
| Problem Assumptions | |||
| Prompts scattered across Notion/files/chats (no central repo) | High | Interviews/surveys | 70%+ confirm multi-tool chaos |
| No version control leads to lost effective prompts weekly | High | Interviews/observation | 60%+ report revert needs |
| Manual multi-model testing >30min per prompt | High | Surveys/timed tasks | Avg time >30min, 80% frustration |
| Teams duplicate prompts (50%+ effort waste) | High | Team interviews | 50%+ duplication stories |
| No performance metrics; can't identify best prompts | Medium | Surveys | 75%+ desire analytics |
| Sharing prompts via copy-paste loses context | Medium | Interviews | 65%+ collaboration pain |
| Prompt tweaking untracked, leads to inconsistency | Medium | Observation | 70%+ tweak frequently |
| Solution Assumptions | |||
| Users adopt dedicated prompt manager over docs | High | Landing page/prototype | >5% signup rate |
| Git-like versioning valued by 70%+ | Medium | Interviews/prototype | 70%+ positive reaction |
| Multi-model testing saves >50% time | Critical | Wizard of Oz | 80%+ accuracy/usability score |
| Analytics drive prompt optimization | High | Prototype feedback | 75%+ report value |
| Team collab features essential for teams | High | Interviews | 60%+ teams prioritize |
| VS Code integration boosts daily use | Medium | Surveys | 50%+ current VS Code users interested |
| API/webhooks needed for prod workflows | Medium | Interviews | 40%+ prod users confirm |
| Business Assumptions | |||
| Pro tier ($19/mo) acceptable to individuals | Critical | Pricing surveys/pre-orders | 40%+ willing at target |
| CAC <$50 via AI communities | High | Ad tests | <$50 for 5%+ signup |
| 10% free-to-paid conversion | High | Waitlist follow-up | >5% pre-order |
| Team tier ($49/user) viable for 10+ person teams | High | Interviews | 50%+ teams at price |
| Low churn (<5%/mo) from prompt lock-in | Medium | Prototype retention | >70% weekly return |
2 Customer Discovery Interview Guide (60-90 min)
Target 25 interviews: 60% AI engineers, 20% prompt engineers, 20% consultants. Recruit via LinkedIn (AI/ML groups), Reddit (r/PromptEngineering, r/MachineLearning), Twitter. Incentive: $50 Amazon card + free Pro access.
Part 1: Background (10 min)
- Tell me about your role and daily LLM usage.
- How many prompts do you manage weekly?
- Team size using AI?
Part 2: Problem Exploration (20 min)
- Walk me through last time you tested a prompt across models.
- How often do you lose track of good prompts?
- Describe sharing a prompt with your team—what went wrong?
- How much time/money wasted on re-testing?
- What's the worst prompt chaos story?
Part 3: Current Solutions (15 min)
- What tools (Notion, spreadsheets, Langchain Hub)? Likes/dislikes?
- Ever switched? Why?
- What would make you switch to a dedicated tool?
Part 4: Solution Exploration (15 min)
- Tool with git-versioning, multi-model tests, team sharing...
- Most/least valuable features?
- Concerns (security, LLM costs)?
- Expect to pay? Approve purchase?
Part 5: Wrap-up (10 min)
- Pain scale 1-10 for prompt management?
- Beta interest? Referrals?
Logistics: Record (Otter.ai), template: quotes, reactions, pricing. Analyze for patterns post-15 interviews.
3 Survey Designs
Screening Survey (Target: 300 responses, Typeform/LinkedIn)
- Role? [AI Engineer, Prompt Engineer, ML Engineer, Consultant, Other]
- Daily/weekly LLM prompt usage? [Daily, Weekly, Rarely]
- Prompts managed? [<50, 50-200, 200+]
- Pain scale 1-10: prompt organization/testing?
- Current tools? [Notion, Spreadsheets, None, Other]
- Team collab needs? [Yes/No]
- Interview interest? ($50 card) [Yes: email]
Validation Survey (Post-screening, 200+ responses)
- Problem frequency (e.g., lost prompts: Never-Always)
- Solution satisfaction (1-10 per tool)
- A/B messaging: "Version prompts like code" vs. "AI prompt analytics hub"
- Van Westendorp: Too cheap/expensive thresholds for $19/mo
- Demographics: Team size, LLM spend/mo
4-6 Experiments: Landing Page, Prototypes, Fake Door
Landing Page (Carrd, $500 FB/LinkedIn ads)
- A/B Headlines: "Version Control for AI Prompts", "Test Prompts Across LLMs Instantly", "End Prompt Chaos for Teams"
- Waitlist signup + fake door ($19/mo button)
- Success: 1K visitors, 7% signup, <10% bounce
Prototype Testing (Rec: Wizard of Oz)
- Google Form input → Manual LLM tests → Email results
- Test 15 users: versioning, multi-model, analytics
- Cost: $0 + 20hrs; Metrics: NPS>40, 80% repeat request
Fake Door/Pre-Order
- Show tiers post-signup; Stripe pre-order (refundable)
- Success: 12% click, 3% pay ($19 x 10+)
7 8-Week Validation Timeline
Wk 1-2: Problem
15 interviews, 300 screening surveys, pattern analysis.
Wk 3-4: Solution
Landing page A/B ($500 ads), 100+ waitlist, 25 validation surveys.
Wk 5-6: Pricing
15 pricing interviews, Van Westendorp, fake door/pre-orders (target 10).
Wk 7-8: Prototype
Wizard of Oz for 20 users, NPS/feedback, iterate.
| Metric | Target | Actual | Pass? |
|---|---|---|---|
| Pain confirmation (interviews) | 80%+ | ||
| Landing signup | >7% | ||
| Price acceptance | 60%+ @ $19 | ||
| Pre-orders | 10+ | ||
| Prototype NPS | >40 |
4/5 passes → Proceed to MVP. Total budget: $2K.
8 Research Synthesis Template
Problem Summary
- Top pains + quotes
- Invalidated assumptions
- Unexpected insights
Solution Summary
- Must-have features
- Low-interest
- UX/integration needs
Pricing Summary
- Optimal price
- Segment sensitivity
- Model prefs
GTM Insights
- Channels
- Discovery
- Objections
Next Step: Execute Week 1 immediately—schedule 5 interviews this week for quick signals.