05. User Research & Validation Plan
Strategy to validate the "Git for Prompts" hypothesis and de-risk the PromptVault development roadmap.
1 Critical Assumptions Matrix
| Assumption Category | Hypothesis | Risk Level | Validation Target |
|---|---|---|---|
| Problem | Engineers currently lose significant time searching for or recreating lost prompts. | HIGH | 80% of interviewees admit to losing a high-value prompt in the last month. |
| Problem | Teams struggle to maintain consistency across members (e.g., everyone using different prompt versions). | MED | 5+ Team Leads confirm "prompt drift" is a pain point. |
| Solution | Users prefer a "Git-like" version control mental model (commits, diffs) over simple document history. | HIGH | Users engage with the "Diff View" in prototypes >3 times per session. |
| Solution | Users want to test prompts against multiple providers (OpenAI vs Anthropic) in one UI. | MED | Feature is ranked in top 2 "Must Haves" in surveys. |
| Business | Teams will pay $49/user/mo rather than building internal tools or using Notion. | CRITICAL | 10 pre-orders or LOIs from team leads. |
| Business | Security concerns regarding storing prompts on a 3rd party SaaS are manageable. | CRITICAL | Less than 20% of qualified leads drop off due to data residency concerns. |
2 Discovery Interview Guide
Target: 25 Interviews (15 Engineers, 10 Product Managers)
Part A: The "Current Mess" (Deep Dive)
- "Show me, don't tell me: Can you share your screen and show me where your top 3 production prompts live right now?"
- "Walk me through the last time you had to update a prompt. How did you test that it didn't break edge cases?"
- "How do you share a successful prompt with a colleague? Slack? Email? Notion?"
Part B: Pain Quantification
- "Have you ever rolled out a prompt change that made things worse? How long did it take to revert?"
- "On a scale of 1-10, how confident are you that your current prompts are cost-optimized?"
Part C: Solution Fit
- "If you had a 'GitHub for Prompts', what is the one feature that would make it a 'must-buy' for your team?"
- "We're considering a Team plan at $49/user. Who in your org would sign off on that?"
3 Survey & Screening
The "Builder" Screener
Goal: Filter out casual ChatGPT users. We need production builders.
Q1. How many LLM calls does your product make per day?
[ ] 0 (Ideation phase)
[ ] 1-1,000
[ ] 1,000+ (Production)
Q2. Which models do you actively use?
[ ] GPT-4 only
[ ] Claude + GPT
[ ] Open Source (Llama/Mistral)
Q3. How do you store prompts?
[ ] Hardcoded in code
[ ] Database
[ ] Notion/Docs
[ ] 0 (Ideation phase)
[ ] 1-1,000
[ ] 1,000+ (Production)
Q2. Which models do you actively use?
[ ] GPT-4 only
[ ] Claude + GPT
[ ] Open Source (Llama/Mistral)
Q3. How do you store prompts?
[ ] Hardcoded in code
[ ] Database
[ ] Notion/Docs
Recruitment Channels
- r/LocalLLaMA & r/OpenAI: High concentration of power users.
- YCombinator "Startup School" Forum: Founders dealing with this exact pain.
- Direct LinkedIn Outreach: Search "AI Engineer" or "Prompt Engineer".
4 Validation Experiments
5 8-Week Validation Roadmap
Weeks 1-2: Problem Discovery
- Conduct 15 "Show me your mess" interviews.
- Launch "Builder" screening survey on Reddit/Twitter.
- Define the "Anti-Persona" (who we definitely do NOT serve).
Weeks 3-4: Solution & Messaging Fit
- Launch Landing Page A/B test ($500 ad spend).
- Build "Clickable Figma Prototype" focusing on the Diff View and Test Runner.
- Test Prototype with 10 interviewees from Phase 1.
Weeks 5-6: Willingness to Pay
- Execute Concierge MVP (Manual Prompt Testing Service).
- Attempt to pre-sell "Team Plan" lifetime deal to 5 startups ($299 one-time).
- Validate security requirements with 2 Enterprise prospects.
Weeks 7-8: Synthesis & Go/No-Go
- Aggregate all qualitative and quantitative data.
- Calculate CAC based on ad experiments.
- Decision Gate: Proceed to Code or Pivot.
🚦 Go/No-Go Decision Criteria
We will only proceed to full engineering development if we hit 3 out of 4 of these targets:
80%
Problem Validation
(Interviews)
(Interviews)
5%
Waitlist Conversion
(Cold Traffic)
(Cold Traffic)
10
Paid Pre-orders
($299+)
($299+)
>40
NPS Score
(Prototype Users)
(Prototype Users)