AI: PromptVault - Prompt Library Manager

Model: x-ai/grok-4.1-fast
Status: Completed
Cost: $0.094
Tokens: 264,022
Started: 2026-01-02 23:25

05 User Research & Validation Plan

1 Key Assumptions to Validate

Critical assumptions grouped by category, prioritized by risk. Validate these before MVP build to de-risk $350K investment.

Assumption Risk Method Target Evidence
Problem Assumptions
Prompts scattered across Notion/files/chats (no central repo)HighInterviews/surveys70%+ confirm multi-tool chaos
No version control leads to lost effective prompts weeklyHighInterviews/observation60%+ report revert needs
Manual multi-model testing >30min per promptHighSurveys/timed tasksAvg time >30min, 80% frustration
Teams duplicate prompts (50%+ effort waste)HighTeam interviews50%+ duplication stories
No performance metrics; can't identify best promptsMediumSurveys75%+ desire analytics
Sharing prompts via copy-paste loses contextMediumInterviews65%+ collaboration pain
Prompt tweaking untracked, leads to inconsistencyMediumObservation70%+ tweak frequently
Solution Assumptions
Users adopt dedicated prompt manager over docsHighLanding page/prototype>5% signup rate
Git-like versioning valued by 70%+MediumInterviews/prototype70%+ positive reaction
Multi-model testing saves >50% timeCriticalWizard of Oz80%+ accuracy/usability score
Analytics drive prompt optimizationHighPrototype feedback75%+ report value
Team collab features essential for teamsHighInterviews60%+ teams prioritize
VS Code integration boosts daily useMediumSurveys50%+ current VS Code users interested
API/webhooks needed for prod workflowsMediumInterviews40%+ prod users confirm
Business Assumptions
Pro tier ($19/mo) acceptable to individualsCriticalPricing surveys/pre-orders40%+ willing at target
CAC <$50 via AI communitiesHighAd tests<$50 for 5%+ signup
10% free-to-paid conversionHighWaitlist follow-up>5% pre-order
Team tier ($49/user) viable for 10+ person teamsHighInterviews50%+ teams at price
Low churn (<5%/mo) from prompt lock-inMediumPrototype retention>70% weekly return

2 Customer Discovery Interview Guide (60-90 min)

Target 25 interviews: 60% AI engineers, 20% prompt engineers, 20% consultants. Recruit via LinkedIn (AI/ML groups), Reddit (r/PromptEngineering, r/MachineLearning), Twitter. Incentive: $50 Amazon card + free Pro access.

Part 1: Background (10 min)

  • Tell me about your role and daily LLM usage.
  • How many prompts do you manage weekly?
  • Team size using AI?

Part 2: Problem Exploration (20 min)

  • Walk me through last time you tested a prompt across models.
  • How often do you lose track of good prompts?
  • Describe sharing a prompt with your team—what went wrong?
  • How much time/money wasted on re-testing?
  • What's the worst prompt chaos story?

Part 3: Current Solutions (15 min)

  • What tools (Notion, spreadsheets, Langchain Hub)? Likes/dislikes?
  • Ever switched? Why?
  • What would make you switch to a dedicated tool?

Part 4: Solution Exploration (15 min)

  • Tool with git-versioning, multi-model tests, team sharing...
  • Most/least valuable features?
  • Concerns (security, LLM costs)?
  • Expect to pay? Approve purchase?

Part 5: Wrap-up (10 min)

  • Pain scale 1-10 for prompt management?
  • Beta interest? Referrals?

Logistics: Record (Otter.ai), template: quotes, reactions, pricing. Analyze for patterns post-15 interviews.

3 Survey Designs

Screening Survey (Target: 300 responses, Typeform/LinkedIn)

  1. Role? [AI Engineer, Prompt Engineer, ML Engineer, Consultant, Other]
  2. Daily/weekly LLM prompt usage? [Daily, Weekly, Rarely]
  3. Prompts managed? [<50, 50-200, 200+]
  4. Pain scale 1-10: prompt organization/testing?
  5. Current tools? [Notion, Spreadsheets, None, Other]
  6. Team collab needs? [Yes/No]
  7. Interview interest? ($50 card) [Yes: email]

Validation Survey (Post-screening, 200+ responses)

  • Problem frequency (e.g., lost prompts: Never-Always)
  • Solution satisfaction (1-10 per tool)
  • A/B messaging: "Version prompts like code" vs. "AI prompt analytics hub"
  • Van Westendorp: Too cheap/expensive thresholds for $19/mo
  • Demographics: Team size, LLM spend/mo

4-6 Experiments: Landing Page, Prototypes, Fake Door

Landing Page (Carrd, $500 FB/LinkedIn ads)

  • A/B Headlines: "Version Control for AI Prompts", "Test Prompts Across LLMs Instantly", "End Prompt Chaos for Teams"
  • Waitlist signup + fake door ($19/mo button)
  • Success: 1K visitors, 7% signup, <10% bounce

Prototype Testing (Rec: Wizard of Oz)

  • Google Form input → Manual LLM tests → Email results
  • Test 15 users: versioning, multi-model, analytics
  • Cost: $0 + 20hrs; Metrics: NPS>40, 80% repeat request

Fake Door/Pre-Order

  • Show tiers post-signup; Stripe pre-order (refundable)
  • Success: 12% click, 3% pay ($19 x 10+)

7 8-Week Validation Timeline

Wk 1-2: Problem
15 interviews, 300 screening surveys, pattern analysis.
Wk 3-4: Solution
Landing page A/B ($500 ads), 100+ waitlist, 25 validation surveys.
Wk 5-6: Pricing
15 pricing interviews, Van Westendorp, fake door/pre-orders (target 10).
Wk 7-8: Prototype
Wizard of Oz for 20 users, NPS/feedback, iterate.
Metric Target Actual Pass?
Pain confirmation (interviews)80%+
Landing signup>7%
Price acceptance60%+ @ $19
Pre-orders10+
Prototype NPS>40

4/5 passes → Proceed to MVP. Total budget: $2K.

8 Research Synthesis Template

Problem Summary

  • Top pains + quotes
  • Invalidated assumptions
  • Unexpected insights

Solution Summary

  • Must-have features
  • Low-interest
  • UX/integration needs

Pricing Summary

  • Optimal price
  • Segment sensitivity
  • Model prefs

GTM Insights

  • Channels
  • Discovery
  • Objections

Next Step: Execute Week 1 immediately—schedule 5 interviews this week for quick signals.