AI: PromptVault - Prompt Library Manager

Model: x-ai/grok-4.1-fast

Status: Completed

Cost: $0.094

Tokens: 264,022

Started: 2026-01-02 23:25

05 User Research & Validation Plan

1 Key Assumptions to Validate

Critical assumptions grouped by category, prioritized by risk. Validate these before MVP build to de-risk $350K investment.

Assumption	Risk	Method	Target Evidence
Problem Assumptions
Prompts scattered across Notion/files/chats (no central repo)	High	Interviews/surveys	70%+ confirm multi-tool chaos
No version control leads to lost effective prompts weekly	High	Interviews/observation	60%+ report revert needs
Manual multi-model testing >30min per prompt	High	Surveys/timed tasks	Avg time >30min, 80% frustration
Teams duplicate prompts (50%+ effort waste)	High	Team interviews	50%+ duplication stories
No performance metrics; can't identify best prompts	Medium	Surveys	75%+ desire analytics
Sharing prompts via copy-paste loses context	Medium	Interviews	65%+ collaboration pain
Prompt tweaking untracked, leads to inconsistency	Medium	Observation	70%+ tweak frequently
Solution Assumptions
Users adopt dedicated prompt manager over docs	High	Landing page/prototype	>5% signup rate
Git-like versioning valued by 70%+	Medium	Interviews/prototype	70%+ positive reaction
Multi-model testing saves >50% time	Critical	Wizard of Oz	80%+ accuracy/usability score
Analytics drive prompt optimization	High	Prototype feedback	75%+ report value
Team collab features essential for teams	High	Interviews	60%+ teams prioritize
VS Code integration boosts daily use	Medium	Surveys	50%+ current VS Code users interested
API/webhooks needed for prod workflows	Medium	Interviews	40%+ prod users confirm
Business Assumptions
Pro tier ($19/mo) acceptable to individuals	Critical	Pricing surveys/pre-orders	40%+ willing at target
CAC <$50 via AI communities	High	Ad tests	<$50 for 5%+ signup
10% free-to-paid conversion	High	Waitlist follow-up	>5% pre-order
Team tier ($49/user) viable for 10+ person teams	High	Interviews	50%+ teams at price
Low churn (<5%/mo) from prompt lock-in	Medium	Prototype retention	>70% weekly return

2 Customer Discovery Interview Guide (60-90 min)

Target 25 interviews: 60% AI engineers, 20% prompt engineers, 20% consultants. Recruit via LinkedIn (AI/ML groups), Reddit (r/PromptEngineering, r/MachineLearning), Twitter. Incentive: $50 Amazon card + free Pro access.

Part 1: Background (10 min)

Tell me about your role and daily LLM usage.
How many prompts do you manage weekly?
Team size using AI?

Part 2: Problem Exploration (20 min)

Walk me through last time you tested a prompt across models.
How often do you lose track of good prompts?
Describe sharing a prompt with your team—what went wrong?
How much time/money wasted on re-testing?
What's the worst prompt chaos story?

Part 3: Current Solutions (15 min)

What tools (Notion, spreadsheets, Langchain Hub)? Likes/dislikes?
Ever switched? Why?
What would make you switch to a dedicated tool?

Part 4: Solution Exploration (15 min)

Tool with git-versioning, multi-model tests, team sharing...
Most/least valuable features?
Concerns (security, LLM costs)?
Expect to pay? Approve purchase?

Part 5: Wrap-up (10 min)

Pain scale 1-10 for prompt management?
Beta interest? Referrals?

Logistics: Record (Otter.ai), template: quotes, reactions, pricing. Analyze for patterns post-15 interviews.

3 Survey Designs

Screening Survey (Target: 300 responses, Typeform/LinkedIn)

Role? [AI Engineer, Prompt Engineer, ML Engineer, Consultant, Other]
Daily/weekly LLM prompt usage? [Daily, Weekly, Rarely]
Prompts managed? [<50, 50-200, 200+]
Pain scale 1-10: prompt organization/testing?
Current tools? [Notion, Spreadsheets, None, Other]
Team collab needs? [Yes/No]
Interview interest? ($50 card) [Yes: email]

Validation Survey (Post-screening, 200+ responses)

Problem frequency (e.g., lost prompts: Never-Always)
Solution satisfaction (1-10 per tool)
A/B messaging: "Version prompts like code" vs. "AI prompt analytics hub"
Van Westendorp: Too cheap/expensive thresholds for $19/mo
Demographics: Team size, LLM spend/mo

4-6 Experiments: Landing Page, Prototypes, Fake Door

Landing Page (Carrd, $500 FB/LinkedIn ads)

A/B Headlines: "Version Control for AI Prompts", "Test Prompts Across LLMs Instantly", "End Prompt Chaos for Teams"
Waitlist signup + fake door ($19/mo button)
Success: 1K visitors, 7% signup, <10% bounce

Prototype Testing (Rec: Wizard of Oz)

Google Form input → Manual LLM tests → Email results
Test 15 users: versioning, multi-model, analytics
Cost: $0 + 20hrs; Metrics: NPS>40, 80% repeat request

Fake Door/Pre-Order

Show tiers post-signup; Stripe pre-order (refundable)
Success: 12% click, 3% pay ($19 x 10+)

7 8-Week Validation Timeline

Wk 1-2: Problem

15 interviews, 300 screening surveys, pattern analysis.

Wk 3-4: Solution

Landing page A/B ($500 ads), 100+ waitlist, 25 validation surveys.

Wk 5-6: Pricing

15 pricing interviews, Van Westendorp, fake door/pre-orders (target 10).

Wk 7-8: Prototype

Wizard of Oz for 20 users, NPS/feedback, iterate.

Metric	Target	Actual	Pass?
Pain confirmation (interviews)	80%+
Landing signup	>7%
Price acceptance	60%+ @ $19
Pre-orders	10+
Prototype NPS	>40

4/5 passes → Proceed to MVP. Total budget: $2K.

8 Research Synthesis Template

Problem Summary

Top pains + quotes
Invalidated assumptions
Unexpected insights

Solution Summary

Must-have features
Low-interest
UX/integration needs

Pricing Summary

Optimal price
Segment sensitivity
Model prefs

GTM Insights

Channels
Discovery
Objections

Next Step: Execute Week 1 immediately—schedule 5 interviews this week for quick signals.