AI: PromptVault - Prompt Library Manager

Model: google/gemini-3-pro-preview
Status: Completed
Cost: $2.09
Tokens: 286,814
Started: 2026-01-02 23:25

05. User Research & Validation Plan

Strategy to validate the "Git for Prompts" hypothesis and de-risk the PromptVault development roadmap.

1 Critical Assumptions Matrix

Assumption Category Hypothesis Risk Level Validation Target
Problem Engineers currently lose significant time searching for or recreating lost prompts. HIGH 80% of interviewees admit to losing a high-value prompt in the last month.
Problem Teams struggle to maintain consistency across members (e.g., everyone using different prompt versions). MED 5+ Team Leads confirm "prompt drift" is a pain point.
Solution Users prefer a "Git-like" version control mental model (commits, diffs) over simple document history. HIGH Users engage with the "Diff View" in prototypes >3 times per session.
Solution Users want to test prompts against multiple providers (OpenAI vs Anthropic) in one UI. MED Feature is ranked in top 2 "Must Haves" in surveys.
Business Teams will pay $49/user/mo rather than building internal tools or using Notion. CRITICAL 10 pre-orders or LOIs from team leads.
Business Security concerns regarding storing prompts on a 3rd party SaaS are manageable. CRITICAL Less than 20% of qualified leads drop off due to data residency concerns.

2 Discovery Interview Guide

Target: 25 Interviews (15 Engineers, 10 Product Managers)

Part A: The "Current Mess" (Deep Dive)

  • "Show me, don't tell me: Can you share your screen and show me where your top 3 production prompts live right now?"
  • "Walk me through the last time you had to update a prompt. How did you test that it didn't break edge cases?"
  • "How do you share a successful prompt with a colleague? Slack? Email? Notion?"

Part B: Pain Quantification

  • "Have you ever rolled out a prompt change that made things worse? How long did it take to revert?"
  • "On a scale of 1-10, how confident are you that your current prompts are cost-optimized?"

Part C: Solution Fit

  • "If you had a 'GitHub for Prompts', what is the one feature that would make it a 'must-buy' for your team?"
  • "We're considering a Team plan at $49/user. Who in your org would sign off on that?"

3 Survey & Screening

The "Builder" Screener

Goal: Filter out casual ChatGPT users. We need production builders.

Q1. How many LLM calls does your product make per day?
[ ] 0 (Ideation phase)
[ ] 1-1,000
[ ] 1,000+ (Production)

Q2. Which models do you actively use?
[ ] GPT-4 only
[ ] Claude + GPT
[ ] Open Source (Llama/Mistral)

Q3. How do you store prompts?
[ ] Hardcoded in code
[ ] Database
[ ] Notion/Docs

Recruitment Channels

  • r/LocalLLaMA & r/OpenAI: High concentration of power users.
  • YCombinator "Startup School" Forum: Founders dealing with this exact pain.
  • Direct LinkedIn Outreach: Search "AI Engineer" or "Prompt Engineer".

4 Validation Experiments

Exp 1: Landing Page A/B

Hypothesis: "Organization" sells better than "Testing".

Variant A: "Stop Losing Your Prompts. The CMS for AI."
Variant B: "Compare GPT-4 vs Claude Instantly. The Testing Lab for AI."
Goal: >5% Conversion to Waitlist
Exp 2: Concierge MVP

Hypothesis: Users will pay for prompt optimization reports.

Manual Service: User emails a prompt -> We manually run it on 4 models -> Send back PDF comparison report.

Goal: 10 Users @ $29 one-off
Exp 3: Fake Door Integration

Hypothesis: VS Code is the preferred environment.

Create a landing page specifically for a "PromptVault VS Code Extension" (that doesn't exist yet) and track "Install" clicks.

Goal: >15% CTR on "Install"

5 8-Week Validation Roadmap

Weeks 1-2: Problem Discovery

  • Conduct 15 "Show me your mess" interviews.
  • Launch "Builder" screening survey on Reddit/Twitter.
  • Define the "Anti-Persona" (who we definitely do NOT serve).

Weeks 3-4: Solution & Messaging Fit

  • Launch Landing Page A/B test ($500 ad spend).
  • Build "Clickable Figma Prototype" focusing on the Diff View and Test Runner.
  • Test Prototype with 10 interviewees from Phase 1.

Weeks 5-6: Willingness to Pay

  • Execute Concierge MVP (Manual Prompt Testing Service).
  • Attempt to pre-sell "Team Plan" lifetime deal to 5 startups ($299 one-time).
  • Validate security requirements with 2 Enterprise prospects.

Weeks 7-8: Synthesis & Go/No-Go

  • Aggregate all qualitative and quantitative data.
  • Calculate CAC based on ad experiments.
  • Decision Gate: Proceed to Code or Pivot.

🚦 Go/No-Go Decision Criteria

We will only proceed to full engineering development if we hit 3 out of 4 of these targets:

80%
Problem Validation
(Interviews)
5%
Waitlist Conversion
(Cold Traffic)
10
Paid Pre-orders
($299+)
>40
NPS Score
(Prototype Users)