AI: PromptVault - Prompt Library Manager

Model: google/gemini-3-pro-preview

Status: Completed

Cost: $2.09

Tokens: 286,814

Started: 2026-01-02 23:25

05. User Research & Validation Plan

Strategy to validate the "Git for Prompts" hypothesis and de-risk the PromptVault development roadmap.

1 Critical Assumptions Matrix

Assumption Category	Hypothesis	Risk Level	Validation Target
Problem	Engineers currently lose significant time searching for or recreating lost prompts.	HIGH	80% of interviewees admit to losing a high-value prompt in the last month.
Problem	Teams struggle to maintain consistency across members (e.g., everyone using different prompt versions).	MED	5+ Team Leads confirm "prompt drift" is a pain point.
Solution	Users prefer a "Git-like" version control mental model (commits, diffs) over simple document history.	HIGH	Users engage with the "Diff View" in prototypes >3 times per session.
Solution	Users want to test prompts against multiple providers (OpenAI vs Anthropic) in one UI.	MED	Feature is ranked in top 2 "Must Haves" in surveys.
Business	Teams will pay $49/user/mo rather than building internal tools or using Notion.	CRITICAL	10 pre-orders or LOIs from team leads.
Business	Security concerns regarding storing prompts on a 3rd party SaaS are manageable.	CRITICAL	Less than 20% of qualified leads drop off due to data residency concerns.

2 Discovery Interview Guide

Target: 25 Interviews (15 Engineers, 10 Product Managers)

Part A: The "Current Mess" (Deep Dive)

"Show me, don't tell me: Can you share your screen and show me where your top 3 production prompts live right now?"
"Walk me through the last time you had to update a prompt. How did you test that it didn't break edge cases?"
"How do you share a successful prompt with a colleague? Slack? Email? Notion?"

Part B: Pain Quantification

"Have you ever rolled out a prompt change that made things worse? How long did it take to revert?"
"On a scale of 1-10, how confident are you that your current prompts are cost-optimized?"

Part C: Solution Fit

"If you had a 'GitHub for Prompts', what is the one feature that would make it a 'must-buy' for your team?"
"We're considering a Team plan at $49/user. Who in your org would sign off on that?"

3 Survey & Screening

The "Builder" Screener

Goal: Filter out casual ChatGPT users. We need production builders.

                    Q1. How many LLM calls does your product make per day?

                    [ ] 0 (Ideation phase)

                    [ ] 1-1,000

                    [ ] 1,000+ (Production)

                    Q2. Which models do you actively use?

                    [ ] GPT-4 only

                    [ ] Claude + GPT

                    [ ] Open Source (Llama/Mistral)

                    Q3. How do you store prompts?

                    [ ] Hardcoded in code

                    [ ] Database

                    [ ] Notion/Docs

Recruitment Channels

r/LocalLLaMA & r/OpenAI: High concentration of power users.
YCombinator "Startup School" Forum: Founders dealing with this exact pain.
Direct LinkedIn Outreach: Search "AI Engineer" or "Prompt Engineer".

4 Validation Experiments

Exp 1: Landing Page A/B

Hypothesis: "Organization" sells better than "Testing".

Variant A: "Stop Losing Your Prompts. The CMS for AI."

Variant B: "Compare GPT-4 vs Claude Instantly. The Testing Lab for AI."

Goal: >5% Conversion to Waitlist

Exp 2: Concierge MVP

Hypothesis: Users will pay for prompt optimization reports.

Manual Service: User emails a prompt -> We manually run it on 4 models -> Send back PDF comparison report.

Goal: 10 Users @ $29 one-off

Exp 3: Fake Door Integration

Hypothesis: VS Code is the preferred environment.

Create a landing page specifically for a "PromptVault VS Code Extension" (that doesn't exist yet) and track "Install" clicks.

Goal: >15% CTR on "Install"

5 8-Week Validation Roadmap

Weeks 1-2: Problem Discovery

Conduct 15 "Show me your mess" interviews.
Launch "Builder" screening survey on Reddit/Twitter.
Define the "Anti-Persona" (who we definitely do NOT serve).

Weeks 3-4: Solution & Messaging Fit

Launch Landing Page A/B test ($500 ad spend).
Build "Clickable Figma Prototype" focusing on the Diff View and Test Runner.
Test Prototype with 10 interviewees from Phase 1.

Weeks 5-6: Willingness to Pay

Execute Concierge MVP (Manual Prompt Testing Service).
Attempt to pre-sell "Team Plan" lifetime deal to 5 startups ($299 one-time).
Validate security requirements with 2 Enterprise prospects.

Weeks 7-8: Synthesis & Go/No-Go

Aggregate all qualitative and quantitative data.
Calculate CAC based on ad experiments.
Decision Gate: Proceed to Code or Pivot.

🚦 Go/No-Go Decision Criteria

We will only proceed to full engineering development if we hit 3 out of 4 of these targets:

80%

Problem Validation
(Interviews)

Waitlist Conversion
(Cold Traffic)

Paid Pre-orders
($299+)

>40

NPS Score
(Prototype Users)