When to use this prompt
Before signing a contract worth more than a month’s spend on the budget. The most common vendor-decision failure is “we’ll just go with the one Bob recommended” — fast, but produces buyer’s remorse and surprise costs in month four. The second-most-common failure is the opposite: a 40-row spreadsheet where every cell is “TBD” and the team gives up after week two.
This prompt produces a real scorecard the team can act on in an afternoon. Use it for software, agencies, contractors, partnerships, or any selection where 3+ options are on the table and the wrong choice has real cost.
The prompt
<role>Procurement specialist who builds vendor scorecards that produce defensible decisions and surface hidden risks.</role>
<task>Score the vendors below against the criteria, weight the criteria, and recommend one. Surface a per-vendor risk watch list with each vendor's specific concerns.</task>
<inputs>
<purchase>[WHAT YOU'RE BUYING, e.g., "product analytics platform", "PR agency for B2B tech", "fractional CFO"]</purchase>
<budget_band>[ANNUAL OR PROJECT BUDGET, e.g., "$50K-100K/year"]</budget_band>
<vendors>
[3 TO 5 VENDORS, EACH WITH:
- Vendor name
- Pricing model and rough cost
- Key strengths the vendor markets
- Anything you've heard from peers, references, or reviews
- Anything that gave you pause]
</vendors>
<must_haves>
[3-5 non-negotiables. If a vendor fails any, they are out regardless of total score.]
</must_haves>
<team_context>[1-3 sentences on the buying team: size, technical sophistication, current tech stack, what previous vendors in this category got right or wrong]</team_context>
</inputs>
<instructions>
1. Identify 6 to 9 selection criteria that genuinely matter for this purchase. Criteria must be observable in a sales process or check-able in references — not "good culture fit." Mix categories: capability, cost, risk, fit, time-to-value.
2. Weight each criterion 5 to 25 points. Total must equal 100. Weighting reflects what actually matters per <team_context>, not what every vendor scorecard says.
3. Score each vendor 1 to 5 on each criterion. Use only what's in <vendors>. Where you don't have enough information to score, write "?" and add a [NEED] flag noting what reference call or demo question would close the gap.
4. After the scorecard, identify the top 3 risks for each vendor. Risks must be specific to that vendor — not "implementation could be hard." A risk like "Vendor B's CSM team is staffed at 1:60 ratio per recent G2 reviews; expect long ticket times" is what you want.
5. Apply the must-haves filter. Mark any vendor failing a must-have as DISQUALIFIED regardless of score.
6. Recommend one vendor. The recommendation must include: total score, the single strongest reason to pick them, the single biggest risk to mitigate before signing, and one specific contract term to negotiate.
7. Constraints:
- Do not invent vendor capabilities, prices, or reviews not in <vendors>. Mark anything unverified [VERIFY].
- Do not recommend a tied vendor; if scores are within 3 points, name the tiebreaker explicitly.
- Do not produce a "they all have strengths and weaknesses" recommendation. Pick one.
</instructions>
<output_format>
**Selection criteria + weights (total 100):**
| # | Criterion | Weight | Why this weight |
|---|-----------|--------|-----------------|
| 1 | ... | XX | one sentence |
| ... | | | |
**Scorecard:**
| Criterion (Weight) | Vendor A | Vendor B | Vendor C | [Vendor D] | [Vendor E] |
|--------------------|----------|----------|----------|------------|------------|
| ... (XX) | N | N | N | N | N |
| ... | | | | | |
| **Weighted total** | XX | XX | XX | XX | XX |
| **Must-have status** | OK / DQ | | | | |
**Per-vendor risks (top 3 each):**
### Vendor A
1. [Specific risk]
2. ...
3. ...
### Vendor B
...
**Recommendation:** [Vendor name]
- **Total score:** XX/100
- **Strongest reason:** [one sentence]
- **Biggest risk to mitigate before signing:** [one sentence with specific action]
- **Contract term to negotiate:** [specific term, e.g., "30-day exit clause if CSM SLA is missed two consecutive months"]
**Tiebreaker (if scores within 3):** [explicit reason for the pick]
**[NEED] gaps to close before signing:**
- [Vendor]: [specific question to ask in next reference call or demo]
**[VERIFY] flags:** [Anything in the inputs that should be confirmed]
</output_format>
How it works
The forced weighting (5 to 25 points each, totaling 100) reveals what the team actually cares about. Most vendor scorecards are flat, so every criterion seems equally important and the highest-budget vendor wins by inertia. Weighting means the team must say out loud whether implementation speed beats integration depth, or vice versa.
The per-vendor risk watch list is what separates a real scorecard from a feature-comparison spreadsheet. Every vendor has specific risks; naming them early prevents post-contract surprise. The constraint that risks must be specific to the vendor (not generic) is what makes this section useful.
The “specific contract term to negotiate” output is a 2026 directive that frontier models actually obey. Most procurement loops produce a vendor recommendation but no concrete negotiation lever. Naming one specific term gives the buyer a concrete handle for the contract phase.
The [NEED] flags surface the questions still worth asking before signing. Without them, models fabricate scores in cells where they don’t have data; with them, the team knows exactly which reference calls to schedule.
Example output
| Criterion (Weight) | Vendor A | Vendor B | Vendor C |
|---|---|---|---|
| Time-to-value (20) | 4 | 3 | 5 |
| Integration with our stack (20) | 3 | 5 | 2 |
| Cost relative to budget (15) | 5 | 3 | 4 |
| Reference call quality (15) | 4 | ? | 4 |
| Roadmap alignment (10) | 4 | 5 | 3 |
| Vendor stability (10) | 5 | 4 | 3 |
| Implementation support (10) | 3 | 4 | 5 |
| Weighted total | 79 | 75 | 73 |
Recommendation: Vendor A — 79/100. Strongest reason: Integration risk is lower than B’s score suggests; A’s published API matches our existing stack with no custom work, which is worth more than B’s marginal lead in raw integration depth. Biggest risk to mitigate before signing: Implementation support is the weakest cell. Negotiate a named CSM and a 60-day onboarding success milestone into the contract. Contract term to negotiate: 90-day exit clause if onboarding success milestone is missed.
[NEED] gaps:
- Vendor B: schedule a reference call with a current customer of similar size; current data is from G2 only.
[VERIFY] flags: Vendor C’s published case study claiming “30-day implementation” — confirm with two named references.
Variations
- Build vs buy version: When one option is “build it ourselves,” add an internal-build column to the scorecard and apply the same scoring discipline.
- Replacement vendor mode: When you’re switching from an incumbent, add a column for the incumbent so the scorecard surfaces whether switching is genuinely better or just different.
- Bake-off mode: For RFPs where vendors have submitted written responses, paste the responses as inputs and ask the model to score against the rubric with citation back to specific RFP responses.