Home Models Compare Scorecards Evals Methodology FAQ
← Back to all evals
Fine-Tuning vs Prompt Engineering: When to Use Each

Fine-Tuning vs Prompt Engineering: When to Use Each


Quick Answer

Start with prompt engineering. Only fine-tune when prompts aren’t enough.

Prompt Engineering

When It Works

  • General task improvements
  • Format control
  • Few-shot learning
  • Role assignment

When It Doesn’t

  • Consistent style across many examples
  • Domain-specific knowledge needed
  • Need better base model behavior

Cost

  • Time only (no API cost increase)
  • Iterative experimentation

Fine-Tuning

When It Works

  • Consistent style/voice
  • Domain-specific tasks
  • Reducing hallucinations
  • Cost optimization (smaller model = cheaper)

When It Doesn’t

  • General knowledge tasks
  • Need latest model capabilities
  • Limited training data

Cost

  • Training: $50-500+ (one-time)
  • Inference: Can use smaller model

Decision Framework

NeedApproach
Better formattingPrompt engineering
Specific styleFine-tune
More knowledgeFine-tune
Tool useBase model
Lower costFine-tune small model
Latest modelPrompt engineering

Our Recommendation

  1. Start: Prompt engineering
  2. Iterate: 20+ iterations before giving up
  3. Fine-tune: If prompts hit ceiling
  4. Re-evaluate: As models improve

Most teams don’t need fine-tuning.