Prompt Engineering

9 Techniques for Better LLM Outputs

Master the techniques that separate demos from production systems. Learn what to use, when to use it, and what to skip.

15 minBeginnerInteractive

Why This Matters

Getting an LLM to return something is easy. Getting it to return the right thing consistently is harder.

The techniques below address specific problems: enforcing output formats, improving accuracy on edge cases, making decisions explainable, preventing unexpected behavior. Use what solves your actual problems.

Interactive Playground

Explore 9 prompt engineering techniques. Click any technique to see how it works - you can mix and match based on your use case, not all are needed for every scenario.

Running Example: Support Ticket

We'll use this ticket throughout all 9 techniques to show how each approach handles the same input.

"I've been charged twice for my subscription this month. Order #12345. This is the third time this has happened. I want a refund NOW or I'm leaving."

Select a Technique

Click any technique to explore how it works

Technique 1 of 9

Basic Prompt

A simple instruction telling the model what to do.

Why This Technique?

Starting point - direct instruction with no additional context or structure.

Prompt

Input

Classify this customer support ticket:

"I've been charged twice for my subscription this month. Order #12345. This is the third time this has happened. I want a refund NOW or I'm leaving."

Model Output

Response

This is a billing complaint. The customer is angry about being charged twice and wants a refund. High priority.

Pros

+Simple to write
+Fast execution
+Low token cost

Cons

-Inconsistent output format
-May miss important details
-No structured data for downstream processing

How to Choose Techniques

Start simple. Add techniques when you hit specific problems, not preemptively. If your outputs are inconsistent, try few-shot examples. If you need structured data, enforce a JSON schema. If decisions need to be explainable, use a reasoning model or chain-of-thought.

Below are three example combinations showing different levels of complexity. These aren't rules, they're starting points. Your actual needs depend on your specific requirements and failure modes.

Simple

Internal tools, prototypes, simple classification

•Structured Output
•Few-Shot (optional)

Example: Tag classifier for internal docs

Moderate

Customer-facing, reliability matters

•Role Prompting
•Structured Output
•Few-Shot
•Guardrails

Example: Customer support ticket analyzer

Complex

Multi-step reasoning, verification needed

•Reasoning model (e.g., OpenAI o1/o3)
•Structured Output
•Prompt Chaining or Self-Consistency
•Guardrails

Example: Multi-step analysis with verification

Common Mistakes

These are common pitfalls when building LLM applications. Most are fundamentals that get skipped or deprioritized, then cause issues later.

🚫

Over-engineering from day one

Adding all 9 techniques before you know what fails. Start with structured output + few-shot, measure, then iterate.

🚫

No structured output enforcement

Parsing free-text responses with regex and hoping. Use native JSON modes from providers (OpenAI, Anthropic, Google).

🚫

Generic few-shot examples

Random examples don't help. Pick edge cases, ambiguous inputs, and domain-specific scenarios your model will actually see.

🚫

No evaluation metrics

Flying blind without measuring accuracy, format compliance, or consistency. Build eval sets early, not after production issues.

🚫

Skipping guardrails in production

No input validation, no output constraints, no audit logs. Then wondering why things break or behave unexpectedly.