What Is Prompt Engineering? Techniques Every Developer Should Know

title: "What Is Prompt Engineering? Techniques Every Developer Should Know" date: "2026-03-05" description: "A practical guide to prompt engineering — zero-shot, few-shot, chain-of-thought, and system prompts — with code examples showing how to get consistent, high-quality outputs from LLMs." tags: ["Prompt Engineering", "AI Fundamentals"]

Prompt engineering is the practice of crafting inputs to language models to reliably elicit the outputs you want. As LLMs become infrastructure-level tools, writing good prompts is becoming a core software engineering skill.

This isn't about magic tricks. It's about understanding how LLMs process information and structuring your inputs accordingly.

Why Prompts Matter

LLMs are incredibly sensitive to how questions are phrased. The exact same underlying capability can produce wildly different results:

Bad prompt:

"Fix my code"

Better prompt:

"You are a senior TypeScript engineer. The following function should return the sum of all even numbers in an array, but it has a bug. Identify the bug, explain why it's a bug, and provide the corrected code."

The model hasn't changed — your communication has.

1. Zero-Shot Prompting

A zero-shot prompt gives the model a task with no examples. You rely entirely on the model's pre-trained knowledge.

import anthropic

client = anthropic.Anthropic()

response = client.messages.create(
    model="claude-opus-4-5",
    max_tokens=512,
    messages=[
        {
            "role": "user",
            "content": "Classify the sentiment of this review as positive, negative, or neutral:\n\n'The battery life is decent but the screen is way too dim for outdoor use.'"
        }
    ]
)

print(response.content[0].text)
# Output: Negative

Zero-shot works well for tasks the model has seen frequently during training (translation, summarization, basic classification). For domain-specific or nuanced tasks, you'll want few-shot examples.

2. Few-Shot Prompting

Few-shot prompting provides examples of the desired input/output format. You're showing the model the pattern, not just describing it.

few_shot_prompt = """
Classify the sentiment of customer reviews.

Review: "This product is absolutely incredible! Best purchase I've made this year."
Sentiment: Positive

Review: "Arrived damaged and customer service was unhelpful. Total waste of money."
Sentiment: Negative

Review: "It works as described. Nothing special but gets the job done."
Sentiment: Neutral

Review: "The build quality is great but it's a bit overpriced for what you get."
Sentiment: """

response = client.messages.create(
    model="claude-opus-4-5",
    max_tokens=10,
    messages=[{"role": "user", "content": few_shot_prompt}]
)

print(response.content[0].text)  # Output: Negative

Tips for Few-Shot Prompts

3–5 examples is usually the sweet spot
Cover edge cases in your examples
Keep the format strictly consistent
Order matters — put more representative examples later

3. Chain-of-Thought (CoT) Prompting

For complex reasoning tasks, asking the model to "think step by step" dramatically improves accuracy. This is Chain-of-Thought prompting.

cot_prompt = """
Solve this problem step by step:

A server processes 500 requests per second at peak load.
Each request takes an average of 200ms to process.
The server has 4 CPU cores.

How many concurrent requests can the server handle before queuing begins?

Think through this step by step, then give your final answer.
"""

response = client.messages.create(
    model="claude-opus-4-5",
    max_tokens=512,
    messages=[{"role": "user", "content": cot_prompt}]
)

print(response.content[0].text)

The model will reason through:

Each request holds a thread for 200ms = 0.2 seconds
4 cores × (1 / 0.2s) = 20 concurrent requests per second capacity
At 500 RPS, you'd need 500 × 0.2 = 100 concurrent slots
With 4 cores, queuing begins almost immediately without a thread pool

Without CoT, the model might just guess. With CoT, it works through the logic correctly.

Zero-Shot CoT

You don't always need examples. Simply appending "Think step by step." or "Let's reason through this carefully." activates chain-of-thought reasoning:

response = client.messages.create(
    model="claude-opus-4-5",
    max_tokens=1024,
    messages=[{
        "role": "user",
        "content": "Should I use PostgreSQL or MongoDB for a social media app's post feed? Think step by step."
    }]
)

4. System Prompts

System prompts set the model's role, persona, and constraints before the conversation begins. They're the most powerful tool for consistent behavior across a production application.

response = client.messages.create(
    model="claude-opus-4-5",
    max_tokens=1024,
    system="""You are a senior software engineer specializing in Kotlin and Android development.
You have 10 years of experience building production Android apps.

Your responses should:
- Be technically precise and opinionated
- Prefer idiomatic Kotlin over Java-style code
- Recommend Jetpack Compose for UI
- Mention potential performance implications when relevant
- Be concise — no fluff

If asked something outside Android/Kotlin, politely redirect.""",
    messages=[{
        "role": "user",
        "content": "What's the best way to handle API calls in a ViewModel?"
    }]
)

5. Structured Output

LLMs can reliably generate structured formats when explicitly instructed. This is essential for parsing model outputs programmatically.

import json

structured_prompt = """
Extract the key information from this job posting and return it as valid JSON only.
No explanation, just the JSON object.

Job posting:
"We're hiring a Senior Backend Engineer to join our fintech startup.
You'll work with Python, FastAPI, and PostgreSQL.
3+ years of experience required. Remote-friendly, salary $130k-$160k."

Return this exact JSON structure:
{
  "title": "",
  "level": "",
  "technologies": [],
  "experience_years_min": 0,
  "salary_min": 0,
  "salary_max": 0,
  "remote": true/false
}
"""

response = client.messages.create(
    model="claude-opus-4-5",
    max_tokens=256,
    messages=[{"role": "user", "content": structured_prompt}]
)

parsed = json.loads(response.content[0].text)
print(parsed["technologies"])  # ["Python", "FastAPI", "PostgreSQL"]

Modern LLM APIs (including Anthropic) also support tool use for enforced JSON schemas — prefer that for production over free-text JSON parsing.

6. Prompt Chaining

For complex tasks, break them into a pipeline of simpler prompts. Each model call does one thing well.

def analyze_code(code: str) -> dict:
    # Step 1: Identify potential issues
    issues_response = client.messages.create(
        model="claude-opus-4-5",
        max_tokens=512,
        system="You are a code reviewer. List potential bugs and issues concisely.",
        messages=[{"role": "user", "content": f"Review this code:\n\n```python\n{code}\n```"}]
    )
    issues = issues_response.content[0].text

    # Step 2: Generate fixes based on identified issues
    fixes_response = client.messages.create(
        model="claude-opus-4-5",
        max_tokens=1024,
        system="You are a senior engineer. Provide corrected code with explanations.",
        messages=[{
            "role": "user",
            "content": f"Original code:\n```python\n{code}\n```\n\nIdentified issues:\n{issues}\n\nProvide the fixed code."
        }]
    )

    return {
        "issues": issues,
        "fixed_code": fixes_response.content[0].text
    }

Common Prompt Engineering Mistakes

1. Being vague about format Don't: "Give me some options" Do: "List exactly 5 options, one per line, starting with a dash"

2. Not specifying the audience Don't: "Explain recursion" Do: "Explain recursion to a junior developer who knows Python but hasn't studied computer science formally"

3. Prompting for opinion without constraints Don't: "What database should I use?" Do: "I'm building a real-time chat app for 50k users, using Node.js, with a 3-person team. We need flexible schema for message metadata. Should I use MongoDB or PostgreSQL? Give a direct recommendation with reasoning."

4. Not using system prompts for production In production, system prompts are your guardrails. Always define the model's role, output format, and constraints.

5. Ignoring temperature For deterministic tasks (classification, extraction), use temperature=0. For creative tasks, temperature=0.7–1.0 gives variety.

Prompt Engineering in Production

When shipping LLM features:

Version your prompts — store them in code, track changes in git
Evaluate systematically — build eval datasets, measure accuracy
Use structured output — tool use / JSON mode for parseable responses
Add guardrails — validate outputs before showing to users
Monitor in production — log inputs, outputs, latencies, error rates
Iterate based on real failures — production data is your best eval set

Summary

| Technique | When to use | |---|---| | Zero-shot | Simple, well-defined tasks the model knows well | | Few-shot | Domain-specific tasks, consistent output format | | Chain-of-thought | Multi-step reasoning, math, logic | | System prompts | Production apps, consistent persona/behavior | | Structured output | Data extraction, classification, API responses | | Prompt chaining | Complex multi-step workflows |

Prompt engineering is an empirical discipline. Form a hypothesis, test it, measure the results, iterate. The engineers who get good at it aren't just better at using AI tools — they ship better AI products.