Published: 2026-06-01 โ€ข Updated: 2026-07-05

Advanced Prompting: System Messages, Few-Shot, and Chain-of-Thought

As developers, transitioning from basic chat interactions with Large Language Models (LLMs) to building production-grade applications requires a deep understanding of advanced prompting techniques. While simple queries work for casual use, software development demands predictability, structured output, and logical reasoning. This lesson covers the three pillars of advanced prompting: System Messages, Few-Shot Prompting, and Chain-of-Thought (CoT) reasoning.

The Advanced Prompting Pipeline

When building LLM-powered applications, prompts are processed through distinct layers. Understanding this flow helps you structure your API calls for maximum efficiency and predictability.

+-------------------------------------------------------------+
|                        SYSTEM MESSAGE                       |
|  (Defines the Persona, Constraints, and Output Format)      |
+-------------------------------------------------------------+
                              |
                              v
+-------------------------------------------------------------+
|                      FEW-SHOT EXAMPLES                      |
|  (Demonstrates Input -> Output patterns for the model)     |
+-------------------------------------------------------------+
                              |
                              v
+-------------------------------------------------------------+
|                     CHAIN-OF-THOUGHT                        |
|  (Forces step-by-step reasoning before the final answer)    |
+-------------------------------------------------------------+
                              |
                              v
+-------------------------------------------------------------+
|                         USER INPUT                          |
|  (The actual runtime query or data to be processed)         |
+-------------------------------------------------------------+
    

1. System Messages: Setting the Rules of Engagement

The System Message (often referred to as the system prompt) is the foundational instruction set given to the LLM. It establishes the model's persona, defines behavioral boundaries, sets the output format, and dictates how to handle edge cases.

In API environments like OpenAI or Anthropic, the system message is passed as a separate role, distinct from the user's input. This separation prevents the model from being easily distracted by user inputs that might attempt to override instructions.

Why System Messages Matter

  • Consistency: They ensure the model maintains the same tone and format across multiple API calls.
  • Security: They act as a primary defense against prompt injection attacks.
  • Format Control: They force the model to output raw JSON, XML, or specific code structures without conversational filler.

Example: API Configuration System Message

// System Message
You are a strict code-review assistant. Your task is to analyze Java code for security vulnerabilities.
- Output ONLY valid JSON. Do not include any markdown formatting, backticks, or introductory text.
- If no vulnerabilities are found, return an empty JSON array: [].
- If vulnerabilities are found, return an array of objects containing "line", "severity", and "description".

2. Few-Shot Prompting: Teaching by Example

LLMs are highly capable of Zero-Shot Learning, which means performing a task based purely on instructions without seeing any examples. However, when you need a highly specific output format, complex classification, or domain-specific logic, Few-Shot Prompting is the most reliable technique.

Few-Shot prompting involves providing the model with one or more high-quality examples of input-output pairs before presenting the actual query. This teaches the model the exact pattern you expect.

Example: Converting Natural Language to SQL

Without examples, an LLM might write overly complex SQL queries or use non-existent tables. Few-shot prompting aligns the model to your specific database schema.

// System Message
You are a SQL generator for a PostgreSQL database. Use only the tables: users(id, name, created_at) and purchases(id, user_id, amount, purchase_date).

// Example 1 (User Input)
Find users who registered in the last 30 days.

// Example 1 (Model Output)
SELECT name FROM users WHERE created_at >= CURRENT_DATE - INTERVAL '30 days';

// Example 2 (User Input)
Get the total spend of user "Alice".

// Example 2 (Model Output)
SELECT SUM(p.amount) FROM purchases p JOIN users u ON p.user_id = u.id WHERE u.name = 'Alice';

// Active User Query (User Input)
Find the names of users who spent more than $100 in total.

3. Chain-of-Thought (CoT) Prompting: Logical Reasoning

For complex logic, mathematical calculations, or multi-step debugging, LLMs can fail if they attempt to generate the final answer immediately. This happens because autoregressive language models predict the next token based on previous tokens. If they start writing the answer without "thinking" first, they often hallucinate or make logical leaps.

Chain-of-Thought (CoT) prompting solves this by forcing the model to generate its intermediate reasoning steps before delivering the final answer. Simply adding the phrase "Let's think step by step" to a prompt can dramatically increase the accuracy of the output.

Example: Debugging a Complex Java Concurrency Issue

Instead of asking the model to "fix this code," we instruct it to analyze the execution flow step-by-step.

// Prompt
Analyze this Java code for potential deadlocks. 
Explain your reasoning step-by-step, identifying which threads hold which locks and in what order, before suggesting the fix.

public class DeadlockDemo {
    private final Object lock1 = new Object();
    private final Object lock2 = new Object();

    public void methodA() {
        synchronized (lock1) {
            synchronized (lock2) {
                System.out.println("Method A executed");
            }
        }
    }

    public void methodB() {
        synchronized (lock2) {
            synchronized (lock1) {
                System.out.println("Method B executed");
            }
        }
    }
}

Expected Chain-of-Thought Output

Step 1: Identify the shared resources. The class has two lock objects: lock1 and lock2.
Step 2: Trace Thread 1 executing methodA(). Thread 1 acquires lock1 and waits to acquire lock2.
Step 3: Trace Thread 2 executing methodB(). Thread 2 acquires lock2 and waits to acquire lock1.
Step 4: Analyze the conflict. If Thread 1 holds lock1 and wants lock2, while Thread 2 holds lock2 and wants lock1, neither thread can proceed. This is a classic circular wait condition (deadlock).
Step 5: Formulate the solution. To prevent deadlock, both methods must acquire the locks in the exact same order.

Suggested Fix:
Modify methodB() to acquire lock1 before lock2.

Combining All Three: The Production-Grade Developer Prompt

To build robust applications, you should combine System Messages, Few-Shot examples, and Chain-of-Thought reasoning into a single, cohesive prompt structure.

// SYSTEM MESSAGE
You are an expert Java compiler assistant. Your job is to parse raw log files, identify stack traces, explain the root cause step-by-step, and output a structured JSON response.

// FEW-SHOT EXAMPLES
User:
Log: "Exception in thread "main" java.lang.NullPointerException at com.example.App.main(App.java:14)"

Assistant:
Reasoning:
1. The log shows a NullPointerException (NPE).
2. The NPE occurred in the main thread.
3. The exact location is in the class com.example.App, in the main method, at line 14 of App.java.
4. This means a variable accessed at line 14 was null when a method or property was invoked on it.
Output:
{
  "exception": "java.lang.NullPointerException",
  "class": "com.example.App",
  "method": "main",
  "line": 14,
  "root_cause": "An object reference was null when accessed."
}

// ACTUAL USER INPUT
User:
Log: "Exception in thread "pool-1-thread-2" java.lang.ArrayIndexOutOfBoundsException: Index 5 out of bounds for length 5 at com.example.ArrayUtils.getElement(ArrayUtils.java:22)"

Real-World Use Cases

  • Automated Code Review Systems: Using system messages to enforce company coding standards, few-shot prompts to show acceptable vs. unacceptable code, and CoT to explain why a line of code fails standards.
  • Natural Language Database Interfaces: Enabling non-technical users to query databases by converting English to safe, optimized SQL queries using few-shot examples of database schemas.
  • Structured Log Analyzers: Processing millions of chaotic server logs into clean, structured JSON data for ingestion into monitoring dashboards like Kibana or Datadog.

Common Mistakes to Avoid

  • Putting Examples in the System Message: Keep system messages focused on instructions and constraints. Put your few-shot examples in the user/assistant message history to keep the context clean and structured.
  • Example Bias: If your few-shot examples only show one type of output (e.g., all examples return true), the model may become biased and output that response even when it shouldn't. Ensure your examples cover diverse scenarios.
  • Ignoring Token Costs: Chain-of-Thought prompting requires the model to write out its reasoning. Since you pay per token for both input and output, CoT increases API costs and latency. Use it only when logical reasoning is strictly necessary.
  • Vague System Instructions: Writing "Be helpful and write good code" is too vague. Be explicit: "You are a Java 17 compiler. Output only valid Java code without markdown formatting."

Interview Notes for Developers

  • How do you handle LLM hallucinations in production? Explain that you use System Messages to define strict boundaries, Few-Shot prompting to show expected outputs, and Chain-of-Thought to force logical validation before output generation.
  • What is the difference between Zero-Shot and Few-Shot prompting? Zero-shot asks the model to perform a task with instructions only. Few-shot provides one or more examples showing the desired input-output behavior, making it far more reliable for structured tasks.
  • How do you optimize LLM latency when using Chain-of-Thought? Mention that you can use a two-step process: use a cheaper, faster model to perform simple tasks, or instruct the model to keep its reasoning steps concise to minimize output tokens.

Summary

Advanced prompting transforms LLMs from unpredictable conversationalists into reliable, structured software engines. By mastering System Messages, you control the model's persona and constraints. Through Few-Shot prompting, you define the exact patterns and formats you require. With Chain-of-Thought reasoning, you unlock the model's ability to solve complex, multi-step logical problems. Combining these three techniques is the key to building successful, production-grade AI integrations.

About the Author

Naresh Kumar

Naresh Kumar

Senior Java Backend Engineer experienced in Banking, Payments, ISO 20022, Spring Boot, Microservices, Kafka, Docker, Kubernetes, AWS and Cloud Native Systems.

Built enterprise payment solutions, transaction processing systems, API platforms and scalable microservices used in production.

LinkedIn Profile