Prompt engineering is quickly becoming one of the most useful everyday skills in the world of AI. It lets regular people guide powerful models like Gemini (or ChatGPT, Claude, and others) to give more accurate, creative, or structured answers. This post gives you a clear, detailed walkthrough of the key ideas from Lee Boonstra's comprehensive whitepaper on prompt engineering.
We have expanded the explanations so you can really understand how each technique works, why it helps, and when to reach for it and all in plain English. The goal is to help you grasp the fundamentals deeply without needing to read the full 68-page document. Everything here comes directly from the whitepaper's content and examples.
Let's dive in!
What Prompt Engineering Really Means
Large language models are trained on enormous amounts of text. They do not truly "understand" in the human sense. Instead, they predict the next word (or small chunk called a token) based on patterns they have seen before. When you type something into the model, that input is your prompt, and everything that follows is the model's best guess at what should come next.
Prompt engineering is the art and practice of writing those inputs in a smart way so the model produces exactly the kind of output you want. You control the direction, tone, format, accuracy, and usefulness of the reply simply by choosing your words carefully.
You do not need to be a programmer or data scientist. Anyone can improve at it through trial and error. The whitepaper stresses that the best results come from iteration: write a prompt, read the answer, notice what went wrong or could be better, then adjust and try again. Small changes in wording, order, or added instructions often make a big difference.
Shaping the Model's Behavior with Settings
Even before you write the prompt itself, you can adjust a few built-in controls that affect every response the model gives.
First is output length. You can set a maximum number of tokens so answers stay short and focused (which also saves time and cost). Good prompts help the model pack useful information into that limit without cutting off early.
Next come the sampling settings, which decide how the model picks each next token from all its possible choices.
- Temperature controls randomness. A value close to 0 makes the model very focused and predictable. It almost always picks the most likely word, which is great for factual questions, math, code, or anything where you want consistency. Higher temperature (for example 0.7–1.0) introduces more variety and surprise, which suits creative writing, brainstorming ideas, or storytelling.
- Top-K narrows the options to only the K most probable tokens. A small K (like 10) keeps things safe and on-topic. A larger K allows more unusual but still reasonable choices.
- Top-P (also called nucleus sampling) works differently. It includes tokens until their combined probability reaches a set threshold (for example 0.95). This usually gives a nice balance between focus and diversity.
The whitepaper suggests combining these thoughtfully. A common reliable mix is low temperature (around 0.2), top-P at 0.95, and moderate top-K (around 30–40). It also warns about a common issue called the repetition loop bug: if settings push the model too far toward certainty or randomness, it can get stuck repeating the same words or phrases endlessly.
Starting with the Basics: Core Prompting Styles
The simplest and often most effective techniques come first.
Zero-shot prompting
You give the model a clear task description with no examples at all.
- For example, you might write: "Read this movie review and classify it as POSITIVE, NEUTRAL, or NEGATIVE. Review: 'The special effects were amazing, but the story felt rushed and unoriginal.'"
- The model tries to follow your instructions directly. Keep temperature low so it stays consistent. Many tasks work surprisingly well this way. When results are shaky, track different wordings in a simple table so you can see patterns and improve.
One-shot and few-shot prompting
Here you show the model what you want by including examples.
- One-shot means one example. Few-shot usually means three to five.
- A classic use case is turning free-form text into structured data. You might show two or three pizza orders written naturally, each followed by the same JSON format, then give a new order and ask for the same treatment.
- Pick examples that cover normal cases and tricky edge cases. Put your actual task at the very end. This method often lifts performance a lot when zero-shot struggles, because the model learns the pattern from your demonstrations.
System, role, and contextual instructions
These give the model stronger guardrails or personality.
- System instructions apply to the whole conversation. You can demand valid JSON output only, forbid inventing facts (and instruct the model to say "I don't know" instead), enforce a certain tone, or require step-by-step reasoning every time.
- Role prompting tells the model to act as a specific character. For example: "You are a friendly, humorous travel expert who has lived in Amsterdam for 20 years. Suggest three unusual museums I should visit and explain why each one is worth seeing." The persona makes answers more engaging and on-brand.
- Contextual prompting adds useful background information right before the task. If you want blog ideas about 1980s arcade games, first supply a short list of popular titles, key cultural details, and target audience preferences. The model then builds on that foundation.
Moving to Advanced Techniques for Harder Tasks
When basic methods are not enough, these strategies help the model reason better, stay consistent, or tackle multi-step problems.
Step-back prompting
Ask a broader, more abstract question first to activate the right knowledge, then follow up with the detailed one. This "step back" often unlocks better performance on complex topics.
Chain-of-Thought (CoT)
Encourage the model to think aloud. Simply add "Let's think step by step" to a zero-shot prompt, or show examples that include detailed reasoning before the final answer (few-shot CoT). This dramatically improves accuracy on math problems, logic puzzles, planning tasks, and anything requiring multiple steps.
Self-consistency
Run the same prompt several times (with a bit of randomness from higher temperature or sampling), collect all answers, and choose the most common one. This reduces mistakes on classification or decision questions, such as deciding whether an email reports an urgent bug.
Tree of Thoughts
Builds on CoT by exploring several possible reasoning paths at once, like branches on a tree, then evaluating which path looks most promising.
ReAct (Reason + Act)
The model alternates between thinking and taking actions (for example searching the web, doing a calculation, or recalling updated information). This works best when you connect the model to external tools, allowing it to handle questions that need real-time or external data.
Automatic Prompt Engineering (APE)
Let the model improve your prompt for you. Ask it to generate ten variations of your original instruction, then rank or score them. Pick the best one and use it.
Using Prompts Effectively for Code
The whitepaper spends time showing how prompts handle programming tasks.
- Generate code: Describe what you need in plain language ("Write a Python script that adds 'draft_' to the start of every filename in a folder").
- Explain code: Paste existing code and ask for a clear, step-by-step breakdown of what each part does.
- Translate code: Convert from one language to another (for example from Bash shell script to Python).
- Debug and review: Show buggy or messy code and ask the model to find errors, explain them, suggest fixes, and recommend cleaner ways to write it.
Always test any code the model produces yourself before using it in important work.
The document briefly mentions multimodal prompting (combining text with images), but most examples stay text-focused.
Practical Tips to Get Better Results
Here are the whitepaper's most repeated pieces of advice:
- Be clear and specific about what you want. Focus on positive instructions (what to do) rather than negatives (what not to do).
- Ask for structured output whenever it helps: JSON, bullet points, tables, numbered steps.
- Use plenty of examples when the task has a pattern.
- Experiment with different roles, tones, and levels of detail.
- Keep a log of your prompts, settings, and outputs so you can track what works.
- Remember that models keep improving. A prompt that worked perfectly last month might need small tweaks after an update.
- Iterate patiently. The difference between an okay answer and a great one often comes from three or four careful refinements.
Final Thoughts
Prompt engineering is really about clear communication combined with smart experimentation. Start with straightforward zero-shot instructions and low randomness. When you need more precision or creativity, add examples, reasoning steps, roles, or advanced patterns. Adjust the model's settings to match the task. Keep testing and tweaking.
With these ideas in your toolkit, you will get far more reliable and useful results from any large language model. The whitepaper shows that anyone can reach expert-level prompting through practice and no special degree required.
There you have it, the essentials from the whitepaper in one easy read! If you want the full details, grab the original at the provided link below. At prompt01.com, we're all about making AI prompting so happy prompting!
Original Prompt Engineering whitepaper by Lee Boonstra: https://www.kaggle.com/whitepaper-prompt-engineering