Claude AI — The Complete Guide

0 %

Course content

How Claude Works — Models, Context, and Prompts

Section 1 — Lesson 2

How Claude Works

Models, context windows, tokens, prompts, and temperature — the essential mechanics you need to understand before you can use Claude like a pro.

How Large Language Models Work (Simplified)

Before diving into Claude’s specific features, it helps to understand the basic idea behind all large language models (LLMs). Do not worry — you do not need a PhD in machine learning. The core concept is surprisingly intuitive.

An LLM is a massive neural network that has been trained on enormous amounts of text — books, articles, websites, code repositories, academic papers, and more. During training, the model learns statistical patterns: given a sequence of words, what is the most likely next word? It does this billions of times across trillions of words, gradually building an incredibly rich internal representation of language, logic, facts, and reasoning patterns.

When you send Claude a message, the model does not “look up” an answer in a database. Instead, it generates a response one token at a time, where each token is predicted based on everything that came before it — your message, the conversation history, and the model’s learned patterns. Think of it as an extraordinarily sophisticated autocomplete that understands context, nuance, and logic at a level that often feels like genuine understanding.

💡 Key Insight

Claude does not “know” things the way a human does. It has learned patterns from training data and generates statistically likely continuations. This is why it can be brilliantly correct on one topic and subtly wrong on another — and why understanding how it works helps you use it more effectively.

Tokens and the Context Window

One of the most important concepts to understand is tokens. A token is the basic unit that language models work with. In English, one token is roughly three-quarters of a word. The word “understanding” might be split into two tokens: “under” and “standing”. Short common words like “the” or “is” are single tokens. Code and technical text often use more tokens per word because of special characters and syntax.

The context window is the total amount of text — measured in tokens — that the model can “see” at one time. Think of it as the model’s working memory. Everything in the context window — your current message, all previous messages in the conversation, any uploaded documents, and the system prompt — must fit within this limit.

200K

Claude’s context window in tokens

~150K

Equivalent words (roughly 500 pages)

1 token

≈ 0.75 English words on average

Claude’s 200,000-token context window is one of the largest in the industry. In practical terms, you can paste an entire novel, a full codebase, or hundreds of pages of legal documents into a single conversation — and Claude can reason about all of it simultaneously. This is a game-changer for tasks like document analysis, code review, and research synthesis where having the full picture matters.

However, context windows have an important limitation: once a conversation exceeds the limit, the oldest messages are silently dropped. The model does not warn you. It simply loses access to the earliest parts of the conversation. This is why long conversations can sometimes feel like Claude has “forgotten” something you discussed earlier — it literally has, because that information no longer fits in the window.

✅ Pro Tip

For very long tasks, start a fresh conversation periodically and provide a concise summary of previous context. This gives Claude a clean, full context window to work with and produces better results than a 50-message thread where half the messages have been silently truncated.

Model Tiers in Practice: Opus vs. Sonnet vs. Haiku

You learned about the three Claude tiers in the previous lesson. Now let us look at when to use each one — because choosing the right model is one of the simplest ways to get better results.

Scenario	Best Model	Why
Analyzing a 200-page contract	Opus	Complex reasoning over long context
Writing a marketing email	Sonnet	Good writing, fast turnaround
Classifying 10,000 support tickets	Haiku	High volume, low cost, simple task
Debugging a complex codebase	Opus	Needs deep multi-file reasoning
Brainstorming blog topics	Sonnet	Creative, fast, cost-effective
Quick translation of a sentence	Haiku	Simple task, speed matters

On claude.ai, the default model is Sonnet, which is the right choice for 80% of everyday tasks. You can switch to Opus for demanding work using the model selector dropdown. Haiku is primarily available through the API, where developers build it into automated pipelines that process thousands of requests.

Understanding Prompts

A prompt is simply the text you send to Claude. It can be a question, an instruction, a document to analyze, or a conversation. The quality of your prompt is the single biggest factor in the quality of Claude’s response. This is so important that an entire discipline — prompt engineering — has emerged around the art and science of writing effective prompts.

At its core, the prompt is Claude’s only window into what you want. Claude cannot read your mind. It does not know your background, your industry, your preferences, or your goals unless you tell it. The more context and clarity you provide, the better the output. A vague prompt like “write me an email” will produce a generic result. A specific prompt like “write a professional follow-up email to a client who missed our product demo, tone should be warm but urgent, mention the recording link” will produce something you can actually send.

❌ Weak Prompt

“Write an email.”

No context, no audience, no goal. Claude has to guess everything.

✅ Strong Prompt

“Write a warm but urgent follow-up email to a client who missed our SaaS product demo yesterday. Include a link to the recording and suggest two alternative times next week.”

Clear context, audience, tone, and specific deliverables.

System Prompts — The Hidden Instruction Layer

Beyond what you type in the chat, there is another layer of instructions that shapes Claude’s behavior: the system prompt. This is a special message, usually set by the developer or platform, that Claude receives before your conversation begins. It tells Claude how to behave — what role to play, what tone to use, what constraints to follow, and what information to prioritize.

When you use claude.ai directly, Anthropic sets a default system prompt that instructs Claude to be helpful, harmless, and honest. But when businesses build Claude into their products via the API, they write custom system prompts tailored to their use case. For example, a customer service bot might have a system prompt like: “You are a support agent for Acme Corp. Only answer questions about our products. Never discuss competitors. Always be polite and offer to escalate to a human agent if the issue is complex.”

You can also use system prompts in your own workflows through the API or through tools like Claude Code. This is an incredibly powerful technique that we will explore in depth in later sections — it essentially lets you “program” Claude’s personality and behavior for specific tasks.

Temperature — Controlling Creativity vs. Precision

Temperature is a parameter that controls how “creative” or “random” Claude’s responses are. It is a number between 0 and 1. Understanding temperature helps you get the right type of output for different tasks.

Temperature 0

Most Deterministic

Claude always picks the most likely next token. Responses are consistent and repeatable. Best for factual tasks: data extraction, classification, code generation where correctness is paramount.

Temperature 0.5

Balanced

A mix of reliability and variety. Good for general writing, analysis, and most everyday tasks. This is roughly what you experience on claude.ai by default.

Temperature 1.0

Most Creative

Claude samples from a wider range of possible tokens. Responses are more varied, surprising, and creative. Best for brainstorming, storytelling, and generating diverse ideas.

On claude.ai, you do not have direct control over temperature — Anthropic sets a sensible default. But when using the API, you can set temperature precisely for each request. This is especially useful in production applications: you might use temperature 0 for extracting data from invoices (where you want exact, repeatable results) and temperature 0.8 for generating marketing copy (where variety and creativity are valued).

The Conversation as Input → Output

One final concept that ties everything together: every interaction with Claude is fundamentally an input-output operation. You provide an input (the full conversation context, including the system prompt, all previous messages, and your latest message), and Claude produces an output (the next response). There is no persistent memory between conversations. There is no hidden state. Every response is generated purely from what is in the context window at that moment.

This means that the quality of your output is entirely determined by the quality of your input. If you give Claude clear instructions, relevant context, and specific examples of what you want — the output will be excellent. If you give it a vague one-liner with no context — you get a generic response. This principle is the foundation of everything you will learn in this course.

⚠️ Common Misconception

Claude does not remember previous conversations. Each new conversation starts with a blank slate. If you had a great discussion yesterday and start a new chat today, Claude has zero knowledge of what you discussed. Always provide necessary context at the start of each conversation.

📚 Lesson Summary

LLMs like Claude generate text by predicting the most likely next token based on learned patterns from training data
Claude’s 200K token context window (~150K words) lets you work with entire books, codebases, or document sets at once
Use Opus for complex reasoning, Sonnet for daily tasks, and Haiku for fast high-volume processing
Your prompt is the single biggest factor in output quality — be specific, provide context, state your goal
System prompts let developers and power users program Claude’s behavior for specific use cases
Temperature controls the creativity-precision tradeoff: 0 for factual tasks, higher for creative work
Every conversation is stateless — Claude only knows what is in the current context window

About
Comments (0)

Praktické vysvětlení toho, jak Claude funguje pod kapotou — od základů LLM a tokenových kontextových oken po úrovně modelů, základy prompt engineeringu a nastavení teploty.