Claude API Fundamentals
Messages, Streaming & Models
Your First Claude API Call
The Anthropic SDK is the official way to talk to Claude from code. Unlike chat interfaces, the API gives you full control: you choose the model, set the temperature, define system prompts, and stream responses token by token.
The Messages API
Every Claude interaction is a messages request. You send an array of messages (alternating user/assistant turns) and get back a response:
const response = await client.messages.create({
model: "claude-sonnet-4-20250514",
max_tokens: 1024,
messages: [{ role: "user", content: "Hello, Claude!" }],
});The response contains an array of content blocks — usually a single text block with Claude's reply.
Message Roles
| Role | Purpose | Example |
|---|---|---|
| `system` | Sets persistent behavior and context | "You are a helpful personal assistant" |
| `user` | The human's input | "What meetings do I have today?" |
| `assistant` | Claude's responses (or pre-filled) | "You have 3 meetings today..." |
The system prompt is special — it's not part of the messages array but a separate parameter. It shapes every response Claude gives.
Streaming: Why It Matters
Without streaming, you wait for the entire response before showing anything. With streaming, tokens appear as they're generated — usually within 200ms of the first token.
const stream = await client.messages.stream({
model: "claude-sonnet-4-20250514",
max_tokens: 1024,
messages: [{ role: "user", content: "Explain streaming." }],
});
for await (const event of stream) {
if (event.type === "content_block_delta") {
process.stdout.write(event.delta.text);
}
}Streaming is essential for assistants because:
Model Selection
| Model | Speed | Intelligence | Best For |
|---|---|---|---|
| Haiku | Fastest | Good | Quick lookups, classification |
| Sonnet | Fast | Very good | Most assistant tasks |
| Opus | Slower | Best | Complex analysis, nuanced reasoning |
For a personal assistant, Sonnet is the sweet spot — fast enough for interactive use, smart enough for complex tasks like email triage and research synthesis.
Temperature for Assistants
For your assistant, start with 0.3 for most tasks. You can adjust per-task later.
Building a Chat Loop
A chat loop is the simplest interactive assistant pattern:
The conversation history is just an array that grows with each turn. Claude sees the entire conversation every time, which is how it "remembers" what you said earlier.
Key Takeaways
This is chapter 1 of Build Your AI Assistant with Claude.
Get the full hands-on course — free during early access. Build the complete system. Your projects become your portfolio.
View course details