Anthropic: Claude Haiku 4 5

anthropic/claude-haiku-4-5

Claude Haiku 4.5 is Anthropic’s fastest and most efficient model, delivering near-frontier intelligence at a fraction of the cost and latency of larger Claude models. Matching Claude Sonnet 4’s perfor...

visiontools

MODALITIES

→

INPUT PRICE

$0.09per 1M

OUTPUT PRICE

$0.45per 1M

CONTEXT

200K

RELEASED

Oct 16, 2025

Provider				Cache	Uptime	Chat
ccapi	—	$0.09	$0.45	Cache read$0.009

Capabilities

Input modalities

fileimagetext

Output modalities

text

Features

include_reasoningmax_tokensreasoningresponse_formatstopstructured_outputstemperaturetool_choicetoolstop_ktop_p

Get your API key

Create an API key from the Tokens page, then set it as an environment variable:

export ONLIST_API_KEY=sk-...

Make your first request

Endpoints

POSThttps://onlist.io/v1/chat/completions

OpenAI Chat Completions format

Request Headers

Authorization:Bearer $ONLIST_API_KEY

Content-Type:application/json

Model:anthropic/claude-haiku-4-5

POSThttps://onlist.io/v1/responses

OpenAI Responses format

Request Headers

Authorization:Bearer $ONLIST_API_KEY

Content-Type:application/json

Model:anthropic/claude-haiku-4-5

POSThttps://onlist.io/v1/messages

Anthropic Messages format

Request Headers

Authorization:Bearer $ONLIST_API_KEY

Content-Type:application/json

Model:anthropic/claude-haiku-4-5

Code samples

curl https://onlist.io/v1/chat/completions \
  -H "Authorization: Bearer $ONLIST_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
       "model": "anthropic/claude-haiku-4-5",
       "messages": [
         {
           "role": "user",
           "content": "Explain quantum entanglement in one paragraph."
         }
       ]
     }'

Replace $ONLIST_API_KEY with the API key from your token settings.

Authentication

All requests must include an Authorization: Bearer <TOKEN> header. Generate tokens from the Tokens page; tokens can be scoped to specific models, groups, IP ranges, and rate limits.

Enable streaming

Add "stream": true to receive partial responses as server-sent events in real time.

Streaming example

curl https://onlist.io/v1/chat/completions \
  -H "Authorization: Bearer $ONLIST_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
       "model": "anthropic/claude-haiku-4-5",
       "messages": [
         {
           "role": "user",
           "content": "Write a haiku about recursion."
         }
       ],
       "stream": true
     }'

Supported parameters

Name	Type	Description
`include_reasoning`
`max_tokens`	integer	Maximum number of tokens to generate in the completion.
`reasoning`
`response_format`	object	Specifies the output format. Use {"type": "json_object"} for JSON mode.
`stop`	string \| array	Up to 4 sequences where the API will stop generating tokens.
`structured_outputs`
`temperature`	number	Sampling temperature between 0 and 2. Higher values make output more random.
`tool_choice`	string \| object	Controls which tool is called. "auto", "none", "required", or a specific function.
`tools`	array	A list of tools the model may call. Currently supports functions.
`top_k`	integer	Limits token selection to the k most likely candidates at each step.
`top_p`	number	Nucleus sampling. The model considers tokens with top_p probability mass.

These are the request parameters this model accepts. Parameter semantics follow the OpenAI Chat Completions specification.