Claude Haiku 4.5 is Anthropic’s fastest and most efficient model, delivering near-frontier intelligence at a fraction of the cost and latency of larger Claude models. Matching Claude Sonnet 4’s perfor...
| Provider | Cache | Uptime | Chat | |||
|---|---|---|---|---|---|---|
| — | $0.09 | $0.45 | Cache read$0.009 |
Capabilities
Get your API key
Create an API key from the Tokens page, then set it as an environment variable:
export ONLIST_API_KEY=sk-...Make your first request
Endpoints
https://onlist.io/v1/chat/completionsOpenAI Chat Completions format
https://onlist.io/v1/responsesOpenAI Responses format
https://onlist.io/v1/messagesAnthropic Messages format
Code samples
curl https://onlist.io/v1/chat/completions \
-H "Authorization: Bearer $ONLIST_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "anthropic/claude-haiku-4-5",
"messages": [
{
"role": "user",
"content": "Explain quantum entanglement in one paragraph."
}
]
}'Replace $ONLIST_API_KEY with the API key from your token settings.
Authentication
All requests must include an Authorization: Bearer <TOKEN> header. Generate tokens from the Tokens page; tokens can be scoped to specific models, groups, IP ranges, and rate limits.
Enable streaming
Add "stream": true to receive partial responses as server-sent events in real time.
Streaming example
curl https://onlist.io/v1/chat/completions \
-H "Authorization: Bearer $ONLIST_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "anthropic/claude-haiku-4-5",
"messages": [
{
"role": "user",
"content": "Write a haiku about recursion."
}
],
"stream": true
}'Supported parameters
| Name | Type | Description |
|---|---|---|
include_reasoning | ||
max_tokens | integer | Maximum number of tokens to generate in the completion. |
reasoning | ||
response_format | object | Specifies the output format. Use {"type": "json_object"} for JSON mode. |
stop | string | array | Up to 4 sequences where the API will stop generating tokens. |
structured_outputs | ||
temperature | number | Sampling temperature between 0 and 2. Higher values make output more random. |
tool_choice | string | object | Controls which tool is called. "auto", "none", "required", or a specific function. |
tools | array | A list of tools the model may call. Currently supports functions. |
top_k | integer | Limits token selection to the k most likely candidates at each step. |
top_p | number | Nucleus sampling. The model considers tokens with top_p probability mass. |
These are the request parameters this model accepts. Parameter semantics follow the OpenAI Chat Completions specification.