# Chat Completions

**Endpoint**

`POST https://llm.onerouter.pro/v1/chat/completions`

### Basic chat completion

Create a non-streaming chat completion.

**Example request**

{% tabs %}
{% tab title="Python" %}

```python
import os
from openai import OpenAI
 
client = OpenAI(
    api_key='<API_KEY>',
    base_url='https://llm.onerouter.pro/v1'
)
 
completion = client.chat.completions.create(
    model='claude-3-5-sonnet@20240620',
    messages=[
        {
            'role': 'user',
            'content': 'What is the meaning of life?'
        }
    ],
    stream=False,
)
 
print('Assistant:', completion.choices[0].message.content)
print('Tokens used:', completion.usage)
```

{% endtab %}

{% tab title="TypeScript" %}

```typescript
import OpenAI from 'openai';
 
const openai = new OpenAI({
  '<API_KEY>',
  baseURL: 'https://llm.onerouter.pro/v1',
});
 
const completion = await openai.chat.completions.create({
  model: 'claude-3-5-sonnet@20240620',
  messages: [
    {
      role: 'user',
      content: 'What is the meaning of life?',
    },
  ],
  stream: false,
});
 
console.log('Assistant:', completion.choices[0].message.content);
console.log('Tokens used:', completion.usage);
```

{% endtab %}
{% endtabs %}

### Streaming chat completion

Create a streaming chat completion that streams tokens as they are generated.

**Example request**

{% tabs %}
{% tab title="Python" %}

```python
import os
from openai import OpenAI
 
client = OpenAI(
    api_key='<API_KEY>',
    base_url='https://llm.onerouter.pro/v1'
)
 
stream = client.chat.completions.create(
    model='claude-3-5-sonnet@20240620',
    messages=[
        {
            'role': 'user',
            'content': 'What is the meaning of life?'
        }
    ],
    stream=True,
)
 
for chunk in stream:
    content = chunk.choices[0].delta.content
    if content:
        print(content, end='', flush=True)
```

{% endtab %}

{% tab title="TypeScript" %}

```typescript
import OpenAI from 'openai';
 
const openai = new OpenAI({
  '<API_KEY>',
  baseURL: 'https://llm.onerouter.pro/v1',
});
 
const stream = await openai.chat.completions.create({
  model: 'claude-3-5-sonnet@20240620',
  messages: [
    {
      role: 'user',
      content: 'What is the meaning of life?',
    },
  ],
  stream: true,
});
 
for await (const chunk of stream) {
  const content = chunk.choices[0]?.delta?.content;
  if (content) {
    process.stdout.write(content);
  }
}
```

{% endtab %}
{% endtabs %}

**Streaming response format**

Streaming responses are sent as [Server-Sent Events (SSE)](https://developer.mozilla.org/en-US/docs/Web/API/Server-sent_events), a web standard for real-time data streaming over HTTP. Each event contains a JSON object with the partial response data.

The response format follows the OpenAI streaming specification:

```
data: {"id":"chatcmpl-123","object":"chat.completion.chunk","created":1677652288,"model":"anthropic/claude-sonnet-4.5","choices":[{"index":0,"delta":{"content":"Once"},"finish_reason":null}]} 

data: {"id":"chatcmpl-123","object":"chat.completion.chunk","created":1677652288,"model":"anthropic/claude-sonnet-4.5","choices":[{"index":0,"delta":{"content":" upon"},"finish_reason":null}]} 

data: [DONE]
```

Key characteristics:

* Each line starts with `data:` followed by JSON
* Content is delivered incrementally in the `delta.content` field
* The stream ends with `data: [DONE]`
* Empty lines separate events

SSE Parsing Libraries:

If you're building custom SSE parsing (instead of using the OpenAI SDK), these libraries can help:

* JavaScript/TypeScript: [`eventsource-parser`](https://www.npmjs.com/package/eventsource-parser) - Robust SSE parsing with support for partial events
* Python: [`httpx-sse`](https://pypi.org/project/httpx-sse/) - SSE support for HTTPX, or [`sseclient-py`](https://pypi.org/project/sseclient-py/) for requests

For more details about the SSE specification, see the [W3C specification](https://html.spec.whatwg.org/multipage/server-sent-events.html).

### Image attachments

Send images as part of your chat completion request.

{% content-ref url="<https://app.gitbook.com/s/Z9C9AjT7j46HAcQrOVWw/features/multimodal-input/images-inputs>" %}
[Images Inputs](https://app.gitbook.com/s/Z9C9AjT7j46HAcQrOVWw/features/multimodal-input/images-inputs)
{% endcontent-ref %}

### PDF attachments

Send PDF documents as part of your chat completion request.

{% content-ref url="<https://app.gitbook.com/s/Z9C9AjT7j46HAcQrOVWw/features/multimodal-input/pdf-inputs>" %}
[PDF Inputs](https://app.gitbook.com/s/Z9C9AjT7j46HAcQrOVWw/features/multimodal-input/pdf-inputs)
{% endcontent-ref %}

### Audio attachments

{% content-ref url="<https://app.gitbook.com/s/Z9C9AjT7j46HAcQrOVWw/features/multimodal-input/audio-inputs>" %}
[Audio Inputs](https://app.gitbook.com/s/Z9C9AjT7j46HAcQrOVWw/features/multimodal-input/audio-inputs)
{% endcontent-ref %}

### Video attachments

{% content-ref url="<https://app.gitbook.com/s/Z9C9AjT7j46HAcQrOVWw/features/multimodal-input/video-inputs>" %}
[Video Inputs](https://app.gitbook.com/s/Z9C9AjT7j46HAcQrOVWw/features/multimodal-input/video-inputs)
{% endcontent-ref %}

### Parameters

The chat completions endpoint supports the following parameters:

**Required parameters**

* `model` (string): The model to use for the completion (e.g., `anthropic/claude-sonnet-4`)
* `messages` (array): Array of message objects with `role` and `content` fields

**Optional parameters**

* `stream` (boolean): Whether to stream the response. Defaults to `false`
* `temperature` (number): Controls randomness in the output. Range: 0-2
* `max_tokens` (integer): Maximum number of tokens to generate
* `top_p` (number): Nucleus sampling parameter. Range: 0-1
* `frequency_penalty` (number): Penalty for frequent tokens. Range: -2 to 2
* `presence_penalty` (number): Penalty for present tokens. Range: -2 to 2
* `stop` (string or array): Stop sequences for the generation
* `tools` (array): Array of tool definitions for function calling
* `tool_choice` (string or object): Controls which tools are called (`auto`, `none`, or specific function)
* `provider` (object): [Provider routing and configuration options](https://app.gitbook.com/s/Z9C9AjT7j46HAcQrOVWw/routing-and-gateway/inference-provider-routing)
* `response_format` (object): Controls the format of the model's response
  * For OpenAI standard format: `{ type: "json_schema", json_schema: { name, schema, strict?, description? } }`
  * For legacy format: `{ type: "json", schema?, name?, description? }`
  * For plain text: `{ type: "text" }`
  * See [Structured outputs](https://vercel.com/docs/ai-gateway/openai-compat/structured-outputs) for detailed examples

### Message format

Messages support different content types:

**Text messages**

```json
{  
    "role": "user",  
    "content": "Hello, how are you?"
}
```

**Multimodal messages**

```json
{  
    "role": "user",  
    "content": [    
        { 
            "type": "text", 
            "text": "What's in this image?" 
        },    
        {      
            "type": "image_url",      
            "image_url": {        
                "url": "data:image/jpeg;base64,/9j/4AAQSkZJRgABAQAAAQABAAD..."      
            }    
        }  
    ]
}
```

**File messages**

```json
{  
    "role": "user",  
    "content": [    
        { 
            "type": "text", 
            "text": "Summarize this document" 
        },    
        {      
            "type": "file",      
            "file": {        
                "data": "JVBERi0xLjQKJcfsj6IKNSAwIG9iago8PAovVHlwZSAvUGFnZQo...",        
                "media_type": "application/pdf",        
                "filename": "document.pdf"      
            }    
        }  
    ]
}
```
