DeepSeek Reasoning Conversation Format (Chat Completions-style)
Deepseek Reasoning Conversation Format (Chat Completions Style)
Section titled “Deepseek Reasoning Conversation Format (Chat Completions Style)”Overview
Official Documentation
Reasoning model (deepseek-reasoner)
📝 Introduction
Section titled “📝 Introduction”Deepseek-reasoner is a reasoning model launched by DeepSeek. Before producing the final answer, the model first outputs a chain-of-thought section to improve the accuracy of the final response. The API exposes the reasoning_content from deepseek-reasoner to users for viewing, display, and distillation.
💡 Request Example
Section titled “💡 Request Example”Basic text chat ✅
Section titled “Basic text chat ✅”curl https://4All API地址/v1/chat/completions \ -H "Content-Type: application/json" \ -H "Authorization: Bearer $4All API_API_KEY" \ -d '{ "model": "deepseek-reasoner", "messages": [ { "role": "user", "content": "9.11 and 9.8, which is greater?" } ], "max_tokens": 4096 }'Response example:
{ "id": "chatcmpl-123", "object": "chat.completion", "created": 1677652288, "model": "deepseek-reasoner", "choices": [{ "index": 0, "message": { "role": "assistant", "reasoning_content": "让我一步步思考:\n1. 我们需要比较9.11和9.8的大小\n2. 两个数都是小数,我们可以直接比较\n3. 9.8 = 9.80\n4. 9.11 < 9.80\n5. 所以9.8更大", "content": "9.8 is greater than 9.11." }, "finish_reason": "stop" }], "usage": { "prompt_tokens": 10, "completion_tokens": 15, "total_tokens": 25 }}Streaming response ✅
Section titled “Streaming response ✅”curl https://4All API地址/v1/chat/completions \ -H "Content-Type: application/json" \ -H "Authorization: Bearer $4All API_API_KEY" \ -d '{ "model": "deepseek-reasoner", "messages": [ { "role": "user", "content": "9.11 and 9.8, which is greater?" } ], "stream": true }'Streaming response example:
{"id":"chatcmpl-123","object":"chat.completion.chunk","created":1694268190,"model":"deepseek-reasoner","choices":[{"index":0,"delta":{"role":"assistant","reasoning_content":"让我"},"finish_reason":null}]}
{"id":"chatcmpl-123","object":"chat.completion.chunk","created":1694268190,"model":"deepseek-reasoner","choices":[{"index":0,"delta":{"reasoning_content":"一步步"},"finish_reason":null}]}
{"id":"chatcmpl-123","object":"chat.completion.chunk","created":1694268190,"model":"deepseek-reasoner","choices":[{"index":0,"delta":{"reasoning_content":"思考:"},"finish_reason":null}]}
// ... more chain-of-thought content ...
{"id":"chatcmpl-123","object":"chat.completion.chunk","created":1694268190,"model":"deepseek-reasoner","choices":[{"index":0,"delta":{"content":"9.8"},"finish_reason":null}]}
{"id":"chatcmpl-123","object":"chat.completion.chunk","created":1694268190,"model":"deepseek-reasoner","choices":[{"index":0,"delta":{"content":" is greater"},"finish_reason":null}]}
// ... more final answer content ...
{"id":"chatcmpl-123","object":"chat.completion.chunk","created":1694268190,"model":"deepseek-reasoner","choices":[{"index":0,"delta":{},"finish_reason":"stop"}]}📮 Request
Section titled “📮 Request”Endpoint
Section titled “Endpoint”POST /v1/chat/completionsAuthentication method
Section titled “Authentication method”Include the following in the request header for API key authentication:
Authorization: Bearer $4All API_API_KEYWhere $DEEPSEEK_API_KEY is your API key.
Request body parameters
Section titled “Request body parameters”messages
Section titled “messages”- Type: array
- Required: Yes
The list of messages containing the conversation so far. Note that if you pass reasoning_content in the input messages sequence, the API will return a 400 error.
- Type: string
- Required: Yes
- Value: deepseek-reasoner
The model ID to use. Currently only deepseek-reasoner is supported.
max_tokens
Section titled “max_tokens”- Type: integer
- Required: No
- Default: 4096
- Maximum: 8192
The maximum length of the final answer, excluding the chain-of-thought output. Note that chain-of-thought output can reach up to 32K tokens.
stream
Section titled “stream”- Type: boolean
- Required: No
- Default: false
Whether to use streaming responses.
Unsupported parameters
Section titled “Unsupported parameters”The following parameters are currently not supported:
- temperature
- top_p
- presence_penalty
- frequency_penalty
- logprobs
- top_logprobs
Note: To remain compatible with existing software, setting temperature, top_p, presence_penalty, or frequency_penalty will not produce an error, but they will also not take effect. Setting logprobs or top_logprobs will produce an error.
Supported features
Section titled “Supported features”- Chat completion
- Prefix continuation in chat (Beta)
Unsupported features
Section titled “Unsupported features”- Function call
- JSON output
- FIM completion (Beta)
📥 Response
Section titled “📥 Response”Successful response
Section titled “Successful response”Returns a chat completion object, or a streamed sequence of chat completion chunk objects if the request is streamed.
- Type: string
- Description: Unique identifier for the response
object
Section titled “object”- Type: string
- Description: Object type, value is “chat.completion”
created
Section titled “created”- Type: integer
- Description: Response creation timestamp
- Type: string
- Description: The model name used, value is “deepseek-reasoner”
choices
Section titled “choices”- Type: array
- Description: Contains the generated response options
- Properties:
- index : Option index
- message : Message object containing the role, chain-of-thought content, and final answer role : Role, value is “assistant” reasoning_content : Chain-of-thought content content : Final answer content
- finish_reason : Completion reason
- Type: object
- Description: Token usage statistics
- Properties:
- prompt_tokens : Number of tokens used by the prompt
- completion_tokens : Number of tokens used by the completion
- total_tokens : Total number of tokens
📝 Context concatenation notes
Section titled “📝 Context concatenation notes”During each round of conversation, the model outputs chain-of-thought content (reasoning_content) and the final answer (content). In the next round, the chain-of-thought content from the previous round is not concatenated into the context, as shown below:
Note
If you pass reasoning_content in the input messages sequence, the API will return a 400 error. Therefore, please remove the reasoning_content field from the API response before making the next API request, as shown in the example below.
Example:
from openai import OpenAIclient = OpenAI(api_key="<DeepSeek API Key>", base_url="https://4All API地址")
# Round 1messages = [{"role": "user", "content": "9.11 and 9.8, which is greater?"}]response = client.chat.completions.create( model="deepseek-reasoner", messages=messages)
reasoning_content = response.choices[0].message.reasoning_contentcontent = response.choices[0].message.content
# Round 2 - only concatenate the final answer contentmessages.append({'role': 'assistant', 'content': content})messages.append({'role': 'user', 'content': "How many Rs are there in the word 'strawberry'?"})response = client.chat.completions.create( model="deepseek-reasoner", messages=messages)Streaming response example:
# Round 1messages = [{"role": "user", "content": "9.11 and 9.8, which is greater?"}]response = client.chat.completions.create( model="deepseek-reasoner", messages=messages, stream=True)
reasoning_content = ""content = ""
for chunk in response: if chunk.choices[0].delta.reasoning_content: reasoning_content += chunk.choices[0].delta.reasoning_content else: content += chunk.choices[0].delta.content
# Round 2 - only concatenate the final answer contentmessages.append({"role": "assistant", "content": content})messages.append({'role': 'user', 'content': "How many Rs are there in the word 'strawberry'?"})response = client.chat.completions.create( model="deepseek-reasoner", messages=messages, stream=True)