DeepSeek is a platform for providing endpoints for Large Language models.
See their library of models here.
We recommend experimenting to find the best-suited model for your use-case. Here are some general recommendations:
deepseek-v4-flash (default): fast V4 model, 1M context. Hybrid thinking + non-thinking.
deepseek-v4-pro: flagship V4 model, 1M context. Hybrid thinking + non-thinking.
DeepSeek does not have rate limits. See their docs for information about how to deal with slower responses during high traffic.
Authentication
Set your DEEPSEEK_API_KEY environment variable. Get your key from here.
export DEEPSEEK_API_KEY=***
Example
Use DeepSeek with your Agent:
from agno.agent import Agent
from agno.models.deepseek import DeepSeek
agent = Agent(model=DeepSeek(id="deepseek-v4-flash"), markdown=True)
# Print the response in the terminal
agent.print_response("Share a 2 sentence horror story.")
Parameters
| Parameter | Type | Default | Description |
|---|
id | str | "deepseek-v4-flash" | The id of the DeepSeek model to use |
name | str | "DeepSeek" | The name of the model |
provider | str | "DeepSeek" | The provider of the model |
api_key | Optional[str] | None | The API key for DeepSeek (defaults to DEEPSEEK_API_KEY env var) |
base_url | str | "https://api.deepseek.com" | The base URL for the DeepSeek API |
use_thinking | Optional[bool] | None | Control thinking mode. See Thinking mode. |
DeepSeek extends the OpenAI-compatible interface and supports most parameters from the OpenAI model.
Note: DeepSeek supports JSON mode but not native json_schema structured outputs, so supports_native_structured_outputs is set to False. Use use_json_mode=True for structured output.
Available Models
| Model id | Notes |
|---|
deepseek-v4-flash | Fast V4 model (default), 1M context. Hybrid thinking + non-thinking. |
deepseek-v4-pro | Flagship V4 model, 1M context. Hybrid thinking + non-thinking. |
The legacy ids deepseek-chat and deepseek-reasoner still work and route server-side to the V4 models (deepseek-chat to non-thinking, deepseek-reasoner to thinking), but you should migrate to deepseek-v4-flash / deepseek-v4-pro.
Thinking mode
V4 models run with thinking enabled by default, so you get reasoning_content out of the box. Control it with the use_thinking flag:
| Value | Behavior |
|---|
None (default) | V4 models think by default; legacy deepseek-chat does not. |
True | Force thinking on. The model returns reasoning_content. |
False | Force thinking off for a faster, cheaper response. |
For complex tasks, set reasoning_effort="high" or reasoning_effort="max". While thinking is active, temperature, top_p, presence_penalty and frequency_penalty are ignored by the API.
from agno.agent import Agent
from agno.models.deepseek import DeepSeek
agent = Agent(
model=DeepSeek(id="deepseek-v4-pro", reasoning_effort="max"),
markdown=True,
)