Skip to main content
DeepSeek is a platform for providing endpoints for Large Language models. See their library of models here. We recommend experimenting to find the best-suited model for your use-case. Here are some general recommendations:
  • deepseek-v4-flash (default): fast V4 model, 1M context. Hybrid thinking + non-thinking.
  • deepseek-v4-pro: flagship V4 model, 1M context. Hybrid thinking + non-thinking.
DeepSeek does not have rate limits. See their docs for information about how to deal with slower responses during high traffic.

Authentication

Set your DEEPSEEK_API_KEY environment variable. Get your key from here.
export DEEPSEEK_API_KEY=***

Example

Use DeepSeek with your Agent:
from agno.agent import Agent
from agno.models.deepseek import DeepSeek

agent = Agent(model=DeepSeek(id="deepseek-v4-flash"), markdown=True)

# Print the response in the terminal
agent.print_response("Share a 2 sentence horror story.")

View more examples here.

Parameters

ParameterTypeDefaultDescription
idstr"deepseek-v4-flash"The id of the DeepSeek model to use
namestr"DeepSeek"The name of the model
providerstr"DeepSeek"The provider of the model
api_keyOptional[str]NoneThe API key for DeepSeek (defaults to DEEPSEEK_API_KEY env var)
base_urlstr"https://api.deepseek.com"The base URL for the DeepSeek API
use_thinkingOptional[bool]NoneControl thinking mode. See Thinking mode.
DeepSeek extends the OpenAI-compatible interface and supports most parameters from the OpenAI model. Note: DeepSeek supports JSON mode but not native json_schema structured outputs, so supports_native_structured_outputs is set to False. Use use_json_mode=True for structured output.

Available Models

Model idNotes
deepseek-v4-flashFast V4 model (default), 1M context. Hybrid thinking + non-thinking.
deepseek-v4-proFlagship V4 model, 1M context. Hybrid thinking + non-thinking.
The legacy ids deepseek-chat and deepseek-reasoner still work and route server-side to the V4 models (deepseek-chat to non-thinking, deepseek-reasoner to thinking), but you should migrate to deepseek-v4-flash / deepseek-v4-pro.

Thinking mode

V4 models run with thinking enabled by default, so you get reasoning_content out of the box. Control it with the use_thinking flag:
ValueBehavior
None (default)V4 models think by default; legacy deepseek-chat does not.
TrueForce thinking on. The model returns reasoning_content.
FalseForce thinking off for a faster, cheaper response.
For complex tasks, set reasoning_effort="high" or reasoning_effort="max". While thinking is active, temperature, top_p, presence_penalty and frequency_penalty are ignored by the API.
from agno.agent import Agent
from agno.models.deepseek import DeepSeek

agent = Agent(
    model=DeepSeek(id="deepseek-v4-pro", reasoning_effort="max"),
    markdown=True,
)