Configuring LLM Models
- How to select a platform and model for your LLM agent
- What each setting on the General tab does
- How to fine-tune temperature, token limits, and advanced options
When you set an agent's Tipo de assistente (Assistant Type) to llm, AutoTalk reveals a rich set of configuration options. The first thing you choose is the AI platform, then you configure the model and its behavior through the Geral (General) tab.
Selecting a platform
The Tipo de plataforma (Platform Type) dropdown determines which AI provider powers your agent. AutoTalk supports five platforms:
| Platform | Description |
|---|---|
| openai (default) | OpenAI models including GPT-4.1 and others. The most popular choice with a wide range of models at different price and quality levels. |
| deepseek | DeepSeek models, offering competitive pricing for capable AI performance. |
| gemini | Google Gemini models, part of Google's AI ecosystem. |
| custom | Connect to a custom or third-party API endpoint. Use this for providers not listed above, or for your own fine-tuned models served through a compatible API. |
| node-llama-cpp | Run a large language model locally on your own hardware using node-llama-cpp. No external API calls are made, which keeps data fully on-premises. Best for organizations with strict data residency requirements. |
After selecting a platform, four configuration tabs appear: Geral (General), Mensagens (Messages), Acoes (Actions), and Ferramentas (Tools). This page covers the General tab; see System Messages and Tools for the other tabs.
General tab settings
The Geral (General) tab contains the core model configuration:
Modelo (Model)
Select the specific model to use from a dropdown list. Each model displays helpful metadata including:
- A quality tier label (e.g., "Medio" for medium quality)
- Pricing per 1K tokens for both input and output, so you can estimate costs
For example, selecting gpt-4.1 shows its quality rating and per-token pricing. Lighter models cost less but may produce lower-quality responses; heavier models cost more but handle complex conversations better.
Temperatura (Temperature)
Controls how creative or deterministic the model's responses are. The default value is 1.
- Lower values (0.0 to 0.5): The agent gives more focused, predictable, and consistent answers. Best for factual customer support.
- Higher values (0.8 to 1.5): The agent produces more varied and creative responses. Useful for brainstorming or casual conversation, but may reduce accuracy.
Token
A searchable field where you can select an existing API token or create a new one. This token authenticates your agent with the chosen AI platform. If you have not yet added an API key for the platform, you can create one directly from this field.
Maximo de tokens (Max Tokens)
Sets the upper limit on how many tokens the model can generate in a single response. Use this to control response length and cost. If left empty, the model uses its default maximum.
Maximo de caracteres de entrada (Max Input Characters)
Limits how many characters from the user's message are sent to the model. The default is 1024 characters. Increase this if your customers tend to send longer messages and you want the agent to consider the full text; decrease it to reduce costs on verbose inputs.
Advanced options
The General tab also includes three optional checkboxes for advanced behavior:
Ativar saidas estruturadas (Enable Structured Outputs)
When checked, the model is instructed to return responses in a structured format (such as JSON). This is useful when the agent's output is consumed by another system rather than displayed directly to a customer.
Ativar orcamento de tokens do contexto (Enable Context Token Budget)
When checked, AutoTalk manages how much conversation history is sent to the model by enforcing a token budget for context. This prevents the context window from exceeding the model's limit on long conversations and helps control costs.
Ativar processamento de imagens (Enable Image Processing)
When checked, the agent can receive and process images sent by customers (on models that support vision capabilities). This allows the agent to describe, analyze, or respond to photos and screenshots.
For most customer-facing agents, start with the openai platform, choose a capable model like gpt-4.1, set the temperature to 0.3 to 0.5 for reliable answers, and leave the max input characters at the default. Enable image processing only if your use case requires it, since it increases token usage.