OptionalabortOptionalmaxIf set, returns a failure when the number of messages exceeds this limit.
OptionalmaxMaximum number of tokens the model can generate in its response.
The conversation messages to send to the model.
OptionalnumNumber of alternative completions to generate. Not currently used by any provider.
OptionalparallelWhether the model can call multiple tools in a single turn. (OpenAI Responses API only)
OptionalrawArbitrary provider-specific attributes passed directly to the underlying API call.
OptionalreasoningProvider-agnostic reasoning effort level.
thinking is also set, it takes
precedence for Anthropic/Google.
OptionalresponseA Zod schema to constrain the model's output to structured JSON matching the schema.
OptionalresponseOptionalstream
If true, returns an AsyncGenerator of StreamChunks instead of a single result.
Optionaltemperature
Sampling temperature (0-2). Higher values make output more random, lower values more deterministic. (OpenAI only)
Optionalthinking
Enable extended thinking / thought signatures. When enabled, the model returns its reasoning process alongside the response. (Anthropic and Google only — OpenAI reasoning tokens are not exposed)
OptionalbudgetTokens?: number
Token budget for the thinking process. Defaults to 5000. (Anthropic only)
Whether to enable extended thinking.
OptionaltoolDefine behavior if too many repeated tool calls are detected (loop prevention).
Optionaltools
Tools (functions) the model can call. Each tool has a name, optional description, and a Zod schema defining its parameters.
An AbortSignal for cancelling the request.