smoltalk
    Preparing search index...

    smoltalk

    Smoltalk

    Smoltalk exposes a common API to different LLM providers. There are other packages that do this, but Smoltalk allows you to build strategies on top of it. Here is a simple example.

    pnpm install smoltalk
    
    import { text, userMessage } from "smoltalk";

    async function main() {
    const messages = [userMessage("Write me a 10 word story.")];
    const response = await text({
    messages,
    model: "gpt-5.4",
    });
    console.log(response);
    }

    main();

    This is functionality that other packages allow.

    Response
    {
    success: true,
    value: {
    output: 'Clock stopped; everyone smiled as tomorrow finally arrived before yesterday.',
    toolCalls: [],
    usage: {
    inputTokens: 14,
    outputTokens: 15,
    cachedInputTokens: 0,
    totalTokens: 29
    },
    cost: {
    inputCost: 0.000035,
    outputCost: 0.000225,
    cachedInputCost: undefined,
    totalCost: 0.00026,
    currency: 'USD'
    },
    model: 'gpt-5.4'
    }
    }

    What if you wanted to have fallbacks in case the OpenAI API was down? Just change the model field:

      const response = await text({
    messages,
    model: fallback("gpt-5.4", "gemini-2.5-flash-lite"),
    // or multiple fallbacks:
    // model: fallback("gpt-5.4", ["gemini-2.5-flash-lite", "gemini-3-flash-preview"]),
    });

    Or what if you wanted to try a couple of models and take the first response?

      const response = await text({
    messages,
    model: race("gpt-5.4", "gemini-2.5-flash-lite", "o4-mini"),
    });

    Or combine them:

      const response = await text({
    messages,
    model: race(fallback("gpt-5.4", "gemini-2.5-flash-lite"), "o4-mini"),
    });

    You get the idea.

    To use Smoltak, you first create a client:

    import { getClient } from "smoltalk";

    const client = getClient({
    openAiApiKey: process.env.OPENAI_API_KEY || "",
    googleApiKey: process.env.GEMINI_API_KEY || "",
    logLevel: "debug",
    model: "gemini-2.0-flash-lite",
    });

    Then you can call different methods on the client. The simplest is prompt:

    const resp = await client.prompt("Hello, how are you?");
    

    If you want tool calling, structured output, etc., text may be a cleaner option:

    let messages: Message[] = [];
    messages.push(
    userMessage(
    "Please use the add function to add the following numbers: 3 and 5"
    )
    );
    const resp = await client.text({
    messages,
    });

    Here is an example with tool calling:

    function add({ a, b }: { a: number; b: number }): number {
    return a + b;
    }

    const addTool = {
    name: "add",
    description: "Adds two numbers together and returns the result.",
    schema: z.object({
    a: z.number().describe("The first number to add"),
    b: z.number().describe("The second number to add"),
    }),
    };

    const resp = await client.text({
    messages,
    tools: [addTool]
    });

    Here is an example with structured output:

    const resp = await client.text({
    messages,
    responseFormat: z.object({
    result: z.number(),
    });
    });

    A couple of design decisions to note:

    • You specify different API keys using different parameter names. This means you could set a couple of different API keys and then be able to change the model name without worrying about the keys, which makes things easier for code generation.
    • The schema for tools and structured outputs is defined using Zod.
    • Parameter names are camel case, as that is the naming convention in TypeScript. They are converted to snake case for you if required by the APIs.

    SmolPromptConfig is the union of client config (SmolConfig) and per-request config (PromptConfig). You can pass all options together to text(), or split them between getClient() and individual calls.

    Option Type Description
    model ModelName | ModelConfig Required. The model to use (e.g. "gpt-4o", "gemini-2.0-flash-lite").
    openAiApiKey string OpenAI API key.
    googleApiKey string Google Gemini API key.
    ollamaApiKey string Ollama API key (only needed for cloud Ollama).
    ollamaHost string Ollama host URL (for self-hosted or cloud Ollama).
    provider Provider Override provider detection. One of "openai", "openai-responses", "google", "ollama", "anthropic", "replicate", "modal", "local".
    logLevel LogLevel Logging verbosity: "debug", "info", "warn", "error", etc.
    toolLoopDetection ToolLoopDetection Config to detect and break tool call loops. See below.
    Option Type Description
    messages Message[] Required. The conversation messages to send.
    instructions string System-level instructions (system prompt).
    tools { name, description?, schema }[] Tool definitions. schema is a Zod object schema.
    responseFormat ZodType Zod schema for structured output. The response will be parsed and validated against this schema.
    responseFormatOptions object Fine-grained control over structured output (see below).
    maxTokens number Maximum number of output tokens to generate.
    temperature number Sampling temperature (0–2 for most providers).
    numSuggestions number Number of completions to generate.
    parallelToolCalls boolean Whether to allow the model to call multiple tools in parallel.
    stream boolean If true, returns an AsyncGenerator<StreamChunk> instead of a Promise.
    maxMessages number If the message list exceeds this count, returns a failure instead of calling the API.
    rawAttributes Record<string, any> Pass provider-specific attributes directly to the API request.

    Used with responseFormat to control validation behavior (currently OpenAI only).

    Option Type Default Description
    name string Name for the response format schema.
    strict boolean Whether to use strict schema validation.
    numRetries number 2 How many times to retry if the response fails schema validation.
    allowExtraKeys boolean If true, strips unexpected keys instead of failing validation.

    Detects when the model is stuck in a repetitive tool-call loop.

    Option Type Description
    enabled boolean Whether loop detection is active.
    maxConsecutive number Number of consecutive identical tool calls before triggering intervention.
    intervention string Action to take: "remove-tool", "remove-all-tools", "throw-error", or "halt-execution".
    excludeTools string[] Tool names to ignore when counting consecutive calls.

    Smoltalk has support for a limited number of providers right now, and is mostly focused on the stateless APIs for text completion, though I plan to add support for more providers as well as image and speech models later. Smoltalk is also a personal project, and there are alternatives backed by companies:

    • Langchain
    • OpenRouter
    • Vercel AI

    Contributions are welcome. Any of the following contributions would be helpful:

    • Adding support for API parameters or endpoints
    • Adding support for different providers
    • Updating the list of models