Jun 15, 2025
5 min read
Rust,
LLM,
OpenAI,

Function Calling with Large Language Models in Rust

Fundamentally, large language models (LLMs) are text-to-text (word2word) models and do not inherently possess the ability to call functions. When we talk about an LLM’s function calling capability, what we actually refer to is its ability to autonomously determine whether a function should be invoked based on the user’s question and the list of available functions, and then return the corresponding function parameters.

This seemingly simple ability has enabled AI agents to flourish. With this feature, many previously difficult tasks have become feasible — for example, generating fully functional websites or games from a single sentence, or converting natural language instructions into smart home commands to control hardware devices.

In terms of instruction following, OpenAI and Claude stand out among competitors. Before the release of DeepSeek v3/r1, most domestic Chinese LLMs had very poor instruction-following capabilities — essentially unusable for practical applications. Fortunately, after DeepSeek v3/r1, many domestic LLMs significantly improved their instruction-following abilities. In this regard, DeepSeek deserves credit.

To demonstrate the function-calling ability of LLMs, I’ll use Qwen as an example (mainly because it’s affordable and fast). If you’re building production-grade applications, I recommend using DeepSeek as your primary choice; however, considering DeepSeek often runs under heavy load, having a backup option is advisable.

Why Do We Need Function Calls?

A realistic limitation is that LLMs cannot know real-time information such as the current time, weather, or breaking news. This is because the training data used for LLMs is time-bound — even though it’s now 2025, many LLMs might only include data up to 2023.

Therefore, we need LLMs to “intelligently” perceive real-world data through function calls.

Function Calling Workflow

The workflow for function calling looks like this:

graph TD
    A[User Question] --> B{Call LLM Based on User Input}
    B --> C{Tool Information}
    C{Weather Query / Current Time Function} --> D{Determine Whether to Call Tool}
    D -- Yes --> E[Output Tool Name & Parse Parameters]
    D -- No --> F[Direct Response]
    E --> G{Invoke Tool Using Name & Parameters}
    G --> H[Output Tool Result]
    H --> I{Call LLM Again Using Tool Result + User Question}
    I --> J[Return Final LLM Response]
    J --> K[Final Answer]
    F --> K

Our goal is to implement this exact flow.

Implementation in Rust

Tongyi Qianwen (Qwen) provides two API styles: one similar to OpenAI, and another specific to DashScope.

If you prefer the OpenAI style, I recommend using this library:

cargo add async-openai

For the DashScope style, use this library instead:

cargo add async-dashscope

Here, I’ll demonstrate using DashScope-style APIs, taking the query “What time is it now?” as an example. First, we need to define a function to retrieve the current time:

fn get_current_time() -> i64 {
	let now = Utc::now();
	now.to_string()
}

Next, construct the message payload:

let mut messages = vec![MessageBuilder::default()
        .role("user")
        .content("现在是什么时间?")
        .build()
        .unwrap()];

Then, attach function definitions to the request parameters:

// add function call
    let request = GenerationParamBuilder::default()
        .model("qwen-turbo".to_string())
        .input(InputBuilder::default().messages(messages.clone()).build()?)
        .parameters(
            ParametersBuilder::default()
                .functions([FunctionCallBuilder::default()
                    .typ("function")
                    .function(
                        FunctionBuilder::default()
                            .name("get_current_time")
                            .description("return the current time")
                            .build()?,
                    )
                    .build()?])
                // or call .tools(value)
                .result_format("message")
                .parallel_tool_calls(true)
                .build()?,
        )
        .build()?;

With both user input and function definitions included, we make the first LLM call:

let client = Client::default();
let response = client.generation().call(request).await?;

At this point, the LLM will automatically decide whether a tool needs to be called. If so, it returns the function name and arguments. Based on that, we handle the function invocation accordingly — here’s a simplified version:

    let response_message = response.output.choices.unwrap().first().unwrap().message.clone();
    // get  function call arguments
    if let Some(func_calls) = response_message.tool_calls {
        for call in &func_calls {
            if call.function.name == "get_current_time" {
                let func_response = get_current_time();
                messages.push(
                    MessageBuilder::default()
                        .role("user")
                        .content(func_response)
                        .build()?,
                );
                break;
            }
        }

Once we obtain the function result, we append it back to the message history and make a second call to the LLM — this time without including any function definitions:

let request = GenerationParamBuilder::default()
            .model("qwen-turbo".to_string())
            .input(InputBuilder::default().messages(messages.clone()).build()?)
            .build()?;

        let response = client.generation().call(request).await?;

        // Return final summarized response
        dbg!(&response.output.text);

And finally, we receive a summarized output:

The date and time you provided is: **June 5, 2025 at 16:00:00**.

In reality, things aren’t always this straightforward. However, this example serves as a solid foundation for developing practical AI agent applications.