glm-4-9b-chat量化bin模型(精度q8_0)调用function calling失败

```python
>首先是对glm3和glm4模型做量化，我下载并使用glm-3-6b-chat和glm-4-9b-chat完整的模型做量化：(量化精度都是q8_0)
chatglm.cpp# python3 chatglm_cpp/convert.py -i /glm-3-6b-chat/ -t q8_0 -o models/chatglm3-q8_0-ggml.bin
chatglm.cpp# python3 chatglm_cpp/convert.py -i /glm-4-9b-chat/ -t q8_0 -o models/chatglm4-q8_0-ggml.bin

>然后分别使用两个量化模型，调用function calling的功能，在此参考作者官方的cli_demo.py
先是测试glm3的量化模型：
/chatglm.cpp/examples# python3 cli_demo.py -m /chatglm.cpp/models/chatglm3-q8_0-ggml.bin --temp 0.1 --top_p 0.8 --sp system/function_call.txt -i

>在Prompt中输入："请帮我查询一下今天苏州的天气"，打印出了tool_call，说明调用了function calling的功能：

Prompt  > 请帮我查询一下苏州今天的天气
ChatGLM3 > get_weather
tool_call(city_name='苏州')

>在cli_demo.py中的120行上面添加打印responses的代码：
prompt_image = image
    while True:
        print("--------------------------------------------------------------")
        print(messages)
        if messages and messages[-1].tool_calls:
            (tool_call,) = messages[-1].tool_calls
            if tool_call.type == "function":
                print(
                    f"Function Call > Please manually call function `{tool_call.function.name}` and provide the results below."
                )
                input_prompt = "Observation   > "
            elif tool_call.type == "code":
                print(f"Code Interpreter > Please manually run the code and provide the results below.")
                input_prompt = "Observation      > "
            else:
                raise ValueError(f"unexpected tool call type {tool_call.type}")
            role = "observation"
        else:
            input_prompt = f"{'Prompt':{prompt_width}} > "
            role = "user"

>则在运行时，打印出的responses中的role="assistant"字段中的tool_calls是完全调用了get_weather函数的：
>[ChatMessage(role="system", content="Answer the following questions as best as you can. You have access to the following tools:
{
    \"random_number_generator\": {
        \"name\": \"random_number_generator\",
        \"description\": \"Generates a random number x, s.t. range[0] <= x < range[1]\",
        \"params\": [
            {
                \"name\": \"seed\",
                \"description\": \"The random seed used by the generator\",
                \"type\": \"int\",
                \"required\": true
            },
            {
                \"name\": \"range\",
                \"description\": \"The range of the generated numbers\",
                \"type\": \"tuple[int, int]\",
                \"required\": true
            }
        ]
    },
    \"get_weather\": {
        \"name\": \"get_weather\",
        \"description\": \"Get the current weather for `city_name`\",
        \"params\": [
            {
                \"name\": \"city_name\",
                \"description\": \"The name of the city to be queried\",
                \"type\": \"str\",
                \"required\": true
            }
        ]
    }
}", tool_calls=[]), ChatMessage(role="user", content="请帮我查询苏州今天的天气", tool_calls=[]), ChatMessage(role="assistant", content="```python
tool_call(city_name='苏州')
```", tool_calls=[ToolCallMessage(type="function", function=FunctionMessage(name="get_weather", arguments="tool_call(city_name='苏州')"), code=CodeMessage(input=""))])]

>下面同样的命令，测试glm4的function calling调用结果
chatglm.cpp/examples# python3 cli_demo.py -m /chatglm.cpp/models/chatglm4-q8_0-ggml.bin --temp 0.1 --top_p 0.8 --sp system/function_call.txt

>同样在Prompt中输入查询苏州天气的字段，则打印：

Prompt   > 请帮我查询一下今天苏州的天气
ChatGLM4 > get_weather
{"city_name": "苏州"}

可以看出没有tool_call的字段，说明没有调用function calling的功能

>而打印出responses查看的话，如下：
[ChatMessage(role="system", content="Answer the following questions as best as you can. You have access to the following tools:
{
    \"random_number_generator\": {
        \"name\": \"random_number_generator\",
        \"description\": \"Generates a random number x, s.t. range[0] <= x < range[1]\",
        \"params\": [
            {
                \"name\": \"seed\",
                \"description\": \"The random seed used by the generator\",
                \"type\": \"int\",
                \"required\": true
            },
            {
                \"name\": \"range\",
                \"description\": \"The range of the generated numbers\",
                \"type\": \"tuple[int, int]\",
                \"required\": true
            }
        ]
    },
    \"get_weather\": {
        \"name\": \"get_weather\",
        \"description\": \"Get the current weather for `city_name`\",
        \"params\": [
            {
                \"name\": \"city_name\",
                \"description\": \"The name of the city to be queried\",
                \"type\": \"str\",
                \"required\": true
            }
        ]
    }
}", tool_calls=[]), ChatMessage(role="user", content="请帮我查询一下今天苏州的天气", tool_calls=[]), ChatMessage(role="assistant", content="get_weather
{\"city_name\": \"苏州\"}", tool_calls=[])]

>可以看到role="assistant"字段里面的tool_calls为空，则说明调用function calling失败，虽然content中的内容看起来很对
>不知道这个是chatglm.cpp作者自己的疏忽，还是什么原因，我换了glm-4-9b-chat-1m基座模型和glm-4-9b的base模式的基座模型去做量化，调用function calling的功能也是这样的结果
>请问有什么方法可以解决吗？


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

glm-4-9b-chat量化bin模型(精度q8_0)调用function calling失败 #345

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

glm-4-9b-chat量化bin模型(精度q8_0)调用function calling失败 #345

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions