Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

.Net: Update Ollama Connector to support streaming with function calling. #10292

Open
landonzeng opened this issue Jan 24, 2025 · 5 comments
Open
Assignees
Labels
bug Something isn't working .NET Issue or Pull requests regarding .NET code

Comments

@landonzeng
Copy link

Describe the bug
当我执行以下代码时:

 var endpoint = new Uri("http://localhost:11434/v1/");
 var modelId = "qwen2.5:14b";

var httpClient = new HttpClient();
var builder = Kernel.CreateBuilder()
    .AddOpenAIChatCompletion(modelId: modelId!, apiKey: null, endpoint: endpoint, httpClient: httpClient);
var kernel = builder.Build();

var chatCompletionService = kernel.GetRequiredService<IChatCompletionService>();

var openAIPromptExecutionSettings = new OpenAIPromptExecutionSettings()
{
    ToolCallBehavior = ToolCallBehavior.AutoInvokeKernelFunctions
};
KernelArguments kernelArgs = new(openAIPromptExecutionSettings);
var history = new ChatHistory();
string? userInput;
do
{
    Console.Write("User > ");
    userInput = Console.ReadLine();
    history.AddUserMessage(userInput!);

    var result = chatCompletionService.GetStreamingChatMessageContentsAsync(history, executionSettings: openAIPromptExecutionSettings, kernel: kernel);
    string fullMessage = "";

    Console.Write("Assistant > ");
    await foreach (var content in result)
    {
        Console.Write(content.Content);
        fullMessage += content.Content;
    }
    Console.WriteLine();

    history.AddAssistantMessage(fullMessage);
} while (userInput is not null);

如果chatCompletionService.GetStreamingChatMessageContentsAsync方法传入了executionSettings的参数时流式输出就会失效,如果不传入的话,流式输出是正确的,例如:

chatCompletionService.GetStreamingChatMessageContentsAsync(history,  kernel: kernel);

修改成这样,那么控制台将以流式输出AI回答的内容

Expected behavior
GetStreamingChatMessageContentsAsync能正确按流式输出内容

Platform

  • Language: C#
  • Source: 1.34.0
  • AI model: qwen2.5:14b
  • IDE: Visual Studio
  • OS: Windows
@landonzeng landonzeng added the bug Something isn't working label Jan 24, 2025
@markwallace-microsoft markwallace-microsoft added .NET Issue or Pull requests regarding .NET code triage labels Jan 24, 2025
@github-actions github-actions bot changed the title Bug: 在调用GetStreamingChatMessageContentsAsync方法时如果传入了executionSettings的参数时流式输出就会失效 .Net: Bug: 在调用GetStreamingChatMessageContentsAsync方法时如果传入了executionSettings的参数时流式输出就会失效 Jan 24, 2025
@RogerBarreto
Copy link
Member

@landonzeng This is a not supported scenario for OpenAI connector. For using against Ollama, we recommend using the OllamaSharp IChatClient. Please follow the example here:

var chatService = ollamaClient.AsChatCompletionService();

OR you can use our Ollama Connector extensions through the KernelBuilder or IServiceCollection types.

Regarding the Function Calling I'm also seeing you are using our legacy ToolCallBehavior, we recommend using the new. FunctionChoiceBehavior

OpenAIPromptExecutionSettings settings = new() { FunctionChoiceBehavior = FunctionChoiceBehavior.Auto() };

i.e:

var openAIPromptExecutionSettings = new OpenAIPromptExecutionSettings()
{
    FunctionChoiceBehavior = FunctionChoiceBehavior.Auto()
};

Let me know if this solves your problem, feel free to reopen the issue otherwise.

@RogerBarreto RogerBarreto moved this from Bug to Sprint: Done in Semantic Kernel Jan 30, 2025
@landonzeng
Copy link
Author

landonzeng commented Feb 5, 2025

我按您提供的信息,在方法里添加了以下代码:

var openAIPromptExecutionSettings = new OpenAIPromptExecutionSettings()
{
    FunctionChoiceBehavior = FunctionChoiceBehavior.Auto()
};

修改成这样后,并没有以流式输出回答的内容,插件是正确调用了的,当时接口是等所有回答内容都得到了后再返回给前端的,
接口的全部代码如下:

        [HttpPost("test")]
        [Produces("text/event-stream")]
        public async Task<IResult> SSETest([FromBody]QuestionDTO input)
        {

            var _history = new ChatHistory();
           

            _semanticKernelService.Kernel.Plugins.AddFromType<TimesPlugin>();

            _history.AddUserMessage(input.Question);


            OpenAIPromptExecutionSettings settings = new() { FunctionChoiceBehavior = FunctionChoiceBehavior.Auto() };
            
            var content = _semanticKernelService.Kernel.InvokePromptStreamingAsync(input.Question, new(settings));

            Response.Headers.ContentType = "text/event-stream";
            Response.Headers.CacheControl = "no-cache";
            await Response.Body.FlushAsync();

            if (content is null)
            {
                var error = JsonSerializer.Serialize(MessageModel<string>.Fail("生成失败"));
                await Response.WriteAsync($"data: {error}\n");
                await Response.Body.FlushAsync();
            }
            else
            {
                await foreach (var item in content)
                {
                    await Response.WriteAsync($"data: {MessageModel<string>.Ok(item.ToString())}\n"); ;
                    await Response.Body.FlushAsync();
                }
            }

            await Response.Body.FlushAsync();
            _semanticKernelService.Kernel.Plugins.Clear();
            return Results.Empty;
        }
 public class TimesPlugin

 {
     [KernelFunction]
     [Description("返回当前时间")]
     public async Task<string> GetCurrTimeAsync()
     {
         Console.WriteLine("【本机函数】-返回当前时间-GetCurrTimeAsync");

         var result = DateTime.Now;

         return result.ToString();
     }

     [KernelFunction]
     [Description("返回当前日期")]
     public async Task<string> GetCurrDateAsync()
     {
         Console.WriteLine("【本机函数】-返回当前日期-GetCurrDateAsync");
         var result = DateTime.Now.Date;

         return result.ToString();
     }

     [KernelFunction]
     [Description("返回当前月份")]
     public async Task<string> GetCurrMonthsAsync()
     {
         Console.WriteLine("【本机函数】-返回当前月份-GetCurrMonthsAsync");
         var result = DateTime.Now.Month;

         return result.ToString();
     }

     [KernelFunction]
     [Description("返回当前星期几")]
     public async Task<string> GetCurrWeekAsync()
     {
         Console.WriteLine("【本机函数】-返回当前星期几-GetCurrWeekAsync");
         var result = DateTime.Now.DayOfWeek;

         return result.ToString();
     }

     [KernelFunction]
     [Description("计算指定时间分钟数量")]
     [return: Description("时间分钟数")]
     public async Task<string> CalculateTimeHour([Description("时间小时数")] int hour, [Description("时间分钟数")] int minute)
     {
         Console.WriteLine("【本机函数】-计算指定时间分钟数量-GetCurrWeekAsync");
         var result = hour * 60 + minute;

         return result.ToString();
     }

 }

Image

如果我不使用插件的话,流式输出是正常的:

Image

@RogerBarreto

@landonzeng
Copy link
Author

landonzeng commented Feb 5, 2025

我发现只要使用了FunctionChoiceBehavior = FunctionChoiceBehavior.Auto(), 那么方法就没办法获得流式数据,只能等到所有数据全部拿到后返回,这应该是个bug
@RogerBarreto

@RogerBarreto
Copy link
Member

As far as I know streaming is not fully supported for function calling with Ollama, last time I checked when using streaming with function calling, the result of the function was not streamed in chunks but rather, provided as one big block.

I will reopen this issue to investigate the streaming behavior of Ollama, but for now, recommend using the non-streaming API.

@RogerBarreto RogerBarreto reopened this Feb 11, 2025
@RogerBarreto RogerBarreto moved this from Sprint: Done to Sprint: Planned in Semantic Kernel Feb 11, 2025
@John0King
Copy link

base on my test with DeepSeek api (using openai connnector) , I find that when I calling using the steaming GetStreamingChatMessageContentsAsync , the steaming response is not the final UpdateResponseMessageContent but a UpdateFunctionCallContent , and it will end with the UpdateFunctionCallContent ,
I think this is what it do:

[user]   ---->  question  with tool description
[assisant] ----> tool call  streaming response
         |-> tool call steaming response end  
 append chatHistroy 
 AsyncEnumerable   end

infact I try to calling the GetStreamingChatMessageContentsAsync again , and then it give me the UpdateResponseMessageContent for the reall message response

so I think this is indeed a bug !

what I wanted is a streaming chat client with tool support

[user] --> question
[assisant]: steaming response:
       |-->  response to user   "I'm looking your data/file"  (tool call)
       |-->  response the aswer for the user question using  tool response 

[user] -> question 2
assisant]: steaming response:
      |->  ....
      |->  ....

...

@RogerBarreto RogerBarreto changed the title .Net: Bug: 在调用GetStreamingChatMessageContentsAsync方法时如果传入了executionSettings的参数时流式输出就会失效 .Net: Update Ollama Connector to support streaming with function calling. Mar 30, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working .NET Issue or Pull requests regarding .NET code
Projects
Status: Sprint: Planned
Development

No branches or pull requests

5 participants