Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Python: Feat/Add DEVELOPER role for openai o1 #10033

Merged

Conversation

ymuichiro
Copy link
Contributor

@ymuichiro ymuichiro commented Dec 24, 2024

Motivation and Context

Currently, the OpenAI O1 model introduces a new role "developer". However, passing the conventional "system" role results in an error. To address this issue, the following changes are proposed.

Description

  1. Add a Method to the ChatHistory Class

    • Implement a method to handle role conversion or substitution logic for compatibility with the O1 model.
  2. Expand the AuthorRole Enum

    • Add "developer" as a new value to the AuthorRole enum.
  3. Improve Developer Experience (UX)

    • If the "system" role is mistakenly passed, the logic should internally convert it to "developer" for better UX.
    • However, since all models other than O1 still use "system", simply adding support for the "developer" role seems to be the most optimal solution at this point.

Background

  • The O1 model no longer supports the "system" role, causing errors when it is passed.
  • The introduction of the "developer" role must be integrated in a way that preserves compatibility with other models.

Usage

import asyncio

from semantic_kernel.connectors.ai.function_choice_behavior import FunctionChoiceBehavior
from semantic_kernel.connectors.ai.open_ai.prompt_execution_settings.azure_chat_prompt_execution_settings import (
    AzureChatPromptExecutionSettings,
)
from semantic_kernel.connectors.ai.open_ai.services.azure_chat_completion import AzureChatCompletion
from semantic_kernel.contents.chat_history import ChatHistory
from semantic_kernel.kernel import Kernel

async def main():
    kernel = Kernel()
    service_id = "azure-openai"
    kernel.add_service(
        service=AzureChatCompletion(
            service_id=service_id,
            api_key=*********,
            endpoint=*********,
            deployment_name="o1",
            api_version="2024-12-01-preview",
        )
    )

    settings = kernel.get_prompt_execution_settings_from_service_id(service_id=service_id)

    if isinstance(settings, AzureChatPromptExecutionSettings):
        settings.function_choice_behavior = FunctionChoiceBehavior.Auto(auto_invoke=True)

    service = kernel.get_service(service_id=service_id)

    if not isinstance(service, AzureChatCompletion):
        raise Exception("Invalid Value")

    history = ChatHistory()
    history.add_developer_message("You are a helpful assistant.")
    history.add_user_message("hello")

    result = await service.get_chat_message_contents(chat_history=history, settings=settings, kernel=kernel)

    if not result:
        raise Exception("result is None")

    print(result[0].content)

if __name__ == "__main__":
    asyncio.run(main())

Contribution Checklist

@ymuichiro ymuichiro requested a review from a team as a code owner December 24, 2024 01:31
@markwallace-microsoft markwallace-microsoft added the python Pull requests for the Python Semantic Kernel label Dec 24, 2024
@moonbox3
Copy link
Contributor

Thanks for having a look at this, @ymuichiro. I think it’d be good to make sure other o1 api features are in as well. I saw they have a reasoning_effort param, too. Are there other features we’d want to get in to make sure the o1 experience works well?

@ymuichiro
Copy link
Contributor Author

@moonbox3

Thank you for checking this so quickly. Regarding o1, here are the changes I’m aware of:

※ It might be worth considering providing OpenAIChatReasoningBase as a separate service from OpenAIChatCompletionBase in the future. However, since it seems that the functionality is being gradually integrated, I believe the appropriate place to incorporate these changes for now would be in OpenAIChatPromptExecutionSettings.

o1: 2024-12-17
API: 2024-09-01-preview

  1. In o1, the system role has been deprecated, and a new developer role has been introduced.
  2. In o1, max_tokens has been deprecated and replaced with max_completion_tokens.
    max_completion_tokens: int
    • The value represents the sum of reasoning tokens and completion tokens, with a recommendation to specify 25,000 or more.
  3. A new property, reasoning_effort, has been added in o1.
    reasoning_effort: enum, low, medium, high
    • Higher values indicate that the model will spend more time processing the request.
  4. A new usage metric, reasoning_tokens, has been introduced in o1.
    • This represents the number of tokens used during reasoning.
{
  "completion_tokens": 1843,
  "prompt_tokens": 20,
  "total_tokens": 1863,
  "completion_tokens_details": {
    "audio_tokens": null,
    "reasoning_tokens": 448
  },
  "prompt_tokens_details": {
    "audio_tokens": null,
    "cached_tokens": 0
  }
}

Currently Unsupported

• Parallel tool invocation
temperature, top_p, presence_penalty, frequency_penalty, logprobs, top_logprobs, logit_bias

It would be nice if these were applied to settings, usage, and author_role.

@eavanvalkenburg
Copy link
Member

I agree @moonbox3 bit more work is needed to make sure we fully support o1, these extra settings can just be added with None defaults, but we also need to consider the rendered prompt to ChatHistory parser, that uses a system message as default, that can be left as is, but might throw when used with o1 but might also be usefull to have a look at how we can support this as well!

@ymuichiro
Copy link
Contributor Author

@moonbox3 @eavanvalkenburg

Thank you for your comment. After reviewing the current Semantic Kernel code, I found several instances where AuthorRole.SYSTEM is used as a default value. Since these involve shared classes that are used not only with OpenAI models but also with other models, I believe it is not appropriate to make changes that could impact them. Additionally, to handle these cases comprehensively, it is necessary to determine whether the model in question is a traditional ChatCompletion model or a Reasoning model.

With that in mind, I am planning to proceed with the following tasks. Does this approach make sense to you?

  1. Support for the new role AuthorRole.DEVELOPER

    Add a new inference class to automatically replace AuthorRole.SYSTEM with AuthorRole.DEVELOPER where specified.

    • Additions to model types:

      • Add REASONING to OpenAIModelTypes in semantic_kernel/connectors/ai/open_ai/services/open_ai_model_types.py.
    • Additions to inference classes:

      • Add the following three classes extending OpenAIChatCompletionBase in semantic_kernel/connectors/ai/open_ai/services/open_ai_chat_completion_base.py:
        • OpenAIChatReasoning
        • AzureChatReasoning
      • Add the following class extending AzureAIInferenceBase in semantic_kernel/connectors/ai/azure_ai_inference/services/azure_ai_inference_base.py:
        • AzureAIInferenceChatReasoning

※ Most methods can be reused, so it might be sufficient to simply override some methods of the existing OpenAIChatCompletion (...etc) class.

  1. Addition of new configuration parameters:
    1. max_completion_tokens
    2. reasoning_effort

Regarding item 2, even if certain properties cannot be used, I believe it is the responsibility of the user to ensure only valid properties are passed. Therefore, I plan to set the default values for these properties to None, so they will only be used if explicitly specified.

Furthermore, regarding the new response property reasoning_tokens, since it depends on ChatCompletionChunk from openai.types.chat, I am not including it in the current scope of work. However, as the reasoning result chunks are preserved as inner_content, this value is expected to be included as is in the current implementation.

@moonbox3
Copy link
Contributor

moonbox3 commented Jan 6, 2025

Hi @ymuichiro thanks for the follow up details. Just getting back to the office after holidays - apologies for the slow reply.

As I have a lot of items to go through, let me dig deeper into this today, and I can provide some more comments. As per @eavanvalkenburg's comment above, we do need to think about how the chat history will be handled now that o1 doesn't support the system message -- today one can specify the system message via the constructor, so we'd need to now handle this in a different way.

Another quick thought is: could we have OpenAIReasoningPromptExecutionSettings, which would contain the new properties? That way we're not trying to add too much to the chat prompt execution settings -- nothing is needing to change with the chat models. I also kind of like OpenAIReasoning instead of OpenAIChatReasoning but that might be a personal preference.

Let me circle back soon. And thank you again for all your work on this thus far.

@eavanvalkenburg
Copy link
Member

I'm not sure if a seperate PES makes things easier or harder, not having too many settings is great, but we rely on the class property for the "right" PES at different points and having two possibly valid PES's means headaches for that, we could say, check the model id, if starts with o then use Reasoning, but that doesn't work for Azure OpenAI since there we can't reason about which model is actually behind a deployment, so I don't think that would be ideal, I think keeping it simple and add the new params with None defaults, is still easier.

@eavanvalkenburg
Copy link
Member

@ymuichiro thanks for your patience, me and @moonbox3 had a chat on this, the current PR is good, and please also add the two new params (max_completion_tokens and reasoning_effort) to the existing OpenAIChatPromptExecutionSettings, we will need to have a broader discussion in our team on how to handle this more broadly so for now we will just add without breaking anyone! Thanks again!

@ymuichiro ymuichiro force-pushed the feat/python-add-developer-role-for-openai-o1 branch from 0c5fdb0 to afb791d Compare January 7, 2025 09:58
@ymuichiro
Copy link
Contributor Author

ymuichiro commented Jan 7, 2025

@moonbox3 @eavanvalkenburg

Thank you very much for your review. I have just added a commit with additional modifications.
Regarding the changes, I thought splitting the commits would make them too granular, so I have rebased them into a single commit o1 Provisional Support.

@eavanvalkenburg
Copy link
Member

Don't worry about your commits, the pr gets squashed on merge anyway

@moonbox3
Copy link
Contributor

moonbox3 commented Jan 7, 2025

Don't worry about your commits, the pr gets squashed on merge anyway

Was just going to write this -- you beat me to it!

@markwallace-microsoft
Copy link
Member

markwallace-microsoft commented Jan 7, 2025

Python Test Coverage

Python Test Coverage Report •
FileStmtsMissCoverMissing
semantic_kernel/connectors/ai/open_ai/prompt_execution_settings
   open_ai_prompt_execution_settings.py99199%139
semantic_kernel/contents
   chat_history.py165299%109, 114
TOTAL16695177889% 

Python Unit Test Overview

Tests Skipped Failures Errors Time
3000 4 💤 0 ❌ 0 🔥 1m 11s ⏱️

@moonbox3
Copy link
Contributor

moonbox3 commented Jan 7, 2025

Thanks for your help on this, @ymuichiro.

@ymuichiro
Copy link
Contributor Author

To resolve the test error, I investigated the cause and found a challenging issue, so I am documenting it for now. The error occurs in the Azure AI Inference section. The test verifies whether all defined roles can be parsed by the Message class of Azure AI Inference.

MESSAGE_CONVERTERS: dict[AuthorRole, Callable[[ChatMessageContent], ChatRequestMessage]] = {

To pass the test, we would need to link the AuthorRole defined on the Semantic Kernel side with the ChatRole defined in the Azure AI Inference SDK. However, the Azure AI Inference SDK does not yet include the DEVELOPER role...

https://github.com/Azure/azure-sdk-for-python/blob/3fef975949b7fcb0a45fdde599902d3e39a6c0d0/sdk/ai/azure-ai-inference/azure/ai/inference/models/_enums.py#L29

As a temporary measure, we could prepare a placeholder function with a TODO: note or similar. However, this approach risks missing future updates, so I have started considering how best to address this issue ☹️

@eavanvalkenburg
Copy link
Member

@ymuichiro we had a chat on this one, we think it's best if the other classes raise an error for now when presented with a DEVELOPER message, that way we don't have to introduce behavior changes later on (the alternative is to map from developer to system, but then in the future if Azure AI Inference does support Developer it will change the behavior)

@ymuichiro
Copy link
Contributor Author

@eavanvalkenburg
Thank you. I have implemented support for Azure AI inference.

@moonbox3 @eavanvalkenburg
I discovered new issues and addressed them as follows:

  1. 404 error occurs even when parallel_tool_call is set to False

    • I modified the code to allow None for parallel_tool_call and excluded it from the request.
  2. There are unsupported properties not documented in the OpenAI documentation

    • The following were also not supported:
      • role: tool
      • property: tool_choice
  3. reasoning_effort is not supported in the current openai-python==1.57.2

    • Changed the version in uv.lock to 1.58.0

@ymuichiro ymuichiro force-pushed the feat/python-add-developer-role-for-openai-o1 branch from b5629ab to fd3e0dc Compare January 9, 2025 11:51
@ymuichiro
Copy link
Contributor Author

sorry. It seems that the uv.lock file has a conflict. Since uv is being used, I ran uv add "openai==1.58”, but is a different command actually used in this case?

@ymuichiro
Copy link
Contributor Author

@moonbox3 @eavanvalkenburg

I resolved the conflict in uv.lock. To update the openai package to a version that supports “reasoning_effort,” I attempted to update uv.lock using the following commands. However, since these commands consistently broke uv.lock, I ended up updating it manually.

# The command was executed with the expectation that it would resolve the issue automatically.
# however, this command ended up rewriting the markers in uv.lock
uv sync --all-extras --dev --frozen
uv lock --upgrade-package "openai==1.58"

Note:
As part of troubleshooting, I worked within a devcontainer on GitHub Codespaces, but I was unable to resolve the issue.

@moonbox3
Copy link
Contributor

@moonbox3 @eavanvalkenburg

I resolved the conflict in uv.lock. To update the openai package to a version that supports “reasoning_effort,” I attempted to update uv.lock using the following commands. However, since these commands consistently broke uv.lock, I ended up updating it manually.

# The command was executed with the expectation that it would resolve the issue automatically.
# however, this command ended up rewriting the markers in uv.lock
uv sync --all-extras --dev --frozen
uv lock --upgrade-package "openai==1.58"

Note: As part of troubleshooting, I worked within a devcontainer on GitHub Codespaces, but I was unable to resolve the issue.

Would you like some help with this? I can pull down your branch and take a look. Let me know!

@ymuichiro
Copy link
Contributor Author

@moonbox3

Thank you! This time, I was able to resolve the issue manually. Therefore, no further help is needed for this PULL REQUEST.🙏

@eavanvalkenburg
Copy link
Member

thanks @ymuichiro you just have a readme conflict, then we can get this in!

@ymuichiro
Copy link
Contributor Author

@eavanvalkenburg

Conflicts were resolved!

@moonbox3 moonbox3 added this pull request to the merge queue Jan 15, 2025
Merged via the queue into microsoft:main with commit 2e4455b Jan 15, 2025
25 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation python Pull requests for the Python Semantic Kernel
Projects
Status: Sprint: Done
Development

Successfully merging this pull request may close these issues.

Python: New Feature: Add Support for "developer" Role for OpenAI o1 model
5 participants