Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: make-rag-optional-but-default #2123

Conversation

jeannotdamoiseaux
Copy link
Contributor

Purpose

Added functionality to optionally disable RAG from the developer settings. The default option remains using all documents in the knowledge base.

This change also addresses hallucination issues with sources observed when RAG was disabled, caused by the master prompt instruction to include sources in the answer. The fix involves injecting the source-related part of the master prompt only when RAG is enabled. Additionally, the supporting content button is now displayed (in the chat) or enabled (in the analysis panel) only when supporting content is available.

Does this introduce a breaking change?

When developers merge from main and run the server, azd up, or azd deploy, will this produce an error?
If you're not sure, try it out on an old environment.

[ ] Yes
[ X ] No

Does this require changes to learn.microsoft.com docs?

This repository is referenced by this tutorial
which includes deployment, settings and usage instructions. If text or screenshot need to change in the tutorial,
check the box below and notify the tutorial author. A Microsoft employee can do this for you if you're an external contributor.

[ ] Yes
[ X ] No

Type of change

[ ] Bugfix
[ X ] Feature
[ ] Code style update (formatting, local variables)
[ ] Refactoring (no functional changes, no api changes)
[ ] Documentation content changes
[ ] Other... Please describe:

Code quality checklist

See CONTRIBUTING.md for more details.

  • The current tests all pass (python -m pytest).
  • I added tests that prove my fix is effective or that my feature works
  • I ran python -m pytest --cov to verify 100% coverage of added lines
  • I ran python -m mypy to check for type errors
  • I either used the pre-commit hooks or ran ruff and black manually on my code.

Note: Many tests are failing with the message "... value does not match the expected value in snapshot tests/snapshots/test_app/...". These failures need to be addressed before merging.

@jeannotdamoiseaux
Copy link
Contributor Author

@pamelafox, could you help me identify where the errors in the test scripts might be coming from? When I deploy the app, everything seems to work as expected on my end.

bnodir added a commit to bnodir/azure-search-openai-demo that referenced this pull request Nov 6, 2024
@@ -79,7 +79,8 @@
"retrieveCount": "Hent dette antal søgeresultater:",
"includeCategory": "Inkludér kategori",
"includeCategoryOptions": {
"all": "Alle"
"all": "Alle",
"none": "Ingen"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@EMjetrot Can you review this string- "Ingen" for "None" , as in "no categories"?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pamelafox - That is correct. "Ingen" as in no categories and "Alle" as in all categories.

@@ -58,7 +58,7 @@ def system_message_chat_conversation(self):
return """Assistant helps the company employees with their healthcare plan questions, and questions about the employee handbook. Be brief in your answers.
Answer ONLY with the facts listed in the list of sources below. If there isn't enough information below, say you don't know. Do not generate answers that don't use the sources below. If asking a clarifying question to the user would help, ask the question.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jeannotdamoiseaux - Only after overriding the prompt template, I received an answer not based on the sources. Otherwise, the response was 'I don't know.' This might be due to line 59 of the master prompt instruction. Can you confirm if this is intended to be like this?
image

Copy link
Contributor Author

@jeannotdamoiseaux jeannotdamoiseaux Nov 10, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bnodir This is actually the intended behavior. With this feature, RAG is essentially turned off, transforming the app into a more generic chatbot that relies on the general knowledge of the LLM.

@@ -83,7 +83,8 @@
"retrieveCount": "ここで指定する検索結果数を取得:",
"includeCategory": "カテゴリを指定",
"includeCategoryOptions": {
"all": "全て"
"all": "全て",
"none": "无"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The correct Japanese translation for "none" is "なし".

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the review! I’ve made the changes.

@@ -58,7 +58,7 @@ def system_message_chat_conversation(self):
return """Assistant helps the company employees with their healthcare plan questions, and questions about the employee handbook. Be brief in your answers.
Answer ONLY with the facts listed in the list of sources below. If there isn't enough information below, say you don't know. Do not generate answers that don't use the sources below. If asking a clarifying question to the user would help, ask the question.
If the question is not in English, answer in the language used in the question.
Each source has a name followed by colon and the actual information, always include the source name for each fact you use in the response. Use square brackets to reference the source, for example [info1.txt]. Don't combine sources, list each source separately, for example [info1.txt][info2.pdf].
{sources_reference_content}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We've considered moving the prompts into Jinja files. I'm wondering if we should make that change first, as this could read a bit nicer, like

{% if should_reference_sources $}
Each source...
{% endif %}

Our current method of storing the templates inside multi-line strings is not easy to work with, so I've been hoping to move them at least into separate files, and just hadn't decided about .txt versus Jinja versus prompty files. Jinja is a happy medium. That could be done by you or I in a separate PR. Thoughts?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed. I'll create a separate PR for this first.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Implemented in #2164

sources_content = []
extra_info = {"thoughts": [], 'data_points': []}

if include_category != "__NONE__":
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think include_category != "__NONE__" condition should be executed only once, stored in a variable, and be under a comment that says that NONE is a value passed down by the frontend categories picker.
I'm nervous about special magical values like NONE so want to make it clear where it came from and minimize places to use it incorrectly.
It could also be a class attribute on Approach, like NO_CATEGORIES

extra_info = {"thoughts": [], 'data_points': []}

if include_category != "__NONE__":
tools: List[ChatCompletionToolParam] = [
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm. It'd be nice to make this change in a way that doesn't cause this large indentation change, since that can make it harder for developers to merge in new changes..but I think that's not possible, right? Just musing out loud.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought about that as well. I think the only way to avoid indentation is to create a separate function. What do you think?


extra_info = {
"data_points": data_points,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You seem to have lost the data_points from thought process, please bring them back.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This behavior is intentional, as there are no data points when RAG is disabled. However, the solution could be more elegant than simply overwriting the extra_info variable.

@@ -198,7 +198,8 @@ export const Settings = ({
onChange={(_ev?: React.FormEvent<HTMLElement | HTMLInputElement>, option?: IDropdownOption) => onChange("includeCategory", option?.key || "")}
aria-labelledby={includeCategoryId}
options={[
{ key: "", text: t("labels.includeCategoryOptions.all") }
{ key: "", text: t("labels.includeCategoryOptions.all") },
{ key: "__NONE__", text: t("labels.includeCategoryOptions.none") }
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not 100% sure that this will be intuitive to folks as the way to go into "non-RAG" mode. Before looking at the PR, I was expecting a top-level checkbox like "Use sources for answers". But it's also nice to avoid adding even more things to settings. Do any other folks have thoughts?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given my use case, this approach is preferable. We will move this setting out of the developer settings and make it available to all users. This way, they will have a single dropdown to select a relevant subset of the documents in the index.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jeannotdamoiseaux - Am I correct in understanding that the 'Include category' dropdown will now be visible in the UI for all users, and that selecting 'None' in the dropdown will effectively disable RAG mode?

If this is the case, I believe the UI would be more intuitive for users if these options were presented as two separate settings: one for disabling RAG mode and another for selecting a specific category, provided that RAG mode is enabled and multiple categories are available in the index.

Additionally, as a developer, I would need the ability to control the visibility of these settings for end users. This is important because it represents a significant shift in the application's intended use and would require updates to our legal assessments (DPIA) if such functionality were permitted. Furthermore, some developers might want to allow category selection without providing the option to disable RAG mode.

Copy link
Contributor Author

@jeannotdamoiseaux jeannotdamoiseaux Nov 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@EMjetrot No, in this repository, it will remain a developer setting. However, in our clone, we plan to migrate the dropdown to the UI, making it accessible to all users.

By the way, I’m curious to know which organization you're working with, as we also need to comply with regulations like DPIAs as a governmental entity.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point by @EMjetrot, this feature is significant enough that there should be an option to enable/disable, as many organizations won't want to enable a general purpose chat. Given that, and concerns about unintuitive nature of the "None" dropdown, I'd vote for either a separate developer setting or a setting in the user space. (We don't have an obvious place for it, probably next to User uploads, but it's getting crowded. Perhaps you have ideas)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My preference would be to remove the text next to the buttons, move the 'Upload File' button to the left of the chat input container, change the 'Remove Chat' button to 'New Chat,' and place it in the top-left (similar to the ChatGPT interface). The RAG switch could either be placed to the left of the question input container or in the top-right.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • Re "upload file": I'd be slightly concerned that folks would think the file was only for that particular question, since that's often a feature of chat interfaces, versus persistent.
  • "Remove chat" -> "New chat" seems like a good change, given that "Remove chat" doesn't actually remove from chat history for folks using that. That's also the phrasing used by GitHub Copilot Chat.
  • Placing in top-left to match ChatGPT could work. I don't feel strongly either way.
  • I do like when there's text next to icons, as I can be a bit "icon-blind" at times, but I realize that's not always feasible. It's fine as long as we keep accessible tooltips.

cc @zedhaque as I think he's thought through this for the mobile-optimized design

@pamelafox
Copy link
Collaborator

@jeannotdamoiseaux The snapshot tests are failing due to the addition of a newline from the template change. They can be updated with pytest --snapshot-update and the differences reviewed in GitHub.

@jeannotdamoiseaux
Copy link
Contributor Author

@jeannotdamoiseaux The snapshot tests are failing due to the addition of a newline from the template change. They can be updated with pytest --snapshot-update and the differences reviewed in GitHub.

I'm new to this type of testing and not entirely sure what steps to follow here. Could you clarify what exactly needs to be done?

@jeannotdamoiseaux jeannotdamoiseaux deleted the feat--make-rag-optional-but-default branch December 4, 2024 15:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants