Skip to content

Commit

Permalink
add open in colab
Browse files Browse the repository at this point in the history
  • Loading branch information
masci committed Oct 6, 2024
1 parent 97fc036 commit c125cd0
Showing 1 changed file with 84 additions and 57 deletions.
141 changes: 84 additions & 57 deletions cookbooks/Prompt_Caching_with_Anthropic.ipynb
Original file line number Diff line number Diff line change
@@ -1,31 +1,21 @@
{
"nbformat": 4,
"nbformat_minor": 0,
"metadata": {
"colab": {
"provenance": []
},
"kernelspec": {
"name": "python3",
"display_name": "Python 3"
},
"language_info": {
"name": "python"
}
},
"cells": [
{
"cell_type": "markdown",
"metadata": {
"id": "JPUfjAlRUB8w"
},
"source": [
"# Use banks to cache prompts with Anthropic API\n",
"\n",
"<a target=\"_blank\" href=\"https://colab.research.google.com/github/masci/banks/blob/main/cookbooks/Prompt_Caching_with_Anthropic.ipynb\">\n",
" <img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/>\n",
"</a>\n",
"\n",
"Prompt caching allows you to store and reuse context within your prompt saving time and money. When using the prompt cache feature from Anthropic, the chat messages have to be expressed in blocks rather than simple text, so that for each block you can define the cache behaviour.\n",
"\n",
"Let's see how Banks makes this super easy."
],
"metadata": {
"id": "JPUfjAlRUB8w"
}
]
},
{
"cell_type": "code",
Expand All @@ -40,47 +30,61 @@
},
{
"cell_type": "markdown",
"source": [
"To simulate a huge prompt, we'll provide Claude with a full book in the context, \"Pride and prejudice\"."
],
"metadata": {
"id": "QF9UZVjaUsK1"
}
},
"source": [
"To simulate a huge prompt, we'll provide Claude with a full book in the context, \"Pride and prejudice\"."
]
},
{
"cell_type": "code",
"source": [
"!curl -O https://www.gutenberg.org/cache/epub/1342/pg1342.txt"
],
"execution_count": null,
"metadata": {
"id": "Ayno0BHEStAm"
},
"execution_count": null,
"outputs": []
"outputs": [],
"source": [
"!curl -O https://www.gutenberg.org/cache/epub/1342/pg1342.txt"
]
},
{
"cell_type": "code",
"cell_type": "markdown",
"metadata": {},
"source": [
"with open(\"pg1342.txt\") as f:\n",
" book = f.read()"
],
"Read the whole book and assign to the `book` variable."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "N2EcJ1P6Svx6"
},
"execution_count": null,
"outputs": []
"outputs": [],
"source": [
"with open(\"pg1342.txt\") as f:\n",
" book = f.read()"
]
},
{
"cell_type": "markdown",
"source": [
"With Banks we can define which part of the prompt will be cached from the prompt template directly, using the `cache_control` built-in filter."
],
"metadata": {
"id": "xzTdJJubVGkL"
}
},
"source": [
"With Banks we can define which part of the prompt specifically will be cached. \n",
"Directly from the prompt template text, we can use the `cache_control` built-in filter to tell Anthropic that\n",
"we want to cache the whole text resulting from the `{{ book }}` template block."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "7PO4397MSm-f"
},
"outputs": [],
"source": [
"import time\n",
"\n",
Expand All @@ -106,15 +110,22 @@
"chat_messages = p.chat_messages({\"book\": book})\n",
"# dump the ChatMessage objects into dictionaries to pass to LiteLLM\n",
"messages_dict = [m.model_dump(exclude_none=True) for m in chat_messages]"
],
"metadata": {
"id": "7PO4397MSm-f"
},
"execution_count": null,
"outputs": []
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Let's call the Anthropic API for the first time. We don't expect any difference from a normal call without caching."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "3hsJHr29ThLj"
},
"outputs": [],
"source": [
"# First call has no cache\n",
"start_time = time.time()\n",
Expand All @@ -123,15 +134,22 @@
"print(f\"Non-cached API call time: {time.time() - start_time:.2f} seconds\")\n",
"print(response.usage)\n",
"print(response)"
],
"metadata": {
"id": "3hsJHr29ThLj"
},
"execution_count": null,
"outputs": []
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Now the book content is in the cache, and the difference in time and cost repeating the previous call is obvious."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "8F75jH4BTZ6U"
},
"outputs": [],
"source": [
"# Second call, the book is cached\n",
"start_time = time.time()\n",
Expand All @@ -140,12 +158,21 @@
"print(f\"Cached API call time: {time.time() - start_time:.2f} seconds\")\n",
"print(response.usage)\n",
"print(response)"
],
"metadata": {
"id": "8F75jH4BTZ6U"
},
"execution_count": null,
"outputs": []
]
}
],
"metadata": {
"colab": {
"provenance": []
},
"kernelspec": {
"display_name": "Python 3",
"name": "python3"
},
"language_info": {
"name": "python"
}
]
}
},
"nbformat": 4,
"nbformat_minor": 0
}

0 comments on commit c125cd0

Please sign in to comment.