Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docs: update v3 documentation, readme and examples #1526

Merged
merged 6 commits into from
Jan 17, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
23 changes: 15 additions & 8 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -25,12 +25,16 @@ Load your data, save them as a dataframe, and push them to the platform
```python
import pandasai as pai

df = pai.read_csv("./filepath.csv")
df.push()
pai.api_key.set("your-pai-api-key")

file = pai.read_csv("./filepath.csv")

df.save(path="your-organization/dataset-name",
dataset = pai.create(path="your-organization/dataset-name",
df=file,
name="dataset-name",
description="dataset-description")

dataset.push()
```
Your team can now access and query this data using natural language through the platform.

Expand Down Expand Up @@ -113,7 +117,7 @@ df.chat(
You can also pass in multiple dataframes to PandaAI and ask questions relating them.

```python
from pandasai import Agent
import pandasai as pai

employees_data = {
'EmployeeID': [1, 2, 3, 4, 5],
Expand All @@ -133,8 +137,7 @@ salaries_df = pai.DataFrame(salaries_data)
# You can get your free API key signing up at https://app.pandabi.ai (you can also configure it in your .env file)
pai.api_key.set("your-pai-api-key")

agent = Agent([employees_df, salaries_df])
agent.chat("Who gets paid the most?")
pai.chat("Who gets paid the most?", employees_df, salaries_df)
```

```
Expand All @@ -145,16 +148,20 @@ You can find more examples in the [examples](examples) directory.

## 📜 License

PandaAI is available under the MIT expat license, except for the `pandasai/ee` directory (which has it's [license here](https://github.com/Sinaptik-AI/pandas-ai/blob/master/pandasai/ee/LICENSE) if applicable.
PandaAI is available under the MIT expat license, except for the `pandasai/ee` directory of this repository, which has its [license here](https://github.com/Sinaptik-AI/pandas-ai/blob/master/pandasai/ee/LICENSE).

If you are interested in managed PandaAI Cloud or self-hosted Enterprise Offering, [contact us](https://forms.gle/JEUqkwuTqFZjhP7h8).
If you are interested in managed PandaAI Cloud or self-hosted Enterprise Offering, [contact us](https://getpanda.ai/pricing).

## Resources

> **Beta Notice**
> Release v3 is currently in beta. The following documentation and examples reflect the features and functionality in progress and may change before the final release.

- [Docs](https://pandas-ai.readthedocs.io/en/latest/) for comprehensive documentation
- [Examples](examples) for example notebooks
- [Discord](https://discord.gg/KYKj9F2FRH) for discussion with the community and PandaAI team


## 🤝 Contributing

Contributions are welcome! Please check the outstanding issues and feel free to open a pull request.
Expand Down
2 changes: 1 addition & 1 deletion docs/mint.json
Original file line number Diff line number Diff line change
Expand Up @@ -64,7 +64,7 @@
},
{
"group": "Natural Language",
"pages": ["v3/overview-nl", "v3/large-language-models", "v3/chat-and-cache", "v3/output-formats"],
"pages": ["v3/overview-nl", "v3/large-language-models", "v3/chat-and-output"],
"version": "v3"
},
{
Expand Down
88 changes: 65 additions & 23 deletions docs/v3/agent.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -3,19 +3,22 @@ title: 'Agent'
description: 'Add few-shot learning to your PandaAI agent'
---

<Note title="Beta Notice">
Release v3 is currently in beta. This documentation reflects the features and functionality in progress and may change before the final release.
</Note>

You can train PandaAI to understand your data better and to improve its performance. Training is as easy as calling the `train` method on the `Agent`.


## Prerequisites

Before you start training PandaAI, you need to set your PandaAI API key. You can generate your API key by signing up at [https://app.pandabi.ai](https://app.pandabi.ai).

Then you can set your API key as an environment variable:
Before you start training PandaAI, you need to set your PandaAI API key.
You can generate your API key by signing up at [https://app.pandabi.ai](https://app.pandabi.ai).

```python
import os
import pandasai as pai

os.environ["PANDABI_API_KEY"] = "YOUR_PANDABI_API_KEY"
pai.api_key.set("your-pai-api-key")
```

It is important that you set the API key, or it will fail with the following error: `No vector store provided. Please provide a vector store to train the agent`.
Expand All @@ -33,10 +36,10 @@ The training uses by default the `BambooVectorStore` to store the training data,
As an alternative, if you want to use a local vector store (enterprise only for production use cases), you can use the `ChromaDB`, `Qdrant` or `Pinecone` vector stores (see examples below).

```python
import pandasai as pai
from pandasai import Agent

# Set your PandasAI API key (you can generate one signing up at https://app.pandabi.ai)
os.environ["PANDABI_API_KEY"] = "YOUR_PANDABI_API_KEY"
pai.api_key.set("your-pai-api-key")

agent = Agent("data.csv")
agent.train(docs="The fiscal year starts in April")
Expand All @@ -61,19 +64,22 @@ agent = Agent("data.csv")

# Train the model
query = "What is the total sales for the current fiscal year?"
response = """
import pandas as pd

df = dfs[0]
# The following code is passed as a string to the response variable
response = '\n'.join([
'import pandas as pd',
'',
'df = dfs[0]',
'',
'# Calculate the total sales for the current fiscal year',
'total_sales = df[df[\'date\'] >= pd.to_datetime(\'today\').replace(month=4, day=1)][\'sales\'].sum()',
'result = { "type": "number", "value": total_sales }'
])

# Calculate the total sales for the current fiscal year
total_sales = df[df['date'] >= pd.to_datetime('today').replace(month=4, day=1)]['sales'].sum()
result = { "type": "number", "value": total_sales }
"""
agent.train(queries=[query], codes=[response])

response = agent.chat("What is the total sales for the last fiscal year?")
print(response)

# The model will use the information provided in the training to generate a response
```

Expand Down Expand Up @@ -110,15 +116,17 @@ agent = Agent("data.csv", vectorstore=vector_store)

# Train the model
query = "What is the total sales for the current fiscal year?"
response = """
import pandas as pd

df = dfs[0]
# The following code is passed as a string to the response variable
response = '\n'.join([
'import pandas as pd',
'',
'df = dfs[0]',
'',
'# Calculate the total sales for the current fiscal year',
'total_sales = df[df[\'date\'] >= pd.to_datetime(\'today\').replace(month=4, day=1)][\'sales\'].sum()',
'result = { "type": "number", "value": total_sales }'
])

# Calculate the total sales for the current fiscal year
total_sales = df[df['date'] >= pd.to_datetime('today').replace(month=4, day=1)]['sales'].sum()
result = { "type": "number", "value": total_sales }
"""
agent.train(queries=[query], codes=[response])

response = agent.chat("What is the total sales for the last fiscal year?")
Expand All @@ -145,3 +153,37 @@ vector_store = BambooVectorStor(api_key="YOUR_PANDABI_API_KEY")
# Instantiate the agent with the custom vector store
agent = Agent(connector, config={...} vectorstore=vector_store)
```
## Custom Head

In some cases, you might want to provide custom data samples to the conversational agent to improve its understanding and responses. For example, you might want to:
- Provide better examples that represent your data patterns
- Avoid sharing sensitive information
- Guide the agent with specific data scenarios

You can do this by passing a custom head to the agent:

```python
import pandas as pd
import pandasai as pai

# Your original dataframe
df = pd.DataFrame({
'sensitive_id': [1001, 1002, 1003, 1004, 1005],
'amount': [150, 200, 300, 400, 500],
'category': ['A', 'B', 'A', 'C', 'B']
})

# Create a custom head with anonymized data
head_df = pd.DataFrame({
'sensitive_id': [1, 2, 3, 4, 5],
'amount': [100, 200, 300, 400, 500],
'category': ['A', 'B', 'C', 'A', 'B']
})

# Use the custom head
smart_df = pai.SmartDataframe(df, config={
"custom_head": head_df
})
```

The agent will use your custom head instead of the default first 5 rows of the dataframe when analyzing and responding to queries.
4 changes: 4 additions & 0 deletions docs/v3/ai-dashboards.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,10 @@ title: 'AI Dashboards'
description: 'Turn your dataframes into collaborative AI dashboards'
---

<Note title="Beta Notice">
Release v3 is currently in beta. This documentation reflects the features and functionality in progress and may change before the final release.
</Note>

PandaAI provides a [data platform](https://app.pandabi.ai) that maximizes the power of your [semantic dataframes](/v3/dataframes).
With a single line of code, you can turn your dataframes into auto-updating AI dashboards - no UI development needed.
Each dashboard comes with a pre-generated set of insights and a conversational agent that helps you and your team explore the data through natural language.
Expand Down
64 changes: 0 additions & 64 deletions docs/v3/chat-and-cache.mdx

This file was deleted.

43 changes: 39 additions & 4 deletions docs/v3/output-formats.mdx → docs/v3/chat-and-output.mdx
Original file line number Diff line number Diff line change
@@ -1,12 +1,48 @@
---
title: 'Output formats'
description: 'Understanding the different output formats supported by PandaAI'
title: "Chat and output formats"
description: "Learn how to use PandaAI's powerful chat functionality and the output formats for natural language data analysis"
---

PandaAI supports multiple output formats for responses, each designed to handle different types of data and analysis results effectively. This document outlines the available output formats and their use cases.
<Note title="Beta Notice">
Release v3 is currently in beta. This documentation reflects the features and functionality in progress and may change before the final release.
</Note>

## Chat

The `.chat()` method is PandaAI's core feature that enables natural language interaction with your data. It allows you to:
- Query your data using plain English
- Generate visualizations and statistical analyses
- Work with multiple DataFrames simultaneously

For a more UI-based data analysis experience, check out our [Data Platform](/v3/ai-dashboards).

### Basic Usage

```python
import pandasai as pai

df_customers = pai.load("company/customers")

response = df_customers.chat("Which are our top 5 customers?")
```

### Chat with multiple DataFrames

```python
import pandasai as pai

df_customers = pai.load("company/customers")
df_orders = pai.load("company/orders")
df_products = pai.load("company/products")

response = pai.chat('Who are our top 5 customers and what products do they buy most frequently?', df_customers, df_orders, df_products)
```

## Available Output Formats

PandaAI supports multiple output formats for responses, each designed to handle different types of data and analysis results effectively. This document outlines the available output formats and their use cases.


### DataFrame Response
Used when the result is a pandas DataFrame. This format preserves the tabular structure of your data and allows for further data manipulation.

Expand All @@ -28,7 +64,6 @@ The response format is automatically determined based on the type of analysis pe

Example:
```python
import pandas as pd
import pandasai as pai

df = pai.load("my-org/users")
Expand Down
39 changes: 0 additions & 39 deletions docs/v3/conversational-agent.mdx

This file was deleted.

Loading
Loading