-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(sandbox): adding docker sandbox #1517
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
❌ Changes requested. Reviewed everything up to fdcfb87 in 1 minute and 57 seconds
More details
- Looked at
338
lines of code in6
files - Skipped
1
files when reviewing. - Skipped posting
7
drafted comments based on config settings.
1. pandasai/sandbox/Dockerfile:7
- Draft comment:
It's recommended to specify versions for the Python packages installed to ensure reproducibility. For example:
RUN pip install pandas==1.3.3 numpy==1.21.2 matplotlib==3.4.3
- Reason this comment was not posted:
Confidence changes required:50%
The Dockerfile installs specific Python packages but does not specify versions. This can lead to inconsistencies if the packages are updated. It's a best practice to specify package versions to ensure reproducibility.
2. pandasai/sandbox/docker_sandbox.py:99
- Draft comment:
Ensure thatdfs
is not empty before accessingdfs[0]
to avoid potentialIndexError
. - Reason this comment was not posted:
Comment did not seem useful.
3. pandasai/sandbox/docker_sandbox.py:71
- Draft comment:
Consider expanding the SQL query detection to include other SQL operations like "INSERT", "UPDATE", or "DELETE" for a more comprehensive extraction. - Reason this comment was not posted:
Confidence changes required:50%
Theextract_sql_queries_from_code
method only looks for SQL queries containing "SELECT". This might miss other types of SQL queries like "INSERT", "UPDATE", or "DELETE". Consider expanding the logic to capture a broader range of SQL queries.
4. pandasai/sandbox/docker_sandbox.py:147
- Draft comment:
Consider making the path "/tmp" configurable or document the reason for choosing this path for transferring the tar archive to the container. - Reason this comment was not posted:
Confidence changes required:30%
Thepass_csv
method uses a hardcoded path "/tmp" for transferring the tar archive to the container. This might not be suitable for all use cases. Consider making the path configurable or document the choice of path.
5. pandasai/sandbox/sandbox.py:8
- Draft comment:
raise NotImplementedError("Subclasses must implement the stop method.")
- Reason this comment was not posted:
Decided after close inspection that this draft comment was likely wrong and/or not actionable:
- The current error message follows a consistent pattern used throughout the file. 2. The suggested change would break this consistency. 3. Both messages convey the same meaning, but the existing one matches the style used in start(), _exec_code(), and pass_csv() methods. 4. Consistency in error messages is generally more important than minor wording preferences.
The suggested message might be marginally more concise. The consistency argument might not be as important if the other messages should also be changed.
While the suggested message is slightly shorter, maintaining consistency across the codebase is more valuable than a minor rewording that doesn't significantly improve clarity.
The comment should be deleted as it suggests breaking the established error message pattern without providing meaningful improvement.
6. pandasai/sandbox/sandbox.py:6
- Draft comment:
raise NotImplementedError("Subclasses must implement the start method.")
- Reason this comment was not posted:
Decided after close inspection that this draft comment was likely wrong and/or not actionable:
The suggested change is extremely minor - just rewording the same message in a slightly different way. Both versions are grammatically correct and clear. The original message follows the same pattern as other error messages in the file. Making this change would actually make this error message inconsistent with the others.
The suggested wording could be considered more active voice and direct. Error messages should be clear and consistent.
While active voice is good, consistency across error messages in the same file is more important. The original message is already clear and matches the style of other messages.
Delete this comment as it suggests an unnecessary rewording that would make the error messages less consistent within the file.
7. pandasai/sandbox/sandbox.py:28
- Draft comment:
"Subclasses must implement the pass_csv method."
- Reason this comment was not posted:
Decided after close inspection that this draft comment was likely wrong and/or not actionable:
The current error message follows the same pattern as other error messages in the file ("The X method must be implemented by subclasses"). The suggested change would break this consistency. While both messages convey the same meaning, consistency in error messages is valuable for readability and maintenance.
The suggested message might be considered more concise. Error messages should be clear and to the point.
While conciseness is good, consistency across related error messages is more important for maintainability and readability. The current message already follows a clear, established pattern.
The comment should be deleted as it suggests breaking the consistent error message pattern established in the file without providing significant benefit.
Workflow ID: wflow_LuwXf4cZub1aDXIj
Want Ellipsis to fix these issues? Tag @ellipsis-dev
in a comment. You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet
mode, and more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
❌ Changes requested. Reviewed everything up to b630742 in 2 minutes and 14 seconds
More details
- Looked at
996
lines of code in14
files - Skipped
2
files when reviewing. - Skipped posting
7
drafted comments based on config settings.
1. examples/docker_sandbox.py:1
- Draft comment:
Typo in comment. ChangeIntall
toInstall
. - Reason this comment was not posted:
Confidence changes required:10%
The comment on line 1 ofexamples/docker_sandbox.py
contains a typo in the word 'Install'.
2. extensions/sandbox/docker/pandasai_docker/docker_sandbox.py:163
- Draft comment:
Consider adding type annotations for thecsv_data
parameter inpass_csv
method for better code clarity. - Reason this comment was not posted:
Confidence changes required:30%
TheDockerSandbox
class indocker_sandbox.py
has a methodpass_csv
that takes acsv_data
parameter, but it is not type-annotated. Adding type annotations improves code readability and helps with static analysis.
3. extensions/sandbox/docker/pandasai_docker/docker_sandbox.py:186
- Draft comment:
Ensure the container is running before attempting to stop and remove it in the__del__
method to avoid potential errors. - Reason this comment was not posted:
Confidence changes required:50%
Indocker_sandbox.py
, the__del__
method should check if the container is running before attempting to stop and remove it to avoid potential errors.
4. extensions/sandbox/docker/pandasai_docker/serializer.py:20
- Draft comment:
Ensureresult
contains atype
key before accessing it to avoid potentialKeyError
. - Reason this comment was not posted:
Comment did not seem useful.
5. examples/docker_sandbox.py:1
- Draft comment:
Typo in comment. ChangeIntall
toInstall
. - Reason this comment was not posted:
Confidence changes required:50%
The comment on line 1 ofexamples/docker_sandbox.py
contains a typo in the word 'Install'.
6. examples/docker_sandbox.py:21
- Draft comment:
Typo in method call. Changesanbox.stop()
tosandbox.stop()
. - Reason this comment was not posted:
Marked as duplicate.
7. extensions/sandbox/docker/pandasai_docker/docker_sandbox.py:88
- Draft comment:
Typo in parameter name. Changeenviroment
toenvironment
. This typo appears in multiple places in this file. - Reason this comment was not posted:
Marked as duplicate.
Workflow ID: wflow_fREAPEFqLWwZDa2Y
Want Ellipsis to fix these issues? Tag @ellipsis-dev
in a comment. You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet
mode, and more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 Looks good to me! Incremental review on f880b01 in 40 seconds
More details
- Looked at
61
lines of code in3
files - Skipped
0
files when reviewing. - Skipped posting
2
drafted comments based on config settings.
1. pandasai/sandbox/sandbox.py:23
- Draft comment:
"The _exec_code method must be implemented by subclasses."
- Reason this comment was not posted:
Comment was on unchanged code.
2. extensions/sandbox/docker/pandasai_docker/docker_sandbox.py:118
- Draft comment:
"execute_sql_query function is not defined in the environment."
- Reason this comment was not posted:
Comment was on unchanged code.
Workflow ID: wflow_jU5dIBuygeamV8WF
You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet
mode, and more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 Looks good to me! Incremental review on 221c1dc in 13 seconds
More details
- Looked at
11
lines of code in1
files - Skipped
0
files when reviewing. - Skipped posting
2
drafted comments based on config settings.
1. extensions/sandbox/docker/README.md:8
- Draft comment:
The change frompoetry install
topoetry add
is correct for adding a package dependency. - Reason this comment was not posted:
Confidence changes required:0%
The change from 'poetry install' to 'poetry add' is correct for adding a package dependency. However, the PR description is empty, which is not ideal for understanding the context of the change.
2. extensions/sandbox/docker/README.md:8
- Draft comment:
The change frompoetry install
topoetry add
is correct and improves the installation instructions. No issues found here. - Reason this comment was not posted:
Confidence changes required:0%
The change in the README.md file is correct and improves the installation instructions. No issues found.
Workflow ID: wflow_a3ZTqJFTY63AjpZz
You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet
mode, and more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Skipped PR review on a341e07 because no changed files had a supported extension. If you think this was in error, please contact us and we'll fix it right away.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well done, just a few leftovers!
@ArslanSaleem one of the most interesting features of the docker is the fact of running the code without internet connection. Docker allows it with Could we enforce it so that it can run offline? |
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## release/v3 #1517 +/- ##
==============================================
- Coverage 79.09% 78.80% -0.29%
==============================================
Files 149 61 -88
Lines 6013 2015 -3998
==============================================
- Hits 4756 1588 -3168
+ Misses 1257 427 -830
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
Co-authored-by: ellipsis-dev[bot] <65095814+ellipsis-dev[bot]@users.noreply.github.com>
…into feat/sandbox
Done |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
❌ Changes requested. Incremental review on a78039d in 39 seconds
More details
- Looked at
15
lines of code in1
files - Skipped
0
files when reviewing. - Skipped posting
0
drafted comments based on config settings.
Workflow ID: wflow_XvwQSg5a7DWDlIzr
Want Ellipsis to fix these issues? Tag @ellipsis-dev
in a comment. You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet
mode, and more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 Looks good to me! Incremental review on 5d99577 in 14 seconds
More details
- Looked at
35
lines of code in1
files - Skipped
0
files when reviewing. - Skipped posting
0
drafted comments based on config settings.
Workflow ID: wflow_62CoZ3EI3jQYCCc7
You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet
mode, and more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 Looks good to me! Incremental review on 1806976 in 32 seconds
More details
- Looked at
90
lines of code in2
files - Skipped
1
files when reviewing. - Skipped posting
2
drafted comments based on config settings.
1. docs/v3/agent.mdx:137
- Draft comment:
The addition of the sandbox environment is well-documented. Ensure users are aware of the security benefits and usage instructions. - Reason this comment was not posted:
Confidence changes required:20%
The addition of the sandbox environment is a significant change, and the documentation should ensure that users are aware of the security benefits and usage instructions. The documentation seems clear and provides necessary details.
2. docs/v3/agent.mdx:137
- Draft comment:
Ensure that there are corresponding unit tests for the new sandbox feature to verify its functionality. - Reason this comment was not posted:
Comment did not seem useful.
Workflow ID: wflow_OS4qeKIVzVU72BOR
You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet
mode, and more.
@@ -134,6 +134,56 @@ print(response) | |||
# The model will use the information provided in the training to generate a response | |||
``` | |||
|
|||
## Using the Sandbox Environment | |||
|
|||
To enhance security and protect against malicious code through prompt injection, PandaAI provides a sandbox environment for code execution. The sandbox runs your code in an isolated Docker container, ensuring that potentially harmful operations are contained. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe it would be useful to specify that the docker daemon must be running in the machine before using this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@scaliseraoul Yes definitely we should mention
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ArslanSaleem great, can we add it accordingly?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall code is solid and well tested.
There are few things that could be discussed.
-
The first time user run
sandbox = DockerSandbox()
2 container with random names get created in Docker. And not deleted. -
It would be more solid to have the serializer code as part of the pandasai-docker image instead of adding the code in the code_executor.
-
The SQL function is still executed outside of the docker so we should think how to avoid any DML statements (INSERT, UPDATE, DELETE, DROP etc.)
-
Should df.chat and pai.chat have the possibility to use the sandbox as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Approving based on previous feedback.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 Looks good to me! Incremental review on 878f325 in 50 seconds
More details
- Looked at
85
lines of code in3
files - Skipped
0
files when reviewing. - Skipped posting
3
drafted comments based on config settings.
1. extensions/sandbox/docker/pandasai_docker/Dockerfile:3
- Draft comment:
LABEL image_name="pandasai-sandbox"
- Reason this comment was not posted:
Comment looked like it was already resolved.
2. extensions/sandbox/docker/pandasai_docker/docker_sandbox.py:20
- Draft comment:
self._image_name: str = image_name
- Reason this comment was not posted:
Comment was not on a valid diff hunk.
3. extensions/sandbox/docker/pandasai_docker/Dockerfile:3
- Draft comment:
The image name in the LABEL should be consistent with the rest of the code.
LABEL image_name="pandasai-sandbox"
- Reason this comment was not posted:
Marked as duplicate.
Workflow ID: wflow_m3miv8VMEUEd1pd6
You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet
mode, and more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
❌ Changes requested. Incremental review on 7c18a8c in 2 minutes and 4 seconds
More details
- Looked at
66
lines of code in2
files - Skipped
0
files when reviewing. - Skipped posting
2
drafted comments based on config settings.
1. extensions/sandbox/docker/tests/test_sandbox.py:59
- Draft comment:
Ensuremock_subprocess.run
is called with the correct image name and arguments.
mock_subprocess.run.assert_called_once_with([
"docker",
"build",
"-f",
dockerfile_path,
"-t",
image_name,
".",
], check=True, capture_output=True, text=True)
- Reason this comment was not posted:
Comment did not seem useful.
2. extensions/sandbox/docker/tests/test_sandbox.py:38
- Draft comment:
The testtest_build_image
has been correctly updated to mocksubprocess.run
instead ofself._client.images.build
. Ensure that the test covers all necessary scenarios for the new implementation. - Reason this comment was not posted:
Confidence changes required:50%
The_build_image
method indocker_sandbox.py
has been modified to usesubprocess.run
instead ofself._client.images.build
. The corresponding testtest_build_image
has been updated to mocksubprocess.run
. This is a good practice to ensure the new implementation is tested.
Workflow ID: wflow_xeBc5cpcz4LXXbH5
Want Ellipsis to fix these issues? Tag @ellipsis-dev
in a comment. You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet
mode, and more.
Co-authored-by: ellipsis-dev[bot] <65095814+ellipsis-dev[bot]@users.noreply.github.com>
Co-authored-by: ellipsis-dev[bot] <65095814+ellipsis-dev[bot]@users.noreply.github.com>
Important
Adds a Docker-based sandbox environment for secure code execution in PandaAI, with comprehensive documentation and testing.
DockerSandbox
class inpandasai_docker/docker_sandbox.py
for isolated code execution using Docker.Sandbox
base class insandbox.py
for sandbox interface.Agent
class inbase.py
to support sandbox execution.agent.mdx
to include sandbox usage instructions.README.md
for Docker sandbox extension.DockerSandbox
intest_sandbox.py
.ResponseSerializer
intest_serializer.py
.Sandbox
base class intest_sandbox.py
.mint.json
to change logo href.Dockerfile
for sandbox environment setup.pyproject.toml
for managing dependencies with Poetry.This description was created by for 7c18a8c. It will automatically update as commits are pushed.