-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: transpiling sql to specific dialect #1596
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
❌ Changes requested. Reviewed everything up to b98fb72 in 1 minute and 53 seconds
More details
- Looked at
119
lines of code in6
files - Skipped
0
files when reviewing. - Skipped posting
8
drafted comments based on config settings.
1. pandasai/data_loader/sql_loader.py:44
- Draft comment:
Transpiling query using 'to_dialect=source_type' without specifying 'from_dialect' may lead to inconsistent behavior compared to DuckDB loader. - Reason this comment was not posted:
Decided after close inspection that this draft comment was likely wrong and/or not actionable: usefulness confidence = 20% vs. threshold = 50%
The comment raises a valid technical concern about SQL dialect conversion. However, without seeing the implementation of SQLParser.transpile_sql_dialect() or understanding the default behavior when from_dialect is omitted, we can't be certain this is actually an issue. The mention of DuckDB seems to require knowledge of other parts of the codebase. The comment is speculative ("may lead to") rather than definitive.
I might be too harsh - this could be a real compatibility issue that an experienced developer spotted. The DuckDB reference might indicate a known issue in the system.
While it could be a real issue, we need stronger evidence within this file's context to keep the comment. The speculative nature and reference to external components make it less actionable.
Delete the comment because it's speculative ("may lead to") and requires knowledge of external components (DuckDB loader) to understand or verify the issue.
2. pandasai/data_loader/view_loader.py:93
- Draft comment:
Ensure that transpiling with 'to_dialect=source_type' correctly handles the expected dialect conversion for view queries. - Reason this comment was not posted:
Marked as duplicate.
3. pandasai/data_loader/duck_db_connection_manager.py:37
- Draft comment:
Hardcoding the input dialect as 'mysql' may be too restrictive. Consider exposing or detecting the source dialect. - Reason this comment was not posted:
Marked as duplicate.
4. pandasai/data_loader/sql_loader.py:44
- Draft comment:
Missing explicit 'from_dialect' parameter may lead to inconsistent transpilation compared to other parts. Confirm if input queries are in the expected dialect. - Reason this comment was not posted:
Marked as duplicate.
5. pandasai/data_loader/view_loader.py:93
- Draft comment:
As above, consider whether a 'from_dialect' argument is needed for consistency in transpiling queries. - Reason this comment was not posted:
Marked as duplicate.
6. pandasai/query_builders/sql_parser.py:55
- Draft comment:
Consider adding error handling (try/except) around parse_one to provide clearer feedback on SQL parsing failures. - Reason this comment was not posted:
Marked as duplicate.
7. tests/unit_tests/data_loader/test_sql_loader.py:156
- Draft comment:
Test assertion relies on specific SQL formatting which may be brittle if sqlglot's pretty-printing changes. - Reason this comment was not posted:
Decided after close inspection that this draft comment was likely wrong and/or not actionable: usefulness confidence = 20% vs. threshold = 50%
The comment raises a valid concern - the test is now asserting exact string matching including whitespace and newlines, which could break if the SQL formatter's output changes. However, this seems to be an intentional change to match the actual behavior of the system. The test should reflect the real behavior.
The comment may be too speculative - we don't know if sqlglot's formatting is actually likely to change. The test might intentionally verify the exact formatting.
While the formatting could change, tests should verify actual behavior. If the formatting changes, the test should be updated to match the new behavior.
The comment should be deleted. While it raises a valid point about test brittleness, exact string matching is sometimes necessary in tests and changes to the formatting would require intentional test updates.
8. tests/unit_tests/query_builders/test_sql_parser.py:62
- Draft comment:
Basing tests on exact output formatting can lead to brittle tests if the underlying formatter changes. Consider using a more flexible comparison. - Reason this comment was not posted:
Marked as duplicate.
Workflow ID: wflow_A6ReyntpAVBI0Alg
Want Ellipsis to fix these issues? Tag @ellipsis-dev
in a comment. You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet
mode, and more.
Important
Add SQL dialect transpilation to data loaders using
SQLParser.transpile_sql_dialect
and update tests accordingly.SQLParser.transpile_sql_dialect
added to convert SQL queries to specific dialects.DuckDBConnectionManager.sql
now transpiles queries from MySQL to DuckDB.SQLDatasetLoader.execute_query
andViewDatasetLoader.execute_query
transpile queries to the source type.test_sql_loader.py
to check for transpiled query format.test_mysql_transpilation
intest_sql_parser.py
to verify SQL dialect conversion.This description was created by
for b98fb72. It will automatically update as commits are pushed.