feat(redshift-mcp-server): add describe_execution_plan tool#1927
feat(redshift-mcp-server): add describe_execution_plan tool#1927saeedma8 wants to merge 0 commit into
Conversation
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #1927 +/- ##
==========================================
+ Coverage 91.40% 92.60% +1.19%
==========================================
Files 1017 911 -106
Lines 75524 73452 -2072
Branches 12186 11721 -465
==========================================
- Hits 69036 68020 -1016
+ Misses 3997 3298 -699
+ Partials 2491 2134 -357 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
|
Looking... |
794efdd to
76d2479
Compare
|
This pull request is now marked as stale because it hasn't seen activity for a while. Add a comment or it will be closed soon. If you wish to exclude this issue from being marked as stale, add the "backlog" label. |
|
WIP |
3aa8f6d to
45342ed
Compare
grayhemp
left a comment
There was a problem hiding this comment.
Additionally a general request: I see some tests which don't relate to the subject of the PR are either changed or removed. Please make sure that the CR changes relevant functionality only and leaves unrelated tests intact.
|
@saeedma8 let's also:
Thanks. |
10ad683 to
0bf3f81
Compare
91d22f1 to
24aa1eb
Compare
8fa2354 to
807ebf1
Compare
ef258a5 to
af2156c
Compare
|
Reviewed and commented on few changes in DM. |
8c0693e to
4ba1082
Compare
| # Match native scans ("on <table>") and external/Spectrum scans where the | ||
| # table name follows directly or after "of" (PartitionInfo scans). | ||
| scan_pattern = regex.compile( | ||
| r'(?:S3 Seq Scan|S3 Query Scan|Seq Scan PartitionInfo of|Seq Scan|Index Scan|Bitmap Heap Scan) (?:on )?(\S+)', |
There was a problem hiding this comment.
Index Scan|Bitmap Heap Scan? Does RS have those?
| stats_sequential_scans: Optional[int] = Field( | ||
| None, description='Number of sequential scans initiated' | ||
| ) | ||
| stats_sequential_tuples_read: Optional[int] = Field( | ||
| None, description='Number of live rows fetched by sequential scans' | ||
| ) | ||
| stats_rows_inserted: Optional[int] = Field(None, description='Number of rows inserted') | ||
| stats_rows_updated: Optional[int] = Field(None, description='Number of rows updated') | ||
| stats_rows_deleted: Optional[int] = Field(None, description='Number of rows deleted') |
There was a problem hiding this comment.
For the stat columns we need to clarify "since the last statistics reset" in the description.
| f'Could not fetch table extra stats (may not be supported on serverless): {table_info_error}' | ||
| ) | ||
| # Continue without table info - tables already have None values |
There was a problem hiding this comment.
I don't really like that we are "swallowing" a possible error here. Please double-check (provide evidence) if the query is runnable on provisioned/serverless with superuser/non-superuser user. If it is true for all 4 options then we need it to be an error and raise.
| - table_type: The type of the table (TABLE, VIEW, EXTERNAL TABLE, SHARED TABLE). | ||
| - remarks: Remarks about the table. | ||
|
|
||
| Redshift-specific properties (from pg_catalog, may be None on serverless): |
There was a problem hiding this comment.
That doesn't seem to be true. Dist style and estimated rows should be available normally on serverless. Please double check and provide evidence.
| 4. 'SHARED TABLE' types indicate tables from datashares. | ||
| 5. Use table names for subsequent column discovery and query operations. | ||
| 6. Consider table permissions (table_acl) for access planning. | ||
| 7. Monitor table statistics and run ANALYZE when needed for accurate query planning. |
There was a problem hiding this comment.
"run ANALYZE" should be "suggest the user using ANALYZE" - you can not run ANALYZE directly throuh the MCP because of RO mode.
| 1. First use list_clusters to get valid cluster identifiers. | ||
| 2. Then use list_databases to get valid database names for the cluster. | ||
| 3. Do NOT include EXPLAIN keyword in your SQL - the tool adds it automatically. | ||
| 4. Run ANALYZE on tables before checking execution plans for accurate cost estimates. |
There was a problem hiding this comment.
Again, "suggest the user running ANALYZE", see the comment above.
| 3. Do NOT include EXPLAIN keyword in your SQL - the tool adds it automatically. | ||
| 4. Run ANALYZE on tables before checking execution plans for accurate cost estimates. | ||
| 5. Compare plans before and after schema changes to measure optimization impact. | ||
| 6. Use this tool to validate query performance before running expensive queries. |
There was a problem hiding this comment.
"Ask the user if they want to ..."
| # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
| # See the License for the specific language governing permissions and | ||
| # limitations under the License. | ||
|
|
There was a problem hiding this comment.
All that is a complete overkill. We just need to import BaseSchema, create a DumySchema(BaseSchema) class, and tests that it eliminates None fileds when dumped and dumped as JSON.
| - Access permissions | ||
| - Remarks and metadata | ||
| - Access permissions and remarks | ||
| - Redshift-specific metadata (from pg_catalog tables, may be None on serverless): |
There was a problem hiding this comment.
"may be None on serverless" - as well here.
| - Human-readable plan text (summarized for large plans) | ||
| - Rule-based performance optimization suggestions derived from plan analysis, table design, and column statistics | ||
|
|
||
| **Permissions note**: Column planner statistics from `pg_stats` require SELECT privilege on the table. Table activity statistics from `pg_stat_user_tables` are visible to all users regardless of permissions. |
There was a problem hiding this comment.
Clarify on pg_stats "... for the statistics to be visible." soe it won't create an impression that it will raise permission error.
471a3fb to
abe37a4
Compare
Fixes #1019
Summary
Added describe_execution_plan tool to the Redshift MCP. This tool uses Redshift's EXPLAIN VERBOSE command to explain how a query would be executed, providing insights into query performance without actually running the query.
Changes
User experience
User can get execution plan and execution verbose plan by using this tool.
Checklist
If your change doesn't seem to apply, please leave them unchecked.
Is this a breaking change? (N)
RFC issue number: #1019
Checklist:
Acknowledgment
By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of the project license.
Testing explain only
E2E Testing