Skip to content

Feature Request: Structured Diff Output API #5703

@VISHNU7KASIREDDY

Description

@VISHNU7KASIREDDY

Summary

Add a new RPC method to the schema engine that returns structured, machine-readable diff output instead of only human-readable text or SQL scripts.

Motivation

Currently, migrateDiff only outputs human-readable text or SQL scripts to stdout. This makes it difficult to programmatically analyze schema changes without brittle regex parsing. Several use cases would benefit from structured diff output:

  1. Auto-generating migration names based on schema changes (e.g., add_email_to_User, create_table_Post)
  2. Building diff visualization tools that need to understand what changed
  3. Automated schema analysis for CI/CD pipelines
  4. Custom migration workflows that need to inspect changes before applying

Current Limitations

The existing diff RPC method:

  • Returns only an exit code in the JSON response
  • Outputs human-readable text via "print" requests to stdout
  • Requires parsing text output with regex (brittle and error-prone)
  • Text format may change between versions, breaking tooling

Proposed Solution

Add a new RPC method diffStructured (or add a parameter to existing diff method) that returns structured JSON output.

Proposed API

// RPC method name: "diffStructured"
interface DiffStructuredInput {
  from: MigrateDiffTarget
  to: MigrateDiffTarget
  shadowDatabaseUrl: string | null
  filters: SchemaFilter
}

interface DiffStructuredOutput {
  changes: SchemaChange[]
  exitCode: MigrateDiffExitCode
}

type SchemaChange = 
  | AddTableChange
  | DropTableChange
  | RenameTableChange
  | AddColumnChange
  | DropColumnChange
  | AlterColumnChange
  | AddIndexChange
  | DropIndexChange
  | AddForeignKeyChange
  | DropForeignKeyChange
  // ... other change types

interface AddTableChange {
  type: 'AddTable'
  table: string
  columns: ColumnDefinition[]
}

interface DropTableChange {
  type: 'DropTable'
  table: string
}

interface AddColumnChange {
  type: 'AddColumn'
  table: string
  column: string
  columnType: string
  nullable: boolean
  default?: string
}

interface DropColumnChange {
  type: 'DropColumn'
  table: string
  column: string
}

interface AlterColumnChange {
  type: 'AlterColumn'
  table: string
  column: string
  changes: ColumnPropertyChange[]
}

type ColumnPropertyChange = 
  | { property: 'type', from: string, to: string }
  | { property: 'nullable', from: boolean, to: boolean }
  | { property: 'default', from: string | null, to: string | null }

// ... similar structures for indexes, foreign keys, etc.

Example Output

{
  "changes": [
    {
      "type": "AddColumn",
      "table": "User",
      "column": "email",
      "columnType": "String",
      "nullable": false
    },
    {
      "type": "AddTable",
      "table": "Post",
      "columns": [
        {
          "name": "id",
          "type": "Int",
          "nullable": false,
          "autoIncrement": true
        },
        {
          "name": "title",
          "type": "String",
          "nullable": false
        }
      ]
    }
  ],
  "exitCode": 2
}

Benefits

  1. Robust tooling: No regex parsing needed, changes are strongly typed
  2. Version stability: JSON structure can be versioned and evolved
  3. Better UX: Enables features like auto-generated migration names
  4. Ecosystem growth: Third-party tools can build on structured diff data
  5. Backward compatible: Existing diff method continues to work

Use Cases

1. Auto-generate migration names

const diff = await engine.diffStructured({ from, to })
const name = generateMigrationName(diff.changes)
// e.g., "add_email_to_User" or "create_table_Post"

2. Diff visualization

const diff = await engine.diffStructured({ from, to })
renderDiffUI(diff.changes) // Show structured diff in UI

3. CI/CD validation

const diff = await engine.diffStructured({ from, to })
const hasBreakingChanges = diff.changes.some(c => 
  c.type === 'DropColumn' || c.type === 'DropTable'
)
if (hasBreakingChanges) {
  throw new Error('Breaking changes detected')
}

Implementation Considerations

  • The schema engine already computes these changes internally for generating SQL
  • This would expose existing internal data structures via RPC
  • Could reuse existing diff logic, just format output differently
  • Should support all database providers (PostgreSQL, MySQL, SQLite, etc.)
  • Consider adding this as a new RPC method to avoid breaking existing diff behavior

Related Issues

Alternatives Considered

  1. Parse human-readable output: Brittle, breaks when format changes
  2. Parse SQL output: Database-specific, complex to parse correctly
  3. Use evaluateDataLoss warnings: Only contains data loss warnings, not all changes

None of these alternatives provide a robust, maintainable solution.

Questions

  1. Would the team be open to this enhancement?
  2. Should this be a new RPC method or a parameter on existing diff?
  3. What level of detail should be included in the change structures?
  4. Any concerns about exposing internal diff structures?

I'm happy to contribute to the implementation if the team is interested, though I'd need guidance on the Rust codebase structure.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions