Skip to content

Commit 7dea43b

Browse files
committed
feat: add schema-only dump, ERD generation, env command, integration tests, and Python SDK
Schema-only dump (dump.schema_path): - Add DumpOptions{SchemaOnly} to Engine interface; engines pass --schema-only (Postgres) or --no-data (MySQL) when set - Scheduler runs a second DDL-only dump to schema_path after the full dump, skipping obfuscation bake (no data to scrub) - Documented in ditto.yaml.example ditto erd command: - New internal/erd package: information_schema introspection for Postgres and MySQL, Mermaid erDiagram renderer, DBML renderer with unit tests - ditto erd [--format mermaid|dbml] [--output file] [--source] - Default: creates a copy, introspects, destroys; --source connects directly ditto env command: - ditto env export: creates a copy, prints eval-able export lines for DATABASE_URL and DITTO_COPY_ID - ditto env destroy <id>: destroys a copy by ID - ditto env -- <cmd>: thin alias for copy run (reuses runCopyExec) Integration tests (//go:build integration): - engine/postgres/integration_test.go: full dump/restore cycle + schema-only - engine/mysql/integration_test.go: same for MySQL - dockerutil.RunContainerOnNetwork added to support named network attachment - CI workflow gains a dedicated integration job on ubuntu-latest Python SDK (sdk/python/): - ditto.Client: pure stdlib, create/destroy/list/with_copy context manager - pytest fixture plugin (ditto_client, ditto_copy) auto-registered via pytest11 - pip install ditto-sdk[pytest] Remove hooks/: - pre-job.sh and post-job.sh are superseded by actions/create + actions/delete composite actions and the new ditto env export workflow - References removed from .goreleaser.yaml, README.md, CONTRIBUTING.md
1 parent 3b4947f commit 7dea43b

29 files changed

+1809
-120
lines changed

.github/workflows/ci.yml

Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -51,3 +51,14 @@ jobs:
5151
go-version: "1.26"
5252
cache: true
5353
- run: go build ./cmd/ditto
54+
55+
integration:
56+
name: Integration
57+
runs-on: ubuntu-latest
58+
steps:
59+
- uses: actions/checkout@v6
60+
- uses: actions/setup-go@v6
61+
with:
62+
go-version: "1.26"
63+
cache: true
64+
- run: go test -tags=integration -race -count=1 -timeout=15m ./engine/...

.gitignore

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -25,6 +25,14 @@ vendor/
2525
*.swp
2626
*.swo
2727

28+
# Python
29+
__pycache__/
30+
*.pyc
31+
*.pyo
32+
*.egg-info/
33+
dist/
34+
.venv/
35+
2836
# OS
2937
.DS_Store
3038
Thumbs.db

.goreleaser.yaml

Lines changed: 0 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -37,7 +37,6 @@ archives:
3737
- README.md
3838
- LICENSE
3939
- ditto.yaml.example
40-
- hooks/
4140

4241
brews:
4342
- repository:
@@ -73,10 +72,6 @@ nfpms:
7372
- src: ./ditto.yaml.example
7473
dst: /etc/ditto/ditto.yaml.example
7574
type: config|noreplace
76-
- src: ./hooks/pre-job.sh
77-
dst: /usr/share/ditto/hooks/pre-job.sh
78-
- src: ./hooks/post-job.sh
79-
dst: /usr/share/ditto/hooks/post-job.sh
8075

8176
changelog:
8277
sort: asc

CONTRIBUTING.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -55,7 +55,7 @@ internal/
5555
cmd/
5656
ditto/main.go CLI entry point; blank engine imports
5757
*.go cobra command implementations
58-
hooks/ GHA pre/post job shell scripts
58+
actions/ GitHub Actions composite actions (create, delete)
5959
```
6060

6161
## Adding a new database engine

README.md

Lines changed: 0 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -703,22 +703,6 @@ Or a standalone cron job for just the dump:
703703

704704
### Runner setup (GitHub Actions self-hosted)
705705

706-
Install the hooks on the runner host:
707-
708-
```bash
709-
cp hooks/pre-job.sh /home/runner/hooks/pre-job.sh
710-
cp hooks/post-job.sh /home/runner/hooks/post-job.sh
711-
chmod +x /home/runner/hooks/*.sh
712-
```
713-
714-
Add to the runner's systemd service unit:
715-
716-
```ini
717-
[Service]
718-
Environment=ACTIONS_RUNNER_HOOK_JOB_STARTED=/home/runner/hooks/pre-job.sh
719-
Environment=ACTIONS_RUNNER_HOOK_JOB_COMPLETED=/home/runner/hooks/post-job.sh
720-
```
721-
722706
The runner user must be able to reach the configured runtime socket. For
723707
Docker Engine on Linux:
724708

cmd/env.go

Lines changed: 156 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,156 @@
1+
package cmd
2+
3+
import (
4+
"context"
5+
"fmt"
6+
"os"
7+
"time"
8+
9+
copypkg "github.com/attaradev/ditto/internal/copy"
10+
"github.com/spf13/cobra"
11+
)
12+
13+
func newEnvCmd() *cobra.Command {
14+
var serverURL string
15+
16+
cmd := &cobra.Command{
17+
Use: "env",
18+
Short: "Inject DATABASE_URL into your shell or a subprocess",
19+
Long: `Manage DATABASE_URL injection for shell sessions and subprocesses.
20+
21+
Subcommands:
22+
ditto env -- <command> Run a command with DATABASE_URL set (same as copy run)
23+
ditto env export Create a copy and print eval-able export lines
24+
ditto env destroy <id> Destroy a copy created by export
25+
26+
Shell session workflow:
27+
eval $(ditto env export) # creates a copy; sets DATABASE_URL + DITTO_COPY_ID
28+
psql $DATABASE_URL # use the copy from any tool
29+
ditto env destroy $DITTO_COPY_ID # clean up when done`,
30+
}
31+
32+
cmd.PersistentFlags().StringVar(&serverURL, "server", "",
33+
"Remote ditto server URL (e.g. http://ditto.internal:8080)")
34+
35+
// Propagate --server into context so copyClientFromContext picks it up.
36+
cmd.PersistentPreRunE = func(cmd *cobra.Command, args []string) error {
37+
if serverURL != "" {
38+
ctx := context.WithValue(cmd.Context(), keyServerURL, serverURL)
39+
cmd.SetContext(ctx)
40+
}
41+
return nil
42+
}
43+
44+
cmd.AddCommand(
45+
newEnvRunCmd(),
46+
newEnvExportCmd(),
47+
newEnvDestroyCmd(),
48+
)
49+
50+
return cmd
51+
}
52+
53+
// newEnvRunCmd: ditto env -- <command> [args…]
54+
// Thin wrapper around runCopyExec — identical lifecycle, signal handling, and
55+
// DATABASE_URL injection as `ditto copy run`.
56+
func newEnvRunCmd() *cobra.Command {
57+
var (
58+
ttl string
59+
label string
60+
dumpURI string
61+
obfuscate bool
62+
)
63+
64+
cmd := &cobra.Command{
65+
Use: "-- <command> [args...]",
66+
Short: "Run a command with DATABASE_URL injected",
67+
DisableFlagParsing: false,
68+
Args: cobra.MinimumNArgs(1),
69+
RunE: func(cmd *cobra.Command, args []string) error {
70+
return runCopyExec(cmd, ttl, label, dumpURI, obfuscate, args)
71+
},
72+
}
73+
74+
cmd.Flags().StringVar(&ttl, "ttl", "", "Copy lifetime (e.g. 1h, 30m)")
75+
cmd.Flags().StringVar(&label, "label", "", "Run identifier tag")
76+
cmd.Flags().StringVar(&dumpURI, "dump", "", "Dump source: local path, s3://..., or https://...")
77+
cmd.Flags().BoolVar(&obfuscate, "obfuscate", false, "Apply obfuscation rules after restore")
78+
79+
return cmd
80+
}
81+
82+
// newEnvExportCmd: ditto env export
83+
// Creates a copy and prints eval-able shell export lines. Intended use:
84+
//
85+
// eval $(ditto env export)
86+
func newEnvExportCmd() *cobra.Command {
87+
var (
88+
ttl string
89+
label string
90+
)
91+
92+
cmd := &cobra.Command{
93+
Use: "export",
94+
Short: "Create a copy and print eval-able export lines",
95+
Long: `Create an ephemeral database copy and print shell-eval-able export lines.
96+
97+
Usage:
98+
eval $(ditto env export)
99+
100+
After eval, DATABASE_URL and DITTO_COPY_ID are set in the current shell.
101+
Destroy the copy when you are done:
102+
ditto env destroy $DITTO_COPY_ID`,
103+
RunE: func(cmd *cobra.Command, args []string) error {
104+
return runEnvExport(cmd, ttl, label)
105+
},
106+
}
107+
108+
cmd.Flags().StringVar(&ttl, "ttl", "", "Copy lifetime (e.g. 1h, 30m)")
109+
cmd.Flags().StringVar(&label, "label", "", "Run identifier tag")
110+
111+
return cmd
112+
}
113+
114+
func runEnvExport(cmd *cobra.Command, ttl, label string) error {
115+
client := copyClientFromContext(cmd)
116+
117+
runID := label
118+
if runID == "" {
119+
runID = detectRunID()
120+
}
121+
opts := copypkg.CreateOptions{
122+
RunID: runID,
123+
JobName: detectJobName(),
124+
}
125+
if ttl != "" {
126+
d, err := time.ParseDuration(ttl)
127+
if err != nil {
128+
return fmt.Errorf("invalid --ttl %q: %w", ttl, err)
129+
}
130+
opts.TTLSeconds = int(d.Seconds())
131+
}
132+
133+
c, err := client.Create(cmd.Context(), opts)
134+
if err != nil {
135+
return fmt.Errorf("env export: create copy: %w", err)
136+
}
137+
138+
// Print eval-able lines to stdout.
139+
if _, err := fmt.Fprintf(os.Stdout, "export DATABASE_URL=%q\nexport DITTO_COPY_ID=%q\n",
140+
c.ConnectionString, c.ID); err != nil {
141+
return fmt.Errorf("env export: write output: %w", err)
142+
}
143+
return nil
144+
}
145+
146+
// newEnvDestroyCmd: ditto env destroy <id>
147+
func newEnvDestroyCmd() *cobra.Command {
148+
return &cobra.Command{
149+
Use: "destroy <id>",
150+
Short: "Destroy a database copy by ID",
151+
Args: cobra.ExactArgs(1),
152+
RunE: func(cmd *cobra.Command, args []string) error {
153+
return copyClientFromContext(cmd).Destroy(cmd.Context(), args[0])
154+
},
155+
}
156+
}

cmd/erd.go

Lines changed: 146 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,146 @@
1+
package cmd
2+
3+
import (
4+
"database/sql"
5+
"fmt"
6+
"os"
7+
8+
"github.com/attaradev/ditto/engine"
9+
copypkg "github.com/attaradev/ditto/internal/copy"
10+
"github.com/attaradev/ditto/internal/erd"
11+
"github.com/attaradev/ditto/internal/secret"
12+
"github.com/spf13/cobra"
13+
)
14+
15+
func newErdCmd() *cobra.Command {
16+
var (
17+
format string
18+
output string
19+
useSource bool
20+
serverURL string
21+
)
22+
23+
cmd := &cobra.Command{
24+
Use: "erd",
25+
Short: "Generate an Entity-Relationship Diagram from the database schema",
26+
Long: `Generate an ERD by introspecting the live database schema.
27+
28+
By default ditto creates a temporary copy, introspects its schema, then
29+
destroys it — so your source database is never accessed at query time.
30+
Use --source to connect directly to the configured source database instead.
31+
32+
Supported output formats:
33+
mermaid Mermaid erDiagram syntax (default) — paste into any Mermaid renderer,
34+
GitHub markdown fences, Notion, or VS Code with the Mermaid extension.
35+
dbml DBML syntax — compatible with dbdiagram.io.
36+
37+
Examples:
38+
ditto erd # Mermaid ERD to stdout via a copy
39+
ditto erd --format=dbml # DBML to stdout
40+
ditto erd --output=schema.md # Write Mermaid to file
41+
ditto erd --source # Connect to source DB directly (no copy)
42+
ditto erd --server http://ditto:8080 # Use remote ditto server for the copy`,
43+
RunE: func(cmd *cobra.Command, args []string) error {
44+
return runERD(cmd, format, output, useSource)
45+
},
46+
}
47+
48+
cmd.Flags().StringVar(&format, "format", "mermaid", "Output format: mermaid, dbml")
49+
cmd.Flags().StringVar(&output, "output", "", "Output file path (default: stdout)")
50+
cmd.Flags().BoolVar(&useSource, "source", false, "Connect directly to source DB instead of creating a copy")
51+
cmd.Flags().StringVar(&serverURL, "server", "", "Remote ditto server URL (e.g. http://ditto.internal:8080)")
52+
53+
return cmd
54+
}
55+
56+
func runERD(cmd *cobra.Command, format, output string, useSource bool) error {
57+
cfg := configFromContext(cmd)
58+
59+
var (
60+
dsn string
61+
copyID string
62+
cleanup func()
63+
)
64+
65+
if useSource {
66+
var sc secret.Cache
67+
pwd, err := sc.Resolve(cmd.Context(), cfg.Source.PasswordSecret, cfg.Source.Password)
68+
if err != nil {
69+
return fmt.Errorf("erd: resolve source password: %w", err)
70+
}
71+
dsn = buildERDSourceDSN(cfg.Source.Engine, cfg.Source.Host, cfg.Source.Port, cfg.Source.Database, cfg.Source.User, pwd)
72+
cleanup = func() {}
73+
} else {
74+
client := copyClientFromContext(cmd)
75+
c, err := client.Create(cmd.Context(), copypkg.CreateOptions{RunID: "erd"})
76+
if err != nil {
77+
return fmt.Errorf("erd: create copy: %w", err)
78+
}
79+
dsn = c.ConnectionString
80+
copyID = c.ID
81+
cleanup = func() {
82+
if err := client.Destroy(cmd.Context(), copyID); err != nil {
83+
// Non-fatal: copy will expire via TTL.
84+
_ = err
85+
}
86+
}
87+
}
88+
defer cleanup()
89+
90+
driver := erdDriverName(cfg.Source.Engine)
91+
db, err := sql.Open(driver, dsn)
92+
if err != nil {
93+
return fmt.Errorf("erd: open db: %w", err)
94+
}
95+
defer func() { _ = db.Close() }()
96+
97+
eng, err := engine.Get(cfg.Source.Engine)
98+
if err != nil {
99+
return fmt.Errorf("erd: %w", err)
100+
}
101+
102+
schema, err := erd.Introspect(cmd.Context(), db, eng.Name(), cfg.Source.Database)
103+
if err != nil {
104+
return fmt.Errorf("erd: introspect: %w", err)
105+
}
106+
107+
w := os.Stdout
108+
if output != "" {
109+
f, err := os.Create(output)
110+
if err != nil {
111+
return fmt.Errorf("erd: create output file: %w", err)
112+
}
113+
defer func() { _ = f.Close() }()
114+
w = f
115+
}
116+
117+
switch format {
118+
case "mermaid":
119+
return erd.RenderMermaid(schema, w)
120+
case "dbml":
121+
return erd.RenderDBML(schema, w)
122+
default:
123+
return fmt.Errorf("erd: unknown format %q — use mermaid or dbml", format)
124+
}
125+
}
126+
127+
// buildERDSourceDSN builds a DSN for direct connection to the source database.
128+
// Unlike copy container DSNs, this uses sslmode=require for Postgres (the
129+
// source is a real server, not a local container).
130+
func buildERDSourceDSN(eng, host string, port int, database, user, password string) string {
131+
switch eng {
132+
case "mysql":
133+
return fmt.Sprintf("%s:%s@tcp(%s:%d)/%s", user, password, host, port, database)
134+
default: // postgres
135+
return fmt.Sprintf("postgres://%s:%s@%s:%d/%s?sslmode=require", user, password, host, port, database)
136+
}
137+
}
138+
139+
// erdDriverName returns the database/sql driver name for the given engine.
140+
// Both drivers are registered via blank imports in cmd/ditto/main.go.
141+
func erdDriverName(eng string) string {
142+
if eng == "mysql" {
143+
return "mysql"
144+
}
145+
return "pgx"
146+
}

cmd/root.go

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -114,6 +114,8 @@ Use it when shared staging databases make test runs flaky, schema fidelity matte
114114
newStatusCmd(),
115115
newDaemonCmd(),
116116
newServeCmd(),
117+
newErdCmd(),
118+
newEnvCmd(),
117119
)
118120
return root
119121
}

0 commit comments

Comments
 (0)