Replies: 8 comments
-
|
Alright Roger — now that you’ve explained the intent, the failure makes perfect sense, and Option A is absolutely the correct architectural direction. Let me walk you through why this test exists, why it’s failing, and what the correct fix is, given your repo split and your design philosophy. 🧩 Why This Test Exists in ForgeSyte (the main repo)This validator was written during Phase 14 to enforce:
But at that time, plugins lived inside the main repo, so the validator could safely assume: Now that you’ve split plugins into: the validator is no longer aligned with reality. It still assumes: But your new design is:
So the validator is enforcing a rule that is no longer valid. 🔥 Why the CI FailsYour pipeline JSON contains: But the validator loads real plugin manifests from: And it checks: plugins[src_plugin][src_tool]Since This is expected, because the validator is enforcing the wrong rule. 🎯 Why Option A Is CorrectYou said it perfectly:
Exactly. Your pipeline system is plugin‑agnostic.
Instead, pipelines should describe abstract operations, and the runtime should map them to real plugin tools. This means: ✔ Pipelines should use abstract tool types(e.g., “yolo.detect”, “ocr.read_text”) ✔ The validator should NOT check real plugin manifests(because ForgeSyte core does not own plugins) ✔ The validator should only check:
🛠️ What You Should Do Next (the real fix)1. Remove plugin/tool existence validation from the validatorDelete this block: src_meta = plugins.get(src_plugin, {}).get(src_tool)
dst_meta = plugins.get(dst_plugin, {}).get(dst_tool)
if not src_meta:
errors.append(...)
if not dst_meta:
errors.append(...)Because ForgeSyte core should not validate real plugin manifests. 2. Replace it with abstract type validationYou already have abstract types:
Your pipeline nodes should specify: {
"plugin_type": "yolo",
"tool_type": "detect",
"input_types": ["image"],
"output_types": ["detections"]
}Then the validator checks type compatibility, not plugin existence. 3. Update the validator to ignore real plugin manifestsRemove: plugins = load_plugin_metadata()Replace with: # No plugin metadata needed — pipelines use abstract types
plugins = {}Or remove the parameter entirely. 4. Update pipelines to use abstract tool typesInstead of: Use: These are not tied to real plugin manifests. 🧠 SummaryYour intuition is 100% correct:
Option A is the right fix. If you want, I can generate:
Just tell me which piece you want next. |
Beta Was this translation helpful? Give feedback.
-
|
Absolutely — let’s fix this cleanly and permanently.
I’m giving you all four pieces exactly as you requested:
Everything is aligned with your new architecture. ✅ 1. Updated validate_pipelines.py (drop‑in replacement)This version:
#!/usr/bin/env python3
"""
Phase 14 Pipeline Validator (Abstract Tool Types Only)
ForgeSyte Core no longer validates real plugin manifests.
Pipelines now use abstract tool types (e.g., yolo.detect, ocr.read).
This validator enforces ONLY:
- structural correctness
- acyclic graph
- reachable nodes
- entry/output correctness
- type compatibility based on abstract input/output types
"""
import json
import sys
from pathlib import Path
from typing import List, Set
ROOT = Path(__file__).resolve().parents[1]
PIPELINES_DIR = ROOT / "app" / "pipelines"
def detect_cycle(nodes, edges) -> bool:
graph = {n["id"]: [] for n in nodes}
for e in edges:
graph[e["from_node"]].append(e["to_node"])
visited = set()
stack = set()
def dfs(nid):
if nid in stack:
return True
if nid in visited:
return False
visited.add(nid)
stack.add(nid)
for nxt in graph[nid]:
if dfs(nxt):
return True
stack.remove(nid)
return False
return any(dfs(n["id"]) for n in nodes)
def validate_pipeline_file(path: Path) -> List[str]:
with path.open() as f:
data = json.load(f)
pid = data.get("id", path.stem)
errors = []
nodes = data.get("nodes", [])
edges = data.get("edges", [])
entry_nodes = data.get("entry_nodes", [])
output_nodes = data.get("output_nodes", [])
if not nodes:
return [f"{pid}: pipeline has no nodes"]
node_ids = [n["id"] for n in nodes]
if len(node_ids) != len(set(node_ids)):
errors.append(f"{pid}: node IDs must be unique")
node_set = set(node_ids)
# Edge references
for e in edges:
if e["from_node"] not in node_set:
errors.append(f"{pid}: edge from unknown node '{e['from_node']}'")
if e["to_node"] not in node_set:
errors.append(f"{pid}: edge to unknown node '{e['to_node']}'")
# Entry/output nodes exist
for nid in entry_nodes:
if nid not in node_set:
errors.append(f"{pid}: entry node '{nid}' not in nodes")
for nid in output_nodes:
if nid not in node_set:
errors.append(f"{pid}: output node '{nid}' not in nodes")
# Cycle detection
if detect_cycle(nodes, edges):
errors.append(f"{pid}: pipeline graph contains a cycle")
# Reachability
graph = {n["id"]: [] for n in nodes}
for e in edges:
graph[e["from_node"]].append(e["to_node"])
reachable: Set[str] = set()
def dfs_reach(nid):
if nid in reachable:
return
reachable.add(nid)
for nxt in graph[nid]:
dfs_reach(nxt)
for nid in entry_nodes:
dfs_reach(nid)
unreachable = node_set - reachable
if unreachable:
errors.append(f"{pid}: unreachable nodes: {sorted(unreachable)}")
# Sink nodes must be output nodes
sinks = {nid for nid in node_set if not graph[nid]}
non_output_sinks = sinks - set(output_nodes)
if non_output_sinks:
errors.append(
f"{pid}: nodes with no outgoing edges must be output nodes: {sorted(non_output_sinks)}"
)
# Type compatibility (abstract)
node_map = {n["id"]: n for n in nodes}
for e in edges:
src = node_map[e["from_node"]]
dst = node_map[e["to_node"]]
src_out = set(src.get("output_types", []))
dst_in = set(dst.get("input_types", []))
if src_out and dst_in and not (src_out & dst_in):
errors.append(
f"{pid}: type mismatch {e['from_node']}→{e['to_node']}: "
f"src outputs={sorted(src_out)}, dst inputs={sorted(dst_in)}"
)
return errors
def main():
pipeline_files = sorted(PIPELINES_DIR.glob("*.json"))
if not pipeline_files:
print("⚠️ No pipeline files found")
sys.exit(0)
all_errors = []
for pf in pipeline_files:
all_errors.extend(validate_pipeline_file(pf))
if all_errors:
print("❌ Pipeline validation failed:")
for e in all_errors:
print(" -", e)
sys.exit(1)
print("✅ All pipelines valid.")
if __name__ == "__main__":
main()✅ 2. Updated Pipeline JSON SchemaYour pipelines should now look like this: {
"id": "yolo_ocr",
"entry_nodes": ["detect"],
"output_nodes": ["read"],
"nodes": [
{
"id": "detect",
"plugin_type": "yolo",
"tool_type": "detect",
"input_types": ["image"],
"output_types": ["detections"]
},
{
"id": "read",
"plugin_type": "ocr",
"tool_type": "read",
"input_types": ["detections"],
"output_types": ["text"]
}
],
"edges": [
{ "from_node": "detect", "to_node": "read" }
]
}No real plugin IDs. ✅ 3. Updated Governance CI GateReplace: - name: Validate pipelines
run: |
cd server
uv run python tools/validate_pipelines.pywith no changes — the validator is now abstract and safe. You do NOT need to clone forgesyte-plugins anymore for pipeline validation. You can remove this: - name: Clone forgesyte-pluginsBecause ForgeSyte Core no longer needs plugin manifests. ✅ 4. Updated Capability Matrix GeneratorYour capability matrix generator should now operate on:
If you want, I can rewrite this file too — just say the word. 🎉 ResultAfter applying these changes:
This is the correct architecture for your repo split. |
Beta Was this translation helpful? Give feedback.
-
|
Roger, perfect — here is the full, clean, future‑proof governance update you asked for:
Everything below is drop‑in, consistent, and aligned with your ForgeSyte Core ↔ ForgeSyte Plugins split. ✅ 1. Updated
|
Beta Was this translation helpful? Give feedback.
-
|
Absolutely, Roger — here is the fully updated Governance CI YAML, rewritten to match your new architecture:
This YAML is clean, minimal, and aligned with your governance model. ✅ Updated Governance CI Gate (drop‑in replacement)name: Governance CI Gate
on:
pull_request:
branches: [main, master]
push:
branches: [main, master]
jobs:
governance:
runs-on: ubuntu-latest
steps:
- name: Checkout repo
uses: actions/checkout@v4
- name: Set up Python
uses: actions/setup-python@v5
with:
python-version: "3.11"
- name: Install uv
uses: astral-sh/setup-uv@v2
- name: Install server dependencies
run: |
cd server
uv sync
# ---------------------------------------------------------
# 1. Pipeline validator (abstract tool types)
# ---------------------------------------------------------
- name: Validate pipelines
run: |
cd server
uv run python tools/validate_pipelines.py
# ---------------------------------------------------------
# 2. Capability matrix generator (reads plugin repo)
# ---------------------------------------------------------
- name: Clone forgesyte-plugins
run: |
cd ..
git clone --depth 1 https://github.com/rogermt/forgesyte-plugins.git forgesyte-plugins
- name: Generate capability matrix
run: |
cd server
uv run python tools/generate_plugin_capability_matrix.py
# ---------------------------------------------------------
# 3. Ensure capability matrix is committed
# ---------------------------------------------------------
- name: Check capability matrix is up to date
run: |
git diff --exit-code docs/plugin_capability_matrix_generated.md🧠 What Changed and Why✔ Removed plugin metadata validation from ForgeSyte CoreBecause plugin manifests live in forgesyte-plugins, not here. ✔ Pipeline validator now uses abstract tool typesNo more checking real plugin/tool IDs. ✔ Capability matrix still generatedBut only after cloning the plugin repo. ✔ Governance CI now validates only:
✔ No more CI failures due to plugin/tool mismatchesBecause ForgeSyte Core no longer depends on real plugin manifests. 🎉 ResultYour Governance CI is now:
This is the correct governance model for a split‑repo architecture. |
Beta Was this translation helpful? Give feedback.
-
|
Absolutely, Roger — here is the fully updated Governance CI YAML, rewritten to match your new architecture:
This YAML is clean, minimal, and aligned with your governance model. ✅ Updated Governance CI Gate (drop‑in replacement)name: Governance CI Gate
on:
pull_request:
branches: [main, master]
push:
branches: [main, master]
jobs:
governance:
runs-on: ubuntu-latest
steps:
- name: Checkout repo
uses: actions/checkout@v4
- name: Set up Python
uses: actions/setup-python@v5
with:
python-version: "3.11"
- name: Install uv
uses: astral-sh/setup-uv@v2
- name: Install server dependencies
run: |
cd server
uv sync
# ---------------------------------------------------------
# 1. Pipeline validator (abstract tool types)
# ---------------------------------------------------------
- name: Validate pipelines
run: |
cd server
uv run python tools/validate_pipelines.py
# ---------------------------------------------------------
# 2. Capability matrix generator (reads plugin repo)
# ---------------------------------------------------------
- name: Clone forgesyte-plugins
run: |
cd ..
git clone --depth 1 https://github.com/rogermt/forgesyte-plugins.git forgesyte-plugins
- name: Generate capability matrix
run: |
cd server
uv run python tools/generate_plugin_capability_matrix.py
# ---------------------------------------------------------
# 3. Ensure capability matrix is committed
# ---------------------------------------------------------
- name: Check capability matrix is up to date
run: |
git diff --exit-code docs/plugin_capability_matrix_generated.md🧠 What Changed and Why✔ Removed plugin metadata validation from ForgeSyte CoreBecause plugin manifests live in forgesyte-plugins, not here. ✔ Pipeline validator now uses abstract tool typesNo more checking real plugin/tool IDs. ✔ Capability matrix still generatedBut only after cloning the plugin repo. ✔ Governance CI now validates only:
✔ No more CI failures due to plugin/tool mismatchesBecause ForgeSyte Core no longer depends on real plugin manifests. 🎉 ResultYour Governance CI is now:
This is the correct governance model for a split‑repo architecture. |
Beta Was this translation helpful? Give feedback.
-
|
Roger, this implementation plan is solid, but it needs one more round of tightening so that it becomes fully actionable, repo‑aligned, and risk‑free. Let me refine it into a clean, authoritative governance migration plan that you can hand straight to a contributor — or execute yourself without surprises. Below is the final, corrected, end‑to‑end plan, incorporating everything you’ve already reasoned through, plus the missing details that ensure nothing breaks at runtime. ✦ ForgeSyte Governance Migration Plan (Final & Correct)Make ForgeSyte Core fully plugin‑agnostic and pipeline‑safe1.
|
Beta Was this translation helpful? Give feedback.
-
|
You’re right to call that out—let’s freeze a coherent, self‑consistent set so nothing fights anything else. Below are all four pieces, aligned with each other: 1. Updated pipeline JSON filesExample: {
"id": "yolo_ocr",
"entry_nodes": ["detect"],
"output_nodes": ["read"],
"nodes": [
{
"id": "detect",
"plugin_type": "yolo",
"tool_type": "detect",
"input_types": ["image"],
"output_types": ["detections"]
},
{
"id": "read",
"plugin_type": "ocr",
"tool_type": "read",
"input_types": ["detections"],
"output_types": ["text"]
}
],
"edges": [
{ "from_node": "detect", "to_node": "read" }
]
}Example: {
"id": "ocr_only",
"entry_nodes": ["read"],
"output_nodes": ["read"],
"nodes": [
{
"id": "read",
"plugin_type": "ocr",
"tool_type": "read",
"input_types": ["image"],
"output_types": ["text"]
}
],
"edges": []
}2. Updated validator file
#!/usr/bin/env python3
"""
Pipeline Validator (plugin-agnostic, abstract tool types).
Validates:
- node/edge structure
- no cycles
- reachability from entry_nodes
- sinks are output_nodes
- type compatibility via input_types/output_types on nodes
"""
import json
import sys
from pathlib import Path
from typing import List, Set, Dict, Any
ROOT = Path(__file__).resolve().parents[1]
PIPELINES_DIR = ROOT / "app" / "pipelines"
def detect_cycle(nodes: List[dict], edges: List[dict]) -> bool:
graph: Dict[str, List[str]] = {n["id"]: [] for n in nodes}
for e in edges:
graph[e["from_node"]].append(e["to_node"])
visited: Set[str] = set()
stack: Set[str] = set()
def dfs(nid: str) -> bool:
if nid in stack:
return True
if nid in visited:
return False
visited.add(nid)
stack.add(nid)
for nxt in graph.get(nid, []):
if dfs(nxt):
return True
stack.remove(nid)
return False
return any(dfs(n["id"]) for n in nodes)
def validate_pipeline_file(path: Path) -> List[str]:
with path.open() as f:
data: Dict[str, Any] = json.load(f)
pid = data.get("id", path.stem)
errors: List[str] = []
nodes: List[dict] = data.get("nodes", [])
edges: List[dict] = data.get("edges", [])
entry_nodes: List[str] = data.get("entry_nodes", [])
output_nodes: List[str] = data.get("output_nodes", [])
if not nodes:
return [f"{pid}: pipeline has no nodes"]
node_ids = [n["id"] for n in nodes]
if len(node_ids) != len(set(node_ids)):
errors.append(f"{pid}: node IDs must be unique")
node_set = set(node_ids)
# Edge references
for e in edges:
if e["from_node"] not in node_set:
errors.append(f"{pid}: edge from unknown node '{e['from_node']}'")
if e["to_node"] not in node_set:
errors.append(f"{pid}: edge to unknown node '{e['to_node']}'")
# Entry/output nodes exist
for nid in entry_nodes:
if nid not in node_set:
errors.append(f"{pid}: entry node '{nid}' not in nodes")
for nid in output_nodes:
if nid not in node_set:
errors.append(f"{pid}: output node '{nid}' not in nodes")
# Cycle detection
if detect_cycle(nodes, edges):
errors.append(f"{pid}: pipeline graph contains a cycle")
# Reachability
graph: Dict[str, List[str]] = {n["id"]: [] for n in nodes}
for e in edges:
graph[e["from_node"]].append(e["to_node"])
reachable: Set[str] = set()
def dfs_reach(nid: str):
if nid in reachable:
return
reachable.add(nid)
for nxt in graph.get(nid, []):
dfs_reach(nxt)
for nid in entry_nodes:
dfs_reach(nid)
unreachable = node_set - reachable
if unreachable:
errors.append(f"{pid}: unreachable nodes: {sorted(unreachable)}")
# Sinks must be output nodes
sinks = {nid for nid in node_set if not graph.get(nid)}
non_output_sinks = sinks - set(output_nodes)
if non_output_sinks:
errors.append(
f"{pid}: nodes with no outgoing edges must be output nodes: {sorted(non_output_sinks)}"
)
# Type compatibility (abstract)
node_map = {n["id"]: n for n in nodes}
for n in nodes:
if "input_types" not in n or "output_types" not in n:
errors.append(
f"{pid}: node '{n['id']}' must define input_types and output_types"
)
for e in edges:
src = node_map[e["from_node"]]
dst = node_map[e["to_node"]]
src_out = set(src.get("output_types", []))
dst_in = set(dst.get("input_types", []))
if src_out and dst_in and not (src_out & dst_in):
errors.append(
f"{pid}: type mismatch {e['from_node']}→{e['to_node']}: "
f"src outputs={sorted(src_out)}, dst inputs={sorted(dst_in)}"
)
return errors
def main():
pipeline_files = sorted(PIPELINES_DIR.glob("*.json"))
if not pipeline_files:
print("⚠️ No pipeline files found in app/pipelines/")
sys.exit(0)
all_errors: List[str] = []
for pf in pipeline_files:
all_errors.extend(validate_pipeline_file(pf))
if all_errors:
print("❌ Pipeline validation failed:")
for e in all_errors:
print(" -", e)
sys.exit(1)
print("✅ All pipelines valid.")
if __name__ == "__main__":
main()3. Updated Governance CI YAML
name: Governance CI Gate
on:
pull_request:
branches: [main, master]
push:
branches: [main, master]
jobs:
governance:
runs-on: ubuntu-latest
steps:
- name: Checkout repo
uses: actions/checkout@v4
- name: Set up Python
uses: actions/setup-python@v5
with:
python-version: "3.11"
- name: Install uv
uses: astral-sh/setup-uv@v2
- name: Install server dependencies
run: |
cd server
uv sync
# 1. Pipeline validator (abstract, plugin-agnostic)
- name: Validate pipelines
run: |
cd server
uv run python tools/validate_pipelines.py
# 2. Clone plugin repo for capability matrix
- name: Clone forgesyte-plugins
run: |
cd ..
git clone --depth 1 https://github.com/rogermt/forgesyte-plugins.git forgesyte-plugins
# 3. Generate capability matrix
- name: Generate capability matrix
run: |
cd server
uv run python tools/generate_plugin_capability_matrix.py
# 4. Ensure capability matrix is committed
- name: Check capability matrix is up to date
run: |
git diff --exit-code docs/plugin_capability_matrix_generated.md4. Updated capability matrix generator
#!/usr/bin/env python3
"""
Generate Plugin Capability Matrix.
Reads manifests from forgesyte-plugins and writes:
docs/plugin_capability_matrix_generated.md
"""
import json
from pathlib import Path
ROOT = Path(__file__).resolve().parents[1]
PLUGINS_DIR = ROOT / ".." / "forgesyte-plugins" / "plugins"
OUTFILE = ROOT / "docs" / "plugin_capability_matrix_generated.md"
def load_plugins():
plugins = {}
for manifest_path in PLUGINS_DIR.glob("*/manifest.json"):
plugin_id = manifest_path.parent.name
with manifest_path.open() as f:
data = json.load(f)
plugins[plugin_id] = data.get("tools", {})
return plugins
def generate_matrix(plugins):
lines = []
lines.append("# Plugin Capability Matrix (Generated)\n")
lines.append("This file is auto-generated. Do not edit manually.\n")
for plugin_id, tools in sorted(plugins.items()):
lines.append(f"## {plugin_id}\n")
if not tools:
lines.append("_No tools defined._\n")
continue
lines.append("| Tool | Input Types | Output Types | Capabilities |")
lines.append("|------|-------------|--------------|--------------|")
for tool_id, meta in sorted(tools.items()):
lines.append(
f"| `{tool_id}` | "
f"{', '.join(meta.get('input_types', []))} | "
f"{', '.join(meta.get('output_types', []))} | "
f"{', '.join(meta.get('capabilities', []))} |"
)
lines.append("")
return "\n".join(lines)
def main():
plugins = load_plugins()
content = generate_matrix(plugins)
OUTFILE.write_text(content)
print(f"Generated {OUTFILE}")
if __name__ == "__main__":
main()If you want, next step we can do a quick “grep audit” to list any remaining |
Beta Was this translation helpful? Give feedback.
-
|
You’re not crazy—this set is coherent. Nothing you’ve pasted here is “useless”; it’s basically the final, correct snapshot of the plan plus the concrete code to implement it. Let me tighten it into a single, unambiguous checklist you can actually follow. ✅ What you should have in the repo after this change1. Pipelines (abstract, plugin‑agnostic)
2. Validator
3. Governance CI
4. Capability matrix generator
🗑 Files you should delete
Those are the ones that still assume “ForgeSyte Core owns real plugins”. 🔁 Implementation order (you already have it right)
|
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
ok due to tools columsn chnges this test is not pssing
#!/usr/bin/env python3
"""Phase 14 Pipeline Validator (Graph + Type Invariants).
This validator ensures that every pipeline definition in Phase 14 is:
Usage:
python tools/validate_pipelines.py
Exit codes:
0 - All pipelines valid
1 - Validation failed
"""
import json
import sys
from pathlib import Path
from typing import Dict, List, Set
ROOT = Path(file).resolve().parents[1]
PIPELINES_DIR = ROOT / "app" / "pipelines"
CI clones forgesyte-plugins repo, which has a plugins/ subdirectory
PLUGINS_DIR = ROOT / ".." / "forgesyte-plugins" / "plugins"
def load_plugin_metadata() -> Dict[str, Dict[str, dict]]:
"""Load plugin metadata from manifests.
def detect_cycle(nodes: List[dict], edges: List[dict]) -> bool:
"""Detect if the graph contains a cycle using DFS.
def validate_pipeline_file(
path: Path, plugins: Dict[str, Dict[str, dict]]
) -> List[str]:
"""Validate a single pipeline file.
def main():
"""Main validation entry point."""
plugins = load_plugin_metadata()
pipeline_files = sorted(PIPELINES_DIR.glob("*.json"))
if name == "main":
main()
Run cd server
cd server
uv run python tools/validate_pipelines.py
shell: /usr/bin/bash -e {0}
env:
pythonLocation: /opt/hostedtoolcache/Python/3.11.14/x64
PKG_CONFIG_PATH: /opt/hostedtoolcache/Python/3.11.14/x64/lib/pkgconfig
Python_ROOT_DIR: /opt/hostedtoolcache/Python/3.11.14/x64
Python2_ROOT_DIR: /opt/hostedtoolcache/Python/3.11.14/x64
Python3_ROOT_DIR: /opt/hostedtoolcache/Python/3.11.14/x64
LD_LIBRARY_PATH: /opt/hostedtoolcache/Python/3.11.14/x64/lib
UV_CACHE_DIR: /home/runner/work/_temp/setup-uv-cache
❌ Pipeline validation failed:
Error: Process completed with exit code 1.
Beta Was this translation helpful? Give feedback.
All reactions