Skip to content

Conversation

@VDFaller
Copy link
Contributor

@VDFaller VDFaller commented Oct 29, 2025

Summary

Added a tool for parsing the manifest to get the lineage.

What Changed

Just parses then reads the manifest.

Why

So that it's usable by non-cloud customers.

Checklist

  • I have performed a self-review of my code
  • I have made corresponding changes to the documentation (in https://github.com/dbt-labs/docs.getdbt.com) if required -- Mention it here
  • I have added tests that prove my fix is effective or that my feature works
  • New and existing unit tests pass locally with my changes

Additional Notes

There are some differences in the outputs. I can try to line them up if needed.

Prompt Call & Response

  • "Get me the lineage of this model"
    • Bad - doesn't use mcp
  • "use the dbt mcp server to tell me the full lineage of this model"
    • Good - calls both directions, recursively
  • "use the dbt mcp server to tell me the lineage of this model"
    • Bad - ran jq on the manifest
    • 2nd run - Fine - calls both directions recursively
  • "use the dbt mcp server to get me the children of this model, including tests"
    • Good - Gets non-recursive children (used the name not the unique_id) and had the tests included.

@VDFaller VDFaller requested review from a team, b-per and jasnonaz as code owners October 29, 2025 18:06
@VDFaller VDFaller marked this pull request as draft October 29, 2025 20:39
@VDFaller VDFaller force-pushed the model-lineage-cli branch 2 times, most recently from a48b7a2 to 76ac932 Compare October 30, 2025 17:27
@b-per
Copy link
Collaborator

b-per commented Oct 31, 2025

Thanks Vince!

I am wondering if we should not create new tools instead of making get_model_parents/get_model_children be able to either query the Metadata API or the local artefacts.

Here are my Pros/Cons of having new tools for get_model_parents/children dedicated to the CLI/manifest

  • Pros
    • it would be easy to activate those along the rest of the CLI tools and deactivate the dbt platform ones if needed (just activating the CLI toolset)
    • people could query both the get_model_parents from the metadata API and the new local tool in a single LLM session/context to compare the changes that they are introducing
    • it feels simpler to understand what tool does what and easier to know which ones someone might want to activate/deactivate
  • Cons
    • we already have many tools and it would add more (but realistically most people shouldn't activate all tools and tweak those to their use case)
    • we'd need to find good names and descriptions to explain to the LLM the difference between the children from metadata and the children from manifest if we want to avoid the LLM to get confused

So, I am in the camp of adding this functionality but in new tools ; and I'd be keen to hear other people's opinions about it.


As a side note, would you be able to set up signed commits for this repo? We can bypass this check at the PR level as repo admins, but this repo expects all commits to be signed now.

@VDFaller
Copy link
Contributor Author

VDFaller commented Nov 5, 2025

@b-per Crap, that's how I originally had it (shouldn't have squashed 😢 )

I didn't like it because when I was trying it out, exactly like you pointed to, it seemed to arbitrarily pick which tool it used. So I could run very similar queries and it would give two different results. We could give better names/descriptions so the tool wouldn't get confused but I don't think the user would necessarily know if they were getting the answer they expected.

  • me
    • Get me the model parents for jaffle_shop.orders.
  • MCP
    • Okay there we've got some options, do you want production parents, or local parents, recusive or not?
  • me
    • What?

I think it would just run the tool and the user would see it asking to run get_model_lineage(...) and go "that seems right", without knowing the nuance.

If it were to be two separate tools, would you think it should be a cli tool or a discovery tool. My entire thought process was "Discovery is NOT just platform", especially after listening to Jason's talk at Coalesce where he talked about them abstractly. This very much relates to #418 in my head.

Rebased with gpgSign on, no idea why it was set to false for this repo.

Also on this

we already have many tools and it would add more (but realistically most people shouldn't activate all tools and tweak those to their use case)

  • Do y'all have data to show that's the case? I'd expect people to just give it everything.

@DevonFulcher
Copy link
Collaborator

Hey @VDFaller and @b-per, I'm sorry for the back-and-forth on this. I told Vince that I was appreciative of sticking with the existing get_model_parents/get_model_children tools and routing between the local or remote version depending on the user's config. Let's get aligned on this. I think the Pros you listed are valid, Benoit, but they may not be the features worth optimizing greatly for.

I like the router approach because the agent typically doesn't care whether the information is coming from a local or remote source; the user cares more about that. Also, with the latest config changes, it is quite easy to point the agent to local or remote. If DBT_HOST is present, use GQL; otherwise, use the local version. Turning on/off more tool options depending on local or remote usage is more flexible, but it is also more complex, and I don't think most users want to use both local and remote at the same time.

Furthermore, this router approach can be applied to the Semantic Layer tools in the future. It is a source of frustration for some users that these tools don't work locally.

Add a fallback path for Discovery tools to get use CLI functionality
Add ModelLineage type with main constructor `from_manifest`

The CLI path will not work until auto-disable is functioning correctly.
@VDFaller VDFaller marked this pull request as ready for review November 7, 2025 16:35
@VDFaller VDFaller changed the title CLI fallback for get_model_parents/get_model_children discovery tools Add get_model_lineage_dev CLI tool Nov 7, 2025
@VDFaller VDFaller requested a review from a team as a code owner December 3, 2025 20:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants