Skip to content

feat: Add GitHub integration with agent_prompts and github_components #1637

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 28 commits into
base: main
Choose a base branch
from

Conversation

julian-risch
Copy link
Member

@julian-risch julian-risch commented Apr 10, 2025

Related Issues

Proposed Changes:

  • Move github_components from experimental to a new integration
  • Move agent_prompts from experimental to a new integration

The idea is to enable users to run the example notebook (or a version with updated imports) after having installed this new integration

How did you test it?

New unit tests and I ran all usage examples successfully with a test repo.

I haven't tested it with the notebook yet, which we would need to update first. (tracked by deepset-ai/haystack-cookbook#183 )

Notes for the reviewer

  • I suggest we rename github_token parameter to api_key for consistency with many other integrations.
  • Some components have github_token: Optional[Secret] = None, because they can work without any token while others use Secret.from_env_var("GITHUB_TOKEN"). I suggest we use Secret.from_env_var("GITHUB_TOKEN", strict=False) where we currently have None as the default.
  • The internal implementation of the components differs in how they use _get_headers or _get_request_headers or define headers inline. We could refactor that.
  • While we could find a way to set up integration tests, I would rather leave them out of this PR.
  • GithubRepositoryViewer has a branch parameter in the run method, which could also be named ref to make more clear it can also be a tag or commit hash
  • Similar to the empty lines in prompts and linebreaks: Do we need to add whitespaces in the beginning of new lines. Trailing whitespaces are removed by the linter.

Checklist

@github-actions github-actions bot added the type:documentation Improvements or additions to documentation label Apr 10, 2025
@julian-risch julian-risch marked this pull request as ready for review April 25, 2025 10:28
@julian-risch julian-risch requested a review from a team as a code owner April 25, 2025 10:28
@julian-risch julian-risch requested review from sjrl and removed request for a team April 25, 2025 10:28
@@ -0,0 +1,21 @@
# github-haystack
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be good for us to add examples here in the Readme on how to use or to link to the tutorial/google colab for how to use.

Also another relevant detail I think is that these prompts were optimized using Anthropic models. Could be a useful thing for users to know.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

google colab in the cookbook and some more examples in an integration page is what I imagine. The README's we currently don't fill out, for example see: https://pypi.org/project/opensearch-haystack/
Might be a good idea to change that and use a copy of the integrations page. I don't see a good reason to keep it empty but I would prefer a consistent solution. I'll talk to Bilge.

@sjrl
Copy link
Contributor

sjrl commented Apr 29, 2025

@julian-risch maybe a general comment on the structure here. I see that the prompts aren't being used within the library and I understand they will be used in a future tutorial/colab.

I wonder then if it would be helpful to instead pre-assemble the tools within the repo so users could easily import the tools and immediately pass them to an Agent. What do you think?

Comment on lines +57 to +62
def _get_request_headers(self) -> dict:
"""
Get headers with resolved token for the request.

:return: Dictionary of headers including authorization if token is present
"""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this pattern for an incidental reason. By not putting the resolve of the Secret in the init method we allow users to not need to provide the secret until run time which is nice especially when running pipeline validation like in the deepset platform where the secrets are not available to the pipeline in the pipeline builder.

Could be nice to do this in the other components, basically to move the resolve_value() outside of the init method.

@sjrl
Copy link
Contributor

sjrl commented Apr 29, 2025

@julian-risch overall this looks really good! I mostly have minor comments and only one larger conceptual one about maybe providing users Tools directly instead of needing to compose them, themselves.

I didn't comb through every line since there is a lot, but it's well tested so it's good to go from my perspective! We can always make quick updates to this if things arise and depending on usage.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
topic:CI type:documentation Improvements or additions to documentation
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants