Skip to content

add function to combine all tasks metadata into tasks.json #1323

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

zeke
Copy link
Contributor

@zeke zeke commented Apr 1, 2025

This PR updates the inference-codegen script to build a JSON file containing all the metadata of all the tasks.

The goal is to make this data more portable, so it can be easily consumed in different contexts outside of the huggingface.js codebase, e.g. for Replicate model classification.

To test:

$ pnpm run inference-codegen

Here's an example of the current output: https://gist.github.com/zeke/460a58b7aa50e305072415844a335209

To Do

  • Update inference-codegen script to output all task data as tasks.json
  • Dereference the JSON schema so everything is included.
  • Make sure summary is included. e.g. "Keypoint detection is the task of identifying meaningful distinctive points or features in an image."
  • git-ignore generated tasks.json file.
  • ???

@zeke
Copy link
Contributor Author

zeke commented Apr 1, 2025

@SBrandeis @julien-c @Wauplin 👋🏼

I finally got around to bumping this forward a bit. My goal is to use this to classify all the models on Replicate.

Could use a little help with the dereferencing the JSON $refs.. I'm not familiar with quicktype and my initial attempts failed.

@Wauplin
Copy link
Contributor

Wauplin commented Apr 1, 2025

Hey @zeke , thanks for looking into this! Just to be sure, the goal of this PR is to generate a single tasks.json with all definitions from all input/output/output_stream from all tasks defined in https://github.com/huggingface/huggingface.js/tree/main/packages/tasks/src/tasks. Am I correct? No extra information added to it? (just want to be sure about the goal of it)

@zeke
Copy link
Contributor Author

zeke commented Apr 1, 2025

Hey @zeke , thanks for looking into this! Just to be sure, the goal of this PR is to generate a single tasks.json with all definitions from all input/output/output_stream from all tasks defined in https://github.com/huggingface/huggingface.js/tree/main/packages/tasks/src/tasks. Am I correct? No extra information added to it? (just want to be sure about the goal of it)

Yep! That's the goal!

@zeke
Copy link
Contributor Author

zeke commented Apr 1, 2025

Poking around the codebase, I see other code paths where more task types are mentioned. Here's one:

"image-to-video": ["diffusers"],

For example, image-to-video is not present in the code I'm generating here, but it is present in that file ☝🏼

Let me know if there's a better way to structure this to be using the most complete and up-to-date set of Tasks.

@zeke
Copy link
Contributor Author

zeke commented Apr 1, 2025

Ideally I also want the generated file to include these prose summary strings for each Task:

"Image-to-image is the task of transforming an input image through a variety of possible manipulations and enhancements, such as super-resolution, image inpainting, colorization, and more.",

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants