Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The api and rows services cannot store datasets cache #1560

Open
severo opened this issue Jul 25, 2023 · 1 comment
Open

The api and rows services cannot store datasets cache #1560

severo opened this issue Jul 25, 2023 · 1 comment
Labels
bug Something isn't working infra P2 Nice to have

Comments

@severo
Copy link
Collaborator

severo commented Jul 25, 2023

The datasets cache, for api and rows services (they depend on datasets), is not set, and by default is /.cache/huggingface/datasets. But this directory is not accessible by the python user.

I'm not sure if it's an issue, but I think we should:

  • set the datasets environment variable for these services (note that all the pods that depend on libcommon potentially have the same issue, but not all of them use datasets)
  • or better (but it's more work) create a libs/libdatasets library, that should only be used by /rows and the workers
@severo severo added bug Something isn't working infra labels Jul 25, 2023
@github-actions
Copy link

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

@github-actions github-actions bot closed this as completed Sep 3, 2023
@severo severo added the P2 Nice to have label Sep 4, 2023
@severo severo reopened this Sep 4, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working infra P2 Nice to have
Projects
None yet
Development

No branches or pull requests

1 participant