Update benchmarks and fuzz targets bucket periodically in GKE #944
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Hi DonggeLiu,
Thanks a lot for your guidance on this! I’ve implemented the changes to automate syncing OSS-Fuzz targets and benchmarks to GCS. For now, I’ve created a draft PR for your review. Please let me know if you’d like any changes or if you’d like to dive deeper into any part of the implementation! #66
Summary of Updates
🔹 Data Source Implementation
Added a new script: tools/gcs_sync.py
Fetches human-written fuzz targets and function signatures using target_collector.py and introspector.py
Excludes LLVMFuzzerTestOneInput
Stores timestamped JSON files in the oss-fuzz-gen-targets GCS bucket under project-specific directories
🔹 Scheduling & Automation
Created k8s/gcs-sync-cron.yaml to define a Kubernetes CronJob
Default schedule: Runs daily at 3 AM UTC (adjustable via Cron syntax)
Uses a dedicated service account (oss-fuzz-sync-sa) with minimal permissions (storage.objectAdmin) for security
🔹 GKE Integration & Docker Image
Built a lightweight Docker image (Dockerfile.sync)
Includes only essential dependencies: Python, GCS SDK, and the sync script
Optimized image size (~300MB) to prevent unnecessary bloat
Automated build & deployment via GitHub Actions
🔹 Cluster Access & Workflow
Everything is managed via YAML files and GitHub Actions
No direct GKE access required, ensuring a more secure and auditable system
Let me know if you have any feedback! 🚀
Cheers,
Ekamjot