-
Notifications
You must be signed in to change notification settings - Fork 24
Description
The team has done an impressive job setting up Tabarena.
To further improve usability, it would be very helpful to have clear documentation outlining how to integrate a new model into the Tabarena pipeline and evaluate its performance within the current framework.
At present, much of the material is spread across tabarena_benchmarking_examples and tabrepo, which makes it challenging to determine the correct workflow and run the necessary scripts in the right order for a proper, consistent comparison with other models in Tabarena.
Would it be possible to consolidate these resources or provide step-by-step guidance for model integration and benchmarking?
This would ensure a smooth process for apple-to-apple comparisons between newly added models and existing ones.
Thank you