Hey,
great job with this repo. Caformer with 100M parameters is really powerful, though I am struggling with the finetuning due to hardware limitations. Did you already make experiments with something like adapter finetuning or LoRA? At first glance, the code looks like one would need to rewrite a lot for this.