Hi I think your diffusion models are amazing - especially the turbo versions that dont seem to need the audio codes to generate
But the Qwen omni LLM is poor at instruction following perhaps because its multifunctional
I was thinking of finetuning a different text only llm model for task of enhancing.embelishing prompt and lyric generation
and perhaps metadata generation
I couldnt find the training set you used...can you direct me please?
also i couldnt find the training pipeline for the llm acestep base model in this repo? (not LORA training)
Hi I think your diffusion models are amazing - especially the turbo versions that dont seem to need the audio codes to generate
But the Qwen omni LLM is poor at instruction following perhaps because its multifunctional
I was thinking of finetuning a different text only llm model for task of enhancing.embelishing prompt and lyric generation
and perhaps metadata generation
I couldnt find the training set you used...can you direct me please?
also i couldnt find the training pipeline for the llm acestep base model in this repo? (not LORA training)