Songs ending #47

ValfarDeveloper · 2025-02-03T18:08:16Z

Hi! thank you to all your team for this.

I've been playing with the model, and I just noted that the songs ending are not generated correctly and they are cutted off, ignoring the last words of the last segment, I just tried to use a special prompt for the final segment, but unfortunatelly it doesn't work:

Instead of this:

segment_prompt = eos_token + sos_token + self.tokenizer.tokenize(section_text) + [self.tokenizer.soa] + self.codectool_stage1.sep_token_ids()

I've tried this:

segment_prompt = eos_token + sos_token + self.tokenizer.tokenize(section_text) + [self.tokenizer.soa] + self.codectool_stage1.sep_token_ids() + [self.tokenizer.eov] + [self.tokenizer.eoi]

but it fails with this error:

Traceback (most recent call last):
  File "/app/inference/infer.py", line 823, in <module>
    main()
  File "/app/inference/infer.py", line 819, in main
    pipeline.run()
  File "/app/inference/infer.py", line 806, in run
    self.run_stage1()
  File "/app/inference/infer.py", line 505, in run_stage1
    vocal_seg = self.codectool_stage1.ids2npy(rearrange(seg_tokens, "(n b) -> b n", b=2)[0])
                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/app/inference/codecmanipulator.py", line 326, in ids2npy
    assert codebook_0_range[0] <= token_ids[0] < codebook_0_range[1], \
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
AssertionError: token_ids[0]=32006 no está en el rango (45334, 46358)

Also I've tried to introduce manually the EOA, but already we have an assertion that makes it fail:

segment_prompt = eos_token + sos_token + self.tokenizer.tokenize(section_text) + [self.tokenizer.soa] + self.codectool_stage1.sep_token_ids() + [self.tokenizer.eoa]

ids = raw_output[0].cpu().numpy()
soa_idx = np.where(ids == mmtokenizer.soa)[0].tolist()
eoa_idx = np.where(ids == mmtokenizer.eoa)[0].tolist()
if len(soa_idx)!=len(eoa_idx):
    raise ValueError(f'invalid pairs of soa and eoa, Num of soa: {len(soa_idx)}, Num of eoa: {len(eoa_idx)}')

I've been thinking that could be an issue of the max_context limit that we have? not sure I'm just learning about it.

# Use window slicing in case output sequence exceeds the context of model
    max_context = 16384-max_new_tokens-1
    if input_ids.shape[-1] > max_context:
        print(f'Section {i}: output length {input_ids.shape[-1]} exceeding context length {max_context}, now using the last {max_context} tokens.')
        input_ids = input_ids[:, -(max_context):]

maybe it is being cutted abruptly causing that specific issue? also I noted in the demos that you posted in the website that you had a similar issue. I could help to create a PR with the fix, but only I need more context to approach the fix better.

The text was updated successfully, but these errors were encountered:

a43992899 · 2025-02-03T18:41:25Z

Maybe this issue is related: #46

We are looking into this~

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Songs ending #47

Songs ending #47

ValfarDeveloper commented Feb 3, 2025 •

edited

Loading

a43992899 commented Feb 3, 2025

Songs ending #47

Songs ending #47

Comments

ValfarDeveloper commented Feb 3, 2025 • edited Loading

a43992899 commented Feb 3, 2025

ValfarDeveloper commented Feb 3, 2025 •

edited

Loading