Skip to content

Conversation

blazethunderstorm
Copy link

@blazethunderstorm blazethunderstorm commented Jul 15, 2025

Fixes #893

Updated the generator.py to correctly detect instruction encodings when using the new type/subtype schema. Previously, it looked only for the old encoding field, so some instructions were skipped. Now it checks subtype fields and extracts encoding info from there.

image image

@blazethunderstorm blazethunderstorm changed the title fix(gen): support subtype-based encodings in generator fix(gen): support subtype-based encodings in C header generator Jul 15, 2025
Copy link
Collaborator

@ThinkOpenly ThinkOpenly left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are WAY too many unrelated changes to easily review this. Could you create a single commit with just the necessary changes to address the issue?

@blazethunderstorm
Copy link
Author

@ThinkOpenly pls see now

@ThinkOpenly
Copy link
Collaborator

I don't think these changes have any impact. Are you seeing some problems being resolved?

I would expect that instead of finding the needed "match" string directly, you'd need to compute it by going into the "format" attribute and its children "opcodes" and "variables".

Note that these changes would likely appear in backends/generators/generator.py in load_instructions(). Indeed when you run ./do gen:c_header, you get error messages where these issues occur:

ERROR:: Missing 'encoding' field in instruction add.uw in /workspace/riscv-unified-db/gen/resolved_spec/_/inst/Zba/add.uw.yaml
ERROR:: Missing 'encoding' field in instruction rolw in /workspace/riscv-unified-db/gen/resolved_spec/_/inst/B/rolw.yaml
ERROR:: Missing 'encoding' field in instruction rol in /workspace/riscv-unified-db/gen/resolved_spec/_/inst/B/rol.yaml
ERROR:: Missing 'encoding' field in instruction xnor in /workspace/riscv-unified-db/gen/resolved_spec/_/inst/B/xnor.yaml
ERROR:: Missing 'encoding' field in instruction clmul in /workspace/riscv-unified-db/gen/resolved_spec/_/inst/B/clmul.yaml
ERROR:: Missing 'encoding' field in instruction orn in /workspace/riscv-unified-db/gen/resolved_spec/_/inst/B/orn.yaml
ERROR:: Missing 'encoding' field in instruction clmulh in /workspace/riscv-unified-db/gen/resolved_spec/_/inst/B/clmulh.yaml
ERROR:: Missing 'encoding' field in instruction andn in /workspace/riscv-unified-db/gen/resolved_spec/_/inst/B/andn.yaml
ERROR:: Missing 'encoding' field in instruction rorw in /workspace/riscv-unified-db/gen/resolved_spec/_/inst/B/rorw.yaml
ERROR:: Missing 'encoding' field in instruction ror in /workspace/riscv-unified-db/gen/resolved_spec/_/inst/B/ror.yaml

Part of your mission is to make those error messages go away. The current code just extracts the value of the "match" attribute, but you'll need to compute it.

@blazethunderstorm
Copy link
Author

@ThinkOpenly would do that

@dhower-qc
Copy link
Collaborator

Thanks for the contribution! Like @ThinkOpenly said, you'll need to build up what used to be explicit in 'match'. Basically, start with an instruction-length string of '-'s and then replace any position that is occupied by an opcode value.

So if you have:

# andn.yaml
format:
  funct7:
    display_name: ANDN
    location: 31-25
    value: 0b0100000
  funct3:
    display_name: ANDN
    location: 14-12
    value: 0b111
  opcode:
    display_name: OP
    location: 6-0
    value: 0b0110011

You want to wind up with the string:

0100000----------111-----0110011

@blazethunderstorm
Copy link
Author

@dhower-qc thanks for help would make the changes as req

Copy link
Collaborator

@ThinkOpenly ThinkOpenly left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks again for your efforts. Good code. Comments/questions inline.

encoding_filtered += 1
continue

continue
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does this have any effect?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@blazethunderstorm can you see this review? It also seems randomly placed to me

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The continue statement is still there, but looking again it does have a purpose: the code above handles the "format" case, and the code below handles the "encoding" case. These are mutually exclusive, so OK.

Nit: could you remove the extra blank line above the continue statement?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

still there :-)

@AFOliveira
Copy link
Collaborator

@blazethunderstorm I believe @ThinkOpenly comments were not addressed.

Copy link
Collaborator

@AFOliveira AFOliveira left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@blazethunderstorm not to this one, I thin k

encoding_filtered += 1
continue

continue
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@blazethunderstorm can you see this review? It also seems randomly placed to me

@AFOliveira
Copy link
Collaborator

@blazethunderstorm You still have not addressed @ThinkOpenly comments, can you please take a look at them?

@blazethunderstorm
Copy link
Author

@ThinkOpenly @AFOliveira pls review

Copy link
Collaborator

@AFOliveira AFOliveira left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please finish that small loose end. I plan to re-test Golang generation and C Header this week to see if we are finaly close enough, and getting this PR in before I try it, would be great!

@AFOliveira
Copy link
Collaborator

@AFOliveira sry to ask but what loose end like I have to fix

@ThinkOpenly comment.

Copy link

codecov bot commented Aug 7, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 46.05%. Comparing base (292b34f) to head (c76ff94).
⚠️ Report is 46 commits behind head on main.

Additional details and impacted files
@@           Coverage Diff           @@
##             main     #904   +/-   ##
=======================================
  Coverage   46.05%   46.05%           
=======================================
  Files          11       11           
  Lines        4942     4942           
  Branches     1345     1345           
=======================================
  Hits         2276     2276           
  Misses       2666     2666           
Flag Coverage Δ
idlc 46.05% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@ThinkOpenly
Copy link
Collaborator

CI is failing, complaining about the extra blank line and three other formatting issues.

Copy link
Collaborator

@AFOliveira AFOliveira left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me, please fix the commit history!

encoding_filtered += 1
continue

match_bits = ["-" for _ in range(32)]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this work for compressed (extension "C" and friends) instructions? For example:

match: 000-----------01

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed pls review

)
val_bin = bin(val_int)[2:].zfill(width)

match_bits[31 - hi : 32 - lo + 1] = list(val_bin)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also needs to accommodate compressed instructions (16 bit encoding)

@blazethunderstorm

This comment was marked as off-topic.

Copy link
Collaborator

@ThinkOpenly ThinkOpenly left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think the recent changes work in practice.

To determine the actual length of the instruction encoding you probably need to:

  1. For instructions which have been converted to "format" style
    1. Access the instruction type's "length" attribute:
      1. Get the instruction's subtype.
      2. Get the subtype's type.
      3. Get the type's length.
    2. Or, compute the instruction's length by walking through all of the "opcodes" and "variables" and accounting for all of the bits.
  2. For instructions which have not been converted to "format" and still use "encoding", get the length of the "match" field.


match_bits = ["-" for _ in range(32)]
bit_length = 32
if "length" in data:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this ever true?

pass
elif "C" in enabled_extensions:
if any(
loc.get("location", "").startswith("0..15")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this ever true?

val_bin = bin(val_int)[2:].zfill(width)

match_bits[31 - hi : 32 - lo + 1] = list(val_bin)
inst_width = 16 if encoding.get("compressed", False) else 32
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is encoding.get("compressed"... ever true?

@ThinkOpenly
Copy link
Collaborator

Looks good to me, please fix the commit history!

What do you seek, @AFOliveira , given that the merge queue will squash everything?

@AFOliveira
Copy link
Collaborator

Looks good to me, please fix the commit history!

What do you seek, @AFOliveira , given that the merge queue will squash everything?

Nevermind, I'm just not used to the merge queue squash.

@ThinkOpenly
Copy link
Collaborator

Superceded by #1051

@blazethunderstorm blazethunderstorm deleted the fix-gen-subtype-encoding-893 branch September 17, 2025 19:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Generated C header missing instructions with new subtype schema
4 participants