-
Notifications
You must be signed in to change notification settings - Fork 2
Format improvements (WIP) #63
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
|
||
| #[derive(Debug, Copy, Clone, PartialEq, Eq, Hash)] | ||
| pub struct IntFormat { | ||
| pub unsigned: bool, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might prefer signed rather than unsigned to avoid double negatives
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I went with unsigned since it's more of an exception to the norm than anything. I tend to structure my bools with true as the "do something different" value so that they don't need to be negated whenever being read, but I guess that doesn't matter as much with match since you have to be explicit in both cases.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
true as "do something different" makes sense in C-likes where it's easy to zero-initialize things. It doesn't make so much sense in rust.
|
|
||
| | ArgEncoding::Padding | ||
| | ArgEncoding::Padding { .. } | ||
| => Ok(ast::Expr::from(raw.expect_int())), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if this is unreachable now?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's definitely not unreachable since the padding args are only dropped after being raised thanks to some sort of logic with @arg0. I don't know how important the positioning of that is since you'd left a comment about it being important to reference the original args list, so I chose to leave it there for now. If it's possible to implement the "verify the bytes are 0 or fallback to blob" check at at line 541 before raising all the arguments, then it would be unreachable for sure.
| let time = f.read_i32()?; | ||
| let opcode = f.read_u16()?; | ||
| let size = f.read_u16()? as usize; | ||
| let size = f.read_i16()? as usize; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it seems to me this is just begging for an Out of Memory error without providing any advantages
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The field is signed though, read with MOVSX. If a size is greater than 0x7FFF, it'll index backwards (and probably explode)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It doesn't matter whether it's signed in game because it gets cast immediately to an unsigned type. So unless you plan to add range checking (which you certainly can do), you're literally just trading one incorrect behavior for another...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Range checking would definitely be ideal, though IDK how you'd want the error to be formatted.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems similar to the check in anm
| (Th08, 128, Some(("S(imm)f(imm)f(imm)f(imm)f(imm)", None))), // Argument 1 is unread | ||
| (Th08, 129, Some(("b(imm)---", None))), | ||
| (Th08, 130, Some((r#"s(imm;enum="EclSub")--"#, None))), | ||
| (Th08, 130, Some((r#"s(imm;extend;enum="EclSub")--"#, None))), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"extend" is too short of a name for something that is probably only ever going to be used once. Probably want something with multiple words, meaning we might need to pick now whether we want snake_case or kebab-case
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There're quite a few places where the game engines will read a full dword from the signature in order to check for variables but then only read the lower bits of the resulting value. Ideally extend could be replaced with a more general purpose attribute applicable to those situations as well since that would eliminate the need for extended padding anyway. Maybe this could be written as E(imm;range="s") or something?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was also thinking this should probably be a dword-sized value with an attribute that limits its value size. However, I really do not want to see abi letters get used inside string literals.
src/llir/abi.rs
Outdated
| /// stored in the arg0 header field of the instruction in EoSD and PCB ECL. (which is mapped to the | ||
| /// `@arg0` pseudo-argument in raw instruction syntax) | ||
| Integer { size: u8, ty_color: Option<TypeColor>, arg0: bool, immediate: bool, format: ast::IntFormat }, | ||
| Integer { size: u8, ty_color: Option<TypeColor>, arg0: bool, immediate: bool, extend: bool, format: ast::IntFormat }, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is getting out of hand, let's break this up over lines.
| /// | ||
| /// The first argument may have `arg0` if it is two bytes large. This indicates that the argument is | ||
| /// stored in the arg0 header field of the instruction in EoSD and PCB ECL. (which is mapped to the | ||
| /// `@arg0` pseudo-argument in raw instruction syntax) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Document extend.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What sort of formatting should be used for that? The only currently documented attributes are mask and furibug, which don't even use the same format.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh bugger, I forgot to document enum. That's the only thing I see missing from here.
It should be in prose in the documentation of Integer, just like the documentation for arg0 that I put this review comment under. (basically, "put it here!")
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
well, imm is also missing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And hex isn't included either. There're enough attributes that there should probably be a list with the names at the front.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, we probably want to make a markdown list
///
/// Supported attributes:
///
/// * **`imm`**: lorem ipsum dolor sit amet
/// i have a barnicle in my backseat
/// * **`arg0`**: blah blah blah blah
etc
| defs::Signature { | ||
| return_ty: sp!(value::ExprType::Void), | ||
| params: abi.encodings.iter().enumerate().flat_map(|(index, enc)| { | ||
| params: abi.encodings.iter().filter(|enc| !matches!(*enc, ArgEncoding::Padding { .. })).enumerate().flat_map(|(index, enc)| { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This line is too noisy. Funnily, this is a flat_map and yet it doesn't even look like anything is currently taking advantage of that; but this is exactly the sort of use case for that, so let's make use of it:
| params: abi.encodings.iter().filter(|enc| !matches!(*enc, ArgEncoding::Padding { .. })).enumerate().flat_map(|(index, enc)| { | |
| params: abi.encodings.iter().enumerate().flat_map(|(index, enc)| { |
And change the Padding match arm to:
| ArgEncoding::Padding { .. }
=> return None, // hide from signatureThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was concerned something like that would end up inserting a None into the resulting collection. Didn't know it would skip over it. Seems like there could be issues in the future though if index is ever used for anything though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oops I missed the fact that this would change indices, lemme read this over again.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was concerned something like that would end up inserting a None into the resulting collection.
flat_map flattens the iterators being returned by the closures. Option is an iterator of zero or one items, so None produces no items. Technically, this should be using .filter_map instead (you can just change the .flat_map to .filter_map and it should still compile); this is just an alias for Option-based flat_maps that is clearer in intent.
Oops I missed the fact that this would change indices, lemme read this over again.
So the index is only used to generate arg names. I would pull that out.
// in a line of code outside the loop
let arg_names = (1..).map(|i| ident!("arg_{i}"));
// inside the loop, change this line:
let name = sp!(abi_span => ctx.resolutions.attach_fresh_res(arg_names.next().unwrap()));There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
and of course get rid of the enumerate()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it make more sense to just add another function to the InstrAbi implementation similar to arg_encodings that returns the filtered version? This seems like the sort of thing that would end up continually popping up throughout the code base otherwise. (I would've done that in the first place, but rust complained about the filter output not being compatible with VeclikeIterator or something and I don't know how to make it happy.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmmmmm. Crap, that's a pain.
Writing this function is probably a good idea. It will have to return the slightly weaker DoubleEndedIterator instead of VeclikeIterator. (basically because filter can't be ExactSizeIterator, which is the other half of VeclikeIterator)
| match size { | ||
| 1 => args_blob.write_u8(0).expect("Cursor<Vec> failed?!"), | ||
| 1 => { | ||
| if (args_blob.position() & 3) == 0 { small_padding_value = 0i8 } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please avoid one-line ifs. I reserve these for extreme cases (usually only when there are multiple similar ifs in a row and I want to emphasize the similarities between them)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll freely admit the entire logic for handling that was a 5 minute quick fix that bolts a random functionality where it doesn't quite belong. I'd love to replace the logic with something better that doesn't feel as dirty.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As the complexity of this function increases (which it is no doubt going to, without end), we're going to want to start factoring out pieces of it into separable units.
This thing right here is easily separated out into a Finite State Machine. You could have something like:
/// Finite-state machine helper for the value of `-` padding.
struct SmallPaddingFillHelper {
fill_value: i8,
}
impl SmallPaddingFillHelper {
/// Update the helper in response to the next arg encoding.
///
/// Returns the current fill value for `-` padding.
fn step(&mut self, enc: &ArgEncoding, blob_position: u64) -> i8 {
todo!("update self.fill_value");
self.fill_value
}
}and then you could put let small_padding_fill = padding_helper.step(enc, args_blob.position()); near the top of the loop
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I meant more that I'd rather leave the padding bytes as exclusively 0, eliminate the padding fill variable entirely, and let the values be handled by a range sort of attribute within a single larger value. That way there isn't any more multi-iteration mutable state than there needs to be.
| let mut param_mask: raw::ParamMask = 0; | ||
| let mut current_param_mask_bit: raw::ParamMask = 1; | ||
|
|
||
| let mut small_padding_value = 0i8; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I want to see tests for:
- basic functionality of
extend - padding value resetting to 0 at dword boundaries. (
s(extend)------) - resetting to zero on positive values
That last one would involve
10 s(extend)b-
ins_10(-2, 1)
args_blob should compile to blobify![-2i16, 1i16]. You will need to impl Blobify for i16 here:
truth/tests/integration_impl/mod.rs
Line 227 in c41358e
| impl Blobify for i32 { |
src/image/color.rs
Outdated
| ColorFormat::Argb4444 => Rc::new(Argb8888::encode(&Argb4444::decode(bytes))), | ||
| ColorFormat::Argb8332 => Rc::new(Argb8888::encode(&Argb8332::decode(bytes))), | ||
| ColorFormat::Gray8 => Rc::new(Argb8888::encode(&Gray8::decode(bytes))), | ||
| ColorFormat::Rgb332 => Rc::new(Rgb332::encode(&Rgb332::decode(bytes))), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should be Argb8888::encode?
src/llir/abi.rs
Outdated
| arg0: bool, | ||
| immediate: bool, | ||
| extend: bool, | ||
| format: ast::IntFormat |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
im anal about trailing commas :P
Fix th20 std support
Fix th20 std support (again)
| ))); | ||
| } | ||
| } | ||
| pub(crate) fn validate(&self, _ctx: &CompilerContext) -> Result<(), ErrorReported> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was removed because its only purpose was to handle the padding arguments from old STD. This was refactored into the _ and - formats usable in any script type rather than optional arguments, so this check is unnecessary
|
|
||
| let sprite_id = sprite.id.unwrap_or(*next_auto_sprite_id); | ||
| *next_auto_sprite_id = sprite_id + 1; | ||
| *next_auto_sprite_id = sprite_id.wrapping_add(1); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ZUN did something extremely cursed in a few old engine files and uses -1 as the sprite ID to intentionally overwrite sprites from a different ANM. This didn't break previously because truth didn't dump the ID field to roundtrip it but blew up once the field was added.
| } | ||
| Ok(abi_parts::PaddingInfo { count, index: first_index }) | ||
| fn find_and_remove_padding(&self, arg_encodings: &mut Vec<(usize, &ArgEncoding)>) { | ||
| arg_encodings.retain(|(_, enc)| !matches!(*enc, ArgEncoding::Padding { .. })); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was also simplified by changing padding to be format letters instead of optional args
At some point, zero had attempted to merge some commits from main into his branch, but it did not get saved in git as a merge. This attempts to correct that. A change on my branch to add decompilation of string literals to enum consts was undone. We decided it didn't make sense. At the end there was this ScalarValueMap type left over that seemed no longer used, though I'm surprised it wasn't used in other places...
Grabs some more commits from zero's branch
…ruth into format_improvements_test
| /// `C` in mapfile. 4-byte integer immediate or register, printed as hex when immediate. | ||
| Color, | ||
| /// `_` in mapfile. Unused 4-byte space. | ||
| /// `-` in mapfile. Unused 1-byte space. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
forgot to mention before. The first line of a docstring is special. It shows up as the summary of the item.
No description provided.