Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SquashFS superblock #308
base: master
Are you sure you want to change the base?
SquashFS superblock #308
Changes from 1 commit
aa22e7a
96745d1
502d6da
044a726
2d2c852
9a30969
0b78cdd
9218754
87b8a36
44b535e
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not rename the field to
fileformats
?justsolve
doesn't sound like a descriptive name for that wiki.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, that would be a question for a new issue, but I don't think it's much beneficial. Right now,
justsolve
is an established identifier, so it doesn't make sense to change it arbitrarily.In general, I don't care about the exact identifiers, it's just important to keep them consistent (so that no two different KSY specs use another names to refer to the same thing).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This
justsolve
contains broken links, doesn't link to the real spec and assumes that.sfs
extension without any references. I am not too enthusiastic to add it as a reference. I found more relevant results with Google than digging for thejustsolve
keyword and following it there.I can't say that this
xref
tojustsolve
is human friendly, and if it is for robots, then wikidata is a better way to link various related resources, IMO.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fair enough.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually SquashFS can also be big endian:
https://github.com/plougher/squashfs-tools/blob/master/squashfs-tools/squashfs_fs.h#L30
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
BTW KS allows switcheable endianness.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SquashFS 4.0 is strictly little endian. The older versions of SquashFS could be either.
This definition is only there for compatibility, since
unsquashfs
supports unpacking older formats.There are some big endian 4.0 images floating around, typically created by cheap plastic router vendors who thought it's a great idea to hack up the format. But those usually also contain other modifications as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should link to a specific revision, not to
master
that can change over time:There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not to the original squashfs project as well?
https://github.com/plougher/squashfs-tools/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess because https://github.com/plougher/squashfs-tools/blob/master/COPYING
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Again, this is not a problem. Also, for interoperability you are allowed to look at this in many jurisdictions without any problem at all.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Depends heavily on court decisions on per-case basis. And I don't think that in the case where source code is available and when one explicitly claims he bases his work on that code, the court gonna decide that the work is independent and not derivative one.
Copyright trolls are copyright trolls, and choosing GPL when not forced to, is one of the markers of a copyright troll.
It is in the jurisdictions where reversing for interoperability is fair use ... There are ones in which there is no concepts of fair use, but reversing is allowed ... with a lot of limitations .... only for the cases almost never happening in practice.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would find it very hard to believe that a header file that contains no code, just definitions and a description of an on disk structure, would be copyrightable.
This doesn't make sense. Who is choosing GPL here when not forced to?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not "here", but in general. Some people prefer to license their software to GPL and other licenses designed to create legal troubles to other people. Not everyone considers it as OK. In the current legal system the best way to deal with such software and people is to avoid them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I vehemently disagree. I suggest take this discussion offline as to not clutter the comments :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@adrianherrera because https://github.com/plougher/squashfs-tools/ is not a link to reference doc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not move the
superblock
definition ofseq
directly here?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess because this spec should be called
squashfs
, notsquashfs_superblock
.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep. It should. But for now I am unable to overcome the complexity to develop it further.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't it be a enum?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Most likely. I am still learning the .ksy format. In the spec it is.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
btw, we have a lib of decompressors
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What for?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
zlib (raw + gzip), lzma (both versions), lz4, zstd
see https://github.com/kaitai-io/kaitai_compress
also: https://github.com/KOLANICH/transformerz.py
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It'd be nice to add
doc
to each field non-fully described by its title. And also crosslink it to squashfs impls using-orig-id
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The field name
-orig-id
is not self-descriptive either. :D It is not in the .ksy spec. What is it for?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
First: we usually call a
field
an entry inseq
.I don't remember we have the established terminology for the stuff within
fields
andinstance
s, but I'd call it an attribute. Attrs beginning with-
are not validated by the compiler against specs, but there are some conventions.-orig-id
is used for names in original specs and source code. I.e. if in the original spec some field is called so, you can and should put it there. If in the reference impl a variable or a struct member to which the val is parsed is called so, you can also use it. Multiple names can be inserted using YAML arrays.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Original name in spec.
Added the
doc
, but https://ide.kaitai.io/ doesn't render anything specific for it. JS code also doesn't contain the comment.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@AgentD I copied comments for the parser from your https://github.com/AgentD/squashfs-tools-ng/blob/master/doc/format.txt spec. It is okay for you that the parser with these comments is CC-0?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess that it may make sense to add that fact into https://github.com/AgentD/squashfs-tools-ng/blob/master/COPYING.md , if he is OK with relicensing that doc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have no problem with that. In fact, one of the main reasons I wanted to document the SquashFS format was to enable others to create implementations without getting tangled up in a license mess (as reading the GPL'd code as a starting point would).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@AgentD, could you explicitly license the docs under a permissive license then, please?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should probably be a bit-sized type
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bit mapping in 16-bit little endian block in Kaitai is not trivial. May need to write a script to avoid mistakes.
There are only 11 bits used, so it may seem that the mapping should start with 5 bit padding. But that's a mistake. Because the first 8 bits that Kaitai sees are from the lower byte, because the flag stored as little endian. The thing that goes first are these 7 bits.
The bits in Kaitai will be written from the end. To make it 8 bits, there should be 1 padding bit for the start of Kaitai structure. Wrong. The bit is actually 0x0004 missing in the middle. I don't know if you can compare it with spec manually, but for me it is hard.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. PTAL.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@abitrolly Note that you already can read bit integers also in little-endian byte order, see kaitai-io/kaitai_struct#155 (comment). Looking at the hex values of the bits in your comment, it seems that you can read them exactly in the order as they're defined in the spec (if the whole flag is stored in little-endian bit order as you said), but rather look at the comment above to see the actual bit layout of little-endian bit integers.
You'll need a snapshot compiler version installed though, or use the devel Web IDE that comes preinstalled with the latest 0.9-SNAPSHOT out-of-the-box.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@generalmimon I don't find bit masking syntax more readable. Any attempt to work with bits without getting the byte or the word out of the flow first looks complicated. I coud use something like this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@abitrolly
If the hex bit masks specified here are meant to be used on a single
u2le
parsed previously:One can declare the structure in the
.ksy
spec in the exact same order using little-endian bit integers like this:I accept your opinion. Yeah, I realize that may be difficult to understand the pattern how the little-endian bit integers work at first glance, but once you get it, it starts to pay off with brevity, simplicity and flexibility.
And as you can see on the example above, it doesn't have to be hard to understand at all.
It also seamlessly fits into the KS concept of declarativity, meaning that you just describe the actual bit layout of the fields, without imperatively saying how to read (or write) them. This makes serialization (kaitai-io/kaitai_struct#27) much easier as well - the bit layout is everything you need to know to be able to write the values back.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@generalmimon is this format already implemented? I don't see
bit-endian
here http://doc.kaitai.io/ksy_reference.htmlHow make sure the size is set to
u2le
for parsing this type? Maybe set it inmeta
?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it is, but unfortunately the documentation for this feature hasn't been written yet, sorry for that. Hopefully I will add the section about the little-endian bit integers to the User Guide soon.
I explained the bit layout of the little-endian bit-sized integers on a simple example in comment kaitai-io/kaitai_struct#155 (comment), so I suggest you read it first. If you have any questions, feel free to ask on our Gitter channel.
And please note that the KSY reference is now disabled, refer to the KSY syntax diagram instead. I'm going to add the
bit-endian
key to the KSY schema soon, so it'll be available there.No, you have to make sure that every
seq
orinstance
attribute withtype: flags
has alsosize: 2
(i.e. theflags
type will be wrapped in its own substream with 2 bytes in size). Omitting the last padding field while some attribute doesn't wrapflags
to 2-byte substream can cause wrong readings on further bit-sized integer readings using the same stream (because the internal bit position pointerbitsLeft
stays on the wrong place in the middle of the partially-consumed byte, and just jumping out of theflags
type doesn't reset it).Though the compiler inserts a
alignToByte()
call if it detects a byte-aligned field (anything withsize: [0-9]+
, ortype: [us][1248]
) after a not byte-aligned bit-sized integer (b[0-9]+
) in the sameseq
, the detection algorithm is currently quite dumb, and it doesn't recognizeif
conditions, subtypes, repetitions and type-switchings which have to be handled with special care. See kaitai-io/kaitai_struct#743 (comment) for more info.So that's why I suggest wrapping every subtype with
seq
that might not be byte-aligned in its own substream, because you're safe this way. It's sort of a workaround for this bug. But even when it's resolved, I like to make the bytesize
explicit because you immediately see how many bytes the bitfield occupies in the stream.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, I'm still hoping for some response here, @abitrolly. I'd be glad if you tried the approach I presented in #308 (comment), if it works for your format.
If there are any difficulties, feel free to share them, I'm here to help you.
To make it simple, you can ignore the whole thing with omitting the last padding field and adding
size
, just take the spec from #308 (comment) as it is and remove my recommendation comment.I might be a little biased as I implemented the feature 😃, but I think that using the little-endian bit integers makes sense here and can improve the readability of the spec. Currently you use big-endian bit integers, which enforces declaring the individual bits in unusual order, as you might've noticed. Your comment bit count starts from the largest, byte count form the smallest just reflects the mess. The sequence
0x0080
,0x0040
, ...0x0001
,bits 11110000
,0x0800
,0x0100
seems to me unnecessarily illogical and chaotic.To be clear, it's still much better than extracting the bits manually using value instances and bitwise AND
&
and bitshift>>
operations, because big-endian bit-integers can be at least called declarative, but there's a more organized way of doing it, so why not try it.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@generalmimon the events from 9th of August, 2020 in Belarus were too disturbing to do coding. There is a lot of info I need to reread, and because it didn't fit in my head last time, I don't think it will fit in those spare 15 minutes I have now either. If there is already a new tutorial about bit mapping in Kaitai, I may be able to find some time to experiment with that. Right now my focus is on getting my
ksykaitai
helper released.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
BTW it may make sense to create a
version
struct.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If there were other versions, but there aren't.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Even now, because it will allow to put them in the same field called
version
. It is easier to read.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I prefer the .ksy file to closely follow the spec with no surprises.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think it is a good approach. IMHO we must always improve if we can.