Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Could you publish the specification for the grammars syntax? #5

Open
KOLANICH opened this issue Jun 13, 2018 · 7 comments
Open

Could you publish the specification for the grammars syntax? #5

KOLANICH opened this issue Jun 13, 2018 · 7 comments

Comments

@KOLANICH
Copy link

I mean that there is enough information in the grammars themselves, but something is still unclear:

  • the full list of available encodings ?
  • what is dynamic endian?
  • what is "pascal" string?
  • what is the difference between zero-terminated and delimiter-terminated with delimiter=\0 ?
  • what exactly do fixedvalues do?
  • what does mustmatch do even 8n non-fixedvalues tags?
  • can ids be non-integer?
    .... etc
@synalysis
Copy link
Owner

@KOLANICH
Copy link
Author

KOLANICH commented Jun 14, 2018

Thank you.

Pascal strings: https://en.wikipedia.org/wiki/String_(computer_science)#Length-prefixed

I know that they are length prefixed. But the length may differ. Is the length 1-byte?

for values with mustmatch set the constraints set by fixedvalues and min/max value are used. If none of the fixedvalues is found or the value is not between min/max value the enclosing structure is not applied.

This is the especially valuable clarification, since I expected a bit different behavior (raising an error when a signature is not matched).

BTW, the software I develop is https://github.com/KOLANICH/synalysis2kaitai .

@synalysis
Copy link
Owner

For multi-byte encodings like UTF-16 the length is 2 bytes. However, Pascal strings are usually prefixed with a single byte length. These built-in data types work in many cases - for more special requirements data types can be parsed with Python or Lua scripts.

In chapter 6 of https://www.synalyze-it.com/Synalyze_It_Manual.pdf you find some additional information regarding "mustmatch". So for structures with variable element order the parser can automatically select the one that should be used - where all constraints are met.

@KOLANICH
Copy link
Author

Thank you for the info. Could you also clarify what do valueexpression, floating,lengthoffset, disabled and debug do?

@KOLANICH KOLANICH reopened this Jun 24, 2018
@synalysis
Copy link
Owner

valueexpression: for structures it determines what should be displayed as value in the parsing results tree. For elements it allows to show a different value than the one that was parsed in the file. For elements it's not used much. For structures it's useful because you can avoid opening structures in the parsing results but still see what they're about.

floating can be ignored. The idea was to mark structures as floating which can appear at different places in a file.

lengthoffset can also be ignored.

disabled elements or structures are simply ignored while parsing.

debug can be set for structures in order to get additional log messages while parsing a file. This makes grammar development easier because you see only parsing messages of a certain structure and not for all.

@KOLANICH KOLANICH reopened this Sep 16, 2018
@KOLANICH
Copy link
Author

KOLANICH commented Sep 16, 2018

Could you clarify, please:

  • what the difference between disabled and unused is
  • what the difference between prev and this is
  • if you could share the grammars for expressions like the ones used in length under the permissive license
  • how exactly repeatmin, repeatmax and repeat work (in pseudocode taking in account all versions of the format)
  • if mask is always an integer, if it always must contain a fixedvalue and how it interacts with endianness
  • if you have test grammars testing aspects of the format separately (like https://github.com/KOLANICH/synalysis2kaitai/tree/master/tests ) and if you are ready to share them under a permissive license.
  • the algorithm of member resolution in pseudocode. If one structure refers a member of another structure by its id, and for example the structure of a referenced member is not a substructure of a referrer, to which instance of that structure the referenced member does belong?
  • is fixedvalue inside binary always uses value in hex?

?

@KOLANICH
Copy link
Author

@synalysis

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants