Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Expose optional lineno in thrift modules #298

Open
stiga-huang opened this issue Dec 30, 2024 · 4 comments
Open

Expose optional lineno in thrift modules #298

stiga-huang opened this issue Dec 30, 2024 · 4 comments
Labels
enhancement New feature or request

Comments

@stiga-huang
Copy link

We have a precommit job to check incompatible thrift changes and add warnings in Gerrit code reviews. E.g. when a patch adds a required field in an existing struct, the job will add a comment at that line saying "this might be incompatible".

Currently, the job doesn't understand the thrift syntax so has lots of false positive warnings. I'd like to improve the job by using the parser of thriftpy2. However, to add a Gerrit comment, I need to locate the symbol in the file. Basically just need the file name and line number. I can't find a way to expose it in thriftpy2. Not sure if I miss anything.

I see yacc.LRParser.parse() which is used in parser.py actually provides a tracking mode:

    def parse(self, input=None, lexer=None, debug=False, tracking=False, tokenfunc=None):

https://github.com/dabeaz/ply/blob/3.11/ply/yacc.py#L803-L809
However, thriftpy2 just uses the default mode so maybe line numbers are not attached in the results:

lexer.lineno = 1
parser.parse(data)

What confused me is in parser.py, there are some variables of line number. E.g. p.lineno here:

def p_error(p):
if p is None:
raise ThriftGrammarError('Grammar error at EOF')
raise ThriftGrammarError('Grammar error %r at line %d' %
(p.value, p.lineno))

It seems thriftpy2 can still get the line numbers.

So is it doable in current thriftpy2 versions to get the location (fileName + lineno) of each thrift symbols (struct/enum/method)? Any inputs are appreciated!

@cocolato
Copy link
Collaborator

cocolato commented Dec 31, 2024

Thriftpy2 is not suitable for parsing work. Maybe you should use Antlr to write your own visitor for parsing thrift metainfo.
https://github.com/antlr/grammars-v4/tree/master/thrift

@aisk aisk added the enhancement New feature or request label Jan 4, 2025
@stiga-huang
Copy link
Author

Thank @cocolato ! There are projects like https://github.com/thrift-labs/thrift-parser using Antlr but they are not that powerful as thriftpy2.

For instance, one of our requirements is just checking methods/structs used in some internal RPCs. There are some structs and enums used in data exchange between Java and C++ codes in the same process. They are allowed to add/remove required fields.

Thriftpy2 is convenient that it parses all thrift files at once and we can easily track methods and structs used in internal RPCs. However, using thrift-parser mentioned above, we have to parse the thrift files one by one and "join" the structs defined in different files.

It'd be nice if the parser in Thriftpy2 can expose the line numbers. I thought it's not that hard since line numbers are already used to report parsing/compilation errors. Do you think it needs complex changes?

@cocolato
Copy link
Collaborator

Could you describe the example of using the parse API that you expect?

Thriftpy2 is convenient that it parses all thrift files at once and we can easily track methods and structs used in internal RPCs. However, using thrift-parser mentioned above, we have to parse the thrift files one by one and "join" the structs defined in different files.

@ethe
Copy link
Member

ethe commented Jan 21, 2025

BTW I don't think that thriftpy parser is the best practice, it uses lots of anonymouse struct (tuples) to represent parsing intermediates, each item semantic of tuple is so indistinct, this makes the parser is not that easy to be extended, I'm happy to see if there are better alternatives.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

4 participants