Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversion of % format using flint #2634

Merged
merged 2 commits into from
Jan 28, 2025

Conversation

un-pogaz
Copy link
Contributor

I found a command line tool to automatically convert Python code from old % format and .format() strings to f-strings: ikamensh/flynt.

flynt is able to convert a good number of % format that Ruff cannot, even if it can be a little to aggressive during the conversion, so is not 1:1 of the result that flynt would have produced.

A note for a later use: flynt have also a agresive mode (not performed here), will which catch and convert almost any % format to f-strings.

Using flint. Change has been filtred
because even on safe mode, flint can be too aggressive.
@un-pogaz un-pogaz changed the title Flint conversion of % format Conversion of % format using flint Jan 28, 2025
@eli-schwartz
Copy link
Contributor

flynt is able to convert a good number of % format that Ruff cannot, even if it can be a little to aggressive during the conversion, so is not 1:1 of the result that flynt would have produced.

ikamensh/flynt#176

Personally I have thus far been basically avoiding f-strings that I didn't write by hand. It's not really clear in most auto-converted cases that the f-string is an improvement.

Note as well that f-strings are much faster than .format() due to no function call... but percent-formatted strings got the same speed boost as f-strings (eventually) and are now faster than .format(), so many of the changes in this PR make calibre slightly slower.

@eli-schwartz
Copy link
Contributor

@kovidgoyal
Copy link
Owner

kovidgoyal commented Jan 28, 2025

@eschwartz: Looking at the diff its almost all % formatting to
f-strings. The only cases where it foes to .format() is when the
arguments themselves use string literals. I dont think those few cases
are a performance concern. I have made a followup commit to fix a few of
them in the one place where the performance might matter slightly.

@kovidgoyal kovidgoyal closed this Jan 28, 2025
@kovidgoyal kovidgoyal merged commit 9c5e88c into kovidgoyal:master Jan 28, 2025
4 checks passed
@eli-schwartz
Copy link
Contributor

-' id="%s"' % (aid + '-' + aid_anchor_suffix)
+f' id="{aid + "-" + aid_anchor_suffix}"'

It's a very personal stylistic thing but I actually don't like this anyway, since my text editor syntax highlighting treats the latter as a simple string and doesn't syntax highlight as an expression.

-'search_%s'%('author_title' if x == 'book' else x)
+'search_{}'.format('author_title' if x == 'book' else x)

Was changed in the PR (this is what is meant by "can be a little too aggressive" I think), and not changed again by you to become

-'search_%s'%('author_title' if x == 'book' else x)
+f'search_{"author_title" if x == 'book' else x}'

Same for

PR:

-'content/book_%d.html' % int(self.books_by_description[0]['id']))
+'content/book_{}.html'.format(int(self.books_by_description[0]['id'])

which could become:

-'content/book_%d.html' % int(self.books_by_description[0]['id']))
+f'content/book_{int(self.books_by_description[0]['id'])}.html'

but I question whether any of that is better and would be inclined to often prefer percent formatting :D

@un-pogaz
Copy link
Contributor Author

un-pogaz commented Jan 28, 2025

this is what is meant by "can be a little too aggressive" I think

Yes, that was this. Also, not making recursive-string was the behavior of Ruff for the same conversion.

@eli-schwartz I share your reticence about automatic tools, but I find that flynt was extremely efficient and faithful. After looked the code, this is explain by it uses of the Python internal compliator and the Abstract Syntax Trees to interpret the original code and the value passed, before rewrite the same Code Node into a f-string. The code write its own code.
This doesn't mean you can trust him blindly, especially on aggressive mode, but it can catch a lot of % format and provide a good hint of the good edit.

Else @kovidgoyal, a thing I want to share about convert more string in the future:
The enormous majority of % format that need the use of the aggressive mode is because the original string contain a %d, the only format specificator that have none .format()/f-string equivalent. And because of a subtile behavior of it, is a implicte int convertor, flynt avoid to convert them since it cannot guess if the value pased is compatible or not.
In aggressive mode, flynt solve that by brute-force adding a explicit int conversion, but after looked a test run, the vaste majority of this explicit int convertion are useless because the value is already confirmed to be a int.

I initialy started to convert them too, but after see the numbers of files grown out of control like the pep8-strict project, I conclude that such masive change need to be perform by you when you want rather than reviewing a monstruous PR, and so resume to only the safer convesion.

@kovidgoyal
Copy link
Owner

@eli-schwartz Need to upgrade the syntax highlighting in your editor then :) Here's what it looks like in my editor:
screenshot

@un-pogaz yes I know the issue with %d, the vast majority of the time it is safe to convert as just {} without int() but sadly this cant be done automatically reliably, which means tons of manual work. Someday when I have nothing better to do perhaps.

@eli-schwartz
Copy link
Contributor

There's a vim feature request for it but it appears to be taking a while... IIRC it doesn't help remotely, that reliably determining the end quote char for an f-string requires fully parsing the semantics of python, since any valid python expression is now valid inside of one, including nested f-strings as well as the same quote char you used to open the original string itself. It's a documented requirement that you need a PEG parser running the cpython grammar to figure out where the start and end bounds of a string is, now.

Would probably be easier if they focused on implementing pre-PEG versions of f-string syntax highlighting but it is what it is.

Mostly the increased power of them and the ability to "aggressively" rewrite things as f-strings seems to be geared towards people who want to write a 12-line program surrounded by two bytes of string literals, as a string containing a 12-line program. That's why I tend to get nervous whenever I see an f-string expression more complicated than {func(varname)}

@kovidgoyal
Copy link
Owner

That screenshot is from nvim, FYI. Probably using tree-sitter, IIRC.

@un-pogaz
Copy link
Contributor Author

un-pogaz commented Jan 29, 2025

A idea that I just got to make this task easier:
Edit the formatted_value() in flynt/transform/percent_transformer.py so that it not cast a int() during the conversion of %d (Since in your case, it much easier to add manualy a int() if your not sure than remove them from almost all of the occurence).

remove this

val = ast.Call(
    func=ast.Name(id="int", ctx=ast.Load()),
    args=[val],
    keywords={},
)

EDIT: You know what, I think a will retake the full conversion remaining % format by altered flynt and separet the edit in the same way that for the pep8-strict project so that you can programmatic review the change of the most big commit.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants