Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

newint performance #136

Open
wizzat opened this issue Feb 1, 2015 · 4 comments
Open

newint performance #136

wizzat opened this issue Feb 1, 2015 · 4 comments

Comments

@wizzat
Copy link

wizzat commented Feb 1, 2015

I'm seeing a HUGE performance regression in my applications since switching from six -> python-future. I've narrowed at least one new bottleneck down to the newint monkey patch.

from __future__ import (absolute_import, division, print_function, unicode_literals)
from future.builtins import int

a = int(0)
for _ in range(1000000):
     a += 1

Python 2.7 without the int import: 0.211 sec
Python 2.7 with the int import: 5.388 sec

@LxiaoGirl
Copy link

Do not specify a type can obtain a faster running speed?

@edschofield
Copy link
Contributor

@wizzat: Thanks for your feedback! Wow, this looks pretty bad.

It comes down to the __add__ method in the newint class. Commenting out the __add__ method brings the time down to 0.23 sec on my machine, but then this behaviour results:

>>> from future.builtins import int
>>> a = int() 
>>> a
0
>>> a += 1
>>> type(a)
long

Performance would be helped by a C implementation of newint, but this would make python-future much less portable. A more realistic option might be a big recommendation in the docs on newint to use it sparingly because the compatibility it offers is at the expense of Py2 performance.

As @LxiaoGirl suggests, the toy script above doesn't need the newint type from python-future for Py2/3 compatibility. Does your actual code need it, or would a native int suffice?

(newint is useful for simplifying Py2/3 compatibility in applications and libraries like xlwt that handle both short and long integers on Py2. There's an example in the docs here: http://python-future.org/what_else.html#int.)

By the way: is this the only source of performance regressions you have seen? I'd be grateful to know about any others you find (ideally as separate issues).

@wizzat
Copy link
Author

wizzat commented Feb 3, 2015

Hey, thanks for looking into this! So, I end up doing a lot of numerical stuff in Python and this problem originally cropped up because of a bunch of float -> int conversions in code like this:

import collections
from future.builtins import *

c = collections.Counter()
for _ in range(1000000):
    c[int(3.5)] += 1

The natural answer is to use future.utils.native, but even that has a performance hit of ~2x:

import collections
from future.utils import native

c = collections.Counter()
for _ in range(1000000):
    c[native(3.5)] += 1

So the particular block of code that I was converting from six to python-future doesn't actually rely on newint (its bounded on uint64_t) - but the "standard" set of compatibility shims (from future.builtins import *) was unexpectedly a huge performance hit.

@Bernhard10
Copy link

In newint's __new__ there is the following instance check (line 62)

try:
    val = x.__int__()
except AttributeError:
    val = x
else:
    if not isint(val):
        raise TypeError('__int__ returned non-int ({0})'.format(type(val)))

When using from future.builtins import * and int(2.5) to convert floats to integers, there is quite a large performance hit. Profiling shows that the instance check (isint(val)) is partly to blame for that.

Is this check really necessary? As far as I can see, the line return super(newint, cls).__new__(cls, val) (line 76) will raise a TypeError anyway, if val is not an integer type.

Further more, if a programmer uses a class with an __int__ method that does not return an integer type, this is the programmer's fault and IMHO we shouldn't care.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants