Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

On Python 2 with all builtins imported, bytes(str(u'abc'), 'utf8') == b"b'abc'" is True #193

Open
Valloric opened this issue Feb 9, 2016 · 9 comments

Comments

@Valloric
Copy link

Valloric commented Feb 9, 2016

$ python2
Python 2.7.6 (default, Jun 22 2015, 17:58:13)
[GCC 4.8.2] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> from builtins import *
>>> bytes(str(u'abc'), 'utf8')
b"b'abc'"
$ python3
Python 3.4.3 (default, Oct 14 2015, 20:28:29)
[GCC 4.8.4] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from builtins import *
>>> bytes(str(u'abc'), 'utf8')
b'abc'

Obviously this is a bug.

@Valloric
Copy link
Author

Valloric commented Feb 9, 2016

On Python 2, this can be worked around with str(u'abc').encode('utf8'). It returns the newbytes type from python-future as expected.

@edschofield
Copy link
Contributor

Thanks for reporting this, Val. Yes, I confirm this bug.

edschofield added a commit that referenced this issue Feb 9, 2016
@edschofield
Copy link
Contributor

I've added a putative fix to the v0.15.x branch. Could you please try this out? Can you find any related brokenness?

@Valloric
Copy link
Author

I can confirm the fix works! Thanks for the quick turnaround! :)

May I ask when will this go up on PyPI?

@ankostis
Copy link

The fix did not break(or fix) the the following related behavior
but I'm wondering if this is on purpose:

>>> from future.types import newbytes
>>> newbytes(newbytes(b'12'))
b'12'

While:

>>> bytes(newbytes(b'12'))
"b'12'"

@Valloric
Copy link
Author

@ankostis That too looks like a bug.

@ankostis
Copy link

Should I open a new issue or will you deal with it here?

Note that the solution might not be that straight forward - the root cause is that the "original" bytes invokes newbytes.__str__() which it is mimicing py3 and returns a unicode.
In PY3 I suppose the original bytes would call __bytes__() instead.

@ankostis
Copy link

Any progress on that?

@posita
Copy link
Contributor

posita commented Jun 14, 2016

See also #171.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants