Skip to content

newbytes not fully compatible with bytes: newbytes(newstr(...), '<encoding>') looks like it produces something similar to (but not quite the same as) newbytes(repr(newstr(...)), '<encoding>') #171

Description

@posita

On Python 3.4:

>>> from __future__ import print_function, unicode_literals ; from builtins import *
>>> bytes
<class 'bytes'>
>>> str
<class 'str'>
>>> b1 = str(u'abc \u0123 do re mi').encode(u'utf_8') # this works
>>> b1
b'abc \xc4\xa3 do re mi'
>>> b2 = bytes(u'abc \u0123 do re mi', u'utf_8') # so does this
>>> b2
b'abc \xc4\xa3 do re mi'
>>> b1 == b2
True
>>> b3 = bytes(str(u'abc \u0123 do re mi'), u'utf_8') # this works too (unsurprisingly)
>>> b3
b'abc \xc4\xa3 do re mi'
>>> b1 == b3
True

On Python 2.7:

>>> from __future__ import print_function, unicode_literals ; from builtins import *
>>> bytes
<class 'future.types.newbytes.newbytes'>
>>> str
<class 'future.types.newstr.newstr'>
>>> b1 = str(u'abc \u0123 do re mi').encode(u'utf_8') # this works
>>> b1
b'abc \xc4\xa3 do re mi'
>>> type(b1)
<class 'future.types.newbytes.newbytes'>
>>> b2 = bytes(u'abc \u0123 do re mi', u'utf_8') # so does this (argument is native unicode object)
>>> b2
b'abc \xc4\xa3 do re mi'
>>> b1 == b2
True
>>> b3 = bytes(str(u'abc \u0123 do re mi'), u'utf_8') # but this looks like it's encoding the repr() of the newstr
>>> b3
b"b'abc \xc4\xa3 do re mi'"
>>> b1 == b3
False
>>> # I can't figure out what it's actually doing though; these aren't quite the same
>>> bytes(repr(str(u'abc \u0123 do re mi')).encode(u'utf_8'))
b"'abc \\u0123 do re mi'"
>>> bytes(repr(str(u'abc \u0123 do re mi').encode(u'utf_8')), 'utf_8')
b"b'abc \\xc4\\xa3 do re mi'"

Metadata

Metadata

Assignees

No one assigned

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions