Skip to content

newbytes not fully compatible with bytes: newbytes(newstr(...), '<encoding>') looks like it produces something similar to (but not quite the same as) newbytes(repr(newstr(...)), '<encoding>') #171

Open
@posita

Description

@posita

On Python 3.4:

>>> from __future__ import print_function, unicode_literals ; from builtins import *
>>> bytes
<class 'bytes'>
>>> str
<class 'str'>
>>> b1 = str(u'abc \u0123 do re mi').encode(u'utf_8') # this works
>>> b1
b'abc \xc4\xa3 do re mi'
>>> b2 = bytes(u'abc \u0123 do re mi', u'utf_8') # so does this
>>> b2
b'abc \xc4\xa3 do re mi'
>>> b1 == b2
True
>>> b3 = bytes(str(u'abc \u0123 do re mi'), u'utf_8') # this works too (unsurprisingly)
>>> b3
b'abc \xc4\xa3 do re mi'
>>> b1 == b3
True

On Python 2.7:

>>> from __future__ import print_function, unicode_literals ; from builtins import *
>>> bytes
<class 'future.types.newbytes.newbytes'>
>>> str
<class 'future.types.newstr.newstr'>
>>> b1 = str(u'abc \u0123 do re mi').encode(u'utf_8') # this works
>>> b1
b'abc \xc4\xa3 do re mi'
>>> type(b1)
<class 'future.types.newbytes.newbytes'>
>>> b2 = bytes(u'abc \u0123 do re mi', u'utf_8') # so does this (argument is native unicode object)
>>> b2
b'abc \xc4\xa3 do re mi'
>>> b1 == b2
True
>>> b3 = bytes(str(u'abc \u0123 do re mi'), u'utf_8') # but this looks like it's encoding the repr() of the newstr
>>> b3
b"b'abc \xc4\xa3 do re mi'"
>>> b1 == b3
False
>>> # I can't figure out what it's actually doing though; these aren't quite the same
>>> bytes(repr(str(u'abc \u0123 do re mi')).encode(u'utf_8'))
b"'abc \\u0123 do re mi'"
>>> bytes(repr(str(u'abc \u0123 do re mi').encode(u'utf_8')), 'utf_8')
b"b'abc \\xc4\\xa3 do re mi'"

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions