Back

Explore Courses Blog Tutorials Interview Questions
0 votes
3 views
in Python by (16.4k points)
edited by
In the Chapter 4.3 of Dive into Python 3, it says:

In the version of Python 3, all the strings are sequences of Unicode characters. There is nothing of the sort as a Python string encoded in UTF-8, or a Python string encoded as CP-1252. “Is this string UTF-8?” is an invalid question.

In one way or another, I understand what this implies: strings = characters in the Unicode set, and Python can assist you with encoding characters as per diverse encoding methods. Be that as it may, are characters in Pythons put away as bytes in PCs in any case? For instance, s = 'strings', and s is definitely put away in my PC as a byte stream '0100100101...' or whatever. At that point what is this encoding method utilized here - The "default" encoding strategy for Python?

Thank you

1 Answer

0 votes
by (26.4k points)

Python 3 recognizes text and binary information. Text is destined to be in Unicode, however no particular encoding is determined, the extent that I could see. So it very well may be UTF-8, or UTF-16, or UTF-32¹ – yet you wouldn't take note. 

The central matter here is: You shouldn't give it a second thought. On the off chance that you need to manage text, at that point use text strings and access them by code point (which is the quantity of a solitary Unicode character and free of the inside UTF – which may coordinate code focuses in a few more modest code units). In the event that you need bytes, at that point use b"" and access them by byte. What's more, on the off chance that you need to have a string in a byte sequence in a particular encoding, you then use .encode().

Want to learn python to get expertise in the concepts of python? Join python certification course and get certified

Related questions

0 votes
1 answer
0 votes
1 answer
asked Jul 31, 2019 in Java by Anvi (10.2k points)
0 votes
1 answer
asked Sep 27, 2019 in AWS by yuvraj (19.1k points)

Browse Categories

...