0 votes
1 view
in Python by (47.8k points)

What exactly is a Unicode string?

What's the difference between a regular string and Unicode string?

What is utf-8?

I'm trying to learn Python right now, and I keep hearing this buzzword. What does the code below do?

i18n Strings (Unicode)

> ustring = u'A unicode \u018e string \xf1' 

> ustring 

u'A unicode \u018e string \xf1' 

## (ustring from above contains a unicode string) 

> s = ustring.encode('utf-8') 

> s 

'A unicode \xc6\x8e string \xc3\xb1' ## bytes of utf-8 encoding > t = unicode(s, 'utf-8') ## Convert bytes back to a unicode string 

> t == ustring ## It's the same as the original, yay! True

Files Unicode

import codecs

f = codecs.open('foo.txt', 'rU', 'utf-8')

for line in f:

# here line is a *unicode* string

1 Answer

0 votes
by (107k points)

The Unicode string is a standard for working with a wide range of characters. Each symbol has a codepoint a number, and these codepoints can be encoded converted to a sequence of bytes using a variety of encodings.

To know more about this you can have a look at the following video tutorial:-

Related questions

0 votes
1 answer
0 votes
1 answer
Welcome to Intellipaat Community. Get your technical queries answered by top developers !


Categories

...