0 votes
1 view
in Python by (47.8k points)

When piping the output of a Python program, the Python interpreter gets confused about encoding and sets it to None. This means a program like this:

# -*- coding: utf-8 -*-

print u"åäö"

will work fine when running normally, but fail with:

UnicodeEncodeError: 'ASCII' codec can't encode character u'\xa0' in position 0: ordinal not in range(128)

when used in a pipe sequence.

What is the best way to make this work when piping? Can I just tell it to use whatever encoding the shell/filesystem/whatever is using?

The suggestions I have seen thus far is to modify your site.py directly, or hardcoding the default encoding using this hack:

# -*- coding:utf-8 -*-

import sys

reload(sys)

sys.setdefaultencoding('utf-8')

print u"åäö"

Is there a better way to make piping work?

1 Answer

0 votes
by (107k points)

For setting the correct encoding when piping stdout in Python

The code that you have written works when running in a script because Python encodes the output to whatever encoding your terminal application is using. If you are piping you must encode it yourself.

An important rule to keep in mind, always use unicode internally. Decode what you receive, and encode what you send.  The explanation code is below which will tell you how to encode.

# -*- coding: utf-8 -*- 

print u"åäö".encode('utf-8')

Below is the Python program to convert ISO-8859-1 into UTF-8, and making everything uppercase in between.

import sys

for line in sys.stdin: 

line = line.decode('iso8859-1') 

line = line.upper() 

line = line.encode('utf-8') 

sys.stdout.write(line)

Welcome to Intellipaat Community. Get your technical queries answered by top developers !


Categories

...