0 votes
1 view
in Devops and Agile by (14.9k points)

I keep getting UnicodeEncodeError when trying to print a 'Á' that I get from a website requested using selenium in python 3.4.

I already defined at the top of my .py file

#  -*- coding: utf-8 -*-

the def is something like this:

from selenium import webdriver

b = webdriver.Firefox()

b.get('http://fisica.uniandes.edu.co/personal/profesores-de-planta')

dataProf = b.find_elements_by_css_selector('td[width="508"]')

for dato in dataProf:

        print(datos.text)

and the exception:

Traceback (most recent call last):

  File "C:/Users/Andres/Desktop/scrap/scrap.py", line 444, in <module>

    dar_p_fisica()

  File "C:/Users/Andres/Desktop/scrap/scrap.py", line 390, in dar_p_fisica

    print(datos.text) #.encode().decode('ascii', 'ignore')

  File "C:\Python34\lib\encodings\cp1252.py", line 19, in encode

    return codecs.charmap_encode(input,self.errors,encoding_table)[0]

UnicodeEncodeError: 'charmap' codec can't encode character '\u2010' in position 173: character maps to <undefined>

thanks in advance

1 Answer

0 votes
by (42.1k points)

The encoding your terminal is using doesn't support that character:

>>> '\xdf'.encode('cp866')

Traceback (most recent call last):

  File "<stdin>", line 1, in <module>

  File "/opt/local/Library/Frameworks/Python.framework/Versions/3.3/lib/python3.3/encodings/cp866.py", line 12, in encode

    return codecs.charmap_encode(input,errors,encoding_map)

UnicodeEncodeError: 'charmap' codec can't encode character '\xdf' in position 0: character maps to <undefined>

Python is handling it just fine, it's your output encoding that cannot handle it.

You can try using chcp 65001 in the Windows console to switch your codepage; chcp is the windows command line command to change code pages.

Mine, on OS X (using UTF-8) can handle it just fine:

>>> print('\xdf')

ß

Hope this helps!

...