Back

Explore Courses Blog Tutorials Interview Questions
0 votes
2 views
in Python by (45.3k points)

I have a piece of code like this

host = 'http://www.bing.com/search?q=%s&go=&qs=n&sk=&sc=8-13&first=%s' % (query, page)

req = urllib2.Request(host)

req.add_header('User-Agent', User_Agent)

response = urllib2.urlopen(req)

and when I input a query greater than one word like "the dog" I get the following error.

response = urllib2.urlopen(req)

File "/usr/lib/python2.7/urllib2.py", line 126, in urlopen

return _opener.open(url, data, timeout)

File "/usr/lib/python2.7/urllib2.py", line 400, in open

response = meth(req, response)

File "/usr/lib/python2.7/urllib2.py", line 513, in http_response

'http', request, response, code, msg, hdrs)

File "/usr/lib/python2.7/urllib2.py", line 438, in error

return self._call_chain(*args)

File "/usr/lib/python2.7/urllib2.py", line 372, in _call_chain

result = func(*args)

File "/usr/lib/python2.7/urllib2.py", line 521, in http_error_default

raise HTTPError(req.get_full_url(), code, msg, hdrs, fp)

urllib2.HTTPError: HTTP Error 400: Bad Request

Can anyone point out what I'm doing wrong? Thanks in advance.

1 Answer

0 votes
by (16.8k points)

The reason that "the dog" returns a 400 Error is because you aren't escaping the string for a URL.

If you do this:

import urllib, urllib2

quoted_query = urllib.quote(query)

host = 'http://www.bing.com/search?q=%s&go=&qs=n&sk=&sc=8-13&first=%s' % (quoted_query, page)

req = urllib2.Request(host)

req.add_header('User-Agent', User_Agent)

response = urllib2.urlopen(req)

It will work.

However I highly suggest you use requests instead of using urllib/urllib2/httplib. It's much much easier and it'll handle all of this for you.

This is the same code with python requests:

import requests

results = requests.get("http://www.bing.com/search", 

              params={'q': query, 'first': page}, 

              headers={'User-Agent': user_agent})

Browse Categories

...