Encoding Special Characters with URLLib Quote Function in Python

I was actually testing my final year project today and there was so much noise in the data, I was really frustrated by the end of it.

One of the most troublesome and difficult to figure out was urllib.quote(movie) function.

You should see movie titles people update, here are a few.

I ♥ Bollywood, Funniest movies ever

That really seemed a challenge to be sent over for an API call. We thought of stripping them off in the sentence, but there are a few French and Italian movies which always have some or the other odd character in them. I used quote() function and was getting KeyError exception.

Finally I figured it out, you have to encode it into UTF-8 so that they can be sent across. So while calling a URL, if it has any special characters in it, better encode it and sent it accross.

Example: (Google App Engine)

import urllib
from google.appengine.api import urlfetch

data = u'♥+ツ'
url = 'http://www.google.com?search='

response = urlfetch.fetch(url + urllib.quote(data.encode(encoding = 'UTF-8')))

if response.status_code == 200:
    output = response.content
    self.response.out.write(output)

I wasted almost an hour on figuring out what to do. If you ever get a KeyError when you are using URLLib.Quote, then this is the solution.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s