Here’s a very simple code written in Python 2.7 and webapp2 on Google App Engine that does memcaching. It is possibly the easiest example I could think of, that I could implement memcaching on Google App Engine.
The main confusion which arises when one starts to look into memcaching is that Google fails to show that memecaching has to be done after instantiating a memcache client object.
The source code for this application is available here and is self explanatory.
Questions, comments, doubts are welcome.
UPDATE – 24/03/2018: I’m in the process of rewriting this article. For those of you who can understand a bit of non-trivial Python code you can take a look at my GitHub repository for a more elegant implementation.
OUTDATED information from here…
I’ve written a very small code snippet that actually generates n-grams. I’ve also added a small tweak that gives us the number of times a n-gram has appeared in the document.
The example I’ve considered is a Shakespeare’s play (All is Well that Ends Well). I’ll be generating the most common 3,4,5 or 6 word phrases that were used by Shakespeare in this particular play.
The first thing to do is cleaning up the document. Removing stuff like ACT1, SCENE 1, [To Derpina] etc. The next step is tokenising the document (splitting the document into tokens by stripping punctuations and white spaces).
Now we get into action:
#By now you should have a list of the words in the file
#There should not be unnecessary punctuation marks in the end
#of the words or any unnecessary white spaces as well.
#now word_list contains a list, generate a n-gram
#n for n-gram
#Change it to whatever the requirement is
n = 6
ngrams = dict()
#create an n-gram list
for i in range(len(word_list) - n + 1):
gram = tuple(word_list[i:i+n])
if gram in ngrams:
ngrams[gram] += 1
ngrams[gram] = 1
#now ngrams contains all the ngrams of the book
sorted_ngrams = sorted(ngrams.iteritems(), key = operator.itemgetter(1), reverse = True)
Okay! this is the only working part of this program that needs to be explained. I believe the the code is self-explanatory if you know a bit of Python.
The source code can be found in my repository .