tkbe

December 17, 2009

django :: Http404 handling

Filed under: django — tb @ 5:12 am

Just like "everyone" else I've written a custom CMS (Content Management System) that we use internally. One can discuss the merits of writing your own vs. using something like Plone, but given a boss who sometimes wants it "just so" I felt more comfortable with a system I wrote myself. Creating custom "portals" is one of the services we offer, so it's also part of our core business.

Today, I wanted to gain some knowledge about pages we don't serve, yet someone is looking for them. The usual scenarios are dead links in either bookmarks, emails, directly entered typos, or search-engines that are going amok. Sometimes there is an obvious alternative page that could be displayed, e.g. if a page has moved.

Our urls.py file has this as its last rule

PYTHON:
  1. (r'^(?P<path>.+)$', views.page),

and the page view starts by finding the correct page (we use separate settings files for the various websites we run)

PYTHON:
  1. webpage = get_object_or_404(Page, site=settings.WEBSITE, path=path)

The Page model keeps track of which version of resources should be displayed, which permissions (if any) are required to view the page, keywords, description, title, etc.

Changing get_object_or_404 to our own function, we get

PYTHON:
  1. webpage = select_page(request, path)

the settings are global, so we don't need to pass settings.WEBSITE as an argument. The select_page function uses the Redirect model

PYTHON:
  1. class Redirect(models.Model):
  2.     website = models.PositiveIntegerField()
  3.     path = models.CharField(max_length=240)
  4.     redirect = models.CharField(max_length=240, default='/')
  5.     keywords = models.CharField(max_length=240, null=True, blank=True)
  6.     counter = models.PositiveIntegerField(default=0, blank=True)
  7.  
  8.     class Admin:
  9.         list_display = 'id website path redirect counter keywords'.split()
  10.         search_fields = 'path redirect keywords'.split()
  11.         list_filter = ['website']

finally, the select_page function

PYTHON:
  1. def select_page(request, path):
  2.     orig_path = path
  3.     path = normpath(request, path)  # make sure there is a slash at the end
  4.  
  5.     try:
  6.         webpage = Page.objects.get(site=settings.WEBSITE, path=path)
  7.        
  8.     except Page.DoesNotExist:
  9.         referer = request.META.get('HTTP_REFERER')
  10.  
  11.         # to handle paths like /page.asp?page=454 we need to make the query string significant...
  12.         q = request.META.get('QUERY_STRING')
  13.         url = orig_path
  14.         if q:
  15.             url += '?' + q
  16.        
  17.         if not referer:
  18.             # probably search engine...
  19.             r, created = Redirect.objects.get_or_create(website=settings.WEBSITE, path=url)
  20.             if created:
  21.                 r.redirect = None  # default is "/" which might have been a mistake...
  22.                 r.save()
  23.             else:
  24.                 if r.redirect is not None:
  25.                     return select_page(request, r.redirect)
  26.                
  27.                 r.counter += 1
  28.                 r.save()
  29.                 time.sleep(min(9, 2*r.counter)) # bad spider!
  30.  
  31.             raise http.Http404
  32.  
  33.         # someone probably linked to our old site... try to find some
  34.         # useful content for 'em...
  35.         try:
  36.             r = Redirect.objects.get(website=settings.WEBSITE,  path=url,
  37.                                      redirect__isnull=False)
  38.             return select_page(request, r.redirect)
  39.        
  40.         except Redirect.DoesNotExist:
  41.             # we could use the same strategy as above (simply inserting into the
  42.             # Redirect table), however I like to handle these manually since they
  43.             # are usually a sign that something is very wrong...
  44.             # The bjorn (I'm bjorn, btw.) module makes it trivially easy to send
  45.             # email to myself -- in its simplest form: bjorn.email('hello world')
  46.             bjorn.email(repr(request),
  47.                         subject="missing-page[linked]: %s" % url,
  48.                         use_snippet=False)
  49.             raise http.Http404
  50.        
  51.     return webpage

The astute reader will notice that we're in fact not redirecting. If a visitor goes to http://example.com/foo.html and the Redirect table contains an entry with "foo.html" -> "/", then the visitor's browser will show the url he typed and the content from '/'. While not necessarily a bug, it doesn't help your page statistics, and I'll probably end up changing it later...

November 13, 2009

django :: implementing reCAPTCHA in 5 easy steps

Filed under: django — tb @ 2:58 am

It turns out reCAPTCHA is very easy to implement :-)

Step #1: install the Python client:

CODE:
  1. easy_install recaptcha-client

Step #2: obtain public and private keys here: http://recaptcha.net/whyrecaptcha.html.

Step #3: Then create a small helper module, mycaptcha.py, for convenience and to keep your keys in one place...

PYTHON:
  1. import recaptcha.client.captcha as rc
  2.  
  3. public_key = 'your-public-key-from-step-2-goes-here'
  4. private_key = 'your-private-key-from-step-2-goes-here'
  5.  
  6. def displayhtml(use_ssl=False, error=None):
  7.     return rc.displayhtml(public_key, use_ssl, error)
  8.  
  9. def submit(request):
  10.     return rc.submit(
  11.         request.REQUEST.get('recaptcha_challenge_field',''),
  12.         request.REQUEST.get('recaptcha_response_field',''),
  13.         private_key,
  14.         request.META.get('REMOTE_ADDR', ''))

Step #4: create your Django view

PYTHON:
  1. def captcha_test_view(request):
  2.     import mycaptcha
  3.     error = None
  4.  
  5.     if request.method == 'POST':
  6.         response = mycaptcha.submit(request)
  7.         if response.is_valid:
  8.             return HttpResponseRedirect('success-url/')
  9.         else:
  10.             error = response.error_code
  11.            
  12.     recaptcha = captcha.displayhtml(error)
  13.  
  14.     return render_to_response('recaptcha_test.html', {'recaptcha':recaptcha})

Step #5: include the following inside the form element of your html...

HTML:
  1. <form method=POST ...>
  2.     {{ recaptcha|safe }}
  3.     ...
  4.     <input type=submit>
  5.   </form>

May 27, 2009

django :: date filter cheat sheet

Filed under: django — tb @ 12:48 pm

I don't seem able to memorize the list of format characters for the date filter, maybe because the documentation lists them in alphabetical order. I've created the cheat sheet below with an attempt at semantic grouping...

Time seconds s Seconds, 2 digits (with leading zeros).
minutes i Minutes (with leading zeros).
hours
(24 hr clock)
G Hour, 24-hour format without leading zeros.
H Hour, 24-hour format with leading zeros.
hours
(12 hr clock)
g Hour, 12-hour format without leading zeros.
h Hour, 12-hour format with leading zeros.
a a.m. or p.m.
A A.M. or P.M.
Date year y Year, 2 digits.
Y Year, 4 digits.
z Day of the year (0 - 365).
week W ISO-8601 week number.
w Day number of the week (0=Sunday, 6=Saturday).
month n Month number
m Month number, 2 digits (leading zeros)
b Month name, lower case, 3 characters.
M Month name, 3 characters.
F Month name
day j Day of the month
d Day of the month, 2 digits (leading zeros).
D Day name, 3 characters.
l Day name
z Day of the year (0 - 365)
w Day number of the week (0=Sunday, 6=Saturday).
misc. metainfo t Number of days in the given month.
L Leap year? (bool).
special formats r RFC 2822 formatted date..
U Seconds since the Unix Epoch (1970-01-01 00:00:00).
z Day of the year (0 - 365).
time zone O Difference to Greenwich time in hours.
Z Time zone offset in seconds. The offset for timezones west of UTC is always negative, and for those east of UTC is always positive..
T Time zone of this machine. ('EST', 'MDT').
English only S English ordinal suffix for day of the month (1st, 2nd)
P Time, in 12-hour hours, minutes and 'a.m.'/'p.m.', with minutes left off if they're zero and the special-case strings 'midnight' and 'noon' if appropriate. Proprietary extension.
f Time, in 12-hour hours and minutes, with minutes left off if they're zero. (Proprietary extension.)
N Month abbreviation in Associated Press Style (proprietary extension).
Next Page »

Powered by WordPress