I was checking out a number of 37signals’ webapps today (Highrise, Basecamp, and a few others… well worth a look). When I discovered I could sign in with OpenID. After a little googling and watching an excellent screencast over at Simon Willisons weblog, I figured out what it was and managed to turn my other site into an OpenID for myself. I might take a stab at implementing OpenID login for the site I’m working on, although it might not be the right audience… So what is it, you ask? Go watch the screencast, it explains it better than I ever could
It looks like Python is getting tuples with named members in 2.6(http://www.oluyede.org/blog/2007/03/11/updates-from-python-svn-part-2/ and http://docs.python.org/dev/lib/named-tuple-factory.html). I suspect many of us have implemented similar functionality ourselves, e.g. Shannon -jj Behrens describes how he sometimes uses dictionaries to return composite polymorphic values (http://jjinux.blogspot.com/2007/03/python-returning-multiple-things-of.html). The problem with dictionaries is of course that they require too much excercise of your little finger in typing ['xx']. That's even worse for me since I'm using a keyboard layout that switches national characters onto those keys when I tap the caps-lock key, so I can use my American-keyboard touch typing skillz and eat my national characters as well (I'm looking forward to the Metaphor-off!)
It looks like the new Python NamedTuple type is going to limit the fields to those that are defined at creation time. It's based on a tuple, so I suppose that follows naturally, however it doesn't seem natural for the abstract-data-type of a container of named fields with iteration and indexing. I've called my implementation of this ADT a property set since most of the motivating use cases for this was returning returning values that had properties attached to them. The use is as follows...
You can assign to random fields, the only limitation is that they cannot start with an underscore, but public fields wouldn't have that anyway so it's not really a limitation (the limitation comes from the fact that the implementation overrides __setattr__ and being able to interpret fields starting with an underscore as internal to the implementation simplifies things quite a bit):
PYTHON:
-
>>> p = pset()
-
>>> p.a = 42
-
>>> p.b = 'hello'
-
>>> p.c = [p.a, p.b]
-
>>> p
-
pset(a=42, b='hello', c=[42, 'hello'])
You can iterate over the values:
PYTHON:
-
>>> for key, value in p:
-
... print key, value
-
...
-
a 42
-
b hello
-
c [42, 'hello']
Notice that it maintains the insertion order, and you can also access by index:
For technical reasons it is not possible to maintain the order when creating a pset from keyword arguments (I was hesitating to put this functionality in, but practicality beats purity, and it's turned out to be very practical). Equality does not require isomorphism, which means that as long as the sets have the same fields they compare equal:
PYTHON:
-
>>> q = pset(a=42, b='hello', c=[42,'hello'])
-
>>> q
-
pset(a=42, c=[42, 'hello'], b='hello')
-
>>> p == q
-
True
You can keep the order given to the constructor by initializing with a list of tuples:
PYTHON:
-
>>> list(p.items())
-
[('a', 42), ('b', 'hello'), ('c', [42, 'hello'])]
-
>>> r = pset(p.items())
-
>>> r
-
pset(a=42, b='hello', c=[42, 'hello'])
The example above does of course not mean that you can't create a pset from a pset directly (this also maintains order):
PYTHON:
-
>>> s = pset(p)
-
>>> s
-
pset(a=42, b='hello', c=[42, 'hello'])
It's also extremely useful to be able to use field indexing notation as well:
PYTHON:
-
>>> p
-
pset(a=42, b='hello', c=[42, 'hello'])
-
>>> p.b
-
'hello'
-
>>> p[1]
-
'hello'
-
>>> p['b']
-
'hello'
-
>>> p['b'] = 'world'
-
>>> p
-
pset(a=42, b='world', c=[42, 'hello'])
-
>>> p[1] = 'foo'
-
>>> p
-
pset(a=42, b='foo', c=[42, 'hello'])
Here's the code:
PYTHON:
-
class pset(dict):
-
"""This code is placed in the Public Domain.
-
-
Property Set class.
-
A property set is an object where values are attached to attributes,
-
but can still be iterated over as key/value pairs.
-
The order of assignment is maintained during iteration.
-
Only one value allowed per key.
-
>>> x = pset()
-
>>> x.a = 42
-
>>> x.b = 'foo'
-
>>> x.a = 314
-
>>> x
-
pset(a=314, b='foo')
-
"""
-
def __init__(self, items=(), **attrs):
-
object.__setattr__(self, '_order', [])
-
super(pset, self).__init__()
-
for k, v in items:
-
self.add(k, v)
-
for k, v in attrs.items():
-
self.add(k, v)
-
-
def add(self, key, value):
-
if type(key) in (int, long):
-
key = self._order[key]
-
elif key not in self._order:
-
self._order.append(key)
-
dict.__setitem__(self, key, value)
-
-
def __eq__(self, other):
-
"""Equal iff they have the same set of keys, and the values for
-
each key is equal. Key order is not considered for equality.
-
"""
-
if set(self._order) == set(other._order):
-
for key in self._order:
-
if self[key] != other[key]:
-
return False
-
return True
-
return False
-
-
def __iadd__(self, other):
-
for k, v in other:
-
self.add(k, v)
-
-
# should probably have an __radd__ method too...
-
def __add__(self, other):
-
tmp = self.__class__()
-
tmp += self
-
tmp += other
-
return tmp
-
-
def __repr__(self):
-
vals = ', '.join('%s=%s' % (k, repr(v)) for (k,v) in self)
-
return '%s(%s)' % (self.__class__.__name__, vals)
-
-
def __getattr__(self, key):
-
if key not in self:
-
raise AttributeError(key)
-
return self.get(key)
-
-
def __getitem__(self, key):
-
if type(key) in (int, long):
-
key = self._order[key]
-
return self.get(key)
-
-
__str__ = __repr__
-
-
-
def __iter__(self):
-
return ((k, self.get(k)) for k in self._order)
-
-
def items(self):
-
return iter(self)
-
-
def __setattr__(self, key, val):
-
if key.startswith('_'):
-
object.__setattr__(self, key, val)
-
else:
-
self.add(key, val)
-
-
def __setitem__(self, key, val):
-
self.add(key, val)
In-house staff has been getting spoiled lately by the simplicity of the search box in the Django admin interface. My favorite comment from last week was "the stupid thing doesn't even search the middle name field when I enter a name, I wish it could be more like the Django", in reference to an application for which we're paying a significant amount of licencing fees. (On a side-note I would suggest to everyone to re-brand the admin pages -- my users are now convinced Django is the stuff that I've put on their admin page...)
For a number of reasons, however, I'm starting to shift out parts of the admin interface for my own home-made pages. While they've served us well so far, we've gotten to the point now that the feature requests require more custom work than can be integrated into the admin interface. It's a good thing. It means users are thinking about how things can be done better, and it's also the way the Django admin interface is supposed to be used -- allowing everyone to focus on something besides basic admin functionality until later in a project.
It would seem important though, to not take a step backward and lose functionality like e.g. the simplicity of a single search box. I was to lazy to go look at the source, but a quick googling only turned up an entry from Petro Verkhogliad (http://petro.tanreisoftware.com/?p=22) that only deals with searching a single field (and a suggestion for an algorithm to search multiple fields that materializes way to much data to be practical). The second entry I found was from Steven Ametjan (http://www2.wolfsreign.com/archives/2007/01/22/writing-search-view-django/). His solution is unfortunately buggy if there is more than one search term.
A correct version should look something like this:
PYTHON:
-
from django.db.models import Q
-
-
def search(terms=None):
-
if terms is None:
-
return Customers.objects.all()
-
-
query = Customer.objects
-
for term in terms:
-
query = query.filter(
-
Q(fname__icontains=term)
-
| Q(lname__icontains=term)
-
| Q(email__icontains=term)
-
| Q(zipcode__icontains=term)
-
| Q(birthdate=term))
-
return query
That works ok, but searching the birthdate field requires using database syntax to enter dates (2007-03-25). People get very upset when they can't enter dates in their local format (I can't emphasize this too much). Where I'm sitting right now, dates are always entered as dd.mm.yyyy. We can fix this problem though, and at the same time make our searches faster by utilizing some domain knowledge. In this case we know that we don't need to search in non-date fields for terms that are dates (nor for non-date data in date fields). The only numeric column we're searching is the zipcode column, so we can limit terms that are matched against this column as well, and we end up with something like this:
PYTHON:
-
def possible_zipcode(v):
-
try:
-
int(v)
-
return True
-
except:
-
return False
-
-
def local_date_format(v):
-
if len(v) == 10 and len(v.split('.')) == 3:
-
try:
-
day, month, year = map(int, v.split('.'))
-
datetime.date(year, month, day)
-
return True
-
except:
-
pass
-
return False
-
-
-
def search(terms=None):
-
if terms is None:
-
return Customers.objects.all()
-
-
query = Customer.objects
-
for term in terms:
-
if possible_zipcode(term):
-
query = query.filter(zipcode=term)
-
elif local_date_format(term):
-
day, month, year = map(int, term.split('.'))
-
query = query.filter(birthdate=datetime.date(year, month, day))
-
elif '@' in term:
-
query = query.filter(email__icontains=term)
-
else:
-
query = query.filter(
-
Q(fname__icontains=term)
-
| Q(lname__icontains=term))
-
return query
For my real code running against real data this approach turned out to be an order of magnitude faster than Django's generic algorithm... (In all fairness to Django I should probably mention that the only time Django's search box isn't almost instantaneous is when I'm running the admin interface on my personal machine against our production database over a vpn connection from home