This is a Python implementation of Jamie Zawinski's threading algorithm, which takes a bunch of e-mail messages and groups them into threads.

The module contains a thread() function that takes a list of Message objects. Message is a class defined in this module and not the rfc822.Message class; there's a make_message() function that will take an rfc822.Message object, look at its 'References:' and 'In-Reply-To:' headers, and return a jwzthreading.Message object. thread() will return a dictionary mapping subject lines to Container objects that are the root of trees containing the threaded messages.

Here's an example:

bash-2.05a$ python
Python 2.3a0 (#6, Aug 23 2002, 11:59:34) 
[GCC 3.1 20020620 (Red Hat Linux Rawhide 3.1-7)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import jwzthreading as jwz
>>> m1 = jwz.Message()
>>> m1.message = "message body, or object, or whatever"
>>> m1.subject = 'Subject'
>>> m1.message_id = "id_a" 
>>> m2 = jwz.Message()
>>> m2.message_id = 'id_b' ; m2.references = ['id_a']
>>> m2.subject = 'Re: subject'
>>> d = jwz.thread([m1, m2])
>>> d    # Returned value maps subject strings to Containers
{'Subject': <Container 401f370c: <Message: 'id_a'>>}
>>> root = d['Subject']        # Get root of a thread
>>> root.message               # It has the jwz.Message as .message ...
<Message: 'id_a'>
>>> root.children              # ... and a list of children.
[<Container 401f35ec: <Message: 'id_b'>>]
>>> 

Download the code

jwzthreading-0.91.tar.gz [3K] [Signature]


[Contact me]