This module defines a number of classes that allow easy and uniform access to mail messages in a (Unix) mailbox.
For maximum portability, messages in a Unix-style mailbox are
separated by any line that begins exactly with the string
' (note the trailing space) if preceded by exactly two newlines.
Because of the wide-range of variations in practice, nothing else on
the From_ line should be considered. However, the current
implementation doesn't check for the leading two newlines. This is
usually fine for most applications.
The UnixMailbox class implements a more strict version of From_ line checking, using a regular expression that usually correctly matched From_ delimiters. It considers delimiter line to be separated by "From name time" lines. For maximum portability, use the PortableUnixMailbox class instead. This class is identical to UnixMailbox except that individual messages are separated by only "From " lines.
For more information, see Unix: Why the Content-Length Format is Bad" >Configuring Netscape Mail on Unix: Why the Content-Length Format is Bad.
'From 'are quoted by mail handling software at delivery-time.
'*** EOOH ***'(End-Of-Original-Headers) and the visible headers appear after the
EOOHline. Babyl-compliant mail readers will show you only the visible headers, and BabylMailbox objects will return messages containing only the visible headers. You'll have to do your own parsing of the mailbox file to get at the original headers. Mail messages start with the EOOH line and end with a line containing only
'\037\014'. factory is as with the UnixMailbox class.
Note that because the rfc822 module is deprecated, it is recommended that you use the email package to create message objects from a mailbox. (The default can't be changed for backwards compatibility reasons.) The safest way to do this is with bit of code:
import email import email.Errors import mailbox def msgfactory(fp): try: return email.message_from_file(fp) except email.Errors.MessageParseError: # Don't return None since that will # stop the mailbox iterator return '' mbox = mailbox.UnixMailbox(fp, msgfactory)
The above wrapper is defensive against ill-formed MIME messages in the mailbox, but you have to be prepared to receive the empty string from the mailbox's next() method. On the other hand, if you know your mailbox contains only well-formed MIME messages, you can simplify this to:
import email import mailbox mbox = mailbox.UnixMailbox(fp, email.message_from_file)