Updated 2012-08-29 10:59:35 by RLE

Documentation can be found at http://tcllib.sourceforge.net/doc/pop3d.html .

POP3 Daemon , i.e. Server.

This tcllib module provides three packages which allow the construction of pop3 email servers.

AMG: I wrote a simple pop3 daemon for use with gmail pop3 import. It can be used for importing mbox, maildir, or any other email format into gmail. I'm sure it can also be adapted for other purposes.

Since this code only supports one user, and since pop3 disallows concurrent connections to a single maildrop, my code only accepts one connection at a time. Authentication is done through USER/PASS. I guess I could have gotten clever and implemented APOP, but I didn't. :^) UIDL is implemented only because gmail requires it; it uses sha1c.tcl from tcllib for a tremendous performance boost. The entire mailbox is stored in a list, and you have to customize the script to configure where the actual email is read from. DELE deletes from the in-memory mailbox, but restarting the program causes all emails to be restored. Everything's logged to stdout, so you can see the progress of gmail importing. Most importantly, you can tell when it's done, because that's when it's time to stop running the server to plug the gaping security hole. :^)

Source code:
#!/bin/sh
# The next line restarts with tclsh.\
exec tclsh "$0" ${1+"$@"}

package require Tcl 8.5
lappend auto_path [file join [pwd] critcl.vfs lib]
package require sha1
source sha1c.tcl

set bindaddr 0          ;# Address of interface to listen on, or 0 for any.
set bindport 110        ;# TCP port number to listen on.
set username google     ;# Username.  Unset this variable to allow any.
set password IMPORT     ;# Password.  Unset this variable to allow any.

# Load mailbox from disk.  Replace this code as necessary.
set mailbox {}
foreach file [glob mail/*] {
    set chan [open $file]
    lappend mailbox [string map {\n \r\n} [read $chan]]
    close $chan
}

# Timestamp and log a message to stdout.
proc log {msg} {
    set time [clock microseconds]
    puts [format "%s.%06d %s" [clock format [expr {$time / 1000000}]\
        -format {%Y-%m-%d %H:%M:%S} -timezone :UTC]\
        [expr {$time % 1000000}] $msg]
}

# Process an incoming connection.  This procedure doesn't return until the
# connection has been terminated, so only one connection at a time is allowed.
proc accept {chan peeraddr peerport} {
    global username password mailbox

    log "connected to $peeraddr:$peerport"
    fconfigure $chan -translation crlf -buffering line
    set userauth 0
    set passauth 0
    set deletes {}
    puts $chan +OK
    while {[gets $chan input] >= 0} {
        log "received $input"
        if {![regexp {^(\S+)(?: (.*))?$} $input _ command arguments]
         || ($command ni {USER PASS QUIT} && (!$userauth || !$passauth))} {
            log "bad or unauthorized command"
            puts $chan -ERR
            continue
        }
        switch -- $command {
        USER {
            if {[info exists username] && $arguments ne $username} {
                log "bad username"
                # Don't admit to client that username is bad, or else client
                # can repeatedly use this command to query for valid usernames.
            } else {
                set userauth 1
            }
            puts $chan +OK
        } PASS {
            if {[info exists password] && $arguments ne $password} {
                log "bad password"
                puts $chan -ERR
                break
            } else {
                set passauth 1
                puts $chan +OK
            }
        } STAT {
            set size 0
            foreach mail $mailbox {
                incr size [string length $mail]
            }
            puts $chan "+OK [llength $mailbox] $size"
        } LIST {
            if {$arguments eq ""} {
                puts $chan +OK
                set num 0
                foreach mail $mailbox {
                    incr num
                    puts $chan "$num [string length $mail]"
                }
                puts $chan .
            } elseif {![scan $arguments %d num]
                   || $num <= 0 || $num > [llength $mailbox]} {
                log "bad LIST command"
                puts $chan -ERR
            } else {
                puts $chan "+OK $num [string length\
                    [lindex $mailbox [expr {$num - 1}]]]"
            }
        } RETR {
            if {![scan $arguments %d num]
             || $num <= 0 || $num > [llength $mailbox]} {
                log "bad RETR command"
                puts $chan -ERR
            } else {
                fconfigure $chan -translation binary
                puts $chan +OK\r\n[regsub -all -line {^\.}\
                    [lindex $mailbox [expr {$num - 1}]] ..]\r\n.\r
                fconfigure $chan -translation crlf
            }
        } DELE {
            if {![scan $arguments %d num]
             || $num <= 0 || $num > [llength $mailbox]} {
                log "bad DELE command"
                puts $chan -ERR
            } else {
                lappend deletes [expr {$num - 1}]
                puts $chan +OK
            }
        } NOOP {
            puts $chan +OK
        } RSET {
            set deletes {}
            puts $chan +OK
        } QUIT {
            puts $chan +OK
            break
        } UIDL {
            if {$arguments eq ""} {
                puts $chan +OK
                set num 0
                foreach mail $mailbox {
                    incr num
                    puts $chan "$num [sha1::Hex [sha1::sha1c $mail]]"
                }
                puts $chan .
            } elseif {![scan $arguments %d num]
                   || $num <= 0 || $num > [llength $mailbox]} {
                log "bad UIDL command"
                puts $chan -ERR
            } else {
                puts $chan "+OK $num [sha1::Hex [sha1::sha1c\
                    [lindex $mailbox [expr {$num - 1}]]]]"
            }
        } default {
            log "unrecognized command"
            puts $chan -ERR
        }}
    }
    close $chan
    log "disconnected from $peeraddr:$peerport"
    foreach num [lsort -decreasing -integer -unique $deletes] {
        set mailbox [lreplace $mailbox $num $num]
    }
}

socket -server accept -myaddr $bindaddr $bindport
log "listening for connections on $bindaddr:$bindport"

vwait forever

# vim: set sts=4 sw=4 tw=80 et ft=tcl:

Example log:
2010-03-29 02:01:58.603649 received DELE 283
2010-03-29 02:01:58.667210 received DELE 288
2010-03-29 02:01:58.718815 received DELE 282
2010-03-29 02:01:58.782199 received DELE 281
2010-03-29 02:01:58.846147 received DELE 280
2010-03-29 02:01:58.909667 received QUIT
2010-03-29 02:01:58.929500 disconnected from 209.85.221.14:33859
2010-03-29 02:03:17.996801 connected to 209.85.221.29:46246
2010-03-29 02:03:18.056628 received USER google
2010-03-29 02:03:18.112344 received PASS IMPORT
2010-03-29 02:03:18.166294 received CAPA
2010-03-29 02:03:18.167507 unrecognized command
2010-03-29 02:03:18.223426 received LIST
2010-03-29 02:03:18.566214 received UIDL
2010-03-29 02:03:25.604822 received RETR 279
2010-03-29 02:03:26.188850 received RETR 278
2010-03-29 02:03:26.336384 received RETR 277
2010-03-29 02:03:26.542660 received RETR 276
2010-03-29 02:03:26.719905 received RETR 275
2010-03-29 02:03:26.933912 received RETR 274

gmail reads mails in descending order, then issues delete commands for those mails. The order of delete commands appears to be slightly randomized. 200 mails are read at a time before disconnecting. Each time it connects, it may use a different IP address. gmail uses UIDL to avoid redownloading emails it already has.

There is one email in my archive that gmail refused to import. It's from MS, in fact.

The message "oops" from miguel sofer (mig@youteedeetee.edu) contained a virus or a suspicious attachment. It was therefore not fetched from your account google@myhostname and has been left on the server.

The attachment was tcl8.3.4-flow2.tar.gz. It seems gmail scanned it and determined that it had a few bugs. :^)