AMG: I wrote a simple pop3 daemon for use with gmail pop3 import. It can be used for importing mbox, maildir, or any other email format into gmail. I'm sure it can also be adapted for other purposes.Since this code only supports one user, and since pop3 disallows concurrent connections to a single maildrop, my code only accepts one connection at a time. Authentication is done through USER/PASS. I guess I could have gotten clever and implemented APOP, but I didn't. :^) UIDL is implemented only because gmail requires it; it uses sha1c.tcl from tcllib for a tremendous performance boost. The entire mailbox is stored in a list, and you have to customize the script to configure where the actual email is read from. DELE deletes from the in-memory mailbox, but restarting the program causes all emails to be restored. Everything's logged to stdout, so you can see the progress of gmail importing. Most importantly, you can tell when it's done, because that's when it's time to stop running the server to plug the gaping security hole. :^)Source code:
#!/bin/sh # The next line restarts with tclsh.\ exec tclsh "$0" ${1+"$@"} package require Tcl 8.5 lappend auto_path [file join [pwd] critcl.vfs lib] package require sha1 source sha1c.tcl set bindaddr 0 ;# Address of interface to listen on, or 0 for any. set bindport 110 ;# TCP port number to listen on. set username google ;# Username. Unset this variable to allow any. set password IMPORT ;# Password. Unset this variable to allow any. # Load mailbox from disk. Replace this code as necessary. set mailbox {} foreach file [glob mail/*] { set chan [open $file] lappend mailbox [string map {\n \r\n} [read $chan]] close $chan } # Timestamp and log a message to stdout. proc log {msg} { set time [clock microseconds] puts [format "%s.%06d %s" [clock format [expr {$time / 1000000}]\ -format {%Y-%m-%d %H:%M:%S} -timezone :UTC]\ [expr {$time % 1000000}] $msg] } # Process an incoming connection. This procedure doesn't return until the # connection has been terminated, so only one connection at a time is allowed. proc accept {chan peeraddr peerport} { global username password mailbox log "connected to $peeraddr:$peerport" fconfigure $chan -translation crlf -buffering line set userauth 0 set passauth 0 set deletes {} puts $chan +OK while {[gets $chan input] >= 0} { log "received $input" if {![regexp {^(\S+)(?: (.*))?$} $input _ command arguments] || ($command ni {USER PASS QUIT} && (!$userauth || !$passauth))} { log "bad or unauthorized command" puts $chan -ERR continue } switch -- $command { USER { if {[info exists username] && $arguments ne $username} { log "bad username" # Don't admit to client that username is bad, or else client # can repeatedly use this command to query for valid usernames. } else { set userauth 1 } puts $chan +OK } PASS { if {[info exists password] && $arguments ne $password} { log "bad password" puts $chan -ERR break } else { set passauth 1 puts $chan +OK } } STAT { set size 0 foreach mail $mailbox { incr size [string length $mail] } puts $chan "+OK [llength $mailbox] $size" } LIST { if {$arguments eq ""} { puts $chan +OK set num 0 foreach mail $mailbox { incr num puts $chan "$num [string length $mail]" } puts $chan . } elseif {![scan $arguments %d num] || $num <= 0 || $num > [llength $mailbox]} { log "bad LIST command" puts $chan -ERR } else { puts $chan "+OK $num [string length\ [lindex $mailbox [expr {$num - 1}]]]" } } RETR { if {![scan $arguments %d num] || $num <= 0 || $num > [llength $mailbox]} { log "bad RETR command" puts $chan -ERR } else { fconfigure $chan -translation binary puts $chan +OK\r\n[regsub -all -line {^\.}\ [lindex $mailbox [expr {$num - 1}]] ..]\r\n.\r fconfigure $chan -translation crlf } } DELE { if {![scan $arguments %d num] || $num <= 0 || $num > [llength $mailbox]} { log "bad DELE command" puts $chan -ERR } else { lappend deletes [expr {$num - 1}] puts $chan +OK } } NOOP { puts $chan +OK } RSET { set deletes {} puts $chan +OK } QUIT { puts $chan +OK break } UIDL { if {$arguments eq ""} { puts $chan +OK set num 0 foreach mail $mailbox { incr num puts $chan "$num [sha1::Hex [sha1::sha1c $mail]]" } puts $chan . } elseif {![scan $arguments %d num] || $num <= 0 || $num > [llength $mailbox]} { log "bad UIDL command" puts $chan -ERR } else { puts $chan "+OK $num [sha1::Hex [sha1::sha1c\ [lindex $mailbox [expr {$num - 1}]]]]" } } default { log "unrecognized command" puts $chan -ERR }} } close $chan log "disconnected from $peeraddr:$peerport" foreach num [lsort -decreasing -integer -unique $deletes] { set mailbox [lreplace $mailbox $num $num] } } socket -server accept -myaddr $bindaddr $bindport log "listening for connections on $bindaddr:$bindport" vwait forever # vim: set sts=4 sw=4 tw=80 et ft=tcl:Example log:
2010-03-29 02:01:58.603649 received DELE 283 2010-03-29 02:01:58.667210 received DELE 288 2010-03-29 02:01:58.718815 received DELE 282 2010-03-29 02:01:58.782199 received DELE 281 2010-03-29 02:01:58.846147 received DELE 280 2010-03-29 02:01:58.909667 received QUIT 2010-03-29 02:01:58.929500 disconnected from 209.85.221.14:33859 2010-03-29 02:03:17.996801 connected to 209.85.221.29:46246 2010-03-29 02:03:18.056628 received USER google 2010-03-29 02:03:18.112344 received PASS IMPORT 2010-03-29 02:03:18.166294 received CAPA 2010-03-29 02:03:18.167507 unrecognized command 2010-03-29 02:03:18.223426 received LIST 2010-03-29 02:03:18.566214 received UIDL 2010-03-29 02:03:25.604822 received RETR 279 2010-03-29 02:03:26.188850 received RETR 278 2010-03-29 02:03:26.336384 received RETR 277 2010-03-29 02:03:26.542660 received RETR 276 2010-03-29 02:03:26.719905 received RETR 275 2010-03-29 02:03:26.933912 received RETR 274gmail reads mails in descending order, then issues delete commands for those mails. The order of delete commands appears to be slightly randomized. 200 mails are read at a time before disconnecting. Each time it connects, it may use a different IP address. gmail uses UIDL to avoid redownloading emails it already has.There is one email in my archive that gmail refused to import. It's from MS, in fact.The message "oops" from miguel sofer (mig@youteedeetee.edu) contained a virus or a suspicious attachment. It was therefore not fetched from your account google@myhostname and has been left on the server.The attachment was tcl8.3.4-flow2.tar.gz. It seems gmail scanned it and determined that it had a few bugs. :^)