package require XOTcl namespace import xotcl::* Class ZipArchive @ ::ZipArchive idemeta component ZipArchive ZipArchive instproc addFile {inputFile fileName} { # inputFile - source file to archive # fileName - name of file in the archive my lappend files $inputFile $fileName } ZipArchive instproc addToStream {stream fin fout} { my instvar written set offset $written set fdata [open $fin r] fconfigure $fdata -encoding binary -translation binary set data [read $fdata] set datacompresed [string range [::vfs::zip -mode compress $data] 2 end-4] close $fdata binary scan \x04\x03\x4B\x50 I LFH_SIG my writeLong $stream $LFH_SIG incr written 4 my writeShort $stream 20 # java implementation make 8 # but tools (WinZip) leave it 0 my writeShort $stream 0 incr written 4 my writeShort $stream 8 incr written 2 # last mod. time and date set dosTime [my toDosTime $fin] my writeLong $stream $dosTime incr written 4 set crc [::vfs::crc $data] set csize [string length $datacompresed] set size [string length $data] my writeLong $stream $crc my writeLong $stream $csize my writeLong $stream $size incr written 12 # file name length my writeShort $stream [string length $fout] incr written 2 # extra field length set extra "" my writeShort $stream [string length $extra] incr written 2 # file name puts -nonewline $stream $fout incr written [string length $fout] puts -nonewline $stream $extra incr written [string length $extra] set dataStart written; puts -nonewline $stream $datacompresed incr written $csize list $offset $dosTime $crc $csize $size } ZipArchive instproc createFile file { set fout [open $file w] fconfigure $fout -encoding binary -translation binary my createToStream $fout close $fout } ZipArchive instproc createToStream ostream { my instvar files cdOffset cdLength written set descriptionList [list] foreach {fin fout} $files { lappend descriptionList [my addToStream $ostream $fin $fout] } set cdOffset $written foreach {fin fout} $files desc $descriptionList { foreach {offset dosTime crc csize size} $desc {} my writeCentralFileHeader $ostream $fin $fout $offset $dosTime $size $csize $crc } set cdLength [expr {$written - $cdOffset}] # wirte header # EOCD 0X06054B50L scan 0X06054B50L %x s set s binary scan \x06\x05\x4B\x50 I EOCD my writeLong $ostream $EOCD # disk numbers my writeShort $ostream 0 my writeShort $ostream 0 # number of entries set filenum [expr {[llength $files]>>1}] my writeShort $ostream $filenum my writeShort $ostream $filenum # length and location of CD my writeLong $ostream $cdLength my writeLong $ostream $cdOffset # zip file comment set comment "" # comment lenght my writeShort $ostream [string bytelength $comment] puts -nonewline $ostream $comment } ZipArchive instproc init {} { my set files [list] package require vfslib my instvar cdLength cdOffset written set cdLength 0 set cdOffset 0 set written 0 } ZipArchive instproc toDosTime file { set sec [file mtime $file] foreach {year month day hour minute secound} [clock format $sec -format "%Y %m %e %k %M %S"] {} set month [string trimleft $month 0] set year [string trimleft $year 0] if {$minute eq ""} { set minute 0 } set secound [string trimleft $secound 0] if {$secound eq ""} { set secound 0 } set value [expr (($year - 1980) << 25) | ($month << 21) | ($day << 16) | ($hour << 11) | ($minute << 5) | ($secound >> 1)] return $value } ZipArchive instproc writeCentralFileHeader {ostream fin fout offset dosTime size csize crc} { my instvar written # CFH 0X02014B50L binary scan \x02\x01\x4B\x50 I CFH_SIG my writeLong $ostream $CFH_SIG incr written 4 if {$::tcl_platform(platform) eq "windows"} { # unix set pid 5 } else { # windows set pid 11 } my writeShort $ostream [expr { (($pid << 8) | 20)}] incr written 2 # version needed to extract # general purpose bit flag my writeShort $ostream 20 my writeShort $ostream 0 incr written 4 # compression method my writeShort $ostream 8 incr written 2 # last mod. time and date my writeLong $ostream $dosTime incr written 4 # CRC # compressed length # uncompressed length my writeLong $ostream $crc my writeLong $ostream $csize my writeLong $ostream $size incr written 12; set comment "" set extra "" # file name length my writeShort $ostream [string bytelength $fout] incr written 2; # extra field length my writeShort $ostream [string bytelength $extra] incr written 2; # file comment length my writeShort $ostream [string bytelength $comment] incr written 2; # disk number start my writeShort $ostream 0 incr written 2 # internal file attributes my writeShort $ostream 0 incr written 2 # external file attributes my writeLong $ostream 0 incr written 4 # relative offset of LFH my writeLong $ostream $offset incr written 4 # file name puts -nonewline $ostream $fout incr written [string bytelength $fout] # extra field puts -nonewline $ostream $extra incr written [string bytelength $extra] # file comment puts -nonewline $ostream $comment incr written [string bytelength $comment] } ZipArchive instproc writeLong {stream short} { puts -nonewline $stream [binary format i $short] } ZipArchive instproc writeShort {stream short} { puts -nonewline $stream [binary format s $short] } ZipArchive proc createZip {zipFile files} { set zipArch [my new] foreach f $files { $zipArch addFile $f [file tail $f] } $zipArch createFile $zipFile $zipArch destroy } ZipArchive proc testZip {} { my createZip {C:/tmp/my2.zip} {C:/tmp/test.txt C:/tmp/test2.txt C:/tmp/tmp.zip} # ZipArchive dumpFile {C:/tmp/my2.zip} # ZipArchive dumpFile {C:/tmp/tmp.zip} }
requires vfslib Usage as in class method createZip
set zipArch [ZipArchive new] foreach f $files { $zipArch addFile $f [file tail $f] } $zipArch createFile $zipFile $zipArch destroy
PT 23-Jan-2009: Now that zlib functions are part of the core I'm working to make it simple to create jar-style Tcl archives currently known as [zipkit]s. The following code is a tcl program to construct a zip archive from a directory tree using nothing but Tcl 8.6 core features. The resulting zip file should be compatible with other zip programs - with the possible exception of unicode support. The Tcl generated files use utf-8 encoding for all filenames and comments but I notice particularly on Windows info-zip and the Windows built-in zip view have rather poor support for this part of the ZIP file specification. The 7-Zip program does correctly display utf8 filenames however and the vfs::zip package will use these of course.If you use mkzip mystuff.tm -zipkit -directory mystuff.vfs it will pack your mystuff.vfs/ virtual filesystem tree into a zip archive with a suitable header such that on unix you may mark it executable and it should run with tclkit. Or you can run it with tclsh or wish 8.6 if you like. To change the executable header, specify -runtime preface where preface is a file containing code you want prefixed. For instance, on windows you can create a self-extracting zip archive using mkzip mystuff.exe -directory mystuff.vfs -runtime unzipsfx.exe (unzipsfx is the Info-Zip self-extracting stub).31-May-2009: a minor update is to use the streaming part of the core zlib api to deflate the data in chunks to avoid using large amounts of memory when compressing big files.
# mkzip.tcl -- Copyright (C) 2009 Pat Thoyts <patthoyts@users.sourceforge.net> # # Create ZIP archives in Tcl. # # Create a zipkit using mkzip filename.zkit -zipkit -directory xyz.vfs # or a zipfile using mkzip filename.zip -directory dirname -exclude "*~" # # version 1.2 package require Tcl 8.6 namespace eval zip {} # zip::timet_to_dos # # Convert a unix timestamp into a DOS timestamp for ZIP times. # # DOS timestamps are 32 bits split into bit regions as follows: # 24 16 8 0 # +-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+ # |Y|Y|Y|Y|Y|Y|Y|m| |m|m|m|d|d|d|d|d| |h|h|h|h|h|m|m|m| |m|m|m|s|s|s|s|s| # +-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+ # proc zip::timet_to_dos {time_t} { set s [clock format $time_t -format {%Y %m %e %k %M %S}] scan $s {%d %d %d %d %d %d} year month day hour min sec expr {(($year-1980) << 25) | ($month << 21) | ($day << 16) | ($hour << 11) | ($min << 5) | ($sec >> 1)} } # zip::pop -- # # Pop an element from a list # proc zip::pop {varname {nth 0}} { upvar $varname args set r [lindex $args $nth] set args [lreplace $args $nth $nth] return $r } # zip::walk -- # # Walk a directory tree rooted at 'path'. The excludes list can be # a set of glob expressions to match against files and to avoid. # The match arg is internal. # eg: walk library {CVS/* *~ .#*} to exclude CVS and emacs cruft. # proc zip::walk {base {excludes ""} {match *} {path {}}} { set result {} set imatch [file join $path $match] set files [glob -nocomplain -tails -types f -directory $base $imatch] foreach file $files { set excluded 0 foreach glob $excludes { if {[string match $glob $file]} { set excluded 1 break } } if {!$excluded} {lappend result $file} } foreach dir [glob -nocomplain -tails -types d -directory $base $imatch] { set subdir [walk $base $excludes $match $dir] if {[llength $subdir]>0} { set result [concat $result $dir $subdir] } } return $result } # zip::mkzipfile -- # # Add a single file to a zip archive. The zipchan channel should # already be open and binary. You may provide a comment for the # file The return value is the central directory record that # will need to be used when finalizing the zip archive. # # FIX ME: should handle the current offset for non-seekable channels # proc zip::mkzipfile {zipchan base path {comment ""}} { set fullpath [file join $base $path] set mtime [timet_to_dos [file mtime $fullpath]] set utfpath [encoding convertto utf-8 $path] set utfcomment [encoding convertto utf-8 $comment] set flags [expr {(1<<11)}] ;# utf-8 comment and path set method 0 ;# store 0, deflate 8 set attr 0 ;# text or binary (default binary) set version 20 ;# minumum version req'd to extract set extra "" set crc 0 set size 0 set csize 0 set data "" set seekable [expr {[tell $zipchan] != -1}] if {[file isdirectory $fullpath]} { set attrex 0x41ff0010 ;# 0o040777 (drwxrwxrwx) } elseif {[file executable $fullpath]} { set attrex 0x81ff0080 ;# 0o100777 (-rwxrwxrwx) } else { set attrex 0x81b60020 ;# 0o100666 (-rw-rw-rw-) if {[file extension $fullpath] in {".tcl" ".txt" ".c"}} { set attr 1 ;# text } } if {[file isfile $fullpath]} { set size [file size $fullpath] if {!$seekable} {set flags [expr {$flags | (1 << 3)}]} } set offset [tell $zipchan] set local [binary format a4sssiiiiss PK\03\04 \ $version $flags $method $mtime $crc $csize $size \ [string length $utfpath] [string length $extra]] append local $utfpath $extra puts -nonewline $zipchan $local if {[file isfile $fullpath]} { # If the file is under 2MB then zip in one chunk, otherwize we use # streaming to avoid requiring excess memory. This helps to prevent # storing re-compressed data that may be larger than the source when # handling PNG or JPEG or nested ZIP files. if {$size < 0x00200000} { set fin [open $fullpath rb] set data [read $fin] set crc [zlib crc32 $data] set cdata [zlib deflate $data] if {[string length $cdata] < $size} { set method 8 set data $cdata } close $fin set csize [string length $data] puts -nonewline $zipchan $data } else { set method 8 set fin [open $fullpath rb] set zlib [zlib stream deflate] while {![eof $fin]} { set data [read $fin 4096] set crc [zlib crc32 $data $crc] $zlib put $data if {[string length [set zdata [$zlib get]]]} { incr csize [string length $zdata] puts -nonewline $zipchan $zdata } } close $fin $zlib finalize set zdata [$zlib get] incr csize [string length $zdata] puts -nonewline $zipchan $zdata $zlib close } if {$seekable} { # update the header if the output is seekable set local [binary format a4sssiiii PK\03\04 \ $version $flags $method $mtime $crc $csize $size] set current [tell $zipchan] seek $zipchan $offset puts -nonewline $zipchan $local seek $zipchan $current } else { # Write a data descriptor record set ddesc [binary format a4iii PK\7\8 $crc $csize $size] puts -nonewline $zipchan $ddesc } } set hdr [binary format a4ssssiiiisssssii PK\01\02 0x0317 \ $version $flags $method $mtime $crc $csize $size \ [string length $utfpath] [string length $extra]\ [string length $utfcomment] 0 $attr $attrex $offset] append hdr $utfpath $extra $utfcomment return $hdr } # zip::mkzip -- # # Create a zip archive in 'filename'. If a file already exists it will be # overwritten by a new file. If '-directory' is used, the new zip archive # will be rooted in the provided directory. # -runtime can be used to specify a prefix file. For instance, # zip myzip -runtime unzipsfx.exe -directory subdir # will create a self-extracting zip archive from the subdir/ folder. # The -comment parameter specifies an optional comment for the archive. # # eg: zip my.zip -directory Subdir -runtime unzipsfx.exe *.txt # proc zip::mkzip {filename args} { array set opts { -zipkit 0 -runtime "" -comment "" -directory "" -exclude {CVS/* */CVS/* *~ ".#*" "*/.#*"} } while {[string match -* [set option [lindex $args 0]]]} { switch -exact -- $option { -zipkit { set opts(-zipkit) 1 } -comment { set opts(-comment) [encoding convertto utf-8 [pop args 1]] } -runtime { set opts(-runtime) [pop args 1] } -directory {set opts(-directory) [file normalize [pop args 1]] } -exclude {set opts(-exclude) [pop args 1] } -- { pop args ; break } default { break } } pop args } set zf [open $filename wb] if {$opts(-runtime) ne ""} { set rt [open $opts(-runtime) rb] fcopy $rt $zf close $rt } elseif {$opts(-zipkit)} { set zkd "#!/usr/bin/env tclkit\n\# This is a zip-based Tcl Module\n" append zkd "package require vfs::zip\n" append zkd "vfs::zip::Mount \[info script\] \[info script\]\n" append zkd "if {\[file exists \[file join \[info script\] main.tcl\]\]} {\n" append zkd " source \[file join \[info script\] main.tcl\]\n" append zkd "}\n" append zkd \x1A puts -nonewline $zf $zkd } set count 0 set cd "" if {$opts(-directory) ne ""} { set paths [walk $opts(-directory) $opts(-exclude)] } else { set paths [glob -nocomplain {*}$args] } foreach path $paths { puts $path append cd [mkzipfile $zf $opts(-directory) $path] incr count } set cdoffset [tell $zf] set endrec [binary format a4ssssiis PK\05\06 0 0 \ $count $count [string length $cd] $cdoffset\ [string length $opts(-comment)]] append endrec $opts(-comment) puts -nonewline $zf $cd puts -nonewline $zf $endrec close $zf return } if {!$tcl_interactive} { set r [catch [linsert $argv 0 zip::mkzip] err] if {$r} {puts stderr $errorInfo} elseif {[string length $err]} {puts $err} exit $r }
june 1 2009 Harm Olthof: I have tried this (just what I was looking for), but found a few problems:
- If directories have spaces in it, it will halt because it breaks down directory names at the white space and subsequently it can not find the directory. I could solve this by changing the line in proc "walk":
set result [concat $result $dir $subdir]
- to
set result [concat $result [list $dir] $subdir]
- with or without this change: directories are treated as plain files (size 0) by winzip (8.1) and 7-zip (sub directories), so I could not extract directories;
- jdc 29-sep-2010 To solve this, a / should be added to directory names:
if {[file isdirectory $fullpath]} { append path / } set utfpath [encoding convertto utf-8 $path]
- I added "level" to the options and tested with level 9 on a few files (txt, jpg and doc) and found in each time for winzip a slightly better compression. Could this have to do with the 4096 chunks? In general: how does zlib stream work? Are all files compressed again if data is added taking all data into account? i.e. I would like to work without a data window, so that I could have maximum compression. Is that possible? (I know there will be a performance penalty for that.)
- After ending compressing a 650Mb folder, wish86 still allocates some 800Mb virtual memory, which almost brings my desktop to a hold (1Gb). Is this to expected because of virtual memory management or could this be a memory leak?
See also: Using zipper to create zip files
[CesareSalvioni] - 2012-03-24 12:12:14Still using tcl version 8.5, so back to zipArchive using XOTcl. Looks to me that the procedure toDosTime is missing some codes before the expr. If day, hour or minute are lower then 10 expr raise me an error for invalid octal number. I changed the lines between foreach and set value like this:
set year [string trimleft $year 0] set month [string trimleft $month 0] set day [string trimleft $day 0] set hour [string trimleft $hour 0] if {$hour eq ""} { set hour 0 } set minute [string trimleft $minute 0] if {$minute eq ""} { set minute 0 } set secound [string trimleft $secound 0] if {$secound eq ""} { set secound 0 }
MHo 2014-04-05: I got a buffer error while zipping a large directory structure. And I see, if I do a zip::mkzip test1.zip -directory /name/of/some/dir *, the proc never ended and i had to kill it. I thougt, this would pack all (*) files in the given directory (which are approx 5 mbytes in size, but I stopped at 100 mbytes...) but I had to use something like zip::mkzip test1.zip -directory /name/of/some/dir /name/of/some/dir/* to achieve the result I'm expecting. If I only use zip::mkzip test1.zip /name/of/some/dir/*, no recursion happens. I'm confused.Another alternative for creating ZIP files from tcl without using an external program is to use Freewrap's built-in freewrap::makeZIP.