Code edit
package provide vfs::chm 0.5 set ::verbose 0 package require vfs 1.0 package provide chmvfs 0.5 set ::vfs::debug 1 # Basic idea is having access to a site structure as saved in a chm file, then # serving it via a tcl-based http server. namespace eval vfs::chm { variable status array set status [list / /] } namespace eval ::lzx { variable bitbuf "" variable bufsiz 0 variable bufpos 0 variable status array set status [list / /] variable LZX_FRAME_SIZE 32768 variable LZX_CHECK_BLOCK [expr (32768 + 6144)*8] variable LZX_MIN_MATCH 2 variable LZX_MAX_MATCH 257 variable LZX_NUM_CHARS 256 #define LZX_BLOCKTYPE_VERBATIM (1) #define LZX_BLOCKTYPE_ALIGNED (2) #define LZX_BLOCKTYPE_UNCOMPRESSED (3) variable LZX_PRETREE_NUM_ELEMENTS 20 variable LZX_ALIGNED_NUM_ELEMENTS 8 variable LZX_NUM_PRIMARY_LENGTHS 7 variable LZX_NUM_SECONDARY_LENGTHS 249 variable LZX_LENGTH_MAXSYMBOLS [expr $LZX_NUM_SECONDARY_LENGTHS + 1] variable LZX_LENGTH_TABLEBITS 12 variable LZX_PRETREE_MAXSYMBOLS $LZX_PRETREE_NUM_ELEMENTS variable LZX_PRETREE_TABLEBITS 6 variable LZX_MAINTREE_MAXSYMBOLS [expr $LZX_NUM_CHARS + 50 * 8] variable LZX_MAINTREE_TABLEBITS 12 variable LZX_ALIGNED_MAXSYMBOLS $LZX_ALIGNED_NUM_ELEMENTS variable LZX_ALIGNED_TABLEBITS 7 variable extra_bits variable position_base # Initialize LZX lookup tables for { set i 0; set j 0} { $i < 51 } { incr i 2 } { # 0,0,0,0,1,1,2,2,3,3,4,4,5,5,6,6,7,7... set extra_bits($i) $j set extra_bits([expr $i + 1]) $j # 0,0,1,2,3,4...15,16,17,17,17,17... if { ($i != 0) && ($j < 17) } { incr j } } for { set i 0; set j 0 } { $i < 51 } { incr i } { # 0,1,2,3,4,6,8,12,16,24,32,... set position_base($i) $j # 1,1,1,1,2,2,4,4,8,8,16,16,32,32,... incr j [expr 1 << $extra_bits($i)] } } proc vfs::chm::Mount {chm local} { if {![file exists $chm] || ![file isfile $chm] } { error "No such file $chm" } set fileHeader(size) 56 set fileHeader(structure) { {n type f a4} {n version f i} {n totLength f i} {n unknown f i} {n timestamp f i} {n langId f i} {n guid1 f h32} {n guid2 f h32}} set recordFormat "h32" set fl [::open $chm r] fconfigure $fl -translation binary set fmt "" set varList {} foreach headerElement $fileHeader(structure) { array set element $headerElement set fmt "$fmt$element(f)" set varList [concat $varList fileHeader($element(n))] } set content [read $fl $fileHeader(size)] eval binary scan \$content $fmt $varList if { $fileHeader(version) != 2 } { set hst(size) 40 set hst(structure) { {n offs0 f w} {n len0 f w} {n offs1 f w} {n len1 f w} {n offs2 f w}} } else { set hst(size) 32 set hst(structure) { {n offs0 f w} {n len0 f w} {n offs1 f w} {n len1 f w}} } set fmt "" set varList {} foreach headerElement $hst(structure) { array set element $headerElement set fmt "$fmt$element(f)" set varList [concat $varList hst($element(n))] } set content [read $fl $hst(size)] set offset $fileHeader(size) eval binary scan \$content $fmt $varList set hSect0(size) $hst(len0) set hSect0(structure) { {n unkn0 f H4} {n unkn1 f i} {n fileSize f w } {n unkn2 f i} {n unkn3 f i}} set fmt "" set varList {} foreach headerElement $hSect0(structure) { array set element $headerElement set fmt "$fmt$element(f)" set varList [concat $varList hSect0($element(n))] } set content [read $fl $hSect0(size)] eval binary scan \$content $fmt $varList set hSect1(size) 84 set hSect1(structure) { { n type f a4 } { n version f i } { n length f i } { n unkn0 f i } { n chunksz f i } { n density f i } { n depth f i } { n rtchunkno f i } { n PGML1 f i } { n PMGLls f i } { n unkn1 f i } { n dirChunks f i } { n wLangId f i } { n guid f h32 } { n len2 f i } { n unkn2 f i } { n unkn3 f i } { n unkn4 f i } } set fmt "" set varList {} foreach headerElement $hSect1(structure) { array set element $headerElement set fmt "$fmt$element(f)" set varList [concat $varList hSect1($element(n))] } seek $fl $hst(offs1) set content [read $fl $hSect1(size)] eval binary scan \$content $fmt $varList set offset [expr $hst(offs1) + $hSect1(length)] seek $fl $offset for {set chunk 0} { $chunk < $hSect1(dirChunks) } { incr chunk } { set chOffset 0 set content [read $fl $hSect1(chunksz)] incr offset $hSect1(chunksz) set fmt a4 eval binary scan \$content $fmt chunkType if { $chunkType == "PMGL" } { set begLstChunk(size) 20 set begLstChunk(structure) { { n type f a4 } { n length f i } { n zero f i } { n prev f i } { n next f i } } set fmt "" set varList {} foreach headerElement $begLstChunk(structure) { array set element $headerElement set fmt "$fmt$element(f)" set varList [concat $varList begLstChunk($element(n))] } eval binary scan \$content $fmt $varList set quickref [expr $chOffset + $hSect1(chunksz) - $begLstChunk(length)] set end [expr $chOffset + $hSect1(chunksz) - 2] set fmt s eval binary scan \$content @${end}$fmt begLstChunk(entryNum) lappend begLstChunk(structure) {n entryNum f s} set chunks($chunk) [array get begLstChunk] incr chOffset $begLstChunk(size) set chunkList {} for {set entryno 0} { $entryno < $begLstChunk(entryNum) } {incr entryno} { set chOffset [readEntry $content $chOffset entry] array set temp $entry set name [string trim $temp(NAME) /] if { $name == "/" } { set name "/" } set ::vfs::chm::tree${chm}($name) $entry lappend chunkList $entry } set chunks(lst$chunk) $chunkList } else { if { $chunkType == "PMGI" } { } else { error "Unknown chunk type $chunkType" } } } if { $fileHeader(version) != 2 } { set offset $hst(offs2) } else { incr offset $hSect1(chunksz) } close $fl if { $fileHeader(version) != 2 } { set contentStart $hst(offs2) } else { set contentStart $offset } set ::vfs::chm::tree${chm}(contentStart) $contentStart set nlfid [open $chm ::DataSpace/NameList r 438] set content [read $nlfid] close $nlfid binary scan $content ss len num set offset 4 for { set idx 0 } { $idx < $num } { incr idx } { binary scan $content @${offset}s fileNameLen incr offset 2 set beg $offset incr offset [expr 2 * $fileNameLen] if { $idx > 0 } { set fname [string range $content $beg $offset] set fname [join [split $fname "\0"] ""] array set sectData [set ::vfs::chm::tree${chm}(::DataSpace/Storage/$fname/Content)] incr sectData(OFFSET) $contentStart set ctrlfd [open $chm ::DataSpace/Storage/$fname/ControlData r 438] set control [read $ctrlfd] close $ctrlfd binary scan $control ia4iiiii wnum sectData(SIGN) sectData(VERSION) sectData(RSTINT) sectData(WINSZ) sectData(CACSZ) zero if { $sectData(VERSION) == 2 } { set sectData(WINSZ) [expr $sectData(WINSZ) * 32768] set sectData(RSTINT) [expr $sectData(RSTINT) * 32768] } switch $sectData(WINSZ) { 32768 { set sectData(WINBT) 15 } 65536 { set sectData(WINBT) 16 } 131072 { set sectData(WINBT) 17 } 262144 { set sectData(WINBT) 18 } 524288 { set sectData(WINBT) 19 } 1048576 { set sectData(WINBT) 20 } 2097152 { set sectData(WINBT) 21 } default { error "bad controldata window size" } } set ctrlFileName "::DataSpace/Storage/$fname/Transform/{7FC28940-9D31-11D0-9B27-00A0C91E9C7C}/InstanceData/ResetTable" set ctrlfd [open $chm $ctrlFileName r 438] set control [read $ctrlfd] close $ctrlfd binary scan $control iiiiwwwww sectData(unk) sectData(RSTBEN) sectData(RSTBSZ) sectData(RSTBLN) sectData(RSTBULN) sectData(RSTBCLN) sectData(RSTBBSZ) sectData(RSTBZRO) sectData(RSTBBND) array set rstTable [set ::vfs::chm::tree${chm}($ctrlFileName)] set sectData(RSTBLOFFS) $rstTable(OFFSET) set ::vfs::chm::tree${chm}(Section$idx) [array get sectData] } incr offset 2 } set ::vfs::chm::status($local) $chm set ::vfs::chm::tree${chm}(mountPoint) $local ::vfs::log "chm $chm mounted at $local" vfs::filesystem mount $local [list vfs::chm::handler $chm] vfs::RegisterMount $local [list vfs::chm::Unmount] return $local } proc vfs::chm::Unmount {local} { set chm $::vfs::chm::status($local) unset ::vfs::chm::tree$chm unset ::vfs::chm::status($local) vfs::filesystem unmount $local } proc vfs::chm::handler {chm cmd root relative actualpath args} { if {$cmd == "matchindirectory"} { eval [list $cmd $chm $relative $actualpath] $args } else { eval [list $cmd $chm $relative] $args } } # If we implement the commands below, we will have a perfect # virtual file system for Compiled HTML archives. proc vfs::chm::stat {chm name} { ::vfs::log "stat $name" if { $name == ""} { return [list type directory size 0 mode 0555 \ ino -1 depth 0 name $name atime 0 ctime 0 mtime 0 dev -1 \ uid -1 gid -1 nlink 1] } if {[catch "set ::vfs::chm::tree${chm}($name)" x]} { error "No such file: $x" } array set entry $x if { $entry(LENGTH) == 0 } { return [list type directory size 0 mode 0111 \ ino -1 depth 0 name $name atime 0 ctime 0 mtime 0 dev -1 \ uid -1 gid -1 nlink 1] } return [list type file size $entry(LENGTH) mode 0111 \ ino -1 depth 0 name $name atime 0 ctime 0 mtime 0 dev -1 \ uid -1 gid -1 nlink 1] } proc vfs::chm::access {chm name mode} { if {$mode & 2} { vfs::filesystem posixerror $::vfs::posix(EROFS) } if { $name == ""} { return 1 } if {[catch "set ::vfs::chm::tree${chm}($name)" x]} { vfs::filesystem posixerror $::vfs::posix(ENOENT) } return 1 } proc vfs::chm::exists {chm name} { if { $name == ""} { return 1 } if {[catch "set ::vfs::chm::tree${chm}($name)" x]} { vfs::filesystem posixerror $::vfs::posix(ENOENT) } return 1 } proc vfs::chm::matchindirectory {chm path actualpath pattern type} { set pattern [file join $path $pattern] set biggest [array name ::vfs::chm::tree${chm} $pattern] set root [set ::vfs::chm::tree${chm}(mountPoint)] set bigger [list] foreach p $biggest { if { [string match "${pattern}/*" $p ]} continue lappend bigger [file join $root $p] } #::vfs::log "got $newres" return [::vfs::matchCorrectTypes $type $bigger] } proc vfs::chm::open {chm name mode permissions} { switch -- $mode { "" - "r" { if {[catch "set ::vfs::chm::tree${chm}($name)" x]} { vfs::filesystem posixerror $::vfs::posix(ENOENT) } array set entry $x if { !$entry(LENGTH) } { # There are no empty files: they are folders! vfs::filesystem posixerror $::vfs::posix(EPERM) } set nfd [vfs::memchan] fconfigure $nfd -translation binary set chmfd [::open $chm r] fconfigure $chmfd -translation binary if { $entry(SECTION) } { # needs decompression array set sectData [set ::vfs::chm::tree${chm}(Section${entry(SECTION)})] set entryNum [expr $entry(OFFSET) / $sectData(RSTBBSZ) / 2] set entry(NUM) [expr $entryNum * 2] if { $entryNum > $sectData(RSTBEN) } { error "Entry $entryNum not present in Reset Table($sectData(RSTBEN))" } set entryOffset [expr [set ::vfs::chm::tree${chm}(contentStart)] + $sectData(RSTBLOFFS) + $sectData(RSTBLN) + $entry(NUM) * $sectData(RSTBSZ)] set fl [::open $chm r] fconfigure $fl -translation binary seek $fl $entryOffset set addr [read $fl $sectData(RSTBSZ)] if { $sectData(RSTBSZ) == 4 } { binary scan "$addr" i compOffset } else { binary scan "$addr" w compOffset } set entry(OFFS) $compOffset set skip [expr $entry(OFFSET) - $entry(NUM) * $sectData(RSTBBSZ)] seek $chmfd $sectData(OFFSET) seek $chmfd $entry(OFFS) current ::lzx::decompress $chmfd $nfd $entry(LENGTH) $skip $sectData(WINBT) } else { seek $chmfd [set ::vfs::chm::tree${chm}(contentStart)] start seek $chmfd $entry(OFFSET) current set data [read $chmfd $entry(LENGTH)] puts -nonewline $nfd $data } fconfigure $nfd -translation auto seek $nfd 0 return [list $nfd] } default { vfs::filesystem posixerror $::vfs::posix(EROFS) } } } proc vfs::chm::createdirectory {chm name} { #::vfs::log "createdirectory $name" vfs::filesystem posixerror $::vfs::posix(EROFS) } proc vfs::chm::removedirectory {chm name recursive} { #::vfs::log "removedirectory $name" vfs::filesystem posixerror $::vfs::posix(EROFS) } proc vfs::chm::deletefile {chm name} { #::vfs::log "deletefile $name" vfs::filesystem posixerror $::vfs::posix(EROFS) } proc vfs::chm::fileattributes {chm name args} { #::vfs::log "fileattributes $args" switch -- [llength $args] { 0 { # list strings return [list NAME SECTION OFFSET LENGTH] } 1 { # get value if {[catch "set ::vfs::chm::tree${chm}($name)" x]} { vfs::filesystem posixerror $::vfs::posix(ENOENT) } array set entry $x set index [lindex $args 0] set name [lindex [list NAME SECTION OFFSET LENGTH] $index] return $entry($name) } 2 { # set value set index [lindex $args 0] set val [lindex $args 1] vfs::filesystem posixerror $::vfs::posix(EROFS) } } } proc vfs::chm::utime {fd path actime mtime} { vfs::filesystem posixerror $::vfs::posix(EROFS) } proc vfs::chm::readEntry { content offset lst } { set l 0 set b -1 while { $b < 0 } { eval binary scan \$content @${offset}c1 b incr offset set l [expr ( $l << 7 ) | ($b & 127) ] } eval binary scan \$content @${offset}a$l n incr offset $l set s 0 set b -1 while { $b < 0 } { eval binary scan \$content @${offset}c1 b incr offset set s [expr ( $s << 7 ) | ($b & 127) ] } set o 0 set b -1 while { $b < 0 } { eval binary scan \$content @${offset}c1 b incr offset set o [expr ( $o << 7 ) | ($b & 127) ] } set l 0 set b -1 while { $b < 0 } { eval binary scan \$content @${offset}c1 b incr offset set l [expr ( $l << 7 ) | ($b & 127) ] } upvar $lst x set x [list NAME $n SECTION $s OFFSET $o LENGTH $l] return $offset } proc ::lzx::checkBuffer { fds } { variable bufpos variable bufsiz variable bitbuf # How many bytes to be skipped? set bigSkip [expr ($bufpos >> 3 ) & -2] # How many bits to be skipped? set smlSkip [expr $bufpos & 15] # In case of exhausted buffer, read some # We should have at least LZX_CHECK_BLOCK = ( LZX_FRAME_SIZE + 6k ) * 8 bits, # the maximum size an uncompressed block can achieve, but we # read a MegaByte at the time if { $bufsiz < $::lzx::LZX_CHECK_BLOCK } { set bytes [read $fds 1048576] set bitbuf [string range $bitbuf $bigSkip end] append bitbuf $bytes # Use real bytes length, in case of less input data incr bufsiz [expr [string length $bytes] * 8] set bufpos $smlSkip set bigSkip 0 } } proc ::lzx::bits { len } { variable bufpos variable bufsiz variable bitbuf # How many bytes to be skipped? set bigSkip [expr ($bufpos >> 3 ) & -2] # How many bits to be skipped? set smlSkip [expr $bufpos & 15] # I read an entire DWORD: it should be enough if { [binary scan $bitbuf @${bigSkip}ss num1 num2] < 2} { error "Exhausted input" } set num [expr ($num1 << 16) | ($num2 & 65535)] set res [expr $num >> ( 32 - $smlSkip - $len ) ] set res [expr $res & ( ( 1 << $len ) - 1 ) ] incr bufpos $len incr bufsiz -$len return $res } proc ::lzx::build_table { obj } { set nsyms [set ::lzx::LZX_${obj}_MAXSYMBOLS ] set nbits [set ::lzx::LZX_${obj}_TABLEBITS ] set table_mask [expr 1 << $nbits] set bit_mask [expr $table_mask >> 1] set next_symbol $bit_mask set bit_num 1 set pos 0 while { $bit_num <= $nbits } { for { set sym 0 } { $sym < $nsyms } { incr sym } { if { [set ::lzx::status(${obj}_len$sym)] == $bit_num } { set leaf $pos incr pos $bit_mask if { $pos > $table_mask } { error "Table overrun" } set fill $bit_mask while { $fill > 0 } { set ::lzx::status(${obj}_table$leaf) $sym incr fill -1 incr leaf } } } set bit_mask [expr $bit_mask >> 1] incr bit_num } if { $pos != $table_mask } { for { set sym $pos } { $sym < $table_mask } { incr sym } { set ::lzx::status(${obj}_table$sym) 0 } set pos [expr $pos << 16] set table_mask [expr $table_mask << 16] set bit_mask $::lzx::LZX_FRAME_SIZE while { $bit_num <= 16 } { for { set sym 0 } { $sym < $nsyms } { incr sym } { if { [set ::lzx::status(${obj}_len$sym)] == $bit_num } { set leaf [expr $pos >> 16] for { set fill 0 } { $fill < $bit_num - $nbits } { incr fill } { if { $::lzx::status(${obj}_table$leaf) == 0 } { set s [expr $next_symbol << 1] set ::lzx::status(${obj}_table$s) 0 incr s set ::lzx::status(${obj}_table$s) 0 set ::lzx::status(${obj}_table$leaf) $next_symbol incr next_symbol } set leaf [expr $::lzx::status(${obj}_table$leaf) << 1] if { ($pos >> (15-$fill)) & 1 } { incr leaf } } set ::lzx::status(${obj}_table$leaf) $sym incr pos $bit_mask if { $pos > $table_mask } { error "table overflow" } } } set bit_mask [expr $bit_mask >> 1] incr bit_num } } if { $pos == $table_mask } return } proc ::lzx::read_huffsym { obj } { set mp $::lzx::bufpos set ms $::lzx::bufsiz set bits [set ::lzx::LZX_${obj}_TABLEBITS ] set r [::lzx::bits $bits] set i $::lzx::status(${obj}_table$r) if { $i >= [set ::lzx::LZX_${obj}_MAXSYMBOLS ] } { set j [expr 1 << (32 - $bits)] set go 1 while { $go || $i >= [set ::lzx::LZX_${obj}_MAXSYMBOLS ] } { set j [expr $j >> 1] set i [expr $i << 1] set i [expr $i | [::lzx::bits 1]] if {!$j} { error "Illegal data" } set go 0 set i $::lzx::status(${obj}_table$i) } } set j $::lzx::status(${obj}_len$i) set ::lzx::bufpos $mp set ::lzx::bufsiz $ms incr ::lzx::bufpos $j incr ::lzx::bufsiz -$j return $i } proc ::lzx::read_lengths { obj first last } { for { set x 0 } { $x < 20 } { incr x } { set ::lzx::status(PRETREE_len$x) [::lzx::bits 4] } build_table PRETREE for { set x $first } { $x < $last} {} { set z [::lzx::read_huffsym PRETREE] if { $z == 17 } { set y [::lzx::bits 4]; incr y 4 while { $y} { incr y -1 set ::lzx::status(${obj}_len$x) 0 incr x } } else { if { $z == 18 } { set y [::lzx::bits 5]; incr y 20 while { $y} { incr y -1 set ::lzx::status(${obj}_len$x) 0 incr x } } else { if { $z == 19 } { set y [::lzx::bits 1]; incr y 4 set z [::lzx::read_huffsym PRETREE] set z [expr $::lzx::status(${obj}_len$x) - $z] if { $z < 0 } { incr z 17 } while { $y} { incr y -1 set ::lzx::status(${obj}_len$x) $z incr x } } else { set z [expr $::lzx::status(${obj}_len$x) - $z] if { $z < 0 } { incr z 17 } set ::lzx::status(${obj}_len$x) $z incr x } } } } } proc ::lzx::decompress {infd askfd len skip wndbit } { variable bufsiz variable bufpos variable bitbuf set bufsiz 0 set bufpos 0 set bitbuf "" if { $wndbit == 20 } { set posn_slots 42 } else { if { $wndbit == 21 } { set posn_slots 50 } else { set posn_slots [expr $wndbit * 2] } } if { $skip } { set outfd [vfs::memchan] fconfigure $outfd -translation binary } else { set outfd $askfd } # decompress as much as needed, but outputs only the content, skip previous # the idea is to start with initial values: LZX status is not used nor needed set window_posn 0 set last_window_posn 0 set window_size [expr 1 << $wndbit] set bufsiz 0 set main_elements [expr $::lzx::LZX_NUM_CHARS + ($posn_slots << 3)] set todo [expr $len + $skip] while { $todo > 0 } { ::lzx::checkBuffer $infd if { $window_posn == $window_size } { set window_posn 0 } if { $window_posn == 0 } { set header_read 0 for { set i 0 } { $i < $::lzx::LZX_MAINTREE_MAXSYMBOLS } { incr i } { set ::lzx::status(MAINTREE_len$i) 0 } for { set i 0 } { $i < $::lzx::LZX_LENGTH_MAXSYMBOLS } { incr i } { set ::lzx::status(LENGTH_len$i) 0 } set R0 1; set R1 1; set R2 1 } if { !$header_read } { set i 0; set j 0 set k [::lzx::bits 1] set intel_filesize 0 if { $k } { set intel_filesize [::lzx::bits 32] } set header_read 1 } set block_type [::lzx::bits 3] set block_remaining [::lzx::bits 24] set block_length $block_remaining switch $block_type { 1 { # LZX_BLOCKTYPE_VERBATIM read_lengths MAINTREE 0 256 read_lengths MAINTREE 256 $main_elements build_table MAINTREE if { [set ::lzx::status(MAINTREE_len232)] != 0} { set intel_started 1 } read_lengths LENGTH 0 $::lzx::LZX_NUM_SECONDARY_LENGTHS build_table LENGTH } 2 { # LZX_BLOCKTYPE_ALIGNED for { set i 0 } { $i < 8 } { incr i } { set ::lzx::status(ALIGNED_len$i) [::lzx::bits 3] } build_table ALIGNED read_lengths MAINTREE 0 256 read_lengths MAINTREE 256 $main_elements build_table MAINTREE if { [set ::lzx::status(MAINTREE_len232)] != 0} { set intel_started 1 } read_lengths LENGTH 0 $::lzx::LZX_NUM_SECONDARY_LENGTHS build_table LENGTH } 3 { # LZX_BLOCKTYPE_UNCOMPRESSED set intel_started 1 set b [expr $::lzx::bufpos & 15] if { $b } { set b [expr 16 - $b] incr bufpos $b incr bufsiz -$b } set R0 [::lzx::bits 32] set R1 [::lzx::bits 32] set R2 [::lzx::bits 32] } default { error "Illegal block type $block_type" } } set this_run $block_remaining while { $this_run > 0 && $todo > 0 } { if { $this_run > $todo } { set this_run $todo } incr todo -$this_run incr block_remaining -$this_run set last_window_posn $window_posn set window_posn [expr $window_posn & ( $window_size - 1 )] if { ($window_posn + $this_run) > $window_size } { error "Invalid format" } switch $block_type { 1 { # LZX_BLOCKTYPE_VERBATIM while { $this_run > 0 } { set main_element [read_huffsym MAINTREE] if { $main_element < $::lzx::LZX_NUM_CHARS } { puts -nonewline $outfd [binary format c $main_element] incr window_posn incr this_run -1 } else { incr main_element -$::lzx::LZX_NUM_CHARS set match_length [expr $main_element & $::lzx::LZX_NUM_PRIMARY_LENGTHS] if { $match_length == $::lzx::LZX_NUM_PRIMARY_LENGTHS } { incr match_length [read_huffsym LENGTH] } incr match_length $::lzx::LZX_MIN_MATCH set match_offset [expr $main_element >> 3] if { $match_offset > 2 } { if { $match_offset != 3 } { set extra $::lzx::extra_bits($match_offset) set verbatim_bits [::lzx::bits $extra] set match_offset [expr $::lzx::position_base($match_offset) - 2 + $verbatim_bits] } else { set match_offset 1 } set R2 $R1; set R1 $R0; set R0 $match_offset } elseif { $match_offset == 0 } { set match_offset $R0 } elseif { $match_offset == 1 } { set match_offset $R1 set R1 $R0; set R0 $match_offset } else { set match_offset $R2 set R2 $R0; set R0 $match_offset } set length $match_length while { $length } { set rundest [tell $outfd] seek $outfd -$match_offset end set match [read $outfd $length] seek $outfd $rundest puts -nonewline $outfd $match set l [string length $match] incr length -$l } incr window_posn $match_length incr this_run -$match_length } if { ($window_posn % 32768 ) == 0 && $window_posn != 0 } { set b [expr $::lzx::bufpos & 15] if { $b } { set b [expr 16 - $b] incr ::lzx::bufpos $b incr bufsiz -$b } } } } 2 { # LZX_BLOCKTYPE_ALIGNED while { $this_run > 0 } { set main_element [read_huffsym MAINTREE] if { $main_element < $::lzx::LZX_NUM_CHARS } { puts -nonewline $outfd [binary format c $main_element] incr window_posn incr this_run -1 } else { incr main_element -$::lzx::LZX_NUM_CHARS set match_length [expr $main_element & $::lzx::LZX_NUM_PRIMARY_LENGTHS] if { $match_length == $::lzx::LZX_NUM_PRIMARY_LENGTHS } { incr match_length [read_huffsym LENGTH] } incr match_length $::lzx::LZX_MIN_MATCH set match_offset [expr $main_element >> 3] if { $match_offset > 2 } { # It is not a repeated offset set extra $::lzx::extra_bits($match_offset) set match_offset [expr $::lzx::position_base($match_offset) - 2] if { $extra > 3 } { incr extra -3 set verbatim_bits [::lzx::bits $extra] incr match_offset [expr $verbatim_bits << 3] set aligned_bits [::lzx::read_huffsym ALIGNED] incr match_offset $aligned_bits } elseif { $extra == 3 } { set aligned_bits [::lzx::read_huffsym ALIGNED] incr match_offset $aligned_bits } elseif { $extra > 0 } { set verbatim_bits [::lzx::bits $extra] incr match_offset $verbatim_bits } else { set match_offset 1 } set R2 $R1; set R1 $R0; set R0 $match_offset } elseif { $match_offset == 0 } { set match_offset $R0 } elseif { $match_offset == 1 } { set match_offset $R1 set R1 $R0; set R0 $match_offset } else { set match_offset $R2 set R2 $R0; set R0 $match_offset } set length $match_length while { $length } { set rundest [tell $outfd] seek $outfd -$match_offset end set match [read $outfd $length] seek $outfd $rundest puts -nonewline $outfd $match set l [string length $match] incr length -$l } incr window_posn $match_length incr this_run -$match_length } if { ($window_posn % 32768 ) == 0 && $window_posn != 0 } { set b [expr $::lzx::bufpos & 15] if { $b } { set b [expr 16 - $b] incr bufpos $b incr bufsiz -$b } } } } 3 { # LZX_BLOCKTYPE_UNCOMPRESSED set tt [expr $::lzx::bufpos / 8] set part [string range $::lzx::bitbuf $tt [expr $tt + $this_run - 1]] puts -nonewline $outfd $part incr ::lzx::bufpos [expr $this_run * 8] incr bufsiz [expr -$this_run * 8] incr window_posn $this_run if { $::lzx::bufpos & 8 } { incr ::lzx::bufpos 8 incr bufsiz -8 } } default { error "Illegal block type $block_type" } } set this_run $block_remaining } } if { $skip } { seek $outfd $skip set tmp [read $outfd $len] puts -nonewline $askfd $tmp } } proc ::vfs::chm::test {} { ::vfs::chm::Mount c:/Tcl/doc/ActiveTclHelp8.4.chm c:/chm cd c:/chm set d [glob -types d -directory c:/chm -tails ActiveTcl8.*] cd $d puts [pwd] puts [join [lsort [glob *]] \n] catch "file mkdir test" err puts $err set fid [::open c:/chm/$d/aspn.css r] set content [read $fid] close $fid set fid [::open c:/aspn.css w] fconfigure $fid -translation binary puts $fid $content close $fid cd c:/ ::vfs::chm::Unmount c:/chm }
CHM Web server edit
This is just a DustMote modification to browse the CHM file as a site.package require vfs::chm #source chmvfs.tcl ::vfs::chm::Mount c:/Tcl/doc/ActiveTclHelp8.4.chm c:/chm set d [glob -types d -directory c:/chm -tails ActiveTcl8.*] set root "c:/chm/$d" set default "at.toc.html" set port 80 proc bgerror {trouble} {puts stdout "bgerror: $trouble"} proc answer {socketChannel host2 port2} { fileevent $socketChannel readable [list readIt $socketChannel] } proc readIt {socketChannel} { global root default fconfigure $socketChannel -blocking 0 set gotLine [gets $socketChannel] if { [fblocked $socketChannel] } then {return} fileevent $socketChannel readable "" set shortName "/" regexp {/[^ ]*} $gotLine shortName set many [string length $shortName] set last [string index $shortName [expr $many-1] ] if {$last=="/"} then {set shortName $shortName$default } set wholeName $root$shortName set err "" puts "Serving $wholeName" if [catch {set fileChannel [open $wholeName RDONLY] } err ] { puts $socketChannel "HTTP/1.0 404 Not found" puts $socketChannel "" puts $socketChannel "<html><head><title><No such URL.></title></head>" puts $socketChannel "<body><center>" puts $socketChannel "The URL you requested does not exist on this site." puts $socketChannel "</center>$err</body></html>" puts $err } else { fconfigure $fileChannel -translation binary fconfigure $socketChannel -translation binary -buffering full puts $socketChannel "HTTP/1.0 200 OK" puts $socketChannel "" set work [read $fileChannel] puts $socketChannel $work close $fileChannel } close $socketChannel } socket -server answer $port vwait forEver
Discussion edit
AK: A number of questions.
- The history of the lzx code is not fully clear to me from the description. Does it come from cabextract, or from chmtools?
- Am I right in my assumption that with porting you mean that the original code is in C(++), and you ported it to Tcl?
- What licenses was the original code under, and what license is your code under?
- Do you know where it is possible to find an lzx compressor in Tcl or C(++) ? (For people interested in doing the write part of the VFS).
- Can you provide links to web pages which describe the structure of CHM and CAB files in detail? I.e., a good specification of the file format.
DAG: A number of answers.
- It does come from chmtools. In the source code I found, and copied, the reference.
- You're perfectly right: it was coded in C, and I ported in Tcl. It was just for fun, and to make it much more portable, without the need of a binary library: I love Tcl-only packages.
- It was GNU GPL by Caie. Russotto didn't claim any right on modifications. For me is just the same.
- See below.
- My ending point, after a long research was [1], the reason why I thanked Matthew.
- Here is another place of information [2]. It is more related to CHM format, and can be used to implement next features.
DDG: Very interesting work. However I can't read the aspn.css file. Getting the "illegal block type 4" error. The file exists however as the glob results shows. Any suggestions?
DAG: How unfortunate! May I know which version of file do you use? I am afraid the versions I own have only VERBATIM and ALIGNED blocks, so UNCOMPRESSED block part could be not tested at all.Anyway, last night I found that there is still a bug related to Random Access: it seems probably something about odd blocks. I am afraid I access those with some misalignment. I am looking into it: next tests will copy all files from archive to another location.Another thing I fixed is the buffer reset at the beginning of decompress proc.
DDG Thanks for your response. I use the file D:/ActiveTcl8.4.2/doc/ActiveTclHelp.chm. I can give you a link for downloading "my" chm file. It would be nice to use chmvfs for my dgHelpBrowser like the metakit files.
DAG: That's a good idea. Have a try with this last version: it works fine on both my versions, but for XOTcl pdf files. I'm working on it, but I think that probably it is still related to UNCOMPRESSED blocks. Even chmtools seems to have some problems with those pdf files, though.I also added a small modification of DustMote, which I use to browse the CHM file as a site. In case it is not working for You, let me know where I can download Your file from.
DDG It now works also for my chm file. Great! I will see that I update dgHelpBrowser for reading chm-files on Unixes,Win and OS-X etc. I once tried to adjust chmlib for this purpose using Swig but I failed. Thanks.
DAG: I thought I found the last problem in UNCOMPRESSED blocks handling, realigning to 16-bit boundaries after each of them, if needed. Nonetheless, it seems that pdf extracted are not always identical to stored files. Still investigating.
DAG: I did it. At last I found the bug in UNCOMPRESSED blocks handling. Nevertheless, I discovered a .chm that puzzled me again. Filenames are stored in all lower-cases, but referred also with upper cases. This means that I need to add an option to mount the VFS in both modes: case-sensitive or case-insensitive. I think that, coming from Windows, the latter should be the default.