- write a proc, e.g. "f", in Tcl
- disassemble its body
- use dis2asm to convert the disassembly to TAL (input) language
- test it by calling it (only then the assembly takes place), see if it works like before
- possibly optimize the TAL code by hand or script (see dis2asm gets better for optimization potentials) - and test again
#interp alias {} asm {} ::tcl::unsupported::assemble ;# worksn't - the [assemble] command isn't exported yet namespace eval tcl::unsupported {namespace export assemble} namespace import tcl::unsupported::assemble rename assemble asm interp alias {} disasm {} ::tcl::unsupported::disassembleI have extended the aproc function, which before just returned the disassembly, to accept an extra -x flag to "reassemble" the disassembly, and eval it as a proc, so you can test it just by calling it. The original disassembly code is also shown in a comment.
proc aproc {name argl body args} { proc $name $argl $body set res [disasm proc $name] if {"-x" in $args} { set res [list proc $name $argl [list asm [dis2asm $res]]] eval $res } return $res }This now is the converter from "dis" to TAL:
proc dis2asm body { set res "" set jumptargets {} foreach line [split $body \n] { if [regexp {\# pc (\d+)} $line -> pc] {lappend jumptargets $pc} } foreach line [split $body \n] { set line [string trim $line] if {$line eq ""} continue set code "" if {[regexp {\((\d+)\) (.+)} $line -> pc instr]} { if {$pc in $jumptargets} { append res "\n label L$pc;" } if {[regexp {(.+)#(.+)} $instr -> instr comment]} { set arg [lindex $comment end] if {$arg eq ""} {set arg "{}"} if [string match jump* $instr] {set arg L$arg} } else {set arg ""} set instr0 [normalize [lindex $instr 0]] if {$instr0 in {invokeStk}} {set arg [lindex $instr end]} if {$instr0 in {incrImm}} {set arg [list $arg [lindex $instr end]]} if {$instr0 in {list}} {set arg [lindex $instr end]} ;# PZ: 'list' missing arg. added set code [format " %-24s" "$instr0 $arg"] if {$instr0 in {startCommand}} {set code ""} append res "\n $code ;# [string trim $line]" } } append res \n return $res }This translates "dis" instruction names to "asm" instruction names, where different:
proc normalize instr { regsub {\d+$} $instr "" instr ;# strip off trailing length indicator set instr [string map { loadScalar load nop "" storeScalar store incrScalar1Imm incrImm } $instr] return $instr }Now to try it out.
% aproc f x {expr {sqrt($x)+1}} -x proc f x {asm { push tcl::mathfunc::sqrt ;# (0) push1 0 # "tcl::mathfunc::sqrt" load x ;# (2) loadScalar1 %v0 # var "x" invokeStk 2 ;# (4) invokeStk1 2 push 1 ;# (6) push1 1 # "1" add ;# (8) add ;# (9) done}} % f 2 2.414213562373095The following test shows that while mostly we have to remove parts of the disassembly, for jump targets we need to insert label pseudo-instructions:
% aproc f x {if {$x==1} {set x 2};return $x} -x proc f x {asm { load x ;# (0) loadScalar1 %v0 # var "x" push 1 ;# (2) push1 0 # "1" eq ;# (4) eq jumpFalse L13 ;# (5) jumpFalse1 +8 # pc 13 push 2 ;# (7) push1 1 # "2" store x ;# (9) storeScalar1 %v0 # var "x" jump L15 ;# (11) jump1 +4 # pc 15 label L13; push {} ;# (13) push1 2 # "" label L15; pop ;# (15) pop load x ;# (16) loadScalar1 %v0 # var "x" ;# (18) done}} % f 3 3 % f 1 2Of course, tcl::mathfunc::hypot is C-coded and much faster than this...
% aproc hypot {x y} {expr {sqrt($x**2+$y**2)}} -x proc hypot {x y} {asm { push tcl::mathfunc::sqrt ;# (0) push1 0 # "tcl::mathfunc::sqrt" load x ;# (2) loadScalar1 %v0 # var "x" push 2 ;# (4) push1 1 # "2" expon ;# (6) expon load y ;# (7) loadScalar1 %v1 # var "y" push 2 ;# (9) push1 1 # "2" expon ;# (11) expon add ;# (12) add invokeStk 2 ;# (13) invokeStk1 2 tryCvtToNumeric ;# (15) tryCvtToNumeric ;# (16) done}} % hypot 3 4 5.0
% aproc f x {incr x -1} -x proc f x {asm { incrImm x -1 ;# (0) incrScalar1Imm %v0 -1 # var "x" ;# (3) done}} % f 5 4while loops can already be handled:
% aproc f x {while {$i <= $x} {puts $i; incr i}} -x proc f x {asm { jump L22 ;# (0) jump1 +22 # pc 22 label L2; push puts ;# (2) push1 0 # "puts" load i ;# (4) loadScalar1 %v1 # var "i" invokeStk 2 ;# (6) invokeStk1 2 pop ;# (8) pop ;# (9) startCommand +12 1 # next cmd at pc 21 incrImm i +1 ;# (18) incrScalar1Imm %v1 +1 # var "i" pop ;# (21) pop label L22; load i ;# (22) loadScalar1 %v1 # var "i" load x ;# (24) loadScalar1 %v0 # var "x" le ;# (26) le jumpTrue L2 ;# (27) jumpTrue1 -25 # pc 2 push {} ;# (29) push1 1 # "" ;# (31) done }}for loops can also well be handled:
% aproc f x {for {set i 0} {$i<$x} {incr i} {puts $i}} -x proc f {x {i {}}} {asm { push 0 ;# (0) push1 0 # "0" store i ;# (2) storeScalar1 %v1 # var "i" pop ;# (4) pop jump L27 ;# (5) jump1 +22 # pc 27 label L7; push puts ;# (7) push1 1 # "puts" load i ;# (9) loadScalar1 %v1 # var "i" invokeStk 2 ;# (11) invokeStk1 2 pop ;# (13) pop ;# (14) startCommand +12 1 # next cmd at pc 26 incrImm i +1 ;# (23) incrScalar1Imm %v1 +1 # var "i" pop ;# (26) pop label L27; load i ;# (27) loadScalar1 %v1 # var "i" load x ;# (29) loadScalar1 %v0 # var "x" lt ;# (31) lt jumpTrue L7 ;# (32) jumpTrue1 -25 # pc 7 push {} ;# (34) push1 2 # "" ;# (36) done }} % f 4 0 1 2 3 %
What didn't work yetThe following illustrates the issue of mid-code "done" (see (9)).. we can't convert it to input TAL. Maybe a "jump" to a label at the end might help? More specifically, when a "done" is met in non-final position, it shall be converted to "jump done", and a "label done" added at the end.
% aproc f x {if {$x > 0} {return 1} else {return 0}} -x proc f x {asm { load x ;# (0) loadScalar1 %v0 # var "x" push 0 ;# (2) push1 0 # "0" gt ;# (4) gt jumpFalse L12 ;# (5) jumpFalse1 +7 # pc 12 push 1 ;# (7) push1 1 # "1" ;# (9) done ;# (10) nop ;# (11) nop label L12; push 0 ;# (12) push1 0 # "0" ;# (14) done ;# (15) done}} % f 1 inconsistent stack depths on two execution pathsThe above error came from a Tclkit 8.6.1 on a Lubuntu netbook. The same example on tclsh 8.6b2 on Win XP works there (with the solution I had in mind - jump to final "done" which here is called "L33" ;^)
% aproc f x {if {$x > 0} {return 1} else {return 0}} -x proc f x {asm { load x ;# (0) loadScalar1 %v0 # var "x" push 0 ;# (2) push1 0 # "0" gt ;# (4) gt jumpFalse L21 ;# (5) jumpFalse1 +16 # pc 21 ;# (7) startCommand +12 1 # next cmd at pc 19 push 1 ;# (16) push1 1 # "1" ;# (18) done jump L33 ;# (19) jump1 +14 # pc 33 label L21; ;# (21) startCommand +12 1 # next cmd at pc 33 push 0 ;# (30) push1 0 # "0" ;# (32) done label L33; ;# (33) done }} 37 % f 3 1This was fixed in dis2asm gets things done.Another issue: the "dis" language generated for foreach looks so different that at the moment I was only puzzled. The language accepted by "asm" has no instruction beginning with "foreach"...The story continues at dis2asm does macros - dis2asm gets things done - dis2asm gets better - dis2asm learns to catch