NJG January 23, 2005A speed tuned version can now be downloaded from A regexp twist!
MG takes a quick shot at this in pure-Tcl on 8.4.9 ...
proc regexpScriptPre8.5 {args} { if { [llength $args] < 3 } { error "wrong # args" } set rArgs [lrange $args 0 end-3] set cmd [lindex $args end] eval "foreach x \[regexp -inline $rArgs \[lindex \$args end-2\] \[lindex \$args end-1\]\] \{ \ uplevel 1 \[list [list $cmd]\] \[list \$x\] \ \}" }Or, in 8.5 simplified with {*} (though untested as I don't have 8.5)
proc regexpScript8.5 {args} { if { [llength $args] < 3 } { error "wrong # args" } set rArgs [lrange $args 0 end-3] set cmd [lindex $args end] foreach x [regexp -inline {*}$rArgs [lindex $args end-2] [lindex $args end-1]] { uplevel 1 [list $cmd] [list $x] } }Lars H: A less Quoting hell backport of the 8.5 version to 8.4:
proc regexpScript {args} { if { [llength $args] < 3 } { error "wrong # args" } set rArgs [lrange $args 0 end-3] set cmd [lindex $args end] foreach x [eval [list regexp -inline] $rArgs [lrange $args end-2 end-1]] { uplevel 1 [list $cmd] [list $x] } }which can of course be optimised further still by concatenating the lranges. But note that these do not do the same as the thing at the top of the page; here the last argument is a command prefix, but the compiled command is supposed to take an arbitrary script that accesses the match in global variables.
MG Just decided to do a quick test to see what, if anything, the speed difference was...
(Desktop) 6 % time {regexpScriptPre8.5 -all . {This is a test string} bleh} 500 635 microseconds per iteration (Desktop) 7 % time {regexpScriptPre8.5 -all . {This is a test string} bleh} 5000 664 microseconds per iteration (Desktop) 8 % time {regexpScriptPre8.5 -all . {This is a test string} bleh} 50000 730 microseconds per iteration (Desktop) 9 % load xregexp.dll Extended regexp handling is in place (Desktop) 10 % time {regexp -inline -all . {This is a test string} bleh} 500 897 microseconds per iteration (Desktop) 11 % time {regexp -inline -all . {This is a test string} bleh} 5000 903 microseconds per iteration (Desktop) 12 % time {regexp -inline -all . {This is a test string} bleh} 50000 1017 microseconds per iterationAs you can see, I did the tests using the pre-8.5 version (with eval), rather than the {*} version. All the tests were done on Tcl 8.4.9, and the Tcl-only version used the "normal" regexp; ie, I only loaded NJG's package after I'd tested the plain-tcl code. Suprisingly, the plain-Tcl version comes out slightly faster. (Oh, the 'bleh' script used there was just:
proc bleh {x} { set ::tmp $x; return }
NJG January 19, 2005MG, I would not say that a 40-50% difference is slight, so I looked into the code again. I found that the major part of the difference must come from my code saving 10 match variables at each match while your regexp -inline creates a sublist of only as many elements as there are subexpression matches. In the actual test 9 of the saves are superfluous!It is easy to remedy this, so I will shortly post the corrected version.I stated in the original posting (A regexp twist) that it is the least effective way to execute the script by Tcl_Eval at each match, as it is done in the code now. However, its effect is the least pronounced when the script consists of only a single parameterless procedure call, as is the case in your test.Anyway, I did this hack as it was easy and I found the result aesthetically pleasing. Perhaps I should create the solution which is efficient as well ...Finally, your test does not take into account the time needed for fetching the match and submatch values from the list representation returned by regexp -inline! (The time of at least one set <varname> [lindex $x <index>] or lindex $x <index> command).Thanks for the feedback!