About regexp -about edit
The regexp command has an -about flag that appears to provide information about the command's behavior. Hopefully more information about this option will be added here as time goes by.
 $ set r {a{,3}}
 a{,3}
 $ set b {aabc}
 aabc
 $ regexp $r $b
 0
 $ regexp $r {abc}
 0
 $ regexp -- $r $b
 0
 $ regexp -about $r $b
 0 {REG_UBRACES REG_UUNSPEC}
 $ set r {a{0,3}}
 a{0,3}
 $ regexp -about $r $b
 0 {REG_UBOUNDS REG_UEMPTYMATCH}
 $ regexp -- $r $b
 1The values returned from regexp -about are:
- A digit which indicates the number of submatches available,
- A list of symbols which indicate something about the regular expression.
- REG_UBACKREF
- Contains back-references.
- REG_ULOOKAHEAD
- Contains lookahead constraints (e.g., “(?=...)”)
- REG_UBOUNDS
- appears to indicate that the {m,n} quantifier is used
- REG_UBRACES
- appears to indicate that braces are used in a non-metacharacter manner
- REG_UBSALNUM
- Backslash followed by (unrecognized) alphanumeric. REG_UUNSPEC also set.
- REG_UPBOTCH
- RE engine detected that it is in a case that the POSIX spec botched (unmatched “)” character)
- REG_UBBS
- Backslash in bracketed term (i.e., “[...\...]”)
- REG_UNONPOSIX
- Not a POSIX RE.
- REG_UUNSPEC
- Contains something not covered by the specification.
- REG_UUNPORT
- RE is formally unportable to different character sets other than the one it was designed for (not a problem in practice; Tcl always uses UNICODE characters)
- REG_ULOCALE
- Has a dependency on the locale (only one locale currently supported, so not a problem)
- REG_UEMPTYMATCH
- Can match the empty string.
- REG_UIMPOSSIBLE
- Cannot match anything.
- REG_USHORTEST
- Overall non-greedy regular expression.
If you set up your regular expression in a Tcl variable, then you can have unintended consequences:
set foo abc(def) set RE "$foo" regexp $RE $another_variableHas anybody written a filter for variables that can clean them up before sticking them in a regular expression like this?Lars H: It appears whoever wrote the above was either confused or made some fatal typo. Besides setting variables foo and RE, the above is 100% equivalent to
  regexp {abc(def)} $another_variableMixing it up edit
Mixing greedy and non-greedy quantifiers might not have the results you'd expect.See Henry Spencer's reply in tcl 8.2 regexp not doing non-greedy matching correctly ,comp.lang.tcl ,1999-09-20.
 ,comp.lang.tcl ,1999-09-20.regexp Regular Expressions - Regular Expression Examples - Regular Expression Debugging Tips
Testing and Debugging REs edit
Visual REGEXP
Visual REGEXP is a little script which helps you to debug your regexp with a "trial and error" method (get it here [1]).http://www.lucidway.org/Marty/Tcl/TclWikiImages/1345.png (Link "broken" 15 Sep 2005, i.e., target server requires some kind of login.)Image from Softpedia - http://linux.softpedia.com/
 (Link "broken" 15 Sep 2005, i.e., target server requires some kind of login.)Image from Softpedia - http://linux.softpedia.com/ :
:
A Test Script
The following little test script can be used for testing RE's on the fly - BBH#
# regexp tester/viewer
#
set SubMatchColors {red blue magenta orange cyan purple green}
proc clear {} {
    # clear old info
    foreach t [.txt tag names] {.txt tag remove $t 1.0 end}
}
proc do_re {} {
    clear
    
    # get matches by index
    set cmd [list regexp -inline -indices]
    if {$::LINE} {lappend cmd -line}
    if {$::ALL} {lappend cmd -all}
    lappend cmd -- $::EXP [.txt get 1.0 end]
    set l [eval $cmd]
    
    if {[llength $l] > 0} {
        # mark range of entire match
        set i1 "1.0 + [lindex [lindex $l 0] 0] chars"
        set i2 "1.0 + [expr [lindex [lindex $l 0] 1] + 1] chars"           
        .txt tag add FullMatch $i1 $i2
        # mark any submatches
        set modval [llength $::SubMatchColors]
        set num 0
        set p2 -1
        foreach {match} [lrange $l 1 end] {
            if {[lindex $match 0] < $p2} {
                # previous match was really a full match when -all specified
                #   NOTE: this will also cause the outer set(s) of nested submatches
                #         to not be highlighted in any way - an enhancement would
                #         be to determine (by parsing the RE itself) how many subexpresions
                #         there are, then use that to determine the true "total match"
                #         instead of just looking for overlapping ranges, then any
                #         nested parens can be formatted (maybe by background, or underline,
                #         or italic, or bold, or size or ....) but you would need to
                #         determine a set of non-canceling highlights, then keep track
                #         of how many levels deep in a overlapping region of text you
                #         are in and use a set of mofiiers for each level
                #
                #          BUT that is too complicated for a simple little test tool
                #                (at least for now)
                #
                #   Additional NOTE: the -about flag may be of use in determine number of submatches
                #
                .txt tag add FullMatch "1.0 + $p1 chars" "1.0 + [expr $p2 + 1] chars"
                set num [expr ($num - 1) % $modval]
            }
            set i1 "1.0 + [lindex $match 0] chars"
            set i2 "1.0 + [expr [lindex $match 1] + 1] chars"
            .txt tag add SubMatch$num $i1 $i2
            set p1 [lindex $match 0]
            set p2 [lindex $match 1]
            set num [expr ($num + 1) % $modval]
        }
    } else {
        tk_messageBox -message "RE doesn't match!"
    }
}
wm title . "RE Checker"
label .lbl -text "Expression:"
entry .exp -textvar EXP
bind <Return> .exp do_re
set LINE 0
set ALL 0
frame .f
pack [label .f.label -text "options:"] -side left
pack [checkbutton .f.line -text "-line" -variable LINE] -side left
pack [checkbutton .f.all -text "-all" -variable ALL] -side left
pack [button .f.doit -text "Run regexp!" -command do_re] -side left -expand 1 -fill none
pack [button .f.clear -text "Reset Text" -command clear] -side left -expand 1 -fill none
text .txt -background grey25 -foreground white
.txt tag config FullMatch -background black -relief raised
set i 0
foreach clr $SubMatchColors {
    .txt tag config SubMatch$i -foreground $clr
    incr i
}
grid .lbl .exp -sticky ew
grid .f   -    -sticky ew -pady 5
grid .txt -    -sticky news
grid columnconfigure . 1 -weight 10
grid rowconfigure    . 2 -weight 10Komodo
Someone ought to explain the RE debugger available in Komodo.Also, it would be good to have a comparison with Visual RegExp.TREV
Check out http://www.doulos.com/knowhow/tcltk/examples/trev/ , where TREV, the Tcl Regular Expression Visualiser, is discussed. The purpose of it is to demonstrate how a regular expression matches text.
 , where TREV, the Tcl Regular Expression Visualiser, is discussed. The purpose of it is to demonstrate how a regular expression matches text.Regex-coach
Yet another ("sexier"?) regexp debugger appears at http://www.weitz.de/regex-coach . Note, though, that it implements Perl's regexp syntax.
 . Note, though, that it implements Perl's regexp syntax.redet
See redet for a tool to assist in developing regular expressions.txt2regex
See ^txt2regex$ for a tool to assist in constructing regular expressions.regexpviewer
I made my own, as well, located at regexpviewer - davidwregfuzz
regfuzz is a collection of program and scripts for testing regular expression robustness using randomly generated valid and invalid regular expressions.The base implementation is in C, but a Tcl interface via swig is included along with samples of its use.
 is a collection of program and scripts for testing regular expression robustness using randomly generated valid and invalid regular expressions.The base implementation is in C, but a Tcl interface via swig is included along with samples of its use.

 
  
 