Summary edit
Arjen Markus (21 february 2006) I am facing a task of modifying a lot of text files in a rather mechanical way. I used to do this kind of things with AWK, but Tcl lends itself for this too. It is just a matter of the right "little language".The task I am facing is not really interesting for anyone else, but the characteristics are fairly common:- Certain modifications are required for a particular part of the file
- Some modifications apply to particular lines
- Defining regular expressions to capture exactly the lines you need can be tricky.
So it is probably easier to do it in steps.
# modify.tcl -- # Yet another AWK-like utility. This one reads a file line by line # and decides on the basis of patterns marking the beginning and # end of a block of lines (section) what actions to take. # # Note: # - sections may overlap # - what they do is up to you # - special sections are: begin, end and otherwise # - the command "nextline" causes the actions for any subsequent # sections to be cancelled. # namespace eval ::Sections { variable section_number 0 variable section_data {} variable section_active {} variable nextline 0 namespace export section begin end otherwise nextline scanfile proc _begin {} {} proc _end {} {} proc _otherwise line {} } # section -- # Define the beginning and end of a section and the actions to take # # Arguments: # begin The regexp pattern marking the start # end The regexp pattern marking the end # actions The script to be run # # Result: # None # proc ::Sections::section {begin end actions} { variable section_number variable section_active variable section_data lappend section_data $begin $end lappend section_active 0 proc ::Sections::$section_number line $actions incr section_number } # begin -- # Define the actions for the beginning of a file # # Arguments: # actions The script to be run # # Result: # None # proc ::Sections::begin {actions} { proc ::Sections::_begin {} $actions } # end -- # Define the actions for the end of a file # # Arguments: # actions The script to be run # # Result: # None # proc ::Sections::end {actions} { proc ::Sections::_end {} $actions } # otherwise -- # Define the actions for lines not falling in any section # # Arguments: # actions The script to be run # # Result: # None # proc ::Sections::otherwise {actions} { proc ::Sections::_otherwise line $actions } # nextline -- # Instruct the scanning procedure to skip all remaining sections # # Arguments: # None # # Result: # None # proc ::Sections::nextline {} { variable nextline set nextline 1 } # scanfile -- # Scan the file, taking actions appropriate for the # sections the line is part of # # Arguments: # filename Name of the file to scan # # Result: # None # proc ::Sections::scanfile {filename} { variable section_number variable section_data variable section_active variable nextline set infile [open $filename r] _begin while { [gets $infile line] >= 0 } { set nextline 0 set id -1 set insection 0 foreach {start stop} $section_data active $section_active { incr id if { $active } { if { [regexp $stop $line] } { lset section_active $id 0 } } else { if { [regexp $start $line] } { lset section_active $id 1 set active 1 } } if { $active } { set insection 1 $id $line if { $nextline } { break } } } if { ! $insection } { _otherwise $line } } _end close $infile } # main -- # Simple test case and demo # namespace import ::Sections::* begin { puts "List of procedures:" set ::count 0 } section "^#.*--" "^ *proc" { puts "| $line" if { [regexp "#.*--" $line] } { set ::count 0 } } section "{" "^#.*--" { incr ::count if { $line == "\}" } { # Naive criterium for the end of a procedure puts "(Number of lines: $::count)" } } scanfile $argv0
Comments edit
Very useful indeed ! I fixed a small bug: the "if {$insection} ..." test is better placed outside the foreach loopan excerpt of the demo's output:
| # scanfile -- | # Scan the file, taking actions appropriate for the | # sections the line is part of | # | # Arguments: | # filename Name of the file to scan | # | # Result: | # None | # | proc ::Sections::scanfile {filename} { (Number of lines: 46)