Description edit
Supports mult-line records.This module provides support for CSV data.Example: Parsing CSV edit
#! /bin/env tclsh package require csv package require struct::matrix ::struct::matrix data set chan [open myfile.csv] csv::read2matrix $chan data , auto close $chan set rows [data rows] for {set row 0} {$row < $rows} {incr row} { puts [data get row $row] }auto is almost always necessary what you want when parsing a matrix, but it isn't the default, so it has to be explicit passed to [::struct::matrix], meaning that the second argument, usually ,, must also be passed.
#! /bin/env tclsh package require csv package require struct::queue ::struct::queue data set chan [open myfile.csv] csv::read2queue $chan data close $chan while {[data size] > 0} { puts [data get] }
Example: Generating CSV edit
package require csv # Make the lists to convert to csv-format set exampleList1 {123 123,521.2} lappend exampleList1 {Mary says "Hello, I am Mary"} lappend exampleList1 {} set exampleList2 {a b c d e f} set exampleList3 {} for {set i 0} {$i < 10} {incr i} { lappend exampleList3 $i } # Make a list of lists... set exampleLists [list $exampleList1 $exampleList2 $exampleList3] # Write the data to a file set f [open exampleLists.csv w] puts $f [csv::joinlist $exampleLists] close $fThe result of running this program is 4 lines - one for each example list, and an empty line.
[JDW]: The "empty line" (mentioned above) is a nuisance for some applications. It is the result of [csv::joinlist] including a newline at the end of every line, rather than as a delimiter between lines. Then the [puts] adds another newline. The extra newline can be avoided by using the following construct:
puts -nonewline $f [csv::joinlist $exampleLists]Of course, the [write_file] command from Tclx would would make the [open]/[puts]/[close] sequence all one line:
write_file -nonewline exampleLists.csv [csv::joinlist $exampleLists]However, -nonewline isn't supported on write_file. My first thought was that the extra newline shouldn't be added by [csv::joinlist], but perhaps the real deficiency is that [write_file] should support -nonewline. One way or the other, it would be handy to make [write_file] and [csv::joinlist] work together.The (very ugly) workaround I've come up with is:
write_file exampleLists.csv [string range [csv::joinlist $exampleLists] 0 end-1]Of course, that's probably not efficient for writing non-trivial file sizes.In case this behavior is version-dependant, this was tested using ActiveTcl 8.4.19.1 on Linux.
Demos edit
Tcllib also comes with a few sample programs demonstrating the usefulness of the csv package. See the tcllib/examples/csv/ directory for code to convert csv files to html, to cut out csv columns, to join csv data from two files, to sort csv files by column, to do a 'uniq' type function on csv columns, etc. Currently at version 0.0 .These demos are in the tcllib source tree. If you want to use them, however, you have to install them by hand.The csv utility commands in tcllib/examples/csv/ arecsv2html
csv2html ?-sep sepchar? ?-title string? file... Reads CSV data from the files and returns it as a HTML table on stdout.
csvcut
csvcut ?-sep sepchar? LIST file... Like "cut", but for CSV files. Print selected parts of CSV records from each FILE to standard output. LIST is a comma separated list of column specifications. The allowed forms are: N numeric specification of single column N-M range specification, both parts numberic, N < M required. -M See N-M, N defaults to 0. N- See N-M, M defaults to last column If there are no files or file = "-" read from stdin.
csvdiff
csvdiff ?-n? ?-sep sepchar? ?-key LIST? file1 file2 Like "diff", but for CSV files. Compare selected columns of CSV records from each FILE to standard output. -n indicates that line numbers should be output -sep sepchar allows one to indicate that, instead of a comma, the sepchar will be separating the CSV columns. LIST is a comma separated list of column specifications. The allowed forms are: N numeric specification of single column N-M range specification, both parts numberic, N < M required. -M See N-M, N defaults to 0. N- See N-M, M defaults to last column file1 and file2 are the files to be compared.Example of use:
$ cat > f1 a|b|c|d|e|f|g|h|i|j| 1|2|3|d|e|F|g|h|i|j| x|y|z|d|e|f|g|h|i|j| ^D $ cat > f2 a|b|c|d|e|f|g|h|i|j| 1|2|3|d|e|f|g|h|i|j| x|y|z|d|e|f|g|h|i|j| ^D $ csvdiff -sep '|' -key '0 5 8 9' f1 f2 -|1|2|3|d|e|F|g|h|i|j| +|1|2|3|d|e|f|g|h|i|j|Note that if you want to compare several fields, I find that I have to use spaces to separate them, rather than commas as the comments imply. Also, if there are multiple lines in one of the files that are identical in the columns specified, a warning similar to this will appear:
warning: 0 2942 0000 R occurs multiple times in f1 (lines 2634 and 2633)Also, all the first file's lines are output first, then the second file's lines are output.
csvjoin
csvjoin ?-sep sepchar? ?-outer? keycol1 file1.in keycol2 file2.in file.out|- Joins the two CSV inputtables using the specified columns as keys to compare and associate. The result will contain all columns from both files with the exception of the second key column (the result needs only one key column, the other is identical by definition and therefore superfluous). Options: -sep specifies the separator character used in the input file. Default is comma. -outer Flag, perform outer join. Means that if the key is missing in file2 a record is nevertheless written, extended with empty values.
csvsort
csvsort ?-sep sepchar? ?-f? ?-n? ?-r? ?-skip cnt? column file.in|- file.out|- Like "sort", but for CSV files. Sorts after the specified column. Input and output are from and to a file or stdin and stdout (Any combination is possible). Options: -sep specifies the separator character used in the input file. Default is comma. -n If specified integer sorting is used. -f If specified floating point sorting is used. (-n and -f exclude each other. If both are used the last option decides the mode). -r If specified reverse sorting is used (largest first) -skip If specified that number of rows is skipped at the beginning, i.e. excluded from sorting. This is to allow sorting of CSV files with header lines.
csvuniq
csvuniq ?-sep sepchar? column file.in|- file.out|- Like "uniq", but for CSV files. Uniq's the specified column. Writes the first record it encounters for a value. Input and output are from and to a file or stdin and stdout (Any combination is possible). Options: -sep specifies the separator character used in the input file. Default is comma.[Examples of how to use the above commands would be helpful]
Alternatives edit
- tclcsv is a binary extension for parsing CSV much faster than the pure-tcl implementation can manage
- tsv (part of tcl-hacks) is a TclOO-based parser for CSV-like formats. It is somewhat slower than Tcllib's CSV, but offers a nicer interface and more flexibility while passing the same test suite.
- csv - this page's "See Also" section links several related projects