Updated 2015-04-03 17:50:27 by pooryorick

diff utilities determine and present the differences between texts.

See Also  edit

comparing files in tcl
Using Snit to glue diff, patch, and md5sum

Tcl Tools  edit

diff in Tcl
Another diff in Tcl
DiffUtilTcl
tkdiff
eskil
A very nice graphical diff tool.
bindiff

Non-Tcl Tools  edit

Active File Compare
Proprietary. CL has received mild testimonials. There's no particular Tcl connection; it's just been valuable to me as a Tcl developer when working under Windows.
WinMerge
Open-source, for Windows. A future cross-platform version is planned.
xdelta
Open-source binary diff, differential compression tools, VCDIFF (RFC 3284) delta compression Jean-Samuel Gauthier: ...supports a simple but flexible callback interface to feed/extract data to/from the compressor. Tcl Examples included.

description  edit

Most frequently, 'diff is a comparison of two files. When the output is text, the Unix tradition is to display the differences in terms of the changes made to the first file to achieve a file similar to the second file.

Often in a GUI application, coloring or other techniques are used to convey more information about what changed. In some applications, entire lines are highlighted, while in other, particular characters are highlighted.

Specialized Diff  edit

Arjen Markus: We have faced a slightly different problem: two files that should be compared with special care for (floating-point) numbers. The solution was simple in design:

  • Read the files line by line (all lines should be comparable, we did not need to deal with inserted or deleted lines)
  • Split the lines into words and compare the words as either strings or as numbers.
  • By using [string is float] we identified if the "word" is actually a number and if so, we compared them numerically (even allowing a certain tolerance if required).

This way you are immune to numbers formatted in different ways: 0.1, +.1, 1.0E-01, +1.00e-001 all spell the same number and you can encounter all of these forms (sometimes you have less than perfect control over the precise format).

Arjen Markus: Question: would not this be a nice addition for the fileutil module in Tcllib?

GPS: maybe it would...

Arjen Markus: If so, it would benefit (in my opinion) from two custom procedures:

  • A procedure one can supply to compare the lines (for instance: ignore white-space or interpret numbers as numbers - my original problem)
  • A procedure to process the output (in a manner as Tkdiff does for instance)

Arjen Markus: A few thoughts for improving the performance:

  • Store the lines as {lineno content}
  • Sort by content (lsort has this ability via "-index")
  • Use binary search to replace the inner loop.

This would bring back the number of iterations from O(N^2) to O(NlogN). But perhaps it is not worth the trouble :-)