Updated 2016-04-25 20:57:50 by gold

Richard Suchenwirth 2002-12-28 - In my Xmas math reading, I came upon the algorithm proposed by Georg Cantor (the creator of set theory, 1845-1918) for identifying points in 2D plane, e.g. in the unit square (x and y from 0 to 1), with scalar (1D) values, e.g. between 0 and 1, such that the mapping is unambiguous in both ways (bijective) - with the drastic consequence that the (infinite) number of points in the square equals the number of real numbers between 0 and 1. In Cantor's days there was no Tcl, but his approach strongly reminded me of the Tcl way:

  • interpret the digits of the x and y value as strings
  • split them into characters (digits) and iterate over them...
  • ... to build up a new digit string by concatenation. Example:
   x = 0.1 3 5 7 ...
   y = 0. 2 4 6 8 ...

By alternatingly appending digits from one and the other sequence, the resulting scalar value is
   z = 0.12345678......

This works in theory for irrational numbers, just that the algorithm would never terminate, because it would have to iterate over two infinite sequences of non-predictible digits to produce another infinite sequence. Luckily, in computing all our "float" numbers are of finite length (which can be queried or set with tcl_precision, maximum 17), so the following code is guaranteed to terminate:
 proc cantor'join {x y} {
    if ![regexp {^0[.]([0-9]*)$} $x -> xdigits] {
        error "bad x $x, must be between 0 and 1"
    }
    if ![regexp {^0[.]([0-9]*)$} $y -> ydigits] {
        error "bad y $y, must be between 0 and 1"
    }
    set res 0.
    foreach i [split $xdigits ""] j [split $ydigits ""] {
        append res [empty=0 $i] [empty=0 $j]
    }
    set res
 }
 proc cantor'split z {
    if ![regexp {^0[.]([0-9]*)$} [format %f $z] -> zdigits] {
        error "bad z $z, must be between 0 and 1"
    }
    foreach result {x y} {set $result 0.}
    foreach {i j} [split $zdigits ""] {
        append x $i
        append y $j
    }
    list [string trimright $x 0] [string trimright $y 0]
 }
 # Little helper to ensure 0s come after the end of a digit sequence:
 proc empty=0 x {expr {$x==""? 0: $x}}
 #------------ Testing:
 set tcl_precision 17
 puts z=[set z [cantor'join 0.12345 0.6789]]
 puts split:[cantor'split $z]

which results in:
 z=0.1627384950
 split:0.12345 0.67890
 % cantor'split [cantor'join 0.123 0.2345678]
 0.123 0.2345678

So the implementation appears to be correct - with the restriction, imposed by the fact that we have no "real" real numbers, that x,y,z all have to stay below 1. Otherwise, 1 could be rewritten as 0.99999... which amounts to infinitely many 9's - we have no infinite time at disposal to process that, but at least can get as close to 1 as 17 9's will bring us - and expr understands that message:-)
 % expr 0.99999999999999999
 1.0

When I told this story to my 12yo daughter Hanke at breakfast, her first question was: "Can you do that on a higher level too?" And yes - of course we can! The above methods, modified, are capable of weaving a "plait" from the coordinates in a vector of arbitrary dimension, and untangling it again (given the target dimension). The discussion then drifted into dimensions in general. She claimed that the fourth and fifth dimension must be "wind" and "smell" from cinema advertisements she'd seen, which is debatable - but made me think on what else could be "weaved" - unsigned integers (or bit sequences) come to mind, which would however have to be reverted in order to have common positional values...

To demonstrate the behavior of the cantor'split function, i.e. how a scalar iteration fills the square, here's some Tk code:
 proc dot {w x y {color red}} {
    $w create rect $x $y [incr x] [incr y] -fill $color -outline $color
 }
 pack [canvas .c -width 100 -height 100 -bg white] -fill both -expand 1
 for {set i 0} {$i<10000} {incr i} {
    foreach {x y} [cantor'split [expr {$i/10000.}]] {
        dot .c [expr int($x*100)] [expr int($y*100)]
        update
    }
 }

KPV -- I'm getting different results, probably resulting from different underlying number handling. Specifically, in the test above I get:
 z=0.1627384950
 () 62 % puts split:[cantor'split $z]
 split:0.123 0.678

rather than the expected:
 split:0.12345 0.67890

I'm running tk 8.4.0 and 8.4a4 on W2K - RS: That should not be. Have you set tcl_precision to 17 (the maximum, like in the above testing example)? - KPV: yes, I cut and pasted the example.

SCT: I get the same results with Tcl 8.4.1 on NT4. The problem appears to be the [format %f $z] in cantor'split. Apparently %f doesn't take tcl_precision into account. %0.17f gets closer, but isn't quite right. (because of floating point inaccurracies?)
 () 12 % set z
 0.1627384950
 () 13 % format %f $z
 0.162738
 () 14 % format %0.17f $z
 0.16273849500000001

RS: Right - tcl_precision is known for "binary noise" if set to 17. I must admit that I tested before adding the scan, and after that only with little numbers, so didn't notice.. However, with a %.16f format it works well on my W2k box (8.4.1):
 % cantor'join 0.12345 0.24689
 0.1224364859
 % set tcl_precision
 12
 % cantor'split 0.1224364859
 0.12345 0.24689

The scan was introduced only for normalizing omitted leading 0, e-format etc. Sorry for untested software!