Arity corner cases in math commands

SS 18Mar2004

Changelog

20 Mar 2004 - Changed this document to reflect what I think now, after some day of discussions.

Background

TIP 174 proposed to add Tcl commands as an alternative to expr to do math with Tcl. Ideally, all the operators and mathematical functions in expr should be provided as commands, but in this document I want to focus in the arity corner cases problems, so I'll take as examples mainly +, -, *, and /.

In fact while it's trivial to guess what the command [+ 2 3] should do (just add the two numbers), it isn't for [- 4], [/ 40 10 2] and so on. This short memo will try to summarize the ideas discussed in Tcler's chat, and may express in some cases my particular vision on what is the best design for TIP 174.

Looking for already done design

We are not the first to face this problems. Lisp dialects already had to find solutions for our problems. We will study the solution adopted by the Scheme language that's one of the formally cleaner dialects of Lisp, and try to improve in respect to Tcl fit.

Binary arity case

The simplest case, is of course the binary one, being +, -, * and / (all binary operators) the meaning is just:

  [+ a b] => a + b
  [- a b] => a - b
  [* a b] => a * b
  [/ a b] => a / b

That's of course the Scheme's behaviour.

Three or more arguments

The next step is to decide what to do with three or more arguments.

For + and *, the commands just do the sum or product of all the arguments passed. For - and /, left association is used. So:

  [+ a b c ... z] => a + b + c + ... + z
  [- a b c ... z] => (( ... ((a - b) - c) - ... ) - z)
  [* a b c ... z] => a * b * c * ... * z
  [/ a b c ... z] => (( ... ((a / b) / c) / ... ) / z)

This seems what the user expects, following the least surprise principle. To avoid such an extension is just to lose an opportunity to type less.

One or zero arguments

To handle the one or zero argument cases is a bit more complex. That's what Scheme does: if we pass a single argument, it uses it as second argument, and use for the first argument the neutral (identity element) for this operator (0 for + and -, 1 for * and / of course).

  (+ x) is equivalent to (+ 0 a), that's x
  (- x) is equivalent to (- 0 a), that's -x
  (* x) is equivalent to (* 1 a), that's x
  (/ x) is equivalent to (/ 1 a), that's 1/x

With zero arguments, + and * just return the neutral:

  (+) returns 0
  (*) returns 1

- and / are invalid with zero arguments.

What are the advantages of this for Tcl?

Save you some (minimal) keystrokes when you need reciprocal.
Commands that works with any arity in a meaningful way (except for - and / with zero arguments).
[- x] returning -x is somewhat natural visually.
- and / variadic makes it much more likely to use less [] nesting.

What about the problems of this solution?

Least surprise rule violated, it's non obvious. (I'm not sure btw, maybe [- x] => error is a bigger surprise!)
Does not play well with {*} sometimes.
Other operators may have a less sounding extension with 0 or 1 arg.

An example of bad interaction with {*} is the following:

  - $n {*}$list

The intuitive meaning of the expression is "to subtract all the elements of $list in turn, from $n". You may not expect it to return the reciprocal of $n if $list happens to be empty. Sometimes you may also want to raise an error if there aren't at least two arguments, in order to avoid hard to trace bugs. Of course you can always write:

  - $n 0 {*}$list

to avoid the reciprocal problem, but the default behaviour is still not good, and there is no way to raise an error for default on a suspicious number of arguments.

BUT SEE THIS!! added later to this document that's verbatim what I wrote in the chat:

 <antirez> I think I found the real reson why - interact badly with {*} sometimes
 <antirez> it's the only Tcl form where what is in syntax one argument, may be not in semantic
 <antirez> proc x {args} {puts [llength $args]}
 <antirez> x => 0
 <antirez> x 1 2 3 => 3
 <antirez> x {*}$emtpyList => 0
 <antirez> this was impossible in Tcl before {*}.

An alternative solution that may work better, is to use the neutral argument as second argument if the user provided just one. So:

  [+ $x] will be equivalent to [+ $x 0]
  [- $x] will be equivalent to [- $x 0]
  [* $x] will be equivalent to [* $x 1]
  [/ $x] will be equivalent to [/ $x 1]

This saves us from the reciprocal problem with {*}. Still the least surprise principle is violated for quite little advantages. On the other side this two solutions are sounding in the ortogonality and coherence side. What's a good alternative?

An alternative solution

To raise an error for *every* binary operator called with less than two arguments. This is of course very simple to explain, and still, to type:

  [- 0 $x] instead of [- $x]

is not too hard, but still useless if the users are aware of [- x] => -x semantic. On the other side this interact quite well with {*}.

  [+ {*}$l] ; # Will raise an error if the list has less than two elements
  [+ 0 {*}$l] ; # Will raise an error if the list has less than one arg.
  [+ 0 0 {*}$l]; # Returns zero with empty list. Otherwise the sum.

What's important here is that the user can select from different behaviours, with the default being the safest.

It works well even with - and /.

  [- $n {*}$l]; # Error on empty list.
  [- $n 0 {*}$l]; # Returns $n on empty list, what most expect otherwise.

But there are several problems with this:

[- x] => -x is very useful and clean, why not?
[- a b c] => (a - b) -c is even more useful, and [- [- a b] c] is not clean at all. you can still use expr of course.
Since scheme's design is not bad at all, why to invent something of new for advantages that are not clear?

Scheme semantic mini implementation

The following code implements the scheme behaviour (with the optional part of three or more arguments for - and /), with the only difference that it will accept zero arguments for - and / (that will return what + and * return with zero argument).

 foreach {op neutral} {+ 0 - 0 * 1 / 1} {
    proc $op args [format {
        if {[llength $args] <= 1} {
            set args [concat %s $args]
        }
        set r [lindex $args 0]
        set args [lrange $args 1 end]
        foreach a $args {
            set r [expr {$r %s $a}]
        }
        return $r
    } $neutral $op]
 }

NEM Just want to clarify some stuff here. Firstly, I have a scheme interpreter sitting here (actually, it's LispMe [1] on my palm pilot. It's apparently a mostly conforming scheme interpreter). The results I get are:

 (+) -> 0
 (-) -> error (wrong # args)
 (*) -> 1
 (/) -> error (wrong # args)

+, and * both operate on lists, and are defined (it seems) equivalently to:

 proc + {args} { expr [join [concat $args 0] +] }
 proc * {args} { expr [join [concat $args 1] *] }

However, - and / work on either 1 or 2 arguments, and no more. Anything else gives an error. In the one argument case, - acts as negation, and / acts as reciprocal (if they're the right terms):

 (- 1) -> -1
 (/ 2) -> 0.5

In the two argument case, they work as expected. The more I think about it, the more this behaviour makes sense (with the possible exception of (/ x) being equivalent to (1/x)). If you try and make - work on a list, then you need to specify a different operator for negation, IMHO.

SS You are right, (-) and (/) does not return 0 and 1 but just an error. About three or more arguments for - and /, that's what R5RS states:

 procedure:  (+ z1 ...)
 procedure:  (* z1 ...)

 These procedures return the sum or product of their arguments.

 (+ 3 4)                         ===>  7
 (+ 3)                           ===>  3
 (+)                             ===>  0
 (* 4)                           ===>  4
 (*)                             ===>  1

 procedure:  (- z1 z2)
 procedure:  (- z)
 optional procedure:  (- z1 z2 ...)
 procedure:  (/ z1 z2)
 procedure:  (/ z)
 optional procedure:  (/ z1 z2 ...)

Actually they are optional, just my interpreter support this ([mzscheme]), and your implementation does not. I corrected the above document about - and / without arguments.

I think that if we want [- x] returning 0-x, we should also have [/ x] returning 1/x. Btw note that most scheme interpreters follow the "optional" part.

For me is ok to follow exactly scheme (with or without the optional part, I prefer with btw). Just don't think that to mix is a good idea.

male - 18.03.2004:

Sorry - in my opinion, we have already non-intuitive commands in tcl and/or tk, so ... if it is clearly described in the man pages, that ...

 [- x {*}list]

... is equal to ...

 [- x listElem1 listElem2 ... listElemN]

... than everything is ok!

But to disallow such things like ...

 [- 1]

... and to force to use ...

 [- 0 1]

... is most counter intuitive and to be prevented!

NEM Well, but consider this case:

 set somecalc [- $total {*}$list]

if $list is empty, then the result is [expr {0 - $total}]. This is also not very intuitive. So, it seems much better to me to either support using - for negation, or support lists of arguments greater than 2 elements. If you try and do both, it's a mess.

male Yes, but it's my programmers duty to let only "valid" lists go into this calculation! The "API" is only as intelligent as the programmer uses it. To make an "API" absolutely intuitve, foolproven, ... is like creating a (in german) "eierlegendewolfsmichsau". Especially because of different people with different intuitions. Better describe an API well and create rules for how to use and that's it!

NEM Yes, of course. But you just said that you disliked [- 0 1] because it is unintuitive! This is my point. Either case is unintuitive at some point. So, you either do one or the other, and don't try and do both.

escargo 18 Mar 2004 - Note that several places above use the term reciprocal in a way that I believe to be incorrect: "A number related to another in such a way that when multiplied together their product is 1." That is, the result of [- n] is not the reciprocal of n. (It might be properly described as the additive inverse of n.)

Wouldn't we just want the equivalent of:

 proc + {args} {while {[llength $args]<2} {set args [linsert $args 0 0]};return [expr [join $args +]]}
 proc - {args} {while {[llength $args]<2} {set args [linsert $args 0 0]};return [expr [join $args -]]}
 proc * {args} {while {[llength $args]<2} {set args [linsert $args 0 1]};return [expr [join $args *]]}
 proc / {args} {while {[llength $args]<2} {set args [linsert $args 0 1]};return [expr [join $args /]]}

But I don't know enough about {*} to understand if it conflicts or not. (Bob Clark)