KJN - Remove arrays from Tcl, and use the name(index) syntax to mean dictionaryVariable(key). If the value of index is permitted to be a Tcl list, then it can be interpreted as dictionaryVariable(key ?key ...?).RZ Make it possible to add traces to name(index) so it can be used as textvariable in Tk.DKF: Traces make that tricky, since they're attached to variables (which arrays are collections of) and not values (which dictionaries are collections of).KJN In other words, these modified dictionaries should have the property "if a variable varName holds a dictionary value, each dictionary key varName(foo ...) should have the behavior of a variable" (including traces, upvar).KJN - I expect this will not be feasible for various reasons, but I'll suggest it anyway. I know that many dict operations work on values $dictionaryVariable, but I am assuming that Tcl uses dual format to optimise these operations, e.g. that the parsing of
dict get $dictionaryVariable foodoes not convert $dictionaryVariable to a string, so it will be possible in principle to track the access to particular keys of dictionaryVariable.The new trace operations needed would be
- trace add dict name ops commandPrefix
- trace remove dict name ops commandPrefix
- trace info dict name ops commandPrefix
- read
- write
- unset
- vread
- vwrite
- vunset
set foo $dictionaryVariable set dictionaryVariable {foo bar} unset dictionaryVariable dict set dictionaryVariable key1 value ;# where {dictionaryVariable key1 key2} is being tracedFor Migration from Tcl8 arraysImplement the command array (except array exists) as a (deprecated) ensemble of pure-Tcl commands to ease the migration path from 8 to 9.There are clearly incompatibilities that prevent a dictionary variable from being a drop-in replacement for an array: hence this suggestion is for Tcl 9.
- A dictionary variable is an ordinary variable, and can be overwritten with either a dictionary or non-dictionary value; whereas an array cannot be overwritten with a scalar value, or a scalar variable with an "array value", and any attempt to do so will throw an error. The subcommand "array exists" has no meaningful counterpart for a dictionary variable.
- A dictionary key path of varName(foo bar) would imply the existence of key path varName(foo). This is not the case with the array index arrayName(foo bar). Also, the parser currently rejects arrayName(foo bar) if the index is specified as a literal, rather than (for example) by arrayName($index). (DKF: That's just a syntax problem, just as the parser needs extra syntax to handle words with spaces in.)
KJN Similar suggestions were made on If I were to complain, and DKF pointed out that as well as breaking traces, upvar would become much hairier.On the other hand, this could be a useful feature, especially when handling deeply nested dicts. Using parentheses as above, an upvar extended to look into dicts would allow a useful shorthand:
upvar 0 varName(foo bar branch twig) varName2
KJN It seems that if something has been suggested once, it has been suggested a dozen times. See also Dictionaries as Arrays, which points out more problems. Scroll down that page to the discussion of the values of these three variables:
set A "0 1 2 3" set B [list 0 1 2 3] set C [dict create 0 1 2 3]and whether it is meaningful to ask for the values $A(0) $B(0); and what should happen if the number of elements in the list is odd. I'm not sure that this is a show-stopper: we don't mind that
dict get $A 0 dict get $B 0each return a value, or that
set D "call me Bond" dict get $D callthrows an error because $D has an odd number of elements. We can think of
$foo(key ?key ...?)as syntactic sugar for
[dict get $foo key ?key ...?]
KJN - The discussion below was last modified in 2001, and many of the suggestions have now been implemented with the dict command in Tcl 8.5.Better Arrays. Arrays should be objects that can be passed around and returned.DKF - Some of this might in fact find its way into the core before 9.0; c.f. dictionary values.MSJ - Is the idea here to use the syntax of present day arrays, but just to store the value like in Jean-Claude Wippler's Dictionary? [Do you perhaps mean Frederic Bonnet's dictionary? ] In this case I think it would be a huge improvement, as not only would one get more functionality, but the TCL "everything is a string" philosophy would now also apply to arrays (I've never really understood why arrays had the limitations in the first place). The way I understand it, the following would apply:
set a(fruit) apple ==> apple set a(vegetable) tomato ==> tomato set a ==> fruit apple vegetable tomato set basket(contents) $a ==> fruit apple vegetable tomato set basket ==> contents {fruit apple vegetable tomato}I would like to add a few requests so that slightly more complex data structires can be handled. I would like to be able to do:
set basket(contents)(fruit) pear ==> pear set basket ==> contents {fruit pear vegetable tomato}It would also be nice if lists could be easier to use than at present, e.g.:
set xx [list apples bananas cherries] ==> apples bananas cherries set xx((1)) [list BANANAS BROCOLI] ==> BANANAS BROCOLI set xx ==> apples {BANANAS BROCOLI} cherries puts $xx((2)) ==> cherries puts $xx((1))((0)) ==> BANANASDKF - Maybe. The problem with doing something like this is it obscures a number of other useful functionalities (arbitrary array names come to mind) and there is code about that tests for presence, absence and/or type of variables by using [catch] which would cease to work under such a change. That isn't to say that this is a bad idea, but the side effects are quite long-reaching and may not all be obvious. I have also corrected a few minor errors in your examples above; Tcls association lists - those accepted by [array set] - are not the same as TclXs keyed lists; they have fewer layers of quoting...MSJ - I suppose what I am saying is "in the process of reforming arrays, please don't forget arrays & lists embedded in arrays". I do realise that the double parenthesis proposal would require changing the satus of parenthesis to a reserved symbol that would need to be escaped in certain sequences much like the $ symbol. This would of course break some existing code, but would be beautiful - everything would be a string, but with the symbols $,( and ) one would be able to reduce the use of functions set, lindex, lreplace, array etc, for basic tasks such as saving/retrieving a value in an array/list. One could of course also use a less common symbol (e.g. @ or $$) instead of parenthesis, but this would not look as good. BTW, was your answer to my first question a yes with the changes that you made?DKF - Yes. Are there any overlaps with BLTs vectors?MSJ - Don't know. I just had another brainstorm that I think will fix the present difficulty with embedded data structures without backwards compatibility problems - One function, let's call it struct for the purposes of demonstration:
The first argument is the variable name Hereafter, the arguments come in pairs First argument in the pair is a charcter and determines how the arguments just before and just after it are interpreted: A . has the same meaning as in C (the argument before is a Tcl-array (C-struct)) An @ means that the argument before is a list and the argument after is the index An = means the next argument should be assigned to the previous % foreach idx {a b c d} {struct x . $idx = $idx$idx} % set x ==> a aa b bb c cc d dd % foreach idx {1 2 3 4 5} {struct x . vec @ $idx = 4$idx} % set x ==> a aa b bb c cc d dd vec {{} 41 42 43 44 45} % struct x . vec @ end ==> 45One can of course work on allowing more complicated indices after @ characters e.g. "end+1", "2*$i+5", "1..end". Instead of ".", "@" or "=", one might also write "delete" to delete an item.RS -- Can't the deficiency of
set foo(x) 1 1 set foo can't read "foo": variable is array set foo {bar grill} can't set "foo": variable is arraybe overcome even before 9.0 by just calling array set/get in the positions where these two errors are thrown? Performance may get lost if huge arrays are recast to lists and back, but in the object representation one bit might indicate array state, so converting $foo to a list would be needed for accesses like lindex $foo 1 .JCW I've been pondering on whether arrays can be virtualized (for doing things like pulling data off the disk on demand, or for presenting an array layer on things like /etc/passwd, *dbm, or eh... Metakit). Right now, one can do a lot with traces to fake it, but not everything. The "array names" command won't work unless elements actually exist. Read traces can fill an array lazily but not keep the size such a cache within bounds. Write traces are fine, but there is no "raw write" to fill in data without triggering such traces. Plus, somewhat related, I see a need for "pre-traces" ... executing code before a value changes or goes away. With a virtual interface, one might even be able to move the implementation of arrays into a separate extension.JCW Another idea is to reconsider the dichotomy between arrays and lists. Both store collections, but one of them is indexable via ints, and the other via a more general key value. Lua has generalized this into the wrong direction IMNSHO, and stores lists in arrays ("tables") - albeit very efficiently due to some clever ideas. The opposite direction would seem much more general: lists store aggregate data, with arrays being a special case. Special in two ways: the odd-even format is an obvious (and natural, in Tcl) way to create an equivalence. The other trick would be to use a secondary/hidden list for hashing / buckets / overflow. This does not necessarily slow down things (quite the opposite in fact, I expect). What it does mean is that fast/hashed access is a property one can slap onto any pairwise list. FWIW, I have started doing this in Metakit with good results so far. It has an extra benefit that the hashing information becomes explicit (a bunch of ints, which can be stored like everything else). Hm... perhaps I'm not making too much sense with all this. It's a bit cryptic, I'll think about a way to describe this more clearly...RS Let me see if I understand you right. We start from even-elemented lists (2 is even too):
set x {} lappend x John 123 ;# set y(John) 123 lappend x Mary 456 ;# set y(Mary) 456 @ x John 234 ;# overwrite: set y(John) 123 @ x Mary => 456 ;# set y(Mary), or: $y(Mary) @ x => {John Mary} ;# array names y expr {[llength $x]/2} ;# array size yWhat I don't see yet is how to do hashing with a (hidden) list - given a key to search, lsearch with check on odd/even index would be required in both cases. The advantage of lsearch however is that you can search by key and value, which I've used in some translation tables (e.g. Morse en/decoder). But to replace hashtables with lists in equal efficiency seems difficult to me, and on the C side they're a fact of Tcl life anyway..JCW Nice interface! After I wrote that, it occurred to me that I could actually code this up and have a go it it (trace-less, but mutable and storable). Ok let me try to describe things further...
- let's call this thing a map for now
- the internal rep of a map is two lists
- one list is the pairwise key/value contents
- the other is pairwise too, but hashcode/index
- the size of the second list is larger, it has spare slots
- lookup hashes the key, looks it up in 2nd, then finds index
- when the 2nd list fills, it gets re-sized (in prime-size chunks)
- when its use drops below 2/3's, it is resized down again
- the implementation I use is somewhat derived from Python's