- i. e.: lindex - for lists, string range - for strings.
interp alias {} get {} set set x Hello set x -> Hello get x -> HelloWe see that the semantics of get/set are not dissimilar. While in TCL we use the duality of set for both operations, consider for the moment that we use the get/set model. We also implement this by way of extending the internal tclObjType handler to have pointers for getter/setter functions.Now take the following:
command $var command $var(key)These can be interpreted as:
command [get var] is the same as [set var] command [get var key]The meaning of:
$var(key)currently is: get the variable from the hash var that has the key key and return its tclObj datum.The new meaning would be: call the tclObjType accessor function for the variable var requesting the part key .When a variable has a type of assocArray (new obj type) then the key would refer to an entry in an attached associative array whose value is a tcl variable structure.Note there is no inherent meaning of the interpretation of key. Each tclObjType will use key as appropriate.Some subsequent examples will clarify this.Now we see why we cannot use the duality of set for both read and write. We now have extended the get proc to have a second argument that we will apply the notation of:
partWe think of the value within the parentheses as a request for part of the whole. Such as with an array, we request one of the elements not all of them.By definition when there is no parentheses then we are requesting the whole of the datum.The raison d'etre for this change in interpretation of the notation: $var(index), is to allow a common syntax regardless of internal representation, as well as extensibility.Rather than limiting the notation to arrays we are free to apply the notation to any tclObjType.Note we will also have to change the implementation of set to call the setter method of the tclObjType for the variable when given a variable notation of: var(key). In all likelihood we would change set to set var part value notation.Benefits
- Unified syntax.
- Extensibility.
- less obfuscation of the meaning of the statements.
- The main problem in the implementation of this construct is that associative arrays are a kludge that has never been reworked when objectifying tcl variables.
- A associative array is an abstract quantity, so how do you call setter/getter functions when you don't know the type of data associated with the variable until you have accessed it?
- Performance. While a reference to a scalar will degenerate to a call to set, which will be byte coded out of existence, and an array reference will turn into a cached hash lookup, commands such as lindex and string range will not be able to be bytecoded when called as accessors through the $ notation (due to run time type determination).
set astring {Hello!} puts $astring(0) -> H puts $astring(end) -> ! puts $astring(1 3) -> ellTry lists:
set alist {A {B C D} E} ;# ms pointed out this is not a list yet set alist [list A [list B C D] E] ;# this is a list puts $alist(0) -> A puts $alist(end) -> E puts $alist(1 2} -> CMS notes that this is not what would happen; until some list op is done to $alist, it has a string tclObjType; so that [puts $alist(1 2)] -> " \{" and not "C". Had there been an intervening [llength $alist] the story would change. ,pwq: amended above thanks.Try Dicts:
set fred [dict create A 1 B 2 C 3 D 4] ;# or what ever syntax is implemented. $fred add New [dict create A 100 B 10] puts $fred(A) -> 1 puts $fred(end) -> {} puts $fred(1) -> {} puts $fred(New A) -> 100 puts "I think that [get $fred {New A}] is the same as $... above"Try Keyed lists as structures, or binary data represented as named quantities (As in ASN.1 notation for example).Pause and ThinkYou do not have to use it.What the above does, is give the programmer the ability to determine what meaning the $ notation has.If you want you can always use [set varname] to access a variable rather than $ notation.What's happening behind the variableTake for example accessors for a list type. The notation $var(end) could be interpreted as:
lindex $var endWhile the notation: $var(1 .. 3), Could be interpreted as:
lrange $var 1 3This could be a call through the scripted Tcl interface , or it could be a direct call to the lower level C API function, or it could be implemented inside the accessor function.Which is the best approach probably depends on the underlying data type. Maybe this model is best reserved for creating new Abstract Data Types and the core types, such as dicts, lists, arrays, do not implement any accessor functions.Note 'In the above, could means just that. The meaning of the notation is entirely determined by the getter function. Any defaults as applied to core types such as list would be dictated by the TCT. It could even raise an error. The programmer can however override this default interpretation if desired. The above is an example of one such interpretation.'Var vs $varCurrently set takes varname rather than tclObj as the reference to the variable to be set. It is not yet determined if the new Tcl commands set/get should/need to take varname. The principle requirement would be that traces can be determined when a variable is accessed. Ideally, the use of the tclObj directly would be a more orthogonal one.Exposing accessors to the script levelThe most benefit to having accessors would be realized if the functionality is available as scripted procs.This would allow programmers to change the meaning of the key inside the parentheses to create new constructs that match the application processing of the data.Coupled with namespaces allowing private getter/setter functions allows a controlled and structured replacement strategy. I.e. this does not need to affect the default accessor behaviour much like namespaces allow the overriding on core tcl command procs with safety.But we don't need to do itWe also do not need a virtual file system in TCL. I can do the same thing under Linux. I can mount an ftp server as a directory and any program can access files as though they were local to the machine.However there are times when the VFS facility is of use, such as in starkits.Likewise, the ability to implement accessor functions can be of benefit when the programmer requires them.The key phrase would be:
- Is there any reason to limit functionality.
NEM This is quite interesting, and partially related to some stuff I have been thinking about recently, with a view to eventually working towards a TIP. See Feather and particularly read up on the interfaces stuff, as this is very related. The getter/setter methods to Tcl_ObjType you propose above would be instead in an interface. Paul came up with a generic container interface (or something like that), which was similar. The mechanism is more generalized though. One thing which needs to be cleared up in the above is the difference between values and variables. [set] works with variables, and we have the following scheme currently:
set var "a" ;# Store the value "a" in the variable "var" set var ;# Retrieve the value stored in the variable "var" set foo(a) "bar" ;# Store the value "bar" in the variable with key "a" which is part of the array variable "foo" set foo(a) ;# etcThe point is, that the $a(b) syntax (and the [set] equivalent) currently means finding a value which is stored in a variable which is stored inside an array variable. AIUI, your proposed change is to just have normal variables (and drop arrays), so that:
puts $foo(a)would instead retrieve the value held in the variable foo, and then locate a sub-part in that which corresponds to the key "a". The difference is subtle, but involves one less dereference than currently:Current:
- find variable called "foo"
- get array "value" from that
- find variable called "a" in array
- get value from that
- find variable called "foo"
- get container value from that
- get part "a"
Lars H: The big problem with the above is that it assumes that Tcl values have types! The meaning of
$V(2)would be different depending on whether V was set as
set V {4 3 2 1}; # Type string: [string index] accessor => 3 set V [list 4 3 2 1] ; # Type list: [lindex] accessor => 2 set V [dict create 4 3 2 1];# Type dict: [dict get] accessor => 1whereas today these are all identical (at least for a suitable choice of hash function; there's a 50% chance that [dict create 4 3 2 1] returns "2 1 4 3" instead). That is a very fundamental change to the language, and it would break several programming idioms.Today one can do
set F [open "tempfile" w] fconfigure $F -encoding utf8 -translation lf puts -nonewline $F $value close $Flater do
set F [open "tempfile" r] fconfigure $F -encoding utf8 -translation lf set value [read $F] close $Fand recover exactly the same $value, regardless of how complicated that value is! (It probably won't be stored in the same way, but any script can go ahead and use it as if it had never been written to file.) This is possible because Tcl accessors come with a choice of interpretation for the data they are applied to, but would not be possible with the one-accessor-fits-all strategy proposed above.I might add that having "types" (technically known as "categories": letter, other, active, begin-group, end-group, etc.) of data internally but losing these when data is written to a file has been a major headache in the history of LaTeX, and with respect to multi-lingual documents still is. Since Tcl already has a fully working solution here, there is no need to break it.Another problem, which NEM mentioned briefly above, it that the proposer seems to be confused about the distinction between values and variables. set is a device for accessing variables. Tcl_Objs are values. Variables were never obj'ified in any other sense than that they store Tcl_Objs, and that was equally done with array elements as with scalar variables. Integers, floats, and lists on the other hand were obj'ified. Commands can be obj'ified (expect Tcl_Objs as arguments), but need not be (there is a C command for declaring Tcl commands that take strings as arguments).What probably could be done (but not in any Tcl 8.*, please) is to change the interpretation of variable names, so that one could have a name for a part of the value stored in the variable, and use that in cases where a variable name was expected. Suppose for example that if a variable name has a proper list structure, and the first element of that list is the name of some type of container, e.g. list. Then this would be interpreted as referring to a part of the value stored in the variable whose name is given in the second element of the name, and any remaining elements will be interpreted as specifying what part precisely. Examples, with translations:
${list L 2} # [lindex $L 2] set {list L 2} xyz # lset L 2 xyz set {list L 2 3} # lindex $L 2 3 set {dict D surname} # dict get $D surname set {list {list L 2} 3} # lindex [lindex $L 2] 3It should of course be nestable,
set {list {dict D addresses} 0} {10 Downing Street, ...} # dict set D addresses [lreplace [dict get $D addresses] 0 0 {10 Downing Street, ...}]that is where it really shines! It can also be handy with commands such as scan that only stores data in variables. (Try coding
binary scan $headerBytes H8IA40A20c {dict header checksum}\ {dict header designsize} {dict header codingscheme}\ {dict header family} {dict header face}without using auxiliary variables in current Tcl.)If anything should be done with respect to the $arr(index) notation, it should rather be to create alternatives than to make it more powerful, because
set Arr($index) $newvalhas a problem with respect to shimmering of $index (it has to first be embedded into a string, and then parsed out of that string). One could easily extend the above scheme so that
set [list array Arr $index] $newvalprovides such an alternative, if desired.PWQ 11 May 04, Thanks Lars H for those comments. I don't believe I am confused about vars and objects. Lars H: You do propose "set" and "get" entries in the Tcl_Obj structure, despite "set" being something that modifies a variable. pwq: see followup And objects do have types, it's the first field of the Tcl_Obj structure! Lars H: That is a private implementation detail, which it is a bug to expose at the script level. The language would be a lot more fragile with public types than it currently is.The premiss for the above all stems from the fact that $ is supposed to be conceptually the same as set, however that connection appears to have long since been lost.In closing, TCL programmers don't care how variables or objs are stored or accessed internally, they are only concerned in using TCL commands to perfrom some processing. The extension of the $ notation is one way of getting that processing done with less typing. When tcl introduced the $ notation, it did not have extended data types such as dicts and structs (or possibly even arrays), so its time to bring the $ notation up to speed with the rest of the core.pwq Followup 12 May 04:Lars H, asked the question, "do set/get operate on TclObj structure or variables?The answer to this is that set/get operate on variables. The subtle difference would be as follows:
Case 1
set var [get x]The above is the same as now:
set var $x Case 2
set var [get x subpart] ;# aka set var $x(subpart)In the above case, set has to dereference x and then call the accessor function for the object type of x , and then assign the returned Tcl_Obj to var.In C Pseudo Code with errors and definitions omitted.
set var(0) $x
SetObjCmd(dummy, interp, objc, objv) { varPtr = TclObjLookupVar(interp, part1Ptr, part2, flags, /*createPart1*/ 0, /*createPart2*/ 0, objc[1]); /* LookupVar now only needs to deal with scalars */ if { $part2 == NULL } { varPtr->objPtr = objc[2]; Tcl_ReturnObj(varPtr->objPtr) } else varPtr->objPtr->typPtr->setProc(part1Ptr->objPtr,part2,objc[2]); Tcl_ReturnObj(varPtr->objPtr->typePtr->getProc(varPtr->objPtr,part2)) } }I am not sure what problem Lars H has with the above. All we have actually done is move the call to a hash lookup from within the LookupVar function to, the ObjType structure (assuming that we now have a array type of object).Another consideration, since we do have variables and TclObj and this creates issues when looking at changes like those mentioned above, I propose instead that variables would be implemented as just another Tcl_ObjType so that we can always access the variable info if needed without having to pass a varname to set/get. Insead we would pass a Tcl_Obj that was a variable type, set /get would do the necessary double indirect dereference to return the objPtr.Again I do not see anything controversial, or magical about my proposal.Lars H, 12 May 2004: You appear familiar with the details of Tcl:the-C-library, but not so familiar with Tcl:the-language. Nothing magic about your proposal? It could probably be implemented, but there are likely to be some ugly corner cases. Nothing controversial about your proposal? Read what I wrote was the big problem with it. It does throw away everything is a string! You can't get more controversial than that!