Updated 2016-01-13 19:47:12 by pooryorick

In Tcl, a pure list is a list for which no string representation has been generated, but for which an internal [list structure has. Pure lists are significant primarily because the evaluator and bytecode compiler treat them specially. If confronted with a pure list, these components can recognize immediately that

  • the object in question is a single Tcl command; the words making up the command are precisely the elements of the list.
  • the object requires no substitution.

They can therefore avoid using a parser on the object. They simply look up the command (element 0 of the list), and construct a parameter vector consisting of the list elements.

Pure lists are preferred for any commands that directly evaluate some argument as a script, command or command prefix, e.g. trace, eval, uplevel, and after.

PYK 2015-12-16: I'm adding "script" back to the previous paragraph because commands like eval evaluate scripts, and even if a pure list is only capable of representing a script composed of a single command, it's still a script, and eval still benefits, at least in theory, from receiving a pure list as an argument.

list always returns a pure list.

Lars H: Actually, this isn't completely true. See [1].

AMG: Why is empty string not considered to be a zero-length pure list?

Lars H: It's not a pure list, by definition, because it has a string representation.

AMG, cont: It seems to me the places that check for pure list type could also check for null type.

Lars H: Checking for null type is pointless, since what one is interested in is the property that the string representation (if it would be generated) of the Tcl_Obj would be the canonical one. A Tcl_Obj of null type has no intrep, so you're worse off than for an impure list. What one could check is whether the length of the string representation is 0, but that would slow almost everything down (testing two conditions instead of one) for the benefit of one very special case, so it's most likely not worth it.

AMG, cont: Also, are there ways to get a zero-length pure list, or a zero-length non-null type string?

Lars H: Quoting myself from the referenced bug report: in case anyone wants a list equivalent that always returns a pure list, they can do
interp alias {} list {} lreplace x 0 0

jcw: Is it essential that there be no string rep - isn't a list rep sufficient for this either way?

DGP: Yes, it is essential that there be no string rep. Consider these two strings:
list a set a 1 set b 2

list a
set a 1
set b 2

Both of those strings are valid Tcl lists. In fact they are both the same list. However, they are not the same string, and when evaluated as scripts, they are not the same script.

This distinction is important because eval sees newlines as the ends of commands, while list sees them as just another whitespace character to separate list elements.

Aha! So this means that one cannot construct a string, do an [llengthi] to force a list rep, and then expect the result to eval quickly. That's unfortunate. Is there a simple way to *force* a string to a pure list rep in Tcl?

DGP: lrange $string 0 end appears to work, but I don't know that I would count on it. I could easily see someone optimizing that "no-op" away.

AMG: list {*}$string can be used: list {*}{ "this" {is} \x61 l\u0069st } returns the pure list this is a list. Also, it's a little faster than lrange:
set string { "this" {is} \x61 l\u0069st }
proc 1 [list [list string $string]] {list {*}$string}
proc 2 [list [list string $string]] {lrange $string 0 end}
time 1 1000000             ;# 3.484332 microseconds per iteration
time 2 1000000             ;# 4.503587 microseconds per iteration
info patchlevel            ;# 8.6b1.2

Another thought: would it be possible to further special-case eval so that it still uses the list rep if the string has no newlines? OTOH, perhaps it's not worth the extra scan.

DGP: Haven't thought about it, but one would need to check for semi-colons too, at least.

KBK: It's a good bit messier than that. Consider one simple example:
% set foo bar
bar
% set grill {list $foo}
list $foo
% lrange $grill 0 end
list {$foo}
% eval $grill
bar

You'd have to make sure that the string is free of substitutions, which would almost defeat the purpose.

An interesting example for list conversion was shown by DKF in the Tcl chatroom on 2002-12-10:
% set t a\ b\ c\\\ d\ e ;# legal string markup (blanks escaped)
a b c\ d e              ;# The backslash is still there
% lrange $t 0 end       ;# Is this a valid list?
a b {c d} e             ;# Yes, though it looks different
% set t                 ;# ..and the string rep still has the backspace
a b c\ d e
% set t [lrange $t 0 end] ;# Now reassign the pure list to t (= throw away string rep)
a b {c d} e
% set t                 ;# The new string rep is re-created from the list
a b {c d} e