set v [foo bar [set bar 20] [baz $bar]]In that example we first create an array of Tcl_Objs. The first element will be for a result of set v ...
[0 RESULT] [1 set]We push the index of the set command, which in this case is 1.
LIFO stack[] = 1We continue down the line adding objects for strings until we reach another branch.
[0 RESULT] [1 set] [2 v] [3 RESULT]Now we are at a branch, where we have:
[foo bar [set bar 20] [baz $bar]]We append another object for the foo command and push the index of the object, which in this case is 4.
[0 RESULT] [1 set] [2 v] [3 RESULT] [4 foo] [5 bar] LIFO stack[] = 1 4Now we are at another branch for:
[set bar 20] [baz $bar]We append a RESULT object and push an index for the set object.
[0 RESULT] [1 set] [2 v] [3 RESULT] [4 foo] [5 bar] [6 RESULT] [7 set] [8 bar] [9 20] LIFO stack[] = 1 4 79 was the end of a command ( ] ), so we pop one item off of the LIFO into our order list:
order list = 7Now we just have the [baz $bar] to finish in a similar manner. We push the index of [baz $bar]'s baz onto the LIFO and we end up with:
LIFO stack[] = 1 4 11 [0 RESULT] [1 set] [2 v] [3 RESULT] [4 foo] [5 bar] [6 RESULT] [7 set] [8 bar] [9 20] [10 RESULT] [11 baz] [12 $bar] [13 END OF ALL]The end of the command [baz $bar] is reached so we pop another item off the stack into our order list.
order list = 7 11The end of the [foo ...] command is reached so we pop another and get:
order list = 7 11 4and so forth until the command is complete
our order is: 7 11 4 1Compare that order to these and you will see that it's the proper result:
[0 RESULT] [1 set] [2 v] [3 RESULT] [4 foo] [5 bar] [6 RESULT] [7 set] [8 bar] [9 20] [10 RESULT] [11 baz] [12 $bar] [13 END OF ALL]One aspect which we haven't discussed is result transfer. Each command copies its result to the object behind its beginning. This makes it so that each command result works properly.This approach allows us to write the Tcl_Obj structures to the disk more easily, and unlike a parse-tree it doesn't require many little mallocs, frees and evaluation at each branch.Based on my somewhat naive assesment of the Tcl sources this design would require a new parser, and perhaps some changes to the Tcl_Obj structure or a new Tcl_ObjType. I also propose a preprocessor that would tranlate $foo to [set foo].To improve performance each head of a command would also have an offset into an array that stores a function pointer. An alternative would be to use an epoch and do hash lookup, but to me that is overkill. The disadvantage of using an array offset is that to add new commands we would need to either increase the array's size and add more pointers, or replace existing offsets with new pointers and risk having some old commands call the new inappropriately.What do you think?
DKF: I don't yet see my way through this idea far enough. OK, it's Monday morning and I'm not at my sharpest. If the fundamental execution model is stack-based bytecode (and that's an easy target to generate from a compiler), could we have a bit of an illustration of a few opcodes and their effect on the stack? And then perhaps a sample conversion of a Tcl command fragment into those bytecodes? That would help my wooly-thinking very much. Thanks! (And if you've already done some of that above, please forgive me. I'm just having problems seeing where this is all going. Once I 'get' it, I'll probably be able to offer much more useful comments...)RS would like to know how this differs from the current bytecode compiler, and what advantages it would bring.GPS: The idea is that rather than using a BCC we store the Tcl_Objs in a format that is more easily executed. To execute a command all we have to do is make pointers to the Tcl_Objs in the array, do lookup on the first object for the function pointer (or cache it), and call the function. Basically there is no stack used at runtime (only at Tcl_Compile() time to fetch the order list). I'll work on some examples.To me the BCC is too complicated. MS and DGP seem to fix a BCC bug a week. It's kind of ridiculous. MS once called Tcl_ExecuteByteCode Medusa, because of the way it snakes around with goto and so on.MS: I'm not sure I understand correctly, but (modulo possible storage details) I imagine this being the current bcc engine without bcc'ed commands - ie, where INST_INVOKE is the main player. I have thought that that would be a desirable approach for a while. Note that the function pointers are currently cached in the command name Tcl_Objs.The example above would produce a bytecode stream in the flavor of:
[INST_PUSH "set"] [INST_PUSH "v"] [INST_PUSH "foo"] [INST_PUSH "bar"] [INST_PUSH "set"] [INST_PUSH "bar"] [INST_PUSH "20"] [INST_INVOKE 3] [INST_PUSH "baz"] [INST_LOAD_SCALAR "bar"] [INST_INVOKE 2] [INST_INVOKE 4] [INST_INVOKE 3]where [INST_INVOKE x] uses the top x stack items to build a command, and pushes the command's result on the stack. Seems to me the idea proposed here is at least somewhat related to storing the object (pointer) list, and separately the instructions:
1. Object list {"set" "v" "foo" "bar" "set" "bar" "20" "baz" "bar"} 2. Instruction list: [PUSH 7] [INVOKE 3] [PUSH 1] [LOAD 1] [INVOKE 2] [INVOKE 4] [INVOKE 3]which is to be compared to the new approach proposed in this page (as I understand it):
1. Object list {"set" "v" "foo" "bar" "set" "bar" "20" "baz" "bar"} 2. Invocation list (index of object command): {4 7 2 0}Note: Is this understanding correct, modulo the missing RESULT objects and the not-handling of $bar (to be replaced by a call to set as explained above)?Details I am missing in the proposed approach:
- How do the commands invoked from the order list know how many arguments they receive?
- How is the object list collapsed so that the invocation of command [4 foo] receives its four arguments in a packed C-array as required by the tcl API?