package require md5 2 # A very simple implementation of a Linda-like Tuplespace. # Tuples take the form {field1 ... fieldN}. # namespace eval tuplespace { # This array defines the tuplespace. Each column contains a list # of matching tuple ids (tids). The hash is a combination of # column number, tuple field. (.e.g - tspace(0,hello) references any # tuple (via its id) that has 'hello' in its first column. # variable tspace # Tuples are referenced in tspace by tuple ids (tids). Tids are # md5 hashes of the actual tuple and are used as an index into the # array that holds the tuples. # variable tid_arr # Put a tuple out into tuplespace. # proc out {tuple} { variable tspace set tid [new_tid $tuple]; # create a new tuple id. # For each field in the tuple, asign it to a column in tuplespace. # set fn 0 foreach tpf $tuple { lappend tspace($fn,$tpf) $tid;# this column now references tid. incr fn } return $tid } # Read (non-destructively) one (default) or more tuples from the space # matched by 'tpat'. Tpat is a tuple specification that may have a # "don't care" placeholder (?) in place of any tuple field. At least # one matching tuple field must be supplied. If rd_multiple is 1, then # return all matching tuples. # proc rd {tpat {rd_multiple 0}} { set res {} foreach tid [_rd $tpat $rd_multiple] { if {!$rd_multiple} { return [tid.tuple $tid] } lappend res [tid.tuple $tid] } return $res } # Take (by removing) a tuple from the space that matches 'tpat'. # proc in {tpat} { set res {} foreach tid [_rd $tpat] { set res [tid.tuple $tid] _remove $tid } return $res } # Return a directory listing of all tuples that match 'tpat'. # proc dir {tpat} { return [rd $tpat 1] } # Do all of the work of reading a tuple. # proc _rd {tpat {rd_multiple 0}} { variable tspace set match {} # Look at all of the supplied pattern fields and collect # a list of matching tuples (based on each field). If a field # fails to match (and the field is not ?, return an empty tuple). # set fn 0 foreach tpf $tpat { if {[string equal $tpf "?"]} { incr fn continue } if {![info exists tspace($fn,$tpf)]} { return {} } lappend match $tspace($fn,$tpf) incr fn } return [_intersection $match $tpat $rd_multiple] } # Find the intersection between all of the matched (candidate) tuples. # proc _intersection {match tpat rd_multiple} { # Calculate the number of fields actually supplied in tpat. # This is the number intersections in the matches we must find # in order to return a tuple. # set tpat_len [llength [lsearch -not -all -exact $tpat ?]] set tpat_len [llength [lsearch -not -all -exact $tpat ?]] # Find the intersect of these collected matches. The first # intersect count that equal $tpat_len is our matched # tuple (unless all matches are requested). # set result {} foreach tid [join $match] { set tuple [tid.tuple $tid] if {[llength $tuple] != [llength $tpat]} { continue } if {![info exists intersect($tuple)]} { set intersect($tuple) 1 } else { incr intersect($tuple) } if {$intersect($tuple) == $tpat_len} { lappend result $tid if {!$rd_multiple} { break } } } return $result } # Remove a tuple from the space based on the tuple id. # proc _remove {tid} { variable tspace set tuple [tid.tuple $tid] set fn 0 foreach tpf $tuple { if {[info exists tspace($fn,$tpf)]} { set tspace($fn,$tpf) [lsearch -all -not -inline \ $tspace($fn,$tpf) $tid] } incr fn } tid.delete $tid } # Create a new tid to hold a tuple. # proc new_tid {tuple} { variable tid_arr set id [tid $tuple] set tid_arr($id) $tuple return $id } # Calculate the tid from a give tuple. # proc tid {tuple} { return "\#Tuple[::md5::md5 -hex $tuple]" } # Return the tuple from the given tid. # proc tid.tuple {tid} { variable tid_arr return $tid_arr($tid) } # Destroy a tuple named by tid. # proc tid.delete {tid} { variable tid_arr unset tid_arr($tid) } # Flush a tuple from memory, but keep the tspace references around. # DANGEROUS! proc tid.flush_tuple {tid} { variable tid_arr set tid_arr($tid) {} } } # Examples: puts [tuplespace::out {hello there Todd Coram}] puts [tuplespace::out {hello there Maroc Ddot}] puts [tuplespace::out {linda space}] puts [tuplespace::dir {hello there ? ?}] puts [tuplespace::rd {hello there ? ?}] puts [tuplespace::in {hello there ? ?}] puts [tuplespace::dir {hello there ? ?}] puts [tuplespace::in {hello there ? ?}] puts [tuplespace::dir {hello there ? ?}]
jmn 2004-02-01 The above example seems to work just fine, but fails when I try:
puts [tuplespace::out {hello there test etc}]I don't know what's magic about that string, but the tid returned doesn't begin with #Tuple like all the others. Something fishy with the md5 implementation I'm using I guess.Using tcllib1.5, md5 2.0.0, Tcl8.4.4 on Windows.Further experimentation seems to suggest there's some sort of carriage return or backspace type character being returned by
[md5 {hello there test etc}]Seems the 1st 9 chars on the line are getting eaten!
% puts "123456789[md5 {hello there test etc}]" ììÆ?¼ê[3#ü~²ºq % puts "123456789xxx[md5 {hello there test etc}]" ììÆ?¼ê[3#xxxü~²ºqBeats me what this means or whether it's a problem in the md5 implementation itself, or just in how it's being used here.Argh. The md5 package appears to have changed its API between versions. The prior version returned hex strings by default. Adding -hex to the tid generator proc's md5 call should fix the problem -- Todd CoramEither that or package require version 1 of the md5 package. schlenk
Ruby, incidentally, has a nice, standard Tuple implementation as part of "Distributed Ruby" (drb).
09/11/02 -- Incorporated Tcl'ish improvements made by Michael Schlenker into the code above (replacing a few clumsy for loops with foreach loops).Also, Each tid is now uniquely identified by an md5 hash from the tcllib md5 package. And a few more tid procs were added. Why? Well it makes it easier to add persistence (via metakit!) --- Todd Coram :
# Meta-kit persistence for the tuplespace. # namespace eval tuplespace-db { proc open {dbpath} { mk::file open db $dbpath mk::view layout db.tspace "id tuple" _load } proc close {} { mk::file close db } proc _load {} { trace variable ::tuplespace::tid_arr ruw {} mk::loop row db.tspace { set tuple [mk::get $row tuple] set tid [tuplespace::out $tuple] tuplespace::tid.flush_tuple $tid } trace variable ::tuplespace::tid_arr ruw ::tuplespace-db::getset } proc getset {name tid op} { switch -- $op { w { _write $tid } r { _read $tid } u { _unset $tid } } } proc _unset {tid} { set r [mk::select db.tspace -count 1 \ -exact id $tid] if {$r != {}} { mk::row delete db.tspace!$r mk::file commit db.tspace } } proc _read {tid} { if {[tuplespace::tid.tuple $tid] == {}} { # reload from database # set r [mk::select db.tspace -count 1 -exact id $tid] tuplespace::new_tid [mk::get db.tspace!$r tuple] } } proc _write {tid} { set tuple [tuplespace::tid.tuple $tid] # See if the tuple already exists in the database. # set r [mk::select db.tspace -count 1 -exact id $tid] if {$r != {}} { # Replace the tuple. # mk::set db.tspace!$r id $tid tuple $tuple } else { # Nope, add a new row. # mk::row append db.tspace id $tid tuple $tuple } mk::file commit db.tspace } }
15/Oct/03 schlenk: I noticed that the Metakit persistence for this tuplespace does not work reliably, if md5 hashes are used. I base64 encoded things and everything works well:
# Calculate the tid from a give tuple. # package require base64 proc tid {tuple} { return "\#Tuple[::base64::encode [::md5::md5 $tuple]]" }Is the problem really with Metakit itself, or with Mk4Tcl ?schlenk Bad phrasing on my side: It did not work with Metakit if only MD5 was used for this, but it worked when base64 encoded MD5 was used, probably due to embedded nulls or something like it...; have to take a look more closely tomorrow.jcw - Yep, nulls are trouble. Change:
mk::view layout db.tspace "id tuple"to
mk::view layout db.tspace "id:B tuple"Are nulls trouble for Metakit, or just for Mk4Tcl?jcw - Neither. Nulls terminate strings in C. That's why MK has type S (default in Mk4tcl) and type B properties. If you store a string with embedded null bytes in a S property, MK's C api will ignore everything past the first one, just like every other C function taking char*'s. Hmm...now you have me wondering: in a Tcl_Obj* string rep, embedded nulls are not possible, right? I wonder where the problem lies in this case.Wrong. Tcl_Obj string reps are counted strings precisely so they can include embedded NULLs. Since 8.1 established (modified) UTF-8 as the preferred internal string encoding, embedded NULLs are no longer needed and are frowned upon, but for compatibility with extensions written for 8.0, Tcl_Obj's still accept and preserve embedded NULLS in their string reps.jcw - Thanks, Don, for setting this straight. In that case, the conclusion is: if your strings can have null bytes, don't store them in an S property - use B instead. The frown carries over to Mk4tcl, MK, C, and C++. It would be nice for Mk4tcl to catch such cases and throw an exception - right now (MK 2.4.9.2), it doesn't - it truncates.Jacob Levy JCW, are the costs of storing B and S items basically the same? If yes, why not make the default property type for mk4tcl be B?
3nov2003 Todd Coram A first cut at a Tuplespace server:
namespace eval tuplespace_server { array set cb {} proc register_cb {chan tpat} { variable cb lappend cb($chan) $tpat return "ok" } proc deregister_cb {chan} { variable cb unset cb($chan) return "ok" } proc read {chan cmd tpat} { set res [::tuplespace::$cmd $tpat] return "ok $res" } proc write {chan tuple} { variable cb set tid [::tuplespace::out $tuple] foreach cbchan [array names cb] { foreach tpat $cb($cbchan) { if {[::tuplespace::rd $tpat] != {}} { puts $cbchan "cb $tpat" } } } return "ok $tid" } proc dispatch {chan cmd tuple} { switch -- $cmd { cb { set res [register_cb $chan $tuple] } rd - dir - in { set res [read $chan $cmd $tuple] } out { set res [write $chan $tuple] } default { set res "error invalid command!" } } puts $chan $res } } proc register_client {chan addr port} { fconfigure $chan -blocking 0 -buffering line fileevent $chan readable [list handle_input $chan] } proc handle_input {chan} { if {![eof $chan]} { if {[gets $chan data] == -1} { return; # only handle complete lines } } else { catch {close $chan} ::tuplespace_server::deregister_cb $chan return } set l [regexp -inline -- {(\w+)\s+(.*)} $data] if {[llength $l] != 3} { puts $chan "error usage: command {tuple}" } else { foreach {dummy cmd tuple} $l break puts stderr "cmd=($cmd), tuple=($tuple)" ::tuplespace_server::dispatch $chan $cmd $tuple } } socket -server [list register_client] 6667 vwait ::foreverConnect to the server via telnet and try the following commands:
out tom baker 56 out bob baker 54 dir ? baker ? cb ? baker ? out ginger baker
Todd Coram Ugh. The above tuplespace server commits the sin of not practicing what it preaches. Here are revised procs that use the tuplespace itself to store callback information (not a Tcl array!):
proc register_cb {chan tpat} { ::tuplespace::out [list cb_clients $chan $tpat] return "ok" } proc deregister_cb {chan {tpat ?}} { foreach cb [::tuplespace::dir [list cb_clients $chan $tpat]] { ::tuplespace::in [list cb_clients $chan [lindex $cb 2]] } return "ok" } proc tuple_match {tpat tid} { foreach tuple [::tuplespace::dir $tpat] { if {[::tuplespace::tid $tuple] == $tid} { return 1 } } return 0 } proc write {chan tuple} { set tid [::tuplespace::out $tuple] foreach client [::tuplespace::dir [list cb_clients ? ?]] { foreach {cb_clients chan tpat} $client break if {[tuple_match $tpat $tid]} { if {[catch {puts $chan "cb $tpat"}] != 0} { deregister_cb $chan } } } return "ok $tid" }
1/31/2004 Yikes! A long standing bug in the tuple callback code above has been fixed. The addition of tuple_match means that you don't get a callback EVERYTIME anything is written to the space (and you had a previous match that hadn't yet been read). -- Todd Coram (Yes this code is getting too unwieldy for the wiki)
schlenk 01/Feb/2004. I have a variant of the above code (without the notification part at the moment) running under Tclhttpd exposed as a SOAP webservice. Ask me if you are interested, it's not really polished yet.
AK: While storing the callback information in the tuplespace itself is interesting it also opens the possibility of an application screwing with the server by modifying and/or removing the relevant tuples. That is IMHO a bad thing, from a security point of view. I.e., keeping the server management information (callbacks) and the application data (tuples) separate is IMHO the better approach. And nothing prevents us from storing the data in a second tuplespace, if we wish to :). Of course, that requires rephrasing the code above as a class, so that we can have multiple independent t-spaces. ... I would also use comm for the communication part. Its hooks allow the restricted protocol we are running here as well. Possibly even the cross-linkage of many tuple-servers into one space.Todd Coram: Or... reserve any tuple beginning with cb_clients as system use only by disallowing any modifications via the write proc. A feature of my Street Programming Tuplespace was to reserve any tuple with a first element beginning with a '#' as a system tuple that couldn't be modified without a privileged connection. The benefit of storing tuplespace meta information as tuples themselves is that you can delegate system facilities to privileged clients outside of the tuplespace server. Maybe you would want a callback managager that could track who was getting what and displayed the frequency of queries in a graph. That would bloat the tuplespace server, but would make a nice external client.Of course, you now need some sort of password facility to make this useful.AK: Right. Access-Control, authentication, secrecy, the works. Because these system tuples should not be seen by a regular user either. Or otherwise system data leaks out which can help in attacks in other ways. Who was getting what ... That type of manager I would place on top of a tuplespace actually. Otherwise the tuplespace has an hardwired assumption that requests come from a network and that there is an id to be had identifying the requester. Note that this type of tracking can be of general interest beyond statistics. If you are linking several tuplespaces and each space can query others if he cannot answer a query on its own. That goes into the realm of P2P systems, or HA through replication and redundancy. Quite a lot of fun can be had.CMcC - I have a few questions:
- Take only grabs one matching tuple, right? (Yes - Todd Coram)
- Do tuple IDs have to be unique for all time as well as at any given time for all tuples? (The ID value is directly related to the contents of the tuple. {Hello World} will always have the same unique ID, regardless when it was put into the space. Or, at least as unique as MD5 hashing allows - Todd Coram ;-)
- Are duplicates allowed (duplicates for all but tuple ID, I guess) (No. A exact duplicate tuple will produce the same ID and squash the old one. - Todd Coram) escargo 3 Apr 2005 - Wouldn't an exact duplicate not need to be written, since it's already in the data base?
One of the reasons Erlang is so cool is that it builds in tuplespace concepts, although, to my astonishment, I have yet to come across a presentation of that line of descent. Erlang is all about sending messages between processes [explain status of "cloud" and comparison to Erlang semantics]; receipt is pattern-matched.
NEM 21 Aug 2006: A tuplespace also seems to be quite related to Blackboard Systems.
MJ - YATS (Yet Another Tupleserver) moved to Tupleserver