Updated 2016-02-27 05:31:07 by pooryorick

Disambiguation  edit

MS-DOS
The Microsoft operating system.

Description  edit

Jacob Levy 2003-05-30: Recently there was some discussion on comp.lang.tcl about whether Tcl is vulnerable to DoS attacks and what to do about it, if anything.

First of all, the term DoS is shorthand for denial of service. A denial of service results when someone malicious succeeds in preventing legitimate users from accessing and using a shared resource. For example, a denial of service attack could prevent people from withdrawing money from an ATM, if the ATM back end processing was somehow prevented from processing legitimate transactions. Most denial of service attacks mounted via the internet involve sending massive amounts of bogus data to a server accessible via the net, thereby burying legitimate data in a pile of garbage. The result is that to find the legitimate data, the server has to examine and reject all the bogus data, which tremendously slows things down. Often, when you hear about DoS attacks, you also hear about a variant called DDoS, in which a perpetrator uses a lot of other computers on the net to mount a coordinated attack, making it less likely that the server will detect that the attack is even occurring.

Security attacks are often evaluated and analyzed in terms of their practicality and effectiveness. Practicality measures how much work it takes to mount the attack; if it's not a lot of work, the attack is deemed practical. Effectiveness is a measure of how successful the attack is at achieving its goal, which could range from obtaining information that should stay protected or preventing legitimate access to a shared resource.

In Algorimic Complexity Attack on Tcl [sic], Tcl core mailing list, 2003-05-29, Scott Crosby described a DoS attack that exploits deterministic hashing algorithms. Additional details are available in Denial of Service via Algorithmic Complexity Attacks, Scott and Dan Wallach, 2003.

Here's my take: I think that the paper is significant, and not "drivel", because they showed that with very little input bandwidth consumption, they were able to exploit a flaw in the deterministic algorithm used in an example application so that that application ceased to effectively function. That makes the attack practical, and this means that other, legitimate users, could not make use of this shared resource -- a DoS attack. One of the other example applications that was mentioned is Perl 5; there is no reason to believe that Tcl 8.4, for example, would be invulnerable to a practical attack along these lines.

I do think this is serious and worthy of consideration. For a Tcl-based networked server, there is a DoS risk here if you let a remote, untrusted application determine the hash keys you're going to use. That happens, as was pointed out, when you parse mime headers, when you parse HTTP headers, and (a new one from me) when you represent XML as nested arrays. In fact, even an assignment to an element of an array (or any local variable) uses hash routine in Tcl.

Fortunately, it's very easy to fix this DoS risk: instead of making Tcl's array resizing algorithm deterministic, let it choose, at Tcl_Init time, a resize factor in a bounded range. This effectively defeats the attack, because the remote attacker has no way to predict what inputs will clog a specific invocation of Tcl. Of course they could still get lucky and hit a sequence of inputs that causes a particular invocation to lock up tight. However, a sufficiently large range for the resize factor makes the attack impractical. To make the array resizing algorithm even more resistant to this attack, allow it to choose a resize factor each time an array is created, and store the resize factor in the array metadata.

Now, as to whether it's worthwile to fix this: Since Tcl accepts data as code, and allows you to execute strings received from untrusted sources, it is already exposed to DoS and hack and crash attacks, even without this specific vulnerability. Even if you run all untrusted code in a safe interpreter, you're still open to DoS attacks. For an example, try this simple code in a safe interpreter and wait for the Tcl prompt to return (hint: it won't :)):
set i [interp create -safe]
$i eval {
    proc donothing {} {}
    for {} true {} donothing
}

If some hacker sent this code to your networked server, and you accepted it for execution and followed all the rules about executing untrusted code only in safe interpreters, your application would still become catatonic to the world. This seems to me to be an even more reliable and low bandwidth method for DoS than what Scott Crosby and Dan Wallace propose.

Maybe after we fix this problem, it'll be worthwile to fix the hash table DoS attack also :)

AM 2003-06-03: In the chatroom this was discussed as well, and one idea that was put forward is to see how chroot() and setrlimit() could be used as a model to limit the possibilities of a potentially malicious or careless script.

It was noted that chroot() could be implemented via a script: redefine the open command in the slave interpreter.

(Actually, you would have to redefine cd and pwd as well and the details are a bit complicated, as you do not want a file name as "../../myfile.inp" to compromise the starting point. Or on Windows, "c:/c:/myfile.inp", and you only strip off one copy of "c:/" ... Oh well, just some thoughts.)

The infinite loop, as noted on the c.l.t. later on, could be broken by watching the command count. Here too, there are gory details to consider, but it can be done.

DKF notes (on 04-Jun-2003) that Tcl already provides the tools for ensuring that nothing bad happens when doing a chroot()-alike, through the file normalize command, as well as safe interpreters as normal.

Jacob Levy 2003-06-03 I'm waiting for Kevin Kenny to describe another idea he was proposing, a new [timelimit script ms] command.

DKF 2007-12-16: In 8.5, something like interp limit will do that.

Jacob Levy 2003-06-07: Recently there was a thread on comp.lang.tcl about a supposed bug in safe slaves where an "update" command made it appear that safe slaves somehow break out of an infinite loop. In the master, if control flow ever returns to it, the slave interpreter is destroyed.

The scenario reported as a "bug" is that the master evaluates a script in a safe slave. The slave's script is an infinite "while" loop, and on each iteration the script calls "update". The problematic behavior observed was that the slave was destroyed somehow even though control was never supposed to return to the master. The person reporting the problem said that the problem only appears in "wish", not in "tclsh". This difference in the behavior can probably be attributed to the fact that the event loop is inactive in tclsh whereas it is active in wish. The "update" command probably allowed a nested script to start in the slave, and when control flow returned from it to the master the slave was destroyed. This makes it appear that the outer script somehow returned.

Also, recently we had a discussion on how to implement resource limits for computations in general, in particular in the context of evaluating untrusted scripts in safe interpreters. One idea floated in that discussion was to place a clock time limit on the computation, so that if control does not return by the time limit, the computation is terminated. Also, nested time limits are unwound until the time limit that hit is reached.

Now consider what happens when you place a time limit on a script that calls "update". The call to "update" does not return until all pending events are processed. If you use the event loop to start untrusted scripts e.g. those arriving on a socket, this script will cause nested untrusted computations to be started in the slave, and then when the time limit hits for the outter script, the nested computations are unwound as part of servicing the time limit. This in essence has the effect of time limiting these nested computations with the time limit of the outter script, denying service to these nested computations. Note that the outter script is malicious but the nested ones may be legitimate.

I dont have a solution for this problem (yet). Is it possibly a security bug that "update" is present in a safe slave? Or perhaps it should be somehow limited in scope if invoked from a safe interpreter?

DKF 2003-06-09: This is a problem with time-limits for limiting execution, and there's not much you can do about it. However, the way you resolve these things is to mark the slave interpreter as expired, which stops further processing of commands within it. OK, it doesn't shut the interpreter down immediately (though in fact, deleting the interpreter shouldn't have any really bad effects - Tcl's careful like that) but just prevents further execution of commands within it. It's up to the other interpreter to exit from its processing before the [while 1 update] can respond, but that's reasonable enough.

DKF: An interesting reference for doing something about the hash table problem that was discussed earlier this year in c.l.t [1] is [2]. The code isn't suited to being used as-is, but seems to have at least some of the spectral properties we'd like. And it is fairly cheap too.

US 2003-09-04: A DoS attack can very well be handled in pure Tcl!

I've prepared a small example server, which is able to handle and limit infinite while and for loops. You may set a time limit (in milliseconds) and a maximum of iterations through the loop. Loops may even be nested. I'm sure the code still needs some polishing, but it works. It should be easy to extend this capability to foreach loops (I don't know in which respect they are dangerous).

Please try it out, try to initiate a DoS. If you accomplish a DoS, please tell me how, I'll try to improve the code. (No, redefinition of while or for in terms of other loop constructs (which?) is not the way to fool this server. The other loops are not yet protected, but in the final version they will be.)

All puts statements are just for illustration purposes, they aren't needed.

The interp create command is still dangerous. For now I just hide interp, a more flexible solution will follow soon.
proc lim_for {chan start cond step body} {
    global next_loop_id cmd slave maxnum maxtim
    puts "Slave: for \{$start\} \{$cond\} \{$step\} \{$body\}"
    set loop_id for#[incr next_loop_id]
    set cmd($loop_id,time) [expr {[clock clicks -milliseconds] + $maxtim}]
    set cmd($loop_id,runs) $maxnum
    set new_cond "\[check_limit $loop_id\] && ($cond)"
    interp invokehidden $slave($chan) for $start $new_cond $step $body
}

proc lim_while {chan cond body} {
    global next_loop_id cmd slave maxnum maxtim
    puts "Slave: while \{$cond\} \{$body\}"
    set loop_id while#[incr next_loop_id]
    set cmd($loop_id,time) [expr {[clock clicks -milliseconds] + $maxtim}]
    set cmd($loop_id,runs) $maxnum
    set new_cond "\[check_limit $loop_id\] && ($cond)"
    interp invokehidden $slave($chan) while $new_cond $body
}

proc lim_proc {chan name arglst body} {
    global slave
    if {[lsearch -exact {proc check_limit} $name] != -1} {
        error "redefinition of $name prohibited"
    }
    interp invokehidden $slave($chan) proc $name $arglst $body
}

proc check_limit {loop_id} {
    global cmd
    if {$cmd($loop_id,runs)} {
        incr cmd($loop_id,runs) -1
    } else {
        error "$loop_id exceeded maxnum"
    }
    if {[clock clicks -milliseconds] > $cmd($loop_id,time)} {
        error "$loop_id timed out"
    }
    return 1
}

proc connect {chan addr port} {
    global slave
    fconfigure $chan -buffering line
    fileevent $chan readable [list rcv $chan]
    set slave($chan) [interp create -safe]
    interp hide $slave($chan) interp
    interp hide $slave($chan) while
    interp alias $slave($chan) while {} lim_while $chan
    interp hide $slave($chan) for
    interp alias $slave($chan) for {} lim_for $chan
    interp hide $slave($chan) proc
    interp alias $slave($chan) proc {} lim_proc $chan
    interp alias $slave($chan) check_limit {} check_limit
    puts "Host $addr connected at port $port ($chan)"
}

proc rcv {chan} {
    global cmd slave
    if {[gets $chan line] == -1} {
        close $chan
        interp delete $slave($chan)
        unset cmd($chan)
        return
    }
    append cmd($chan) $line "\n"
    if {[info complete $cmd($chan)]} {
        if {[catch {$slave($chan) eval $cmd($chan)} res]} {
            puts "Illegal instruction ${res}."
            set res "Illegal instruction ignored."
        }
        puts $chan $res
        set cmd($chan) ""
    }
}

set maxnum 100
set maxtim 1000
set next_loop_id 0

set server [socket -server connect 4711]
vwait forever
close $server
exit

VI 2003-10-22: There's an (up-to-date?) port of Tcl and wish using a ck interface at [3]. I tried both the tclsh85 and the cwsh are both remarkable. I don't understand Russian and can't figure out who the author really is... Thanks though!

TV: Isn't it more or less always possible to do a dos on something on a network? You could generate so much, legitimate, traffic that most any service relying on it suffers. Then again, when the link to that network limits all possibilities to generate more than a reasonable amount of all possible traffic, serious Dos of the flooding kind would be preventable.

I run tclhttpd, and got a korean machine (at least that's what the IP address suggested) to call for bogus pages very quickly on a row a few days ago just after the machine restarted.

On a non-t1 like connection and altogether lame machine, Tcl should respond to any well formed request in probably less time than the network can ever handle, even including file transfer.

Different matter is cgi-access, which involves spawning a process, doing things with various os layers (including memory management) other then sockets, which I like to prevent altogether because it seems like a stupid thing to do on a tcl server anyway (before you know it, you want Tcl in your cgi..). I think the server I am running has I think only a single dos in that way, to generate a (unix) fortune:
   Theo Verelst Local Fortune Page

   An automatically generated fortune even I, the programmer of the server procedure for it, don't know about...

       Corruption is not the #1 priority of the Police Commissioner.  His job
       is to enforce the law and fight crime.
                            -- P.B.A. President E. J. Kiernan

   If you like it a lot, paste it in the message page, so I can read it, too.

This was a random one from http://82.168.209.239/fortune , which has a hardcoded link to the executable.