2005-02-13 12:00 GMT - I've re-enabled the wiki with more patient locking, looks like some edits are now taking longer than the previous rules for lock acquire/break were prepared to wait for. There may be a race condition in the lockfile logic. If things break again, I'll revert to r/o mode again. Please let me know by email -jcw
Eureka! There is indeed a race condition, when wikit cannot acquire the lock. In wikit/lock.tcl, it does:
- check if pid stored in lockfile is a running process
- if not, remove the lock file as being stale
proc AcquireLock {lockFile {maxAge 3600}} {
for {set i 0} {$i < 300} {incr i} {
catch {
set t [file mtime $lockFile]
set fd [open $lockFile]
set opid [gets $fd]
close $fd
if {[clock seconds] > $t + $maxAge ||
$opid != "" && [file isdir /proc] && ![file isdir /proc/$opid]} {
# If the lock looks stale, wait a bit to see if it is about to go away
# and be reclaimed by another process - if so, avoid a file delete race.
# This caused a damaged db twice in mid-Feb 2005, the new logic should
# make most wikit instances back off if they see a lock.
after 2500
if {[file mtime $lockFile] == $t} {
# here, the lock needs to be deleted *and* no other process has done
# done so for 2.5s - so we *assume* it can now be deleted without race.
file delete $lockFile
set fd [open savelog.txt a]
set now [clock format [clock seconds]]
puts $fd "# $now drop lock $opid -> [pid]"
close $fd
}
}
}
catch {close $fd}
if {![catch {open $lockFile {CREAT EXCL WRONLY}} fd]} {
puts $fd [pid]
close $fd
return 1
}
after 1100
}
return 0
}More cleanup - it turns out that a large number of requests from search-engine spiders are bogus. I've added a shell filter in front of the CGI calls which rejects all URLs that do not match the following patterns:
shopt -s extglob
case "$REQUEST_URI" in
?(/tcl)/+([0-9])?(.txt|.html)) ;;
?(/tcl)/[^0-9/@!]+([^/@!])) ;;
?(/tcl)/edit/+([0-9])\@) ;;
?(/tcl)/references/+([0-9])\!) ;;
?(/tcl)/2\?+([^/@!])) ;;
/cgi-bin/nph-wikit/+([0-9])) ;;
*) cat <<'EOF'
HTTP/1.0 400 Bad request
Content-type: text/plain
This is not a valid URL for this site
EOF
exit;;
esac
echo HTTP/1.0 200 OKIt leads to a 5-fold reduction in requests triggering CGI. Please let me know if I accidentally locked out any valid requests -jcw14feb05 jcw - Made various tweaks and fixes to /edit/ and /references/ links. The Recent Changes
page now omits the link if the page has essentially no content. As far as I can tell, the changes over the past day or so to wikit bring everything back in working order, make URLs stricter, and prevent incorrect requests from triggering a CGI call to wikit (about 80% fewer requests now).If any further problems show up, please email me [1] -jcw
