davfs2 shenanigans
[Hardcore nerd post. Non-geeks, please ignore.]
WebDAV is a nice, lightweight, widely implemented network filesystem protocol. davfs2 is the client implementation for Linux, which lets you mount a remote WebDAV server.
Heaven forbid that you should confuse it, though. If you confuse davfs2 (as I did, by trying to unmount the same share twice), you might get this delightfully cryptic and mildly ungrammatical error message:
/sbin/mount.davfs: Process 5852 already uses device cfs0. Maybe it's only a stale pid-file left from unclean shutdown. Please clean up manually.
Notice the lack of any hint as to where that pidfile is. The phrase “Maybe it’s only a stale pid-file left from unclean shutdown.” doesn’t appear in a Google or a Yahoo search; I hope putting it here will fix that.
I thought I’d be smart about this. “I know!,” I said, “I’ll use
strace!” (strace prints out all of the
system calls that a program makes, so you can get an idea of what its
inputs and outputs are.)
strace -o foo mount /path/to/mountpoint
Of course, strace revealed not a word about pidfiles, device cfs0, or the long-dead process 5824. Neither did ltrace.
Eventually I realized/remembered that mount was
(fork/exec)ing mount.davfs, so strace wasn’t picking up
anything that happened after the fork—which is when all of the
interesting stuff happened.
Thankfully, strace has a switch that makes it hang on to
all child processes: -f. So,
strace -f -o foo mount /path/to/mountpoint
…revealed the last few things that mount.davfs was doing:
6475 write(2, "\n", 1) = 1
6475 setresuid32(-1, 0, -1) = 0
6475 access("/dev/cfs0", F_OK) = 0
6475 open("/dev/cfs0", O_RDWR) = 5
6475 chown32("/dev/cfs0", 0, 100) = 0
6475 chmod("/dev/cfs0", 0660) = 0
6475 getuid32() = 0
6475 setresuid32(-1, 0, -1) = 0
6475 ioctl(5, CIOC_KERNEL_VERSION, 0xbffffc54) = 0
6475 open("/var/run/mount.davfs/cfs0.pid", O_RDONLY) = 6 <------ AHA!
6475 fstat64(6, {st_mode=S_IFREG|0644, st_size=5, ...}) = 0
6475 mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb7fe8000
6475 read(6, "5852\n", 4096) = 5
6475 write(2, "/sbin/mount.davfs: ", 19) = 19
6475 write(2, "Process 5852 already uses device"..., 129) = 129
6475 write(2, "\n", 1) = 1
6475 exit_group(1) = ?
6474 <... wait4 resumed> [{WIFEXITED(s) && WEXITSTATUS(s) == 1}], 0, NULL) = 6475
6474 --- SIGCHLD (Child exited) @ 0 (0) ---
6474 exit_group(1) = ?
Note the “AHA!” above. The pidfile was in /var/run.
I have no idea how /var/ is supposed to be arranged. Some
things make sense – logfiles are in /var/log, mail boxes
live in /var/spool/mail (depending on your setup). But
some things make no sense: mysql databases live in
/var/<b>lib</b>/mysql, as if that makes any sense. (They
aren’t code libraries!) I suppose that /var/run makes
sense for pidfiles but why couldn’t it be something obvious like
/var/pids?
(I know. I know. That would be too easy.)
Update: The source of the problem that prompted me to mount and
unmount davfs2 until it died and left behind a stray pidfile was also
in /var/ – the /var/cache/davfs2
directory. As it turns out, if you remove files behind davfs2’s back,
it can get confused, spewing errors like this:
/sbin/mount.davfs: Connection failed, mounting anyway. File system will only be usable when connection comes up.
When I looked at the network dumps (using tethereal "tcp port $PORT" -i lo -d tcp.port==$PORT,http -x -w path-to-output,
which I highly recommend), it turned out that davfs was requesting
files that I’d long since nuked. Blowing away the contents of the
/var/cache/davfs2 fixed the problem. (Why the heck do
these hang around between mounts, anyway?)
