Remotely Executing Processes from Files

21 Sep 2018

1 Context
2 Remotely Running Jobs
3 Watching and Synchronising the Job File
4 Windows in Addition to Linux
5 References

A rather absurd approach I implemented to remotely, non-interactively control a computer with a file-based interface. Absurd, yes, but it does work wonders.

1 Context

What do you do when you need to remotely operate a computer that's behind a router? The conventional way is to have it forward ports – typically 22 for SSH. All common routers let you do this in a matter of a few clicks. But what if you're a little scared of the security consequences, you want to leave the router configuration untouched, you don't like doing things the conventional way and you feel like coding a new service anyway? You reinvent the wheel and give it an eye-catching rim. That's what I did when I wrote filerexec.

2 Remotely Running Jobs

The idea is to adopt a pull approach rather than a push one, and leave all the networking and security concerns to a file hosting service such as Dropbox. In essence, you have a service watch a job file called filerexec.yaml which describes what to run and how. In this example, filerexec is instructed on how to run the ls and date jobs. The commands key is a command line as you'd enter it in a shell and as such supports control operators, redirections, and all the rest of it:

ls:
    commands: ls /home/john; ls /etc > /home/john/ls-etc
    operation: start
    stderr: true
    stdout: true
date:
    commands: date
    operation: start
    stderr: false
    stdout: true

In the ls job, filerexec lists the contents of the /home/john directory to stdout. It then lists the contents of the /etc directory, this time writing it to the /home/john/ls-etc file. The stdout and stderr keys enable or disable recording respectively stdout and stderr to the filerexec log file. In the case of ls, only the contents of /home/john will be logged, since that of /etc will be recorded into /home/john/ls-etc as per the redirection.

The operation key, when turned to start, causes the job to be executed on the remote host. Turning it to stop not only causes it to not start if already not running, but also to kill the corresponding remote processes if it is running. For this particular key, I chose to make this file a two-way interface, on the one hand allowing the local client to define the commands to run, whether or not to log stdout and stderr and to start and stop the remote job, and on the other hand also letting the remote host which runs the job to edit it. In particular, it will set the operation key to stop when the job finishes. It's like using Vim like a dashboard.

Is this outrageously dangerous in that it gives way to all sorts of race conditions? Probably, and yet, after having used it rather intensively for a few months managing tens of jobs, this approach proved to not only work fine in practice, but to be convenient too. I probably fully embraced the UNIX philosophy on this one: I didn't mind living dangerously because I knew what I was doing, understood and accepted the consequences of having different processes editing the same file. Then again, Vim made my life easy in warning me that the job file had been changed behind my back by the remote host, so I knew I had better reload it before saving any change I'd made on my end in the meantime. Yanking those changes before doing so allowed me to avoid any inextricable definition conflicts.

3 Watching and Synchronising the Job File

If Vim makes it easy to merge on the fly changes made by both me and the remote host, I'm a little scared that Dropbox's conflicted copies feature might get in the way in some as-yet unmet circumstances. When several users edit the same file at the same time, Dropbox sends a renamed copy of it instead of overwriting it. Something called like filerexec.yaml (John Doe's conflicted copy 2018-09-21). If the remote host only reads filerexec.yaml, none of the changes you might bring to the job file will be taken into account. Not even a salvatory:

mv filerexec.yaml "filerexec.yaml (John Doe's conflicted copy 2018-09-21)"

If you're physically far away from the remote host, you've had it. Given the stakes and consequences in the context of filerexec, you'd much prefer losing changes on one end than ending up stuck. Sadly, Dropbox doesn't offer a way to turn off this feature. I've not suffered from this, thank goodness. But I really ought to have filerexec either:

Clean up conflicted copies, fighting off that which Dropbox does to supposedly protect me. (Can't help but think of it like those cars that slam their brakes of their own accord in situations when they really shouldn't, putting everybody in harm's way more than if they hadn't meddled.)
Watch a directory of job files, rather than a single job file.
Give up on the two-way job file, which would be a bit heartbreaking, to be brutally honest. Moving files in and out of directories may be safer, although much less convenient than Vimming your way around in a single file.

Speaking of watching files and directories, this project was an opportunity for me to try watchdog, a Python module which helps you monitor filesystem events. I've been a lifelong and happy user of (py)inotify. Nevertheless, I was eager to try something new. Somehow, watchdog offers a simpler interface while achieving precisely what you need, whereas I always found pyinotify rather hard to convince, even though it appeared to be very flexible. It's usually the other way around. However easy watchdog made my life, I somehow got myself tangled up in an algorithm of mine which, under some as-yet unclear conditions fails to collect events. Something I still need to sort out. As an easy way to alleviate this problem, I added a safety net to filerexec to force a job file change check every so often even if it didn't collect any event from watchdog. This gave me a chance several times to restart filerexec and put it out of its misery.

Because interestingly enough, filerexec can restart itself. It's not something I planned, but I once needed to write a job to restart itself and – thankfully – it worked. I was quite pleased with that, not least because I was a few miles away from the remote host and still wearing pyjamas. I wrote a short Python script to carefully aim for the process to shoot down, rather than coming up with a write-only zsh one-liner:

import os
import subprocess
import re

regex = re.compile('python.*filerexec')

for entry in os.listdir('/proc'):
    if entry.isdigit():
        with open('/proc/%s/cmdline' % entry) as fhl:
            cmdline = fhl.read()
            if regex.search(cmdline):
                print "Killing %s: %s" % (entry, cmdline)
                subprocess.call(['kill', entry])

I summoned the Python script with filerexec as follows:

kill:
    commands: python /home/john/Dropbox/scripts/kill.py
    operation: start
    stderr: true
    stdout: true

This of course will only be useful if you've set up a service for filerexec with e.g. systemd such that it restarts it if it gets killed, or something otherwise nasty befalls it. Rather predictably, filerexec won't be able to update the operation key in the job file as it will already be pushing up the daisies by the time kill.py is through with it. Consequently, as soon as systemd brings it back from the dead, it will commit suicide again. And again. And again. Until you realise what goes on and set the operation key to stop manually, at which point everything will be back to normal.

4 Windows in Addition to Linux

The primary reason why I gave up on pyinotify in favour of watchdog isn't because I didn't like pyinotify anymore – I always quite liked it, and still do, in fact. The main reason was that I needed to remotely run jobs not only on Linux hosts, but also on Windows. And Windows doesn't understand inotify. Another neat aspect of watchdog is that it will harness whichever filesystem event monitoring mechanism it will find available on the machine it's running on – inotify on Linux, kqueue on BSD systems (including macOS), ReadDirectoryChangesW on Windows and disk polling if all else fails.

At first, when I did my testing between Linux machines, all I needed was to catch on_modified events. But as I started involving Windows in the tests, I realised that it wasn't enough and started listening for on_any_event to please everyone. Easy and effective.

If the filerexec project was an opportunity to discover and like watchdog, it was also one to (re-)discover and at-first-not-quite-utterly-loathe Windows. In particular, when I came to turn filerexec into a service, I must confess that I had a rather pleasant experience that I felt like sharing here. A while back, I discussed the weird and wonderful world of systemd and how I fell in love with it for trying so hard to be helpful. In all fairness, Microsoft did a good job of making service management just as comfortable. Only, they'd already done it all a long time ago. Can't believe I just said that.

Nevertheless, this didn't keep me from installing Cygwin to bring some of the UNIX beauty to Windows. In fact, it played an important part in setting up a service for filerexec. There's a very useful utility called cygrunsrv available in the Cygwin package of the same name. It lets you manage Windows services. For instance, installing filerexec as a service was a matter of running:

cygrunsrv -I filerexec -p /home/john/filerexec.sh -n

Ensure that you have administrator privileges and that filerexec.sh is executable. Also specify the full path leading to filerexec.sh, or Windows will not be able to successfully start the service. As making cygrunsrv accept arguments to filerexec was a little hard, I soon chose to just have it call a wrapper shell script which, in turn, executes filerexec with the bespoke arguments. The -n option points out that the service should never exit by itself, and that Windows may restart it automatically if it does. Open the Start menu and type Services to show the list of available services: filerexec should appear there for you to start or stop it. Double-click on its entry in the list and switch to the Recovery tab to make sure that filerexec be restarted every time something horrible happens – this is where the -n option comes in:

Setting Up Automatic Restart in the Filerexec Service Properties

5 References

filerexec on GitHub
What's a conflicted copy?
watchdog
pyinotify
Writing services with systemd
Cygwin
cygrunsrv.README, although I found cygrunsrv -h more informative.