How do we set up continuous integration (CI) for a project? Okay it’s not that hard, we just copy some yaml files from a similar project and edit in our own project details. But like, hypothetically, what if all the other CI config files in the world got deleted? Or in an even more far-fetched scenario, what if we were working a project that was the first of its kind, and its needs were unique? How would we proceed?
Surely in that case, we’d read up on the CI service’s docs and figure out exactly how to write what we need. And everything would just work exactly as documented. On the first try.
Nah, of course we couldn’t do that. There wouldn’t be enough documentation, or things will behave differently from how they’re documented, or we’d simply make mistakes. Or all three of those things would happen.
So if not that way, then how? Well I know how to write shell scripts: I try a bunch of stuff in a terminal, then I copy down the stuff that works. Could something like that work for writing a script for a CI config?
The problems with applying this to writing a script for CI are that:
The CI environment is different enough from your own computer (e.g. filesystem layout, installed software, authentication credentials) that developing a script locally and using it on CI might not work;
Trying out commands on CI is pretty slow, with, in my impression of a typical CI environment, editing a command in a file, committing, pushing, and waiting for a bunch of other setup to run at the beginning of the CI run.
The goal of Linechunk is to alleviate these two problems: to run commands in the CI environment so that you can find out how they behave there, and to let you run those commands interactively without baking them into a Git branch and waiting for a new CI run for each one.
Have you seen those software installation instructions that are like “run curl rustup-or-whatever | sh”? People have raised some complaints about that method. The server could send you anything, even something surreptitiously different from the correct installer. The shell will run commands as they’re received, so if something goes wrong with then network, you might end up with only part of the installation script run.
These properties of curl | sh are real flaws with using it as an installation mechanism, but they’re just right for building Linechunk. We can set up a CI script to run curl somewhere | sh, and that “somewhere” server is free to send anything and have it executed as it’s sent—line by line, even. (Hence the “line” in the project name. “Chunk” refers to the use of “chunked” transfer encoding in HTTP, which allows the server not to pre-commit to a total length of the script before it starts sending commands.)
Linechunk is, on one end, a server—the mythical “somewhere” described above—that serves up a shell script one line at a time, which you can curl and pipe into a shell in your CI script. On the other end, it’s a little web interface for you to tell it what line to serve next.
Coming up next, I’ll write about how authentication works in Linechunk.
Linechunk is a public Glitch project. Probably can’t make it a private Glitch project, or it would be much more complicated to curl it. But we don’t want to let the general public send commands to our CI runs. Thus I’ve put in some password protection for the side that sends commands.
To use Linechunk, you first choose a password, and you get a special URL to curl which contains a hash of the password. When your CI environment curls that URL, the server gets a copy of that hash and uses it to authenticate you when you come to send commands.
And that’s kind of neat I think. Because the password is hashed, even if the CI logs are public, e.g. in a public repo using GitHub Actions, viewers won’t be able to log in from anything that shows up in the logs.
Next I’ll write about limitations and how Linechunk compares with related projects.
This gives you a one-directional stream of data. You don’t see the results of the commands on the web interface. The assumption is that you can look at your CI logs in real time to see what’s happening with these commands. A more sophisticated system could stream data back somehow. Would a web socket be suitable? But we don’t really have any widely installed web socket clients like we have curl.
You only control stdin. The curl | sh invocation does not create a TTY. That’s good enough for some things, but certain utilities like less and sudo ( glitch pls) rather read from /dev/tty (I think, correct me if I’m wrong on that). Not sure how that ends up on CI, but I was testing from inside the Glitch project terminal, and it wouldn’t work right. I suppose in real CI you wouldn’t use a pager in the first place and sudo would be passwordless, so maybe this isn’t a big deal.
There’s no echo of input on the CI side. Linechunk recommends running the shell with -v to see a copy of the commands though, which gets you most of the utility. But for providing inputs to interactive programs, it’s gonna look weird in the CI logs with the prompts showing up but the stuff you type not showing up.
There’s no port forwarding or file copying or anything. It’s not like we’re running a whole SSH connection.
Although the password-hash-in-the-URL makes Linechunk not have to store a users database, it also means you can’t change your password unless you push a change to the URL in your CI script.
Teleconsole (GitHub - gravitational/teleconsole at b750dfe4a8881de20c4371f0cc5bb6f88f0b116b):
This service (no longer offered) would set up a proxied SSH server and give you a URL which opens an in-browser SSH client, with the right credentials, so you could click that and start running commands. I got a lot of inspiration from this. (A lot of motivation from it being shut down too.) Although it’s not as suitable for publicly visible CI, because anyone could click the link. Linechunk adds a password so that you can control who is allowed to send commands.
Reverse shells (What Is a Reverse Shell | Acunetix, a search result on the topic):
These, in the network intrusion field, are commands or programs that connect from a victim machine to an attacker’s machine and run things received from that connection. (They’re reverse in the sense that connecting to the attacker’s machine is the reverse of listening for a connection from the attacker’s machine.) If you’re breaking into a system where you can run an arbitrary command, you could run such a reverse shell command and then be able to send multiple commands. Like a wish for more wishes. In these scenarios, there’s no assumption that you can see the output on a separate CI log, so these commands/programs are designed to send the results back too. A lot of these use raw TCP for communication, whereas Linechunk uses HTTP so that we can run it easily on Glitch.
Maybe, the whole password hashing thing is overengineered. Would anyone really trust a Glitch project not under their own control to send shell commands to their CI environment? Not sure. I run a lot of curl | sh installers for various things. And I think some projects I have allow CI to run on PR branches from other people. But couldn’t all this have been some simple shared secret in the .env file, and anyone else who wants to use it should remix the project and set their own secret? Hm