Experience report: building an administrative connection to Glitch

At first I wanted to build a tool to transfer files to and from a Glitch project, which many posts in the feature requests section confirm I’m not unique in desiring.

I decomposed this into two problems: (i) a way to transport data between a local computer and a Glitch project and (ii) some additional logic and user interface to harness that data transport for file transfer. In this report I focus on (i). That’s all we’ll be discussing–what’s the best way for a project administrator to set up a one-off data communication between a local computer and a Glitch project.

Background: the elemental communications with Glitch

There are a few ways a Glitch project connects to the outside. Let’s review these:

  1. Anyone (only approved users in the case of private projects) can communicate with the project over HTTP on the project’s glitch.me subdomain.
  2. A project member can view and modify code files through the web editor.
  3. A project member can open a terminal session.
  4. A project member can connect to a debugger.
  5. Anyone (only approved users in the case of private projects) can fetch the project as a Git repository and a project member can push to it.
  6. A project member can upload assets.
  7. A project member can view project logs.
  8. A project member can execute a command and get its output.

I’ve built some cool things with (1), and it’s a highly capable means of communication, but it doesn’t seem right for this job. I’m focusing on public projects because they’re available for free, which means that we’d have to build our own authentication system. It’s also more disruptive in that the tool would somehow have to set something up on the project to intercept whatever data we’re sending.

Option (2) is limited to files that would show up in the editor, and it’s limited to text. I can’t help but imagine that I’d be using a ‘file transfer’ tool to back up and restore binary database files from .data though–not shown in the editor and not text. I have some experimental scripts to do this already, but for these reasons they don’t live up to my expectations of a ‘file transfer’ tool. It has a nice property that you don’t have to run refresh afterwards, so I’d like to add this to my snail CLI suite regardless.

Option (3) is built on WeTTY, which is built on a package called node-pty, which it turns out has a text-based API. There’s also whatever translation terminal devices normally do to make sure, for example that nothing takes effect until you press enter and that line endings get munged up properly.

Option (4) is intriguing, but I see a lot of people complaining that it doesn’t connect right, so I’ve held off on using it for this job. But I’ve recently started looking into it, and I might revisit this.

Option (5) is pretty much meant for transferring files, but it has some limitations. It’s restricted to stuff in the Git repository at /app. Moreover, if I’m to be a good person, I’d have to restrict myself to things that should go in version control. Furthermore, any file transferred this way would first have to touch down in the most precious /app mount (i.e. the one limited to 200 MB (some other number for boosted projects)) because that’s where the .git directory is.

Option (6) can and, in recommendations already on this forum, is paired with a curl or wget from within the container to achieve a fairly low-hassle one-directional transfer. You could also transfer things from a project by giving the project a copy of Glitch’s upload policy. I want to explore the idea of a project uploading assets on its own sometime. But this approach has drawbacks too. It makes the file public, and you can’t later delete the file from the CDN, which is undesirable from a security standpoint. Maybe you could hide it by uploading under a long unguessable filename and not updating the .glitch-assets file, but I don’t know how secretive Glitch is overall with asset filenames. Also, it’s two separate transfers through two different servers, which feels like it would be inefficient.

Option (7) is too crazy, even for me. Perish the thought.

Number (8) actually exists separately from the terminal, in case you didn’t know. It’s used in things like the editor’s ‘find in all files’ function. You can send a command and get the result back as a string. But that might be a little scary for file transfer, because in several places the file would have to fit in memory.

An unlisted option is that we could build it upon a bug in Glitch that allows us to connect to a project. But we shouldn’t both (i) hope for Glitch to get faster at fixing security vulnerabilities and (ii) hope for the longevity of tools that exploit a vulnerability. And we’d better side with (i), because what good would it be if we could transfer files but those files weren’t safe?

Next post: choosing from these

The options don’t look great, but they’re all we have. In the next post, I’ll discuss which one I went with.

6 Likes

I would say option three is quite promising. I believe all you need to connect to your project’s terminal outside of Glitch is the socket url, your persistent token, and the project to be on. Good luck with this, I’m curious to see what you come up with.

The least bad option

This might not have turned out to be true, but when I was deciding among these various ways to communicate with a Glitch project, it sure looked like option (3) was the least bad of the bunch. @RiversideRocks, I see you’re thinking the same.

Communicating over the terminal indeed has that going for it. And it has all these things that were variously missing in the other options:

  • It’s very undisruptive. You don’t need to change anything about how your project runs.
  • We can rely on Glitch’s existing authentication mechanisms.
  • You can read and write files that don’t show up in the editor, such as in .data. Or /tmp. Or even node_modules.
  • It’s streaming, so it ought to be possible to transfer a file without loading it all into memory.
  • Yeah it’s text-only and there’s some weird translation and buffering due to the pseudoterminal, but these are things that we ought to be able to work around end to end. The fact that it starts you out in a shell makes it pretty easy to send a command to decode some sort of an encoding protocol.
  • It seems to be pretty reliable, with not many reports of it being broken while the rest of the project was working.
  • It can write directly to a file without first making another copy on a CDN or somewhere else in the filesystem.
  • It’s clear how it would work in both directions.

But there were many things I didn’t know, and those things ended up making this still kind of bad in the end. I’ll discuss those things in detail later.

Related work

This couldn’t be a new problem, could it? Transferring a piece of data over a terminal shell session? Nah, of course not.

XMODEM and subsequent protocols look a lot like what we’ll need to do. From this video of a sample use https://youtu.be/zxTO5qxti-I?t=381, you would run a program on the remote side, giving it a destination filename; then you’d get your local side to send the file encoded in some protocol that the receiver program understands.

tramp is Emacs’s way to edit files on a remote server. It has https://github.com/emacs-mirror/emacs/blob/emacs-27.1/lisp/net/tramp-sh.el extensive logic to work through all sorts of configurations. It’s able to load and save files through ‘inline’ shell methods, which go through a normal shell terminal session. I want to revisit this, to try adding a tramp method that goes through Glitch’s terminal API.

Ansible and Salt are remote administration tools that run or can run ‘agentless,’ where they SSH into a machine and don’t require a special program already to be installed there. They would send commands and data over SSH. But we shouldn’t take that to mean it does things through a shell session. SSH lets you run without a pseudoterminal, it has subsystems specifically for file transfer, it has port forwarding, etc. That’s the stuff we’re trying to build in the first place. There isn’t much literature on how these tools do their job, and I don’t know really where to look in their codebase, so I haven’t gotten to the bottom of what we can learn from them. Whatever links, not sure if these are the right places to look: plugins/connection/ssh.py for Ansible and client/ssh/__init__.py for Salt.

Metasploit, that pentesting tool, you’d think would have a leet way to automate various things given a shell session. rex-exploitation/bourne.rb at v0.1.26 · rapid7/rex-exploitation · GitHub Apparently it echos a base64 program into a temporary file then decodes it, chmods it, and runs it. But metasploit-framework/shell_to_meterpreter.rb at 6.0.27 · rapid7/metasploit-framework · GitHub the actual payload it delivers is by default a program that makes a separate connection back to your machine, which it would be a lot more hassle to use if we required people to set up port forwarding.

There’s a lot of good stuff in these related work, but some of it we don’t have to worry about for this job. The connection is reliable and won’t go flipping bits. Glitch containers are all homogeneous and have a lot of software available.

Next post: bad attempts

Eventually I did start writing code to try out a few different ideas. In the next post, I’ll describe them and what I found out about why each one wouldn’t work.

3 Likes

Summary of problems to solve

We’ve avoided (tried to avoid) some serious problems by choosing to build on the terminal API, and what remains are things we should be able to solve ourselves.

  1. The API is in text, but we want to be able to transfer binary data.
  2. The pseudoterminal will translate line endings in various ways.
  3. The pseudoterminal will buffer an entire line before releasing it to the program. (Or the shell has a userspace line editor that buffers similarly? Please educate us if you have the details on this.)

That’s, at least, what I was aware of when I started experimenting. Other problems came up, which I’ll introduce along with the experiments that exposed them.

Idea 1: base64 -d >dst <<E_OF

A heredoc of base64-encoded data. To be extra sure, we can put some underscores in the delimiter, which shouldn’t show up in base64.

What went wrong: I found out that the shell doesn’t stream the heredoc content. It buffers it all. Memory usage went to the moon and Glitch terminated the session.

Idea 2: base64 -d >dst

Alright, since this isn’t a shell script and the shell and decoder are sharing the same input, let’s just hit enter and start sending base64 data.

What went wrong: You can’t stop. Turns out there’s no part of WeTTY that lets the gracefully signal the end of input. In real life, we’d be able to Ctrl+D our way out of the decoder, but that only works because we’re slow humans. Ctrl+D doesn’t specifically mean “The End”–it tells the terminal device to end a read call with whatever it has, which if it has zero bytes ready, returns the same as zero-byte result you get at the end of file. Thus tacking on a 0x04 at the end of a big data stream won’t actually exit the decoder. You’d need some kind of delay to make sure the decoder started another read call, and I’m philosophically opposed to fussing with that.

Idea 3: head -c1234 | base64 -d >dst

Sort of like using a length prefix instead of a delimiter, let’s run a program that knows exactly how many bytes of encoded data to read.

What went wrong: By now things were better, and short transfers worked. But longer transfers were revealing more insidious problems.

Addendum to problems to solve

  1. Terminal echo will wastefully send back a copy of whatever we sent out.

Idea 4: stty raw; head ...

Putting the terminal in raw mode does a lot of good things for us:

  • It turns off the wasteful echoing, which feels more efficient.
  • It turns off the line buffering, which would allow us to send really long lines without guilt.
  • It turns off line ending translation, which although isn’t a problem with this prototype, would be one less thing we have to worry about.

But also:

  • It makes it impossible to send signals, which although we don’t need it in this prototype, would be one less trick up our sleeve.

What went wrong: It actually didn’t turn off the echoing at all.

Idea 5: stty raw\n <delay> head ...

We actually need the stty raw to be received and executed before we start shoveling the data.

What went wrong: Me being opposed to fussing with delays, couldn’t handle this.

Idea 6: stty raw; echo ready; head ...

This led to a lot more complexity on the local sending side. We’d send this command, wait for the ready to come back, and only then start sending the data. The extra round trip was lame.

What went wrong: The echoing was gone, but long transfers would fail, with a connection lost error. I verified that the project memory wasn’t climbing, and observed that other terminal sessions to the same project would stay open. So it wasn’t Glitch getting annoyed at excessive memory use. It didn’t seem like random chance of actual network problems either, with many attempts all failing, around the same point in the transfer, even.

Idea 6a: ... ready; head -c1234 >dst

Around this time, I found out that node-pty, while documented to take string typed writes, would do any string-specific processing and would forward whatever JavaScript value it received to a Node stream write node-pty/unixTerminal.ts at 0.7.8 · microsoft/node-pty · GitHub ; thus, a Buffer would work too. WeTTY uses socket.io for the connection from the client, which also transparently supports sending Buffers. And we were using a raw mode terminal which would do no translation, and we were using a length prefix so we wouldn’t need to reserve any special delimiter that couldn’t appear during the data transfer. That all meant we could perhaps get away without doing any encoding.

What went wrong: Cutting out the encoding actually worked for the purpose of, well, cutting out the encoding. Neither WeTTY nor socket.io nor node-pty nor anything had problems with the binary data, and a sample binary file could be sent, and the checksums would match. But this didn’t fix the ‘connection’ problem in common with idea 6.

As for this optimization, I ended up abandoning it for a few reasons. First, I wasn’t sure socket.io was able to send binary data other than by base64 encoding internally. Second, I wasn’t able to measure the speedup empirically because variance in time in my experiments dominated. Third, passing a value type inconsistent with the node-pty documentation felt kind of risky for maintainability.

Next post: the connection loss problem

I was stuck on this connection loss problem for a while. In the next post, I’ll describe what caused this problem and several more ideas I tried in order to address it.

2 Likes

The connection loss problem: no backpressure

socket.io has a debug logging system where you set an environment variable to have it print out what’s going on internally. I tried running a long upload with this enabled. Here’s what showed up:

  1. Many large chunks of the file would be ‘sent’ very quickly.
  2. 25 seconds into the program, the socket.io client itself would send a ‘ping’ message.
  3. 5 seconds after that, the client gets mad that it doesn’t receive a ‘pong’ message.
  4. The client determines that the connection is unresponsive and disconnects it.
  5. The program exits abnormally because it can’t handle the connection loss.

The local side reads the large-ish file pretty quickly. I had been running this experiment a few times, and the whole file was probably in cache. The code to bring it over to socket.io was simple. It was something like this:

const socket = io(...);
const src = fs.createReadStream(...);
src.on('data', (chunk) => {
  io.emit('input', whateverEncodingThereWas(chunk));
});

There was no backpressure. We would give this poor socket a huge amount of data more or less all at once. Some layer below that obviously wouldn’t be able to send it over the network instantaneously, so that data would be waiting in line somewhere.

Anyway, later, on an interval, socket.io would try to send a ping to check if the connection still works. That ‘ping’ message gets in the back of the line, but it’s so far back it won’t stand a chance. The five seconds that the client is willing to wait for a pong will pass, and the line will have only inched forward.

Remember that time we were looking down on those other options for reading the whole file into memory? That’s us right now. And we’re not just wasting memory, the deluge is starving out another job that’s on a deadline. We have some new problems.

Addendum to problems to solve

  1. Uncontrolled reading will make memory usage grow.
  2. Uncontrolled writing will starve out the pinging and kill our connection.

So it looks like we’ll need to get backpressure working.

But you couldn’t do the usual Node.js thing where you check if write (or emit in this case) returns false and wait for a drain event. socket.io Client API | Socket.IO doesn’t provide you with that information. The next level down, called engine.io, has a drain event, but I found out that engine.io-client/websocket.js at 3.5.0 · socketio/engine.io-client · GitHub it’s faked for the web socket transport. Reaching into even lower levels of abstraction seemed too complicated, because one of engine.io's jobs is to switch around between transports as needed.

Idea 7: Acknowledge each chunk at the application layer

Increase the complexity by running a custom program in the Glitch project that receives data chunk-by-chunk and sends back a little something to acknowledge that it received each one. Also increase the complexity by having our program watch for these acknowledgements.

And we can’t send just one chunk at a time. This sort of ‘stop and wait’ design was one of the issues with XMODEM XMODEM - Wikipedia that later protocols notably improved upon. For performance, we’d need to count out a ‘window’ of chunks that are out at the same time.

What went wrong: I couldn’t settle on the perfect window size. The window needs to be large enough that the network always has something to send. The window needs to be small enough that when a ‘ping’ gets in the back of the line, it can get to the front, get sent, and have a reply come back in time. And that all depends on the network connection, and we’re building on a library that’s not transparent about that stuff.

From backpressure to any-irrelevant-message

When I was experimenting with some different window sizes on the last prototype, I tried larger and larger window sizes. I even tried a size so large that the entire ‘long upload’ file could be sent without any waiting. It seemed like it ought to have died for the same starvation reason as before, but these transfers succeeded.

I found out that the client wasn’t waiting for a ‘pong’ message per se–engine.io-client/socket.js at 3.5.0 · socketio/engine.io-client · GitHub “any packet counts.”

Idea 8: ...; dd count=1234 status=progress | ...

If all we needed to keep our connection from dying was for there to be some messages flowing the other direction, we could just use a ‘noisier’ off-the-shelf program in our pipeline. This one replaces head with dd in a mode where it shows a little progress line.

What went wrong: Nothing. Well, except for the memory growing as it reads the file faster than it can send. Nothing new.

Giving up on the detail

I decided not to solve problem (5). I had been working extensively with prototypes for the upload direction, and it would be a problem in the download direction as well. The code on the Glitch project side wetty/term.ts at 496db5e5632517052fb9abaeddd5ee769e77e296 · etamponi/wetty · GitHub for putting data into the socket was also “simple,” as I had said of my own code. It would take yet more complexity to have such a custom program on the Glitch project side to watch for acknowledgements. Heck, maybe I wouldn’t even end up using this tool to transfer that large of files in the first place. People have brought up how Glitch had this weird thing where a project gets more RAM than it gets disk anyway.

Pivoting on the big picture

There was this whole second part of how to use a data transport primitive to make a file transfer utility. I was tired from all this stuff with the transport. And this second part isn’t necessarily trivial either:

  • What if the user puts the name of a directory as the destination?
  • What if the user wants to copy a file to a different filename at the destination?
  • Should we support recursive copying?
  • Might people want a prompt for overwriting?
  • Should we offer to replicate the file permissions?

And so on. It seemed that having an 8-bit clean administrative connection was enough to cut a release. Maybe we could later use that connection to run rsync and have all those fiddly little questions above settled with “yes, and many more features too.” Maybe we could use it to run Visual Studio Code’s remote plugin. Lots of possibilities.

Introducing snail t pipe

In the end, I polished up the prototype to support full duplex communication and to encode each chunk separately so that they can be sent immediately. Instead of a user interface for copying files, it lets you run a command on the Glitch project. With all the problems we solved above, you’re free to use something as simple as cat >dst as that command and pipe data into the snail invocation. It provides separate stdout and stderr, and it also forwards the command’s return value.

See the resulting code here:

I left a big note about the remaining problems snail-cli/index.js at 6e6ca11a7605d0c15783fbc0ff6a2985967307d1 · wh0/snail-cli · GitHub. Notably, in the download direction, data piles up in the project container’s WeTTY process, which is controlled by root. Intriguing. That alone is why I posted this thread in the Feedback section, in case anyone was wondering.

Concluding remarks

I’ve subjected you to a rambling 3,785-word report just to get to this point: Dear Glitch, runaway printing in the terminal can elevate a root process’s memory usage, seemingly without bound. Please check that out.

4 Likes