NixOS 23.11 packages

Project URL: https://tmpnix-gsdrv.glitch.me/built.html

NixOS 23.11 is out, which while we’re not quite looking to put an entire operating system on Glitch, rather means that the Nix community has settled on a set of package versions that it’ll maintain for the next six months.

For now, click the project link to see the packages built for Glitch, from the “small” set. Highlights:

  • Nix 2.18.1
  • Node.js 18.18.2
  • Python 3.11.6
  • Emacs 28.2 still :cry:
  • OpenJDK 19.0.2
  • Ruby 3.1.4
  • PHP 8.2.13
  • nginx 1.24.0
  • Apache HTTP Server 2.4.58
  • PostgreSQL 15.5
  • MariaDB 10.11.6
  • GCC 12.3.0
  • Clang 16.0.6
  • rustc 1.73.0
  • glibc 2.38 :skull:

See here for a somewhat tolerable way to install those packages: NAR Flinger, a package installer in a single script

last thread: List of packages I've compiled for Glitch


In the following posts, I’ll walk through the horrible things encountered in getting this to build on Glitch. The good things too.

4 Likes

Would it be possible to build discourseAllPlugins so it would work on Glitch?

glibc

Somewhere between the version that was in earlier NixOS releases and this one, there was new logic added to remember which syscalls give ENOSYS and it’ll later remember to use the fallback. Or, at least there is for the clone3 one that I patched.

So I’ve had to rewrite the patch. It treats the EPERM we get with a custom behavior: use the fallback, but don’t go into never-try-again mode either. Which in retrospect, how did I decide that that’s how I want to do it? Something vaguely “patches should be something that could be added upstream are better, and a real EPERM on a non-Glitch system shouldn’t put the process into fallback-only mode.” But it’s not that meaningful, this project is full of bad-outside-of-Glitch patches.

Oh and for some reason is now a second place in the glibc codebase that also does this “try clone3 and see if it’s not supported” logic (spawni). I missed that the first time and had to recompile gcc an extra time :smiling_face_with_tear:

libredirect

Libredirect, that thing that interposes on various filesystem-related libc functions to let you pretend file are in other places, has a few changes.

First, they’ve added a new test for NULL path arguments. Which is good. But it was inserted right between two tests I had disabled for incompatibility (they use system which doesn’t work), so I had to rewrite the patch to disable them.

Second, they’ve since removed the -ldl link flag, which could make it slightly more interoperable with older programs.

Before this removal, the libredirect.so file would get the new glibc libdl added to its rpath, which would cause problems when trying to load an old binary: the new libdl would be incompatible with the old libc in the process.

Earlier this year though, the team observed that the libcs they intend to package don’t really put anything in libdl treewide: remove -ldl linker flags by alyssais · Pull Request #212275 · NixOS/nixpkgs · GitHub, so they removed that flag. Huh, that PR was from back in January. Maybe it was already in since the last release, 23.05? Maybe I didn’t look into it back then.

Unfortunately, the old glibc on Glitch still requires libdl to provide the dlopen symbol that libredirect uses. In many programs this can still work, as long as the original program requests libdl, which seems to be common. But Glitch’s /bin/sh does not request libdl, so dlopen won’t be found, so libredirect won’t load correctly in /bin.sh, so it continues to be impossible to run Glitch’s /bin/sh (e.g. for the system function) with libredirect. Ugh.

python psutil

Remember that thing about Python APIs I had forgotten that the way I wrote that Nix “overlay” sets up the patch for a specific version of Python, 3.10 at the time. Now that the default Python 3 version has been upgraded to 3.11, I had to update the overlay file. Am I overlooking a simpler, non-Python-version-specific way to do this? Needs more research.

libuv

libuv, that async IO library used in Node.js, added a test for thread affinity. Thread affinity is super weird on Glitch, as we had earlier seen with the AWS SDK. I’ve added a patch to disable the thread affinity test.

I also looked into what the heck is so weird about thread affinity on Glitch.

import os

os.sched_getaffinity(os.getpid())
# {1, 2, 3}
os.sched_setaffinity(os.getpid(), [0])
# invalid argument

It seems there really are 4 logical CPUs, but we’re not allowed to use CPU 0. What’s the actual mechanism for this? Needs more research. It feels like I’ll find out with enough sifting through various cgroup options.

Anyway, that really throws off tests that expect reasonably “there must be at least one CPU, so let’s test on CPU 0.”

I guess we could make our patches smaller so that they test on CPU 1 instead of disable these affinity tests completely. But I’d need a lot of motivation to make that change.


Alright, that’s enough writing for now. I just keep uncovering more and more places I need to do additional research. But let me make a list of remaining topics so I don’t forget to mention them later.

  • openssh
  • openexr :skull:
  • php pear installer thing
  • nix sandbox test
  • nix ‘installable’ derivation meaning

https://hydra.nixos.org/build/242650383#tabs-details

Closure size: 1577.11 MiB

good god, that’d take like 5 minutes just to download each time


update:

https://www.kernel.org/doc/Documentation/cgroup-v1/cpusets.txt

Cpusets provide a mechanism for assigning a set of CPUs and Memory
Nodes to a set of tasks.

That must be it.

The /proc/<pid>/status file for each task has four added lines,
displaying the task’s cpus_allowed (on which CPUs it may be scheduled)
and mems_allowed (on which Memory Nodes it may obtain memory)

$ cat /proc/$$/status
...
Cpus_allowed:   e
Cpus_allowed_list:      1-3

Yup. (0xe being 0b1110.)

1 Like

If it’d work, at least it’d be a modern version

A little more about overriding a Python package

No good findings.

And overriding python3 instead of python311 is not satisfactory, because it would only work for python3 and not python311. Overriding python311 works on both.

openssh

They added a bunch more references to /bin/sh in their tests. To recap, we can’t let it run /bin/sh in its tests, because the nixpkgs maintainers are running the openssh regress tests with libredirect, and libredirect is incompatible with the Glitch container’s old native libc.

I’ve rewritten the overlay to replace /bin/sh with a Nix-provided shell more widely. Previously, it had only done shebangs.

openexr

OpenEXR has a thread pool implementation. It has tests for that thread pool implementation. We run those tests when we build OpenEXR. That thread pool test tests resizing the thread pool. Resizing the thread pool starts new threads to make the thread pool bigger. To make the thread pool smaller, it joins all threads and starts new ones. The test randomly resizes the thread pool ten thousand times in a tight loop.

And that’s the backdrop for why last week I complained about that not working on Glitch Joining threads leaves something behind?.

Couldn’t figure out why. We’re joining the threads, right? We wouldn’t ever have more than 33 at a time. So why would we ever run out of cgroup tasks? Please post if you can figure this out :pray:

So I worked around it by reducing the number of iterations in the test to a much smaller number.

PEAR installer for PHP

There’s a part of the normal PHP distribution called PEAR. I have not looked into what it is, but someone on the internet said php: add optional `phpSrc` attribute by drupol · Pull Request #254556 · NixOS/nixpkgs · GitHub that the normal way to build PHP from source is to obtain PEAR by downloading this “install-pear-nozlib.phar” file from a website.

By some luck, that website published a new version of the file for the first time in two years on the same week I was trying to build PHP. Of course Nix does the right thing of making sure the file you get is the one the package definition expects, so it failed the build when it got this new unexpected version.

As usual, the workaround for this was to find a copy of the old version and download it with nix-prefetch-url.

I helped submit a change to use that URL to the old copy in the package definition php: use a versioned url for install-pear-nozlib.phar by wh0 · Pull Request #271972 · NixOS/nixpkgs · GitHub

new Nix test needing sandboxing

There was a new test that involved, and let me look this up nix help-stores - Nix Reference Manual —a “chroot” store, which requires the same features that Nix’s sandboxing uses (not just chroot actually, unshare as well).

The developers are merciful though. These tests that need sandboxing are skipped when you run tests on a system that doesn’t support it, as is the case on Glitch (no permission to use user namespaces).

A bug cause this new test not to be skipped completely on a system without sandbox support. I was able to patch this and submit the fix upstream tests: avoid a chroot store without sandbox support by wh0 · Pull Request #9529 · NixOS/nix · GitHub

how to specify derivation outputs

The version of Nix newly distributed in NixOS has changed how part of the command line interface works. Telling Nix to copy around a derivation’s outputs now requires a ^* after the derivation path. Otherwise, you’re now talking about the derivation file itself.

I rewrote my automatically-upload-what-I-build script to work with this :person_shrugging:


I could try

update: this thing wants me to build nodejs again, but “slim.” aaaaaaaaaaaaaaaaaaaaaaaaaaaa

update: it also wants me to build v8 again, separately. which after like 8 hours, just failed

[3252/3262] LINK default/mksnapshot
FAILED: default/mksnapshot 
python3 "../../build/toolchain/gcc_link_wrapper.py" --output="default/mksnapshot" -- g++ -pie -Wl,--fatal-warnings -Wl,--build-id -fPIC -Wl,-z,noexecstack -Wl,-z,relro -Wl,-z,now -m64 -rdynamic -Wl,-z,defs -Wl,--as-needed -pie -Wl,--disable-new-dtags -Wl,-O2 -Wl,--gc-sections -o "default/mksnapshot" -Wl,--start-group @"default/mksnapshot.rsp"  -Wl,--end-group  -latomic -ldl -lpthread -lrt
/tmp/nix/store/32mq38h73xs22ji5whndc9ixrrsy9zii-gcc-wrapper-12.3.0/bin/ld: line 269: 17658 Killed                  /tmp/nix/store/ffamw1h76yjx8bpr9drzjdqi4zzyhc8v-binutils-2.40/bin/ld ${extraBefore+"${extraBefore[@]}"} ${params+"${params[@]}"} ${extraAfter+"${extraAfter[@]}"}
collect2: error: ld returned 137 exit status
ninja: build stopped: subcommand failed.
error: builder for '/tmp/nix/store/rfhda09rnd9xpbmmv46h1rbl00n74kpw-v8-9.7.106.18.drv' failed with exit code 1

whaaat just happened

update: it actually used too much ram trying to link that mksnapshot program. like not just “over the limit glitch says.” because if glitch were to kill your processes for that, it would kill them all at once. there’d be no ninja build tool to say “ah, this step failed.” it used so much ram that the container had to kill a process.

so now what? maybe build on a bigger server somewhere else?

update: trying on a machine with more RAM

[3252/3262] LINK default/mksnapshot
FAILED: default/mksnapshot 
python3 "../../build/toolchain/gcc_link_wrapper.py" --output="default/mksnapshot" -- g++ -pie -Wl,--fatal-warnings -Wl,--build-id -fPIC -Wl,-z,noexecstack -Wl,-z,relro -Wl,-z,now -m64 -rdynamic -Wl,-z,defs -Wl,--as-needed -pie -Wl,--disable-new-dtags -Wl,-O2 -Wl,--gc-sections -o "default/mksnapshot" -Wl,--start-group @"default/mksnapshot.rsp"  -Wl,--end-group  -latomic -ldl -lpthread -lrt
/tmp/nix/store/ffamw1h76yjx8bpr9drzjdqi4zzyhc8v-binutils-2.40/bin/ld: final link failed: No space left on device
collect2: error: ld returned 1 exit status
ninja: build stopped: subcommand failed.
note: keeping build directory '/tmp/nix-build-v8-9.7.106.18.drv-0'
error: builder for '/tmp/nix/store/rfhda09rnd9xpbmmv46h1rbl00n74kpw-v8-9.7.106.18.drv' failed with exit code 1

didn’t have enough disk :skull:

update: /tmp/nix/store/zw87cg9f61x9hk3lh8cqdgfl93mf7m69-discourse-3.2.0.beta1 here you go :person_shrugging:

1 Like

Discourse actually worked? Wow.

far from it. we have the compiled package now, but I’ve never run it. and from what I’ve seen of your work on getting it to run on replit, it seems pretty gnarly. chmodding the nix store and editing the files :dizzy_face:

and as for compiling it, it wasn’t even possible on glitch. v8 needed more RAM than the container had. and that asset precompile step somehow froze, and I manually killed the postgresql and redis to unfreeze it.

anyway, want to try it out?

1 Like

I’ll try it out.

I tried Ruby 3.1.4. It seems that I cannot use bundle exec:

Traceback (most recent call last):
        12: from /usr/lib/ruby/2.5.0/rubygems/core_ext/kernel_require.rb:59:in `require'
        11: from /usr/lib/ruby/2.5.0/rubygems/core_ext/kernel_require.rb:59:in `require'
        10: from /tmp/nix/store/i4gvrq9smsbz9c4pn1g61gzn4qx7hq1d-ruby-3.1.4/lib/ruby/3.1.0/bundler/setup.rb:10:in `<top (required)>'
         9: from /tmp/nix/store/i4gvrq9smsbz9c4pn1g61gzn4qx7hq1d-ruby-3.1.4/lib/ruby/3.1.0/bundler/ui/shell.rb:88:in `silence'
         8: from /tmp/nix/store/i4gvrq9smsbz9c4pn1g61gzn4qx7hq1d-ruby-3.1.4/lib/ruby/3.1.0/bundler/ui/shell.rb:136:in `with_level'
         7: from /tmp/nix/store/i4gvrq9smsbz9c4pn1g61gzn4qx7hq1d-ruby-3.1.4/lib/ruby/3.1.0/bundler/setup.rb:10:in `block in <top (required)>'
         6: from /tmp/nix/store/i4gvrq9smsbz9c4pn1g61gzn4qx7hq1d-ruby-3.1.4/lib/ruby/3.1.0/bundler.rb:161:in `setup'
         5: from /tmp/nix/store/i4gvrq9smsbz9c4pn1g61gzn4qx7hq1d-ruby-3.1.4/lib/ruby/3.1.0/bundler/runtime.rb:20:in `setup'
         4: from /tmp/nix/store/i4gvrq9smsbz9c4pn1g61gzn4qx7hq1d-ruby-3.1.4/lib/ruby/3.1.0/bundler/shared_helpers.rb:76:in `set_bundle_environment'
         3: from /tmp/nix/store/i4gvrq9smsbz9c4pn1g61gzn4qx7hq1d-ruby-3.1.4/lib/ruby/3.1.0/bundler/shared_helpers.rb:282:in `set_bundle_variables'
         2: from /tmp/nix/store/i4gvrq9smsbz9c4pn1g61gzn4qx7hq1d-ruby-3.1.4/lib/ruby/3.1.0/bundler/rubygems_integration.rb:178:in `bin_path'
         1: from /usr/lib/ruby/2.5.0/rubygems.rb:263:in `bin_path'
/usr/lib/ruby/2.5.0/rubygems.rb:284:in `find_spec_for_exe': Could not find 'bundler' (2.3.26) required by `$BUNDLER_VERSION`. (Gem::GemNotFoundException)
To update to the latest version installed on your system, run `bundle update --bundler`.
To install the missing version, run `gem install bundler:2.3.26`

Files from older Ruby are invoked instead of the newly installed one?
I checked which bundle and which ruby, they are all newly installed ones.


Edit: seems that restarting the container solves the problem.

worked for me too