Kapsule v0.2.1: sponsored by my wife's horror movies

In my last post, I made a solemn vow to not touch Kapsule for a week. Focus on the day job. Be a responsible adult.

Success level: medium.

I did get significantly more day-job work done than the previous week, so partial credit there. But my wife's mother and sister are visiting from Japan, and they're really into horror movies. I am not. So while they were watching people get chased through dark corridors by things with too many teeth, I was in the other room hacking on container pipelines with zero guilt. Sometimes the stars just align.

coding while untold horrors occur in the next room

Here's what came out of that guilt-free hack time.

Konsole integration: it's actually done

The two Konsole merge requests from the last post—!1178 (containers in the New Tab menu) and !1179 (container association with profiles)—are merged. They're in Konsole now. Shipped.

Building on that foundation, I've got two more MRs up:

!1182 adds the KapsuleDetector—the piece that actually wires Kapsule into Konsole's container framework. It uses libkapsule-qt to list containers over D-Bus and OSC 777 escape sequences for in-session detection, following the same pattern as the existing Toolbox and Distrobox detectors. It also handles default containers: even if you haven't created any containers yet, the distro-configured default shows up in the menu so you can get into a dev environment in one click.

!1183 is a small quality-of-life addition: when containers are present, a Host section appears at the top of the container menu showing your machine's hostname. Click it, and you get a plain host terminal. This matters because once you set a container as your default, you need a way to get back to the host without going through settings. Obvious in hindsight.

The OSC 777 side of this lives in Kapsule itself—kapsule enter now emits container;push / container;pop escape sequences so Konsole knows when you've entered or left a container. This is how the tab title and container indicator stay in sync.

Four merge requests across two repos (Konsole and Kapsule) to get from "Konsole doesn't know Kapsule exists" to "your containers are in the New Tab menu and your terminal knows when you're inside one." Not bad for horror movie time.

Configurable host mounts: the trust dial is real

In the last post, I talked about making filesystem mounts configurable—turning the trust model into a dial rather than a switch. That's shipped now.

--no-mount-home does what it says—your home directory stays on the host, the container gets its own. --custom-mounts lets you selectively share specific directories. And --no-host-rootfs goes further, removing the full host filesystem mount entirely and providing only the targeted socket mounts needed for Wayland, audio, and display to work.

The use case I had in mind was sandboxing AI coding agents and other tools you don't fully trust with your home directory. But it's also useful for just keeping things clean—some containers don't need to see your host files at all.

Snap works now

Here's a screenshot of Firefox running in a Kapsule container on KDE Linux, installed via Snap:

screenshot of firefox in snap in kapsule

I expected this one to be a multi-day ordeal. It wasn't.

Snap apps—like Firefox on Ubuntu—run in their own mount namespace, and snap-update-ns can't follow symlinks that point into /.kapsule/host/. So our Wayland, PipeWire, PulseAudio, and X11 socket symlinks were invisible to anything running under Snap, resulting in helpful errors like "Failed to connect to Wayland display."

The fix was straightforward: replace all those symlinks with bind mounts via nsenter. Bind mounts make the sockets appear as real files in the container's filesystem, so Snap's mount namespace setup handles them correctly. That was basically it.

While I was in there, I batched all the mount operations into a single nsenter call instead of running separate incus exec invocations per socket. That brought the mount setup from "noticeably slow" to "instant"—roughly 10-20x faster on a cold cache. And the mount state is now cached per container, so subsequent kapsule enter calls skip the work entirely.

NVIDIA GPU support (experimental)

This one's interesting both technically and in terms of where it's going.

Kapsule containers are privileged by design—that's what lets us do nesting, host networking, and all the other things that make them feel like real development environments. The problem is that upstream Incus and LXC both reject their NVIDIA runtime integration on privileged containers. The upstream LXC hook expects user-namespace UID/GID remapping, and the default codepath wants to manage cgroups for device isolation. Neither applies to our containers.

So I wrote a custom LXC mount hook that runs nvidia-container-cli directly with --no-cgroups (privileged containers have unrestricted device access anyway) and --no-devbind (Incus's GPU device type already passes the device nodes through). This leaves nvidia-container-cli with exactly one job: bind-mount the host's NVIDIA userspace libraries into the container rootfs so CUDA, OpenGL, and Vulkan work without the container image shipping its own driver stack.

There's a catch, though. On Arch Linux, the injected NVIDIA libraries conflict with mesa packages. The container's package manager thinks mesa owns those files, and now there are mystery bind-mounts shadowing them. It works, but it's ugly and will cause problems during package updates. I hit this on Arch first, but I'd be surprised if other distros don't have the same issue—any distro where mesa owns those library paths is going to complain.

So NVIDIA support is disabled by default for now. The plan: build Kapsule-specific container images that ship stub packages for the conflicting files, and have images opt-in to NVIDIA driver injection via metadata. Two independent flags control the behavior: --no-gpu disables device passthrough entirely (still on by default), and --nvidia-drivers enables the driver injection.

Architecture: pipelines all the way down

The biggest behind-the-scenes change in v0.2.1 is the complete restructuring of container creation. The old container_service.py was a 1,265-line monolith that did everything sequentially in one massive function. It's gone now.

In its place is a decorator-based pipeline system. Container creation is a series of composable steps, each a standalone async function that handles one concern:

Pre-creation:     validate → parse image → build config → store options → build devices
Incus API call:   create instance
Post-creation:    host network fixups → file capabilities → session mode
User setup:       mount home → create account → configure sudo → custom mounts → host dirs → enable linger → mark mapped

Each step is registered with an explicit order number and gaps of 100 between steps, so inserting new functionality doesn't require renumbering everything. The decorator handles sorting by priority with stable tie-breaking, so import order doesn't matter.

This pattern worked well enough that I plan to extend it to other large operations—delete, start, stop—as they accumulate their own pre/post logic.

On the same theme of "define it once, use it everywhere": container creation options are now defined in a single Python schema that serves as the source of truth for the daemon's validation, the D-Bus interface (which now uses a{sv} variant dicts, so adding an option never changes the method signature), and the C++ CLI's flag generation. Add a new option in Python, recompile the CLI, and you've got a --flag with help text and type validation. Zero manual C++ work.

The long-term plan is to use this same schema to dynamically generate the graphical UI in a future KCM. Define the option once, get the CLI flag, the D-Bus parameter, the daemon validation, and the Settings page widget—all from the same schema.

First external contributor

Marie Ramlow (@meowrie) submitted a fix for PATH handling on NixOS—the first external contribution to Kapsule. I don't have a NixOS setup to test it on, so this one's on trust. That's open source for you: someone shows up, fixes a problem you can't even reproduce, and you merge it with gratitude and a prayer.

Testing

The integration test suite grew substantially. New tests cover host mount modes, custom mount options, OSC 777 escape sequence emission, and socket passthrough. The test runner now does two full passes—once with the default full-rootfs mount and once with --no-host-rootfs—to verify both configurations work.

Bugs caught during testing that would have been embarrassing in production: a race condition in the Incus client where sequential device additions could clobber each other (the client wasn't waiting for PUT operations to complete), and Alpine containers failing because they don't ship /etc/sudoers.d by default.

CI/CD: of all the things to break

I finally built out the CI/CD pipelines. They use the same kde-linux-builder image that builds KDE Linux itself—mainly because it's one of the few CI images with sudo access enabled, which we need for Incus operations.

The good news: the pipeline successfully builds the entire project, packages it into a sysext, deploys it to a VM, and runs the integration tests. That whole chain works. I was pretty pleased with myself for about ten minutes.

The bad news: when the first test tries to actually create a container, the entire CI job dies. Not "the test fails." Not "the runner reports an error." The whole thing just... stops. No exit code, no error message, no logs after that point. Nothing.

I'm fairly sure it's causing a kernel panic in the CI runner's VM. Which is, you know, not great.

Debugging this has been miserable. I can't get any logs after the panic because there are no logs—the kernel is gone. I tried adding debug prints before each step in the container creation pipeline to isolate exactly where it dies. The prints don't come through either, probably because of output buffering, or maybe the runner agent doesn't get a chance to stream the output to GitLab before the entire VM goes down.

The weird part: it's not a nested virtualization issue. Regular Incus works fine on the same runner—you can create containers interactively, no problem. And it doesn't reproduce on KDE Linux at all. Something about the specific combination of the CI environment and Kapsule's container creation path is triggering it, and I have no way to see what.

I've shelved this for now. The pipeline is there, the build and deploy stages work, and the tests would work if the runner didn't kernel panic when Kapsule tries to create a container. If anyone reading this has ideas, I'm all ears.

What's next: custom container images

The biggest item on my plate is custom container images. Right now, Kapsule uses stock distribution images from the Incus image server. They work, but they're not optimized for our use case—things like the NVIDIA stub packages I mentioned above need to live somewhere, and "just install them at container creation time" adds latency and fragility.

Incus uses distrobuilder for image creation, so the plan is straightforward: image definitions live in a directory in the Kapsule repo, a CI pipeline invokes distrobuilder to build them, and the images get published to a server.

The "published to a server" part is where it gets political. I talked to Ben Cooksley about hosting Kapsule images on KDE infrastructure, and he's—understandably—not yet convinced that Kapsule needs its own image server. It's a fair pushback. This is all still experimental, and spinning up image hosting infrastructure for a project that might change direction is a reasonable thing to be cautious about.

So for now, I'll host the images on my own server. They probably won't be the default, since the server is in the US and download speeds won't be great for everyone. But they'll be available for testing and for anyone who wants the NVIDIA integration or other Kapsule-specific tweaks. I'll bug Ben again when the image story is more fleshed out and there's a clearer case for why KDE infrastructure should host them.

Beyond that: get the Konsole MRs (!1182 and !1183) reviewed and merged, and figure out why CI kills the kernel. The usual.

Comments

Bar A.February 19, 2026 at 2:33 AM
If the machine hosting the Gitlab Runner instance is dying, shouldn’t a KDE administrator be able to get you the logs from Linux pstore? You might also set up your own Gitlab Runner machine and try to reproduce the issue. Then you’d have access to the crash dump in pstore (and also VGA console).

Search This Blog

Herp De Derp