Loose ideas for operating systems

This post has been copy-edited by doppler. Thanks!

Most research nerds either start writing Unix hagiographies or start stapling a 99-point thesis at the doors of Murray Hill. This is the latter kind of post; I’ll try to cover ideas for systems that could be meaningfully different from current systems. I’ve done a lot of research on existing concepts and existing systems, particularly those that could have been the future. Existing systems can be extrapolated into something new.

A lot of the ideas have been percolating in my head for a while now and are rough ideas for what could be. Perhaps I’ll iterate on them further, or realize there’s a reason no one was doing these before. The main idea is a place to start off, and it iterates from there. Treat it like a buffet of ideas; caveat emptor for people who don’t like musing.

Erlang as OS

Erlang is a very interesting programming language with its focus on resilient distributed systems, and means of functional programming and message passing to achieve this. The Erlang VM itself looks like an operating system, with its own radically different notions of processes, IPC, and concurrency. OTP as a standard library provides a lot of services one would see as outside of the domain of applications, like SSH and SNMP servers. With all this operating system on top of an operating system, why run your operating system on top of POSIX’s different semantics?

For those not familiar, in the Erlang world, processes are designed with the assumption that problems can occur and thus, process must be able to die without taking out the system. This means tolerating failure through isolating it. Processes are supervised (another process watches over them), and can establish links for the purpose of notification and restarting. Supervising can be done recursively too, as has been suggested for Unix environments. Applications themselves are composed out of several processes; in other environments, we might model these as threads (which lack many safety properties) or containers (which can be quite a heavyweight model to view the world, especially with k8s involved).

Processes become more interesting with Erlang’s model of IPC: processes may be sent messages, receiving them through an inbox. This has benefits for concurrency; there’s no shared mutable state, which simplifies a lot of issues with multiple processes needing to work on the same data. Receiving messages is also an extremely common pattern familiar to people who know state machines well. It’s no surprise that GUI programming is often based around this model – on Windows, every window has its own inbox, receiving them from a loop and using it as the basis of multitasking.

A common idiom in the Erlang world is to hold state through recursion. Processes in a receive loop simply call the receive function with state they hold around, since state isn’t implicit; it must be managed like any good functional program. All of this is starting to sound familiar – processes/actors holding state, message passing between processes/actors… this is actual object-oriented programming, not the inheritance you were told OOP was! It’s a shame so many object-oriented languages essentially model distributed systems without the distributed.

Such an environment would lessen the need for things like Kubernetes, because the environment naturally allows for handling partial, rolling upgrades across cluster nodes. (As an example, check out how ferd managed Erlang application updates.) So much of DevOps could be simplified – no more container management morass, or having to manage another operating system below BEAM (or below the container). (The boundaries of how applications are deployed becomes a question in this case – do you run all applications under a single instance per machine, VMs as units of applications, or something in between?)

Without an OS below our Erlang environment, what could interaction be like? While the operating system being proposed might not be intended for interactive use, a good interactive environment can do wonders for debugging and iterative configuration. For example, Cisco IOS has a command line designed for configuring network equipment (that is, CLI commands and config syntax are mostly the same). Telecom exchanges, Erlang’s home, were moving towards interactivity in the late ’70s. Eshell might provide a foundation. It’s a REPL that can operate in your deployed application – in fact, the (previously mentioned) SSH server can drop you into one! Because the REPL and application are in the same environment, this lets us do ad-hoc component testing and have the same rich data types (i.e, not just strings) for easier handling of data.

If we’re in an environment with only managed code, perhaps we could also revisit some other ideas that could help with performance, reliability, and safety. (They are often generally applicable techniques, but we have an opportunity here to include them if they help.) For example, if safety properties are enforced by a runtime, we could omit a separate address space for processes. This would save us the (increasing due to vulnerabilities like Spectre) penalty on task switch detritus like TLB and cache flushes. We could employ single-level storage to effectively turn disk into memory, and memory as cache. This would have the benefit of unifying the notions of storage – instead of a filesystem, we could simply just persist objects in memory. Then we could go further – it might even be possible to have pausable/restartable processes – maybe even if the machine is unplugged! The Erlang process model could provide an interesting framework to implement such an idea.

Hypervisor based systems

Although virtualization got its start on the mainframe and spread far in the modern day, there haven’t been OSes really designed around virtualization since IBM VM. I can already hear the audience at home is going “what about ESXi?”, but that’s still mostly just siloed-off full-fat VMs with a thinner kernel and fancier management. What could be possible is lightweight VMs that work together.

In IBM VM, while virtualization of other operating systems was a major application, one of the others was lightweight VMs that gave you an entire virtualized machine to play around with. Instead of sharing a single OS instance, everyone logged in could get their own VM running CMS, a lightweight operating system that inspired CP/M and its successors. Of course, there is IPC through VMs. Services, be it for other VMs, or on the network (even the TCP/IP stack) are implemented by, you guessed it, VMs. It’s no surprise esr noticed the odd similarities between VM and Unix.

(Sidebar: It’s worth noting how little difference there is between a hypervisor and a microkernel in this regard. What is a process at a low level but an address space? Virtualization was intended to simply just run operating systems as programs, trapping or specially handling what the operating system does when it assumes it has control. When you get things like UML involved, the lines are even blurrier.)

In a modern implementation of this idea, we already have a lightweight operating system that can be used interactively while providing access to the whole virtual machine – it’s called the EFI shell. With the software industry focusing on efficient virtualization, we can provide paravirtual devices that’d be both easy to work directly against, while remaining supported by operating systems – virtio is a good choice here.

Having a thin hypervisor-based OS also encourages the usage of “specialty” or even niche/hobbyist OSes; the obvious example being the rise of unikernels. If they only need to support paravirtual devices to run on real hardware, courtesy of a thin hypervisor, it becomes much easier to develop one without having to write several different drivers for diverse hardware.

Desktop virtualization is often just an afterthought, usually just for testing labs or running legacy systems. An interesting trend is running multiple operating systems closer to equals. This is either for security, like Qubes OS does for extreme isolation, or providing significant hardware resources like a GPU to a guest, often for gaming (the typical example being a Linux host with VFIO for a Windows guest). While Qubes tries its best to make the process transparent for users, the case of VFIO on Linux rarely is. It’s often involved, because Linux needs to be kept away from the resources trying to be shared, and you effectively have two computers to deal with from an I/O perspective. What if these two systems could share the I/O they have without excessive manual intervention?

Yet even more ideas

That’s all I have for now. Let me know if you want a clarification about something. Stay tuned for when I get even more ideas (or just further development of what I already have)….

the sporks space

party with sporks in space