Web Terminal Project Writeup

Intro

Webterm is a microservice-style application that provides a web interface to access Linux containers hosted in a Kubernetes cluster.

Technologies involved are Go, Kubernetes, ArgoCD, Docker, Helm, Node.js, and JavaScript.

Technical Overview

Webterm has three deployable services: webserver, pseudo-terminal-manager, and pseudo-terminal
We use a Go program for cluster orchestration and terminal allocation, and Node.js as a terminal backend system
The website hosts a static bundle serving a terminal UI based on xterm.js
Deployed using a Helm chart, with image tags updated by GitHub Actions in pipeline.yml
- (See shellbin for an explanation of a similar pipeline.)
Terminal pods use node-pty to bind a real shell process to a browser socket connection; keyboard inputs and results are sent over the wire

Source Code

The source for Webterm is available here.

Microservices Overview

webserver
- Serves the static frontend bundle (e.g., index.js, index.html)
- Exposes the /getPseudoTerminalAddress API used by the browser
- Rewrites the returned terminal address into something the browser can dial
pseudo-terminal
- Represents a user-facing terminal pod
- Runs node-pty within Node.js to communicate terminal data with the frontend’s xterm.js
- Forwards browser input to the shell and streams shell output back to the browser
- Asks the manager to kill the pod after idle timeout
pseudo-terminal-manager
- Talks to the Kubernetes API from inside the cluster
- Tracks which terminal pods are free, in use, or are being recreated
- Automatically exposes pseudoterminals with a NodePort; provides addresses to pseudoterminals to the browser
- Creates additional pseudoterminals by scaling a StatefulSet

General Connection Flow

The backend needs to provide each user with a pseudoterminal backend.

Here’s a rough overview of this process:

The browser loads the frontend from webserver
The frontend calls webserver for a terminal address
webserver forwards the request to pseudo-terminal-manager
The manager first checks whether that client already owns a pod
If yes, the existing pod is returned and the session is treated as a reconnect
If no, the manager allocates a spare pod in state ready first
If the last spare pod is in use, the pseudo-terminal-manager scales the StatefulSet up so a new spare is created
The browser receives the address of a pseudo-terminal
The browser then opens a socket connection directly to the assigned terminal pod

What Are Pseudoterminals For?

We’re making a web terminal, but unfortunately the browser cannot run Linux or host a real shell process by itself.

The browser can only render a terminal-like interface and send user input over the network. The actual shell has to run somewhere else, in this case inside a container in the cluster.

This process is enabled by pseudoterminals.

Basically, a pseudoterminal is an OS primitive that attaches to a running process. They are important, as all modern terminal emulators (even local) use pseudoterminals as an abstraction. It can be thought of as pseudoterminals “exchanging” keyboard input for terminal process output.

For a web-based terminal, instead of handling input and output locally, the browser sends user input over the network to a backend container running a pseudoterminal, and the resulting output is streamed back to the browser.

Here’s a quick rundown of how Webterm uses pseudoterminals:

The xterm.js frontend handles the terminal UI in the browser, but it is not running the shell
When a user connects, the frontend gets assigned a specific terminal pod and opens a socket connection to it
Inside that pod, node-pty creates a pseudoterminal and starts the real shell process
Keystrokes from the browser are sent over the socket into that pseudoterminal
The pseudoterminal sends output through the socket which is rendered by xterm.js

See containerPseudoTerminal.js for the actual pseudoterminal server running in terminal client containers.

Filtering a Kubernetes Watch to Manage Pods

Kubernetes exposes a watch mechanism, which is basically a way to subscribe to changes in cluster objects without constantly polling the API.

pseudo-terminal-manager uses this to notice when a terminal pod changes state, especially when a pod is being recreated or when a newly scaled pod becomes ready.

Here’s some context for the code below:

The manager instantiates a watch on pods in the webterm namespace, so it receives a stream of events whenever those pods change state or are updated
That stream contains events about all pods in the webterm namespace.
The purpose of the filter is to take the single watch stream representing many events, and to basically create a custom, fine-grained watch that detects a pattern representing pod readiness.
We must create isolated, filtered event streams for multiple pods, and we must attach to an API watch output stream that is ongoing, hence the need for a concurrent filter system.

filter.go

// the 'brain' of the watch filter system
for {
    select {
    case paramToAppend := <-fil.paramStream:
        fil.params = append(fil.params, &paramToAppend)
    case indexToRemove := <-fil.remIndexChan:
        fil.params = remove(fil.params, indexToRemove)
    case event := <-fil.inChan:
        for _, fp := range fil.params {
            if fp.pass(event, fil.done) {
                fp.outChan <- event
            }
        }
    case <-fil.done:
        return
    default:
        if len(fil.params) == 0 {
            close(fil.done)
            runningFilter = nil
        }
    }
}

apiEndpoints.go

// waitPatternPendingRunning is a consumer of an isolated watch event stream that is produced by
// filter.go. waitPatternPendingRunning helps us connect to pods as soon as they're available, and
// avoids a race condition where separate users can connect to the same pod at once.
func waitPatternPendingRunning(fp *filterParam, wg *sync.WaitGroup) {

	var lastPhase string
	for {
		select {
		case event := <-fp.outChan:
			pod, _ := event.Object.(*v1.Pod)

			currentPhase := string(pod.Status.Phase)

			if lastPhase == "Pending" && currentPhase == "Running" {
				fmt.Println("pattern found") //t
				runningFilter.remIndexChan <- runningFilter.getFpIndex(fp)
				wg.Done()
				return
			}
			lastPhase = currentPhase
		}
	}
}

This code works well enough for this project, but in the future I will probably use Kubebuilder for interacting with the Kubernetes API.

Closing

webterm combines a few different ideas into one project: browser-side terminal rendering, pseudoterminals, container networking, and direct use of the Kubernetes API.

This writeup skips many implementation details, like the exact Helm setup, the exact UI-to-terminal connection logic, the GitHub Actions pipeline, the per-pod NodePort Services, and the removal system for stale terminal pods. But the main idea is still simple; the browser asks for a terminal, the manager finds or creates one, and the frontend then talks to a shell running inside a container.

Future plans

Learn how to security-harden Kubernetes pods and implement abuse prevention so that the project may be deployed publicly
Rewrite ugly logic using standard tools within the Kubernetes community

Demo Video

Note: in this demo, each browser tab is treated as a distinct IP, so each tab gets a fresh terminal. The software can be configured so that only one terminal backend is allocated per real IP, and new tabs would all lead to the same terminal.