Catbird: An experimental game engine for Scheme programmers

October 25, 2022

Tags:

I've been participating in the Lisp Game Jam for several years now, and the next one is starting on 10/28, and with each attempt I've been accumulating code that forms something resembling a game engine. I'm now attempting to solidify some of the concepts I've been exploring in order to make a game engine worthy of releasing in case someone else wanted to use it, too. I've named this engine Catbird, continuing the bird theme established by my game programming library, Chickadee, which the engine is built on.

Game engines are opinionated machines, and no single engine is suitable for all types of game or developer. So, the first step for designing an engine is to determine what kind of games to target and who might want to use it. I have decided to target small-scale 2D games (though I've left the door open for adding 3D support later) made by developers who love Emacs, love Scheme, and for whom having a pleasant, REPL-driven workflow is more important than raw performance. In other words, I've designed it for myself, but I know there's a small community out there that shares my preferences. It takes a billion dollars and a ton of developers to make the Unreal Engine, but a small, niche engine can be made by a hobbyist in their spare time.

These are my design goals in more detail:

REPL-driven development model AKA live coding. This means that everything needs to be modifiable at runtime.
Good enough performance for solo developers and small indie teams making games on a small scale. It's okay to use a language implementation with garbage collection!
Game objects are composable. Simple objects can be combined to form complex ones.
Game objects can run asynchronous scripts.
A state machine abstraction breaks down games into small, manageable chunks.
A well-defined user input layer cleanly separates game state modifications from the input devices and buttons that trigger them.
Linux and MacOS as the initial target platforms. Windows and Android would be nice future additions.

I've taken design inspiration from several places:

Godot, a project that has shown that a FOSS engine can compete with the likes of Unity. If I didn't have such Lispy tendencies and didn't enjoy implementing engine stuff so much, I'd just use Godot. I like their take on the scene node abstraction.
Emacs, the extensible, self-documenting text editor and the greatest developer tool ever created. Modifying Emacs at runtime to suit your needs/preferences while you work on your projects was a transcendent experience for me when I first tried Emacs just over a decade ago. Not only do I want to develop games within Emacs, I want my engine to share some of that Emacs spirit.
Xelf, the Common Lisp game engine that combines concepts from Smalltalk and Emacs with great results. Common Lisp isn't really my thing, but if it's your thing then you should really try making a game with it.

In order to implement the design goals, one of the first big decisions that needed to be made was about the programming paradigm. Scheme is often referred to as a functional language, but it is really a multi-paradigm language. Different layers of a program can choose the paradigm that bests suits the domain. I decided to copy Xelf and use an object oriented architecture using Guile's OOP system: GOOPS. GOOPS closely resembles the almighty CLOS: The Common Lisp Object System. If you're unfamiliar with CLOS, the most import thing to know about it is that methods do not belong to classes. This separation of class and method unlocks the ability for methods to dispatch on all of their arguments instead of traditional single dispatch on only the first argument. Combine that with support for multiple inheritance and the metaobject protocol (which I will not go into here, but it rocks) and you have an OOP system that I actually enjoy using. Classes and methods can be redefined at runtime and it's a much better experience than modifying record types (which do not update existing instances) and procedures (which often have old versions referenced elsewhere that are not updated without re-evaluating that code, too.) So, GOOPS provides the flexible foundation for Catbird's REPL-driven development model.

Here's a simple diagram of the most important classes in the engine:

Catbird class diagram

Nodes and modes

There are two fundamental types in Catbird, and they rhyme: Nodes and modes. Nodes encapsulate objects in the game world, such as the player character. Modes encapsulate states of interactivity, such as moving the player character around a map with the arrow keys. Catbird nodes are similar to Godot nodes, and modes are similar to Emacs modes.

Nodes encapsulate the state of an object in the game world. Nodes can be rendered, updated, and run asynchronous scripts. Nodes can have one parent and zero or more children, forming a tree structure often called a scene graph, though in this case it is a true tree because nodes cannot belong to more than one parent. This structure allows for composing complex objects from simpler ones. A player character node might contain an animated sprite node to handle its various animations. The state of a parent node affects child nodes. If sprite A is at position (2, 3) and contained within sprite B at position (3, 5), then sprite A will be rendered at position (5, 8). If sprite B is paused, then neither sprite A nor B will have their state updated. If sprite B is hidden, then neither sprite A nor B will be rendered.

Modes encapsulate pieces of a game's state machine and serve as the input handling layer. There are two types of modes: Major and minor. Major modes are considered mutually incompatible with each other and are used to represent different states within a game scene. For example, map traversal in an RPG could be represented by a major mode for moving the player character around the map and another major mode for interacting with objects/talking to NPCs. Minor modes implement smaller, composable pieces of game logic. For example, text field editing controls could be contained within a minor mode and composed with every major mode that has the player type something. All modes have an input map, which translates keyboard/mouse/controller input events to calls to their respective event handlers.

Scenes

Nodes and modes are fundamental but have no relation. Nodes do not contain modes, and modes do not contain nodes. A new type is required to link the two together: Scenes. In Emacs, buffers contain text. In Catbird, scenes contain nodes. Both buffers and scenes have modes.

The scene type is a subclass of node that is used to encapsulate a coarse chunk of a game's state machine. For example, an RPG could be divided into several scenes: world map, inventory, and battle. Modes are attached to scenes to form a playable game. Scenes always have one active major mode and zero or more minor modes. Scenes and modes together form the state machine abstraction, handling coarse and fine grained states, respectively.

Cameras and regions

Okay, so scenes have nodes, but how is a scene rendered? How do you move around within a scene? With a camera, of course! Cameras provide a view into a scene. They have a projection, position, and orientation. The same scene can be rendered using multiple cameras, if desired, such as in a split-screen multiplayer game. This is like how an Emacs buffer can be viewed from many different scroll positions at the same time.

Cameras need a place to render, and I wanted to make sure that it wasn't always assumed that rendering should cover the whole screen, so I adapted another Emacs concept and renamed it. Emacs has windows (which are not desktop windows, Emacs calls those frames because it predates windowing systems!) and Catbird has regions. Regions represent a sub-section of the game window, defining a viewport to which a scene can be rendered. Regions can be associated with one scene and one camera. When both a scene and camera are present, the scene is rendered to the region's viewport from the perspective of the camera. In a typical game, one region that covers the entire window is all that's needed. A split-screen multiplayer game, however, could divide the window into two regions and render the same scene using different cameras. The scene associated with a region can be changed to transition from one scene to another.

Assets

Assets are containers for data that is loaded from the file system, such as images or audio files. They are meant to be defined as top-level variables in modules and referenced by whichever nodes need them. Assets keep track of the file(s) from which the data was loaded. Assets are lazy loaded when they are first dereferenced, but they can also be manually loaded ahead of time. When developing, the files associated with assets are watched for changes. When a change is detected in an asset file, the asset is automatically reloaded. Nodes keep references to assets, not the data within, so that the freshly reloaded data is automatically used by all nodes with a reference to that asset. With code and data modifiable at runtime, there is rarely a reason to stop and restart the game.

Conclusion

So yeah, this engine design isn't novel by any means. I'm just combining some traditional game engine design with concepts from Emacs, some of which have already been applied in engines like Xelf. I think the result is quite nice, though, and I don't know of any other Scheme project that's quite like it. I will be continuing to develop Catbird here, if you want to check out the code: https://git.dthompson.us/catbird.git/