SUB NIVIS

SUB NIVIS is a first-person shooter inspired by classics like Quake and DOOM. Play as Sister Maria, a nun sent to succeed where others like the Vatican and the UN failed: the Black Pyramid of Alaska.

The game is made in a custom engine which I’ve been an engine programmer of since the start of my second year at BUas. Read more about it here.

Download here

Tech Lead

During my second year of BUas, I was fortunate enough to spend the majority of my time developing the engine that would eventually power SUB NIVIS.

This led me to be chosen as Tech Lead for the game. Being the Tech Lead was very insightful as I saw how different disciplines worked with the engine you’ve worked on since the beginning.

It was also interesting to explain to other developers what would be possible with the engine as well as giving examples.

In the next section, I will explain the cool tech behind SUB NIVIS and the choices we made for ‘The Other Engine’

The Tech Behind SUB NIVIS

The Other Engine

SUB NIVIS is powered by a custom engine called: The Other Engine. The Other Engine is a cross-platform engine that focuses on developing retro-like shooters. It features support for the original Quake’s level file format (BSP) as well as ENTT for ECS.

The Quake BSP level format

Because we wanted the engine to be compatible with the numerous Quake mod tools, we decided to use the BSP level format from Quake.

The name of this level format comes from “Binary space partitioning” which is a technique to divide the level into smaller chunks. This was heavily used in the original Quake to prevent drawing more vertices of the level than strictly necessary (occlusion culling). It was one of the reasons Quake managed to render fully 3D levels back in the day.

The original version of ‘The Other Engine’ also used the BSP for occlusion culling. However, after some profiling, it turned out to be cheaper to upload the whole level to the GPU due to the limited amount of vertices. So instead the culling is only used to activate/deactivate enemies based on the position of the player.

Show Code

BSP_Resource.cpp

bool BspResource::parse_bsp(const std::string& file_name)
{
	m_file_path = file_name;
	size_t data_size;
	void *data = read_entire_file(file_name.c_str(), &data_size);

	if (!data)
	{
		// read_entire_file already complained so just exit
		return false;
	}

	unsigned char *base = (unsigned char *)data;
	BSP::dheader_t *header = (BSP::dheader_t *)base;

	// read entity data
	{
		assert(header->entities.size < MAX_MAP_ENTSTRING && "if this fails try increasing the MAX_MAP_ENTSTRING value");
		memcpy(m_ent_string, base + header->entities.offset, header->entities.size);
	}

	// read plane data
	{
		m_num_planes = header->planes.size / sizeof(BSP::plane_t);
		
		m_planes = static_cast<BSP::plane_t*>(MemoryManager::alloc_CPU(header->planes.size));
		memcpy(m_planes, base + header->planes.offset, header->planes.size);
	}

	// read miptex data
	{
		m_num_textures = *(int32_t*)(base + header->miptex.offset);
		assert(m_num_textures <= MAX_MAP_TEXTURES);

		memcpy(m_mip_texture_offsets, base + header->miptex.offset + 4, sizeof(BSP::miptex_t) * m_num_textures);
	}

	// read vertices data.
	{
		m_num_vertices = header->vertices.size / sizeof(BSP::vertex_t);

		m_vertices = static_cast<BSP::vertex_t*>(MemoryManager::alloc_CPU(header->vertices.size));
		memcpy(m_vertices, base + header->vertices.offset, header->vertices.size);
	}

	// read visilist data
	{
		m_num_visilist = header->visilist.size;

		m_visilist = static_cast<BSP::u_char*>(MemoryManager::alloc_CPU(header->visilist.size));
		memcpy(m_visilist, base + header->visilist.offset, header->visilist.size);
	}

	// read bsp node data
	{
		m_num_nodes = header->nodes.size / sizeof(BSP::node_t);
		
		m_bsp_nodes = static_cast<BSP::node_t*>(MemoryManager::alloc_CPU(header->nodes.size));
		memcpy(m_bsp_nodes, base + header->nodes.offset, header->nodes.size);
	}

	// read texture info data
	{
		m_num_texture_info = header->texinfo.size / sizeof(BSP::texinfo_t);
		
		m_texture_info = static_cast<BSP::texinfo_t*>(MemoryManager::alloc_CPU(header->texinfo.size));
		memcpy(m_texture_info, base + header->texinfo.offset, header->texinfo.size);
	}

	// read face data
	{
		m_num_faces = header->faces.size / sizeof(BSP::face_t);

		m_faces = static_cast<BSP::face_t*>(MemoryManager::alloc_CPU(header->faces.size));
		memcpy(m_faces, base + header->faces.offset, header->faces.size);
	}

	// read lightmap data
	{
		m_lightmaps_size = header->lightmaps.size;
		assert(m_lightmaps_size <= MAX_MAP_LIGHTING);

		memcpy(m_lightmaps, base + header->lightmaps.offset, header->lightmaps.size);
	}

	// TODO: clipnodes

	// read leaf data
	{
		m_num_leaves = header->leaves.size / sizeof(BSP::dleaf_t);

		m_leaves = static_cast<BSP::dleaf_t*>(MemoryManager::alloc_CPU(header->leaves.size));
		memcpy(m_leaves, base + header->leaves.offset, header->leaves.size);
	}

	// read lface data
	{
		m_num_lfaces = header->lface.size / sizeof(BSP::u_short);

		m_lfaces = static_cast<BSP::u_short*>(MemoryManager::alloc_CPU(header->lface.size));
		memcpy(m_lfaces, base + header->lface.offset, header->lface.size);
	}

	// read edge data
	{
		m_num_edges = header->edges.size / sizeof(BSP::edge_t);

		m_edges = static_cast<BSP::edge_t*>(MemoryManager::alloc_CPU(header->edges.size));
		memcpy(m_edges, base + header->edges.offset, header->edges.size);
	}

	// read Ledge data
	{
		m_num_ledges = header->ledges.size / sizeof(int32_t);

		m_ledges = static_cast<int32_t*>(MemoryManager::alloc_CPU(header->ledges.size));
		memcpy(m_ledges, base + header->ledges.offset, header->ledges.size);
	}

	// read model data
	{
		m_num_models = header->models.size / sizeof(BSP::model_t);

		m_models = static_cast<BSP::model_t*>(MemoryManager::alloc_CPU(header->models.size));
		memcpy(m_models, base + header->models.offset, header->models.size);
	}

	// Parse texture data
	{
		for (size_t i = 0; i < m_num_textures; i++)
		{
			// texture is unused
			if (m_mip_texture_offsets[i] == -1)
				continue;

			// read all the mip textures
			// note: because textures have unpredictable size, we need to load each texture individually
			const BSP::u_char *mip_textures_base = base + header->miptex.offset;
			memcpy(&m_textures[i], mip_textures_base + m_mip_texture_offsets[i], sizeof(BSP::miptex_t));
		}
	}

	LightmapFormat lightmap_format = LightmapFormat::mono;

	// bspx
	{
		// find bspx header, it should be at the end of all the lumps, aligned to a 4 byte boundary
		int bspx_offset = 0;

		for (int i = 0; i < BSP_ENTRY_COUNT; i++)
		{
			int offset = header->entries[i].offset + header->entries[i].size;

			if (bspx_offset < offset)
				bspx_offset = offset;
		}

		// do the align
		bspx_offset = (bspx_offset + 3) & ~3;

		// if there's space left in the file for a bspx header, let's try parsing it
		if (bspx_offset + sizeof(BSP::bspx_header_t) <= data_size)
		{
			BSP::bspx_header_t *bspx = (BSP::bspx_header_t *)(base + bspx_offset);

			// see if this header looks reasonable, that the id matches
			if (memcmp(&bspx->id, "BSPX", 4) == 0)
			{
				// and see that we have space left for all the entries
				int total_size = bspx_offset + sizeof(*bspx) + sizeof(BSP::bspx_entry_t)*bspx->entry_count;
				if (total_size <= data_size)
				{
					BSP::bspx_entry_t *entry_base = (BSP::bspx_entry_t *)(bspx + 1);
					for (int entry_index = 0; entry_index < bspx->entry_count; entry_index++)
					{
						BSP::bspx_entry_t *entry = &entry_base[entry_index];
						if (strcmp(entry->name, "RGBLIGHTING") == 0)
						{
							// handle rgb lighting
							lightmap_format = LightmapFormat::rgb;

							// TODO: This copy is totally pointless, why are we copying this.
							assert(entry->size <= sizeof(m_lightmaps));
							m_lightmaps_size = entry->size;

							memcpy(m_lightmaps, base + entry->offset, entry->size);
						}
					}
				}
			}
			else
			{
				logf(LogLevel::warning, "there is space left in the bsp but it does not look like a valid bspx header!");
			}
		}
	}

	int lightmap_channels = (lightmap_format == LightmapFormat::rgb ? 3 : 1);

	stbi_write_bmp("lightmap_debug/lightmap_full.bmp", 50000 / 256, 256, lightmap_channels, m_lightmaps);

	// extract lightmaps
	{
		for (int i = 0; i < m_num_faces; i++)
		{
			const BSP::face_t &face = m_faces[i];

			if(face.lightmap != -1)
			{
				SurfExtents ext;
				calculate_surface_extents(face, ext);

				int lw = ext.w / LIGHTMAP_SCALE + 1;
				int lh = ext.h / LIGHTMAP_SCALE + 1;

				BSP::u_char *lightmap = &m_lightmaps[lightmap_channels*m_faces[i].lightmap];

				ScopedMemory temp;

				int stbi_result = 0;
				stbi_result = stbi_write_bmp(temp->format_string("lightmap_debug/lightmap_%d.bmp", i).data(), lw, lh, lightmap_channels, lightmap);

				Rendering::LightMap tex;
				tex.face     = i;
				tex.channels = 4;
				tex.width    = lw;
				tex.height   = lh;
				tex.name     = "light";

				tex.pixel_buffer.resize(tex.channels*tex.width*tex.height);
				unsigned char *dst = tex.pixel_buffer.data();

				switch (lightmap_format)
				{
					case LightmapFormat::mono:
					{
						BSP::u_char *src = lightmap;

						for (int j = 0; j < lw*lh; j++)
						{
							BSP::u_char value = *src++;

							*dst++ = value;
							*dst++ = value;
							*dst++ = value;
							*dst++ = 255;
						}
					} break;

					case LightmapFormat::rgb:
					{
						BSP::u_char *src = lightmap;

						for (int j = 0; j < lw*lh; j++)
						{
							BSP::u_char r = *src++;
							BSP::u_char g = *src++;
							BSP::u_char b = *src++;

							*dst++ = r;
							*dst++ = g;
							*dst++ = b;
							*dst++ = 255;
						}
					} break;
				}

				stbi_result = stbi_write_bmp(temp->format_string("lightmap_debug/lightmap_result_%d.bmp", i).data(), lw, lh, 4, tex.pixel_buffer.data());

				m_lightmap_textures.push_back(tex);
			}
		}
		
	}

	free(data);

	return false;
}

std::vector<std::vector<Vertex>> BspResource::get_faces(size_t model_index) const
{
	std::vector<std::vector<Vertex>> faces;
	faces.reserve(65536);

	const BSP::model_t& model = m_models[model_index];

	// for each face in the model
	for (int face_offset = 0; face_offset < model.face_num; face_offset++)
	{
		const BSP::face_t& face = m_faces[model.face_id + face_offset];

		const BSP::texinfo_t& texture_info = m_texture_info[face.texinfo_id];

		glm::vec3 vector_s = { texture_info.vectorS.x,  texture_info.vectorS.y, texture_info.vectorS.z };
		glm::vec3 vector_t = { texture_info.vectorT.x,  texture_info.vectorT.y, texture_info.vectorT.z };

		glm::vec2 texture_scale = get_texture_scale(texture_info.texture_id);

		SurfExtents ext;
		calculate_surface_extents(face, ext);

		std::vector<Vertex> face_vertices;
		face_vertices.reserve(face.ledge_num);
		for (int edge_offset = 0; edge_offset < face.ledge_num; edge_offset++)
		{
			size_t edge_id = face.ledge_id + edge_offset;

			BSP::edge_t edge = m_edges[abs(m_ledges[edge_id])];
			glm::vec3 v0 = make_vec3(m_vertices[edge.vertex0]);
			glm::vec3 v1 = make_vec3(m_vertices[edge.vertex1]);

			float uv0 = dot(v0, vector_s) + texture_info.distS;
			float uv1 = dot(v0, vector_t) + texture_info.distT;

			uv0 = uv0 / texture_scale.x;
			uv1 = uv1 / texture_scale.y;

			float lm_uv0 = dot(v0, vector_s) + texture_info.distS;
			lm_uv0 -= ext.texture_mins[0];
			// lm_uv0 += 8.0f;
			lm_uv0 /= (float)ext.w; // texture_scale.x;

			float lm_uv1 = dot(v0, vector_t) + texture_info.distT;
			lm_uv1 -= ext.texture_mins[1];
			// lm_uv1 += 8.0f;
			lm_uv1 /= (float)ext.h; // texture_scale.y;

			Vertex vert0 = {
				{v0.x, v0.y, v0.z},
				{0, 0, 0},
				{ uv0, uv1 },
				{ lm_uv0, lm_uv1 },
				(float)texture_info.texture_id,
				(float)face_offset
			};

			uv0 = dot(v1, vector_s) + texture_info.distS;
			uv1 = dot(v1, vector_t) + texture_info.distT;

			uv0 = uv0 / texture_scale.x;
			uv1 = uv1 / texture_scale.y;

			lm_uv0 = dot(v1, vector_s) + texture_info.distS;
			lm_uv0 -= ext.texture_mins[0];
			// lm_uv0 += 8.0f;
			lm_uv0 /= (float)ext.w; // texture_scale.x;

			lm_uv1 = dot(v1, vector_t) + texture_info.distT;
			lm_uv1 -= ext.texture_mins[1];
			// lm_uv1 += 8.0f;
			lm_uv1 /= (float)ext.h; // texture_scale.y;

			Vertex vert1 ={
				{v1.x, v1.y, v1.z},
				{0, 0, 0},
				{ uv0, uv1 },
				{ lm_uv0, lm_uv1 },
				(float)texture_info.texture_id,
				(float)face_offset
			};

			if (m_ledges[edge_id] > 0)
			{
				face_vertices.push_back(vert0);
				face_vertices.push_back(vert1);
			}
			else
			{
				face_vertices.push_back(vert1);
				face_vertices.push_back(vert0);
			}
		}

		if (face_vertices.size() >= 3)
		{
			faces.push_back(face_vertices);
		}
	}
	faces.shrink_to_fit();
	return faces;
}

Memory Management in The Other Engine

Dynamic allocations can impact the performance of a game. Each time memory needs to be allocated, there is a chance the CPU context switches from user mode to kernel mode to allocate the memory. These context switches are quite expensive so they should be avoided at all costs (or kept to a minimum).

I’ve implemented a stack-based allocator to prevent dynamic allocations in ‘The Other Engine’. In this style of allocator, the data is allocated in a last-in-first-out (LIFO) manner. This means that preferably memory that doesn’t need to be freed regularly should be allocated first, for instance, the level and the player.

The custom stack-based allocator also helped a great deal with porting the game to the PS4, as memory works a bit differently on that console.

Show Code

StackAllocator.cpp

#include "stdafx.h"
#include "StackAllocator.h"
#include <cassert>

// Windows specific allocation function
#ifdef PLATFORM_X64
#include <malloc.h>
#endif // PLATFORM_X64

using namespace toe;

StackAllocator::StackAllocator(uint64_t maxSize) 
{
	assert(maxSize > 0 && "Cannot allocate empty space.");

	// Initialize top on Windows Platform
	m_top = malloc(maxSize);

	m_size = maxSize;
	m_bottom = m_top;
}

StackAllocator::~StackAllocator() 
{
	free(m_bottom);
}

void* StackAllocator::alloc(uint64_t size) 
{
	assert(size > 0 && "Cannot allocate empty space");
	void* ptr = m_top;
	m_top = static_cast<char*>(m_top) + size;
	assert((reinterpret_cast<uint64_t>(m_top) - reinterpret_cast<uint64_t>(m_bottom)) < m_size && "Stack allocator ran out of space.");
	return ptr;
}

StackAllocator::Marker StackAllocator::getMarker() const
{
	return reinterpret_cast<Marker>(m_top);
}

void StackAllocator::freeToMarker(Marker marker) 
{
	if(marker == 0)
	{
		Debug::logSysWarning("0 Marker free requested... Nothing is freed.", "MEMORY");
		return;
	}
	assert(marker >= (uint64_t)m_bottom && marker <= (uint64_t)m_top && "Marker not within stack range");
	m_top = reinterpret_cast<void*>(marker);
	Debug::logSysMessage("Memory freed to: %#010x","MEMORY", marker);
}