Doom STM

Background Knowledge:

The picture above is an STM32F429 dev board which has Doom working on it. The Doom port is Chocolate Doom, and I used it as an experience to figure out how to code for the STM32 this way. STM32 is a 32 bit ARM CPU specifically designed for embedded systems. I had to work with the STM32 for the MR-Clutch arm robot, where it was supposed to act as the controller for the arm, and the Jetson TX2 was the brain of the system.

Quick Run:

The hardest challenge in the STM32 learning curve was the linker scripts. Each CPU has a different memory map, and as such the code must appear in different sections of the CPU in order to work correctly. The biggest help here were code generators, and the datasheets for the chip I used. Below is the linker script I used for the STM32.

ENTRY(Reset_Handler)

STACK_SIZE = 0x100;
HEAP_SIZE = 0x100;

PROVIDE(_estack = ALIGN(ORIGIN(RAM) + LENGTH(RAM) - 8, 8));

SECTIONS
{
	.isr_vector :
	{
		. = ALIGN(4);
		KEEP(*(.isr_vector))
		. = ALIGN(4);
	} >FLASH

	.text :
	{
		__preinit_array_start = .;
		KEEP(*(.preinit_array*))
		__preinit_array_end = .;

		__ctor_start__ = .;
		__init_array_start = .;
		KEEP(SORT(*)(.init_array))
		__ctors_end__ = .;
		__init_array_end = .;
		__dtors_start__ = .;
		
		. = ALIGN(4);
		*(.text)
		*(.text.*)
		*(.rodata)
		*(.rodata.*)

		*(.eh_frame_hdr)
		*(.eh_frame)
		*(.ARM.extab* .gnu.linkonce.armextab.*)

		. = ALIGN(4);
	} >FLASH

	__exidx_start = .;

	. = ALIGN(4);
	_etext = .;
	_sidata = _etext;

	.data :
	{
		. = ALIGN(4);
		_sdata = .;
		*(.data)
		*(.data.*)
	} > RAM AT > FLASH

	.ramfunc :
	{
		. = ALIGN(4);
		*(.ramfunc)
		*(.ramfunc.*)
		. = ALIGN(4);
		_edata = .;
	} > RAM AT > FLASH

	.bss (NOLOAD):
	{
		. = ALIGN(4);
		_sbss = .;
		*(.bss)
		*(.bss.*)
		*(COMMON)
		. = ALIGN(4);
		_ebss = .;
	} >RAM

	.noinit (NOLOAD):
	{
		. = ALIGN(4);
		_start_of_noinit = .;
		*(.noinit)
		*(.noinit.*)
		. = ALIGN(4);
		_end_of_noinit = .;
		_end = .;
		__end = .;
	} >RAM

	.battram (NOLOAD):
	{
		. = ALIGN(4);
		_start_of_batt_ram = .;
		*(.battram)
		*(.battram.*)
		_end_of_batt_ram = .;
	} > BATTRAM

	.ccram :
	{
		. = ALIGN(4);
		_start_of_ccram = .;
		*(.ccram)
		*(.ccram.*)
		_end_of_ccram = .;
	} > CCRAM

	._usr_stack_heap :
	{
		. = ALIGN(4);
		. = . + STACK_SIZE;
		. = . + HEAP_SIZE;
		. = ALIGN(4);
	} >RAM

	PROVIDE(_heap = _ebss);
	PROVIDE(_eheap = _estack - STACK_SIZE);
}

PROVIDE(__top_of_stack = _estack);
PROVIDE(__idata_start = _sidata);
PROVIDE(__data_start = _sdata);
PROVIDE(__data_end = _edata);
PROVIDE(__bss_start = _sbss);
PROVIDE(__bss_end = _ebss);

Now this is all good and well but it does not handle the startup code. All CPUs have a section of code that is run at the very beginning, where the program counter on resets. This code can be found by looking up STM startup files. The harder section is the libc which is not by default on barebones systems. Things such as strcmp are taken for granted, due to size constraints I used asm for majority of the string parsing library. Below is a section of code from the string parsing I made for the stm32.

.syntax unified
.global memcpy
memcpy:
	mov	r12, r0
1:
	subs	r2, #1
	bmi	2f
	ldrb	r3, [r1], #1
	strb	r3, [r0], #1
	b	1b
2:
	mov	r0, r12
    bx lr

.global memset
memset:
	mov r12, r0
1:
	subs	r2, #1
	bmi	2f
	strb	r1, [r0], #1
	b	1b
2:
	mov r0, r12
    bx lr

.global strlen
strlen:
	mov r2, r0
1:
	ldrb r1, [r0], #1
	tst r1, r1
	bne 1b
	sub r0, r0, r2
	sub r0, r0, #1
    bx lr

Now while this is not the full string library, it was all that was needed. I could make this more effective in several ways, however I wrote this code when I was younger and feel it would not be right to change the code now. I may write a post on optimizing assembly later.

Most Important Tidbits: