
Background Knowledge:
The picture above is an STM32F429 dev board which has Doom working on it. The Doom port is Chocolate Doom, and I used it as an experience to figure out how to code for the STM32 this way. STM32 is a 32 bit ARM CPU specifically designed for embedded systems. I had to work with the STM32 for the MR-Clutch arm robot, where it was supposed to act as the controller for the arm, and the Jetson TX2 was the brain of the system.
Quick Run:
The hardest challenge in the STM32 learning curve was the linker scripts. Each CPU has a different memory map, and as such the code must appear in different sections of the CPU in order to work correctly. The biggest help here were code generators, and the datasheets for the chip I used. Below is the linker script I used for the STM32.
ENTRY(Reset_Handler)
STACK_SIZE = 0x100;
HEAP_SIZE = 0x100;
PROVIDE(_estack = ALIGN(ORIGIN(RAM) + LENGTH(RAM) - 8, 8));
SECTIONS
{
.isr_vector :
{
. = ALIGN(4);
KEEP(*(.isr_vector))
. = ALIGN(4);
} >FLASH
.text :
{
__preinit_array_start = .;
KEEP(*(.preinit_array*))
__preinit_array_end = .;
__ctor_start__ = .;
__init_array_start = .;
KEEP(SORT(*)(.init_array))
__ctors_end__ = .;
__init_array_end = .;
__dtors_start__ = .;
. = ALIGN(4);
*(.text)
*(.text.*)
*(.rodata)
*(.rodata.*)
*(.eh_frame_hdr)
*(.eh_frame)
*(.ARM.extab* .gnu.linkonce.armextab.*)
. = ALIGN(4);
} >FLASH
__exidx_start = .;
. = ALIGN(4);
_etext = .;
_sidata = _etext;
.data :
{
. = ALIGN(4);
_sdata = .;
*(.data)
*(.data.*)
} > RAM AT > FLASH
.ramfunc :
{
. = ALIGN(4);
*(.ramfunc)
*(.ramfunc.*)
. = ALIGN(4);
_edata = .;
} > RAM AT > FLASH
.bss (NOLOAD):
{
. = ALIGN(4);
_sbss = .;
*(.bss)
*(.bss.*)
*(COMMON)
. = ALIGN(4);
_ebss = .;
} >RAM
.noinit (NOLOAD):
{
. = ALIGN(4);
_start_of_noinit = .;
*(.noinit)
*(.noinit.*)
. = ALIGN(4);
_end_of_noinit = .;
_end = .;
__end = .;
} >RAM
.battram (NOLOAD):
{
. = ALIGN(4);
_start_of_batt_ram = .;
*(.battram)
*(.battram.*)
_end_of_batt_ram = .;
} > BATTRAM
.ccram :
{
. = ALIGN(4);
_start_of_ccram = .;
*(.ccram)
*(.ccram.*)
_end_of_ccram = .;
} > CCRAM
._usr_stack_heap :
{
. = ALIGN(4);
. = . + STACK_SIZE;
. = . + HEAP_SIZE;
. = ALIGN(4);
} >RAM
PROVIDE(_heap = _ebss);
PROVIDE(_eheap = _estack - STACK_SIZE);
}
PROVIDE(__top_of_stack = _estack);
PROVIDE(__idata_start = _sidata);
PROVIDE(__data_start = _sdata);
PROVIDE(__data_end = _edata);
PROVIDE(__bss_start = _sbss);
PROVIDE(__bss_end = _ebss);Now this is all good and well but it does not handle the startup code. All CPUs have a section of code that is run at the very beginning, where the program counter on resets. This code can be found by looking up STM startup files. The harder section is the libc which is not by default on barebones systems. Things such as strcmp are taken for granted, due to size constraints I used asm for majority of the string parsing library. Below is a section of code from the string parsing I made for the stm32.
.syntax unified
.global memcpy
memcpy:
mov r12, r0
1:
subs r2, #1
bmi 2f
ldrb r3, [r1], #1
strb r3, [r0], #1
b 1b
2:
mov r0, r12
bx lr
.global memset
memset:
mov r12, r0
1:
subs r2, #1
bmi 2f
strb r1, [r0], #1
b 1b
2:
mov r0, r12
bx lr
.global strlen
strlen:
mov r2, r0
1:
ldrb r1, [r0], #1
tst r1, r1
bne 1b
sub r0, r0, r2
sub r0, r0, #1
bx lrNow while this is not the full string library, it was all that was needed. I could make this more effective in several ways, however I wrote this code when I was younger and feel it would not be right to change the code now. I may write a post on optimizing assembly later.
Most Important Tidbits:
- ASM is easy once you get the hang of it
- .S files allow preprocessor commands if you use GAS, and .s files do not
- STM32 has a greater set of libraries than the PIC brand MCUs, however harder to learn
- Remember to follow the ABI for making ASM functions, otherwise you will be unable to call ASM functions from C (unless you use inline asm)