REAL-TIME OPERATING SYSTEMS: Znak za na prvo stran platnice-FE LABORATORY EXERCISES Založba FE JANEZ PUHAN University of Ljubljana Faculty of electrical engineering Real-time operating systems: Laboratory exercises Janez Puhan Ljubljana, 2019 Kataložni zapis o publikaciji (CIP) pripravili v Narodni in univerzitetni knjižnici v Ljubljani COBISS.SI-ID=299480576 ISBN 978-961-243-381-9 (pdf) URL: http://fides.fe.uni-lj.si/~janezp/real-time_operating_systems_laboratory_exercises.pdf Založnik: Založba FE, Ljubljana Izdajatelj: Fakuleta za elektrotehniko, Ljubljana Urednik: prof. dr. Sašo Tomažič 1. elektronska izdaja Contents Preface v 1 Installing IDE 1 2 Tasks and scheduling algorithms in FreeRTOS™ 13 3 Implementing other scheduling algorithms 21 4 Assembly language function 29 5 MPU 35 6 Stack management in FreeRTOS™ 47 7 Heap management in FreeRTOS™ 53 8 Deadlocks 61 9 Ramp application 75 A Peripheral device initialization and usage receipts 79 B External board LCD 85 Bibliography 87 Preface The laboratory exercises described in this script are part of the Real-time operating systems course. The course is held in the third semester of the 2nd Cycle Postgraduate Study Programme in Electrical Engineering, study programme option Electronics, at the Faculty of electrical engineering of the University of Ljubljana, Slovenia. The students are introduced into RTOS1 concepts through nine laboratory exercises. A µC2 system with ARM3 Cortex based processor core is used. The Arduino Due board with Olimex ARM-USB-OCD-H5 JTAG6 interface serves as a hardware platform. The open source FreeRTOS™ software is used as an operating system platform. The Eclipse IDE7 for C/C++ Developers is used as a graphical interface to the GNU8 tools (i.e., compiler, linker, debugger, etc.). The environment is installed on a PC9 with installed Linux operating system. A solid knowledge of the C programming language is a required prerequisite. 1RTOS ... Real-Time Operating System 2µC ... Micro-Controller 3ARM ... Advanced RISC4 Machines 4RISC ... Reduced Instruction Set Computer 5ARM-USB-OCD-H... ARM -USB6 -On-Chip Debugger -High speed 6USB... Universal Serial Bus 7JTAG ... Joint Test Action Group 8IDE ... Integrated Development Environment 9GNU ... GNU’s Not Unix 10PC ... Personal computer FreeRTOS is a trademark of Real Time Engineers Ltd. Exercise 1 Installing IDE Prepare a working environment to program the Arduino Due board through the Olimex ARM-USB-OCD-H interface on a Linux pre-installed PC1. Use Eclipse IDE for C/C++ Developers as a graphical interface, GCC2 as ARM cross-compiler, and OpenOCD for communication with the ARM-USB-OCD-H interface. Find the required software on the Internet. Create and cross-compile a template project with an empty main() function. Use ASF3 code for µC initialization from reset to the start of the main() function. Upload the cross-compiled project to the Arduino Due board. Explanation Installing the environment The Eclipse is a platform consisting of several components used to develop applications in various programming languages. Since our code will be in the C programming language the component Eclipse IDE for C/C++ Developers needs to be installed. The Eclipse IDE for C/C++ Developers tarball can be downloaded from the Eclipse website [1]. Open a terminal window and extract the files in tarball. To open the terminal window, go to the Applications in the menu bar, select Accessories, and run the Terminal program. A terminal window with the user home directory as the current working directory is opened. To extract the tarball (i.e., the eclipse-cpp-kepler-SR2-linux-gtk-x86_64.tar.gz file), use tar command. user@host:~$ tar xvfz eclipse-cpp-kepler-SR2-linux-gtk-x86_64.tar.gz The Eclipse requires a Java VM4 to run. Installing the Java SE5 Development Kit solves that. The JDK6 tarball can be downloaded from the Oracle website [2]. Change into the eclipse directory created at the previous tarball extraction and extract the JDK tarball there. user@host:~$ cd eclipse user@host:~/eclipse$ tar xvfz jdk-8u45-linux-x64.tar.gz 1The Wheezy release of the Debian Linux distribution 2GCC ... GNU Compiler Collection 3ASF ... Atmel Software Framework 4VM ... Virtual Machine 5SE ... Standard Edition 6JDK ... Java Development Kit To ensure that the Eclipse will run on the installed JVM7, it has to be specified in eclipse.ini file. The file can be edited with an arbitrary text editor (i.e., vi, nano, gedit, etc.). Add the following lines before the VM arguments line (i.e., before the -vmargs line) in eclipse.ini. -vm /home/user/eclipse/jdk1.8.0_45/bin/java The Eclipse will serve as a graphical interface to the GNU tools (i.e., compiler, linker, debugger, etc.). The GNU tools (i.e., GCC) for ARM embedded processors will be used. GCC [3] is a collection of compilers supporting various programming languages and targeting various platforms (i.e., µCs or µPs8). In our case, the GCC ARM cross-compiler is required. The source code will be cross-compiled on a PC building an executable for an ARM Cortex processor. The GCC for ARM embedded processors tarball can be downloaded from the GCC ARM Embedded project on the Launchpad website [4]. Extract the tarball (i.e., the gcc-arm-none-eabi-4_9-2015q1-20150306-linux.tar.bz2 file) into the eclipse directory. user@host:~/eclipse$ tar xvjf gcc-arm-none-eabi-4_9-2015q1-20150306-linux.tar.bz2 To communicate with the Olimex ARM-USB-OCD-H interface [5], the OCD software is required. The OCD (i.e., OpenOCD) provides programming and debugging of the target embedded system (i.e., µC on the Arduino Due board). To do so, a debug interface (i.e., the Olimex ARM-USB-OCD-H) is needed to produce the required electric signals (i.e., JTAG). In our case, the Eclipse will communicate with the ARM-USB-OCD-H interface. Thus the GNU ARM Eclipse OpenOCD distribution of the OpenOCD project [6] will be installed. The tar-ball can be downloaded from the GNU ARM Eclipse Plug-ins project on the Sourceforge website [7]. Extract the tarball (i.e., the gnuarmeclipse-openocd-de bian64-0.8.0-201503201909.tgz file) into the eclipse directory. user@host:~/eclipse$ tar xvfz gnuarmeclipse-openocd-debian64-0.8.0-201503201909.tgz The OpenOCD software needs to be properly configured to use the selected debug interface (i.e., the Olimex ARM-USB-OCD-H) talking to the selected target embedded system (i.e., the Atmel AT91SAM3X8E µC [8] on the Arduino Due board [9]). Create the following openocd.cfg configuration file in openocd/0.8.0-201503201909/scripts subdirectory of the eclipse directory. source [find interface/ftdi/olimex-arm-usb-ocd-h.cfg] source [find target/at91sam3ax_8x.cfg] $_TARGETNAME configure -event gdb-attach { echo "Halting target" halt } The Olimex ARM-USB-OCD-H interface and the AT91SAM3X8E Arduino Due µC are specified in openocd.cfg. Also halting of the target processor is performed 7JVM ... Java Virtual Machine 8µP ... MicroProcessor on the GDB9 attach event10. Use chmodcommand to set openocd.cfgpermissions to read/write for the owner and read for everyone else. user@host:~/eclipse$ chmod 644 openocd/0.8.0-201503201909/scripts/openocd.cfg OpenOCD software needs the lib32ncurses5 package to be installed. Also the libcanberra-gtk-module is required by Eclipse. To install both packages, root user permissions are required. root@host:~# apt-get update root@host:~# apt-get install lib32ncurses5 root@host:~# apt-get install libcanberra-gtk-module The Olimex ARM-USB-OCD-H interface is identified by the udev daemon when plugged in. The udev identifies a new device and creates its name according to the rules in /etc/udev/rules.d directory. The 99-openocd.rules file contains rules for various interfaces (including the ARM-USB-OCD-H) the OpenOCD can work with. It has to be copied into the /etc/udev/rules.d directory. The rules has to be reloaded to take effect. The root user permissions are required. root@host:~# cp /home/user/eclipse/openocd/0.8.0-201503201909/con trib/99-openocd.rules /etc/udev/rules.d root@host:~# udevadm control --reload-rules To use the Olimex ARM-USB-OCD-H interface, the user has to be a member of the plugdev group. root@host:~# usermod -a -G plugdev user It is time to run the freshly installed Eclipse for the first time. user@host:~/eclipse$ ./eclipse To make the Eclipse environment work with the Olimex ARM-USB-OCD-H OpenOCD interface and AT91SAM3X8E µC, the Eclipse extensions for GNU tools for ARM embedded processors have to be installed. These extensions are provided by the GNU ARM plug-ins. Since debugging sessions are powered by the GDB, the C/C++ GDB Hardware Debugging plug-in is a prerequisite. It is a part of the CDT11 plug-ins. The CDT zip file (i.e., the cdt-master-8.3.0.zip file) can be downloaded from the Eclipse website [1]. To install the C/C++ GDB Hardware Debugging plug-in into the Eclipse, select the Install New Software... menu item from the Help menu in menu bar. The Install dialog box opens. Press the Add... button to add a new software repository. In Add Repository dialog box shown in Fig. 1.1 specify the CDT repository, i.e., Name: CDT, Location: absolute path to the cdt-master-8.3.0.zip file. After the repository is specified, the Install dialog box regains the focus. Select the C/C++ GDB Hardware Debugging plug-in from the CDT Optional Features list as shown in Fig. 1.2. Press the Next > button and follow the installation procedure. 9GDB ... GNU debugger 10Occurs when the GDB connects to the target (i.e., at the beginning of the debug session). 11CDT ... C/C++ Development Tooling Install the GNU ARM plug-ins in the same manner. The zip file (i.e., the ilg.gnuarmeclipse.repository-2.8.1-201504061754.zip file) can be downloaded from the GNU ARM Eclipse Plug-ins project on the Sourceforge website [7]. This time specify the repository as Name: GNU ARM Eclipse Plug-ins, and Location: absolute path to the ilg.gnuarmeclipse.repository-2.8.1-20150406175 4.zip file. In the Install dialog box select the entire package of the GNU ARM C/C++ Cross Development Tools plug-ins. Finally, a path to the OpenOCD binary directory has to be configured in the Eclipse environment. To do so, select the Preferences menu item from the Window menu in menu bar. The Preferences dialog box opens. Select the String Substitution item from Run/Debug as shown in Fig. 1.3. Select the openocd_path variable and press the Edit... button. In the Edit Variable: openocd_path dialog box specify the absolute path to the OpenOCD binary directory, i.e., absolute path to the openocd/0.8.0-201503201909/bin directory. plug-ins, the GNU tools and the OpenOCD software is installed as shown in Fig. 1.4. A few handy settings of the Eclipse environment follow to ease the usage of the created working environment. The settings are optional. Window | Preferences → Preferences dialog box → General | Editors | Text Editors → enable Show print margin and Show line numbers → press the Apply button Window | Preferences → Preferences dialog box → General | Workspace → disable Build automatically and enable Save automatically before build → press the Apply button Window | Preferences → Preferences dialog box → C/C++ | Build | Console → enable Bring console to top when building (if present) and Wrap lines on the console, set Limit console output (number of lines) to 5000 → press the Apply button Window | Preferences → Preferences dialog box → C/C++ | Code Analysis → disable all problems → press the Apply button Window | Preferences → Preferences dialog box → C/C++ | Code Style | For-matter → set Active profile to GNU [built-in] → press the Apply button Window | Preferences → Preferences dialog box → C/C++ | Editor → in the Documentation tool comments section, set Workspace default to Doxygen → press the Apply button Window | Preferences → Preferences dialog box → C/C++ | Editor | Folding → in the Initially fold this region types section, disable Header Comments → press the Apply button Window | Preferences → Preferences dialog box → C/C++ | Indexer → in the Build configuration for the indexer section, select Use active build configuration → press the Apply button Window | Preferences → Preferences dialog box → Run/Debug | Launching → in the Launch Operation section, enable Always launch the previously launched application → press the Apply button Selecting an appropriate µC boot mode The Atmel AT91SAM3X8E µC on the Arduino Due board has three non-volatile memory blocks that can retain their contents when not powered. Those are ROM12 (16kB) starting at the 0x00000000 address, the first Flash memory bank (256kB) starting at 0x00080000, and the second Flash memory bank (256kB) starting at 0x000c0000. The reset vector13 of the µC can reside in any of them. The location of the reset vector is selected by the GPNVM14 bits (see Tab. 1.1) [8]. GPNVM bit if bit value = 0 if bit value = 1 0 (security bit) 1 (boot mode selection) 2 (flash selection)* Flash access enabled reset vector in ROM reset vector in Flash0 Flash access disabled reset vector in Flash reset vector in Flash1 * used only when GPNVM bit1 = 1 Table 1.1: GPNVM bits Any kind of outside access to Flash is disabled when the GPNVM bit0 is set. Therefore, the code in the Flash is protected and cannot be read by the third party. The protected code can only be deleted by tying the Erase pin to high voltage level for at least 220ms (i.e., pressing the ERASE button on the Arduino Due board for 220ms [9]). The GPNVM bits are also erased by this procedure. Thus, the access to fresh empty Flash is enabled. Of course, the GPNVM bit0 must not be set during the code development. The GPNVM bit1 selects the location of the reset vector. When ROM is selected, the SAM-BA15 program hard-coded there is started. It programs16 the on-chip Flash memory via the UART17 or USB. On the other hand, when the Flash is selected, the reset vector is read from the first or the second Flash bank regarding the GPNVM bit2. Since the GDB will be used for uploading the code into the on-chip Flash, the GPNVM bit1 must be set. SAM-BA will not be used. The code can be compiled for either the first, or the second Flash bank. The bank is selected in the linker script provided by the ASF. Since the ASF uses the first Flash bank (i.e., Flash0), the GPNVM bit2 must not be set. The default value of the GPNVM bits is zero (i.e., when the ERASE button is pressed). To get the desired values (i.e., GPNVM bits = 0b010), the GPNVM bits have to be set with the OpenOCD. Plug in the Olimex USB-ARM-OCD-H debug interface with the Arduino Due board connected over the JTAG. Open two terminal windows (Applications → Accessories → Terminal ). In the first terminal start the OpenOCD debugger. user@host:~/eclipse/openocd/0.8.0-201503201909/bin$ ./openocd 12ROM ... Read-Only Memory 13Reset vector is loaded into the program counter register at power-up. It defines the µC starting address. 14GPNVM ... General Purpose Non-Volatile Memory 15SAM-BA ... Smart ARM MCU18 -Boot Assistant 16SAM-BA starts the FFPI19 to program the on-chip Flash. 17UART ... Universal Asynchronous Receiver/Transmitter 18MCU ... µC Unit 19FFPI ... Fast Flash Programming Interface GNU ARM Eclipse 64-bit Open On-Chip Debugger 0.8.0-00063-gbda7f5c (2015-01-01-00:00) Licensed under GNU GPL v2 For bug reports, read http://openocd.sourceforge.net/doc/doxygen/bugs.html Info : only one transport option; autoselect ’jtag’ adapter speed: 500 kHz adapter_nsrst_delay: 100 jtag_ntrst_delay: 100 cortex_m reset_config sysresetreq adapter speed: 500 kHz Info : clock speed 500 kHz Info : JTAG tap: sam3.cpu tap/device found: 0x4ba00477 (mfg: 0x23b, part: 0xba00, ver: 0x4) Info : sam3.cpu: hardware has 6 breakpoints, 4 watchpoints Connect to the OpenOCD debugger via telnet in the second terminal. Use local-host 4444 port. The GPNVM bits can be set and viewed with the at91sam3 OpenOCD command [6]. user@host:~$ telnet localhost 4444 Trying ::1... Trying 127.0.0.1... Connected to localhost. Escape character is ’ˆ]’. Open On-Chip Debugger > reset init JTAG tap: sam3.cpu tap/device found: 0x4ba00477 (mfg: 0x23b, part: 0xba00, ver: 0x4) target state: halted target halted due to debug-request, current mode: Thread xPSR: 0x01000000 pc: 0x0010004c msp: 0x20001000 > at91sam3 gpnvm clr 0 > at91sam3 gpnvm set 1 > at91sam3 gpnvm clr 2 > at91sam3 gpnvm sam3-gpnvm0: 0 sam3-gpnvm1: 1 sam3-gpnvm2: 0 > exit Connection closed by foreign host. Creating a template project To create an empty template project in the Eclipse environment, select the Project... submenu item from the File | New menu. In the New Project dialog box shown in Fig. 1.5 select the C/C++ | C Project. In the next, C Project dialog box shown in Fig. 1.6 set the Project name and select Makefile project | Empty Project for the Project type, and Cross ARM GCC for the Toolchains. An empty makefile project is created. A path to the GNU tools directory (i.e., /home/user/eclipse/gcc-arm-none-eabi-4_9-2015q1/bin) has to be set. Highlight the project in the Project Explorer view of the C/C++ perspective20 . Select the Properties menu item from the Project menu. In the Properties for 21 dialog box set the following: C/C++ Build | Settings → Toolchains tab → press the Apply button22 C/C++ Build | Environment → press the Add... button → New variable dialog box → set variable Name to PATH and Value to /home/user/eclipse/gcc-arm-no ne-eabi-4_9-2015q1/bin → press the OK button → back in the Properties for dialog box press the Apply button (see Fig. 1.7) Atmel provides the ASF software library for its µCs. It contains source code for µC initialization, APIs23 to peripheral units, etc. For Cortex based processors, the CMSIS24 provided by ARM [10] is included. The ASF software library can be downloaded from the Atmel website [11]. It comes as a standalone archive file (i.e., as a asf-standalone-archive-3.21.0.6.zip file). Makefiles and linker scripts are added, so the code can be compiled and linked using the GCC [3] and GNU Make utility [12]. The AT91SAM3X8E µC source code files accompanied by makefiles and linker script (all extracted from the ASF library) can be downloaded from [13]. Note that only the files needed in this laboratory exercises are included. There is a branched directory structure with a lot of various files in [13], which can be a bit confusing. Thus, the three key files are pointed out here: • The sam/utils/cmsis/sam3x/source/templates/gcc/startup_sam3x.c file contains the exception table. The second entry in the table is the reset 20For explanation of the Eclipse environment views and perspectives, consult the documentation pages on the Eclipse website [1]. 21The name template is used as in figures. 22The toolchain settings have to be applied although no changes are made. Otherwise the build command (i.e., make) is not set. This is a bug. 23API ... Application Programming Interface 24CMSIS ... Cortex Microcontroller Software Interface Standard vector loaded into the program counter register at power-up. The reset vector refers to the Reset_Handler() function also defined in this file. Thus, the µC starts in Reset_Handler() which after some basic initialization25 calls the main() function. The main() function is considered as the beginning of the program in the C programming language. • The config.mk contains the build settings used by the GNU Make utility. The compiler and linker flags, linker script filename, output (elf) filename, list of C and assembly source files, include paths, library paths, etc., are defined here. • The sam/utils/linker_scripts/sam3x/sam3x8/gcc/flash.ld file is the linker script. Among others, the address of the selected Flash memory bank is defined here (see page 6). Extract and copy the files from [13] into the project directory, e.g., /home/user/ workspace/. To run and debug the freshly created template project, a debug configuration has to be defined. Highlight the project in the Project Explorer view of the C/C++ perspective. Select the Debug Configurations... menu item from the Run menu. In the Debug Configurations dialog box select the GDB OpenOCD Debugging item and click the New ( ) button. Select the newly created Default debug configuration under the GDB OpenOCD Debugging item. Define the configuration settings in tabs on the right side of the Debug Configurations dialog box as shown in Fig. 1.8. 25Copy the relocate segments to RAM26, clear the bss segment, set the exception table address (e.g., to 0x0008000 = the first Flash memory bank), all according to the linker script, and initialize the libc standard C library. Note that the first entry in the exception table is the initial stack top address loaded into the r13 SP27 register at power-up. 26RAM ... Random Access Memory 27SP ... Stack Pointer In the Main tab, define the C/C++ Application executable file as it is defined in the config.mk file. The TARGET_FLASH = ....elf line defines the name of the elf file. In the Debugger tab, define the OpenOCD Config options. The options reside in the openocd.cfg file (see page 2) which has to be passed as an argument to the OpenOCD executable. In the Startup tab, the debug starting-point can be defined with Set breakpoint at option. After the upload, the program on the target µC board starts with execution. It stops at the first breakpoint set to the main() function by default. By changing the Set breakpoint at option, the initial breakpoint can be placed elsewhere (e.g., to the Reset_Handler() function right “after” the reset vector and even before the basic initializations). In the Common tab, the directory containing the *.launch file, where setting are saved, is specified. There is an inconsistency in the standard cdefs.h28 and the ASF compi ler.h29 header file. Both define the __always_inline30 macro. Therefore one of the definitions is redundant. Since the definitions are not exactly identic, the ASF definifion is used and the definition in the cdefs.h header file is commented out. With this minor hack, the project can be compiled by selecting the Build Project menu item from the Project menu. To upload the compiled elf file to the target µC board (i.e., the Arduino Due board) and start a debug session, select the Debug Configurations... menu item from the Run menu. Select the project under the GDB OpenOCD Debugging item and press the Debug button. Enabling serial communication over the USB When the Programming USB port on the Arduino Due board is connected to a Linux PC, it is identified as a new serial device (e.g., /dev/ttyACM0). The user will be able to access such a device if it is a member of the dialout group. The super user can add the user into the group with the following command: root@host:~# usermod -a -G dialout user The change takes effect at the next login. An arbitrary serial terminal program is also required for serial communication. The PuTTY serial console can be used. It can be installed to a Linux PC with the commands: root@host:~# apt-get update root@host:~# apt-get install putty carried out as the super user. To start PuTTY, select the PuTTY SSH Client submenu item from the Applications | Internet menu. Note that the serial terminal settings must match the UART configuration (see Appendix A). An example of the PuTTY serial settings is shown in Fig. 1.9. 28Located in the include/sys subdirectory, e.g., /home/user/eclipse/gcc-arm-none-eabi-4_ 9-2015q1/arm-none-eabi/include/sys/cdefs.h. 29Located in the sam/utils subdirectory, e.g., /home/user/workspace//sam/ utils/compiler.h. 30In line 359 of cdefs.h and in line 162 of compiler.h. Exercise 2 Tasks and scheduling algorithms in FreeRTOS™ Create four FreeRTOS [14] tasks. Assign the same above idle priority to two tasks, and the idle priority to the other two. No task should ever end, each should run in an infinite loop. Assign a LED1 and a button key on the external board to each task. In every iteration of the infinite loop, a task should turn its LED on and all the other LEDs off to indicate which task is running. The Idle task should turn all the LEDs off. Also, in every iteration of the infinite loop, a task should check all the button keys. If a button key belonging to a particular task is pressed, the task should be suspended, if released, the task should be resumed. Thus, a task is suspended while its button key is pressed, and vice versa. Observe various scheduling algorithms available in FreeRTOS. Can a running task explicitly request a context switch? How a delay can be effectively implemented? Explanation Create a new empty project in the Eclipse working environment as explained in Exercise 1. Connect the Arduino Due board to the host PC over the Olimex ARM-USB-OCD-H interface, and to the external board button keys and LEDs, as shown in Fig. 2.1. The default main() function of the freshly created empty project can be found in the src/main.c file. The first function called is prvSetupHardware() where the hardware initialization is performed. Inside the prvSetupHardware() function, the functions sysclk_reinit(), NVIC_SetPriorityGrouping() and board_init() perform the basic initialization of the AT91SAM3X8E µC and the Arduino Due board. To be able to read the keys and drive the LEDs, the pins from PC21 to PC26, PC28 and PC29 have to be configured as GPIO2 pins (see Fig. 2.1). The keys require pull-up resistors on their input pins, while the debouncing filters are not necessary. See Appendix A for GPIO pin configuration and usage. FreeRTOS configuration The FreeRTOS properties can be set by options in the src/FreeRTOSConfig.h configuration file. The relevant options will be introduced in parallel with the explanation of the exercises in this script. 1LED ... Light Emitting Diode 2GPIO ... General Purpose I/O3 3I/O ... Input Output Tasks in FreeRTOS A task is a standalone program implemented as a function taking one void pointer argument, i.e., a function of type void func(void *). The FreeRTOS can ‘simultaneously’ run more than one task by using the time slicing technique. A task can be in one of four states (see Fig. 2.2): • ready; task is ready to run whenever scheduled, • running; there is always exactly one task in running state, • suspended; task stays suspended until resumed, and • blocked; task is waiting for an event. Task creation and FreeRTOS starting A new task is created by the xTaskCreate() function5 [15] [16]. The declaration of the function is: BaseType_t xTaskCreate( TaskFunction_t pvTaskCode, char *pcName, unsigned short usStackDepth, void *pvParameters, UBaseType_t uxPriority, TaskHandle_t *pxCreatedTask );6 The function returns the pdPASS value on success. The function will fail if there is not enough heap memory available to allocate the task’s stack and TCB7. A newly created task is placed into the ready state. The function arguments are: pvTaskCode pcName usStackDepth ... ... ... pointer to task function task name8 task’s stack size in words9 pvParameters ... pointer passed to the task as an argument uxPriority ... task priority (zero is the lowest priority) pxCreatedTask ... pointer to where a handle to the created task is re turned; if NULL, the handle is not returned For instance, the function call xTaskCreate(tsk, "Task", 150, (void *)2, 1, &xHnd); creates a task named "Task" coded in the tsk() function of type void tsk(void *). It has 600 bytes of stack. The pointer passed to the task function, e.g., arg, has value 0x00000002. The task priority is one, just above the lowest. Pointer to a handle to the newly created task is written into the xHnd variable defined as TaskHandle_t xHnd;. Note that xHnd is actually a void pointer, and &xHnd is an address where this void pointer resides. The FreeRTOS scheduler is started by vTaskStartScheduler() function [15] [16] declared as: void vTaskStartScheduler( void ); If the scheduler is started successfully, the function will never return. The FreeR-TOS scheduler starts to ‘simultaneously’ execute the created tasks following the scheduling algorithm. The function also creates an additional Idle task with the lowest priority. Therefore, in case when no application created task is ready to be scheduled into the running state, the Idle task will be available. Note that exactly one task must be in the running state at all times. Starting FreeRTOS scheduler will fail if there is not enough heap memory available for allocating the Idle task. 5The configSUPPORT_DYNAMIC_ALLOCATION option must not be set to zero in the src/FreeRTOS Config.h configuration file to make the xTaskCreate() function available. 6BaseType_t is the most suitable integer type for the architecture, i.e., a 32-bit integer type for the AT91SAM3X8E µC. TaskFunction_t type is a pointer to a function of type: void func(void *arg). UBaseType_t is the most suitable unsigned integer type for the architecture, i.e., an unsigned 32-bit integer type for the AT91SAM3X8E µC. TaskHandle_t is a void pointer type. 7TCB ... Task Control Block, i.e., task data used by FreeRTOS 8Maximum number of caharacters (ending NULL character included) in task name is set by the configMAX_TASK_NAME_LEN option in the src/configFreRTOSConfig.h configuration file. 9For AT91SAM3X8E, one word is four bytes. In that case, the vTaskStartScheduler() function returns. The FreeRTOS is thus started with a simple vTaskStartScheduler() call: vTaskStartScheduler(); A pseudo code that creates the required tasks and starts the FreeRTOS scheduler is as follows: create two tasks with priority one create two tasks with priority zero (idle priority) start scheduler To use the xTaskCreate() and vTaskStartScheduler() functions, and the accompanying data types, the FreeRTOS.h and task.h header files have to be included. #include #include Suspending and resuming a task Before discussing the actual implementation of the task functions, a lesson, how a task can be suspended and resumed, is needed. A task can be suspended by the vTaskSuspend() function10 call [15] [16]. The declaration of the function is: void vTaskSuspend( TaskHandle_t pxTaskToSuspend ); The function suspends the specified task, i.e., places the task into the suspended state. The argument of the function is: pxTaskToSuspend ... task handle The following vTaskSuspend() function call, for example, suspends a task specified by the xHnd handle: vTaskSuspend(xHnd); Passing NULL argument is equivalent to passing a handle of the calling task. The task suspends itself. To resume a suspended task, the vTaskResume() function10 can be used [15] [16]. The declaration of the function is: void vTaskResume( TaskHandle_t pxTaskToResume ); The function call transfers the specified task from suspended into ready state (see Fig. 2.2). The function call will have no effect if the task to be resumed is not in the suspended state. The argument of the function is: pxTaskToResume ... task handle The following vTaskResume() function call, for instance, resumes previously suspended task specified by the xHnd handle: vTaskResume(xHnd); 10The INCLUDE_vTaskSuspend option must be set to one in the src/FreeRTOSConfig.h configuration file to make the vTaskSuspend() and vTaskResume() functions available. Task implementation The four tasks in this exercise are actually four instances of the same algorithm. In every iteration of an endless loop, the LED corresponding to the task is turned on, and the tasks belonging to currently pressed keys are suspended. Other LEDs are turned off, and other tasks are resumed. The pseudo code of the algorithm is as follows: while forever turn LED belonging to this task on, others off get key positions suspend tasks belonging to pressed keys, resume others The Idle task When the FreeRTOS is started, an additional Idle task with the lowest, i.e., zero or idle, priority is created. The stack depth of the Idle task is set by the configMINIMAL_STACK_SIZE option in the src/FreeRTOSConfig.h configuration file. The Idle task basically just waits in an endless loop. Its only real occupation is releasing system resources, e.g., heap memory, after an application task is deleted. Otherwise, the Idle task is equivalent to any other application created task with idle priority. The Idle task is invisible to the application by default. If the configUSE_IDLE_HOOK will be set to one in the src/FreeRTOSConfig.h configuration file, the vApplicationIdleHook() callback function is called in every iteration of the Idle task endless loop [15] [16]. The function has to be defined in the application code as: void vApplicationIdleHook(void) { ... } The Idle task calls the vApplicationIdleHook() function regularly. The function have to return within a short period of time, or else the Idle task cannot perform the releasing promptly. To avoid an error when no task is available to enter the running state, the idle task must never be blocked, suspended or deleted. The Idle task functionality required in this exercise is implemented in the vApplicationIdleHook() function. The pseudo code of the function is as follows: turn all LEDs off get key positions suspend tasks belonging to pressed keys, resume others Scheduling algorithms With all the code ready, different scheduling algorithms can be tested. Cooperative scheduling is selected when the configUSE_PREEMPTION option is set to zero in the src/FreeRTOSConfig.h configuration file. A context switch occurs when the running task ends, is blocked, is suspended, or explicitly requests a switch. The running task cannot be preempted by a higher priority task. The next task to enter running state is a ready task with the highest priority. If there is more than one candidate, a task, being in ready state longest, will be selected. A context switch can be explicitly requested by calling the taskYIELD() function [15] [16] declared as: void taskYIELD( void ); Thus, the running task places itself into ready state by the following call: taskYIELD(); The Idle task requests a context switch in every iteration of its endless loop. With cooperative scheduling, the request cannot be canceled by setting the configIDLE_SHOULD_YIELD option to zero. Therefore, the Idle task should not be able to block any application task. Prioritized preemptive scheduling without time slicing is selected when the configUSE_PREEMPTION option is set to one and the configUSE_TIME_SLIC ING option is set to zero in the src/FreeRTOSConfig.h configuration file. A context switch occurs when a task with higher priority than running task becomes ready, or when the running task ends, is blocked, is suspended, or explicitly requests a switch. The next task to enter running state is a ready task with the highest priority. If there is more than one candidate, a task, being in ready state longest, will be selected. The Idle task requests a context switch in every iteration of its endless loop. The requests can be canceled by setting the configIDLE_SHOULD_YIELD option to zero. Therefore, the Idle task can block an idle priority application task. Prioritized preemptive scheduling with time slicing is selected when the configUSE_PREEMPTION and configUSE_TIME_SLICING options are set to one in the src/FreeRTOSConfig.h configuration file. The time slice frequency is set by the configTICK_RATE_HZ option. The time slice length11 is therefore configTICK_RATE_HZ−1 seconds. With this scheduling algorithm, a context switch will occur when a task with higher priority than running task becomes ready, or at the beginning of a new time slice if a task with running task’s priority is ready, or when the running task ends, is blocked, is suspended, or explicitly requests a switch. The next task to enter running state is a ready task with the highest priority. If there is more than one candidate, a task, being in ready state longest, will be selected. The Idle task requests a context switch in every iteration of its endless loop. The requests can be canceled by setting the configIDLE_SHOULD_YIELD option to zero. Regardless canceling, the Idle task cannot block any application task since context switch takes place at the beginning of a new time slice. Delay If an application task at some point needs to wait for a predefined amount of time, the most efficient way will be to place the task into the blocked state for that period. That can be achieved with the vTaskDelay() function13 [15] [16] declared as: 11A share of time consumed by the operating system overhead increases with time slice shortening. Therefore, the time slice should not be too short. On the other end, the maximum time slice length is 1s, which cannot be achieved by any MCK12 frequency since the FreeRTOS uses 224−1the 24-bit SysTick timer, e.g., tSLICEMAX � = < 200ms. fMCK=84MHz fMCK 12MCK ... Master Clock 13The INCLUDE_vTaskDelay option in the must be set to one in the src/FreeRTOSConfig.h configuration file to make the vTaskDelay() function available. void vTaskDelay( TickType_t xTicksToDelay );14 The function places the calling task into a blocked state for the specified number of time slices, i.e., ticks. When the period expires, the task is placed into ready state. The delay interval has to be inconveniently specified in number of ticks. To convert milliseconds into a number of ticks, the pdMS_TO_TICKS() macro can be used [15] [16]. For instance, the following vTaskDelay() call makes a task wait for 2s: vTaskDelay(pdMS_TO_TICKS(2000)); Note that the vTaskDelay()function must never be called from the vApplication IdleHook() callback function since the Idle task must not be placed into the blocked state. Include a delay into one or more of the four tasks and observe the scheduling algorithms. 14TickType_t is an unsigned 32-bit integer by default. It will be set to an unsigned 16-bit integer if the configUSE_16_BIT_TICKS option is set to one. Exercise 3 Implementing other scheduling algorithms Write a code for four finite tasks with various BTs1. A task should run in a finite loop and end after a predefined number of iterations defining the BT. Use button keys on the external board as asynchronous task triggers. Each time a button key is pressed, a request for a single run of the corresponding finite task should be issued. Use LEDs on the external board to indicate which task is currently running. Implement the FCFS2, SJF3, SRTF4 and RR5 scheduling algorithms by dynamically adjusting the task priorities. Which scheduling algorithms are cooperative, and which are preemptive? Which can cause the convoy effect, and/or the CPU6 starvation? Why the FreeRTOS sometimes skips a task, i.e., the task is not run, although a request was issued? Explanation Create a new empty project in the Eclipse working environment as explained in Exercise 1. Connect the Arduino Due board to the host PC over the Olimex ARM-USB-OCD-H interface, and to the external board button keys and LEDs, as shown in Fig. 2.1. Configure the pins from PC21 to PC26, PC28 and PC29 as GPIO pins. The pull-up resistors are required on input pins, i.e., keys. To properly detect the key pressing, the debouncing filters are also needed. See Appendix A for GPIO pin configuration and usage. Start the FreeRTOS scheduler as explained in Exercise 2. As the FreeRTOS is started without any tasks created, the only task at the beginning is the Idle task. Ending a finite task A finite task is usually executed at an event, i.e., a key press. The task ends when its job is done, e.g., the event is handled. In the FreeRTOS, a task is a standalone function that should never return. Therefore, it cannot just end. The vTaskDelete() function7 [15] [16] must be called instead. The declaration of the function is: void vTaskDelete( TaskHandle_t pxTask ); 1BT ... Burst Time, i.e., task’s running state time 2FCFS ... First Come First Serve 3SJF ... Shortest Job First 4SRTF ... Shortest Remaining Time First 5RR ... Round Robin 6CPU ... Central Process Unit 7The INCLUDE_vTaskDelete option in the must be set to one in the src/FreeRTOSConfig.h configuration file to make the vTaskDelete() function available. The function informs the FreeRTOS kernel to delete the specified task. In general, any task can delete any other task. The task related system resources are released later by the Idle task (see Exercise 2). The function argument is: pxTask ... task handle Passing NULL argument is equivalent to passing a handle of the calling task. The task deletes itself. In that case, the function never returns. To end a finite task, the task must call the vTaskDelete() function and delete itself before ending: void func(void *arg) { ... vTaskDelete(NULL); } To use the vTaskDelete() function, the FreeRTOS.h and task.h header files have to be included. #include #include FCFS scheduling algorithm The FCFS scheduling algorithm executes tasks one after another in FIFO8 order as their requests arrive. There is no priority, nor preemption. In FreeRTOS, the FCFS scheduling algorithm can be implemented by cooperative scheduling and all tasks having the same idle priority. The idle priority allows the Idle task to participate, thus being able to release system resources after task ending. Since there is no task preemption, the FCFS is a cooperative algorithm. While a long BT task is being executed, new single run task requests pile up. The newly arrived tasks are made to wait the line. Every request, short BT tasks are no exception, is added at the end of the line. Such an accumulation of waiting tasks is called the convoy effect. The convoy effect can significantly increase the average waiting time9 that leads to lower CPU utilization [17]. The FCFS scheduling algorithm can cause the convoy effect. A task is starving of CPU time when continuously denied to enter the running state. The CPU starvation effect occurs when a scheduling algorithm denies one or more tasks to be scheduled for infinite amount of time. Assuming finite task BTs, the FCFS scheduling algorithm cannot cause the CPU starvation under any task arriving scheme. The FreeRTOS was started without any tasks created. The only task at the beginning is the Idle task. It turns all the LEDs off, and creates a task corresponding to the newly pressed key. The created task is placed into the ready state, which is in fact a run request. The Idle task functionality can be coded in the vApplicationIdleHook() callback function (see Exercise 2). The pseudo code of the function is: turn all LEDs off for each of the four keys if the key is down and was up in the previous Idle task iteration create new key corresponding task save key position for the next Idle task iteration 8FIFO ... First In First Out 9An average time interval between task request arrival and entering the running state. Current key positions are saved into a global array in each iteration of the Idle task. They are used to detect a key pressed event. The tasks implement the same algorithm in a finite loop. The pseudo code of the finite task algorithm is as follows: for predefined number of iterations turn LED belonging to this task on, others off for each of the four keys if the key is down and was up in the previous iteration create new key corresponding task save key position for the next iteration delete this task A new task is created on every key pressed event. Each task requires some operating system resources, e.g., some heap space. If the line of waiting tasks gets long, there may not be enough space left to create another task. The xTaskCreate() function fails (see Exercise 2), and the run request gets skipped. Obviously, skipping is more common when the heap size is small. The heap size can be set by the configTOTAL_HEAP_SIZE option in the src/FreeRTOSConfig.h configuration file. SJF scheduling algorithm The SJF scheduling algorithm executes tasks one after another. There is no preemption. A task with shorter BT has higher priority, though. Therefore, a new task request is not just added at the end of the waiting line, but is inserted into the line according to its BT. In FreeRTOS, the SJF scheduling algorithm can be implemented by cooperative scheduling and creating tasks with above idle priorities according to their BTs. There are four finite tasks with predefined BTs in this exercise. The priority of the task with the longest BT should be set to one, the next to two, etc., and the task with the shortest BT should have priority four. The maximum number of available priorities is set by the configMAX_PRIORITIES option in the src/FreeRTOSConfig.h configuration file. For SJF scheduling of four tasks, the configMAX_PRIORITIES option has to be at least five. Obviously, the SJF scheduling algorithm is a cooperative algorithm. Since the shorter BT tasks get scheduled first, the waiting line is shorter than in FCFS algorithm. Consequently, the convoy effect is smaller or none, skipping requests is rarer. On the other hand, the SJF scheduling algorithm can cause the CPU starvation. Constantly arriving short BT tasks can block a long BT task infinitely. The SJF version of the pseudo code of the Idle task callback function is only slightly modified FCFS version. The tasks has to be created with priorities according to their BTs: turn all LEDs off for each of the four keys if the key is down and was up in the previous Idle task iteration create new key corresponding task with priority reflecting its BT save key position for the next Idle task iteration The same goes for the pseudo code of the finite tasks: for predefined number of iterations turn LED belonging to this task on, others off for each of the four keys if the key is down and was up in the previous iteration create new key corresponding task with priority reflecting its BT save key position for the next iteration delete this task Critical section of code Tasks may want to access the same resource, e.g., global variable, peripheral device register etc., at the same time. Since preemption can occur at any time, the outcome depends on the sequence in which the individual tasks access the resource. The phenomenon is called a race condition. A race condition cannot occur in cooperative scheduling. Example: Task A wants to increment, and task B to decrement the same global variable. At the end, the variable value should be the same. Task A reads the variable. Before succeeding to store the incremented value back into the memory, task B preempts task A. Since the variable has not changed yet, task B decrements the original value. After a while, task A is rescheduled. It continues with incrementing/storing the previously read value. The global variable unexpectedly ends incremented. If the preemption take place a bit later, the expected result will be obtained. Task A would manage to store the incremented value, and task B would decrement the variable back to its original value. Critical section of code is a region where a race condition can arise. If preemption is disabled during the critical section, a race condition cannot occur. The critical section becomes an atom, i.e., a region of code that cannot be interrupted. To make a section of code an atom, the taskENTER_CRITICAL() and taskEXIT_CRITICAL() macros can be used [15] [16]. The enclosed code becomes an atom. ... taskENTER_CRITICAL(); atom code taskEXIT_CRITICAL(); ... Atoms should be very short. The code inside an atom is guaranteed to stay in the running state. It must not request a context switch, go blocked, suspended, or end. The taskENTER_CRITICAL() and taskEXIT_CRITICAL() macros can be nested. Task priority modification Initial task priority defined at task creation (see Exercise 2) can be dynamically modified. The vTaskPrioritySet() function10 [15] [16] can be used. Its declaration is: void vTaskPrioritySet( TaskHandle_t pxTask, UBaseType_t uxNewPriority ); 10The INCLUDE_vTaskPrioritySet option must be set to one in the src/FreeRTOSConfig.h configuration file to make the vTaskPrioritySet() function available. The function modifies the specified task’s priority. The arguments of the function are: pxTask ... task handle uxNewPriority ... task priority Passing NULL as a task handle argument is equivalent to passing a handle of the calling task. The task modifies its own priority. The following vTaskPrioritySet() function call, for instance, sets priority of the calling task to three: vTaskPrioritySet(NULL, 3); SRTF scheduling algorithm The SRTF scheduling algorithm always runs a task with the shortest remaining time to completion. A new arrived task will preempt the running task if its BT is shorter than the remaining time of the running task. In FreeRTOS, the SRTF scheduling algorithm can be implemented by prioritized preemptive scheduling without time slicing. The task priority has to be initially set according to its BT. During the execution, the remaining time to completion decreases, and the task priority has to be correspondingly raised. The SRTF scheduling algorithm is a preemptive algorithm. There is no convoy effect. Skipping requests due to a lack of system resources should not be an issue. The SRTF algorithm can cause the CPU starvation. Constantly arriving short BT tasks can block a long BT task infinitely. A section of code from getting to saving the current key position for the next iteration is critical and should be an atom. Otherwise, a key press requesting a short BT task will be handled twice if the remaining time of the running task is longer. The SRTF version of the pseudo code of the Idle task callback function is as follows: turn all LEDs off for each of the four keys enter critical section if the key is down and was up in the previous Idle task iteration create new key corresponding task with priority reflecting its BT save key position for the next Idle task iteration exit critical section The tasks implement essentially the same algorithm in a finite loop. Since the remaining time constantly decreases, each task should be adequately raising its priority during the execution. The number of the remaining finite loop iterations can be used as a remaining time measure. In this exercise, there are four tasks with predefined number of iterations. Suppose the number of iterations of the first task is A, of the second B, of the third C, and of the fourth D, and A < B < C < D. The initial priority of the first task is set to four, of the second to three, of the third to two, and of the fourth to one, thus reflecting the tasks’ BTs. The priority is then raised as the number of remaining finite loop iterations decreases. The following pseudo code implements the described mechanism: for predefined number of iterations, i.e., one of A, B, C, or D turn LED belonging to this task on, others off for each of the four keys enter critical section if the key is down and was up in the previous iteration create new key corresponding task with priority reflecting its BT save key position for the next iteration exit critical section if number of iterations left equals to any of 1, A + 1, B + 1, or C + 1 increment priority delete this task Note that number of remaining iterations is decremented after the iteration is completed. The task priority is gradually raised. The final priority before task completion is five. Therefore, the configMAX_PRIORITIES option has to be at least six. RR scheduling algorithm The RR scheduling algorithm assigns one time slice per task in circular manner. Each task gets at most one time slice of CPU before the context switch takes place. There is no priority. A newly arrived task request is added at the end of the line of tasks. In FreeRTOS, the RR scheduling algorithm can be implemented by prioritized preemptive scheduling with time slicing and all tasks having the same idle priority. The idle priority allows the Idle task to participate, thus being able to release system resources after task ending. The RR scheduling algorithm is a preemptive algorithm. The convoy effect and skipping requests due to a lack of system resources both increase with the time slice length. For an infinitely long time slice, the RR converts into the FCFS algorithm. The RR algorithm cannot cause the CPU starvation under any task arriving scheme. Since a preemption can occur at any place in the code, a section of code from getting to saving the current key position for the next iteration is critical and should be an atom. To obtain the RR version of the pseudo code of the Idle task callback function, the FCFS version has to be equipped with critical section markers: turn all LEDs off for each of the four keys enter critical section if the key is down and was up in the previous Idle task iteration create new key corresponding task save key position for the next Idle task iteration exit critical section The same goes for the pseudo code of the finite tasks: for predefined number of iterations turn LED belonging to this task on, others off for each of the four keys enter critical section if the key is down and was up in the previous iteration create new key corresponding task save key position for the next iteration exit critical section delete this task Exercise 4 Assembly language function In the assembly language of the AT91SAM3X8E µC, write an external function which performs an addition of two arbitrary long unsigned integers. The function should receive four arguments. The first argument should be a pointer to the final sum, i.e., the address where the function writes the result. The next two arguments should be pointers to both summands. And the fourth argument should provide the length of the integers in 32-bit words. The function should return the final carry bit value. To test the function, write a program reading two 128-bit long unsigned integers in a hexadecimal form from the stdin stream, and writing their sum to the stdout stream. Use the UART peripheral device as the stdio1, and an arbitrary serial terminal program as a console. Explanation Create a new empty project in the Eclipse working environment as explained in Exercise 1. Connect the Arduino Due board to the host PC over the Olimex ARM-USB-OCD-H interface as shown in Fig. 4.1. Configure the UART peripheral device and the stdio in serial mode as explained in Appendix A. Function description As it can be understood from the exercise text, the function has the following declaration: 1stdio ... Standard I/O uint32_t func(uint32_t *sum, uint32_t *summand1, uint32_t *summand2, uint32_t length);2 It has to be declared as an extern function, since its definition will reside in a separate assembly code file. The function usage from the C code is demonstrated with the following lines. The two 128-bit (= 4 × 32bit) summands pulNum1 and pulNum2 are added, the result is written into pulSum, and the final carry bit is returned. uint32_t pulNum1[4], pulNum2[4], pulSum[4], ulC; ... ulC = func(pulSum, pulNum1, pulNum2, 4); Both summands and the sum are the arrays of four 32-bit unsigned integers. Each array represents a 128-bit number. The function adds the numbers from the pulNum1 and pulNum2 arrays and writes the result into the pulSum array as shown in Fig. 4.2. The length of the numbers is four times 32-bit, i.e., 128-bit. Function implementation The AT91SAM3X8E µC is based on the ARM Cortex-M3 processor [18]. The processor has 16 32-bit core registers labeled from r0 to r15. There are 13 general-purpose registers r0 to r12, an SP register r13, a link register r14, and a program counter r15. When writing a subroutine, i.e., an external function, in the AT91SAM3X8E µC assembly language, the subroutine calling convention for the ARM architecture [19] has to be taken into account. By following the convention, the assembly subroutine can be transparently called from the C code as an external function. A short summary of the convention for the purpose of this exercise follows: • The r0 to r3 registers are used to pass the arguments into the subroutine, and to pass the result value out. If there are more arguments, or the result is larger, the stack will be used. The r0 to r3 registers do not need to be restored before returning. • The r4 to r11 registers can be freely used by the subroutine. They must be restored before returning. Therefore, the used registers are usually pushed into the stack on subroutine entry, and restored from it before returning. • The r12 register is a scratch register and can be used for any purpose. 2uint32_t is a 32-bit unsigned integer type. • The r13 register is the SP register and must not be used for any other purpose. The stack operates in full-descending mode, i.e., the SP register points to the last item on stack and the stack grows downwards to lower memory addresses. • The r14 register is the link register with the returning address. Note that the link register must be stored, e.g., to stack, when a subsubroutine is called. According to the convention, the first argument, i.e., pointer to the sum, is passed in the r0register, the second and the third, i.e., pointers to both summands, in r1and r2, and the fourth, i.e., length, in r3. Stack is not used. Before returning, the result, i.e., the final carry bit value, has to be stored into the r0 register to be passed back. The summands are added by parts, 32-bits at a time. The carry bit from the previous iteration is added in each step. Obviously, the number of iterations equals to the length argument. The final carry bit is stored into the r0 register for returning. The registers used in the subroutine must be saved, i.e., pushed to the stack, at the subroutine beginning, and restored, i.e., popped off the stack, at the end. The pseudo code of the subroutine is as follows: save working registers (push to the stack) C = 0 iter: C, [r0] ← [r1] + [r2] + C increment pointers r0, r1 and r2 decrement r3 if r3 is not zero go to iter r0 ← C restore working registers (pop off the stack) return As the exercise requires, the subroutine has to be written in assembly language in a separate assembly code file. In the empty project, the src/sum.S file is prepared for that purpose. Since the subroutine represents an external function to be used from the C code, it has to be visible outside the assembly file. Thus, the .global assembler directive [20] is required. The assembly file structure is: .thumb .syntax unified .global func .text func: ... ... .end The ARM Cortex-M3 processors implements the ARMv7-M Thumb instruction set [21] which is quite extensive. To code the function from this exercise, only some basic variations of load and store, data-processing and branch instructions are required. A few selected instructions can be found in Tab. 4.1. The triangular brackets <> denote a required field, the square brackets [] denote address dereferencing, e.g., [rn] denotes a value stored at the memory location address in the rn register, and the curly brackets {} denote an optional field. instruction stmfd {!}, ldmfd {!}, mov{s} ,# mov{s} , ldr ,= ldr ,[{,#+/-}] ldr ,[,#+/-]! ldr ,[],#+/- ldr ,[,{,}] str ,[{,#+/-}] str ,[,#+/-]! str ,[],#+/- str ,[,{,}] cmp ,# cmp ,{,} cmn ,# cmn ,{,} add{s} {,}{,#} operation regs → [rn]3 [rn] → regs3 const → rd4 rm → rd4 c32 → rd5 [rn±imm] → rd6 [rn±imm] → rd, rn±imm → rn6 [rn] → rd, rn±imm → rn6 [rn+shift(rm)] → rd7 rs → [rn±imm]8 rs → [rn±imm], rn±imm → rn8 rs → [rn], rn±imm → rn8 rs → [rn+shift(rm)]9 rn−const10 rn−shift(rm)11 rn+const10 rn+shift(rm)11 rn+const → rd12 18 add{s} {,},{,} rn+shift(rm) → rd13 18 adc{s} {,}{,#} rn+const+C → rd14 18 adc{s} {,},{,} rn+shift(rm)+C → rd15 18 sub{s} {,}{,#} rn−const → rd16 18 sub{s} {,},{,} rn−shift(rm) → rd17 18 b{cond}