Main / Pstruct
Build FlowPreprocessor -> Compiler -> Assembler -> Linker The -c option flag for gcc will run the first three steps, but skip the linking. .bin vs .elfA bin file is a pure binary file with no memory relocation, although it will have an IVT header, explicit instructions to be loaded at a specific memory address... ELF files are Executable Linkable Format which consists of symbol look-ups and relocation table - it can be loaded at any memory address. All symbols used are adjusted to the offset from that memory address where it was loaded into. Usually ELF files have a number of sections, such as 'data', 'text', 'bss', etc - it is within those sections where the runtime can calculate where to adjust the symbol's memory references dynamically. ELF file also contains the bin within. Object filesGCC adds symbols like _start and frame_dummy to the executable. _start is the actual beginning, even before main(). _start comes from crt1.o while _init comes from crti.o. These are not really libraries but inline assembly code that does pre-main init stuff like setting up interrupts, initializing stack, etc. The assembly files have a .s extension. Linker scripts (name.ld or name.ld.S or sometimes even name.lsl as in the Aurix TriCore) or ldscripts are commands to the linker telling it where to place the symbols. This is where the load memory addresses are established. The .text symbol gives the address where the code will be loaded. The main purpose of the linker script is to describe how the sections in the input files should be mapped into the output file, and to control the memory layout of the output file. Linker scripts are not meant to be modified by end users. They should only be modified by toolchain developers, but sometimes you have to step in to make a fix. They also have a section in which you can execute code, init steps to be run before start or main or anything else. The most fundamental command of the ld command language is the SECTIONS command (see section Specifying Output Sections). Every meaningful command script must have a SECTIONS command: it specifies a "picture" of the output file's layout, in varying degrees of detail. No other command is required in all cases. The MEMORY command complements SECTIONS by describing the available memory in the target architecture. This command is optional; if you don't use a MEMORY command, ld assumes sufficient memory is available in a contiguous block for all output. See section Memory Layout. Here is the full syntax of a section definition, including all the optional portions: SECTIONS { ... secname start BLOCK(align) (NOLOAD) : AT ( ldadr ) { contents } >region :phdr =fill ... } You can determine the name of the compiler/linker used by looking at the debug .map file. The .map files shows the addresses of the symbols, sizes of the functions, etc. It can be generated by using the -Map=name.map flag for the linker. If you have a linker error saying cannot move location counter backwards then you are probably exceeding your available memory. You could try adding the gcc flag -ffunction-sections and ld flag --gc-sections (garbage collection) but this may strip important headers off of your binary. Take a look at the size of your elf and maybe all the libraries that are getting linked in. Explain 'relocateable'The GNU linker has an option -r for creating relocateable output. Another word for this is partial or incremental linking. -r --relocateable Generate relocatable output--i.e., generate an output file that can in turn serve as input to ld. This is often called partial linking. A linker takes .o or .a files as input. It produces executable or relocateable output. Passing -r (or --relocatable) to ld will create an object that is suitable as input of ld. In the nominal use case, a linker receives relocateable object files (like ELF) and produces an executable of the same format (ELF). Warning: when an input file does not have the same format as the output file, partial linking is only supported if that input file does not contain any relocations. A relocateable has no address information for symbols, only offsets from main. The linker moves blocks of bytes of your program to their run-time addresses. These blocks slide to their run-time addresses as rigid units; their length does not change and neither does the order of bytes within them. Such a rigid unit is called a section. Assigning run-time addresses to sections is called relocation. Apart from text, data and bss sections you need to know about the absolute section. When the linker mixes partial programs, addresses in the absolute section remain unchanged. With an incremental link, you can leave "undefined references" in the code because it is presumed they will be resolved at the final linking. DependenciesMany build systems add automatically detected make dependencies into a .d file. In particular, for C/C++ source files they determine what #include files are required and automatically generate that information into the .d file. It contains a list of targets and dependencies like foo.o : foo.h bar.h biz.h Newer C++ VersionsIf you want to compile with updates, you'll need to specify the version to support on the compile command, i.e. g++ -std=c++11 <yada-yada> Compile-time computations (is the new assert stuff an example of this?) result in output going into .ro data section. LibrariesA .a file is a static library. The Unix archive format is a collection of relocatable object files with a header for size and location descriptions. The general rule for libraries is to be at the end of the linking command line, otherwise references may not be resolved because the definition (library) is read before the call (module). Good explanation from over on stack overflow: Static libraries are .a (or in Windows .lib) files. All the code relating to the library is in this file, and it is directly linked into the program at compile time. A program using a static library takes copies of the code that it uses from the static library and makes it part of the program. [Windows also has .lib files which are used to reference .dll files, but they act the same way as the first one]. There are advantages and disadvantages in each method. Shared libraries reduce the amount of code that is duplicated in each program that makes use of the library, keeping the binaries small. It also allows you to replace the shared object with one that is functionally equivalent, but may have added performance benefits without needing to recompile the program that makes use of it. Shared libraries will, however have a small additional cost for the execution of the functions as well as a run-time loading cost as all the symbols in the library need to be connected to the things they use. Additionally, shared libraries can be loaded into an application at run-time, which is the general mechanism for implementing binary plug-in systems. Static libraries increase the overall size of the binary, but it means that you don't need to carry along a copy of the library that is being used. As the code is connected at compile time there are not any additional run-time loading costs. The code is simply there. Personally, I prefer shared libraries, but use static libraries when needing to ensure that the binary does not have many external dependencies that may be difficult to meet, such as specific versions of the C++ standard library or specific versions of the Boost C++ library. The general recommendation is to prefer dynamic linking when possible. Note that there are problems statically linking some libraries with some compilers on certain platforms. For example, the pthread library has a fail silent problem some circumstances, and if you want to statically link it you need to make sure to use the --whole-archive flag. (https://gcc.gnu.org/bugzilla/show_bug.cgi?id=52590, https://stackoverflow.com/questions/7090623/c0x-thread-static-linking-problem) How can I know which dependencies an executable has?Use readelf -d <bin> How can I know the arch target for a .a archive file?The file command won't work here. Use readelf -h <archive>.a | grep 'Class\|File\|Machine'. Where does the loader search for dependencies on running a program?The rpath designates the run-time search path hard-coded in an executable file or library. Dynamic linking loaders use the rpath to find required libraries. The rpath is revealed with readelf -d <prog> objdumpThe meaning of the columns for -T (dynamic symbol table) COLUMN ONE: the symbols value COLUMN TWO: a set of characters and spaces indicating the flag bits that are set on the symbol. There are seven groupings which are listed below: group one: (l,g,,!) local, global, neither, both. group two: (w,) weak or strong symbol. group three: (C,) symbol denotes a constructor or an ordinary symbol. group four: (W,) symbol is warning or normal symbol. grout five: (I,) indirect reference to another symbol or normal symbol. group six: (d,D,) debugging symbol, dynamic symbol or normal symbol. group seven: (F,f,O,) symbol is the name of function, file, object or normal symbol. COLUMN THREE: the section in which the symbol lives; ABS means not associated with a certain section, UND means referenced but not defined in this file COLUMN FOUR: the symbols size or alignment. COLUMN FIVE: ??? COLUMN SIX: the symbols name. I've seen binaries bereft of .text sections, presumably because they are getting everything from a dynamic lib. An object file written by GNU assembler has at least three sections, any of which may be empty. These are named . text, .data and . bss sections. You may allocate address space in the . bss section, but you may not dictate data to load into it before your program executes. When your program starts running, all the contents of the . bss section are zeroed bytes. Starting and RunningThe very beginning is not main... instead the linker script has a command to identify the actual starting label: What goes on once you get into the _start function? For the Aurix TriCore:
Note with some processors you can specify a program start location using a boot mode index defined in a boot mode header file - just a compiled C file of structures and pointers to form into an image to place into memory. It has its own location in flash and this is given by the processor architecture and specified in the linker script. On reset, with the right boot mode pins, the processor will look to that flash location to load a program counter. Here is the sequence as the program counter is first given the address of the start location in instruction memory (could be RAM or flash): Run TimeWhat's with this term, "runtime"?I think it's often a poorly used term. It literally means run or execution time, but is often used as a lazy replacement of the more descriptive terms runtime library, runtime system, or runtime environment. Runtime is not a thing, it's a time. The C runtime library is small and different from the C standard library. It does define the stdlib, but the implementation of those functions must be added. A runtime library is always specific to the platform and compiler, so it is hardware-dependent. GCC provides libgcc. The runtime environment consists of environment variables, and are accessed via the runtime system. The runtime system is also composed of the stack management instructions. Also from Wikipedia:"Most programming languages have some form of runtime system that provides an environment in which programs run. This environment may address a number of issues including the management of application memory, how the program accesses variables, mechanisms for passing parameters between procedures, interfacing with the operating system, and otherwise. The compiler makes assumptions depending on the specific runtime system to generate correct code. Typically the runtime system will have some responsibility for setting up and managing the stack and heap, and may include features such as garbage collection, threads or other dynamic features built into the language." "As a simple example of a basic runtime, the runtime system of the C language is a particular set of instructions inserted into the executable image by the compiler. Among other things, these instructions manage the processor stack, create space for local variables, and copy function-call parameters onto the top of the stack." Another distinguishment to make is to say that a compiled object file contains only assembly code instructions of the functions, while an executable binary also contains the runtime environment implementation. The object files depend on that environment. There is actually a hierarchy of runtime systems in a complex machine, with the microcode/logic of the CPU itself being the lowest-level runtime system. For ARM (and others?), there is a file of assembly instructions called crt0.s that sets up the runtime environment by initializing stack pointers for things like IRQ, FIQ, services, etc. It's final instruction is a branch to main, so it executes before main. http://en.wikipedia.org/wiki/Crt0 Aurix C Runtime Environment Init/* Initialization of C runtime variables */ Ifx_Ssw_C_Init(); IFX_SSW_INLINE void Ifx_Ssw_C_InitInline(void) { Ifx_Ssw_CTablePtr pBlockDest, pBlockSrc; unsigned int uiLength, uiCnt; unsigned int *pTable; /* clear table */ pTable = (unsigned int *)&__clear_table; while (pTable) { pBlockDest.uiPtr = (unsigned int *)*pTable++; uiLength = *pTable++; /* we are finished when length == -1 */ if (uiLength == 0xFFFFFFFF) { break; } uiCnt = uiLength / 8; while (uiCnt--) { *pBlockDest.ullPtr++ = 0; } if (uiLength & 0x4) { *pBlockDest.uiPtr++ = 0; } if (uiLength & 0x2) { *pBlockDest.usPtr++ = 0; } if (uiLength & 0x1) { *pBlockDest.ucPtr = 0; } } /* copy table */ pTable = (unsigned int *)&__copy_table; while (pTable) { pBlockSrc.uiPtr = (unsigned int *)*pTable++; pBlockDest.uiPtr = (unsigned int *)*pTable++; uiLength = *pTable++; /* we are finished when length == -1 */ if (uiLength == 0xFFFFFFFF) { break; } uiCnt = uiLength / 8; while (uiCnt--) { *pBlockDest.ullPtr++ = *pBlockSrc.ullPtr++; } if (uiLength & 0x4) { *pBlockDest.uiPtr++ = *pBlockSrc.uiPtr++; } if (uiLength & 0x2) { *pBlockDest.usPtr++ = *pBlockSrc.usPtr++; } if (uiLength & 0x1) { *pBlockDest.ucPtr = *pBlockSrc.ucPtr; } } } What is a dynamic run-time environment?IBM says it's the idea of having different libraries attached to the user portion of the "Runtime". What dynamically changes is library lists. Other than that, this term is unclear. Not much source material discussing it. Demystifying the Stack Terminologyuser stack = sometimes used to refer to a user interrupt stack, i.e. interrupt stack that is only for one core in it's own memory A thread consists of a user stack and a kernel stack. Interrupt stacks are associated on a per processor basis, and are only used while the kernel is currently using that particular CPU. When a interrupt (external) happens then the kernel switches to the the interrupt stack, since it saves creating more space on the kernel stack with the associated thread. What is the frame pointer?AKA base pointer, this is the first value pushed onto the stack when a new function is invoked, and a new frame pointer is created for the new function. Everything is referenced as relative to the frame pointer. Strictly speaking, they are not totally necessary because you can use the stack pointer as your anchor instead. gcc gives you the -fomit-frame-pointer option, for example. The FP remains constant while the SP moves around as the stack grows and shrinks. The -fomit-frame-pointer option instructs the compiler to not store stack frame pointers if the function does not need it. You can use this option to reduce the code image size. The -fno-omit-frame-pointer option instructs the compiler to store the stack frame pointer in a register. What is RTL?
Aurix TriCore Predefined Program SectionsDefault sections .text Section for commands(Code) .data Initialized files are stored in .data .bss Non-initialized files are stored in . bss .rodata Location of read-only data .version_info Information in the module on the compiler and options utilised Small addressable sections .sdata This section stores initialized data that are addressable via small data area pointers (%a0) .sbss Location of non-initialized data. Addressing is effected via small data area pointers %a0 .srodata Location of read-only data that can be small addressed Absolute addressable sections .zdata Initialized data that are absolute addressable .zbss Non-initialized data, absolute addressable .zrodata Location of read-only data that can be absolute addressed PCP sections .pcptext PCP code section .pcpdata PCP data section C++ sections .eh_frame Exception handling frame of C++ exceptions .ctors Section for constructors .dtors Section for destructors Debug sections .debug_<name> Various debug sections System Design: Resource EstimatesIt turns out this is a very difficult problem without a lot of direct experience with the target architecture, compiler, software itself (drivers/modules/apps), libraries involved, and requirements. Some good thoughts on RAM usage estimates: https://electronics.stackexchange.com/questions/140116/how-do-you-determine-how-much-flash-ram-you-need-for-a-microcontroller Measuring "software size" typically uses:
MMU, Translation, Program MemoryIn a 32-bit system you have 4GB of virtual address space to play with. But there may only be a smaller amount of RAM. A translation table is used by the MMU to perform the mapping from virtual addresses to physical addresses. Here's a readelf output from a 32-bit system application that shows the NULL exception address at origin, even though the program has been placed at the 2GB location of 0x8000.0000. Section Headers: [Nr] Name Type Addr Off Size ES Flg Lk Inf Al [ 0] NULL 00000000 000000 000000 00 0 0 0 [ 1] .text PROGBITS 80000000 008000 0a7fa4 00 AX 0 0 64 The application translation tables reserve an invalid/inaccessible region of memory there at 0x0 (may be cacheable for bootloader). Small in size, for example maybe 1MB just because that's the minimum size of a region. One page. In this case both the bootloader and application translation tables have been altered to indicate an accessible DDR cacheable region up at 0x8000.0000, and this is translated through the FPGA and AXI interface to the 0x0000.0000 location in off-chip DDR memory accessible through the processor system's DDR interface. In other words it goes out to the FPGA and then back before going to RAM. Table entries: /* BOOTLOADER */ .rept 0x0100 /* 0x80000000 - 0x8fffffff (DDR cacheable, virt mapping to ECC-prot) */ .word SECT + 0x15de6 /* S=b1 TEX=b101 AP=b11, Domain=b1111, C=b0, B=b1 */ .set SECT, SECT+0x100000 .endr /* APP */ .rept CPU0_CACHEABLE_PAGES/16 /* (DDR Cacheable) */ .rept 16 #if XPAR_CPU_ID==0 .word SECT + 0x45de6 /* 16MB page, S=b0 TEX=b101 AP=b11, Domain=b1111, C=b0, B=b1 */ #else .word SECT + 0x0 /* invalid, generates a translation fault */ #endif .endr .set SECT, SECT+0x01000000 .endr Uncached MemoryWhen the linker script sets up memory regions, stack and heap, etc, you can also designate an uncached region and the umalloc() function can actually allocate space there instead of in the cached heap that's part of CPU memory. ARM example: MEMORY { //... UNCACHED_BASEADDR : ORIGIN = 0x20000000, LENGTH = 0x0A000000 //... } //... uncached (NOLOAD) : { _uncached_start = .; . += _UNCACHED_END - _UNCACHED_START; _uncached_end = .; } > UNCACHED_BASEADDR Memory Helper![]() |