GNU链接器简介
- 1 使用简单程序简介链接脚本
- 1.1 测试程序
- 1.2 编译测试程序
- 1.2.1 不使用链接器编译
- 1.2.1.1 不使用链接器编译
- 1.2.1.2 读取objdump_test 的结构
- 1.2.2 使用链接器去链接
- 1.2.2.1 链接脚本
- 1.2.2.2 使用链接脚本编译
- 1.2.2.3 读取objdump 的结构
- 2 链接脚本
- 2.1 基本连接器脚本概念
- 2.2 连接器脚本格式
- 2.3 简单链接器脚本示例
链接器(Linker)是编译系统中的一个重要组件,它负责将一个或多个目标文件(object files)以及库文件(library files)组合成一个可执行文件或者库。在C/C++等编程语言中,当源代码被编译时,编译器会为每个源文件生成一个对应的目标文件。这些目标文件包含了机器代码和符号表信息,其中符号表包括了函数名、变量名及其地址等。
1 使用简单程序简介链接脚本
为了介绍链接器,首先使用一个最简单的测试程序去介绍链接器的功能,主要是简要介绍链接器对于可行性程序或库的入口地址的影响。
1.1 测试程序
rlk@rlk:test$ cat objdump_test.c
#include <stdio.h>
void greet() {
printf("Hello, World!\n");
}
int main() {
greet();
return 0;
}
rlk@rlk:test$
1.2 编译测试程序
1.2.1 不使用链接器编译
1.2.1.1 不使用链接器编译
gcc -o objdump_test objdump_test.c
1.2.1.2 读取objdump_test 的结构
Entry point address: 0x1050
可执行程序objdump_test的可执行程序的入口为0x1050Number of program headers: 11
表示可执行程序有11个program sectionNumber of section headers: 29
表示可执行程序有29个section
rlk@rlk:test$ readelf -hW objdump_test
ELF Header:
Magic: 7f 45 4c 46 02 01 01 00 00 00 00 00 00 00 00 00
Class: ELF64
Data: 2's complement, little endian
Version: 1 (current)
OS/ABI: UNIX - System V
ABI Version: 0
Type: DYN (Shared object file)
Machine: Advanced Micro Devices X86-64
Version: 0x1
Entry point address: 0x1050
Start of program headers: 64 (bytes into file)
Start of section headers: 14664 (bytes into file)
Flags: 0x0
Size of this header: 64 (bytes)
Size of program headers: 56 (bytes)
Number of program headers: 11
Size of section headers: 64 (bytes)
Number of section headers: 29
Section header string table index: 28
rlk@rlk:test$
1.2.2 使用链接器去链接
1.2.2.1 链接脚本
rlk@rlk:test$ cat link.ld
/* linker_script.ld */
/* ENTRY(_start) */ /* 指定入口点 */
SECTIONS
{
. = 0x10000000; /* 设置起始地址 */
.init_array :
{
__init_array_start = .;
KEEP(*(SORT(.init_array.*)))
KEEP(*(.init_array))
__init_array_end = .;
}
.text : { *(.text) } /* 定义.text段的位置 */
.rodata : { *(.rodata) } /* 只读数据段 */
.data : { *(.data) } /* 初始化的数据段 */
.bss : { *(.bss) } /* 未初始化或清零的数据段 */
}
rlk@rlk:test$
1.2.2.2 使用链接脚本编译
gcc -o objdump objdump_test.c -T link.ld
1.2.2.3 读取objdump 的结构
Entry point address: 0x10000010
可执行程序的入口地址为 0x10000010,改地址也和链接脚本指定的入口地址一致。
rlk@rlk:test$ readelf -hW objdump
ELF Header:
Magic: 7f 45 4c 46 02 01 01 00 00 00 00 00 00 00 00 00
Class: ELF64
Data: 2's complement, little endian
Version: 1 (current)
OS/ABI: UNIX - System V
ABI Version: 0
Type: EXEC (Executable file)
Machine: Advanced Micro Devices X86-64
Version: 0x1
Entry point address: 0x10000010
Start of program headers: 64 (bytes into file)
Start of section headers: 8704 (bytes into file)
Flags: 0x0
Size of this header: 64 (bytes)
Size of program headers: 56 (bytes)
Number of program headers: 7
Size of section headers: 64 (bytes)
Number of section headers: 32
Section header string table index: 31
rlk@rlk:test$
2 链接脚本
Every link is controlled by a linker script. This script is written in the linker command language.
每个链接都是由链接器脚本控制的。这个脚本是用链接器命令语言编写的。
The main purpose of the linker script is to describe how the sections in the input files should be mapped into the output file, and to control the memory layout of the output file. Most linker scripts do nothing more than this. However, when necessary, the linker script can also
direct the linker to perform many other operations, using the commands described below.
链接器脚本的主要目的是描述输入文件中的各个部分应如何映射到输出文件中,并控制输出文件的内存布局。大多数链接器脚本的作用仅限于此。然而,当需要时,链接器脚本还可以使用下面描述的命令指导链接器执行许多其他操作。
The linker always uses a linker script. If you do not supply one yourself, the linker will use a default script that is compiled into the linker executable. You can use the ‘–verbose’ command-line option to display the default linker script. Certain command-line options, such as ‘-r’ or ‘-N’, will affect the default linker script.
链接器总是使用链接脚本。如果你没有提供自己的脚本,链接器将使用编译进链接器可执行文件中的默认脚本。你可以使用‘–verbose’命令行选项来显示默认的链接脚本。某些命令行选项,如‘-r’或‘-N’,将会影响默认的链接脚本。
You may supply your own linker script by using the ‘-T’ command line option. When you do this, your linker script will replace the default linker script.
您可以使用‘-T’命令行选项来提供自己的链接器脚本。当您这样做时,您的链接器脚本将替换默认的链接器脚本。
You may also use linker scripts implicitly by naming them as input files to the linker, as though they were files to be linked.
您也可以通过将链接器脚本命名为链接器的输入文件来隐式使用它们,就好像它们是要被链接的文件一样。
2.1 基本连接器脚本概念
We need to define some basic concepts and vocabulary in order to describe the linker script language.
我们需要定义一些基本概念和词汇,以便描述链接器脚本语言。
The linker combines input files into a single output file. The output file and each input file are in a special data format known as an object file format. Each file is called an object file. The output file is often called an executable, but for our purposes we will also call it an object file. Each object file has, among other things, a list of sections. We sometimes refer to a section in an input file as an input section; similarly, a section in the output file is an output section.
链接器将输入文件合并成一个单独的输出文件。输出文件和每个输入文件都是以一种特殊的称为对象文件格式的数据格式存在。每个文件被称为对象文件。输出文件通常被称为可执行文件,但为了我们的目的,我们也将其称为对象文件。每个对象文件都有一个包含许多部分的列表。我们有时将输入文件中的一个部分称为输入部分;类似地,输出文件中的一个部分是输出部分。
Each section in an object file has a name and a size. Most sections also have an associated block of data, known as the section contents. A section may be marked as loadable, which means that the contents should be loaded into memory when the output file is run. A section with no contents may be allocatable, which means that an area in memory should be set aside, but nothing in particular should be loaded there (in some cases this memory must be zeroed out). A section which is neither loadable nor allocatable typically contains some sort of debugging information.
对象文件中的每个部分都有一个名称和一个大小。大多数部分还有一个相关的数据块,称为部分内容。一个部分可以被标记为可加载的,这意味着当输出文件运行时,内容应该被加载到内存中。一个没有内容的部分可能是可分配的,这意味着应该在内存中预留一个区域,但特定的东西不应该被加载在那里(在某些情况下,这块内存必须被清零)。既不可加载也不可分配的部分通常包含某种调试信息。
Every loadable or allocatable output section has two addresses. The first is the VMA, or virtual memory address. This is the address the section will have when the output file is run. The second is the LMA, or load memory address. This is the address at which the section will be loaded. In most cases the two addresses will be the same. An example of when they might be different is when a data section is loaded into ROM, and then copied into RAM when the program starts up (this technique is often used to initialize global variables in a ROM based system). In this case the ROM address would be the LMA, and the RAM address would be the VMA.
每个可加载或可分配的输出段都有两个地址。第一个是VMA,即虚拟内存地址。这是输出文件运行时该段将拥有的地址。第二个是LMA,即加载内存地址。这是该段将被加载的地址。在大多数情况下,这两个地址是相同的。当它们可能不同时的一个例子是,当一个数据段被加载到ROM中,然后在程序启动时复制到RAM中(这种技术通常用于初始化基于ROM系统的全局变量)。在这种情况下,ROM地址将是LMA,而RAM地址将是VMA。
You can see the sections in an object file by using the objdump program with the ‘-h’ option. Every object file also has a list of symbols, known as the symbol table. A symbol may be defined or undefined. Each symbol has a name, and each defined symbol has an address, among other information. If you compile a C or C++ program into an object file, you will get a defined symbol for every defined function and global or static variable. Every undefined function or global variable which is referenced in the input file will become an undefined symbol.
你可以使用带有‘-h’选项的objdump程序查看对象文件中的各个部分。每个对象文件还有一个符号列表,称为符号表。符号可能是已定义的或未定义的。每个符号都有一个名称,每个已定义的符号都有一个地址。在其他信息中,如果你将一个C或C++程序编译成一个对象文件,你将会得到每个已定义函数和全局或静态变量的定义符号。每一个在输入文件中被引用但未定义的函数或全局变量将会变成一个未定义的符号。
You can see the symbols in an object file by using the nm program, or by using the objdump program with the ‘-t’ option.
你可以使用 nm
程序查看对象文件中的符号,或者使用带有‘-t’
选项的objdump程序来查看。
2.2 连接器脚本格式
Linker scripts are text files.
链接器脚本是文本文件。
You write a linker script as a series of commands. Each command is either a keyword, possibly followed by arguments, or an assignment to a symbol. You may separate commands using semicolons. Whitespace is generally ignored.
你编写链接器脚本是一系列命令的组合。每个命令要么是一个关键字,可能后面跟着参数,要么是对一个符号的赋值。你可以使用分号来分隔命令。空白字符通常会被忽略。
Strings such as file or format names can normally be entered directly. If the file name contains a character such as a comma which would otherwise serve to separate file names, you may put the file name in double quotes. There is no way to use a double quote character in a file name.
像文件名或格式名这样的字符串通常可以直接输入。如果文件名包含逗号等字符,这些字符通常用于分隔文件名,你可以将文件名放在双引号中。文件名中无法使用双引号字符。
You may include comments in linker scripts just as in C, delimited by ‘/’ and ‘/’. As in C, comments are syntactically equivalent to whitespace.
你可以在链接器脚本中像在C语言中一样包含注释,由‘/*’
和‘*/’
界定。就像在C语言中,注释在语法上等同于空白字符。
2.3 简单链接器脚本示例
Many linker scripts are fairly simple.
许多链接器脚本相当简单。
The simplest possible linker script has just one command: ‘SECTIONS’. You use the ‘SECTIONS’ command to describe the memory layout of the output file.
最简单的链接脚本只有一个命令:“SECTIONS”。您使用“SECTIONS”命令来描述输出文件的内存布局。
The ‘SECTIONS’ command is a powerful command. Here we will describe a simple use of it. Let’s assume your program consists only of code, initialized data, and uninitialized data. These will be in the ‘.text’, ‘.data’, and ‘.bss’ sections, respectively. Let’s assume further
that these are the only sections which appear in your input files.
“SECTIONS”命令是一个强大的命令。在这里,我们将描述它的一个简单用法。假设你的程序只包含代码、初始化数据和未初始化数据。
这些将分别位于“.text”、“.data”和“.bss”段中。进一步假设这些是你输入文件中出现的唯一段。
For this example, let’s say that the code should be loaded at address 0x10000, and that the data should start at address 0x8000000. Here is a linker script which will do that:
以这个例子来说,假设代码应该加载在0x10000地址,数据应该从0x8000000地址开始。下面是一个将实现这一点的链接器脚本:
SECTIONS
{
. = 0x10000;
.text : { *(.text) }
. = 0x8000000;
.data : { *(.data) }
.bss : { *(.bss) }
}
You write the ‘SECTIONS’ command as the keyword ‘SECTIONS’, followed by a series of symbol assignments and output section descriptions enclosed in curly braces.
您将‘SECTIONS’命令写为关键字‘SECTIONS’,后面跟着一系列符号赋值和输出段描述,这些都被包含在大括号中。
The first line inside the ‘SECTIONS’ command of the above example sets the value of the special symbol ‘.’, which is the location counter. If you do not specify the address of an output section in some other way (other ways are described later), the address is set from the current value of the location counter. The location counter is then incremented by the size of the output section. At the start of the ‘SECTIONS’ command, the location counter has the value ‘0’.
上述示例中‘SECTIONS’命令的第一行设置了特殊符号‘.’的值,即位置计数器。如果你没有以其他方式指定输出段的地址(稍后将描述其他方式),地址将从位置计数器的当前值设置。然后,位置计数器会根据输出段的大小进行增加。在‘SECTIONS’命令的开始时,位置计数器的值为‘0’。
The second line defines an output section, ‘.text’. The colon is required syntax which may be ignored for now. Within the curly braces after the output section name, you list the names of the input sections which should be placed into this output section. The ‘’ is a wildcard which matches any file name. The expression ‘(.text)’ means all ‘.text’ input sections in all input files.
第二行定义了一个输出部分,名为“.text”。冒号是必需的语法,目前可以忽略。在输出部分名称后面的花括号中,你列出了应该放入这个输出部分的输入部分的名称。‘’是一个通配符,可以匹配任何文件名。表达式‘(.text)’意味着所有输入文件中的所有“.text”输入部分。
Since the location counter is ‘0x10000’ when the output section ‘.text’ is defined, the linker will set the address of the ‘.text’ section in the output file to be ‘0x10000’.
由于在定义输出段“.text”时位置计数器是‘0x10000’,链接器将会在输出文件中将“.text”段的地址设置为‘0x10000’。
The remaining lines define the ‘.data’ and ‘.bss’ sections in the output file. The linker will place the ‘.data’ output section at address ‘0x8000000’. After the linker places the ‘.data’ output section, the value of the location counter will be ‘0x8000000’ plus the size of the ‘.data’ output section. The effect is that the linker will place the ‘.bss’ output section immediately after the ‘.data’ output section in memory.
剩余的行定义了输出文件中的’.data’和’.bss’部分。链接器将会把’.data’输出部分放置在地址’0x8000000’。在链接器放置了’.data’输出部分之后,位置计数器的值将会是’0x8000000’加上’.data’输出部分的大小。结果是链接器将会把’.bss’输出部分紧接在’.data’输出部分之后放置在内存中。
The linker will ensure that each output section has the required alignment, by increasing the location counter if necessary. In this example, the specified addresses for the ‘.text’ and ‘.data’ sections will probably satisfy any alignment constraints, but the linker may have to create a small gap between the ‘.data’ and ‘.bss’ sections.
链接器将确保每个输出段都满足所需的对齐要求,如有必要,通过增加位置计数器来实现。在这个例子中,指定的‘.text’和‘.data’段的地址可能已经满足了任何对齐约束,但链接器可能需要在‘.data’和‘.bss’段之间创建一个小间隙。
That’s it! That’s a simple and complete linker script.
就这样!这是一个简单且完整的链接器脚本。