GNU 的汇编器 as 针对有很多架构的处理器.
这里只举例 ARM 架构相关的介绍, 只关注 ELF 文件格式.
举例的指令中只摘取了 u-boot 汇编程序中出现的或者常用的. 要了解更多的话狠戳下面的链接.
能力时间有限, 为避免翻译后的误解, 原文照copy了. 菜鸟上伤起啊.
参考文档地址: http://sourceware.org/binutils/docs-2.20/as/index.html#Top
===============================================================================================================================
ARM Machine Directives:
.align expression [, expression]
This is the generic .align directive. For the ARM however if the first argument is zero (ie no alignment is needed) the assembler will behave as if the argument had been 2 (ie pad to the next four byte boundary). This is for compatability with ARM's own assembler.
name .req register name
This creates an alias for register name called name. For example:
foo .req r0
.code [16|32]
This directive selects the instruction set being generated. The value 16 selects Thumb, with the value 32 selecting ARM.
.thumb
This performs the same action as .code 16.
.arm
This performs the same action as .code 32.
.force_thumb
This directive forces the selection of Thumb instructions, even if the target processor does not support those instructions
.thumb_func
This directive specifies that the following symbol is the name of a Thumb encoded function. This information is necessary in order to allow the assembler and linker to generate correct code for interworking between Arm and Thumb instructions and should be used even if interworking is not going to be performed. The presence of this directive also implies .thumb
.thumb_set
This performs the equivalent of a .set
directive in that it creates a symbol which is an alias for another symbol (possibly not yet defined). This directive also has the added property in that it marks the aliased symbol as being a thumb function entry point, in the same way that the .thumb_func
directive does.
.ltorg
This directive causes the current contents of the literal pool to be dumped into the current section (which is assumed to be the .text section) at the current location (aligned to a word boundary).
.pool
This is a synonym for .ltorg.
.global symbol 或 .globl symbol:
.global
makes the symbol visible to ld
. If you define symbol in your partial program, its value is made available to other partial programs that are linked with it. Otherwise, symbol takes its attributes from a symbol of the same name from another file linked into the same program.
Both spellings (`.globl' and `.global') are accepted, for compatibility with other assemblers.
.word
expressions:
This directive expects zero or more expressions, of any section, separated by commas.
The size of the number emitted, and its byte order, depend on what target computer the assembly is for.
Warning: Special Treatment to support Compilers
Machines with a 32-bit address space, but that do less than 32-bit addressing, require the following special treatment. If the machine of interest to you does 32-bit addressing (or doesn't require it; see Machine Dependencies), you can ignore this issue.
In order to assemble compiler output into something that works, as occasionally does strange things to `.word' directives. Directives of the form `.word sym1-sym2' are often emitted by compilers as part of jump tables. Therefore, when as assembles a directive of the form `.word sym1-sym2', and the difference between sym1
and sym2
does not fit in 16 bits, as creates asecondary jump table, immediately before the next label. This secondary jump table is preceded by a short-jump to the first byte after the secondary table. This short-jump prevents the flow of control from accidentally falling into the new table. Inside the table is a long-jump to sym2
. The original `.word' contains sym1
minus the address of the long-jump to sym2
.
If there were several occurrences of `.word sym1-sym2' before the secondary jump table, all of them are adjusted. If there was a `.word sym3-sym4', that also did not fit in sixteen bits, a long-jump to sym4
is included in the secondary jump table, and the .word
directives are adjusted to contain sym3
minus the address of the long-jump to sym4
; and so on, for as many entries in the original jump table as necessary.
.balign[wl]
abs-expr,
abs-expr,
abs-expr
Pad the location counter (in the current subsection) to a particular storage boundary. The first expression (which must be absolute) is the alignment request in bytes. For example `.balign 8' advances the location counter until it is a multiple of 8. If the location counter is already a multiple of 8, no change is needed.
The second expression (also absolute) gives the fill value to be stored in the padding bytes. It (and the comma) may be omitted. If it is omitted, the padding bytes are normally zero. However, on some systems, if the section is marked as containing code and the fill value is omitted, the space is filled with no-op instructions.
The third expression is also absolute, and is also optional. If it is present, it is the maximum number of bytes that should be skipped by this alignment directive. If doing the alignment would require skipping more bytes than the specified maximum, then the alignment is not done at all. You can omit the fill value (the second argument) entirely by simply using two commas after the required alignment; this can be useful if you want the alignment to be filled with no-op instructions when appropriate.
The .balignw
and .balignl
directives are variants of the .balign
directive. The .balignw
directive treats the fill pattern as a two byte word value. The .balignl
directives treats the fill pattern as a four byte longword value. For example, .balignw 4,0x368d
will align to a multiple of 4. If it skips two bytes, they will be filled in with the value 0x368d (the exact placement of the bytes depends upon the endianness of the processor). If it skips 1 or 3 bytes, the fill value is undefined.
.macro:
.macro
and .endm
allow you to define macros that generate assembly output. For example, this definition specifies a macro sum
that puts a sequence of numbers into memory:
.macro sum from=0, to=5With that definition, `SUM 0,5' is equivalent to this assembly input:
.long \from
.if \to-\from
sum "(\from+1)",\to
.endif
.endm
.long 0
.long 1
.long 2
.long 3
.long 4
.long 5
.macro
macname
.macro
macname
macargs
...
Begin the definition of a macro called macname. If your macro definition requires arguments, specify their names after the macro name, separated by commas or spaces. You can qualify the macro argument to indicate whether all invocations must specify a non-blank value (through `:req
'), or whether it takes all of the remaining arguments (through `:vararg
'). You can supply a default value for any macro argument by following the name with `=deflt'. You cannot define two macros with the same macname unless it has been subject to the .purgem
directive (see Purgem) between the two definitions. For example, these are all valid .macro
statements:
-
.macro comm
-
Begin the definition of a macro called
comm
, which takes no arguments.
-
.macro plus1 p, p1
-
.macro plus1 p p1
-
Either statement begins the definition of a macro called
plus1
, which takes two arguments; within the macro definition, write `\p' or `\p1' to evaluate the arguments.
-
.macro reserve_str p1=0 p2
-
Begin the definition of a macro called
reserve_str
, with two arguments. The first argument has a default value, but not the second. After the definition is complete, you can call the macro either as `reserve_str a,b' (with `\p1' evaluating to a and `\p2' evaluating to b), or as `reserve_str ,b' (with `\p1' evaluating as the default, in this case `0', and `\p2' evaluating to b).
-
.macro m p1:req, p2=0, p3:vararg
-
Begin the definition of a macro called
m
, with at least three arguments. The first argument must always have a value specified, but not the second, which instead has a default value. The third formal will get assigned all remaining arguments specified at invocation time.When you call a macro, you can specify the argument values either by position, or by keyword. For example, `sum 9,17' is equivalent to `sum to=17, from=9'.
Note that since each of the macargs can be an identifier exactly as any other one permitted by the target architecture, there may be occasional problems if the target hand-crafts special meanings to certain characters when they occur in a special position. For example, if the colon (:
) is generally permitted to be part of a symbol name, but the architecture specific code special-cases it when occurring as the final character of a symbol (to denote a label), then the macro parameter replacement code will have no way of knowing that and consider the whole construct (including the colon) an identifier, and check only this identifier for being the subject to parameter substitution. So for example this macro definition:
.macro label l
\l:
.endm
might not work as expected. Invoking `label foo' might not create a label called `foo' but instead just insert the text `\l:' into the assembler source, probably generating an error about an unrecognised identifier.
Similarly problems might occur with the period character (`.') which is often allowed inside opcode names (and hence identifier names). So for example constructing a macro to build an opcode from a base name and a length specifier like this:
.macro opcode base length
\base.\length
.endm
and invoking it as `opcode store l' will not create a `store.l' instruction but instead generate some kind of error as the assembler tries to interpret the text `\base.\length'.
There are several possible ways around this problem:
-
Insert white space
- If it is possible to use white space characters then this is the simplest solution. eg:
.macro label l \l : .endm
Use `
\()'
The string `\()' can be used to separate the end of a macro argument from the following text. eg:
.macro opcode base length \base\().\length .endm
Use the alternate macro syntax mode
In the alternative macro syntax mode the ampersand character (`&') can be used as a separator. eg:
.altmacro .macro label l l&: .endm
Note: this problem of correctly identifying string parameters to pseudo ops also applies to the identifiers used in .irp
(see Irp) and .irpc
(see Irpc) as well.
.endm
Mark the end of a macro definition. .exitm
Exit early from the current macro definition.
\@
as maintains a counter of how many macros it has executed in this pseudo-variable; you can copy that number to your output with `\@', but only within a macro definition. LOCAL
name [ , ... ]
Warning: LOCAL
is only available if you select “alternate macro syntax” with `--alternate' or .altmacro
. See .altmacro
.
.align
abs-expr,
abs-expr,
abs-expr:
Pad the location counter (in the current subsection) to a particular storage boundary. The first expression (which must be absolute) is the alignment required, as described below.
The second expression (also absolute) gives the fill value to be stored in the padding bytes. It (and the comma) may be omitted. If it is omitted, the padding bytes are normally zero. However, on some systems, if the section is marked as containing code and the fill value is omitted, the space is filled with no-op instructions.
The third expression is also absolute, and is also optional. If it is present, it is the maximum number of bytes that should be skipped by this alignment directive. If doing the alignment would require skipping more bytes than the specified maximum, then the alignment is not done at all. You can omit the fill value (the second argument) entirely by simply using two commas after the required alignment; this can be useful if you want the alignment to be filled with no-op instructions when appropriate.
The way the required alignment is specified varies from system to system. For the arc, hppa, i386 using ELF, i860, iq2000, m68k, or32, s390, sparc, tic4x, tic80 and xtensa, the first expression is the alignment request in bytes. For example `.align 8' advances the location counter until it is a multiple of 8. If the location counter is already a multiple of 8, no change is needed. For the tic54x, the first expression is the alignment request in words.
For other systems, including ppc, i386 using a.out format, arm and strongarm, it is the number of low-order zero bits the location counter must have after advancement. For example `.align 3' advances the location counter until it a multiple of 8. If the location counter is already a multiple of 8, no change is needed.
This inconsistency is due to the different behaviors of the various native assemblers for these systems which GAS must emulate. GAS also provides .balign
and .p2align
directives, described later, which have a consistent behavior across all architectures (but are specific to GAS).
.section
name:
Use the .section
directive to assemble the following code into a section named name.This directive is only supported for targets that actually support arbitrarily named sections; on a.out
targets, for example, it is not accepted, even with a standard a.out
section name.
ELF Version
This is one of the ELF section stack manipulation directives. The others are .subsection
(see SubSection), .pushsection
(see PushSection), .popsection
(see PopSection), and.previous
(see Previous).
For ELF targets, the .section
directive is used like this:
.section name [, "flags"[, @type[,flag_specific_arguments]]]
The optional flags argument is a quoted string which may contain any combination of the following characters:
-
a
-
section is allocatable
-
w
-
section is writable
-
x
-
section is executable
-
M
-
section is mergeable
-
S
-
section contains zero terminated strings
-
G
-
section is a member of a section group
-
T
- section is used for thread-local-storage
The optional type argument may contain one of the following constants:
-
@progbits
-
section contains data
-
@nobits
-
section does not contain data (i.e., section only occupies space)
-
@note
-
section contains data which is used by things other than the program
-
@init_array
-
section contains an array of pointers to init functions
-
@fini_array
-
section contains an array of pointers to finish functions
-
@preinit_array
- section contains an array of pointers to pre-init functions
Many targets only support the first three section types.
Note on targets where the @
character is the start of a comment (eg ARM) then another character is used instead. For example the ARM port uses the %
character.
If flags contains the M
symbol then the type argument must be specified as well as an extra argument—entsize—like this:
.section name , "flags"M, @type, entsize
Sections with the M
flag but not S
flag must contain fixed size constants, each entsize octets long. Sections with both M
and S
must contain zero terminated strings where each character isentsize bytes long. The linker may remove duplicates within sections with the same name, same entity size and same flags. entsize must be an absolute expression. For sections with both M
and S
, a string which is a suffix of a larger string is considered a duplicate. Thus "def"
will be merged with "abcdef"
; A reference to the first "def"
will be changed to a reference to "abcdef"+3
.
If flags contains the G
symbol then the type argument must be present along with an additional field like this:
.section name , "flags"G, @type, GroupName[, linkage]
The GroupName field specifies the name of the section group to which this particular section belongs. The optional linkage field can contain:
-
comdat
-
indicates that only one copy of this section should be retained
-
.gnu.linkonce
- an alias for comdat
Note: if both the M and G flags are present then the fields for the Merge flag should come first, like this:
.section name , "flags"MG, @type, entsize, GroupName[, linkage]
If no flags are specified, the default flags depend upon the section name. If the section name is not recognized, the default will be for the section to have none of the above flags: it will not be allocated in memory, nor writable, nor executable. The section will contain data.
For ELF targets, the assembler supports another type of .section
directive for compatibility with the Solaris assembler:
.section "name"[, flags...]
Note that the section name is quoted. There may be a sequence of comma separated flags:
-
#alloc
-
section is allocatable
-
#write
-
section is writable
-
#execinstr
-
section is executable
-
#tls
- section is used for thread local storage
This directive replaces the current section and subsection. See the contents of the gas testsuite directory gas/testsuite/gas/elf
for some examples of how this directive and the other section stack directives work.
.type:
This directive is used to set the type of a symbol.
ELF Version
For ELF targets, the .type
directive is used like this:
.type name , type description
This sets the type of symbol name to be either a function symbol or an object symbol. There are five different syntaxes supported for the type description field, in order to provide compatibility with various other assemblers.
Because some of the characters used in these syntaxes (such as `@' and `#') are comment characters for some architectures, some of the syntaxes below do not work on all architectures. The first variant will be accepted by the GNU assembler on all architectures so that variant should be used for maximum portability, if you do not need to assemble your code with other assemblers.
The syntaxes supported are:
.type <name> STT_<TYPE_IN_UPPER_CASE>
.type <name>,#<type>
.type <name>,@<type>
.type <name>,%<type>
.type <name>,"<type>"
The types supported are:
-
STT_FUNC
-
function
-
Mark the symbol as being a function name.
-
STT_GNU_IFUNC
-
gnu_indirect_function
-
Mark the symbol as an indirect function when evaluated during reloc processing. (This is only supported on Linux targeted assemblers).
-
STT_OBJECT
-
object
-
Mark the symbol as being a data object.
-
STT_TLS
-
tls_object
-
Mark the symbol as being a thead-local data object.
-
STT_COMMON
-
common
-
Mark the symbol as being a common data object.
-
STT_NOTYPE
-
notype
-
Does not mark the symbol in any way. It is supported just for completeness.
-
gnu_unique_object
- Marks the symbol as being a globally unique data object. The dynamic linker will make sure that in the entire process there is just one symbol with this name and type in use. (This is only supported on Linux targeted assemblers).
Note: Some targets support extra types in addition to those listed above.
.text
subsection:
Tells as to assemble the following statements onto the end of the text subsection numbered subsection, which is an absolute expression. If subsection is omitted, subsection number zero is used.
.set
symbol,
expression:
Set the value of symbol to expression. This changes symbol's value and type to conform to expression. If symbol was flagged as external, it remains flagged (see Symbol Attributes).
You may .set
a symbol many times in the same assembly.
If you .set
a global symbol, the value stored in the object file is the last value stored into it.
The syntax for set
on the HPPA is `symbol .set expression'.
On Z80 set
is a real instruction, use `symbol defl expression' instead.
.rept
count:
Repeat the sequence of lines between the .rept
directive and the next .endr
directive count times.
For example, assembling
.rept 3
.long 0
.endr
is equivalent to assembling
.long 0
.long 0
.long 0
.byte
expressions:
.byte
expects zero or more expressions, separated by commas. Each expression is assembled into the next byte.
.int
expressions:
Expect zero or more expressions, of any section, separated by commas. For each expression, emit a number that, at run time, is the value of that expression. The byte order and bit size of the number depends on what kind of target the assembly is for.
.short
expressions:
.short
is normally the same as `.word'. See .word
.
In some configurations, however, .short
and .word
generate numbers of different lengths. See Machine Dependencies.
.hword
expressions:
This expects zero or more expressions, and emits a 16 bit number for each.
This directive is a synonym for `.short'; depending on the target architecture, it may also be a synonym for `.word'.
.long
expressions:
.long
is the same as `.int'. See .int
.
.org
new-lc ,
fill:
Advance the location counter of the current section to new-lc. new-lc is either an absolute expression or an expression with the same section as the current subsection. That is, you can't use .org
to cross sections: if new-lc has the wrong section, the .org
directive is ignored. To be compatible with former assemblers, if the section of new-lc is absolute, as issues a warning, then pretends the section of new-lc is the same as the current subsection.
.org
may only increase the location counter, or leave it unchanged; you cannot use .org
to move the location counter backwards.
Because as tries to assemble programs in one pass, new-lc may not be undefined. If you really detest this restriction we eagerly await a chance to share your improved assembler.
Beware that the origin is relative to the start of the section, not to the start of the subsection. This is compatible with other people's assemblers.
When the location counter (of the current subsection) is advanced, the intervening bytes are filled with fill which should be an absolute expression. If the comma and fill are omitted, filldefaults to zero.
.extern:
.extern
is accepted in the source program—for compatibility with other assemblers—but it is ignored. as treats all undefined symbols as external.
.size:
This directive is used to set the size associated with a symbol.
ELF Version
For ELF targets, the .size
directive is used like this:
.size name , expression
This directive sets the size associated with a symbol name. The size in bytes is computed from expression which can make use of label arithmetic. This directive is typically used to set the size of function symbols.
.hidden
names:
This is one of the ELF visibility directives. The other two are .internal
(see .internal
) and .protected
(see .protected
).
This directive overrides the named symbols default visibility (which is set by their binding: local, global or weak). The directive sets the visibility to hidden
which means that the symbols are not visible to other components. Such symbols are always considered to be protected
as well.
.equ
symbol,
expression:
This directive sets the value of symbol to expression. It is synonymous with `.set'; see .set
.
The syntax for equ
on the HPPA is `symbol .equ expression'.
The syntax for equ
on the Z80 is `symbol equ expression'. On the Z80 it is an eror if symbol is already defined, but the symbol is not protected from later redefinition. Compare Equiv.
.data
subsection:
.data
tells as to assemble the following statements onto the end of the data subsection numbered subsection (which is an absolute expression). If subsection is omitted, it defaults to zero.
.include"
file":
This directive provides a way to include supporting files at specified points in your source program. The code from file is assembled as if it followed the point of the .include
; when the end of the included file is reached, assembly of the original file continues. You can control the search paths used with the `-I' command-line option (see Command-Line Options). Quotation marks are required around file.
.error"
string":
Similarly to .err
, this directive emits an error, but you can specify a string that will be emitted as the error message. If you don't specify the message, it defaults to ".error directive invoked in source file"
. See Error and Warning Messages.
.error "This code has not been assembled and tested."
.local
names:
This directive, which is available for ELF targets, marks each symbol in the comma-separated list of names
as a local symbol so that it will not be externally visible. If the symbols do not already exist, they will be created.
For targets where the .lcomm
directive (see Lcomm) does not accept an alignment argument, which is the case for most ELF targets, the .local
directive can be used in combination with.comm
(see Comm) to define aligned local common data.
.struct
expression:
Switch to the absolute section, and set the section offset to expression, which must be an absolute expression. You might use this as follows:
.struct 0This would define the symbol
field1:
.struct field1 + 4
field2:
.struct field2 + 4
field3:
field1
to have the value 0, the symbol field2
to have the value 4, and the symbol field3
to have the value 8. Assembly would be left in the absolute section, and you would need to use a .section
directive of some sort to change to some other section before further assembly.