NASM, though it attempts to avoid the bureaucracy of assemblers like
MASM and TASM, is nevertheless forced to support a few directives.
These are described in this chapter.
NASM's directives come in two types: user-level directives and
primitive directives. Typically, each directive has a user-level
form and a primitive form. In almost all cases, we recommend that users use
the user-level forms of the directives, which are implemented as macros
which call the primitive forms.
Primitive directives are enclosed in square brackets; user-level
directives are not.
In addition to the universal directives described in this chapter, each
object file format can optionally supply extra directives in order to
control particular features of that file format. These
format-specific directives are documented along with the formats
that implement them, in chapter 6.
The BITS directive specifies whether NASM
should generate code designed to run on a processor operating in 16-bit
mode, or code designed to run on a processor operating in 32-bit mode. The
syntax is BITS 16 or
BITS 32.
In most cases, you should not need to use BITS
explicitly. The aout,
coff, elf and
win32 object formats, which are designed for use
in 32-bit operating systems, all cause NASM to select 32-bit mode by
default. The obj object format allows you to
specify each segment you define as either USE16
or USE32, and NASM will set its operating mode
accordingly, so the use of the BITS directive is
once again unnecessary.
The most likely reason for using the BITS
directive is to write 32-bit code in a flat binary file; this is because
the bin output format defaults to 16-bit mode in
anticipation of it being used most frequently to write DOS
.COM programs, DOS .SYS
device drivers and boot loader software.
You do not need to specify BITS 32
merely in order to use 32-bit instructions in a 16-bit DOS program; if you
do, the assembler will generate incorrect code because it will be writing
code targeted at a 32-bit platform, to be run on a 16-bit one.
When NASM is in BITS 16 state, instructions
which use 32-bit data are prefixed with an 0x66 byte, and those referring
to 32-bit addresses have an 0x67 prefix. In
BITS 32 state, the reverse is true: 32-bit
instructions require no prefixes, whereas instructions using 16-bit data
need an 0x66 and those working on 16-bit addresses need an 0x67.
The BITS directive has an exactly equivalent
primitive form, [BITS 16] and
[BITS 32]. The user-level form is a macro which
has no function other than to call the primitive form.
Note that the space is neccessary, BITS32 will
not work!
The SECTION directive
(SEGMENT is an exactly equivalent synonym)
changes which section of the output file the code you write will be
assembled into. In some object file formats, the number and names of
sections are fixed; in others, the user may make up as many as they wish.
Hence SECTION may sometimes give an error
message, or may define a new section, if you try to switch to a section
that does not (yet) exist.
The Unix object formats, and the bin object
format (but see section 6.1.3,
all support the standardised section names .text,
.data and .bss for the
code, data and uninitialised-data sections. The
obj format, by contrast, does not recognise these
section names as being special, and indeed will strip off the leading
period of any section name that has one.
The SECTION directive is unusual in that its
user-level form functions differently from its primitive form. The
primitive form, [SECTION xyz], simply switches
the current target section to the one given. The user-level form,
SECTION xyz, however, first defines the
single-line macro __SECT__ to be the primitive
[SECTION] directive which it is about to issue,
and then issues it. So the user-level directive
SECTION .text
expands to the two lines
%define __SECT__ [SECTION .text]
[SECTION .text]
Users may find it useful to make use of this in their own macros. For
example, the writefile macro defined in
section 4.3.3 can be usefully
rewritten in the following more sophisticated form:
This form of the macro, once passed a string to output, first switches
temporarily to the data section of the file, using the primitive form of
the SECTION directive so as not to modify
__SECT__. It then declares its string in the data
section, and then invokes __SECT__ to switch back
to whichever section the user was previously working in. It thus
avoids the need, in the previous version of the macro, to include a
JMP instruction to jump over the data, and also
does not fail if, in a complicated OBJ format
module, the user could potentially be assembling the code in any of several
separate code sections.
The ABSOLUTE directive can be thought of as an
alternative form of SECTION: it causes the
subsequent code to be directed at no physical section, but at the
hypothetical section starting at the given absolute address. The only
instructions you can use in this mode are the
RESB family.
This example describes a section of the PC BIOS data area, at segment
address 0x40: the above code defines kbuf_chr to
be 0x1A, kbuf_free to be 0x1C, and
kbuf to be 0x1E.
The user-level form of ABSOLUTE, like that of
SECTION, redefines the
__SECT__ macro when it is invoked.
STRUC and ENDSTRUC
are defined as macros which use ABSOLUTE (and
also __SECT__).
ABSOLUTE doesn't have to take an absolute
constant as an argument: it can take an expression (actually, a critical
expression: see section 3.8) and it
can be a value in a segment. For example, a TSR can re-use its setup code
as run-time BSS like this:
org 100h ; it's a .COM program
jmp setup ; setup code comes last
; the resident part of the TSR goes here
setup:
; now write the code that installs the TSR here
absolute setup
runtimevar1 resw 1
runtimevar2 resd 20
tsr_end:
This defines some variables `on top of' the setup code, so that after
the setup has finished running, the space it took up can be re-used as data
storage for the running TSR. The symbol `tsr_end' can be used to calculate
the total size of the part of the TSR that needs to be made resident.
EXTERN is similar to the MASM directive
EXTRN and the C keyword
extern: it is used to declare a symbol which is
not defined anywhere in the module being assembled, but is assumed to be
defined in some other module and needs to be referred to by this one. Not
every object-file format can support external variables: the
bin format cannot.
The EXTERN directive takes as many arguments
as you like. Each argument is the name of a symbol:
extern _printf
extern _sscanf,_fscanf
Some object-file formats provide extra features to the
EXTERN directive. In all cases, the extra
features are used by suffixing a colon to the symbol name followed by
object-format specific text. For example, the obj
format allows you to declare that the default segment base of an external
should be the group dgroup by means of the
directive
extern _variable:wrt dgroup
The primitive form of EXTERN differs from the
user-level form only in that it can take only one argument at a time: the
support for multiple arguments is implemented at the preprocessor level.
You can declare the same variable as EXTERN
more than once: NASM will quietly ignore the second and later
redeclarations. You can't declare a variable as
EXTERN as well as something else, though.
GLOBAL is the other end of
EXTERN: if one module declares a symbol as
EXTERN and refers to it, then in order to prevent
linker errors, some other module must actually define the symbol
and declare it as GLOBAL. Some assemblers use the
name PUBLIC for this purpose.
The GLOBAL directive applying to a symbol must
appear before the definition of the symbol.
GLOBAL uses the same syntax as
EXTERN, except that it must refer to symbols
which are defined in the same module as the
GLOBAL directive. For example:
global _main
_main:
; some code
GLOBAL, like EXTERN,
allows object formats to define private extensions by means of a colon. The
elf object format, for example, lets you specify
whether global data items are functions or data:
global hashlookup:function, hashtable:data
Like EXTERN, the primitive form of
GLOBAL differs from the user-level form only in
that it can take only one argument at a time.
The COMMON directive is used to declare
common variables. A common variable is much like a global variable
declared in the uninitialised data section, so that
common intvar 4
is similar in function to
global intvar
section .bss
intvar resd 1
The difference is that if more than one module defines the same common
variable, then at link time those variables will be merged, and
references to intvar in all modules will point at
the same piece of memory.
Like GLOBAL and
EXTERN, COMMON supports
object-format specific extensions. For example, the
obj format allows common variables to be NEAR or
FAR, and the elf format allows you to specify the
alignment requirements of a common variable:
common commvar 4:near ; works in OBJ
common intarray 100:4 ; works in ELF: 4 byte aligned
Once again, like EXTERN and
GLOBAL, the primitive form of
COMMON differs from the user-level form only in
that it can take only one argument at a time.
The CPU directive restricts assembly to those
instructions which are available on the specified CPU.
Options are:
CPU 8086 Assemble only 8086 instruction set
CPU 186 Assemble instructions up to the 80186
instruction set
CPU 286 Assemble instructions up to the 286
instruction set
CPU 386 Assemble instructions up to the 386
instruction set
CPU 486 486 instruction set
CPU 586 Pentium instruction set
CPU PENTIUM Same as 586
CPU 686 P6 instruction set
CPU PPRO Same as 686
CPU P2 Same as 686
CPU P3 Pentium III (Katmai) instruction sets
CPU KATMAI Same as P3
CPU P4 Pentium 4 (Willamette) instruction set
CPU WILLAMETTE Same as P4
CPU PRESCOTT Prescott instruction set
CPU IA64 IA64 CPU (in x86 mode) instruction
set
All options are case insensitive. All instructions will be selected only
if they apply to the selected CPU or lower. By default, all instructions
are available.