GNU Info

Info Node: (nasm.info)Section 3.8

(nasm.info)Section 3.8


Next: Section 3.9 Prev: Section 3.7 Up: Chapter 3
Enter node , (file) or (file)node

3.8. Critical Expressions
=========================

   A limitation of NASM is that it is a two-pass assembler; unlike TASM
and others, it will always do exactly two assembly passes. Therefore it
is unable to cope with source files that are complex enough to require
three or more passes.

   The first pass is used to determine the size of all the assembled
code and data, so that the second pass, when generating all the code,
knows all the symbol addresses the code refers to. So one thing NASM
can't handle is code whose size depends on the value of a symbol
declared after the code in question. For example,

             times (label-$) db 0
     label:  db      'Where am I?'

   The argument to `TIMES' in this case could equally legally evaluate
to anything at all; NASM will reject this example because it cannot
tell the size of the `TIMES' line when it first sees it. It will just
as firmly reject the slightly paradoxical code

             times (label-$+1) db 0
     label:  db      'NOW where am I?'

   in which _any_ value for the `TIMES' argument is by definition wrong!

   NASM rejects these examples by means of a concept called a _critical
expression_, which is defined to be an expression whose value is
required to be computable in the first pass, and which must therefore
depend only on symbols defined before it. The argument to the `TIMES'
prefix is a critical expression; for the same reason, the arguments to
the `RESB' family of pseudo-instructions are also critical expressions.

   Critical expressions can crop up in other contexts as well: consider
the following code.

                     mov     ax,symbol1
     symbol1         equ     symbol2
     symbol2:

   On the first pass, NASM cannot determine the value of `symbol1',
because `symbol1' is defined to be equal to `symbol2' which NASM hasn't
seen yet. On the second pass, therefore, when it encounters the line
`mov ax,symbol1', it is unable to generate the code for it because it
still doesn't know the value of `symbol1'. On the next line, it would
see the `EQU' again and be able to determine the value of `symbol1',
but by then it would be too late.

   NASM avoids this problem by defining the right-hand side of an `EQU'
statement to be a critical expression, so the definition of `symbol1'
would be rejected in the first pass.

   There is a related issue involving forward references: consider this
code fragment.

             mov     eax,[ebx+offset]
     offset  equ     10

   NASM, on pass one, must calculate the size of the instruction `mov
eax,[ebx+offset]' without knowing the value of `offset'. It has no way
of knowing that `offset' is small enough to fit into a one- byte offset
field and that it could therefore get away with generating a shorter
form of the effective-address encoding; for all it knows, in pass one,
`offset' could be a symbol in the code segment, and it might need the
full four-byte form. So it is forced to compute the size of the
instruction to accommodate a four-byte address part. In pass two, having
made this decision, it is now forced to honour it and keep the
instruction large, so the code generated in this case is not as small
as it could have been. This problem can be solved by defining `offset'
before using it, or by forcing byte size in the effective address by
coding `[byte ebx+offset]'.

   Note that use of the `-On' switch (with n>=2) makes some of the above
no longer true (see *Note Section 2.1.16::).


automatically generated by info2www version 1.2.2.9