Making of Tiger #6, Stack frames

Intro

In this post we’re going to add support for stack frames. Here I’ve made some notes while reading the chapter 6 to make sure I understand things clearly.

Stack

A stack is a region of memory that grows downward and shrinks upward (like icicles). The top of the stack is it’s lowest memory address. We treat the stack as a big array, with a special register – the stack pointer (SP).

The stack grows only at the entry to a function, by the increment large enough to hold all the local variables for that function. The stack shrinks at the exit from the function by the same amount.

Stack frame (SF)

We need a stack frame abstraction because:

  • Local variables are pushed/popped in large batches on function entry/exit
  • When local variables created they are not always initialized right away
  • After many variables have been pushed, we want to continue accessing variables deep within the stack

An SF consists of:

  • Some of local variables (others are kept in CPU registers)
  • Return address (where control should return after completion of the current function)
  • Temporaries and saved registers
  • Outgoing/incoming arguments

Another name for a stack frame is a function activation record. Stack frame layout depends on the ISA and the programming language being compiled.

Here is a typical stack frame layout:

:    :             (higher addresses)
:    :             previous frame
| i1 | [FP+8]      (2-nd argument)
| i2 | [FP+4]      (1-st argument)
| SL | [FP]        (static link)
-------------------------------------------------
| l1 | [FP-4]      (1-st local variable)
| l2 | [FP-8]      (2-nd local variable)
:    :
| lk | [FP-k*4]    (k-nd local variable)
| RA | [FP-k*4-k]  (return address)
:    :
| t1 |             (1-st temp) temporaries and
| t2 |             (2-nd temp) saved registers
:    :
| o1 |             (1-st outgoing arg) outgoing
| o2 |             (2-nd outgoing arg) arguments
:    :
| SP | [FP-?]      (current stack pointer)
-------------------------------------------------
:    :             next frame
:    :             (lower addresses)

Stack pointer (SP)

Always points to the top of the stack (lowest memory address in a stack).

Frame pointer (FP)

A term we use for convenience.

On entry to some function f it allocates a new SF by subtracting the frame size from the SP and the old FP is saved in the SF.

Basically, if the frame size is fixed: FP = SP + size(SF).

On function enter:

  • SF[0...k] = V[0...k] (put local variables in SF)
  • SF[k + 1] = FP (save the current FP in SF)
  • FP = SP (old SP becomes the current FP)

On function exit:

  • SP = FP (copy FP back to SP)
  • FP = SF[k + 1] (fetch back the saved FP)

The size of SF is not known until quite late in the compilation process, but we want to know the offsets of function arguments and local variables much earlier. We put args and locals right near the FP at offsets that are known early (temporaries and saved registers are known later).

Registers

We want to use CPU registers as much as possible to make compiled programs run faster. Also we have a limited set of registers available and many functions that use them. We must be able to save and restore them (to/from a stack frame).

We say that a register is caller-save if the caller must save and restore it, otherwhise it is callee-save. Which registers are preserved depends on the machine architecture.

Parameters and returns addresses

In the early days (in 70’s) on most machines function arguments and return addresses were always passed on the stack. On modern machines, for efficiency (to avoid high memory traffic), the first several arguments, result and return address are passed in registers. The rest args are passed in memory.

Why and how usage of registers helps

Suppose f(a, b, c) calls g(x, y, z).

The f receives its args in registers \( r_1, r_2 \) and \( r_3 \), but it must pass the args x, y and z in the same registers \( r_1, r_2 \) and \( r_3 \). So the f should save those registers to the its SF before calling g and we’re back to the memory traffic problem.

So how the usage of registers saved any time?

  1. Leaf procedures – those that don’t call other procedures. The most procedures called are leafs which need not write their incoming args to memory. And often they don’t need to allocatie a stack frame at all.
  2. Interprocedural register allocation. A technique used by some optimizing compilers to analyze all the functions in an entire program at once and assign them different registers in which to receive parameters and hold local variables.
  3. Sometimes f doesn’t need its args anymore by the time it calls g, so it can just overwrite corresponding registers.
  4. Register windows available on some architectures. Each function can allocate a fresh set of registers without memory traffic.

How incoming parameters are passed

The first k args \( a_1, \dots, a_k \) are passed in registers and the rest \( a_{k+1} ,\dots, a_n \) are placed at the end of the callers own frame.

If the callee needed to write any of these arguments (including those passed in registers) to memory, it would write them to the very beginning of its own SF.

For simplicity we’ll assume that by default everything is passed and kept in registers, except for cases when:

  • variable is passed by reference
  • variable is accessed by a nested function
  • variable is too big to fit into a single register
  • array variables
  • register holding the variable is needed for some other purpose
  • there are many local variables and temporary values that they won’t all fit in registers

Escaping variables

Variable escapes if:

  • It is passed by reference
  • It is accessed from a nested function
  • Its address is taken (not applicable for Tiger)

But when the compiler sees a formal parameter or local variable for the first time it doesn’t yet know whether it escapes and how many registers the calculation will require.

So we must assign some provisional locations (see the alloc_local function) to all formals and locals, and decide later which of them should really go in registers.

  • Block structure – a feature that allow the inner functions use variables declared in outer functions
  • Static link (SL) – a pointer to the function statically enclosing the current one

Static links are stored in stack frames.

Example of program with nested functions in Tiger.

type tree = {
  key: string,
  left: tree,
  right: tree
}

function prettyprint(tree: tree) : string =
  let
    var output := ""

    function write(s: string) =
      output := concat(output, s)

    function show(n: int, t: tree) =       <--- [2]
      let
        function indent(s: string) = (
          for i := 1 to n do               <--- [5]
             write(" ");
             output := concat(output, s);
             write("\n")
        )
      in
        if t = nil
        then ident(".")                    <--- [3]
        else (
          indent(t.key);
          show(n + 1, t.left);             <--- [4]
          show(n + 1, t.right)
        )
      end
  in
    show(0, tree);                         <--- [1]
    output
  end
  • 1: prettyprint calls show, passing prettyprint’s own FP as show’s SL
  • 2: show stores its SL into its own SF
  • 3: show calls indent, passing its own FP as indent’s SL
  • 4: show calls show, passing its own SL as the SL
  • 5: indent uses the value n from show’s frame, it can find n by using the SL (which points at the SF of show)

Each call requires a one or more fetches using SL’s to resolve variables declared in outers functions.

Interface

Stack frame and calling conventions

Every machine architecture might have a different standart stack frame layout.

Suppose we have a call of the following function g(x1, x2, x3) where the 1-st parameter escapes. Here is an example of how the Frame.mk g [true; false; false] works on three different architecture.

Architecture Formals
Pentium [InFrame 8; InFrame 12; InFrame 16]
MIPS [InFrame 0; InReg t157; InReg t158]
SPARC [InFrame 8; InReg t157; InReg t158]


Pentium MIPS SPARC
\( SF_{SP+0} \leftarrow FP \) \( SP \leftarrow SP-k \) SAVE \(\ SP \), \( -k \), \(\ SP \)
\( FP \leftarrow SP \) \( SF_{SP+k+0} \leftarrow r_2 \) \( SF_{FP+68} \leftarrow i_0 \)
\( SP \leftarrow SP-k \) \( t_{157} \leftarrow r_4 \) \( t_{157} \leftarrow i_1 \)
\( t_{157} \leftarrow r_5 \) \( t_{158 }\leftarrow i_2 \)


\(SF\) – stack frame memory region
\(k\) – number of formal parameters
\(t\) – temporary location
\(r_i\) – MIPS registers
\(i_i\) – SPARC registers

Thats why we’ll use an abstract interface for stack frames.

Most of PC’s and laptops are based on the x86 and x64 ISA. Mobile devices usually based on the ARM ISA.

  • x86 – family of (32-bit) ISA based on the Intel 8086 and 8088 microprocessors
  • x64 – a 64-bit version of the x86 ISA supporting larger amounts of virtual and physical memory + additional general-purpose registers

x86

  • Has 8 general-purpose registers: eax, ebx, ecx, edx, ebp, esp, esi, edi
  • 32-bit (4-byte) words

x64

  • Has 16 general-purpose registers: rax, rbx, rcx, rdx, rbp, rsp, rsi, rdi, r8 - r15
  • 64-bit (8-byte) words

Lets use the x64 ISA, there are 2 flavors of it:

  • Microsoft x64
  • System V AMD64 (which I prefer)

According to the System V AMD64 ABI, the first 6 integer arguments are passed in left-to-right order in rdi, rsi, rdx, rcx, r8 and r9 registers, respectively.

Arguments 5 and higher are passed in memory. They are pushed onto the stack in reversed (right-to-left) order.

Helpful links:

Here is our interface for a stack frame:

(** Holds information about formal parameters and
    local variables allocated in this frame *)
type t [@@deriving show]

(** Abstract location of a formal parameter (function argument) or
    a local variable that may be placed in a frame or in a register *)
type access [@@deriving show]

(** Makes a new frame for a function with the
    given label and formal parameters *)
val mk : Temp.label -> bool list -> t

(** Extracts a list of accesses denoting
    the locations where the formal parameters will be
    kept at runtime, as seen from inside the callee *)
val formals : t -> access list

(** Allocates a new local variable in the given frame or in a register.
    The boolean argument specifies whether the new variable
    escapes and needs to go in the frame.
    Returns "in-memory" access with an offset from the frame pointer or
    "in-register" access in case if it can be allocated in a register *)
val alloc_local : t -> bool -> access

To make a new frame for some function f we’ll call Frame.mk label formals, where:

  • label: Temp.label – static memory address of the f function (that is yet to be determined)
  • formals: bool listtrue for each parameter that escapes and false for each that doesn’t

The SF should contain information about formal parameters and local variables allocated:

type t = {
  (* label at which the function's machine code begins *)
  label: Temp.label;
  (* locations of all the formals *)
  formals: access list;
  (* number of locals allocated so far *)
  locals: int ref;
  (* instructions required to implement the "view shift" *)
  instrs: Instruction.t list;
} [@@deriving show]

We don’t want to implement the view shift right now, so the Instruction.t is defined like this:

type t = unit [@@deriving show]

ABI

Lets add some constants for System V ADM64 ABI to the Frame module. We’ll need them later.

(* word size in bytes *)
let word_size = 64 / 8 (* = 8 bytes *)

(* special registers *)
let fp = "rbp" (* frame pointer *)
let sp = "rsp" (* stack pointer *)

(* x64 "parameter"-registers *)
let rdi = "rdi"
let rsi = "rsi"
let rdx = "rdx"
let rcx = "rcx"
let	r8  = "r8"
let r9  = "r9"
let arg_regs = [rdi; rsi; rdx; r8; r9]

(* other x64 registers *)
let rbx = "rbx"
let r10 = "r10"
let r11 = "r11"
let r12 = "r12"
let r13 = "r13"
let r14 = "r14"
let r15 = "r15"

(* registers that are preserved by the caller *)
let caller_regs = [r10; r11]

(* registers that are preserved by the callee *)
let callee_regs = [rbx; r12; r13; r14; r15]

Access

The access data type describes location of a formal parameter or a local variable that may be placed in a frame or in a register:

type access =
   (* memory location at the specific offset from the frame pointer *)
  | InFrame of int
  (* register location *)
  | InReg of Temp.t
  [@@deriving show]

The rest of the Frame module implementation:

(* creates a new location for a formal parameter or
   a local variable, given its index and [esc] flag *)
let mk_access i = function
  | true -> InFrame ((i + 1) * (-word_size)) (* escapes - alloc in frame *)
  | false -> InReg (Temp.mk ()) (* doesn't escape - use temp (register) *)

(* makes a new stack frame *)
let mk label formals =
  let formals = List.mapi mk_access formals in
  let locals = ref 0 in
  (* don't know yet what instructions we need,
     so just leave it empty for now *)
  let instrs = [] in
  { label; formals; locals; instrs }

let formals { formals; _ } = formals

(* local variables that do not escape can be allocated in a register,
   escaping variables must be allocated in the frame *)
let alloc_local { locals; _ } = function
  | true ->
    incr locals;
    let offset = (!locals + 1) * (-word_size) in
    InFrame offset
  | false ->
    InReg (Temp.mk ())

alloc_local allocates a new local variable in the given frame or in a register. The boolean argument specifies whether the new variable escapes and needs to go in the frame. Returns in-memory access with an offset from the frame pointer.

alloc_local t true
InFrame -4
alloc_local t true
InFrame -8
alloc_local t false
InReg t1
alloc_local t false
InReg t2

A clever compiler might optimize the frame size by noticing when two frame-resident variables could be allocated to the same slot.

Temporaries and labels

We need a couple of more abstractions to represent a “not known yet” register and machine-language locations:

  • Temporary – an abstract name for a value that is temporarily held in some register
  • Label (just like label in assembly language) – an abstract name for a static machine-language location whose exact address is yet to be determined

The Temp module manages these two distinct sets of names:

(** Abstract name for a local variable that
    is temporarily held in a register *)
type t [@@deriving show]

(** Abstract name for a static memory address that
    is yet to be determined *)
type label [@@deriving show]

(** Returns a new temporary from an infinite set of temporaries *)
val mk : unit -> t

(** Returns a new [label], whose assembly-language name is
    the given string (if given), otherwise it is generated. *)
val mk_label : string option -> label

The implmenetation is trivial:

module S = Symbol

type t = int [@@deriving show]
type label = Symbol.t [@@deriving show]

let mk =
  let idx = ref (-1) in
  fun () ->
    incr idx;
    !idx

let mk_label name =
  let idx = ref (-1) in
  match name with
  | Some s ->
    S.symbol s
  | None ->
    incr idx;
    let name = string_of_int !idx in
    S.symbol name

View shift

Function arguments are seen differently by the caller and the callee. In the book this is referred to as the view shift. For example, a caller my put a parameter into register \(r_6\), but the callee may want to access it from register \(r_9\). We want to handle this view shift in the Frame module.

For each formal parameter we should calculate:

  • How it will be seen by callee (in a register, or in a frame location)
  • What instructions are required to implement this view shift

To keep things simple, we are not going to implement this right now.

The Frame module should not know anything about static links, because we want it be independent of any specific source language being compiled. We’ll use Translate module to manage static links.

The static link (which is a pointer to the enclosing function) is passed to a function in a register and stored into the frame, just like any other escaping formal parameter. So we will treat it as one by adding another true value at the front the list of booleans representing formal parameters. It means that for some function f(x,y) (assuming neigher x nor y escapes) we’ll have the following list:

[true; false; false]

Notice the extra true at the beginning.

Translate module

The Translate module handles the notion of nested scopes (via static links), providing the interface to the Semant module.

We separate Semant from Translate module to avoid a huge, unweildy module that does both: type checking and semantic translation.

Ok, I think at this point we are ready to write some initial implementation:

type expr = unit

(** Represents a nesting level *)
type level

(** Describes a way to access a formal parameter or a local variable.
    Basically, it is just a [Frame.access] plus a nesting [level] *)
type access = level * Frame.access

(** Outermost level at which all
    top-level functions and variables are declared *)
val outermost : level

(** Creates a new "nesting level" for a function *)
val mk : level option -> Temp.label -> bool list -> level

(** Extracts a list of accesses *)
val formals : level -> access list

(** Creates an [access] at the given [level].
    The argument [bool] specifies whether the variable escapes *)
val alloc_local : level -> bool -> access

We don’t know yet what the expr will be, thats why it is a type alias for unit. And here is the basic implementation:

type expr = unit [@@deriving show]

type level = {
  parent: level option;
  frame: Frame.t
} [@@deriving show]

type access = level * Frame.access

let outermost =
  let label = Temp.mk_label None in
  { parent = None;
    frame = Frame.mk label [];
  }

let mk parent label formals =
  let formals = true :: formals in
  let frame = Frame.mk label formals in
  { parent; frame }

(* Returns formals (excluding the static link) *)
let formals lev =
  (* exclude the SL *)
  let args = List.tl (Frame.formals lev.frame) in
  List.map (fun access -> lev, access) args

let alloc_local lev esc =
  let access = Frame.alloc_local lev.frame esc in
  lev, access

Notice the formals = true :: formals in the mk function, it is there to represent a static link.

Env

We must keep the Translate.level along with the Temp.label in the FunEntry. Also we need to update the VarEntry to include the Translate.access.

Here is how our updated Env module looks like:

module T = Type
module Table = Symbol.Table

type access

(** Variable entry *)
type var_entry = {
  access: Translate.access; (** Describes how to access the variable **)
  ty: T.t (** Type of the variable *)
}

(** Function entry *)
type fun_entry = {
  level: Translate.level; (** Nesting level *)
  label: Temp.label; (** Label of the machine-code entry point *)
  formals: T.t list; (** Types of the formal parameters *)
  result: T.t (** Type of the result returned by the function **)
}

(** Term-level entry *)
type entry =
  | VarEntry of var_entry
  | FunEntry of fun_entry

(** Contains bindings for predefined functions *)
val base_venv : entry Table.t

(** Predefined types *)
val base_tenv : T.t Table.t

Escapes

To calculate which variables should be stored in registers and which should be allocated in the frame we’ll have a special module named Escape.

Here its interface in OCaml:

(** Depth (nesting level) of the function that
    contains the variable declaration *)
type depth

(** Environment that maps variables to pairs of depth and
    a reference to a boolean flag indicating if a
    particular variable escapes *)
type env = (depth * bool ref) Symbol.Table.t

(** Looks for escaping variables and records this
    info in the [escape] fields of the abstract syntax *)
val traverse_prog : Syntax.expr -> unit

The traverse_prog function is going to traverse the entire AST looking for escaping uses of every variable. Also, we’ll have a separate environment env to track “escaping” of variables.

Whenever a variable or formal-parameter declaration \(a\) is found at static function-nesting depth \(d\) then a new binding (d, ref false) is entered into the environment env. This new environment is used in processing expressions within the scope of the variable. Then whenever this var or formal-parameter \(a\) is used at depth \(\gt d\) (which means that it escapes), then our “escape ref” is set to true in the environment.

This should take place before semantic analysis, because the Semant module needs to know about escaping variables to do it’s work.

Semant

Now lets go and update our Semant module.

I won’t paste the whole implementation, the code is here.

Summary