A directed acyclic graph (DAG!) is a directed graph that contains no cycles. A rooted tree is a special kind of DAG and a DAG is a special kind of directed graph. For example, a DAG may be used to represent common subexpressions in an optimizing compiler.

+ + . . . . . . . . * () *<---| () . . . . . . | . . . . . . . . | . | a b f * a b | f | . . ^ v . . | | a b |--<---- Tree DAG

expression: a*b+f(a*b)

Example of Common Subexpression.

The common subexpression a*b need only be compiled once but its value can be used twice.

A DAG can be used to represent prerequisites in a university course, constraints on operations to be carried out in building construction, in fact an arbitrary partial-order `<'. An edge is drawn from a to b whenever a<b. A partial order `<' satisfies:

(i) transitivity, a<b and b<c implies a<c

(ii) non-reflexive, not(a < a)

Standard graph terminology is used throughout this section. In addition,

for a directed edge e we denote by src(e) and dst(e) the source vertex and

the destination vertex of e.

‘**Optimizations’ of Basic Blocks**

Equivalent transformations: Two basic block are equivalent if they compute the same set of expressions.

-Expressions: are the values of the live variables at the exit of the block.

Two important classes of local transformations:

1:structure preserving transformations:

*common sub expression elimination

*dead code elimination

*renaming of temporary variables

*interchange of two independent adjacent statements.

*2:algebraic transformations (countlessly many):*

*simplify expressions

*replace expensive operations with cheaper ones.

**The DAG Representation of Basic Blocks**

Directed acyclic graphs (DAGs) give a picture of how the value computed by each statement in the basic block is used in the subsequent statements of the block.

Definition: a dag for a basic block is a directed acyclic graph with the following labels on nodes:

leaves are labeled with either variable names or constants.

they are unique identifiers

from operators we determine whether l- or r-value.

represent initial values of names. Subscript with 0.

interior nodes are labeled by an operator symbol.

Nodes are also (optionally) given a sequence of identifiers for labels.

- interior node = computed values

- identifiers in the sequence – have that value.

**Algorithm DAG: Constructing a DAG**

*Input: A basic block of three address statements. No pointers or array*

*references.*

*Output:A DAG where each node n has a value, VALUE(n), which is an*

*operator in the case of an interior node or a variable name if the node is a*

*leaf. Also, each node n has a (possibly empty) list of identifiers attached,*

ID(n).

**Method:**

**begin**

**for each instruction I in the basic block do**

**if I is of form (x := y op z) then**

**Find a node, ny, such that y ε ID(ny) (only**

**one can exist). If it cannot be found, create a**

**leaf with VALUE(ny)=ID(ny)={y}.**

**... same for z, node is nz ...**

**Find or, if not found, create node m such that**

**VALUE(m) = op and ny and nz are**

**resp. its left and rignt child.**

**if there is p such that x ε ID(p) then**

**ID(p) := ID(p) - {x}**

**fi**

**ID(m) := ID(m) + {x}**

**elseif I is of form ( x := op y) then**

**... similar code as above...**

**elseif I is of form (x := y) then**

**Find a node, ny, such that y ε ID(ny) (only**

**one can exist). If it cannot be found, create a**

**leaf with VALUE(ny)=ID(ny)={y}.**

**m := ny**

**if there is p such that x ε ID(p) then**

**ID(p) := ID(p) - {x}**

**fi**

**ID(m) := ID(m) + {x}**

**fi**

**od**

**end**

With the DAG it is easy to determine which identifiers have their values

used in the block. These are the identifiers for which a leaf is created.

Also, it is easy to determine the statements that compute values that

could be used outside the block. These are the statements whose

associated node, m, still has its left hand side, x, in ID(m) at the end of

the algorithm.

To improve the chances of finding common subexpressions,

commutative operations should be normalized. For example,when both

operands are variables, alphabetical order could be used. Also, if one of

the operands is a leaf and the other an internal node, the leaf could be

placed on the left.

Constant folding can be applied by replacing the nodes that evaluate to a

constant, say c, with a node m such that VALUE(m) is c.

The previous algorithm was introduced by Cocke and Schwartz and is

known as “Value Numbering of Basic Blocks”.

When there are references to array elements and to pointers, we need to

make sure that:

1. Common subexpressions are properly identified

2. The order of instructions generated from the DAG is correct.

To make sure that common subexpressions are correctly identified an

extra bit is added to each node. Every time there is an assignment to an

element of an array, all nodes representing elements of that array are

killed by setting the bit. The ID of a killed node cannot be extended with

new variable names. That is, they cannot be recognized as common

subexpressions.

Also, when there is an assignment of the form *p :=a all nodes in the

DAG must be killed if we do not know what p might point to.

Similar observations could be made about formal parameters when the

language allows aliasing.

To guarantee correct order of generated instructions, the DAG could be

extended with arcs that enforce the following rules:

1. Any two references to an array one of which is a write, must be

performed in the original order.

2. Any two references, if one is a write and at least one of the references

is through a pointer must be performed in the original order

## No comments:

## Post a Comment