In data structures, trees are frequently used to represent hierarchies. One of the special types of trees is the expression tree, which is a special type of tree structure for expressions in the mathematical or logical domains. A common operation with trees is to evaluate some mathematical expression; there are also cases when analyzing some statements or expressions would be helpful, and often during the compilers process of development you may take a closer look at a tree structure to assist you in developing the compiler before instead of just evaluating the expression. It is helpful to know how expression trees work and how to make them so that you can evaluate the expressions efficiently.
The expression tree is a binary tree, and as such each internal node represents an operator, and each leaf node represents an operand.
For instance, the expression tree for the equation (3 + 5) * (2 - 1) looks like this:
*
/ \
+ -
/ \ / \
3 5 2 1
Expression trees are a primary data structure for representing arithmetic and logical expressions in many notations (infix, postfix, and prefix). Expression trees can represent many kinds of expressions, depending on the operators, operands, and relationships based on the arrangement of the tree. Outlined below are the kinds of expressions that can be represented by expression trees:
Arithmetic expressions use mathematical operations like addition, subtraction, multiplication, and division and are typically based on numerical values (whether constants or variables).
+
/ \
3 *
/ \
4 2
In this example:
Boolean expressions represent logical operations such as AND, OR, NOT, and combinations of these operations. They refer to Boolean variables or constants (true and false).
OR
/ \
AND C
/ \
A B
In this case:
Relational expressions compare two values and are used commonly in conditional expressions based upon relational operators—comparative expressions—like: =, >, <, >=, <=, and !=.
AND
/ \
> <
/ \ / \
A B C D
Here:
Conditional expressions are expressions that can be used in a decision-making context, such as in if-else statements or a ternary operator. A common example is the ternary conditional operator (? :)
?
/ \
> :
/ \ / \
x 10 y z
In this case:
An expression tree can be constructed using infix, prefix, or postfix notations. Let's explain how they are constructed and displayed referring to an example:
+
/ \
A *
/ \
B C
In infix notation, operators are placed between operands. This is the most common way of writing arithmetic expressions.
Step 1: Start with the lowest precedence operator (if multiple operators are present).
Step 2: Make this operator the root of the tree.
Step 3: The left subtree is created from the part of the expression to the left of the operator, and the right subtree is created from the part of the expression to the right of the operator.
Step 4: Recursively apply this process to the left and right subexpressions.
In postfix notation, operators are placed after their operands.
Step 1: Start from the left of the postfix expression and push operands onto a stack.
Step 2: When an operator is encountered, pop the required number of operands (2 for binary operators) from the stack, create a new node for the operator, and make the popped operands its children.
Step 3: Push the new operator node back onto the stack.
Step 4: At the end of the expression, the stack will contain a single node, which is the root of the expression tree.
Prefix notation consists of operators coming before their operands.
Step 1: Start from the right of the prefix expression and push operands onto a stack.
Step 2: When an operator is encountered, pop the required number of operands (2 for binary operators) from the stack, create a new node for the operator, and make the popped operands its children.
Step 3: Push the new operator node back onto the stack.
Step 4: At the end of the expression, the stack will contain a single node, which is the root of the expression tree.
| Notation | Example | Expression Tree |
|---|---|---|
| Infix | A + B * C | + (root), with left child A and right child * (subtree with B and C as leaves) |
| Postfix | A B C * + | Same as the infix tree, but the construction order follows postfix operations |
| Prefix | + A * B C | Same as the infix tree, but the construction order follows prefix operations |
Here are the advantages and disadvantages of the expression tree:
Some of the algorithms are commonly used to manipulate and evaluate expression trees:
Expression trees do not only represent mathematical operations; they have been implemented in various programming languages such as C, C++, and Java, to name a few. In fact, these data structures get a lot of attention in compiler design and are used in the evaluation of arithmetic or mathematical expressions. The representation of expressions as a binary tree makes the processes of parsing, evaluation, and optimization (e.g., constant folding and common subexpression elimination) more efficient.
Expression trees are typically constructed in C by means of structures for nodes, where every node contains an operator or operand and pointers to its left and right subtrees. The construction is usually done by parsing the postfix (or any other notation) expression and using a stack to create the tree. The code for the structure and the algorithm might look like this:
#include <stdio.h>
#include <stdlib.h>
#include <ctype.h>
// Define tree node
struct Node {
char data;
struct Node *left, *right;
};
// Define stack for tree nodes
struct Stack {
int top;
unsigned capacity;
struct Node **array;
};
// Create new tree node
struct Node* newNode(char data) {
struct Node* node = (struct Node*)malloc(sizeof(struct Node));
node->data = data;
node->left = node->right = NULL;
return node;
}
// Stack operations
struct Stack* createStack(unsigned capacity) {
struct Stack* stack = (struct Stack*)malloc(sizeof(struct Stack));
stack->top = -1;
stack->capacity = capacity;
stack->array = (struct Node**)malloc(stack->capacity * sizeof(struct Node*));
return stack;
}
int isEmpty(struct Stack* stack) {
return stack->top == -1;
}
void push(struct Stack* stack, struct Node* node) {
stack->array[++stack->top] = node;
}
struct Node* pop(struct Stack* stack) {
if (!isEmpty(stack))
return stack->array[stack->top--];
return NULL;
}
// Construct expression tree from postfix
struct Node* constructExpressionTree(char postfix[]) {
struct Stack* stack = createStack(100); // arbitrary capacity
for (int i = 0; postfix[i] != '\0'; i++) {
// If operand, create a node and push
if (isalnum(postfix[i])) {
push(stack, newNode(postfix[i]));
}
// If operator, pop two nodes and make them children
else {
struct Node* node = newNode(postfix[i]);
node->right = pop(stack);
node->left = pop(stack);
push(stack, node);
}
}
// Root of the tree
return pop(stack);
}
// Inorder traversal
void inorder(struct Node* root) {
if (root != NULL) {
inorder(root->left);
printf("%c ", root->data);
inorder(root->right);
}
}
// Preorder traversal
void preorder(struct Node* root) {
if (root != NULL) {
printf("%c ", root->data);
preorder(root->left);
preorder(root->right);
}
}
// Postorder traversal
void postorder(struct Node* root) {
if (root != NULL) {
postorder(root->left);
postorder(root->right);
printf("%c ", root->data);
}
}
int main() {
char postfix[] = "ab+c*d-";
struct Node* root = constructExpressionTree(postfix);
printf("Inorder Traversal : ");
inorder(root);
printf("\n");
printf("Preorder Traversal : ");
preorder(root);
printf("\n");
printf("Postorder Traversal : ");
postorder(root);
printf("\n");
return 0;
}
By this method, the system can easily handle the parsing and evaluation of arithmetic expressions. This is one of the ways in which the compiler can be optimized such as constant folding, where the subtrees that represent constant expressions are evaluated at compile time.
An expression tree in C++ is achieved through the use of object-oriented features where nodes and tree operations are encapsulated in classes. The method of construction is similar to C, but it is more readable and maintainable. Here is a short excerpt of the code for illustration:
#include <iostream>
#include <stack>
using namespace std;
// Node class for expression tree
class Node {
public:
char data;
Node* left;
Node* right;
Node(char val) : data(val), left(NULL), right(NULL) {}
};
// Function to construct expression tree from postfix
Node* constructExpressionTree(const string& postfix) {
stack<Node*> st;
for (char ch : postfix) {
// If operand, create a node and push to stack
if (isalnum(ch)) {
st.push(new Node(ch));
}
// If operator, pop two nodes and make them children
else {
Node* node = new Node(ch);
// Right child is the top of stack
node->right = st.top();
st.pop();
// Left child is the next
node->left = st.top();
st.pop();
st.push(node);
}
}
// Final node is root
return st.top();
}
// Inorder traversal (to verify the tree)
void inorder(Node* root) {
if (root != NULL) {
inorder(root->left);
cout << root->data << " ";
inorder(root->right);
}
}
// Preorder traversal
void preorder(Node* root) {
if (root != NULL) {
cout << root->data << " ";
preorder(root->left);
preorder(root->right);
}
}
// Postorder traversal
void postorder(Node* root) {
if (root != NULL) {
postorder(root->left);
postorder(root->right);
cout << root->data << " ";
}
}
int main() {
string postfix = "AB*CD/+";
Node* root = constructExpressionTree(postfix);
cout << "Inorder Traversal : ";
inorder(root);
cout << endl;
cout << "Preorder Traversal : ";
preorder(root);
cout << endl;
cout << "Postorder Traversal : ";
postorder(root);
cout << endl;
return 0;
}
Besides this, C++ expression trees are also a major component of compiler designs used for parsing mathematical expressions and for further stages of compiler like optimization by methods such as common subexpression elimination.
Java provides robust support for building expression trees using classes and data structures such as Stack. Below is an example of constructing an expression tree from a postfix expression:
import java.util.Stack;
class Node {
char data;
Node left, right;
Node(char data) {
this.data = data;
left = right = null;
}
}
public class ExpressionTree {
// Construct Expression Tree from Postfix
public static Node constructTree(String postfix) {
Stack<Node> stack = new Stack<>();
for (char ch : postfix.toCharArray()) {
if (!isOperator(ch)) {
// Operand → create a node and push
stack.push(new Node(ch));
} else {
// Operator → pop two nodes
Node node = new Node(ch);
node.right = stack.pop();
node.left = stack.pop();
stack.push(node);
}
}
return stack.pop();
}
// Check if character is operator
private static boolean isOperator(char ch) {
return ch == '+' || ch == '-' || ch == '*' || ch == '/';
}
// Inorder traversal
public static void inorder(Node root) {
if (root != null) {
inorder(root.left);
System.out.print(root.data + " ");
inorder(root.right);
}
}
// Preorder traversal
public static void preorder(Node root) {
if (root != null) {
System.out.print(root.data + " ");
preorder(root.left);
preorder(root.right);
}
}
// Postorder traversal
public static void postorder(Node root) {
if (root != null) {
postorder(root.left);
postorder(root.right);
System.out.print(root.data + " ");
}
}
// Driver Code
public static void main(String[] args) {
String postfix = "AB*CD/+";
Node root = constructTree(postfix);
System.out.print("Inorder Traversal : ");
inorder(root);
System.out.println();
System.out.print("Preorder Traversal : ");
preorder(root);
System.out.println();
System.out.print("Postorder Traversal : ");
postorder(root);
System.out.println();
}
}
Java's object-oriented features make it convenient to extend expression trees for advanced compiler optimizations or symbolic computation, such as constant folding and detecting common subexpressions.
Infix Expression : a + b * c - d
Prefix Expression : - * + a b c d
Postfix Expression : a b + c * d -
To summarize, an expression tree is a binary tree that represents mathematical expressions in which the leaf nodes are operands and the interior nodes are operators. Expression trees allow for evaluation of an expression while preserving the precedence of operators and for transformations to be simplified. Expression trees are often used in both compilers and calculators in the computer algebra systems for parsing, evaluating or optimizing expressions. Using post-order traversal, arbitrary expressions can be evaluated or simplified quickly.
Yes, expression trees are an important component of symbolic computation systems, allowing for symbolic manipulation and simplification of algebraic expressions.
The time complexity for evaluating an expression tree is O(n), where n means the total number of nodes. This is because a post-order traversal processes each node exactly once.