CS358 Principles of Programming Languages | Winter 2025
Basic Information __ _
This language report is all about Rust, a language meant to exist in the same field as C/C++ while providing more safety and ease of use than either language. Originally started by Graydon Hoare in 2006 while working at Mozilla, the first stable release came out in 2015 with lots of contributions from the open source community. Rust is meant to handle a wide variety of tasks, from low level systems programming to networking and cloud services. Ultimately, the goal of Rust is to maximize the speed at which a developer can write code (like Python) while also maximizing how quickly a program will run (like C/C++). They achieve this by adding a lot of safeguards and guardrails into the compiler to prevent many common errors such as concurrency issues, as well as a novel way to manage memory without a garbage collector. Similarly to C/C++, this language compiles directly into native machine code to maximize runtime efficiency. At the time of this report, I'm currently working with stable release 1.84.1. Rust has a three tiered release system -- stable releases that are the most thoroughly tested and suitable for most developers releasing every 6 weeks. The beta releases are feature frozen releases that are ironing out bugs and quirks meant to preview what the next stable release will look like - these are also released every 6 weeks offset from stable. Finally, the nightly releases are the most cutting edge (and unstable) versions of Rust you can find. These are updated daily and meant for developers who want to experiment with new features. Rust is a very well documented language, offering many resources for someone to get started with the language. Their main site is an index for all of the resources available for the language, including an introductory book, as well as a more detailed reference book for more niche behaviors and tasks. You can also find a list of all releases as well as their github. The language is governed and maintained by The Rust Foundation, a non-profit formed in 2021 with five founding corporate members (Amazon, Huawei, Google, Microsoft, and Mozilla).
Compiler __ _
Rust's compiler rustc uses an LLVM (Low-Level Virtual Machine) backend to produce machine code out of Rust source code (.rs files). It works with a package manager Cargo, abstracting package installation and dependencies away from the user. It supports cross-compilation to allow developers to easily target platforms from embedded systems to web assembly for different projects. Rust does not specify an official concrete form of its grammar, in BNF (Backus–Naur form) or otherwise. The closest it gets is the lexical structure section in the reference. The grammar is specified within the parser component of the compiler, which someone could dig into due to its open source nature. There are third parties who have defined Rust's grammar in this way, but none have been adopted officially. There are 7 steps the compiler takes to produce machine code.
Lexical Analysis & Parsing:
rustcfirst breaks the code down into tokens and organizes them into an AST (abstract syntax tree)Macro Expansion: Macros such as
println!or#[derive(Debug)]are expanded into regular Rust codeName Resolution & Type Checking: The compiler resolves variables, functions, and modules to their true definitions. Since Rust is statically typed, type errors can occur in this stage. The borrow checker also runs in this stage, guaranteeing memory safety at compile time
Intermediate Representations (IR): Here the compiler uses 3 separate intermediate stages to begin breaking the code down and optimizing it
- High Level IR: A slightly simplified version of the AST for the compiler to work with
- Mid Level IR: A more abstract representation of the tree where complex features (ownership or lifetimes) are transformed into simpler parts. The borrow checker runs again at this stage for a deeper check on memory
- Low Level IR: At this point the code is translated into LLVM for the next stage of optimization
Optimization: The LLVM framework heavily optimizes the code here by loop unrolling or removing dead code to name a couple
Code Generation: The optimized LLVM IR is converted into the machine code of a determined architecture (ARM, x86 etc.)
Linker: Here Cargo links external libraries or dependencies with the compiled code to produce the final executable file
Primitive Types __ _
Integers & Floating Point __
Rust natively supports integers and floating point numbers to varying degrees of size and precision. For integers, 8, 16, 32, 64, and 128 bit sizes are available in signed or unsigned variants, while floats offer 32 or 64 bit precision (both signed). All sizes rely on two's complement to compute their specific ranges. Upon integer overflow, compiling in debug mode causes your program to exit, or in Rust terminology, panic!. Compiling with the --release flag does not include this check, and performs two's complement wrapping instead of exiting. In the case of a u8 variable, 256 would become 0, 257 would become 1 and so on. Dividing an integer by zero causes a program to panic! at runtime causing a crash. Dividing a float by 0 however results in either inf or -inf as defined by the IEEE standard. Performing 0.0 / 0.0 results in NaN (not a number). An example of the different types ranges and some initializations are shown in the code snippet below:
// i8: -128 to 127
// u8: 0 to 255
// ...
// i128: -2^127 to 2^127 - 1
// u128: 0 to 2^128 - 1
// ...
// f32: 7 digit precision
// f64: 15 digit precision
fn main() {
let x: i32 = -42;
let y: u32 = 42;
let big: i64 = 10_000_000_000;
let hex: u32 = 0xff; // Hexadecimal
let octal: u32 = 0o77; // Octal
let binary: u32 = 0b1010_1010; // Binary
}
Booleans __
Rust has dedicated boolean types completely separate from integers. In fact, you cannot implicitly treat integers as booleans in conditional tests as you can in other languages such as C. There are no "truthy" or "falsey" values in Rust, only true = true and false = false. The boolean operators are standard (&& - and) (|| - or). An example of boolean types is shown below:
fn main() {
let is_true: bool = true;
let is_false: bool = false;
if (is_false && is_true) {
println!("This won't print");
}
else if (is_false | is_true) {
println!("This will print!");
}
}
Strings __
There are two string types in Rust, String & &str. String can be modified and generally hold full phrases and words, while &str are an immutable slice of a string (A &str can be mutable with a mut keyword in front of it). Both string types are encoded using UTF-8. As a result of this, slicing a string in the middle of a multi-byte character will cause your program to panic!. An example is shown below for how to slice a string using a &str object.
let s1 = String::from("HelloWorld");
let s2 = &s[0..4];
println!("{}", s2); // "Hello"
Due to the danger of slicing, there is a separate safer method of iterating over strings using the built in .chars() function. An example using it is shown below.
let s1 = String::from("Hello, world");
for c in s1.chars() {
println!("{}", c); // Prints every character in the string on separate lines
}
The + operator concatenates strings and takes a String & &str object to work. An example is shown below showcasing this, as well as how ownership transfers when concatenating strings.
let s1 = String::from("Hello");
let s2 = String::from(", world!");
let s3 = s1 + &s2; // s1 gets moved, s3 now owns the new string
println!("{}", s3); // prints "Hello, world!"
It is possible to do the above example without transferring ownership by using the format! keyword. This keyword works the same
as println! but copies instead of prints. An example is shown below.
let s1 = String::from("Hello");
let s2 = String::from(", world!");
let s3 = format!("{} {}", s1, s2); // creates a brand new string s3 containing the contents of s1 & s2 without altering the originals
println!("{}", s1); // prints "Hello"
println!("{}", s2); // prints ", world"
println!("{}", s3); // prints "Hello, world"
For searching operations, Rust includes contains, starts_with, and find as string functions. Examples of each are shown below.
let s1 = String::from("Hello, world);
println!("{}", s.contains("world")); // prints true
println!("{}", s.starts_with("world")); // prints false
println!("{}", s.find("world").unwrap()); // prints 7
.unwrap() prints the position of the first letter in the match. If no sub-string is found as a match, the program will panic! unless the programmer explicitly handles the None case, or .unwrap_or() is used which prints a non-negative integer in the case that no matches are found.
Rust also supports changing the case of an entire string through the to_lowercase() and to_uppercase() functions.
let s1 = String::from("Hello, world");
println!("{}", s.to_lowercase()); // prints "hello, world"
println!("{}", s.to_uppercase()); // prints "HELLO, WORLD"
Operators & Precedence in Expressions __ _
All of the regular arithmetic operators are allowed in expressions, however some of their uses and precedence rules differ from languages such as C++ or Python.
Some of the operators behave differently depending on the context they are written in such as * being used for multiplication if it sits between two expressions, or defining or dereferencing a pointer next to a variable. I will go into the most notable or unique operators or rules Rust offers. The full list of operators in Rust can be found here, while their order and precedence can be found here. Many of the operators and expressions can also be overloaded for other types using std::ops or std::cmp.
Auto-chaining comparisons in Rust is not allowed, as opposed to Python where a statement like 2 < 3 < 4 is valid, meaning (2 < 3) and (3 < 4). In Rust you must specify 2 < 3 && 3 < 4.
In C++ and Python, programmers can freely compare different numeric types (such as ints to floats). In Rust, numeric comparisons must be between the same type, requiring the programmer to explicitly cast the comparison as a certain type. The example below compares a 32 bit integer and a 64 bit float.
let x: i32 = 10;
let y: f64 = 10.5;
println!("{}", y < x as i32); // evaluates to false (10.5 gets rounded to 11 which is not less than 10)
? acts as an error propagation operator that short circuits and returns errors early. This operator can be used in functions that return Result<T, E> or Option<T> types. It is built to reduce boilerplate for error handling. An example is shown below where ? is used to handle any errors opening the file with the passed file_path string.
fn readFile(file_path: &str) -> Result<String, std::io::Error> {
let content = std::fs::read_to_string(file_path)?;
Ok(content)
}
The .. and ..= operators are unique to Rust, specifying ranges exclusively and inclusively when surrounded by integer types such as 1..5 or 1..=5. While Python does include ranges, there is no way to distinguish the two. When .. is used on its own, it indicates a full range. An example of this is shown below.
let arr = [1, 2, 3, 4, 5];
let slice = &arr[..3];
println!("{:?}", slice); // prints the first 3 elements "[1, 2, 3]"
The .. operator is actually used in a few different contexts to perform various functions. We covered its use in ranges, but it can also be used to copy fields from another struct without modifying its existing values. An example of this below.
struct Point { x: i32, y: i32, z: i32 }
let p1 = Point { x: 1, y: 2, z: 3 };
let p2 = Point { x: 4, ..p1 };
println!("{}, {}, {}", p1.x, p1.y, p1.z); // prints "1, 2 ,3"
println!("{}, {}, {}", p2.x, p2.y, p2.z); // prints "4, 2, 3"
It also serves a purpose in match statements, letting the programmer ignore middle elements in array and tuple patterns. This is similar to Python's * used for unpacking, but within pattern matching. The example below will only print the first and last value of the array, 1 and 5.
let numbers = [1, 2, 3, 4, 5];
match numbers {
[first, .., last] => println!("First: {}, Last: {}", first, last),
}
Other operators also behave interestingly within match statements. The | operator acts as a logical or, letting the user define multiple match values for a specific case. The pattern binding operator @ is used exclusively in match statements to bind a value while also checking a condition. In the snippet below, x will evaluate to print "Small number: 3" in the match statement because of the pattern defined by @.
let x = Some(3);
match x {
Some(n @ 1..=4) => println!("Small number: {}", n),
Some(n) => println!("Big number: {}", n),
None => println!("No number"),
}
Let Bindings __
Rust supports let bindings for pattern matching in multiple places such as if let, while let, and inside match guards. As of version 1.65, Rust does support let bindings within expressions for early returns, particularly let - else shown in the example below.
fn get_value(opt: Option<i32>) -> i32 {
let Some(x) = opt else { return 0 }; // If None, return 0
x * 2
}
Functions, Binding, Scoping __ _
Functions __
Defining a function in Rust follows the general syntax shown below. The return type (if applicable) is denoted by an arrow, and the return value is either denoted by the return keyword before it, or by the lack of a semicolon at the end of the line. If a function does not specify a return type, it defaults to the unit type, or (). This type is similar to None in Python or void in C++. The parameters of a function must have explicit types. A programmer may pass an argument by reference, making it either mutable or immutable. This is done by placing a & operator in front of the type, and let's the function 'borrow' the value without taking ownership. Making it mutable requires the mut keyword as we've seen before, where the function gets exclusive access to that value for it's lifetime.
fn function_name(param1: Type1, param2: &Type2, param3: &mut Type3) -> ReturnType {
block ...
...
return_value
}
Mutually recursive functions don't have any special syntax and are allowed in Rust. The example below checks whether a number is even or odd using mutually recursive functions.
fn is_even(n: u32) -> bool {
if n == 0 {
true
} else {
is_odd(n - 1)
}
}
fn is_odd(n: u32) -> bool {
if n == 0 {
false
} else {
is_even(n - 1)
}
}
fn main() {
println!("{}", is_even(4)); // true
println!("{}", is_odd(4)); // false
}
Rust allows for functions to be nested as well. The syntax for defining them is the same, just nested within another function. In the example below we wrap the two functions into a more abstracted "check_number" function.
fn check_number(n: u32) -> bool {
fn is_even(n: u32) -> bool {
if n == 0 {
true
} else {
is_odd(n - 1)
}
}
fn is_odd(n: u32) -> bool {
if n == 0 {
false
} else {
is_even(n - 1)
}
}
is_even(n)
}
fn main() {
println!("{}", check_number(7)); // false
}
Functions can also be passed as arguments to other functions in Rust using the fn type in the parameter definition. You can see an example of this below.
fn add_one(x: i32) -> i32 {
x + 1
}
fn apply_function(f: fn(i32) -> i32, value: i32) -> i32 {
f(value)
}
fn main() {
let result = apply_function(add_one, 5);
println!("{}", result); // Prints: 6
}
Similarly to when we pass functions as arguments, we can also use them as callbacks by storing them in data structures using function pointers. The example below shows the square function being used as a callback with pointers.
struct Callback {
func: fn(i32) -> i32,
}
fn square(x: i32) -> i32 {
x * x
}
fn main() {
let cb = Callback { func: square }; // Stores function in the struct
println!("{}", (cb.func)(4)); // Calls square(4), prints 16
}
Lambda expressions or anonymous functions are defined as closures in Rust. The input parameters are inside vertical bars, and the function body implicitly returns as a single expression. Multiple expressions in a closure need curly braces like a regular function, and the return needs to be explicit as well. Closures do have some restrictions put on them. A closure cannot call itself recursively, nor can it call another closure. Like functions, closures must have explicit typing. An example of a multi-expression closure is defined below.
let closure = |x: i32| {
let doubled = x * 2;
doubled + 1
};
println!("{}", closure(5)); // Prints: 11
Function values can be returned by other functions as well, with different syntax and methods for closures rather than named functions. Named functions can be returned through function pointers fn() -> T because they have a fixed size and thus can be returned directly. Closures on the other hand do not have a fixed size, and either require to be explicitly boxed, or returned with impl Fn(T) if the closure always returns the same type.
Binding & Scoping __
Rust contains strict rules in order to keep the language memory safe by preventing null references, memory leaks, and race conditions. One interesting feature is shadowing. The example below shows how x is redefined to a different type. This is different from mutation - the variable was replaced from a string to a number.
let x = "hello";
let x = x.len();
Most types in Rust are moved instead of copied. As we saw with strings, concatenating two strings gives ownership to the final string variable, taking value from its originator. Passing a variable to a function works the same way, moving ownership into the scope of the function unless it is passed by reference. Rust also has a borrowing system, where you can use a value without taking ownership of it. An immutable borrow allows for the value to be read, but not modified. There can be multiple immutable borrowers of the same value. Mutable borrowing allows for the value to be modified, but there can only be one mutable borrower at a time for any given value. The original value can't be used while there is a mutable borrower holding the value. Similarly, there cannot be any immutable borrowers of a value while there is a mutable borrower. Unlike Python's global declarations, global variables can only be mutated within unsafe blocks of code in Rust. This is done with the static mut keyword in front of the variable.
By default, Rust supports static binding. Because it aims to optimize runtime, Rust resolves function calls and variable types at compile time. It is possible to use dynamic binding however, using trait objects with dyn Trait. This allows for dynamic dispatch, meaning the function calls will be resolved at runtime. This is a more expensive operation, but can be useful when dealing with polymorphism.
Statements & Control __ _
In Rust there are a few different forms of atomic statements. Variable declarations using the let keyword are atomic and cannot be used in expressions. Both break and continue are atomic statements used in loops to direct control flow. Any expression that appears alone is treated as a statement, and the result is discarded. If you have the line:
x + 5;
This is treated as an atomic statement. Assignment is also an atomic statement, and doesn't return anything as a result. The code below will result in an error since = doesn't actually return anything in Rust.
let y = (x = 1) + 2;
There are no traditional exceptions in Rust such as the try - except blocks found in Python. Rust handles recoverable errors through the Result<T, E> and Option<T> types. This can be used to handle both good and bad cases throughout a program using Ok(T) and Err(E). These types are often used in conjunction with the ? operator to propagate any errors that might occur. The simple example below shows how Ok and Err could be used to handle opening a file from the Rust Reference. Source here.
use std::fs::File;
fn main() {
let greeting_file_result = File::open("hello.txt");
let greeting_file = match greeting_file_result {
Ok(file) => file,
Err(error) => panic!("Problem opening the file: {error:?}"),
};
}
Sequencing Control Statements __
Similarly to C++, Rust features blocks and ; used as a separator between expressions. Blocks of code are denoted by an open curly brace followed by the statements or expressions and a closed curly brace, ie: { ... }. Expressions must be separated with ; unless it is the last expression in the block, signifying the return statement. If the last expression does include the ;, it will return the unit type or ().
Selection Control Statements __
We have touched on many of Rust's selection control statements throughout this report. if statements can be used normally just as in other languages, but they can also be used in place of the ternary operator to return an expression, shown below.
let x = 10;
let y = if x > 5 { "big" } else { "small" };
println!("{}", y); // Prints: big
Rust's match statements are powerful control statements that can pattern match on values, allow destructuring, and guards. This is Rust's equivalent to switch case, or match case statements in other languages. if let statements provide simpler pattern matching where a match isn't needed.
Iteration Control Statements __
Rust offers the basic iteration control statements like while and for, but provides more flexible uses for them. Basic while loops work the same as most other languages, but Rust also offers while let statements for looping with pattern matching. This will keep the loop running as long as the given pattern matches the context. The loop statement is an infinite loop like a while(true) found in other languages. This will keep looping until it encounters a break statement. for loops have the capability to loop over a specified range (using .. or ..=), collections (like an array), or iterators. Rust provides iterators for collections like vec, array, or HashMap. Rust also allows programmers to define custom iterators using the Iterator trait. Iterators can use built in functional methods such as map, filter, fold, or take as different filters. Rust also let's the programmer define loop labels to break out of multiple loops at the same time. The example below shows how this can be used within multiple for loops.
'outer: for i in 1..=3 {
for j in 1..=3 {
if i == 2 && j == 2 {
break 'outer;
}
}
}
// break goes here!
Data Structures __ _
Simple Product Types __
Rust offers two simple product types, tuples and structs, both allowing heterogenous typing. Tuples are immutable by default, but can be made mutable with the mut keyword when defining them. It is important to note that individual fields cannot be made mutable in a tuple, its all or nothing. structs are similarly immutable by default, but individual fields can be made mutable by choice of the programmer. Tuples are always unboxed structures (stored directly on the stack), while structs are only unboxed by default. Using the Box keyword, a programmer can force a struct to be boxed (stored on the heap) if they are dealing with a particularly large struct, or the struct contains recursive definitions. Both tuples and structs require static indexing; struct fields must be accessed by name, and tuple indices must be constants. Both are checked at compile time. Below is an example of a simple tuple and struct both defining coordinate points.
struct Point {
x: i32,
y: i32,
c: char,
}
let pointTup: (i32, i32, char) = (2, 2, 'x')
Sum Types __
Rust has a built-in sum type called enum that allows a programmer to define a variable that could be one of a few different types. The example below defines a 'shape' enum which could potentially be a circle or a rectangle, with different definitions for each type. Rust has a built in Option<T> type as well, which can represent some type, or nothing. Commonly used to wrap values, this type is actually implemented in the backend using an enum, shown below.
enum Option<T> {
Some(T),
None,
}
enum Shape {
Circle(f64), // Radius
Rectangle(f64, f64) // Width, Height
}
Arrays __
Rust offer three separate array types: arrays, vectors, and hash maps. Arrays are the most simple of the types, and require homogenous typing as well as fixed length at compile time. They are indexed with integers usize and defined as unboxed structures. Vectors are similar in that they require homogenous typing and are indexed with integers, however they are defined as boxed structures, and can be resized dynamically. Dynamic indexing of arrays and vectors is allowed, but out of bounds access will cause the program to panic!. Hash maps offer key-value storage similar to Python's dictionaries. These structures are boxed, and can be resized dynamically. Key and value pairs can be different types, and the keys are looked up dynamically at runtime. Primitives, strings, and tuples are all valid key types. Floating point keys are allowed but discouraged due to precision issues. Simple forms of each structure are defined below.
use std::collections::HashMap;
fn main() {
let mut map = HashMap::new();
map.insert("one", 1);
map.insert("two", 2);
let arr: [i32; 2] = [1, 2];
let mut v: Vec<i32> = vec![1, 2];
}
Creating an AST Type __
Defining an abstract syntax tree type in Rust would probably be best using an enum along with Box<T> for any recursive definitions. The enum could define the different types used by the language, as well as the structures for each of the statements and expressions.
Comparing Structures __
Rust only supports structural equality (comparing the values), unlike Java for example which supports referential equality for its data structures (comparing the objects' memory addresses). The only case where Rust supports referential equality is in pointers through std::ptr::eq.
Type Systems __ _
Rust supports polymorphic or generic types for both functions and data structures. The syntax for this is to add a <T> in front of the definition. This allows for more efficient code reusability for multiple types without having to redefine a function over and over. This can be applied to functions, structs, enums, and traits.
Rust has strong type inference, in addition to being statically typed. This means that the compiler needs to know what type each variable is, but it can infer this based on the way variables are used. Closures and iterators also utilize type inference to decide what the type is of the variables being passed to them. Examples for each are shown below.
fn main() {
let x = 42; // Inferred as i32
let y = 3.14; // Inferred as f64
let name = "Rust"; // Inferred as &str
let multiply = |x, y| x * y; // Compiler infers x & y are i32
println!("{}", multiply(4, 5));
}
Rust goes far beyond type checking during static analysis to ensure safety and performance in compilation. The first three stages in the compiler that we saw above all fall under static analysis. Some static analysis happens during the first stage, or parsing stage. Once an AST is created, Rust ensures that all variables and functions are declared before they are used. This is also where scoping rules are checked. After the next stage — type checking, Rust runs the ownership and borrow checker, which makes sure that no rules are being broken about value access or sharing. Next is the lifetime checker, which detects if references outlive the data they point to. Rust then performs an exhaustive check on match statements to ensure pattern matching covers all cases. In the final part of static analysis Rust finds any unused variables, functions, or imports and throws warnings about them. Rust's compiler was built to optimize runtime performance and safety, not compile time, which shows.
Memory __ _
Rust includes a novel way of managing memory without a garbage collector. Using the ownership and borrowing system that we've touched on above, Rust ensures that any handling of variables is done in a safe manner. Once a variable falls out of scope and is unused, Rust will detect this and free the memory in the heap automatically. The Drop trait can be used to define custom logic before an object is deallocated both for primitive and custom types. A programmer can also explicitly deallocate memory early by using std::mem::drop. This strict system makes it much harder to fall into simple pitfalls surrounding memory management, while also getting rid of the runtime overhead of using a garbage collector.
Objects & Modules __ _
Programming Methods __
Rust supports methods to program in functional, procedural, and object-oriented styles. Focusing on the object-oriented paradigm, Rust is different from Java or C++ in that it does not have traditional classes or inheritance. Structs are used in place of classes, with impl blocks to define their methods for encapsulation. Rust relies on traits to provide polymorphism in place of traditional class inheritance. Trait objects allow for dynamic dispatch similar to C++'s virtual functions using the dyn keyword. Subtyping can be achieved through generics to keep code reuse higher. A simple example of some code written in OOP style below where a Bike is of a broader type Vehicle.
trait Vehicle {
fn drive(&self);
}
struct Bike;
impl Vehicle for Bike {
fn drive(&self) {
println!("You go zoooooooom!");
}
}
fn start_trip<T: Vehicle>(v: &T) {
v.drive();
}
fn main() {
let bike = Bike;
start_trip(&bike); // You go zoooooooom!
}
Modules & Submodules __
Rust has a module system for organizing code into separate files and namespaces to help with organization.
A module is defined with the mod keyword, and it's contents are private by default unless marked public with the
pub keyword. The use
keyword is used to import module items into a program, where you can also specify a function within the module to import by using
use module::function;.
Modules can also be nested in one another for more abstraction if necessary. Submodules can be defined in other modules through
pub mod submodule_name; in the parent module file.