Data types in Rust
Rust is a statically typed language, meaning the Rust compiler must know what are the types of each variable at compile time. This trait enables the Rust compiler to check and enforce type safety, reducing many kinds of errors and ensuring that operations on variables are valid according to their types.
Just like most programming languages, the Rust compiler can also infer the type of a variable from it's value if possible. Let's have a look at a simple example:
In the example above, the type of x
is inferred as an i32
by the compiler. Even though the type of x
could potentially be an i8
or i16
, but the Rust compiler defaults to i32
for integer types when the type cannot be inferred from the context because i32
strikes a balance between range and performance on modern computer architectures. It is large enough to handle most integer calculations without overflowing too often, and it's efficiently handled by 32-bit and 64-bit processors. We'll go in more detail about integer types later in this lesson.
For the variable y
, we are telling the Rust compiler to use the type i64
for the variable y
.
But there are cases in which there is multiple possible types for a value, in which case you must annotate the type so that the Rust compiler can know what type the variable is, if you don't do this, the Rust compiler will give a compile error. Here's an example:
The Rust compiler can't infer the type of the variable, because the parse
method can parse a string into an integer or a floating number of any size, so the Rust compiler can't know what type the variable is at compile time and won't default to any type, therefore you get a compile error.
Type annotations needed Rust compile error
To solve this, you can annotate the type explicitly, so the Rust compiler can know what type the variable is.
Type annotations success Rust
Subsets of types in Rust
Rust has two main categories of types: scalar types and compound types. Scalar types represent a single value, and compound types represent a collection of values.
Let's have a look at each of them in more detail.
Scalar types
Scalar types represent a single value. Rust has four primary scalar types:
- Integers
- Floating-point numbers
- Booleans
- Characters
Integer types
An integer is a number without a fractional component. Rust has several different integer types, which can be signed or unsigned, and have different sizes.
Signed and unsigned integers
Signed integers can store both positive and negative numbers, while unsigned integers can only store positive numbers.
Here is a list of all the integer types in Rust:
Length | Signed | Unsigned | Size Range |
---|---|---|---|
8-bit | i8 | u8 | -128 to 127, 0 to 255 |
16-bit | i16 | u16 | -215 to 215-1, 0 to 216-1 |
32-bit | i32 | u32 | -231 to 231-1, 0 to 232-1 |
64-bit | i64 | u64 | -263 to 263-1, 0 to 264-1 |
128-bit | i128 | u128 | -2127 to 2127-1, 0 to 2128-1 |
arch | isize | usize | Based on the kind of computer your program is running on |
Each signed integer type can store numbers from -(2n - 1) to 2n - 1 - 1, where n is the number of bits that variant uses. If we have an i16
, it can store numbers from -32768 to 32767. Unsigned variants can store numbers from 0 to 2n - 1, so a u16
can store numbers from 0 to 65535.
The isize
and usize
types depend on the kind of computer your program is running on: 64 bits if you're on a 64-bit architecture and 32 bits if you're on a 32-bit architecture.
Integer overflow (Wraparound)
When we use the type i32
it means that the value of the integer should be in between -231 to 231-1, but what happens if you try to store a number that is outside of this range?
If you try to store a number that is too large for the integer type you are using (e.g., trying to store 256 in a u8
), there is two possible behaviors that can happen depending on the build mode you are using.
-
In debug mode: Rust checks for overflow. Since u8 can only hold values from 0 to 255, attempting to store 256 in a u8 would cause a panic at runtime, as this exceeds the maximum value a u8 can represent.
-
In release mode: Rust does not include overflow checks to optimize performance. Instead, it performs "wrapping" arithmetic. For u8, attempting to store 256 would result in wrapping around to 0, because 256 is exactly one more than the maximum u8 value (255), and it wraps back to the minimum value of the range.
In real life applications you should be really careful with integer overflow, as it can lead to security vulnerabilities and bugs that are hard to track down. You can use
i32
for almost any use case, as it is large enough for most values, but if you feel like you need a larger integer, you can usei64
ori128
.
Integer literals
Integer literals are types of numbers that have a literal value. You can write integer literals in any of the following formats:
- Decimal:
99_999
- Hexadecimal:
0x1a3f
- Octal:
0o137
- Binary:
0b1010_1010
- Byte (u8 only):
b'Z'
The underscore
_
can be used as a visual separator for large numbers, which makes large numbers easier to read, but it is ignored by the compiler.1000000
(one million) and1_000_000
are treated as the same number.
Mathematical operations
Rust supports all the basic mathematical operations you would expect from a programming language, such as addition, subtraction, multiplication, division, and remainder.
But there is a catch, you can't perform mathematical operations between different types, for example, you can't add an i32
to an i64
, you must convert one of the types to the other type before performing the operation.
Normally the compiler will infer the type of the result of the operation, but you can also annotate the type explicitly if you want to.
If you do a mathematical operation that results in a floating-point number, Rust will default to using the f64
type, which is a 64-bit floating-point number. If you want to use a f32
type, you must annotate the type explicitly.
If you do a mathematical operation between different types, the Rust compiler will give an error, and you must convert one of the types to the other type before performing the operation.
Type mismatch error Rust
In such cases, you must convert one of the types to the other type before performing the operation.
Another way to change the type of a number is to use the as
keyword to cast the number to the desired type.
In this case the number 30.4
will be truncated to 30
and then multiplied by 4
, so the result will be 120
.
You have to be careful with mathematical operations in Rust, if you use integers in a division operation, the result will be an integer, and the fractional part will be truncated.
In the example above, you would expect a result of 1.1
, but the result will be 1
, because the numbers 11
and 10
are integers, and the result of the division operation will be an integer, and the fractional part will be truncated. This is definitely not something you'd want in a real-life application, so you should be careful when it comes to using integers in math operations that result in a fractional number.
Floating-point numbers
In Rust floating numbers are different from integers, they are numbers with a fractional component (decimal points). Rust has two primitive types for floating-point numbers: f32
and f64
, which are 32-bit and 64-bit in size, respectively.
The default type for a floating-point number is f64
, meaning whenever you declare a floating number variable, the type will be inferred as f64
by the compiler, the f64
is almost the same speed as f32
, so you almost never need to use f32
in Rust.
Here's an example of floating-point numbers in Rust:
Booleans
Just like in most programming languages, a boolean type in Rust represents a binary value, it can be either true
or false
. Booleans take up one byte of memory.
The primary use of booleans is to perform conditional logic, such as if statements, loops, and other control flow mechanisms which we'll cover later in the next lessons.
The boolean type is specified using the bool
keyword in Rust. Here's an example of booleans in Rust:
The Character type
Rust has a char
type that represents a single Unicode character. Characters are specified using single quotes '
and can represent any character, including emojis and special characters.
char
literals take up 4 bytes of memory and they can represent any Unicode character, which means you can use emojis and special characters and up to 1.1 million different characters.
We use the single quote '
to specify a character literal, and the character must be a single character, otherwise, Rust will give a compile error.
Here's an example of characters in Rust:
Characters in Rust
Compound types
Compound types are types that can group multiple values into one type. Rust has two primitive compound types: Arrays and Tuples.
Arrays
Arrays in Rust are a collection of values of the same type which have a fixed length. Unlike other programming languages, once an array is declared, it cannot grow or shrink in size.
Here's an example of arrays in Rust:
Arrays are declared using square brackets []
, and the type of the array is inferred from the values inside the array. In the example above, the type of the array a
is inferred as [i32; 5]
, which means it is an array of 5 elements of type i32
.
The place in memory where the array is stored is called the stack which we'll cover in more detail in the next lessons.
Arrays should not be confused with vectors, vectors are a flexible type provided by Rust's standard library, they are similar to arrays but they can grow or shrink in size, while arrays have a fixed size. We will cover vectors in more detail in the later chapters.
When to use arrays
Arrays are useful when you want to store a fixed number of elements of the same type. If you know for sure that the number of elements will not change, you can use an array, if you're not sure, you should use a vector.
A good example of using an array would be storing the seasons of the year, as there are always 4 seasons, and they will never change.
In the example above, the type of the variable seasons
is inferred as [&str; 4]
, which means it is an array of 4 elements of type &str
. &str
is a string slice type, the &
before the str
means it is a reference to a string slice, we will cover slices and references in more detail later in the guide.
You can also tell the Rust compiler the type of the array explicitly if you want to, but in most cases, you don't need to do this, as the Rust compiler can infer the type of the array automatically.
In this example, by default the type of the array seasons
would be inferred as [&str; 4]
, but we are telling the Rust compiler to use the type [&str; 4]
explicitly.
Accessing array values
You can access the values of an array by using the index of the element you want to access. The index starts at 0, so the first element of the array is at index 0, the second element is at index 1, and so on.
Here's how to access the values of an array in Rust:
If you try to access an element that is out of bounds of the array, Rust will not give a compile error, but the program will panic at runtime, so you should be careful when accessing array elements.
In the example above, the array a
has 5 elements, so the index of the last element is 4, if you try to access the element at index 5, Rust will give a compile error, because the index is out of bounds.
In real-life applications, it would be impossible for the Rust compiler to know which value will be accessed at runtime, especially if the index is calculated at runtime or given by the user, but whenever an index out of bounds is accessed, Rust will panic at runtime.
In other low-level programming languages when accessing an array out of bounds or trying to access invalid memory locations, the program will still continue to run and would lead to undefined behavior, which can lead to security vulnerabilities and bugs that are hard to track down.
Undefined behavior is a term used in low-level programming languages to describe the behavior of a program which is completely unpredictable, it can lead to security vulnerabilities and bugs that are hard to track down. Fortunately, Rust prevents you from accessing invalid memory locations, that's why Rust is considered a safe language.
Even though Rust will panic at runtime and doesn't let us access invalid memory locations, panicking in runtime is the last thing you want in a real-life production application, for that reason you have to handle such scenarios with proper error handling, which we will cover in more detail in the later chapters.
Tuples
Tuples are another compound type in Rust, they are a collection of values, they could be of different types or the same type, they have a fixed length and their size is fixed and they can not grow or shrink.
It's a good idea to use a tuple when you want to group multiple values of different types together, and you know for sure that the number of elements will not change.
Tuples are created by using the parentheses ()
and separating the values with commas ,
. Here's an example of tuples in Rust:
I have annotated the type of the tuple explicitly in the example above, but it's not needed, as the Rust compiler can infer the type of the tuple automatically.
To get back access to one of the values of the tuple, we can use pattern matching to destructure the tuple, or we can use the dot .
operator followed by the index of the value we want to access.
Let's try to access the values of the tuple using the dot .
operator:
In this example, we're doing quite the same thing as the previous example, but instead of using pattern matching to destructure the tuple, we are using the dot .
operator followed by the index of the value we want to access.
There is another way to get access to the values of the tuple, which is called destructuring, we can destructure the tuple and assign the values of the tuple to new variables, here's an example of how you would do that:
In the example above, we are destructuring the tuple tup
into three variables x
, y
, and z
, which will hold the values of the tuple.
The unit type
In Rust there is a special type called the "unit type", it is represented by the empty tuple ()
. The unit type has only one value, which is also ()
. We will discuss the unit type in more detail later in the guide.
Conclusion
In this lesson we've covered all of the primitive types in Rust, we've covered scalar types such as integers, floating-point numbers, booleans, and characters, and we've covered compound types such as arrays and tuples. However, there are more types in Rust, such as strings and slices that we've not discussed in this lesson, the reason for that is that they are not considered primitive types, because their sizes can shrink and grow, and they are allocated on the heap and not the stack. We will talk about all of these types in more detail later in this guide.
In the next lesson, we will cover functions, how to define a function, pass arguments, we will learn about the difference between statements and expressions, and how to return values from functions.