Strings and slices
In the data types lesson we talked about the scalar types and compound types, but we never touched upon the String
type or the slice type. The reason for that is that String
is not a primitive data type itself, it is just a collection of bytes stored in a Vec<u8>
that is owned and guaranteed to be a valid UTF-8 encoded string.
Strings
Since the string type is just a collection of bytes, it is actually a Vec<u8>
. The String
type is a wrapper around a Vec<u8>
that represents a UTF-8 encoded string. The String
type is a growable, heap-allocated data structure that allows you to store a sequence of UTF-8 encoded characters.
If we want to have a reference of a String
or a part of a String
, we can use a Slice, which is a reference to a sequence of elements in the collection, in this case it is a reference to a sequence of bytes in the String
type. The slice type is &str
, which is a reference to a sequence of UTF-8 encoded characters.
Let's have a look at an example:
In this snippet of code, we have a String
type called my_string
that contains the string "Hello, World!"
. We then create two slices, hello
and world
.
The hello
slice is a reference to the first 5 characters of the my_string
string (index 0 to 4), which is "Hello"
. The world
slice is a reference to the last 5 characters of the my_string
string (index 7 to 11), which is "World"
.
When we print the world
slice, it will output "World"
, the whole output in the console is going to be Hello, World
.
Strings
are just collections of bytes, represented by theString
type (UTF-8 encoded) or&str
for slices (string references).
Slices
In the previous lesson The stack and the heap we learned about both the stack and the heap and how data is stored in memory. We also talked about pointers.
Pointers are the same as references, they are stored on the stack and point to a value stored somewhere else, mostly stored on the heap but they could also be stored somewhere else like the stack.
Let's illustrate the concept of slices with an example:
Slices illustrated
The two boxes you see on the left represent the two pointers that are stored on the stack, and the one on the right represents the actual String
value stored on the heap.
The first reference is pointing to the first character of the String
and has a length of 13
characters, so if we count from the first character and stop at the 13th character, we will get the whole string.
The other slice world
is pointing to the 8th character (index 7) and has a length of 6
characters, so if we count from the 8th character (index 7) to a length of 6 characters, we will stop at the 13th character, and we'll get the string World
.
Note that these variables themselves don't hold any values of the string, they rather hold a reference to the actual value stored on the heap.
String slices (&str) allow us to work with parts of strings without copying data, adhering to ownership rules.
String slices help us reference parts of a collection without copying the data, however, they also have to follow the ownership rules and if we break them, the compiler will not compile the code.
Let's have a look at this example:
In this example, we have a String
type called text
that contains the string "good morning". We then call the get_prefix
function with a reference to the text
string. The get_prefix
function returns an owned String
type that contains the prefix of the text
string.
Let's walk trough the code line by line having ownership rules in mind:
- We create a
String
type calledtext
that contains the string "good morning".
-
We call the
get_prefix
function with an immutable reference to thetext
string. -
The immutable reference is being used and returns an owned
String
.
- We mutate the
text
string by prepending the string "Hello" to it.
- We use an immutable reference of the
text
string to print theprefix
to the console.
The code works because none of the ownership rules are violated. We are not using immutable references at the same time as taking a mutable reference, so the code is safe.
The problem with this code is that when we prepend the string "Hello"
to the text
string and the prefix will change to "Hello"
instead of "good"
, but the prefix
variable still holds the old value "good"
, which is not what we want.
A better approach to do this is to make the variables relationally linked to each other. By using string slices &str
we can achieve this.
So, instead of returning an owned String
type, we can return a string slice to point to the slice in which we have the prefix.
Let's have a look at the modified code:
This way we can be sure that the prefix
is always pointing to the same value as the text
string.
Let's run the code and see what happens:
Can't mutate borrowed value error
Now let's go over the code and see what happened when the code is executed:
- We create a
String
type calledtext
that contains the string "good morning".
- We call the
get_prefix
function with an immutable reference to thetext
string.
- We then mutate the
text
string by prepending the string "Hello" to it (the prefix now changed).
- We print
prefix
, which is a slice that references thetext
string.
There's a problem however, the immutable reference is being used after the text
string has been modified, which is a violation of the ownership rules. The code will not compile because of this violation.
Slices error
This is a good example of how the ownership system in Rust works, it's a great way to prevent errors and bugs that could potentially happen later in your code which could be hard to debug.
If we didn't use the reference to the text
string, the prefix
would be the older prefix value "good"
instead of "Hello"
, which would be a bug that would be a little bit difficult to debug.
String slices are references to portions of a String, avoiding unnecessary data duplication and ensuring data integrity.
Other slices
Slices are not exclusive to the String
type, they can also be used with arrays. Let's have a look at an example:
In this example, we have an array called numbers
that contains the numbers 1, 2, 3, 4, 5
. We then create a slice called slice
that references the second and third elements of the numbers
array. We then print the slice
slice, which will output [2, 3]
.
When executed, the code will output [2, 3]
, which are the second and third elements of the numbers
array.
Array slice
The syntax for array slices is very similar to that of string slices. You specify the starting index (inclusive) and the ending index (exclusive) within square brackets, separated by two dots (..
).
Here are a few more examples to demonstrate the flexibility of array slices:
Slices are not limited to strings; they can be used with arrays as well, providing a flexible way to work with sequences of data.
Conclusion
In this lesson, we explored string slices and their connection to the String
type. We demonstrated how slices can be used with arrays to reference specific portions of data without copying it.
String slices are a powerful feature in Rust, enabling you to efficiently work with segments of strings while maintaining data integrity and adhering to ownership rules. This approach helps to optimize memory usage and ensures the safe management of string data in your applications.
In the next lesson, we're going to learn about structs and enums, which are custom data types that enable you to define your own data structures and models. These are fundamental concepts in Rust, and mastering them will assist you in creating more complex and robust applications.