NG#3 - Dictionary - script (⏱,💰)

The reference for completing the introductory task is Installation Guide.

Racing Riverboats is the third task of the Think Cairo project, which focuses on using dictionary queries to find duplicate values in an array. There are two subtasks in this task.

Index of Duplicate Values in a List#

The first subtask, "Finding Crocodiles," involves finding duplicate items in an array and marking their indices in another array.

The array contains sub-elements that are structs composed of u32, u32, and felt252. The initial data is an Array<Obstacle>.

#[derive(Drop)]
struct Obstacle {
    length: u32,
    width: u32,
    description: Array<felt252>
}

The initial approach was to use Felt252Dict. The Obstacle elements were looped through and popped into the Felt252Dict, similar to a mapping(index => struct) in Solidity. If the value already existed, the index was recorded; otherwise, it was inserted.

After writing the initial version of the code, it was discovered that felt252Dict.get(index) -> Obstacle was not valid because the Obstacle sub-element Array<felt252> cannot be copied. Therefore, a different approach was needed.

Without using a dictionary, a nested loop was used to compare the Obstacles and add the duplicates to a temporary list. Then, the indices were found. However, this algorithm resulted in an "out of gas" error during testing and actually consumed 3.2 million gas, exceeding the target of 1.8 million. Optimization was needed.

Analyzing the gas limit of the test case, the cost of comparing two Obstacles once was 26,100 gas. The test case had 13 Obstacles, so calculating 1,800,000 gas would allow for a maximum of 69 comparisons. However, the above algorithm required 78 comparisons (1+2+...+12) to loop through all the elements. Therefore, the number of comparisons needed to be reduced.

The approach taken was inspired by the test cases of the core library dict. The Felt252Dict was used to store the indices, and if a value was obtained, it was skipped, reducing unnecessary comparisons. However, even with this optimization, the gas limit was still exceeded, and further reduction in the number of comparisons was not possible. Therefore, the cost of a single comparison needed to be optimized, specifically the comparison of Obstacle.description.

After being stuck for a while without any ideas, the mentor of NG was contacted privately, and he suggested hashing the Obstacles to achieve the goal in O(n) complexity.

The question then became how to hash a struct. Initially, the #[derive(Drop)] of Obstacle was changed to #[derive(Drop,Serde)], and the following code was used to obtain the hash:

let mut raw_data = Default::default();
obs.serialize(ref raw_data);
let hash = poseidon_hash_span(raw_data.span());

However, it was later discovered that this approach did not pass the online tests because Obstacle cannot be modified.

If Obstacle cannot be modified, all the attributes of the struct can be manually packed and hashed.

After successfully hashing the Obstacles, it was only necessary to insert (index, hash) into the dict and use get(index) to obtain the duplicate indices. The code then passed the tests smoothly.

There were also many conversions between u_size and felt252, which required the use of into and unwarp. You can refer to the documentation to resolve these issues.

Another pitfall was that if the index is 0 and the value is also 0, both dict.get(0) and dict.get(nonexistent index) would return 0, leading to incorrect results. This issue was resolved by using the library use nullable::{NullableTrait, match_nullable, FromNullableResult};. It is recommended to study the source code of nullable to understand this part.

In summary, the key points of the first subtask are:

How to hash a struct
Conversions between u_size and felt252
Creating, getting, and inserting into Felt252Dict
Using nullable

Option Return in Conditional Statements within Loops#

This subtask involves returning the minimum number of steps needed to surpass the river (array) without exceeding a step length of 3. If the first and last cells are crocodiles, None should be returned.

The approach is to use nested loops and jump 3, 2, or 1 step(s) each time. If a jump is possible, the step count is increased by 1. If no jump is possible, Option::None is returned. The goal is achieved by making multiple jumps.

The implementation of this subtask is relatively simple. After obtaining the index array from the first subtask, it can be solved using two loops. The main challenge is encountering an "out of gas" error during testing, which means that checking for an empty river and checking if the start and end points are crocodiles should be done in advance to save gas.

It is recommended to compare the test cases and make use of print statements to see where each jump lands.

Some pitfalls have already been mentioned in the first subtask, so if they are addressed in advance, there should be no need for further modifications.

After submitting the verification, there was a small bug that prevented passing the tests. It is important to note whether the condition for breaking the loop is river.len or river.len-1.

This subtask involves many return values within loops, and since Cairo does not have a return statement, it is easy to make mistakes and requires multiple modifications.

This subtask also provides practice with various operations related to Option.

Conclusion#

Since Cairo's memory is immutable and slots cannot be overwritten, some techniques are needed to achieve the effect of a dictionary. Familiarity with these operations will be helpful for future development.