Definition and declaration of variables

In Rust, valid identifiers (including variable names, function names, trait names, etc.) must consist of numbers, letters, and underscores, and cannot start with a number. This is the same in many other languages. In the future, Rust will also allow other Unicode characters as identifiers, as well as raw identifiers, which allows keywords to be used as identifiers, such as r#self. This is most commonly used in FFI.

Variable declaration: let variable: i32 = 100;. In Rust, the way variables are declared is different from other languages. Here, the variable name comes first, followed by the variable type. So it looks like let variable: i32;.

The advantage of this variable declaration style is that it makes it easier for syntax analysis, and the most important part of a variable declaration statement is the variable name. By highlighting the variable name first, it emphasizes the importance of the variable name, and the type serves as additional information that can be inferred from the context. However, Rust's automatic type inference has limitations, and for types that cannot be inferred, a type annotation must be manually added.

The use of let in variable declaration is also inspired by functional programming languages. let indicates the meaning of binding, indicating that the variable name is bound to the memory. In Rust, the statement of declaring a local variable and initializing it is generally called "variable binding". The emphasis here is on the meaning of "binding", which is different from the "assignment initialization statement" in C++/C.

Some issues with variable declaration:

In Rust, every variable must be properly initialized before it can be used. It is not possible to use an uninitialized variable in Rust.

Checking for uninitialized variables:

In the example let variable: i32; above, this is a declaration without assigning a value to the variable. This may be acceptable in other languages, but in Rust, the compiler will report an error (if this uninitialized variable is used later). The Rust compiler performs basic static branch flow analysis on the code to ensure that variables are initialized before use. Since variable is not bound to any value, this code would cause many memory safety issues, such as unexpected calculation results or program crashes, so the Rust compiler must report an error.

let variable: i32;
println!("variable = {}", variable); // error[E0381]: use of possibly uninitialized 'variable'

Checking for uninitialized variables in branch flow:

The Rust compiler's static branch flow analysis is quite strict.

fn main() {
    let x: i32;
    if true {
        x = 1;
    } else {
        x = 2;
    }
    println!("x = {}", x);
}

In this case, all branches of the if statement bind a value to the variable x, so it can be executed. But if the else branch is removed, the compiler will report an error:

error: use of possibly uninitialized variable: 'x'
println!("x = {}", x);

From this, we can see that the compiler has detected that the variable x has not been properly initialized. After removing the else branch, the compiler's static branch flow analysis determines that the println! outside the if expression also uses the variable x, but it has not been bound to any value. The compiler's static branch flow analysis cannot recognize that the condition in the if expression is true, so it checks all branches. (This is an area of research in the field of programming languages, such as software static analysis. Some reference materials: Software Analysis course at Nanjing University)

If the println! statement is also removed, the code can be compiled and run normally because there is no other place outside the if expression that uses the variable x. The if expression, where x is used, has already bound a value to it, so the compilation is successful.

// An example
fn test(condition: bool) {
    let x: i32; // declare x
    if condition {
        x = 1; // initialize x, this is initialization
        println!("{}", x);
    }
    // If the condition is not satisfied, x is not initialized

    // But it doesn't matter as long as x is not used here
}

Checking for uninitialized variables in loops:

When using the break keyword in a loop, the value of the variable in the branch is returned.

fn main() {
    let x: i32;
    loop {
        if true {
            x = 2;
            break;
        }
    }
    println!("{}", x); // 2
}

The Rust compiler's static branch flow analysis knows that break will return the value of x, so the println! outside the loop can print the value of x.

Empty arrays or vectors can be used to initialize variables:

When binding an empty array or vector to a variable, the type needs to be explicitly specified, otherwise the compiler cannot infer its type.

fn main() {
    let a: Vec<i32> = vec![];
    let b: [i32; 0] = [];
}

If the explicit type annotation is not added, the compiler will report an error: error[E0282]: type annotation needed. Empty arrays or vectors can be used to initialize variables, but currently cannot be used to initialize constants or static variables.

Uninitialized variables caused by ownership transfer:

When binding an already initialized variable y to another variable y2, Rust treats y as logically uninitialized. Both y and y2 are variables with move semantics, which means ownership is transferred, while value semantics, like the default behavior in other C++ languages, are passed by value.

fn main() {
    let x = 42; // Primitive types are value semantics and stored on the stack by default
    let y = Box::new(4); // Variables are boxed on the heap, Box::new allocates memory on the heap and returns a pointer
    // which is then bound to y, while the pointer y is stored on the stack
    println!("{}", y);
    let x2 = x;
    let y2 = y;
    //println!("{}", y); // Ownership has been transferred, so the variable y can be considered uninitialized
    // But if the variable is bound to a value again, the variable y is still usable, this process is called reinitialization
}