Apr 05, 2025
5 min read
Rust,

Rust's 'Uncommon but Super Useful' Syntaxes: How Many Do You Know?

This article introduces some uncommon but practical syntax features in Rust, including unions (`union`), slice pattern matching, raw pointer operations, inline assembly (`asm!`), extern blocks, loop return values, the `@` in match patterns, labeled loop control, never type (`!`), `#[repr]` attributes, and diverse macro invocation methods. These features can significantly enhance development efficiency and code flexibility in specific scenarios.

There are some syntax features in Rust that exist but are less frequently used by developers due to their special use cases or relatively complex nature. Of course, the definition of common and uncommon is relative. Given Rust’s wide range of applications, it’s normal for some uncommon syntaxes to be very common in other fields.

Union (union)

Rust supports unions similar to those in C. The declaration of a union uses the same syntax as a struct declaration, just replacing struct with union. The key feature of a union is that all fields share the same storage space. Therefore, writing to one field of the union may overwrite other fields, and the size of the union is determined by its largest field.

fn main() {
    #[repr(C)]
    union MyUnion {
        i: i32,
        f: f32,
    }
    let u = MyUnion { i: 42 };
    println!("{}", unsafe { u.i })
}

Slice Pattern Matching

Rust supports matching the head and tail of slices using the .. wildcard. This is not commonly used in everyday code (if you often use it, you might disagree), but it is quite common in deep learning, such as in candle: tensor.i((.., ..4))?.

fn main() {
    let arr = [1, 2, 3, 4];
    let &[a, .., b] = &arr;
    println!("First: {}, Last: {}", a, b); // Outputs 1 and 4

    let arr = [
        [1,2,3],
        [4,5,6],
        [7,8,9],
    ];
    println!("{:?}", &arr[1][..]); //[4, 5, 6]
    println!("{:?}", &arr[1][1..]); //[5, 6]
}

.. also has a less common usage, which is to update fields.

fn main() {
    #[derive(Debug)]
    struct Point {
        x: i32,
        y: i32,
    }
    let p1 = Point { x: 1, y: 2 };
    let p2 = Point { y: 3, ..p1 }; // p2.x = 1, p2.y = 3
    println!("p2 = {:?}", p2);
}

Raw Pointer Operations (*const T, *mut T)

*const T and *mut T can directly manipulate memory addresses. This is not commonly seen in daily projects because most people use the safe part of Rust. Directly manipulating memory is an unsafe operation. However, for library authors, this is quite common, especially when doing FFI programming. Direct memory address manipulation needs to be done within an unsafe block.

fn main() {
    let a = 42;
    let ptr = &a as *const i32;
    unsafe {
        println!("{}", *ptr); // Outputs 42
    }
}

Inline Assembly (asm!)

Rust can directly embed assembly code.

fn main() {
    let x: u64;
    unsafe {
        asm!("mov {}, 5", out(reg) x);
    }
    assert_eq!(x, 5);
}

External Function Interface Extern Block

The extern keyword was quite common in older versions of Rust, but now it is mainly used for interfacing with C.

fn main() {
    extern "C" {
        fn printf(fmt: *const u8, ...);
    }
    unsafe {
        printf("Hello, world!\0".as_ptr());
    }
}

Loop Return Values

You can directly return a value from a loop through the break expression:

fn main() {
    let result = loop {
        break 42; 
    };

    println!("result: {}", result);
}

@ in Match Patterns

@ binds variables and matches values simultaneously during pattern matching.

	let value = Some(5);
    match value {
        Some(x @ 1..=10) => {
            println!("x: {}", x);
        }, 
        _ => {},
    }

Labeled Loop Control

    let mut x = 0;
    'outer: loop {
        x +=100;
        loop {
            x += 1;
            if x % 7 == 0 {
                println!("x is {}",x); //105
                break 'outer; 
            }
        }
    }

Never Type: !

Never indicates a function that never returns. Note: this is different from not writing a return type.

fn forever() -> ! {
    loop {}
}

#[repr] Attribute

From the previous examples, we can see that the [repr] attribute is used to control the memory layout of types (such as #[repr(C)] being compatible with C). This is only used in FFI or low-level optimizations.

#[repr(C)]
struct Point {
    x: i32,
    y: i32,
}

Macro Invocation

Due to code auto-completion, many people are accustomed to the vec![] syntax. In fact, macros can also be invoked with vec!() or vec!{}. Using square brackets is just a conventional practice.

    let data1 = vec![1, 2, 3, 4, 5];
    let data2 = vec!(1, 2, 3, 4, 5);
    let data3 = vec!{1, 2, 3, 4, 5};

    println!("data1: {:?}", data1);
    println!("data2: {:?}", data2);
    println!("data3: {:?}", data3);