rust_ffi

note: I am not a Rust expert and I am still learning a ton. If you see something that can be improved, or is straight up wrong, let me know at ntranswe@gmail.com

At the end of the day, your tech stack is just a (hopefully carefully chosen) means to an end, but being able to use tools you enjoy at work is really nice. While I still have a billion things to study and I’m only a few years into industry, I’m starting to get an idea of what I like and dislike.

I remember taking my first computer science course in college and learning Python. The language felt so clean and intuitive. Just glue some libraries together and it would work. Freshman me took things for granted and ignored the scary and mysterious iceberg that existed underneath print("Hello, World!"). I didn’t care about the stack and the heap and what a syscall was. At one point in undergrad I probably thought it’d be my main language as a software engineer. In fact, I even recently had a job offer to do back-end Python/Java/AWS/K8s stuff… but I ended up turning it down for a different opportunity. (primarily for pay + career growth + a cooler product, although the tech stack was still a small, but significant factor)

I’ve realized I dislike dynamic typing. At one of my previous roles, I wrote a bunch of Python scripts wrapping APIs from a hardware SDK, and automated a bunch of annoying manual work revolving around the SDK’s GUI. Now generally, the ideal goal of any API is to empower a developer without forcing them to look under the hood too much, but this is far from reality. In particular, it seems like abstractions tend to leak out even more in the world of embedded. Figuring out these API’s internal runtime errors in tandem with cryptic, barely-documented types wasn’t very fun. Even coming back to my own scripts months later after losing mental context sometimes felt painful. I tried mitigating this with type hints, docstrings, and following PEP8 to the best of my abilities, but I just don’t get excited about reading/writing Python too much. To be fair though, it’s very possible I just don’t have enough experience with it.

Anyway, I now prefer statically-typed languages with a focus on compile-time errors. I’m happy to say I’m primarily working in Rust now, which is quite fortunate in the grand scheme of things. The overwhelming majority of the embedded industry still runs on C and C++, and from what I’ve read recently, hobbyist Rust users wish they could use Rust in their job, but the demand just isn’t quite widespread enough from most employers yet. In my case, I somehow stumbled into the domain of network security/cryptography, where this press release from the White House certainly helped me out a bit.

Actually, I even got to use it for some months at the tail-end of a former role, but I was rushed into it with inadequate time and little context in order to deliver an MVP. In this post, I’ll be showing you what that looked like, along with some lessons I’ve learned along the way.

constructors/in-place initialization

To start, here’s some miscellaneous context about the environment I was developing in:

With that out of the way, let’s get into some nasty code I wrote. I was rewriting an interface around Xilinx’s board support package APIs, specifically their SPI drivers. They’re written in C, so I had to use FFI. The purpose behind the interface is unimportant here.

class SpiDriver {
public:
    SpiDriver() {
        XSpiPs_Config config = {
            // blah blah
        };

        std::ignore = XSpiPs_CfgInitialize(&device_, &config, 0xDEADBEEF);
        // pass device_ through other various initialization APIs
    }

private:
    XSpiPs device_;
};

use core::mem::MaybeUninit;

extern "C" {
    // Xilinx BSP stuff
}

pub struct SpiDriver {
    device: XSpiPs,
}

impl SpiDriver {
    pub fn new() -> Result<Self, SpiError> {
        let mut device = MaybeUninit::<XSpiPs>::uninit();
        let config = XSpiPs_Config { 
            // blah blah
        };
        unsafe {
            let _ = XSpiPs_CfgInitialize(device.as_mut_ptr(), &config, 0xDEADBEEF);
            // pass device through other various initialization APIs
            let device = device.assume_init();
            Ok(Self { device })
        }
    }
}

Now if you haven’t already, take a look at the docs for XSpiPs_CfgInitialize(). With some context from the docs, and some familiarity with Rust’s move semantics, you can spot the issue:

Here is the first lesson I was forced to learn coming from C++ and being C++ brained:

Rust does NOT have built-in “emplace” constructors and in-place stack initialization the same way C++ does. (…yet)

When I instantiate an XSpiPs inside SpiDriver::new() and return it wrapped in a Result, it gets copied out to the caller, which changes its underlying memory address and breaks the contract I was supposed to uphold, invoking undefined behavior. Starting out in Rust, I saw people define new() methods for their structs everywhere and thought, “oh it’s basically like a C++ constructor!” That was terribly wrong and they are not the same at all. Naming these functions new() is just customary and it’s not even a special keyword.

The key difference I’m trying to point out here is that the C++ version worked because C++ constructors fully own the section of memory they operate on when they run, and the base address of the created object is stable from that point on. The memory is initialized in-place. The contract stated in the docs was preserved with the C++ version. Again, in my Rust version with SpiDriver::new(), it’s just a free function where the XSpiPs instance is copied out to the caller, changing its address and breaking the API’s contract. Normally, types in Rust do not have to care about their memory addresses, but I was in FFI-land using C data types here.

Typical solutions to this could be to use Box<T> inside SpiDriver::new(), or a static mut instance of XSpiPs for initialization on startup, but as mentioned at the start, those were out of the equation. In the end, as I was in a rush, what I opted to do was just split the creation of the struct and its initialization into two separate methods:

use core::mem::MaybeUninit;

extern "C" {
    // Xilinx BSP stuff
}

pub struct SpiDriver {
    device: MaybeUninit<XSpiPs>,
}

impl SpiDriver {
    pub fn new() -> Self {
        Self { 
            device: unsafe {
                MaybeUninit::uninit()
            }
        }
    }

    pub fn initialize(&mut self) -> Result<(), SpiError> {
        let mut device = MaybeUninit::<XSpiPs>::uninit();
        let config = XSpiPs_Config { 
            // blah blah
        };
        unsafe {
            let _ = XSpiPs_CfgInitialize(device.as_mut_ptr(), &config, 0xDEADBEEF);
            // pass device through other various initialization APIs
            let device = device.assume_init();
            Ok(())
        }
    }
}

Notice how the device field of SpiDriver is now wrapped in MaybeUninit. While this is more correct semantically (rather than filling the fields of XSpiPs with meaningless values in SpiDriver::new()), ergonomically it created more pain, since every method that wants to use self.device would now have to use something like assume_init(). This choice would indirectly come back to bite me down the line, and I’ll explain this in a later section.

Anyway, while it was very far from ideal, it was working and I moved onto other things. I wrote a long comment explaining why the initialization had to be split into two methods and called it a day.

Some days after, a principal software engineer I was working with proceeded to make the exact same C++ brained mistake I did.

misuse-resistance with Pin

As stated earlier, my interface was far from ideal. The core of writing Rust interfaces around C functions is that we provide a “safe” interface around “unsafe” calls, but my design is still very prone to mistakes. The move and subsequent change of memory addresses at the call site of SpiDriver::new() was really annoying, but the problem persists even after calling SpiDriver::initialize():

    let mut spi = SpiDriver::new();
    let mut other_spi = SpiDriver::new();

    let _ = spi.initialize();
    let _ = other_spi.initialize();

    for device in [spi, other_spi] { // oops
        // undefined behavior now
    }

The snippet of for device in [spi, other_spi] just sneakily performed a move into an array and broke the contract once again! Let’s change my interface to be more misuse-resistant with Pin:

use core::mem::MaybeUninit;
use core::marker::PhantomPinned;
use core::pin::{Pin, pin};

extern "C" {
    // Xilinx BSP stuff
}

pub struct SpiDriver {
    device: MaybeUninit<XSpiPs>,
    _pin: PhantomPinned,
}

impl SpiDriver {
    pub fn new() -> Self {
        Self {
            device: MaybeUninit::uninit(),
            _pin: PhantomPinned,
        }
    }

    pub fn initialize(self: Pin<&mut Self>) -> Result<(), SpiError> {
        let config = XSpiPs_Config { 
            // blah blah
        };
        unsafe {
            let _ = XSpiPs_CfgInitialize(self.as_mut(), &config, 0xDEADBEEF);
            // pass device through other various initialization APIs
            let device = device.assume_init();
            Ok(())
        }
    }
}

fn try_to_move(_x: SpiDriver) {
}

fn main() {
    let unpinned_driver = SpiDriver::new();
    let mut pinned_driver: Pin<&mut SpiDriver> = pin!(unpinned_driver);
    let _ = pinned_driver.as_mut().initialize();
    try_to_move(*pinned_driver); // fails to compile
}

With Pin, we make the user promise not to move their pinned instance of SpiDriver. From my understanding, in my usage, Pin is a bit more of a semantic indication rather than compile-time enforcement of address stability. The real compile-time enforcement simply comes from the fact that the Pin is wrapping a &mut SpiDriver, rather than the user having a raw SpiDriver. With the call of try_to_move(), we get an error like the below:

enforcement of state machine rules at compile-time

So now it’s MUCH harder to accidentally move around an instance of SpiDriver, but what about other forms of misuse with my interface? The SPI hardware itself is basically a state machine, and has valid or invalid API calls depending on what state it is in. For example, let’s say we have a method to transfer data across the SPI bus:

impl SpiDriver {
    pub fn data_transfer(self: Pin<&mut Self>, tx: &[u8], rx: &mut [u8]) -> Result<(), SpiError> {
        // ...
        Ok(())
    }
}

fn main() {
    let unpinned_driver = SpiDriver::new();
    let mut pinned_driver: Pin<&mut SpiDriver> = pin!(unpinned_driver);

    let x = [0u8; 32];
    let y = [0u8; 32];

    // incorrect order, SPI hardware hasn't been initialized
    let _ = pinned_driver.as_mut().data_transfer(&x, &y);
    let _ = pinned_driver.as_mut().initialize();
}

In most languages, we would just have to let this suffice and leave it up to the user to avoid potential runtime errors. However with Rust’s typesystem, we can actually force this to be a compile-time error. For me, this is personally one of the most mind-blowing things about this language. State machines are everywhere in embedded after all. The name of this design pattern is the typestate pattern:

use core::mem::MaybeUninit;
use core::marker::PhantomPinned;
use core::pin::{Pin, pin};

extern "C" {
    // Xilinx BSP stuff
}

// new states
// empty structs are ZSTs that are optimized away, so there is 0 runtime overhead
pub struct Uninitialized;
pub struct Initialized;

pub struct SpiDriver<S> {
    device: MaybeUninit<XSpiPs>,
    _pin: PhantomPinned,
    _state: core::marker::PhantomData<S>,
}

impl SpiDriver<Uninitialized> {
    pub fn new() -> Self {
        Self {
            device: MaybeUninit::uninit(),
            _pin: PhantomPinned,
            _state: core::marker::PhantomData,
        }
    }

    pub fn initialize(self: Pin<&mut Self>) -> Result<Pin<&mut SpiDriver<Initialized>>, SpiError> {
        // do initialization as before...

        // the Pin makes us do an ugly cast for the moment of state transition, 
        // but it's fine since SpiDriver<Uninitialized> and SpiDriver<Initialized> 
        // are identical in layout... because again, the state is a ZST
        unsafe {
            let ptr = Pin::get_unchecked_mut(self) as *mut SpiDriver<Uninitialized>
                as *mut SpiDriver<Initialized>;
            Ok(Pin::new_unchecked(&mut *ptr))
        }
    }
}

impl SpiDriver<Initialized> {
    pub fn data_transfer(self: Pin<&mut Self>, _tx: &[u8], _rx: &mut [u8]) -> Result<(), SpiError> {
        // ...
        Ok(())
    }
}

fn main() {
    let unpinned_driver = SpiDriver::<Uninitialized>::new();
    let mut pinned_driver: Pin<&mut SpiDriver<Uninitialized>> = pin!(unpinned_driver);
    
    let tx = [0u8; 32];
    let mut rx = [0u8; 32];
    
    // since pinned_driver's state is the wrong type, the below fails to compile
    // let _ = pinned_driver.as_mut().data_transfer(&tx, &mut rx);

    let mut initialized_driver: Pin<&mut SpiDriver<Initialized>> =
        pinned_driver.as_mut().initialize().unwrap();
        
    // this compiles fine
    let _ = initialized_driver.as_mut().data_transfer(&tx, &mut rx);
}

If we uncomment the data_transfer() call on a SpiDriver<Uninitialized>, we get this message:

I think this is pretty cool. Obviously, there is more overhead in reading this interface as there are wrappings on top of wrappings. But I’d rather have the problems be explicit and upfront at compile-time, instead of bashing my head against a wall at runtime.

Now, I’ve been writing a bunch of vague sample code with missing lines based off of my embedded work in the past. Here’s a version that’s stripped of all the BSP stuff so that it actually compiles:

use core::marker::PhantomPinned;
use core::pin::{Pin, pin};

// new states
// empty structs are ZSTs that are optimized away, so there is 0 runtime overhead
pub struct Uninitialized;
pub struct Initialized;

pub struct SpiDriver<S> {
    _pin: PhantomPinned,
    _state: core::marker::PhantomData<S>,
}

impl SpiDriver<Uninitialized> {
    pub fn new() -> Self {
        Self {
            _pin: PhantomPinned,
            _state: core::marker::PhantomData,
        }
    }

    pub fn initialize(self: Pin<&mut Self>) -> Result<Pin<&mut SpiDriver<Initialized>>, ()> {
        // the Pin makes us do an ugly cast for the moment of state transition, 
        // but it's safe since SpiDriver<Uninitialized> and SpiDriver<Initialized> 
        // are identical in layout... because again, the state is a ZST
        unsafe {
            let ptr = Pin::get_unchecked_mut(self) as *mut SpiDriver<Uninitialized>
                as *mut SpiDriver<Initialized>;
            Ok(Pin::new_unchecked(&mut *ptr))
        }
    }
}

impl SpiDriver<Initialized> {
    pub fn data_transfer(self: Pin<&mut Self>, _tx: &[u8], _rx: &mut [u8]) -> Result<(), ()> {
        // ...
        Ok(())
    }
}

fn main() {
    let unpinned_driver = SpiDriver::<Uninitialized>::new();
    let mut pinned_driver: Pin<&mut SpiDriver<Uninitialized>> = pin!(unpinned_driver);
    
    let tx = [0u8; 32];
    let mut rx = [0u8; 32];
    
    // since pinned_driver's state is the wrong type, the below fails to compile
    // let _ = pinned_driver.as_mut().data_transfer(&tx, &mut rx);

    let mut initialized_driver: Pin<&mut SpiDriver<Initialized>> =
        pinned_driver.as_mut().initialize().unwrap();
        
    // this compiles fine
    let _ = initialized_driver.as_mut().data_transfer(&tx, &mut rx);
}

With all this, my SPI interface is now much harder to misuse. In summary, we’ve learned a few lessons here:

a silly bonus lesson

My SPI wrapper was finally solid and things worked. Tests on target were passing… until suddenly they weren’t. Remember when I mentioned how my use of MaybeUninit came back to bite me?

To be specific, it wasn’t just MaybeUninit, but rather MaybeUninit::uninit(). I was passing an uninitialized XSpiPs to XSpiPs_CfgInitialize(). I mean, makes sense right? After all, the docs describe it like so:

Eventually as the codebase progressed further, I suddenly started receiving the error return XST_DEVICE_IS_STARTED from the C API. I remember stupidly wasting a few hours on this until I just decided to look at the API’s implementation:

s32 XSpiPs_CfgInitialize(XSpiPs *InstancePtr, const XSpiPs_Config *ConfigPtr,
             u32 EffectiveAddr)
{
    s32 Status;
    Xil_AssertNonvoid(InstancePtr != NULL);
    Xil_AssertNonvoid(ConfigPtr != NULL);

    /*
     * If the device is busy, disallow the initialize and return a status
     * indicating it is already started. This allows the user to stop the
     * device and re-initialize, but prevents a user from inadvertently
     * initializing. This assumes the busy flag is cleared at startup.
     */
    if (InstancePtr->IsBusy == TRUE) {
        Status = (s32)XST_DEVICE_IS_STARTED;
    } else {
        // ...
    }
    // ...
}

…It turns out, before actually initializing my passed XSpiPs, they were first checking one of its fields for some weird edge case. As a result, they were reading into my garbage stack memory, and I had undefined behavior for a while without even realizing. The fact that it worked for a period was pure luck. What I really should’ve done was MaybeUninit::zeroed(), or even core::mem::zeroed() (to avoid MaybeUninit’s ergonomics) which are basically the equivalents of a typical defensive memset(data, 0, size);.

Very ironically, if I was writing my code in C, I probably would’ve remembered this habit and I wouldn’t have run into this strange issue. I let my guard down just because I was writing Rust, but I shouldn’t have as I was in FFI/unsafe land.

Actually… if I was writing C, I probably would’ve just initialized my XSpiPs instance statically, that way it’s automatically zeroed out on startup. Too bad I was discouraged from using a global static mut in Rust :D

I ended up writing a minor GitHub issue, although I don’t think I’d blame the Xilinx devs for this. I think it’s just kind of funny because a lot of somewhat niche preconditions had to be in place for me to run into this.

misfortunes of diving into rust FFI as a lost newbie

constructors/in-place initialization

misuse-resistance with `Pin`

enforcement of state machine rules at compile-time

a silly bonus lesson

misfortunes of diving into rust FFI as a lost newbie

constructors/in-place initialization

misuse-resistance with Pin

enforcement of state machine rules at compile-time

a silly bonus lesson

misuse-resistance with `Pin`