Non-Generic Inner Functions

Steve Klabnik recently wrote about whether out parameters are idiomatic in Rust. The post ends by showing a snippet of code: a generic function, with a non-generic function inside of it which contains the actual implementation. Steve says this pattern may warrant its own post, so here is that post, where I’ll explain why this inner function is useful, discuss the trade-offs of doing it, and describe why this pattern will hopefully not be necessary in the future.

The snippet Steve showed is this one:

// Taken from https://steveklabnik.com/writing/are-out-parameters-idiomatic-in-rust
pub fn read_to_string<P: AsRef<Path>>(path: P) -> io::Result<String> {
    fn inner(path: &Path) -> io::Result<String> {
        let mut file = File::open(path)?;
        let mut string = String::with_capacity(initial_buffer_size(&file));
        file.read_to_string(&mut string)?;
        Ok(string)
    }
    inner(path.as_ref())
}

Code 1

The example code from Steve’s post.

↺

Notice that the outer function is generic, taking anything which is convertible via the AsRef trait into a &Path, while the inner function is not generic, only taking in a &Path. The outer function does nothing but perform the as_ref() conversion and then pass the result to the inner function.

So why have two functions at all?

Well, Rust generics are monomorphized at compile time. That means the compiler identifies all the concrete types which any generic functions or types are called with throughout the codebase, and generates copies of the generic code, specialized to the concrete types. So, for example, if read_to_string() were called with a PathBuf, the compiler would generate a copy of read_to_string() which takes a PathBuf specifically. This is how we can have code that is both generic and fast to call (not involving any runtime checking to identify the implementation to use).

We can imagine this monomorphization looks like this:

use std::path::{Path, PathBuf};
use std::error::Error as StdError;
use std::fs::File;
use std::io::{self, Read as _};

type Error = Box<dyn StdError + Send + Sync + 'static>;

// Copied from https://doc.rust-lang.org/src/std/fs.rs.html#198-203
fn initial_buffer_size(file: &File) -> usize {
    // Allocate one extra byte so the buffer doesn't need to grow before the
    // final `read` call at the end of the file.  Don't worry about `usize`
    // overflow because reading will fail regardless in that case.
    file.metadata().map(|m| m.len() as usize + 1).unwrap_or(0)
}

// Adapted from code in https://steveklabnik.com/writing/are-out-parameters-idiomatic-in-rust
pub fn read_to_string<P: AsRef<Path>>(path: P) -> io::Result<String> {
    let mut file = File::open(path.as_ref())?;
    let mut string = String::with_capacity(initial_buffer_size(&file));
    file.read_to_string(&mut string)?;
    Ok(string)
}

fn main() -> Result<(), Error> {
    let path = {
        let mut path = PathBuf::new();
        path.push("some");
        path.push("path");
        path.push("with");
        path.push("a");
        path.push("file.txt");
        path // the path is now 'some/path/with/a/file.txt'
    };

    let contents = read_to_string(path)?;

    println!("{}", contents);

    Ok(())
}

/*
// At compile time, this could become:
use std::path::{Path, PathBuf};
use std::error::Error as StdError;
use std::fs::File;
use std::io::{self, Read as _};

type Error = Box<dyn StdError + Send + Sync + 'static>;

// Copied from https://doc.rust-lang.org/src/std/fs.rs.html#198-203
fn initial_buffer_size(file: &File) -> usize {
    // Allocate one extra byte so the buffer doesn't need to grow before the
    // final `read` call at the end of the file.  Don't worry about `usize`
    // overflow because reading will fail regardless in that case.
    file.metadata().map(|m| m.len() as usize + 1).unwrap_or(0)
}

// Adapted from code in https://steveklabnik.com/writing/are-out-parameters-idiomatic-in-rust
pub fn read_to_string_path_buf(path: PathBuf) -> io::Result<String> {
    let mut file = File::open(path.as_ref())?;
    let mut string = String::with_capacity(initial_buffer_size(&file));
    file.read_to_string(&mut string)?;
    Ok(string)
}

fn main() -> Result<(), Error> {
    let path = {
        let mut path = PathBuf::new();
        path.push("some");
        path.push("path");
        path.push("with");
        path.push("a");
        path.push("file.txt");
        path // the path is now 'some/path/with/a/file.txt'
    };

    let contents = read_to_string_path_buf(path)?;

    println!("{}", contents);

    Ok(())
}
}
*/

Code 2

An example of what monomorphization looks like.

↺

While the generated code doesn’t look too different from the original code, consider that some generic functions may be called with a several different concrete types throughout a codebase, with each concrete type causing the generation of a complete copy of the original generic code. This creates a tension and a trade-off between compilation speed and the size of the resulting binary on one side and generic, reusable code on the other side.

To explain: Rust compilation is complicated and slow compilation times can have any number of causes, but code generation is consistently identified as one of the slowest compilation phases. Monomorphization is one form of code generation in Rust, so using generic code more heavily, and thus causing the generation of more concrete copies of a generic function or type, contributes to a crate compiling slowly. Those copies are also done in their entirety, meaning you may end up with a lot of repeated code in a binary, one for each version of the original, bloating the resulting executable file.

However, the ability to be generic and reusable is valuable, and taking in types like AsRef<Path> is often in principle preferable to taking in a &Path (this is why we have the standard conversion traits like AsRef or Into in the first place).

So, what do you do?

Well, you can do what Steve does in his example. Most of the function body isn’t generic, the generic is simply to be more flexible in the input type, which the function then converts into a single concrete type. So, you can create a separate function which takes the converted-to type, and then have the generic function only perform the conversion and then call the non-generic function. This way, the time spent generating code and the size of the generic code in the resulting binary are both reduced.

You could do this pattern with a function outside of the original like so:

// Adapted from https://steveklabnik.com/writing/are-out-parameters-idiomatic-in-rust
pub fn read_to_string<P: AsRef<Path>>(path: P) -> io::Result<String> {
    inner(path.as_ref())
}

fn inner(path: &Path) -> io::Result<String> {
    let mut file = File::open(path)?;
    let mut string = String::with_capacity(initial_buffer_size(&file));
    file.read_to_string(&mut string)?;
    Ok(string)
}

Code 3

An example showing the inner function placed outside the original generic function.

↺

However, this does pollute the namespace with a function which will only ever have one caller. The one question before putting this inner function inside of read_to_string is: if the inner function is inside the generic function, doesn’t it still contribute to code bloat?

The answer is no, and to prove it, let’s turn to the Rust Playground and its ability to generate MIR (Rust’s Mid-level Internal Representation, which is converted into LLVM IR and then into machine code). The MIR for read_to_string in the above full code example looks like this:

fn read_to_string(_1: P) -> std::result::Result<std::string::String, std::io::Error> {
    debug path => _1;                    // in scope 0 at src/main.rs:17:39: 17:43
    let mut _0: std::result::Result<std::string::String, std::io::Error>; // return place in scope 0 at src/main.rs:17:51: 17:69
    let mut _2: &std::path::Path;        // in scope 0 at src/main.rs:24:11: 24:24
    let _3: &std::path::Path;            // in scope 0 at src/main.rs:24:11: 24:24
    let mut _4: &P;                      // in scope 0 at src/main.rs:24:11: 24:15

    bb0: {
        StorageLive(_2);                 // scope 0 at src/main.rs:24:11: 24:24
        StorageLive(_3);                 // scope 0 at src/main.rs:24:11: 24:24
        StorageLive(_4);                 // scope 0 at src/main.rs:24:11: 24:15
        _4 = &_1;                        // scope 0 at src/main.rs:24:11: 24:15
        _3 = <P as std::convert::AsRef<std::path::Path>>::as_ref(move _4) -> [return: bb2, unwind: bb3]; // scope 0 at src/main.rs:24:11: 24:24
                                         // mir::Constant
                                         // + span: src/main.rs:24:16: 24:22
                                         // + literal: Const { ty: for<'r> fn(&'r P) -> &'r std::path::Path {<P as std::convert::AsRef<std::path::Path>>::as_ref}, val: Value(Scalar(<ZST>)) }
    }

    bb1 (cleanup): {
        resume;                          // scope 0 at src/main.rs:17:1: 25:2
    }

    bb2: {
        _2 = _3;                         // scope 0 at src/main.rs:24:11: 24:24
        StorageDead(_4);                 // scope 0 at src/main.rs:24:23: 24:24
        _0 = read_to_string::inner(move _2) -> [return: bb4, unwind: bb3]; // scope 0 at src/main.rs:24:5: 24:25
                                         // mir::Constant
                                         // + span: src/main.rs:24:5: 24:10
                                         // + literal: Const { ty: for<'r> fn(&'r std::path::Path) -> std::result::Result<std::string::String, std::io::Error> {read_to_string::inner}, val: Value(Scalar(<ZST>)) }
    }

    bb3 (cleanup): {
        drop(_1) -> bb1;                 // scope 0 at src/main.rs:25:1: 25:2
    }

    bb4: {
        StorageDead(_2);                 // scope 0 at src/main.rs:24:24: 24:25
        StorageDead(_3);                 // scope 0 at src/main.rs:25:1: 25:2
        drop(_1) -> bb5;                 // scope 0 at src/main.rs:25:1: 25:2
    }

    bb5: {
        return;                          // scope 0 at src/main.rs:25:2: 25:2
    }
}

Code 4

A snippet of MIR showing the generic function is smaller with the non-generic inner function pattern.

↺

This is a generated textual representation of MIR’s internal structure, so it may be a bit hard to read, but this shows the function doing the conversion (inside the bb0 section) and calling the inner function (inside the bb2 section). All the code which actually performs the file operations is contained inside the inner function, which is split out and does not contribute to the size of the outer function in the generated code.

So, placing inner inside of the generic function limits its scope to only the callsite that needs it, without harming our goal of reducing compile time and the size of the resulting binary.

The final question to ask in this situation is why the Rust compiler doesn’t just perform this optimization itself? In an ideal world, the Rust compiler would see that most of the function isn’t generic, or that the parameters are generic over one of the standard conversion trait, and would create an inner non-generic function and update the original function to perform the conversion and call the inner function.

That’s not the reality right now. The compiler isn’t smart enough to do this, although it could in the future. If it did, everyone could get the benefit of this pattern without having to remember to it themselves, which would be lovely.

So, to summarize:

The non-generic inner function pattern exists to reduce compile time and code bloat due to monomorphization of certain sorts of generic functions.
The inner function ought to be placed inside the generic function, which reduces its scope to only the relevant location without causing any problems.
Eventually, it would be nice if the Rust compiler could do this all automatically.

Possible Rust

Possible Rust

Categories

Tags