This post continues the series on Tide, sketching a possible design for routing and extraction that combines some of the best ideas from frameworks like Rocket, Actix, and Gotham.
-
Routing is how the framework maps from an HTTP request to an endpoint, i.e. a piece of code intended to handle the request.
-
Extraction is how an endpoint accesses data from the HTTP request.
The two concerns are usually somewhat coupled, because the extraction strategy shapes the signature of the endpoints that are being routed to. As we’ll see in this post, however, the coupling can be extremely loose.
Nothing in this post is set in stone! Rather, this is a sketch of one possible API direction, to kick off discussion and collaboration. Please leave your thoughts on the internals post.
A simple example
We’ll start with a very simple example “app” built on top of the routing and extraction system from this post, and then we’ll look at that system in closer detail.
The data
The app maintains a simple in-memory list of messages:
#[derive(Serialize, Deserialize)]
struct Message {
contents: String,
author: Option<String>,
// etc...
}
/// A handle to an in-memory list of messages
#[derive(Clone)]
struct Database { /* ... */ }
impl Database {
/// Create a handle to an empty database
fn new() -> Database;
/// Add a new message, returning its ID
fn insert(&mut self, msg: Message) -> u64;
/// Attempt to look up a message by ID
fn get(&mut self, id: u64) -> Option<Message>;
/// Attempt to edit a message; returns `false`
/// if `id` is not found.
fn set(&mut self, id: u64, msg: Message) -> bool;
}
This tiny API is meant as a stand-in for more complex back ends. The main point of interest is that Database
is a handle to a database, meaning that it is Clone
(and uses an Arc
under the hood). We’ll see why that’s important later on.
The web API: table of contents
We will build a simple, JSON-based web API for operating on this in-memory database. As per the last post, we’ll do this in two parts: a high-level “table of contents” showing how to route requests to endpoints, and then lower-level endpoint definitions.
The table of contents for the app is specified via a builder API:
fn main() {
// The endpoints will receive a handle to the app state, i.e. a `Database` instance
let mut app = App::new(Database::new());
app.at("/message").post(new_message);
app.at("/message/{}").get(get_message);
app.at("/message/{}").put(set_message);
app.serve();
}
Posting at /message
creates a new message, while /message/{}
allows retrieving and editing existing messages. The {}
segment matches any single URL segment (not containing a separator). We’ll see how to extract the matched data momentarily.
The web API: endpoint implementation
To finish out the app, we need to implement the endpoint functions we passed into the table of contents.
Insertion
Let’s start with the new_message
endpoint:
async fn new_message(mut db: AppState<Database>, msg: Json<Message>) -> Display<usize> {
db.insert(msg.0)
}
First off, we’re using async fn
to write the endpoint.
This feature, currently available on Nightly, allows you to write futures-based code with ease. The function signature is equivalent to:
fn new_message(mut db: AppState<Database>, msg: Json<Message>) -> impl Future<Output = Display(usize)>
Every endpoint signature has this same form:
- Zero or more arguments, each of which implements the
Extractor
trait. TheExtractor
implementation says how the argument data should be extracted from the request. Generally, extractors are just wrapper structs with public fields. - An asynchronous return value that can be transformed into a response (via an
IntoResponse
trait)
For new_message
, we use two extractors: one to get a handle to the application state
(a Database
), and another to extract the body (as a json-encoded Message
).
Within the body of the function, we can use the parameters directly. The extractor wrapper
types implement Deref
and DerefMut
, and we can use .0
to extract the inner object when we
need ownership:
db.insert(msg.0)
Finally, we return the identifer of the inserted message, a Display(u64)
value.
Like Json
, the Display
type is a decorator saying to serialize the given value
into a vanilla HTTP 200 with body generated by formattting via the Display
trait.
In particular, it provides the following impl:
impl<T: fmt::Display> IntoResponse for Display<T> { ... }
Updates
Next, we’ll look at updating an existing message:
async fn set_message(mut db: AppState<Database>, id: Path<usize>, msg: Json<Message>) -> Result<(), NotFound> {
if db.set(id.0, msg.0) {
Ok(())
} else {
Err(NotFound)
}
}
The basic setup here is quite similar to new_message
. However, for this endpoint we need
to extract the {}
parameter from the URL. We use the Path
extractor to say that the
corresponding URL segment should parse as a usize
value. Otherwise, the arguments and body
of the function are pretty self-explanatory.
One important detail: the Result
return value will serialize into a response via the serialization
for the type it contains. Both ()
and NotFound
serialize to responses with empty bodies, but the former generates a 200 response code, while the latter yields 404.
Note that this way of structuring the return type is just an example. In practice, you’d probably have a custom app error type with a more sophisticated serialization approach.
Retrieval
Finally, we can implement retrieval of messages:
async fn get_message(mut db: AppState<Database>, id: Path<usize>) -> Result<Json<Message>, NotFound> {
if let Some(msg) = db.get(id.0) {
Ok(Json(msg))
} else {
Err(NotFound)
}
}
The only twist here is that we’re using the Json
marker in the success case, indicating that
we want to return a 200 response whose body is Message
serialized via json.
Digging deeper
Walking through the example app already introduced many of the relevant APIs, but now it’s worth stepping back and seeing a more complete picture, as well as rationale and relationship to existing Rust web frameworks.
Design goals
There are a few core goals for the API design being sketched here:
-
Make it very straightforward to understand how URLs map to code. We do this by making a sharp separation between routing and other concerns (including extraction), and by limiting the expressive power of routing.
-
Make extraction and response serialization ergonomic. We do this by leveraging the trait system on both sides.
-
Avoid macros and code generation at the core; prefer simple, “plain Rust” APIs. While macros can be very powerful, they can also obscure the underlying mechanics of a framework, and lead to hard-to-understand errors when things go wrong. While this is not a hard constraint, achieving the above two goals while using only “plain Rust” is a nice-to-have.
-
Provide a clean mechanism for middleware and configuration. We’ll see later on how the proposed API is well-suited for extensibility and customization.
The design draws ideas liberally from Rocket, Gotham, and Actix-Web, with some new twists of its own. Let’s dig in!
Routing
For routing, to achieve the clarity goals, we follow these principles:
- Separate out routing via a “table of contents” approach, making it easy to see the overall app structure.
- No “fallback” in route matching; use match specificity. In particular, the order in which routes are added has no effect, and you cannot have two identical routes.
- Drive endpoint selection solely by URL and HTTP method. Other aspects of a request can affect middleware and the behavior of the endpoint, but not which endpoint is used in the successful case. So for example, middleware can perform authentication and avoid invoking the endpoint on failure, but it does this by explicitly choosing a separate way of providing a response, rather than relying on “fallback” in the router.
At the core of the routing system are several data types:
/// An application, which houses application state and other top-level concerns.
pub struct App<AppData> { .. }
/// Configures routing within an application. Routers can be nested.
pub struct Router<AppData> { .. }
/// Configures the responses for an application for a particular URL match.
pub struct Resource<AppData> { .. }
/// Embeds a typemap for providing hierarchical configuration of extractors, middleware, and more.
pub struct Config { .. }
The app-level and router APIs are straightforward:
impl<AppData> App<AppData> {
pub fn new(app_data: AppData) -> App<AppData>;
/// Access the top-level router for the app.
pub fn router(&mut self) -> &mut Router<AppData>;
/// Access the top-level configuration.
pub fn config(&mut self) -> &mut Config;
/// Convenience API to add routes directly at the top level.
pub fn at(&mut self, path: &str) -> &mut Resource<AppData>;
/// Start up a server instance.
pub fn serve(&mut self);
}
impl<AppData> Router<AppData> {
/// Configure the router.
pub fn config(&mut self) -> &mut Config;
/// Add a route.
pub fn at(&mut self, path: &str) -> &mut Resource<AppData>;
}
The syntax for routes is very simple: URLs with zero or more {}
segments, possibly ending in a *
segment (for matching an arbitrary “rest” of the URL). The {}
segments hook into the Path
extractor; each Path<T>
argument in an endpoint extracts one such segment, in order.
To provide maximum clarity, the router only allows two routes to overlap if one route is more specific than the other; the most specific route is prefered. So, for example, the following routes can all coexist:
"/users/{}"
"/users/{}/help"
"/users/new"
"/users/new/help"
and a request at /users/new
or /users/new/help
will use the last two routes, respectively.
More generally: routes can share an identical prefix, but at some point must either have segments that don’t overlap (e.g. two different fixed strings), or exactly one of the routes must have a {}
or *
segment where the other has a fixed string.
Once a route has been given, you get a handle to a Resource
, which allows mounting endpoints or working with a router nested at that URL:
impl<AppData> Resource<AppData> {
pub fn get<T>(&mut self, endpoint: impl Endpoint<AppData, T>) -> &mut Config;
pub fn put<T>(&mut self, endpoint: impl Endpoint<AppData, T>) -> &mut Config;
pub fn post<T>(&mut self, endpoint: impl Endpoint<AppData, T>) -> &mut Config;
pub fn delete<T>(&mut self, endpoint: impl Endpoint<AppData, T>) -> &mut Config;
pub fn nest(&mut self, impl FnOnce(&mut Router));
pub fn config(&mut self) -> &mut Config;
}
If there’s a mismatch between the number of {}
or *
segments and the corresponding Path
and Glob
extractors in an endpoint, the resource builder API will panic on endpoint registration. Hence, such mismatches are trivially caught before a server even runs.
Most of these methods return a handle to a Config
, which makes it possible to tweak the configuration at a route or endpoint level. This post won’t go into detail on the configuration API, but the idea is that configuration, like middleware, applies in a hierarchical fashion along the routing table of contents. So, app-level configuration provides a global default, which can then be adjusted at each step along the way down a route (or even parts of a route) and an endpoint.
Endpoints
A route terminates at an endpoint, which is an asynchronous Request
to Response
function:
pub trait Endpoint<AppData, Kind> {
type Fut: Future<Output = Response> + Send + 'static;
fn call(&self, state: AppData, req: Request, config: &Config) -> Self::Fut;
}
The Request
and Response
types here are wrappers around those from the http
crate; we won’t get into that part of the API in too much detail here, since most end-users won’t ever work directly with these types.
The endpoint is given a handle to the application state and a reference to the configuration, in addition to ownership of the Request
.
Note that the Endpoint
trait has a Kind
parameter which is not used in the body of the trait. This additional parameter is what makes it possible to overload the mounting APIs to work with a variety of function signatures. In particular, here’s a fragment of some of the provided implementations:
/// A marker struct to avoid overlap
struct Ty<T>(T);
// An endpoint implementation for *zero* extractors.
impl<T, AppData, Fut> Endpoint<AppData, Ty<Fut>> for T
where
T: Fn() -> Fut,
F: Future,
F::Output: IntoResponse,
// ...
// An endpoint implementation for *one* extractor.
impl<T, AppData, Fut, T0> Endpoint<AppData, (Ty<T0>, Ty<Fut>)> for T
where
T: Fn(T0) -> Fut,
T0: Extractor<AppData>,
Fut: Future,
Fut::Output: IntoResponse,
// ...
// An endpoint implementation for *two* extractors.
impl<T, AppData, Fut, T0, T1> Endpoint<AppData, (Ty<T0>, Ty<T1>, Ty<Fut>)> for T
where
T: Fn(T0, T1) -> Fut,
T0: Extractor<AppData>,
T1: Extractor<AppData>,
Fut: Future,
Fut::Output: IntoResponse,
// ...
// and so on...
Putting this all together, from the user’s perspective an “endpoint” is any async function where each argument type implements Extractor
and where the return type implements IntoResponse
. This is how we provide an endpoint experience comparable to the one in Rocket (which has a similar setup), without using macros or code generation.
Extraction
Extractors work similarly to many other Rust frameworks. They are asynchronous functions that extract data from app state, configuration, and the request:
pub trait Extractor<AppData>: Sized {
type Error: IntoResponse;
type Fut: Future<Output = Result<Self, Self::Error>> + Send + 'static;
fn extract(state: AppData, config: &Config, req: &mut Request) -> Self::Fut;
}
Note that extractors can fail with an error. Unlike in some frameworks, where this results in rerouting, this API works more like Actix-Web: the error must itself be directly convertable to a response, and no further routing is performed. As with Actix-Web, you can use the Config
object to customize parameters of an extractor, including what kind of error it produces.
Much like with Actix-Web and other frameworks, we can provide a set of prebuilt extractors:
// These all implement `Extractor`:
pub struct Json<T>(pub T);
pub struct AppState<T>(pub T);
pub struct Path<T>(pub T);
pub struct Glob<T>(pub T);
pub struct Query<T>(pub T);
Unlike with Actix-Web and Gotham, if you want to extract multiple {}
matches in a URL, you do so using multiple Path
parameters, rather than e.g. using a tuple within Path
.
Serialization
Finally, an endpoint must return data that is convertable into a response:
pub trait IntoResponse: Sized {
type Body: BufStream + Send + 'static;
fn into_response(self) -> http::Response<Self::Body>;
}
The Body
type here is to allow for streaming response bodies. Otherwise, this setup works identically to Rocket.
What’s next?
In general, this API seems to take some of the most appealing aspects of existing Rust frameworks and consolidate them in a fairly streamlined way. While there are a lot of details elided from this post, it hopefully gives enough of a sketch to spark a useful discussion. I’m eager for feedback on basically any aspect of what’s being laid out here.
If feedback is generally positive, there are a couple of next steps that can proceed in parallel:
-
While I’ve prototyped the type system aspects of these APIs, I don’t have a working implementation yet. If we want to go forward in something like this direction, I’d love to collaborate with folks to get it working!
-
In addition to the APIs spelled out here, I’ve also put thought into middleware and configuration APIs. So, again assuming we want to head in this direction, the next post will spell out those ideas.
Looking forward to hearing what you think!