Rust, the typestate pattern and Kubernetes operators: a match made in heaven
Handling CRDs with the typestate pattern is COOL
Not that long ago I had to code a Kubernetes operator for work-related matters and I wanted to make it in the most flexible, yet simple (to read, at least), way possible.
The fun thing about Rust is that there are a LOT of very interesting constructs you don't normally encounter when writing certain kinds of software, especially when you're not writing libraries: one of these is PhantomData.
Citing the documentation:
Zero-sized type used to mark things that “act like” they own a
T.
Adding aPhantomData<T>field to your type tells the compiler that your type acts as though it stores a value of typeT, even though it doesn’t really. This information is used when computing certain safety properties.
This is really cool and perfectly solved the problem I had.
Pursuing a nice API
I wanted my operator to have a straightforward way to handle resources; let's say, for example, that we needed to handle Secrets.
The "official" (abbreviated) way to handle resources (via kube-rs, the library that implements the operator control loop) is the following:
// example Secret creation
#[tokio::main]
async fn main() {
let mut secret: Secret {
metadata: ObjectMeta {
name: Some(String::from("your-secret-name")),
namespace: Some(String::from("the-namespace")),
owner_references: /* other stuff unrelated to this post */,
..ObjectMeta::default()
},
..Default::default()
}
secret.data = Some(
BTreeMap::from([(
String::from("key-1"), String::from("base64-encoded-value-1"),
String::from("key-2"), String::from("value-2"),
// ...
)])
)
}
Once you're done creating your Secret, you can feed it to the API server:
let api: Api<Secret> = Api::namespaced(/* kube::Client instance, the client that talks to the apiserver */, &secret.namespace);
// other stuff
api.patch(&secret.name, /* other unrelated stuff */, &Patch::Apply(&secret)).await?;
This effectively applies the secret inside Kubernetes, and everyone is happy.
Since everything is very cumbersome, we can create a function that creates secrets for us:
fn create_secret(name: &str, namespace: &str, owner_references: /* unrelated */) -> Secret {
Secret {
metadata: ObjectMeta {
name: Some(name.into()),
namespace: Some(namespace.into()),
owner_references: /* other stuff unrelated to this post */,
..ObjectMeta::default()
},
..Default::default()
}
}
... and you can figure out the function that applys secrets on your own.
The problem with this approach is obvious (and easily solvable with generics): I have to create a dedicated function for every resource I want to handle in my operator.
The API I had in mind, though, wasn't just generic, I wanted it to be easily extensible. I didn't want to die every time I had to register a new resource.
I wanted something like this:
let secret = create::<Secret>(String::from("secret-name"), String::from("secret-namespace"), /* other stuff */);
let config_map = create::<ConfigMap>(...);
let namespace = create_global::<Namespace>(...); // for cluster-wide resources!
secret.add_secret_kv("key", "value"); // and "value" gets automatically encoded in base64
config_map.add_kv("key", "value"); // this is in plaintext
namespace.add_resource_quota(...); // another, completely different, method
Basically, I wanted to be able to define types with known behaviours (Secrets behave differently than Namespaces, of course), but I wanted to have those information at compile time, without having to make any kind of runtime type checking or other terrible stuff.
Other than that, I wanted my types to be instantiated all through a common interface, the create or create_global functions (that's possible, since all Kubernetes resources all share, at least, a name and a namespace - if namespaced - and the "other stuff" you see in the create function signature). The thing I was wrapping my head around was: how can I make these functions instantiate a given type every time I pass them a different resource? In other words, how can i make create::<Secret>() return a Secret and create::<ConfigMap>() return a ConfigMap without having to do any kind of runtime type checking?
This is exactly the problem that the type state pattern solves and that's where PhantomData comes into play.
Making a nice API
Designing the create function is actually really simple in Rust because, through PhantomData, I can hint the compiler the type I'm going to return from the function:
pub fn create<R>(
name: &str,
namespace: &str,
// other stuff
) -> R
where R: NamespacedResource // we'll talk about this in a bit
{
let resource_generator = ResourceGenerator::<R>::new(/* other stuff */);
resource_generator
.create_resource(name, namespace, /* some other stuff */);
}
pub(crate) struct ResourceGenerator<R> {
_resource: PhantomData<R> // here's the magic!
}
impl<R> ResourceCreator<R> {
fn new(/* stuff */) -> Self {
Self {
_resource: PhantomData, // this is the **concrete** PhantomData, <R> is inferred from the struct definition
/* stuff */
}
}
}
impl<R> ResourceCreator<R>
where R: NamespacedResource,
{
fn create_resource(
&self,
name: &str,
namespace: &str,
/* stuff ... */,
) -> R {
R::create(name, namespace, /* stuff ... */)
}
}
When I first discovered this I was AMAZED at how elegant this was.
PhantomData essentially enables us to "take" the generic parameter K inside our struct without actually storing it: this way, we can get type information at compile time regarding the resource we are passing inside the ResourceGenerator.
Oh, let's not forget about NamespacedResource: I need this trait because I have to abstract the instantiation of the Kubernetes resource, of course, because I can't simply write this:
fn create_resource(...) -> R {
R {
name, namespace, /* ... */
} // this syntax is not allowed in Rust
}
NamespacedResource is NOT, however, the actual Kubernetes resource we are going to create, but rather a wrapper around it (we'll see why in a second):
pub trait NamespacedResource {
// Wrapped is the REAL Kubernetes resource, such as Secrets, ConfigMaps or whatever
type Wrapped: Resource<Scope = NamespaceResourceScope, DynamicType = ()> // Resource is a kube-rs trait, as well as NamespaceResourceScope
+ Clone
+ DeserializeOwned
+ Debug
+ Serialize
+ Sync
+ Send
+ 'static; // these are all traits needed for the kube-rs API. This allows me
// to narrow down the implementation to ONLY Kubernetes resources!
// you can find more information on the "excursus" appendix
// using an associated type is even cooler because this way I can lock the implementations to only
// the types I want.
fn name(&self) -> String;
async fn apply(self, controller_name: &str) -> kube::Result<()>; // this creates the resource inside the api server
fn create(
name: &str,
namespace: &str,
owner_reference: OwnerReference,
client: kube::Client,
) -> Self; // this is the method that is called by the `create_resource` function!
}
Note: the associated type is more clearly explained at the end of the article
This way, I have a very clean API to implement around my Kubernetes resources, for example:
pub(crate) struct KubernetesConfigMap {
name: String,
resource: ConfigMap,
namespace: String,
}
impl NamespacedResource for KubernetesConfigMap {
type Wrapped = ConfigMap; // (associated type) this is the REAL type for the Kubernetes API
fn create(name: &str, namespace: &str, /* other stuff */) -> Self::Wrapped {
let actual_config_map = ConfigMap {
name, namespace,
/// other stuff ...
}
KubernetesConfigMap {
resource: actual_config_map,
name,
namespace,
}
}
async fn apply(self, controller_name: &str) -> kube::Result<()> {
// you don't need to understand this method, just know that it's always the same
// for every kubernetes resource
let api: Api<Self::Wrapped> = Api::all(self.client.clone());
let serverside = PatchParams::apply(controller_name).force();
api.patch(&self.name, &serverside, &Patch::Apply(&self.resource))
.await?;
Ok(()) // yea!
}
// implement all the other methods ...
}
impl KubernetesConfigMap {
pub fn add_data(&mut self, other: BTreeMap<String, String>) {
if let Some(ref mut data) = self.resource.data {
data.extend(other.into_iter());
} else {
self.resource.data = Some(other.clone());
}
}
}
or ...
pub(crate) struct KubernetesSecret {
name: String,
resource: Secret,
namespace: String,
}
impl NamespacedResource for KubernetesConfigMap {
type Wrapped = Secret;
}
// implement all the other methods ...
}
impl KubernetesSecret {
/// this method adds data to a kubernetes secret
/// it automatically takes care of encoding the strings you pass using base64
pub fn add_data(&mut self, other: BTreeMap<String, String>) {
if let Some(ref mut data) = self.resource.data {
data.extend(other.into_iter().map(|(k, v)| {
(
k,
ByteString(general_purpose::STANDARD.encode(&v).into_bytes()),
)
}));
} else {
self.resource.data = Some(
other
.iter()
.map(|(k, v)| {
(
k.to_string(),
ByteString(general_purpose::STANDARD.encode(&v).into_bytes()),
)
})
.collect(),
);
}
}
}
And that gives use the exact API we were looking for, effectively implementing the type state pattern!
let secret = create::<KubernetesSecret>(String::from("my-secret"), String::from("my-namespace"), /* other... */);
secret.add_data(...);
secret.apply().await?; // super clean!
I was still bugged by something... GODDAMN IT
You know it's not perfect if there's still something to remove... And there definitely was something to remove in this design: the implementation for the NamespacedResource trait was ALWAYS THE SAME and that was driving me crazy!!! That was SO BAD!
Luckily, Rust has an amazing macro system, so I could create a #[derive] macro in no time!
#[proc_macro_derive(NamespacedResource)]
pub fn derive_namespaced_wrapped_resource(input: TokenStream) -> TokenStream {
let input = parse_macro_input!(input as DeriveInput);
if let syn::Data::Struct(ref data) = input.data {
if let Fields::Named(ref fields) = data.fields {
// check if all the fields are registered into the struct
if !check_required_fields_namespaced(&data.fields) {
return TokenStream::from(
syn::Error::new(
input.ident.span(),
"make sure you used all the required struct fields: `name`, `resource`, `client`, `namespace`",
)
.to_compile_error(),
);
}
let struct_name = input.ident;
// we need to get Wrapped, which has the same type as the resource field
// and should always be defined, since we already checked for the fields
let wrapped_type = get_type_from_resource_field(&data.fields).unwrap();
return TokenStream::from(
quote!(impl operator_core::prelude::NamespacedResource for #struct_name {
type Wrapped = #wrapped_type;
fn create(
name: &str,
namespace: &str,
owner_reference: OwnerReference,
client: kube::Client,
) -> Self {
let resource = Self::Wrapped {
metadata: ObjectMeta {
name: Some(name.to_owned()),
namespace: Some(namespace.to_owned()),
owner_references: Some(vec![owner_reference]),
..ObjectMeta::default()
},
..Default::default()
};
Self {
name: name.to_string(),
resource,
client,
namespace: namespace.to_string(),
}
}
fn wrapped(&self) -> Self::Wrapped {
self.resource.clone()
}
fn name(&self) -> String {
self.name.clone()
}
async fn apply(self, controller_name: &str) -> kube::Result<()> {
let api: Api<Self::Wrapped> = Api::namespaced(self.client.clone(), &self.namespace);
let serverside = PatchParams::apply(controller_name).force();
api.patch(&self.name, &serverside, &Patch::Apply(&self.resource))
.await?;
Ok(())
}
}),
);
}
}
TokenStream::from(
syn::Error::new(
input.ident.span(),
"only named structs can implement `NamespacedResource`",
)
.to_compile_error(),
)
}
fn get_type_from_resource_field(fields: &Fields) -> Option<&syn::Type> {
for field in fields {
if let Some(ident) = &field.ident {
if ident == "resource" {
return Some(&field.ty);
}
}
}
None
}
This effectively solved the problem, since this macro automatically generates the code I need for the trait implementation. Everything got shrinked to this:
#[derive(NamespacedResource)]
pub(crate) struct KubernetesConfigMap {
name: String,
resource: ConfigMap,
client: Client,
namespace: String,
}
// no more clunky impl NamespacedResource for <whatever>!!!
impl KubernetesConfigMap {
// my custom methods here :)
}
and it felt SO good!
(excursus) Why an associated type?
If I used a generic type, the implementation would have changed to this:
pub trait NamespacedResource<Wrapped>
where Wrapped: Resource<Scope = NamespaceResourceScope, DynamicType = ()>
+ Clone
+ DeserializeOwned
+ Debug
+ Serialize
+ Sync
+ Send
+ 'static;
{
fn wrapped(&self) -> Wrapped;
// ... rest of the stuff is the same
}
This would change all the definitions I made before (I'm not even sure this would work for the rest of the code of the operator, but that's another problem), but more importantly, it would not make any sense, since there's only ONE way to implement and apply a NamespacedResource.
When implementing a resource, then, I would have to write something like this:
pub(crate) struct KubernetesSecret<R> {
name: String,
resource: R,
namespace: String,
}
impl NamespacedResource<R> for KubernetesSecret
where R: Secret,
{
// implement all the other methods ...
}
...but this wouldn't make any sense! Is there any other resource a KubernetesSecret could apply? Why do I need to be generic over the resource field?
Nothing can stop me from writing another implementation:
impl NamespacedResource<R> for KubernetesSecret
where R: Deployment, // Deployment is another kubernetes resource
{
// implement all the other methods ...
}
Nope, that's complete nonsense, but it's a nice way to understand why sometimes you need associated types and why some other times you need generics.
I'm really glad you made it this far, I hope this was somewhat insightful.
Until the next post!
Stay in the loop
Subscribe to my Telegram channel to keep in touch and receive notifications as soon as posts go online!