Silverfin Takeaways, KubeCon Valencia šŸŠ

Silverfin Takeaways, KubeCon Valencia šŸŠ

Leading cloud native pioneers and open-source community members gathered in beautiful and warm Valencia.

The last time Silverfin Cloud Infrastructure engineers were able to join KubeCon was pre-corona (Barcelona, 2019), so it felt good to be back. We met many great people, saw many exciting projects, and networked with our suppliers and service providers.

Take-aways

Crossplane seemed very prevalent at the conference. And this makes sense. As many ops teams are building their own company Heroku/PaaS, for different engineering teams to self-service, Crossplane can simplify a lot of the complexity involved in automating public cloud infrastructure.

The whole concept of ephemeral containers seems to have matured a lot since we last checked, and Aaron Alparā€™s talk made us want to revisit how we do ā€œsupportā€ debugging containers for our engineers. Currently, we spin up support deploys through some internal tooling called sfctl (thatā€™s for another blog post), but ephemeral containers are more native and more powerful.

How Lombard Odier Deployed VPA to Increase Resource Usage Efficiency, by Vincent Sevel, made us want to take another look at Vertical Pod Autoscalers (VPA). Silverfin currently scales for performance using Horizontal Pod Autoscalers (HPA), but we also want to scale for cost optimization using Vertical Pod Autoscalers. Unfortunately, considering the current limitations, we cannot use this in conjunction with our current HPAs for the time being.
VPAs did segway us into looking into Multidimensional Pod Autoscaler (MPA)! Which weā€™ve immediately put to the test for specific batch workloads in one of our QA clusters so we can learn all about it and get familiar. šŸ¤ž

FinOps (a portmanteau of ā€œFinanceā€ and ā€œDevOpsā€) was also more prevalent. While we didnā€™t attend anything FinOps specific, itā€™s become clear that itā€™s an ever-evolving practice that deserves attention.

Mercedesā€™s open-source manifesto is a thing.
The company went from a corporate closed box environment to an Open-Source-First company.
Next to being a massive benefit for the company and its employees, itā€™s also a great marketing trick that got them to be Keynote speakers. But it was not just words. Their manifesto can be found on a dedicated page and work on their GitHub account.

Having a suitable mesh network has been in our heads for a while. It brings complexity but adds a bunch of excellent features.
Charles Pretzer demonstrated one of these features during the ā€˜Multi-cluster failover with linkerd' talk. Individual services were accessible throughout multiple stacks and could provide a nice failover between stacksā€”pretty neat stuff.
We do not have any plans to add this to our current stack, but when we do, we first have to compare more of the meshing options out there. A Devoteam rep told us that we should consider Istio, as Google provides some support.

We also refreshed the available auto scalers and metric types during the ā€˜autoscaling Kubernetes deploymentsā€™ talk.
An interesting takeaway here was that we should also consider network throughput and downstream dependencies. All of this was presented with the help of metrics in Pixie, which looks terrific.

ā€˜Building a nodeless Kubernetes platformā€™ (GKE AutoPilot) was an insightful talk on how the AutoPilot came to existence and which decisions were made in that light.
The point was to provide a managed Kubernetes environment without having to overthink infrastructure. Google abstracts node types, node pools, and part of the affinity groups.
One can specify required machine types right in the pod itself, and the AutoPilot will provide the proper infrastructure for it.
The most intriguing point here is that the pricing is no longer based on the nodes youā€™re renting (as you no longer have to think about those) but is billed based on the number of pods and seconds of vCPUs youā€™re consuming.

Next up, Matthew LeRay and Omid Azizi introduced a new level: ā€˜Reproducing Production issues in CI using eBPF.ā€™
eBPF is a powerful tool that allows the extension of the Linux kernel without actually needing to load modules or changing the kernel itself. Great stuff that can quickly become quite complex. Be sure to check it out if you like things like syscalls!
In this presentation, they used logging syscalls with eBPF and used that data to feed into Pixie.
They then used the data inside Pixie to generate the curl commands to re-do the exact requests and compare the results.
Basic stuff for now, but a cool idea to keep in mind. The demo was taped (without audio).

Did you ever wonder what happened on the receiving end when you did a docker push? Ā The fantastic people from DigitalOcean have built their docker Registry (based on ā€˜distributionā€™) and were so kind to share some insights. Quite insightful how layers can reference or request to overwrite layers of other images, how objects have no size limits (and cause many timeouts in an HTTP stack) and how the manifest can only be public after all the referenced objects are live.
Then they went a bit deeper into the rabbit hole where one might adjust storage drivers in the code for God knows what reason. A bit of zoning-out later, there was an excellent comprehensible reason why garbage collection can only work on a read-only registry. (hint: you can read/write at the same time to a registry)

Masks or no masks, there is nothing like attending in-person. After 2+ years, we missed that conference vibe. We had a great week, met new people, and, thanks to what we call ā€œthat conference vibeā€, came home with an itch to improve allthethings. Conferences are also great for team building. Weā€™re a distributed team, and this was the first time weā€™ve all got to see our faces in real life. Thank you Silverfin, thank you CNCF, and see you next year!

Silverfin Cloud Infrastructure Team