WEKA at Cloud Field Day 18

If you haven’t heard of WEKA, you need to take notice. It’s a pleasure to see how the data platform has evolved and also a great win to see Phil Curran and the team presenting what they are working at Cloud Field Day 18.

WEKA is a neat origin story too having been born in the cloud at AWS re:Invent 2017. The capabilities are going to be fun to dive into here as well as how folks are using the platform.

About WEKA Data Platform

WEKA and their data platform runs in the 4 major cloud providers (AWS, Azure, Google Cloud, Oracle Cloud) and also has on-premises options for private and hybrid hosting.

Use-cases for having a performance-oriented and cost-optimized data platform are likely obvious with a great foit for AI and ML workloads.

The hybrid option is very interesting which opened up right away to the next session and our chance to dive in.

Inside WEKA Architecture

WEKA runs on a cloud compute instance in the native cloud platform. The requirement is to use a specific few types of cloud compute instances

Kicking off your first time with WEKA is easy by going to start.weka.io and that will spin up your environment.

Deployment is automated with configuration options to decide how your configuration. Terraform is being used to spawn the initial environment and there is centralized state management inside DynamoDB or Cosmos DB to handle health checks. Control of the scaling is user-driven. You define your rules based on when to scale and

All of the deployment is done in your tenant which makes it a smoother conversation with your security team 🙂

WEKA Tiering to S3

It was interesting to see how S3 comes into play and where WEKA can do things like snapshots to S3 while also co-locating with live data. This has been a challenge for a lot of other platforms trying to solve a similar challenge of putting databases onto S3 and Glacier object storage.

You can definitely tell the WEKA team is keenly watching the risk factors and are building towards resilience early on. You can do multi-region deployments and there are many other layers of protection in the deployment architecture.

Resiliency is tunable which is also an advantage that you can

Hybrid and Multi-Cloud Goodness

The ability to do a snapshot for processing

Scale to zero is also an interesting option which also introduces an interesting option that helps for migration use-cases and being able to have that portability.

Being able to operate in a hybrid model is a win for me. It’s ideal to have flexibility and portability. I’m seeing a lot of focus on flexibility throughout the presentation and talk track which tells me this is popular.

Deployment Example on AWS

It’s great to get to see under the covers and we were lucky enough to spend some off

Architecturally, WEKA is transparent on their configuration and the reasoning. There is a lot of differentiated IP. The result is a few advantages for both performance and costs.

If we take the same hardware as a comparison to running WEKA instead, this is the result:

Now a raw comparison can show how the performance and cost differentiation comes in with how WEKA manages the data platform environment.

Sustainability Advantage of Optimized Data Platforms

The bonus capability is that the very same consolidation and optimization activities that happen with WEKA data platform are also driving more efficient utilization. End result: better sustainability.

You can run more workloads, with higher performance, on less hardware. That’s a total win when we see the opportunity to optimize cloud and on-premises usage for less environmental impact. Reducing the power and scaling requirements with platform efficiencies has turned into real reductions in carbon emissions from power and other consumption. Great news!

Kubernetes? Yes, please!

There is a CSI plugin to support presenting storage to Kubernetes right from WEKA. Features below the connector are partially abstracted by the platform but also present the additional capabilties for things like crash-consistent snaps and continuous optimization. Much more to explore here and I hope to get a dedicated session with them to dive into the Kubernetes goodies.

My Thoughts on WEKA

WEKA is worth a deeper look. Data performance is challenging and costs of storing data with goals of performance are high normally. There is a lot of attention to resiliency and providing both enterprise resiliency and features that are going to be very attractive to data enthusiasts.

The AI and ML processing performance and snapshot capabilities are slick and the fact that Kubernetes support is already there with snapshotting and future options for things like container-level snapshots.

The core simplicity of presenting performance and cost-optimized storage over standard protocols to multiple clouds is something that needs attention. WEKA solved a problem that is plagued a lot o fus.

Check out the full session on LinkedIn and make sure you follow Weka and visit them to find out if it’s a fit.

DISCLOSURE: My travel expenses were covered by Tech Field Day (GestaltIT) for the event. All analysis and content is my opinion from the presentation, discussion, and independent research

Leave a Comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.