ML, Machine Learning

AI Chihuahua! Part II

Build or Buy: All-in-One vs Best-of-Breed

Feb 22, 2021

Ian Hellström

D2iQ

5 min read

 
With build-or-buy decisions, it often comes down to an all-in-one platform or a mixture of best-of-breed technologies. With open-source technology companies can actually get the best of everything. So, why not roll your own platform based on top-notch technologies?
 
The real question is whether enterprises can afford to. Open-source software is free to use, but teams have to invest quite a bit in selecting, introducing, using, and maintaining these technologies. The first step may sound trivial—selecting a stack—but let’s take a look at the cloud-native and AI landscapes to get a better idea of what organizations are up against.
 
 
Finding the handful of relevant technologies for your business can take a while and even then, do you really know you have the best for your use cases? Do you really have the skills and time to create custom components to wire these up, make them work and scale at your organization, and of course maintain the base technologies and the glue code that makes up 95% of a machine learning platform, with documentation that does not introduce bus factors of one everywhere?
 
The situation is roughly the same in the AI world.
 

Linux Foundation AI Landscape.
 
On the surface most product alternatives are similar enough to be mistaken for identical. A quick-and-dirty proof of concept (POC) won’t be able to tell you much about the relative differences in the long-term usage and the integration challenges with other technologies. Many detailed POCs won’t be possible because of time and budget constraints. After all, you were hired to solve business problems with technology, not evaluate technologies. 
 
If only you did not have to settle for an all-in-one platform that is mediocre at everything and not really great at anything. If only you could have a best-of-breed platform without the need to dig through thousands of pages of not-always-that-great documentation and out-of-date tutorials. 
 
If only...
 
DKP: D2iQ Kubernetes Platform
So far, we have skipped a few topics that are relevant but oft forgotten. Machine learning does not work without stateful data services, for instance for data ingestion, transformation, and storage. Without lifecycle management, version upgrades are an exercise in risk management and often entails significant loss of business due to maintenance windows. Enterprises ideally rely on services that can be upgraded with minimal or no downtime while keeping their data safe and secure.
 
Most enterprises operate development, staging, and production environments. That implies three separate clusters. You may even want to split the production environment: production training and tuning vs production serving, as each sports different types of workloads with potentially differing hardware requirements and SLAs. That’s already a total of four clusters to manage, just for machine learning. 
 
Now you need more software, and more glue code, and more documentation. Beyond that you’ve also overlooked that the team needs to be trained in these technologies.
 
DKP, D2iQ’s Kubernetes Platform offers all of these components: we offer the best-of-breed solutions with day-two features out of the box, such as:
 
  • Kaptain (formerly: KUDO for Kubeflow), an opinionated end-to-end machine learning platform
  • Lifecycle support for all services, including stateful ones, thanks to KUDO with supported open-source operators for Kafka, Cassandra, and Spark
  • CI/CD with Dispatch
  • Enterprise-grade security and observability baked into Konvoy, our Kubernetes platform with Kommander to deal with cluster sprawl and governance
     

DKP: the leading independent suite of cloud-native technologies and services to succeed at machine learning. It comes with all functionality for day-2 operations, such as authentication, security, cost management, and observability. No need to piece everything together by pretending to be a Roomba with faulty navigation.
 
All our products are tested for resilience and with mixed workloads to simulate realistic environments. That way, our customers can rest assured everything will work, scale up, and not fail on the second step of the installation or when running a tutorial notebook in a real environment. With our suite, we promise compatibility with the open-source editions while giving you the best of what open source and cloud-native technologies have to offer. 
 
Talk to us if you want your machine learning initiatives to succeed anywhere and every time.

Ready to get started?