Data is growing at a rate faster than ever before. Every day, 2.5 quintillion bytes of data are created. But how are enterprises taking advantage of this data?
Over the past two to three years, companies have started transitioning from big data, where analytics are processed after-the-fact in batch mode, to fast data, where analysis is done against data streaming in real-time to provide immediate insights. Fast data allows companies to create new business opportunities and serve their customers in new ways, and has become the core of powerful business applications. According to a recent OpsClarity survey1, over 92% of companies plan to increase their investment in streaming data in the next year, indicating an increasing shift in the use of data processing to serve customer-facing applications.
It appears enterprises are finally realizing business value from data. But the learning curve is still steep.
Why We Need Data Services Automation
Businesses looking to develop new services (from personalization to Internet of Things) will often need to build their solution using a combination of data services. Kafka and Spark are common examples. However, the distributed nature of these data services can make them very difficult to deploy and operate. These challenges fall in three areas.
First, installing a production-grade platform service such as Kafka or Cassandra requires specialized knowledge of operators; even for an expert, deployment is time consuming and often requires significant engineering effort. Second, ongoing operations of these technologies is complex and risky; common tasks such as upgrading software, deploying updates, rolling back in failure scenarios, monitoring health, and managing storage resources are often manual and error-prone. Third, maintaining enough infrastructure to handle data-processing peaks gets expensive. Average datacenter utilization continues to hover around 6-12%, driven by companies' desire to maintain high service quality through peak load periods.
The consequences are severe - and include:
- Longer time to market for new applications
- Low productivity of developers and operators
- Increased risk of downtime
- Diminished ability for data scientists and developers to experiment with new technologies
- High on-premise or cloud infrastructure costs
Mesosphere DC/OS: Production-Proven Infrastructure for Fast Data
Mesosphere DC/OS is the only production-proven platform that runs both containers and data services on the same infrastructure. DC/OS provides one-click installation of data services such as databases, message queues, and analytics engines, on-par with cloud providers such as Amazon Web Services. These services are simple to deploy, operate, and scale, and run on shared infrastructure, dramatically increasing utilization. This is possible because DC/OS utilizes two-level scheduling with frameworks to implement application-aware lifecycle management, which provides key differentiators versus simply running services in containers on container orchestration platforms such as Kubernetes and Docker Swarm. For example, Mesosphere DC/OS enables:
- Single-command install of cloud native data services such as Spark, Cassandra, HDFS, Kafka and Elasticsearch, among many others. DC/OS also dramatically simplifies resizing instances of a data service, as well as adding more instances.
- Reduced time and effort involved with operating cloud native data services through simple runtime software upgrades and updates, application-level monitoring and metrics, and managed persistent storage volumes.
- Dramatically increased utilization, enabled by multiple data services, containerized applications and traditional applications all running on the same infrastructure.
One-Click Data Service Installation with Mesosphere DC/OS
Working with Our Partners to Bring New Data Services to DC/OS
An operating system is only as powerful as the applications that run on it. To that end, we've been incredibly focused on working with our partners to bring new data services into the DC/OS ecosystem. More than a year ago, we integrated the popular open source tools Spark, Kafka, Cassandra, and Akka (colloquially known as the "SMACK stack"), and in August announced partnerships with the top companies behind these technologies, including Confluent, DataStax, and Lightbend.
But we didn't stop there. DC/OS 1.9 includes production-quality integrations with an additional 7 data services, including both open source and partner-supported technologies. With these partners, we are bringing incredible value to our customers, who can now install these services datacenter-wide with a single click, operate them with no downtime, and run them elastically across shared infrastructure.
Mesosphere DC/OS Universe: One-Click Installation of Over 100 Platform Services
Interested in learning more? You are invited to our data services webinar series taking place over the next several weeks. We will spend an hour with each partner to dive deeper into their technology, use cases, and demos!
- Alluxio: April 4
- Lightbend: April 11
- DataStax: April 18
- dataArtisans: April 25
- Couchbase: May 2
- Redis Labs: May 9
- Confluent: May 23
DataStax is the provider of DataStax Enterprise (DSE), the always-on data platform for cloud applications powered by the industry's best distribution of Apache Cassandra™. DSE lets you focus on what matters most to you and makes it easy to distribute your data across datacenters or cloud regions, making your applications always-on, ready to scale, and able to create instant insight and experiences. Your applications are ready for anything – be it enormous growth, handling mixed workloads, or enduring catastrophic failure. With DSE's unique, fully distributed, masterless architecture, your application scales reliably and effortlessly.
Approximately six months ago, we worked with DataStax to bring DataStax Enterprise to DC/OS. Today, we are announcing expanded support for DataStax Enterprise which includes integrated analytics, search, and graph capabilities as well as administration and monitoring, developer tooling, and more. Our customers have been clamoring for this joint offering, and we are excited to make it available.
"DataStax provides data management for cloud applications. Since we announced our partnership with Mesosphere six months ago, we've seen accelerated interest among our customer base," said Kathryn Erickson, Director of Strategic Partnerships at DataStax. "Deploying DataStax Enterprise, the always-on data platform on DC/OS helps customers get the most out of their infrastructure, and critically, it makes their data management highly portable across clouds or data centers. With the most recent integration of DataStax Enterprise, we bring a unified data platform of search, analytics, graph, and monitoring capabilities into to the DC/OS ecosystem. The partnership provides our customers an always-on, scalable solution that can deliver instantly actionable insight for their applications.