Apache Mesos uses OS containers such as Docker and Linux cgroups for providing task and resource isolation. While this approach works nicely for local resources like CPU and memory, it does not provide a mechanism for managing resources across networks of containers. This is why, Mesos now supports distinct IP addresses for each container in a cluster (aka IP per container), a feature first introduced in Mesos 0.23.0.
Without IP per container, the container implementations share host IP addresses and hence have to share host ports. The applications are assigned non-standard ports to avoid port-conflicts, which prevents them from listening on well-known ports. This hinders service discovery capabilities and makes it harder for other applications to reach the containerized application.
The lack of network isolation also creates a security concern related to multi-tenancy when a Mesos cluster is shared among different classes of applications. For example, if a financial firm is running both risk-analysis simulations and customer-facing applications a single Mesos cluster, there is no easy way to prevent a hostile application from accidentally accessing sensitive information. Another risk is that a poorly performing application could saturate the network, thus starving a mission-critical application running on the same node.
Finally, each organization has different network needs. There is no one size that fits all when it comes to networking.
To address these problems, the Mesos community has enhanced Mesos to support enabling IP per container for the native Mesos Containerizer (support for Docker containers is planned for the near future.) This pluggable solution also enables third-party network-isolation providers such as Calico, WeaveWorks and others to provide a plugin solution for your Mesos cluster.
IP per container, explained
One of the design goals for IP per container in Mesos was to create a pluggable architecture that allows users to pick from existing third-party networking vendors for providing a networking solution. There are five key components to how this works:
- The framework/scheduler tags tasks to indicate the IP requirements for the to-be-launched container. This is an opt-in service allowing existing frameworks to work without any side effects.
- A Mesos cluster comprised of a Mesos master and a Mesos agent.
- A third-party IP Address Management (IPAM) server assigns IP addresses on demand and recycles once they are no longer in use.
- A (third-party) network isolation provider is responsible for isolating containers and allows operators to configure reachability and routes.
- A network isolation module, which is a lightweight Mesos module that is loaded into the agent, looks at the task requirements set by the scheduler and uses IPAM and network isolator services to provide IP addresses to the container. It then forwards the IP addresses to the Master as well as the framework.
Even though IP assignment and network isolation can be provided by a single unit, conceptually, they provide two different services. One can imagine two independent service providers offering IPAM and network isolation services. For example, one can use Ubuntu FAN for IP address allocation and Project Calico for network isolation.
The opt-in nature of IP per container service allows for a rolling cluster upgrade by keeping the existing frameworks unaffected. Thus, one can run containers in mixed mode—with and without IP per container—in the cluster without any incompatibility.
Having an IP address per container allows for both coarse-grained and fine-grained network isolation between containers. While it is up to the third-party network isolation provider, one can imagine a trivial coarse-grained isolation using a network routing table.
Without per-container IP addresses, the application must register the <localhost, port-assignment> with some discovery service (e.g., Consul, Zookeeper, etc.). Next, an HAProxy or similar reverse proxy must be deployed on each compute node to forward traffic from localhost:
With per-container IP, after the IP address is assigned to a container and networking isolation and routes are enabled, the Mesos master and scheduler are informed of the IP addresses. At this point, the scheduler can use the container IP for reaching the application. It can also provide this information to newer containers and applications as they are launched.
Further, the Mesos master makes the container IP available via its state endpoint. This information is used by DNS service providers such as Mesos-DNS and Mesos-Consul to enable name resolution.
With the advent of unique IP addresses per container, each container owns the entire port range available for its IP and there are no more port conflicts to worry about. The application can now listen on standard ports, thus making service discovery trivial and eliminating the need for a reverse proxy.
The IP-per-container approach allows us to assign each Mesos container a unique IP address. This solves the inherent port conflicts problem allowing the application to listen on well-known ports and makes it easier to do service discovery. The pluggable mechanism allows users to pick and choose their favorite third-party vendor for IP address management and network isolation according to their specific requirements.
As usage of Mesos and the Mesosphere Datacenter Operating System pick up, we're excited for more users to experience the type of networking control that IP per container enables. Give it a try and let us know what you think!