Marathon 0.9.0 released

Jul 21, 2015

Pierluigi Cau


6 min read

Mesosphere's Marathon is a cluster-wide init and control system for services in cgroups or Docker containers. It is the most popular framework on Mesos and has been used in large-scale production at some of the largest companies in the world.
The Datacenter Operating System (DCOS) uses Marathon to manage the processes and services and is the "init system" for the DCOS. Marathon starts and monitors your applications and services, automatically healing failures.
Today, we are announcing our latest stable release, 0.9.0, which is now available for download and offers significant improvements around stability and allows for easier integration thanks to the new events stream API endpoint. It also improves support for Mesos roles, a new ZooKeeper-based storage abstraction layer and much, much more.
The changelog from Marathon 0.8.2 to 0.9.0 can be found here, but keep reading to learn about the improvements and new features introduced in this version.
Breaking changes
There have been a number of breaking changes with this version of Marathon, please review these before upgrading:
  • Disk resource limits are now passed to Mesos. (For information on how to enable disk quota enforcement in Mesos, see the containerizer documentation.)
  • New format for the http_endpoints command line parameter.
  • New default for the zk_max_versions command line parameter. In Marathon, most state is versioned. This includes app definitions and group definitions. Marathon previously allowed restricting the number of versions that are kept by --zk_max_versions but this had to be specified explicitly. Starting with this version, Marathon will by default keep only 25 versions. That means that Marathon will start to remove old versions in order to enforce the limit.
  • Removed the deprecated zk_hosts and zk_state command line parameters. Use the zk parameter instead.
Overview of changes
Restrict applications to certain Mesos roles
Prior Marathon versions already support registering with a --mesos_role. This causes Mesos to offer resources of the specified role to Marathon in addition to resources without any role designation ("*"). Marathon would use resources of any role for tasks of any app.
Now you can specify which roles Marathon should consider for launching apps per default via the --default_accepted_resource_roles configuration argument. You can override the default by specifying a list of accepted roles for your app via the acceptedResourceRoles attribute of the app definition.
Event stream as server sent events
Prior Marathon versions already notified other services of events via event subscriptions. Services could register an HTTP endpoint at which they received all events. The new Marathon now provides an event stream endpoint where you receive all events conveniently as Server Sent Events.
Abstraction for persistent storage added with ZooKeeper access directly in the JVM
A new storage abstraction has been created, which allows for different storage providers. It is completely non-blocking and provides consistent usage patterns.
The new ZooKeeper Storage Provider is implemented in a backward compatible fashion -- the same data format and storage layout is used as prior versions of Marathon.
You can use this version of Marathon without migrating data while it is also possible to switch back to an older version. The new persistent storage layer is enabled by default, no further action is needed.
Satisfy ports from any offered port range
In prior Marathon versions, matching port resources to the demands of a task had various restrictions:
  • Marathon could not launch a task if it required port resources with different Mesos roles.
  • Dynamically assigned non-docker host ports had to come from a single port range.
Now the port resources of a task can be satisfied by any combination of port ranges with any matching offered role.
Randomize dynamic Docker host ports
If a task reuses recently freed port resources, it can happen that dependencies of old tasks still expect the old task to be reachable at the old port for a limited time span. For this reason, Marathon has already randomized assignment of dynamic non-docker host ports to minimize the risk of launching a new task on ports recently used by other tasks.
Now Marathon also randomly assigns dynamic docker host ports.
Disk resource limits are passed to Mesos
If you specify a non-zero disk resource limit, this limit is now passed to Mesos on task launch. If you rely on disk limits, you also need to configure Mesos appropriately. This includes configuring the correct isolator and enabling disk quotas enforcement with --enforce_container_disk_quota.
Improved proxying to current leader
One of the Marathon instances is always elected as a leader and is the only instance processing your requests. For convenience, Marathon has long proxied all requests to non-leaders to the current leader so that you do not have to lookup the current leader yourself or are annoyed by redirects. This proxying has now been improved and gained additional configuration parameters:
  • --leader_proxy_connection_timeout (Optional. Default: 5000): Maximum time, in milliseconds, for connecting to the current Marathon leader from this Marathon instance.
  • --leader_proxy_read_timeout (Optional. Default: 10000): Maximum time, in milliseconds, for reading from the current Marathon leader.
Furthermore, leader proxying now uses HTTPS to talk to the leader if --http_disable was specified.
Relative URL paths in the UI
The UI now uses relative URL paths making it easier to run Marathon behind a reverse proxy.
Restrict the number of versions by default
In Marathon, most state is versioned. This includes app definitions and group definitions. Marathon already allowed restricting the number of versions that are kept by --zk_max_versions but you had to specify that explicitly. Since some of our users were running into problems with too many versions, we decided to restrict to a maximum number of 25 versions by default. We recommend to set this to an even lower number, e.g. 3, since higher numbers impact performance negatively.
New format for the http_endpoints command line parameter
We changed the format of the http_endpoints command line parameter from a space-separated to a comma-separated list of endpoints, in order to be more consistent with the documentation and with the format used in other parameters. WARNING: If you previously used the http_endpoints parameter with multiple space separated URLs, you will need to migrate to the comma-separated format.
Do not delay task launches anymore as a result of failed health checks
Marathon uses an exponential back off strategy to delay further task launches after task failures. This should prevent keeping the cluster busy with task launches which are set up to fail anyway. The delay was also increased when health checks failed leading to delayed recovery. Since health checks typically (depending on configuration) take a while to determine that a task is unhealthy, this already delays restarting sufficiently.
Removed deprecated command line arguments zk_hosts and zk_state
The command line arguments zk_hosts and zk_state were deprecated for some time and got removed in this version.
Use the --zk command line argument to define the ZooKeeper connection string. is a replacement for the haproxy-marathon-bridge implemented in Python. It reads Marathon task information and generates haproxy configuration. It supports advanced functions like sticky sessions, HTTP to HTTPS redirection, SSL offloading, VHost support and templating.
Be more careful about using ulimit in startup script
The startup script now only increases the maximum number of open files if the limit is too low and if the script is started as root.
Upgrade now
Overall, we are truly excited about the stability increases and overall improvements introduced by this release. Head over to our downloads page to download and install Marathon v0.9.0 now and follow the installation instructions.

Ready to get started?