Product

Marathon 0.11: More powerful UI and debugging capabilities | D2iQ

Oct 02, 2015

Philip Norman

D2iQ

7 min read

 

We are excited to announce the release of Marathon 0.11.0, a cluster-wide init and control system for services in cgroups or Docker containers. It is the most popular framework on Mesos and has been used in large-scale production at some of the largest companies in the world.

Mesosphere's Datacenter Operating System (DCOS) uses Marathon to manage processes and services. Marathon starts and monitors your applications and services, automatically healing failures, and is used as the "init" system for DCOS.

This new version brings significant performance updates, a simplified and updated API, and a greatly extended user interface. Here's more detail on what we added and what we fixed.

The headlines

Marathon 0.11.0 allows you to:

  • Create Docker apps from the UI.
  • Edit your running applications from the UI.
  • Compare your application statistics across deployments.
  • Huge improvements in access to debugging information.

UI: The details

New application modal, including Docker-specific fields

The application modal has undergone significant changes, simplifying the app-creation process and giving the user access to more advanced features. Dockerized applications can now be created directly from the UI:

Modify existing configurations from the UI

It's now possible to edit an application configuration and deploy it using the same improved modal. In previous versions of Marathon, users wanting to do this from the UI had to stop, delete, and recreate applications to make a simple change.

Usability improvements to the applications view

The applications view is the 'dashboard' for Marathon UI. This version makes the applications view significantly more useful, especially by revealing more about application health.

Search bar

Applications can be filtered by name using the search bar in the top left corner of the applications view.

Total resource usage

Previously, only the configured resource usage was shown in the applications view, so an application with 100 running tasks would show the same resource usage as an identical application with only 1 running task. Now, the combined resource usage is shown, allowing users to sort applications by their total assigned resources.

Sort applications by health

Unhealthy apps can be found quickly by sorting the application table by health status.

Better progress information feedback

A tooltip displays when the user hovers over the application progress bar, showing individual health statuses.

More information on applications and deployments

In previous versions of Marathon, only an application's configuration and health check status were available from the UI. Marathon 0.11 brings the following features:

Debug app tab

A new tab is available in the application detail view that displays the most recent changes to the application configuration, the most recent task failure, and relevant statistics. There's also a direct link to the sandbox in the Mesos UI.

The new task statistics are also shown in the debug tab, allowing users to see at a glance how their latest scaling operation or configuration change affected their application.

API: The Details

Smarter resource offer handling

When launching tasks across multiple applications, Marathon now spreads offers across all apps. In previous version of Marathon, tasks were launched sequentially by application, which meant that one big deployment could block all the others.

To avoid overloading Mesos, we've made some new configuration parameters available. You can now restrict how many tasks Marathon launches in a given time period. By default, we allow 1000 unconfirmed task launches every 30 seconds. A TASK_RUNNING status update from Mesos is considered confirmation and allows a new task to be launched.

After Marathon receives a resource offer from Mesos, it finds suitable tasks to launch for incoming offers. In 0.11 it is possible to configure the period in which these matches are made. Any tasks unmatched when this period is complete are temporarily rejected and retried later.

No pseudo-deterministic assignment of host ports

Non-zero "ports" in an application configuration are now used as service ports. Previous versions of Marathon assigned these ports as host ports. It could thus appear that supplied "ports" corresponded to actual host ports, which caused unnecessary confusion. In 0.11, host port assignment is randomized where not explicitly configured.

Additional task statistics

Granular statistics are provided in each apps endpoint query, which gives the user a detailed picture of application performance across multiple configurations and at different scale levels.

The following detailed statistics are provided:

  • withLatestConfig - tasks running with the latest application configuration.
  • startedAfterLastScaling - tasks that were launched by the last scale or restart.
  • withOutdatedConfig - the inverse of withLatestConfig.
  • totalSummary - all tasks.

Faster lost-task reconciliation

Newly launched tasks are tracked by ID, leading to faster reconciliation of lost tasks.

Specific "embed" parameters for application-related information

All GET requests now take optional "embed" parameters that allow the user to specify the desired information. By default, Marathon will deliver all information, as in previous versions. Users are encouraged to take advantage of this new parameter to improve performance.

New versioning information in API and UI

We now have "versionInfo" with "lastConfigChangeAt" and "lastScalingAt" in the apps JSON of our API. "lastConfigChangeAt" is the timestamp of the last change to the application that was not just a restart or a scaling operation. "lastScalingAt" is the timestamp of the last scaling or restart operation.

Persistent last task failure information

Information about task failure now persists across restarts and failover, preserving valuable debugging information.

Logging parameters on startup

All command line parameters are now logged on startup, enabling users to check that their Marathon configuration is correct at a glance.

Improved logging of offer rejections

Marathon now logs offer rejection in greater detail, giving useful error messages. For example, attempting to assign a host port that is already in use results in the following error message:

Cannot find range with host port 8080 for app [/product/frontend]Insufficient resources for [/product/frontend] (need cpus=1.0, mem=64.0, disk=1.0, ports=([8080] required + 1 dynamic), available in offer:...
Attempting to deploy an application which requires more CPUs than are available:Not all basic resources satisfied: cpu NOT SATISFIED (30.0 > 8.0), disk SATISFIED (0.0 <= 0.0), mem SATISFIED (16.0 <= 15360.0)Insufficient for [/test] (need cpus=30.0, mem=16.0, disk=0.0, ports=(1 dynamic), available in offer:...

Reflect configuration changes immediately

When a new deployment is accepted, the associated configuration data (the data provided in the application's endpoint) is updated immediately. Previous versions of Marathon only updated application data after the deployment had started. This could cause confusion because successful API requests could appear to have had no effect. Similarly, group data is now updated as soon as a deployment is accepted.

Better back off behavior when tasks fail

When task failures occur, Marathon throttles subsequent task launches using the backoff duration parameter. The backoff delay is no longer reset automatically; in previous versions of Marathon, this could lead to problems when an application crashed shortly after startup. As in previous versions of Marathon, backoff delay can still be reset manually.

MARATHON_APP_DOCKER_IMAGE environment variable

Dockerized applications are now started with their docker image name in the MARATHON_APP_DOCKER_IMAGE environment variable.

Important bug fixes

  • #1553 - Marathon will now correctly reload task state information after a failover.
  • #1924 - Marathon will accept offers without disk resources if no disk resources are required.
  • #1671 - Mesos will now use the hostname given by the --hostname parameter to communicate with Marathon.
  • #1926 - Our leader proxy code used buffered IO without intermediate flushing which did not play well with streaming events from our /v2/events endpoint.
  • #1877 - Marathon will now exit on startup failures instead of continuing to run without being able to answer requests. For example: Marathon will now exit if the specified http port is already in use.

Under the hood

Jetty 9 as Servlet Engine

The latest Jetty servlet engine is used in this version of Marathon. Jetty 9 has a completely overhauled I/O layer, Servlet API 3.0, SPDY/3 and WebSocket support.

Improved SSE handling

As part of the Jetty 9 update, the SSE support for /v2/events has been improved. The event name event: event-name is added to every data: json entry for easier filtering and handling.

Play JSON everywhere

We finished our transition from Jackson JSON serialization to Play JSON. Play JSON provides a type-safe interface, which make it easier to write correct code.

Upgrade now

Familiarize yourself with the upgrade process, then head over to our downloads page to download and install Marathon v0.11.0 now!

The detailed changelog is available on Github. Please note that we recommend using Marathon 0.11.0 with Mesos 0.23.0.

Ready to get started?