Use Cases

Megaupload on Mesos

Jan 29, 2014

Jason Dusek


5 min read

SSSP is a simple web application that provides a white-label "Megaupload" for storing and sharing files in S3. By holding the credentials and allowing configurable routing of subpaths to buckets, SSSP offers a simple and intuitive storage interface for software releases, intermediate data products, internal documents and large media files.
SSSP is configured with Amazon S3 credentials and bucket names, and then provides signed redirects for PUTs, GETs and DELETEs. Putting the AWS information behind a central, firewalled storage router like SSSP makes credential management easier for administrators and users alike, allows plain HTTP libraries and tools to function as usual, and provides for intuitive and memorable naming of storage locations.
The modern, Mesos-enabled SSSP, written in Scala, is a reimplementation of the Haskell version, which was in use at Airbnb and Erudify for build storage and deployment. As a simple proxy, SSSP can be made highly available by running a few instances of it and using DNS load-balancing; but Mesos allowed us to easily implement a few new features:
Dynamic reconfiguration, with Mesos's framework messages
Access to the cluster topology through a web endpoint
Automated recovery when worker nodes go down
Mesos also makes some planned additions, like dynamic scaling and resilience in the face of coordinator failure, much easier to implement.
Using SSSP To Store & Share Files
To get started with SSSP in standalone mode, it's enough to install Play, check out the latest release of SSSP from GitHub, start SSSP and PUT a bucket configuration:
:;  git clone;  pushd sssp:;  mv conf/mesos.conf mesos.conf        # Disable Mesos for the time being:;  play run:;  curl -X POST http://localhost:9000/         -H 'Content-Type: application/json'         -d '{ "/": { "s3": { "bucket": "a-bucket",                              "access": "the-access-key",                              "secret": "the-secret-key" } } }'
(This JSON configuration, a map of paths to buckets, can also be placed in conf/s3.json.)
Once a bucket is configured in SSSP, PUTs, GETs and DELETEs to paths below the root result in a signed redirect, pointing to a location in Amazon S3. The redirects are signed for ten seconds. To upload data with curl, simply pass the -L option, which causes redirects to be followed. With -i, curl allows us to watch the sign-and-redirect flow as it progresses:
:;  curl -ifL -X PUT http://localhost:9000/xyz -d textHTTP/1.1 307 Temporary RedirectLocation: max-age=9Content-Length: 0
HTTP/1.1 200 OKx-amz-id-2: /2rv2TXhIxMNHvV6DFtCHU4voQEcXpvUm+pk5xffmfGAr2eouoCJxa5Mnzd9ba7nx-amz-request-id: D1D0016D87DAD49DDate: Mon, 27 Jan 2014 23:10:08 GMTETag: "1cb251ec0d568de6a929b520c4aed8d1"Content-Length: 0Server: AmazonS3
Retrieval and deletion work similary:
:;  curl -fL -X GET http://localhost:9000/xyztext:;  curl -fL -X DELETE http://localhost:9000/xyz
Now that the file is gone, a GET returns a 404:
:;  curl -fL -X GET http://localhost:9000/xyzcurl: (22) The requested URL returned error: 404
SSSP Configuration & Deployment On Elastic Mesos
SSSP can be deployed like any other Play application, using play dist. While the web API and console allow most things to be configured online, static files are supported, too: conf/s3.json for bucket configuration and conf/mesos.conf for Mesos settings (Mesos mode is only enabled when the Mesos settings are present).
For Mesos configuration, two settings are needed -- a Mesos master URL and the number of workers to spawn. Here is an example that works well on Elastic Mesos (assuming you start the framework on the master):
With Play and a Scala build chain installed locally, you can easily create a distribution for launching on Elastic Mesos (a basic mesos.conf is included in the distribution):
:;  git clone;  pushd sssp:;  play universal:package-zip-tarball
Send the distribution to your Elastic Mesos primary host and unpack it:
:;  cat target/universal/sssp-*.tgz | ssh ubuntu@ela.stic.mesos tar -xz
Once you're logged in, install Java and launch SSSP.
:;  sudo aptitude install -y default-jre:;  cd sssp-*:;  MESOS_NATIVE_LIBRARY=/usr/local/lib/ bin/sssp
Note that we indicate the path to the Mesos dynamic library.
Configuration & Deployment
The Web Console
SSSP's web console offers an overview of the bucket layout and the cluster topology.
Buckets can be added and removed dynamically from the web console. The web console is served at the only endpoint handled by SSSP directly -- the root. The routes can be retrieved as JSON if the Accept header indicates application/json is desired.
Dynamic Load-Balancing & SSSP
The web API provides an easy way to automatically find active nodes and configure your load balancer. Requests to the root with Accept: text/plain set will return a service endpoints line that is congruent with those of Marathon:
sssp    9000
The host:port pairs are the endpoints, separated from the application name and canonical port, as well as each other, by tabs. Marathon's haproxy_cfg script can be used with this data to generate full HAProxy configurations:
:;  ( ./haproxy_cfg header &&      curl -sSf http://localhost:9000 -H 'Accept: text/plain' |      ./haproxy_cfg rules ) > /tmp/haproxy.conf
or partial HAProxy configurations.
:;  curl -sSf http://localhost:9000 -H 'Accept: text/plain' |    ./haproxy_cfg ruleslisten sssp  bind  mode http  option tcplog  option httpchk GET /  balance leastconn  server sssp-1 check
Thanks to the committers who contributed to the rewrite of SSSP!
Interested in Contributing?
Mesosphere loves open source. SSSP is a new project and there are plenty of opportunities to contribute to its design and development.

Ready to get started?