dynamo logo

Dynamo - Dynamic Data Management System

Dynamo is a Dynamic Data Management system that organizes and optimizes large scale data usage on storage systems with distributed storage sites (storage elements). It is designed to seamlessly integrate large number of storage sites even if the individual storage sites are very different in sizes. The data are assumed to be ordered in chunks that have some clear relation (datasets), and which share some common metadata that can be used to organize their usage.

Once the data is entered into the system Dynamo policies can be set to let the system organize the data for best access. The policy language is rich and powerful and can be adjusted at runtime. Popularity is used as a metric to dynamically adjust the number of replicas for data in the system. Tape storage systems can be integrated which offers a wider variety of accessibility level and good safety against data loss.

The system has been originally designed for the CMS experiment at CERN. There were seven storage sites with tape system access (Tier-1 sites) and about 40 sites with disk only storage (Tier-2).

If you are interested in managing your data with Dynamo or contributing to the development of Dynamo, please contact the Dynamo Team at ddm-dynamo@mit.edu. The documentation is maintained in github and compiled using Sphinx and is uploaded to readthedocs.

History

The development of Dynamo started in 2014 by the CMS Computing Operations organization and was initially foreseen to be a tool working on top of the existing CMS data transfer engine PhEDEx. During the last year Dynamo evolved its capabilities to take over the organization of the data transfers and provide a complete package to manage data using, DBS as the source for the initial definition of the metadata, FTS to perform the specific data transfers and the CMS popularity service to track the usage of the data.

As the package emerged from the CMS operations, it was initially coupled to the CMS environment. However, in the last half year, CMS-specific components have been decoupled, and a standalone version, ready to be used by other experiments, has been produced allowing external specialized plugins for services like the popularity or a master metadata source describing the data. We are looking forward to supporting other efforts where large amount of data has to be managed across a potentially heterogeneous set of storage sites. We do support a fully integrated tape storage.

Indices and tables