• Slide 1

    Tenth International Workshop on Self-Managing Database Systems

    自我管理 数据库, 8 April 2019, Macau SAR China

  • Slide 2

    Tenth International Workshop on Self-Managing Database Systems

    自我管理 数据库, 8 April 2019, Macau SAR China

  • Slide 3

    Tenth International Workshop on Self-Managing Database Systems

    自我管理 数据库, 8 April 2019, Macau SAR China


During the last forty years, data management systems have grown in scale, complexity, and number of installations. At the same time, administration of these systems has become very expensive with the human factor dominating the total cost of ownership. Current trends like cloud computing make this situation even more problematic for service providers who have to configure and manage thousands of database nodes.

There has been a significant amount of research addressing this problem by providing autonomic or self-* features in database systems to support complex administrative tasks like physical database design, problem diagnosis, and performance tuning. However, novel challenges arise from trends like containerization and virtualization; emerging systems like Kafka, Presto, and Spark; auto-scaling and transient clusters on the cloud; and Software-as-a-Service (SaaS). A major challenge is the need to scale self-management capabilities to the level of hundreds to thousands of nodes while taking economic factors into account.

Autonomic, or self-managing, systems are a promising approach to achieve the goal of systems that are easier to use and maintain. A system is considered to be autonomic if it possesses the capabilities to be self-configuring, self-optimizing, self-healing and self-protecting. The aim of the SMDB workshop is to provide a forum for researchers from both industry and academia to present and discuss ideas related to self-management and self-organization in data management systems ranging from classical databases to data stream engines to large-scale cloud environments that utilize advanced in AI, machine learning and data mining and analysis.


Download CFP as PDF


Topics of interest include, but are not limited to:

* Principles and architecture of autonomic data management systems
* Retro-fitting existing systems vs. designing for self management
* Self-* capabilities in databases and storage systems
* Data management in cloud and multi-tenant databases
* Autonomic capabilities in database-as-a-service platforms
* Automated testing of data management systems
* Automated physical database design and adaptive query tuning
* Automated provisioning and integration
* Automatic enforcement of information quality
* Robust query processing techniques
* Self-managing data stream engines and adaptive event-based systems
* Self-managing distributed / decentralized / peer-to-peer information systems
* Self-management of internet-scale distributed systems
* Self-management for big data infrastructures
* Monitoring and diagnostics in data management systems
* Policy automation and visualization for datacenter administration
* User acceptance and trust of autonomic capabilities
* Evaluation criteria and benchmarks for self-managing systems
* Self-evaluation of data management services in the cloud
* Use cases and war stories on deploying autonomic capabilities


Authors are invited to submit original research contributions in English of up to 6 pages in the IEEE camera-ready format (templates are available at the ICDE 2019 submission guidelines page) to the submission site https://cmt3.research.microsoft.com/SMDB2019. Authors of accepted papers will be encouraged to submit an extended paper of up to 8 pages for final publication. All accepted papers will appear in the formal Proceedings of the Conference Workshops published by IEEE CS Press, and will be included in the IEEE digital library.

Paper submission deadline:

December 3, 2018 5pm PST    December 17, 2018 5pm PST (abstract) (optional)
December 10, 2018 5pm PST    December 22, 2018 5pm PST(full paper)

January 28, 2019    January 31, 2019

February 22, 2019



Shivnath Babu

CTO, Unravel Data Systems
Adjunct Professor, Duke University

Panos K. Chrysanthis

Professor, Computer Science Department
University of Pittsburgh

Meichun Hsu

Sr. Director of R&D Database Server Technology
Oracle Corporation

Constantinos Costa

Research Associate, Computer Science Department
University of Pittsburgh


Ailamaki Anastasia, EPFL, Switzerland
Ashraf Aboulnaga, Qatar Computing Research Institute, Qatar
Khuzaima Daudjee, University of Waterloo, Canada
Vivek Narasayya, Microsoft Research, USA
Beng Chin Ooi, National University of Singapore, Singapore
Neoklis Polyzotis, Google Inc, USA
Jeff Lefevre, University of California Santa Cruz, USA
Ken Salem, University of Waterloo, Canada
Matthias J. Sax, Confluent Inc., USA
Alkis Simitsis, Microfocus Inc, USA
S. Sudarshan, IIT Bombay, India
Herodotos Herodotou, Cyprus University of Technology, Cyprus
Sam Lightstone, IBM, Canada
Kai-Uwe Satler, TU Ilmenau, Germany
Evaggelia Pitoura, University of Ioannina, Greece


Daniel Petrov, University of Pittsburgh, USA


08:30-08:45 Opening and Introductions

08:45-10:00 Session 1: Self-Managing Frameworks

Session Chair: Costantinos Costa, University of Pittsburgh

A Framework for Self-Managing Database Systems
Jan Kossmann (Hasso Plattner Institute); Rainer Schlosser (Hasso Plattner Institute)

Towards Auto-Scaling Existing Transactional Databases with Strong Consistency
Michael Georgiou (Cyprus University of Technology); Aristodemos Paphitis (Cyprus University of Technology); Michael Sivirianos (Cyprus University of Technology); Herodotos Herodotou (Cyprus University of Technology)

Distribution-Driven, Embedded Synthetic Data Generation System and Tool for RDBMS
Joseph W. Hu (SAP); Ivan Bowman (SAP); Anisoara Nica (SAP SE, Waterloo, Canada); Anil Goel (SAP)

10:00-10:30 Break

10:30-12:00 Session 2: Keynote 1

Session Chair: Meichun Hsu, Oracle Co.

"Towards self-managing cloud-scale computing platforms: experiences and challenges"
Jingren Zhou, Vice President at Alibaba Group

SMDB Business Meeting

12:00-13:30 Lunch

13:30-15:00 Session 3: Machine Learning Driven Self-Managing

Session Chair: Costantinos Costa, University of Pittsburgh

Gray Box Modeling Methodology for Runtime Prediction of Apache Spark Jobs
Hani Al-Sayeh (TU-Ilmenau); Kai-Uwe Sattler (TU Ilmenau)

Guided Bayesian Optimization to AutoTune Memory-based Analytics
Mayuresh Kunjir (Duke University)

AutoCache: Employing Machine Learning to Automate Caching in Distributed File Systems
Herodotos Herodotou (Cyprus University of Technology)

15:00-15:30 Break

15:30-17:00 Session 4: Keynote 2

Session Chair: Meichun Hsu, Oracle Co.

"Cost/Performance in Modern Data Stores: How Data Caching Systems Succeed"
David Lomet, Microsoft Research

17:00 Closing Remarks


Towards self-managing cloud-scale computing platforms: experiences and challenges

Jingren Zhou, Vice President at Alibaba Group


More and more companies heavily rely on massive data analysis of many kinds to understand data insights and drive business decisions. To support this ever-increasing need, big data computing platforms have grown to an unprecedented scale, way beyond human manageability. In this talk, I'll share our experiences at Alibaba to enable our big data platforms to configure, optimize, monitor, and protect themselves automatically, including automatic version testing and deployment control, system health monitoring and alert, automatic physical design/data placement/storage optimization, etc. I’ll also outline some outstanding research and engineering challenges.


jingrenJingren Zhou is Vice President at Alibaba Group. He is responsible for driving data intelligence infrastructure and several key data-driven businesses at Alibaba. Specifically, he leads work to develop cloud-scale distributed computing platform, data analytic products, and various business solutions. He also leads work to develop advanced techniques for personalized search, product recommendation, and advertisement at Alibaba's e-commerce platforms, including Taobao and Tmall. His research interests include cloud-computing, databases, and large scale machine learning systems. He received his PhD in Computer Science from Columbia University. He is a Fellow of IEEE.

Cost/Performance in Modern Data Stores: How Data Caching Systems Succeed

David Lomet, Microsoft Research


Data in traditional “caching” data systems resides on secondary storage, and is read into main memory only when operated on. This limits system performance. Main memory data stores with data always in main memory are much faster. But this performance comes at a cost. In this paper, we analyze the costs of both in-memory operations and secondary storage operations where data is not “in cache”. We study the performance impact of cache misses on caching system performance. The analysis considers both execution and storage costs. Based on our analysis, we derive cost/performance results for a data caching system [Deuteronomy and its Bw-tree] and a main memory system [MassTree] to understand where each demonstrates the best cost per operation, what is driving the cost differences, and the scale of the differences. This analysis (1) provides insight into why data caching systems continue to dominate the market; (2) points to higher performance that does not rely on simply increasing main memory cache size; and (3) suggests a path to lower costs and hence better cost/performance.


davelometDavid Lomet founded the Database Group at Microsoft Research Redmond in 1995 and managed it for 20 years. His research career began at IBM where, while on a 1975-76 sabbatical at the University of Newcastle-on-Tyne, he invented atomic actions (a form of transactions). He later worked at Wang Institute as a faculty member, and at Digital Equipment Corporation as a software architect and research staff member. He received a Ph.D. in computer science from the University of Pennsylvania. Lomet’s primary focus has been the engineering of database systems, with a focus on database system kernels. His work on concurrency control and recovery contributed to making DEC’s Rdb and Microsoft’s SQL Server database management systems leaders in cost/performance. His Deuteronomy research project’s latch-free Bw-tree index and log structure store are key elements in Microsoft’s Hekaton main memory database and Azure Cosmos DB cloud data service. Deuteronomy won the Microsoft Research Redmond “2017 Best Research Project” Award. Lomet is an author of over 120 papers, with two SIGMOD best paper awards, and over 60 patents. Lomet has won IEEE awards as well as the ACM SIGMOD Contributions Award, and at this conference the TCDE Service Award, for his 25 year tenure as EIC of the IEEE Data Engineering Bulletin. HE has also served as editor of ACM TODS, VLDB Journal and others, and has been a member of the VLDB Board. He has been a PC co-chair for ICDE and VLDB. He is a member of the IEEE Computer Society Board of Governors and society Secretary, and has been First Vice President and Treasurer. He is a fellow of IEEE, ACM, and AAAS, and a member of the National Academy of Engineering.