• Slide 1
  • Slide 2
  • Slide 3

Due to recent directives involving COVID-19, the SMDB 2020 will no longer be held in person in Dallas TX, April 20th, 2020. The workshop will follow ICDE 2020's decision to take place online. Pursuant to ICDE conference guidelines, at least one author registration will be required for each paper to be included in the proceedings. Please stay tuned for guidance regarding the conference.


During the last forty years, data management systems have grown in scale, complexity, and number of installations. At the same time, administration of these systems has become very expensive with the human factor dominating the total cost of ownership. Current trends like cloud computing make this situation even more problematic for service providers who have to configure and manage thousands of database nodes.
There has been a significant amount of research addressing this problem by providing autonomic or self-* features in database systems to support complex administrative tasks like physical database design, problem diagnosis, and performance tuning. However, new challenges arise from trends like cloud and cluster computing, virtualization, and Software-as-a-Service (SaaS). A major challenge is the need to scale self-management capabilities to the level of hundreds to thousands of nodes while taking economic factors into account.
Autonomic, or self-managing, systems are a promising approach to achieve the goal of systems that are easier to use and maintain. A system is considered to be autonomic if it possesses the capabilities to be self-configuring, self-optimizing, self-healing and self-protecting. The aim of the SMDB workshop is to provide a forum for researchers from both industry and academia to present and discuss ideas related to self-management and self-organization in data management systems ranging from classical databases to data stream engines to large-scale cloud environments that utilize advanced AI, machine learning, and data mining and analysis.
We plan to follow the successful format of previous instances of this workshop: approximately 10 presentations of accepted papers, a keynote address by a well-known speaker and subject matter expert in self-managing database systems, as well as a panel discussion involving experts from industry and academia.


Download CFP


Topics of interest include, but are not limited to:

* Principles and architecture of autonomic data management systems
* Retro-fitting existing systems vs. designing for self management
* Self-* capabilities in databases and storage systems
* Data management in cloud and multi-tenant databases
* Autonomic capabilities in database-as-a-service platforms
* Automated testing of data management systems
* Automated physical database design and adaptive query tuning
* Automated provisioning and integration
* Automatic enforcement of information quality
* Robust query processing techniques
* Self-managing data stream engines and adaptive event-based systems
* Self-managing distributed / decentralized / peer-to-peer information systems
* Self-management of internet-scale distributed systems
* Self-management for big data infrastructures
* Monitoring and diagnostics in data management systems
* Policy automation and visualization for datacenter administration
* User acceptance and trust of autonomic capabilities
* Evaluation criteria and benchmarks for self-managing systems
* Self-evaluation of data management services in the cloud
* Use cases and war stories on deploying autonomic capabilities


Authors are invited to submit original research contributions in English of up to 6 pages in the IEEE camera-ready format (templates are available at the ICDE 2020 submission guidelines page) to the submission site https://cmt3.research.microsoft.com/SMDB2020. Authors of accepted papers will be encouraged to submit an extended paper of up to 8 pages for final publication. Author are also invited to submit short papers up to 4 pages. All accepted papers will appear in the formal Proceedings of the Conference Workshops published by IEEE CS Press, and will be included in the IEEE digital library.

Paper submission deadline:

January 6, 2020 5pm PST (abstract)(optional)
January 13, 2020 5pm PST Extended to January 23, 2020 5pm PST (full & short )


February 13, 2020


February 28, 2020



Panos K. Chrysanthis

Professor, Computer Science Department
University of Pittsburgh

Meichun Hsu

Sr. Director of R&D Database Server Technology
Oracle Corporation

Herodotos Herodotou

Assistant Professor, Dept. of Electrical Eng., Computer Eng. and Informatics
Cyprus University of Technology

Yingjun Wu

Software Development Engineer, Redshift Team
Amazon Web Services

Constantinos Costa

Research Associate, Computer Science Department
University of Pittsburgh


  • Alex Delis, University of Athens, Greece
  • Alkis Polyzotis, Google, USA
  • Alkis Simitsis, Microfocus, USA
  • Andreas Kipf, TU Munich, Germany
  • Bailu Ding, Microsoft Research, USA
  • Deepak Majeti, Vertica/MicroFocus, USA
  • Eduardo Cunha de Almeida, Federal University of ParanĂ¡, Brazil
  • Jeff Lefevre, UCSC, USA
  • Jiaheng Lu, University of Helsinki, Finland
  • Kai-Uwe Sattler, Ilmenau University of Technology, Germany
  • Khuzaima Daudjee, University of Waterloo, Canada
  • Matthias Sax, Confluent, USA
  • Ryan Marcus, MIT, USA
  • Sam Lightstone, IBM, Canada
  • Stefanie Scherzinger, OTH Regensburg, Germany
  • Vivek Narasayya, Microsoft Research, USA
  • Yao Lu, Microsoft Research, USA


Constantinos Costa, University of Pittsburgh, USA


Brian Nixon, University of Pittsburgh, USA


Times are displayed in CDT. Look up your local times: http://www.timebie.com/std/cdt.php.


Opening and Introductions


Keynote 1/

Chair: Shimin Chen

Software Hardware Co-Design for Cloud Native Database Systems

Feifei Li, Vice President of Alibaba Group, Professor at University of Utah


Joint Invited Talk

Chair: Yingjun Wu

AI-native Database

Guoliang Li, Professor, Tsinghua University




Session 1

Chair: Herodotos Herodotou


Research Talk 1

Adaptive Distributed Partitioning in Apache Flink

Theodoros Toliopoulos (Aristotle University of Thessaloniki), Anastasios Gounaris (Aristotle University of Thessaloniki)


Research Talk 2

Towards Self-Adapting Data Migration in the Context of Schema Evolution in NoSQL Databases

Andrea Hillenbrand (Darmstadt University of Applied Sciences), Uta Störl (University of Applied Sciences Darmstadt), Maksym Levchenko (University of Applied Sciences Darmstadt), Shamil Nabiyev (Darmstadt University of Applied Sciences), Meike Klettke (Universität Rostock)


Research Talk 3

PatchIndex - Exploiting Approximate Constraints in Self-managing Databases

Steffen Kläbe (TU Ilmenau), Kai-Uwe Sattler (TU Ilmenau), Stephan Baumann (Actian Germany GmbH)


Research Talk 4

START – Self-Tuning Adaptive Radix Tree

Philipp Fent (TU Munchen), Michael Jungmair (TU Munchen), Andreas Kipf (TU Munchen), Thomas Neumann (TU Munchen)




Keynote 2

Chair: Meichun Hsu

AIOps with the Oracle Autonomous Database

Rao Sandesh, VP of Autonomous Health and Machine Learning, Oracle Autonomous Database Group


Lunch Break


Session 2

Chair: Constantinos Costa


Research Talk 5

Cost-Guided Cardinality Estimation: Focus Where it Matters

Parimarjan Negi (MIT), Ryan Marcus (MIT), Hongzi Mao (MIT CSAIL), Nesime Tatbul (Intel Labs and MIT), Tim Kraska (MIT), Mohammad


Research Talk 6

Online Index Selection Using Deep Reinforcement Learning for a Cluster Database

Seyedeh Zahra Sadri Tabaee (University of Oklahoma), Le Gruenwald (University of Oklahoma)


Coffee Break



Chair: Panos K. Chrysanthis

Business Meeting on the Future of TCDE Workgroup on SMDB




Software Hardware Co-Design for Cloud Native Database Systems

Feifei Li, Vice President of Alibaba Group


Cloud native database systems become increasingly popular on the cloud, which leverages the virtualized resource pool provided by the underlying cloud infrastructure to offer excellent elasticity, high availability, and scalability. Decoupling resource usage and management across the stack (e.g, compute and storage) is a critical path towards realizing cloud native properties. Software-hardware co-design plays an important role in this paradigm, such as using kernel bypassing, RDMA for shared distributed storage, FPGA acceleration, NVM for tied memory hierarchy, TEE for secure and trustworthy compute, to name a few. This talk shares our experience and lessons learned from using software-hardware co-design principles towards building cloud native database systems.


feifei Feifei Li is currently a Vice President of Alibaba Group, ACM Distinguished Scientist, President of the Database Products Business Unit of Alibaba Cloud, and director of the Database and Storage Lab of DAMO academy. He is a tenured full professor at the School of Computing, University of Utah (on leave). He has won multiple awards from NSF, ACM, IEEE, Visa, Google, HP, Microsoft, IBM, etc. He is a recipient of the ACM SoCC 2019 Best Paper Award (runner-up), IEEE ICDE 2014 10 Years Most Influential Paper Award, ACM SIGMOD 2016 Best Paper Award, ACM SIGMOD 2015 Best System Demonstration Award, IEEE ICDE 2004 Best Paper Award. He has been an associate editor, PC co-chairs, and core committee members for many prestigious journals and conferences.

AIOps with the Oracle Autonomous Database

Sandesh Rao, Oracle USA, Vice President, Cloud Diagnosability and RAC Assurance


Autonomous Database is one of the hottest Oracle products where we have attempted to use Machine Learning for several aspects of the service. We will cover some use cases to find anomalies in them to troubleshoot them at a scale of several petabytes a year using Log Anomaly timeline using semi-supervised machine learning techniques to reduce logs and match them in near real time. We will also cover how we detect changing workload, use Zscores to pinpoint faults, use time series analysis to find good times to do backups or maintenance, models to detect performance tuning issues and root cause analysis as well as fleet learning to apply knowledge of trends and issues across multiple symptoms affecting the fleet including rediscovery. We will cover examples, code where applicable and frameworks we use for this.


sandesh Sandesh Rao is a VP running the AIOps Automation for the Autonomous Database Group at Oracle Corporation specializing using AI/ML for different use cases from predicting faults before they happen to Anomaly Detection within log data, metrics data. His previous positions have focused on performance tuning, high availability, disaster recovery and architecting cloud-based solutions using the Oracle Stack. With more than 18 years of experience working in the HA space and having worked on several versions of Oracle with different application stacks, he is a recognized expert in RAC, Database Internals, PaaS, SaaS, and IaaS solutions and solving Big Data related problems. Most of his work involves working with customers in the implementation of public and hybrid cloud projects in the financial, retailing, scientific, insurance, biotech, and tech space. He is also responsible for developing assessments for best practices for the Oracle Grid Infrastructure 19c including products like RAC (Real Application Clusters), Storage (ASM, ACFS) More details https://bit.ly/1UCL46K


AI-native database

Guoliang Li, Tsinghua University


In big data era, database systems face three challenges. Firstly, the traditional heuristics-based optimization techniques (e.g., cost estimation, join order selection, knob tuning) cannot meet the high-performance requirement for large-scale data, various applications and diversified data. We can design learning-based techniques to make database more intelligent. Secondly, many database applications require to use AI algorithms, e.g., image search in database. We can embed AI algorithms into database, utilize database techniques to accelerate AI algorithms, and provide AI capability inside databases. Thirdly, traditional databases focus on using general hardware (e.g., CPU), but cannot fully utilize new hardware (e.g., AI chips). Moreover, besides relational model, we can utilize tensor model to accelerate AI operations. Thus, we need to design new techniques to make full use of new hardware. To address these challenges, we design an AI-native database. On one hand, we integrate AI techniques into databases to provide self-configuring, self-optimizing, self-healing, self-protecting and self-inspecting capabilities for databases. On the other hand, we can enable databases to provide AI capabilities using declarative languages, in order to lower the barrier of using AI. In this talk, I will introduce the five levels of AI-native databases and provide the open challenges of designing an AI-native database. I will also take automatic database knob tuning, deep reinforcement learning based optimizer, machine-learning based cardinality estimation, automatic index/view advisor as examples to showcase the superiority of AI-native databases.


guoliang Guoliang Li is a full Professor of Department of Computer Science, Tsinghua University, Beijing, China. His research interests include AI-native database, big data analytics and mining, crowdsourced data management, big spatio-temporal data analytics, large-scale data cleaning and integration. He has published more than 100 papers in premier conferences and journals, such as SIGMOD, VLDB, ICDE, SIGKDD, SIGIR, TODS, VLDB Journal, and TKDE. He will be the General co-chair of SGIMOD 2021 and demo chair of VLDB 2021. He is working as associate editor for IEEE Transactions and Data Engineering, VLDB Journal, ACM Transaction on Data Science, IEEE Data Engineering Bulletin. He got several best paper awards in top conferences, such as CIKM 2017 best paper award, ICDE 2018 best paper candidate, KDD 2018 best paper candidate, DASFAA 2014 best paper runner-up, APWeb 2014 best paper award, etc. He received VLDB Early Research Contribution Award 2017, and IEEE TCDE Early Career Award 2014.