• Slide 1
  • Slide 2
  • Slide 3


During the last forty years, data management systems have grown in scale, complexity, and number of installations, while the workloads they serve have become more diverse and demanding. Current trends like cloud computing make this situation even more challenging for service providers who have to configure and manage thousands of database nodes as well as to ensure that service level agreements are met.
There has been a significant amount of research addressing these issues by providing autonomic or self-* features in database systems to support complex administrative tasks, such as physical database design, problem diagnosis, and performance tuning, as well as to optimize the operations of database components such as the query optimizer and the execution engine. However, new challenges arise from trends like cloud and cluster computing, virtualization, and Software-as-a-Service (SaaS). A major challenge is the need to scale self-management capabilities to the level of hundreds to thousands of nodes while considering economic factors.
Autonomic, or self-managing, systems are a promising approach to achieve the goal of systems that are easier to use and maintain. A system is considered autonomic if it possesses the capabilities to be self-configuring, self-optimizing, self-healing and self-protecting. The aim of the SMDB workshop is to provide a forum for researchers from both industry and academia to present and discuss ideas related to self-management and self-organization in data management systems ranging from classical databases to data stream engines to large-scale cloud environments that utilize advanced AI, machine learning, and data mining and analysis.
We plan to follow the successful format of previous instances of this workshop: approximately eight presentations of accepted papers, a keynote address by a well-known speaker and subject matter expert in self-managing database systems, as well as a panel discussion involving experts from industry and academia. The last two years, SMDB also featured joint keynote addresses with the Joint International Workshop on Big Data Management on Emerging Hardware and Data Management on Virtualized Active Systems (HardBD&Active). Furthermore, in previous years, the best papers presented in SMDB and HardBD&Active were invited for extended submissions to a Special Issue in DAPD (Distributed and Parallel Databases) Journal under the theme “Self-Managing and Hardware-Optimized Database Systems.”


Download CFP


Topics of interest include, but are not limited to:

* Principles and architecture of autonomic data management systems
* Retro-fitting existing systems vs. designing for self management
* Self-* capabilities in databases and storage systems
* Data management in cloud and multi-tenant databases
* Autonomic capabilities in database-as-a-service platforms
* Automated testing of data management systems
* Automated physical database design and adaptive query tuning
* Automated provisioning and integration
* Automatic enforcement of information quality
* Robust query processing techniques
* Self-managing database components (e.g., query optimizer, execution engine)
* Self-managing data stream engines and adaptive event-based systems
* Self-managing distributed / decentralized / peer-to-peer information systems
* Self-management of internet-scale distributed systems
* Self-management for big data infrastructures
* Monitoring and diagnostics in data management systems
* Policy automation and visualization for datacenter administration
* User acceptance and trust of autonomic capabilities
* Evaluation criteria and benchmarks for self-managing systems
* Self-evaluation of data management services in the cloud
* Use cases and war stories on deploying autonomic capabilities


Authors are invited to submit original research contributions in English of up to 6 pages in the IEEE camera-ready format (templates are available at the ICDE 2023 submission guidelines page) to the submission site https://cmt3.research.microsoft.com/SMDB2023. Authors of accepted papers will be encouraged to submit an extended paper of up to 8 pages for final publication. Author are also invited to submit short papers up to 4 pages. The page limit includes the bibliography and any appendix. All accepted papers will appear in the formal Proceedings of the Conference Workshops published by IEEE CS Press, and will be included in the IEEE digital library.

Authors of a selection of accepted papers will be invited to submit an extended version to the Distributed and Parallel Databases (DAPD) journal.


Paper submission deadlines:

Paper Abstract (optional):

January 04, 2023 (Wednesday) 5pm PST

Paper Submission:

January 11, 2023 (Wednesday) 5pm PST
January 30, 2023 (Monday) 5pm PST

Notification of acceptance:

February 01, 2023 (Wednesday)
February 20, 2023 (Monday)


February 28, 2023 (Tuesday) 5pm PST



Herodotos Herodotou

Assistant Professor, Dept. of Electrical Eng., Computer Eng. and Informatics
Cyprus University of Technology

Yingjun Wu

Founder and CEO,
Singularity Data Inc

Constantinos Costa

Founder and CEO,
Rinnoco Ltd

Bailu Ding

Principal Researcher, Microsoft Research
Redmond, USA

Demetris Trihinas

Lecturer, Computer Science Department
University of Nicosia



Panos K. Chrysanthis

Professor, Computer Science Department
University of Pittsburgh

Meichun Hsu

Sr. Director of R&D Database Server Technology
Oracle Corporation


  • Alexandros Labrinidis, University of Pittsburgh, USA
  • Alkis Simitsis, Athena Research Center, Greece
  • Anshuman Dutt, Microsoft Research, USA
  • Bo Tang, Southern University of Science and Technology, China
  • Danica Porobic, Oracle, USA
  • Chunwei Liu, MIT, USA
  • Deepak Majeti, Ahana, USA
  • Dimitrios Georgakopoulos, Swinsburne University of Technology, Australia
  • Eduardo Cunha de Almeida, Federal University of Paraná, Brazil
  • Evaggelia Pitoura, University of Ioannina, Greece
  • George Pallis, University of Cyprus, Cyprus
  • Jeff LeFevre, UCSC, USA
  • Jiaheng Lu, University of Helsinki, Finland
  • Kai-Uwe Sattler, TU Ilmenau, Germany
  • Khuzaima Daudjee, University of Waterloo, Canada
  • Le Gruenwald, University of Oklahoma, USA
  • Matthias J. Sax, Confluent Inc., USA
  • Mohamed A. Sharaf, United Arab Emirates University, UAE
  • Nikos Katsipoulakis, Snowflake, USA
  • Rebecca Taft, Cockroach DB, USA
  • Ryan Marcus, MIT, USA
  • Uta Störl, University of Hagen, Germany
  • Vincenzo Gulisano, Chalmers University of Technology, Sweden
  • Yongluan Zhou, University of Copenhagen, Denmark


Brian T. Nixon, University of Pittsburgh, USA

Rakan A. Alseghayer, University of Pittsburgh, USA


Accepted Papers

  • Towards Evaluating Stream Processing Autoscalers George Siachamis (TU Delft)*; Job Kanis (TU Delft); Wybe Koper (TU Delft); Kyriakos Psarakis (TU Delft); Marios Fragkoulis (Delivery Hero); Arie Van Deursen (Delft University of Technology); Asterios Katsifodimos (TU Delft)
  • Towards a Signature Based Compression Technique for Big Data Storage Constantinos Costa (Rinnoco Ltd)*; Panos K. Chrysanthis (Rinnoco Ltd); Marios Costa (Rinnoco Ltd); Efstathios Stavrakis (Algolysis); Nicolas Nicolaou (Algolysis)
  • Quantum Annealing Method for Dynamic Virtual Machine and Task Allocation in Cloud Infrastructures from Sustainability Perspective Valter Uotila (University of Helsinki)*; Jiaheng Lu (University of Helsinki)
  • A Self-managed Marketplace for Sharing IoT Sensors Anas Dawod (Swinburne University of Technology)*; Dimitrios Georgakopoulos (Swinsburne University of Technology, Australia); Prem P. Jayaraman (Swinsburne University of Technology, Australia); Panos K. Chrysanthis (University of Pittsburgh).
Times are displayed in California, USA. Look up your local times: https://time.is/.
Time (California, USA) Session Chairs Room  
8:30-8:40 Herodotos Herodotou/ Yingjun Wu [Platinum 4]

Opening and Introductions

General Chairs: Herodotos Herodotou &
Yingjun Wu
8:40-10:00 Bailu Ding/ Yingjun Wu [Platinum 4] Research Talk 1 Towards a Signature Based Compression Technique for Big Data Storage Constantinos Costa (Rinnoco Ltd), Panos K. Chrysanthis (Rinnoco Ltd), Marios Costa (Rinnoco Ltd), Efstathios Stavrakis (Algolysis), Nicolas Nicolaou (Algolysis)
Research Talk 2 Quantum Annealing Method for Dynamic Virtual Machine and Task Allocation in Cloud Infrastructures from Sustainability Perspective Valter Uotila (University of Helsinki), Jiaheng Lu (University of Helsinki)

Coffee Break

10:30-12:30 Panos K. Chrysanthis [Platinum 4]

Keynote 1

From self-managed database systems to runtime-intelligent analytics Anastasia Ailamaki, EPFL and Google, Inc.

Keynote 2

Self-Managing Database Capabilities in SQL Azure Hanuma Kodavalla, Technical Fellow, Microsoft


14:00-15:00 Yingjun Wu [Platinum 4] Research Talk 3 Towards Evaluating Stream Processing Autoscalers George Siachamis (TU Delft), Job Kanis (TU Delft), Wybe Koper (TU Delft), Kyriakos Psarakis (TU Delft), Marios Fragkoulis (Delivery Hero), Arie Van Deursen (Delft University of Technology), Asterios Katsifodimos (TU Delft)
Research Talk 4 A Self-managed Marketplace for Sharing IoT Sensors Anas Dawod (Swinburne University of Technology), Dimitrios Georgakopoulos (Swinsburne University of Technology, Australia), Prem P. Jayaraman (Swinsburne University of Technology, Australia), Panos K. Chrysanthis (University of Pittsburgh)
15:00-15:30 Panos K. Chrysanthis
[Platinum 4]

Invited Talk

Data Systems in Academia and Industry: Lessons Learned and Insights Gained Yingjun Wu, Founder and CEO, Singularity Data Inc

Coffee Break

16:00-18:00 Tianzheng Wang [Platinum 3]
HardBD&Active Room

Keynote 3

The Data Systems Grammar Stratos Idreos, Harvard

Keynote 4

A Composable Era for Data Management Pedro Pedreira, Meta




From self-managed database systems to runtime-intelligent analytics

Anastasia Ailamaki, EPFL and Google, Inc.


Self-managed database systems have been successful in addressing various challenges such as database tuning, optimization, and maintenance. However, the growth and heterogeneity of data, hardware, and applications reveal limitations which can only be addressed using real-time adaptive query engines. Just-in-time query execution using code generation presents a promising solution which enables efficient processing of complex queries while minimizing overhead and maintenance costs and allowing databases to be more dynamic, adaptive, and efficient. This talk presents an overview of approaches to just-in-time query execution including dynamic query planning, adaptive caching, self-optimized data pipelines, and machine learning-based techniques. We discuss the benefits and challenges of these approaches, as well as their practical applications.


Anastasia Ailamaki

Anastasia Ailamaki is a Professor of Computer and Communication Sciences at the École Polytechnique Fédérale de Lausanne (EPFL) in Switzerland, as well as the co-founder and Chair of the Board of Directors of RAW Labs SA, a Swiss company developing systems to analyze heterogeneous big data from multiple sources efficiently. She earned a Ph.D. in Computer Science from the University of Wisconsin-Madison in 2000. She has received the 2019 ACM SIGMOD Edgar F. Codd Innovations Award and the 2020 VLDB Women in Database Research Award. She is also the recipient of an ERC Consolidator Award (2013), the Finmeccanica endowed chair from the Computer Science Department at Carnegie Mellon (2007), a European Young Investigator Award from the European Science Foundation (2007), an Alfred P. Sloan Research Fellowship (2005), an NSF CAREER award (2002), twelve best-paper awards in international scientific conferences. She has received the 2018 Nemitsas Prize in Computer Science by the President of Cyprus and the 2021 ARGO Innovation Award by the President of the Hellenic Republic. She is an ACM fellow, an IEEE fellow, a member of the Academia Europaea, and an elected member of the Swiss, the Belgian, the Greek, and the Cypriot National Research Councils.

Self-Managing Database Capabilities in SQL Azure

Hanuma Kodavalla, Technical Fellow, Microsoft


SQL Azure is a Database Platform as a Service offering from Microsoft that manages more than 11 million databases worldwide in all geographies. Managing at this scale requires automating many aspects of running a database system. This talk describes how SQL Azure automatically allocates required resources efficiently based on the workload, detects and recovers from query plan regressions, protects from security attacks, recovers from hardware defects, finetunes the locking mechanism based on the concurrent workload, reduces recovery time and failover impact, protects from natural disasters like power, fire and flood, manages storage and indexes, and deploys new versions of infrastructure and database software with minimum disruption. The talk also covers the monitoring and the debugging facilities, some of the lessons learnt and the challenges that remain in making a large and growing fleet of databases self-managing.


Wang-Chiew Tan

Hanuma Kodavalla is a Technical Fellow in the Azure Databases group at Microsoft where he has been for twenty years. He previously worked at Data General, Digital Equipment Corporation, Oracle, Sybase and Asera. For more than three decades, Hanuma worked on many aspects of Relational Database Systems and has been instrumental in architecting multiple commercial database systems for high performance and high availability. Hanuma received BTech in Electronics and Communications in 1981 from National Institute of Technology, Warangal, India, MTech in Computer Science in 1983 from Indian Institute of Technology, Chennai, India, and MS in Computer Science in 1988 from University of Massachusetts, Amherst, USA. He has a few publications in database conferences and many patents related to novel implementation techniques for online transaction processing and data warehousing in the areas of concurrency control, recovery, high-availability, query processing and security.

The Data Systems Grammar

Stratos Idreos, Harvard


Data systems are everywhere. A data system is a collection of data structures and algorithms working together to achieve complex data processing tasks. For example, with data systems that utilize the correct data structure design for the problem at hand, we can reduce the monthly bill of large-scale data applications on the cloud by hundreds of thousands of dollars. We can accelerate data science tasks by dramatically speeding up the computation of statistics over large amounts of data. We can train drastically more neural networks within a given time budget, improving accuracy. However, knowing the right data system design for any given scenario is a notoriously hard problem; there is a massive space of possible designs, while no single design is perfect across all data, queries, and hardware contexts. In addition, building a new system may take several years for any given (fixed) design. We will discuss our quest for the first principles of data system design. We will show that it is possible to reason about this massive design space. This allows us to create a self-designing data system that can take drastically different shapes to optimize for the workload, hardware, and available cloud budget using a grammar for data systems. These shapes include data structure, algorithms, and overall system designs which are discovered automatically and do not (always) exist in the literature or industry, yet they can be more than 10x faster.


Stratos Idreos

Stratos Idreos is an associate professor of Computer Science at Harvard University, where he leads the Data Systems Laboratory. His research focuses on making it easy and even automatic to design workload and hardware-conscious data structures and data systems with applications on relational, NoSQL, and data science problems. For his Ph.D. thesis on adaptive indexing, Stratos was awarded the 2011 ACM SIGMOD Jim Gray Doctoral Dissertation award and the 2011 ERCIM Cor Baayen award from the European Research Council on Informatics and Mathematics. In 2015 he was awarded the IEEE TCDE Rising Star Award from the IEEE Technical Committee on Data Engineering for his work on adaptive data systems, and in 2022 he received the ACM SIGMOD Test of Time award for the NoDB concept. Stratos is also a recipient of the National Science Foundation Career award and the Department of Energy Early Career award. Stratos was PC Chair of ACM SIGMOD 2021 and IEEE ICDE 2022, he is the founding editor of the ACM/IMS Journal of Data Science and the chair of the ACM SoCC Steering Committee. Finally, Stratos received the 2020 ACM SIGMOD Contributions award for his work on reproducible research.

A Composable Era for Data Management

Pedro Pedreira, Meta


The requirement for specialization in data management systems has evolved faster than our software development practices. After decades of organic growth, this situation has created a siloed landscape composed of hundreds of products developed and maintained as monoliths, with limited reuse between systems. This fragmentation has often forced us to reinvent the wheel, impacted our end users through SQL API inconsistencies, and ultimately slowed down innovation. In this talk, I will describe how the increasing popularity of open source projects aimed at standardizing different layers of the stack is changing how data management systems are developed, and outline a novel reference modular architecture. I will also discuss experiences with Velox and on componentizing one of the largest data warehouses in the world with the Shared Foundations effort at Meta, hoping to foster collaboration, motivate further research, and promote a more composable future for data management.


Pedro Pedreira

Pedro Pedreira is a Software Engineer at Meta. In his 10-year tenure, he has led a series of Data Infrastructure projects aimed at unifying and consolidating fragmented data management stacks. Currently, Pedro leads the Velox program, which is an effort at unifying execution engines into an open-source library, spanning more than a dozen engines within Meta and beyond. In the past, he worked on log analytics engines (such as Scuba), and created Cubrick, an in-memory analytical DBMS. Pedro holds a PhD, and MS in Computer Science from the Federal University of Parana, in Brazil.