[DB Talk][DB Seminar] CMU/Pitt Joint DB Monthly Meetup

Pitt/CMU Joint DB Monthly Meetup
Date: Wednesday, March 19th @ 4:30 pm
Location: Room GHC6121 CMU Campus
1st Speaker: Nicholas L Farnan, PhD Pitt CS
2nd Speaker: Danai Koutra, PhD CMU CS

Nicholas L Farnan, PhD Pitt CS (webpage: http://people.cs.pitt.edu/~nlf4/)
Title: PAQO: Preference-Aware Query Optimization for Decentralized Database Systems

The declarative nature of SQL has traditionally been a major strength. Users simply state what information they are interested in, and the database management system determines the best plan for retrieving it. A consequence of this model is that should a user ever want to specify some aspect of how their queries are evaluated (e.g., a preference to read data from a specific replica, or a requirement for all joins to be performed by a single server), they are unable to. This can leave database administrators shoehorning evaluation preferences into database cost models. Further, for distributed database users, it can result in query evaluation plans that violate data handling best practices or the privacy of the user. To address such issues, we have developed a framework for declarative, user-specified constraints on the query optimization process and implemented it within PosgreSQL. Our Preference-Aware Query Optimizer (PAQO) upholds both strict requirements and partially ordered preferences that are issued alongside of the queries that it processes. In this paper, we present the design of PAQO and thoroughly evaluate its performance.

Danai Koutra, PhD CMU CS (webpage: http://www.cs.cmu.edu/~dkoutra)
Title: Large Graph Mining and Sense-making

Given a large graph, like Facebook or Gmail, what can we say about its structure? Which are the most important structures in the graph? Are there many cliques, stars and chains or are there more complex structures? What do the structures reveal? Are there anomalies? How does the graph change over time? This thesis focuses on developing fast algorithms and models that promote the understanding of large graphs at different granularities: node and graph level. For this purpose we go after two directions: similarity analysis, and pattern mining and modeling. For similarity analysis, we mainly concentrate on finding node affinities and roles, developing graph similarity approaches when the node correspondence is known or unknown, and using graph alignment to reveal similarities between nodes across different networks. For pattern mining, the goal is to provide the domain expert with a succinct summary of one or more graphs with billion nodes and edges, and spotlight the "important" and "interesting" (anomalous) graph patterns. For that reason, we model features in a variety of graphs, and also propose a graph summarization approach that unveils semantically interesting structures. Applications come from the domain of neuroscience, where we study brain connectivity graphs, as well as static or time-evolving social and collaboration networks (e.g., Twitter, Facebook, DBLP, IMDB, Enron, YahooWeb).