[DB Talk][DB Seminar] CMU/Pitt Joint DB Monthly Meetup

Date: Wednesday, February 19th @ 4:30 pm
Location: Board Room, Room 6329, 6th floor Sennott Square Building, Pitt Campus
1st Speaker: Danai Koutra, PhD CMU CS
2nd Speaker: Roxana Gheorghiu, PhD Pitt CS
 

1st Speaker: Danai Koutra, PhD CMU CS
Title: BiG-Align: Fast Bipartite Graph Alignment
Abstract:
How can we find the virtual twin (i.e., the same or similar user) on LinkedIn for a user on Facebook? How can we effectively link an information network with a social network to support cross-network search? Graph alignment – the task of finding the node correspondences between two given graphs – is a fundamental building block in numerous application domains, such as social networks analysis, bioinformatics, chemistry, pattern recognition.

In this work, we focus on aligning bipartite graphs, a problem which has been largely ignored by the extensive existing work on graph matching, despite the ubiquity of those graphs (e.g., users- groups network). We introduce a new optimization formulation and propose an effective and fast algorithm to solve it. We also propose a fast generalization of our approach to align unipartite graphs. The extensive experimental evaluations show that our method outperforms the state-of-art graph matching algorithms in both alignment accuracy and running time, being up to 10× more accurate or 174× faster on real graphs. This is joint work with Hanghang Tong (CUNY) and David Lubensky (IBM).

2nd Speaker: Roxana Gheorghiu, PhD Pitt CS
Title: Towards a Practical Database Preference Model
Abstract:
In this work we present a theoretical framework and a practical system that supports both qualitative and quantitative preferences, at the same time. Our integrated system is used to hold, manipulate, and utilize preferences, stored in user profile preference graphs, in order to make a query result more relevant to users’ needs. One key contribution of our proposed system is the ability to support and combine intensity values, for both types of preferences, which allows us to detect the most preferred tuples based not only on how many preferences match a tuple but also on the combined intensity value. We use Neo4j as our system’s storage engine and we test the system using real data (extracted from DBLP). Our results demonstrate that intensity values hold a key role in determining the most relevant set of tuples in the query result.