Welcome to the ADMT Publication Server

Multi-characteristic Subject Selection from Biased Datasets

DocUID: 2020-014 Full Text: PDF

Author: Tahereh Arabghalizi, Alexandros Labrinidis

Abstract: Subject selection plays a critical role in experimental studies, especially ones with human subjects. Anecdotal evidence suggests that many such studies, done at or near university campus settings, suffer from selection bias, i.e., the too-many-college-kids-as-subjects problem. Unfortunately, traditional sampling techniques, when applied over biased data, will typically return biased results. In this paper, we tackle the problem of multi-characteristic subject selection from biased datasets. We present a constrained optimization-based method that finds the best possible sampling fractions for the different population subgroups, based on the desired sampling fractions provided by the researcher running the subject selection. We perform an extensive experimental study, using a variety of real datasets. Our results show that our proposed method outperforms the baselines for all problem variations by up to 90%.

Keywords: subject selection, biased data, human subjects, user study

Published In: arXiv

Year Published: 2020

DOI: https://doi.org/10.48550/arXiv.2012.10311

Project: PittSmartLiving Subject Area: Others, Data Exploration, Quantitative Methods

Publication Type: Others

Sponsor: NSF CNS-1739413

Citation:Text Latex BibTex XML Tahereh Arabghalizi, and Alexandros Labrinidis. Multi-characteristic Subject Selection from Biased Datasets. arXiv. 2020. DOI: https://doi.org/10.48550/arXiv.2012.10311.