Theodoros (Theo) Rekatsinas
Post-doc, Stanford University DB faculty candidate


Bio

Theodoros (Theo) Rekatsinas is a Moore Data Postdoctoral Fellow at Stanford working with Christopher Ré, he earned his Ph.D. in Computer Science from the University of Maryland, where he was advised by Amol Deshpande and Lise Getoor. His research interests are in data management, with a focus on data integration, data cleaning, and uncertain data. Theo's work on using quality-aware data integration techniques to forecast the emergence and progression of disease outbreaks received the Best Paper Award at SDM 2015. Theo was awarded the Larry S. Davis Doctoral Dissertation award in 2015.

Talk Information

  • Data Integration with Unreliable Sources
  • Thursday, 2/9/17, 3:20pm-5:00pm
  • SCI Conference Room (WEB 3780)

  • Abstract
    Data integration is an essential element of data-intensive science and modern analytics. Users often need to combine data from different sources to gain new scientific knowledge, obtain accurate insights, and create new services. However, today's upsurge in the number and heterogeneity—in terms of format and reliability—of data sources limits the ability of users to reason about the value of data. This raises the fundamental questions: what makes a data source useful to end users, how can we integrate unreliable data, and which sources we need to combine to maximize the user's utility? In this talk, I discuss how to assess and leverage the quality and reliability of data to make data integration more efficient. Specifically, I demonstrate how statistical learning is the key to managing large volumes of heterogeneous sources effectively. Building upon this observation, I introduce new solutions to classical data integration problems, such as data conflict resolution and data cleaning, and show that these solutions outperform their traditional counterparts by large margins. I finish with an outlook on how recent advancements in machine learning have the potential to streamline the construction of end-to-end data curation systems and bring data closer to users.

    Schedule

    TimePlanAppointmentLocation
    2017-02-09 08:30:00BreakfastMatthew Flatt
    2017-02-09 10:00:01SoC DirectorRoss WhitakerDirector's Office (MEB 3190)
    2017-02-09 10:30:001-1Jeff PhillipsMEB 3442
    2017-02-09 11:00:001-1Rajeev BalasubramonianMEB 3414
    2017-02-09 11:30:001-1Aditya BhaskaraMEB 3120
    2017-02-09 12:00:00LunchTom Fletcher
    2017-02-09 13:30:001-1Tucker HermansMEB 3112
    2017-02-09 14:00:001-1John RegehrMEB 3470
    2017-02-09 14:30:00
    2017-02-09 15:00:00talk prep
    2017-02-09 15:20:00TalkSCI Conference Room (WEB 3780)
    2017-02-09 18:45:00DinnerRyan, Jeff, ...

    TimePlanAppointmentLocation
    2017-02-10 08:00:00BreakfastSneha Kasera
    2017-02-10 09:30:00CoE DeanRichard BrownDean's Office (WEB 1650)
    2017-02-10 10:00:001-1Feifei LiWEB 2692
    2017-02-10 10:30:001-1Jason WieseMEB 3114
    2017-02-10 11:00:001-1Vivek SrikumarMEB 3126
    2017-02-10 11:30:001-1Suresh VenkatasubramanianMEB 3404
    2017-02-10 12:00:00Lunch with studentsMeet outside SoC office (MEB 3190)
    2017-02-10 13:30:001-1Alex LexWEB 3887
    2017-02-10 14:00:001-1Ellen RiloffMEB 3140
    2017-02-10 14:30:00Roundtablegrad studentsGraduate Lounge (MEB 3429)
    2017-02-10 15:30:001-1Ryan StutsmanMEB 3436
    2017-02-10 16:00:00Meet at Director's OfficeMike KirbyDirector's Office (MEB 3190)
    2017-02-10 16:30:00
    2017-02-10 18:00:00DinnerFeifei, Alex, John R

    Sign Up

    Add Homepage Link?