Hire an Illini

Dawei Zhou

  • Advisor:
    • Jingrui He
  • Departments:
  • Areas of Expertise:
    • Rare Category Analysis
    • Machine Learning
    • Data Mining
  • Thesis Title:
    • Harnessing Rare Category Trinity for Complex Data
  • Thesis abstract:
    • Despite the sheer volume of data being collected, it is often the rare categories that are of the most importance in many high impact domains, ranging from financial fraud detection in online transaction networks to emerging trend detection in social networks, from spam image detection in social media to rare disease diagnosis in the medical decision support system. The unique challenges of rare category analysis include the ones associated with the nature of rare examples (i.e., rarity, non-separability, label scarcity) as well as the emerging ones associated with complex data (i.e., heterogeneity, intangibility, and privacy). In my thesis, we aim to provide an interactive learning mechanism for rare category analysis on complex data. In particular, my research focus on the following trinity tasks for rare category analysis: (1) characterization (e.g., how to characterize the rare patterns with a compact representation?), (2) explanation (e.g., how to interpret the prediction results and provide relevant clues for the end-users?), and (3) generation (e.g., how to produce synthetic rare category examples that resemble the real ones?). Moreover, to address the aforementioned challenges, my research aims to develop a uni ed learning mechanism for rare category analysis on complex data, which integrates a variety of novel techniques, from rare category characterization for static data to rare category tracking for temporal data, from representing rare patterns in a salient embedding space to interpreting the prediction results and providing relevant clues for the end-users; from mimicking the underlying distribution of rare categories to generating synthetic ones for data augmentation. In my future research, we will continue investigating the characteristics of rare category examples in complex data, and try to enable domain adaptation for rare category analysis. Moreover, we plan to develop versatile generative models that are capable of label-informed and permutation-invariant rare example generation.
  • Downloads:

Contact information:
dzhou21@illinois.edu