管理学Workshop:Perils of bias and scarcity: Overcoming challenges in Political Ideology Prediction from text data

发布日期:2023-11-17 12:00    来源:

Perils of bias and scarcity: Overcoming challenges in Political Ideology Prediction from text data

时间:2023年11月17日10:00

地点:承泽园333教室

Speaker: Chen Chen

Abstract:

Political Ideology Prediction (PIP) from text data is pivotal in policy evaluation, online marketing, and understanding firm strategy. However, development of Machine Learning (ML) models have been facing crucial challenges such as sparse self-reported labels and selection bias, as well as label bias, characterized by systematic distortion of observed labels from the ground truth. All these issues have severely limit the applicaton of advanced ML algorithms such as LLMs on PIP from texts. To address these issues, we designed two ML artifacts. The first artifact addresses sampling issues by decomposing document embeddings into a linear combination of a latent neutral context vector and a latent position vector. This semi-supervised model, predicting ideology solely on position vectors, significantly outperforms the SoTAs in accuracy, even with as little as 5% biased data. The second artifact, designed to address label biases, is based on a kernel of Mixture of Theories. Preliminary results show that it adapts universally in various context and aligns with most currently identified causes of biases, demonstrating promising potential for improving PIP.

 

Introduction of Speaker

Dr. Chen Chen is an Assistant Professor from the area of Information Systems at The Chinese University of Hong Kong, Shenzhen. After graduating from Tsinghua University, he proceed to obtain his first Ph.D. in Molecular Cancer Biology from Duke University in 2014 and second Ph.D. in Management from Boston University in 2020. His primary research interests include 1) using advanced deep learning algorithms and language models to decipher how human beings interact, behave and make decisions in both virtual and real communities; 2) understanding the dynamics of patient-doctor's interaction on online healthcare platforms; 3) knowledge engineering and knowledge iterpolation/extrapolation vis knowledge graph and Graph Neural Networks; 4) AI augmentation and its implication in management, AI alignment and governance, corporate AI strategy and its impact on personnel turnover. Dr. Chen has numerous publications in top-tier journals and computer science conferences including ISR, PNAS, Nature Cell Biology, and ACL proceedings.


分享到: