Files
Abstract
This study investigates how IP location disclosure affects online hostility, with Weibo serving as the main platform for analysis. A fine-tuned Chinese RoBERTa model jointly classifies hate speech and sentiment in a dataset of approximately 100,000 user posts. The resulting labels serve as outcome variables in a regression discontinuity design (RDD) to evaluate behavioral shifts before and after the policy. The analysis finds that the policy led to a statistically significant decline in hate speech, reducing its probability by approximately 3.2 percentage points in the immediate post-policy period. This effect is particularly pronounced in posts involving regional identity, as revealed through heterogeneous analysis. These results are robust across model specifications and sampling windows. Validation exercises show high consistency between model predictions and human-coded labels. Taken together, the findings suggest that identity-based transparency can reduce online hostility by increasing perceived accountability, contributing to ongoing discussions about platform governance and social norms.