Files

Abstract

Interacting with millions of global users, Large Language Models (LLMs) such as Chat- GPT may possess unprecedented capabilities for automated, personalized persuasion at scale. Are LLMs actually good at personalized persuasion? Do they become more persua- sive when they personalized the texts to match users’ linguistic patterns? This research undertakes to evaluate the personalized persuasion potentials of LLMs when they learn to speak like users. We assess GPT-4’s mimicry capability against mimicry naturally occur- ring in human interaction and evaluate linguistically personalized persuasion’s effectiveness through a survey experiment. Results show GPT-4’s proficiency in linguistic mimicry akin to humans, yet linguistically personalized arguments aren’t more effective than general ones. Our approach underscores the importance of domain expertise in eliciting model capabilities and constructing evaluations for governing frontier AI systems. Future research should advance domain-inspired evaluation methods to better inform policy decisions for AI labs and governments regarding model training and deployment.

Details

Actions

from
to
Export