.jpg&w=200&h=&zc=3)
2025. február 17-én megjelent Rakovics Zsófia és Rakovics Márton „Exploring the potential and limitations of large language models as virtual respondents for social science research” című tanulmánya az Intersections folyóirat Text as Data különszámában.
Absztrakt:
Social and linguistic differences encoded in various textual content available on the internet represent certain features of modern societies. For any scientific research which is interested in social differences mediated by language, the advent of large language models (LLMs) has brought new opportunities. LLMs could be used to extract information about different groups of society and utilized as data providers by acting as virtual respondents generating answers as such.
Using LLMs (GPT-variants, Llama2, and Mixtral), we generated virtual answers for politics and democracy related attitude questions of the European Social Survey (10th wave) and statistically compared the results of the simulated responses to the real ones. We explored different prompting techniques and the effect of different types and richness of contextual information provided to the models. Our results suggest that the tested LLMs generate highly realistic answers and are good at invoking the needed patterns from limited contextual information given to them if a couple of relevant examples are provided, but struggle in a zero-shot setting.
A critical methodological analysis is inevitable when considering the potential use of data generated by LLMs for scientific research, the exploration of known biases and reflection on social reality not represented on the internet are essential.
A cikk elérhető az alábbi linken:
Rakovics, Z. and Rakovics, M. 2025. Exploring the potential and limitations of large language models as virtual respondents for social science research. Intersections. East European Journal of Society and Politics. 10, 4 (Feb. 2025), 126–147. https://doi.org/10.17356/ieejsp.v10i4.1326