Using Machine Learning to Uncover the Semantics of Concepts: How Well Do Typicality Measures Extracted from a BERT Text Classifier Match Human Judgments of Genre Typicality?

Le Mens, Gaël; Kovács, Balázs; Hannan, Michael T.; Pros, Guillem

doi:10.15195/v10.a3

Tag Archives | Categories

Using Machine Learning to Uncover the Semantics of Concepts: How Well Do Typicality Measures Extracted from a BERT Text Classifier Match Human Judgments of Genre Typicality?

By Parker Webservices on March 3, 2023 in Articles

Gaël Le Mens, Balázs Kovács, Michael T. Hannan, Guillem Pros

Sociological Science March 3, 2023
10.15195/v10.a3

Abstract

PDF (3315 views)

0 Citation

Abstract
Author Information
Process Info

Social scientists have long been interested in understanding the extent to which the typicalities of an object in concepts relate to its valuations by social actors. Answering this question has proven to be challenging because precise measurement requires a feature-based description of objects. Yet, such descriptions are frequently unavailable. In this article, we introduce a method to measure typicality based on text data. Our approach involves training a deep-learning text classifier based on the BERT language representation and defining the typicality of an object in a concept in terms of the categorization probability produced by the trained classifier. Model training allows for the construction of a feature space adapted to the categorization task and of a mapping between feature combination and typicality that gives more weight to feature dimensions that matter more for categorization. We validate the approach by comparing the BERT-based typicality measure of book descriptions in literary genres with average human typicality ratings. The obtained correlation is higher than 0.85. Comparisons with other typicality measures used in prior research show that our BERT-based measure better reflects human typicality judgments.

This work is licensed under a Creative Commons Attribution 4.0 International License.

Gaël Le Mens: Department of Economics and Business, Universitat Pompeu Fabra (UPF), Barcelona School of Economics, and UPF Barcelona School of Management, Barcelona, Spain
E-mail: gael.le-mens@upf.edu

Balázs Kovács: School of Management, Yale University, New Haven, CT, USA
E-mail: balazs.kovacs@yale.edu

Michael T. Hannan: Graduate School of Business, Stanford University, Stanford, CA, USA
E-mail: hannan@stanford.edu

Guillem Pros: Department of Economics and Business, Universitat Pompeu Fabra, Barcelona, Spain
E-mail: guillem.pros@upf.edu

Acknowledgments: We are grateful to Jerker Denrell, Amir Goldberg, Greta Hsu, Thorbjørn Knudsen, Cecilia Nunes, and Phanish Puranam for discussion of ideas developed in this article and for the detailed feedback we received from them on the earlier versions. We thank conference participants at the 2021 and 2022 Nagymaros Conferences for valuable feedback and discussion. G. Le Mens and G. Pros received financial support from ERC Consolidator Grant #772268 from the European Commission. G. Le Mens also received financial support from grant PID2019-105249GBI00/ AEI/10.13039/501100011033 from the Spanish Ministerio de Ciencia, Innovacion y Universidades (MCIU) and the Agencia Estatal de Investigacion (AEI) and from the BBVA Foundation Grant G999088Q. B. Kovács was supported by Yale School of Management. M. Hannan was supported by the Stanford Graduate School of Business. Data, material, and analysis code for all analyses are available online at https://osf.io/ta273/. We encourage readers to download the shared folder and use the code to compute BERT typicality on their own data sets.

Citation: Le Mens, Gaël, Balázs Kovács, Michael T. Hannan, and Guillem Pros. 2023. “Using Machine Learning to Uncover the Semantics of Concepts: How Well Do Typicality Measures Extracted from a BERT Text Classifier Match Human Judgments of Genre Typicality?” Sociological Science 10: 82-117.
Received: September 28, 2022
Accepted: November 9, 2022
Editors: Ari Adut, Filiz Garip
DOI: 10.15195/v10.a3

An Ecology of Social Categories

By Jesper Sorensen on August 18, 2014 in Articles

Elizabeth G. Pontikes, Michael T. Hannan

Sociological Science, August 18, 2014
DOI 10.15195/v1.a20

Abstrac

PDF (6666 views)

0 Citation

Abstract
Author Information
Supplemental Material
Process Info

This article proposes that meaningful social classification emerges from an ecological dynamic that operates in two planes: feature space and label space. It takes a dynamic view of classification, allowing objects’ movements in both spaces to change the meaning of social categories. The first part of the theory argues that agents assign labels to objects based on perceptions of their similarities to existing members of a category. The second part of the theory shows that an object’s perceived similarity to members of other categories reduces its typicality in a focal category. This means that for categories with a high degree of overlap with other categories in label space (lenient categories), the link between feature-based similarities and labeling weakens. The findings suggest that social classification will likely evolve to contain both constraining and lenient categories. The theory implies that this process is self-reinforcing, so that constraining categories become more constraining, whereas lenient categories become more lenient.

Elizabeth G. Pontikes: University of Chicago. E-mail: elizabeth.pontikes@chicagobooth.edu.

Michael T. Hannan: Stanford University. Email: hannan@stanford.edu.

Supplemental Material

Citation: Pontkes, Elizabeth G. and Michael T. Hannan. 2014. “An Ecology of Social Categories.” Sociological Science 1: 311-343.
Received: April 15, 2014
Accepted: May 28, 2014
Editors: Olav Sorenson
DOI: 10.15195/v1.a20

The Diffusion of the Legitimate and the Diffusion of Legitimacy

By Jesper Sorensen on March 3, 2014 in Articles

Gabriel Rossman

Sociological Science, March 3, 2014
DOI 10.15195/v1.a5

Abstract

PDF (6487 views)

5 Citation

Abstract
Author Information
Supplemental Materials
Process Info

This article models the implications of innovations being nested within categories. In effect, social actors assess the legitimacy of innovations vis-à-vis conformity to categories such that a sufficiently legitimate innovation may be adopted without direct reference to the behavior of peers. However, when innovations lack categorical legitimacy, actors default to proximately peer-oriented heuristics such as information cascades. Eventually, if enough similarly novel innovations achieve widespread popularity, their conventions will become accepted as a legitimate category. Thus density creates legitimacy, but this density can be at the level of the particular innovation or of the category within which it is embedded.

Gabriel Rossman: University of California, Los Angeles. E-mail: Rossman@soc.ucla.edu

Supplemental Materials

Citation: Rossman, Gabriel. 2014. “The Diffusion of the Legitimate and the Diffusion of Legitimacy.” Sociological Science 1: 49–69.
Received: September 17, 2013
Accepted: September 20, 2013
Editors: Jesper Sørensen, Ezra Zuckerman
DOI: 10.15195/v1.a5

Navigation

Tag Archives | Categories

Sociological Science March 3, 2023 10.15195/v10.a3

Abstract

Sociological Science, August 18, 2014 DOI 10.15195/v1.a20

Abstract

Sociological Science, March 3, 2014 DOI 10.15195/v1.a5

Abstract

Sociological Science March 3, 2023
10.15195/v10.a3

Sociological Science, August 18, 2014
DOI 10.15195/v1.a20

Sociological Science, March 3, 2014
DOI 10.15195/v1.a5