<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (NISO Z39.96-2019) Journal Publishing DTD v1.2 20190208//EN" "https://jats.nlm.nih.gov/publishing/1.2/JATS-journalpublishing1-mathml3.dtd">
<article xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" dtd-version="1.2" xml:lang="en" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
  <front>
    <journal-meta>
  <journal-id journal-id-type="publisher-id">CCR</journal-id>
  <journal-title-group>
    <journal-title>Computational Communication Research</journal-title>
  </journal-title-group>
  <issn pub-type="ppub" />
  <issn pub-type="epub">2665-9085</issn>
  <publisher>
    <publisher-name>Amsterdam University Press</publisher-name>
    <publisher-loc>Amsterdam</publisher-loc>
  </publisher>
</journal-meta><article-meta>
      <article-id pub-id-type="publisher-id">CCR2025.1.14.MULL</article-id><article-id pub-id-type="doi">10.5117/CCR2025.1.14.MULL</article-id><article-categories><subj-group subj-group-type="heading"><subject>Article</subject></subj-group></article-categories><title-group>
        <article-title>Associations Measured = Stereotypes Conveyed? A Semantic Validation of Word Embedding-Based Measures of Implicit Group Stereotyping in Large Text Corpora</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <name>
            <surname>Müller</surname>
            <given-names>Philipp</given-names>
          </name>
          <aff>Institute for Media and Communication Studies &amp; Mannheim Centre for European Social Research (MZES), University of Mannheim, Germany</aff>
        </contrib>
        <contrib contrib-type="author">
          <name>
            <surname>Freudenthaler</surname>
            <given-names>Rainer</given-names>
          </name>
          <aff>Institute for Media and Communication Studies &amp; Mannheim Centre for European Social Research (MZES), University of Mannheim, Germany</aff>
        </contrib>
        <contrib contrib-type="author">
          <name>
            <surname>Ludwig</surname>
            <given-names>Katharina</given-names>
          </name>
          <aff>Institute for Media and Communication Studies &amp; Mannheim Centre for European Social Research (MZES), University of Mannheim, Germany</aff>
        </contrib>
        <contrib contrib-type="author">
          <name>
            <surname>Chan</surname>
            <given-names>Chung-hong</given-names>
          </name>
          <aff>GESIS - Leibniz-Institut für Sozialwissenschaften, Cologne, Germany</aff>
        </contrib>
      </contrib-group>
      <pub-date pub-type="epub"><year>2025</year></pub-date><volume>7</volume><issue>1</issue><fpage>1</fpage><permissions><copyright-statement>© The authors</copyright-statement><copyright-year>2025</copyright-year><copyright-holder>The authors</copyright-holder><license license-type="open-access"><license-p>This is an open access article distributed under the CC BY 4.0 license <ext-link ext-link-type="uri" xlink:href="https://creativecommons.org/licenses/by/4.0/">https://creativecommons.org/licenses/by/4.0/</ext-link></license-p></license></permissions><abstract>
    <title>Abstract</title><p>Word embedding-based measures are increasingly being used in computational communication research to assess how entities (such as individuals or social groups) are implicitly contextualized within mediated discourse. We argue that these corpus-level metrics, yet, lack a demonstration of their semantic validity and point out several challenges that preclude researchers from using the traditional “gold-standard” coding for its establishment. In this study, we propose and apply an alternative avenue, namely, to use the experimental survey logic to test a causal conjecture between human-perceived context and the implicit associations measured using word embeddings. We report the results of an application of this approach which uses texts and measures from a previous study investigating the implicit stigmatization of ethnic groups. Results indicate alignment between participants’ perceived group contextualization and the respective estimations from a word embedding model across experimental conditions. We interpret this as evidence for semantic validity of word embedding-based measures of implicit stereotypical associations.</p>
  </abstract>
      <kwd-group>
        <title>Keywords:</title><kwd>validation</kwd>
        <kwd>word embeddings</kwd>
        <kwd>implicit association</kwd>
        <kwd>stereotype</kwd>
        <kwd>experiment</kwd>
        <kwd>automated content analysis</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="S1">
      
      
      
      
    <title>Word Embedding-Based ISA Measures and Their Validity</title><sec id="S1.SS1">
        
        
        
        
      <title>Word Embedding-Based ISA Metrics</title><p id="S1.SS1.p1">Word embeddings are trained on a large corpus of texts in an unsupervised manner, usually as a by-product of training a neural network to predict a word using other words surrounding it. This by-product can be used to quantify the semantic meanings of words based on the linguistic properties of distributional semantics. Word embeddings work by utilizing co-occurrence statistics to place all unique words in a corpus on multidimensional word embedding spaces. Each row in these spaces represents the word vector of one word. For example, words like <italic>fluid</italic> and <italic>water</italic> are more likely to occur in the same sentence as <italic>flow</italic>, while words like <italic>solid</italic> or <italic>steel</italic> are less likely to do so (Pennington et al., <xref rid="bib.bibx34" ref-type="bibr">2014</xref>). For words with a similar semantic meaning (e.g., <italic>fluid</italic> and <italic>water</italic> in the above example) their similarity can be quantified by comparing their word vectors. By the same token, words with different meanings (e.g., <italic>water</italic> and <italic>steel</italic>) should have very different word vectors. Therefore, the comparability of word vectors is able to tell us which words are more closely associated. In communication research, this comparability was originally used to enhance supervised and unsupervised machine learning models over the traditional “bag-of-words” model (e.g., Rudkowsky et al., <xref rid="bib.bibx36" ref-type="bibr">2018</xref>; Atteveldt et al., <xref rid="bib.bibx44" ref-type="bibr">2021</xref>). There exist several pretrained word vectors that researchers can use in an off-the-shelf manner.</p><p id="S1.SS1.p2">However, this application has also been criticized because pretrained word embeddings might contain unwanted associations inherited from the texts in their training corpora, so that a downstream task would also be tainted by those associations. Pioneer works in this regard are Caliskan et al. (<xref rid="bib.bibx10" ref-type="bibr">2017</xref>) and Garg et al. (<xref rid="bib.bibx17" ref-type="bibr">2018</xref>). These studies derived several “word embedding association” metrics to quantify what the authors interpret as “biases” in pretrained word vectors. These metrics are also based on the comparability of word vectors: the relative similarity in the word embedding space of a target (e.g., <italic>nurse</italic>) with different attributes (e.g., <italic>male</italic> and <italic>female</italic>) can tell us how the word vectors—and by extension, their original training corpus—represent different target entities to be implicitly associated with other entities.</p><p id="S1.SS1.p3">More recently, this approach has been extended as a computational communication research method to assess implicit stereotypical associations within text: Instead of using pretrained word vectors, word embeddings are trained on a large corpus of texts that one wants to study. Resulting word embedding association metrics are used to quantify ISAs in the corpora—which have in past research been interpreted as “stereotypes” (Kroon et al., <xref rid="bib.bibx25" ref-type="bibr">2021</xref>; Andrich et al., <xref rid="bib.bibx2" ref-type="bibr">2023</xref>; Andrich &amp; Domahidi, <xref rid="bib.bibx3" ref-type="bibr">2022</xref>; Azzalini, <xref rid="bib.bibx6" ref-type="bibr">2025</xref>), “media bias” (Sales et al., <xref rid="bib.bibx37" ref-type="bibr">2019</xref>; Curto et al., <xref rid="bib.bibx13" ref-type="bibr">2022</xref>), or “implicit representations/ associations” (Fu, <xref rid="bib.bibx16" ref-type="bibr">2023</xref>; Müller et al., <xref rid="bib.bibx33" ref-type="bibr">2023</xref>; Urman &amp; Makhortykh, <xref rid="bib.bibx42" ref-type="bibr">2022</xref>). In this kind of application, word embedding-based metrics are not used to assess potential biases within a word embedding model with the aim to reduce these biases. Instead, they are used as a content analysis tool: Researchers calculate word embedding-based scores to make observations about the corpora they train the model on. However, while the body of applied studies employing word embedding-based ISA metrics is flourishing (also see, Durrheim et al., <xref rid="bib.bibx15" ref-type="bibr">2023</xref>), these applications yet lack a demonstration of their semantic validity.</p></sec><sec id="S1.SS2">
        
        
        
        
      <title>The Missing Semantic Validity of Using Word Embedding-Based ISA Metrics for Content Analysis</title><p id="S1.SS2.p1">DiMaggio et al. (<xref rid="bib.bibx14" ref-type="bibr">2013</xref>), Quinn et al. (<xref rid="bib.bibx35" ref-type="bibr">2010</xref>), and Grimmer &amp; Stewart (<xref rid="bib.bibx20" ref-type="bibr">2013</xref>) conceptualize three different aspects of validation in automated text analysis: statistical, predictive, and semantic. These three aspects supplement each other. Statistical validity indicates whether results generated from a model agree with the model’s statistical assumptions. Usingword embedding-based ISA metrics as an example, the stability of the measurement is one aspect of this statistical assumption. It can be assessed using standardized tests (e.g., Spliethöver &amp; Wachsmuth, <xref rid="bib.bibx40" ref-type="bibr">2021</xref>). Statistical validity <italic>per se</italic> is important but not sufficient as sole evidence of the validity of a measure because it evaluates only the internal coherence of a model.</p><p id="S1.SS2.p2">Predictive validity, on the other hand, measures the “expected correspondence between a measure and exogenous events uninvolved in the measurement process” (Quinn et al., <xref rid="bib.bibx35" ref-type="bibr">2010</xref>, p. 222). Foundational works on word embedding-based ISA measures established the validity of the approach primarily in terms of predictive validity (for an overview see, Durrheim et al., <xref rid="bib.bibx15" ref-type="bibr">2023</xref>). For instance, Caliskan et al. (<xref rid="bib.bibx10" ref-type="bibr">2017</xref>) show the correlation between WEAT-based scores derived from word vectors trained on Wikipedia and various web corpora (Pennington et al., <xref rid="bib.bibx34" ref-type="bibr">2014</xref>) with Implicit Association Test (IAT) scores obtained from some exogenous U.S. experiments conducted in the 1990s (Greenwald et al., <xref rid="bib.bibx18" ref-type="bibr">1998</xref>). Similarly, Garg et al. (<xref rid="bib.bibx17" ref-type="bibr">2018</xref>) validate word embedding-based scores on gender biases in occupation within a Google News corpus by testing their correlation with the relative percentage of females in different occupations in the U.S. in the 1960s. For use cases in which word embedding-based ISA metrics are employed to detect biases within word embedding models and de-bias them for further application, this correspondence of these metrics to external data suffices.</p><p id="S1.SS2.p3">In recent years, though, word embedding-based measures have been used to assess stereotypical associations within texts from specific periods and regions. For example, Kroon et al. (<xref rid="bib.bibx25" ref-type="bibr">2021</xref>) aim to show that “representations of minorities in newspapers have become progressively remote from factual integration outcomes, and are therefore rather an artifact of news production processes than a true reflection of what is actually happening in society.” Their claim is that the associations measured within a word embedding model trained on news do not correspond to external data, but instead indicate a stereotypical portrayal within the news corpus. Word embedding-based scores are here used as a content analysis tool: The researchers’ goal in this kind of application is to make assertions about the content of news in the observed corpus, not about external society. To validate the application of word embedding-based scores for these kinds of applications, predictive validity does not suffice—in fact, where one expects a divergence between reporting and external societal phenomena, predictive validity cannot be used to assess the method’s validity at all.</p><p id="S1.SS2.p4">Therefore, when using word embedding-based metrics as a content analysis tool, one needs to supplement statistical and predictive validity with semantic validity. Krippendorff (<xref rid="bib.bibx24" ref-type="bibr">2018</xref>, p.323) defined semantic validity as “the degree to which the analytical categories of texts correspond to the meanings these texts have for particular readers or the roles they play within a chosen context.” An important distinction between predictive validity and semantic validity is therefore what the measure of interest corresponds to: The ground truth in the case of semantic validity is the human understanding of texts, rather than some exogenous cultural patterns. One can therefore consider semantic validity a form of criterion validity, where the criterion test—more commonly referred to as “gold standard” (Song et al., <xref rid="bib.bibx38" ref-type="bibr">2020</xref>; Atteveldt et al., <xref rid="bib.bibx44" ref-type="bibr">2021</xref>; Lind et al., <xref rid="bib.bibx28" ref-type="bibr">2017</xref>)—is the human understanding. Consequently, semantic validity as a category is much closer to the actual target construct that is supposed to be measured in text analysis, namely the meaning conveyed by text to human readers, than predictive validity.</p><p id="S1.SS2.p5">Krippendorff’s works refer to manual content analysis with human coders. In this context, they suggest to compare human coders’ ratings of media content to that of individuals with expertise in the field under study (political professionals for political texts, legal experts for legal texts etc. Krippendorff, <xref rid="bib.bibx23" ref-type="bibr">1980</xref>). In the present study, instead of focusing on topical experts to produce the criterion values for semantic validity, we use a slightly different approach and examine a general media audience’s assessment. In manual content analysis, trained coders are typically not such experts, but rather resemble members of the general media audience. Therefore, the question whether their coding is in line with the general audience’s understanding of texts appears less pressing. For automated analysis, the case is different. Here, it is far from self-evident that the analysis routines produce a textual understanding that coincides with that of the general media audience.</p><p id="S1.SS2.p6">However, this question is important when we consider the epistemological focus that content analytical assessments are typically conducted with. There are two inferential goals of media content analysis: (1.) the “diagnostic approach” is interested in making inferences about media messages’ production circumstances from content analysis, (2.) while the “prognostic approach” tries to infer predictions of a message’s potential processing and effects (Maurer &amp; Reinemann, <xref rid="bib.bibx31" ref-type="bibr">2006</xref>, p. 13). The latter is especially important in the context of ISAs because of the harm that these kinds of messages might do to societal intergroup relations. Therefore, when establishing word embedding-based ISA assessment’s semantic validity we are particularly interested in whether its substantial findings are in line with a general audiences’ impression of the same content. This is a logical prerequisite to inferring predictions of media processing and effects from the content analytical results obtained by these measurements.</p></sec><sec id="S1.SS3">
        
        
        
        
      <title>The Challenges of Semantic Validation—And How to Address Them</title><p id="S1.SS3.p1">For common automated content analysis techniques such as sentiment detection or topic modeling, the units of analysis are typically articles or their sub-units such as sentences or paragraphs. Therefore, it is relatively easy to compare measures extracted for these units, for instance their sentiment scores or topic allocations, with assessments of the same categories made by human coders. It is also possible to test the semantic validity of the measure by studying a representative sample of articles in the corpus. Although the amount of sampled articles does impact the validation outcome (Song et al., <xref rid="bib.bibx38" ref-type="bibr">2020</xref>), it is indeed statistically valid (Krippendorff, <xref rid="bib.bibx24" ref-type="bibr">2018</xref>) and the approach has been recommended in the methods literature (Grimmer &amp; Stewart, <xref rid="bib.bibx20" ref-type="bibr">2013</xref>; Atteveldt et al., <xref rid="bib.bibx44" ref-type="bibr">2021</xref>; Lind et al., <xref rid="bib.bibx28" ref-type="bibr">2017</xref>) and widely used in computational communication research.</p><p id="S1.SS3.p2">However, word embedding-based ISA measures represent the word associations in an entire corpus and are therefore aggregated text measurements at the corpus level. The resulting scores cannot be broken down into single texts, paragraphs, or even sentences as units of analysis. Consequently, semantic validation that would compare word embedding-based scores with human-generated data of the same texts has previously not been applied when attempting to validate the method’s suitability for assessing ISAs within text. Müller et al. (<xref rid="bib.bibx33" ref-type="bibr">2023</xref>, p. 409) even explicitly pointed out that a validation using human coding is impractical. In their analysis of implicit stereotypical associations of ethnic and religious group names with emotions, the authors assert that “[a] proper human validation would need raters to read the entire corpus of 697,913 articles and point out what racial biases they have learned from the corpus.” For a similar method, Arendt &amp; Karadas (<xref rid="bib.bibx5" ref-type="bibr">2017</xref>, p.13) defend the decision not to validate their measurement because “there is no real ‘gold standard’ of what the ‘true’ mediated associations are.” These assertions underscore the difficulty, but do not rule out the possibility for validating word embedding-based ISA measures. However, such an attempt cannot follow the same logic and approach as the semantic validation of text- or sentence-based measurements. For an entire corpus, it is impractical to use the so-called “gold standard” procedure of asking trained human coders to go over the texts under study and ask them to manually code ISAs as observational evidence of word embedding-based scores’ semantic validity.</p><p id="S1.SS3.p3">A practical approach to overcome these issues, according to Müller et al. (<xref rid="bib.bibx33" ref-type="bibr">2023</xref>, p.409), is “to develop ways of validating word embedding bias methods using a well-defined causal conjecture.” Applying this logic, one can first propose a causal conjecture in the form of hypotheses and then test them empirically. A suitable causal conjecture for semantically validating word embedding-based ISA measures is that a sentence package with stronger ISAs for a specified entity (as measured by word embedding-based scores) <bold>causes</bold> human readers to perceive that the sentence package contains such associations to a higher extent than a control condition. This causal conjecture can be studied using an experimental research design (Imai et al., <xref rid="bib.bibx22" ref-type="bibr">2011</xref>). In fact, a similar approach (using experimental designs to establish semantic validity) was proposed and applied in the early days of computational text analysis to study the validity of topic models by Grimmer &amp; Stewart (<xref rid="bib.bibx20" ref-type="bibr">2013</xref>); Grimmer &amp; King (<xref rid="bib.bibx19" ref-type="bibr">2011</xref>).</p><p>In transferring the notion of semantic validation via survey experiments to the case of word embedding-based scores, we propose a routine which proceeds in three steps:</p><p><list list-type="order" id="S1.I1"><list-item id="S1.I1.i1">
              
              
              
              
            <p id="S1.I1.i1.p1">As we cannot expect human participants to read a complete corpus, we extract different packages of sentences from a previous study’s corpus (Müller et al., <xref rid="bib.bibx33" ref-type="bibr">2023</xref>) that we expect to contain ISAs.</p></list-item><list-item id="S1.I1.i2">
              
              
              
              
            <p id="S1.I1.i2.p1">Then, we ensure that word embedding-based scores capture the ISAs assumed to be present in those sentences. We inject the sentence packages into a version of the original corpus that is stripped of all other mentions of the target group to remove the original ISA from the corpus. This modified corpus is used for assessing the level of ISAs in sentence packages.</p></list-item><list-item id="S1.I1.i3">
              
              
              
              
            <p id="S1.I1.i3.p1">Finally, we conduct an experimental study in which we let a large sample of human participants read scaled-down versions of the different sentence packages and afterwards ask them to answer a set of survey questions tailored to capture the ISAs conveyed by these sentences. If the results on these survey measures are in line with word embedding-based measures of ISAs, the causal conjecture established by the experimental design offers an argument for the semantic validity of using word embeddings to investigate ISAs in large text corpora.</p></list-item></list></p><p id="S1.SS3.p5">By conducting this validation procedure in an experimental survey setting instead of hiring a limited number of trained coders (as in the typical gold standard validation routine), we make use of the law of large numbers. In contrast to the traditional coding approach in which each unit of analysis is judged by one coder (also the one applied in crowd coding, see, e.g., Atteveldt et al., <xref rid="bib.bibx44" ref-type="bibr">2021</xref>; Lind et al., <xref rid="bib.bibx28" ref-type="bibr">2017</xref>), the same sentence packages are evaluated by large groups of individuals. Instead of one data point per text unit, we, thus, gather a large number of data points on the same units. This accounts for the fact that implicit meanings such as ISAs may be perceived differently by various human readers, even by trained coders. Considering this variance, we assess the average meaning conveyed by different sentence packages to humans based on a large amount of data points. Further, contrary to a typical coding scheme the survey approach can account for the implicit nature of conveyed ISAs by using various text-dependent measures as indicators, not just one (single-item) assessment that is typically used to capture (quasi-)manifest textual meanings by human coders.</p></sec></sec>
    <sec id="S2">
      
      
      
      
    <title>Study Design</title><p id="S2.p1">We preregistered the hypotheses, survey, and analytical plan for this study before data collection. The code to reproduce both the stimulus generation process and the obtained survey data can be found on OSF.<xref rid="id1" ref-type="fn" specific-use="fn"><sup>1</sup></xref> All Appendices are available in the same repository.</p><sec id="S2.SS1">
        
        
        
        
      <title>Case</title><p id="S2.SS1.p1">We designed this study based on the Open Science materials shared by Müller et al. (<xref rid="bib.bibx33" ref-type="bibr">2023</xref>). In that study, word embedding-based ISA metrics were applied to assess implicit stigmatization of ethnic and religious groups in German news reporting by measuring the co-occurrence of group labels with words implicitly charged with the positive emotion admiration or the negative emotion fear. Importantly, in the original study, this measurement was conducted based on a non-“pretrained” word embedding model. Instead, it was trained from the study’s own text corpus using the GLoVE algorithm.</p><p id="S2.SS1.p2">For the present validation attempt, our goal was to create three artificial stimulus sentence packages—one consisting of sentences that implicitly associate a target group with fear, one implicitly associating the same group with admiration, and one that contains no implicit association of the target group with either of the two emotions (control condition). For the generation of sentence packages, we used a group (Italian people) that Müller et al. (<xref rid="bib.bibx33" ref-type="bibr">2023</xref>) found to be portrayed in a balanced way on average within the corpus. We reasoned that an overall balanced group portrayal meant we would find enough sentences that implicitly contained either admiration or fear. Additionally, if we assume that the found associations within the original study are somewhat representative of stereotypes the wider population has internalized, it should be easier to construct an intuitively credible implicitly biased dataset based on this group. For example, using a group that is stereotypically associated with fear and constructing an artificial corpus where this group is implicitly associated with admiration might result in a failure of the validation attempt because existing stereotypes are less malleable and the sentences are therefore perceived as unrealistic or implausible by participants.</p><p id="S2.SS1.p3">Following Müller et al. (<xref rid="bib.bibx33" ref-type="bibr">2023</xref>); Urman &amp; Makhortykh (<xref rid="bib.bibx42" ref-type="bibr">2022</xref>), we used the normalized association score (<inline-formula><mml:math id="S2.SS1.p3.m1" alttext="NAS" display="inline"><mml:mrow><mml:mi>N</mml:mi><mml:mo>⁢</mml:mo><mml:mi>A</mml:mi><mml:mo>⁢</mml:mo><mml:mi>S</mml:mi></mml:mrow></mml:math></inline-formula>) (Caliskan et al., <xref rid="bib.bibx10" ref-type="bibr">2017</xref>) to quantify ISAs. <inline-formula><mml:math id="S2.SS1.p3.m2" alttext="NAS" display="inline"><mml:mrow><mml:mi>N</mml:mi><mml:mo>⁢</mml:mo><mml:mi>A</mml:mi><mml:mo>⁢</mml:mo><mml:mi>S</mml:mi></mml:mrow></mml:math></inline-formula> is calculated with the word embeddings <inline-formula><mml:math id="S2.SS1.p3.m3" alttext="\boldsymbol{w}" display="inline"><mml:mi>𝒘</mml:mi></mml:math></inline-formula>, target words <inline-formula><mml:math id="S2.SS1.p3.m4" alttext="\boldsymbol{s}" display="inline"><mml:mi>𝒔</mml:mi></mml:math></inline-formula>, admiration attribute wordset <inline-formula><mml:math id="S2.SS1.p3.m5" alttext="\mathcal{A}" display="inline"><mml:mi class="ltx_font_mathcaligraphic">𝒜</mml:mi></mml:math></inline-formula>, and fear attribute wordset <inline-formula><mml:math id="S2.SS1.p3.m6" alttext="\mathcal{B}" display="inline"><mml:mi class="ltx_font_mathcaligraphic">ℬ</mml:mi></mml:math></inline-formula> (See Appendix A in the OSF repository). The software implementation by Chan (<xref rid="bib.bibx12" ref-type="bibr">2022</xref>) was used.</p><p id="S2.SS1.p4">Suppose the cosine similarity score between word <inline-formula><mml:math id="S2.SS1.p4.m1" alttext="a" display="inline"><mml:mi>a</mml:mi></mml:math></inline-formula> and <inline-formula><mml:math id="S2.SS1.p4.m2" alttext="b" display="inline"><mml:mi>b</mml:mi></mml:math></inline-formula> in the word embedding space of <inline-formula><mml:math id="S2.SS1.p4.m3" alttext="\boldsymbol{w}" display="inline"><mml:mi>𝒘</mml:mi></mml:math></inline-formula> is denoted as <inline-formula><mml:math id="S2.SS1.p4.m4" alttext="cos(\boldsymbol{w_{a}},\boldsymbol{w_{b}})" display="inline"><mml:mrow><mml:mi>c</mml:mi><mml:mo>⁢</mml:mo><mml:mi>o</mml:mi><mml:mo>⁢</mml:mo><mml:mi>s</mml:mi><mml:mo>⁢</mml:mo><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:msub><mml:mi>𝒘</mml:mi><mml:mi>𝒂</mml:mi></mml:msub><mml:mo>,</mml:mo><mml:msub><mml:mi>𝒘</mml:mi><mml:mi>𝒃</mml:mi></mml:msub><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow></mml:math></inline-formula>.</p><disp-formula id="S2.E1">
            
            
            
          <mml:math id="S2.E1.m1" alttext="cos(\boldsymbol{w_{a}},\boldsymbol{w_{b}})=\frac{\boldsymbol{w_{a}}\cdot%&#10;\boldsymbol{w_{b}}}{||\boldsymbol{w_{a}}||||\boldsymbol{w_{b}}||}" display="block"><mml:mrow><mml:mrow><mml:mi>c</mml:mi><mml:mo>⁢</mml:mo><mml:mi>o</mml:mi><mml:mo>⁢</mml:mo><mml:mi>s</mml:mi><mml:mo>⁢</mml:mo><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:msub><mml:mi>𝒘</mml:mi><mml:mi>𝒂</mml:mi></mml:msub><mml:mo>,</mml:mo><mml:msub><mml:mi>𝒘</mml:mi><mml:mi>𝒃</mml:mi></mml:msub><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow><mml:mo>=</mml:mo><mml:mfrac><mml:mrow><mml:msub><mml:mi>𝒘</mml:mi><mml:mi>𝒂</mml:mi></mml:msub><mml:mo lspace="0.222em" rspace="0.222em">⋅</mml:mo><mml:msub><mml:mi>𝒘</mml:mi><mml:mi>𝒃</mml:mi></mml:msub></mml:mrow><mml:mrow><mml:mrow><mml:mo stretchy="false">‖</mml:mo><mml:msub><mml:mi>𝒘</mml:mi><mml:mi>𝒂</mml:mi></mml:msub><mml:mo stretchy="false">‖</mml:mo></mml:mrow><mml:mo>⁢</mml:mo><mml:mrow><mml:mo stretchy="false">‖</mml:mo><mml:msub><mml:mi>𝒘</mml:mi><mml:mi>𝒃</mml:mi></mml:msub><mml:mo stretchy="false">‖</mml:mo></mml:mrow></mml:mrow></mml:mfrac></mml:mrow></mml:math></disp-formula><p id="S2.SS1.p6">For a given word <inline-formula><mml:math id="S2.SS1.p6.m1" alttext="s" display="inline"><mml:mi>s</mml:mi></mml:math></inline-formula>, let us denote its differential association with <inline-formula><mml:math id="S2.SS1.p6.m2" alttext="\mathcal{A}" display="inline"><mml:mi class="ltx_font_mathcaligraphic">𝒜</mml:mi></mml:math></inline-formula> and <inline-formula><mml:math id="S2.SS1.p6.m3" alttext="\mathcal{B}" display="inline"><mml:mi class="ltx_font_mathcaligraphic">ℬ</mml:mi></mml:math></inline-formula> as <inline-formula><mml:math id="S2.SS1.p6.m4" alttext="g(s,\mathcal{A},\mathcal{B},\boldsymbol{w})" display="inline"><mml:mrow><mml:mi>g</mml:mi><mml:mo>⁢</mml:mo><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mi>s</mml:mi><mml:mo>,</mml:mo><mml:mi class="ltx_font_mathcaligraphic">𝒜</mml:mi><mml:mo>,</mml:mo><mml:mi class="ltx_font_mathcaligraphic">ℬ</mml:mi><mml:mo>,</mml:mo><mml:mi>𝒘</mml:mi><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow></mml:math></inline-formula>. And it is calculated as:</p><disp-formula id="S2.E2">
            
            
            
          <mml:math id="S2.E2.m1" alttext="g(s,\mathcal{A},\mathcal{B},\boldsymbol{w})=\frac{1}{|\mathcal{A}|}\sum%&#10;\nolimits_{a\in\mathcal{A}}cos(\boldsymbol{w_{s}},\boldsymbol{w_{a}})-\frac{1}%&#10;{|\mathcal{B}|}\sum\nolimits_{b\in\mathcal{B}}cos(\boldsymbol{w_{s}},%&#10;\boldsymbol{w_{b}})" display="block"><mml:mrow><mml:mrow><mml:mi>g</mml:mi><mml:mo>⁢</mml:mo><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mi>s</mml:mi><mml:mo>,</mml:mo><mml:mi class="ltx_font_mathcaligraphic">𝒜</mml:mi><mml:mo>,</mml:mo><mml:mi class="ltx_font_mathcaligraphic">ℬ</mml:mi><mml:mo>,</mml:mo><mml:mi>𝒘</mml:mi><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow><mml:mo>=</mml:mo><mml:mrow><mml:mrow><mml:mfrac><mml:mn>1</mml:mn><mml:mrow><mml:mo stretchy="false">|</mml:mo><mml:mi class="ltx_font_mathcaligraphic">𝒜</mml:mi><mml:mo stretchy="false">|</mml:mo></mml:mrow></mml:mfrac><mml:mo>⁢</mml:mo><mml:mrow><mml:msub><mml:mo>∑</mml:mo><mml:mrow><mml:mi>a</mml:mi><mml:mo>∈</mml:mo><mml:mi class="ltx_font_mathcaligraphic">𝒜</mml:mi></mml:mrow></mml:msub><mml:mrow><mml:mi>c</mml:mi><mml:mo>⁢</mml:mo><mml:mi>o</mml:mi><mml:mo>⁢</mml:mo><mml:mi>s</mml:mi><mml:mo>⁢</mml:mo><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:msub><mml:mi>𝒘</mml:mi><mml:mi>𝒔</mml:mi></mml:msub><mml:mo>,</mml:mo><mml:msub><mml:mi>𝒘</mml:mi><mml:mi>𝒂</mml:mi></mml:msub><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow></mml:mrow></mml:mrow><mml:mo>−</mml:mo><mml:mrow><mml:mfrac><mml:mn>1</mml:mn><mml:mrow><mml:mo stretchy="false">|</mml:mo><mml:mi class="ltx_font_mathcaligraphic">ℬ</mml:mi><mml:mo stretchy="false">|</mml:mo></mml:mrow></mml:mfrac><mml:mo>⁢</mml:mo><mml:mrow><mml:msub><mml:mo>∑</mml:mo><mml:mrow><mml:mi>b</mml:mi><mml:mo>∈</mml:mo><mml:mi class="ltx_font_mathcaligraphic">ℬ</mml:mi></mml:mrow></mml:msub><mml:mrow><mml:mi>c</mml:mi><mml:mo>⁢</mml:mo><mml:mi>o</mml:mi><mml:mo>⁢</mml:mo><mml:mi>s</mml:mi><mml:mo>⁢</mml:mo><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:msub><mml:mi>𝒘</mml:mi><mml:mi>𝒔</mml:mi></mml:msub><mml:mo>,</mml:mo><mml:msub><mml:mi>𝒘</mml:mi><mml:mi>𝒃</mml:mi></mml:msub><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow></mml:mrow></mml:mrow></mml:mrow></mml:mrow></mml:math></disp-formula><p id="S2.SS1.p8">The mean of all cosine similarity scores of the union of <inline-formula><mml:math id="S2.SS1.p8.m1" alttext="\mathcal{A}" display="inline"><mml:mi class="ltx_font_mathcaligraphic">𝒜</mml:mi></mml:math></inline-formula> and <inline-formula><mml:math id="S2.SS1.p8.m2" alttext="\mathcal{B}" display="inline"><mml:mi class="ltx_font_mathcaligraphic">ℬ</mml:mi></mml:math></inline-formula> of a given word <inline-formula><mml:math id="S2.SS1.p8.m3" alttext="s" display="inline"><mml:mi>s</mml:mi></mml:math></inline-formula> is denoted as <inline-formula><mml:math id="S2.SS1.p8.m4" alttext="m(s,\mathcal{A},\mathcal{B},\boldsymbol{w})" display="inline"><mml:mrow><mml:mi>m</mml:mi><mml:mo>⁢</mml:mo><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mi>s</mml:mi><mml:mo>,</mml:mo><mml:mi class="ltx_font_mathcaligraphic">𝒜</mml:mi><mml:mo>,</mml:mo><mml:mi class="ltx_font_mathcaligraphic">ℬ</mml:mi><mml:mo>,</mml:mo><mml:mi>𝒘</mml:mi><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow></mml:math></inline-formula>. It is calculated as:</p><disp-formula id="S2.E3">
            
            
            
          <mml:math id="S2.E3.m1" alttext="m(s,\mathcal{A},\mathcal{B},\boldsymbol{w})=\frac{1}{|\mathcal{A}\cup\mathcal{%&#10;B}|}\sum\nolimits_{x\in\mathcal{A}\cup\mathcal{B}}cos(\boldsymbol{w_{s}},%&#10;\boldsymbol{w_{x}})" display="block"><mml:mrow><mml:mrow><mml:mi>m</mml:mi><mml:mo>⁢</mml:mo><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mi>s</mml:mi><mml:mo>,</mml:mo><mml:mi class="ltx_font_mathcaligraphic">𝒜</mml:mi><mml:mo>,</mml:mo><mml:mi class="ltx_font_mathcaligraphic">ℬ</mml:mi><mml:mo>,</mml:mo><mml:mi>𝒘</mml:mi><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow><mml:mo>=</mml:mo><mml:mrow><mml:mfrac><mml:mn>1</mml:mn><mml:mrow><mml:mo stretchy="false">|</mml:mo><mml:mrow><mml:mi class="ltx_font_mathcaligraphic">𝒜</mml:mi><mml:mo>∪</mml:mo><mml:mi class="ltx_font_mathcaligraphic">ℬ</mml:mi></mml:mrow><mml:mo stretchy="false">|</mml:mo></mml:mrow></mml:mfrac><mml:mo>⁢</mml:mo><mml:mrow><mml:msub><mml:mo>∑</mml:mo><mml:mrow><mml:mi>x</mml:mi><mml:mo>∈</mml:mo><mml:mrow><mml:mi class="ltx_font_mathcaligraphic">𝒜</mml:mi><mml:mo>∪</mml:mo><mml:mi class="ltx_font_mathcaligraphic">ℬ</mml:mi></mml:mrow></mml:mrow></mml:msub><mml:mrow><mml:mi>c</mml:mi><mml:mo>⁢</mml:mo><mml:mi>o</mml:mi><mml:mo>⁢</mml:mo><mml:mi>s</mml:mi><mml:mo>⁢</mml:mo><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:msub><mml:mi>𝒘</mml:mi><mml:mi>𝒔</mml:mi></mml:msub><mml:mo>,</mml:mo><mml:msub><mml:mi>𝒘</mml:mi><mml:mi>𝒙</mml:mi></mml:msub><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow></mml:mrow></mml:mrow></mml:mrow></mml:math></disp-formula><p id="S2.SS1.p10">And the standard deviation <inline-formula><mml:math id="S2.SS1.p10.m1" alttext="\sigma(s,\mathcal{A},\mathcal{B},\boldsymbol{w})" display="inline"><mml:mrow><mml:mi>σ</mml:mi><mml:mo>⁢</mml:mo><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mi>s</mml:mi><mml:mo>,</mml:mo><mml:mi class="ltx_font_mathcaligraphic">𝒜</mml:mi><mml:mo>,</mml:mo><mml:mi class="ltx_font_mathcaligraphic">ℬ</mml:mi><mml:mo>,</mml:mo><mml:mi>𝒘</mml:mi><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow></mml:math></inline-formula> is calculated as:</p><disp-formula id="S2.E4">
            
            
            
          <mml:math id="S2.E4.m1" alttext="\sigma(s,\mathcal{A},\mathcal{B},\boldsymbol{w})=\sqrt{\frac{1}{|\mathcal{A}%&#10;\cup\mathcal{B}|-1}\sum\nolimits_{x\in\mathcal{A}\cup\mathcal{B}}(cos(%&#10;\boldsymbol{w_{s}},\boldsymbol{w_{x}})-m(c,\mathcal{A},\mathcal{B},\boldsymbol%&#10;{w}))^{2}}" display="block"><mml:mrow><mml:mrow><mml:mi>σ</mml:mi><mml:mo>⁢</mml:mo><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mi>s</mml:mi><mml:mo>,</mml:mo><mml:mi class="ltx_font_mathcaligraphic">𝒜</mml:mi><mml:mo>,</mml:mo><mml:mi class="ltx_font_mathcaligraphic">ℬ</mml:mi><mml:mo>,</mml:mo><mml:mi>𝒘</mml:mi><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow><mml:mo>=</mml:mo><mml:msqrt><mml:mrow><mml:mfrac><mml:mn>1</mml:mn><mml:mrow><mml:mrow><mml:mo stretchy="false">|</mml:mo><mml:mrow><mml:mi class="ltx_font_mathcaligraphic">𝒜</mml:mi><mml:mo>∪</mml:mo><mml:mi class="ltx_font_mathcaligraphic">ℬ</mml:mi></mml:mrow><mml:mo stretchy="false">|</mml:mo></mml:mrow><mml:mo>−</mml:mo><mml:mn>1</mml:mn></mml:mrow></mml:mfrac><mml:mo>⁢</mml:mo><mml:mrow><mml:msub><mml:mo rspace="0em">∑</mml:mo><mml:mrow><mml:mi>x</mml:mi><mml:mo>∈</mml:mo><mml:mrow><mml:mi class="ltx_font_mathcaligraphic">𝒜</mml:mi><mml:mo>∪</mml:mo><mml:mi class="ltx_font_mathcaligraphic">ℬ</mml:mi></mml:mrow></mml:mrow></mml:msub><mml:msup><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mrow><mml:mi>c</mml:mi><mml:mo>⁢</mml:mo><mml:mi>o</mml:mi><mml:mo>⁢</mml:mo><mml:mi>s</mml:mi><mml:mo>⁢</mml:mo><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:msub><mml:mi>𝒘</mml:mi><mml:mi>𝒔</mml:mi></mml:msub><mml:mo>,</mml:mo><mml:msub><mml:mi>𝒘</mml:mi><mml:mi>𝒙</mml:mi></mml:msub><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow><mml:mo>−</mml:mo><mml:mrow><mml:mi>m</mml:mi><mml:mo>⁢</mml:mo><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mi>c</mml:mi><mml:mo>,</mml:mo><mml:mi class="ltx_font_mathcaligraphic">𝒜</mml:mi><mml:mo>,</mml:mo><mml:mi class="ltx_font_mathcaligraphic">ℬ</mml:mi><mml:mo>,</mml:mo><mml:mi>𝒘</mml:mi><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mn>2</mml:mn></mml:msup></mml:mrow></mml:mrow></mml:msqrt></mml:mrow></mml:math></disp-formula><p id="S2.SS1.p12">The <inline-formula><mml:math id="S2.SS1.p12.m1" alttext="NAS" display="inline"><mml:mrow><mml:mi>N</mml:mi><mml:mo>⁢</mml:mo><mml:mi>A</mml:mi><mml:mo>⁢</mml:mo><mml:mi>S</mml:mi></mml:mrow></mml:math></inline-formula> of word <inline-formula><mml:math id="S2.SS1.p12.m2" alttext="s" display="inline"><mml:mi>s</mml:mi></mml:math></inline-formula> is denoted as <inline-formula><mml:math id="S2.SS1.p12.m3" alttext="NAS(s,\mathcal{A},\mathcal{B},\boldsymbol{w})" display="inline"><mml:mrow><mml:mi>N</mml:mi><mml:mo>⁢</mml:mo><mml:mi>A</mml:mi><mml:mo>⁢</mml:mo><mml:mi>S</mml:mi><mml:mo>⁢</mml:mo><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mi>s</mml:mi><mml:mo>,</mml:mo><mml:mi class="ltx_font_mathcaligraphic">𝒜</mml:mi><mml:mo>,</mml:mo><mml:mi class="ltx_font_mathcaligraphic">ℬ</mml:mi><mml:mo>,</mml:mo><mml:mi>𝒘</mml:mi><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow></mml:math></inline-formula>:</p><disp-formula id="S2.E5">
            
            
            
          <mml:math id="S2.E5.m1" alttext="NAS(s,\mathcal{A},\mathcal{B},\boldsymbol{w})=\frac{g(s,\mathcal{A},\mathcal{B%&#10;},\boldsymbol{w})}{\sigma(s,\mathcal{A},\mathcal{B},\boldsymbol{w})}" display="block"><mml:mrow><mml:mrow><mml:mi>N</mml:mi><mml:mo>⁢</mml:mo><mml:mi>A</mml:mi><mml:mo>⁢</mml:mo><mml:mi>S</mml:mi><mml:mo>⁢</mml:mo><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mi>s</mml:mi><mml:mo>,</mml:mo><mml:mi class="ltx_font_mathcaligraphic">𝒜</mml:mi><mml:mo>,</mml:mo><mml:mi class="ltx_font_mathcaligraphic">ℬ</mml:mi><mml:mo>,</mml:mo><mml:mi>𝒘</mml:mi><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow><mml:mo>=</mml:mo><mml:mfrac><mml:mrow><mml:mi>g</mml:mi><mml:mo>⁢</mml:mo><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mi>s</mml:mi><mml:mo>,</mml:mo><mml:mi class="ltx_font_mathcaligraphic">𝒜</mml:mi><mml:mo>,</mml:mo><mml:mi class="ltx_font_mathcaligraphic">ℬ</mml:mi><mml:mo>,</mml:mo><mml:mi>𝒘</mml:mi><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow><mml:mrow><mml:mi>σ</mml:mi><mml:mo>⁢</mml:mo><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mi>s</mml:mi><mml:mo>,</mml:mo><mml:mi class="ltx_font_mathcaligraphic">𝒜</mml:mi><mml:mo>,</mml:mo><mml:mi class="ltx_font_mathcaligraphic">ℬ</mml:mi><mml:mo>,</mml:mo><mml:mi>𝒘</mml:mi><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow></mml:mfrac></mml:mrow></mml:math></disp-formula><p id="S2.SS1.p14">In Müller et al. (<xref rid="bib.bibx33" ref-type="bibr">2023</xref>) and in the original implementation by Caliskan et al. (<xref rid="bib.bibx10" ref-type="bibr">2017</xref>), the target is also a wordset <inline-formula><mml:math id="S2.SS1.p14.m1" alttext="\mathcal{S}" display="inline"><mml:mi class="ltx_font_mathcaligraphic">𝒮</mml:mi></mml:math></inline-formula>, e.g., all grammatical forms of an ethnic group label <inline-formula><mml:math id="S2.SS1.p14.m2" alttext="\{Italiener,Italienerin,\ldots\}" display="inline"><mml:mrow><mml:mo stretchy="false">{</mml:mo><mml:mrow><mml:mi>I</mml:mi><mml:mo>⁢</mml:mo><mml:mi>t</mml:mi><mml:mo>⁢</mml:mo><mml:mi>a</mml:mi><mml:mo>⁢</mml:mo><mml:mi>l</mml:mi><mml:mo>⁢</mml:mo><mml:mi>i</mml:mi><mml:mo>⁢</mml:mo><mml:mi>e</mml:mi><mml:mo>⁢</mml:mo><mml:mi>n</mml:mi><mml:mo>⁢</mml:mo><mml:mi>e</mml:mi><mml:mo>⁢</mml:mo><mml:mi>r</mml:mi></mml:mrow><mml:mo>,</mml:mo><mml:mrow><mml:mi>I</mml:mi><mml:mo>⁢</mml:mo><mml:mi>t</mml:mi><mml:mo>⁢</mml:mo><mml:mi>a</mml:mi><mml:mo>⁢</mml:mo><mml:mi>l</mml:mi><mml:mo>⁢</mml:mo><mml:mi>i</mml:mi><mml:mo>⁢</mml:mo><mml:mi>e</mml:mi><mml:mo>⁢</mml:mo><mml:mi>n</mml:mi><mml:mo>⁢</mml:mo><mml:mi>e</mml:mi><mml:mo>⁢</mml:mo><mml:mi>r</mml:mi><mml:mo>⁢</mml:mo><mml:mi>i</mml:mi><mml:mo>⁢</mml:mo><mml:mi>n</mml:mi></mml:mrow><mml:mo>,</mml:mo><mml:mi mathvariant="normal">…</mml:mi><mml:mo stretchy="false">}</mml:mo></mml:mrow></mml:math></inline-formula>. <inline-formula><mml:math id="S2.SS1.p14.m3" alttext="NAS(\mathcal{S},\mathcal{A},\mathcal{B},\boldsymbol{w})" display="inline"><mml:mrow><mml:mi>N</mml:mi><mml:mo>⁢</mml:mo><mml:mi>A</mml:mi><mml:mo>⁢</mml:mo><mml:mi>S</mml:mi><mml:mo>⁢</mml:mo><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mi class="ltx_font_mathcaligraphic">𝒮</mml:mi><mml:mo>,</mml:mo><mml:mi class="ltx_font_mathcaligraphic">𝒜</mml:mi><mml:mo>,</mml:mo><mml:mi class="ltx_font_mathcaligraphic">ℬ</mml:mi><mml:mo>,</mml:mo><mml:mi>𝒘</mml:mi><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow></mml:math></inline-formula> is the average of all <inline-formula><mml:math id="S2.SS1.p14.m4" alttext="NAS(s_{i},\mathcal{A},\mathcal{B},\boldsymbol{w})" display="inline"><mml:mrow><mml:mi>N</mml:mi><mml:mo>⁢</mml:mo><mml:mi>A</mml:mi><mml:mo>⁢</mml:mo><mml:mi>S</mml:mi><mml:mo>⁢</mml:mo><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:msub><mml:mi>s</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo>,</mml:mo><mml:mi class="ltx_font_mathcaligraphic">𝒜</mml:mi><mml:mo>,</mml:mo><mml:mi class="ltx_font_mathcaligraphic">ℬ</mml:mi><mml:mo>,</mml:mo><mml:mi>𝒘</mml:mi><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow></mml:math></inline-formula> of all targets words <inline-formula><mml:math id="S2.SS1.p14.m5" alttext="s_{i}" display="inline"><mml:msub><mml:mi>s</mml:mi><mml:mi>i</mml:mi></mml:msub></mml:math></inline-formula> in <inline-formula><mml:math id="S2.SS1.p14.m6" alttext="\mathcal{S}" display="inline"><mml:mi class="ltx_font_mathcaligraphic">𝒮</mml:mi></mml:math></inline-formula>. For the current experimental setting, we used one target word (<italic>Italiener</italic>, the singular or plural male nominative form) only. Therefore, <inline-formula><mml:math id="S2.SS1.p14.m7" alttext="NAS(\mathcal{S},\mathcal{A},\mathcal{B},\boldsymbol{w})" display="inline"><mml:mrow><mml:mi>N</mml:mi><mml:mo>⁢</mml:mo><mml:mi>A</mml:mi><mml:mo>⁢</mml:mo><mml:mi>S</mml:mi><mml:mo>⁢</mml:mo><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mi class="ltx_font_mathcaligraphic">𝒮</mml:mi><mml:mo>,</mml:mo><mml:mi class="ltx_font_mathcaligraphic">𝒜</mml:mi><mml:mo>,</mml:mo><mml:mi class="ltx_font_mathcaligraphic">ℬ</mml:mi><mml:mo>,</mml:mo><mml:mi>𝒘</mml:mi><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow></mml:math></inline-formula> and <inline-formula><mml:math id="S2.SS1.p14.m8" alttext="NAS(s,\mathcal{A},\mathcal{B},\boldsymbol{w})" display="inline"><mml:mrow><mml:mi>N</mml:mi><mml:mo>⁢</mml:mo><mml:mi>A</mml:mi><mml:mo>⁢</mml:mo><mml:mi>S</mml:mi><mml:mo>⁢</mml:mo><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mi>s</mml:mi><mml:mo>,</mml:mo><mml:mi class="ltx_font_mathcaligraphic">𝒜</mml:mi><mml:mo>,</mml:mo><mml:mi class="ltx_font_mathcaligraphic">ℬ</mml:mi><mml:mo>,</mml:mo><mml:mi>𝒘</mml:mi><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow></mml:math></inline-formula> are identical.</p></sec><sec id="S2.SS2">
        
        
        
        
      <title>Preregistered Hypotheses</title><p id="S2.SS2.p1">Our goal was to test whether the ISAs picked up by word embedding-based scores are conveyed to a human audience. Accounting for the implicit nature of the associations to be assessed, we attempted to establish this semantic validity based on a larger set of measures that may all be causally linked to implicit associations of the target group with admiration or fear in the stimulus sentence packages. We supposed that the more of the following preregistered hypotheses<xref rid="id2" ref-type="fn" specific-use="fn"><sup>2</sup></xref> were supported, the stronger the evidence for the semantic validity of word embedding-based ISA measures was:</p><list id="S2.I1"><list-item id="S2.I1.ix1">
              
              
              
            <p id="S2.I1.ix1.p1"><italic>If word embedding-based measures indicate that a sentence package contains an ISA of a target group with fear, the representation of this target group will be perceived as more negative (compared to a corpus with no ISA).</italic></p></list-item><list-item id="S2.I1.ix2">
              
              
              
            <p id="S2.I1.ix2.p1"><italic>If word embedding-based measures indicate that a sentence package contains an ISA of a target group with fear, the representation of this target group will be perceived as more negative (compared to a corpus with an ISA with admiration).</italic></p></list-item><list-item id="S2.I1.ix3">
              
              
              
            <p id="S2.I1.ix3.p1"><italic>If word embedding-based measures indicate that a sentence package contains an ISA of a target group with admiration, the representation of this target group will be perceived as more positive (compared to a corpus with no ISA).</italic></p></list-item><list-item id="S2.I1.ix4">
              
              
              
            <p id="S2.I1.ix4.p1"><italic>If word embedding-based measures indicate that a sentence package contains an ISA of a target group with admiration, this target group will be perceived as (a) more admirable and as (b) less frightening (compared to a corpus with no ISA).</italic></p></list-item><list-item id="S2.I1.ix5">
              
              
              
            <p id="S2.I1.ix5.p1"><italic>If word embedding-based measures indicate that a corpus contains an ISA of a target group with admiration, this target group will be perceived as (a) more admirable and as (b) less frightening (compared to a corpus with an ISA with fear).</italic></p></list-item><list-item id="S2.I1.ix6">
              
              
              
            <p id="S2.I1.ix6.p1"><italic>If word embedding-based measures indicate that a sentence package contains an ISA of a target group with fear, this target group will be perceived as (a) more frightening and as (b) less admirable (compared to a corpus with no ISA).</italic></p></list-item></list></sec><sec id="S2.SS3">
        
        
        
        
      <title>Semantic Validation Procedure</title><p id="S2.SS3.p1">As outlined above, establishing semantic validity of an automated content analytical measure means to assess whether its results are mirrored in human readers’ judgments of the same texts. In the case of word embedding-based metrics which can only be calculated for large text corpora, it is nearly impossible to have human study participants read the full textual material that these measures are rating. We also usually do not know the sentences that introduce a specific association in the models. It is therefore not straightforward to arrive at a diminished corpus which could be processed by human readers. But by the means of reverse engineering, we can attempt to reproduce those sentences within a corpus that probably contributed to the measured association expressed in scores derived from a word-embedding model which was trained on said corpus. We can, then extract those sentences from the corpus, and use them to be rated by human study participants.</p><p id="S2.SS3.p2">As previously outlined, our semantic validation routine followed this three-step process which we will describe in more detail in the following. The overall procedure for generating the sentence packages and testing their WEAT-based association scores within a large corpus is visualized in Figure <xref rid="S2.F1" ref-type="fig">1</xref>. In addition to that, the table displayed in Appendix B offers a concise overview of the consecutive steps of text processing. We explain these individual steps in more detail in the following subsections.</p><fig id="S2.F1"><label>Figure 1:</label><caption><title>Flowchart of the stimulus sentence package generation process</title></caption>
          
          
          
          
        <graphic xlink:href="intermediate/wevarticle.drawio" /></fig><sec id="S2.SS3.SSS1">
          
          
          
          
        <title>Step 1: Identifying Potentially ISA-Inducing Sentence Packages</title><p id="S2.SS3.SSS1.p1">The idea behind implicit associations within word embeddings is that certain wordsets represent specific concepts. In the present study’s context, wordset <inline-formula><mml:math id="S2.SS3.SSS1.p1.m1" alttext="\mathcal{A}" display="inline"><mml:mi class="ltx_font_mathcaligraphic">𝒜</mml:mi></mml:math></inline-formula> represents fear and wordset <inline-formula><mml:math id="S2.SS3.SSS1.p1.m2" alttext="\mathcal{B}" display="inline"><mml:mi class="ltx_font_mathcaligraphic">ℬ</mml:mi></mml:math></inline-formula> represents admiration. They comprise of words that explicitly contain said concepts, such as <italic>scary</italic> or <italic>admirable</italic>. Then, there are sets of context words (<inline-formula><mml:math id="S2.SS3.SSS1.p1.m3" alttext="\mathcal{A^{\prime}}" display="inline"><mml:msup><mml:mi class="ltx_font_mathcaligraphic">𝒜</mml:mi><mml:mo>′</mml:mo></mml:msup></mml:math></inline-formula> and <inline-formula><mml:math id="S2.SS3.SSS1.p1.m4" alttext="\mathcal{B^{\prime}}" display="inline"><mml:msup><mml:mi class="ltx_font_mathcaligraphic">ℬ</mml:mi><mml:mo>′</mml:mo></mml:msup></mml:math></inline-formula>, such as <italic>crime</italic> or <italic>actor</italic>). These context words often co-occur with the words included in the explicit wordset of the respective concept (e.g., <italic>actor</italic> often co-occurs with words such as <italic>admirable</italic>). An implicit association with <inline-formula><mml:math id="S2.SS3.SSS1.p1.m5" alttext="\mathcal{A}" display="inline"><mml:mi class="ltx_font_mathcaligraphic">𝒜</mml:mi></mml:math></inline-formula> or <inline-formula><mml:math id="S2.SS3.SSS1.p1.m6" alttext="\mathcal{B}" display="inline"><mml:mi class="ltx_font_mathcaligraphic">ℬ</mml:mi></mml:math></inline-formula> occurs, then, when a target group <inline-formula><mml:math id="S2.SS3.SSS1.p1.m7" alttext="s" display="inline"><mml:mi>s</mml:mi></mml:math></inline-formula> is often mentioned close to those context words. For instance, if <italic>Italian</italic> often co-occurs with context words such as <italic>actor</italic> but less so with explicit concept words such as <italic>admirable</italic>, the group of <italic>Italians</italic> would be indirectly—or: implicitly—associated with the concept admiration. If present, such ISAs can be measured using <inline-formula><mml:math id="S2.SS3.SSS1.p1.m8" alttext="NAS" display="inline"><mml:mrow><mml:mi>N</mml:mi><mml:mo>⁢</mml:mo><mml:mi>A</mml:mi><mml:mo>⁢</mml:mo><mml:mi>S</mml:mi></mml:mrow></mml:math></inline-formula> scores calculated from a word embedding model that was trained on a corpus of interest. This is how <inline-formula><mml:math id="S2.SS3.SSS1.p1.m9" alttext="NAS" display="inline"><mml:mrow><mml:mi>N</mml:mi><mml:mo>⁢</mml:mo><mml:mi>A</mml:mi><mml:mo>⁢</mml:mo><mml:mi>S</mml:mi></mml:mrow></mml:math></inline-formula> can be used to operationalize ISAs in text corpora.</p><p id="S2.SS3.SSS1.p2">To validate this operationalization, in a first step, we generated packages of sentences that established such an implicit association between a group name and the two emotions fear and admiration from the original study’s corpus (Müller et al., <xref rid="bib.bibx33" ref-type="bibr">2023</xref>). For this purpose, two sentences are required: One that links a context word with an explicit emotion word (thereby, charging the context word with the respective emotion) and one that links a target group with a context word (thereby, implicitly charging the group label with the respective emotion). Such a sentence pair, for instance, looks like this:</p><p><list list-type="bullet" id="S2.I2"><list-item id="S2.I2.i1">
                
                
                
              <p id="S2.I2.i1.p1"><italic>The Italian actor keeps personal matters private and rarely discusses family life.</italic></p></list-item><list-item id="S2.I2.i2">
                
                
                
              <p id="S2.I2.i2.p1"><italic>Audiences were moved by the actor’s admirable portrayal of a young musician.</italic></p></list-item></list></p><p id="S2.SS3.SSS1.p4">These two sentences establish an implicit association between the target <inline-formula><mml:math id="S2.SS3.SSS1.p4.m1" alttext="s" display="inline"><mml:mi>s</mml:mi></mml:math></inline-formula> (<italic>Italian</italic>) and the attribute admiration (<italic>admirable</italic>), because a context word (<italic>actor</italic>) co-occurs with both. The first sentence is an example of a target-context association sentence, in which the target is associated with a context. The second sentence is an example of a context-attribute association sentence, in which the context is associated with an attribute.</p><p id="S2.SS3.SSS1.p5">An example for the implicit association with fear is:</p><p><list list-type="bullet" id="S2.I3"><list-item id="S2.I3.i1">
                
                
                
              <p id="S2.I3.i1.p1"><italic>The Italian from the Left Party blamed the government for the escalation.</italic></p></list-item><list-item id="S2.I3.i2">
                
                
                
              <p id="S2.I3.i2.p1"><italic>The mayor fears an escalation of violence in the region.</italic></p></list-item></list></p><p id="S2.SS3.SSS1.p7">In this example, the context word <italic>escalation</italic> co-occurs with an explicit <italic>fear</italic> word as well as with the group label <italic>Italian</italic>. Importantly, to establish an implicit rather than an explicit link between the concepts, these linking sentences do not occur close to each other or even within the same documents of a text corpus, but are distributed across several texts within the corpus.</p><p id="S2.SS3.SSS1.p8">In stimulus sentence package construction, the overarching goal therefore was to look for context words (e.g., <inline-formula><mml:math id="S2.SS3.SSS1.p8.m1" alttext="\mathcal{A^{\prime}}" display="inline"><mml:msup><mml:mi class="ltx_font_mathcaligraphic">𝒜</mml:mi><mml:mo>′</mml:mo></mml:msup></mml:math></inline-formula>) that co-occur with attribute words (e.g., <inline-formula><mml:math id="S2.SS3.SSS1.p8.m2" alttext="\mathcal{A}" display="inline"><mml:mi class="ltx_font_mathcaligraphic">𝒜</mml:mi></mml:math></inline-formula>), in context-attribution association sentences, and that co-occur with the target (<inline-formula><mml:math id="S2.SS3.SSS1.p8.m3" alttext="s" display="inline"><mml:mi>s</mml:mi></mml:math></inline-formula>) in target-context association sentences which are otherwise free from evaluative language. Using the word embedding model and the wordsets for <inline-formula><mml:math id="S2.SS3.SSS1.p8.m4" alttext="\mathcal{A}" display="inline"><mml:mi class="ltx_font_mathcaligraphic">𝒜</mml:mi></mml:math></inline-formula> and <inline-formula><mml:math id="S2.SS3.SSS1.p8.m5" alttext="\mathcal{B}" display="inline"><mml:mi class="ltx_font_mathcaligraphic">ℬ</mml:mi></mml:math></inline-formula> provided in Müller et al. (<xref rid="bib.bibx33" ref-type="bibr">2023</xref>), we generated a list of context words that are not contained in <inline-formula><mml:math id="S2.SS3.SSS1.p8.m6" alttext="\mathcal{A}" display="inline"><mml:mi class="ltx_font_mathcaligraphic">𝒜</mml:mi></mml:math></inline-formula> or <inline-formula><mml:math id="S2.SS3.SSS1.p8.m7" alttext="\mathcal{B}" display="inline"><mml:mi class="ltx_font_mathcaligraphic">ℬ</mml:mi></mml:math></inline-formula>, but co-occur with both.</p><p id="S2.SS3.SSS1.p9">To identify context words, we selected terms that are (1) semantically close to the words in <inline-formula><mml:math id="S2.SS3.SSS1.p9.m1" alttext="\mathcal{A}" display="inline"><mml:mi class="ltx_font_mathcaligraphic">𝒜</mml:mi></mml:math></inline-formula> and <inline-formula><mml:math id="S2.SS3.SSS1.p9.m2" alttext="\mathcal{B}" display="inline"><mml:mi class="ltx_font_mathcaligraphic">ℬ</mml:mi></mml:math></inline-formula>, but not those words themselves, and (2) also semantically distant from the opposite wordset: <inline-formula><mml:math id="S2.SS3.SSS1.p9.m3" alttext="\mathcal{A^{\prime}}" display="inline"><mml:msup><mml:mi class="ltx_font_mathcaligraphic">𝒜</mml:mi><mml:mo>′</mml:mo></mml:msup></mml:math></inline-formula> are terms close to <inline-formula><mml:math id="S2.SS3.SSS1.p9.m4" alttext="\mathcal{A}" display="inline"><mml:mi class="ltx_font_mathcaligraphic">𝒜</mml:mi></mml:math></inline-formula>, but distant from <inline-formula><mml:math id="S2.SS3.SSS1.p9.m5" alttext="\mathcal{B}" display="inline"><mml:mi class="ltx_font_mathcaligraphic">ℬ</mml:mi></mml:math></inline-formula>; <inline-formula><mml:math id="S2.SS3.SSS1.p9.m6" alttext="\mathcal{B^{\prime}}" display="inline"><mml:msup><mml:mi class="ltx_font_mathcaligraphic">ℬ</mml:mi><mml:mo>′</mml:mo></mml:msup></mml:math></inline-formula> are terms close to <inline-formula><mml:math id="S2.SS3.SSS1.p9.m7" alttext="\mathcal{B}" display="inline"><mml:mi class="ltx_font_mathcaligraphic">ℬ</mml:mi></mml:math></inline-formula>, but distant from <inline-formula><mml:math id="S2.SS3.SSS1.p9.m8" alttext="\mathcal{A}" display="inline"><mml:mi class="ltx_font_mathcaligraphic">𝒜</mml:mi></mml:math></inline-formula>. We applied the SemAxis technique by An et al. (<xref rid="bib.bibx1" ref-type="bibr">2018</xref>) to find these context words. First, we calculated the column mean vectors <inline-formula><mml:math id="S2.SS3.SSS1.p9.m9" alttext="\mathbf{V}^{\mathcal{A}}" display="inline"><mml:msup><mml:mi>𝐕</mml:mi><mml:mi class="ltx_font_mathcaligraphic">𝒜</mml:mi></mml:msup></mml:math></inline-formula> and <inline-formula><mml:math id="S2.SS3.SSS1.p9.m10" alttext="\mathbf{V}^{\mathcal{B}}" display="inline"><mml:msup><mml:mi>𝐕</mml:mi><mml:mi class="ltx_font_mathcaligraphic">ℬ</mml:mi></mml:msup></mml:math></inline-formula> of all word vectors of words in <inline-formula><mml:math id="S2.SS3.SSS1.p9.m11" alttext="\mathcal{A}" display="inline"><mml:mi class="ltx_font_mathcaligraphic">𝒜</mml:mi></mml:math></inline-formula> and <inline-formula><mml:math id="S2.SS3.SSS1.p9.m12" alttext="\mathcal{B}" display="inline"><mml:mi class="ltx_font_mathcaligraphic">ℬ</mml:mi></mml:math></inline-formula> respectively as:</p><disp-formula id="S2.E6">
              
              
              
            <mml:math id="S2.E6.m1" alttext="\mathbf{V}^{\mathcal{A}}=\frac{1}{|\mathcal{A}|}\sum\nolimits_{a\in\mathcal{A}%&#10;}\boldsymbol{w_{a}}" display="block"><mml:mrow><mml:msup><mml:mi>𝐕</mml:mi><mml:mi class="ltx_font_mathcaligraphic">𝒜</mml:mi></mml:msup><mml:mo>=</mml:mo><mml:mrow><mml:mfrac><mml:mn>1</mml:mn><mml:mrow><mml:mo stretchy="false">|</mml:mo><mml:mi class="ltx_font_mathcaligraphic">𝒜</mml:mi><mml:mo stretchy="false">|</mml:mo></mml:mrow></mml:mfrac><mml:mo>⁢</mml:mo><mml:mrow><mml:msub><mml:mo>∑</mml:mo><mml:mrow><mml:mi>a</mml:mi><mml:mo>∈</mml:mo><mml:mi class="ltx_font_mathcaligraphic">𝒜</mml:mi></mml:mrow></mml:msub><mml:msub><mml:mi>𝒘</mml:mi><mml:mi>𝒂</mml:mi></mml:msub></mml:mrow></mml:mrow></mml:mrow></mml:math></disp-formula><disp-formula id="S2.E7">
              
              
              
            <mml:math id="S2.E7.m1" alttext="\mathbf{V}^{\mathcal{B}}=\frac{1}{|\mathcal{B}|}\sum\nolimits_{b\in\mathcal{B}%&#10;}\boldsymbol{w_{b}}" display="block"><mml:mrow><mml:msup><mml:mi>𝐕</mml:mi><mml:mi class="ltx_font_mathcaligraphic">ℬ</mml:mi></mml:msup><mml:mo>=</mml:mo><mml:mrow><mml:mfrac><mml:mn>1</mml:mn><mml:mrow><mml:mo stretchy="false">|</mml:mo><mml:mi class="ltx_font_mathcaligraphic">ℬ</mml:mi><mml:mo stretchy="false">|</mml:mo></mml:mrow></mml:mfrac><mml:mo>⁢</mml:mo><mml:mrow><mml:msub><mml:mo>∑</mml:mo><mml:mrow><mml:mi>b</mml:mi><mml:mo>∈</mml:mo><mml:mi class="ltx_font_mathcaligraphic">ℬ</mml:mi></mml:mrow></mml:msub><mml:msub><mml:mi>𝒘</mml:mi><mml:mi>𝒃</mml:mi></mml:msub></mml:mrow></mml:mrow></mml:mrow></mml:math></disp-formula><p id="S2.SS3.SSS1.p12">Subtracting <inline-formula><mml:math id="S2.SS3.SSS1.p12.m1" alttext="\mathbf{V}^{\mathcal{B}}" display="inline"><mml:msup><mml:mi>𝐕</mml:mi><mml:mi class="ltx_font_mathcaligraphic">ℬ</mml:mi></mml:msup></mml:math></inline-formula> from <inline-formula><mml:math id="S2.SS3.SSS1.p12.m2" alttext="\mathbf{V}^{\mathcal{A}}" display="inline"><mml:msup><mml:mi>𝐕</mml:mi><mml:mi class="ltx_font_mathcaligraphic">𝒜</mml:mi></mml:msup></mml:math></inline-formula> gives the semantic axis vector <inline-formula><mml:math id="S2.SS3.SSS1.p12.m3" alttext="\mathbf{V_{axis}}" display="inline"><mml:msub><mml:mi>𝐕</mml:mi><mml:mi>𝐚𝐱𝐢𝐬</mml:mi></mml:msub></mml:math></inline-formula>.</p><disp-formula id="S2.E8">
              
              
              
            <mml:math id="S2.E8.m1" alttext="\mathbf{V_{axis}}=\mathbf{V}^{\mathcal{A}}-\mathbf{V}^{\mathcal{B}}" display="block"><mml:mrow><mml:msub><mml:mi>𝐕</mml:mi><mml:mi>𝐚𝐱𝐢𝐬</mml:mi></mml:msub><mml:mo>=</mml:mo><mml:mrow><mml:msup><mml:mi>𝐕</mml:mi><mml:mi class="ltx_font_mathcaligraphic">𝒜</mml:mi></mml:msup><mml:mo>−</mml:mo><mml:msup><mml:mi>𝐕</mml:mi><mml:mi class="ltx_font_mathcaligraphic">ℬ</mml:mi></mml:msup></mml:mrow></mml:mrow></mml:math></disp-formula><p id="S2.SS3.SSS1.p14">Given a word <inline-formula><mml:math id="S2.SS3.SSS1.p14.m1" alttext="c" display="inline"><mml:mi>c</mml:mi></mml:math></inline-formula> and its word vector <inline-formula><mml:math id="S2.SS3.SSS1.p14.m2" alttext="\boldsymbol{w_{c}}" display="inline"><mml:msub><mml:mi>𝒘</mml:mi><mml:mi>𝒄</mml:mi></mml:msub></mml:math></inline-formula>, we calculated <inline-formula><mml:math id="S2.SS3.SSS1.p14.m3" alttext="cos(\boldsymbol{w_{c}},\mathbf{V_{axis}})" display="inline"><mml:mrow><mml:mi>c</mml:mi><mml:mo>⁢</mml:mo><mml:mi>o</mml:mi><mml:mo>⁢</mml:mo><mml:mi>s</mml:mi><mml:mo>⁢</mml:mo><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:msub><mml:mi>𝒘</mml:mi><mml:mi>𝒄</mml:mi></mml:msub><mml:mo>,</mml:mo><mml:msub><mml:mi>𝐕</mml:mi><mml:mi>𝐚𝐱𝐢𝐬</mml:mi></mml:msub><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow></mml:math></inline-formula> for all words covered in <inline-formula><mml:math id="S2.SS3.SSS1.p14.m4" alttext="\boldsymbol{w}" display="inline"><mml:mi>𝒘</mml:mi></mml:math></inline-formula>. We selected a word <inline-formula><mml:math id="S2.SS3.SSS1.p14.m5" alttext="d" display="inline"><mml:mi>d</mml:mi></mml:math></inline-formula> in <inline-formula><mml:math id="S2.SS3.SSS1.p14.m6" alttext="\mathcal{A^{\prime}}" display="inline"><mml:msup><mml:mi class="ltx_font_mathcaligraphic">𝒜</mml:mi><mml:mo>′</mml:mo></mml:msup></mml:math></inline-formula> if it satisfied the following criteria: (1) <inline-formula><mml:math id="S2.SS3.SSS1.p14.m7" alttext="d" display="inline"><mml:mi>d</mml:mi></mml:math></inline-formula> co-occurs with <inline-formula><mml:math id="S2.SS3.SSS1.p14.m8" alttext="s" display="inline"><mml:mi>s</mml:mi></mml:math></inline-formula> in a sentence of the corpus; (2) <inline-formula><mml:math id="S2.SS3.SSS1.p14.m9" alttext="d" display="inline"><mml:mi>d</mml:mi></mml:math></inline-formula> is not in <inline-formula><mml:math id="S2.SS3.SSS1.p14.m10" alttext="\mathcal{A}" display="inline"><mml:mi class="ltx_font_mathcaligraphic">𝒜</mml:mi></mml:math></inline-formula>; and (3) <inline-formula><mml:math id="S2.SS3.SSS1.p14.m11" alttext="\boldsymbol{w_{d}}" display="inline"><mml:msub><mml:mi>𝒘</mml:mi><mml:mi>𝒅</mml:mi></mml:msub></mml:math></inline-formula> has a high cosine similarity to <inline-formula><mml:math id="S2.SS3.SSS1.p14.m12" alttext="\mathbf{V_{axis}}" display="inline"><mml:msub><mml:mi>𝐕</mml:mi><mml:mi>𝐚𝐱𝐢𝐬</mml:mi></mml:msub></mml:math></inline-formula>. For <inline-formula><mml:math id="S2.SS3.SSS1.p14.m13" alttext="\mathcal{B^{\prime}}" display="inline"><mml:msup><mml:mi class="ltx_font_mathcaligraphic">ℬ</mml:mi><mml:mo>′</mml:mo></mml:msup></mml:math></inline-formula>, the criteria are the same, except we selected a word <inline-formula><mml:math id="S2.SS3.SSS1.p14.m14" alttext="d" display="inline"><mml:mi>d</mml:mi></mml:math></inline-formula> which has a low cosine similarity to <inline-formula><mml:math id="S2.SS3.SSS1.p14.m15" alttext="\mathbf{V_{axis}}" display="inline"><mml:msub><mml:mi>𝐕</mml:mi><mml:mi>𝐚𝐱𝐢𝐬</mml:mi></mml:msub></mml:math></inline-formula><xref rid="id3" ref-type="fn" specific-use="fn"><sup>3</sup></xref> and not in <inline-formula><mml:math id="S2.SS3.SSS1.p14.m16" alttext="\mathcal{B}" display="inline"><mml:mi class="ltx_font_mathcaligraphic">ℬ</mml:mi></mml:math></inline-formula>. For <inline-formula><mml:math id="S2.SS3.SSS1.p14.m17" alttext="\mathcal{A^{\prime}}" display="inline"><mml:msup><mml:mi class="ltx_font_mathcaligraphic">𝒜</mml:mi><mml:mo>′</mml:mo></mml:msup></mml:math></inline-formula> and <inline-formula><mml:math id="S2.SS3.SSS1.p14.m18" alttext="\mathcal{B^{\prime}}" display="inline"><mml:msup><mml:mi class="ltx_font_mathcaligraphic">ℬ</mml:mi><mml:mo>′</mml:mo></mml:msup></mml:math></inline-formula>, we selected the top 200 words according to their cosine similarity scores.</p><p id="S2.SS3.SSS1.p15">Using <inline-formula><mml:math id="S2.SS3.SSS1.p15.m1" alttext="s" display="inline"><mml:mi>s</mml:mi></mml:math></inline-formula>, <inline-formula><mml:math id="S2.SS3.SSS1.p15.m2" alttext="\mathcal{A}" display="inline"><mml:mi class="ltx_font_mathcaligraphic">𝒜</mml:mi></mml:math></inline-formula>, <inline-formula><mml:math id="S2.SS3.SSS1.p15.m3" alttext="\mathcal{B}" display="inline"><mml:mi class="ltx_font_mathcaligraphic">ℬ</mml:mi></mml:math></inline-formula> provided by Müller et al. (<xref rid="bib.bibx33" ref-type="bibr">2023</xref>) and the created context wordsets <inline-formula><mml:math id="S2.SS3.SSS1.p15.m4" alttext="\mathcal{A^{\prime}}" display="inline"><mml:msup><mml:mi class="ltx_font_mathcaligraphic">𝒜</mml:mi><mml:mo>′</mml:mo></mml:msup></mml:math></inline-formula> and <inline-formula><mml:math id="S2.SS3.SSS1.p15.m5" alttext="\mathcal{B^{\prime}}" display="inline"><mml:msup><mml:mi class="ltx_font_mathcaligraphic">ℬ</mml:mi><mml:mo>′</mml:mo></mml:msup></mml:math></inline-formula>, we searched the original corpus for matching pairs of target-context association and context-attribute association sentences for the admiration and fear conditions. In addition to these two sentence packages, we also deemed it useful for the purpose of our survey experiment to have a sentence package in which no implicit or explicit association between the target group and the two emotions fear and admiration was established. This neutral sentence package should then serve as a control condition in the experimental setting. As control sentences, we selected pairs in which one sentence contained only <inline-formula><mml:math id="S2.SS3.SSS1.p15.m6" alttext="s" display="inline"><mml:mi>s</mml:mi></mml:math></inline-formula> and no words from <inline-formula><mml:math id="S2.SS3.SSS1.p15.m7" alttext="\mathcal{A}" display="inline"><mml:mi class="ltx_font_mathcaligraphic">𝒜</mml:mi></mml:math></inline-formula>, <inline-formula><mml:math id="S2.SS3.SSS1.p15.m8" alttext="\mathcal{B}" display="inline"><mml:mi class="ltx_font_mathcaligraphic">ℬ</mml:mi></mml:math></inline-formula>, <inline-formula><mml:math id="S2.SS3.SSS1.p15.m9" alttext="\mathcal{A^{\prime}}" display="inline"><mml:msup><mml:mi class="ltx_font_mathcaligraphic">𝒜</mml:mi><mml:mo>′</mml:mo></mml:msup></mml:math></inline-formula> nor <inline-formula><mml:math id="S2.SS3.SSS1.p15.m10" alttext="\mathcal{B^{\prime}}" display="inline"><mml:msup><mml:mi class="ltx_font_mathcaligraphic">ℬ</mml:mi><mml:mo>′</mml:mo></mml:msup></mml:math></inline-formula> plus one sentence containing neither <inline-formula><mml:math id="S2.SS3.SSS1.p15.m11" alttext="s" display="inline"><mml:mi>s</mml:mi></mml:math></inline-formula>, nor words from <inline-formula><mml:math id="S2.SS3.SSS1.p15.m12" alttext="\mathcal{A}" display="inline"><mml:mi class="ltx_font_mathcaligraphic">𝒜</mml:mi></mml:math></inline-formula>, <inline-formula><mml:math id="S2.SS3.SSS1.p15.m13" alttext="\mathcal{B}" display="inline"><mml:mi class="ltx_font_mathcaligraphic">ℬ</mml:mi></mml:math></inline-formula>, <inline-formula><mml:math id="S2.SS3.SSS1.p15.m14" alttext="\mathcal{A^{\prime}}" display="inline"><mml:msup><mml:mi class="ltx_font_mathcaligraphic">𝒜</mml:mi><mml:mo>′</mml:mo></mml:msup></mml:math></inline-formula> nor <inline-formula><mml:math id="S2.SS3.SSS1.p15.m15" alttext="\mathcal{B^{\prime}}" display="inline"><mml:msup><mml:mi class="ltx_font_mathcaligraphic">ℬ</mml:mi><mml:mo>′</mml:mo></mml:msup></mml:math></inline-formula>).</p><p id="S2.SS3.SSS1.p16">We found that this method did not generate an amount of sentence pairs that associated the selected group label ‘Italian’ with the concept fear that was sufficient for the purposes of our study. Therefore, we drew sentences for a similar group label (<italic>Spanish</italic>), and replaced this group name with <italic>Italian</italic> in all sentences. Thus, we created 347 sentence pairs for each condition. To make the final data set semantically meaningful for human readers, we manually removed duplicates, rephrased fragmentary and incoherent sentences, and arrived at one sentence-pair dummy corpus for each condition.</p></sec><sec id="S2.SS3.SSS2">
          
          
          
          
        <title>Step 2: Measuring ISAs of the Sentence Packages With WEAT-based Scores</title><p id="S2.SS3.SSS2.p1">Next, our goal was to assess whether the sentences extracted in Step 1 actually introduced the expected ISAs within the study corpus according to word embedding-based <inline-formula><mml:math id="S2.SS3.SSS2.p1.m1" alttext="NAS" display="inline"><mml:mrow><mml:mi>N</mml:mi><mml:mo>⁢</mml:mo><mml:mi>A</mml:mi><mml:mo>⁢</mml:mo><mml:mi>S</mml:mi></mml:mrow></mml:math></inline-formula> scores. The sentences that were selected to introduce an ISA of the target group with fear should receive a word embedding-based rating that points in the direction of fear while the admiration dummy corpus should receive a reversed rating, and the control corpus should range somewhere in between. Obviously, this is a prerequisite of the subsequent semantic validation.</p><p id="S2.SS3.SSS2.p2">As word embedding models are hardly statistically robust if trained on very small corpora, such as our sentence packages, we had to re-introduce the sentence packages into the original study corpus first. For this purpose, before estimating <inline-formula><mml:math id="S2.SS3.SSS2.p2.m1" alttext="NAS" display="inline"><mml:mrow><mml:mi>N</mml:mi><mml:mo>⁢</mml:mo><mml:mi>A</mml:mi><mml:mo>⁢</mml:mo><mml:mi>S</mml:mi></mml:mrow></mml:math></inline-formula> scores for the three sentence packages, we first tokenized the original corpus from Müller et al. (<xref rid="bib.bibx33" ref-type="bibr">2023</xref>) into sentences, removed all mentions of <italic>Italian(s)</italic> from the original corpus, and created three different versions of it in which all sentences of one of the three different sentence packages were merged into. These corpora thus contained solely those sentences mentioning the target group which were included in our three sentence packages. Otherwise they fully mirrored the original study’s corpus. Then, for each of the sentence packages, we trained a GLoVe word embedding model with the same hyperparameters used by Müller et al. (<xref rid="bib.bibx33" ref-type="bibr">2023</xref>) <italic>de novo</italic> on these constructed artificial corpora. Ultimately, we calculated <inline-formula><mml:math id="S2.SS3.SSS2.p2.m2" alttext="NAS(s,\mathcal{A},\mathcal{B},\boldsymbol{w})" display="inline"><mml:mrow><mml:mi>N</mml:mi><mml:mo>⁢</mml:mo><mml:mi>A</mml:mi><mml:mo>⁢</mml:mo><mml:mi>S</mml:mi><mml:mo>⁢</mml:mo><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mi>s</mml:mi><mml:mo>,</mml:mo><mml:mi class="ltx_font_mathcaligraphic">𝒜</mml:mi><mml:mo>,</mml:mo><mml:mi class="ltx_font_mathcaligraphic">ℬ</mml:mi><mml:mo>,</mml:mo><mml:mi>𝒘</mml:mi><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow></mml:math></inline-formula> for all three corpora.</p><p id="S2.SS3.SSS2.p3">With only about 300 mentions of <inline-formula><mml:math id="S2.SS3.SSS2.p3.m1" alttext="s" display="inline"><mml:mi>s</mml:mi></mml:math></inline-formula> within the artificial corpora, we expected the resulting word embeddings not to be very stable. Following Antoniak &amp; Mimno (<xref rid="bib.bibx4" ref-type="bibr">2018</xref>) who showed that the size of a corpus only affects the variance of a word embedding based measurement but not its central tendency, we repeated the creation of a word-embedding model and the measurement of its ISAs one-hundred times. Figure <xref rid="S2.F2" ref-type="fig">2</xref> shows the resulting distributions of <inline-formula><mml:math id="S2.SS3.SSS2.p3.m2" alttext="NAS(s,\mathcal{A},\mathcal{B},\boldsymbol{w})" display="inline"><mml:mrow><mml:mi>N</mml:mi><mml:mo>⁢</mml:mo><mml:mi>A</mml:mi><mml:mo>⁢</mml:mo><mml:mi>S</mml:mi><mml:mo>⁢</mml:mo><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mi>s</mml:mi><mml:mo>,</mml:mo><mml:mi class="ltx_font_mathcaligraphic">𝒜</mml:mi><mml:mo>,</mml:mo><mml:mi class="ltx_font_mathcaligraphic">ℬ</mml:mi><mml:mo>,</mml:mo><mml:mi>𝒘</mml:mi><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow></mml:math></inline-formula> for the three dummy corpora.<xref rid="id4" ref-type="fn" specific-use="fn"><sup>4</sup></xref></p><fig id="S2.F2"><label>Figure 2:</label><caption><title>Distribution of the NAS scores for the admiration, fear and control dummy corpora. Negative values denote more implicit fear, positive values more implicit admiration.</title></caption>
            
            
            
            
          <graphic xlink:href="intermediate/ridges_gg" /></fig><p id="S2.SS3.SSS2.p4">The median values are diverging in the expected directions between all three dummy corpora: The admiration dummy corpus has a positive median <inline-formula><mml:math id="S2.SS3.SSS2.p4.m1" alttext="NAS(s,\mathcal{A},\mathcal{B},\boldsymbol{w})" display="inline"><mml:mrow><mml:mi>N</mml:mi><mml:mo>⁢</mml:mo><mml:mi>A</mml:mi><mml:mo>⁢</mml:mo><mml:mi>S</mml:mi><mml:mo>⁢</mml:mo><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mi>s</mml:mi><mml:mo>,</mml:mo><mml:mi class="ltx_font_mathcaligraphic">𝒜</mml:mi><mml:mo>,</mml:mo><mml:mi class="ltx_font_mathcaligraphic">ℬ</mml:mi><mml:mo>,</mml:mo><mml:mi>𝒘</mml:mi><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow></mml:math></inline-formula>, the fear dummy corpus a negative median, and the control dummy corpus a value between the two latter. At the same time, the control dummy corpus is closer to the admiration dummy corpus, with some overlap between the runs. This is in line with qualitative observations we made during the manual cleanup step: We found our method to generate more clearly fear-laden sentences than admiration-laden ones. Furthermore, the sentences for the control condition were not completely “neutral”, in a sense that they did not contain any potentially emotion-eliciting words. They did, of course, not feature words from our word lists <inline-formula><mml:math id="S2.SS3.SSS2.p4.m2" alttext="\mathcal{A}" display="inline"><mml:mi class="ltx_font_mathcaligraphic">𝒜</mml:mi></mml:math></inline-formula>, <inline-formula><mml:math id="S2.SS3.SSS2.p4.m3" alttext="\mathcal{B}" display="inline"><mml:mi class="ltx_font_mathcaligraphic">ℬ</mml:mi></mml:math></inline-formula>, <inline-formula><mml:math id="S2.SS3.SSS2.p4.m4" alttext="\mathcal{A^{\prime}}" display="inline"><mml:msup><mml:mi class="ltx_font_mathcaligraphic">𝒜</mml:mi><mml:mo>′</mml:mo></mml:msup></mml:math></inline-formula>, or <inline-formula><mml:math id="S2.SS3.SSS2.p4.m5" alttext="\mathcal{B^{\prime}}" display="inline"><mml:msup><mml:mi class="ltx_font_mathcaligraphic">ℬ</mml:mi><mml:mo>′</mml:mo></mml:msup></mml:math></inline-formula>, yet they seemed to contain potential traces of other emotions or group-related valence. The sentences are therefore not emotionally “neutral” in an absolute sense, but in a relational one, meaning that they did not carry the two specific emotions fear and admiration. We deemed this sufficient for the purpose of the present validation attempt.</p></sec><sec id="S2.SS3.SSS3">
          
          
          
          
        <title>Step 3: Measuring ISAs of the Sentence Packages as Perceived By Humans</title><p id="S2.SS3.SSS3.p1">After testing whether our sentence packages are generating the expected ISAs within the original corpus according to word embedding-based measures, we were ready to test whether the same sentences also generate ISAs that human readers pick up. To do so, we drew a random sample of sentences for each condition—featuring 25 sentence pairs (i.e. 50 sentences in total). This additional reduction step was performed to avoid wear-out-effects and unit-non-response in participants provoked by a potentially too lengthy stimulus exposure. Our pretest showed that native German speakers read 25 sentence pairs (or 50 sentences) in less than 10 minutes.</p><p id="S2.SS3.SSS3.p2">To still account for the semantic variance within the sentence packages, we drew three random samples of 25 sentence pairs as a stimulus package for each condition, giving us three stimulus packages of admiration sentence pairs, three stimulus packages of fear sentence pairs, and three stimulus packages of neutral sentence pairs for the control condition. In total, nine different stimulus sentence packages were created (3 ISA conditions <inline-formula><mml:math id="S2.SS3.SSS3.p2.m1" alttext="\times" display="inline"><mml:mo>×</mml:mo></mml:math></inline-formula> 3 randomized packages).<xref rid="id5" ref-type="fn" specific-use="fn"><sup>5</sup></xref></p><p id="S2.SS3.SSS3.p3"><bold>Participants.</bold> To conduct the planned 3 <inline-formula><mml:math id="S2.SS3.SSS3.p3.m1" alttext="\times" display="inline"><mml:mo>×</mml:mo></mml:math></inline-formula> 3-between-subjects experiment, we recruited a sample from the non-commercial German-language online-access panel SoSciPanel (Leiner, <xref rid="bib.bibx27" ref-type="bibr">2016</xref>). As members of this panel participate in studies without receiving incentives, they typically bring along a high intrinsic motivation to contribute to research leading to a relatively high data quality and a strong sampling bias towards high education. While this can be problematic for many types of research, we deemed it particularly advantageous for our research goal of detecting implicit associations within sentence packages as this task might require close reading. We aimed at obtaining 1,200 observations, or 133.3 observations per treatment. With this sample size, we can detect a so-called small effect size (<inline-formula><mml:math id="S2.SS3.SSS3.p3.m2" alttext="\eta^{2}" display="inline"><mml:msup><mml:mi>η</mml:mi><mml:mn>2</mml:mn></mml:msup></mml:math></inline-formula> = 0.02) at <inline-formula><mml:math id="S2.SS3.SSS3.p3.m3" alttext="\alpha=0.11" display="inline"><mml:mrow><mml:mi>α</mml:mi><mml:mo>=</mml:mo><mml:mn>0.11</mml:mn></mml:mrow></mml:math></inline-formula> (which is the selected <inline-formula><mml:math id="S2.SS3.SSS3.p3.m4" alttext="\alpha" display="inline"><mml:mi>α</mml:mi></mml:math></inline-formula> level for the Bayesian modelling strategy, following McElreath, <xref rid="bib.bibx32" ref-type="bibr">2016</xref>) with almost 100% statistical power. The study invitation that was sent out via e-mail to registered panel members led to an unusually high conversion rate. Thus, after data clean-up, the final sample still consisted of <italic>n</italic> = 1937 individuals
(48.4 % male; 83.5 % with highest German secondary school degree ‘Abitur’; age: <italic>M</italic> = 53.3, SD = 13.9).</p><p id="S2.SS3.SSS3.p4"><bold>Procedure.</bold> Upon arrival at the survey platform, participants were given detailed information about the study (without unmasking the actual research purpose) and actively consented to participation. Each participant was, then, randomly allocated to read one out of the nine stimuli. Randomization checks found no significant differences between the treatment groups regarding sociodemographics and political left-right orientation. Each stimulus contained 50 news sentences, which were displayed on five subsequent pages, presenting 10 news sentences each. All sentences were drawn in a random order from the respective stimulus sentence package during the experiment. To avoid priming effects, participants were instructed to observe the language and tonality journalists use in the sentences in general and that they would be asked to rate this language on a number of dimensions after exposure. After stimulus confrontation, the dependent variables were assessed. Finally, participants were thanked for their time and fully debriefed.</p><p id="S2.SS3.SSS3.p5"><bold>Survey Measures.</bold> Following stimulus exposure, we first asked participants to rate the language of the sentences they had read (e.g., concerning comprehensibility and complexity). This was included as a distraction task and to fulfill participants’ expectations that they would have to rate journalists’ language on multiple dimensions. Subsequently, we used one item to capture the perceived valence of the sentence packages. Similar to a feeling thermometer, we assessed the negativity/positivity participants felt the sentence package they read expressed towards the group of Italians, ranging from “very positive” to “very negative” on a 7-point scale (<italic>“How positive/negative is the portrayal of Italians in the sentences you have read?”</italic>). This item was used to test hypotheses H1, H2 and H3. To test hypotheses H4, H5 and H6, the next two single-item measures inquired perceived admiration and perceived fear on a 7-point scale, ranging from “very much” to “very little” (<italic>“How much admiration/fear was expressed towards Italians in the sentences you have read?”</italic>). The original German versions of all measures are available in Appendix C in the OSF repository.</p><p id="S2.SS3.SSS3.p6"><bold>Statistical Analysis.</bold> Following our preregistered analysis plan, we applied Bayesian Analysis of Variance to test our hypotheses (Bürkner, <xref rid="bib.bibx9" ref-type="bibr">2017</xref>) We chose the Bayesian approach due to its pragmatic advantages yielding directly interpretable uncertainty statements about parameters, or providing only mild regularization of estimates via informative priors. We report both 1) the conditional effect and 2) the effect size <inline-formula><mml:math id="S2.SS3.SSS3.p6.m1" alttext="\eta^{2}" display="inline"><mml:msup><mml:mi>η</mml:mi><mml:mn>2</mml:mn></mml:msup></mml:math></inline-formula> and its 89% high density intervals (Lüdecke et al., <xref rid="bib.bibx30" ref-type="bibr">2022</xref>). For all analyses, we used a noninformative prior of <inline-formula><mml:math id="S2.SS3.SSS3.p6.m2" alttext="\mathcal{N}(0,1)" display="inline"><mml:mrow><mml:mi class="ltx_font_mathcaligraphic">𝒩</mml:mi><mml:mo>⁢</mml:mo><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mn>0</mml:mn><mml:mo>,</mml:mo><mml:mn>1</mml:mn><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow></mml:math></inline-formula>. This prior assumes that before conducting our experiment there was no evidence to show <inline-formula><mml:math id="S2.SS3.SSS3.p6.m3" alttext="NAS(s,\mathcal{A},\mathcal{B},\boldsymbol{w})" display="inline"><mml:mrow><mml:mi>N</mml:mi><mml:mo>⁢</mml:mo><mml:mi>A</mml:mi><mml:mo>⁢</mml:mo><mml:mi>S</mml:mi><mml:mo>⁢</mml:mo><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mi>s</mml:mi><mml:mo>,</mml:mo><mml:mi class="ltx_font_mathcaligraphic">𝒜</mml:mi><mml:mo>,</mml:mo><mml:mi class="ltx_font_mathcaligraphic">ℬ</mml:mi><mml:mo>,</mml:mo><mml:mi>𝒘</mml:mi><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow></mml:math></inline-formula> is a valid measure of ISAs conveyed to human readers in the stimulus sentence packages. Four Bayesian models for the four dependent variables perceived valence, perceived admiration, perceived fear, and assumed message effects were constructed (see Appendix D for regression coefficients and Appendix E for posterior predictive checks). For these analyses, participants exposed to either of the three different sentence packages within the same ISA condition were collapsed, leading to a total of three different analytical units. In the following, we interpret the conditional effects and the effect sizes from these models.</p></sec></sec></sec>
    <sec id="S3">
      
      
      
      
    <title>Results</title><sec id="S3.SS1">
        
        
        
        
      <title>Perceived Valence</title><fig id="S3.F3"><label>Figure 3:</label><caption><title>Conditional effect plots of perceived valence (top), perceived fear (center), and perceived admiration (bottom)</title></caption>
          
          
          
          
        <graphic xlink:href="intermediate/combined" /></fig><p id="S3.SS1.p1">Figure <xref rid="S3.F3" ref-type="fig">3</xref> displays the conditional effect plots for perceived valence. The top subplot shows the conditional effect of perceived valence as a function of ISA measured for the received stimulus treatment. Results reveal that participants in the fear condition, perceived the representation of this group in the sentences they had read as much more negative than participants in the other two conditions. This supports Hs 1 and 2. Moreover, participants in the admiration condition perceived a more positive representation of the target group than those in the control condition, yielding support of H3. The effect size <inline-formula><mml:math id="S3.SS1.p1.m1" alttext="\eta^{2}" display="inline"><mml:msup><mml:mi>η</mml:mi><mml:mn>2</mml:mn></mml:msup></mml:math></inline-formula> is 0.2 (89% HDI: 0.17 to 0.23), and therefore much higher than the anticipated 0.02.</p></sec><sec id="S3.SS2">
        
        
        
        
      <title>Perceived Admiration</title><p id="S3.SS2.p1">The center subplot of Figure <xref rid="S3.F3" ref-type="fig">3</xref> shows the conditional effect of perceived admiration as a function of ISA measured for the received stimulus treatment. Participants in the admiration condition perceived the representation of Italians as clearly more admirable than those in the control or fear conditions which supports H 4a and 5a. Contrary to that, participants in the fear condition found the target group presentation to convey less admiration than those in the control condition, supporting H6b. The effect size <inline-formula><mml:math id="S3.SS2.p1.m1" alttext="\eta^{2}" display="inline"><mml:msup><mml:mi>η</mml:mi><mml:mn>2</mml:mn></mml:msup></mml:math></inline-formula> is 0.21 (89% HDI: 0.18 to 0.25), again, much higher than the previously anticipated 0.02.</p></sec><sec id="S3.SS3">
        
        
        
        
      <title>Perceived Fear</title><p id="S3.SS3.p1">The bottom subplot of Figure <xref rid="S3.F3" ref-type="fig">3</xref> shows the conditional effect of perceived fear as a function of ISA measured for the received stimulus treatment. Participants in the admiration condition did not perceive the representation of Italians as less frightening than individuals in the control condition. H4b is therefore not supported. However, participants in the admiration condition perceived a less frightening representation of Italians than participants in the fear condition, supporting H5b. Likewise, respondents in the fear condition perceived the representation of Italians as more frightening than those in the control condition. H6a is supported. The effect size <inline-formula><mml:math id="S3.SS3.p1.m1" alttext="\eta^{2}" display="inline"><mml:msup><mml:mi>η</mml:mi><mml:mn>2</mml:mn></mml:msup></mml:math></inline-formula> is 0.16 (89% HDI: 0.14 to 0.2), and, thus, once more much higher than the anticipated 0.02.</p></sec></sec>
    <sec id="S4">
      
      
      
      
    <title>Discussion</title><p id="S4.p1">The goal of this study was to test the semantic validity of word embedding-based measures of implicit stereotypical associations (ISAs) in large text corpora. We argued that when word embedding association measures are employed to draw conclusions about the content of these corpora—and not just as representations of exogenous cultural patterns—predictive validity does not suffice to establish that these measures allow for conclusions about the content of a study corpus. It is necessary to establish the semantic validity of these measures to be able to make claims about these texts.</p><p id="S4.p2">To achieve this goal, we employed an experimental survey approach. This somewhat unusual validation strategy was chosen to compensate for the fact that traditional gold standard coding seemed inapplicable for the aggregated content analytical method that was to be validated in this study. Word embedding-based ISA metrics analyze implicit associations within a whole corpus in order to draw conclusions about the content of a corpus. We deemed it non-practicable for a limited number of trained human coders to make a reliable and, thus, reproducible generalizing judgment of ISAs within a whole corpus, particularly so since such associations are only implicitly present in the corpus.</p><p id="S4.p3">We therefore developed and employed a validation method that made use of judgments from a large number of untrained human coders assuming that, following the law of large numbers, individual participants’ erroneous judgments that might possibly occur would not preponderate in this setting because the resulting patterns would regress to the (non-erroneous) mean. We conducted the data collection as a survey using typical items designed to assess given media stimuli, instead of a traditional coding task that follows a precise coding instruction. We deemed that a somewhat unreflected, more intuitive assessment of stimulus sentence packages (likely to follow a heuristic processing routine within participants; Chaiken et al., <xref rid="bib.bibx11" ref-type="bibr">1989</xref>) was actually better able to detect ISAs than the traditional coding routine for which a reflected, thorough decision-making, and, thus, systematic processing is the explicit goal.</p><p id="S4.p4">For demonstration purposes, we decided to use the openly available analysis corpus from a recent application of word embedding-based ISA measures for media content analysis (Müller et al., <xref rid="bib.bibx33" ref-type="bibr">2023</xref>) which investigated implicit stigmatization of ethnic and religious groups in journalistic discourse, focusing on the implicit association of group labels with the emotions fear or admiration. A number of methodological decisions in this validation attempt had to be tailored to this specific study in enabling semantic validation at all. For instance, the original study (Müller et al., <xref rid="bib.bibx33" ref-type="bibr">2023</xref>) covered a large variety of ethnic groups. For the purposes of our experimental semantic validation attempt, the selection of one group out of this larger variety was a crucial step to keep group-level background factors constant in the experiment. For this group, the original article corpus needed to provide a sufficient number of sentences associating it with both of the two emotions, fear and admiration. For other semantic validation attempts of word embedding-based ISA measures, other factors might be more important when choosing the right selection strategy for used sentence packages—and this will be true for multiple other methodological decisions made during the planning of the present study, depending on the design of the application that is supposed to be validated.</p><p id="S4.p5">For instance, for studies based on one-sided word embedding bias tests (Kroon et al., <xref rid="bib.bibx25" ref-type="bibr">2021</xref>), finding sentences that contain the opposite valance of the measured association would be more challenging—since our approach relies on using the inverted distance to one end of the spectrum as well as the distance to the other end of the spectrum to identify clear context words. Possible solutions would be either to solely rely on the distance to the measured end of the spectrum, or to artificially construct terms that represent the implicit opposite end of the spectrum of the measured dimension. Another interesting case would be the validation of gender-stereotypes, as in Garg et al. (<xref rid="bib.bibx17" ref-type="bibr">2018</xref>). In one of the analyses, for example, instead of measuring the association of different groups with two emotions, the authors measured the association of a large number of jobs with two genders. If a validation study of this research followed our procedure, therefore, it would have to choose a set of target occupations and construct an artificial association with different genders. This would be an interesting robustness-test for our validation procedure, as constructing artificial associations of gender with occupations can be counter-intuitive to readers who are used to opposing stereotypes—for example, it would be interesting to see if a set of stimulus sentences can suffice to induce an association counter to the observed stereotype, e.g., an association of <italic>men</italic> with <italic>nurse</italic>. A larger variance of target words would probably be necessary to control for more and less salient stereotypes.</p><p id="S4.p6">Arguably, such further validation attempts are necessary. The present study constitutes just one successful validation of the application of word embedding-based ISA measures in a specific application, namely in Müller et al. (<xref rid="bib.bibx33" ref-type="bibr">2023</xref>). The between-subjects survey experiment presented in this article largely supported the assumption that participants’ perception of the stimulus sentence packages was in line with the measured ISAs, thus, representing first evidence for the semantic validity of the word embedding-based approach for investigating ISAs (Durrheim et al., <xref rid="bib.bibx15" ref-type="bibr">2023</xref>). The successful semantic validation of an exemplary case, however, does not warrant inferences about the general semantic validity of said method. The consensus in computational communication science is that there exists no off-the-shelf method and each method requires individual validation to show the (semantic) validity for one’s individual research data (Atteveldt &amp; Peng, <xref rid="bib.bibx43" ref-type="bibr">2018</xref>; Baden et al., <xref rid="bib.bibx7" ref-type="bibr">2021</xref>). Therefore, this study can only be considered a first step in establishing semantic validity of word-embedding based measures of ISAs. Further steps will necessarily have to follow, particularly when considering Krippendorff’s (<xref rid="bib.bibx24" ref-type="bibr">2018</xref>, p. 323) notion that semantic validity is context dependent. Yet, if multiple other future validation studies come to similar conclusions as the present one, this could be interpreted as cumulative evidence for the general validity of word embedding-based ISA measures. But even in this case, all future applications would still require individual validation efforts.</p><p id="S4.p7">In the specific context of the present validation, some findings call for a more in-depth engagement. For instance, the results showed that, for perceived fear as a dependent variable, there was no difference between the admiration and the no ISA stimuli. However, vice versa, perceived admiration of the target group was significantly lower in the fear stimulus condition than in the no ISA condition. This pattern does not put into question the semantic validity of the tested word embedding-based ISA measure in principle. But, it should be seen as cause to reconsider the decision to use fear and admiration as the end poles of an emotion continuum, as the preliminary work by Müller et al. (<xref rid="bib.bibx33" ref-type="bibr">2023</xref>) did. There, it was argued that the two emotions are functionally equivalent in group enhancement and devaluation in media reporting. As the present validation has shown, they are indeed causally linked, but only partially. Implicit fear-inducing messages not only increase the perceived fear of the group, but also reduce the positive emotion of admiration. Admiration-inducing messages, however, are unable to trigger a weakening of the negative emotion of fear. The communicative hurdles for overcoming fear of ethnic groups appear to be higher than those for eliminating admiration. This substantial finding of the present validation study should stimulate further research from an intergroup communication perspective. It adds a crucial facet to the discussion of the original study’s results that goes beyond mere methodological validation.</p><p id="S4.p8">Related to this, we observed another interesting pattern when constructing the sentence packages for the present validation study. A clearer difference between the control and the fear conditions could be generated, compared to the difference between the admiration and control conditions. In the survey experiment, this impression is confirmed: Participants evaluated the control sentence packages closer to the admiration packages. This observation is particularly true for the overall perceived valence of the stimuli. One explanation for this could be that the negativity bias within reporting (Soroka &amp; McAdams, <xref rid="bib.bibx39" ref-type="bibr">2015</xref>) leads to more cases of directly expressed fear within news reporting, while admiration is more dispersed and subtle. This could explain why the admiration sentences we found contained, at face value, less obvious traces of admiration, which both the word embedding ISA model and the participants’ responses seem to confirm. This could be taken as additional evidence to challenge the decision of using fear and admiration as end poles of a two-dimensional emotion scale in Müller et al. <xref rid="bib.bibx33" ref-type="bibr">2023</xref>.</p><p id="S4.p9">However, as the survey responses are largely in line with this pattern, they should still be interpreted as evidence for the overall semantic validity of the word embedding-based ISA measure that was tested in this study. The measure seems to be able to detect both more explicitly expressed associations (resulting in higher ISA metrics) as well as largely implicit associations (resulting in lower ISA metrics which are, yet, still distinct from zero) in line with human judgments. Thus, the present semantic validation study can be seen as supporting the general idea that word embedding based measures are able to detect ISAs in texts like human readers would. At the same time, it underscores the importance of making an informed and reflected choice about which concepts to contrast in such a necessarily bipolar measure. More broadly speaking, the observations reported here underscore the value of semantic validation, not only for ascertaining the validity of content-analytical measurements, but also for refining their conceptual underpinnings, and thus for substantial theorizing.</p><sec id="S4.SS1">
        
        
        
        
      <title>Limitations and Future Research</title><p id="S4.SS1.p1">The present validation attempt, of course, has some limitations. First, there is the question of the scalability of its results—how representative are our constructed, relatively small sentence packages of ISA-containing sentences for naturally occurring corpora with far larger numbers of group mentions and great noise particularly within more “neutral” sentences? As our results indicate, we were able to construct very convincing sentence packages for the fear-association conditions—but the “no ISA” sentence packages, both during a face validity check and in the survey experiment, appear to contain a visible rest of ISA with emotional valence and higher semantic variance. One avenue for future research would be to find a way to scale the individual contribution of terms to the overall ISA model (both in terms of their prevalence, and in terms of their effect on the resulting ISAs) to get a more fine-grained assessment of each sentence’s actual contribution. This would allow to vary the degree of implicit association within the stimulus in a linear fashion, rather than using the current three-level ordinal ISA scale for classifying the sentence packages.</p><p id="S4.SS1.p2">In this study, we validated word embedding-based ISA measures for just one group that, within the original reporting, was portrayed relatively neutral with regards to the two tested emotions fear and admiration. The selection of this group was based on the assumption that implicit stereotypes towards such a group are more malleable, making it easier to measure primed perceptions of that group after exposure to a few sentence-pairs. A more complex setup would have to test whether the same semantic validation will also be successful for groups with, presumably, more established stereotypical associations with either of the two emotions fear or admiration. It may be that the experiment-based semantic validation routine presented in this study does not work for groups that are subject to strongly one-sided prejudices in public perception.</p><p id="S4.SS1.p3">We show the concordance between <inline-formula><mml:math id="S4.SS1.p3.m1" alttext="NAS(s,\mathcal{A},\mathcal{B},\boldsymbol{w})" display="inline"><mml:mrow><mml:mi>N</mml:mi><mml:mo>⁢</mml:mo><mml:mi>A</mml:mi><mml:mo>⁢</mml:mo><mml:mi>S</mml:mi><mml:mo>⁢</mml:mo><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mi>s</mml:mi><mml:mo>,</mml:mo><mml:mi class="ltx_font_mathcaligraphic">𝒜</mml:mi><mml:mo>,</mml:mo><mml:mi class="ltx_font_mathcaligraphic">ℬ</mml:mi><mml:mo>,</mml:mo><mml:mi>𝒘</mml:mi><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow></mml:math></inline-formula> and human perceptions of ISAs within a sentence package among the three specified levels (admiration, fear, and no ISA) through which we provide evidence of semantic validity. However, this approach does not provide any evidence on the calibration (Lindhiem et al., <xref rid="bib.bibx29" ref-type="bibr">2018</xref>) of <inline-formula><mml:math id="S4.SS1.p3.m2" alttext="NAS(s,\mathcal{A},\mathcal{B},\boldsymbol{w})" display="inline"><mml:mrow><mml:mi>N</mml:mi><mml:mo>⁢</mml:mo><mml:mi>A</mml:mi><mml:mo>⁢</mml:mo><mml:mi>S</mml:mi><mml:mo>⁢</mml:mo><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mi>s</mml:mi><mml:mo>,</mml:mo><mml:mi class="ltx_font_mathcaligraphic">𝒜</mml:mi><mml:mo>,</mml:mo><mml:mi class="ltx_font_mathcaligraphic">ℬ</mml:mi><mml:mo>,</mml:mo><mml:mi>𝒘</mml:mi><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow></mml:math></inline-formula>, i.e. how the difference between any two arbitrary points on the scale of <inline-formula><mml:math id="S4.SS1.p3.m3" alttext="NAS(s,\mathcal{A},\mathcal{B},\boldsymbol{w})" display="inline"><mml:mrow><mml:mi>N</mml:mi><mml:mo>⁢</mml:mo><mml:mi>A</mml:mi><mml:mo>⁢</mml:mo><mml:mi>S</mml:mi><mml:mo>⁢</mml:mo><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mi>s</mml:mi><mml:mo>,</mml:mo><mml:mi class="ltx_font_mathcaligraphic">𝒜</mml:mi><mml:mo>,</mml:mo><mml:mi class="ltx_font_mathcaligraphic">ℬ</mml:mi><mml:mo>,</mml:mo><mml:mi>𝒘</mml:mi><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow></mml:math></inline-formula> actually corresponds to the human perception of difference in the fear-admiration dichotomy. One possible solution to show the calibration of word embedding-based ISA measures is the rich stimulus sampling approach proposed by Young et al. (<xref rid="bib.bibx46" ref-type="bibr">2011</xref>). Under that approach, a stimulus (dummy corpus and its stimulus sentence packages) should be randomly generated on the fly and therefore the level of ISA would not be preconfigurated.</p><p id="S4.SS1.p4">Another limitation could be seen in the participant sample used for the present study. Perceptions of the stereotype intensity conveyed by the same implicit associations plausibly vary with political orientation, lived experience (including being a target of stereotyping), socio-economic and cultural–linguistic background. For example, Italian immigrants may read the affective implications differently than members of the ethnic majority in Germany. This raises a normative question: Whose perceptions should count in semantic validation? The present experiment establishes semantic validity for a reference population that is somewhat skewed towards higher education, but diverse in terms of other socio-demographic factors and political attitudes (Leiner, <xref rid="bib.bibx27" ref-type="bibr">2016</xref>). Considering that typical gold-standard validation studies often rely exclusively on highly educated student assistants who are even less diverse in terms of political orientation and age than our sample, we deemed the dominance of highly educated individuals in the sample acceptable. However, one could argue that, if the goal is to assess the reception of a general audience, a truly representative sample is required as a next step. If the goal is to assess potential harm, it would be advisable to oversample targeted groups. Semantic validation studies employing an experimental approach should therefore carefully consider which kind of benchmark they are aiming for and what participant sample structure is required to achieve it.</p><p id="S4.SS1.p5">Finally, it has to be mentioned that the approach to measuring ISAs which we semantically validated in the present study uses so-called static word embeddings such as GLoVE. However, there are also newer models (e.g., BERT, ELMO, and GPT) which allow to generate so called contextual word embeddings. Methods such as WEAT have been extended by the original authors recently to cover contextual word embeddings (Guo &amp; Caliskan, <xref rid="bib.bibx21" ref-type="bibr">2021</xref>) and there appear to be new applications of contextual word embeddings in communication research (Thijs et al., <xref rid="bib.bibx41" ref-type="bibr">2024</xref>). The evidence of semantic validity presented in this article is certainly not directly transferable to ISAs measured using contextual word embeddings. Yet, its survey experimental approach can be used as the basis to validate ISAs found through contextual word embeddings, too. For this purpose, one might need to take the multi-level approach by Guo &amp; Caliskan (<xref rid="bib.bibx21" ref-type="bibr">2021</xref>) to 1) generate the same set of stimuli for various contexts, and then 2) combine the effect sizes from different contexts using a random-effect model.</p><p id="S4.SS1.p6">At the same time, newer word embedding methods could ease validation in terms of the required computational effort: When conducting the present analysis based on GLoVE word embeddings, our main analysis took seven days to compute, while the robustness-check took about five weeks on a university HPC to run. This process could be sped up significantly with word-embedding models that are optimized for running on graphics cards instead of CPUs, allowing for a larger variance of parameters and combinations of contexts to be included in the analysis at sensible timescales for analysis.</p></sec></sec>
    <sec id="S5">
      
      
      
      
    <title>Conclusion</title><p id="S5.p1">Despite the aforementioned limitations, the present results are more than encouraging for the application of a word embedding-based corpus-level metrics in the domain of computational communication analysis. While previous research had already tested the statistical and predictive validity of word embedding-based ISA detection methods (Durrheim et al., <xref rid="bib.bibx15" ref-type="bibr">2023</xref>), we complemented the picture with the present study, offering first evidence for their semantic validity. This should be read as further consolidation for the assumption that word embedding models are able to capture and quantify actual implicit associations within text corpora as perceived by human readers. For the time being, a broader application of this method for the measurement of various kinds of associations within media (and other) texts does seem promising. However, researchers applying word embedding-based metrics in the future should, of course, be aware that the present semantic validation (even in conjunction with previous statistical and predictive validations of the method) may be limited in its transferability to other research domains.</p></sec>
    <sec id="S6">
      
      
      
      
    <title>Acknowledgments</title><p id="S6.p1">This research was supported by the German Federal Ministry for Family Affairs, Senior Citizens, Women and Youth (BMFSFJ) through a grant to the “Research Association Discrimination and Racism” (FoDiRa) of the DeZIM-Research Community (German Center for Integration and Migration Research).</p></sec>
    </body>
  <back>
    <fn-group><title>Notes</title><fn id="id1" symbol="1"><p id="footnote1">
            
            
            
            
          All materials can be found at: <ext-link xlink:href="https://doi.org/10.17605/OSF.IO/TQRJ3" ext-link-type="uri">https://doi.org/10.17605/OSF.IO/TQRJ3</ext-link></p></fn><fn id="id2" symbol="2"><p id="footnote2">
              
              
              
              
            Parts of the hypotheses were re-phrased during manuscript preparation to better specify the intent of the study. This, however did not affect the relationships under study, labeling of outcome variables or directions of assumed effects. The original ideas of the hypotheses are still fully represented. For full disclosure, the original versions of the pre-registered hypotheses are available via the OSF repository associated with this study. The preregistration included a seventh hypothesis about assumed message effects as a dependent variable (”Assumed message effects will be stronger for corpora containing ISAs with fear or admiration as indicated by WEAT-based measures, compared to a corpus with no ISA”). Analysis of this hypothesis was omitted from the present manuscript for the sake of brevity. However, respective results can be obtained from the Online Appendices published in this study’s OSF repository.</p></fn><fn id="id3" symbol="3"><p id="footnote3">
                
                
                
                
              Another way to think about this is to calculate the semantic axis vector as <inline-formula><mml:math id="footnote3.m1" alttext="\mathbf{V_{-axis}}=\mathbf{V}^{\mathcal{B}}-\mathbf{V}^{\mathcal{A}}" display="inline"><mml:mrow><mml:msub><mml:mi>𝐕</mml:mi><mml:mrow><mml:mo>−</mml:mo><mml:mi>𝐚𝐱𝐢𝐬</mml:mi></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:mrow><mml:msup><mml:mi>𝐕</mml:mi><mml:mi class="ltx_font_mathcaligraphic">ℬ</mml:mi></mml:msup><mml:mo>−</mml:mo><mml:msup><mml:mi>𝐕</mml:mi><mml:mi class="ltx_font_mathcaligraphic">𝒜</mml:mi></mml:msup></mml:mrow></mml:mrow></mml:math></inline-formula> and look for word <inline-formula><mml:math id="footnote3.m2" alttext="d" display="inline"><mml:mi>d</mml:mi></mml:math></inline-formula> with a high <inline-formula><mml:math id="footnote3.m3" alttext="cos(\boldsymbol{w_{d}},\mathbf{V_{-axis}})" display="inline"><mml:mrow><mml:mi>c</mml:mi><mml:mo>⁢</mml:mo><mml:mi>o</mml:mi><mml:mo>⁢</mml:mo><mml:mi>s</mml:mi><mml:mo>⁢</mml:mo><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:msub><mml:mi>𝒘</mml:mi><mml:mi>𝒅</mml:mi></mml:msub><mml:mo>,</mml:mo><mml:msub><mml:mi>𝐕</mml:mi><mml:mrow><mml:mo>−</mml:mo><mml:mi>𝐚𝐱𝐢𝐬</mml:mi></mml:mrow></mml:msub><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow></mml:math></inline-formula>.</p></fn><fn id="id4" symbol="4"><p id="footnote4">
                
                
                
                
              To check for the robustness of our results, we repeated the same analyses with a wider range of parameters, following the approach of Lai et al. (<xref rid="bib.bibx26" ref-type="bibr">2016</xref>), who vary the dimensionality of the word embedding model as well as the number of iterations for model optimization. We again ran 100 runs for each combination of 2, 5, 10, 50 and 75 iterations and a dimensionality of 100, 150, 200 and 250. The resulting graphs are available in Online Appendices F and G in the OSF repository. For all conditions except the two iterations-conditions, the general tendency of the score difference is in the same direction as in Figure <xref rid="S2.F2" ref-type="fig">2</xref>, with results in the same range above 10 iterations. Notably, the number of iterations seems to have a smaller effect on the resulting difference between dummy corpora.</p></fn><fn id="id5" symbol="5"><p id="footnote5">
                
                
                
                
              We additionally calculated <inline-formula><mml:math id="footnote5.m1" alttext="NAS" display="inline"><mml:mrow><mml:mi>N</mml:mi><mml:mo>⁢</mml:mo><mml:mi>A</mml:mi><mml:mo>⁢</mml:mo><mml:mi>S</mml:mi></mml:mrow></mml:math></inline-formula> scores for the packages of 75 sentence pairs for fear, admiration and the control condition, by merging each of them with a smaller corpus containing <inline-formula><mml:math id="footnote5.m2" alttext="\frac{1}{60^{th}}" display="inline"><mml:mfrac><mml:mn>1</mml:mn><mml:msup><mml:mn>60</mml:mn><mml:mrow><mml:mi>t</mml:mi><mml:mo>⁢</mml:mo><mml:mi>h</mml:mi></mml:mrow></mml:msup></mml:mfrac></mml:math></inline-formula> of all sentences of the original corpus. The resulting graph is available in Appendix F in the OSF repository.</p></fn></fn-group><ref-list><title>References</title>
      <ref id="bib.bibx1"><mixed-citation publication-type="journal"><string-name><surname>An</surname>, <given-names>J.</given-names></string-name>, <string-name><surname>Kwak</surname>, <given-names>H.</given-names></string-name>, &amp; <string-name><surname>Ahn</surname>, <given-names>YY.</given-names></string-name> (<year>2018</year>). <article-title>Semaxis: A lightweight framework to characterize domain-specific word semantics beyond sentiment</article-title>. <source><italic>arXiv preprint arXiv:1806.05521</italic></source>, </mixed-citation></ref>
      <ref id="bib.bibx2"><mixed-citation publication-type="journal"><string-name><surname>Andrich</surname>, <given-names>A.</given-names></string-name>, <string-name><surname>Bachl</surname>, <given-names>M.</given-names></string-name>, &amp; <string-name><surname>Domahidi</surname>, <given-names>E.</given-names></string-name> (<year>2023</year>). <article-title>Goodbye, Gender Stereotypes? Trait Attributions to Politicians in 11 Years of News Coverage</article-title>. <source><italic>Journalism &amp; Mass Communication Quarterly</italic></source>, <fpage>107769902211422</fpage>. <ext-link ext-link-type="doi" xlink:href="https://doi.org/10.1177/10776990221142248">https://doi.org/10.1177/10776990221142248</ext-link></mixed-citation></ref>
      <ref id="bib.bibx3"><mixed-citation publication-type="journal"><string-name><surname>Andrich</surname>, <given-names>A.</given-names></string-name>, &amp; <string-name><surname>Domahidi</surname>, <given-names>E.</given-names></string-name> (<year>2022</year>). <article-title>A Leader and a Lady? A Computational Approach to Detection of Political Gender Stereotypes in Facebook User Comments</article-title>. <source><italic>International Journal of Communication</italic></source>, <volume>17</volume>, <fpage>20</fpage>.</mixed-citation></ref>
      <ref id="bib.bibx4"><mixed-citation publication-type="journal"><string-name><surname>Antoniak</surname>, <given-names>M.</given-names></string-name>, &amp; <string-name><surname>Mimno</surname>, <given-names>D.</given-names></string-name> (<year>2018</year>). <article-title>Evaluating the stability of embedding-based word similarities</article-title>. <source><italic>Transactions of the Association for Computational Linguistics</italic></source>, <volume>6</volume>, <fpage>107</fpage>–<lpage>119</lpage>.</mixed-citation></ref>
      <ref id="bib.bibx5"><mixed-citation publication-type="journal"><string-name><surname>Arendt</surname>, <given-names>F.</given-names></string-name>, &amp; <string-name><surname>Karadas</surname>, <given-names>N.</given-names></string-name> (<year>2017</year>). <article-title>Content analysis of mediated associations: An automated text-analytic approach</article-title>. <source><italic>Communication Methods and Measures</italic></source>, <volume>11</volume>(<issue>2</issue>), <fpage>105</fpage>–<lpage>120</lpage>. <ext-link ext-link-type="doi" xlink:href="https://doi.org/10.1080/19312458.2016.1276894">https://doi.org/10.1080/19312458.2016.1276894</ext-link></mixed-citation></ref>
      <ref id="bib.bibx6"><mixed-citation publication-type="journal"><string-name><surname>Azzalini</surname>, <given-names>M.</given-names></string-name> (<year>2025</year>). <article-title>Challenging implicit gender stereotypes in Italian news through language</article-title>. <source><italic>Journalism</italic></source>,  <ext-link ext-link-type="doi" xlink:href="https://doi.org/10.1177/14648849251371940">https://doi.org/10.1177/14648849251371940</ext-link></mixed-citation></ref>
      <ref id="bib.bibx7"><mixed-citation publication-type="journal"><string-name><surname>Baden</surname>, <given-names>C.</given-names></string-name>, <string-name><surname>Pipal</surname>, <given-names>C.</given-names></string-name>, <string-name><surname>Schoonvelde</surname>, <given-names>M.</given-names></string-name>, &amp; <string-name><surname>van der Velden</surname>, <given-names>M. A. C. G.</given-names></string-name> (<year>2021</year>). <article-title>Three Gaps in Computational Text Analysis Methods for Social Sciences: A Research Agenda</article-title>. <source><italic>Communication Methods and Measures</italic></source>, <volume>16</volume>(<issue>1</issue>), <fpage>1</fpage>–<lpage>18</lpage>. <ext-link ext-link-type="doi" xlink:href="https://doi.org/10.1080/19312458.2021.2015574">https://doi.org/10.1080/19312458.2021.2015574</ext-link></mixed-citation></ref>
      <ref id="bib.bibx8"><mixed-citation publication-type="book"><string-name><surname>Bolukbasi</surname>, <given-names>T.</given-names></string-name>, <string-name><surname>Chang</surname>, <given-names>KW.</given-names></string-name>, <string-name><surname>Zou</surname>, <given-names>J.</given-names></string-name>, <string-name><surname>Saligrama</surname>, <given-names>V.</given-names></string-name>, &amp; <string-name><surname>Kalai</surname>, <given-names>A.</given-names></string-name> (<year>2016</year>). <source><italic>Man Is to Computer Programmer as Woman Is to Homemaker? Debiasing Word Embeddings</italic></source>. <publisher-name>arXiv</publisher-name>.</mixed-citation></ref>
      <ref id="bib.bibx9"><mixed-citation publication-type="journal"><string-name><surname>Bürkner</surname>, <given-names>PC.</given-names></string-name> (<year>2017</year>). <article-title>Advanced Bayesian multilevel modeling with the R package brms</article-title>. <source><italic>arXiv preprint arXiv:1705.11123</italic></source>, </mixed-citation></ref>
      <ref id="bib.bibx10"><mixed-citation publication-type="journal"><string-name><surname>Caliskan</surname>, <given-names>A.</given-names></string-name>, <string-name><surname>Bryson</surname>, <given-names>J. J.</given-names></string-name>, &amp; <string-name><surname>Narayanan</surname>, <given-names>A.</given-names></string-name> (<year>2017</year>). <article-title>Semantics derived automatically from language corpora contain human-like biases</article-title>. <source><italic>Science</italic></source>, <volume>356</volume>(<issue>6334</issue>), <fpage>183</fpage>–<lpage>186</lpage>. <ext-link ext-link-type="doi" xlink:href="https://doi.org/10.1126/science.aal4230">https://doi.org/10.1126/science.aal4230</ext-link></mixed-citation></ref>
      <ref id="bib.bibx11"><mixed-citation publication-type="book"><string-name><surname>Chaiken</surname>, <given-names>S.</given-names></string-name>, <string-name><surname>Liberman</surname>, <given-names>A.</given-names></string-name>, &amp; <string-name><surname>Eagly</surname>, <given-names>A. H.</given-names></string-name> (<year>1989</year>). <chapter-title>Heuristic and Systematic Information Processing within and beyond the Persuasion Context</chapter-title>. In <source><italic>Unintended Thought</italic></source> (pp. <fpage>212</fpage>–<lpage>251</lpage>). <publisher-name>Guilford Press</publisher-name>, <publisher-loc>New York</publisher-loc>.</mixed-citation></ref>
      <ref id="bib.bibx12"><mixed-citation publication-type="journal"><string-name><surname>Chan</surname>, <given-names>Ch.</given-names></string-name> (<year>2022</year>). <article-title>sweater: Speedy Word Embedding Association Test and Extras Using R</article-title>. <source><italic>Journal of Open Source Software</italic></source>, <volume>7</volume>(<issue>72</issue>), <fpage>4036</fpage>. <ext-link ext-link-type="doi" xlink:href="https://doi.org/10.21105/joss.04036">https://doi.org/10.21105/joss.04036</ext-link></mixed-citation></ref>
      <ref id="bib.bibx13"><mixed-citation publication-type="journal"><string-name><surname>Curto</surname>, <given-names>G.</given-names></string-name>, <string-name><surname>Jojoa Acosta</surname>, <given-names>M. F.</given-names></string-name>, <string-name><surname>Comim</surname>, <given-names>F.</given-names></string-name>, &amp; <string-name><surname>Garcia-Zapirain</surname>, <given-names>B.</given-names></string-name> (<year>2022</year>). <article-title>Are AI systems biased against the poor? A machine learning analysis using Word2Vec and GloVe embeddings</article-title>. <source><italic>AI &amp; SOCIETY</italic></source>, <volume>39</volume>(<issue>2</issue>), <fpage>617</fpage>–<lpage>632</lpage>. <ext-link ext-link-type="doi" xlink:href="https://doi.org/10.1007/s00146-022-01494-z">https://doi.org/10.1007/s00146-022-01494-z</ext-link></mixed-citation></ref>
      <ref id="bib.bibx14"><mixed-citation publication-type="journal"><string-name><surname>DiMaggio</surname>, <given-names>P.</given-names></string-name>, <string-name><surname>Nag</surname>, <given-names>M.</given-names></string-name>, &amp; <string-name><surname>Blei</surname>, <given-names>D.</given-names></string-name> (<year>2013</year>). <article-title>Exploiting affinities between topic modeling and the sociological perspective on culture: Application to newspaper coverage of U.S. government arts funding</article-title>. <source><italic>Poetics</italic></source>, <volume>41</volume>(<issue>6</issue>), <fpage>570</fpage>–<lpage>606</lpage>. <ext-link ext-link-type="doi" xlink:href="https://doi.org/10.1016/j.poetic.2013.08.004">https://doi.org/10.1016/j.poetic.2013.08.004</ext-link></mixed-citation></ref>
      <ref id="bib.bibx15"><mixed-citation publication-type="journal"><string-name><surname>Durrheim</surname>, <given-names>K.</given-names></string-name>, <string-name><surname>Schuld</surname>, <given-names>M.</given-names></string-name>, <string-name><surname>Mafunda</surname>, <given-names>M.</given-names></string-name>, &amp; <string-name><surname>Mazibuko</surname>, <given-names>S.</given-names></string-name> (<year>2023</year>). <article-title>Using Word Embeddings to Investigate Cultural Biases</article-title>. <source><italic>British Journal of Social Psychology</italic></source>, <volume>62</volume>(<issue>1</issue>), <fpage>617</fpage>–<lpage>629</lpage>. <ext-link ext-link-type="doi" xlink:href="https://doi.org/10.1111/bjso.12560">https://doi.org/10.1111/bjso.12560</ext-link></mixed-citation></ref>
      <ref id="bib.bibx16"><mixed-citation publication-type="journal"><string-name><surname>Fu</surname>, <given-names>KW.</given-names></string-name> (<year>2023</year>). <article-title>Propagandization of Relative Gratification: How Chinese State Media Portray the International Pandemic</article-title>. <source><italic>Political Communication</italic></source>, <fpage>1</fpage>–<lpage>22</lpage>. <ext-link ext-link-type="doi" xlink:href="https://doi.org/10.1080/10584609.2023.2207492">https://doi.org/10.1080/10584609.2023.2207492</ext-link></mixed-citation></ref>
      <ref id="bib.bibx17"><mixed-citation publication-type="journal"><string-name><surname>Garg</surname>, <given-names>N.</given-names></string-name>, <string-name><surname>Schiebinger</surname>, <given-names>L.</given-names></string-name>, <string-name><surname>Jurafsky</surname>, <given-names>D.</given-names></string-name>, &amp; <string-name><surname>Zou</surname>, <given-names>J.</given-names></string-name> (<year>2018</year>). <article-title>Word embeddings quantify 100 years of gender and ethnic stereotypes</article-title>. <source><italic>Proceedings of the National Academy of Sciences</italic></source>, <volume>115</volume>(<issue>16</issue>), <fpage>E3635</fpage>. <ext-link ext-link-type="doi" xlink:href="https://doi.org/10.1073/pnas.1720347115">https://doi.org/10.1073/pnas.1720347115</ext-link></mixed-citation></ref>
      <ref id="bib.bibx18"><mixed-citation publication-type="journal"><string-name><surname>Greenwald</surname>, <given-names>A. G.</given-names></string-name>, <string-name><surname>McGhee</surname>, <given-names>D. E.</given-names></string-name>, &amp; <string-name><surname>Schwartz</surname>, <given-names>J. L.</given-names></string-name> (<year>1998</year>). <article-title>Measuring individual differences in implicit cognition: the implicit association test.</article-title>. <source><italic>Journal of personality and social psychology</italic></source>, <volume>74</volume>(<issue>6</issue>), <fpage>1464</fpage>. <ext-link ext-link-type="doi" xlink:href="https://doi.org/10.1037/0022-3514.74.6.1464">https://doi.org/10.1037/0022-3514.74.6.1464</ext-link></mixed-citation></ref>
      <ref id="bib.bibx19"><mixed-citation publication-type="journal"><string-name><surname>Grimmer</surname>, <given-names>J.</given-names></string-name>, &amp; <string-name><surname>King</surname>, <given-names>G.</given-names></string-name> (<year>2011</year>). <article-title>General purpose computer-assisted clustering and conceptualization</article-title>. <source><italic>Proceedings of the National Academy of Sciences</italic></source>, <volume>108</volume>(<issue>7</issue>), <fpage>2643</fpage>–<lpage>2650</lpage>. <ext-link ext-link-type="doi" xlink:href="https://doi.org/10.1073/pnas.1018067108">https://doi.org/10.1073/pnas.1018067108</ext-link></mixed-citation></ref>
      <ref id="bib.bibx20"><mixed-citation publication-type="journal"><string-name><surname>Grimmer</surname>, <given-names>J.</given-names></string-name>, &amp; <string-name><surname>Stewart</surname>, <given-names>B. M.</given-names></string-name> (<year>2013</year>). <article-title>Text as Data: The Promise and Pitfalls of Automatic Content Analysis Methods for Political Texts</article-title>. <source><italic>Political Analysis</italic></source>, <volume>21</volume>(<issue>3</issue>), <fpage>267</fpage>–<lpage>297</lpage>. <ext-link ext-link-type="doi" xlink:href="https://doi.org/10.1093/pan/mps028">https://doi.org/10.1093/pan/mps028</ext-link></mixed-citation></ref>
      <ref id="bib.bibx21"><mixed-citation publication-type="journal"><string-name><surname>Guo</surname>, <given-names>W.</given-names></string-name>, &amp; <string-name><surname>Caliskan</surname>, <given-names>A.</given-names></string-name> (<year>2021</year>). <article-title>Detecting Emergent Intersectional Biases: Contextualized Word Embeddings Contain a Distribution of Human-like Biases</article-title>. <source><italic>Proceedings of the 2021 AAAI/ACM Conference on AI, Ethics, and Society</italic></source>,  <ext-link ext-link-type="doi" xlink:href="https://doi.org/10.1145/3461702.3462536">https://doi.org/10.1145/3461702.3462536</ext-link></mixed-citation></ref>
      <ref id="bib.bibx22"><mixed-citation publication-type="journal"><string-name><surname>Imai</surname>, <given-names>K.</given-names></string-name>, <string-name><surname>Keele</surname>, <given-names>L.</given-names></string-name>, <string-name><surname>Tingley</surname>, <given-names>D.</given-names></string-name>, &amp; <string-name><surname>Yamamoto</surname>, <given-names>T.</given-names></string-name> (<year>2011</year>). <article-title>Unpacking the Black Box of Causality: Learning about Causal Mechanisms from Experimental and Observational Studies</article-title>. <source><italic>American Political Science Review</italic></source>, <volume>105</volume>(<issue>4</issue>), <fpage>765</fpage>–<lpage>789</lpage>. <ext-link ext-link-type="doi" xlink:href="https://doi.org/10.1017/S0003055411000414">https://doi.org/10.1017/S0003055411000414</ext-link></mixed-citation></ref>
      <ref id="bib.bibx23"><mixed-citation publication-type="book"><string-name><surname>Krippendorff</surname>, <given-names>K.</given-names></string-name> (<year>1980</year>). <chapter-title>Validity in Content Analysis</chapter-title>. In <source><italic>Computerstrategien für die Kommunikationsanalyse</italic></source> (pp. <fpage>69</fpage>–<lpage>112</lpage>). <publisher-name>Campus</publisher-name>, <publisher-loc>Frankfurt a. M.</publisher-loc>.</mixed-citation></ref>
      <ref id="bib.bibx24"><mixed-citation publication-type="book"><string-name><surname>Krippendorff</surname>, <given-names>K.</given-names></string-name> (<year>2018</year>). <source><italic>Content analysis: An introduction to its methodology</italic></source>. <publisher-name>SAGE</publisher-name>.</mixed-citation></ref>
      <ref id="bib.bibx25"><mixed-citation publication-type="journal"><string-name><surname>Kroon</surname>, <given-names>A. C.</given-names></string-name>, <string-name><surname>Trilling</surname>, <given-names>D.</given-names></string-name>, &amp; <string-name><surname>Raats</surname>, <given-names>T.</given-names></string-name> (<year>2021</year>). <article-title>Guilty by Association: Using Word Embeddings to Measure Ethnic Stereotypes in News Coverage</article-title>. <source><italic>Journalism &amp; Mass Communication Quarterly</italic></source>, <volume>98</volume>(<issue>2</issue>), <fpage>451</fpage>–<lpage>477</lpage>. <ext-link ext-link-type="doi" xlink:href="https://doi.org/10.1177/1077699020932304">https://doi.org/10.1177/1077699020932304</ext-link></mixed-citation></ref>
      <ref id="bib.bibx26"><mixed-citation publication-type="journal"><string-name><surname>Lai</surname>, <given-names>S.</given-names></string-name>, <string-name><surname>Liu</surname>, <given-names>K.</given-names></string-name>, <string-name><surname>Xu</surname>, <given-names>L.</given-names></string-name>, &amp; <string-name><surname>Zhao</surname>, <given-names>J.</given-names></string-name> (<year>2016</year>). <article-title>How to generate a good word embedding.</article-title>. <source><italic>IEEE Intelligent Systems</italic></source>, <volume>31</volume>(<issue>6</issue>), <fpage>5</fpage>–<lpage>14</lpage>. <ext-link ext-link-type="doi" xlink:href="https://doi.org/10.1109/MIS.2016.45">https://doi.org/10.1109/MIS.2016.45</ext-link></mixed-citation></ref>
      <ref id="bib.bibx27"><mixed-citation publication-type="journal"><string-name><surname>Leiner</surname>, <given-names>D. J.</given-names></string-name> (<year>2016</year>). <article-title>Our Research's Breadth Lives on Convenience Samples A Case Study of the Online Respondent Pool “SoSci Panel”</article-title>. <source><italic>Studies in Communication | Media</italic></source>, <volume>5</volume>(<issue>4</issue>), <fpage>367</fpage>–<lpage>396</lpage>. <ext-link ext-link-type="doi" xlink:href="https://doi.org/10.5771/2192-4007-2016-4-367">https://doi.org/10.5771/2192-4007-2016-4-367</ext-link></mixed-citation></ref>
      <ref id="bib.bibx28"><mixed-citation publication-type="journal"><string-name><surname>Lind</surname>, <given-names>F.</given-names></string-name>, <string-name><surname>Gruber</surname>, <given-names>M.</given-names></string-name>, &amp; <string-name><surname>Boomgaarden</surname>, <given-names>H. G.</given-names></string-name> (<year>2017</year>). <article-title>Content Analysis by the Crowd: Assessing the Usability of Crowdsourcing for Coding Latent Constructs</article-title>. <source><italic>Communication Methods and Measures</italic></source>, <volume>11</volume>(<issue>3</issue>), <fpage>191</fpage>–<lpage>209</lpage>. <ext-link ext-link-type="doi" xlink:href="https://doi.org/10.1080/19312458.2017.1317338">https://doi.org/10.1080/19312458.2017.1317338</ext-link></mixed-citation></ref>
      <ref id="bib.bibx29"><mixed-citation publication-type="journal"><string-name><surname>Lindhiem</surname>, <given-names>O.</given-names></string-name>, <string-name><surname>Petersen</surname>, <given-names>I. T.</given-names></string-name>, <string-name><surname>Mentch</surname>, <given-names>L. K.</given-names></string-name>, &amp; <string-name><surname>Youngstrom</surname>, <given-names>E. A.</given-names></string-name> (<year>2018</year>). <article-title>The Importance of Calibration in Clinical Psychology</article-title>. <source><italic>Assessment</italic></source>, <volume>27</volume>(<issue>4</issue>), <fpage>840</fpage>–<lpage>854</lpage>. <ext-link ext-link-type="doi" xlink:href="https://doi.org/10.1177/1073191117752055">https://doi.org/10.1177/1073191117752055</ext-link></mixed-citation></ref>
      <ref id="bib.bibx30"><mixed-citation publication-type="journal"><string-name><surname>Lüdecke</surname>, <given-names>D.</given-names></string-name>, <string-name><surname>Ben-Shachar</surname>, <given-names>M. S.</given-names></string-name>, <string-name><surname>Patil</surname>, <given-names>I.</given-names></string-name>, <string-name><surname>Wiernik</surname>, <given-names>B. M.</given-names></string-name>, <string-name><surname>Bacher</surname>, <given-names>E.</given-names></string-name>, <string-name><surname>Thériault</surname>, <given-names>R.</given-names></string-name>, &amp; <string-name><surname>Makowski</surname>, <given-names>D.</given-names></string-name> (<year>2022</year>). <article-title>easystats: Framework for Easy Statistical Modeling, Visualization, and Reporting</article-title>. <source><italic>CRAN</italic></source>,  <ext-link ext-link-type="uri" xlink:href="https://easystats.github.io/easystats/">https://easystats.github.io/easystats/</ext-link></mixed-citation></ref>
      <ref id="bib.bibx31"><mixed-citation publication-type="book"><string-name><surname>Maurer</surname>, <given-names>M.</given-names></string-name>, &amp; <string-name><surname>Reinemann</surname>, <given-names>C.</given-names></string-name> (<year>2006</year>). <source><italic>Medieninhalte. Eine Einführung</italic></source>. <publisher-name>VS</publisher-name>, <publisher-loc>Wiesbaden</publisher-loc>. <ext-link ext-link-type="doi" xlink:href="https://doi.org/10.1007/978-3-531-90179-4">https://doi.org/10.1007/978-3-531-90179-4</ext-link></mixed-citation></ref>
      <ref id="bib.bibx32"><mixed-citation publication-type="book"><string-name><surname>McElreath</surname>, <given-names>R.</given-names></string-name> (<year>2016</year>). <source><italic>Statistical Rethinking: A Bayesian Course with Examples in R and Stan.</italic></source>. <publisher-name>CRC Press</publisher-name>, <publisher-loc>New York</publisher-loc>.</mixed-citation></ref>
      <ref id="bib.bibx33"><mixed-citation publication-type="journal"><string-name><surname>Müller</surname>, <given-names>P.</given-names></string-name>, <string-name><surname>Chan</surname>, <given-names>CH.</given-names></string-name>, <string-name><surname>Ludwig</surname>, <given-names>K.</given-names></string-name>, <string-name><surname>Freudenthaler</surname>, <given-names>R.</given-names></string-name>, &amp; <string-name><surname>Wessler</surname>, <given-names>H.</given-names></string-name> (<year>2023</year>). <article-title>Differential Racism in the News: Using Semi-Supervised Machine Learning to Distinguish Explicit and Implicit Stigmatization of Ethnic and Religious Groups in Journalistic Discourse</article-title>. <source><italic>Political Communication</italic></source>, <volume>40</volume>(<issue>4</issue>), <fpage>396</fpage>–<lpage>414</lpage>. <ext-link ext-link-type="doi" xlink:href="https://doi.org/10.1080/10584609.2023.2193146">https://doi.org/10.1080/10584609.2023.2193146</ext-link></mixed-citation></ref>
      <ref id="bib.bibx34"><mixed-citation publication-type="journal"><string-name><surname>Pennington</surname>, <given-names>J.</given-names></string-name>, <string-name><surname>Socher</surname>, <given-names>R.</given-names></string-name>, &amp; <string-name><surname>Manning</surname>, <given-names>C.</given-names></string-name> (<year>2014</year>). <article-title>Glove: Global Vectors for Word Representation</article-title>. <source><italic>Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP)</italic></source>,  <ext-link ext-link-type="doi" xlink:href="https://doi.org/10.3115/v1/d14-1162">https://doi.org/10.3115/v1/d14-1162</ext-link></mixed-citation></ref>
      <ref id="bib.bibx35"><mixed-citation publication-type="journal"><string-name><surname>Quinn</surname>, <given-names>K. M.</given-names></string-name>, <string-name><surname>Monroe</surname>, <given-names>B. L.</given-names></string-name>, <string-name><surname>Colaresi</surname>, <given-names>M.</given-names></string-name>, <string-name><surname>Crespin</surname>, <given-names>M. H.</given-names></string-name>, &amp; <string-name><surname>Radev</surname>, <given-names>D. R.</given-names></string-name> (<year>2010</year>). <article-title>How to Analyze Political Attention with Minimal Assumptions and Costs</article-title>. <source><italic>American Journal of Political Science</italic></source>, <volume>54</volume>(<issue>1</issue>), <fpage>209</fpage>–<lpage>228</lpage>. <ext-link ext-link-type="doi" xlink:href="https://doi.org/10.1111/j.1540-5907.2009.00427.x">https://doi.org/10.1111/j.1540-5907.2009.00427.x</ext-link></mixed-citation></ref>
      <ref id="bib.bibx36"><mixed-citation publication-type="journal"><string-name><surname>Rudkowsky</surname>, <given-names>E.</given-names></string-name>, <string-name><surname>Haselmayer</surname>, <given-names>M.</given-names></string-name>, <string-name><surname>Wastian</surname>, <given-names>M.</given-names></string-name>, <string-name><surname>Jenny</surname>, <given-names>M.</given-names></string-name>, <string-name><surname>Emrich</surname>, <given-names>Š.</given-names></string-name>, &amp; <string-name><surname>Sedlmair</surname>, <given-names>M.</given-names></string-name> (<year>2018</year>). <article-title>More than Bags of Words: Sentiment Analysis with Word Embeddings</article-title>. <source><italic>Communication Methods and Measures</italic></source>, <volume>12</volume>(<issue>2-3</issue>), <fpage>140</fpage>–<lpage>157</lpage>. <ext-link ext-link-type="doi" xlink:href="https://doi.org/10.1080/19312458.2018.1455817">https://doi.org/10.1080/19312458.2018.1455817</ext-link></mixed-citation></ref>
      <ref id="bib.bibx37"><mixed-citation publication-type="journal"><string-name><surname>Sales</surname>, <given-names>A.</given-names></string-name>, <string-name><surname>Balby</surname>, <given-names>L.</given-names></string-name>, &amp; <string-name><surname>Veloso</surname>, <given-names>A.</given-names></string-name> (<year>2019</year>). <article-title>Media bias characterization in Brazilian presidential elections</article-title>. <source><italic>Proceedings of the 30th ACM Conference on Hypertext and Social Media</italic></source>, <fpage>231</fpage>–<lpage>240</lpage>. <ext-link ext-link-type="doi" xlink:href="https://doi.org/10.1145/3345645.3351107">https://doi.org/10.1145/3345645.3351107</ext-link></mixed-citation></ref>
      <ref id="bib.bibx38"><mixed-citation publication-type="journal"><string-name><surname>Song</surname>, <given-names>H.</given-names></string-name>, <string-name><surname>Tolochko</surname>, <given-names>P.</given-names></string-name>, <string-name><surname>Eberl</surname>, <given-names>JM.</given-names></string-name>, <string-name><surname>Eisele</surname>, <given-names>O.</given-names></string-name>, <string-name><surname>Greussing</surname>, <given-names>E.</given-names></string-name>, <string-name><surname>Heidenreich</surname>, <given-names>T.</given-names></string-name>, <string-name><surname>Lind</surname>, <given-names>F.</given-names></string-name>, <string-name><surname>Galyga</surname>, <given-names>S.</given-names></string-name>, &amp; <string-name><surname>Boomgaarden</surname>, <given-names>H. G.</given-names></string-name> (<year>2020</year>). <article-title>In Validations We Trust? The Impact of Imperfect Human Annotations as a Gold Standard on the Quality of Validation of Automated Content Analysis</article-title>. <source><italic>Political Communication</italic></source>, <volume>37</volume>(<issue>4</issue>), <fpage>550</fpage>–<lpage>572</lpage>. <ext-link ext-link-type="doi" xlink:href="https://doi.org/10.1080/10584609.2020.1723752">https://doi.org/10.1080/10584609.2020.1723752</ext-link></mixed-citation></ref>
      <ref id="bib.bibx39"><mixed-citation publication-type="journal"><string-name><surname>Soroka</surname>, <given-names>S.</given-names></string-name>, &amp; <string-name><surname>McAdams</surname>, <given-names>S.</given-names></string-name> (<year>2015</year>). <article-title>News, politics, and negativity</article-title>. <source><italic>Political Communication</italic></source>, <volume>32</volume>(<issue>1</issue>), <fpage>1</fpage>–<lpage>22</lpage>. <ext-link ext-link-type="doi" xlink:href="https://doi.org/10.1080/10584609.2014.881942">https://doi.org/10.1080/10584609.2014.881942</ext-link></mixed-citation></ref>
      <ref id="bib.bibx40"><mixed-citation publication-type="journal"><string-name><surname>Spliethöver</surname>, <given-names>M.</given-names></string-name>, &amp; <string-name><surname>Wachsmuth</surname>, <given-names>H.</given-names></string-name> (<year>2021</year>). <article-title>Bias Silhouette Analysis: Towards Assessing the Quality of Bias Metrics for Word Embedding Models.</article-title>. <source><italic>IJCAI</italic></source>, <fpage>552</fpage>–<lpage>559</lpage>.</mixed-citation></ref>
      <ref id="bib.bibx41"><mixed-citation publication-type="journal"><string-name><surname>Thijs</surname>, <given-names>G.</given-names></string-name>, <string-name><surname>Trilling</surname>, <given-names>D.</given-names></string-name>, &amp; <string-name><surname>Kroon</surname>, <given-names>A. C.</given-names></string-name> (<year>2024</year>). <article-title>Contextualized Word Embeddings Expose Ethnic Biases in News</article-title>. <source><italic>ACM Web Science Conference</italic></source>,  <ext-link ext-link-type="doi" xlink:href="https://doi.org/10.1145/3614419.3643994">https://doi.org/10.1145/3614419.3643994</ext-link></mixed-citation></ref>
      <ref id="bib.bibx42"><mixed-citation publication-type="journal"><string-name><surname>Urman</surname>, <given-names>A.</given-names></string-name>, &amp; <string-name><surname>Makhortykh</surname>, <given-names>M.</given-names></string-name> (<year>2022</year>). <article-title>“Foreign beauties want to meet you”: The sexualization of women in Google’s organic and sponsored text search results</article-title>. <source><italic>New Media &amp; Society</italic></source>, <fpage>146144482210995</fpage>. <ext-link ext-link-type="doi" xlink:href="https://doi.org/10.1177/14614448221099536">https://doi.org/10.1177/14614448221099536</ext-link></mixed-citation></ref>
      <ref id="bib.bibx43"><mixed-citation publication-type="journal"><string-name><surname>Van Atteveldt</surname>, <given-names>W.</given-names></string-name>, &amp; <string-name><surname>Peng</surname>, <given-names>TQ.</given-names></string-name> (<year>2018</year>). <article-title>When Communication Meets Computation: Opportunities, Challenges, and Pitfalls in Computational Communication Science</article-title>. <source><italic>Communication Methods and Measures</italic></source>, <volume>12</volume>(<issue>2-3</issue>), <fpage>81</fpage>–<lpage>92</lpage>. <ext-link ext-link-type="doi" xlink:href="https://doi.org/10.1080/19312458.2018.1458084">https://doi.org/10.1080/19312458.2018.1458084</ext-link></mixed-citation></ref>
      <ref id="bib.bibx44"><mixed-citation publication-type="journal"><string-name><surname>Van Atteveldt</surname>, <given-names>W.</given-names></string-name>, <string-name><surname>Van der Velden</surname>, <given-names>M. A. C. G.</given-names></string-name>, &amp; <string-name><surname>Boukes</surname>, <given-names>M.</given-names></string-name> (<year>2021</year>). <article-title>The Validity of Sentiment Analysis:Comparing Manual Annotation, Crowd-Coding, Dictionary Approaches, and Machine Learning Algorithms</article-title>. <source><italic>Communication Methods and Measures</italic></source>, <fpage>1</fpage>–<lpage>20</lpage>. <ext-link ext-link-type="doi" xlink:href="https://doi.org/10.1080/19312458.2020.1869198">https://doi.org/10.1080/19312458.2020.1869198</ext-link></mixed-citation></ref>
      <ref id="bib.bibx45"><mixed-citation publication-type="book"><string-name><surname>Wiedemann</surname>, <given-names>G.</given-names></string-name>, &amp; <string-name><surname>Fedtke</surname>, <given-names>C.</given-names></string-name> (<year>2021</year>). <chapter-title>From Frequency Counts to Contextualized Word Embeddings. The Saussurean Turn in Automatic Content Analysis</chapter-title>. In <source><italic>Handbook of Computational Social Science, Volume 2: Data Science, Statistical Modelling, and Machine Learning Methods</italic></source> (pp. <fpage>366</fpage>–<lpage>385</lpage>). <publisher-name>Routledge</publisher-name>, <publisher-loc>London</publisher-loc>. <ext-link ext-link-type="doi" xlink:href="https://doi.org/10.4324/9781003025245-25">https://doi.org/10.4324/9781003025245-25</ext-link></mixed-citation></ref>
      <ref id="bib.bibx46"><mixed-citation publication-type="journal"><string-name><surname>Young</surname>, <given-names>M. E.</given-names></string-name>, <string-name><surname>Cole</surname>, <given-names>J. J.</given-names></string-name>, &amp; <string-name><surname>Sutherland</surname>, <given-names>S. C.</given-names></string-name> (<year>2011</year>). <article-title>Rich stimulus sampling for between-subjects designs improves model selection</article-title>. <source><italic>Behavior Research Methods</italic></source>, <volume>44</volume>(<issue>1</issue>), <fpage>176</fpage>–<lpage>188</lpage>. <ext-link ext-link-type="doi" xlink:href="https://doi.org/10.3758/s13428-011-0133-5">https://doi.org/10.3758/s13428-011-0133-5</ext-link></mixed-citation></ref>
    </ref-list>
    </back>
</article>