Abstract
The existence of jingle fallacies (equally named constructs/measures that, in fact, assess different constructs) and jangle fallacies (differently named constructs/measures that, in fact, measure the same concept) jeopardizes psychological assessment, as both are associated with conceptual and assessment-related uncertainties. A guideline presented by Lawson and Robins Personality and Social Psychology Review, 25, 344–366, (2021) helps evaluate the intensity of respective fallacies. While the guideline is well elaborated, psychometric aspects regarding (dis)similarities of nomological networks require extensions and differentiations. I recommend two analytical advancements, namely (a) the derivation of correlation difference hypotheses for criteria with which the allegedly jingled (jangled) variables are assumed to be correlated at equal (different) levels and (b) procedures to derive cutoffs for the overall similarity of nomological networks based on the elemental approach (Kay & Arrow Social and Personality Psychology Compass, 16, e12662, 2022). Considering correlation difference tests, I further outline the importance of power analyses. These extensions help improve the evaluation of assumed jingle and jangle fallacies, arguably increasing the stability and reliability of research findings.
Similar content being viewed by others
Explore related subjects
Discover the latest articles and news from researchers in related subjects, suggested using machine learning.Avoid common mistakes on your manuscript.
Psychology differs from other scientific branches in terms of the visibility of its subject matter. One day, technical advantages may enable physicists and chemists to make even subatomic particles visible, but psychologists will never directly observe psychological characteristics. For this reason, it is not easy to conclude at first glance whether the intended psychological constructs at hand are measured accurately or whether the employed instruments rather assess closely related yet distinct constructs. Consequently, research psychologists frequently postulate ostensibly new constructs that turn out to be redundant with readily known constructs only after at least a few studies have been conducted on the allegedly new construct. Likewise, different researchers often come up with different definitions for the same constructs, leading to different conceptualizations and, thus, non-convergence of the measures. Erroneously treating concepts/measures as distinct because they have different names is called a jangle fallacy; falsely claiming that concepts/measures are identical because they are named equally is called a jingle fallacy. Both pose serious problems because they exacerbate the comparability of findings due to heterogeneity in assessment and account for the instability of findings and waste of research resources (Hodson, 2021).
Owing to the above problems, it is important to be able to identify jingle and jangle fallacies at an early stage, ideally in the derivation study for a new construct/measure. Lawson and Robins (2021) provided a respective guideline with 10 criteria that can be evaluated independently so that different conclusions can arise from different criteria. Thus, jingle and jangle fallacies differ on a spectrum rather than being dichotomous conditions, suggesting that it is more appropriate to talk about weak, moderate, or strong evidence for or against a jingle/jangle fallacy than about presence versus absence. To acknowledge this, they introduced the term sibling constructs to refer to constructs that “share a close, familial relation, but are not identical; that is, they are not ‘twin constructs’” (p. 345). It stands to reason, however, that perfect twins rarely exist in the realm of psychological constructs. Following the principle of theoretical parsimony, researchers frequently treat constructs as equal if there is little conceptual and empirical space for uniqueness. Thus, postulating a “new” construct only pays off if it reflects a sibling construct in the sense of Lawson and Robins but not a twin construct in the sense of a jangle fallacy. In this regard, Lawson and Robins (2021) themselves consistently used the jingle/jangle terminology in their paper and applied their framework to examples of (assumed) jingle and jangle fallacies (not just what they called sibling constructs), making the framework universal to use.
Specifically, Lawson and Robins’ (2021) framework evaluates conceptual similarities of definitions (1), similarities of the hypothesized and empirically observed nomological networks (2 and 3), correlations of the measures (4), the constitution of common vis-à-vis separate factors of pertinent measures in factor analyses (5), mutual incremental validity for crucial criteria (6), the existence of shared developmental routes, a common cause, or even a causal relationship (7 to 9), and whether the ostensibly jingled/jangled concepts are state or trait manifestations of the same entity (10). Problematically, it remains unclear where to draw the line to decide, for a given criterion from the framework, whether or not two constructs are sibling constructs. The same problem applies to decisions about jingle and jangle fallacies, however, requiring researchers to provide more detailed considerations about the potentially subjective bases of their decisions—regardless of whether they rely on the sibling or the jingle/jangle terminology. The current manuscript describes psychometric refinements of Lawson and Robins’ (2021) framework that can be used for evaluations of jingle/jangle fallacies and sibling constructs alike.
The guideline itself is valuable as it provides a structure of criteria with which many scholars expectedly agree. However, it stands to reason that the recommended approach to evaluating distinct expected nomological networks (Criterion 2) oversimplifies conclusions about jingle/jangle fallacies and that Lawson and Robins’ (2021) proposed procedure to evaluate the equivalence of nomological networks (Criterion 3) relies on unrealistic premises. In this research, I propose extensions of these aspects that call for stronger incorporation of conceptual descriptions of the evaluated constructs (cf. Criterion 1) and empirical overlaps (Criterion 3). Stronger emphasis on these aspects arguably helps quantify the extent to which a jingle/jangle fallacy (or sibling constructs) exists and advances the transparency of respective evaluations.
Expanding on “high degree of overlap in their actual nomological network”
Directional hypotheses versus difference hypotheses
The guideline by Lawson and Robins (2021) remained comparatively vague about how to analyze the nomological networks of potentially jingled/jangled variables. In fact, there are different ways to derive hypotheses for expected nomological networks, but they differ concerning their specificity. The easiest way refers to postulating the direction for each correlation: Let X and Y be constructs for which a jangle fallacy is to be tested, and let A be a validation criterion with which X and Y should be correlated. A phrasing for directional hypotheses would be “X is expected to be positively correlated with A, whereas Y is expected to be negatively correlated with A.” This type of hypothesis is informative if correlations are assumed in different directions, but it is associated with a massive loss of information when it comes to evaluations of jingle and jangle fallacies if positive/negative correlations are assumed for both X and Y. A more informative yet more rarely adopted type of hypothesis refers to correlation differences. This type of hypothesis requires researchers to justify which of the competing constructs exhibits a stronger positive or negative correlation. An exemplary study employing the correlation difference approach dealt with the assumed jangle fallacy of subclinical psychopathy (hereafter psychopathy) and everyday sadism (hereafter sadism), both of which are characterized by aggression and dominance seeking (among others). Resorting to violence is focal to the definition of sadism, and dominance seeking is the assumed underlying motive. In comparison, aggression and dominance seeking are rather peripheral to psychopathy and reflect only two out of many features, leading the authors to hypothesize sadism to be more strongly related to aggression and dominance seeking (Blötner & Mokros, 2023). Similar correlations of measures of competing constructs with aggression, dominance seeking, and other important constructs suggested a jangle fallacy that would not have become clear in such detail through mere directional testing (for confirmations of these findings see Blötner et al., 2024; Kowalski et al., 2024). Lawson and Robins (2021) did not consider the potential for correlation differences, presumably leading most researchers to resort to mere directional analyses of correlations when they apply the framework. Arguably, it can be cumbersome to derive hypotheses about which construct/measure is most strongly related to an outcome or set of outcomes, but these considerations provide additional information about areas of overlap and reduce the chances of falsely claiming a jingle/jangle fallacy (see below).
Overall agreement of nomological networks
Besides evaluations of bivariate correlations with crucial criteria, Lawson and Robins (2021) recommended computing the overall agreement of the observed nomological networks (spanned by a reasonable set of criteria) of the measures for which a jingle or jangle fallacy is tested. To this end, they suggested the double-entry intraclass correlation (ICCDE). To calculate it, the correlations observed between one construct/measure in question and the validation criteria are appended to the observed correlations of the other one and vice versa (see Table 1 for an illustration). The ICCDE is the bivariate correlation between these doubly entered vectors of correlations. The double-entry method ensures that the resultant coefficient reflects the similarity of the coefficients rather than mere rank-order similarity.
Lawson and Robins (2021) recommended ICCDE ≥ .60 as a cutoff for equivalence. However, as they themselves noted, the agreement of nomological networks strongly depends on the selection of constructs constituting the nomological network. Furthermore, it stands to reason that empirical agreements of correlation profiles exceed this cutoff quite easily owing to common method effects, for instance, if all variables are measured with self-report scales or if all selected criteria, by and large, reflect very prosocial or very antisocial constructs, to name just two causes. In many cases, certain overlaps are inevitable because “everything correlates to some extent with everything else” (Meehl, 1990, p. 204). Previous studies that examined jangle fallacies applied a much higher cutoff than Lawson and Robins to treat nomological networks as equivalent (ICCDE ≈ .90; Blötner & Mokros, 2023; Few et al., 2016; Hart et al., 2023; Maples-Keller et al., 2019; Miller et al., 2017; Samuel et al., 2012). Applying a fixed cutoff, however, neglects the complexity of shared and non-shared elements of nomological networks. Thus, the present study proposes an approach for the derivation of a cutoff that also takes content-related specifics into account.
Decisions in the context of evaluations of jingle/jangle fallacies can be compared to a diagnostic process in that a decision must be made about whether committing a type I or a type II error is more problematic: A type I error in this context would occur if researchers claimed a jangle fallacy where there is none (i.e., false positive), which could lead other scholars to treat the ostensibly jangled (yet factually distinguishable) constructs as equal. Imagine, for instance, that Blötner and Mokros (2023) committed a type I error in their study on the jangle fallacy affecting sadism and psychopathy because they adopted a cutoff for the overall agreement of nomological networks that was too low. Under these circumstances, the construct of everyday sadism could be erroneously eliminated from a research area even though it was, in fact, sufficiently unique vis-à-vis subclinical psychopathy (note that subsequent studies confirmed Blötner and Mokros’ findings [e.g., Blötner et al., 2024; Kowalski et al., 2024)]. A type II error, in turn, exists if researchers fail to recognize a jingle/jangle fallacy as such (i.e., false negative), for example, because they adopted a cutoff for the overall agreement of nomological networks that is too high. Both kinds of false decisions can be costly and detrimental to the underlying field of research: In the case of a type I error, useful work done by earlier researchers would be labeled as “useless” despite being, in fact, practically and conceptually useful. Thus, the financial, personal, and time-related resources invested in the researchers’ work would not pay off as intended. In the case of a type II error, the same kinds of resources would be “wasted” for upholding parallel strands of research for factually the same hypothetical construct (cf. Hodson, 2021). The decision as to whether committing a type I or a type II error is more costly should depend on the research area and the amount of work already devoted to the constructs under scrutiny. Thereby, unspecific cutoffs and requirements for the evaluations of the similarity of hypothesized and observed nomological networks (Criteria 2 and 3) can erroneously suggest a jingle/jangle fallacy or favor a respective oversight, especially if researchers do not acknowledge the conceptual proximity of the constructs (Criterion 1). Thus, Lawson and Robins (2021) recommended examining their 10 criteria simultaneously and integrating them for an overall judgment.
Methodological considerations to advance Lawson and Robins’ framework
Which numerical differences are treated as meaningful differences?
An important issue related to statistical evaluations regards drawing a reasonably large sample to test the underlying hypotheses of the study, because the sample size determines whether an effect size yields significance at a specified α-level. That said, benchmarks for the evaluation of effect sizes neglecting the concrete context appear useless, suggesting that researchers should justify which effect size they deem nontrivial for the concrete study (Giner-Sorolla et al., 2024; Riesthuis, 2024). In the context of jangle fallacies, researchers should provide strong reasoning about which numerical difference between correlation coefficients of two ostensibly jangled measures with any external construct they consider to be too small to be interpreted as a meaningful difference. Afterward, researchers should collect enough data to reliably detect this correlation difference, highlighting the role of a priori power analysis.
The R package diffcor (version 0.8.4; Blötner, 2024) provides Monte Carlo-based power analyses for correlation difference tests for dependent correlations (diffpwr.dep; see supplemental template R file under https://osf.io/v5dte/). Consider, for example, that a researcher decides to treat the difference between the correlations of the constructs X and Y with a validation criterion A Δr = rXA − rYA ≥ .10 as meaningful and that they expect X and Y to be correlated with A at rXA = .10 and rYA = .20; at the same time, they assume X and Y to be correlated at rXY = .40. Applying the template R script to this example reveals that a sample size of n = 500 is not sufficient to detect this correlation difference at a level that is usually deemed the lower bar for sufficient statistical power (i.e., 1 − β < .80; Giner-Sorolla et al., 2024), given an α-level of 5% (see left side in Fig. 1), but a sample size of n = 950 does (see right side of Fig. 1). In addition, the simulated data accurately reflect the target coefficients (indices denoted as bias and cov; see explanations in the Note).
Illustration of the power analyses for the outlined exemplary correlation difference. Note. n = Tested sample size. rho12 = Correlations of the validation criterion with the first construct (equals rXA from the text). rho13 = Correlations of the validation criterion with the second construct (equals rYA from the text). rho23 = Correlations between ostensibly jangled constructs (equals rXY from the text). alpha = Type I error level α. n.samples = Number of samples drawn for the simulation. Parameters denoted with cov in the output indicate the ratio of confidence intervals around the simulated correlations that include the target correlation. Parameters denoted with bias in the output indicate the relative differences between the mean (appended _Md) or the median (appended _Md) of the distribution of the simulated correlations and the target correlation. pwr = Achieved power 1−β. For more information, see Blötner (2024)
Note that, depending on the intercorrelation of the ostensibly jingled/jangled constructs, this procedure requires larger sample sizes than approaches purported to ensure sufficient power to detect a bivariate correlation of prespecified size. Thus, a correlation coefficient as high as r = .10 yields significance at a smaller sample size than a correlation difference as high as Δr = .10, given equal α-level and target power (Riesthuis, 2024). It may be appealing to reduce the necessary sample size by increasing the correlation difference deemed critical for (non)redundancy (e.g., Δr = .20). Under these circumstances, ceteris paribus, 250 participants would be needed to detect the pattern outlined in the previous paragraph with a power of 83%. Note in this regard that a nonsignificant correlation difference due to a smaller sample size can lead scholars to falsely claim that two test scores are correlated with the validation criterion at similar levels although the descriptive difference (Δr) is noteworthy. For example, Gignac and Szodorai’s (2016) effect size convention treats a correlation coefficient as high as r = ±.20 as a moderate effect. Therefore, I argue that robust and consequential research requires and deserves well-powered studies, suggesting that sound studies on jingle and jangle fallacies should not be too lenient by relying on way too small samples and/or way too large differences.
That said, power analyses per se cannot tell whether nomological networks differ, because, following the above reasoning on type I and II errors, statistically significant differences are not necessarily meaningful from a practical stance (i.e., sufficient power to detect even negligible differences), and practically meaningful differences do not always yield significance (i.e., insufficient power to detect meaningful effects). However, power analyses require researchers to reveal which (potentially subjective) criteria they apply to decide whether they treat (a difference between) effect sizes as meaningful or negligible (see Giner-Sorolla et al., 2024, for considerations about transparent reports and justifications of assumptions). For instance, if a study suffers from insufficient power to detect a target effect, differences between correlation coefficients that are noteworthy at a descriptive level could be erroneously treated as evidence in favor of a jangle fallacy because the statistical test does not yield significance at a specific α-level (e.g., rs = .10 and .30, given n = 100 and an intercorrelation r = .40, p = .06). If, in turn, the sample size of a study is sufficient to detect even negligible differences, researchers would oversee a jangle fallacy (e.g., rs = .10 and .12, given n = 4,000 and an intercorrelation r = .80, p = .04). Thus, the need to justify applied benchmarks when planning a jingle/jangle study renders power analyses a helpful tool to increase the transparency and reliability of respective conclusions.
Context-specific thresholds for the agreement of nomological networks
I argue that considerations about a critical level of agreement of the nomological networks must take the concrete degree of differences and overlaps into account (see Criterion 1 from the guideline by Lawson & Robins, 2021). That is, if only small differences exist between two potentially jangled variables, the applied cutoff should be higher than that for a case of two variables that reveal at least a moderate degree of differences. If the index of the similarity of the observed nomological networks is smaller than the cutoff yet still high at the numerical level, a strong jangle fallacy can be ruled out, and the conclusion of sibling constructs as per Lawson and Robins would make sense.
As of yet, there is no consensus for an approach to derive thresholds for similarities of observed nomological networks that considers concrete commonalities and differences between the investigated variables. To address this, I propose a scheme that builds upon the integration of correlation difference hypotheses. This extension also serves a closer integration of Lawson and Robins’ (2021) Criteria 1 to 3 (i.e., conceptual differences as well as similarities of hypothesized and observed nomological networks): First, based on the underlying conceptualizations of the ostensibly jangled constructs, researchers need to select a reasonably large set of validation criteria for which distinct correlations are expected, but also criteria for which no differences are assumed (cf. Blötner & Mokros, 2023). Including criteria with which no differences are assumed ensures that shared features of both are also considered. Based on the comparison of the underlying theoretical frameworks, researchers can extract external constructs with which one of the ostensibly jangled constructs is expected to yield stronger relations.
To specify a set of shared and distinct contents, the elemental approach provides a helpful framework. It reflects the decomposition of theoretically important contents of the constructs under scrutiny and can be visualized by Venn diagrams (Kay & Arrow, 2022; see Fig. 2 for a schematic illustration). Based on the decomposition of theoretically important contents, researchers can derive operationalizations of classes of criteria. For instance, researchers may operationalize selfishness, which represents important content of psychopathy, through contributions to a public goods game.
Schematic application of the elemental approach to derive unique and shared elements. Note. UniqueX and UniqueY denote contents that are theoretically more important for X and Y, respectively. SharedXY denotes contents that are (equally) important for both. The numbers of unique and shared contents (i, j, k) can but do not have to differ
After defining the central elements of the nomological networks, researchers can derive the threshold for (non)redundancy as the ratio of empirically detected correlation differences (denoted as d) and the total number of measured criteria (denoted as t). The critical threshold for the ICCDE results from the formula ICCDE, crit = 1 − d/t. For example, consider that a researcher administered 20 criterion measures (= t). In five cases, they detected significant correlation differences in the sense of the last section (= d). Applying the above formula, the cutoff for a jangle fallacy would be ICCDE, crit = 1 − 5/20 = .75. If the empirically observed ICCDE exceeds .75, the researchers would conclude that a jangle fallacy occurred. An adaptation of this procedure for assumed negative overlaps (i.e., one construct is assumed to reflect the opposite of the other [e.g., altruism and egotism]) requires an alternative counting approach for d: In this case, researchers count the number of correlation differences that differ in direction but not in (absolute) strength of association (i.e., |rXA| ≠ |rYA|). That is, empirically observed correlations rXA and rYA are treated as functionally different if X turns out to be more strongly positively related to A than Y is negatively related to A (e.g., rs = .30 and −.30 are functionally equivalent, but rs = −.20 and .50 are not).
To provide a pertinent significance testing approach, the icc.de.boot function from the R package iccde (version 0.3.6; Blötner & Grosz, 2024) computes bootstrap confidence intervals for ICCDEs, given the desired α-level. The only mandatory input of the function reflects the argument data. It requires entering a data frame of variables (columns) and participants (rows) containing the test scores of ostensibly jingled/jangled constructs as well as test scores of the constructs with which correlations should be tested. Figure 3 provides the output for simulated data with four variables from 1,000 participants. By default, the function computes 95% confidence intervals from 1,000 bootstrap samples, but the confidence level and the number of bootstrap samples can be edited.
If the bootstrap confidence interval of the empirically observed ICCDE exceeds the threshold derived using the above formula, findings militate in favor of a strong jangle fallacy (i.e., nomological networks overlap significantly more strongly than expected, suggesting that the measures assess the same construct) or against a strong jingle fallacy (nomological networks exhibit sufficient overlaps to conclude that the measures assess the same construct).
It is important to note that the above proposal for the derivation of a cutoff for the similarity between nomological networks builds on conceptual considerations, but it does not consider measurement errors. Thus, it is important to clarify how measurement error impacts the empirically observed ICCDE when individual correlations of potentially jingled/jangled test scores with validation criteria are attenuated due to unreliability. To acknowledge this, one might calculate ICCDEs from latent factor correlations from confirmatory factor analysis or apply attenuation correction to manifest correlations. Due to easier implementation and lower computational requirements, however, most scholars who evaluate jingle or jangle fallacies presumably examine correlations between manifest test scores (hereafter raw correlations) rather than disattenuated correlations or correlations between latent factors.
To evaluate the extent to which an ICCDE that rests on raw correlations differs from the corresponding ICCDE that rests on disattenuated correlations, I conducted a simulation study (see OSF supplement https://osf.io/4um8a). I simulated 10,000 pairs of vectors of 20 raw correlation coefficients each (ranging from r = −.60 to r = .60) and computed the ICCDE for each of the 10,000 pairs.Footnote 1 Furthermore, I computed the corresponding ICCDE for each pair of vectors of disattenuated correlation coefficients (i.e., vectors of correlations that are corrected for measurement error). Thereby, I simulated reliability levels ranging from rxx = .70 to rxx = 1.00. As can be seen in Fig. 4, differences between ICCDEs derived from raw correlations vis-à-vis disattenuated correlations were negligible because the ICCDEs computed from raw versus disattenuated correlations were themselves correlated at r = .999, and the 95% confidence interval of the differences ranged from −.022 to .022 (see the dashed vertical lines in panel c in Fig. 4). Thus, computing ICCDEs with disattenuated correlations does not elicit particular advantages over analyses of raw correlations.
Relations and differences between ICCDEs derived from raw and disattenuated correlations. Note. a Scatterplot for the relationship between ICCDEs derived from raw versus disattenuated correlations. b Distribution of the difference between the ICCDEs derived from raw versus disattenuated correlations
Conclusion
Arguably, deriving unique and shared elements with the elemental approach is very subjective, as most psychological theories do not provide sufficient formalization to obtain markedly high inter-rater agreement of these elements. Likewise, the devised formula to derive context-specific thresholds for critical ICCDEs is quite simple. Yet, given the incorporation of specific, content-related considerations and ideas that are tailored to the research question at hand, it stands to reason that the outlined formula is more appropriate than abstract and vague cutoffs or rules of thumb derived from gut feelings that are supposed to hold for all imaginable tests of jingle/jangle fallacies. Of note, extremely low as well as extremely high cutoffs (e.g., ICCDE, crit = .20 and .95) as derived from the above formula that are very easy or very hard to exceed should raise doubts about the eligibility of the concretely employed set of validation criteria and render the set of criteria uninformative to answer the research question under scrutiny. That said, I would like to encourage future research to employ, challenge, and (if necessary) refine the applicability of the formula.
Availability of data and materials
Data sharing is not applicable because no real data were collected or analyzed during the current study.
Code availability
All R codes used for the present manuscript are available at https://osf.io/v5dte/.
Notes
The decisions about the selected number of correlations per vector and the range between −.60 and .60 were arbitrary. However, similar conclusions arise for other numbers of correlations per vector (50, 100) and other ranges of correlations (−.70 ≤ rs ≤ .70; see OSF supplement). Note that the upper and lower boundaries for the correlations must not be too high, because otherwise, disattenuated correlations could exceed ±1.
References
Blötner, C. (2024). diffcor: Fisher's z-tests concerning differences between correlations (R package version 0.8.4). CRAN. https://doi.org/10.32614/CRAN.package.diffcor
Blötner, C., & Grosz, M. P. (2024). iccde: Computation of the Double-Entry Intraclass Correlation (R package version 0.3.6). CRAN. https://doi.org/10.32614/CRAN.package.iccde
Blötner, C., & Mokros, A. (2023). The next distinction without a difference: Do psychopathy and sadism scales assess the same construct? Personality and Individual Differences, 205, 112102. https://doi.org/10.1016/j.paid.2023.112102
Blötner, C., Spormann, S. S., Hofmann, M. J., & Mokros, A. (2024). Measures of subclinical psychopathy and everyday sadism are still redundant: A conceptual replication and extension of Blötner and Mokros (2023). Journal of Personality. Advance online publication. https://doi.org/10.1111/jopy.12996
Few, L. R., Miller, J. D., Grant, J. D., Maples, J., Trull, T. J., Nelson, E. C., Oltmanns, T. F., Martin, N. G., Lynskey, M. T., & Agrawal, A. (2016). Trait-based assessment of borderline personality disorder using the NEO Five-Factor Inventory: Phenotypic and genetic support. Psychological Assessment, 28(1), 39–50. https://doi.org/10.1037/pas0000142
Gignac, G. E., & Szodorai, E. T. (2016). Effect size guidelines for individual differences researchers. Personality and Individual Differences, 102, 74–78. https://doi.org/10.1016/j.paid.2016.06.069
Giner-Sorolla, R., Montoya, A. K., Reifman, A., Carpenter, T., Lewis, N. A., Aberson, C. L., Bostyn, D. H., Conrique, B. G., Ng, B. W., Schoemann, A. M., & Soderberg, C. (2024). Power to detect what? Considerations for planning and evaluating sample size. Personality and Social Psychology Review, 28(3), 276–301. https://doi.org/10.1177/10888683241228328
Hart, W., Kinrade, C., Lambert, J. T., Breeden, C. J., & Witt, D. E. (2023). A closer examination of the integrity scale’s construct validity. Journal of Personality Assessment, 105(6), 743–751. https://doi.org/10.1080/00223891.2022.2152346
Hodson, G. (2021). Construct jangle or construct mangle? Thinking straight about (nonredundant) psychological constructs. Journal of Theoretical Social Psychology, 5(4), 576–590. https://doi.org/10.1002/jts5.120
Kay, C. S., & Arrow, H. (2022). Taking an elemental approach to the conceptualization and measurement of Machiavellianism, narcissism, and psychopathy. Social and Personality Psychology Compass, 16(4), e12662. https://doi.org/10.1111/spc3.12662
Kowalski, C. M., Plouffe, R. A., Daljeet, K. N., Trahair, C., Johnson, L. K., Saklofske, D. H., & Schermer, J. A. (2024). A multi-study investigation assessing the potential redundancy among the Dark Tetrad using a narrowband trait approach. Scientific Reports, 14(1), 17433. https://doi.org/10.1038/s41598-024-67952-4
Lawson, K. M., & Robins, R. W. (2021). Sibling constructs: What are they, why do they matter, and how should you handle them? Personality and Social Psychology Review, 25(4), 344–366. https://doi.org/10.1177/10888683211047101
Maples-Keller, J. L., Williamson, R. L., Sleep, C. E., Carter, N. T., Campbell, W. K., & Miller, J. D. (2019). Using item response theory to develop a 60-item representation of the NEO PI–R using the international personality item pool: Development of the IPIP–NEO–60. Journal of Personality Assessment, 101(1), 4–15. https://doi.org/10.1080/00223891.2017.1381968
Meehl, P. E. (1990). Why summaries of research on psychological theories are often uninterpretable. Psychological Reports, 66(1), 195–244. https://doi.org/10.2466/pr0.1990.66.1.195
Miller, J. D., Hyatt, C. S., Maples-Keller, J. L., Carter, N. T., & Lynam, D. R. (2017). Psychopathy and Machiavellianism: A distinction without a difference? Journal of Personality, 85(4), 439–453. https://doi.org/10.1111/jopy.12251
Riesthuis, P. (2024). Simulation-based power analyses for the smallest effect size of interest: A confidence interval approach for minimum-effect and equivalence testing. Advances in Methods and Practices in Psychological Science, 7(2), 25152459241240720. https://doi.org/10.1177/25152459241240722
Samuel, D. B., Miller, J. D., Widiger, T. A., Lynam, D. R., Pilkonis, P. A., & Ball, S. A. (2012). Conceptual changes to the definition of borderline personality disorder proposed for DSM-5. Journal of Abnormal Psychology, 121(2), 467–476. https://doi.org/10.1037/a0025285
Funding
Open Access funding enabled and organized by Projekt DEAL. This research did not receive specific funding from a third-party organization.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Ethics approval
Not applicable.
Consent to participate
Not applicable.
Consent for publication
Not applicable.
Conflicts of interest
I have no conflict of interest to declare.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Blötner, C. Extending Lawson and Robins’ (2021) guideline for the evaluation of jingle and jangle fallacies. Behav Res 57, 177 (2025). https://doi.org/10.3758/s13428-025-02691-6
Accepted:
Published:
DOI: https://doi.org/10.3758/s13428-025-02691-6