Open Access Repositorium für Messinstrumente


German Satisfaction with the Political System Short Scale (SPS)

  • Autor/in: Dentler, K., Bluemke, M., & Gabriel, O.W.
  • In ZIS seit: 2020
  • DOI:
  • Abstract: The German Satisfaction with the Political System Short Scale (SPS) (in German: Kurzskala zur Erfassung Politischer Systemzufriedenheit) measures people’s satisfaction with the political ... mehr system with four items that capture a variety of indications for satisfaction (in particular, justice and freedom for all, and equal treatment of different population groups within a political system). The empirical validation for Germany shows that the SPS allows reliable and valid measurement of people’s satisfaction with the political system. The items used in the SPS scale have been used previously (Fuchs, 1987, 1989; Roller, 1992; Neidhardt et al.,1994). However, in those studies the items were mainly used as single indicators but not as a joint scale or index of satisfaction. The dataset analysed here was obtained via an interviewer-guided face-to-face interview. The SEM-based and recommended two-factor model showed a much better model fit than a one-factor model, also in terms of simple structure of factor loadings. Both factors are useful and contribute to the common core of the general factor. While the items of the first factor focus on value-based aspects, like justice, fairness and freedom (hence termed justice and freedom for all), the items of the second one deal with the influence, or equal treatment, of different population groups (hence named equal treatment of different population groups). The German instrument has been validated for the West-German population regardless of age and social class, but the authors invite researchers to adapt it to the English-speaking realm by using the supplied translations. weniger
  • Sprache Dokumentation: English
  • Sprache Items: deutsch
  • Anzahl der Items: 4
  • Erhebungsmodus: PAPI face-to-face
  • Bearbeitungszeit: 1-2 minutes(authors’ estimate)
  • Reliabilität: Cronbach’s alpha=.62; McDonald’s omega=.69
  • Validität: Evidence for factorial validity and partly for construct validity
  • Konstrukt: Satisfaction with the political system
  • Schlagwörter: political system, political support, political trust, political satisfaction
  • Item(s) in Bevölkerungsumfrage eingesetzt: yes
  • Entwicklungsstand: Tried
    • Important note: To make scale content accessible to non-German speakers as well, the authors provide English translations or adaptions of the German instruction and items. If not otherwise stated, these are ad hoc translations by the authors. This English-language adaption of the SPS scale is not empirically evaluated yet and the findings reported for the German SPS scale do not pertain to the English-language adaption. However, other researchers are cordially invited to test and validate an English-language adaption of the SPS scale based on the provided adaption suggestion.



      The following instructions should be used - if possible - for standardized assessment. Please use the respective introductions right before the item groups to which they pertain.


      Item 1 and 2: Denken Sie nun bitte an das politische System in unserem Land. Auf dieser Liste stehen zwei Aussagen, wie man das gegenwärtige politische System beurteilen kann. Sagen Sie mir bitte zu jedem Satz, ob Sie damit voll übereinstimmen, weitgehend übereinstimmen oder ob Sie ihn weitgehend ablehnen bzw. voll und ganz ablehnen.

      [Please think about the political system in our country. On the list below, you find two statements evaluating the political system. Please tell me for each statement whether you strongly approve, widely approve or whether you widely disapprove, strongly disapprove.]

      Item 3 and 4: Auf dieser Liste stehen Ansichten, die manche Leute vertreten. Wir möchten gerne wissen, wie Sie darüber denken. Stimmen Sie mit den einzelnen Ansichten voll überein, weitgehend überein oder lehnen Sie sie weitgehend bzw. voll und ganz ab?

      [On this list, you see two views some people hold. We would like to know what you are thinking about them. Do you strongly approve, widely approve or widely disapprove, strongly disapprove with each of those views?]




      Table 1

      Items of the Satisfaction with the Political System Short Scale






      (1) Justice and freedom for all

      Das politische System der Bundesrepublik ist gerecht und fair.



      Das politische System der Bundesrepublik schützt die grundlegenden Freiheiten der Bürger.



      (2) Equal treatment of different population groups

      Im politischen System der Bundesrepublik wird nur das Wohl einiger weniger Interessengruppen berücksichtigt und nicht das Wohl aller Bevölkerungsgruppen.



      Jede Bevölkerungsgruppe hat im politischen System der Bundesrepublik die gleiche Chance die Politik zu beeinflussen.


      Note. Items taken from Gabriel (1987, Table 2). English-language adaptions of items: Item 1: "The political system of our country is just and fair." (taken from Allerbeck et al., 1983), Item 2: “<Respondent’s country’s> political system protects our basic liberties.” (taken from Allerbeck et al., 1983), Item 3: “The political system of <Respondent’s country> considers the well-being only of a few interest groups, yet not the well-being of all the people.” (ad hoc adaption by authors), Item 4: “In the political system of <Respondent’s country>, every social group has equal opportunity to influence politics.” (ad hoc adaption by authors).


      Response specifications

      Each of the four items was presented with four response categories “voll übereinstimmen” (strongly approve), “weitgehend übereinstimmen” (widely approve), “weitgehend ablehnen” (widely disapprove), “voll und ganz ablehnen” (strongly disapprove).



      All items are answered on a four-point rating scale ranging from 1 = “voll übereinstimmen” (strongly approve) to 4 = “voll und ganz ablehnen” (strongly disapprove). Items 1, 2 and 4 are formulated positively and item 3 is formulated negatively in the direction of the underlying aspect “satisfaction with the political system” (hence the scale name: SPS). Item 3 thus has to be recoded for the analysis (5 – raw score). However, due to the scoring scheme used in the seminal study, the researcher has to keep in mind that, counter-intuitively, low scores represent approval and satisfaction, whereas high scores represent disapproval and dissatisfaction. If an intuitive scoring scheme were used in a future scale application, any deviation from the scoring scheme should be communicated transparently and the data preparation should be unambiguous.

      We suggest that individual answers should be aggregated to the (sub-)scale level only if there are no missing values. A total score (overall index across all four unit-weighted items) may be formed after proper recoding of item 3. Below we accept a two-factor structure that implies that the use of subscale scores might be possible too (though less reliable), if one were interested in distinguishing between satisfaction with the political system on the basis of values relating to justice and freedom for all (items 1 and 2) and values concerning the equal treatment of different population groups (items 3 and 4). More detailed information and recommendations on the formation of scale scores or indexes is provided in the “Item analyses” section below.


      Application field

      The aim of the SPS is to measure citizens' individual satisfaction with their current political system. It is a rather succinct scale with a short completion time, which allows researchers to use it in time-constrained research settings. The instrument in German (Table 1) has been validated for the West-German population regardless of age and social class, but it can also be adapted to the English-speaking realm by using the supplied translations. The SPS may be used in written mode such as a paper-and-pencil questionnaire, alternatively an online questionnaire, as well as in oral mode such as a face-to-face or telephone interview. For example, the data of the underlying dataset were collected through face-to-face interviews. Based on typical findings, the authors estimate the total time for presenting items along with instructions plus the respective processing and answering by the participant at about 1 minute (web survey) or a maximum of 2 minutes (face-to-face interviewing).

      The items presented have been used previously in separate studies. The following studies exemplify the use of SPS-items and their application fields: Fuchs (1987) used the first two items to generate a measure of regime legitimacy. Similar, Roller (1992) included items 1 and 2 to measure political support and legitimacy. Neidhardt and colleagues (1994) also used the first two items in their analysis of political violence and repression. Fuchs (1989) investigated the dimensions of political support and additionally included items 3 and 4 to measure people’s beliefs about equality of societal groups in the political system, while items 1 and 2 captured whether people believed that the political system assures civil liberties and justice. However, in those studies the items were mainly used as single indicators but not as a joint scale or index of satisfaction.



    Satisfaction with the political system is seen as an important political attitude in social research. Especially in democracies, we expect higher contentment ratings than in other systems because democracies tend to deal in more humane ways with their citizens (Lauth et al., 2000; van Ham et al., 2017). The concept of satisfaction with the political system is based on Easton’s (1965, 1975) concept of political support. Its focus is mainly on positive and negative positions towards the political system, manifesting themselves either in attitudes (covert support) or in behavior (overt support). In distinguishing between various sub-dimensions of political support, Easton (1965, 1975) introduced support for political community, regime and authorities as the referents on the one hand and specific and diffuse support as motivations on the other. Specific support derives from performance while diffuse support has to do with moral convictions and value commitment. According to Easton (1965, 1975), legitimacy beliefs and trust are the two main categories of diffuse support. In line with this conceptualization, we also distinguish the cognitive-affective concept of satisfaction with the political systems from the behavioral component of political support (Easton 1965, 1975; Anderson & Guillory, 1997; Norris, 1999; Linde & Ekman, 2003).

    The measure of “Satisfaction with the Political System (SPS)" relates to other concepts used in analyses of peoples' relationship to political systems, but differs from them in several respects. Its focus is clearly on core values that are represented by a democratic regime such as fairness, liberty, equal opportunities, and responsiveness. Due to these connotations, it goes beyond political trust and legitimacy beliefs. Moreover, the majority of measurement approaches has focused on trust in incumbent public officials and political institutions/authorities (Citrin, 1974; Denters et al., 2007; Miller, 1974; Schnaudt, 2019; Zmerli et al., , 2007; Zmerli & van der Meer, 2017). Different from satisfaction with democracy, SPS does not refer to the political system's effective performance (Canache et al., 2001; Ferrín & Kriesi, 2016; Norris, 2011), but to the values incorporated in the system.

    In a pointed comparison, satisfaction with democracy can be understood as reflecting citizens’ evaluations of the performance of democracy in practice (Norris, 1999; van Ham et al., 2017). It pertains to how people evaluate the performance of the specific democratic regime as institutionalized in their country, but does not explicitly reveal how people think about democracy as a political system in abstract normative terms such as legitimacy, justice, and responsiveness.  To distinguish the two concepts further, negative perceptions about the political system might lead to different - and potentially stronger - behavioral consequences, such as participation in protest marches that undermine or overturn democratic systems (André & Depauw, 2017). Whereas negative views of democracy performance may mostly invoke attempts to reform democracy from within (but not overturn it; Norris, 1999) and foster the activities of “critical democrats”" (Klingemann, 2000), highly negative perceptions of the political system would indicate a veritable legitimacy crisis that might provoke uprisings and ignite violent system transformation, because the system itself is no longer perceived to be capable of changing.

    The conceptual distinction between SPS and satisfaction with democracy is also mirrored in the relevant measurements. Scholars mainly use one question to measure people’s satisfaction with democracy: “How satisfied are you with the way democracy works in [your country]?” (André & Depauw, 2017; please also see the GESIS surveys Comparative Study of Electoral Systems [CSES] and German Longitudinal Election Study [GLES]; Roßteutscher et al., 2019). This question is grounded in Easton’s (1965, 1975) claim that political support is important for the stability of political systems. Initially, the question was used by some authors as a measure of the legitimacy of democracy (Kaase 1985; Braun & Schmitt 2009), but as a result of a broad conceptual debate, the view of this question as an indicator of regime performance – not legitimacy – is now prevailing (Norris, 2011; Ferrín, 2016). Nevertheless, the question is still critiqued for problems like vagueness (Linde & Ekman, 2003; Cutler et al., 2013; André & Depauw, 2017).

    What is absent among the social surveys is a validated scale that indicates diffuse support for a regime sensu Easton (1965, 1975) (that is, satisfaction with the political system in a more general way). In this sense, satisfaction with the political system is an important indicator in the field of democracy research (Demokratieforschung; Pickel & Pickel, 2006; van Ham et al., 2017). The construct is highly relevant in a variety of research areas, but for historical reasons it has mainly been developed and researched in political science (Easton 1965, 1975). The relevance of the target construct is straightforward: Citizens’ satisfaction with political systems may bolster covert and overt support, thus strongly influence the stability and persistence of political systems, for instance, due to legitimacy concerns; dissatisfaction may undermine overt and covert support for political systems (André & Depauw, 2017; Easton, 1965, 1975; but see Canache et al., 2001, for evidence to the contrary). The relevance of the concept intended to be measured by the new instrument may also be found in historical processes (Gabriel, 1987). For example, during the formation of the Federal Republic of Germany, not all people agreed about democracy becoming the foundation of the political system. Over time and with accumulating first-hand experience people became more confident, trusting and satisfied with it (Conradt, 1980; Gabriel, 1987). Nonetheless, acceptance problems with the current political system still occur, and they are common in many political systems around the world. They promptly arise after specific political events, such as unfair elections (e.g., after fraud, illegitimate spending in political campaigns) or emerge continually as general dissatisfaction, for example, due to dictatorship (Anderson & Tverdova, 2003; Banducci & Karp, 2003; Norris, 2011; Thomassen & van Ham, 2017). Consequently, there is an unsatisfied demand for an adequate tool to measure satisfaction with the political system.

    It is timely to think about a validated scale measuring people’s satisfaction with the political system of their respective countries. Although, the scale presented in this contribution is not restricted by an exclusive focus on satisfaction with democracy, it explicitly references democratic core values such as liberty, equity, equality and responsiveness. Whether the concept can actually be applied to regimes other than democracies is an open question, because it is not at all clear whether concepts such as ‘liberties’ or ‘fairness’ can actually be understood identically if survey respondents come from different political histories, cultures, ideologies, and regimes (see section Further Quality Criteria).

    To prevent misunderstandings, we acknowledge that not only the incorporation of democratic core values but also other related factors might causally influence satisfaction with the political system, or be otherwise associated with the construct. For example, economic development and growth (Gabriel, 1989) are both inherently linked with satisfaction of the political system (Rattinger & Juhász, 1990) as people often hold the political system responsible for economic down-turns (Magelhaes, 2017; Peffley & Rohrschneider, 2014). Gabriel (1989) showed that changes in government affect citizens’ governmental support, yet only gradually affect the support of the political system per se. By contrast, Blais and Gélineau (2007) showed that people who supported the winning side at an election tend to be more satisfied with the political system than the losers. Satisfaction with the political (viz. democratic) system has implications for voter turnout, too (Grönlund & Setälä, 2007).

    To sum up, the SPS relates to other concepts used to analyze peoples’ relationship to political systems. However, it goes beyond political trust and legitimacy beliefs, or measuring political system’s effective performance. Instead it refers to the values incorporated in a political system. Thereby, it focuses clearly on core values that are represented by democratic regimes such as fairness, liberty, equal opportunities, and responsiveness. The relevance of the SPS is straightforward: Citizens’ satisfaction with political systems may bolster covert and overt support, thus strongly influence the stability and persistence of political systems.


    Item generation and selection

    The combination of items gathered for the SPS scale was adopted from Gabriel (1987). More specifically, we used the available subset of four out of the five items that Gabriel (1987; see Table 2) had originally used to construct the SPS scale. The first item („Wie sehr entspricht unsere politische Ordnung und Demokratie dem was Sie in der Politik für gut und richtig halten?“) was not available from the dataset. Hence, including this item into validity checks was not possible. The data were collected by the GETAS - Gesellschaft für angewandte Sozialpsychologie mbH - on behalf of ZUMA - Zentrum für Umfragen, Methoden und Analysen e.V., the predecessor institute of GESIS – Leibniz Institute for the Social Sciences. The data collection was part of an international research project named ‘Political Action – An Eight Nation Study’. The first wave of the initially planned panel study was conducted in 1974 and later continued in 1979. Additional to the panel study, GETAS established a representative cross-sectional study with a new sample in 1979/1980. Meanwhile, the dataset was imported to the GESIS data archive (traceable by the study number ZA1191; Allerbeck et al., 1982). Our study uses data gathered in this cross-sectional study from 1979/1980 (Allerbeck et al., 1982) because it included all the four items of interest. Unfortunately, all four items were only conducted together in the German Political Action study but in no other country.



    The cross-sectional study used a sample of 2095 respondents representative for West Germany (full sample) who were interviewed in 1979/1980. The multi-stage stratified random sample comprised citizens of the Federal Republic of Germany at age 16+, having their place of residence in West Germany (so-called ADM mastersample). Relative to the originally planned 3000 interviews, the participation rate was 69.8%. The highest percentage of participant drop-out (n = 380; 12.7%) was due to not being encountered at home even after multiple efforts. The final sample size for analyses was reduced by another 271 cases to 1824 observations owing to item missingness, because we were only interested in analyzing those respondents that answered all four questions of the rather short scale (so there was no missingness to be dealt with in the analytical sample). Table 2 presents sample characteristics of the full and reduced samples, with only small differences between both samples, which indicate that reducing the number of cases did not detrimentally affect the sample composition. The authors do not consider the sample suitable for norming (non-negligible differences between former and current population; sample potentially not representative for present structure of German population also including East Germany on the one hand and distribution of educational degrees currently obtained on the other hand).


    Table 2

    Sample Characteristics


    Full Sample

    Final Sample




    Mean age in years (SD) [Range]

    45.58 (17.55) [16-93]

    44.78 (17.30) [16-93]

    Proportion of women in %



    Educational level in %



    No educational qualification



    Secondary modern school



    General Certificate of Secondary Education



    A levels



    Technical schools






    Still at school



    Note. The equivalent German educational levels were as follows (from low to high): ohne Bildungsabschluss [no educational qualification]; Hauptschule/ Volksschule [secondary modern school]; Mittlere Reife [General Certificate of Secondary Education]; Abitur/ Hochschulreife [A levels]; Berufsfachschulen [technical schools].


    Item analyses

    We investigated the factorial structure of the SPS in Germany in two separate exploratory analyses: principal-axis factor analyses (PAF) and principal component analyses (PCA) – in line with a reflective vs. a formative modeling approach (as often encountered in psychological vs. sociological research traditions). The following exploratory analyses were conducted in SPSS (IMB SPSS Statistics 24), whereas the figures were taken from AMOS (JASP, Version 0.10.2) (see below). While the PCA explains all the item variance, PAF builds exclusively on the shared item variance (covariance). Therefore, higher loadings result for the PCA as components explain unique item variance on top of item covariance, part of which might represent legitimate unique construct variance, alternatively construct-unrelated unsystematic measurement error. Inspecting the flow of Eigenvalues of the PCA, the Eigenvalue of the first component was 1.98, while the Eigenvalue of the second component was 0.98 (hence very close to the Kaiser criterion of 1; Table 3). Therefore, we decided to extract one-component/factor and two-component/factor solutions. Beyond this exploratory approach, we tested structural equation models in AMOS with the default estimator (ML) and performed multiple confirmatory factor analyses (CFA) to inspect the fit of one-factor and two-factor models. Interested in unidimensionality (parsimony principle) on the one hand and simple structure of loadings (interpretability) on the other hand, we initially used Varimax rotation in our exploratory approach but omitted the orthogonality constraint in the two-factor CFA model tested later.

    First, we conducted the exploratory analyses and looked at the amount of variance explained by the first and second component or factor (Table 3) and the loading patterns of the one- and two-dimensional solutions (Table 4).  When we extracted one factor (PAF), the factor was able to explain a bit more than a third of the variance shared among the four items, whereas two factors cumulatively explained nearly half of the variance. After rotation, the variance of the additional factor reflected a non-negligible contribution to the first factor. With PCA, the first component alone already explained nearly half of the variance. When extracting two components, they explained an almost even share of variance, adding up to three quarters of the cumulative variance.


    Table 3

    Output statistics for principal component analysis (PCA) and exploratory factor analysis (EFA; here principal-axis factor analysis: PAF) for one- and two-dimensional solutions



    EFA (PAF)









    Factor/ Component


    % Var.





    % Var.

    % Var.

































    Note. EV = Eigenvalue, Var. = variance, Cum. Var. = cumulative variance. The two-dimensional solutions were based on Varimax rotation.



    Table 4

    PCA and EFA factor loadings for one- and two-dimensional solutions





    EFA (PAF)









































































    Note. λ = loading (model-compliant loadings in bold); h2 = communality. The two-dimensional solutions were based on Varimax rotation.


    Second, we conducted the confirmatory analyses using standardized latent variables to achieve model identification. The one-factor model as analyzed in CFA/SEM is plotted in Figure 1; its model fit indices did not suggest a good fit to the data (Hu & Bentler, 1999; Schermelleh-Engel et al., 2003; Schweizer, 2010). Furthermore, the pattern and sizes of the items’ factor loadings do not support strictly unidimensional measurement as we have argued above. By comparison, the fit indices of the two-factor extraction model (Figure 2) showed a much better model fit compared to the one-factor solution. The factor loadings of the respective factors suggested that using two factors might be the better decision (see Figure 2). Forcing the same loadings for items SPS3 and SPS4 on factor #2, did not hamper model fit.

    In line with the results, we recommend using a two-factor model especially for structural equation models, because it provides higher flexibility for detecting discriminant validity for the subscales and because it achieves better model fit to the data (Figure 2). Furthermore, in line with an additionally tested bifactor model we assume that a general factor also underlies all the indicators, but one half of the items is influenced by a secondary source of variability (Figure 3). This model fitted equally well, but supported a general factor on the basis of all items, while controlling for item specificity in SEM; on its basis, one may use all the items to form an index to approximate the general factor (thereby blurring the information about the general factor due to the presence of the specific factor to some extent).



    Figure 1. t-congeneric measurement model for the SPS latent variable (one-factor model) with standardized path coefficients displayed, RMSEA = .155, CFI = .937, SRMR = .062, χ²(2) = 89.90, < .001, = 1,824.


    Figure 2. Two-dimensional representation with t-congeneric measurement of each factor of the construct with standardized path coefficients displayed, RMSEA = .000, CFI = 1.000, SRMR = .001, χ²(1) = 0.06, = .80, = 1,824. When constraining the F2 loadings to equality (standardized paths estimated at .55 and .52), the model fit was still very good, with RMSEA = .058, CFI = .991, SRMR = .029, χ²(2) = 14.4, ≤ .001, = 1,824.



    Figure 3. Bifactor model with a general factor for the focal construct of satisfaction with the political system (F1) and a specific factor for the lower loading items (F2), with standardized path coefficients displayed, RMSEA = .000, CFI = 1.000, SRMR = .001, χ²(1) = 0.06, p = .80, N = 1,824. F2 loadings were constrained to equality to achieve bifactor model identification.


    Overall, the analyses showed that the cross-loadings in unconstrained PCA and EFA (PAF) models with Varimax-enforced orthogonal rotation appeared to be negligible. And yet the estimated correlation of both factors in the two-factor model CFA (r ≈ 0.5; Figure 2) illustrates that they might be correlated substantially enough, though they do not measure exactly the same concept. Hence, both factors are useful and provide their unique contribution to the common core of the general construct (as in a “g-factor”, also represented in the bifactor model as F1; Figure 3). While the items of the first factor focus more on value-based aspects, like overall justice, fairness and freedom for all, the items of the second one incorporate the political influence, or equal treatment of, different population groups. Keeping the political context as a subtext, we named the two dimensions represented in the two-factor model justice and freedom for all and equal treatment of different population groups, respectively.

    We give the following recommendations regarding the formation of total scale scores, alternatively subscale scores. More generally, factor loadings inform the reader about legitimate use of the items as a composite. We present two possibilities to build manifest scale scores based on all four (alternatively two) items. First, when generating scale aggregates such as total scale (or subscale) indexes, some readers might put more weight on the item content rather than factor loadings in order to represent the construct adequately in line with theoretical considerations, resulting in unit-weighted indexes. Second, others might use our factor loadings as weights for item scores prior to aggregating to a (sub)scale score, or decide to run an EFA or CFA and use their own factor loadings (which similarly happens when using a latent variable for further analyses within a CFA/SEM framework; DiStefano et al., 2009). The second approach mimics a manifest index with items weighted by suitable factor loadings, which might be preferred by those who are concerned about the reliability (see reliability estimates below, more specifically: ωH). Whether to choose a unit-weighted or loading-weighted index depends on the preferences of the user. Also the decision to either use a total scale score or two subscale scores completely depends on the research question. When using weighting, the loadings on the general factor from the bifactor model should be used for the total scale index. For separate subscale indices, the loadings resulting from the two-factor model should be used. However, if the loading of an item (say item number 4) were too low in any future application (e.g., λ4 < .30), then we suggest that readers consider dropping this item completely in any manual computation of unit-weighted scale scores.


    Item parameters

    Descriptive statistics for the SPS items are displayed in Table 5, showing the means, standard deviations, skewness, and kurtosis as well as selectivity coefficients (item-total correlations) at the item level, as well as the descriptive statistics for the SPS at the scale level (note that the descriptive statistics are based on a sample that is not suitable for use as a norming sample).


    Table 5

    Descriptive Statistics for SPS Items and Scale







    Item 1






    Item 2






    Item 3






    Item 4












    Note. Scale ranging from 1 (strong approval) to 4 (strong disapproval) for items 1, 2, 3 and 4, = 1,824. Item 3 was recoded before the analysis. Item-total correlations are part-whole corrected.



    The standardized questionnaire format and written instructions, the labeled response categories and fixed scoring rules enable objective application, evaluation, and interpretation of the SPS.



    As estimates for the SPS scale reliability, we computed Cronbach’s alpha (Cronbach, 1951) and McDonald’s omega (McDonald, 1999; Raykov, 1997). First, we used Cronbach’s alpha as the most commonly used reliability estimate, although its appropriateness for representing internal consistency is limited, especially in the case of very short scales with heterogeneous loadings. Alpha for the total scale amounted to .62. When estimating the alpha coefficient for the two subscales separately (without controlling for measurement error variance via SEM) alpha amounted to .79 and .44, for subscales 1 and 2, respectively. Second, we report McDonald’s omega as a more appropriate measure of the reliability of the scale in the present case of only four – and noticeably heterogeneous – items. The reliability estimate from the bifactor model amounted to .69 (IMB SPSS Statistics 24), which we deem sufficient for many research purposes (Aiken & Groth-Marnat, 2006; Kemper et al., 2019). McDonalds’s omega-h that reflects the reliability of the total scale score with regard to measuring the common factor was ωH = 0.56, which is reasonable given that there was additional reliability due to the specific factor, ωS = .14, and given that no correction factor was used to account for the ordinal nature of the data (e.g., by using the WLSMV estimator). If we consider that our two-dimensional model fits better than a strictly one-dimensional model and given the same fit for a bifactor model as for a two-dimensional model, the decision for applying either subscale scores or the total score depends on a researcher’s question. However, we do not report omega reliability for each of the two factor factors, because their utility is limited due to the inclusion of only two items in each factor. Maximal reliability, also termed construct replicability sensu Hancock and Mueller (2001), indicated an internally consistent four-item scale for measuring the general factor of satisfaction with the political system, when using an optimally weighted scale composite (H-index = .80; Hammer, 2016).



    Based on the theoretical outline above, we deem the four SPS items content-valid indicators of satisfaction with the political system. The fact that they have been invented several years ago may require future applications to see if respondents can still handle them. Cognitive interviewing may be helpful to determine if the average respondent’s understanding of item wordings in in line with a researcher’s current understanding.

    The factorial structure of the items (see Figures 2 and 3) gives a good impression of the factorial validity of the scale. Construct validity of the manifest scale scores were computed based on manifest correlations. The correlation coefficients are depicted in Table 6; their interpretation is based on Cohen (1992): small effect (r ≥ .10), medium effect (r ≥ .30), and strong effect (r ≥ .50). Significance levels are displayed for descriptive purposes only. We first present correlations with the (dummy-coded) federal states. Keeping in mind that the questions refer to the political system of the Federal Republic of Germany, these correlations merely check for any undue differences between respondents residing in different federal states and resulting scores. Ideally, these correlations should be near zero; differential patterns across subscales might support that they focus on different content. Regarding the federal states, we see significant correlations with SPS subscales, but their effect sizes are rather small; and the subscales do differ with regard to their relation to the states, supporting the notion of specificity to some extent.

    Furthermore, in order to investigate both types of construct validity (convergent and discriminant validity), we show convergent construct validity for the SPS total scale and each of its subscales; the evidence of discriminant validity refers to the different correlation patterns of the subscale scores with the criterion variables. We expected positive correlation of the SPS subscales and total score with the following available variables: (a) explicitly reported attitudes towards the federal government in terms of justice and fairness, as well as freedom, and (b) participants’ general political interest, which appears to be a necessity for evaluating political systems properly (Chang, 2018). The size of the correlation is difficult to forecast, as the relative magnitudes of positive and negative effects due to political interest, in combination with specific effects of media use, are critical for satisfaction with democracy (Chang, 2018). The more relevant subscale 1 should correlate with the specific attitude questions in the mid-range, and with unspecific political interest somewhat lower. For subscale 2, equality concerns about population groups should show up in correlations with general interest and justice/fairness, but not necessarily with freedom. However, one must bear in mind that we can only include (West-German) federal states existing before 1990. Furthermore, we calculate correlations of the SPS subscales and total score with relevant sociodemographic variables, namely age, gender and educational level.

    As Table 6 shows, the two attitudinal criterion measures converge substantially with the first subscale justice and freedom, but the second subscale equal treatment does not (confirming discriminant validity across the subscales). The general criterion, political interest per se, is also more strongly associated with the first subscale than with the second, yet different from the attitudinal criteria, the correlation with political interest is higher when collapsing across the two dimensions to a total SPS score (due to a slight reliability gain). Unfortunately, the present criteria in the dataset do not allow discerning the relevance of the second subscale in terms of positive evidence of construct validity. A pessimistic view might be that convergent validity of the second subscale is low. An optimistic view would be that none of the variables targeted concerns for equality of different subgroups directly and that we have to remain agnostic until further evidence on the second subscale is gathered.  By contrast, though the influence of sociodemographic factors on the scale scores are conventionally called statistically significant, their influence is rather negligible in terms of the observed effect sizes.


    Table 6

    Correlation Coefficients of SPS with Relevant Criteria and Control Variables


    Subscale 1

    Pearson (Spearman)

    Subscale 2


    Total Scale



    Federal states
























    Lower Saxony




    North Rhine-Westphalia




    Rhineland Palatinate








    Schleswig Holstein




    Attitudes towards the government: justice/ fair

    .36*** (.49***)

    .04   (.05*)

    .32*** (.45***)

    Attitudes towards the government: freedom

    .41*** (.55***)

    .04  (.07**)

    .37*** (.52***)

    Political interest

    .11*** (.09***)

    .06** (.05*)

    .13*** (.12***)

    Sociodemographic factors





    −.02    (−.02)

    −.05  (−.05*)

    −.04       (−.04)


    .06*  (.06**)

    .05*  (.05*)

    .08*** (.09***)

    Educational level

       −.01 (−.06**)

    .01   (−.03)

    .00    (−.07**)

    Note. N = 1,824. Descriptive p-values: *p < .05, **p < .01, ***p < .001. The default correlation method is Pearson, while arguably preferable Spearman correlation coefficients are supplemented (in parentheses) if variables are categorical. Federal states only include states before 1990. Age ranged from 16 to 93. Gender: 0 = male, 1 = female. For educational levels see Table 2.


    Further quality criteria

    With less than (estimated) 2 minutes average duration for presenting and answering the SPS, it constitutes an economic measurement instrument, and the 4-item scale stresses economy somewhat more than reliability. The fakeability of the SPS is not any different from other typical self-reports (and in non-democratic regimes, it would be easy for respondents to disguise their true satisfaction with the political system, and they may be especially motivated to do so in a face-to-face interview situation). No data are available to judge the relevance of response biases; specifically, correlations with social desirability might be a first step in future research to rule out significant bias in this regard. As the total scale is not fully balanced in terms of positively keyed and inverse items, acquiescence has not been controlled to date.

    In concluding, we note two limitations: One is that the SPS-scale use implicitly assumes a respondent who disposes of sufficient knowledge of the political system. If this precondition is not met, there is a risk that the answers as such are not comparable to those of respondents who are politically better informed (lack of measurement equivalence). With increasing heterogeneity in the population and growing sections with non-native cultural upbringing and different political socialization, running a latent class analysis prior to analyses that assume population homogeneity is recommended for future research, because different latent classes may require different measurement models (Raykov & Marcoulides, 2015). A second limitation is that the application in different political regimes requires actual testing of measurement equivalence first, and possibly an inspection of notable differences in the interpretation of items by means of cognitive testing.

    • Klara Dentler (M.A. Political Science); GESIS - Leibniz Institute for the Social Sciences; PO 12 21 55, 68072 Mannheim, Germany; e-mail:
    • Dr. phil. Matthias Bluemke; GESIS - Leibniz Institute for the Social Sciences;PO 12 21 55, 68072 Mannheim, Germany; e-mail:
    • Prof. em. Dr. Oscar W. Gabriel; University of Stuttgart, Institute for Social Sciences; Breitscheidstr. 2; 70174 Stuttgart, Germany; e-mail:

    Allerbeck, K. R., Kaase, M, & Klingermann, H. (1982). Politische Ideologie II (Repräsentativumfrage 1980). GESIS Datenarchiv, Köln. ZA 1191 Datenfile Version 1.0.0, doi:10.4232/1.1191

    Allerbeck, K. R.; Barnes, S. H.; Deth, J. W. van.; Farah, B. G.; Heunks, F. J.; Inglehart, R.; Jennings, M. K.; Kaase, M.; Klingemann, H.-D.; Thomassen, J. J. A., & Stouthard, P. C. (1983). Political Action II. GESIS Datenarchiv, Köln. ZA1188 Datenfile Version 1.0.0, doi:10.4232/1.1188