    • Published: 04-03-2015 08:48 am

      3. What’s wrong with the Jungian theory and tests?
      3.1 Problem 1: Unsound theoretical foundations
      The first major problem is that there is no empirical evidence for the key concepts of psychoanalysis, paranormal phenomena, and mythology. Concepts from psychoanalysis have been refuted piece by piece: Memory research has shown that ʻunconscious repressionʼ does not exist (e.g., Loftus, 1994a and 1994b); the interpretation of dreams does not yield any workable hypotheses (either according to Freudian or Jungian interpretation; e.g., Lavie & Hobson, 1986; Hobson, Pace-Schott, & Stickgold, 2000), schizophrenia is not caused by ʻregression of the libidoʼ (as claimed by Freud), and autism is not caused by insensitive, ʻrefrigerator mothers.ʼ Autism and schizophrenia both have a strong genetic component and are currently regarded as developmental disorders of the brain. For example, on the basis of twin research, several research groups have calculated that the heritability of schizophrenia is 80% to 84% (Cardno et al., 1999; Kendler, Myers, Potter, & Opalesky, 2009). Similarly, there is no empirical support for other ideas from psychoanalysis such as penis envy (i.e., all girls are said to envy boys because of their penis) or the Oedipus complex (i.e., whereby every boy between the age of three and five secretly dreams of having sex with his mother and killing his father). A more extensive discussion of the problems of psychoanalysis as a theory and as a therapy is outside the scope of this article, but let me conclude here by referring readers to the study led by epidemiologist Yolba Smit, which resulted in the discontinuation of reimbursement for psychoanalytical therapy by health insurers in the Netherlands (Smit et al., 2010; 2012), or to more extensive reviews (e.g., Buekens, 2006).
      Furthermore, Jung’s ideas stemming from mythology and metaphysics also lack empirical evidence. No one has ever provided any evidence for the existence of synchronicity, and when Jung defended himself, he often made use of fallacies such as, “... because statistics are possible only if there are exceptions” (Adler et al., 1973 C.G. Jung Letters, vol. 2, p.246). Jung was a notorious believer in alchemy, astrology, spiritism, telepathy, telekinesis,
      clairvoyance, and extrasensory perception. Historical sources have shown that he was influenced by William James, who, amongst other things, believed in communications with spirits via media in séances. Jung maintained that there are people who sense things ʻintuitively,ʻ for example that ʻa yellow car will come around the street corner.ʼ Jung also cited examples of this ʻfrom his practiceʼ: A gold (scarab) beetle that flew against the window as a patient was relating her dream which featured a beetle, showed, according to Jung, that there has to be a non-coincidental link between the mental world and the phenomena such as this one from the physical world (Jung, 1960, p.142). So far, however, no one has been able to demonstrate paranormal gifts or extrasensory perception under controlled conditions, despite a reward of one million US dollars offered by James Randi several years ago. Recently, that sum was supplemented by one million euros by the Belgian non-profit organisation Skepp.
      As described earlier, Jung postulated that of the four functions (thinking, feeling, sensing, and intuition) one was dominant and another one was auxiliary. The difference between the dominant function and the auxiliary function should be reflected in a test as a higher score for the dominant function and a lower score for the auxiliary function. In the MBTI this is expressed in the so-called JP index (Judgment-Perception Index), which was designed to determine a person’s dominant function. However, the existence of dominant and auxiliary functions has never been confirmed in research, neither in studies by the Myers themselves (Myers & Myers, 1980) nor by others (e.g., McCrae & Costa, 1989). Moreover, test results have not shown that auxiliary functions develop with age.
      If anyone is still in any doubt after this consideration of the main ʻtheoretical foundations,ʼ let me repeat that Jung’s typologies have never been proven empirically. The excuse that they cannot be tested using current scientific methods is not very convincing. Jung did not conduct any scientific studies, and he based his theories mainly on his own observations and anecdotal accounts during a period in which many people described him as psychologically sick. The major reason why tests based on Jung’s ʻtheoryʼ are almost always given an unfavourable assessment is because they are based on an unsound theoretical foundation (e.g., the assessment of the MBTI by the COTAN).
      Jung—just like Freud—never made any secret of the fact that he did not follow the path of academic science:
      “Anyone who wants to know the human psyche will learn next to nothing from experimental psychology. He would be better advised to abandon exact science, put away his scholar’s gown, bid farewell to his study, and wander with human heart through the world.” (Jung’s New Paths in Psychology, Collected Works, London, 1916).
      3.2 Problem 2: Type is at odds with biological variation
      Tests that are based on Jung’s ideas generally divide people into a series of discreet types. However, the reasoning behind typology contains a major fallacy, namely the assumption of
      dichotomy and bipolarity. Practitioners operate on the principle that the scales are discontinuous, dichotomous, or bimodal. This would mean that the population can be divided into two groups per scale, with a ʻgapʼ in the middle of each distribution. Compare this to the idea, for example, that the male population consisted of two groups: men between 1.40 and 1.60m tall and men between 1.80 to 2m tall, with hardly any men between 1.60 and 1.80m tall. However, human characteristics are distributed normally, whether it is a question of height, muscle power, intelligence, or personality traits. Differences in personality are therefore more gradual or ʻfluid.ʼ Modern personality psychology favours the trait approach–these traits are presented in continuous scales or dimensions instead of in dichotomies. The most accepted and scientifically established model is, of course, the Big Five or the Five Factor Model (FFM–five major trait domains). The most well known and researched test is the NEO-PI-R by Costa and McCrae (1995). Both the five major domains and the thirty underlying facets (six per domain) show a normal distribution. Whichever scale one chooses, the mutual combinations of the many facets of our personality produce a vast potential for variation between personalities! A new domain with a number of facets may emerge because now there are data that appear to show a sixth domain called ʻhonestyʼ (HEXACO model, Lee & Ashton, 2004).
      Typologies do not take into account the gradual differences in personality and the enormous variation of human characteristics and their possible combinations: For example, MBTI states that people can be divided up into sixteen types; LIFO � uses four types; and Enneagram uses nine types. According to a typology, a person definitely belongs either in one category or the other. In other words, one category excludes the other. People are either extrovert or introvert; however, the bulk of the population is neither extrovert nor introvert, but lies somewhere in between (referred to as ambivert). The use of ipsative tests (forced choice) emphasises the dichotomy (see below). Any form of typology reasoning is problematic for two main reasons: (1) the enormous variation in personalities as described above, which is an effect of evolutionary influences (e.g., random mutations or the arbitrary mixture of genes as a result of sexual selection), other biological influences (e.g., hormonal influences during pregnancy or viruses), and developmental processes; and (2) the influence of environmental or contextual factors that cause people to react differently in different situations (Barkow, Cosmides, & Tooby, 1992; Moscowitz & Zuroff, 2004).
      And so, variations between people are caused mainly by evolutionary processes and are certainly not limited to four, eight, or sixteen types. There is little room for doubt because the theory of evolution is considered to be one of the best-founded scientific theories (Coyne, 2010; Dawkins, 2009; Dennet 1996; Williams, 1996). Evidence for this theory has been supplied throughout history from many branches of science, such as DNA research, genetics, anthropology, and archaeology. Therefore, some people call the theory of evolution a theorem, which indicates that it is a theory or hypothesis that has been indisputably proven.
      Hans van Gossum (1) summarises evolution in four principles, which together offer an explanation for the personality differences between people.
      1. There is variation between different individuals. You only have to look at a school class photograph to realise that people are (outwardly) different in several respects.
      2. There are always more children born than the number who grow to become adults. Nature provides limited resources, therefore a lot of plant seeds and young animals perish.
      3. The individual who adapts best to the environment has more chance of survival. These adaptations are made possible by genetic variation. There are two known mechanisms that ensure a certain variation gets the upper hand. The first is coincidence: When genetic features randomly spread across a population, it is referred to as ‘genetic drift.’ The second mechanism is natural selection.
      4. Beneficial features (adapted or ‘fit’ to the environment) are passed on by means of procreation. Only those who can successfully procreate will spread the beneficial features, which means a population can adapt to changing circumstances.
      Thus, the origin of differences between people can be explained parsimoniously by the theory of evolution, but not on the basis of Jung’s ʻparallel metaphysical world.ʼ
      3.3 Problem 3: fictitious and incorrectly used scales
      Jungian theory operates on the existence of three dichotomies, yet all three have been subjected to scientific criticism. First, there is the dichotomy of experience versus intuition. Intuition, as described by Jung, arose from his faith in the paranormal, but, as I have already said, there is no evidence for this. Intuition is a concept deployed in modern psychology, but with a different definition: It is the whole of implicit knowledge acquired by multiple experiences in a regular and therefore predictable environment and the opportunity to learn these regularities by lengthy practice (Kahneman, 2011, p.252). This holds up well for professions such as fire fighting, medicine, and nursing, but not for professions such as financial investment advice, political science, and psychotherapy. Similarly, the feeling versus thinking scale is based on an untenable dichotomy. Research in both clinical psychology and neurobiology (e.g., Damasio et al., 2001) has shown that a distinction cannot be made between emotions and thoughts. In fact, they are linked to each other indivisibly in neural networks in the human brain. Clinical psychologists (especially those trained in cognitive behavioural therapy) have firmly adopted that point of view. Anger, for example, is always related to the same sort of thoughts, namely thoughts that involve a command or prohibition. This often finds expression in thoughts involving the words must or not allowed or that’s not possible because a particular desire or objective is under threat. Fear is always related to thoughts that express negative effects: ‘That dog will bite me,’ ‘My partner will be angry,’ ‘The client won’t like that,’ or ‘The dentist is going to hurt me.’ In terms of human behaviour, as is the case with animals, fear almost always leads to some form of avoidance,
      flight, fight, or freeze. Obviously, some people are less stable emotionally than others, but that has nothing to do with intellectual capacity. It is scientifically untenable to portray people who easily become afraid as less capable in ʻthinking.ʼ In the FFM or the Big Five, emotional stability is represented as a dimension (highly unstable to highly stable). However, the question of how prudently someone can think is a completely independent dimension.
      Finally, the way certain tests based on Jung’s ideas (such as the MBTI) deal with the scale of extroversion versus introversion is problematic. These tests present this scale as a dichotomy. However, Jung himself argued that there was no such thing as a person who was only extrovert or introvert, and that these were factors or dimensions (this is also what contemporary research in this field has shown–see above). Jung said that anyone who was only extrovert or introvert should be “admitted to an asylum.” And so, it is the developers of the MBTI and other Jungian typology tests who have introduced this dichotomy and formulated introvert versus extrovert as a type antithesis. A proposition such as ʻintroverts draw their energy from within themselves, whilst extroverts draw energy from othersʼ cannot be tested from a scientific point of view and is in contradiction with other scientific disciplines such as physics.
      3.4 Problem 4: The tests are unreliable, artificially reliable, or unresearched.
      There is a problem in putting Jung’s archetypal theory into operation, and the problem lies in the tests themselves. Some of these tests have been researched for their psychometric quality, but the results have proved extremely problematic. With the MBTI, for example, there is up to a 60% chance of a person being classified under a completely different type after just four weeks–the test-retest reliability is unacceptably low (see below under MBTI).
      Without going into an analysis of all the problems again, I will briefly address the ipsative nature of these tests (2) in a little more depth. Ipsative scoring is a system whereby the respondent actually divides up a set number of points (the constant value) over (usually) a number of scales that are included in the test. Therefore, the sum of the different item scores will be equal for each respondent. The result is an arrangement of the scales on the basis of their importance to the respondent (intrapersonal). This also explains the term ipsative: Indeed, the term ipsative comes from ipse = self. There are various ways of obtaining ipsative scores, such as preference scores or comparison in pairs, but the most commonly used form is the forced choice answer form. For example, the set of items in the MBTI always consists of two items which represent the poles of a bipolar scale and one of the two has to be chosen.
      The biggest problem with ipsative scores lies in the artificial reliability of the tests. Amongst other things, factor analysis is necessary to demonstrate construct validity and starts from an intercorrelation matrix, for example, of the different scales. However, intercorrelation matrices based on ipsative scores have one very peculiar feature: That is, the average intercorrelation of all correlations from the matrix is always equal to -1/(k-1), in which k
      stands for the number of scales in the test. Therefore, irrespective of the items in the scales, the mathematical calculation of how the scales intercorrelate can be done in advance. For example, Meade (2004) collected the test data of 2,895 candidates for the job of sales assistant within a big supermarket group in the USA. The scores for the eight different scales were collected and reproduced in a matrix in both ipsative and normative ways. The predicted average intercorrelation is therefore equal to -1/(8-1) or -0.14. The real average intercorrelation in this matrix was indeed -0.14 (p. 543). On the basis of this rule, it is also clear that the number of scales will increase as the average intercorrelation falls. With fifteen scales, the average intercorrelation is equal to -1/(15-1) or -0.07. As the number of scales rises, the individual correlation values will gradually decrease, therefore a high correlation in the matrix will rarely be encountered. Such matrices are often incorrectly interpreted as evidence for a set of independent variables, certainly whenever the reader is not aware of the ipsative nature of the scales. In other words, the independency of the scales is purely artificial.
      The intercorrelation matrix of a series of ipsative scales is an artefact of which there is no meaningful interpretation. If researchers carry out a factor analysis on such artificial matrices, they will see even more peculiar effects which have been well described by other authors (Baron, 1996; Bartram, 1996; Clemans, 1966; Dunlap & Cornwall, 1994; Hicks, 1970; Johnson, Wood, & Blinkhorn, 1988; Tenopyr, 1988). These effects are of a statistically technical nature (such as, production of artificial bipolar factors; sometimes commonalities equal to 1 and the absence of specific variance; the lower the scales, the greater the artificial distortion; and such like). Dunlap and Cornwall (1994, p.123) concluded: “We are left with recommending against the use of principal component, principal factoring, or maximum likelihood factor analysis with ipsative measures. The separation of artifactual bipolar factors induced by ipsativity from any true underlying relationship will be difficult at best, and not worth the danger of a largely incorrect interpretation.” Tenopyr (1988, p.750) states: “And so, it is clear that the internal consistency reliabilities of forced-choice scales are not only interdependent but may also augment each other enough to give the scale user a false sense of confidence in construct interpretation,” and “... a subject for major concern in view of the widespread use of forced-choice inventories in vocational guidance, diagnosis, selection, and research.”
      Basing one’s conclusions on ipsative test results can have major effects: The results often lead people to completely different conclusions than when using normative scales. To illustrate this, Meade established that recruitment decisions based on ipsative scales were totally different from those based on normative scales. Although some studies appear to show that ipsative scales can sometimes be appropriate in vocational preference tests, for example, they are not appropriate for measuring personality dimensions or for selection (Meade, 2004, p.548). Since the MBTI uses ipsative scales, both the internal reliability of the scales and the independence of the scales are unreliable because they are artificial.
    I knew, intuitively, that a fixed choice self-report is unreliable if you are going to calcuate correlations.
