The Measurement of Differences between
Variable Quantities 1
In biological statistics it is often necessary to study the differences between
two variable types. The problem may be exemplified by a
consideration of the differences between the types represented by populations
of various countries, as, for instance, between the populations of
Sweden, Switzerland, and Central Africa. It is obvious that the type
of Sweden differs less from that of Switzerland than the latter differs
from the type of Central Africa. Nevertheless, it is difficult to say just
what is meant by greater or lesser difference in type. The attempt to
establish and describe varieties of races according to characteristic features
that are considered as significant from a morphological point of
view suffers, therefore, from a lack of clarity of concept.
The differences between the averages of types have been utilized for
the purpose of segregating subtypes of human races, as, for instance, in
the classification of the local types into which the European race may
be divided. Pigmentation, stature, form of hair, head, face, and nose
have been so utilized. For example, local types have been described by
Deniker 2 by assigning to each group peoples among whom certain
average values of measurements are found. All those that have average
statures, head indices, facial forms, nose forms, and pigmentation falling
within certain limits that may be expressed numerically were assigned
by him to a certain subrace. Although it is possible to give in
this manner a definite description of local types, the biological significance
of the observed differences remains undetermined. Obviously the
classification obtained by the method here indicated will vary according
to the limits set for each division. If we call tall those populations whose
average stature is more than 170 cm., their assignment to a subdivision
will not be the same as the one obtained when we call those tall whose
average stature is more than 172 cm. If no valid reason can be given for
181the choice of one or the other limit, then the subtype so established can
have only a conventional descriptive meaning. If we wish to establish
a biologically significant classification we should have to prove that the
descriptive features selected are morphologically significant. Furthermore,
it would be necessary to distinguish between environmental and
hereditary influences that determine the particular features which are
made the basis of the classification. As a matter of fact, this study has
never been made; and since the lines of demarcation between classes
are arbitrary, these classes will be only a convenient schematic review
of the distribution of certain selected combinations of descriptive features. 1
In the following pages I wish to discuss the question whether a valid
method of comparing closely allied forms can be found, so that arbitrary
classifications may be avoided and measurable differences between two
It may seem that maps showing the distribution of a single feature or
of combined features would give this information. Retzius' maps of
Sweden, Livi's maps of Italy, Virchow's map of hair color in Germany,
anthropological maps of France, England, and Spain, all illustrate the
distribution of forms of the body, either by showing the areas in which
the same average value of a measurement occurs or by showing areas
in which certain selected values occur with equal frequency. The
maps are intended to convey the impression that sameness of
average values or of frequencies indicates the occurrence of the same
racial forms. They also indicate that the differences between types
are equivalent whenever the differences between averages or between
frequencies of occurrence of selected values are the same. This
has often led to the interpretation that the values whose frequency is
shown represent separate racial types. Thus, the frequency of long-headedness
in an area is often said to mean that a long-headed race
forms a certain proportion of the population, although no biological
basis can be given for the claim that the arbitrarily selected values
represent a separate racial type.
The essential difficulty of our problem may be made clearer by the
following considerations. Each racial type is variable. When we study
the distribution of any particular feature, let us say of the cephalic index
among European types, we find that the forms which occur in each
182area are variable, and the individuals composing the populations of different
areas show in part the same numerical values of the measurement.
The distribution of forms in each population is such that the types overlap.
The average cephalic index in Sweden may be 77; in Bavaria 85.
Nevertheless, there will be many individuals that have the index 81,
both in Bavaria and in Sweden; and according to this particular feature
individuals may belong to either group. We know that if two regions
are not too far apart, in most cases it is quite impossible to assign with
certainty a single individual to either of them.
If we should assume for the moment the variability found both in
Sweden and in Bavaria to be very low, so that the highest cephalic index
occurring among Swedes would be not more than 80 and the lowest
occurring in Bavaria not less than 82, then the two series would appear
to us entirely distinct. It would be quite inadmissible to claim that the
differences between the pair of groups were the same in the cases of
greater variability (which has actually been observed) and the lesser
variability (which has here been assumed), although in both cases the
averages show the same differences. In the latter case we judge that the
difference is greater, or perhaps better, more fundamental.
Obviously our judgment is influenced by the degree of variability;
still more, by the degree of overlapping of the two series. Only if we
assume quite arbitrarily that the individuals that show the average
values of the measurement in question — or some other selected value —
were the true representatives of the whole population, and that all others
were present only as foreign, intrusive elements, or if their occurrence
represented modifications of the typical form due to extraneous causes
— only under these conditions could we say that the difference between
the selected values represents the difference between the types. A concept
of variability like the one involved in these assumptions is, however,
quite inadmissible. The group must be considered as a class and
its variability determined by the definition of the class in question.
Our detailed study of the class will always be directed toward the discovery
of new principles of classification by means of which subclasses
are formed whose variability will be less than that of the original class.
In this way we try to define the newly formed subclasses more sharply
than the original class, and the advance in our knowledge consists in
the discovery of the factors that make the subclass more determinate. It
would be quite arbitrary to select one particular individual as the type,
and to claim either that all others are not really members of the class or
183that they are modified forms of the type. This method of procedure
would contravene the fundamental concept of variability, for a variable
comprises all the representatives of a class, the individual components of
which are only defined in so far as they are members of the class — this
in contradistinction to constants which are assumed to be completely
defined and must therefore be the same in every case.
As soon as these principles are held clearly in mind, it appears that
the ordinary definition of arithmetical difference is not applicable in
our case. The term “difference” as applied to variables does not mean
the same as the term “difference” applied to constants. Variables cannot
be brought into a measurable series by the same means that we use
for constants which may be compared by means of an arbitrary standard
that is also constant.
The problem before us is how to overcome these difficulties — how to
give a definite meaning to the differences between variables and make
these differences measurable.
The question has been treated by G. H. Mollison 1 and by J.
Czekanowski. 2 Mollison has discussed particularly the problem of differences
between two types, and he gives an arbitrary formula which
later on was modified by St. Poniatowski. 3
In the following pages I shall discuss some possible approaches to
What we call difference in this case is not by any means an arithmetical
difference; it is a judgment of the degree of dissimilarity of two
series. If two series are so far apart that notwithstanding their variability
they do not overlap, they are entirely dissimilar. If they do overlap they
will be the more dissimilar, the less the amount of overlapping. In this
sense we may say that two pairs of series in which the amount and character
of overlapping are equal will be equally dissimilar. While we may
thus determine equality of dissimilarity we are not in a position to
determine quantitatively the degree of dissimilarity.
In treating this problem we may first of all explain the meaning of
similarity and dissimilarity by means of a few examples. Let us assume
that a pure Negro and a pure White population are to be compared.
The types are so distinct in all their features that in comparing them
we should emphasize simply their dissimilarities. Now let us assume
184that a third community is added, consisting perhaps of baboons. It appears
at once that our point of view would be shifted from a consideration
of dissimilarities between Negroes and Whites to the similarities
which they have in common as compared with the baboon, and their
similarities will appear to us now under a new angle and as of different
When we compare a group of blond, blue-eyed North Europeans
with dark complexioned, brown-eyed South Europeans, their dissimilarities
are the most striking feature. If we add a Negro community to
these two groups the similarities between the North and South Europeans
would be much more prominently in our minds. We may observe
the same changing attitude when we speak of family resemblances, or
similarities. When we consider the children of a family, entirely by
themselves, without any reference to any other family, they will appear
to us as dissimilar. If the family has a particular characteristic feature,
let us say, for instance, a long narrow nose, which all the children have
to a greater or less extent, this will become the feature which makes
them similar as compared to the rest of the population.
It is, therefore, clear that the concept of the degree of similarity depends
upon the characteristics of all the groups that are under consideration
and will change with the groups that are being compared.
In investigations on heredity it has been customary to determine the
degree of similarity by means of the coefficient of correlation. When,
for instance, parents and offspring are compared, the coefficient of correlation
between the two will indicate the degree of their similarity.
There is a biological relation between parent and offspring. The average
form of the offspring is determined by the degree to which the parent
differs from the average of the population to which he belongs. In
marriage we may have selective mating through which the forms of two
parents may be correlated. When the husband differs from the average
of the population by a certain amount his wife may differ by a correlated
amount. In both of these cases there is a functional relation between the
two values. The distinguishing feature of fraternal correlation is that
we are dealing with a natural group in which there is no true functional
relation between the members. In a very large fraternity, disregarding
the fraternity as part of a population, the bodily form of one
member does not influence in any way either the average body form of
the rest of the fraternity or the distribution of the individual forms.
This is due to the fact that the members of the fraternity are all members
185of the same variable class, while in all the other cases previously
noted we are dealing with relations between different classes. Fraternal
correlation originates only in a population in which the fraternities
represent different types. If all the families had the same average value
there would be no correlation and no similarity between brothers.
The greater the heterogeneity of the family lines, the greater will be the
correlation and similarity between members of a fraternity.
Exactly the same considerations may be made for racial types. A local
variety may be considered as a fraternal group. The coefficient of correlations
between the local groups will then be a measure of their heterogeneity
or of their dissimilarity.
The problem of the definition of similarities has been treated fully
in experimental psychology. Weber's law is actually based on the observation
that the differences between two pairs of sensations are judged to
be equal. In this case the basis of empirical determination of similarity
is the probability of mistaking one difference for another. It is not, as
was originally assumed, a measure of quantitative value of the sensation
itself. This concept of similarity holds good not only in the case of simple
sensations but also in the field of more complex experience. We may
speak of similarity, or of the probability of failing to differentiate, for
the most diverse kinds and the most complex forms of mental experience.
The problem that we are discussing here has suggested itself in
every comparative study of mental processes.
In an analogous manner we may define the degree of similarity as
the probability of mistaking an individual who belongs to one group for
a member of any of the other groups concerned. The degree of dissimilarity
may then be determined by the probability of recognizing an individual
as belonging to his own group.
The same measurement will occur with varying frequency in the
groups forming the aggregate of groups that is being investigated. Each
individual may belong to any one of these groups and the probability
of his belonging to a particular group will be determined by the ratio
between the frequency of the measurement identifying the individual
as a member of his group and of its frequency in the aggregate. Thus
the probability of the correct assignment of a single individual or of all
individuals of the group having the trait in question can be determined.
When each series is compared with the aggregate of all the series and the
degrees of diversity are established these may be subtracted from one
186another, and in this manner differences in the degree of similarity may
When three series are compared in this manner in pairs, the resultant
values are not additive. If only series (1) and series (2), then series (1)
and (3), then series (2) and (3) are considered, the sum of the difference
between (1) and (2) plus that between (2) and (3) will not
be equal to the difference between (1) and (3). This is another expression
of the observation made before that the meaning of similarity
changes with the aggregate of the series that is being considered.
It might also seem possible to arrange the single series in the order
of their averages and to determine their dissimilarities step by step.
Here the difficulty may arise that two succeeding averages may be
nearly the same, while their variabilities may be quite different. Whenever
this occurs quite an erroneous impression of the differences will
be given. The reason for this difficulty lies in the fact that the difference
as here defined depends upon the averages and variabilities of the
single series, and that certain combinations of these two values result
in the same degree of dissimilarity.
In the case treated here the various series enter into the aggregate
according to the number of individuals representing each series. It
might be, for instance, that a large mass of material has been accumulated
for one group and that another group is known through the study
of a very few individuals only. Our expression contains, therefore,
a weighting according to number which obscures the more general theoretical
question. If the groups were known perfectly, then all would
have equal weight, i.e., we should have to assume them to be represented
by equal numbers.
Whether this point of view or the other should be taken depends
upon the clarity of our concept of the characteristics of each group.
If we assume each group as thoroughly studied and therefore known in
all its characteristics, then equal numbers will represent the conditions
adequately. On the other hand, if we are impressed by the unclassified
series as a whole, without detailed study of each group, and if we try to
determine the similarities and dissimilarities on this basis, the actual
numerical frequency of each group will correspond to the conditions
of the investigation. If subjective elements are to be eliminated as far
as possible, we must try to adjust conditions so that equal numbers can
be applied. As a matter of fact, our judgment of similarity in all cases
187of this type is fluctuating; sometimes one group, sometimes another, is
most prominently in our minds, and the actual assignments are therefore
different from the two extreme forms discussed here and may lie
somewhere in between, or they may change with changing mental conditions.
The more thorough our knowledge of each series, the closer
will be the approach to the treatment of all classes as equal in number.
The method here discussed presents the inconvenience that the values
obtained for similarity are the smaller, the larger the number of series
forming the aggregate, so that when the number of similar series is very
great the values of their similarities will be exceedingly small.
In the final results it may appear that some of these series have the
same degree of dissimilarity. If the averages and variabilities of these
series are also indicative of identity, the series should be combined.
It must be remembered that it is possible for a number of different
distributions to result in the same amount of dissimilarity. Since every
distribution depends at least upon two constants, average and standard
deviation, there are whole sets of functions which will give us the same
value for the total probability of mistaking a member of one series for a
member of the rest of the aggregate. However, owing to the general
likeness of forms of distribution, the occurrence of this event is improbable.
On the other hand, dissimilarity can occur only when distributions
are unlike. The minimum amount of dissimilarity is found
when all the series are identical. If there are n series, the value of dissimilarity,
in other words the probability of assigning any one individual
to its proper series, will be 1/n.
In applying the fundamental thought underlying our considerations
to the classification of mankind, we might ask ourselves which are the
series for which the similarity or the probability of a misjudgment becomes
zero, and these might be considered as the present fundamental
human types. A satisfactory solution of this problem must not be based
on the consideration of a few standardized measurements, but the features
to be studied must be selected after a careful investigation of what
is most characteristic of each group.
It is also feasible to find in this manner outstanding types of a definite
area and to arrange them according to the degrees of their similarity.
The interpretation of the similarity, whether due to mixture, environment,
or other causes, is of course a purely biological problem for which
the statistical inquiry furnishes the material but which cannot be solved
by statistical methods.188
We have seen that, in an attempt to analyze a mixed series according
to types, the individuals of a definite bodily form are not all assigned
by us to the group to which they belong. The impression which
we receive of characteristic forms of a particular series depends upon
the distribution and the forms of individuals whom we assign to it, and
for this reason our impression of the general characteristic form of the
series is expressed by the average of individuals whom we assign to it.
This value is obtained by averaging all those individuals who, according
to our judgment, are assigned to the local type, leaving out the others
that are placed erroneously. This consideration shows that we receive
an exaggerated impression of the characteristics of a series, because
individuals that are similar to other series are assigned to them according
to their appearance and are merged in the general background represented
by the aggregate. Our impression, however, does not correspond
to an actual type. This proves that the attempts to analyze a series into
a number of subtypes according to similarities of individuals is methodologically
not admissible, and that all subdivisions must be based on
the study of the series as a whole, not upon selected types.
The chief difficulty in the practical application of the method outlined
in the preceding pages is due to the facts that the degree of similarity
depends upon the aggregate treated, and that there is no relation
between the numerical values obtained for different aggregates. Not
even the equality of differences between several given series need persist
if new members are added to the aggregate or are taken away from it.
In cases of continuous changes of a type from one extreme form to
another, an artificial classification of the aggregate is unavoidable. By
means of repeated adjustment equal degrees of similarity might be
found according to the method outlined here, but the actual carrying
out of such a plan offers serious difficulties. In such cases each series
might be considered as a specialized form of the general aggregate
and compared with it. The aggregate itself may, however, be established
in two different ways. We may disregard the number of existing
individuals, considering each morphological type contained in the aggregate
as a unit. The units would then be given equal weight (i.e.,
equal numbers of cases). Or we may take the whole series as it exists at
the present time, counting the total number of individuals that it contains,
regardless of local types that may represent the same morphological
form. By either of these methods we ascertain how dissimilar each
morphological type is from the aggregate, but these values cannot be
189used to determine the mutual dissimilarities of the single series contained
in the aggregate. When the types are combined according to the present
actual number of individuals representing them, the most numerous type
will appear least distinct from the average, merely on account of the
large number of its members. This difficulty can hardly be avoided by
comparing each series with the aggregate of the remaining series, because
by this method the standard of comparison is changing. On the
other hand, the formation of the aggregate by giving equal weight to
each morphological type entails the difficulty that we tried to avoid,
namely, an arbitrary classification of the groups as a number of morphological
The problem may be approached in another manner. We may determine
the frequency distribution of the differences between individuals
belonging to one series and those belonging to all the series of the aggregate
including the one selected for study. In this inquiry we have to
determine the average difference between the representatives of one
series and those of all the series, and the variability of this difference.
When the series are arranged in pairs, the differences between the averages
are additive, but the variabilities are not comparable. The interrelations
between the series can be determined only when we consider
any one series in relation to the whole series.
The problems take a slightly different form when populations are
compared with regard to features that occur in a certain percentage of
individuals and are absent in the rest. If, for instance, one population
consists of 15 per cent Negroes and 85 per cent Whites, another one of
30 per cent Negroes and 70 per cent Whites, it might seem that the
difference could be stated simply as a difference of 15 per cent, but
obviously the dissimilarity of these two types of population would not
be the same as in another pair in which we have 40 per cent Negroes
and 60 per cent Whites in one and 55 per cent Negroes and 45 per cent
Whites in the other. In the latter case the populations would seem more
alike to us than in the former case. The difficulty is still more pronounced
if there are present not merely two types but a larger number
in varying proportions. In all these cases we may apply the same
methods which we used for the determination of similarity of measurable
1 Quarterly Publication of the American Statistical Association (December,
1922), pp. 425-445.
2 The Races of Man (London, 1900).
1 See also St. Poniatowski, “Ueber den Wert der Index Klassifikation,” Archiv
für Anthropologie, N.F. vol. 10 (1911), p. 50.
1 Morphologisches Jahrbuch, vol. 42, p. 79.
2 Korrespondenz-Blatt der Deutschen anthropologischen Gesellschaft, vol. 40.
3 Archiv für Anthropologie, N.F. vol. 10 (1911), p. 274.