Fairbanks, Grant. Experimental Phonetics – T16

Additional Analyses of the Rhyme Test

Following the publication of the article describing
the Rhyme Test reprinted in this volume, Fairbanks
completed additional analyses of the test using an augmented
sample of subjects. Most of the data herein
reported were first presented by Fairbanks at the meetings
of the Workshop on Speech Discrimination held in
Houston, Texas, in October of 1961. 1

Procedure

The five forms of the Rhyme Test as previously described
(see response form displayed in Table 1) were
administered to 20 subgroups of 4 Ss in a random Latin
Square design, all Ss receiving all forms of the test such
that each form appeared an equal number of times in
all presentation positions. The tests were administered
to all Ss through PDR-10 headsets under continuously

Table 1. Sample response sheet used for all forms of the Rhyme Test.

tableau A | B | C | D | E | _ot | _ire | _en | _in | _ail | _ay | _ale | _ark | _ust | _ight | _op | _ent | _oil | _ine | _orn | _eel | _oon | _ig | _ink | _od | _ake | _ick | _age | _old | _ock | _aw | _ame | _ast | _it | _ump | _ile | _ide | _ain | _ed | _ate | _eat | _ip | _est | _end | _ell | _ook | _ore | _un | _id | _et | _ill | _ang | _eal | _ack | _uck

monitored -2db V/N conditions, 2 the noise condition
previously identified as the 50% point of the average
intelligibility function for the combined five forms. Ss
were normal hearing college students.

Results and Discussion

Table 2 presents mean and variability estimates of
the per cent correct responses for each of the five forms.
The means are to be compared with the per cent correct
responses of 57% to 64% previously found for 40 Ss at
all V/N ratios.

It should be observed that these values are remarkably
homogeneous, especially in view of the fact that their
determination is necessarily at the steepest inflection
point of the intelligibility function. The small range
is to be interpreted as indicating that the five forms can
be considered as equivalent in their over-all effects.

Table 2. Means, ranges, and standard deviations of the five forms.

tableau X̄ | % | range | S.D. | RT-1 | RT-2 | RT-3 | RT-4 | RT-5 | combined

Table 3 displays the average per cent correct responses
obtained for each of the 18 consonants of the
five forms. These values are to be compared with those
presented in Table VI of the original article. Despite
the fact that the original values were based on the means
of only 8 subjects, while those of Table 3 summarize
80 subjects, the rank-order correlation between the two
sets is .66. Obviously, however, the newer values should
replace those of the earlier article since the stability of
these estimates is considerably enhanced by the greater
N. The total number of determinations used in calculating
these present values ranges from 2,160 for /s/ (27
stimulus words beginning with this phoneme times 80
Ss) to 80 for /j/ and /z/ appearing in the single stimulus
words yet and zeal.

Table 3 also presents the frequency (f) and probability
(p) of those 18 consonants in the language.
These estimates are drawn from the tables of transitional
frequencies of English phonemes (Hultzén, Allen,
and Miron, 1964) as calculated from a total of 20,032
running phonemes of conversational English. Values
are entered for both the frequency of occurrence of the
single consonantal phoneme and for the consonant following
juncture. The # juncture of the Trager-Smith
phonemic system is to be interpreted as being only
roughly equivalent to the space in orthography, since
in this count # subsumes four degrees of pause: +, |,
‖, and #. (See Trager and Smith, 1951.) It is of
interest to note that the frequency of occurrence of these144

Table 3. occurrence probabilities and diversities of the consonant stimuli ordered by intelligibility scores (I = % correct responses).

tableau all occurrences | occurrences after /ł/ | /C/ | /m/ | /r/ | /l/ | /g/ | /f/ | /n/ | /d/ | /b/ | /w/ | /ʤ/ | /k/ | /v/ | /p/ | /t/ | /s/ | /h/ | /j/ | /z/ | sum

* Exclusive of diphthong off-glide occurrences
a based on four stimulus words
b based on two stimulus words
c based on one stimulus word

consonants has a rank-order correlation of .10 with the
per cent correct scores for all occurrences and -.05 for
frequencies following #. Since neither value significantly
exceeds a zero correlation, we may assume that
frequency of occurrence in the language is not the
determinant of the intelligibility of these consonants.
Stated conversely, we may conclude that the test is apparently
free of obvious frequency contamination, justifying
the claim that the instrument is a test of phonemic
discrimination. The importance of this conclusion is
enhanced when one considers that these 18 phonemes
appearing in initial position account for 68% of all obtained
initial phonemes.

The remaining columns of Table 3 display the diversity
(D) and information measures (H) for these
consonant phonemes. The D index indicates the total
number of phonemes found to follow the consonant in
question, the highest value being 42 in the Trager-Smith
system. For example, the table indicates that /m/ in
all positions was followed by a total of 22 phonemes
and the combination /#m/, i.e., /m/ in initial position,
was followed by a total of 12 different phonemes. The
H statistic is that devised by Shannon (1949) and is
calculated from the formula:

H = -Σp₁log2p₁

The statistic indexes the shape of the probability distribution
of the phoneme types found to follow each of
the consonants. If all following phonemes were equiprobable
in occurrence, the value of H would be maximal.
On the other hand, if only one phoneme is found
to follow a given consonant, regardless of its frequency
of occurrence, the minimum H value of zero would be
obtained. In order to compare the H values of phonemes
for which the language permits differing numbers
of following phonemes, we are required to calculate the
relative amount of information (H_rel), i.e., the amount
of information relative to the maximum possible, log₂N,
where N is the obtained number of following phonemes.
Thus we see, for example, that the distribution of the
fewer possible following phonemes, given knowledge of
/#m/, provides more relative information (more uncertainty
as to following phoneme) than does the knowledge
of /m/ alone; the relative information values being
.85 and .76, respectively.

Thus, even though /#m/ has fewer possible following
phonemes and hence lower absolute H, the probabilities
of these following phonemes are more nearly
equal in the language than is the case for the phonemes
found to follow /m/ alone. Of the 17 consonant phonemes
of the Rhyme Test for which the measure was
calculated (H not conveniently calculated excluding
diphthong off-glide occurrences), 10 of these show increases
in relative information when found in the initial
position. As before, the rank-order correlations between
intelligibility scores and these information measures are145

Table 4. Consonant confusion matrix of the three highest-ranking responses to each stimulus.

tableau responses | /p/ | /t/ | /k/ | /f/ | /s/ | /h/ | /w/ | /j/ | /l/ | /r/ | /m/ | /n/ | /b/ | /d/ | /g/ | /v/ | /z/ /ʤ/ | stimuli

* Tied

essentially zero. The fact that none of the frequency
measures correlate with the obtained intelligibility scores
and that the initial consonant position yields essentially
flat distributions of the probabilities of the bi-phoneme
transitions indicates again that the test is a relatively
pure measure of phonemic discrimination.

Table 4 presents the confusions found among the consonant
stimuli. For purposes of simplification, only the
ranks of the first three most frequent consonant responses
to each stimulus have been recorded. Although no particular
justification for the ordering of the consonants as
presented in the table is offered, the ordering employed
provides a reasonably compact distribution of the errors
about the accurate-identification diagonal. It should be
noted that the most salient characteristic of this display
is that the voiced and voiceless elements are separated
and that no confusions, at least for the first three most
frequent responses, ever involve voicing errors. If the
distinctive feature analysis of Jakobson, Fant, and Halle
(1952) were used to order the consonants, the eight
features 3 suggested by them would produce the following
metric of consonant differences:

tableau n 2.71 m 2.84 /d3/ 3.04 | m 2.74 m 2.87 w 3.05 | /b/ 2.75 m 2.87 a/ 3.06 | a/ 2.76 m 2.87 m 3.19 | /d/ 2.76 1*1 3.01 /j/ 3.19 | an/ 2.83 a/ 3.01 a/ 3.29

The values for each consonant were derived by taking
the square root of the sum of squares of the number of
distinctive features separating each from all other consonants.
This procedure is based on the generalized distance
measure for points in a multi-dimensional space.
The values indicate the average distance separating each
consonant from all others in the eight-dimensional space
defined by the distinctive features. The greater the difference
between the values of any two consonants, the
greater the distance between those consonants in the
space and, hence, the greater the number of distinctive
features required to distinguish them. Confusions, therefore,
would be predicted to occur between consonants
having close separation values. A cursory glance at the
confusion matrix will confirm, however, that such is not
the case. Obviously, the separation index provided by
the distinctive feature analysis fails to reproduce the
order of confusions obtained, the largest disparity occurring
relative to the handling of the voicing feature. The
intelligibility data clearly indicate that the distinctive
features discriminating these consonants are not all
equally important to the listener, with the voicing feature
far outweighing any other.

Table 5 presents the reliabilities and intercorrelations
of the five forms of the test. The reliability estimates
are based on the intercorrelations of the item scores
which in turn are based on a random split of the 80 Ss
into two subgroups of 40. The intercorrelations were
also based on item statistics but use the full complement
of Ss. The reliability coefficients are all uniformly high.
The form intercorrelations are comparatively low and146

Table 5. Form reliability and Intercorrelation coefficients.

tableau RT-1 | RT-2 | RT-3 | RT-4 | RT-5

Table 6. Intercorrelation coefficients based on 14 phonemes common
to all forms.

tableau RT-1 | RT-2 | RT-3 | RT-4 | RT-5

undoubtedly reflect the fact that each form contains a
different distribution of both consonants and consonant-vowel
transitions. The form intercorrelations, based on
only those stimulus words containing the 14 consonants
common to all forms, are displayed in Table 6. It can
be seen that substantial increases in the intercorrelations
are obtained when the intercorrelations are restricted to
these common items.

The uniformly high reliabilities, freedom from language
frequency effects, and the implications of the data
derived for the development of an intelligibility index of
distinctive features, as well as the multitude of clinical
applications, indicate that the Rhyme Test can be both
an efficient and practical tool in research and testing.

References

Hultzén, L. A., Allen, J. H. D., and Miron, M. S. Tables
of transitional frequencies of English phonemes. Urbana:
University of Illinois Press, 1964.

Jakobson, R., Fant, C. G. M., and Halle, M. Preliminaries
to speech analysis: the distinctive features and their correlates.
Cambridge, Mass.: MIT Press, 1952.

Shannon, C. E., and Weaver, W. The mathematical theory
of communication. Urbana: University of Illinois Press,
1949.

Trager, G. L., and Smith, H. L., Jr. An outline of English
structure. Studies in Linguistics, Occasional Papers 3.
Washington, D.C.: American Council of Learned Societies,
1951.147

1 The information analysis of the frequencies of the consonant
phonemes has been added by the editor. Any errors in interpretation
or execution for this data are solely his.

2 Vowel-to-noise ratio calculated on the basis of VU readings.
Fairbanks' convention of referring thus to speech-to-noise ratios
is followed in concurrence with his belief that such designation
is more accurate than the more conventional S/N (Ed.).

3 The eight distinctive features are: vocalic/non-vocalic, consonantal/non-consonantal,
compact/diffuse, grave/acute, nasal/oral,
tense/lax, continuant/interrupted, strident/mellow.