Additional Analyses of the Rhyme Test

Following the publication of the article describing

the Rhyme Test reprinted in this volume, Fairbanks

completed additional analyses of the test using an augmented

sample of subjects. Most of the data herein

reported were first presented by Fairbanks at the meetings

of the Workshop on Speech Discrimination held in

Houston, Texas, in October of 1961. 1

Procedure

The five forms of the Rhyme Test as previously described

(see response form displayed in Table 1) were

administered to 20 subgroups of 4 Ss in a random Latin

Square design, all Ss receiving all forms of the test such

that each form appeared an equal number of times in

all presentation positions. The tests were administered

to all Ss through PDR-10 headsets under continuously

Table 1. Sample response sheet used for all forms of the Rhyme Test.

tableau A | B | C | D | E | _ot | _ire | _en | _in | _ail | _ay | _ale | _ark | _ust | _ight | _op | _ent | _oil | _ine | _orn | _eel | _oon | _ig | _ink | _od | _ake | _ick | _age | _old | _ock | _aw | _ame | _ast | _it | _ump | _ile | _ide | _ain | _ed | _ate | _eat | _ip | _est | _end | _ell | _ook | _ore | _un | _id | _et | _ill | _ang | _eal | _ack | _uck

monitored -2db V/N conditions, 2 the noise condition

previously identified as the 50% point of the average

intelligibility function for the combined five forms. Ss

were normal hearing college students.

Results and Discussion

Table 2 presents mean and variability estimates of

the per cent correct responses for each of the five forms.

The means are to be compared with the per cent correct

responses of 57% to 64% previously found for 40 Ss at

all V/N ratios.

It should be observed that these values are remarkably

homogeneous, especially in view of the fact that their

determination is necessarily at the steepest inflection

point of the intelligibility function. The small range

is to be interpreted as indicating that the five forms can

be considered as equivalent in their over-all effects.

Table 2. Means, ranges, and standard deviations of the five forms.

tableau X̄ | % | range | S.D. | RT-1 | RT-2 | RT-3 | RT-4 | RT-5 | combined

Table 3 displays the average per cent correct responses

obtained for each of the 18 consonants of the

five forms. These values are to be compared with those

presented in Table VI of the original article. Despite

the fact that the original values were based on the means

of only 8 subjects, while those of Table 3 summarize

80 subjects, the rank-order correlation between the two

sets is .66. Obviously, however, the newer values should

replace those of the earlier article since the stability of

these estimates is considerably enhanced by the greater

N. The total number of determinations used in calculating

these present values ranges from 2,160 for /s/ (27

stimulus words beginning with this phoneme times 80

Ss) to 80 for /j/ and /z/ appearing in the single stimulus

words *yet* and *zeal*.

Table 3 also presents the frequency (f) and probability

(p) of those 18 consonants in the language.

These estimates are drawn from the tables of transitional

frequencies of English phonemes (Hultzén, Allen,

and Miron, 1964) as calculated from a total of 20,032

running phonemes of conversational English. Values

are entered for both the frequency of occurrence of the

single consonantal phoneme and for the consonant following

juncture. The # juncture of the Trager-Smith

phonemic system is to be interpreted as being only

roughly equivalent to the space in orthography, since

in this count # subsumes four degrees of pause: +, |,

‖, and #. (See Trager and Smith, 1951.) It is of

interest to note that the frequency of occurrence of these144

Table 3. occurrence probabilities and diversities of the consonant stimuli ordered by intelligibility scores (I = % correct responses).

tableau all occurrences | occurrences after /ł/ | /C/ | /m/ | /r/ | /l/ | /g/ | /f/ | /n/ | /d/ | /b/ | /w/ | /ʤ/ | /k/ | /v/ | /p/ | /t/ | /s/ | /h/ | /j/ | /z/ | sum

* Exclusive of diphthong off-glide occurrences

a based on four stimulus words

b based on two stimulus words

c based on one stimulus word

consonants has a rank-order correlation of .10 with the

per cent correct scores for all occurrences and -.05 for

frequencies following #. Since neither value significantly

exceeds a zero correlation, we may assume that

frequency of occurrence in the language is not the

determinant of the intelligibility of these consonants.

Stated conversely, we may conclude that the test is apparently

free of obvious frequency contamination, justifying

the claim that the instrument is a test of phonemic

discrimination. The importance of this conclusion is

enhanced when one considers that these 18 phonemes

appearing in initial position account for 68% of all obtained

initial phonemes.

The remaining columns of Table 3 display the diversity

(D) and information measures (H) for these

consonant phonemes. The D index indicates the total

number of phonemes found to follow the consonant in

question, the highest value being 42 in the Trager-Smith

system. For example, the table indicates that /m/ in

all positions was followed by a total of 22 phonemes

and the combination /#m/, i.e., /m/ in initial position,

was followed by a total of 12 different phonemes. The

H statistic is that devised by Shannon (1949) and is

calculated from the formula:

H = -Σp_{1}log2p_{1}

The statistic indexes the shape of the probability distribution

of the phoneme types found to follow each of

the consonants. If all following phonemes were equiprobable

in occurrence, the value of H would be maximal.

On the other hand, if only one phoneme is found

to follow a given consonant, regardless of its frequency

of occurrence, the minimum H value of zero would be

obtained. In order to compare the H values of phonemes

for which the language permits differing numbers

of following phonemes, we are required to calculate the

relative amount of information (H_{rel}), i.e., the amount

of information relative to the maximum possible, log_{2}N,

where N is the obtained number of following phonemes.

Thus we see, for example, that the distribution of the

fewer possible following phonemes, given knowledge of

/#m/, provides more relative information (more uncertainty

as to following phoneme) than does the knowledge

of /m/ alone; the relative information values being

.85 and .76, respectively.

Thus, even though /#m/ has fewer possible following

phonemes and hence lower absolute H, the probabilities

of these following phonemes are more nearly

equal in the language than is the case for the phonemes

found to follow /m/ alone. Of the 17 consonant phonemes

of the Rhyme Test for which the measure was

calculated (H not conveniently calculated excluding

diphthong off-glide occurrences), 10 of these show increases

in relative information when found in the initial

position. As before, the rank-order correlations between

intelligibility scores and these information measures are145

Table 4. Consonant confusion matrix of the three highest-ranking responses to each stimulus.

tableau responses | /p/ | /t/ | /k/ | /f/ | /s/ | /h/ | /w/ | /j/ | /l/ | /r/ | /m/ | /n/ | /b/ | /d/ | /g/ | /v/ | /z/ /ʤ/ | stimuli

* Tied

essentially zero. The fact that none of the frequency

measures correlate with the obtained intelligibility scores

and that the initial consonant position yields essentially

flat distributions of the probabilities of the bi-phoneme

transitions indicates again that the test is a relatively

pure measure of phonemic discrimination.

Table 4 presents the confusions found among the consonant

stimuli. For purposes of simplification, only the

ranks of the first three most frequent consonant responses

to each stimulus have been recorded. Although no particular

justification for the ordering of the consonants as

presented in the table is offered, the ordering employed

provides a reasonably compact distribution of the errors

about the accurate-identification diagonal. It should be

noted that the most salient characteristic of this display

is that the voiced and voiceless elements are separated

and that no confusions, at least for the first three most

frequent responses, ever involve voicing errors. If the

distinctive feature analysis of Jakobson, Fant, and Halle

(1952) were used to order the consonants, the eight

features 3 suggested by them would produce the following

metric of consonant differences:

tableau n 2.71 m 2.84 /d3/ 3.04 | m 2.74 m 2.87 w 3.05 | /b/ 2.75 m 2.87 a/ 3.06 | a/ 2.76 m 2.87 m 3.19 | /d/ 2.76 1*1 3.01 /j/ 3.19 | an/ 2.83 a/ 3.01 a/ 3.29

The values for each consonant were derived by taking

the square root of the sum of squares of the number of

distinctive features separating each from all other consonants.

This procedure is based on the generalized distance

measure for points in a multi-dimensional space.

The values indicate the average distance separating each

consonant from all others in the eight-dimensional space

defined by the distinctive features. The greater the difference

between the values of any two consonants, the

greater the distance between those consonants in the

space and, hence, the greater the number of distinctive

features required to distinguish them. Confusions, therefore,

would be predicted to occur between consonants

having close separation values. A cursory glance at the

confusion matrix will confirm, however, that such is not

the case. Obviously, the separation index provided by

the distinctive feature analysis fails to reproduce the

order of confusions obtained, the largest disparity occurring

relative to the handling of the voicing feature. The

intelligibility data clearly indicate that the distinctive

features discriminating these consonants are not all

equally important to the listener, with the voicing feature

far outweighing any other.

Table 5 presents the reliabilities and intercorrelations

of the five forms of the test. The reliability estimates

are based on the intercorrelations of the item scores

which in turn are based on a random split of the 80 Ss

into two subgroups of 40. The intercorrelations were

also based on item statistics but use the full complement

of Ss. The reliability coefficients are all uniformly high.

The form intercorrelations are comparatively low and146

Table 5. Form reliability and Intercorrelation coefficients.

tableau RT-1 | RT-2 | RT-3 | RT-4 | RT-5

Table 6. Intercorrelation coefficients based on 14 phonemes common

to all forms.

tableau RT-1 | RT-2 | RT-3 | RT-4 | RT-5

undoubtedly reflect the fact that each form contains a

different distribution of both consonants and consonant-vowel

transitions. The form intercorrelations, based on

only those stimulus words containing the 14 consonants

common to all forms, are displayed in Table 6. It can

be seen that substantial increases in the intercorrelations

are obtained when the intercorrelations are restricted to

these common items.

The uniformly high reliabilities, freedom from language

frequency effects, and the implications of the data

derived for the development of an intelligibility index of

distinctive features, as well as the multitude of clinical

applications, indicate that the Rhyme Test can be both

an efficient and practical tool in research and testing.

References

Hultzén, L. A., Allen, J. H. D., and Miron, M. S. *Tables of transitional frequencies of English phonemes*. Urbana:

University of Illinois Press, 1964.

Jakobson, R., Fant, C. G. M., and Halle, M. *Preliminaries to speech analysis: the distinctive features and their correlates*.

Cambridge, Mass.: MIT Press, 1952.

Shannon, C. E., and Weaver, W. *The mathematical theory of communication*. Urbana: University of Illinois Press,

1949.

Trager, G. L., and Smith, H. L., Jr. *An outline of English structure*. Studies in Linguistics, Occasional Papers 3.

Washington, D.C.: American Council of Learned Societies,

1951.147

1 The information analysis of the frequencies of the consonant

phonemes has been added by the editor. Any errors in interpretation

or execution for this data are solely his.

2 Vowel-to-noise ratio calculated on the basis of VU readings.

Fairbanks' convention of referring thus to speech-to-noise ratios

is followed in concurrence with his belief that such designation

is more accurate than the more conventional S/N (Ed.).

3 The eight distinctive features are: vocalic/non-vocalic, consonantal/non-consonantal,

compact/diffuse, grave/acute, nasal/oral,

tense/lax, continuant/interrupted, strident/mellow.