A Comparative Study of Declarative Intonation
in American English and Spanish *1

In order to discover the true characteristic contours of intonation in a given language,
one must observe speech in its most natural state, as it exists in conversation or as it is
recorded from the platform in extemporaneous expression. The Spanish material
which will be used here is part of an interview with Diego Rivera (DR) on the subject
of El papel social del artista, recorded by Caedman Records, TC 1065. We are assured
that, from the moment Rivera's daughter asked the first question, the speaker forgot
notes and microphone and spoke freely. Within the body of the conversation one
finds efforts and hesitations which assure us of its unaffected diction.

To get something approaching the naturalness of our Spanish material we looked
for a recording of American speech, similarly presented without notes, and found
it in a series of extemporaneous lectures given by the anthropologist Margaret Mead
(MM) entitled Stripped Universals for a World-wide Culture. Her delivery resembles
that of DR in the number and frequency of the hesitations, in the occasional groping
for the right word, typical of spontaneous speech.

In both samples of speech we attempted to determine:

a) the distinctions in meaning supported by intonation,

b) to what extent these distinctions are similar in the two languages or are employed
to the same degree,

c) in what manner intonation is materialized for these distinctions: through what
melodic curves, by what frequency contours.

In presenting the contours, we do not use schematic notations, such as dots and
dashes, but the actual curve of moment-to-moment frequency variations extracted
electronically at the price of infinite time and patience. Our purpose is practical as
well as scientific: we hope to help the student not merely to make himself understood,
but to approach a native pronunciation.

We limit our investigation to unemphatic declarative intonation in its basic contrastive
forms: the expression of finality and the expression of continuation.

First, a word about our technique of analysis. We made, from the five-minute
recording of each of the languages, two types of spectrograms, both of them with
a narrow filtering that reveals the harmonic structure of the formants. One type, at a
scale of 2000 cycles per inch, makes it possible to read the formants and thus to
83segment (divide into consonants and vowels) precisely. The other type, made at a
scale of 200 cycles per inch, shows only a few of the low harmonies, but these are
amplified ten times in order to make the rise and fall of the melodic line more apparent.
The picture of the variations in fundamental frequency (which corresponds to the fall
and rise of the voice and generally reflects the subjective impression of intonation
contours) can be observed and studied directly on the two types of spectrograms:
on the 200 scale by following the movement of the first (fundamental) or of the second
harmonic; on the 2000 scale, by following the movement of a high harmonic such as
the tenth. With our eye on the harmonic movements in the spectrograms, we listened
to our recording of the speakers, syllable by syllable, at both normal and reduced
speed, and noted what, in the intonation curves, corresponded to distinctive impressions
by ear. By this technique of spectrographic analysis, we hope to add substantial
and valuable knowledge to the already impressive work done in the field of
Spanish and comparative Spanish-English studies in intonation by Tomás Navarro
and several American linguists such as Bolinger, Bowen, Stockwell, Silva-Fuenzalida
and Cárdenas. Our objective results generally confirm the subjective notions on
which those linguists agree, but they provide, in addition:

a) the statistical element which indicates the amount of leeway that exists within
an acceptable pronunciation, and

b) the detailed shape of actual contours which is needed to complement the schematic
or numerical notations that are commonly used.

Figures 1 and 2 present frequency variations taken from the spectrograms for
sentences by DR and MM. (For practical purposes, the time scale is approximate).
The two horizontal lines designate the limits of an octave for MM, and the limits of
nearly an octave and a half for DR. We note that although the Spanish speaker has
the wider pitch-range, both speakers have approximately the same lower limit. The
coincidence of the lower limit is to be attributed more to MM's unusually low voice
for a woman than to DR's high pitch for a man. This fact creates rather favorable
conditions for comparison in spite of our two subjects being a man and a woman.
In the following discussion we shall make frequent reference to these figures. Small
letters, such as a-3, will refer to the Spanish text (Figure 1); capital letters, such as
A-3, to the American text (Figure 2). If we occasionally use the terms Spanish
, American intonation in the course of this study, it is because comparison
with numerous other informants has led us to believe that our two subjects have
intonation contours that are sufficiently typical of their respective tongues to justify
the extrapolation.

1. Continuation in General

Before examining particular cases of continuation, let us compare Spanish and
American continuation patterns in general and as a whole, as they exist for our two

The outstanding difference in the manner of expressing continuation bears on the
last stressed syllable of sense-groups: for DR, the major portion of the last stressed
syllable is generally rising (a-2, a-3, b-1), and the subjective impression it causes is
predominantly one of ascent; for MM, the major portion of the last stressed syllables
is generally falling (A-1, A-2, A-3, B-1), and the subjective impression it causes is
substantially one of descent.

Statistically, the predominance of ascent in Spanish continuation and of descent in
American continuation is confirmed. Out of 139 continuation groups of DR 92 show
a long rise on the last stressed syllable followed by a short high plateau which may
be on the stressed syllable itself (a-2, a-7, b-3) or on a subsequent unstressed syllable
(a-3, b-1, a-12); 22 show a long rise followed by a short fall which may also be either
on the high end of the stressed syllable itself (a-10) or on subsequent high unstressed
syllables (b-2, a-15); and only 25 show a rise that is shorter than or equal to the following
descent (a-8, a-9). Out of 207 continuation groups of MM, 145 show a long
descent most often somewhat in the shape of an inclined tilde or a sloping reversed-S
(B-1, A-4, A-7, A-8); 40 with a similar long descent followed by short rising hook
(A-1, A-2, A-10, B-3); and only 22 show a rise, often slightly in S-shape (B-2, A-19).
The end of the tilde or the rising hook can naturally be on a subsequent unstressed
syllable if there is one (A-2, B-3) as well as on the stressed syllable itself (A-1,
A-7, B-4).

In addition to the difference in direction, there is one in the detailed shape of the
continuation pattern. For DR the typical shape is simple. It generally consists of
two portions only: a swift rise plus a plateau or a descent — only one change of direction
and a sharp one at that. For MM it is more complex. The tilde as well as the
S generally show at least two changes in direction and the bends are quite gentle.

Cases like those of the series of a-8, a-9, a-10, which are noticeable for having very
pronounced descending portions in the continuation pattern are included in our
statistics, but it is clear that they do not represent the norm, are not free from special
implication. Here in a-8 through a-10, the sudden change in intonation contour for
continuation is used by DR to recapture the attention of the listener for a new and
important idea. It does express continuation, but with special emphasis. The norm
for the expression of continuation is more likely to be in a series like a-1, 2, 3, 4, 5, 6,
b-1, where a long sharp, direct rise is regularly followed by a high plateau.

The lack of rising intonation to express continuation in American English is well
confirmed by previous studies. Kenneth Pike (The Intonation of American English,
Ann Arbor, 1948, p. 155) presents the following statistics for continuation in several
passages of conversational prose: descending contours: 206, falling-rising contours:
77, level contours: 163, rising contours: 47, with rising patterns constituting only
9.6 per cent of the total contours. Milton Cowan (Pitch and Intensity Characteristics of Stage Speech, Iowa City, 1938, p. 63) summarizes voluminous statistics in this
manner: “… 63 per cent of all phrases ended with falling inflection, 12 per cent with a
rising inflection, and 25 per cent with a level intonation.”85

Thus in general, we find two basic points of contrast in a comparison of Spanish
and American continuation: that of general direction — mainly rising for Spanish,
mainly falling for American, and the shape that these directions take — a simple
movement for Spanish, a complex one for American.

2. Two Forms of Continuation

Continuation can take two distinctive forms in Spanish. In a sentence of three
groups, Tomás Navarro (Entonación española, New York, 1948, p. 56) shows a
contrast in intonation which is correlated with a change in meaning. Here are the
two divisions he proposes.

image El general se muestra emocionado como un muchacho ingenuo // ante su monumento.

image El general se muestra emocionado // como un muchacho ingenuo | ante su monumento.

By dividing the sentence after ingenuo, group 3 will have type B intonation (major
continuation), which contrasts with type A intonation (minor continuation) in
groups 1 and 2. (B often rises higher than A, but may be distinguished by other factors
which we shall mention later.) By dividing the sentence after emocionado, group 2 has
B intonation, while groups 1 and 3 have A intonation. This difference between A and
B is distinctive since it changes the meaning of the sentence — when the word ingenuo
has B intonation, the general is standing in front of the monument; when the word
emocionado has B intonation, he is not. Thus the role of B intonation is to delimit
the major parts of a sentence, to add clarity to expression. Often it groups several
thoughts into a more complete, larger idea. This is done regularly and clearly by DR.
Five examples of the type B continuation can be observed in Figure 1, including the
following one:

De manera que (a-1) aquella sociedad (a-2) de cazadores (a-3) migratorios (a-4)
que caminaron (a-5) desde el polo sur (a-6) a casi el polo norte (b-1)

B intonation on the word norte indicates that this is a major division in the sentence
and that the seven small groups from a-1 through b-1 unite to form a single large

Out of 139 units of continuation by DR, we identify 31 B continuation groups
which are easily confirmed both by ear and by visual examination of the spectrograms.
Moreover, they occupy a logical position in the sentence to fulfill their role of major

3. Realization of the Contrast A/B: Minor Continuation,
Major Continuation

A spectrographic comparison of all the B continuation contours with all the A continuation
contours of DR permits the differences to be separated into four features.

a) The B contours always rise throughout the major portion of the last “stressed
syllable agglomerate” of the sense-group, (b-1, 2, 3, 4). (By “agglomerate” we mean
the stressed syllable plus the subsequent unstressed ones if any.) The A contours
have a few exceptions: 25 out of 108, (a-8, 9, 10). In these exceptions the A/B contrast
is obvious: A shows a predominance of descent; B shows a predominance of ascent.
Compare a-8, 9, 10 with b-2.

b) The ascending slope is on the average sharper (closer to the vertical) for B than
for rising A. Compare the slopes of b-1, 2, 3, 4 with those of a-2 through a-6.

c) In the majority of cases, B continuation rises visibly higher than the A continuations
that immediately precede it. Compare a-12, 13 with b-3. Compare also
a-8, 9, 10 with b-2.

d) At times, however, the B ascent does not actually reach a higher frequency than
the preceding A ascents. And yet the subjective impression is that it does. b-1 for
instance is not higher than a-6, yet it is identified as if it were. It must be, then,
that our perception is associated with another factor. This factor might well be the
range of the frequency-rise, since B has, on the average, a greater range than A. As in
the case of b-1 compared with a-6, the voice drops lower before rising for B continuation
than for A continuation.

For MM, it appears that this contrast is made less regularly and less clearly than
for DR. In the speech of many Americans it probably does not function at all.
Nevertheless, definite signs of its existence in the lecture by MM were found. Having
identified, in 207 continuation groups, 38 which would require form B to unite short
groups logically together, we were able to identify them by ear. Moreover, we found
that, on spectrograms, the 38 B are distinguished from the 169 A by several subtle
yet visible features:

a) The proportion of rising contours is greater for B: 10 out of 38 B — on Fig. 2,
we find for example, one rising B (B-2) vs. three falling B (B-1, 3, 4); as opposed to
12 out of 169 A — on Fig. 2, we find no rising A as opposed to 20 falling A's.

b) The proportion of rising hooks after falling pitch is greater for B: 14 out of 38
B, as opposed to 26 out of 169 A, (B-3 vs. B-1, 4; A-1, 2, 10 vs. 17 others).

c) When continuation has a descending form (14 out of 38 B, 131 out of 169 A),
on the average it is more abruptly falling for B than for the preceding A, (compare
A-3 with B-1, A-11 with B-3).

To summarize, for DR, the A/B contrast of continuation is either an obvious
one of falling versus rising (the exception), or a more subtle one between two rising
contours (the norm). In the latter case, B is distinguished from A by its sharper rise,
and/or its higher rise, and/or its greater range of rise. For MM, when the A/B87

image De manera que, aquella sociedad de cazadores migratorios
que caminaron desde el polo sur a casi el polo norte
tenían que tener una capacidad vital tremenda. casi podemos
decir que es indudable que las gentes que hicieron aquellas
pinturas pertenecieron a una sociedad que no estaba dividida
en clases. De manera que en realidad el arte es una
necesidad vital para el ser humano. Después, pueden venir
las otras atribuciones. Pero, esencialmente, es una necesidad.

Figure 1. Examples of frequency variation contours from recording by Diego Rivera.88

image We don't know for certain whether human beings would even walk
upright if they were left to themselves with nobody to tell them
they should. And, but, we're not able to perform any very good
experiments along these lines because the only suggested experiments
are the so-called “wolf children” in India who have now been pretty
well debunkt and who are believed to be miserable, disturbed,
defective children who run away from home and are reidentified
a few days later as having lived their lives with wolves.

Figure 2. Examples of frequency variation contours from recording by Margaret Mead.

contrast of continuation is made perceptible, it is realized by stressing the B descents
as well as the B ascents, or by a more frequent use of the rising hook after descent
for B.89

4. Finality (C)

As would be expected, finality is identified by a descent in pitch in both languages.
The 23 sentence terminals of MM as well as the 20 of DR, show this clearly. Moreover,
the major descent for both speakers takes place within the last stressed agglomerate
of the sentence. However, beyond these generalizations, differences in detail are
quite apparent, and can be best described in four steps.

a) The most striking difference between Spanish and American expressions of
finality in sentence terminals lies in the relation between the final stressed syllable and
the preceding unstressed syllable. For MM, the unstressed syllable that precedes the
sentence-final stress is usually low and falling, such as it is before all other sense-group
final stresses, (C-1, C-2). For DR, the unstressed syllable that precedes the final stress
is regularly high and flat in a manner that inescapably announces the sharp fall of
the final stressed syllable that follows, (c-1, 2, 3, 4, 5).

b) The final stressed syllable itself is quite different for our two subjects. For
MM it usually rises before falling, as in wolves (C-2). For DR, there is seldom any
rise before the descent; the fall usually proceeds from the very onset of the stressed
syllable (c-1, 2, 3, 4, 5), at, or near, the frequency of the preceding unstressed

c) The manner of descent is more leisurely and winding by MM than by DR.
The shape of MM's contour usually recalls that of a tilde with its two changes of
direction, but the concave ending is less pronounced than for continuation (C-1, C-2)
If unstressed syllables follow, they usually complement the concave ending in a downward
slope. (There are no examples of this on Fig. 2, all C endings being monosyllabic.)
For DR the descent is neither leisurely nor winding, but abrupt and direct. Its
shape approaches that of a straight line with no pronounced change of direction. If
unstressed syllables follow, they usually form a very low plateau at the frequency
level reached by the sharp descent of the stressed syllable, (c-1, 2, 3, 4).

d) The tempo of the down slope in the sentence-final stressed syllable was measured.
On the average DR descends 9.18 semitones per .1 second, MM 5.76 semitones

In short, we find that both the form and manner of the pitch contour for finality in
the two languages are different. On the stressed syllable, American rises, then
descends slowly; Spanish, without having risen, descends abruptly. The preceding
unstressed syllables in American are lower than the final stressed ones and of approximately
the same level as the other unstressed syllables of the sentence, whereas Spanish
unstressed syllables in the same position retain relatively high pitch which announces
the descent for finality. Possible unstressed syllables after the final stress in American
are usually incorporated into the gradual descent of finality much more than in
Spanish where the unstressed syllables are nearly level at a low frequency.90

5. AB/C Contrast: Continuation/Finality

The contrast continuation/finality is quite sharp in Spanish for, in the great majority
of cases, predominantly rising continuation opposes predominantly falling finality
(a-13, b-3 vs. c-3). In addition, the rising continuation of the stressed syllable is
preceded by a low unstressed syllable whereas the falling finality of the stressed
syllable is preceded by a high unstressed syllable. In the few cases where falling
continuation opposes falling finality, the contrast is nonetheless clear, for falling
continuation is preceded by a low pitch (a-8, 9, 10) whereas the falling finality is
preceded by a high pitch (c-3, 4, 5). Besides, the descent is much smaller for falling
continuation than for falling finality.

This contrast is not so clear for our American informant. As we have seen in the
speech of MM, the great majority of continuation contours in American show a
falling pattern. This descent is often severe enough to approximate the sentence-final
descent. This AB/C contrast for American is thus between two degrees of
descent and can be quite ambiguous (A-1, B-1 vs. C-1).

Conclusion and Summary

A detailed spectrographic analysis of two lectures, chosen for their realistic naturalness,
has permitted us to present graphically and describe, for a Spanish subject
(DR) and for an American subject (MM), the patterns of three types of intonation
within the declarative sentence: minor continuation (A), major continuation (B) and
sentence-final intonation (C), comprising two contrasts, A/B-minor continuation/
major continuation, and AB/C-continuation/finality.

Briefly, we can state that continuation is substantially rising in Spanish and predominantly
falling in American English. Figure 3 presents in schematized form the
differences in contour. Spanish continuation typically rises on the last stressed syllable
of the sense-group. This rise is preceded by a low and flat unstressed syllable
and followed by a high plateau which is continued by subsequent unstressed syllables
(if any). American continuation typically shows the last stressed syllable of a sense-group
rising briefly before a long fall which ends in a short rising hook or a suggestion
of one. The shape is typically that of a tilde. This tilde is preceded by a low and
falling unstressed syllable, and often followed by unstressed syllables that are incorporated
in the falling tail of the tilde.

In Spanish, major continuation is distinguished from minor continuation by the
rise which occurs more frequently, and is usually more rapid, and/or higher, and/or
of greater pitch-range (Fig. 3). In American this contrast is less marked and less
regular. When it is perceived, major continuation seems to be realized by more
frequent rising pitch, more frequent hooks after falling pitch, and by slightly more
pronounced descent or ascent.91

image Minor Continuation | Major Continuation | Finality | Spanish | Obreros de Burgos salían | American | The workmen from Boston were leaving

Figure 3. Schematic representation of the most typical frequency variation contours emphasizing
differences between Spanish and American declarative intonation. Three-syllable sense groups with
accent on the middle syllable frequently occur in both languages and are used here for illustration.

Finality is mainly falling on the last stressed syllable of the sentence in both languages,
yet it offers striking differences (Fig. 3). The long fall of the stressed syllable
is typically winding and preceded by a rise in American, straight and preceded by no
rise in Spanish. The unstressed syllable that precedes is low in American, high in
Spanish. The low unstressed syllables that follow (if any) are more falling in American
than in Spanish.

The contrast continuation/finality (Fig. 3) is often ambiguous in American, where
it opposes two falling contours whose differences are subtle. In Spanish this contrast
is, on the contrary, very clear (Fig. 3). It opposes a sequence of low-rise-high to one
of high-fall-low.92

