This guide is written for students who are following GCE Advanced level (AS and A2) syllabuses in English Language. This resource may also be of general interest to language students on university degree courses, trainee teachers and anyone with a general interest in language science.
This page uses IPA symbols - you need a Unicode font, such as Lucida Sans Unicode, installed on your computer system to see these display correctly. For example, the red character between these square brackets [ə]should appear as schwa (looks like an e upside down). If the schwa symbols does not appear, you should go to the IPA Unicode site to download a suitable font:
You will also not see Unicode fonts in some early browser versions. If you use Microsoft's Internet Explorer, you need version 5.5 or later; if you use Netscape Communicator, you need version 6.0 or later. Click on the links below to get the latest versions of these browsers.
On this page I use red type for emphasis. Brown type is used where italics would appear in print (in this screen font, italic looks like this, and is unkind on most readers). Headings have their own hierarchical logic, too:
Main section headings look like this
Sub-section headings look like this
Minor headings within sub-sections look like this
What is phonology?
Phonology is the study of the sound system of languages. It is a huge area of language theory and it is difficult to do more on a general language course than have an outline knowledge of what it includes. In an exam, you may be asked to comment on a text that you are seeing for the first time in terms of various language descriptions, of which phonology may be one. At one extreme, phonology is concerned with anatomy and physiology - the organs of speech and how we learn to use them. At another extreme, phonology shades into socio-linguistics as we consider social attitudes to features of sound such as accent and intonation. And part of the subject is concerned with finding objective standard ways of recording speech, and representing this symbolically.
For some kinds of study - perhaps a language investigation into the phonological development of young children or regional variations in accent, you will need to use phonetic transcription to be credible. But this is not necessary in all kinds of study - in an exam, you may be concerned with stylistic effects of sound in advertising or literature, such as assonance, rhyme or onomatopoeia - and you do not need to use special phonetic symbols to do this.
The physics and physiology of speech
Man is distinguished from the other primates by having the apparatus to make the sounds of speech. Of course most of us learn to speak without ever knowing much about these organs, save in a vague and general sense - so that we know how a cold or sore throat alters our own performance. Language scientists have a very detailed understanding of how the human body produces the sounds of speech. Leaving to one side the vast subject of how we choose particular utterances and identify the sounds we need, we can think rather simply of how we use our lungs to breathe out air, produce vibrations in the larynx and then use our tongue, teeth and lips to modify the sounds. The diagram below shows some of the more important speech organs.
A few people have the ability to interpret most of a speaker's utterances from lip-reading. But many more have a sense of when the lip-movement does or does not correspond to what we hear - we notice this when we watch a feature film with dubbed dialogue, or a TV broadcast where the sound is not synchronized with what we see.
The diagram can also prove useful in conjunction with descriptions of sounds - for example indicating where the airflow is constricted to produce fricatives, whether on the palate, the alveolar ridge, the teeth or the teeth and lips together.
Speech therapists have a very detailed working knowledge of the physiology of human speech, and of exercises and remedies to overcome difficulties some of us encounter in speaking, where these have physical causes. An understanding of the anatomy is also useful to various kinds of expert who train people to use their voices in special or unusual ways. These would include singing teachers and voice coaches for actors, as well as the even more specialized coaches who train actors to produce the speech sounds of hitherto unfamiliar varieties of English or other languages. At a more basic level, my French teacher at school insisted that we (his pupils) could produce certain vowel sounds only with our mouths more open than we would ever need to do while speaking English. And a literally stiff upper lip is a great help if one wishes to mimic the speech sounds of Queen Elizabeth II.
So what happens? Mostly we use air that is moving out of our lungs (pulmonic egressive air) to speak. We may pause while breathing in, or try to use the ingressive air - but this is likely to produce quiet speech, which is unclear to our listeners. (David Crystal notes how the normally balanced respiratory cycle is altered by speech, so that we breathe out slowly, using the air for speech, and breathe in swiftly, in order to keep talking). In languages other than English, speakers may also use non-pulmonic sound, such as clicks (found in southern Africa) or glottalic sounds (found worldwide). In the larynx, the vocal folds set up vibrations in the egressive air. The vibrating air passes through further cavities which can modify the sound and finally are articulated by the passive (immobile) articulators - the hard palate, the alveolar ridge and the upper teeth - and the active (mobile) articulators. These are the pharynx, the velum (or soft palate), the jaw and lower teeth, the lips and, above all, the tongue. This is so important and so flexible an organ, that language scientists identify different regions of the tongue by name, as these are associated with particular sounds. Working outwards these are:
- the back - opposite the soft palate
- the centre - opposite the meeting point of hard and soft palate
- the front - opposite the hard palate
- the blade - the tapering area facing the ridge of teeth
- the tip - the extreme end of the tongue
The first three of these (back, centre and front) are known together as the dorsum (which is Latin for “backbone” or “spine”)
Phonology, phonemes and phonetics
You may have known for some time that the suffix “-phone” is to do with sounds. Think, for instance, of telephone, microphone, gramophone and xylophone. The morpheme comes from Greek phonema, which means “a sound”.
- Telephone means “distant sound”
- Microphone means “small sound” (because it sends an input to an amplifier which in turn drives loudspeakers - so the original sound is small compared to the output sound)
- Gramophone was originally a trade name. It comes from inverting the original form, phonograph (=sound-writing) - so called because the sound caused a needle to trace a pattern on a wax cylinder. The process is reversed for playing the sound back
- Xylophone means “wood sound” (because the instrument is one of very few where the musical note is produced simply by making wood resonate)
The fundamental unit of grammar is a morpheme. A basic unit of written language is a grapheme. And the basic unit of sound is a phoneme. However, this is technically what Professor Crystal describes as “the smallest contrastive unit” and it is highly useful to you in explaining things - but strictly speaking may not exist in real spoken language use. That is, almost anything you say is a continuum and you rarely assemble a series of discrete sounds into a connected whole. (It is possible to do this with synthesised speech, as used by Professor Stephen Hawking - but the result is so different from naturally occurring speech that we can recognize it instantly.) And there is no perfect or single right way to say anything - which is just as well, because we can never exactly reproduce a previous performance.
However, in your comments on phonology, you will certainly want sometimes to focus on single phonemes or small sequences of phonemes. A phoneme is a sound segment of words or syllables. Quite a good way to understand how it may indicate meaning is to consider how replacing it with another phoneme will change the word - so if we replace the middle sound in “bad” we can make “bawd”, “bed”, “bid”, “bird” and “bud”. (In two cases here one letter is replaced with two letters but in all these cases it is a single vowel sound that changes.)
The first people to write in English used an existing alphabet - the Roman alphabet, which was itself adapted from the Greek alphabet for writing in Latin. (In the Roman empire, Latin was the official language of government and administration, and especially of the army but in the eastern parts of the empire Greek was the official language, and in Rome Greek was spoken as widely as Latin, according to F.F. Bruce, in The Books and the Parchments, Chapter 5). Because these first writers of English (Latin-speaking Roman monks) had more sounds than letters, they used the same letters to represent different sounds - perhaps making the assumption that the reader would recognize the word, and supply the appropriate sounds. It would be many years before anyone would think it possible to have more consistent spelling, and this has never been a realistic option for writers of English, though spelling has changed over time. And, in any case, the sounds of Old English are not exactly the same as the sounds of modern English.
As linguists have become aware of more and more languages, many with sounds never heard in English, they have tried to create a comprehensive set of symbols to correspond to features of sound - vowels, consonants, clicks and glottalic sounds and non-segmental or suprasegmental features, such as stress and tone. Among many schemes used by linguists one has perhaps more authority than most, as it is the product of the International Phonetic Association (IPA). In the table below, you will see the phonetic characters that correspond to the phonemes used in normal spoken English. To give examples is problematic, as no two speakers will produce the same sound. In the case of the vowels and a few consonants, the examples will not match the sounds produced by all speakers - they reflect the variety of accent known as Received Pronunciation or RP. Note that RP is not specific to any region, but uses more of the sounds found in the south and midlands than in the north. It is a socially prestigious accent, favoured in greater or less degree by broadcasters, civil servants, barristers and people who record speaking clock messages. It is not fixed and has changed measurably in the last 50 years. But to give one example, the sound represented by θ is not common to all UK native speakers. In many parts of London and the south-east of England the sound represented by f will be substituted. So, in an advertisement, the mother-in-law of Vinnie Jones (former soccer player for Wimbledon and Wales; now an actor) says: “I fought 'e was a big fug” (/aɪ fɔət i: wɒz ə bɪg fug/).
You may also wonder what has happened to the letter x. This is used in English to represent two consonant sounds, those of k and s or of k and z. In phonetic transcription these symbols will be used.
“Consonant” and “vowel” each have two related but distinct meanings in English. In writing of phonology, you need to make the distinction clear. When you were younger you may have learned that b,c,d,f and so on are consonants while a,e,i,o,u are vowels - and you may have wondered about y. In this case consonants and vowels denote the letters that commonly represent the relevant sounds. Phonologists are interested in vowel and consonant sounds and the phonetic symbols that represent these (including vowel and consonant letters). It may be wise for you to use the words consonant and vowel (alone) to denote the sounds. But it is better to use an unambiguous phrase - and write or speak about consonant or vowel sounds, consonant or vowel letters and consonant or vowel symbols. In most words these sounds can be identified, but there are some cases where we move from one vowel to another to create an effect that is like neither - and these are diphthongs. We also have some triphthongs - where three vowel sounds come in succession in words such as “fire”, “power” and “sure”. (But this depends on the speaker - many of us alter the sounds so that we say “our” as if it were “are”.) For convenience you may prefer the term vowel glides - and say that “fine” and “boy” contain two-vowel glides while “fire” contains a three-vowel glide.
IPA symbols for the sounds of English
The examples show the letters in bold that correspond to the sound that they illustrate. You will find guidance below on how to use these symbols in electronic documents. The IPA distributes audio files in analog and digital form, with specimen pronunciations of these sounds.
- The document in the frame below uses unicode symbols. If you do not see them, then you can open a PDF version of the page. Click here to open the PDF file in a new window.
A phoneme is a speech sound that helps us construct meaning. That is, if we replace it with another sound (where this is possible) we get a new meaning or no meaning at all. If I replace the initial consonant (/r/) from rubble, I can get double or Hubble (astronomer for whom the space telescope is named) or meaningless forms (as regards the lexicon of standard English) like fubble and wubble. The same thing happens if I change the vowel and get rabble, rebel, Ribble (an English river) and the nonsense form robble. (I have used the conventional spelling of “rebel” here, but to avoid confusion should perhaps use phonetic transcription, so that replacements would always appear in the same position as the character they replace.)
But what happens when a phoneme is adapted to the spoken context in which it occurs, in ways that do not alter the meaning either for speaker or hearer? Rather than say these are different phonemes that share the same meaning we use the model of allophones, which are variants of a phoneme. Thus if we isolate the l sound in the initial position in lick and in the final position in ball, we should be able to hear that the sound is (physically) different as is the way our speech organs produce it. Technically, in the second case, the back of the tongue is raised towards the velum or soft palate. The initial l sound is called clear l, while the terminal l sound is sometimes called a dark l. When we want to show the detail of phonetic variants or allophones we enclose the symbols in square brackets whereas in transcribing sounds from a phonological viewpoint we use slant lines. So, using the IPA transcription [l] is clear l, while [ɫ] is dark l.
If this is not clear think:
- Am I only describing a sound (irrespective of how this sound fits into a system, has meaning and so on)? If so, use square brackets.
- Am I trying to show how the sound is part of a wider system (irrespective of how exactly it sounds in a given instance)? If so, use slant brackets.
So long as we need a form of transcription, we will rely on the IPA scheme. But increasingly it is possible to use digital recording and reproduction to produce reference versions of sounds. This would not, of course, prevent change in the choice of which particular sounds to use in a given context. When people wonder about harass (hærəs) or harass (həræs) they usually are able to articulate either, and are concerned about which reveals them as more or less educated in the use of the “proper” form. (For your information, the stress historically falls on the first syllable, to rhyme with embarrass - thus in both Pocket Oxford [UK, 1969] and Funk & Wagnalls New Practical Standard [US, 1946]. The fashion for hu-rass is found on both sides of the Atlantic and we should not credit it to, or blame it on, US speakers of English.)
Phonologists also refer to segments. A segment is “a discrete unit that can be identified in a stream of speech”, according to Professor Crystal. In English the segments would correspond to vowel sounds and consonant sounds, say. This is a clear metaphor if we think of fruit - the number of segments varies, but is finite in a whole fruit. So some languages have few segments and others many - from 11 in Rotokas and Mura to 141 in !Xu. The term may be most helpful in indicating what non-segmental or supra-segmental (above the segments) features of spoken language are.
The sounds of English
English has twelve vowel sounds. In the table above they are divided into seven short and five long vowels. An alternative way of organizing them is according to where (in the mouth) they are produced. This method allows us to describe them as front, central and back. We can qualify them further by how high the tongue and lower jaw are when we make these vowel sounds, and by whether our lips are rounded or spread, and finally by whether they are short or long. This scheme shows the following arrangement:
- /i:/ - cream, seen (long high front spread vowel)
- /ɪ/ - bit, silly (short high front spread vowel)
- /ɛ/ - bet, head (short mid front spread vowel); this may also be shown by the symbol /e/
- /æ/ - cat, dad (short low front spread vowel); this may also be shown by /a/
- /ɜ:/- burn, firm (long mid central spread vowel); this may also be shown by the symbol /ə:/.
- /ə/ - about, clever (short mid central spread vowel); this is sometimes known as schwa, or the neutral vowel sound - it never occurs in a stressed position.
- /ʌ/ - cut, nut (short low front spread vowel); this vowel is quite uncommon among speakers in the Midlands and further north in Britain.
- /u:/ - boob, glue (long high back rounded vowel)
- /ʊ/ - put, soot (short high back rounded vowel); also shown by /u/
- /ɔ:/ - corn, faun (long mid back rounded vowel) also shown by /o:/
- /ɒ/- dog, rotten (short low back rounded vowel) also shown by /o/
- /ɑ:/ - hard, far (long low back spread vowel)
We can also arrange the vowels in a table or even depict them against a cross-section of the human mouth. Here is an example of a simple table:
|High||ɪ i:||ʊ u:|
Diphthongs are sounds that begin as one vowel and end as another, while gliding between them. For this reason they are sometimes described as glide vowels. How many are there? Almost every modern authority says eight - but they do not all list the same eight (check this for yourself). Simeon Potter, in Our Language (Potter, S,  Chapter VI, Sounds and Spelling, London, Penguin) says there are nine - and lists those I have shown in the table above, all of which I have found in the modern reference works. The one most usually omitted is /ɔə/ as in bored. Many speakers do not use this diphthong, but use the same vowel in poured as in fraud - but it is alive and well in the north of Britain.
Potter notes that all English diphthongs are falling - that is the first element is stressed more than the second. Other languages have rising diphthongs, where the second element is stressed, as in Italian “uomo” (man) and “uovo” (egg).
Some authorities claim one or two fewer consonants than I have shown above, regarding those with double symbols (/tʃ/ and /dʒ/) as “diphthong consonants” in Potter's phrase. The list omits one sound that is not strictly a consonant but works like one. The full IPA list of phonetic symbols includes some for non-pulmonic consonants (not made with air coming from the lungs), click and glottal sounds. In some varieties of English, especially in the south of Britain (but the sound has migrated north) we find the glottal plosive or glottal stop, shown by the symbol /ʔ/ (essentially a question mark without the dot at the tail). This sound occurs in place of /t/ for some speakers - so /botəl/ or /botl/ (bottle) become /boʔəl/ or /boʔl/.
We form consonants by controlling or impeding the egressive (outward) flow of air. We do this with the articulators - from the glottis, past the velum, the hard palate and alveolar ridge and the tongue, to the teeth and lips. The sound results from three things:
- voicing - causing the vocal cords to vibrate
- where the articulation happens
- how the articulation happens - how the airflow is controlled
All vowels must be voiced - they are caused by vibration in the vocal cords. But consonants may be voiced or not. Some of the consonant sounds of English come in pairs that differ in being voiced or not - in which case they are described as voiceless or unvoiced. So /b/ is voiced and /p/ is the unvoiced consonant in one pair, while voiced /g/ and voiceless /k/ form another pair.
We can explain the consonant sounds by the place where the articulation principally occurs or by the kinds of articulation that occurs there. The first scheme gives us this arrangement:
Articulation described by region
- Glottal articulation - articulation by the glottis. We use this for one consonant in English. This is /h/ in initial position in house or hope.
- Velar articulation - we do this with the back of the tongue against the velum. We use it for initial hard /g/ (as in golf) and for final /ŋ/ (as in gong).
- Palatal articulation - we do this with the front of the tongue on the hard palate. We use it for /dʒ/ (as in jam) and for /ʃ/ (as in sheep or sugar).
- Alveolar articulation - we do this with the tongue blade on the alveolar ridge. We use it for /t/ (as in teeth), /d/ (as in dodo) /z/ (as in zebra) /n/ (as in no) and /l/ (as in light).
- Dental articulation - we do this with the tip of the tongue on the back of the upper front teeth. We use it for /θ/ (as in think) and /ð/ (as in that). This is one form of articulation that we can observe and feel ourselves doing.
- Labio-dental articulation - we do this with the lower lip and upper front teeth. We use it for /v/ (as in vampire).
- Labial articulation - we do this with the lips for /b/ (as in boat) and /m/ (as in most). Where we use two lips (as in English) this is bilabial articulation.
Articulation described by manner
This scheme gives us a different arrangement into stop(or plosive) consonants, affricates, fricatives, nasal consonants, laterals and approximants.
- Stop consonants (so-called because the airflow is stopped) or plosive consonants (because it is subsequently released, causing an outrush of air and a burst of sound) are:
- Bilabial voiced /b/ (as in boat) and voiceless /p/ (as in post)
- Alveolar voiced /d/ (as in dad) and voiceless /t/ (as in tap)
- Velar voiced /g/ (as in golf) and voiceless /k/ (as in cow)
- Affricates are a kind of stop consonant, where the expelled air causes friction rather than plosion. They are palatal /tʃ/ (as in cheat) and palatal /dʒ/ (as in jam)
- Fricatives come from restricting, but not completely stopping, the airflow. The air passes through a narrow space and the sound arises from the friction this produces. They come in voiced and unvoiced pairs:
- Labio-dental voiced /v/ (as in vole) and unvoiced /f/ (as in foal)
- Dental voiced /ð/ (as in those) and unvoiced /θ/ (as in thick)
- Alveolar voiced /z/ (as in zest) and unvoiced /s/ (as in sent)
- Palatal voiced /ʒ/ (as in the middle of leisure) and unvoiced /ʃ/ (as at the end of trash)
- Nasal consonants involve closing the articulators but lowering the uvula, which normally closes off the route to the nose, through which the air escapes. There are three nasal consonants in English:
- Bilabial /m/ (as in mine)
- Alveolar /n/ (as in nine)
- Velar /ŋ/ (as at the end of gong)
- Lateral consonants allow the air to escape at the sides of the tongue. In English there is only one such sound, which is alveolar /l/ (as at the start of lamp)
- Approximants do not impede the flow of air. They are all voiced but are counted as consonants chiefly because of how they function in syllables. They are:
- Bilabial /w/ (as in water)
- Alveolar /r/ (as in road)
- Palatal /j/ (as in yet)
When you think of individual sounds, you may think of them in terms of syllables. These are units of phonological organization and smaller than words. Alternatively, think of them as units of rhythm. Although they may contain several sounds, they combine them in ways that create the effect of unity.
Thus splash is a single syllable but it combines three consonants, a vowel, and a final consonant /spl+æ+ʃ/.
Some words have a single syllable - so they are monosyllables or monosyllabic. Others have more than one syllable and are polysyllables or polysyllabic.
Sometimes you may see a word divided into its syllables, but this may be an artificial exercise, since in real speech the sounds are continuous. In some cases it will be impossible to tell whether a given consonant was ending one syllable of beginning another. It is possible, for example, to pronounce lamppost so that there are two /p/ sounds in succession with some interval between them. But many native English speakers will render this as /læm-pəʊst/ or /læm-pəʊsd/.
Students of language may find it helpful to be able to identify individual syllables in explaining pronunciation and language change - one of the things you may need to do is explain which are the syllables that are stressed in a particular word or phrase.
In written English we use punctuation to signal some things like emphasis, and the speed with which we want our readers to move at certain points. In spoken English we use sounds in ways that do not apply to individual segments but to stretches of spoken discourse from words to phrases, clauses and sentences. Such effects are described as non-segmental or suprasegmental - or, using the adjective in a plural nominal (noun) form, simply suprasegmentals.
Among these effects are such things as stress, intonation, tempo and rhythm - which collectively are known as prosodic features. Other effects arise from altering the quality of the voice, making it breathy or husky and changing what is sometimes called the timbre - and these are paralinguistic features. Both of these kinds of effect may signal meaning. But they do not do so consistently from one language to another, and this can cause confusion to students learning a second language.
- Stress or loudness - increasing volume is a simple way of giving emphasis, and this is a crude measure of stress. But it is usually combined with other things like changes in tone and tempo. We use stress to convey some kinds of meaning (semantic and pragmatic) such as urgency or anger or for such things as imperatives.
- Intonation - you may be familiar in a loose sense with the notion of tone of voice. We use varying levels of pitch in sequences (contours or tunes) to convey particular meanings. Falling and rising intonation in English may signal a difference between statement and question. Younger speakers of English may use rising (question) intonation without intending to make the utterance a question.
- Tempo - we speak more or less quickly for many different reasons and purposes. Occasionally it may be that we are adapting our speech to the time we have in which to utter it (as, for example, in a horse-racing commentary). But mostly tempo reflects some kinds of meaning or attitude - so we give a truthful answer to a question, but do so rapidly to convey our distraction or irritation.
- Rhythm - patterns of stress, tempo and pitch together create a rhythm. Some kinds of formal and repetitive rhythm are familiar from music, rap, poetry and even chants of soccer fans. But all speech has rhythm - it is just that in spontaneous utterances we are less likely to hear regular or repeating patterns.
How many voices do we have? We are used to “putting on” silly voices for comic effects or in play. We may adapt our voices for speaking to babies, or to suggest emotion, excitement or desire. These effects are familiar in drama, where the use of a stage whisper may suggest something clandestine and conspiratorial. Nasal speech may suggest disdain, though it is easily exaggerated for comic effect (as by the late Kenneth Williams in many Carry On films).
Such effects are sometimes described as changing timbre or voice quality. We all may use them sometimes but they are particularly common among entertainers such as actors or comedians. This is not surprising, as they practise using their voices in unusual ways, to represent different characters. The performers in the BBC's Teletubbies TV programme use paralinguistic features to suggest the different characters of Tinky-Winky, Dipsy, La-La and Po.
Everyone's use of the sound system is unique and personal. And few of us use sounds consistently in all contexts - we adapt to different situations. (We rarely adapt our sounds alone - more likely we mind our language in the popular sense, by attending to our lexical choices, grammar and phonology.)
Most human beings adjust their speech to resemble that of those around them. This is very easy to demonstrate, as when some vogue words from broadcasting surf a wave of popularity before settling down in the language more modestly or passing out of use again.
This is particularly true of sounds, in the sense that some identifiable groups of people share (with some individual variation) a collection of sounds that are not found elsewhere, and these are accents. We think of accents as marking out people by geographical region and, to a less degree, by social class or education. So we might speak of a Scouse (Liverpool), Geordie (Newcastle) or Brummie (Birmingham) accent. These are quite general descriptions - within each of these cities we would differentiate further. And we should also not confuse real accent features in a given region with stereotyped and simplified versions of these which figure in (or disfigure) TV drama - Emmerdale, Brookside, Coronation Street and Albert Square are not reliable sources for anything we might want to know about their real-world originals. And the student who hoped to study the speech of people in Peckham by watching episodes of John Sullivan's situation comedy Only Fools and Horses was deeply misguided.
Thinking of social class, we might speak of a public school accent (stiff upper lip and cut glass vowels). But we do not observe occupational accents and we are unlikely to speak of a baker's, soldier's or accountant's accent (whereas we might study their special uses of lexis and grammar).
This is not the place to study in detail the causes of such accents or, for example, how they are changing. Language researchers may wish to record regional variant forms and their frequency. In Britain today (perhaps because of the influence of broadcasting) we can observe sound features moving from one region to another (like the glottal stop which is now common in the north of England), while also recording how other features of accent are not subject to this kind of change.
Studying phonology alone will not answer such questions. But it gives you the means to identify specific phonetic features of accent and record them objectively.
Received Pronunciation (or RP) is a special accent - a regionally neutral accent that is used as a standard for broadcasting and some other kinds of public speaking. It is not fixed - you can hear earlier forms of RP in historical broadcasts, such as newsreel films from the Second World War. Queen Elizabeth II has an accent close to the RP of her own childhood, but not very close to the RP of the 21st century.
RP excites powerful feelings of admiration and repulsion. Some see it as a standard or the correct form of spoken English, while others see its use (in broadcasting, say) as an affront to the dignity of their own region. Its merit lies in its being more widely understood by a national and international audience than any regional accent. Non-native speakers often want to learn RP, rather than a regional accent of English. RP exists but no-one is compelled to use it. But if we see it as a reference point, we can decide how far we want to use the sounds of our region where these differ from the RP standard. And its critics may make a mistake in supposing all English speakers even have a regional identity - many people are geographically mobile, and do not stay for long periods in any one place.
RP is also a very loose and flexible standard. It is not written in a book (though the BBC does give its broadcasters guides to pronunciation) and does not prescribe such things as whether to stress the first or second syllable in research. You will hear it on all the BBC's national radio channels, to a greater or less degree. On Radio 3 you will perhaps hear the most conservative RP, while Radio 5 will give you a more contemporary version with more regional and class variety - but these are very broad generalizations, and refer mainly to the presenters, newsreaders, continuity announcers and so on. RP is used as a standard in some popular language reference works. For example, the Oxford Guide to the English Language (Weiner, E , Pronunciation, p. 45, Book Club Associates/OUP, London) has this useful description of RP:
“The aim of recommending one type of pronunciation rather than another, or of giving a word a recommended spoken form, naturally implies the existence of a standard. There are of course many varieties of English, even within the limits of the British Isles, but it is not the business of this section to describe them. The treatment here is based upon Received Pronunciation (RP), namely 'the pronunciation of that variety of British English widely considered to be least regional, being originally that used by educated speakers in southern England.' This is not to suggest that other varieties are inferior; rather, RP is here taken as a neutral national standard, just as it is in its use in broadcasting or in the teaching of English as a foreign language.”
Accent and social class
Accent is certainly related to social class. This is a truism - because accent is one of the things that we use as an indicator of social class. For a given class, we can express this positively or negatively. As regards the highest social class, positively we can identify features of articulation - for certain sounds, upper class speakers do not open or move the lips as much as other speakers of English. Negatively, we can identify such sounds as the glottal stop as rare among, and untypical of, speakers from this social class.
Alternatively we can look at vowel choices or preferences. For example, the upper classes for long used the vowel /ʌ/ in some cases where /ɒ/ is standard - thus Coventry would be /kʌvəntri:/. C.S. Lewis in The Great Divorce depicts a character who pronounces “God” as “Gud” -“ 'Would to God' he continued, but he was now pronouncing it Gud...”
We may think of dropping or omitting consonants as a mark of the lower social classes and uneducated people. But dropping of terminal g - or rather substituting /n/ for /ŋ/ was until recently a mark of the upper class “toff”, who would enjoy, for example, huntin', fishin' and shootin'. The British actor Ian Carmichael did this in playing the part of Dorothy L. Sayers' detective, Lord Peter Wimsey. In writing the dialogue for her novels Miss Sayers indicates Lord Peter's dropping of the terminal g by the use of an apostrophe:
“It's surprisin' how few people ever mean anything definite from one year's end to another...”
Gaudy Night, Chapter 4
Among real life speakers in whom I have observed this tendency I would identify the late Sir Alf Ramsey. (I do not know whether Alf Ramsey was brought up to speak in this way or acquired the habit later.)
Investigating the connection can be challenging, however, since social class is an artificial construct. Assuming that you have found a way to identify your subjects as belonging to some definable social group, then you can study vowel choices or frequencies. Even the most cursory attention tells us that the Queen has distinct speech sounds. But can we explain them in detail? Does she share them with other members of her family? Do other speakers share them?
Pronunciation and prescription
The English Language List is an Internet discussion forum for English language teachers. Recently (2001) a student, not a native speaker but clearly a very competent writer of English, asked where he could get help to learn to speak in a standard British accent. Many of the responses came from people who were not answering his question but trying to persuade him to stick with his current accent (which he felt would disadvantage him in his business career). Yet we are not disparaging regional accents when we try to learn the neutral and prestigious standard form. (What the discussion never really revealed was how many of the list members would identify themselves as RP speakers.)
The prescriptive tradition in English grammar was unscientific and perhaps harmful. But setting down authoritative standard forms is not always so unwise. In spelling they are useful, and the same may be true of pronunciation. Dictionaries do not compel the reader to learn and use the pronunciations they show - but they do give a representation of the pronunciation according to RP. Some show variant pronunciations as well as the principal RP form.
If you are a student (or even a teacher) you may find RP an unfamiliar accent - maybe you can see that the phonetic transcription indicates a pronunciation different from the one you normally use. No one is forcing you to change your own speech sounds, in which your sense of identity may be profoundly located. But you can become aware that the local norm is not the universal standard.
Now that English is an international language, its development is certainly not controlled by what happens in the UK. So British RP may cease to be a useful standard for learners of English. Increasingly, language learners favour a mid-Atlantic accent, which shares features of British RP and the speech of the eastern USA.
Very young children do not produce the sounds they will use as adults partly because they are unable to form them (physically their speech organs have not developed fully) and partly because they may not know exactly what the sound is that they wish to produce. Children may also be less subtle in controlling the flow of egressive air, so that they will continue speaking, rather than pause briefly, while drawing more air in.
Young children may have a sense of stressed syllables as more important - so they may omit unstressed elements before or after. So, for example, a child may ask for a 'nana rather than a banana. (Alternatively, the child may know that there is some repetition of sound here, but limit it to two syllables.) I am supposing that the non-standard form is spoken by a child, but perhaps repeated back by adults. But one often observes adults (unhelpfully) using what they suppose to be an easier form of a word and offering the child a 'nana. On the other hand, some children have resisted this tendency. Though they may not articulate a word in full or exactly, they can recognize it as an incomplete or mistaken form when an adult repeats it back to them. We see this in this exchange between an adult and a four year old, recorded by George Keith and John Shuttleworth:
Adult: What do you want to be when you grow up?
Child: A dowboy.
Adult: So you want to be a dowboy, eh?
Child: No! Not a dowboy, a dowboy!
The child cannot articulate the /k/ initial sound but knows that what he hears from the adult is not the form of the word he is used to hearing, so protests.
Since children learn by imitation of examples it may be helpful when they begin formal education to give them such examples, but not by continually rebuking them for saying things “wrongly”. Children do not learn to articulate all sounds at the same stage in their development. Teachers of children in early years (nursery and reception) classes should be able to identify the few cases where there is a disorder or problem for which some specialist intervention is appropriate.
Change happens in language - and the sounds of English are not exempt. Of course, basic sounds do not change in the sense that the phonemes represented in the IPA transcription will not go away. And it is rare, but not impossible, for speakers of a given language to begin to use phonemes they did not use before. Thus, most English speakers faced with French -ogne (as in Boulogne or Dordogne) anglicise to Buloyn (/bəlɔɪn/). And Welsh double l in initial position (as in Llanfair and many other place names) they sound simply as /l/ rather than a voiceless unilateral l.
What does change is the choice of which sound to use in a given context - though choice may suggest that this is voluntary whereas the change normally happens unnoticed. At a very simple level we can see, from rhymes in poetry that no longer work, that one or more words has acquired a new standard pronunciation. So John Donne writes (1571-1631) “And find/What wind/Serves to advance an honest mind”. We have retained the vowel sound in wind (verb, as in wind up) but not in wind (noun, as in north wind). We can still observe vowel change. In my own lifetime envelope was pronounced with the initial vowel /ɒ/ (as if it were onvelope). This pronunciation is becoming more rare, and persists mostly among older speakers. Turquoise was once commonly sounded as in French /tɜ:kwæz/ - but now it is more or less uniformly /tɜ:kɔɪz/ or /tɜ:kɔɪs/ (perhaps by analogy with tortoise).
Far more common are changes in stress patterns. So research (more or less universal in the UK when I was a child) has given way to re-search. In the case of harass the stress has shifted the other way, giving harass. We cannot sensibly say that the new form is “wrong” or “bad English” (even if we prefer the older form). But we can observe the frequency with which the new form occurs, and see if it does come to supplant the older form or whether both forms persist.
Change happens within regional varieties, too - so the glottal stop has moved its way northwards from London and southwards from Glasgow (where it has been found for 150 years). This is one feature of what Paul Kerswill calls dialect levelling. Similarly use of /f/ or /v/ in place of /θ/ and /ð/ is spreading north from London.
Perhaps the most well documented change occurring now is in sentence intonation. This is especially common among younger people, but not exclusively so. The change lies in a tendency to use rising (question) intonation more frequently. What is not clear, in contexts that allow either, is whether the speaker intends to ask a question or means to make a statement. We cannot be sure if the rising intonation conveys meaning, or is habitual.
One common way for pronunciation to change is by elision - compressing the word to remove a syllable. Once it was common to sound the -ed ending on past tense verbs, whereas now these verbs end with a /t/ sound. We do still sound the -ed ending on adjectives, even when these are formed from the past tenses - as in naked, wicked and learned. We can contrast the learned professor with what her pupils learned in the lecture. (The first has two syllables, the second only one.)
Police is often pronounced as a monosyllable /pli:s/ for example by the newsreader Sue Lawley. Recently I have observed several newsreaders eliding the middle syllable of terrorist, producing the form /tɛrəɪst/ or sometimes /tɛrɪst/. On the other hand, literacy may alter pronunciation. The n in column is silent, and in the Second World War, people would often speak of the Fifth Columnist (/kɒləmɪst/). But now broadcasters speak of those who write columns in newspapers as /kɒləmnɪsts/ - thereby sounding what was silent /n/.
Phonology for exam students
Phonology as an explicit subject of detailed study is not compulsory for students taking Advanced level courses in English Language. But it is one of the five “descriptions of language” commended by the AQA syllabus B (the others are: lexis, grammar, pragmatics and semantics). In some kinds of study it will be odd if it does not appear in your analysis or interpretation of data.
In written exams, you may want to comment on some features of phonology in explaining example language data - these may be presented to you on the exam paper, or may be your own examples, which illustrate, say, some point about language change, language acquisition or sociolinguistics. You may wish to use diagrams, models or the IPA transcription - and if you are able to do so, this may be helpful. But if you do not feel confident about using these, you can still make useful points about phonology - you can show stress simply by underlining or highlighting the stressed syllable. And you can show many aspects of phonology by using the standard Western (Roman-English) alphabet appropriately - as in contrasting pronunciations of “harass” as:
- ha-russ (first syllable stressed, vowel is a; second syllable unstressed vowel is neutral) or
- huh-rass (first syllable unstressed, neutral vowel; second syllable stressed, vowel is a)
Phonetic symbols and electronic documents
Representing phonetic symbols in electronic documents can be a challenge, unless you have the right software. Assuming that you have a word-processing program, you need to use special fonts that will represent the IPA symbols. These are either the SIL IPA fonts (such as SILdoulosIPA) or Unicode fonts (like Lucida Sans Unicode, which I have used in this document).
If you are producing work that will be printed, then you can add things by hand later, but this is messy and best avoided. There is a lot of guidance on the IPA homepage about how to cope with this problem.
If you do find a way to reproduce the symbols you need, it may make sense to paste them all at the end of the document on which you are working. Then, you can copy and paste as you need to use them. If you do not do this, then you will have to use the Alt key and the numeric keypad, since the keys on the normal keyboard will only give you the symbols that resemble ordinary letters.
Different ways of representing sound
Conventions of language science and lexicographers
If you study reference works you may find a variety of schemes for representing different aspects of phonology - there is no single universal scheme that covers everything you may need to do.
And many dictionaries may not even use the IPA alphabet, for the very obvious reason that the reader is not familiar with this transcription and can cope without it.
The text above comes from the Pocket Oxford Dictionary - this shows a simple phonetic representation based on the standard Western alphabet, with accents to show different vowels. Look in any dictionary you have and you may find something similar.
In representing speech - for example in drama, poetry or prose fiction - some authors are interested not merely in the words but also in how they are spoken. One of the most familiar concerns is that of how to represent regional accents. Here is a fairly early example, from the second chapter of Wuthering Heights (1847), in which the servant Joseph refuses to admit Mr. Lockwood into the house:
“'T' maister's dahn I't' fowld. Goa rahnd by the end ut' laith, if yah went to spake tull him”
Tennyson (1809-1892) has a similar approach in his poem, Northern Farmer, Old Style:
“What atta stannin' theer fur, and doesn' bring me the aäle?
Doctor's a 'toättler, lass, and 'e's allus i' the owd taäle...”
Joseph comes from what is now West Yorkshire, while Tennyson's farmer is supposedly from the north of Lincolnshire. Here is an earlier example, from Walter Scott's Heart of Midlothian (1830), which shows some phonetic qualities of the lowlands Scots accent. In this passage the Laird of Dumbiedikes (from the country near Edinburgh) is on his deathbed. He advises his son about how to take his drink:
“My father tauld me sae forty years sin', but I never fand time to mind him. - Jock, ne'er drink brandy in the morning, it files the stamach sair... ”
George Bernard Shaw, in Pygmalion (1914), uses one phonetic character (ə - schwa) in his attempt to represent the accent of Eliza Doolittle, a Cockney flower girl:
“There's menners f' yer! Tə-oo banches o voylets trod into the mad...Will ye-oo py me f'them.”
However, after a few sentences of phonetic dialogue, Shaw reverts to standard spelling, noting:
“Here, with apologies, this desperate attempt to represent her dialect without a phonetic alphabet must be abandoned as unintelligible outside London”.
In Pygmalion Professor Higgins teaches Eliza to speak in an upper-class accent, so as to pass her off as a duchess. In the course of the play, therefore, her accent changes. The actress playing the part, however, may have a natural accent closer to that with which Eliza speaks at the completion of her education, so in playing the part she may doing the reverse of what Eliza undergoes, by gradually reverting to a natural manner of articulation. (Eliza's pronunciation improves ahead of her understanding of grammar, so that at one point she says memorably: “My aunt died of influenza: so they said. But it's my belief they done the old woman in.”) In Pygmalion Shaw does not merely represent accent (and other features of speech) but makes this crucial to an exploration of how speech relates to identity and social class.
Charles Dickens is particularly interested in the sounds of speech. He observes that many speakers have difficulty with initial /v/ and /w/. Sam Weller, in The Pickwick Papers, regularly transposes these:
“ 'Vell,' said Sam at length, 'if this don't beat cock-fightin' nothin' never vill...That wery next house...' ”
Mr. Hubble, in Great Expectations does, the same thing when he describes young people as “naterally wicious”. Joe Gargery, in the same novel, has many verbal peculiarities, of which perhaps the most striking is in his description of the Blacking Warehouse. This is less impressive than the picture Joe has seen on bills where it is “drawd too architectooralooral”.
In Chapter 16 of Our Mutual Friend, Betty Higden is proud of Mr. Sloppy (an orphan she has fostered) not only because he can read, but because he is able to use different voice styles for various speakers.
“You mightn't think it, but Sloppy is a beautiful reader of a newspaper. He do the Police in different voices.”
Dickens also finds a way to show tempo and rhythm. In Chapter 23 of Little Dorrit (and elsewhere in the novel), Flora Finching speaks at length and without any pauses:
“Most unkind never to have come back to see us since that day, though naturally it was not to be expected that there should be any attraction at our house and you were much more pleasantly engaged, that's pretty certain, and is she fair or dark blue eyes or black I wonder, not that I expect that she should be anything but a perfect contrast to me in all particulars for I am a disappointment as I very well know and you are quite right to be devoted no doubt though what am I saying Arthur never mind I hardly know myself Good gracious!”
Background reading on phonology
There are very full accounts of phonology in both of Professor David Crystal's encyclopedias. See his Cambridge Encyclopedia of Language, Part IV, The Medium of Language: Speaking and Listening (pp. 123-175; ISBN 0521424437) and his Encyclopedia of the English Language, Part IV, 17, The Sound System (pp. 236-255; ISBN 0521596556).
- Click here to buy Encyclopedia of Language from Amazon.com
- Click here to buy Encyclopedia of Language from Amazon.co.uk
- Click here to buy Encyclopedia of Language from Pickabook
- Click here to buy Encyclopedia of the English Language from Amazon.com
- Click here to buy Encyclopedia of the English Language from Amazon.co.uk
- Click here to buy Encyclopedia of the English Language from Pickabook
For a very clear and succinct account, look at Howard Jackson's and Peter Stockwell's Introduction to the Nature and Functions of Language, 2.1, Sounds and letters (pp. 11-23; ISBN 0748725806).
- Click here to buy the Nature and Functions of Language from Amazon.co.uk
- Click here to buy the Nature and Functions of Language from Pickabook
There is a longer and more discursive account in Shirley Russell's Grammar, Structure and Style, Spoken English (pp. 107-168; ISBN 0198311982)
- Click here to buy Grammar, Structure and Style from Amazon.co.uk
- Click here to buy Grammar, Structure and Style from Pickabook
You can find lots of help online. The best place to start is
the International Phonetic Association
You will find some excellent resources from
the languages department of the University of Victoria in British Columbia
For a great introduction to Scots - with some excellent guidance on phonology - try
Andy Eagle's Wir Ain Laid (Our Own Language)
For help with fonts go to
the IPA Unicode site
Alan Wells' Unicode Resources
the Microsoft typography site