THE CLIENT’S AND THERAPIST’S VOCAL QUALITIES IN CBT AND … · Softened-Irregular Vocal Style, compared with a Natural-Definite Vocal Style, was associated with the client’s

THE CLIENT’S AND THERAPIST’S VOCAL QUALITIES

IN CBT AND PE-EFT FOR DEPRESSION

by

Beth Ellen Bernholtz

A thesis submitted in conformity with the requirements

for the degree of Doctor of Philosophy

Graduate Department of Applied Psychology and Human Development

Ontario Institute for Studies in Education

University of Toronto

© Copyright by Beth Ellen Bernholtz 2013

ii

THE CLIENT’S AND THERAPIST’S VOCAL QUALITIES

IN CBT AND PE-EFT FOR DEPRESSION

Doctor of Philosophy, 2013 Beth Ellen Bernholtz

Graduate Department of Applied Psychology and Human Development

University of Toronto

Abstract

The psychotherapy client’s vocal quality indicates the quality of and shifts in the client’s

engagement in treatment. In contrast, the therapist’s vocal quality is a treatment intervention,

either facilitating or hindering the client’s progress. The Client Vocal Quality (CVQ) and

Therapist Vocal Quality (TVQ) measures (Rice & Kerr, 1986) were applied to responses in the

middle 20 minutes of low and high change sessions for 61 clients who received either cognitive

behavioural therapy (CBT) or process-experiential emotion focused therapy (PE-EFT) for

depression (Watson, Gordon, Stermac, Kalogerakos, & Steckley, 2003). Sessions were selected

using the Client Task Specific Change Measure-Revised (CTSC-R; Watson, Greenberg, Rice, &

Gordon, 1996). Outcome measures included the Beck Depression Inventory, Dysfunctional

Attitude Scale, Global Severity Index of the Symptom Checklist-90-Revised, Inventory of

Interpersonal Problems, Problem-Focused Style of Coping Scale, and Rosenberg Self-Esteem

Scale. The CVQ categories differentiated between treatment types, with PE-EFT clients

expressing a significantly higher proportion of Emotional vocal quality and CBT clients

expressing a significantly higher proportion of Externalizing vocal quality. There was no

difference between the treatment types for Focused or Limited vocal quality. The CVQ

categories predicted outcome. This relationship was particularly evident in the session in which

clients first reported moderate to high change on the CTSC-R. The combination of Emotional

and Focused categories was a stronger predictor than either category alone. In contrast, a higher

iii

proportion of Externalizing vocal quality predicted worse scores. In terms of Therapist Vocal

Style, PE-EFT therapists spoke predominantly in the Softened-Irregular Vocal Style, while CBT

therapists spoke predominantly in the Natural-Definite Vocal Style. Therapists’ use of a

Softened-Irregular Vocal Style, compared with a Natural-Definite Vocal Style, was associated

with the client’s report, at the end of therapy, that he/she felt less easily exploited by others and

had a greater ability to assert interpersonal boundaries without fear of offending others.

Limitations of the study are discussed.

iv

Acknowledgements

I would like to thank my supervisor, Dr. Jeanne Watson. I feel tremendously privileged to have

been your student and I am so grateful to you for supporting me and this research. I would also

like to thank my committee members Dr. Lana Stermac and Dr. Ruth Childs. Thank you so

much Lana for making me feel comfortable in the program since the beginning. And Ruth, thank

you for giving me your time and expert guidance. I always left our meetings feeling like I knew

more about statistics than I had thought! Thank you also to Evelyn McMullen, Jon Danson,

Aline Rodrigues, and Laura Gollino for doing such a wonderful job as raters on this project and

for your insights about vocal quality. And, to Olesya Falenchuk, I can’t imagine how my study

would have looked without your assistance. I greatly enjoyed our time together working on the

statistics for this project. I also want to thank Gillian Kerr and Sharon Rappaport for bringing

my study to life by providing essential training information as well as a wonderful historical

connection. Thank you also to Dr. Steve Hollon and Dr. Les Greenberg for providing audio of

psychotherapy sessions for training purposes. Finally…. my family. To my beloved husband

Jeff, thank you for always understanding everything—I love you. To my kids, I am very proud

of you and think you are very astute observers of vocal quality. To my sister, Adrienne, thank

you for always sending happiness and love my way. To my mother, Marcia Krem, thank you for

loving me so much. You always make me feel that I have a special, unique purpose in this

world. To my father and stepmother, Marvin and Linda Feldman, I want to thank you for

supporting me through such difficult times in my life. I would not be here, in this happy,

successful place, without your love.

v

Table of Contents

Abstract .......................................................................................................................................... ii

Acknowledgements ........................................................................................................................ iv

List of Tables ................................................................................................................................. ix

List of Figures ..................................................................................................................................x

List of Appendices ......................................................................................................................... xi

Glossary ....................................................................................................................................... xiv

Chapter 1: Literature Review .......................................................................................................1

Special Characteristics of Vocal Quality and Audition .............................................................1

Connecting to other people through imitation ...........................................................................4

Rhythm, vocal quality, and synchronizing states of consciousness...........................................5

Rhythm, emotion, and vocal quality ..........................................................................................7

Paralanguage ..............................................................................................................................8

Psychotherapists listen to their clients’ vocal qualities..............................................................9

Personality..........................................................................................................................10

Emotional States ................................................................................................................13

Level of arousal..................................................................................................................15

Psychopathology ................................................................................................................16

Using vocal quality to guide treatment interventions ........................................................18

The therapist’s vocal quality: The therapist’s vocal quality as a

treatment intervention ..............................................................................................................18

Mutual influence of the client’s and therapist’s vocal qualities ..............................................20

Studies of the therapist’s vocal quality ....................................................................................24

Rice (1965).........................................................................................................................25

Duncan, Rice, and Butler (1968) .......................................................................................28

Kerr (1983).........................................................................................................................29

Summary of studies of the therapist’s vocal quality ..........................................................35

The client’s vocal quality: How the client’s use of his or her own vocal

quality can facilitate treatment .................................................................................................35

The physical act of speaking and emotional changes ........................................................36

Paying attention to how one sounds...................................................................................36

Increasing emotional arousal .............................................................................................37

Studies of the client’s vocal quality .........................................................................................37

vi

The client’s vocal quality and treatment outcome-Butler, Rice, and

Wagstaff (1962) .................................................................................................................38

The client’s vocal quality and treatment outcome-Rice and Wagstaff (1967) ..................42

The client’s vocal quality and other treatment orientations-

Sarnat (1976) and Nixon (1980) ........................................................................................43

The client’s vocal quality and in-session-Greenberg (1983) .............................................44

The client’s vocal quality and in-session processes-

Watson and Greenberg (1996) ...........................................................................................47

Summary of the client’s vocal quality, treatment outcome, and

in-session processes ...........................................................................................................48

The therapist’s and client’s vocal qualities together-

Butler, Rice, and Wagstaff (1962) ...............................................................................49

The therapist’s and client’s vocal qualities together-

Wiseman and Rice (1989) ............................................................................................49

Different Treatments, Different Demands ...............................................................................52

Methods of Studying vocal quality in the Psychotherapy Setting ...........................................54

Summary ..................................................................................................................................54

Research questions and hypotheses .........................................................................................55

Research Question 1 ..........................................................................................................55



Chapter 2: Method .......................................................................................................................57

Participants ...............................................................................................................................57

Therapist ..................................................................................................................................57

Treatments................................................................................................................................59

Process Measures .....................................................................................................................60

Client Vocal Quality Scale (CVQ) ....................................................................................60

Therapist Vocal Quality Scale (TVQ) ...............................................................................62

Outcome Measures...................................................................................................................64

Beck Depression Inventory (BDI ) ....................................................................................64

Dysfunctional Attitudes Scale (DAS) ................................................................................64

Problem-Focused Style of Coping (PF-SOC) ...................................................................65

Inventory of Interpersonal Problems (IIP) .........................................................................66

Rosenberg Self-esteem Scale (RSES) ................................................................................67

Symptom Checklist-90-Revised (SCL-90-R) ....................................................................68

vii

Post-Session Outcome Measure ...............................................................................................69

Client Task-Specific Change Measure-Revised (CTSC-R) ...............................................69

Procedure .................................................................................................................................70

Session selection ................................................................................................................70

Preparation of materials .....................................................................................................71

CVQ Training ....................................................................................................................71

Rating the CVQ data set ....................................................................................................74

TVQ Training.....................................................................................................................75

Rating the TVQ data set .....................................................................................................76

Descriptive Statistics for the Outcome Measures ..............................................................77

Alpha level .........................................................................................................................78

TVQ Inter-rater reliability on the data set..........................................................................78

TVQ Descriptives ..............................................................................................................81

CVQ Inter-rater reliability .................................................................................................83

Boxplots of CVQ Data .......................................................................................................87

Chapter 3: Results........................................................................................................................91

Research Question 1-Vocal quality and the client’s report of change .....................................91

Research Question 2-Vocal qualities and the client’s scores on

outcome measures ....................................................................................................................94

Hypothesis 2a-Productive CVQ categories will predict better scores

for clients on the outcome measures at the end of treatment .............................................95

Emotional vocal quality in the first report of moderate to

high change session .....................................................................................................96

Focused vocal quality in the first report of moderate to

high change session .....................................................................................................97

Emotional Plus Focused vocal quality in the first report of

moderate to high change session .................................................................................97

Additional Analyses of Externalizing and Limited vocal qualities

in the first report of moderate to high change session for

Research Question 2a ...................................................................................................99

Externalizing vocal quality in the first report of moderate to

high change session .............................................................................................100

Additional analyses of Limited vocal quality and Externalizing

vocal quality in the session with the lowest change score for

Research Question 2a .................................................................................................101

viii

Additional analyses of Limited vocal quality and Externalizing

vocal quality in the session with the highest change score for

Research Question 2a .................................................................................................102

Hypothesis 2b-Productive TVQ categories will predict better scores

for clients on the outcome measures at the end of treatment ...........................................104

Research Question 3-Differences between therapist and client vocal

qualities and treatment types ..................................................................................................107

Additional tests addressing Research Question 3 and the client’s

vocal quality .....................................................................................................................109

Exploration of Hypothesis 3b-Therapist Vocal Style and treatment types ......................111

Summary of results ................................................................................................................112

Chapter 4: Discussion ................................................................................................................116

CVQ predicts clients’ scores on outcome measures at termination .......................................117

Emotional Plus Focused vocal quality predict more favourable treatment

outcomes in the first report of moderate to high change session ....................................117

Limited vocal quality predicts more favourable treatment outcomes in the

first report of moderate to high change session ..............................................................121

Externalizing vocal quality predicts worse treatment outcomes in the

first report of moderate to high change session ..............................................................124

No difference in CVQ in the sessions with the lowest change score and the

session with highest change score .........................................................................................125

Client and therapist vocal qualities differentiate the treatment types ....................................125

Indirect support for other studies comparing different treatment types ...........................126

CBT clients expressed a higher proportion of Externalizing vocal quality

than PE-EFT clients .........................................................................................................127

There was no difference between the treatment types for Focused

vocal quality .....................................................................................................................129

Natural-Definite and Softened-Irregular Therapist Vocal Style and the

treatment types .................................................................................................................129

Strengths of the current study ................................................................................................132

Limitations of the study .........................................................................................................133

Future Research .....................................................................................................................136

Implications for practice ........................................................................................................142

Conclusion .............................................................................................................................146

References ...................................................................................................................................148

ix

List of Tables

Table 1 Features of the vocal quality Qualities and Manner of Speaking

categories for the Client Classification System ..............................................................40

Table 2 Client Characteristics at Pre-treatment ............................................................................58

Table 3 Preliminary Inter-Rater Reliability Between the CVQ Raters and

Each Rater and the Expert...............................................................................................74

Table 4 Preliminary Inter-Rater Reliability Between the TVQ Raters and

Each Rater and the Expert...............................................................................................76

Table 5 Results of Cluster Analysis for the TVQ Observers (N = 59 Sessions) .........................80

Table 6 Results of Cluster Analysis on the TVQ Data Set (N = 177 Sessions) ...........................81

Table 7 Crosstabulation Tables for Therapist Vocal Style by Treatment Type

in Sessions with the Lowest and Highest Change Scores ...............................................82

Table 8 Crosstabulation Tables for Therapist Vocal Style by Treatment Type

in the First Report of Moderate to High Change Session ...............................................83

Table 9 Intraclass Correlation Coefficients for CVQ Inter-Observer

Agreement (N = 63) ........................................................................................................84

Table 10 Results for Hypothesis 1a and b: Wilcoxon Signed-rank test –

Mean Ranks, Ties, Z statistics, p values for the CVQ categories

in the Highest Change Score Sessions and the Lowest Change

Score Sessions (N = 61) ................................................................................................92

Table 11 A Higher Proportion of CVQ Category Predicts Outcome Scores

at Post Treatment .........................................................................................................104

Table 12 Results for Hypothesis #3a: Mean Ranks, Z statistics, Effect Sizes,

and p values of CVQ Categories in Lowest plus Highest Change Score

sessions by Treatment Type (N = 61 Clients) .............................................................108

Table 13 Results for Hypothesis #3a: Mean Ranks, Z statistics, Effect Sizes,

and p values of CVQ Categories in First Report of Moderate to High

Change Score sessions by Treatment Type (N = 58 Clients) ......................................109

x

List of Figures

Figure 1 Crosstabulation matrix for the Focused vocal quality test with

the main rater and expert ...............................................................................................85

Figure 2 Boxplots of CVQ categories in first report of moderate to high

change sessions (N = 60) ..............................................................................................88

Figure 3 Boxplots of CVQ categories in sessions with the lowest change score

and sessions with the highest change score (N = 63) ....................................................89

xi

List of Appendices

Appendix A Ranges for interpretation of statistics ....................................................................160

Appendix B Spearman’s Rho Correlations Between Outcome Measures .................................170

Appendix C, Figure C1 Boxplots of CVQ Categories in Sessions with

the Lowest Change Score (N = 61) ..............................................................................................171

Appendix C, Figure C2 Boxplots of CVQ categories in Sessions with

the Highest Change Score (N = 61) .............................................................................................172

Appendix C, Figure C3 Boxplots of CVQ categories in First Report

of Moderate to High Change Session (N = 58) ...........................................................................173

Appendix D Means, Standard Deviations, and Medians for CVQ

Categories in the Session with the Lowest Change Score and the

Session with the Highest Change Score .......................................................................................174

Appendix E Means, Standard Deviations, and Medians for Treatment

Groups by CVQ Category in the First Report of Moderate to High

Change Session ............................................................................................................................175

Appendix F, Table F1 Results for Hypothesis 2a: Standard (β) and

Unstandardized (B) Regression Coefficients, their Standard Errors,

and p values Emotional Vocal Quality in the Session with the Highest

Change Score ...............................................................................................................................176



and p values Focused Vocal Quality in the Session with the Highest

Change Score ...............................................................................................................................177



and p values Emotional Plus Focused Vocal Quality in the Session with

the Highest Change Score ............................................................................................................178

Appendix G, Table G1 Results for Hypothesis 2a: Standard (β) and


and p values Emotional Vocal Quality in the Session with the Lowest

Change Score ...............................................................................................................................179

xii



and p values Focused Vocal Quality in the Session with the Lowest

Change Score ...............................................................................................................................180


Unstandardized (B) Regression Coefficients, their Standard Errors, and

p values Emotional Plus Focused Vocal Quality in the Session with the

Lowest Change Score ...................................................................................................................181

Appendix H, Table H1 Results for Hypothesis 2a: Standard (β) and


and p values for Emotional Vocal Quality in the Session with the First

Report of Moderate to High Change ...........................................................................................182



p values for Focused Vocal Quality in the Session with the First Report

of Moderate to High Change .......................................................................................................183



p values for Emotional Plus Focused Vocal Quality in the Session with

the First Report of Moderate to High Change .............................................................................184

Appendix I, Table I1 Results for Hypothesis 2a: Standard (β) and


p values for Limited Vocal Quality in the Session with the First Report of

Moderate to High Change ...........................................................................................................185

Appendix I, Table I2 Results for Hypothesis 2a: Standard (β) and


p values for Externalizing Vocal Quality in the Session with the First Report

of Moderate to High Change .......................................................................................................186

Appendix J, Table J1 Results for Hypothesis 2a: Standard (β) and


and p values Limited Vocal Quality in the Session with the Lowest

Change Score ...............................................................................................................................187

xiii

Appendix J, Table J2 Results for Hypothesis 2a: Standard (β) and


and p values Externalizing Vocal Quality in Session with the Lowest

Change Score ...............................................................................................................................188

Appendix K, Table K1 Results for Hypothesis 2a: Standard (β) and


p values Limited vocal quality in the Session with the Highest Change Score............................189

Appendix K, Table K2 Results for Hypothesis 2a: Standard (β) an


p values Externalizing Vocal Quality in the Session with the Highest

Change Score ...............................................................................................................................190

Appendix L, Table L1 Results for Hypothesis #2b: Standard (β) and


p values for Post-Treatment Scores for Outcome Measures by Therapist

Vocal Style (Softened-Irregular and Natural-Definite) for Sessions with

the Lowest and Highest Change Scores .......................................................................................191

Appendix L, Table L2 Results for Hypothesis #2b: Standard (β) and


p values for Post-Treatment Scores for Outcome Measures by Therapist

Vocal Style (Softened-Irregular and Natural-Definite) in First Report of

Moderate to High Change Scores ................................................................................................193

xiv

Glossary

Some terms in this dissertation are complex. This glossary can be used as a guideline while

reading.

Session with the lowest change score: This is one of the groupings of sessions used in

the analyses. Sessions placed in this group are those with the lowest CTSC-R score.

Session with the highest change score: This is one of the groupings of sessions used in

the analyses. Sessions placed in this group are those with the highest CTSC-R score.

First report of moderate to high change session: This is one of the groupings of sessions

used in the analyses. Sessions placed in this group are those that are the first session

scored as 5 or more on the CTSC-R.

Low change sessions: Sessions with the lowest change score

High change sessions: Either sessions with the highest change score or the first report of

moderate to high change session, depending on the analysis.

Acronyms

CTSC-R: Client Task Specific Change-revised

CBT: Cognitive Behaviour Therapy

PE-EFT: Process-Experiential Emotion-Focused Therapy

1

Chapter 1:

Literature Review

Special Characteristics of Vocal Quality and Audition

Some researchers estimate that 55% of communication is nonverbal, consisting of

behaviours such as physical movements, facial expressions, and vocal quality characteristics

(Tepper & Haas, 1978). Of these behaviours, a person’s vocal quality is considered to be the

most accurate indicator of his or her inner state because it reflects physiological correlates of

emotion (Moses, 1954) and because it is the most difficult of the nonverbal behaviours to control

by choice (e.g., Ekman & Friesen, 1969). Vocal quality is also seen as transmitting more

information about the speaker than his or her words, contradicting, underscoring, undermining,

or calling into question the meaning of spoken content (Egan, 1998).

Despite the potential power to alter the impact of words, the sounds of a person’s vocal

quality are constantly changing and fleeting. For example, even if sound is recorded on

audiotape or video, it still cannot be caught, held still, and examined like a visual or tangible

object (Winkel, 1996, as cited in Ross, 2001). Sound also connects people to the world in ways

that the other senses cannot. For example, people hear sounds caused by objects that are out of

their visual range and people can be alerted to events occurring behind them by hearing them

first, before seeing them (Sabbadini, 1997, as cited in Ross, 2001).

Psychoanalysts have lauded vocal quality and audition for their special properties. One

reason for this is that they consider the sound of a patient’s mother’s vocal quality, even before

birth, to be among the first important experiences for the patient that can be accessed in

psychotherapy (e.g., Wrye, 1997). In addition, in spite of its apparent intangibility,

psychoanalysts have referred to sound as providing a physical “contact” experience (Bady,

2

1985). Sound has been described as having a physical impact on the listener that enables him or

her to hear (e.g., Ackerman, 1990) and as forming a “psychobiological bridge” between a

person’s physical/biological functioning and his or her psychological functioning (Stone, 1961,

p. 86).

Bady (1985) referred to the work of Niederland (1985) when commenting about the

physical aspects of sound that enable people to hear:

Sound waves from a person’s voice or other sources are transmitted through a medium of

air to create tiny, yet definite impressions on the skin and eardrum. A loud noise is thus

felt as well as heard. At lower frequencies, there is a gentle but definite vibration distinct

from and superimposed on the sound. (Bady, 1985, p. 488)

Ackerman (1990) described the links between sound and hearing at an even more fundamental

level, explaining that sound waves cause molecules in the air to vibrate. This vibration causes

molecules next to one another to vibrate as in a chain reaction. Molecules move one another

until they enter the ear, where they vibrate against the:

Three colorfully named bones (the hammer, the anvil, and the stirrup), the tiniest bones in

our body . . . [to] press fluid in the inner ear against membranes, which brush tiny hairs

that trigger nearby nerve cell, which telegraph messages to the brain: we hear.

(Ackerman, 1990, p. 177)

Referring to Stone’s (1961) idea that vocal quality forms a “psychobiological bridge” (p. 86)

with other people, Wrye (1997) wrote that unborn babies literally feel their mother’s vocal

qualities and heartbeats “even before birth, as the infant, rocked and rolled in the quadraphonic

audio chamber of the womb, heard and felt her voice ” (p. 360). The physical sensation of vocal

3

quality continues to dominate the baby’s world even after birth. Stern (1990) wrote about what it

must be like for a 6-week-old baby to hear his parents speak to him:

Step into Joey’s earliest world and . . . imagine that none of the things you see or touch or

hear have names or functions, and few any memories attached to them. Joey experiences

objects and events mainly in terms of the feelings they evoke in him . . . . When his

parents call him “honey”, he doesn’t know that honey is a word and refers to him. He

doesn’t even particularly notice it as a sound distinct from a touch or a light. But he

attends carefully to how the sound flows over him. He feels its glide, smooth and easy,

soothing him; or its friction, turbulent and stirring him up, making him even more alert.

Every experience is like that, having its own special feeling tone. (p. 13)

The “psychobiological bridge” formed by speech enables a child to develop, eventually

separating from his or her mother, while still feeling the safety of connection to her (Stone, 1961,

pp. 85-86). Nass (1971) provided an example of this in explaining that the baby feels connected

to his or her mother just by hearing her vocal quality or the sounds of her activities in another

room. Nass (1971) stated that the mother’s vocal quality allows the child to “maintain the object

at a distance” and that hearing her vocal quality enables the child to hold on to her even though

she is not within physical reach (p. 309).

In addition to sound providing a kind of physical contact that enables the child to

develop, sound is an integral part of the child’s emotional growth. Moses (1954), a laryngologist

who wrote at length about the connections between vocal quality, human development, and

psychopathology, explained that as the child grows, he or she makes constant sensory and

emotional associations with sound. For example, the baby associates the enjoyable vocal sounds

he creates with his vocal quality with the pleasant physical experience involved in making them.

4

Examples of such physical experiences might include the feeling of running his tongue along the

insides of his mouth or feeling the rush of air through his lips, filling up his lungs as he prepares

to make a joyful shriek. In addition to associating his own vocal quality with his physical

sensations, the child also makes associations between physical experience and the sounds of his

environment. For example, the sounds of anger, arguing, and bickering in the home would be

deeply associated with the child’s somatic experience of tension or fear.

Such links between vocal quality and feeling occur early on, before the child even learns

to speak. Moses (1954) wrote that, “family and group relationships have been symbolized by

vocal patterns before verbal thinking was mastered” (p. 27). These vocal pattern-feelings

associations are then linked to vocabulary as the child learns to speak in words. The result is

then heard in the child’s vocal quality as described by Moses (1954): “The voice of the child is

like a complicated engraving etched by experience with multitudinous fine traces which cannot

be easily eradicated” (p. 27). For Moses, the end result of the child’s vocal pattern-feeling

development is that vocal quality is the primary transmitter of the speaker’s personality and state

of mental health.

Connecting to other people through imitation

In addition to the human vocal quality and audition playing necessary roles in a child’s

development, Moses (1954) has asserted that they have been crucial in phylogenesis, or the

development of the human species. Moses explained that primitive humans understood or

acquired new information by imitating the new stimuli in their environment. Referring to

Bernfeld (1928) and his theories of fascination, Moses (1954) wrote, “primitive perception is

close to motor reaction. The primitive ego imitates what it perceives in order to master intense

stimuli” (p. 11). Moses (1954) also quoted Bernfeld (1928) when he wrote that, “perceiving and

5

changing one’s own body according to what is perceived were originally one and the same

thing” (Moses, 1954, pp. 11-12).

According to Moses (1954) people continue to use this primitive identification process,

though most do so unconsciously. For example, Moses wrote that listeners instinctively

reproduce the physical actions that the speaker must be making in order to emit particular sounds

or tones of vocal quality. Moses explained that this phenomenon is the reason so many people

were awed by legendary speakers like Franklin Delano Roosevelt and Adolf Hitler. These

speakers were able to captivate their audiences “with their superior breathing techniques that the

listener tends to duplicate” (Moses, 1954, pp. 38-39). Moses suggested that for early humans,

this duplication process likely began as pantomiming with physical gestures and vocal sounds.

An example might be of hunters using their body movements as well as vocal qualities to tell the

story of the hunt to their clan. They likely imitated other sounds in their environment which,

over time, developed into words and eventually sentences and language.

Rhythm, vocal quality, and synchronizing states of consciousness

Anthropologist Byers (1979) wrote that each person and animal functions according to

underlying biological rhythms. People operate according to individual rhythms and their

interactions with one another create their own unique rhythmic states. In addition, according to

Byers, people can synchronize one another into a similar rhythmic state or state of

consciousness. Byers reported observing this in conversations between warrior Yananamo

leaders whose distinct rhythm of conversation appeared to have a peace-keeping function. Byers

(1979) concluded that “two interactants in a tight synchrony necessarily and biologically are

brought into the same state (of consciousness) by virtue of their mutual entrainment” (p. 416).

6

Synchronizing states of consciousness as a means of healing has also been observed in

primitive healing ceremonies. Byers (1979) referred to the work of Coberly (1972) who studied

healing ceremonies in diverse cultures:

Coberly (1972) has examined the sequences, or processes, involved in shamanistic curing

ceremonies in ten cultures, and has shown that, despite the variety of cultural content, the

process is always the same: the shaman collects a group, synchronizes the group through

dance, movement, song, chanting, etc., and then brings in the state deviant patient who

joins—is entrained by—the group and is brought to their state in synchronous

participation. In eastern religions, and in numerous primitive religions, instrumental

means (chanting, breathing, dancing, singing, etc.) are used to change the state of

consciousness of individuals or to bring a group to the same state of consciousness.

(Byers, 1979, p. 416)

This finding as well as Bernfeld’s (1928, as cited in Moses, 1954) observation that people

instinctively imitate others has also been noted in current psychology as emotional contagion.

Emotional contagion has been defined as “the tendency to automatically mimic and synchronize

movements, expressions, postures, and vocalizations with those of another person, and

consequently, to converge emotionally” (Hatfield, Cacioppo, & Rapson, 1992, pp. 153-154, as

cited in Rosner, Beutler, & Daldrup, 2000, p. 3). In terms of creating effective social

interactions, Giles and Coupland (1991) described a similar concept rooted in communication

accommodation theory, or CAT. In CAT, the convergence of the nonverbal behaviours of

people engaged in dialogue occurs because of, “a speaker’s or group’s need (often nonconscious)

for social integration or identification with another” (Giles & Coupland, 1991, pp. 71-72 as cited

in Gregory, Green, Carrothers, Dagan, & Webster, 2001, p. 38). Giles and Coupland (1991)

7

listed a number of vocal characteristics that can converge during a conversation such as

“linguistic/prosodic/non-vocal features including speech rate, pausal phenomena and utterance

length, phonological variants…and so on” (p. 63 as cited in Gregory et al., 2001, p. 38).

Rhythm, emotion, and vocal quality

Byers (1979) referred to the work of Stetson (1905, 1951) and Lashley (1951) when

explaining the importance of vocal rhythm in conveying emotional information. Stetson (1905)

stated that a person’s vocal quality is the “most important natural rhythm-producing apparatus”

(p. 257). However, the rhythms produced by the human vocal quality are different from the

exact recurring rhythms produced by a machine. Instead, the rhythm of vocal quality is

“inexact” as it mirrors the “rapidly shifting inner affect states” of the speaker (Byers, 1979, p.

402). In addition to vocal quality reflecting an underlying emotional rhythm, Lashley (1951)

explained that people function on different levels of organization and that the rhythm of speech

is located on a different level than the content of speech. Lashley (1951) wrote, “the mechanism

[rhythm] which determines the serial activation of the motor units [speech] is relatively

independent, both of the motor units [speech] and of the thoughts structure” (p. 118). Stetson

(1951) also acknowledged the division between rhythm and words in his example of how hard it

is to learn a foreign language, explaining, “the rhythm is certainly one of the most fundamental

characteristics of the utterance of a language, and is most difficult for a foreigner to acquire” (p.

124).

The early psychoanalysts also commented on this distinction between manner and

content of speech. Gilles (1990) stated that Reich (1928/1950) recognized the division between

“what” a person says and “how” he or she says it (Gillies, 1990, p. 24). Sullivan (1954, as cited

in Gilles, 1990) described this division as “vocal” and “verbal”. The “verbal” category refers to

8

spoken words, while the “vocal” category developed into what is currently called “paralanguage”

(Gillies, 1990, pp. 24-25).

Paralanguage

Otswald (1979) explained that researchers began investigating which human vocal

sounds convey emotion as a result of the development of the telephone. According to Otswald

(1979), as scientists worked to make speech transmitted through the telephone understandable,

they discovered that “spoken language is highly redundant, i.e., that when a person speaks he

produces many more acoustic signals than are necessary for correct speech perception (Shannon

and Weaver, 1949)” (Otswald, 1979, p. 261). Otswald wrote that this finding prompted

researchers to discover which of these sounds carry the emotional tone of what people say as

well as to understand how the acoustic qualities of person’s voice are connected to his or her

psychiatric state.

Regarding the acoustic aspects of speech, Meservy and Burgoon (2008) referred to

Trager (1958) as well as Pittenger, Hockett, and Danehy (1960) as pioneers in the field of

paralanguage, or the study of how communicative behaviours, other than spoken content, convey

meaning. Meservy and Burgoon (2008) defined paralanguage this way:

Paralanguage refers to the nonverbal elements of speech – such as vocal pitch, intonation,

and speaking tempo – that can be used to communicate attitudes, convey emotion, or

modify meaning. In simple terms, paralanguage can be thought of as how something is

said rather than what is said. (no page number)

Scherer and Oshinsky (1977) gave examples of paralinguistic terms used to characterize different

emotions. For example, small variation in pitch and slow tempo are paralinguistic descriptors of

9

boredom. In contrast, wide variation in pitch and fast tempo can be used to describe the sounds

of a person expressing happiness.

These terms describe the physical expressions of physiological changes that accompany

the speaker’s emotional state. As Ozdas, Shiavi, Silverman, Silverman, and Wilkes (2004)

explained, “there is considerable evidence that emotional arousal produces changes in the speech

production scheme by affecting the respiratory, phonatory, and articulatory processes that in turn

are encoded in the acoustic signal” (pp. 1530-1531). These physical changes thus alter the way

the speaker’s vocal quality sounds. Otswald (1979), referring to his earlier work (1960),

suggested that variations in “acoustic patterns are subject not only to the physiological laws

governing respiration and phonation, but to social rules regarding acoustic expression of emotion

as well” (Otswald, 1979, p. 261).

Psychotherapists listen to their clients’ vocal qualities

Otswald (1979) referred to Moses (1954) as one of the clinicians whose interest in vocal

quality as a transmitter of emotional information was motivated by the discoveries coming from

the development of telephone communication. Although Moses was a laryngologist, he asserted

that conditions such as neurosis and schizophrenia were diagnosable from patients’ vocal

qualities alone. Moses urged psychoanalysts to increase their understanding of the human vocal

quality as a tool for diagnosing mental disorders.

Moses (1954) also pointed out that because of the physical arrangement specified at that

time in psychoanalysis, in which the patient lies on the couch facing away from the analyst, the

analyst relies almost entirely on the patient’s vocal quality for information about his or her

psychic state. Referring to the work of clinicians as far back as Freud (1893) and Reich (1949)

Davis and Hadicks (1990) wrote that, “it has long been recognized that evaluation of nonverbal

10

behavior is an integral part of clinicians’ assessment of clients’ psychological states” (p. 340).

Of the nonverbal cues, the client’s vocal quality continues to be valued an important source of

information for the psychotherapist. Vocal quality has been appreciated not only as an indicator

of psychopathology (Moses, 1954), but also as a reflection of the client’s personality (e.g., Rice

& Gaylin, 1979), changing emotional states (e.g., Stetson, 1905), intensity of emotional arousal

(e.g., Warwar & Greenberg, 1999); and as micro-marker of cognitive-affective processing (e.g.,

Elliott, Watson, Goldman, & Greenberg, 2004). Having this information allows the therapist to

gauge the appropriateness, timing, and effect of specific treatment interventions.

Personality.

Moses (1954) asserted that vocal quality, “is the primary expression of the individual” (p.

1). The connection of a person’s vocal quality to his or her personality was understood early on

as is seen in the Latin word for a drama character’s mask: persona. Moses (1954) explained that

persona means “per sona: the sound of the voice passes through” (p. 7), reflecting that vocal

quality conveys the essence of one’s personality. Moses (1954) regretted that persona eventually

evolved into the contemporary word, personality, bypassing its essential connection to vocal

quality.

However, vocal quality continues to be regarded as an indicator of personality. Moses

(1954) referred to his earlier investigations into vocal quality and personality in which he

described an adolescent man’s personality only from having listened to a phonographic recording

of his vocal quality. He remarked that his findings, based on the vocal analysis and described in

acoustic terms, corresponded well to his interpretation of the young man’s Rorschach protocol.

More than 30 years after Moses’ (1954) informal investigation, Rice and Gaylin (1973)

explored whether the client’s vocal quality was related to personality when tested on the

11

Minnesota Multiphasic Personality Inventory (MMPI) and on the Rorschach. The measure of

the client’s vocal quality they used is the Classification System for Client Vocal Quality (Rice &

Wagstaff, 1967), currently known as the Client Vocal Quality Scale (CVQ; Rice & Kerr, 1986).

While this classification system uses paralinguistic terms, the terms are grouped together to

define specific vocal quality categories called Focused, Externalizing, Limited, and Emotional.

The paralinguistic terms used - include energy, stress, pitch, terminal contours, cadence, and

resonance (Rice & Gaylin, 1973).

Rice and Gaylin (1973) explained that, “previous evidence had suggested that vocal

quality reflects the kinds of resources that the client brings to the therapy situation, rather than

being related to particular kinds of psychopathology” (p. 134). Three of the four previously

mentioned categories were analyzed: 1) Focused, 2) Externalizing, and 3) Limited. The excluded

category, Emotional, was not expressed enough in the data set to be interpreted. The three vocal

quality categories were expected to reflect distinct personality styles. For example, clients using

predominantly Focused vocal quality were believed to have personality styles in which they

would pay attention to their psychological experience and use high energy to explore it. This

exploration could be heard in their vocal qualities as “groping and hesitation” associated with

“the pondering quality of one who is actively feeling his way into new territory” (Rice & Gaylin,

1973, p. 134). The person speaking in an Externalizing vocal quality, however, was thought to

be turning his or her attention to the outside and using his or her vocal energy “instrumentally to

accomplish something in the outside world.” The client may appear expressive, but the

expressiveness has a “‘talking at’ quality” (Rice & Gaylin, 1973, p. 134). The Limited category

leaves the impression “of limited involvement, of distance from what is being said. There is a

12

fragile, walking-on eggs quality that suggests a distancing or even passivity” (Rice & Gaylin,

1973, p. 134).

Fifty-two clients who had received client-centered therapy were tested on the MMPI and

the Rorschach. Samples of their therapy sessions were rated on the CVQ. While vocal quality

and MMPI scores were not significantly related, vocal quality categories and Rorschach scores

were. The Rorschach scores used in this study are called Rorschach function scores (Gaylin,

1966), which had been derived from a previous study. The Rorschach function scores reflected

the client’s “immediately available resources for engaging in a creative perceptual process” (Rice

& Gaylin, 1973, p. 134).

The results of the study indicated that clients with a Focused vocal quality had Rorschach

function scores showing high energy and high “internal organizational complexity” (Rice &

Gaylin, 1973, p. 137). Rice and Gaylin (1973) suggested this meant that clients using

predominantly Focused category “were able to bring to the therapy task resources of a high

order, immediately usable energy coupled with the availability of inner input, which could be

explored in complex and creative ways” (p. 137). In contrast, clients using Externalizing vocal

quality had fewer and less complex Rorschach responses. This finding was interpreted as

providing support for the clinical impressions and some research which suggested that people

speaking with Externalizing vocal quality lack awareness of their inner experience and that they

have “a preoccupation with the formal at the expense of the affective” (Rice & Gaylin, 1973, p.

137). However, Rice and Gaylin (1973) explained that the nature of the Rorschach test did not

enable the researchers to determine whether or not the client directed his or her attention outward

in order to have an impact on other people.

13

For clients with Limited vocal quality, the low number of Rorschach responses

corresponded to the low energy defining that vocal quality category. But, clients speaking

primarily in Limited vocal quality also responded with a large proportion of nonform

determinants. Rice and Gaylin (1973) explained that this finding suggested that Limited speakers

are quite aware of their emotions, but that they may suffer from “too much affectivity, imagery,

etc., compressed into too meager an output” (p. 138). Lastly, their Rorschach responses showed

the lowest organizational ability, suggesting “a withholding of affect that is potentially available,

possibly in quantities too difficult to handle” (Rice & Gaylin, 1973, p. 138).

Rice and Gaylin (1973) suggested that in order for clients to make use of client-centered

therapy they must be able to make “an intensive, self-directed inner search” which “requires

precisely the kind of functioning that is characteristic of the focused group,” which is “the ability

to interact freely with one’s own affect, imagery, impulses, etc., yet with controlled

concentration rather than free association” (p. 138). Because their findings linked vocal quality

to different personality-related capacities for engaging in psychotherapy, the researchers

suggested future research could focus on determining whether people with particular vocal

styles, as well as other characteristics, were better suited for particular treatments than others.

Emotional States.

Therapists also attend to their clients’ observable nonverbal behaviours for the information

they provide about their internal emotional experience (e.g., Elliott, Watson, Goldman, &

Greenberg, 2004). Greenberg and Johnson (1988) explained that, “nonverbal emotional

expression is clearly a visible and observable signal accompanying an emotional state” (p. 15).

While this is particularly important for therapists trained in emotion-oriented treatments like

gestalt therapy (Rosner, 1996), therapists from other treatment orientations also value nonverbal

14

information. For example, Bady (1985) explained psychoanalyst Karpf’s (1980) position that

vocal qualities are “among the other nonverbal cues are reflecting intrapsychic conflict in the

patient and alerting the therapist to opportune times for intervention” (Bady, 1985, p. 480). Also,

even though cognitive behaviour therapy (CBT) is not known for its focus on the client’s

emotional state, exercises such as thought records include the client’s emotional state before and

after formulating an alternative view of a disturbing experience (see psychologytools.org for a

thought record worksheet). Helping the client identify his or her emotions during this exercise

can come from drawing the client’s attention to his or her expressions of sadness or relief, for

example.

Of the nonverbal behaviours, vocal quality is an especially clear indicator of emotion.

For example, listeners are able to discern some emotional states from the speaker’s vocal quality

alone, without words (e.g., Mohr, Shoham-Salomon, Engle, & Beutler, 1991). Also, listeners are

able to identify emotion from vocal quality samples just as well, and in some cases even more

effectively, than if they are also able to see the speaker’s facial expressions (Kappas, Hess, &

Scherer, 1991). Also by using “content-free” vocal quality samples, Scherer, Banse, and

Wallbott (2001, p. 87) demonstrated that there is significant cross-cultural agreement in

identifying emotions just from vocal cues. Scherer et al. (2001) had 428 participants from nine

countries listen to the “content-free” (p. 87) vocal quality samples of German actors expressing

different emotions. There was significant agreement among listeners about which emotions were

being expressed in the vocal quality samples. This led Scherer et al. (2001) to comment that

there may be “similar inference rules from vocal expression [of emotion] across cultures” (p. 76).

However, there was also enough disagreement that Scherer et al. (2001) “concluded that culture-

and language-specific paralinguistic patterns may influence the decoding process” (p. 76).

15

Level of arousal.

Sounds of the client’s vocal quality such as squeaks, mumbles, droning, and angry shouts

for example signify different emotional intensities. Evaluating the intensity of the client’s

emotional expression is important to clinicians for a number of reasons. The client’s high

intensity expression alerts the CBT therapist to the arousal of “hot” cognitions (e.g., Samoilov &

Goldfried, 2000) and the PE-EFT therapist to activation of important emotion schemes (e.g.,

Greenberg, Rice, & Elliot, 1993). To gauge arousal in research contexts, scales have been

developed such as the Client Emotional Arousal Scale revised (CEAS-r; Machado, 1992) and

Client Expressed Emotional Arousal Scale III—Revised (CEAS; Warwar & Greenberg, 1999).

Both are ordinal scales in which the intensity of the speaker’s vocal quality is used as one of

several indicators of emotional arousal.

Research using the CEAS-r has shown a correspondence between emotional arousal and

problem resolution (Greenberg & Malcolm, 2002) as well as to the client’s report of change after

a session (Goldstein, 2002). Ratings using the CEAS-r have also differentiated between

treatment approaches, as seen in the study conducted by Rosner, Beutler, and Daldrup (2000) of

a cognitive therapy and an emotional expressive therapy.

Carryer and Greenberg (2010) used the CEAS to investigate the most therapeutically

productive amount of emotional intensity. They wrote that using the CEAS to rate the client’s

arousal from videotapes involves evaluating the level of emotion in the client’s vocal quality:

“Emotional vocal quality is indicated by irregular patterns of accentuation, an uneven regularity

of pace, and unexpected terminal contours, suggesting accessibility to feelings” (Carryer &

Greenberg, 2010, p. 193). The authors found that a moderate amount of emotional expression

was more beneficial than either low or high amounts. This tends to fit with the idea that

16

emotional arousal is effective if it is at an intensity that activates the client’s problematic

emotion, but still permits the client to make new cognitive sense of his or her experience in the

session (e.g., Kennedy-Moore & Watson, 1999; Marks, 1991).

Psychopathology.

In addition to promoting the patient’s vocal quality as a diagnostic tool, Moses (1954)

wrote that, “one can go so far as to say that vocal expression is a record of the history of

mankind as well as a record of the individual” (p. 5). Moses (1954) presented the schizophrenic

patients’ vocal quality as evidence of this because it “has a marked archaic character, with

primordial attributes” (p. 5). In contrast, the neurotic patient’s vocal quality reflects his own

delayed development.

Though Moses’ (1954) explanations may not seem plausible today, his premise that the

speaker’s vocal quality reveals his or her psychological state holds true today and continues to be

supported by research linking mental health to paralinguistic cues and emotional arousal. Ozdas

et al. (2004) described the diagnostic value “at a noncontent level” (p. 1530) of the speaker’s

vocal quality from a physiological, emotional arousal perspective. Ozdas et al. (2004) wrote:

There is considerable evidence that emotional arousal produces changes in the speech

production scheme by affecting the respiratory, phonatory, and articulatory processes that

in turn are encoded in the acoustic signal. This is largely due to the fact that vocalization

reflects the activity of many different aspects of the functioning of the neurophysiological

structures. (pp. 1530-1531)

Acoustic parameters measure different qualities of the speaker’s vocal quality. The

parameter used most widely in psychopathology investigations is fundamental frequency (fo)

(e.g., Ozdas et al., 2004). Fundamental frequency is the sound wave that “color[s]” (Scherer and

17

Zei, 1988, p. 179) the speaker’s vocal quality with prosodic and emotional information (e.g.,

Ellgring & Scherer, 1996). The fundamental frequency of a person’s vocal quality is heard by

listeners as baseline pitch or that aspect of a speaker’s vocal quality that is unique and identifying

of him or her.

According to Scherer and Zei (1988), there are several factors that influence the speaker’s

pitch, including emotional arousal. The reason for the latter is that sound waves are created by

air being pushed from the lungs through the speaker’s vocal folds. The vocal folds are muscles

which open and close at different rates depending on their level of tension. Tension in the vocal

folds is related to the “overall muscle tension of the speaker” (Scherer, 1979b; Scherer, 1986, as

cited in Ellgring & Scherer, 1996, p. 87). An example of this is that a relaxed person sounds

calm because “relaxed muscular walls of the vocal resonators… ‘damp’, stop or absorb high

frequencies and produce a mellow tone” (Green, 1964, p. 53, as cited in Laver, 1980, p. 142). In

contrast a person who is tensed up from anxiety or fear for example could have vocal folds with

“taut muscular walls” that would “act as reflectors and produce harsh tone” (Green, 1964, p. 53,

as cited in Laver, 1980, p. 142). This high tension in the vocal folds can make the speaker’s

vocal quality sound high pitched (e.g., Hagenaars & Minnen, 2005).

Scherer and Zei (1988) suggested this relationship between the speaker’s body tension

and vocal fold tension can explain the seemingly contradictory results from research on the vocal

characteristics of depressed patients. Although the link between depression and altered speech

has long been accepted, some studies report high fundamental frequency for depressed

participants while others report low fo. Scherer and Zei explained that while there are other

possible reasons for these conflicting results, it could be that low frequencies characterize the

vocal qualities of patients with retarded depression while high frequencies characterize those

18

with an agitated depression. Scherer and Zei (1988) wrote, “it is important to note that agitated

forms of depressive symptomatology are generally considered to contain significant levels of

anxiety, which leads to an increase in fundamental frequency variation” (p. 1531).

Using vocal quality to guide treatment interventions.

Therapists also listen to their client’s vocal quality to signal the right moment for a

particular treatment intervention as well as to assess the effect of the intervention. For example,

process-experiential researchers pay attention to client behaviours, or markers, which may

indicate that the client is dealing with his or her psychological experience in a potentially

problematic way. Elliott et al. (2004) wrote that, “markers are client statements or behaviors that

alert therapists to various aspects of clients’ functioning that might need attention (p. 55).

In describing his study of the gestalt two chair intervention for “conflict splits”,

Greenberg (1979) explained that the therapist can use nonverbal clues about the client’s

cognitive-affective state to determine if and when to begin this intervention. Greenberg (1979)

wrote:

The client’s voice may suggest a certain urgency, his body an agitation; some increased

intensity of feeling is portrayed by the way in which the person talks about his

experiences. A difference is observed, some aspect of the client comes alive for the

therapist and it is this present cue which prompts the therapist’s intervention. (p. 319)

The therapist’s vocal quality: The therapist’s vocal quality as a treatment intervention

The vocal quality of the therapist has been seen as useful in treatment when it evokes the

client’s memories (Wrye, 1997); provides the patient with a connection to the analyst that is

potentially evocative and soothing, like the connection between a mother and child (Bady, 1985;

19

Stone, 1961); and supports the client in various emotional states (Bady, 1985). The therapist’s

vocal quality has also been linked to helping the client engage more productively in therapy.

Wrye (1997) referred to the work of Welch (1978) and Kahne (1995) when she explained

that psychoanalysts have compared the relationship between them and their patients to the

relationship between mothers and their children. This special psychotherapeutic relationship

evokes “sensual memories” (Wrye, 1997, p. 361) of the patient’s mother which may be

important to analyze in the session. Moreover, Wrye urged analysts to be aware of their client’s

nonverbal signalling for this kind of attention. Wrye (1997) commented that it is easy for

analysts to focus on the content of what their patients say rather than listening with “our third ear

for sounds we may be unaccustomed to hearing and to absorbing our patient’s need at certain

points in the treatment” (p. 365). For example, clients may need “to be rocked simply in the

lullaby of our voices, our enfolding sound envelope, or bathed in a ‘wordbath’” (Wrye, 1997, p.

365).

The parallel between the analyst using his or her vocal quality to sooth the vulnerable

patient is similar to Stone’s (1961) concept of the mother’s vocal quality forming a

“psychobiological bridge” (p. 86) for the child, enabling the child to keep her presence with him

even when she is out of sight. In referring to Stone’s (1961) writings, Bady (1985) commented

that, “speech is a form of human contact that the child learns while achieving actual physical

separation from the mother” (p. 483). In the case of the patient, the psychoanalyst’s vocal

quality is the main form of physical connection the patient can have with him or her. In this

way, the psychoanalyst’s vocal quality forms a “psychobiological bridge” to the patient (Stone,

1961, p. 86), assisting in his or her psychological development (Wrye, 1997).

20

In addition, Bady (1985) listed the various ways she uses her vocal quality to help her

clients move through important psychological states. Bady (1985) wrote:

I will intentionally use my voice along with my words. Sometimes I attempt through

vocal tones to sooth an anxious, agitated patient. Other times I use my voice to stimulate

a depressed and hopeless one. On still other occasions I talk to give the patient a human

response and my words are less important than the vocal indication of my presence.

Sometimes I remain silent in order to encourage separation from me. (p. 483)

Elliott et al. (2004) also commented that the empathic and caring therapist “speaks in a

gentle, prizing way, with a quiet, caring voice that respects the client’s fragile feelings as if

speaking to a small, frightened animal” (p. 133). In addition, there is research support for the

therapist’s use of vocal quality in this facilitative manner. For example, Ritchie (1998) cited

Old’s (1983) findings that a client was more likely to talk about personal issues when the

therapist used a mild vocal quality. Ritchie also cited Knowlton’s (1989) findings that the

optimal therapist vocal quality with which to deliver progressive muscle relaxation instructions

to anxious clients is one that gradually slows and softens.

Mutual influence of the client’s and therapist’s vocal qualities

While previously mentioned work concentrated on the idea of nonverbal behaviours of

two or more interactants converging or synchronizing, Lynch’s (1979) experiment showed how

physiological matching or harmony is associated with the therapist’s enhanced focus on the

patient. The study, discussed briefly in Bady (1985), demonstrated how the interaction between

the therapist and client can be so synchronous that even their hearts beat to a similar rhythm.

Lynch conducted experiments involving cardiac patients and their psychotherapists. In some of

these experiments, the heart rates of the psychotherapist and cardiac patient were monitored.

21

Bady (1985) wrote that “all experiments found a close coordination in increase or decrease of

heart rate between the two persons according to the material being discussed” (p. 488).

However, the “cardiac relationship was closest in those sessions where the therapist reported he

felt least distracted by personal concerns or counter transference responses to the patient” (Bady,

1985, pp. 488-489).

Mutual influence has also been a focal interest in the psychotherapy domain. Butler,

Rice, and Wagstaff (1962) were client-centered therapy researchers who discussed the power of

both the client and therapist to draw one another into different states of functioning through their

individual styles of participation. Insights about client-centered theory gained from previous

work about how people self-actualize, or strive to fulfill their potentials (see Butler & Rice,

1960) led Butler et al. (1962) to consider this possibility. Butler et al. (1962) concluded that

people are driven to self-actualize by a primitive, basic drive which they called “stimulus

hunger” (p. 187). “Stimulus hunger”, in theory, makes people want to have new experiences or

stimulation (Butler et al., 1962, p. 187). Stimulation could be in the form of obtaining things or

objects, but it could also be achieved through changing the degree or type of experience.

“Stimulus hunger” was also understood as having both the power to drive people toward

dark and self-destructive stimuli as well as toward positive and healthful experiences (Butler, et

al., 1962, p. 188). For example, the act of working toward having a new psychological

experience, like clients do in psychotherapy, would be considered a positive result of the

“stimulus hunger” drive (Butler, et al., 1962, p. 188). In fact, Butler et al. (1962) suggested that

psychotherapy provides a potentially tremendously satisfying experience for a client’s “stimulus

hunger” drive, despite the emotional pain that is often involved (p. 188).

22

Butler et al. (1962) and Rice (1965) suggested that the therapist’s “style of participation”

in the therapy process can either help clients satisfy their “stimulus hunger” drive--through

assisting them to create new psychological experience--or constrict clients’ ability to engage in

the therapy process, thus frustrating the “stimulus hunger” drive (Butler, et al., 1962, p. 188).

Butler et al. (1962) suggested that the therapist’s “style of participation”, including vocal

behaviours, “with the greatest connotative range, with the most far-reaching reverberations

within the organism” would result “in a maximum of satisfying experience” (p. 188).

Also, although Rice and Kerr (1986) did not refer specifically to styles of participation,

they did suggest that the therapist’s effective “style of participation” would provide the client

with a model of how to explore and process cognitive-affective information. For example, when

the client-centered therapist is trying to articulate the essence of the client’s experience, his or

her vocal style can sound fragmented, unpredictably fast or slow, with abrupt halting and

resuming of speech. Rice and Kerr suggested that just hearing this different style of putting

one’s inner thoughts and feelings into words could help the client to do the same. About this

style of speech, sometimes referred to as Expressive or Irregular, Rice and Kerr (1986) wrote

that, “if a therapist can slow a client’s pace to match this voice quality, it may in itself facilitate

exploration by breaking up habitual cognitive patterns and introducing gaps in the client’s rush

of externalizing verbiage” (p. 96). While this theoretical point comes out of the client-centered

tradition, Rice (1965) commented that the goal of all therapists, regardless of theoretical

orientation, is to help the client generate new experience.

Butler et al. (1962) also commented that instead of helping the client, the therapist’s

vocal quality could also be unhelpful or even damaging to the client. For example, a therapist

whose vocal quality conveys harshness, disinterest, detachment, or authoritarian attitude will

23

thwart the client’s efforts to satisfy his or her “stimulus hunger” drive (Butler, et al., 1962, p.

188). The authors also warned that the client’s vocal quality could adversely influence the

therapist. For example, a client’s dull and disengaged vocal “style of participation” could

enervate and dampen the therapist’s own “style of participation” without him or her even being

aware that this is happening (Butler, et al., 1962, p. 189). Butler et al. explained that even if

these client behaviours are the result of his or her emotional problems, they could neutralize or

diminish the therapist’s effective style of participation and, ultimately, undermine the healing

potential of the therapy. The researchers advised that if the therapist can maintain “stylistic

independence” from the client, then he or she can continue to stimulate the client in the direction

of new psychological insights, helping him or her to open up to novel experiences, which they

can explore and make sense of together (Butler, et al., 1962, pp. 189-190).

Butler et al. (1962) researched vocal quality in psychotherapy in order to learn what kinds

of therapist behaviours would “stimulate the client to generate new experience for himself” and

which client behaviours are associated with the client’s engagement in self-actualizing, being

“open to the creation of new experience” and breaking “through the endless repetition of

experience so characteristic of maladjusted persons” (pp. 188-189). To investigate these

behaviours as they occur in the therapy sessions, two scales were developed: One was called the

Client Classification System and the other the Therapist Classification System. To create both

scales, the client’s and therapist’s behaviours were observed as they occurred in the therapy hour.

These observations, coupled with theory, were used to create scales that were more objective, as

opposed to being based on “supposed clinical meaning” (Butler et al., 1962, p. 190).

The Client Classification System was composed of three classes including level of

expression, quality of participation and voice qualities and manner of speaking. Subclasses of

24

the voice qualities and manner of speaking class were defined by the amount of energy in the

vocal quality, the direction of this energy (inward or outward), and the level of control of this

energy (e.g., did the vocal quality convey controlled energy or did the energy sound as if it were

pouring forth without control?). Terms such as hesitations, pace, pitch and stresses in speech

were also used to describe the categories.

The Therapist Classification System was also composed of three classes including

freshness of words and combinations, functional level of response, and voice quality. The

therapist’s vocal quality class consisted of several subclasses, each described in terms of varying

amounts of energy, control, and newness vs. closure. They were also described with

paralinguistic terms such as inflection, pitch, and accentuation.

To study the elements of effective psychotherapy, Butler et al. (1962) planned to “apply

and analyze separately the classification systems for client and therapist, treating the responses of

client and therapist separately, and then to treat each dyad of client-therapist response as a unit”

(p. 194). Studies in the 1960’s and 1980’s addressed each of the goals and led to measures of

client and therapist vocal quality known today as the Client Vocal Quality and Revised Therapist

Vocal Quality systems. Studies related to the therapist’s vocal quality, the client’s vocal quality,

and to their vocal qualities together are presented below.

Studies of the therapist’s vocal quality

The major studies using scales specifically developed to evaluate the therapist’s vocal

quality in psychotherapy sessions include Rice (1965), Duncan, Rice, and Butler (1968), and

Kerr (1983). Rice’s (1965) study investigated the three main classes of the Therapist

Classification System in relation to one another. Duncan et al. (1968) researched sessions

identified as “peak” and “poor” using paralinguistic terms such as intensity and pitch and speech

25

fluency (p. 566). The authors wanted to know “could these significant therapy hours be

differentiated by taking only voice quality into account, quite apart from content?” (Duncan,

Rice, & Butler, 1968, p. 566). Kerr’s (1983) study refined the vocal quality scale to its current

form, known as the Revised Therapist Vocal Quality scale or Revised TVQ and then analyzed

the categories in relation to outcome measure scores for clients.

Rice (1965).

Rice (1965) used the Therapist Classification System to explore the therapist’s behaviour

in client-centered therapy. The study was based on the theoretical position that “one of the

primary functions of the client-centered therapist, or indeed of any therapist, is to help the client

to generate new inner experience” (Rice, 1965, p. 156). Rice (1965) continued:

Even when the content of the therapist’s response is within the client’s internal frame of

reference…there is a range of possible responses, all equally accurate perhaps, but with

different stylistic qualities, having sharply different kinds of stimulus value for the client.

The more expressive the verbal and vocal behavior of the therapist, the more the client is

stimulated to generate new experience. The more constricted the therapist’s behavior, the

more the client tends to be confined within the grooves of his own repetitive thinking

process. (p. 156)

There were three subclasses used to categorize the therapist’s vocal quality: Expressive,

Usual, and Distorted. Each subclass was described in paralinguistic descriptors such as “pace,

hesitations, pitch range, patterns of emphasis, etc.” (Rice, 1965, p. 156). The Expressive

category was described this way:

This voice is characterized by high energy used in a controlled but not constricted way.

Color and range are present in the voice, but not to the extent of emotional overflow. The

26

pitch range is wide, and although there is considerable emphasis, it is irregular and

appropriate to the structure. (Rice, 1965, p. 157)

The second therapist vocal category, Usual, was characterized by a narrow range for pitch, but

sufficient energy. Distorted was the last vocal category which could vary in terms of energy and

pitch. For the Distorted vocal quality, “the most distinguished feature is the regular emphasis,

seemingly for effect rather than for spontaneous meaning. There is a subtly cadenced or sing-

song quality, in which emphasis is shifted from its natural location” (Rice, 1965, p. 157).

The other two categories of the Therapist Classification System were freshness of words

and combinations and functional level. The former category referred to therapist’s language,

with effective language stimulating the client’s own capacity for making more complex

associations and more vivid and rich “inner experience” (Rice, 1965, p. 156). One subcategory

of the freshness of words and combinations was defined by evocative therapist language having

“high imagery, auditory and kinaesthetic as well as visual” (Rice, 1965, p. 156) characteristics.

The other subcategory was called ordinary language because of its mundane quality. The

functional level category is based on the assumption that “the stance the client takes toward his

own experience may be much influenced by the expressive stance that the therapist takes in

response to his message” (Rice, 1965, p. 157). Functional level referred to therapists responding

to the clients’ stances of exploring their inner worlds, observing them, or attending to situations

outside of themselves.

Rice’s (1965) study consisted of 20 client and therapist pairs. There were no reported

diagnoses for the clients who received Rogerian psychotherapy at the University of Chicago

Counseling Center. The number of psychotherapy sessions the clients had received ranged from

six to 68. The therapists also had a wide range for experience. The success of each case was

27

determined by the therapist and the study included nine successful cases, four identified as

moderate to limited success, and seven considered to be poor. The second and the penultimate

sessions were selected for rating. To select the therapist responses that would be rated, Rice

divided each session using time to identify the beginning, middle, and end phases. Ten

responses from each phase were rated on each of the three components of the Therapist

Classification System: Freshness of words and combinations, functional level, and vocal quality.

A factor analysis of the ratings was then conducted which resulted in three factors. Rice

(1965) referred to the Type I factor as the “garden variety” (p. 158) because the therapist’s

language was Usual with an “even and relatively uninflected” vocal quality (p. 158). Most

responses in the Type I factor addressed the observations the client made about “the self as an

object” (Rice, 1965, p. 157), with little attention paid to the client’s inner exploration. Type II

was defined by the therapist’s Distorted vocal quality and involved a limited number of fresh and

connotative responses. Attention was directed mainly to the client’s observations of himself.

Type III was characterized by the therapist’s use of expressive vocal quality, fresh and

connotative language, and attention to the client’s inner exploration.

Rice (1965) then conducted a correlation analysis with these three therapist factors and

several criteria used to evaluate the success or failure of the treatment for the client. The criteria

were based on both the client’s and therapist’s perspectives. Taken together, the results showed

that in both the second and penultimate sessions, Type II, characterized by Distorted vocal

quality, was related to cases which both the therapist and client saw as unsuccessful. The

therapist’s vocal quality in the Type II factor sounded planned as if the therapist were trying to

have an impact on the client as opposed to sounding spontaneous. In contrast, Type III in the

penultimate session was associated with successful treatment according to both the therapist’s

28

and client’s evaluation. The therapist’s vocal quality in the Type III factor was Expressive. An

Expressive vocal quality is full of controlled energy, with a wide pitch range and an irregular

pattern of emphases on the words but not because the speaker is hesitant, but because he or she is

searching for the word that best describes the client’s experience.

In the last test of the study, Rice (1965) grouped therapists according their level of

experience. A Mann-Whitney U test compared the experienced and inexperienced groups on the

three factors. The results showed that “experienced therapists show significantly more Type III

behaviour than do inexperienced ones both early and late in therapy” (Rice, 1965, p. 160). This

meant that experienced therapists tended to use Expressive vocal quality more than the

inexperienced therapists.

Duncan, Rice, and Butler (1968).

Duncan et al. (1968) conducted a factor analysis of paralinguistic patterns and sessions

rated as either exceptionally good or poor sessions. Nine therapists, all from the Counseling and

Psychotherapy Research Center of the University of Chicago, submitted two of their own

sessions. One session was selected because it was evaluated as a “peak” or good session and the

other was considered to be poor session (Duncan et al, 1968, p. 566). These sessions were rated

on the therapists’ vocal patterns that Duncan et al. (1968) described in combinations of

paralinguistic qualities such as intensity, pitch height, vocal cord control (referred to as vocal lip

control), and speech nonfluencies, such as unfilled hesitation pauses, filled hesitation pauses and

repeats (e.g., not completing sentences and false starts).

The analyses produced three factors that distinguished the session types. Duncan et al.

(1968) provided paralinguistic descriptions of each factor as well as audio impressions each

factor was thought to make on the listener. The therapists’ vocal qualities associated with Factor

29

I were described as sounding “dull and flat, rather uninvolved” and at times, it seemed as if the

therapists were “speaking for effect” (Duncan et al., 1968, p. 569). In Factor III, the therapists’

vocal qualities consisted mainly of filled pauses. Filled pauses are utterances such as “uhm” or

“uh”. Both of these Factors were associated with the poor sessions. In contrast, Factor II was

associated with peak sessions and was described as lacking filled pauses, having an “oversoft

intensity with overlow pitch” and giving the “impression of being serious, warm, and relaxed”

(Duncan et al., 1968, p. 569).

Despite the striking contrast between the factors, the authors warned that the results

should be interpreted with caution. The reason for this is that there was only one male in the

peak group and only two women in the poor group. Because of this, the results should be

interpreted in light of the fact that “the peak behaviors were used by the therapists in

communicating primarily with females and the poor behaviors used in communicating primarily

with males” (Duncan et al., 1968, pp. 569-570).

Kerr (1983).

Kerr (1980, 1983) revised the Therapist Classification System for Vocal Quality

described in Rice (1965). Now known as the Revised Therapist Vocal Quality (TVQ), the

system consists of seven nominal categories representing different patterns of vocal quality.

Each of the TVQ categories is described thoroughly in the Method section. However, the TVQ

categories of special importance in this review of Kerr’s (1983) study are explained here also

using paralinguistic terms and the expected impact on the client. The first category was called

Softened because the vocal quality does sound soft, with a lower pitch and slower rate of speech.

Softened vocal quality conveys “intimacy and involvement” (Kerr, 1983, p. 30). The second was

the Irregular category which reflects the therapist’s deep attunement to and involvement in the

30

client’s psychological exploration. The therapist’s vocal quality lacks fluency as a result of

unfilled pauses, emphases in unusual places, and ragged-sounding phrases or sentences.

Irregular vocal quality is closest to the Expressive therapist vocal quality category which was

associated with good treatment outcomes described in Rice (1965). The third category was

called Natural vocal quality. While it sounds similar to a vocal quality used in daily

conversation, it is different in that it is “unstrained and natural”, giving the listener the sense that

the speaker is interested in what the listener has to say (Kerr, 1983, p. 31).

The fourth category, called Definite, was described as moderately energetic, sometimes

taking on a pattern in which sentences end in a downward sloping pitch. This pattern can turn

Definite vocal quality into a “confrontational vocal quality” and make the speaker “sound

somewhat overbearing” (Kerr, 1983, p. 31). This vocal quality category seems closest to Rice’s

(1965) Distorted therapist vocal quality in which “the most distinguished feature is the regular

emphasis, seemingly for effect rather than for spontaneous meaning” with a “sing-song quality,

in which emphasis is shifted from its natural location” (Rice, 1965, p. 157). In Rice’s (1965)

study, this vocal quality was associated with psychotherapy cases which were evaluated as

failures by the therapists.

Finally, the Restricted category does not seem to fit with therapist vocal categorizations

mentioned in previous studies. The Restricted vocal quality in the TVQ however was thought to

adversely affect the client because it conveys tension and even insensitivity. While the speaker’s

vocal quality has enough energy to give over the message or content, the vocal quality also

sounds like the speaker is holding back or keeping him or herself distant. Kerr (1983) wrote,

“the voice can be slightly tremulous, whiny, droning or sounding as though the air is escaping

before the word is formed” (p. 31).

31

In this study, Kerr (1983) used data from an Interpersonal Process Recall (IPR) study

conducted by Elliott (1979). In the IPR study, the therapist and client, separately, reviewed a

video tape of the therapy session soon after it occurred. They evaluated the therapist’s responses

in the session on a questionnaire of items rated on a Likert scale. The results of the IPR

questionnaires (Elliott, 1978, as cited in Kerr, 1983, p. 100) produced four evaluations from the

client’s perspective and four from the therapist’s perspective. One of the client’s evaluations

was called CEmp and referred to the client’s rating of the therapist’s empathy for him or her.

This evaluation was based on the question: “When your therapist said that, did you feel

misunderstood or understood?” A second client evaluation was called CHelp, or the client’s

rating of the therapist’s helpfulness to him or her, and came from the question: “When your

therapist said that, did it hinder or help you?” CAff, or the client’s rating of the therapist’s

affective impact on him or her, was based on the question: “When your therapist said that, did it

make you feel worse or better?” The last evaluation, CCog, referred to the client’s rating of the

therapist’s cognitive impact on the client and is based on this question: “Did what s/he said make

you think more or less?”

The therapist’s evaluations were based on a questionnaire that paralleled the client’s

questionnaire. The abbreviations for the therapist’s evaluations are TEmp, THelp, TAff, and

TCog. These evaluations refer to the therapist’s ratings of his or her own impact on the client in

these four dimensions. Finally, there were evaluations made from the observer’s perspective. In

this study, five undergraduate students rated the therapists’ responses as the observers. Each was

also made on a Likert scale and included ObHelp or a measure of General Helpfulness ranging

from the therapist’s being helpful to hindering the client. ImpExpl, or the Impact on Exploration

evaluation, was based on the question: “Does the response facilitate the Helpseeker in further

32

exploring or in bringing up new material? Or does it block or distract Helpseeker?” (p. 102).

The Collab or measure of collaboration was based on the question: “Does the Helper’s manner

communicate a sense of working together in a collaborative process?” Finally, a measure of the

Therapeutic Alliance was also given for the session overall.

In the IPR data set used for Kerr’s (1983) study, there were 16 client-therapist dyads

having the above post-session ratings. The therapists were psychodynamic in their orientation.

The author divided each session into three phases with the first starting 5 minutes into the

session, the next was 25 and the last was 40 minutes into the session. Responses in each phase

were analyzed according to the revised TVQ. Two hundred and sixteen therapist responses were

then analyzed. Kerr (1983) correlated the “proportions of the TVQ categories across the session

with the averaged evaluative measures for all responses” (p. 52).

Kerr’s (1983) results showed that the therapists’ responses made in Irregular vocal

quality were associated with the clients’ evaluation of feeling their therapists understood them

(CEmp; r = .52), helped them (CHelp; r = .60), made them feel better (CAff; r = .40), and made

them think more (CCog; r = .43). From the therapist’s perspective, responses made in Irregular

vocal quality were associated with the therapists’ evaluation that they were more helpful to the

client (THelp; r = .44) and made the client feel better (TAff; r = .45). Irregular vocal quality

was also associated with the observers’ evaluations that the therapist had succeeded in creating a

strong Therapeutic Alliance (r = .50), that the therapist was helpful to the client (ObHelp; r =

.62); collaborated well with the client (Collab; r = .46), and facilitated the client’s exploration

(ImpExpl; r = .51). For all results, see Kerr (1983, p. 53).

According to Kerr (1983), the results for the therapist’s Restricted vocal quality,

however, were negatively associated with many evaluations. From the client’s perspective,

33

therapist responses made in Restricted vocal quality were associated with the client’s feeling that

the therapist did not understand him or her (CEmp; r = - .51). From the therapists’ perspective,

responses made in a Restricted vocal quality were associated with the therapists’ view that they

hindered the client’s progress (THelp; r = - .50) and made the client feel worse (TAff; r = - .57).

Finally, from the observers’ perspective, Restricted therapist responses were associated with

poorer Therapeutic Alliances (r = - . 42).

In addition to Irregular and Restricted vocal qualities, Kerr (1983) reported that there

were also significant results for Natural and Definite vocal qualities. There were no results for

these categories with the client’s evaluation. However, from the therapist’s perspective, Natural

vocal quality responses were associated with making the client feel misunderstood (THelp; r = -

.53). In contrast, responses made in Definite vocal quality were associated with the therapists’

evaluation that they helped the client (THelp; r = .62) and that they made the client think more

(TCog; r = .53). There were no significant results for the Softened vocal quality.

In her summary, Kerr (1983) suggested that these results point to Irregular and Restricted

vocal qualities as being very important in treatment given their significant correlations with all

three perspectives. Irregular vocal quality was thought to represent “a style that is very

productive in therapy for both therapist and client” (Kerr, 1983, p. 69), while Restricted vocal

quality reflected unproductive processes. Regarding Definite vocal quality, Kerr (1983)

suggested that during the IPR, the therapists who heard themselves speak in this self-assured

vocal quality might have paid “more attention to the remembered feeling of confidence than to

what the client’s experience is” (p. 73). This would explain why the Definite category was

associated with positive therapist ratings, but not with the clients’ or observers’ ratings. In terms

of Natural vocal quality, Kerr (1983) explained that this category was “baseline vocal quality for

34

most therapists” that “makes up almost half of the TVQ freque0ncies” (p. 72). Kerr commented

that while Natural vocal quality seems to be a neutral pattern in terms of its effect on clients and

observers, therapists saw themselves as lacking in empathy when they used this vocal quality.

Kerr (1983) wrote that clinicians may construe Natural vocal quality as “negative because it does

give the effect of not doing anything special in the way of showing either intimacy or

competence” (p. 72).

Finally, there were no significant correlations with Softened vocal quality and the IPR

measures from any perspective. However, in other analyses, Kerr (1983) found that the impact

of Softened responses “varied widely…according to which of the other categories it

accompanied” (p. 75). Responses often exhibited more than one TVQ category. The vocal

pattern most often found together with Softened was Restricted. Kerr (1983) commented that in

these instances, “the effect seemed to be of weakness or phoniness” (p. 75). In contrast, the

presence of Softened with either Irregular or Definite seemed to enhance the effect of each: “In

conjunction with ‘Softened’ they sounded more intimate, and their higher energy seemed to lend

the ‘Softened’ category more impact in a positive way” (Kerr, 1983, p. 75). Kerr (1983)

suggested that patterns of vocal qualities might explain how the Softened vocal quality seemed to

change the impact of the other categories. Referring to Elliott’s (1983) examination of

noteworthy client responses, Kerr (1983) wrote that “the important insight was stated in the

‘Irregular’ pattern but prepared, in the response just previous, by the ‘Softened’ pattern” (p. 76).

Taken together, Kerr (1983) stated, “in fact, importance of many of the TVQ categories may lie

in their patterns of use” (p. 76).

35

Summary of studies of the therapist’s vocal quality.

Rice (1965), Duncan et al. (1968), and Kerr (1983) pioneered the exploration of the

therapist’s vocal quality in the psychotherapy hour. Taken together, the findings suggest that the

particular therapist vocal qualities are differentially associated with other therapist behaviours

(Rice, 1965), different post-session evaluations (Kerr, 1983), and treatment outcome as evaluated

by the therapist alone (Duncan et al., 1968) and by both client and therapist (Rice, 1965).

Specifically, vocal qualities that sound energetic, controlled, searching (Kerr, 1983; Rice, 1965)

and soft with a lower pitch (Duncan et al., 1968) were associated with successful treatment.

As a group, the studies have some limitations. One limitation is the size of the samples,

ranging from 16 to 20 therapists, and the limited number of therapeutic orientations (e.g.,

Rogerian and psychodynamic). Also, there was no way to compare the studies due to a lack of

standardized research protocols. For example, there is no mention of diagnoses and the cases

varied in treatment lengths.

The client’s vocal quality: How the client’s use of his or her own vocal quality can facilitate

treatment

When the client speaks, his vocal quality can also help him heal. This is thought to occur

in several ways. One example is Bady’s (1985) suggestion that the physical act of speaking can

affect the body’s emotional experience. Second, as with other nonverbal behaviours,

experiential therapists might draw the client’s attention to his or her vocal quality to heighten

awareness of underlying emotions (e.g., Greenberg, 1979). They may also ask the client to

exaggerate vocal characteristics that seem to reflect something of importance to the client in

order to increase arousal (e.g., Murray & Segal, 1994). Finally, the client’s vocal quality may

help him or her to engage in therapy in a more productive manner (e.g., Rice & Kerr, 1986).

36

The physical act of speaking and emotional changes.

Bady (1985) has asserted that just by speaking, the client may be helping to heal himself

or herself. This can occur through the physical, sensory act of confining thoughts and emotions

within the boundaries of words. During this process, the speaker can feel the production of

words and sentences. Using singing as an example, Bady (1985) wrote:

If we consider the actual mechanics of vocal quality production, we realize that when we

speak, something really is happening” because articulation involves “the movement of

the diaphragm, the relaxing of jaw and throat, the vibration of airwaves in the chest and

head. (p. 487) [italics in original]

When the speaker’s vocal quality matches the content of speech, the speaker can experience an

even more powerful physical sensation. Bady (1985) suggested that a catharsis leads to relief

because of “the physical action of a strong outpouring of words” (p. 487). The reason for this is

that “it may actually feel to the person as though he is ejecting an angry or sad thought from his

body” (Bady, 1985, pp. 487-488).

Paying attention to how one sounds.

Another use of the client’s vocal quality for healing involves the clients’ becoming aware

of how he or she sounds in the moment. According to Greenberg (1979), “often clients are

engaged in certain behaviours or processes in the present that are fairly obvious but of which

they themselves are not aware” (p. 321). When the therapist believes it is helpful, he or she

draws the client’s attention to the behaviour and asks him or her to stay with the unique feeling

associated with that behaviour. For example, if the client states that a recent loss is not that

important to him, but he says this with a quiet and tremulous vocal quality, the therapist might

share his observation of this vocal quality with the client. Doing this can help the client gain

37

awareness of what the quivering in his vocal quality signifies. Drawing the client’s attention to

discrepant behaviours, as seen in the vocal quality example, is a way of putting clients “more

fully in touch with their present experiencing” (Greenberg, 1979, p. 322).

Increasing emotional arousal.

Murray and Segal (1994) wrote, “there is a good deal of emphasis in the clinical literature

on the capacity of vocal expression to arouse emotion in various forms of psychotherapy” (p.

393). In the case of experiential therapy, a therapist may ask a client to use his or her vocal

quality in order to increase emotional arousal. For example, if an edge of irritation sharpens the

client’s vocal quality when discussing how hard it is to confront his boss, the therapist might

suggest that he speak with even more irritation or even anger. This is done to increase the

client’s level of arousal so that thoughts, feelings, memories, and somatic sensations associated

with the client’s anger in this context come alive, making them more amenable to exploration

and change. In PE-EFT, this is known as activating a person’s emotion schemes. Doing this

enables the client to come into more direct contact with his or her psychological difficulty so that

it can be worked on in the session (Greenberg & Paivio, 1997).

Studies of the client’s vocal quality

The Client Vocal Quality measure, or CVQ, and its precursor, the vocal qualities and

manner of speaking component of the Client Classification System (Butler et al., 1962) have

been used in many studies to evaluate the client’s vocal quality in psychotherapy sessions.

Reviews are found in Rice and Koke (1981) and Rice and Kerr (1986). Studies by Rice and

Wagstaff (1967), Sarnat (1976), and Nixon (1980) are reviewed here because they address the

client’s vocal quality and treatment outcome. Work by Greenberg (1983) and Watson and

Greenberg (1996) have linked the client’s vocal quality to in-session processes. Finally, the

38

Wiseman and Rice (1989) and Butler et al. (1962) studies have investigated the client’s and

therapist’s vocal qualities in relation to one another.

The client’s vocal quality and treatment outcome-Butler, Rice, and Wagstaff (1962).

As part of their plan to study client and therapist behaviours that impact treatment, Butler

et al. (1962) developed the Client Classification System which included three components. The

level of expression component was used to rate “the level at which the client is dealing with and

expressing whatever subject matter is under discussion” (Butler, et al., 1962, p. 190). The

quality of participation component was used to assess whether the client was behaving as an

observer or as a participant in the session. The third classification component was the voice

qualities and manner of speaking which was used to rate the client’s vocal quality.

Ratings for the voice qualities and manner of speaking system were made based on the

amount of energy in the vocal quality, the direction of this energy (inward or outward), and the

level of control of this energy (e.g., did the vocal quality convey controlled energy or did the

energy sound as if it were pouring forth without control?). This system included four vocal

subcategories: Focused, Emotional, Externalizing, and Limited. The essential features of the

Focused category include the client’s use of energy “in a controlled problem-solving way” so

that his vocal quality “gives an impression of pondering or exploration” (Butler et al., 1962, p.

191). The essential feature of Emotional vocal quality is that the client’s energy spills outward

and his or her vocal quality “breaks, trembles, is choked with crying” (Butler et al., 1962, p.

191). The primary characteristic of Externalizing vocal quality is substantial energy, but directed

“toward having some effect on the outside world” (Butler et al., 1962, p. 191). Here the client’s

vocal quality would have a pattern and smooth rhythm, giving a “soapbox quality” (Butler et al.,

1962, p. 191). For the Limited category, the client’s energy is low and it is not clear where the

39

energy is directed, though the client’s vocal quality can come through in a “matter-of-fact, even

incidental tone” (Butler et al., 1962, p. 191). Table 1 summarizes the characteristics of vocal

quality energy, rhythm and pace of the categories.

To study the elements of effective psychotherapy, Butler et al. (1962) planned to study

the therapist’s and client’s behaviors both separately and together, intending to “apply and

analyze separately the classification systems for client and therapist, treating the responses of

client and therapist separately, and then to treat each dyad of client-therapist response as a unit”

(p. 194). This review of the study will focus only on the client’s vocal quality. Butler et al.

(1962) classified 24 cases of clients receiving Rogerian psychotherapy sessions as successful or

unsuccessful. Next, the three components of the Client Classification System (level of

expression, quality of participation, and voice qualities and manner of speaking) were used to

rate the first 10 responses from the beginning, middle, and late phases of each client’s second

and penultimate sessions. Butler et al. (1962) then conducted a factor analysis which revealed

three client factors.

According to Butler et al. (1962), in Factor I, clients expressed their inner experience

directly as opposed to speaking as if they were observers and they also spoke about their

feelings. The vocal qualities found in Factor I were Focused and Emotional. In Factor II, clients

were also engaged in the session as participants and, while they explored and expressed

psychological experience that lay “beneath the surface” (Butler et al., 1962, p. 190), they avoided

discussing their feelings. In addition, “the outstanding characteristic [of Factor II] seems to be

the low energy level of responses” with more than two-thirds of the responses in the group

40

Table 1

Features of the Vocal Qualities and Manner of Speaking categories for the Client Classification System

Emotional Focused Externalizing Limited

Qualit

y of Energ

y

“high energy level,

but the energy tends

to overflow into

discharge rather than

being used in a

controlled way”

High energy used in

a “problem-solving

way”

“fairly high energy” “low energy used in

a matter-of-fact, even

incidental tone”

Direct

ion o

f Energ

y

Energy spills into

speech, not directed

toward influencing

others.

“The energy seems to

be turned inward

rather than being

propelled outward”

Energy is “directed

outward, seemingly

toward having some

effect on the outside

world”

“It is clearly a

communication to the

therapist, but it is not

clearly directional in

the sense of” Focused

or Externalizing.

Rhyt

hm

, pace

etc

.

vocal quality “breaks,

trembles, is choked

with crying…

discharge of acute

tension”

“hesitations and

irregularities of pace”

especially “in stress

of syllables”

The vocal quality has

“cadence or rhythmic

pattern… often

mechanical

inflection”

“the pace is even and

relatively unstressed”

Note. Adapted from Butler, Rice, and Wagstaff (1962, p. 191).

41

having a Limited vocal quality in which the client’s vocal quality sounded “rather matter-of-

factly serious and lacking in search or exploring quality” (Butler et al., 1962, p. 197). In Factor

III, the clients were about evenly split as participants and observers in the session. However, the

vast majority of responses involved “ideas or actions” and not the direct expression of feeling.

The vocal quality associated with Factor III was Externalizing, with clients sounding in most

responses as if they were “dramatizing or making a speech” (Butler et al., 1962, p. 197).

In summary, the factor analysis of the client’s vocal qualities revealed that Factor I was

dominated by the Focused and Emotional categories. Limited was the major vocal quality in

Factor II. Factor III was dominated by the Externalizing category. The researchers explained

that Factor I reflected therapy characteristics that would be most beneficial to the client. They

wrote, “the client’s energy, openness of expression, and ability to directly communicate

experience seem to point toward the likelihood of favourable personal reorganization” (Butler et

al., 1962, p. 198). Factor III behaviours were viewed as representing poor client participation in

therapy and a poor prognosis. Of this Factor, Butler et al. (1962) wrote, “the self-avoidant,

nonparticipating, describing, verbal behaviour with its externalizing quality, seem unlikely to be

associated with favourable outcomes in therapy” (p. 199). The behaviour in Factor II was not as

clear. The researchers wrote, “on the one hand the client communicates relevant material in a

somewhat expressive way, but some of the important ingredients of expressiveness seem to be

missing” (Butler et al., 1962, pp. 198-199).

To test the relationship between the factors and treatment outcome, Butler et al. (1962)

then conducted a correlation analysis of these Factors with other treatment measures, such as the

therapist’s and client’s ratings of whether or not the therapy was successful. The results

indicated that Factor I (defined by Focused and Emotional vocal qualities) was associated with

42

good treatment outcomes as identified by the therapist and with the client’s report of self-change

from before and after therapy. Factor II (defined by Limited vocal quality) was associated with

partially successful outcomes. In contrast, Factor III (defined by Externalizing vocal quality)

was significantly associated with unsuccessful treatment outcomes.

The client’s vocal quality and treatment outcome-Rice and Wagstaff (1967).

Also supporting the CVQ’s relationship to outcome is a study by Rice and Wagstaff

(1967), the results of which were recomputed by Rice and Kerr (1986). In this study,

psychotherapy sessions from 53 clients from the University of Chicago Counseling and

Psychotherapy Research Center were rated. Forty-one had received 20 therapy sessions (two per

week) in the client-centered orientation. Data for the study were obtained from the first, second,

and eleventh sessions. The first and second sessions were selected for 12 early attrition clients.

Rice and Wagstaff (1967) divided each session into consecutive thirds and then rated 10

consecutive responses within each third. A total of 30 responses per session were rated on the

CVQ.

Rice and Kerr (1983) explained the outcome evaluations like this: “(1) unequivocal

success; (2) mixed group TH (high from therapist’s perspective but low from client’s); (3) mixed

group CH (high from client’s perspective but low from therapist’s); (4) unsuccessful group (as

seen from both perspectives); and (5) early attrition” (p. 82). Rice and Kerr’s (1983)

recalculation of the results indicated that clients in the unequivocal success group had

significantly more Focused vocal quality responses than clients in the unsuccessful, early

attrition, and mixed CH outcome categories. A higher number of responses made in

Externalizing voice were made by clients in the early attrition group than clients in the

43

unequivocal success one. Last, clients with unsuccessful outcomes had a significantly higher

number of Limited responses than clients in the mixed and unequivocal success groups.

The client’s vocal quality and outcome in other treatments-Sarnat (1976) and Nixon

(1980).

Because the CVQ developed out of the client-centered tradition, its originators

encouraged researchers to use the measure in different treatment conditions (e.g., Rice & Kerr,

1986). Sarnat (1976) applied the CVQ to a psychodynamic sample of N = 40. She found that

when she controlled for the number of psychotherapy sessions, there was a significant correlation

between Focused vocal quality and treatment outcome as rated by the therapist (r = .43, p < .05)

(in Rice & Kerr, 1986).

Nixon (1980) applied the measure to a wholistic primal therapy sample (N = 29).

Speech samples were drawn from the pre-treatment interview and from the second therapy

session. Nixon predicted that Focused and Emotional vocal qualities would be positively

correlated with measures of client change after treatment was completed, but that the Limited and

Externalizing vocal qualities would be negatively correlated with these outcome measures.

Measures of client change included the clients’, therapists’, and independent observers’

perspectives. Measures were obtained from before treatment began and then again, seven and a

half months after treatment was completed.

In terms of the study’s results, there were no significant findings for the Externalizing or

Focused categories. Nixon (1980) explained that reliability for the Focused category was very

poor due to the very small number of Focused responses and to the poor audibility of the audio

recordings. However, Nixon suggested that it could be that neither the Focused nor

Externalizing categories is linked to change in wholistic primal therapy.

44

In contrast, there were significant findings for the Limited category which was negatively

correlated with measures of client change from the client’s and therapist’s perspectives.

Correlations ranged from .31 to .36. The Emotional category was also significantly correlated

with change, but in the positive direction. Nixon (1980) found positive correlations between the

Emotional category and post-treatment scores on a measure from the client’s perspective called

the Q-sort Self-Ideal Correlation (Butler & Haigh, 1954) (r = .50) and on an observer rated

measure called the Psychiatric Status Schedule (Spitzer, Endicott, Fleiss, & Cohen, 1970) (r =

.54).

Importantly, while Nixon’s (1980) study was underway, raters heard a distinct type of

Emotional vocal quality which sounded as if clients were forcing the emotion out. Nixon

removed responses that sounded forced from the Emotional category and used them to create a

second category called Forced Emotional Voice. Nixon (1980) wrote that “forced emotional was

separated from the emotional category because it was felt that the two types of vocal patterns,

while sharing a common denominator of affective release, differ in terms of the manner in which

the affect is expressed” (p. 77). There were no significant results for the forced emotional vocal

category, though some findings approached significance suggesting that “forced emotional voice

quality may have value as a negative predictor in primal therapy” and that the results “certainly

justify a further exploration” (Nixon, 1980, p. 79).

The client’s vocal quality and in-session processes-Greenberg (1983).

The client’s use of Focused or Emotional vocal quality and progress in a treatment

intervention was demonstrated by Greenberg (1983), who conducted a study of the gestalt two-

chair intervention for conflict splits. Conflict split is the name for a psychological experience in

which a person feels that two parts of the self are in conflict with one another. Indications that

45

the client is experiencing a conflict split can include his or her making a statement along the lines

of wanting to do X, but not feeling entitled or able to do so. It could also be indicated by a shift

in the client’s vocal quality. For example, the client might express hopefulness about resolving a

difficult situation, but do so in a vocal quality that turns wistful at the end, suggesting that, in his

reality, he cannot be hopeful after all.

Greenberg (1983) identified three stages to the model of resolution for the two chair task:

(1) opposition, when the two sides of the self oppose one another; (2) merging, when each side

states his or her position and each is able to see the worth or intent of the other’s position; and (3)

integration, when the two sides come together to form a whole. The sides of the self are

identified like this: The side of the self represented by “the ‘other chair,’ is critical, hostile,

intimidating or threatening toward another part labeled the ‘experiencing chair,’ which is…

passively compliant, helpless or avoiding” (Greenberg, 1983, p. 191).

To study this intervention, Greenberg (1983) used the task analytic method (Gottman &

Markman, 1978; Greenberg, 1975). In the task analytic approach, aspects of the intervention that

are necessary for success are specified and then tested to determine the degree to which these

aspects distinguish groups that succeeded in the task, such as resolving their conflict, from the

groups that are unsuccessful, or that did not resolve their conflict. Greenberg (1983) expected

that the clients would use different vocal quality categories for the “other” and “experiencing”

chairs in different stages of the intervention (p. 191).

To evaluate the vocal qualities in this intervention, Greenberg (1983) used the Client

Vocal Quality (CVQ) categories. Greenberg (1983) grouped Focused and Emotional vocal

qualities into a single category called “good contact” (p. 193), indicating that the speaker is

“predominantly ‘in touch’ with him- or herself, experiencing what is being said or processing

46

new information” (p. 193). Externalizing and Limited vocal qualities were also grouped into

another single category called “poor contact” (Greenberg, 1983, p. 193), indicating that the client

is distant from his experience.

Greenberg (1983) predicted that the group that successfully resolved the conflict would

speak in more “good contact” than the group that did not resolve the conflict (p. 192). He found

that in the opposition phase, there were no significant differences in the type of “contact”

category for either the “experiencing chair” or “other chair” in the resolution and nonresolution

groups (Greenberg, 1983, p. 196). However, in the merging phase, there was a significantly

higher proportion of the “good contact” vocal qualities in both the “other chair” and

“experiencing chair” in the resolution group (Greenberg, 1983, p. 196). In addition, in the

resolution group, there was significantly more “good contact” vocal quality for the “other chair”

in the merging phase than in the opposition phase (Greenberg, 1983, p. 196). Moreover, an

additional test showed that in the resolution group, the “other chair” switched from Externalizing

to Focused at the beginning of the merging stage more often than in the nonresolution group

(Greenberg, 1983, p. 196). This point was especially meaningful to Greenberg (1983) who

explained:

It appears as though the “turning inward” by the other chair, indicated by focused voice,

is a critical aspect of the process of softening. No longer is there a lecturing at quality of

the critic, but rather a true looking inside for what is to be said. This change of voice in

the other chair seems to be an important indicator that something new is happening and

almost always accompanies the affiliative content. (p. 199)

Greenberg (1983) interpreted the study’s findings to support that idea that the client’s

vocal quality can be relied on as a “good cue of ‘true process’” in successful resolution of these

47

conflicts (p. 199). While Greenberg (1983) listed other possible reasons for these results, he

concluded that the most likely explanation is that the resolution arose through the “naturalistic,

transactional occurrence of client performance in a specific task environment” (p. 199).

Limitations of the study include its lack of generalizability beyond the gestalt two-chair task and

the correlational design which precludes inferring causality.

The client’s vocal quality and in-session processes-Watson and Greenberg (1996).

Watson and Greenberg (1996) also the investigated client change processes using the

CVQ. They used Focused and Emotional vocal qualities to demonstrate differences in problem

resolution between clients receiving client-centered therapy and clients receiving PE-EFT for

cognitive-affective difficulties such as conflict splits, unfinished business, and problematic

reactions. In this study, clients receiving client-centered therapy were used as a control group.

This was done because PE-EFT is based on client-centered therapy, but also uses gestalt

interventions for problems such as conflict splits described in Greenberg (1983). Gestalt

techniques are highly emotionally evocative. Their sample consisted of 36 clients who were

diagnosed with depression.

In addition to the CVQ, the experimenters used two other measures. One measure called

experiencing (EXP; Klein, Mathieu-Coughlan, & Kiesler, 1986) reflects the degree to which the

client is engaged in self-exploration. The second measure called Expressive Stance (ES)

indicates “the stance [the clients] adopt toward their own experience during sessions” (Watson &

Greenberg, 1996, p. 266). For example, the ES “Category 1 refers to clients focusing inside and

actively re-experiencing an emotion or feeling in the session as they try to express it in words”

(Watson & Greenberg, 1996, p. 266).

48

Watson and Greenberg (1996) found a difference between treatment groups when clients

were working on a conflict split problem. The PE-EFT clients expressed more statements that

showed the combination of productive expressive stance plus productive vocal quality (either

Focused or Emotional) than client-centered therapy clients (z = .008, p < .01). The PE-EFT

clients’ levels of experiencing were also higher than the client-centered group.

Taken together, these results support the relationship of CVQ to client processes, with

Focused and Emotional vocal qualities being associated with productive processes. Referring to

researchers such as Greenberg (1986), Watson and Greenberg (1996) also suggested there was a

“need for more micro-process analyses of the steps in the pathway to change to more fully

illuminate the active ingredients of various treatment approaches” (p. 273). While the study did

not link the CVQ measure through from the session to outcome stages of treatment, Watson and

Greenberg (1996) added that a larger sample would be needed to adequately test links between

“clients’ in-session process, degree of problem resolution, post-session outcome, and final

outcome” (p. 273).

Summary of the client’s vocal quality, treatment outcome, and in-session processes.

Taken together, studies of the client’s vocal quality have shown that productive vocal

categories (Focused and Emotional) are related to beneficial client processes (Butler et al., 1962)

and to successful treatment outcomes from the client’s and therapist’s perspectives (Rice &

Wagstaff, 1967). In psychodynamic therapy, Focused vocal quality was also related to

successful treatment (Sarnat, 1976) as was Emotional vocal quality in wholistic primal therapy

(Nixon, 1980). In terms of in-session process, Focused and Emotional vocal qualities were

judged to be “true process” signals during the merging phase in gestalt chair work (Greenberg,

1983) and were found to be more evident in the more emotionally evocative PE-EFT therapy

49

than client-centered therapy (Watson & Greenberg, 1996). Some of the limitations of the studies

were the small sample sizes, lack of generalizability, and their correlation design.

The therapist’s and client’s vocal qualities together-Butler, Rice, and Wagstaff (1962).

Although analysis of the relationship between the therapists’ and clients’ responses was

not complete at the time of publication, Butler et al. (1962) presented some findings. To analyze

the therapists’ responses, Butler et al. (1962) correlated the Expressive therapist vocal category

with the Client Classification System Factors previously described. The Expressive therapist

vocal quality category, characterized as “energetic”, “warm and confident” and as having a

“pondering, exploring quality” (p. 192) was found to be strongly and positively correlated with

client Factor I, which was characterized by Focused and Emotional vocal categories, and had

been associated with good treatment outcome. However, the Expressive therapist vocal quality

category was strongly and negatively correlated with client Factor III, which was characterized

by the client’s Externalizing vocal quality, and was associated with poor treatment outcome.

Butler et al. (1962) concluded from this and similar findings with other measures that “the

therapist behavior judged to be optimal tends to be associated with client behavior judged to be

optimal” (p. 202). No association was found between Factor II, characterized by the client’s

Limited vocal category, and the therapist’s vocal quality.

The therapist’s and client’s vocal qualities together-Wiseman and Rice (1989).

Wiseman and Rice (1989) investigated the therapist’s and client’s vocal qualities in a

sequential analysis. They used the Client Vocal Quality scale (CVQ; Rice et al., 1979), which is

described in the Method section and presented in a more general form in Table 1 and the

Therapist Vocal Quality scale (TVQ; Kerr, 1983), which uses the same therapist vocal categories

described in the Kerr (1986) study and is more fully detailed in the Method section.

50

In the sequential analysis, the client and therapist behaviours were analyzed while both

people were engaged in a treatment intervention. The client’s behaviour was recorded at time

“1”, before the therapist’s behaviour, and at time “2”, after the therapist’s behaviour. The

differences between these client behaviours were then analyzed. Wiseman and Rice (1989)

explained that “by using sequential designs in a conceptually based manner, the researcher-

clinician can study therapist-client interactions that are clinically significant and relevant to his

or her particular microtheory of change” (p. 285).

Wiseman and Rice (1989) predicted that the therapist’s vocal quality, as measured by the

TVQ, would impact the client’s cognitive-affective processing as it occurred in a client-centered

therapeutic intervention called systematic evocative unfolding (e.g., Greenberg, Rice, & Elliot,

1993). The therapist uses this intervention when the client makes a statement indicating that he

or she is puzzled or upset by his or her reaction to a specific situation. This statement is known

as a marker and its presence alerts the therapist that the client is experiencing a specific

cognitive-affective processing problem. The systematic evocative unfolding method is a

treatment for this problem and involves the therapist’s helping the client to unfold events

surrounding the perplexing event.

In the Wiseman and Rice (1989) study, two sessions from each of five female clients

receiving psychotherapy at a local university counseling center were selected. The sessions were

selected because they contained instances of the systematic evocative unfolding intervention.

Responses for the therapist and the client were rated on the Vocal Quality measures and

Experiencing scale (EXP) (Klein et al., 1986). The response was the unit of analysis and was

defined as “everything one participant said between two successive productions of the other

participant” (Wiseman & Rice, 1989, p. 283).

51

Wiseman and Rice (1989) hypothesized that the therapist’s use of Irregular vocal quality

would precede a shift in the client’s vocal quality. The hypothesized shift for the client would be

from an unproductive vocal quality, Externalizing, to a productive vocal quality, Focused. The

Focused category has been described as the client’s version of the therapist’s Irregular vocal

quality (e.g., Kerr, 1983). When the client speaks in a Focused vocal quality, he or she is

believed to be searching unexplored psychological territory in such a way as to permit new

meanings to emerge. When the therapist is speaking in Irregular vocal quality, he or she is co-

exploring new ground. During this process it appears for the therapist that “the effort to

symbolize seems to be as much for oneself as for the listener” (Kerr, 1983, p. 69).

The researchers also predicted that Irregular responses would precede a shift toward

improved client engagement in the therapy process as determined by the client Experiencing

scale (EXP) (Klein et al., 1986). Low EXP would reflect limited psychological engagement and

thus, poor processing. High EXP would reflect focused psychological engagement and,

therefore, productive client processing. Also, the peak rating of a response was evaluated on the

EXP measure in this study. The peak rating refers to the highest level of EXP reached in a

response.

The results were significant for the Vocal Quality measures: Irregular TVQ preceded the

client’s shift in vocal quality from Externalizing to Focused. In other words, the therapist’s use

of Irregular vocal quality preceded the client’s shift from speaking in an unproductive vocal

quality to a productive one. Wiseman and Rice (1989) did not test whether or not the client’s

vocal quality impacted the therapist’s vocal quality. While the analysis did not show that

Irregular vocal quality preceded a client shift from Low EXP (peak rating of 1, 2, or 3) to High

52

EXP (peak rating of 5, 6, or 7), it did show that Irregular vocal quality preceded a shift from

Low EXP to intermediate EXP (peak rating = 4).

Although the study violated the assumption of independence, the researchers saw these

results as support for the usefulness of task-focused sequential analysis for understanding how

the therapist impacts the client. They suggested that future studies link client processes at the

session level, such as systematic evocative unfolding, to end-of-treatment outcomes for

successful versus unsuccessful cases.

Different Treatments, Different Demands

Nixon (1980) attributed the lack of significant results for Focused and Externalizing

vocal categories in her study to the possibility that these vocal patterns are not central to change

processes in wholistic primal therapy. Gillies (1990) encountered a similar situation in a study in

which three patients received brief psychodynamic treatment through the Mount Zion

Psychotherapy Project. During the pilot study, Gillies (1990) listened to two audio taped

psychodynamic therapy sessions and found so few Focused examples that she consulted with

two CVQ developers. Gillies (1990) referred to her personal communication with Rice and

Greenberg who “hypothesized the Focused voice may be rather weak in psychodynamic therapy

(as opposed to client-centered or experiential therapy) because different psychological processes

may be at work” (p. 41).

Nixon’s (1980) and Gillies’ (1990) findings highlight the idea that different treatment

approaches can make different demands on the client. Rice and Kerr (1986) urged researchers to

apply the Vocal Quality scales to different treatments and problems in order to understand how

vocal quality is related to change processes in them. Rice and Kerr (1986) suggested that

examining vocal quality in cognitive behavioural therapy (CBT) would be particularly intriguing.

53

One reason this would be interesting is that the CBT client is not required to do the

emotionally-charged, self-searching that is required in experiential therapies like client-centered

and process-experiential therapies. Instead, there is a focus on understanding how one’s

thoughts lead to different emotions and behaviours. The work of CBT is on changing a person’s

thoughts, beliefs, etc., so that more comfortable or healthier psychological experiences can

follow (Beck & Weishaar, 1989 in Burgoon et al., 1993). In addition, although the processing of

emotions is becoming more valued among CBT proponents, the CBT client’s emotions have

generally been used diagnostically, such as hot cognitions which indicate that the client is in

touch with important issues (Samoilov & Goldfried, 2000). As a result, the CBT client may not

display Focused vocal quality in the amounts related to good client engagement and outcome as

seen in the studies cited above where treatments were primarily client-centered or experiential.

Regarding the therapist’s vocal quality, Kerr (1983) wrote that “one of the underlying

assumptions of the function of therapist vocal quality was that therapists will use different vocal

patterns when engaged in different help intended communications or speech acts” (p. 35). Like

the CVQ, the TVQ was also developed within the client-centered tradition. Research using this

measure found that Irregular vocal quality, which corresponds to therapist behaviours that

facilitate the client’s experiential searches (e.g., Kerr, 1983), was related to productive client

behaviours (Wiseman & Rice, 1989). Because the experiential search is not a core value of the

CBT approach, Irregular vocal quality would not be expected to be used as much in this

treatment. However, it makes intuitive sense that the CBT therapist would express warmth and

caring as well as a problem-solving attitude in his or her vocal quality. It seems that Natural

vocal quality would be an effective conveyor for these types of messages.

54

Methods of Studying Vocal Quality in the Psychotherapy Setting

In the psychotherapy setting, vocal quality has also been investigated by using acoustic

parameters such as fundamental frequency. Diamond, Rochman, and Amir (2010) measured a

number of acoustic parameters in their exploration of emotional changes in the client’s vocal

quality while he or she was engaged in a PE-EFT intervention for unfinished business. This

approach, however, has not yet been used to investigate the acoustic properties of the more

complex vocal quality patterns found in the Therapist and Client Vocal Quality scales.

Filtering the content from speech so that only vocal quality remains is another way in

which the psychotherapy client’s vocal quality has been studied. Removing the content from the

speaker’s speech is accomplished by running the audio-taped therapy sessions through a low

pass filter at 300 Hz. Mohr et al. (1991) used this technique to study anger in clients engaged in

an emotionally evocative treatment. While their results were interesting, the content-filtering

procedure has drawbacks. The problem is that low pass filters remove the upper frequencies of

sound. Verbal content is at these frequencies as well as important acoustic characteristics such

as tier and tone (Ochai and Fukumura, 1957 in Gillies, 1990). While some emotions may be

detectable after low pass filtering, more subtle variations and accents needed to judge vocal

categories in the Client and Therapist Vocal Quality scales are not (David Orgel, personal

communication, March, 2007; Rice & Koke, 1981).

Summary

Butler et al. (1962) laid out a plan to investigate the vocal characteristics of the therapist

and client as individuals and in interaction with one another. These goals were achieved over the

course of twenty plus years with several different studies (e.g., Butler et al., 1962; Rice, 1965;

Wiseman & Rice, 1989). There are some similarities in the methods of these studies, such as

55

analyzing data sets of clients treated with Rogerian style therapy and doing correlational designs

for the client’s and therapist’s vocal qualities. However, most sample sizes were small, there

were no discernible standard criteria in terms of diagnoses or treatment duration for the

participants, and different versions of the Vocal Quality scales were used in the more

contemporary studies. Several researchers suggested future investigations should involve larger

samples (e.g., Watson & Greenberg, 1996); different treatment orientations (Rice & Kerr, 1986);

and linking in-session process with treatment outcome (e.g., Wiseman & Rice, 1989).

To date, there are no studies testing the relationship of the therapist’s and client’s Vocal

Quality categories (TVQ and CVQ) in terms of the client’s report of in-session change. The

Kerr (1983) study involved the client’s evaluation of how the therapist impacted him or her, but

did not assess the degree to which the client felt he or she changed or experienced a shift in

insight from a particular session. Also, there are no studies of the relationship of TVQ and CVQ

to end-of-treatment outcome in process-experiential therapy (PE-EFT) for depression or

cognitive-behavioural treatment (CBT) for depression. As a result, the goal of the current study

is to explore the therapist’s and client’s vocal quality categories as they relate to the client’s

report of change in the session and to end-of-treatment outcome in PE-EFT and CBT treatments

for depression.

Research questions and hypotheses

Research Question 1.

The first research question is: Are the therapist’s and client’s vocal quality qualities

related to the client’s report of change? Three hypotheses follow from this.

1a. There will be a higher proportion of productive CVQ categories (Emotional, Focused,

Emotional Plus Focused) in high change than low change sessions.

56

1b. There will be a higher proportion of unproductive CVQ categories (Limited and

Externalizing) in low change than in high change sessions.

1c. There will be a higher proportion of Productive TVQ (Irregular, Softened, and

Natural) in high change than in low change sessions.


The second research question is: Are the therapist’s and client’s vocal qualities related to

the client’s scores on the outcome measures at termination? Two hypotheses follow from this:

2a. A higher proportion of the productive CVQ categories (Emotional, Focused,

Emotional Plus Focused) will predict better scores for clients on the outcome measures at the

end of treatment.

2b. A higher proportion of the Productive TVQ (Irregular, Softened, and Natural) will

predict better scores for clients on the outcome measures at the end of treatment.


The third research question is: Is there a difference in the TVQ and CVQ categories

primarily expressed in PE-EFT and CBT? Two hypotheses follow from this:

3a. CBT clients will have a lower proportion of Focused vocal quality than PE-EFT

clients.

3b. CBT therapists will have a higher proportion of Natural vocal quality than PE-EFT

therapists.

57

Chapter 2:

Method

Participants

Data for the current study were drawn from the Depression Project conducted at the

Ontario Institute for Studies in Education/University of Toronto (OISE/UT). Sixty-six clients

who were diagnosed with major depression participated in the Depression Project. Table 2

displays the clients’ demographic and pre-treatment characteristics. Clients were diagnosed with

major depression using the Structured Clinical Interview for DSM-IV (SCID-IV; First, Spitzer,

Williams, & Gibbon, 1997) and the Diagnostic and Statistical Manual for Mental Disorders (4th

ed.; DSM-IV; American Psychiatric Association, 1994). None of the participants was diagnosed

with the Axis I disorders of eating, manic depression, psychosis, or substance abuse or with the

Axis II disorders of antisocial, borderline, or schizotypal. None of the participants was at a high

suicide risk, receiving pharmacotherapy, or other psychological treatments during the study. All

clients were able to speak and understand English.

Therapists

There were 15 therapists about evenly divided between the CBT condition (n = 8) and the

PE-EFT condition (n = 7). Two therapists were psychologists and 13 were graduate students at

OISE/UT. Therapists treated more than one client. Therapists ranged in age from 26 to 43 years

(M = 32.73, SD = 6.08) and their experience varied from 1 to 15 years (M = 5.23, SD = 4.74).

There were no significant differences in age, experience, education or gender between therapists

in the treatment conditions.

58

Table 2

Client Characteristics at Pre-treatment

Completers (n = 66)

Variable n (%) M

Gender

Male 22(33)

Female 44(67)

Age in years

41.52a

Marital Status

Married/common law 28(42)

Single 28(42)

Separated/divorced 9(14)

Widowed 1(2)

Education

Secondary 16(24)

Postsecondary/college 37(56)

Graduate school 13(20)

Beck Depression Inventory

Mild-moderate 12(18)

Moderate-severe 38(58)

Extremely severe 16(24)

No. of previous episodes of MDDb

Current episode=1st episode 4(6)

2-4 episodes 17(26)

5 or more 41(62)

Length of current episodec

< 6 months 19(29)

6 months-9 years 34(51)

> 9 years 8(2)

Global assessment of functioninge 58.17 Note. MDD=major depressive disorder. Criteria for excluding a client from the study: taking medication,

engaged in another form of treatment, inability to communicate in English, high risk of suicide, and current

or previous diagnosis of DSM-IV Axis 1 disorders of substance abuse, psychosis, manic-depression, or

eating disorder. aRange – 21-65 years, SD = 10.82. bUnknown for 4 completer clients. cUnknown for 5 completer clients. dDSM-IV axis II disorders that were excluded include borderline, antisocial, or schizotypal. eMean

Structured Clinical Interview for DSM-IV—global assessment of functioning; range = 51-65 for completers.

This table was adapted from Watson, Gordon, Stermac, Kalogerakos, & Steckley (2003, p. 774).

59

Treatments

An expert in CBT trained the CBT therapists and an expert in PE-EFT trained the PE-

EFT therapists. The CBT therapists were trained according to the treatment manual written by

Beck, Rush, Shaw, and Emery (1979). The PE-EFT therapists were trained using the manuals

written by Greenberg, Rice, and Elliot (1993) and Greenberg and Watson (1998). The experts

trained and supervised students as well as personally acting as therapists in the study. This

arrangement controlled for investigator bias in that it eliminated any bias that may have resulted

from having an expert from only one of the treatment orientations conduct the training and

supervision of all the therapists on the study.

In terms of the differences between the treatments, CBT therapists target their clients’

dysfunctional cognitions, attitudes, core beliefs, and behaviours since these are seen as causing

the clients’ psychological distress. PE-EFT therapists focus on their clients’ distressing emotions

because these are seen as root causes of psychological distress. CBT therapists provide

treatment interventions such as behavioural experiments and thought records. PE-EFT therapists

provide interventions such as empathic reflections and gestalt-based chair work.

Clients were randomly assigned to either the PE-EFT or CBT treatment group. They

received 16 sessions of one-on-one therapy, one hour per week. Weekly supervision sessions

were used to check treatment adherence. All of the psychotherapy sessions were recorded on

audio and video after having obtained the clients’ consent.

60

Process Measures

Client Vocal Quality Scale (CVQ; Rice, Koke, Greenberg, & Wagstaff, 1979; Rice &

Kerr, 1986).

According to Rice (1980), the “CVQ was designed to assess the vocal style of

participation of the client in any given utterance, without regard to the content of what is being

said” (p. 1). The measure is usually used to rate the response or utterance, defined as everything

one speaker says “between two successive productions of the other participant” (Wiseman &

Rice, 1989, p. 283). The CVQ consists of four nominal categories, each of which is defined by

the “accents, accentuation, regularity of pace, terminal contours, perceived energy, and

disruption of speech” (Rice & Kerr, 1986, p. 79).

Speakers using Externalizing vocal quality speak in an even pace with high energy. They

accent their speech through rising pitch and sometimes by increasing loudness. Although the

person’s vocal quality seems full of energy and expression, there is a rhythm to it that conveys a

quality of “talking at” (Rice & Kerr, 1986, p. 78) or a “well-rehearsed speech or chatting to a

friend” (Kennedy-Moore & Watson, 1999, p. 207). In contrast, the vocal quality of Limited

vocal quality speakers sounds very low in energy and hollow. A vocal quality that sounds

“fragile, thin, or empty” is prototypical of this category (Rice & Kerr, 1986, p. 80). Emotional

vocal quality is present when emotion contorts the regular flow of speech. The speaker may

sound as if he or she is struggling to keep his vocal quality under control. “The vocal quality

may break, tremble, rise to a shriek, and so on” (Rice & Kerr, 1986, p. 80). Laughter, however,

is not considered to be Emotional vocal quality. In the Focused category, the speaker’s vocal

quality is energetic, but also softer than usual. According to Rice and Kerr (1986), words are

emphasized by increasing the loudness or drawling rather than pitch. Focused speech sounds

61

choppy with a pace that can slow down, speed up, and halt abruptly. In addition, unfilled pauses

and word accents occur in unpredictable places. Sometimes words sound drawled and the

endings of phrases and sentences can have “ragged” terminal contours (Rice & Kerr, 1986, p.

79).

Regarding the CVQ’s psychometric properties, more recent inter-rater reliabilities range

from .70 to .88 (Greenberg & Malcolm, 2002; Watson & Greenberg, 1996). The measure’s

construct validity is supported by Rice and Gaylin’s (1973) study involving clients in client-

centered therapy and their Rorschach scores. Wexler’s (1974) study of vocal quality and

therapy-relevant processes of differentiation and integration in university students also provides

evidence of construct validity.

Research also supports the measure’s predictive validity, such as the Butler et al. (1962)

study which associated Externalizing vocal quality with poor client processing and with poor

end-of-treatment outcome in client-centered therapy. Limited vocal quality has been associated

with poor client processes and with partially successfully outcomes in client-centered therapy

(Butler et al., 1962; Rice & Wagstaff, 1967) and with poor treatment outcome in wholistic primal

therapy (Nixon, 1980). In contrast, Emotional vocal quality has been associated with productive

client processes in client-centered (Butler et al., 1962; Rice & Kerr, 1986) and process-

experiential therapy (Watson & Greenberg, 1996) and has been shown to be significantly

correlated with outcome in a study of wholistic primal therapy (Nixon, 1980), psychodynamic

therapy (Sarnat, 1976 in Rice & Kerr, 1986), and client-centered therapy (Butler et al., 1962).

Research on Focused vocal quality shows it is positively related to productive psychological

processes in gestalt, client-centered and process-experiential therapy (Greenberg, 1983; Watson

& Greenberg, 1996; Wiseman & Rice, 1989) and to end-of-treatment outcome in client-centered

62

and psychodynamic therapies (Butler et al., 1962; Rice & Wagstaff, 1967; Sarnat, 1976, as cited

in Rice & Kerr, 1986).

Therapist Vocal Quality Scale (TVQ; Rice & Kerr, 1986).

The TVQ was designed to identify therapist vocal qualities that impact the client’s ability

to engage in therapeutic work. The TVQ does this by identifying the therapist’s vocal quality in

relation to his or her baseline vocal quality. Shifts from the therapist’s baseline vocal quality can

be tracked and represent shifts in his or her interaction with the client (Rice & Kerr, 1986). The

TVQ is applied to the response or utterance, defined as everything one speaker says “between

two successive productions of the other participant” (Wiseman & Rice, 1989, p. 283).

The TVQ categories are viewed as helpful or unhelpful depending on the effect they are

thought to have on the client in theory. Works by Kerr (1983) and Rice and Kerr (1986) suggest

that Softened, Irregular, and Natural categories are productive because they either convey client-

centered relationship conditions or reflect the kinds of therapeutic assistance the client-centered

therapist would offer the client when exploring not-yet-known emotional experience and

working through newly discovered experience. The effectiveness of the Definite category is

thought to depend on the situation and context. Restricted, Patterned, and Limited vocal quality

qualities are expected to dampen the client’s ability to engage in therapeutic processes and were

regarded as negative regardless of the situation.

Regarding the TVQ’s psychometric properties, for inter-rater reliability, Rice and Kerr

(1986) report significant Cohen’s (1960) kappas “for the seven nominal TVQ categories… .33,

.31, and .31 for the combinations of the three raters” (p. 99). Rice and Kerr (1986) explain that

low inter-rater reliability is inevitable given the number and complexity of the TVQ categories.

They state that if the scale were simplified in order to increase inter-rater reliability, the scale’s

63

validity would likely be compromised. Wiseman and Rice (1989) report a Cohen’s kappa of .60

(p < .001) for two raters.

The TVQ categories have been described in acoustic terms. Softened vocal quality

sounds soft and slow. It has been called a “lax voice” that sounds “muffled or fuzzy” (Rice &

Kerr, 1986, p. 95). Wiseman and Rice (1989) regarded the Irregular vocal quality as a parallel

to the client’s Focused vocal quality in that it “is characterized by the therapist groping for the

meaning of the client’s message with a new path quality” (p. 282). The pace of the Irregular

vocal quality speaker is variable with abrupt stops and starts as well as unexpected slowing and

quickening. Accentuation is also uneven and unexpected, with some words spoken with a drawl

or lengthening. Natural vocal quality has “adequate energy, fairly full, standard English

emphasis patterns and tempo, with neither an overly tense nor relaxed voice. The voice is

unstrained and natural” (Rice & Kerr, 1986, p. 95).

Definite vocal quality sounds energetic, full, and confident. Rice and Kerr (1986) write:

Stresses are usually down pitched, though they can rise in pitch if accompanied with high energy

and an irregular pattern. Phrase endings are definite, with “heavy,” strong emphases. This

category includes “confrontational voice;” for example, “Well, what are you going to do?” (p.

95). Restricted vocal quality has enough energy to convey the content, but the speaker’s voice

sounds strained, as if “something is being held back” with a “slightly tremulous, whiny, droning”

quality” (Rice & Kerr, 1986, p. 95). The effect of the Restricted vocal quality on the listener is

thought to be “unsatisfying, distanced, and seems uninvolved” (Rice & Kerr, 1986, p. 95).

Patterned vocal quality is although regarded as negatively impacting the listener as it is:

Patterned for emphasis, especially using pitch. Often a syllable at the end of a phrase, on

which pitch would normally go down, has a rising or level pitch. The tempo is normal or

64

fast, and the rhythm of the words is distorted to fit into the pattern. The category as a

whole sounds “sing-song”. (Rice & Kerr, 1986, p. 95)

Limited vocal quality sounds lifeless, flat, and monotone. “This pattern may be just too soft—so

whispery, breathy, or creaky that it fades away—or high-pitched, ending in a kind of squeak”

(Rice & Kerr, 1986, p. 95).

Outcome Measures

Beck Depression Inventory (BDI ).

The BDI (Beck, Ward, Mendelson, Mock, & Erbaugh, 1961) is a well-researched,

commonly used self-report measure of depression. The questionnaire contains 21 items, each of

which reflects a depressive symptom. Each item contains four statements reflecting the severity

of the symptom ranging from the absence of the symptom (0) to very intense (3). Respondents

are instructed to select the one statement that most accurately reflects the way he/she has felt

over the past week. The total of all 21 items is the score used for interpretation. Higher scores

indicate worse depressive symptoms. The BDI’s psychometric properties are well established.

Reported test-retest reliabilities ranged from .69 to .90 (Moreno, Fuhriman, & Selby, 1993).

High internal consistency has also been reported (α = .81) as well as good discriminate

capabilities (Beck, Steer, & Garbin, 1988). Testing with the Hamilton Rating Scale for

Depression shows the BDI has strong convergent validity (r = .84) (HRSD; Hamilton, 1960 in

Moreno et al., 1993).

Dysfunctional Attitudes Scale (DAS).

DAS (Weissman & Beck, 1978) is a self-report questionnaire meant to evaluate attitudes

that may make the respondent vulnerable to depression. The questionnaire contains 40

statements which respondents rate on a 7-point Likert scale ranging from totally disagree to

65

totally agree. Instructions on the first page of the questionnaire tell the respondent to answer

based on the way he/she thinks most of the time. An example of a DAS item is: “If I do not do

as well as other people, it means I am an inferior human being” (Cane, Olinger, Gotlib, &

Kuiper, 1986, p. 308). Higher scores indicate more dysfunctional thinking. Two subscales

include Performance Evaluation and Approval by Others. The DAS has solid psychometric

properties. Dobson and Breiter (1983) report high test-retest reliability (r = .84) for a two-month

time frame. Reports of internal consistency include α = .85 for an adult sample (Oliver and

Baumgart, 1985) and = .88 - .90 for university sample (Dobson & Breiter, 1983).

Problem-Focused Style of Coping (PF-SOC).

The PF-SOC (Heppner, Cook, Wright, & Johnson, 1995) measures a person’s preferred

or dispositional manner of coping with problems. Wei, Heppner, and Mallinckrodt (2003)

explain that “in essence, the PF-SOC assesses the extent to which people believe in general that

they are coping well and making progress toward resolving their problems” (p. 440). Items

reflect cognitive, affective, and behavioral coping experiences. Respondents are asked to rate

how typical each item is of the way they respond to problem situations. They indicate their

responses on a 5-point scale ranging from 1 (almost never) to 5 (almost all of the time).

Factor analysis of all the items revealed three distinct coping methods. The Reflective

Style describes a tendency for a person to be systematic in his or her approach by considering

cause and effect and by planning responses. The Reactive Style of coping indicates negative

cognitive-affective reactions that would exhaust and confuse the person and interfere with efforts

to cope productively. A person using the third method, Suppressive Style, would avoid dealing

with the problem or perhaps not even acknowledge that there is a problem. The mean score of all

66

items within each coping style is that subscale’s score. Higher scores mean the person is

engaging more in that particular style.

Regarding the measure’s psychometric properties, Heppner et al. (1995) found test-retest

correlations with a three week span between testing to be .71 for Reactive Coping, .67 for

Reflective Coping, and .65 for Suppressive. They also reported internal consistency figures for

each style: Reactive ( = .73), Suppressive ( = .76), and Reflective ( = .77). Concurrent

validity was demonstrated in a study pairing the PF-SOC scales with the Problem-Solving

Inventory (PSI; Heppner, 1988).

Inventory of Interpersonal Problems (IIP).

According to its developers Horowitz, Rosenberg, Baer, Ureno, and Villasenor (1988),

the IIP “describes the types of interpersonal problems that people experience and the level of

distress associated with them before, during, and after psychotherapy” (p. 885). The self-report

questionnaire consists of 127 items representing interpersonal problems in terms of “It is hard for

me to….” or “These things I do too much.” An example of the former is “It is hard for me to

trust other people” and of the latter is “I am too easily persuaded by other people”. For each

item, clients indicate their level of suffering by rating the statement on a scale ranging from 0

(not at all) to 4 (extremely). Higher scores reflect worse interpersonal problems. The mean

score of all IIP items produces the Circumplex Total, which is a global measure of interpersonal

suffering. Also, the 127 items can be classified into eight subscales: Domineering-Controlling,

Vindictive/Self-Centered, Cold-Distant, Socially Inhibited, Nonassertive, Self-Sacrificing,

Overly Accommodating and Intrusive-Needy. Higher scores indicate greater interpersonal

distress.

67

Regarding the IIP’s psychometric properties, Horowitz et al. (1988) reported test-retest

correlations for the subscales from .80 to .87, with an IIP total correlation of .98 over a 10-week

interval. Good internal consistency alphas have been reported, ranging from .82 to .94

(Horowitz et al., 1988) and .72 to .85 (Alden, Wiggins, & Pincus, 1990). Comparisons of the IIP

with other measures of interpersonal distress indicate good concurrent validity (Lambert,

Hansen, Umpress, Lunnen, Okiishi, Burlinggame, & Reising, 1996, as cited in Woodward,

Murrell, & Bettler, 2005). Research has also demonstrated the measure’s criterion validity and

sensitivity to clinical change in high versus low stress in university students (Woodward et al.,

2005). In addition, Horowitz et al. (1988) found that the measure could sort treatment

completers from noncompleters.

Rosenberg Self-esteem Scale (RSES).

The RSES (Rosenberg, 1989) is a well-known measure of self-esteem in the social

sciences. The current study used the Bachman and O’Malley (1977) version of the RSES which

consists of 10 self-report items. An example of one item is: “I feel that I am a person of worth,

at least on an equal plane with others”. Respondents are asked to rate their agreement on each

item according to a 5-point scale ranging from never to almost always. The score used for

interpretation is the mean of the items. Higher scores indicate higher self-esteem.

The RSES has good psychometric properties. The measure has good test-retest reliability

(r = .82) for a college sample after one week (Torrey, Mueser, McHugo, & Drake, 2000 referring

to a study conducted by Fleming & Courtney, 1984). Bachman and O’Malley (1977) reported

good internal consistency (α = .81). Studies have also provided evidence of the measure’s

construct validity by demonstrating positive correlations with measures of happiness (r = .54)

and needs for self-development (r = .44) (Bachman & O’Malley, 1977) as well as parental

68

warmth (r = .42), optimism (r = .61), and life satisfaction (r = .61) (Greenberger, Chen,

Dmitrieva, & Farruggia, 2003). Discriminant validity was further supported by negative

correlations of the RSES with somatic symptoms (r = −.34) and negative affective states (r =

−.52) (Bachman & O'Malley, 1977) as well as depressive symptoms (r = −.64) (Greenberger et

al., 2003).

Symptom checklist-90-Revised (SCL-90-R).

The SCL-90-R (Derogatis, Rickels, & Rock, 1976) is a 90-item self-report questionnaire

used to evaluate clients’ psychological suffering in both therapy and research contexts (Schmitz,

Hartkamp, & Franke, 2000). Peveler and Fairburn (1990) wrote that “each item describes the

experience of having a psychiatric symptom” (p. 874). Respondents are instructed to refer to

their experience from the past week when rating each item on a 5-point scale increasing in

intensity of distress from 0 (not at all) to 4 (extremely). Raw scores are converted to T-scores.

Higher scores indicate more distress.

In addition to a Grand Total score, the SCL-90-R scores produce three indices of distress:

The Global Symptom Index, the Positive Symptom Total, and the Positive Symptom Distress

Index. The Global Symptom Index score reflects the total number of symptoms the respondent

reports plus the reported severity. According to Derogatis (1983), the Global Symptom Index is

“the best single indicator of the current level or depth of the disorder, and should be utilized in

most instances where a single summary measure is required” (Derogatis, 1983, p. 11 as cited in

Tingey, 1989, p. 24).

The measure has strong psychometric properties. High test-retest reliability has been

reported (r = .84) for the SCL-90-R total and for the subscales (r = .68 - .83) (Horowitz et al.,

1988, p. 887). Tennen, Affleck, and Herzberger (1985) to report the measure’s impressive

69

“internal consistency and test-retest reliability correlations; both ranged almost exclusively in the

eighties and nineties” (cited in Tingey, 1989, p. 23). Woodward, Murrel, and Bettler (2005)

reported high concurrent validity for the Global Symptom Index. Regarding clinical sensitivity,

Tennen et al. (1985) to support his conclusion that, “the scores…are sensitive to treatment effects

(p. 586)” (cited in Tingey, 1989, p. 23).

Post-Session Outcome Measure

Client Task-Specific Change Measure-Revised (CTSC-R; Watson, Greenberg, Rice,

& Gordon, 1996).

The CTSC-R is a post-session self-report questionnaire intended to gauge the degree to

which clients felt they changed in the session. The CTSC-R was used as a measure of the

client’s report of change from the session. The items refer to experiences the client might have

had given the specific treatment orientation. For example, 12 items refer to PET treatment

effects such as “I understood a puzzling reaction of my own after I discovered what a particular

situation meant to me or how I was interpreting it”. Four items refer to CBT interventions such

as “I feel that I was able to successfully challenge my negative/automatic thoughts”. The clients

indicate how much they agree with the statements on a 7-point scale where number 1 is “not at

all” and number 7 is “very much”. The client’s indication of agreement for each question is

averaged over all questions, providing a single index of post-session change. Higher scores

indicate the client’s report of higher change. Mean scores of five indicate that the client is

reporting moderate to high change in that session. Watson, Goldman, and Greenberg (2007)

explain that a CTSC-R mean of five or more signifies a shift in the clients’ “understanding of

their problems, how they are treating themselves, and how they are feeling about themselves and

others” (p. 19).

70

Regarding the measure’s psychometric properties, Watson, Schein, and McMullen (2010)

reported that factor analysis revealed two distinct constructs underlying the questions: 1.

Behaviour Change and 2. Understanding and Awareness. Also, the measure has been found to

have high internal consistency, with Cronbach alphas ranging from .94 to .98. Most item-total

correlations reported are as high as .90 to .96, with most exceeding .70. The construct validity of

the measure was established through its relationship to the Beck Depression Inventory (BDI,

Beck et al., 1961). CTSC-R (Watson et al., 1996) scores predicted BDI scores regardless of the

treatment type or of the two factors (Behaviour Change and Understanding and Awareness).

Procedure

Session selection.

Sessions were selected from all but the final session (#16). Sessions were selected based

on the CTSC-R mean score. For each client, the session with the lowest CTSC-R score and the

session with the highest CTSC-R score were chosen. The session with the lowest CTSC-R score

is referred to as the session with the lowest change score. The session with the highest CTSC-R

score is referred to as the session with the highest change score.

Another session representing the client’s first report of moderate to high change was

selected. The first session having a CTSC-R score of 5 or more was included in this group. This

session is referred to as the first report of moderate to high change session. If the client’s

highest CTSC-R scored session was a 5, it was included in this group. If the client did not have a

CTSC-R scored session of 5, then that client’s highest change score session was used. There

were 12 clients for whom the same session was the session with the highest change score and the

first report of moderate to high change session. Both of these sessions are referred to as “high

change sessions”, but analyses of them are run separately because of this overlap. If the client’s

71

session with the highest change score and the first report of moderate to high change session had

the same CTSC-R score, the earlier session was used in the first report of moderate to high

change group and the later session was used in the highest change score group. Lastly, some

clients did not have any sessions with a CTSC-R mean score of 5 or more. In order to be

included in the first report of moderate to high change session, it was decided that the score for

this session must be at least 3.5. There were three clients who did not meet this criterion and

they were excluded from analyses involving the first report of moderate to high change session.

Preparation of materials.

The audio for each session was converted to MP3 format. Audio for each rater was

loaded onto an Apple iPod Nano assigned to that rater. Raters listened to the sessions with their

iPods mounted onto Bose sounDock® Portable Digital Music systems. De-identifying numbers

were used for the transcripts and MP3 files.

The middle 20 minutes of each session were transcribed and formatted into a rating sheet,

with a column for raters to indicate their vocal quality rating for each response. For the CVQ

responses, the primary investigator made slash marks to indicate shifts in intonation. The raters

were instructed to rate each shift. When the session was complete, the primary investigator

tallied the ratings for each intonation shift for an overall rating for the response. This was done

in accordance with rules provided in the Manual for Client Vocal Quality (Rice et al., 1979).

CVQ Training.

The CVQ raters trained for more than 100 hours for more than 1 year using the Manual

for Client Vocal Quality (Rice et al., 1979) with an expert rater. Additional audio from

psychotherapy sessions that were not part of the data set were also used. Raters trained to an

acceptable level of agreement with each other and the expert before proceeding to code the data.

72

The preliminary inter-rater reliability was established through a test of non-data audio which was

assembled by the current study’s primary investigator.

When the tests were completed, the primary investigator entered the data and conducted

the analysis. During data entry, there were instances in which the raters gave a borderline rating

for a response. For example, for the same response, one rater may have indicated Focused vocal

quality, but the other rater may have written that the response was not clearly classifiable in

Focused vocal quality or Externalizing vocal quality—that it could be either category. Rice and

Kerr (1986) reported the same situation for TVQ raters, reporting that “since there were

‘borderline’ areas between each TVQ category in which either of two different ratings may have

been equally appropriate, each rater would vary slightly no matter how well trained” (p. 99).

Based on this, it was decided in the current study that raters were in agreement for a response

when the rating for one rater was the same as one of the two borderline categories indicated by a

second rater.

Preliminary reliability was calculated between the raters and the expert. Three measures

of rater agreement were used: Cohen’s kappa (κ), Brennan and Prediger’s kappa (κn) and raw

agreement (râ). Cohen’s kappa was used because it is the inter-rater reliability statistic used in

the majority of studies using the CVQ measure. Brennan and Prediger’s kappa and raw

agreement were added to address the non-uniform marginal totals found in the CVQ

crosstabulation matrices, as per von Eye and Mun’s (2005) recommendation. The non-

uniformity in the CVQ reliability data was due to the high number of observations of one

category (Externalizing vocal quality) in contrast to a very small number of observations for

another category (Emotional vocal quality). This large disparity in observations per category can

73

distort Cohen’s kappa (Cohen, 1960) so that it does not accurately represent the raters’ level of

agreement.

Von Eye and Mun (2005) explain that Brennan and Prediger’s kappa, referred to as a

“‘free’ marginals” kappa (Brennan & Prediger, 1981, p. 690), is interpreted the same way as

Cohen’s kappa, while its formula, seen in Table 3, incorporates both the number of categories in

the scale as well as raw agreement (râ). By providing all three measures, agreement between two

raters can be understood as fitting within a range that spans from the overly stringent Cohen’s

kappa, to the more balanced Brennan and Prediger’s kappa and, finally, to the overly lax raw

agreement statistic.

The reliability statistics for the preliminary CVQ tests are located in Table 3. Significant

Cohen’s kappas were obtained for each pair of raters. Agreement for the Cohen’s and Brennan

and Prediger’s kappas range from moderate to substantial (Landis & Koch, 1977 in von Eye &

Mun, 2005). Raw agreement statistics range from necessary to adequate (House, House, &

Campbell, 1981). Ranges for interpretation of the statistics are found in Appendix A.

Also, regarding the Cohen’s kappas, the statistics reported here are in line with the

significant Cohen’s kappas reported for rater reliability in early published CVQ research, which

ranged from κ = .40 (Wiseman & Rice, 1989) to κ = .49 (Rice & Kerr, 1986). More recent

studies report significant kappas for rater reliability ranging from κ = .36 (Safran & Muran,

1996) to κ = .72 (Watson & Greenberg, 1996). In sum, the results of the preliminary inter-rater

reliability tests were sufficient to proceed with the CVQ rating of the data set.

74

Table 3

Preliminary Inter-Rater Reliability Between the CVQ Raters and Each Rater and the Expert

N Responses Κ ΚNa RȂ

Rater #1 + Rater #2 147 .75 .79 .84

Rater #1 + Expert 128 .63 .57 .68

Rater #2 + Expert 128 .72 .69 .77

Note. Κ = Cohen’s kappa, ΚN = Brennan & Prediger’s kappa, and RȂ = raw agreement.

a ΚN = raw agreement – (1/# of categories) from von Eye & Mun, 2005.

1 – (1/#categories)

Rating the CVQ data set.

One rater rated all of the sessions, called the main rater, and the second rater rated one

session for each client (N = 63) so that inter-rater reliability for the data set could be calculated.

The sessions were randomized so that there was an equal number of sessions with the lowest

change score, sessions with the highest change score, and first report of moderate to high

change sessions. The main rater did not know which sessions were to be used for reliability

calculations.

The CVQ raters were instructed to ground themselves in the training audio and the CVQ

category descriptions before each rating session. They worked in separate rooms, they did not

consult with one another, and they did not have access to one another’s coded transcripts.

Three consistency checks were conducted during the rating process. In the first two

checks, there was a limited review of training audio, discussion of a few responses from the data

set, and some discussion of new non-data audio to address some variation in agreement. The

CVQ raters were also told they could provide an alternate rating for especially difficult

responses. This decision was based on Rice and Kerr’s (1986) experiences with TVQ raters who

75

made alternate ratings when “there were ‘borderline’ areas between one TVQ category in which

either of two different ratings may have been equally appropriate” (p. 99). Vognsen (1969) also

noted that CVQ raters would spontaneously include additional comments to clarify their final

ratings when they heard a response as a borderline or mixture of categories that prevented them

from settling on only one rating.

TVQ Training.

TVQ raters were not the same people who had rated the responses on the CVQ. The

TVQ raters were trained for more than 100 hours over more than one year using the Therapist

Vocal Quality manual (Rice & Kerr, 1986) with an expert rater. Additional audio from

psychotherapy sessions that were not part of the data set was also used.

During training, it was decided to combine the Restricted and Limited vocal quality

categories into one category called Restricted-Limited vocal quality. The reason for this is that it

was difficult to differentiate the categories from the Manual’s training audio. Because both are

considered to be unproductive TVQ categories and because the differences between the

categories were almost imperceptible from the audio, it was decided to combine them.

Raters trained to an acceptable level of agreement with each other and the expert before

proceeding to rate the data. Preliminary inter-rater reliability was established through a test of

non-data audio which was assembled by the study’s primary investigator. The marginal totals

for the crosstabulation matrices for each pair of raters were non-uniform due to the large number

of observations for Irregular vocal quality and very small number for Patterned vocal quality.

As a result, von Eye and Mun’s (2005) recommendation of reporting Cohen’s kappa, Brennan

and Prediger’s kappa (κn), and raw agreement (râ) were followed and are displayed in Table 4.

76

The raw agreement statistics were acceptable (House et al., 1981) and the Cohen’s kappa

and the Brennan and Prediger’s kappa ranged from moderate to substantial (Landis & Koch,

1977 in von Eye & Mun, 2005). Ranges for interpretation of the statistics are found in Appendix

A. These values also exceed those reported by Rice and Kerr (1986) for three pairs of raters (κ =

.31 to .33) and those reported by Wiseman and Rice (1989) (κ = .60) for three raters using the

Spearman Brown formula. The results of the preliminary inter-rater reliability tests were

sufficient to proceed with rating the TVQ data set.

Table 4

Preliminary Inter-Rater Reliability Between the TVQ Raters and Each Rater and the Expert

N Responses Κa ΚNb RȂ

Rater #1 + Rater #2 43 .61 .66 .72

Rater #1 + Expert 43 .70 .75 .79

Rater #2 + Expert 43 .71 .75 .79

Note. Κ = Cohen’s kappa, ΚN = Brennan & Prediger’s kappa, and RȂ = raw agreement.

a The Online Vassar Kappa Calculator was used to calculate Cohen’s kappa when analyses included rater

#2. The reason for this is that SPSS does not calculate Cohen’s kappa unless both raters have at least one

observation for each category. Rater #2 had made observations for five of the six categories, whereas

rater #1made at least one observation for all six categories. bSee Table 3 for the formula.

Rating the TVQ data set.

Both TVQ raters rated one session for each client (N = 63) to calculate the inter-rater

reliability on the data set. The sessions were randomized so that there were an equal number of

sessions with the lowest change score, sessions with the highest change score, and the first

report of moderate to high change sessions. In addition, rater #1 rated a second session for each

client and rater #2 rated a third session for each client. The raters did not know which sessions

were to be used for reliability calculations.

77

The TVQ raters were instructed to ground themselves in the training audio and the TVQ

category descriptions before each rating session. They worked in separate rooms, they did not

consult with one another, and they did not have access to one another’s rated transcripts. They

were also asked to rate in two hour blocks, as recommended by Kerr (1980). In addition, the

raters were asked to rate one group of eight pre-selected sessions at a time. The groups were

composed of sessions from different therapists. This was done to prevent the raters from being

influenced by hearing the same therapist’s vocal quality in relation to two different clients.

Three consistency checks were conducted during the rating process. In the first two

checks, there was a limited review of training audio, discussion of a few responses from the data

set, and some discussion of new non-data audio to address some variation in agreement. The

TVQ raters were also told they could provide an alternate rating for especially difficult

responses. This decision was based on Rice and Kerr’s (1986) experiences with TVQ raters who

made alternate ratings when, “there were ‘borderline’ areas between one TVQ category in which

either of two different ratings may have been equally appropriate” (p. 99).

Descriptive Statistics for the Outcome Measures.

The Spearman Rho correlation was used to detect high correlations in the same direction

of two different measures or subscales. Measures that are highly correlated in the same direction

may be measuring the same construct. To reduce redundancy in the subsequent analyses, only

one measure was selected in these cases. The results are presented in Appendix B. The scales

that were not highly correlated with one another and were therefore included in the subsequent

analyses, include the BDI, DAS Total, RSE, the SCL-90-R (GSI only), the IIP and all its

subscales, and all levels of the PF-SOC.

78

Alpha level.

The alpha level for the current study is p < .05, as is conventionally used in psychology

research (e.g., Perneger, 1998). While there were many planned analyses, the Bonferroni

correction was not used to determine significance. Perneger (1998) explains that while the

Bonferroni adjustment decreases the probability of a Type I error (rejecting the null hypothesis

when it is true) for multiple tests, it also increases the probability of making a Type II error

(accepting the null hypothesis when it is not true). Perneger (1998) writes, “Type II errors are no

less false than type I errors” (p. 1236). In a medical context, this could mean that “an effective

treatment may be deemed no better than placebo” (p. 1236). In the current study, using the

Bonferroni correction could obscure the relationship between vocal quality and psychological

change. Given the exploratory nature of the current study and the reasons above, the risk of

making a Type I error was considered acceptable.

TVQ Inter-rater reliability on the data set.

TVQ inter-rater reliability on the data set was calculated on one session for each client (N

= 63). Disagreements between the TVQ raters on the sessions used for reliability were resolved

by the expert. The categorical TVQ data were aggregated into proportions of each TVQ

category for each client. This resulted in each session being defined by its proportion of

Softened vocal quality, Irregular vocal quality, Natural vocal quality, Definite vocal quality,

Restricted-Limited vocal quality, and Patterned vocal quality. Because the TVQ hypotheses

were proportions-based, intraclass correlation coefficient (ICC) analyses were planned to

evaluate inter-rater reliability.

Before calculating inter-rater reliability, four clients were going to be removed because

the raters agreed on less than 50% of the responses. It was decided that since the TVQ raters

79

rated just one session for each client for reliability calculations, if they disagreed on more than

50% of the responses for the reliability sessions, then their ratings for the other two sessions for

that client would also not be reliable. For this reason, it was decided that all three sessions for

these four clients would be excluded from the TVQ data set and further analysis.

Next, outliers in the remaining reliability sample were to be removed from the reliability

calculations in order to obtain the most representative sample of rater agreement. The

standardized residuals analysis was used to detect outlier cases. An outlier was defined as any

session in which the level of agreement exceeded plus or minus three standard deviations from

the mean. Eight outlier sessions were identified and each was an outlier due to a high level of

disagreement. It was decided that if the level of disagreement on the responses for these sessions

was so high that these sessions were outliers, then all three sessions for these clients should be

removed from the data as well. However, excluding so many clients from the data would reduce

the power of the study to an unacceptable level.

In an effort to retain the outlier clients in the TVQ data set, a cluster analysis of the

reliability sessions was run to see if ratings clustered into similar groups of vocal qualities for

both raters. A two-step cluster analysis was performed and produced two clusters of good

quality which were well-defined and the same for both raters. The clusters were differentiated

by their relative proportions of each TVQ category as displayed in Table 5. For both raters,

Cluster #1 was defined by higher mean proportions of Natural vocal quality and Definite vocal

quality than found in Cluster #2. Because of this, Cluster #1 was called Natural-Definite

Cluster. Cluster #2 was defined by higher mean proportions of Softened vocal quality and

Irregular vocal quality than found in Cluster #1. As a result, Cluster #2 is called Softened-

Irregular Cluster. Restricted-Limited vocal quality did not differentiate the clusters.

80

Table 5

Results of Cluster Analysis for the TVQ Raters (N = 59 Sessions)

Rater #1 Rater #2

TVQ Categories Cluster #1 Cluster #2 Cluster #1 Cluster #2

Irregular .02 .09 .02 .09

Softened .08 .51 .10 .61

Natural .66 .30 .69 .23

Definite .19 .03 .14 .02

Restricted-

Limited .07 .06 .05 .05

Note. Cohen’s kappa for the raters on the Clusters is .76. Gray shading highlights proportions that

differentiate the clusters from one another. Patterned vocal quality was not included in the analyses

because agreement was very poor in preliminary analyses (8%) and there were few responses

classified as Patterned vocal quality.

This meant that for the inter-rater reliability set of sessions (N = 59), each raters’ ratings

resulted in each session being characterized either Natural-Definite or Softened-Irregular

Cluster. To ensure adequate agreement on the sessions as defined by the Clusters, a Cohen’s

kappa test was run on the cluster classification data for the two raters. Agreement was

substantial (κ = .76), as interpreted from Landis & Koch’s (1977) suggested ranges.

If a cluster analysis on the TVQ data set of 177 sessions (three sessions for each of the 59

clients) revealed the same two clusters as found in the reliability sessions, then the eight outlier

cases could be retained and used in the subsequent analyses. To test this, a second cluster

analysis was run on the entire TVQ data set (N = 177). The results, presented in Table 6,

revealed the same cluster types as was found for the two raters in the cluster analysis of the

reliability sessions. Based on this, the eight outlier sessions were retained the data set. However,

the clusters replaced the individual TVQ categories as a new variable called Therapist Vocal

Style. This variable had two levels: Natural-Definite Therapist Vocal Style and the Softened-

Irregular Therapist Vocal Style.

81

Table 6

Results of Cluster Analysis on the TVQ Data Set (N = 177 Sessions)

Proportion of each TVQ per Therapist Vocal Style

TVQ Category Natural-Definite Therapist Vocal Style Softened-Irregular Therapist Vocal Style

Irregular .04 .07

Softened .07 .57

Natural .71 .26

Definite .12 .03

Restricted-Limited* .06 .07

Note. Gray shading highlights the TVQ proportions that differentiate Natural-Definite and Softened-

Irregular Therapist Vocal Styles from one another. *Restricted-Limited vocal quality was not significantly

different between the Therapist Vocal Styles, p = .539.

TVQ Descriptives.

An approximately equal number of sessions were classified as Natural-Definite and

Softened-Irregular Therapist Vocal Style in the sessions. More sessions were classified as

Softened-Irregular Therapist Vocal Style in the sessions with lowest change score (n = 31) than

in sessions with the highest change score (n = 24). More sessions were classified as Natural-

Definite Therapist Vocal Style in the sessions with the highest change score (n = 33) than in

sessions with the lowest change score (n = 26), as seen in Table 7. In the first report of moderate

to high change session, 29 sessions were classified as Softened-Irregular Therapist Vocal Style

and 28 as Natural-Definite Therapist Vocal Style, as seen in Table 8. In each group of sessions,

most of the Softened-Irregular Therapist Vocal Style sessions are PE-EFT sessions and most of

the Natural-Definite Therapist Vocal Style sessions are CBT sessions.

82

Table 7

Crosstabulation Tables for Therapist Vocal Style by Treatment Type in Sessions with the Lowest and

Highest Change Scores

Session with the lowest

change score Therapist Vocal Style

Treatment Softened-Irregular Natural-Definite Total

CBT

Count 7 23 30

% within Tx 23.3% 76.7% 100%

PE-EFT

Count 24 3 27

% within Tx 88.9% 11.1% 100%

Total

Count 31 26 57

% within Tx 54.4% 45.6% 100%

Session with the highest

change score Therapist Vocal Style


CBT

Count 1 29 30

% within Tx 3.3% 96.7% 100%

PE-EFT

Count 23 4 27

% within Tx 85.2% 14.8% 100%

Total

Count 24 33 57

% within Tx 42.1% 57.9% 100%

83

Table 8

Crosstabulation Tables for Therapist Vocal Style by Treatment Type in the First Report of Moderate to

High Change Session

Therapist Vocal Style


CBT Count 5 25 30

% within Tx 16.7% 83.3% 100%

PE-EFT Count 24 3 27

% within Tx 88.9% 11.1% 100%

Total Count 29 28 57

% within Tx 50.9% 49.1% 100%

CVQ Inter-rater reliability.

CVQ inter-rater reliability was calculated on one session for each client (N = 63). The

categorical CVQ data were aggregated into proportions of each CVQ category for each client.

This resulted in each session being defined by its proportion of Emotional, Focused, Limited, and

Externalizing vocal qualities. Because the CVQ hypotheses are proportions-based, intraclass

correlation coefficient (ICC) analyses were used to evaluate inter-rater reliability.

To obtain the most representative sample for inter-rater reliability, eight outlier cases

were removed from the ICC analyses. The standardized residuals analysis was used to identify

outliers. An outlier was defined as any session in which the level of agreement exceeded plus or

minus three standard deviations from the mean. Some sessions were outliers due to an

exceptionally high level of agreement and some were due to a low level of agreement. Table 9

shows acceptable ICC’s for all the CVQ categories except for Focused vocal quality. Because

the main rater rated the entire data set on the CVQ, it was decided to keep all three sessions for

the eight outlier cases in the data set.

84

Table 9

Intraclass Correlation Coefficients (ICC) for CVQ Inter-Rater Reliability (N = 63)

CVQ ICC 95% CI Magnitutdea

Externalizing vocal quality .71** [.56, .84] Strong

Emotional vocal quality .95** [.91, .97] Strong

Limited vocal quality .88** [.80, .93] Strong

Focused vocal quality .23* [.00, .45] Weak

Note. ICCs are two-way, random, consistency. CI=confidence interval.

aThe magnitude of the correlation coefficients is based on ranges that are conventional in the behavioural

sciences and include < .30 (weak), .30-.50 (moderate), > .50 (strong) (Green & Salkind, 2004, p. 256).

* p < .05. **p < .01.

Because Focused was a category of special interest, a second test of rater reliability was

conducted. In the second test, the expert rated a set of responses, many of which consisted of

responses the main rater had classified as Focused vocal quality. The expert did not know how

the main rater had classified any of the responses. Cohen’s kappa was calculated on two

categories: Responses identified as Focused vocal quality and responses identified as not

Focused vocal quality. Cohen’s kappa for this test was .34, representing fair agreement (Landis

& Koch, 1977 in von Eye & Mun, 2005). One of the reasons reliability was not better was that

the expert rated many more responses as Focused vocal quality (n = 29) than the main rater (n =

17). Each rater’s total number of Focused vocal quality responses is highlighted in yellow in the

crosstabulation matrix in Figure 1.

85

Main Rater

Focused Not Focused Total

Exp

ert Focused 12 17 29

Not Focused 5 45 50

Total 17 62 79

Figure 1. Crosstabulation matrix for the Focused vocal quality test with the main rater and expert.

The fair kappa statistic does not provide strong evidence that the main rater reliably rated the

Focused category. Also, it does not provide a statistical measure of the pattern of ratings seen in

the crosstabulation matrix in which the main rater is more conservative than the expert in

identifying Focused vocal quality. To evaluate the extent to which the expert agreed with those

responses that the main rater identified as Focused vocal quality, a third reliability test was

conducted using the weighted occurrence agreement percentage measure (House et al., 1981).

This calculation shows the percent of the main rater’s Focused responses that the expert agrees

with. This method is not conventional because it does not account for the responses which the

expert rated as Focused, but which the main rater did not. Rather, as House et al. (1981) state,

“the rationale for its use is typically that one observer is assigned a ‘criterion’ status and the only

errors processed in analysis are when the ‘regular’ observer fails to detect/record a behavior

coded by the criterion observer” (p. 45). The formula for the weighted occurrence agreement

percentage measure adapted from House et al. is where Focused-A is the number of responses

both raters agree are Focused vocal quality and Focused-M is the number of responses the main

rater says are Focused, but that the expert says are not.

86

% of Main Rater’s

responses verified as

Focused vocal quality

by the Expert

=

Focused-A

X 100%

Focused-A + Focused-M

The weighted occurrence agreement was 71%, meaning that the expert agreed with 71%

of the main rater’s Focused vocal quality ratings. This level of agreement is consistent with

figures reported by other CVQ studies using percent agreement as a measure of inter-rater

reliability. These reported reliabilities range from 69% (Sarnat, 1976; Vognsen, 1971) to 75%

(Clarke, 1989; Rice & Wagstaff, 1967). Referring to the percent agreement formulas presented

in their article, House et al. (1981) write “the question of ‘how much’ agreement is necessary,

good, or reasonable is not resolved (or a priori resolvable)” (p. 46). However, they suggest that

70% is the minimum acceptable level of agreement. House et al. (1981) write, “there seems

some consensus among behavioral investigators that average agreement at or above 70% is

necessary, above 80% is adequate, and above 90% is good” (p. 46).

To summarize, the main rater was more conservative in identifying Focused vocal quality

than the expert. However, the expert agreed with 71% of the main rater’s observations of that

category. This suggests that when the main rater identified Focused vocal quality in the data set,

that Focused vocal quality was, in fact, present. Also, the level of agreement between the main

rater and expert is in line with previous research which used percent agreement as a measure of

reliability. Finally, 71% agreement meets the minimum criteria suggested by House et al.

(1981).

87

Although it was decided that, based on the above, the main rater’s identification of

Focused vocal quality was sufficiently reliable to proceed with analyses of this vocal category, it

is important to state that the results of the analyses should be viewed in the context of two

limitations. First, because the main rater was conservative compared to the expert, she may have

categorized some Focused responses as another vocal quality. As a result, the actual number of

Focused responses may be underrepresented in the data set. Second, while 71% agreement just

meets the minimum standard suggested by House et al. (1981), it is still low, indicating that the

expert disagreed with some responses the main rated heard as Focused vocal quality. Given

these issues, it is suggested that results involving the Focused category be interpreted with

caution.

Boxplots of CVQ Data.

Boxplots of the CVQ categories were created to evaluate the shape of each category’s

distribution, examine outliers, and explore patterns among the categories. Boxplots for each

CVQ category were created for the first report of moderate to high change sessions. These, seen

in Figure 2, are highly skewed. Boxplots for sessions with the lowest change score and sessions

with the highest change score are displayed in Figure 3. Boxplots for each CVQ category for

both sessions appear “squashed” with many outliers, indicating highly skewed distributions

(Delucchi & Bostrom, 2004, p. 1162).

Although Externalizing vocal quality represented the highest proportion of all the CVQ

categories, the proportions of Emotional vocal quality, Focused vocal quality, and Limited vocal

quality expressed by most clients was zero, creating outliers for each of these categories. A

visual inspection of the outliers revealed two clients whose vocal patterns were distinct from the

rest. One client was an outlier on three out of the four CVQ categories for both sessions with the

88

lowest and highest change score as well as the first report of moderate to high change session.

The second client was an outlier for two of the four CVQ categories in the session with the

lowest change score and on all four CVQ categories in the session with the highest change score.

Because their vocal patterns were so different from the rest, they were removed from the CVQ

data set. In order to keep groups comparable in the analyses, these two clients were removed

from the TVQ data set as well.

Emotional Focused Limited Externalizing

Vocal Quality Categories

Figure 2. Boxplots of CVQ categories in first

report of moderate to high change

sessions (N = 60). Three sessions were

removed from this group because their CTSC-R

scores were below 3.5.

89

New boxplots were created without the two clients. Formal normality tests were used to

clarify whether the non-normality of the CVQ category distributions would require non-

parametric tests. There are no set rules for deciding when a distribution is so non-normal that



Sessions with the lowest change score



Sessions with the highest change score

Figure 3. Boxplots of CVQ categories in sessions

with the lowest change score and sessions with

the highest change score (N = 63).

90

non-parametric tests should replace parametric tests (O. Falenchuk, personal communication,

2011). However, Field (2009) suggests evaluating normality by visually comparing the

distribution in question with a normal one, plus taking into account the distribution’s skewness

and kurtosis values. A distribution is considered not symmetrical if either skewness or kurtosis

values exceed twice their standard errors (SPSS). Appendices C1-C3 present the boxplots and

skewness and kurtosis values, all of which show that the distributions for the CVQ categories are

non-normal. Because of this, non-parametric tests were used where necessary. Numerical

descriptive statistics for sessions with the lowest change score and sessions with the highest

change score are located in Appendix D. Numerical descriptive statistics for the first report of

moderate to high change session are located in Appendix E.

Lastly, because the proportions of Emotional and Focused, the CVQ categories of

interest, were so small, it was decided to combine their proportions for some analyses. This

decision was based on the comment by Rice et al. (1979) that “combining the Emotional and

Focused categories improves the predictive power for client-centered therapy” (p. 10), given the

small number of observations of each category in their research. Previous research has also

shown that the presence of both Focused and Emotional vocal qualities indicate “good contact”

with the self in gestalt conflict split work (Greenberg, 1980, p. 149), suggesting an increased

productivity.

91

Chapter 3:

Results

Research Question 1-Therapist and client vocal quality and the client’s report of change

Research question 1 asks if the therapist’s and client’s vocal quality is related to the

client’s report of change. Regarding the client’s vocal quality, Hypothesis 1a states there will be

a higher proportion of productive CVQ categories (Emotional, Focused, and Emotional Plus

Focused) in high change than in low change sessions. Hypothesis 1b states that there will be a

higher proportion of unproductive CVQ categories (Limited and Externalizing) in low change

than in high change sessions.

Wilcoxon Signed-rank tests were used to address these hypotheses. The Wilcoxon test is

a non-parametric test that evaluates whether the median proportion of a CVQ category in the

sessions with the lowest change score is significantly different from the median proportion of

that CVQ category in the sessions with the highest change score. The results indicate that there

were no significant differences in the median proportions of any of the CVQ categories between

the sessions with the lowest change score and sessions with the highest change score. See Table

10 for mean ranks, Z statistics, and p values for each CVQ category. This means there is no

association between the CVQ categories and the client’s report of change in the session, i.e.,

productive CVQ categories are not associated with high change sessions and unproductive CVQ

categories are not associated with low change sessions.

92

Table 10

Results for Hypothesis 1a and b: Wilcoxon Signed-rank test - Mean Ranks, Ties, Z statistics, p values

for the CVQ categories in the Highest Change Score Sessions and the Lowest Change Score Sessions

(N=61)

Negative Ranks Positive Ranks Ties Z

(r)a

p value

n Mean

Rank N

Mean

Rank

Emotional

12 8.67 6 11.17 43 -.81 .10 .420

Focused

12 13.50 16 15.25 33 -.93 .12 .350

Emotional

Plus Focused

20 17.00 16 20.38 25 -.11 .01 .912

Limited

17 19.65 21 19.38 23 -.53 .07 .596

Externalizing

24 28.13 27 24.11 10 -.11 .07 .910

Note. a Effect size is (r) = Z/√N 0.1 is small, 0.3 is medium, 0.5 is large (Field, 2009).

Regarding the therapist’s vocal quality, Hypothesis 1c states that there will be a higher

proportion of Productive TVQ (Irregular, Softened, and Natural) in high change than in low

change sessions. This hypothesis could not be tested because the individual TVQ categories

were replaced by the Therapist Vocal Style variable. However, the Therapist Vocal Style

variable was used to explore possible differences in the therapist’s vocal quality between the

high change than low change sessions.

The Therapist Vocal Style variable could be used to explore this hypothesis because the

Softened-Irregular and Natural-Definite Therapist Vocal Styles can be contrasted with each

other according to theory. Each style is composed of relatively higher proportions of two of the

93

individual TVQ categories. The Softened-Irregular Therapist Vocal Style is composed of higher

proportions of two of the three Productive TVQ categories: Softened and Irregular. The

Natural-Definite Therapist Vocal Style is defined by one of the Productive TVQ categories,

Natural, as well as by Definite, which is a vocal quality that can be either helpful or unhelpful to

the client. Because of this, more instances of Softened-Irregular Therapist Vocal Style would be

expected to occur in the high change than low change sessions, while more instances of Natural-

Definite Therapist Vocal Style would be expected to occur in the low change than high change

sessions.

This question was explored in two ways. First, the Therapist Vocal Style was examined

in the lowest change score group and in the highest change score group. The non-parametric

McNemar test of dependent proportions was used to evaluate whether the proportion of sessions

classified as Softened-Irregular Therapist Vocal Style in the highest change score group was

significantly different from the proportion of sessions classified as Softened-Irregular Therapist

Vocal Style in the lowest change score group. The test simultaneously makes this comparison

for the proportion of sessions classified as Natural-Definite Therapist Vocal Style in the lowest

and highest change score groups. Instances in which one client has both sessions classified with

the same Therapist Vocal Style are excluded from the McNemar test calculations. The results

showed that 54% of the sessions classified predominantly as Softened-Irregular Therapist Vocal

Style occurred in the lowest change score group, while 42% of the sessions classified as

predominantly Softened-Irregular Therapist Vocal Style occurred in the highest change score

group. These percentages are reversed for the Natural-Definite Therapist Vocal Style. These

percentages were not significantly different (p = .092) and should be interpreted with caution

94

because there was a smaller number of pairs of non-tied scores (n = 13) than is required by SPSS

(n = 26) to conduct the z test.

The second way in which this question was explored was to determine whether there

were differences in the session in which clients first reported moderate to high change. The non-

parametric Chi Square Goodness of Fit test was used to evaluate whether the proportions of

sessions classified as Softened-Irregular and Natural-Definite Therapist Vocal Style were equal

within this first report of moderate to high change group. The results were not significant, X2 (1,

N = 57) = .018, p = .895, indicating that within the first report of moderate to high change group,

there was no statistically significant difference between the proportion of sessions classified as

Softened-Irregular Therapist Vocal Style (50.9%) and Natural-Definite Therapist Vocal Style

(49.1%). Table 7 displays these figures.

To summarize the results for Research Question 1 and its associated hypotheses, the

results indicate that the client’s vocal quality and the therapist’s vocal quality appear unrelated to

the client’s report of change in the session.

Research Question 2-Therapist and client vocal qualities and the client’s scores on outcome

measures

Research question 2 asked if therapists’ and clients’ vocal qualities are related to the

clients’ outcome at termination. For the client’s vocal quality, it was hypothesized that a higher

proportion of the productive CVQ categories (Emotional, Focused, Emotional Plus Focused)

would predict better end-of-treatment outcome. Multiple regression analyses were conducted to

test this hypothesis because the test evaluates how well a higher proportion of each CVQ

category predicts scores on the outcome measures at termination.

95

The dependent variable was the client’s score on each outcome measure at the end of

treatment. The clients’ scores on these measures at pre-treatment were entered as covariates to

control for the varying baseline levels of the outcome measures for different clients. The

independent variables were the proportion of each CVQ. The regressions were conducted in the

session with the lowest change scores on the CTSC-R, the session with the highest change score

on the CTSC-R, and the first report of moderate to high change session on the CTSC-R.

The assumptions of the multiple regression analyses were tested by examining the

scatterplots of the standardized residuals versus the predicted values as recommended by

Tabachnick and Fidell (2007). Preliminary tests of normality of the outcome measure

distributions showed that eight of the outcome measures had non-normal distributions.

However, an inspection of the standard residual scatterplots, which account for the relationship

of the post-treatment scores and the independent variables, revealed no violation of assumptions

of multivariate normality, linearity, or homoscedasticity.

Hypothesis 2a-Productive CVQ categories will predict better scores for clients on

the outcome measures at the end of treatment.

There were no significant results for Emotional, Focused, or Emotional Plus Focused in

the session with the highest change score. Parameter estimates for the multiple regression

analyses for Emotional, Focused, and Emotional Plus Focused in the session with the highest

change score are found in Appendices F1 to F3 respectively. In the session with the lowest

change score there was one significant finding in which a higher proportion of Focused vocal

quality predicted better scores on the IIP Self-Sacrificing subscale at the end of treatment, R2 =

.46, Δ R2= .05, F(1, 52) = 4.88, p = .032, 95% CI [-24.50, -1.18]. This means that a higher

proportion of Focused vocal quality in the session with the lowest change score predicted the

96

client’s report of fewer problems with trying to please others to one’s own detriment and being

more self-protective and cautious in trusting others. Parameter estimates for the multiple

regression analyses for Emotional, Focused, and Emotional Plus Focused vocal qualities in the

session with the lowest change score are found in Appendices G1 to G3 respectively.

All other significant results for Emotional, Focused, and Emotional Plus Focused vocal

qualities were found in the first session in which clients report of moderate to high change.

Parameter estimates for the multiple regression analyses for Emotional, Focused, and Emotional

Plus Focused vocal qualities in the first session in which clients report moderate to high change

are found in Appendices H1 to H3 respectively.

Emotional vocal quality in the first report of moderate to high change session.

When clients’ pre-treatment scores were held constant, a higher proportion of Emotional

vocal quality in this session predicted the client’s report of fewer depressive symptoms, as seen

on the BDI scores at the end of treatment, R2 = .09, Δ R

2= .08, F(1, 53) = 4.87, p = .032, 95% CI

[-104.09, -4.95]. Similarly, a higher proportion of Emotional vocal quality predicted the clients’

report of less intense psychological distress at termination on the GSI, R2 = .29, Δ R

2= .08, F(1,

49) = 5.21, p = .027, 95% CI [-8.18, -.52] and less reactivity in coping with problems, on the PF-

SOC Reactive scale, R2 = .22, Δ R

2= .08, F(1, 48) = 4.72, p = .035, 95% CI [-11.89, -.46].

A higher proportion of Emotional vocal quality also predicted the client’s report of fewer

interpersonal problems overall at termination, as seen on the IIP Circumplex total, R2 = .48, Δ

R2= .05, F(1, 49) = 5.19, p = .027, 95% CI [-6.55, -.41]. This was also the case for two IIP

subscales: Cold Distant, R2 = .51, Δ R

2= .05, F(1, 49) = 5.04, p = .029, 95% CI [-7.59, -.42], and

Socially Avoidant, R2 = .58, Δ R

2= .04, F(1, 49) = 4.27, p = .044, 95% CI [-9.53, -.13]. These

results mean that a higher proportion of Emotional vocal quality in this session predicted the

97

clients’ report of fewer problems with feeling interpersonally cold, distant, and unaffectionate as

well as feeling less socially awkward at the end of treatment.

Focused vocal quality in the first report of moderate to high change session.

A higher proportion of Focused vocal quality in this session predicted the client’s report

of fewer problems with interpersonal behaviours at the end of treatment including feeling less

exploitable and gullible as well as less permissive and excessively generous with others. This

was seen on the IIP subscales of Overly Accommodating, R2 = .54, Δ R

2= .04, F(1, 48) = 4.49, p

= .039, 95% CI [-6.72, -.18], and Self-Sacrificing, R2 = .53, Δ R

2= .09, F(1, 49) = 9.33, p = .004,

95% CI [-8.22, -1.70]. Similarly, a higher proportion of Focused vocal quality also predicted

the client’s report of greater ability to cope with problems by thinking through the issues and

strategizing as seen on the PF-SOC Reflective scale, R2 = .62, Δ R

2= .04, F(1, 48) = 5.74, p =

.021, 95% CI [.66, 7.5].

Emotional Plus Focused vocal quality in the first report of moderate to high change

session.

The proportions of Emotional vocal quality and Focused vocal quality in each session

were combined because doing so improves their power to predict change (Rice et al., 1979).

When analyzed in the first report of moderate to high change session, a higher proportion of

Emotional Plus Focused vocal quality predicted the client’s report of fewer symptoms of

depression, as seen on the BDI, R2 = .14, Δ R

2= .13, F(1, 53) = 8.09, p = .006, 95% CI [-70.61, -

12.21], and psychological distress, as seen on the GSI, R2 = .32, Δ R

2= .11, F(1, 49) = 7.87, p =

.007, 95% CI [-5.36, -.89], at the end of treatment.

A higher proportion of Emotional Plus Focused vocal quality in this session also

predicted the client’s report of fewer interpersonal problems overall at termination, as seen on

98

the IIP Circumplex Total, R2 = .50, Δ R

2= .08, F(1, 49) = 7.51, p = .009, 95% CI [-4.27, -.66].

Significant results were also found for four IIP subscales as well including Cold Distant, R2 =

.54, Δ R2= .08, F(1, 49) = 8.04, p = .007, 95% CI [-5.08, -.87], Nonassertive, R

2 = .51, Δ R

2= .06,

F(1, 49) = 5.63, p = .022, 95% CI [-7.41, -.61], Overly Accommodating, R2 = .57, Δ R

2= .07,

F(1, 48) = 7.47, p = .009, 95% CI [-5.93, -.90], and Self-Sacrificing, R2 = .54, Δ R

2= .10, F(1,

49) = 10.62, p = .002, 95% CI [-6.71, -1.59]. These results mean that a higher proportion of

Emotional Plus Focused vocal quality predicts the client’s report of feeling less interpersonally

cold, distant, and unaffectionate; having an easier time asserting boundaries with others; feeling

less exploitable and gullible; and being less permissive and excessively generous with others at

the end of treatment. As well, a higher proportion of Emotional Plus Focused vocal quality in

this session predicted the client’s report of being more reflective and thoughtful in terms of

coping with problems as seen on the PF-SOC Reflective scale, R2 = .62, Δ R

2= .05, F(1, 48) =

6.28, p = .016, 95% CI [.674, 6.14].

Considering these results as a whole, three aspects stand out. First, the only session in

which the productive CVQ categories predict the clients’ scores on the outcome measures at

termination is in the first report of moderate to high change session. There are no significant

results in the session with the lowest change score on the CTSC-R or in the session with the

highest change score on the CTSC-R. Second, the majority of effect sizes for the significant

results were small. However, there were some moderate effect sizes, mainly for Emotional Plus

Focused vocal quality where a higher proportion of these vocal quality categories predicted 13%

of the variance in BDI scores, 11% in GSI scores, and 10% in Self-Sacrificing scores at post

treatment. Third, a comparison of the standardised beta coefficients indicates that Emotional

Plus Focused vocal quality was a stronger predictor of better scores at termination on the BDI,

99

GSI, IIP Circumplex, and the IIP Cold Distant subscales than Emotional vocal quality alone.

Similarly, Emotional Plus Focused vocal quality was a stronger predictor of lower scores at

termination on IIP subscales of Overly Accommodating and Self-Sacrificing than Focused vocal

quality alone and of higher scores on the PF-SOC Reflective scale than Focused vocal quality

alone. In summary, with one exception, all results for Emotional, Focused, and Emotional Plus

Focused vocal quality are significant in the first report of moderate to high change session. This

provides support for Hypothesis 2a, indicating that a higher proportion of productive CVQ

categories predicts a better end-of-treatment outcome.

Additional analyses of Externalizing vocal quality and Limited vocal quality in the first

report of moderate to high change session for Research Question 2a

Additional multiple regressions were conducted with proportions of Externalizing and

Limited vocal qualities in the first report of moderate to high change session and scores on the

outcome measures at termination. In the first report of moderate to high change session, a

higher proportion of Limited vocal quality predicted the client’s report of better problem-coping

skills involving strategizing and thinking through as seen on the PF-SOC Reflective scale, R2 =

.61, Δ R2= .03, F(1, 48) = 4.08, p = .049, 95% CI [.01, 3.06]. In contrast, a higher proportion of

Limited vocal quality in this session predicted the client’s report of worse dysfunctional attitudes

at the end of treatment as seen on the DAS, R2 = .64, Δ R

2= .10, F(1, 45) = 13.15, p = .001, 95%

CI [-139.95, -39.99].

A higher proportion of Limited vocal quality in this session predicted the client’s report

of fewer interpersonal problems overall, as seen in the IIP Circumplex total, R2 = .48, Δ R

2= .05,

F(1, 49) = 5.19, p = .027, 95% CI [-2.15, -.13] at the end of treatment. A higher proportion of

Limited vocal quality in this session also predicted the client’s report of fewer problems in

100

several interpersonal areas at the end of treatment. These included setting boundaries and being

more assertive with others, seen on the IIP Nonassertive subscale, R2 = .56, Δ R

2= .10, F(1, 49) =

10.59, p = .002, 95% CI [-4.68, -1.11]; being less gullible and exploitable by others, seen in the

Overly Accommodating subscale, R2 = .58, Δ R

2= .08, F(1, 48) = 8.65, p = .005, 95% CI [-3.37, -

.63]; and being less permissive and overly generous with others, as seen on the Self-Sacrificing

subscale, R2 = .52, Δ R

2= .08, F(1, 49) = 8.46, p = .005, 95% CI [-3.51, -.64]. Parameter

estimates for the multiple regression analyses for Limited vocal quality in the first report of

moderate to high change session are found in Appendix I1.

Externalizing vocal quality in the first report of moderate to high change session.

A higher proportion of Externalizing vocal quality in the first report of moderate to high

change session predicted the client’s report of more depressive symptoms at the end of

treatment, as seen on the BDI, R2 = .11, Δ R

2= .10, F(1, 53) = 5.84, p = .019, 95% CI [2.67,

28.68]; more dysfunctional attitudes as seen on the DAS, R2 = .53, Δ R

2= .07, F(1, 46) = 6.72, p

= .013, 95% CI [12.92, 102.72]; and greater psychological distress as seen on the GSI, R2 = .27,

Δ R2 = .09, F(1, 49) = 6.22, p = .016, 95% CI [.22, 2.04] at the end of treatment.

A higher proportion of Externalizing vocal quality in this session also predicted the

client’s report of more interpersonal problems, as seen on the higher IIP Circumplex total at the

end of treatment, R2 = .51, Δ R

2 = .09,F(1, 49) = 9.34, p = .004, 95% CI [.40, 1.92]. A higher

proportion of Externalizing vocal quality in this session also predicted the client’s report of

greater difficulty being affectionate and generous with others; more able to set boundaries and to

be more assertive with others; being less gullible and exploitable; and with being less permissive

and being less overly generous at the end of treatment. These differences are seen in the higher

scores on the following IIP subscales: Cold Distant subscale, R2 = .53, Δ R

2 = .07, F(1, 49) =

101

7.59, p = .008, 95% CI [.33, 2.14]; Nonassertive, R2 = .58, Δ R

2 = .12,F(1, 49) = 13.83, p = .001,

95% CI [ 1.16, 3.89; Overly Accommodating, R2 = .61, Δ R

2= .11, F(1, 49) = 13.18, p = .001,

95% CI [.84, 2.91]; and Self-Sacrificing, R2 = .57, Δ R

2 = .13, F(1, 49) = 14.95, p = .000, 95% CI

[.99, 3.142].

A higher proportion of Externalizing vocal quality in this session predicted the client’s

report of greater difficulty coping with problems through strategizing and planning as seen in the

higher scores on the PF-SOC Reflective scale at termination, R2 = .63, Δ R

2 = .06, F(1, 48) =

7.29, p = .01, 95% CI [-2.73, -.40]. Externalizing vocal quality also predicted the client’s report

of lower self-esteem at termination as seen in the lower RSE score, R2 = .27, Δ R

2 = .06, F(1, 50)

= 4.16, p = .047, 95% CI [-23.94, -.18]. Parameter estimates for the multiple regression analyses

for Externalizing vocal quality in the first report of moderate to high change session are found in

Appendix I2.

Additional analyses of Limited vocal quality and Externalizing vocal quality in the

session with the lowest change score for Research Question 2a

Additional multiple regressions were conducted with proportions Limited vocal quality

and Externalizing vocal quality in the session with the lowest change score and scores on the

outcome measures at termination. There were no significant results for Limited vocal quality in

the session with the lowest change score. Parameter estimates for the multiple regression

analyses for Limited vocal quality in the session with the lowest change score are found in

Appendix J1.

A higher proportion of Externalizing vocal quality in the session with the lowest change

score predicted the client’s report of greater difficulty with excessive giving and trying hard to

please others than other clients at termination. This was seen on the IIP Self-Sacrificing

102

subscale, R2 = .46, Δ R

2 = .04, F(1, 52) = 4.16, p = .047, 95% CI [.02, 1.93]. Parameter

estimates for the multiple regression analyses for Externalizing vocal quality in the session with

the lowest change score are found in Appendix J2.

Additional analyses of Limited vocal quality and Externalizing vocal quality in the

session with the highest change score for Research Question 2a

When multiple regression analyses were conducted for Limited vocal quality in the

session with the highest change score and scores on the outcome measures, several significant

results were found. A higher proportion of Limited vocal quality in this session predicted the

client’s report of coping with problems with greater reactivity and avoidance at termination, as

seen on the PF-SOC Reactive, R2 = .25, Δ R

2 = .11, F(1, 51) = 7.55, p = .008, 95% CI [.79,

5.09], and PF-SOC Suppressive scales, R2 = .32, Δ R

2 = .15, F(1, 51) = 11.18, p = .002, 95% CI

[1.62, 6.50]. Parameter estimates for the multiple regression analyses for Limited vocal quality

categories in the session with the highest change score are found in Appendix K1.

When multiple regression analyses were conducted for the Externalizing category in the

session with the highest change score and scores on the outcome measures at termination, there

were several significant results. A higher proportion of Externalizing vocal quality in this

session predicted the client’s report of dealing with problems in a less reactive and avoidant

manner at the end of treatment, as seen on the PF-SOC Reactive, R2 = .21, Δ R

2 = .07, F(1, 51) =

4.89, p = .032, 95% CI [-3.73, -.18], and PF-SOC Suppressive scales, R2 = .26, Δ R

2 = .09, F(1,

51) = 5.91, p = .019, 95% CI [-4.56, -.43]. Parameter estimates for the multiple regression

analyses for Externalizing vocal quality in the session with the highest change score are located

in Appendix K2.

103

In summary, the results of the additional analyses of Externalizing vocal quality and

Limited vocal quality show that in the session with the highest change score, a higher proportion

of Limited vocal quality predicts higher, meaning worse scores on the PF-SOC Reactive and

Suppressive styles of coping at the end of treatment. In contrast, a higher proportion of

Externalizing vocal quality predicts lower, meaning better scores on these measures at the end of

treatment. In the session with the lowest change score, a higher proportion of Externalizing

vocal quality predicts higher, worse scores on a measure of interpersonal problems.

Table 11 summarizes the results of all of the multiple regression analyses. In all, there

were very few significant results in the session with the lowest change score and session with the

highest change score. In contrast, many significant results were found in the first report of

moderate to high change session. A higher proportion of the Emotional Plus Focused category

in this session predicted better scores for the client at the end of treatment on the BDI, GSI, IIP

Circumplex, and some IIP subscales, and the PF-SOC Reflective scale. While the proportion of

Emotional vocal quality and Focused vocal quality alone also predicted better scores on the

outcome measures at the end of treatment, Emotional vocal quality combined with Focused

vocal quality (Emotional Plus Focused vocal quality) was the better predictor as seen by that

variable’s higher standardized beta coefficients. Results of the additional analyses also show that

a higher proportion of Limited vocal quality in the first session in which clients report moderate

to high change predicts better scores at termination on several outcome measures. In contrast, a

higher proportion of Externalizing vocal quality in this session predicts worse scores at

termination on several outcome measures.

104

Hypothesis 2b-Productive TVQ categories will predict better scores for clients on the

outcome measures at the end of treatment.

Hypothesis 2b states that a higher proportion of productive TVQ (Softened, Natural, and

Irregular) will predict better scores for clients on the outcome measures at termination. Because

the individual TVQ categories were replaced by the Therapist Vocal Style variable, this

hypothesis could not be tested directly. Instead, the relationship of the Therapist Vocal Style and

the clients’ scores on the outcome measures at post-treatment was explored. Specifically,

Table 11

A Higher Proportion of CVQ Category Predicts Outcome Scores at Post Treatment

First reported moderate-high

change session

Session with the lowest

change score

Session with the highest

change score

Em

otional

Focu

sed

Em

otional

+ F

ocu

sed

Lim

ited

Ext

ern

aliz

ing

Em

otional

Focu

sed

Em

otional

+ F

ocu

sed

Lim

ited

Ext

ern

aliz

ing

Em

otional

Focu

sed

Em

otional

+ F

ocu

sed

Lim

ited

Ext

ern

aliz

ing

BDI

DAS

GSI

Circumplex

VC

CD

SA

NA

OA

SS

IN

DC

Reactive

Reflective

Suppressive

RSE

Note. “” indicates that a higher proportion of CVQ in that session predicts lower scores on the outcome

measure at termination compared to other clients. “” indicates that a higher proportion of the CVQ in that

session predicts higher scores on the outcome measure at termination compared to other clients.

BDI=Beck Depression Inventory; DAS= Dysfunctional Attitudes Scale; GSI=General Symptom Index;

Inventory of Interpersonal Subscale abbreviations are as follows: Circumplex=Total; VC=Vindictive;

CD=Cold Distant; SA=Socially Avoidant; NA=Nonassertive; OA= Overly Accommodating; SS=Self-

Sacrificing; IN=Intrusive-Needy; DC=Domineering/Controlling; Reactive, Reflective, and Suppressive refer

to the Problem-Focused Style of Coping scales; RSE=Rosenberg Self-Esteem Scale.

105

multiple regression analyses were conducted to evaluate whether having at least one session

(either the session with the lowest change score or the session with the highest change score)

characterized as Softened-Irregular Therapist Vocal Style was related to having better scores on

the outcome measures at post-treatment compared to having no sessions classified as Softened-

Irregular Therapist Vocal Style.

To do this, two dummy variables were created. One represented clients for whom both

the session with the lowest change score and the session with the highest change score on the

CTSC-R were characterized by the Softened-Irregular Therapist Vocal Style. This dummy

variable was called Softened-Irregular Both. The second represented clients for whom one of

these sessions was characterized by the Softened-Irregular Therapist Vocal Style and the other

by the Natural-Definite Therapist Vocal Style. This dummy variable was called Mix Softened-

Irregular and Natural-Definite. The reference category represented clients for whom both the

session with the lowest change score and the session with the highest change score were

characterized by the Natural-Definite Therapist Vocal Style, or Natural-Definite Both.

The clients’ pre-treatment scores on these measures were entered as covariates to control

for varying baseline levels of the outcome measures for different clients. Next, Softened-

Irregular Both and Mix Softened-Irregular and Natural-Definite were entered as one block. The

dependent variables were the clients’ scores at post-treatment on the outcome measures.

The assumptions of the multiple regression analyses were tested by examining the

scatterplots of the standardized residuals versus the predicted values. An inspection of the

standard residual scatterplots, which account for the relationship of the post-treatment scores and

the independent variables, revealed no violation of assumptions of multivariate normality,

linearity, or homoscedasticity.

106

Appendix L1 displays parameter estimates for the multiple regression analyses with the

Therapist Vocal Style in the session with the lowest change score and session with the highest

change score. None of the results were significant, with the exception of a significant test

parameter for the IIP Overly Accommodating subscale. Although the results of the regression

analysis were significant for the overall model, R2 = .48, F(2, 47) = 14.28, p = .000, after

controlling for the pre-treatment scores, the block of Softened-Irregular Both and Mix Softened-

Irregular and Natural-Definite did not add significantly to explaining the model, Δ R2 = .05, p =

.12. Interestingly, however, the parameter estimate did show that Softened-Irregular Both made

a significant contribution to the prediction equation t = -2.09, p = .042. The finding that the

block of Softened-Irregular Both and Mix Softened-Irregular and Natural-Definite did not

explain the model could be an artifact of insufficient sample size in this analysis. However, the

finding that Softened-Irregular Both significantly contributes to the prediction equation suggests

a relationship worth further investigation.

Though this result is very weak, it suggests that Softened-Irregular Both is significantly

different from Natural-Definite Both in predicting lower post-treatment scores on the Overly

Accommodating subscale. This finding tentatively suggests that clients having both the session

with the lowest change score and session with the highest change score classified as Softened-

Irregular Therapist Vocal Style, as opposed to having both sessions classified as Natural-

Definite Therapist Vocal Style, have fewer interpersonal problems at the end of treatment with

worrying about offending others as well as feeling less gullible and exploitable.

Another multiple regression analysis was conducted for the Softened-Irregular and

Natural-Definite Therapist Vocal Style in the first report of moderate to high change session.

The clients’ pre-treatment scores on these measures were entered as covariates to control for

107

varying baseline levels of the outcome measures for different clients. Because the first report of

moderate to high change session would be classified as either Softened-Irregular or Natural-

Definite Therapist Vocal Style, one dummy variable was created for the Softened-Irregular

Therapist Vocal Style, using the Natural-Definite Therapist Vocal Style as a reference category.

The Softened-Irregular Therapist Vocal Style variable was entered after controlling for the pre-

treatment scores. The dependent variables were the clients’ scores at post-treatment on the

outcome measures.

There were no significant results, indicating that in the first report of moderate to high

change session, the Softened-Irregular Therapist Vocal Style is not significantly different from

the Natural-Definite Therapist Vocal Style in predicting clients’ scores on the outcome measures

at post treatment. Appendix L2 displays parameter estimates for the multiple regression

analyses with the Therapist Vocal Style in the first report of moderate to high change session.

In summary, the results of the analyses exploring how well the Therapist Vocal Style

predicts the client’s scores on outcome measures at termination suggest only a weak association

between having both the session with the lowest change score and the session with the highest

change scores being classified as Softened-Irregular Therapist Vocal Style and improvement on

the IIP Overly Accommodating subscale, when compared to having both of these sessions

classified as Natural-Definite Therapist Vocal Style.

Research Question 3-Differences between therapist and client vocal qualities and treatment

types.

Research question 3 asks if there is a difference in the TVQ and CVQ categories

primarily expressed in PE-EFT and CBT. For the client’s vocal quality, Hypothesis 3a states

that CBT clients will have a lower proportion of Focused vocal quality than PE-EFT clients. A

108

Mann-Whitney U test was conducted to evaluate whether the CBT clients would have a lower

proportion of Focused vocal quality than PE-EFT clients. When the proportions of Focused

vocal quality in the session with the lowest change score on the CTSC-R were combined with

the proportions of Focused vocal quality in the session with the highest change score on the

CTSC-R, the results were not significant, z = .88, p = .377. CBT clients had an average rank of

29.30, while the PE-EFT clients had an average rank of 33.00. Table 12 shows the mean ranks,

Z statistics, effect sizes and p values for the proportion of CVQ categories by treatment type in

the session with the lowest change score plus session with the highest change score.

Table 12

Results for Hypothesis #3a: Mean Ranks, Z statistics, Effect Sizes, and p values of CVQ Categories in

Lowest plus Highest Change Score Sessions by Treatment Type (N=61 Clients)

CVQ Categories in

Lowest +Highest

CBT (n = 33)

Mean Rank

PE-EFT (n = 28)

Mean Rank Z (r)a P

Emotional

27.67 36.11 2.57 .33 .010

Focused

29.30 33.00 .88 .11 .377

Emotional+Focused

27.02 35.70 1.97 .26 .049

Limited

30.08 32.09 .454 .06 .650

Externalizing

35.23 26.02 -2.02 .26 .043

Note. aEffect size is (r) = Z/√N 0.1 is small, 0.3 is medium, 0.5 is large (Field, 2009).

A second Mann-Whitney U test was conducted to evaluate the proportions of Focused vocal

quality between CBT and PE-EFT in the session in which clients first report moderate to high

change on the CTSC-R. The test results were not significant, z = .98, p = .326. CBT clients had

an average rank of 27.27, while the PE-EFT clients had an average rank of 31.41. Table 13

shows the mean ranks, Z statistics, effect sizes and p values for the proportion of CVQ categories

109

by treatment type in the first report of moderate to high change sessions. In sum, there are no

differences between clients in CBT and PE-EFT in their use of Focused vocal quality.

Table 13

Results for Hypothesis #3a: Mean Ranks, Z statistics, Effect Sizes, and p values of CVQ Categories in

First Report of Moderate to High Change Score Sessions by Treatment Type (N=58 Clients)

CVQ Categories

CBT (n = 30)

Mean Rank

PE-EFT (n = 28)

Mean Rank Z (r) a P

Emotional

27.38

31.77

1.50

.20

.133

Focused

27.72

31.41

.982

.13

.326

Emotional+Focused

26.68

32.52

1.44

.19

.149

Limited

26.70

32.50

1.45

.19

.149

Externalizing

33.77

24.93

-2.06

.27

.040

Note. a Effect size is (r) = Z/√N 0.1 is small, 0.3 is medium, 0.5 is large (Field, 2009).

Additional tests addressing Research Question 3 and the client’s vocal quality.

Additional Mann-Whitney U tests were conducted to evaluate if there were differences in

the proportions of the other CVQ categories between the two treatment types. The Mann-

Whitney U tests were run on the combined proportions of each CVQ category in the session with

the lowest change score and session with the highest change score. The combination of these

sessions is referred to here as lowest plus highest change score group. The tests were also run on

the proportions of each CVQ category in the sessions in which the client first reported moderate

to high change.

When the proportions of each CVQ were combined, in the lowest plus highest change

score group, there were three significant results. PE-EFT clients expressed a significantly higher

proportion of Emotional vocal quality than CBT clients, z = 2.57, p = .010, r =.33. The mean

rank of the PE-EFT clients was 36.11, while for the CBT clients it was 27.67. PE-EFT clients

110

also expressed a higher proportion of Emotional Plus Focused vocal quality than the CBT

clients, z = 1.97, p = .049, r =.25. PE-EFT clients had a mean rank of 36.11, while the CBT

clients had a mean rank of 27.02. However, the CBT clients expressed a significantly higher

proportion of Externalizing vocal quality than the PE-EFT clients, z = -2.02, p = .043, r =.26.

The mean rank of the CBT clients was 35.23, while for the PE-EFT clients it was 26.02. There

were no significant differences for Limited vocal category, z = .45, p = .650. The mean rank of

the CBT clients was 30.08 and for the PE-EFT clients it was 32.09.

When the Mann-Whitney U test was conducted on the proportions of each CVQ category

in the session in which the client first reported moderate to high change, there was only one

significant result. CBT clients expressed a significantly higher proportion of Externalizing vocal

quality than the PE-EFT clients, z = -2.06, p = .040, r =.27. The mean rank for the CBT group

was 33.77, while for the PE-EFT group, it was 24.93. There was no difference in the proportion

of Emotional vocal quality, z = 1.50, p = .133 in which the CBT group had a mean rank of 27.38

and the PE-EFT group had a mean rank of 31.77. The proportions of Emotional Plus Focused

vocal quality were also not significantly different between the two groups, with the CBT group

having a mean rank of 26.68 and the PE-EFT group having a mean rank of 32.52, z = 1.44, p =

.149.

To summarize the results for Research Question 3 and Hypothesis 3a, there was no

support for the hypothesis that CBT clients will have a lower proportion of Focused vocal

quality than PE-EFT clients. This means that clients in both treatment groups expressed the

same proportion of Focused vocal quality in sessions with the lowest and highest change scores.

Results of the additional analyses for proportions of categories in the lowest plus highest change

111

score group show that CBT and PE-EFT clients expressed statistically equivalent proportions of

Limited vocal quality.

However, the additional analyses for Emotional, Emotional Plus Focused, and

Externalizing vocal qualities in the lowest plus highest change score group showed statistically

significant differences for the PE-EFT group which expressed a higher proportion of Emotional

vocal quality than the CBT group. When the proportion of Emotional vocal quality was

combined with the proportion of Focused vocal quality (Emotional Plus Focused), results

showed that the PE-EFT group expressed a significantly higher proportion of Emotional Plus

Focused vocal quality than the CBT group. However, this result was due to the significantly

higher proportion of Emotional category for the PE-EFT condition only as can be seen from the

Z statistic, which is higher for Emotional than Emotional Plus Focused vocal quality. Lastly,

CBT clients expressed a higher proportion of Externalizing vocal quality than PE-EFT clients

when the proportion of Externalizing in the session with lowest change score was combined with

the proportion of Externalizing in the session with the highest change score. This was also the

case in the first report of moderate to high change session.

Exploration of Hypothesis 3b-Therapist Vocal Style and treatment types.

Hypothesis 3b states that CBT therapists will have a higher proportion of Natural vocal

quality than PE-EFT therapists. Even though the Therapist Vocal Style variable replaced the

individual TVQ categories, it was possible to test this hypothesis by comparing the proportion of

sessions classified as Natural-Definite Therapist Vocal Style between the treatment groups. A

two-way contingency table analysis was conducted to evaluate whether the proportions of

sessions classified as Natural-Definite and Softened-Irregular Therapist Vocal Style was the

same in the CBT and PE-EFT conditions. The two variables were treatment condition (CBT and

112

PE-EFT) and Therapist Vocal Style (Natural-Definite and Softened-Irregular). The analysis was

run in the first report of moderate to high change session and in the session with the lowest

change score and the session with the highest change score.

Results show that the Therapist Vocal Style was significantly related to treatment type,

with the CBT group having significantly more Natural-Definite Therapist Vocal Style sessions

than the PE-EFT group and the PE-EFT group having significantly more Softened-Irregular

Therapist Vocal Style sessions than the CBT group. The effect size was large in the session with

the lowest change score, Pearson 2(1, N = 57) = 24.62, p = .000, = -.66, and in the session

with the highest change score, Pearson 2(1, N = 57) = 39.06, p = .000, = .83.

The analysis was also run in the first report of moderate to high change session. The

results in this session also showed the CBT group having significantly more Natural-Definite

Therapist Vocal Style sessions than the PE-EFT group and the PE-EFT group having

significantly more Softened-Irregular Therapist Vocal Style sessions than the CBT group. The

result also had a large effect size, Pearson 2(1, N = 57) = 29.66, p = .000, = .72.

Taken together, these results indicate that a significantly higher proportion of CBT

sessions are characterized by the Natural-Definite Therapist Vocal Style than PE-EFT sessions

and that a significantly higher proportion of PE-EFT sessions are characterized by the Softened-

Irregular Therapist Vocal Style than the CBT sessions.

Summary of results

Regarding Research Question 1, there was no support for the hypothesis that there would

be a higher proportion of productive CVQ categories (Emotional, Focused, Emotional Plus

Focused) in high change than low change sessions. Results of the additional analyses showed

113

that there were no significant differences between Externalizing and Limited vocal qualities

either in the high and low change sessions.

In terms of the Therapist Vocal Style, there were a statistically equivalent number of

sessions classified as Natural-Definite and Softened-Irregular Therapist Vocal Style in the group

of sessions with the lowest change score and the group of sessions with the highest change score.

Also, the proportion of sessions classified as Softened-Irregular and Natural-Definite Therapist

Vocal Style was not statistically different within the first reported moderate to high change

session. These results do not provide support for the hypothesis that there will be a higher

proportion of productive TVQ categories in high change rather than low change sessions.

Regarding Research Question 2, the proportions of productive CVQ categories in the

session with the highest change score did not significantly predict clients’ scores on the

outcomes measures at termination in the multiple regression analyses. However, a higher

proportion of Focused vocal quality in the session with the lowest change score did predict better

scores at termination on the IIP Self-Sacrificing subscale. There were many significant results

for Emotional, Focused, and Emotional Plus Focused vocal qualities in the first session in which

clients report moderate to high change. Specifically, clients who expressed a higher proportion

of these productive CVQ categories had better scores on outcome measures at termination than

other clients. Also, although Emotional vocal quality and Focused vocal quality alone predicted

better scores at the end of treatment, combining these CVQ categories made them better

predictors of scores for some measures than either category alone.

Additional multiple regressions with the Externalizing and Limited categories showed

that a higher proportion of Limited vocal quality in the first report of moderate to high change

session predicted better scores on outcome measures at the end of treatment, while a higher

114

proportion of Externalizing vocal quality predicted worse scores. Taken together, these results

show that a higher proportion of each CVQ category discriminated between clients, in terms of

their outcome scores at termination, in the first report of moderate to high change session.

Additional multiple regressions with Externalizing vocal quality and Limited vocal

quality in the session with the highest change score show that a higher proportion of

Externalizing vocal quality predicted the client’s report of less reactivity and suppressive styles

of coping at termination, while a higher proportion of Limited vocal quality predicted the

opposite, meaning these clients reported experiencing more reactivity and suppressive coping

behaviours than other clients at the end of treatment. Additional tests in the session with the

lowest change score showed one significant result in which a higher proportion of Externalizing

vocal quality predicted worse scores on a measure of interpersonal problems at the end of

treatment.

In terms of the Therapist’s vocal style, exploratory multiple regression analyses were

conducted in the session with the lowest change score and the session with the highest change

score. The purpose of the test was to see how well having at least one session classified as

Softened-Irregular Therapist Vocal Style predicted the client’s scores on the outcome measures

at the end of treatment, when compared with having both sessions classified as Natural-Definite

Therapist Vocal Style. None of the results were significant, although a significant test parameter

suggests that having both the session with the lowest change score and session with the highest

change score classified as Softened-Irregular Therapist Vocal Style is related to better post-

treatment scores on the IIP Overly Accommodating subscale than having both sessions classified

as Natural-Definite Therapist Vocal Style. This finding suggests that the Softened-Irregular

Therapist Vocal Style may be important for clients having this interpersonal style.

115

Regarding Research Question 3, there was no evidence to support the hypothesis that

CBT clients expressed less Focused vocal quality than PE-EFT clients. However, results from

the additional analyses showed that Emotional vocal quality and Externalizing vocal quality

discriminated between the treatment approaches. Specifically, when the proportion of each CVQ

category in the session with the lowest change score was combined with the proportion of the

proportion of that CVQ category in the session with the highest change score, the results showed

that PE-EFT clients expressed more Emotional vocal quality than CBT clients and that CBT

clients expressed more Externalizing vocal quality than PE-EFT clients. In the first report of

moderate to high change session, however, only Externalizing vocal quality discriminated

between the groups.

In terms of the therapist’s vocal quality, a new variable called the Therapist Vocal Style

replaced the individual TVQ categories in the analyses in order to keep the study’s power as high

as possible after the outlier cases were removed. Although the hypothesis that CBT therapists

would have a higher proportion of Natural vocal quality than PE-EFT therapists could not be

directly tested because of this change, it was explored using the Therapist Vocal Style variable.

The exploratory analyses showed that Therapist Vocal Style discriminated between treatment

approaches. Specifically, there were far more CBT sessions classified as Natural-Definite

Therapist Vocal Style than PE-EFT sessions, while there were far more PE-EFT sessions

classified as Softened-Irregular Therapist Vocal Style than CBT sessions. These results provide

support for the hypothesis that the CBT therapists spoke in Natural vocal quality more than the

PE-EFT therapists.

116

Chapter 4:

Discussion

The purpose of the current study was to explore the clients’ and therapists’ vocal qualities

in the CBT and PE-EFT treatments for major depression. As Greenberg (1984) explains, “voice

is a subtle, moment-by-moment indicator of change which is not easily subjected to conscious

control or external influence and is therefore a good cue of the ‘true’ process” (p. 109). Previous

research demonstrated that some vocal qualities are associated with productive client processes

(e.g., Wexler, 1974) and good treatment outcomes (e.g., Butler, et al., 1962), while others are

not.

One of the most important findings in the current study to be discussed was that the

client’s vocal quality predicted scores on outcome measures at termination. The CVQ categories

with the smallest proportions, Emotional, Focused, and Limited predicted better scores when

they occurred in the first report of moderate to high change session. On the other hand, a higher

proportion of the predominant vocal category, Externalizing vocal quality, in this session

predicted worse scores at termination when it occurred in this session. Also, it seems important

to explain how it was that there was no difference in the proportions of any CVQ category

between the session with the lowest change score and the session with the highest change score.

Another key finding to be discussed was that the client’s and therapist’s vocal quality

differentiated the treatment types from one another. In the PE-EFT group, clients expressed

more Emotional vocal quality and therapists spoke more in the Softened-Irregular Therapist

Vocal Style. In the CBT group, clients spoke mostly in Externalizing vocal quality and

therapists spoke predominantly in Natural-Definite Therapist Vocal Style.

117

CVQ predicts clients’ scores on outcome measures at termination

Although Rice et al. (1979) explained that combining Emotional vocal quality with

Focused vocal quality increases their “predictive power” (p. 10), it is important to reflect on

why that might be the case in the current study and why this was found only in the first report of

moderate to high change session. Also, given that Limited vocal quality is regarded as a vocal

quality that does not generate new experience (Wexler, 1974) and therefore suggests a poor

prognosis (Wexler & Butler, 1976), it is telling that in the current study Limited vocal quality

predicts better scores on the outcome measures at termination in the first report of moderate to

high change session, but predicts worse scores in the session with the highest change score. The

question also arises of how it could be that Externalizing vocal quality in the first report of

moderate to high change session predicts worse scores on the outcome measures at termination,

but predicts better scores in the session with the highest change score.

Emotional Plus Focused vocal quality predict more favourable treatment outcomes in

the first report of moderate to high change session only.

Focused vocal quality and Emotional vocal quality, individually, in the first report of

moderate to high change session predicted better treatment outcomes. Focused vocal quality

predicted the client’s report of greater ability to express anger without fear of offending others;

being more caring for oneself in relation to meeting other people’s demands; and coping with

problems using a more deliberate, thoughtful approach. Although other studies using the CVQ

did not include these outcome measures, the results of the current project are consistent with the

findings from Butler et al. (1962) and Rice and Wagstaff (1967) in which more Focused vocal

quality was associated with favourable treatment results in client-centered therapy.

118

Similarly, in the current study, a higher proportion of Emotional vocal quality was

associated with less depression, less psychological distress, fewer interpersonal problems, and

less reactivity in dealing with problems at the end of treatment. These findings are consistent

with those obtained by Nixon (1980) in which the expression of Emotional vocal quality in

wholistic primal therapy “was significantly and positively correlated with many of the outcome

measures, with a significant correlation of .42 with the global outcome measure” (Rice & Kerr,

1986, p. 85). In client-centered therapy however, Rice et al. (1979) did not find enough

Emotional vocal quality to analyze and suggested that “it seems probable that combining the

Emotional and Focused categories improves the predictive power for client-centered therapy” (p.

10).

In the current study as well, Emotional vocal quality and Focused vocal quality were

present in very small amounts. Even though the number of observations for these CVQ

categories was the smallest of all four categories, they significantly predicted better scores on

outcome measures at termination. Furthermore, when their proportions were combined,

consistent with Rice et al. (1979), Emotional Plus Focused vocal quality was a stronger predictor

of scores than either Emotional vocal quality or Focused vocal quality alone. However,

Emotional Plus Focused vocal quality predicted treatment results only in the first report of

moderate to high change session and not in either the session with the lowest change score or the

session with the highest change score.

Together, these findings suggest there may be something unique about the confluence of

a higher proportion of Emotional vocal quality and Focused vocal quality, occurring within the

session that the client first reports experiencing moderate to high change, that predicts better

scores on the outcome measures at termination. The client who is working in therapy using both

119

of the vocal qualities may have a better chance of experiencing the significant shift “in terms of

their understanding of their problems, how they are treating themselves, and how they are feeling

about themselves and others” (p. 19) described by Watson, Goldman, and Greenberg (2007) as

the meaning of a CTSC-R score of 5 or more.

One reason for this may be that while emotional expression, on its own, may not

necessarily be enough for change to occur (e.g., Kennedy-Moore & Watson, 1999), it is a key

element in the change process. Regarding emotional arousal, which would be heard as

Emotional vocal quality, Murray and Segal (1994) referred to Daldrup, Beutler, Engle, and

Greenberg (1988) and Greenberg and Safran (1987) when they wrote, “there is a good deal of

emphasis in the clinical literature on the capacity of vocal expression to arouse emotion in

various forms of psychotherapy” (Murray & Segal, 1994, p. 393). Further, it is widely accepted

by proponents of diverse psychotherapies that expressing emotion is “a common factor crucial to

psychotherapeutic change” (Iwakabe, Rogan, & Stalikas, 2000, p. 376, referring to Frank &

Frank, 1991, and Garfield, 1989).

Researchers think change occurs with the arousal of emotion for several reasons. From

the PE-EFT perspective, emotional expression arouses the “client’s emotional schemes in order

to restructure old meanings and to create new ones” (Greenberg & Paivio, 1997 as cited in

Samoilov & Goldfried, 2000, p. 375). From the cognitive therapy perspective, Peternelli (1999)

described Beck et al.’s (1979) attitude that although “the meaning or emotional response of an

event depends upon the perception one has about this event…. The counsellor should first allow

the patient to experience and express his genuine emotion” (Peternelli, 1999, p. 13) as this

provides the client with relief.

120

Other researchers have discussed how emotional arousal plus another client processing activity

creates a more powerful predictor of change. For example, Diamond et al. (2010), referring to

Missirlian, Toukmanian, Warwar, Greenberg (2005) and Pos, Greenberg, Goldman, and Korman

(2003), found that emotional expression “has been most strongly correlated with outcome when

it occurs in conjunction with cognitive exploration and reflection” (Diamond et al., 2010, pp.

402-403). Wawar (2005), too, found that “combining [emotional arousal] mid-therapy with EXP

predicted outcome on the SCL-90-R and BDI better than either of these variables alone” (p. iii).

Kennedy-Moore and Watson (1999) explained that while people can learn about themselves by

expressing their emotions, the amount of expression is not what is important. Instead, the

benefits of emotional expression depend on people’s abilities to “integrate their thinking and

their feeling, to draw upon their emotional experience without being driven blindly by it, and to

consider the interpersonal impact of their emotional behavior without discounting their own

experience” (Kennedy-Moore & Watson, 1999, p. 6).

In terms of Emotional and Focused vocal qualities, once emotions have been aroused,

speaking in Focused vocal quality indicates that the client is involved in a reflective-type of

process to form fresh, novel emotional experience (e.g., Wexler, 1974). For example, Watson

and Greenberg (1996) stated that Focused vocal quality represents the “tracking of inner

experience and clients’ attempts to symbolize it in words” (p. 265). Wexler (1974) wrote that

Focused vocal quality signals “an involved and fluid mode of processing where experience is

being created” (p. 48). Wexler studied vocal quality and the processing activities of students

who gave a personal speech about sadness. He found significant and positive relationships

between the processes of differentiation, in which a person describes a more general experience

in finer-grained terms; integration, in which the person abstracts new meaning from discrete

121

experience; and vividness and variety of language used to describe experience. Wexler (1974)

summarized the results of his study by writing that:

Although the differentiation and integration of meaning is certainly a pervasive

characteristic of adult human functioning, the results show that the use of these

operations varies directly with the degree to which voice quality indicates involvement in

creating new experience…the relationship between the two is so strong as to suggest that

they are tapping the same phenomenon. (p. 51)

Taken together, these clinical meanings for Emotional vocal quality and Focused vocal quality

suggest that when they occur in higher proportions in the same session, they boost the client’s

ability to experience the kind of shifts in personal meaning described by Watson et al. (2007) in

“their understanding of their problems, how they are treating themselves, and how they are

feeling about themselves and others” (p. 19).

Limited Vocal Quality predicts more favourable treatment outcomes in the first

report of moderate to high change session.

Previous research associated Limited vocal quality with a personality type in which the

speaker is aware of a great deal of emotion, but is so overwhelmed by it that deeper

psychological exploration is hindered (Rice & Gaylin, 1973). This constricted and limited way

of relating to one’s own experience was inversely related to the creation of new, alternative

views (Wexler, 1974) and was linked to unsuccessful treatment outcomes (Rice & Wagstaff,

1967). One finding in the current study supports these results in that a higher proportion of

Limited vocal quality predicted the client’s report of avoidance and reactivity in coping with

problems at the end of therapy. However, this occurred in the session with the highest change

score and not in a low change session as would be predicted from theory. This scenario in which

122

the client experienced the session as high change, but still had a poor treatment outcome points

to Limited vocal quality as a trait or “enduring style” of speech (Rice & Kerr, 1968, p. 74) that

impedes the process of change.

However, Rice and Kerr (1986) explained that the Vocal Quality measures are also

“flexible enough to reflect moment-to-moment shifts in participation” (p. 74) suggesting more of

a state than trait. Rice and Koke (1981) stated that the “CVQ was originally designed as a state

rather than a trait measure, that is, it was intended to assess the quality of client’s involvement in

the therapy process at any given moment” (p. 161). Results for Limited vocal quality in the first

report of moderate to high change session predicting better scores on the outcome measures at

termination suggest that Limited vocal quality in this situation represents a client state rather than

a trait. These results also provide some support for ambiguous findings for Limited vocal quality

speakers in Rice and Wagstaff’s (1962) pilot study. They found that some Limited vocal quality

speakers had successful treatment outcomes, while others did not.

One explanation for this is that Limited vocal quality represents a temporary, surmountable

state of anxiety in certain contexts. Support for this comes from the similarity of the

paralinguistic descriptions of Limited vocal quality to Scherer’s (1986) prediction of how a

person’s vocal quality would sound if he or she felt powerless to deal with the outcome of an

event. Scherer (1986) suggested that a person who feels unable or powerless to deal with a

situation will likely speak in a “thin voice” (p. 156). A thin vocal quality has “shallow resonance

and low energy” and gives the impression of little self-confidence (Morreale, Spitzberg, &

Barge, 2007, p. 196). This description of thin vocal quality is consistent with descriptions of

Limited vocal quality as sounding thin, having low energy, and above-platform pitch (e.g., Rice

123

& Kerr, 1986). Scherer (1986) listed emotional states that would sound thin including

“anxiety/worry” and “grief/desperation”, with the thinnest being “fear/terror” (p. 157).

The results of the Diamond et al. (2010) study, which used technology to detect changes in

paralinguistic characteristics, support the view of Limited vocal quality as a transient state. In

the Diamond et al. study, clients received one of two treatments for unresolved anger toward a

significant other. In one treatment, clients spoke about their feelings. In the second, clients

participated in the gestalt empty-chair intervention in which they spoke directly to a significant

other in imagination. Although Diamond et al. (2010) did not relate their findings to clients’

resolution of the problems, they found that paralinguistic indicators of fear were higher in the

emotionally arousing condition. “During the arousal of fear/anxiety,…F0 range values increased

due to increased muscle tension caused by the activation of the sympathetic nervous system”

(Diamond et al., 2010, p. 408). A higher pitched frequency, described here, can make a person’s

vocal quality sound Limited as if it is “not resting on [its] own platform” (Rice et al., 1979, p. 6,

see Volume II, Figure 4).

Diamond et al. (2010) concluded from this that “during empty-chair enactments,

participants faced their fear/anxiety that the significant other might respond in an indifferent,

rejecting, or even punitive manner” (p. 408) and that:

Facing and overcoming one’s fears of accessing and experiencing painful threatening

primary emotions, as well as one’s fears of being vulnerable while expressing hurt and

longing to the significant other, is a purported core change mechanism in relationally

oriented experiential therapies. (p. 408)

The idea of arousing or activating feared thoughts, images, memories, emotions, etc. in order to

heal emotional problems has a long history. Hunt (1998) explained Foa and Kozac’s (1986)

124

theory about this: “the cognitive structure underlying the pathological fear must be activated” so

that it can be changed by “new cognitive and affective information, which is incompatible with

the underlying structure” and “if this does not happen, then the cognitive structure will remain in

storage, unavailable for modification” (Hunt, 1998, p. 370).

Perhaps when Limited vocal quality is associated with treatment success, it may be that

the client came into contact with his or her fears, but was not immobilized by them. Instead, it

seems that clients expressing a higher proportion of Limited vocal quality in the current study

were able to overcome, work with, or work in spite of their anxiety and fear in such a way that

the problem was resolved. Although the CTSC-R score does not give information about a shift

in or resolution of the individual’s specific problem, the fact that a higher proportion of Limited

vocal quality predicts better treatment outcomes when it occurs in the first report of moderate to

high change session suggests this possibility.

Externalizing Vocal Quality predicts worse treatment outcomes in the first report of

moderate to high change session.

In contrast to Emotional, Focused, and Limited vocal qualities, a higher proportion of

Externalizing vocal quality in the first report of moderate to high change session predicted worse

scores on the outcome measures at termination. Clients expressing more Externalizing vocal

quality in this session reported worse depression, greater dysfunctional attitudes, higher

psychological distress, more interpersonal problems, lower self-esteem and less use of a

reflective, thoughtful coping style at the end of treatment. Though other studies using the Client

Vocal Quality (CVQ) measure did not use these same outcome measures, these results are

consistent with the findings (Butler et al., 1962) in which Externalizing vocal quality was

associated with unsuccessful results in client-centered therapy. The current study’s results are

125

also consistent with Wexler’s (1974) observation that Externalizing vocal quality speakers have a

“style where little new experience is generated” (p. 48).

However, in the session with the highest change score, a higher proportion of

Externalizing vocal quality predicted better scores at the end of therapy in terms of coping with

problems with less reactivity and with greater awareness. It could be that in the session with the

highest change score, clients were more relaxed, perhaps reflecting on work accomplished in

previous sessions (J.C. Watson, personal communication, May, 2011). This kind of attitude

could be heard as Externalizing vocal quality and may be associated with a more comfortable

and accepting style of dealing with problems.

No difference in Client Vocal Quality Categories in the session with the lowest change score

and the session with highest change score

Contrary to expectations, there were no differences in the proportion of CVQ categories

between the session with the lowest change score and the session with the highest change score.

One reason for this may be that the session with the highest change score reflected the client’s

experience of “consolidated change” (J.C. Watson, personal communication, May, 2011). It

could be that in this session, the client was not engaged in the deep exploration which would

theoretically produce productive vocal qualities. Instead the client could be reviewing his or her

progress or noting new personal strengths and accomplishments. This could leave the client

feeling he or she changed substantially, which could show up on the CTSC-R measure as a high

score.

Client and therapist vocal qualities differentiate the treatment types

While the results showed no difference between the treatments for Focused vocal quality,

which will be discussed later, findings from the additional analyses showed significant

126

differences between the treatment types. PE-EFT clients expressed a significantly higher

proportion of Emotional vocal quality than CBT clients and CBT clients expressed a

significantly higher proportion of Externalizing vocal quality than PE-EFT clients. These

differences in the CVQ categories logically flow from the contrasting treatment demands of CBT

and PE-EFT. While these results support the findings of others studies below, it is also

important to understand how Externalizing vocal quality, traditionally viewed as an unproductive

vocal quality and also found in the current study to predict worse treatment results, could be

related to change processes in the CBT condition. Lastly, explaining how it could be that there

were no differences between the treatment groups for Focused vocal quality is important.

Indirect support for other studies comparing different treatment types.

Because this is the only study conducted so far comparing client vocal quality in an

outcome study of CBT and PE-EFT, the results provide indirect support for earlier research

comparing CBT to more emotionally evocative treatments. For example, Burgoon et al. (1993)

studied clients’ nonverbal arousal in two group therapy conditions. One was the emotionally

evocative focused expressive therapy (FEP) condition (Daldrup, Beutler, Engle, & Greenberg,

1988) and the second was the cognitive therapy (CT) condition. They found that FEP clients

who were working on their own problems in the group were rated higher on a Vocal Tension

scale than their CT counterparts. Also, Mackay, Barkham, Stiles, and Goldfried (2002) studied

depressed clients’ emotional arousal over the course of the session in two conditions. One was

the CBT condition and the other was the psychodynamic-interpersonal (PI) condition in which

therapists “often encourage clients to experience and explore their emotions deeply, particularly

in the context of the relationship with the therapist” (Mackay, Barkham, Stiles, & Goldfried,

2002, pp. 376-377, referring to Shapiro & Firth, 1987). The researchers found that over the

127

course of the session, while clients in both conditions expressed negative emotions, the intensity

was greater for the PI clients.

CBT clients expressed a higher proportion of Externalizing Vocal Quality than PE-

EFT clients.

The finding that CBT clients expressed a higher proportion of Externalizing vocal quality

than PE-EFT clients provides indirect support for other studies which did not use vocal quality,

but used other indicators of clients taking an external focus. For example, Goldfried,

Castonguay, Hayes, Drozd and Shapiro (1997) contrasted CBT with psychodynamic

interpersonal therapy (PI), which is similar to PE-EFT in that it is an emotionally evocative

treatment. They found that CBT sessions were characterized by a “greater emphasis on external

circumstances and clients’ ability to make decisions” (p. 740). In a study of insight events in

CBT and PI psychotherapy for clients with mood disorders, Elliott et al. (1994) found that

therapists in both groups made interpretations, but that CBT therapists made “key therapist

responses were external reattributions, which shifted blame from the client to others” (p. 458).

Elliott et al. (1994) explained that the clients then did the same thing by attributing the causes of

their problems to factors outside of themselves.

Even though Externalizing vocal quality has been associated with an external focus, “little

new experience”, and generally unproductive processes (Wexler, 1974, p. 48), clients in the

current study, in both treatment groups, had similar results at the end of therapy (Watson et al.,

2003). This suggests that Externalizing vocal quality does not necessarily indicate poor

processing activity. One reason for this could be that Externalizing vocal quality may be a

broader category than was previously thought. Issues arising during the rating process suggested

this might be the case. For example, during the rating process in the current study, the CVQ

128

raters commented that there were a number of instances in which the client’s vocal quality

verged on Focused but did not quite meet the criteria for that category. In the majority of these

cases, Externalizing vocal quality was selected instead. Also, there were instances in which a

response seemed to require a different category, sometimes because of poignancy or a serious

intensity that was not described in any of the CVQ categories. Rice and Kerr (1986) explain that

the CVQ categories were the most representative of the vocal patterns found in client-centered

therapy, but that other patterns may exist. They encouraged researchers to use the scale with

other treatment modalities including CBT for this reason.

Also, it is possible that there are Externalizing vocal quality subgroups related to different

processing activities. Although Fosha (2000) did not refer to Externalizing vocal quality, she

alluded to a way of speaking that would be rated this way according to the CVQ manual’s rules.

She warned the listener to pay closer attention to not miss what the speaker is actually

conveying:

When someone is calm and speaks in measured tones it does not mean that affect is

absent and that we are in the realm of defenses; quiet and simple communication can be a

statement of affective truth, a declaration of deeply felt personal meaning, which is an

aspect of core state functioning. (Fosha, 2000, p. 160)

Fosha (2000) continued, “it is important that these highly meaningful declarations not be

mistaken for defensive intellectualization” (p. 160). It could be that some Externalizing vocal

quality responses in the current study were used to identify these types of vocalizations which,

instead of reflecting an external or disengaged focus, actually signified important change

processes that could not be detected with the CVQ measure.

129

There was no difference between the treatment types for Focused Vocal Quality.

Contrary to prediction, CBT clients expressed statistically equivalent proportions of Focused

vocal quality to the PE-EFT clients. One reason for this may be that a person’s ability to do the

inward-directed activities associated with the Focused vocal quality may be a trait and not

dependent on the different treatment demands of CBT and PE-EFT. Support for this explanation

comes from Wexler’s (1974) study in which the voice quality system developed by Rice and

Wagstaff (1967) was used to rate the vocal qualities of university students, each of whom gave a

four minute extemporaneous speech about sadness. Wexler (1974) commented that:

Although the differentiation and integration of meaning is certainly a pervasive

characteristic of adult human functioning, the results show that the use of these

operations varies directly with the degree to which voice quality indicates the

involvement in creating new experience…the relationship between the two is so strong as

to suggest that they are tapping the same phenomenon. (p. 51)

The current study’s test results suggest that CBT and PE-EFT clients were equally engaged in

the types of processing activities indicated by Focused vocal quality. Also, both treatment

groups had “generally equivalent” results in the depression study (Watson et al., 2003). Taken

together, this suggests that clients working in both treatment types have different styles of doing

the work that achieves the same goal, which is recovery from depression.

Natural-Definite and Softened-Irregular Therapist Vocal Style and the treatment

types.

There were a significantly higher number of Natural-Definite Therapist Vocal Style

sessions in the CBT condition and a significantly higher number of Softened-Irregular Therapist

Vocal Style sessions in the PE-EFT condition. These differences can be explained by the

130

contrasting approaches each treatment takes toward emotion. PE-EFT therapists assist their

clients in processing emotional experience by arousing emotions so that they can be expressed,

symbolized, and differentiated (e.g., Watson & Geller, 2005). To facilitate this, PE-EFT

therapists aim to make clients feel “safe…and sufficiently accepted by their therapist, so that

they are not monitoring what they are saying, designing things to please the therapist, or

protecting themselves from potential criticisms” (Iwakabe et al., 2000, p. 377).

The Softened-Irregular therapist vocal style, defined by its higher proportion of Softened

vocal quality and Irregular vocal quality, matches these therapist intentions. The Softened vocal

quality is thought to communicate “to the client that the situation is safe, that the therapist can be

trusted, and that the client is prized” (Rice & Kerr, 1986, p. 96), so that the client feels

sufficiently secure and supported to explore painful experience. Irregular vocal quality is the

pattern made by the therapist as he or she attempts “to get the exact flavour of each feeling or

described event” (Kerr, 1980, p. 46). This describes one of the key therapist interventions in PE-

EFT which is empathic reflection. Because these therapist behaviours are emphasized in PE-

EFT, it makes sense that Softened-Irregular Therapist Vocal Style would be spoken more by PE-

EFT therapists than CBT therapists.

In contrast, CBT therapists traditionally work to dial down clients’ distressing emotion by

“managing or containing affective arousal” (Samoilov & Goldfried, 2000, p. 373). Instead of

treating emotional problems by concentrating on emotion, CBT therapists help their clients use

reason, logic, and behavioural experiments to challenge those thoughts and beliefs that lead to

distress in the first place (Beck & Weishaar, 1989 as cited in Burgoon et al., 1993). These

therapist behaviours correspond well with Mackay et al.’s (2002) remark that the CBT session is

131

generally “conducted in a businesslike manner, with little expressed emotion” (p. 376, referring

to Shapiro & Firth, 1985).

Bolinger (1978) refers to the speaker’s attitude as “the running commentary that

intonation adds to the propositional content of sentences” (p. 484). If the therapist’s attitude is

that the client will benefit from “a structured form of therapy” (Josefowitz & Myran, 2005, p.

330) that facilitates “active problem-solving strategies” (Wilson & Evans, 1977, p. 560 as cited

in Raue & Goldfried, 1994, p. 133), then it follows that the therapist’s intonation would reflect

this. These attitudes, plus the expectation that the CBT “therapists’ behaviour should be honest

and warm” (Hoffman & Asmundson, 2008, p. 3) make it is easy to see how Natural-Definite

Therapist Vocal Style, defined by its higher proportion of Natural vocal quality and Definite

vocal quality, is a logical fit for the CBT therapist.

Natural vocal quality is referred to as the “working factor” because this is the way the

therapist’s vocal quality sounds when he or she is speaking “on topic” and not about the

therapeutic relationship or the “process of exploration” (Rice & Kerr, 1968, p. 96). Rice and

Kerr (1986) also described this vocal quality is “nonthreatening” and “relaxed” (p. 96). The

Definite category, however, can be either helpful or not to the client. This vocal quality is partly

defined by its downward-sloping terminal contours, which are thought to convey authority,

confidence, and also finality (Rice & Kerr, 1986). Ohala (1994), an intonation phonology

researcher, referred to Bolinger (1978) when he wrote, “it seems safe to conclude that such

‘social’ messages as ….assertiveness, authority, aggression, confidence, threat are conveyed by

low/or falling F0 [fundamental frequency]” (Ohala, 1994, p. 327). Also, the falling pitch in some

Definite vocal quality utterances does not leave room for the listener to differ with the speaker.

This is supported by Bolinger’s (1978) observation that, “the two intonational shapes that are

132

found everywhere are fall and rise, with their targets, low and high. The meanings are as

uniform as the shapes: falls for ‘being through,’ rises for ‘not being through’” (p. 516).

These characteristics of Definite vocal quality could make it an effective vocal quality

for instructing and guiding, both of which are likely to be found in CBT where clients are

“encouraged to ask questions to make sure that they understand and agree with treatment”

(Hoffman & Asmundson, 2008, p. 3). This Definite vocal quality could also be used to calm the

client down because it makes the therapist sound like an expert which could be “very reassuring”

to a vulnerable client (Rice & Kerr, 1986, p. 96).

Strengths of the current study

The current study added to the vocal quality literature in several ways. First, the finding

that CVQ categories predict the clients’ scores on the outcome measures at termination, but

primarily in the first report of moderate to high session, suggests that the client’s vocal quality

does indicate whether or not he or she is working in a way that will lead to a feeling of change in

the session and that will be related to change at the end of treatment.

Second, prior to the current study, Limited vocal quality tended to be regarded as a

personality or trait variable, indicating that the client was distant from his or her experience,

fragile and vulnerable, ultimately unable to engage in therapy in a beneficial way. The findings

in this study, however, suggest that Limited vocal quality may also reflect an anxious state

signalling, for example, intense fear or terror. Diamond et al. (2010) explained that fear of

confronting an important attachment figure can prevent the client from accessing his or her

primary experience of feeling “worthlessness, loss, longing, and sadness” (p. 402). However,

facing this fear and overcoming it allows the client to access to these painful emotions, enabling

him or her to learn what it is he or she needs and freeing him or her from having to “defend”

133

against these difficult feelings (Diamond et al., 2010, p. 402). Limited vocal quality in this

context could be a sign that the client is in fact in a state in which he or she is engaging in the

treatment, as in facing one’s fears, as opposed to expressing a trait that keeps him or her at a safe

distance from psychological experience.

Third, results for the Therapist Vocal Style and the CVQ shows how CBT and PE-EFT

sessions can be characterized by the therapists’ and clients’ predominant use of certain vocal

qualities. These vocal qualities logically flow from the unique demands of each approach.

Finally, the proportions of the productive CVQ categories (Emotional, Focused, and

Emotional Plus Focused) in the current study were very small and yet they still predicted the

clients’ scores on the outcome measures at post treatment. This finding provides support for

both Emotional and Focused categories as indicators of important change processes.

Limitations of the study

There were several limitations to the current study including the sample size,

generalizability of the sample, moderate to small effect sizes, and the portion of the session rated.

Regarding generalizability, there are two possible problems. The first is that the data for this

vocal quality study were drawn from a homogenous sample, consisting mostly of woman, and

people who report having a postsecondary/college education, as seen in Table 2. Because of this,

the results of the current study may not be as applicable to men or people with other types of

education.

The second problem has to do with the very small proportions of Emotional, Focused,

and Limited vocal qualities. Most of the clients in the study spoke in Externalizing vocal quality

most of the time. The remaining CVQ categories had very small proportions, with many outliers

in the distribution of each. Because of this it is unclear whether the categories of Emotional,

134

Focused, and Limited vocal qualities are rare in the population or if the session selection for the

current study only captured some instances of these vocal qualities. It is possible that if the

sessions had been selected based on work done on specific client problems or the occurrence of

particular treatment interventions that there would have been more instances of these categories.

Another limitation was the size of the sample. Although the sample size from which the

data for the current study were drawn is larger than that of previous research (Watson & Bedard,

2006), the size was further reduced in the current study due to damaged or inaudible audio or

videotapes. Also, some of the audio was excluded because the client’s and/or therapist’s vocal

qualities were so soft that it was not possible to rate. Had these cases been included in the data

set, the results may have been different.

A third issue is that most results had small to moderate effect sizes. One reason for this

could be that vocal quality is just one of several indicators of client and therapist processes.

Ritchie (1998) wrote, “another problem with the TVQ is that the effect of therapy on the client is

dependent on many variables other than voice quality such as body language and verbal

language” (p. 32). Along these same lines, Kinseth (1989) explained that “human

communication is a multichannel process, involving not only intentional verbal expression but

also simultaneous multiple nonverbal channels such as body movement, nonlinguistic

vocalization, and body orientation” (p. 6).

While there is complementary information streaming through these other nonverbal

channels, likely adding to the overall presentation of the clients, there are other, covert activities

that are also probably contributing to the variance in scores. Rice and Kerr (1986) wrote that “to

understand what makes therapy good or bad, we must be able to follow all of the important

elements in the interaction” (p. 101). Safran, Muran, and Samstag (1994) agreed, “the use of

135

converging measurement procedures…is important since no one measure can comprehensively

capture the important features of any given aspect of clinical process” (p. 231).

As a result, research has included other measures of client activity alongside the CVQ.

Examples include the CVQ alongside SASB (Benjamin, 1974) and EXP (Klein et al., 1986) in a

study of alliance ruptures (Safran & Muran, 1996) and Client Emotional Arousal Scale-revised

(CEAS-r) (Machado, 1992) in the investigation of resolvers and nonresolvers in the empty chair

task for unfinished business (Greenberg & Malcolm, 2002). The TVQ was investigated along

with the Level of Client Perceptual Processing (LCCP) (Toukmanian, 1994) in the study of the

therapist’s vocal quality in the treatment of hyperphagia (Ritchie, 1998).

A fourth limitation is the section of the session that was rated. The middle 20 minutes of

each session was chosen for rating because this section has been considered the “working phase”

of a session (Watson & Bedard, 2006, p. 154). However, rating only the middle 20 minutes of a

session can truncate important therapeutic activity, such as the client’s calming down from a cry

just as the middle 20 minutes begins or the client’s just beginning to speak in Focused vocal

quality at the 19th

minute. The data for the study may have been more representative of the vocal

character of the sessions if the entire session had been rated.

The results of the Mackay et al. (2002) study support this idea. They found that

depressed clients receiving CBT and psychodynamic interpersonal (PI), which is an emotionally

evocative treatment, expressed equivalent amounts of negative emotion, but at different points in

the session. Emotional arousal in the CBT condition tended to be shaped like a U, with less

arousal in the middle of the session, while the PI clients’ arousal patterns tended toward an

upside down U, with greater arousal in the middle of the session. The authors attributed these

differences to the contrasting treatment demands. However, in the current study, only the middle

136

20 minutes were rated on the CVQ, which may have been the main reason the PE-EFT group

was found to have expressed a higher proportion of Emotional vocal quality than the CBT group.

It could be that meaningful instances of the Emotional vocal quality category for the CBT group

were missed because of the middle 20 minute limitation.

Future Research

Three results in particular merit further research. The first stems from the finding

suggesting that having both the session with the lowest change score and the session with the

highest change score classified as the Softened-Irregular Therapist Vocal Style predicts better

scores on the IIP Overly Accommodating subscale at the end of treatment, when compared to

having both of these sessions rated as Natural-Definite Therapist Vocal Style sessions. Clients

in the Depression Project, from which data for the current study were drawn, improved about the

same amount by the end of treatment in both the CBT and PE-EFT treatment condition. There

was one exception to this, which was on the Inventory of Interpersonal Problems (IIP). On this

measure, the PE-EFT clients had greater improvement than the CBT clients. Furthermore, on the

IIP subscale called Overly Accommodating, while the PE-EFT clients reported improvement,

“the CBT group did not change at all” (Watson et al., 2003, p. 777).

Watson et al. (2003) attributed this difference on the IIP to two factors: “The type of

therapeutic relationship that is modeled, with its emphasis on empathy, acceptance and positive

regard, and the nature of the therapeutic tasks” (pp. 779-780). This reasoning links well with the

finding from the current study that PE-EFT therapists spoke predominantly in the Softened-

Irregular Therapist Vocal Style, while CBT therapists spoke predominantly in the Natural-

Definite Therapist Vocal Style. Possible influences of these vocal styles on the interpersonal

style of the Overly Accommodating client suggest a direction for future research on this topic.

137

Clarkin and Levy (2004) asked, “which client and therapist characteristics interact most

saliently and forcefully to produce symptom decline?” (pp. 194-195). The Softened-Irregular

Therapist Vocal Style can be seen as a strong healing match to the needs of the client presenting

with the Overly Accommodating interpersonal style based on theoretical and clinical

considerations. One way to understand the treatment needs of the Overly Accommodating client

is to view his or her behaviour in the context of interpersonal theory. High scorers on this

subscale report feeling gullible, easily exploited by others, and not only “assume that assertive

acts offend” (Schneider, Huprich, & Fuller, 2008, p. 19), but are so afraid of having this effect on

others that they will not express anger.

These are submissive behaviours (e.g., Pincus & Gurtman, 1995) which, seen through the

lens of interpersonal theory, help the client by protecting him or her from feeling anxious (e.g.,

Bernier & Dozier, 2002, referring to Leary, 1957). Anxiety in this case could arise from the

client’s fear of damaging or losing important relationships if the client were to express his or her

true feelings. People tend to respond to submissive behaviour according to the principle of

complementarity in which “interpersonal actions are designed to elicit, entice, or evoke restricted

classes of reactions from persons with whom we interact, especially from significant others”

(Kiesler & Auerbach, 2003, p. 1716). The complementary behaviour to submission is control or

domination (e.g., Benjamin, 1994), meaning that one person’s submissive behaviour elicits

dominant or controlling behaviour from another person. Benjamin (2003) saw the

submit/control transaction as “an example of a highly enmeshed, unhealthy interpersonal

interaction that is very stable (Benjamin, 1994), creating a ‘self-fulfilling prophecy’ experience”

(Benjamin, 2003, p. 48). In addition, the submissive interpersonal style is a form of “passive,

138

avoidant coping that is a central causal factor in depression” (Pearson, Watkins, & Mullan, 2010,

p. 971, referring to the work of Ferster, 1973).

Importantly, the push and pull of the submit/control dynamic is not only pervasive, but it

is also played out in the human vocal quality. As Gregory and Webster (1996) explained, “it is

common knowledge that power and status identifications are communicated through the voice

channel. Authoritative voices vocal quality and deferent voices are easily recognized as such”

(p. 1232). Referring to this phenomenon, Schwartz (1996) wrote, “there’s a hidden battle for

dominance waged in almost every conversation—and the way we modulate the lower

frequencies of our voices shows who’s on top” (no page number).

The Definite vocal quality, along with the Natural vocal quality, characterizes the

Natural-Definite Therapist Vocal Style. Definite vocal quality is partly defined by its falling

frequencies at the ends of statements (Rice & Kerr, 1986). This fall to a lower vocal frequency

imparts an authoritative and potentially controlling tone to the speaker’s message. Ohala (1994),

referring to Bolinger (1978) wrote, “it seems safe to conclude that such ‘social’ messages as

….assertiveness, authority, aggression, confidence, threat are conveyed by low/or falling F0

[fundamental frequency]” (p. 327). These aspects of the Definite category help it exert control

over the listener, which could be helpful, but could also shut down a budding client process. In

the latter case, the therapist using the Natural-Definite Therapist Vocal Style with the Overly

Accommodating client could theoretically perpetuate the client’s submissive interpersonal cycle.

In client-centered therapy, a cornerstone of PE-EFT, the therapist’s vocal quality is

believed to transmit core humanistic values to the client. Rogers (1947) wrote, “the one value or

standard held by the therapist which would exhibit itself in his tone of voice, responses, and

activity, is a deep respect for the personality and attitudes of the client as a separate person” (p.

139

358). This relationship message would seem to provide a direct antidote to the Overly

Accommodating client whose behaviour with others is distorted by the sense that the client, just

as he or she is, is insufficient for others to stay in a close relationship with him or her.

This attitude of the therapist’s acceptance of the client as a separate, valued person is

conveyed in what Truax and Carkhuff (1967) refer to as “nonpossessive warmth” (p. 328).

Rogers (1957) described nonpossessive warmth as “caring for the client as a separate person,

with permission to have his own feelings, his own experiences” (p. 98). This attitude is also

referred to in more contemporary terms as unconditional positive regard and acceptance. The

therapist conveys these attitudes by maintaining a “consistent, genuine, noncritical interest and

tolerance for all aspects of the client” (Elliott et al., 2004, p. 10). Truax and Carkhuff (1967)

described the vocal quality of nonpossessive warmth as “low-pitched, full vocal tones in a

slowed rate of speech, communicating the intentness and seriousness of the therapist’s response”

(p. 245). This description is very similar to Rice and Kerr’s (1986) Softened vocal quality.

If all of these therapist attitudes are conveyed to the client through Softened vocal quality,

then the client is likely to feel secure with the therapist--possibly secure enough that he or she

could express anger, or take other interpersonal risks, without fear of the therapist leaving or

rejecting him or her. Feeling interpersonally secure “allows an adult to consider alternative

perspectives… to reflect on, discuss, and so revise realities…to self-disclose and assert one’s

needs” (Johnson, 2005, p. 411). In addition, contact with a supportive other “tranquilizes the

nervous system” (Schore, 1994, p. 244). These may be among the reasons Benjamin (1979)

wrote that “if a submissive patient who deals with ‘oughts and shoulds’ by total compliance

meets the therapist in the classic Rogerian posture, he/she is thereby encouraged to self-define in

a friendly way” (p. 308). These explanations for the contrasting influences of Therapist Vocal

140

Style on client processes may account for the finding suggesting that Softened-Irregular

Therapist Vocal Style predicts the client’s report of less overly accommodating behaviors at the

end of treatment. Importantly, the results of the current study were suggestive of this and require

further study.

As Rice and Kerr (1986) explained, “in studying the therapist’s process in an

interview…one is interested basically in the client’s change” (p. 94) [italics in original]. Because

of this, a second related area of further investigation could include looking at the Softened-

Irregular Therapist Vocal Style, with its overall non-threatening, soothing tone, as representing

“almost constant ameliorative processes” (Henry, Schacht, & Strupp, 1990, as cited in Henry,

1997, p. 391). These “constant ameliorative processes” have been associated with good

treatment results (Henry, 1997, p. 391). This could be investigated along with the Natural-

Definite Therapist Vocal Style which, because of its authoritative, assertive quality, Definite

vocal quality might close down a client’s emotional experiencing. Experiencing is a process that

has been associated with successful outcomes in different treatment approaches (see Wiser &

Goldfried, 1998, for a list of studies).

Indirect support for this comes from the study done by Wiser and Goldfried (1998).

Although the authors did not investigate vocal qualities, they did find that “personally controlling

interventions were more often followed by shifts away from deeper affective exploration than by

maintenance of the affective focus” (Wiser & Goldfried, 1998, p. 639). Wiser and Goldfried

(1998) explained that:

These highly affiliative but moderately interpersonally controlling interventions were not

harsh, critical, or inappropriate; rather, such utterances received low ratings of affiliation.

Instead, these interventions had a guiding or challenging aspect, such as, “You’re sad,

141

yes, but perhaps you’re angry too?”; “You don’t seem impulsive to me, as you state, you

actually seem quite cautious”; and “What might be a different way of looking at that?”

Such comments are a large part of clinical work wherein therapists respectfully

encourage clients to consider their experience from a new vantage point. (p. 639)

Investigating these effects could be part of a third and larger effort to continue the work

of Wiseman and Rice (1989) who demonstrated through sequential analysis that the therapist’s

use of Irregular vocal quality can shift the client’s vocal quality from unproductive to

productive. Future studies should also include the bi-directional influence between the client and

therapist. As Rice (1965) explained, it is not only the therapist who can affect the client’s

behaviour in the session, but that the therapist’s “style of participation” could be influenced by

the client (p. 160). Because vocal quality is such a revealing nonverbal behaviour (e.g., Perls,

1969), it is worth investigating how the client’s vocal quality might influence the therapist’s

vocal quality as well. Butler et al. (1962) stated that a very depressed, dull-sounding client could

drag down the energetic and focused therapist, which would, in response, influence the

therapist’s “style of participation” to become flatter and less stimulating to the client.

For therapists to gain an awareness of how their own vocal qualities and “styles of

participation” are influenced by their client’s vocal qualities and styles (Butler et al., 1962, pp.

188-189) seems like an obvious part of being an effective therapist. As Fosha (2000) explained:

The change occurring in oneself as a result of closely relating with another, where each

influences the other, “provides a behavioral basis for knowing” the other, thus enabling

one to enter “into the other’s perception, temporal world and feeling state” (Beebe &

Lachmann, 1988, p. 331). (Fosha, 2000, p. 152)

142

In addition, Pally (2001) wrote that, “since nonverbal mechanisms can be activated without

conscious awareness, neither patient nor analysand may be directly aware of their impact” (p.

71). The subtlety of these interpersonal transactions places a special responsibility on the

therapist to not unwittingly engage the client in his or her depression-sustaining interpersonal

cycle. Kiesler (1979) warns therapists that they “must break the vicious circle by not continuing

to be ‘hooked’ or trapped by the client’s engagement or pull” and that “it is essential that the

therapist not respond in the same locked-in and overdetermined manner as have others in the

client’s life” (p. 307) [italics in original].

Implications for practice

Elliott et al. (2004) wrote, “therapists can…enhance their responsiveness to clients by

being alert to the possible meanings inherent in different vocal qualities” (p. 61). The authors

provide guidelines about what clients may need depending on the vocal quality they express.

One of the most important findings of the current study is that depressed clients’ use of

Emotional vocal quality, Focused vocal qualitly, and Limited vocal quality in the first report of

moderate to high change session, predicts better scores on the outcome measures at the end of

treatment. In contrast, the use of a higher proportion of Externalizing vocal quality in this

session predicts worse scores. These results provide some support for the guidelines offered by

Elliott et al. (2004):

Clients who demonstrate little or no focused or emotional voice are seen as less

emotionally accessible and in need of further work to help them process internal

experiential information. Clients with a high degree of external vocal quality can

generally benefit from being helped to focus inward, whereas those with a high degree of

143

limited vocal quality need a safe environment to develop trust in the therapist and allow

them to relax. (p. 61)

Although attending to clients’ nonverbal cues is regarded as very important in the

psychotherapy hour, to prevent, for example “reinforce[ing] dysfunctional behaviour patterns”

(Kinseth, 1989), paying attention to and understanding the meaning of the client’s nonverbal

behaviour, such as vocal quality, is not an easy or natural thing to do for everyone. Rice and

Kerr (1986) suggested that raters have an “ear” or “sensitivity” for discerning vocal quality (p.

98). They add that some people do not have this ability and that they may not be able to acquire

it.

Along these lines, there is a research trend toward using technology to replace human

listening when it comes to treating clients. For example, Rochman, Diamond, and Amir (2008)

examined the acoustic parameters of the clients’ vocal qualities while they were expressing

emotions. Rochman et al. (2008) wrote that:

In terms of clinical practice, [their] research represents a first step toward the

development of a new measurement technology that, in the future, would enable

therapists (and clients) to continuously monitor clients’ emotional states online over the

course of therapy. Such a technology could be used to inform the therapist of clinically

significant changes in clients’ emotional states, changes that might not be evident on the

basis of client behaviour alone. Such information could be used to guide moment-to-

moment intervention strategies. (pp. 515-516)

At the entrepreneurial level, a “biocommunication” company called ZYTO (2012)

developed a product called EVOX to assist health care professionals, such as medical doctors,

address their clients’ “static perceptions” (no page number). The way it works is that during an

144

office visit, the patient wears a head set with a mouth piece and places his or her hand in a

“cradle” (no page number). These devices transmit physical data, such as vocal frequencies, to a

computer, as the client speaks about painful or troubling issues. After about half an hour, there is

enough data for the doctor to show the patient various charts on the computer screen that will

display his or her “stuck” areas in the narrative, as indicated mainly by vocal frequencies and

other data from the hand cradle. This seems like a technological parallel to Elliott et al.’s (2004)

comment that “clients’ vocal quality can …provide clues concerning unacknowledged feelings”

(p. 61).

While using technology this way could make a therapist feel more confident about his or

her next step, it seems that it could also make the client lack confidence in the therapist’s ability

to connect with him or her on an essential level. Also, if a therapist relies on a computer screen

to tell him or her when the client is experiencing meaningful emotion, how can the therapist ever

really listen to what the client is saying? Being truly “heard” by a caring other can be a healing

experience in itself. While treating a client’s changing emotional states like heart beats on an

EKG printout may make the psychology profession seem more medical or scientific, it would

also impersonalize one of the core healing aspects of the therapy session, which is the therapeutic

relationship. It seems like a more genuine approach is for therapists to learn how to listen to

their clients’ vocal qualities for the wealth of information they offer.

However, technology could play a very substantial role in teaching practitioners how to

listen. It could be that lacking an “ear” for vocal quality (Rice & Kerr, 1986, p. 98) could be a

research and training issue rather than some inherent deficit in a person’s ability to hear. Belin,

Fecteau, and Bedard (2004) wrote that “abilities involved in perceiving paralinguistic

information in voices—or ‘voice perception’ abilities – have been far less investigated than

145

speech perception” (p. 129). While the paralinguistic aspects of discrete emotions have been

studied a great deal (e.g., Patel, Scherer, Bjorkner, & Sundberg, 2011), other less-studied aspects

of vocal quality may only be distinguishable to the human ear after being pointed out with the

help of technology.

Given the results of the current study, it would be particularly helpful for example to

identify the paralinguistic aspects of Limited vocal quality as a trait and as a state, if these are

different at the paralinguistic level. Therapists might respond somewhat differently if these types

of Limited vocal qualities are distinct and if they can be discerned by the human ear. Also, since

CBT clients spoke mainly in Externalizing vocal quality and since the treatment outcomes for the

CBT and PE-EFT clients are about equal (Watson et al., 2003) a paralinguistic analysis of

Externalizing vocal quality could reveal subcategories that account for change processes in the

CBT group. If therapists can learn to hear these variations, they would have even more

information available to guide their treatment interventions.

There are audiotapes to accompany the CVQ and TVQ manuals, however both should be

updated with more current vocal examples. Also, it might help teach listening skills if new

manuals included audio examples of the individual paralinguistic characteristics described in the

manual, such as rising and falling fundamental frequency as well as wordless, vocal quality

“melodies” typifying each vocal pattern. Creating a manual or catalogue of sounds such as this

could facilitate training for research purposes and the development of clinical skills. This

possibility becomes even more intriguing given Rice and Koke’s (1981) observations from

supervising students. Rice and Koke (1981) educated student-therapists about vocal quality to

help them understand their clients’ problematic “habitual processing styles” as opposed to the

more commonplace understanding of “psychotherapy process as a series of motivated

146

interpersonal acts, such as defence, manipulation, and so on” (p. 163). Moreover, “therapists, in

turn, often handle such acts by challenging or interpreting the defenses” (Rice & Koke, 1981, p.

163).

The work of Henry and colleagues (1986, 1994) suggest that these kinds of therapist

responses are negative interpersonal processes which are related to poor treatment outcomes.

Referring to the work of Henry, Schacht, and Strupp (1986), Henry and Strupp (1994) wrote of

their results: “In poor-outcome cases, the frequency of complementary exchanges between

therapist and patient that were negative (interpersonally disaffiliative and/or separating) was

significantly higher” (p. 65). An example of interpersonally disaffiliative and separating

behaviours would be the therapist’s responding to the client as if he or she is being defensive or

manipulative. Benjamin (1996), referring to the SASB measure, stated that “tone of voice and

the context are very important in assessing affiliation and interdependence” (p. 39). Reframing

vocal behaviour in terms of processing styles, as opposed to the client’s being manipulative for

example, would presumably shift the therapist’s intervention in a different direction.

Conclusion

The results of the current exploratory study indicate that the client’s and therapist’s vocal

qualities differentiate the CBT and PE-EFT treatments. Because both treatments were found to

be approximately equal in their effectiveness in treating depression in the Depression Project

(Watson et al., 2003), the differences in vocal qualities, as in the Mackay et al. (2002) study,

suggest that the treatments work by different “mechanisms” (p. 380, referring to Stiles, 1983).

As recommended by Watson et al. (2003), “future work needs to be concerned with identifying

more precisely what is differentially effective in each treatment and common to all to further our

understanding of treatment efficacy” (p. 780).

147

Knapp and Hall (2010) wrote, “you should be quick to challenge the cliché that vocal

cues only concern how something is said—frequently they are what is said” (p. 396). However,

psychotherapy researchers view communication in the psychotherapy setting as a multiple level

process (e.g., Kinseth, 1989). Specifying the change mechanisms and their interactions with one

another is an ongoing process.

The results also indicate that just the presence of Emotional vocal quality and Focused

vocal quality is not necessarily enough for change to occur. However, the presence of these

vocal qualities together in a session which the client first reports moderate to high change was

sufficient, in the current study, to predict better treatment outcomes. However, exploring the

ways in which Externalizing vocal quality and Limited vocal quality can predict both better and

worse treatment outcomes would be valuable in terms of expanding the CVQ scale and

understanding what factors make a vocal quality productive in one setting, but not in another.

Finally, understanding how the therapist’s vocal style influences the client’s vocal quality

as well as changes in interpersonal problems continues to be important. The current research

suggests that the therapist’s Softened-Irregular vocal style may play a unique role in helping

clients with these problems. Identification of the Natural-Definite and Softened-Irregular

Therapist Vocal Styles may be an important clue to understanding how the “work” of therapy is

achieved in PE-EFT and CBT.

148

References

Ackerman, D. (1995). A natural history of the senses. New York, NY: Vintage Books.

Alden, L. E., Wiggins, J. S., & Pincus, A. L. (1990). Construction of circumplex scales for

the inventory of interpersonal problems. Journal of Personality Assessment, 55(3-4),

521–536. doi:10.1207/s15327752jpa5503&4_10

Bachman, J. G., & O’Malley, P. M. (1977). Self-esteem in young men: A longitudinal analysis

of the impact of educational and occupational attainment. Journal of Personality and

Social Psychology, 35(6), 365–380. doi:10.1037/0022-3514.35.6.365

Bady, S. L. (1985). The voice as a curative factor in psychotherapy. Psychoanalytic Review,

72(3), 479–490. Retrieved from

http://search.proquest.com/docview/617132515?accountid=14771

Beck, A. T., Rush, A. J., Shaw, B. F., & Emery, G. (1979). Cognitive therapy of depression.

New York, NY: Guilford.

Beck, A. T., Steer, R. A., & Garbin, M. G. (1988). Psychometric properties of the Beck

Depression Inventory: Twenty-five years of evaluation. Clinical Psychology Review,

8(1), 77–100. doi:10.1016/0272-7358(88)90050-5

Beck, A. T., Ward, C. H., Mendelson, M., Mock, J., & Erbaugh, J. (1961). An inventory for

measuring depression. Archives of General Psychiatry, 4, 561–571. Retrieved from


Belin, P., Fecteau, S., & Bédard, C. (2004). Thinking the voice: Neural correlates of voice

perception. TRENDS in Cognitive Sciences, 8(3), 129–135. Retrieved from

http://journals2.scholarsportal.info.myaccess.library.utoronto.ca/tmp/7581775395900802

4.pdf



http://journals2.scholarsportal.info.myaccess.library.utoronto.ca/tmp/75817753959008024.pdf

http://journals2.scholarsportal.info.myaccess.library.utoronto.ca/tmp/75817753959008024.pdf

149

Benjamin, L. S. (1974). Structural analysis of social behavior. Psychological Review, 81(5),

392–425. doi:10.1037/h0037024

Benjamin, L. S. (1979). Use of structural analysis of social behavior (SASB) and Markov chains

to study dyadic interactions. Journal of Abnormal Psychology, 88(3), 303–319.

doi:10.1037/0021-843X.88.3.303

Benjamin, L. S. (1994). SASB: A bridge between personality theory and clinical psychology.

Psychological Inquiry, 5(4), 273–316. Retrieved from

http://www.jstor.org/stable/1449133

Benjamin, L. S. (1996). A clinician-friendly version of the interpersonal circumplex: Structural

Analysis of Social Behavior (SASB). Journal of Personality Assessment, 66(2), 248–266.

doi:10.1207/s15327752jpa6602_5

Benjamin, L. S. (2003). Interpersonal diagnosis and treatment of personality disorders (2nd ed.).

New York, NY: Guilford Press.

Bernier, A., & Dozier, M. (2002). The client-counselor match and the corrective emotional

experience: Evidence from interpersonal and attachment research. Psychotherapy:

Theory, Research, Practice, Training, 39(1), 32-43. doi: http://dx.doi.org/10.1037/0033-

3204.39.1.32

Bolinger, D. L. (1978). Intonation across languages. In J. Greenberg (Ed.), Universals of human

language: Vol. 2. Phonology (pp. 471–524). Stanford, CA: Stanford University Press.

Brennan, R. L., & Prediger, D. J. (1981). Coefficient kappa: Some uses, misuses, and

alternatives. Educational and Psychological Measurement, 41(3), 687–699.

doi:10.1177/001316448104100307


150

Burgoon, J. K., Beutler, L. E., Le Poire, B. A., Engle, D., Bergan, J., Salvio, M., & Mohr, D. C.

(1993). Nonverbal indices of arousal in group psychotherapy. Psychotherapy: Theory,

Research, Practice, Training, 30(4), 635–645. doi:10.1037/0033-3204.30.4.635

Butler, J. M., & Haigh, G. V. (1954). Changes in the relation between self-concepts and ideal

concepts consequent upon client-centered counseling. In C. R. Rogers & R. F. Dymond

(Eds.), Psychotherapy and personality change (pp. 55–75). Chicago, IL: University of

Chicago Press. Retrieved from


Butler, J. M., & Rice, L. (1960). Self-actualization, new experience, and psychotherapy.

University of Chicago Counseling Center Discussion Papers (Library), 6(12), 79–110.

Butler, J. M., Rice, L. N., & Wagstaff, A. K. (1962). Research in psychotherapy. In H. H. Strupp

& L. Luborsky (Eds.), On the naturalistic definition of variables: An analogue of clinical

analysis (pp. 178–205). Washington, DC: American Psychological Association.

doi:10.1037/10591-010

Byers, P. (1979). Biological rhythms as information channels in interpersonal communication

behavior. In S. Wietz (Ed.), Nonverbal communication: Readings with commentary

(pp. 398–418). New York, NY: Oxford University Press.

Cane, D. B., Olinger, L. J., Gotlib, I. H., & Kuiper, N. A. (1986). Factor structure of the

dysfunctional attitude scale in a student population. Journal of Clinical Psychology,

42(2), 307–309. doi:10.1002/1097-4679(198603)42:2<307::AID-

JCLP2270420213>3.0.CO;2-J


151

Carryer, J. R., & Greenberg, L. S. (2010). Optimal levels of emotional arousal in experiential

therapy of depression. Journal of Consulting and Clinical Psychology, 78(2), 190–199.

doi:10.1037/a0018401

Clarke, K. M. (1989). Creation of meaning: An emotional processing task in psychotherapy.

Psychotherapy: Theory, Research, Practice, Training, 26(2), 139–148.

doi:10.1037/h0085412

Clarkin, J. F., & Levy, K. N. (2004). The influence of client variables on psychotherapy. In

M. J. Lambert (Ed.), Bergin and Garfield's handbook of psychotherapy and behavior

change (pp. 194–226). New York, NY: John Wiley & Sons.

Cohen, J. (1960). A coefficient of agreement for nominal scales. Educational and Psychological

Measurement, 20, 37–46. doi:10.1177/001316446002000104

Daldrup, R. J., Beutler, L. E., Engle, D., & Greenberg, L. S. (1988). Focused expressive

psychotherapy: Freeing the overcontrolled patient. New York, NY: Guilford.

Davis, M., & Hadicks, D. (1990). Nonverbal behavior and client state changes during

psychotherapy. Journal of Clinical Psychology, 46(3), 340–351. doi:10.1002/1097-

4679(199005)46:3<340::AID-JCLP2270460315>3.0.CO;2-1

Delucchi, K. L., & Bostrom, A. (2004). Methods for analysis of skewed data distributions in

psychiatric clinical studies: Working with many zero values. The American Journal of

Psychiatry, 161(7), 1159–1168. doi:10.1176/appi.ajp.161.7.1159

Derogatis, L. R., Rickels, K., & Rock, A. F. (1976). The SCL-90 and the MMPI: A step in the

validation of a new self-report scale. British Journal of Psychiatry, 128(3), 280–289.

doi:10.1192/bjp.128.3.280

152

Diamond, G. M., Rochman, D., & Amir, O. (2010). Arousing primary vulnerable emotions in the

context of unresolved anger: “Speaking about” versus “speaking to”. Journal of

Counseling Psychology, 57(4), 402–410. doi:10.1037/a0021115

Dobson, K. S., & Breiter, H. J. (1983). Cognitive assessment of depression: Reliability and

validity of three measures. Journal of Abnormal Psychology, 92(1), 107–109.

doi:10.1037/0021-843X.92.1.107

Duncan, S., Rice, L. N., & Butler, J. M. (1968). Therapists’ paralanguage in peak and poor

psychotherapy hours. Journal of Abnormal Psychology, 73(6), 566–570.

doi:10.1037/h0026597

Egan, G. (1998). The skilled helper: A problem-management approach to helping (6th ed.).

Pacific Grove, CA: Brooks/Cole.

Ekman, P., & Friesen, W. V. (1969). The repertoire of nonverbal behaviour: Categories, origins,

usage, and coding. Semiotica, 1, 49–98.

Ellgring, H., & Scherer, K. R. (1996). Vocal indicators of mood change in depression. Journal of

Nonverbal Behavior, 20(2), 83.

Elliott, R. (1979). How clients perceive helper behaviors. Journal of Counseling Psychology,

26(4), 285-294. doi: http://dx.doi.org/10.1037/0022-0167.26.4.285

Elliott, R., Watson, J., Goldman, R., & Greenberg, L. S. (2004). Learning emotion-focused

therapy: The process-experiential approach to change. Washington, DC: American

Psychological Association. doi:10.1037/10725-000

Field, A. P. (2009). Discovering statistics using SPSS: (and sex and drugs and rock 'n' roll).

Thousand Oaks, CA: Sage.

153

First, M. B., Spitzer, R. L, Gibbon, M., & Williams, J. B. W. (1997). Structured clinical

interview for DSM-IV Axis I disorders (SCID-I): Clinician version. Washington, DC:

American Psychiatric Press.

Fosha, D. (2000). The transforming power of affect: A model for accelerated change. New York,

NY: Basic Books. Retrieved from


Gillies, L. A. (1990). An application of the client vocal quality system to three short term

psychotherapies carried out by the Mount Zion Psychotherapy Research Group

(Unpublished doctoral dissertation). York University, Toronto, Ontario, Canada.

Goldfried, M. R., Castonguay, L. G., Hayes, A. M., Drozd, J. F., & Shapiro, D. A. (1997). A

comparative analysis of the therapeutic focus in cognitive–behavioral and

psychodynamic–interpersonal sessions. Journal of Consulting and Clinical Psychology,

65(5), 740–748. doi:10.1037/0022-006X.65.5.740

Goldstein, B. (2002). Intensity of expressed emotion in process-experiential and cognitive-

behavioral treatment for depression. (Unpublished master's thesis). OISE/University of

Toronto, Toronto, Ontario, Canada.

Green, S. B., & Salkind, N. J. (2004). Using SPSS for Windows and Macintosh: Analyzing and

understanding data (4th ed.). Upper Saddle River, NJ: Pearson Prentice Hall.

Greenberg, L. S. (1979). Resolving splits: Use of the two chair technique. Psychotherapy:

Theory, Research & Practice, 16(3), 316–324. doi:10.1037/h0085895

Greenberg, L. S. (1980). The intensive analysis of recurring events from the practice of Gestalt

therapy. Psychotherapy: Theory, Research & Practice, 17(2), 143–152.

doi:10.1037/h0085904


154

Greenberg, L. S. (1983). Toward a task analysis of conflict resolution in Gestalt therapy.

Psychotherapy: Theory, Research & Practice, 20(2), 190–201. doi:10.1037/h0088490

Greenberg, L. S. (1984). A task analysis of intrapersonal conflict resolution. In L. Rice &

L. Greenberg (Eds.), Patterns of change: Intensive analysis of psychotherapy process

(pp. 67–123). New York, NY: Guilford Press.

Greenberg, L. S., & Johnson, S. M. (1988). Emotionally focused therapy for couples. New York,

NY: Guilford Press. Retrieved from


Greenberg, L. S., & Malcolm, W. (2002). Resolving unfinished business: Relating process to

outcome. Journal of Consulting and Clinical Psychology, 70(2), 406–416.

doi:10.1037/0022-006X.70.2.406

Greenberg, L. S., & Paivio, S. C. (1997). Working with emotions in psychotherapy. New York,

NY: Guilford Press. Retrieved from


Greenberg, L. S., Rice, L. N., & Elliott, R. (1993). Facilitating emotional change: The moment-

by-moment process. New York, NY: Guilford Press.

Greenberg, L. S., & Watson, J. (1998). Experiential therapy of depression: Differential effects of

client-centered relationship conditions and process experiential interventions.

Psychotherapy Research, 8(2), 210–224. doi:10.1093/ptr/8.2.210

Greenberger, E., Chen, C., Dmitrieva, J., & Farruggia, S. P. (2003). Item-wording and the

dimensionality of the Rosenberg self-esteem scale: Do they matter? Personality and

Individual Differences, 35(6), 1241–1254. doi:10.1016/S0191-8869(02)00331-8

Greene, M. C. L. (1964). The voice and its disorders. London, England: Pitman Medical.



155

Gregory, S. W., Jr., Green, B. E., Carrothers, R. M., Dagan, K. A., & Webster, S. T. (2001).

Verifying the primacy of voice fundamental frequency in social status accommodation.

Language and Communication, 21(1), 37–60. doi:10.1016/S0271-5309(00)00011-2

Gregory, S. W., & Webster, S. (1996). A nonverbal signal in voices of interview partners

effectively predicts communication accommodation and social status perceptions.

Journal of Personality and Social Psychology, 70(6), 1231–1240. doi:10.1037/0022-

3514.70.6.1231

Hagenaars, M. A., & van Minnen, A. (2005). The effect of fear on paralinguistic aspects of

speech in patients with panic disorder with agoraphobia. Journal of Anxiety Disorders,

19(5), 521–537. doi:10.1016/j.janxdis.2004.04.008

Hamilton, M. (1960). A rating scale for depression. Journal of Neurology, Neurosurgery, and

Psychiatry, 23(1), 56–62.

Henry, W. P. (1997). The circumplex in psychotherapy research. In R. Plutchik & H. R. Conte

(Eds.), Circumplex models of personality and emotions (pp. 385–410). Washington, DC:

American Psychological Association.

Henry, W. P., Schacht, T. E., & Strupp, H. H. (1986). Structural analysis of social behavior:

Application to a study of interpersonal process in differential psychotherapeutic outcome.

Journal of Consulting and Clinical Psychology, 54(1), 27–31. doi:10.1037/0022-

006X.54.1.27

Henry, W. P., & Strupp, H. H. (1994). The therapeutic alliance as interpersonal process. In

A. O. Horvath & L. S. Greenberg (Eds.), The working alliance: Theory, research,

and practice (pp. 51–84). New York, NY: Wiley.

156

Heppner, P. P. (1988). The problem solving inventory: Manual. Palo Alto, CA: Consulting

Psychologists Press.

Heppner, P. P., Cook, S. W., Wright, D. M., & Johnson, W. C. (1995). Progress in resolving

problems: A problem-focused style of coping. Journal of Counseling Psychology, 42(3),

279–293. doi:http://dx.doi.org/10.1037/0022-0167.42.3.279

Hoffmann, S. G., & Asmundson, G. J. G. (2008). Acceptance and mindfulness-based therapy:

New wave or old hat? Clinical Psychology Review, 28(1), 1–16.

doi:10.1016/j.cpr.2007.09.003

Horowitz, L. M., Rosenberg, S. E., Baer, B. A., Ureño, G., & Villaseñor, V. S. (1988). Inventory

of interpersonal problems: Psychometric properties and clinical applications. Journal of

Consulting and Clinical Psychology, 56(6), 885–892. doi:10.1037//0022-006X.56.6.885

House, A. E., House, J., & Campbell, M. B. (1981). Measures of interobserver agreement:

Calculation formulas and distribution effects. Journal of Psychopathology and

Behavioral Assessment, 3(1), 37–57. doi:10.1007/BF01321350

Hunt, M. G. (1998). The only way out is through: Emotional processing and recovery after a

depressing life event. Behaviour Research and Therapy, 36(4), 361–384.

doi:10.1016/S0005-7967(98)00017-5

Iwakabe, S., Rogan, K., & Stalikas, A. (2000). The relationship between client emotional

expressions, therapist interventions, and the working alliance: An exploration of eight

emotional expression events. Journal of Psychotherapy Integration, 10(4), 375–401.

doi:10.1023/A:1009479100305

157

Johnson, S. (2005). Attachment theory and emotionally focused therapy for individuals and

couples: Perfect partners. In J. Obegi & E. Berant (Eds.), Attachment theory and research

in clinical work with adults (pp. 410–433). New York, NY: Guilford Press.

Josefowitz, N., & Myran, D. (2005). Towards a person-centred cognitive behaviour therapy.

Counselling Psychology Quarterly, 18(4), 329–336. doi:10.1080/09515070500473600

Kappas, A., Hess, U., & Scherer, K. R. (1991). Voice and emotion. In R. Feldman & B. Rimé

(Eds.), Fundamentals of nonverbal behavior (pp. 200–238). New York, NY: Cambridge

University Press. Retrieved from

http://search.proquest.com.myaccess.library.utoronto.ca/docview/618053926?accountid=

14771

Kennedy-Moore, E., & Watson, J. C. (1999). Expressing emotion: Myths, realities, and

therapeutic strategies. New York, NY: Guilford Press. Retrieved from


Kerr, G. P. (1980). The relation of therapist vocal quality to client outcome: A pilot study

(Unpublished master's thesis). York University, Toronto, Ontario, Canada.

Kerr, G. P. (1983). Therapist vocal quality and psychotherapeutic effectiveness (Unpublished

doctoral dissertation). York University, Toronto, Ontario, Canada.

Kiesler, D. J. (1979). An interpersonal communication analysis of relationship in psychotherapy.

Psychiatry: Journal for the Study of Interpersonal Processes, 42(4), 299–311. Retrieved

from http://search.proquest.com/docview/616471679?accountid=14771

Kiesler, D. J., & Auerbach, S. M. (2003). Integrating measurement of control and affiliation in

studies of physician-patient interaction: The interpersonal circumplex. Social Science &

Medicine, 57(9), 1707–1722. doi:10.1016/S0277-9536(02)00558-0

http://search.proquest.com.myaccess.library.utoronto.ca/docview/618053926?accountid=14771

http://search.proquest.com.myaccess.library.utoronto.ca/docview/618053926?accountid=14771



158

Kinseth, L. M. (1989). Nonverbal training for psychotherapy. The Clinical Supervisor, 7(1),

5–25. Available at http://dx.doi.org/10.1300/J001v07n01_02

Klein, M., Mathieu-Coughlan, P., & Kiesler, D. (1986). The experiencing scales. In L.

Greenberg & W. Pinsoff (Eds.), The psychotherapeutic process: A handbook (pp. 21–71).

New York, NY: Guilford Press. Retrieved from


Knapp, M. L., & Hall, J. A. (2010). The effects of vocal cues that accompany spoken words (pp.

367-399). Nonverbal Communication in Human Interaction. (to be completed)

Landis, J. R., & Koch, G. G. (1977). The measurement of observer agreement for categorical

data. Biometrics, 33, 159–174.

Lashley, K. S. (1951). The problem of serial order in behavior. In K. S. Lashley (Ed.), Cerebral

mechanisms in behavior: The Hixon symposium (pp. 112–146). Oxford, England: Wiley.

Retrieved from http://search.proquest.com/docview/615230857?accountid=14771

Laver, J. (1980). The phonetic description of voice. Cambridge, England: Cambridge University

Press.

Lynch, J. J. (1979). The broken heart: The medical consequences of loneliness. New York, NY:

Basic Books.

Machado, P. (1992). Client's emotional arousal in therapy: Development of a rating scale.

Unpublished manuscript, University of California, Santa Barbara, CA.

Mackay, H. C., Barkham, M., Stiles, W. B., & Goldfried, M. R. (2002). Patterns of client

emotion in helpful sessions of cognitive-behavioral and psychodynamic-interpersonal

therapy. Journal of Counseling Psychology, 49(3), 376–380. doi:10.1037/0022-

0167.49.3.376

http://dx.doi.org/10.1300/J001v07n01_02



159

Marks, I. (1991). Emotional arousal as therapy: Activation vs. dissociation. European

Psychiatry, 6(4), 161–170. Retrieved from


Meservy, T. O., & Burgoon, J. K. (2008). Paralanguage. In D. Wolfgang (Ed.), The international

encyclopedia of communication. Retrieved from

http://www.communicationencyclopedia.com.myaccess.library.utoronto.ca/subscriber/toc

node?id=g9781405131995_yr2012_chunk_g978140513199521_ss5-1

Mohr, D. C., Shoham-Salomon, V., Engle, D., & Beutler, L. E. (1991). The expression of anger

in psychotherapy for depression: Its role and measurement. Psychotherapy Research,

1(2), 124–134. Retrieved from


Moreno, J. K., Fuhriman, A., & Selby, M. J. (1993). Measurement of hostility, anger, and

depression in depressed and nondepressed subjects. Journal of Personality Assessment,

61(3), 511.

Morreale, S., Spitzberg, B., & Barge, J. (2007). Interpersonal competence: Developing skills. In

S. Morreale, B. Spitzberg, & J. Barge (Eds.), Human communication: Motivation,

knowledge, and skills (2nd ed., pp. 184–211). Belmont, CA: Thomson Wadsworth.

Moses, P. J. (1954). The voice of neurosis. New York, NY: Grune & Stratton.

Murray, E., & Segal, D. (1994). Emotional processing in vocal and written expression of feelings

about traumatic experiences. Journal of Traumatic Stress, 7(3), 391–405.

doi:10.1007/BF02102784


http://www.communicationencyclopedia.com.myaccess.library.utoronto.ca/subscriber/tocnode?id=g9781405131995_yr2012_chunk_g978140513199521_ss5-1

http://www.communicationencyclopedia.com.myaccess.library.utoronto.ca/subscriber/tocnode?id=g9781405131995_yr2012_chunk_g978140513199521_ss5-1


160

Nass, M. L. (1971). Some considerations of a psychoanalytic interpretation of music. The

Psychoanalytic Quarterly, 40(2), 303–316. Retrieved from


Niederland, W. C. (1958). Early auditory experiences, beating fantasies and the primal scene.

The Psychoanalytic Study of the Child, 13, 471–504.

Nixon, D. S. (1980). The relationships of primal therapy outcome with experiencing, voice

quality and transference (Unpublished doctoral dissertation). York University, Toronto,

Ontario, Canada.

Ohala, J. J. (1994). The frequency code underlies the sound-symbolic use of Vocal Quality pitch.

In L. Hinton, J. Nichols, & J. Ohala (Eds.), Sound symbolism (pp. 325–347). Cambridge,

England: Cambridge University Press.

Oliver, J. M., & Baumgart, E. P. (1985). The dysfunctional attitude scale: Psychometric

properties and relation to depression in an unselected adult population. Cognitive Therapy

and Research, 9(2), 161–167. doi:10.1007/BF01204847

Otswald, P. F. (1979). The sounds of emotional disturbance. In S. Weitz (Ed.), Nonverbal

communication: Readings with commentary (pp. 260–267). New York, NY: Oxford

University Press.

Ozdas, A., Shiavi, R. G., Silverman, S. E., Silverman, M. K., & Wilkes, D. M. (2004).

Investigation of vocal jitter and glottal flow spectrum as possible cues for depression and

near-term suicidal risk. IEEE Transactions on Bio-Medical Engineering, 51(9), 1530–

1540. Retrieved from http://search.proquest.com/docview/66887419?accountid=14771

Pally, R. (2001). A primary role for nonverbal communication in psychoanalysis. Psychoanalytic

Inquiry, 21(1), 71–93.



161

Patel, S., Scherer, K. R., Bjorkner, E., & Sundberg, J. (2011). Mapping emotions into acoustic

space: The role of voice production. Biological Psychology, 87, 93–98.

Pearson, K. A., Watkins, E. R., & Mullan, E. G. (2010). Submissive interpersonal style mediates

the effect of brooding on future depressive symptoms. Behaviour Research and Therapy,

48(10), 966–973. doi:10.1016/j.brat.2010.05.029

Perls, F. S. (1969). Gestalt therapy verbatim. Lafayette, CA: Real People Press.

Perneger, T. V. (1998). What's wrong with Bonferroni adjustments. British Medical Journal,

316(7139), 1236–1238. Retrieved from http://www.jstor.org/stable/25178955

Peternelli, L. (1999). The relationship between emotionality and in-session therapeutic

phenomena. Dissertation Abstracts International: Section A. Humanities and Social

Sciences, 60(6-A), 1924. Retrieved from


Peveler, R. C., & Fairburn, C. G. (1990). Measurement of neurotic symptoms by self-report

questionnaire: Validity of the SCL-90R. Psychological Medicine: A Journal of Research

in Psychiatry and the Allied Sciences, 20(4), 873–879. doi:10.1017/S0033291700036576

Pincus, A. L., & Gurtman, M. B. (1995). The three faces of interpersonal dependency: Structural

analyses of self-report dependency measures. Journal of Personality and Social

Psychology, 69(4), 744–758. doi:10.1037/0022-3514.69.4.744

Raue, P. J., & Goldfried, M. R. (1994). The therapeutic alliance in cognitive-behavior therapy. In

A. Horvath & L. Greenberg (Eds.), The working alliance: Theory, research, and practice

(pp. 131–152). New York, NY: John Wiley & Sons. Retrieved from





162

Rice, L. N. (1965). Therapist's style of participation and case outcome. Journal of Consulting

Psychology, 29(2), 155–160. doi:10.1037/h0021926

Rice, L. N. (1980). Client vocal style and the description of therapeutic events. American

Psychological Association. North York: York University.

Rice, L. N., & Gaylin, N. L. (1973). Personality processes reflected in client vocal style and

Rorschach performance. Journal of Consulting and Clinical Psychology, 40(1), 133–138.

Rice, L. N., & Kerr, G. (1986). Measures of client and therapist vocal quality. In L. Greenberg &

W. Pinsoff (Eds.), The psychotherapeutic process: A research handbook (pp. 73–105).

New York, NY: Guilford Press.

Rice, L. N., & Koke, C. J. (1981). Vocal style and the process of psychotherapy. In J. Darby

(Ed.), Speech evaluation in psychiatry (pp. 151–168). New York, NY: Grune & Stratton.

Rice, L. N., Koke, C. J., Greenberg, L. S., & Wagstaff, A. K. (1979). Manual for client vocal

quality. Unpublished manuscript, Counselling and Development Centre, York University,

Toronto, Ontario, Canada.

Rice, L. N., & Wagstaff, A. K. (1967). Client voice quality and expressive style as indexes of

productive psychotherapy. Journal of Consulting Psychology, 31(6), 557–563.

doi:10.1037/h0025164

Ritchie, M. M. (1998). Hyperphagia: The relationship between therapist vocal quality and levels

of client perceptual processing. Dissertation Abstracts International: Section B: The

Sciences and Engineering, 60(4-B), 1882. Retrieved from



163

Rochman, D., Diamond, G. M., & Amir, O. (2008). Unresolved anger and sadness: Identifying

vocal acoustical correlates. Journal of Counseling Psychology, 55(4), 505–517.

doi:10.1037/a0013720

Rogers, C. R. (1947). Some observations on the organization of personality. American

Psychologist, 2(9), 358–368. doi:10.1037/h0060883

Rogers, C. R. (1957). The necessary and sufficient conditions of therapeutic personality change.

Journal of Consulting Psychology, 21(2), 95–103.

Rosenberg, M. (1989). Society and the adolescent self-image (Rev. ed., 1st Wesleyan ed.).

Middletown, CT: Wesleyan University Press.

Rosner, R. (1996). The relationship between emotional expression, treatment and outcome in

psychotherapy: An empirical study. New York: Peter Lang.

Rosner, R., Beutler, L. E., & Daldrup, R. J. (2000). Vicarious emotional experience and

emotional expression in group psychotherapy. Journal of Clinical Psychology, 56(1),

1–10. doi:10.1002/(SICI)1097-4679(200001)56:1<1::AID-JCLP1>3.0.CO;2-7

Ross, M. (2002), The dynamic duet: The reciprocal process of voice modulation in the

psychotherapeutic encounter. Dissertation Abstracts International: Section B: The

Sciences and Engineering, 60(9-B), 4234. Retrieved from


Safran, J. D., & Muran, J. C. (1996). The resolution of ruptures in the therapeutic alliance.


006X.64.3.447

Safran, J. D., Muran, J. C., & Samstag, L. W. (1994). Resolving therapeutic alliance ruptures: A

task analytic investigation. In J. D. Safran, J. C. Muran, & Samstag, L. W. (Eds.). The


164

working alliance: Theory, research, and practice (pp. 225–255). Oxford, England: John

Wiley & Sons. Retrieved from


Samoilov, A., & Goldfried, M. R. (2000). Role of emotion in cognitive-behavior therapy.

Clinical Psychology: Science and Practice, 7(4), 373–385. doi:10.1093/clipsy/7.4.373

Sarnat, J. E. (1976). A comparison of psychodynamic and client-centered measures of initial in-

therapy patient participation (Unpublished doctoral dissertation). University of

Michigan, Ann Arbor. Available from ProQuest Dissertations and Theses database. (UMI

No. 76-9504)

Scherer, K. R. (1986). Voice, stress, and emotion. In M. H. Appley & R. Trumbull (Eds.),

Dynamics of stress: Physiological, psychological, and social perspectives

(pp. 157–179). New York, NY: Plenum Press.

Scherer, K. R., Banse, R., & Wallbott, H. G. (2001). Emotion inferences from vocal expression

correlate across languages and cultures. Journal of Cross-Cultural Psychology, 32(1),

76–92. doi:10.1177/0022022101032001009

Scherer, K. R., & Oshinsky, J. S. (1977). Cue utilization in emotion attribution from auditory

stimuli. Motivation and Emotion, 1(4), 331–346. doi:10.1007/BF00992539

Scherer, K. R., & Zei, B. (1988). Vocal indicators of affective disorders. Psychotherapy and

Psychosomatics, 49(3–4), 179–186. doi:10.1159/000288082

Schmitz, N., Hartkamp, N., & Franke, G. H. (2000). Assessing clinically significant change:

Application to the SCL–90–R. Psychological Reports, 86(1), 263–274.

doi:10.2466/PR0.86.1.263-274


165

Schneider, R. B., Huprich, S. K., & Fuller, K. M. (2008). The Rorschach and the Inventory of

Interpersonal Problems, IIP-64. Rorschachiana, 29, 3–24. doi:10.1027/1192-5604.29.1.3

Schore, A. N. (1994). Affect regulation and the origin of the self: The neurobiology of emotional

development. Hillsdale, NJ: Lawrence Erlbaum. Retrieved from


Schwartz, J. (1996, July 22). Voices say more than mere words: Tone tells perception of others,

study finds. The Washington Post (pre-1997 Fulltext), p. A.04. Retrieved from


Spitzer, R. L., Endicott, J., Fleiss, J. L., & Cohen, J. (1970). The psychiatric status schedule: A

technique for evaluating psychopathology and impairment in role functioning. Archives

of General Psychiatry, 23(1), 41–55. Retrieved from


Stern, D. N. (1990). Diary of a baby. New York, NY: Basic Books.

Stetson, R. H. (1905). A motor theory of rhythm and discrete succession: I. Psychological

Review, 12(4), 250–270. doi:http://dx.doi.org/10.1037/h0071810

Stetson, R. H. (1951). Motor phonetics: A study of speech movements in action (Vol. 2).

Amsterdam, The Netherlands: North-Holland for Oberlin College.

Stone, L. (1961). The psychoanalytic situation: An examination of its development and essential

nature. New York, NY: International Universities Press.

Tabachnick, B. G., & Fidell, L. S. (2007). Using multivariate statistics. Boston, MA:

Pearson/Allyn & Bacon.

Tepper, D. T., & Haas, R. E. (1978). Verbal and nonverbal communication of facilitative

conditions. Journal of Counseling Psychology, 25, 35–44.




166

Tingey, R. C. (1989). Assessing clinical significance: Extensions in method and application to

the SCL-90-R. Dissertation Abstracts International, 50(4-B), 1659. Retrieved from


Torrey, W. C., Mueser, K. T., McHugo, G. H., & Drake, R. E. (2000). Self-esteem as an

outcome measure in studies of vocational rehabilitation for adults with severe mental

illness. Psychiatric Services, 51(2), 229–233. doi:10.1176/appi.ps.51.2.229

Toukmanian, S. (1994). Levels of client perceptual processing: A measure for psychotherapy

process research. Unpublished manuscript, York University, Toronto, Ontario, Canada.

Truax, C. B., & Carkhuff, R. R. (1967). Toward effective counseling and psychotherapy:

Training and practice. Hawthorne, NY: Aldine. Retrieved from


Vognsen, J. P. (1969). Need for new experience: An explanatory bridge between client vocal

style and outcome of psychotherapy (Unpublished doctoral dissertation). University of

Chicago, IL. Available from Microfilm (23814).

von Eye, A., & Mun, E. Y. (2005). Analyzing rater agreement: Manifest variable methods.

Mahwah, NJ: Lawrence Erlbaum. Retrieved from


Warwar, S. H., & Greenberg, L. S. (1999). Client Emotional Arousal Scale–III. Unpublished

manuscript, York University, Toronto, Ontario, Canada.

Watson, J. C., & Bedard, D. (2006). Clients’ emotional processing in psychotherapy: A

comparison between cognitive-behavioral and process-experiential psychotherapy.


006X.74.1.152




167

Watson, J. C., & Geller, S. M. (2005). The relation among the relationship conditions, working

alliance, and outcome in both process–experiential and cognitive–behavioral

psychotherapy. Psychotherapy Research, 15(1–2), 25–33.

doi:10.1080/10503300512331327010

Watson, J. C., Goldman, R. N., & Greenberg, L. S. (2007). Case studies in emotion-focused

treatment of depression: A comparison of good and poor outcome. Washington, DC:

American Psychological Association. doi:10.1037/11586-000

Watson, J. C., Gordon, L. B., Stermac, L., Kalogerakos, F., & Steckley, P. (2003). Comparing

the effectiveness of process–experiential with cognitive–behavioral psychotherapy in the

treatment of depression. Journal of Consulting and Clinical Psychology, 71(4),

773–781. doi:10.1037/0022-006X.71.4.773

Watson, J. C., & Greenberg, L. S. (1996). Pathways to change in the psychotherapy of

depression: Relating process to session change and outcome. Psychotherapy: Theory,

Research, Practice, Training, 33(2), 262–274. doi:10.1037/0033-3204.33.2.262

Watson, J. C., Greenberg, L. S., Rice, L. N., & Gordon, L. B. (1996). Client task-specific

measure-revised. Unpublished manuscript, Department of Adult Education and

Counselling Psychology, OISE/University of Toronto, Ontario, Canada.

Watson, J. C., Schein, J., & McMullen, E. (2010). An examination of clients’ in-session changes

and their relationship to the working alliance and outcome. Psychotherapy Research,

20(2), 224–233. doi:10.1080/10503300903311285

Wei, M., Heppner, P. P., & Mallinckrodt, B. (2003). Perceived coping as a mediator between

attachment and psychological distress: A structural equation modeling approach. Journal

of Counseling Psychology, 50(4), 438–447. doi:10.1037/0022-0167.50.4.438

168

Weissman, A. N., & Beck, A. T. (1978, August). Development and validation of the

Dysfunctional Attitude Scale: A preliminary investigation. Paper presented at the 86th

Annual Convention of the American Psychological Association, Toronto, Ontario,

Canada.

Wexler, D. A. (1974). Self-actualization and cognitive processes. Journal of Consulting and

Clinical Psychology, 42(1), 47–53. doi:10.1037/h0036034

Wexler, D. A., & Butler, J. M. (1976). Therapist modification of client expressiveness in client-

centered therapy. Journal of Consulting and Clinical Psychology, 44(2), 261–265.

doi:10.1037/0022-006X.44.2.261

Wiseman, H., & Rice, L. N. (1989). Sequential analyses of therapist–client interaction during

change events: A task-focused approach. Journal of Consulting and Clinical Psychology,

57(2), 281–286. doi:10.1037//0022-006X.57.2.281

Wiser, S., & Goldfried, M. R. (1998). Therapist interventions and client emotional experiencing

in expert psychodynamic–interpersonal and cognitive–behavioral therapies. Journal of

Consulting and Clinical Psychology, 66(4), 634–640. doi:10.1037/0022-006X.66.4.634

Woodward, L. E., Murrell, S. A., & Bettler, R. F. (2005). Stability, reliability, and norms for the

inventory of interpersonal problems. Psychotherapy Research, 15(3), 272–286.

doi:10.1080/10503300512331334977

Wrye, H. K. (1997). Voice of the analyst: The body/mind dialectic within the psychoanalytic

subject: Finding the analyst’s voice. The American Journal of Psychoanalysis, 57(4),

359–369.

ZYTO. (2012). ZYTO EVOX: How perception reframing can improve your health, relationships

& performance. Retrieved from www.zyto.com/evox.html

http://www.zyto.com/evox.html

169

Appendix A

Ranges for Interpretation of Statistics

Percent Agreement from House, House, & Campbell, 1981, p. 46.

> 70% necessary, > 80% adequate, > 90% good

Guideline for evaluating agreement with Cohen’s Κ from Landis & Koch

(1977) in von Eye & Mun, 2005, p. 6.

Κ < 0.00 poor agreement

0.00 < Κ < 0.20 slight

0.21 < Κ < 0.40 fair

0.41 < Κ < 0.60 moderate

0.61 < Κ < 0.80 substantial

0.81 < Κ < 1.00 almost perfect agreement

Strength of a correlation coefficient (r or ICC) adapted from Green & Salkind,

2004, p. 256.

< .30 is weak; .30 - .50 is moderate; > .50 is strong

Strength of Effect Size (r2) (based on Green and Salkind, 2004, p. 256)

.00 - .09 is weak; .09-.25 is moderate; > .25 is strong

Appendix B

Spearman’s Rho Correlations Between Outcome Measures

Note. Acronyms refer to the following measures: BDI = Beck Depression Inventory; RSE = Rosenberg Self-Esteem; DAS = Dysfunctional Attitudes Scale; DAS Perf = DAS Perfectionism

subscale; DAS SA = DAS Social Anxiety subscale. Regarding the Inventory of Interpersonal Problems (IIP): Circumplex = IIP Circumplex Total. The following are the IIP subscale names:

DC = Domineering; VC = Vindictive Self-Centered; CD=Cold Distant; SI = Socially Inhibited; NA = Nonassertive; OA = Overly Accommodating; SS = Self-Sacrificing; IN = Intrusive.

Symptom Checklist 90 Revised (SCL-90-R) indices are as follows: GT = Grand Total; GSI = Global Symptom Index; PST = Positive Symptom Index; PSDI = Positive Symptom Distress

Index. SCL-90-R subscales include: DEP TOT = Depression Total; DEP MN = Depression Mean. Levels of the Problem-Focused Style of Coping scale include: REFLEC = Reflective Style of

Coping; SUPP = Suppressive Style of Coping; REACT = Reactive Style of Coping.

174

170

171

Appendix C


Skewness/

SE 23.2 15.86 82.45 -12.74

Kurtosis/

SE 87.4 47.61 41.77 27.35

Figure C1. Boxplots of CVQ categories in sessions with the

lowest change score (N = 61). Skewness and kurtosis values

are calculated by dividing the statistic by its standard error. The

distribution is non-normal if one of these values exceeds + 2.00

(SPSS).

172


Skewness/

SE 10.50 8.05 9.31 -8.26

Kurtosis/

SE 17.72 9.31 12.38 19.78

Figure C2. Boxplots of CVQ categories in sessions with the

highest change score (N = 61). Skewness and kurtosis values are

calculated by dividing the statistic by its standard error. The

distribution is non-normal if one of these values exceeds + 2.00

(SPSS).

173

Appendix C


Skewness/

SE 8.80 7.34 8.74 -7.99

Kurtosis/

SE 1.62 8.88 12.39 13.11

Figure C3. Boxplots of CVQ categories in first report of moderate to

high change session (N = 58). Skewness and kurtosis values are

calculated by dividing the statistic by its standard error. The distribution is

non-normal if one of these values exceeds + 2.00 (SPSS).

174

Appendix D

Means, Standard Deviations, and Medians for CVQ Categories in the Session with the Lowest

Change Score and the Session with the Highest Change Score

CVQ

Categories Session with the…

CBT (n = 33 Clients) PET (n = 28 Clients) Total

(N = 61 Clients)

M SD Mdn M SD Mdn M SD Mdn

Emotional

lowest change score .03 .14 .00 .01 .04 .00 .02 .11 .00

highest change score .00 .01 .00 .01 .03 .00 .01 .02 .00

Lowest + Highest .01 .07 .00 .01 .02 .00 .01 .05 .00

Focused



Lowest + Highest .01 .02 .00 .02 .03 .00 .01 .02 .00

Emotional+

Focused



Lowest + Highest .02 .07 .00 .03 .04 .01 .03 .06 .01

Limited



Lowest + Highest .04 .09 .01 .04 .08 .01 .04 .08 .01

Externalizing

lowest change score .93 .20 1.00 .93 .09 .97 .93 .16 .98


Lowest + Highest .94 .12 .99 .93 .09 .95 .93 .11 .98

175

Appendix E

Means, Standard Deviations, and Medians for Treatment Groups by CVQ Category in the First Report of

Moderate to High Change Session

CVQ

Categories

CBT (n = 30 Clients) PET (n = 28 Clients) Total (N = 58 Clients)

M SD Mdn M SD Mdn M SD Mdn

Emotional .01 .02 .00 .02 .04 .00 .01 .03 .00

Focused .02 .05 .00 .02 .04 .00 .02 .04 .00

Emotional+

Focused .03 .05 .00 .04 .06 .01 .03 .06 .00

Limited .04 .10 .00 .06 .11 .01 .05 .10 .00

Externalizing .93 .14 1.00 .90 .12 .96 .92 .13 .98

176

Appendix F

Table F1

Results for Hypothesis 2a: Standard (β) and Unstandardized (B) Regression Coefficients, their

Standard Errors, and p values Emotional Vocal Quality in the Session with the Highest Change

Score

Outcome Measures Na B SE Β p value

Beck Depression

Inventory 61 -.92 60.17 .00 .988

Dysfunctional Attitudes

Scale 54 -103.51 234.92 -.05 .661

General Symptom Index

of the SCL-90-R 55 -2.09 3.39 -.07 .539

IIPb-

Circumplex Total 55 -1.83 2.76 -.07 .511

IIP subscale-

Vindictive/Self-Centeredc 54 -.81 2.53 -.04 .750

IIP subscale-

Cold Distant 55 -.88 3.12 -.03 .779

IIP subscale-

Socially Avoidant 55 -4.46 4.10 -.10 .281

IIP subscale-

Nonassertive 55 -5.52 4.99 -.11 .274

IIP subscale-

Overly Accommodatingc 54 -.74 3.82 -.02 .847

IIP subscale-

Self-Sacrificing 55 1.35 4.01 .04 .738

IIP subscale-

Intrusive-Needyc 54 -1.80 2.31 -.07 .439

IIP subscale-

Domineering-Controlling 55 -2.83 2.00 -.16 .164

Problem-Focused Style of

Coping-Reactive 54 -3.98 6.00 -.09 .510


Coping-Reflective 54 3.11 4.54 .06 .497


Coping-Suppressive 54 -3.93 7.05 -.07 .580

Rosenberg Self-Esteem

Scale 56 -25.57 40.62 -.07 .532

Note. aDifferent sample sizes are the result of missing data for some clients and/or the removal of outlier

cases. b IIP is Inventory of Interpersonal Problems. c Indicates the removal of one outlier case.

177

Appendix F

Table F2


Standard Errors, and p values Focused Vocal Quality in the Session with the Highest Change

Score


Beck Depression Inventory 61 -5.37 31.81 -.02 .866

Dysfunctional Attitudes Scale 54 52.65 95.67 .06 .584

General Symptom Index of

the SCL-90-R 55 -1.34 1.95 -.08 .494

IIPb-

Circumplex Total 55 -.02 1.46 .00 .990

IIP subscale-

Vindictive/Self-Centeredc 54 .50 1.32 .04 .707

IIP subscale-

Cold Distant 55 -.43 1.64 -.03 .795

IIP subscale-


IIP subscale-

Nonassertive 55 1.53 2.65 .06 .566

IIP subscale-

Overly Accommodatingc 54 1.13 2.01 .00 .578

IIP subscale-

Self-Sacrificing 55 .48 2.11 .02 .820

IIP subscale-


IIP subscale-

Domineering-Controlling 55 -.22 1.08 -.02 .842

Problem-Focused Style of Coping-

Reactive 54 1.83 2.72 .09 .503


Reflective 54 1.09 2.07 .05 .600


Suppressive 54 .48 3.20 .02 .881

Rosenberg Self-Esteem Scale 56 -8.76 21.51 -.05 .685



178

Appendix F

Table F3


Standard Errors, and p values Emotional Plus Focused Vocal Quality in the Session with the

Highest Change Score




General Symptom Index of the

SCL-90-R 55 -1.12 1.44 -.09 .442

IIPb-

Circumplex Total 55 -.31 1.13 -.03 .781

IIP subscale--

Vindictive/Self-Centeredc 54 .17 1.02 -.02 .872

IIP subscale-

Cold Distant 55 -.40 1.27 -.03 .753

IIP subscale-


IIP subscale-

Nonassertive 55 .00 2.06 .00 .999

IIP subscale-

Overly Accommodatingc 54 .55 1.55 .03 .752

IIP subscale-

Self-Sacrificing 55 .51 1.63 .03 .755

IIP subscale-

Intrusive-Needyc 54 -1.02 .93 -.10 .276

IIP subscale-

Domineering-Controlling 55 -.59 .83 -.08 .474


Reactive 54 .64 2.17 .04 .768


Reflective 54 1.11 1.65 .06 .505


Suppressive 54 -.20 2.55 -.01 .937




179

Appendix G

Table G1


Standard Errors, and p values Emotional Vocal Quality in the Session with the Lowest Change

Score



Dysfunctional Attitudes Scale 54 -10.78 32.48 -.04 .741


SCL-90-R 54 -.22 .61 -.04 .714

IIPb-

Circumplex Total 55 -.33 .49 -.07 .501

IIP subscale-

Vindictive/Self-Centeredc 54 -.47 .44 -.12 .296

IIP subscale-

Cold Distant 55 -.27 .56 -.05 .630

IIP subscale-

Socially Avoidant 55 -.79 .73 -.10 .284

IIP subscale-

Nonassertive 55 -.16 .90 -.02 .856

IIP subscale-

Overly Accommodatingc 54 -.08 .69 -.01 .911

IIP subscale-

Self-Sacrificing 55 -.57 .71 -.09 .425

IIP subscale-

Intrusive-Needyc 54 .27 .41 .06 .516

IIP subscale-



Reactive 54 -.09 .94 -.01 .927


Reflective 54 -.27 .72 -.03 .713


Suppressive 54 -.09 1.09 -.01 .934

Rosenberg Self-Esteem Scale 55 6.47 7.23 .10 .375



180

Appendix G

Table G2


Standard Errors, and p values Focused Vocal Quality in the Session with the Lowest Change

Score





SCL-90-R 54 -.92 2.34 -.05 .696

IIPb-


IIP subscale-

Vindictive/Self-Centeredc 54 -3.49 3.79 -.11 .362

IIP subscale--

Cold Distant 55 -5.37 4.65 -.12 .254

IIP subscale-


IIP subscale-

Nonassertive 55 -11.59 7.33 -.16 .120

IIP subscale--


IIP subscale-

Self-Sacrificing 55 -12.84 5.81 -.23 .032

IIP subscale-


IIP subscale-



Reactive 54 5.47 3.46 .20 .120


Reflective 54 .66 2.67 .02 .806


Suppressive 54 -.37 4.15 -.01 .929


Note. a Different sample sizes are the result of missing data for some clients and/or the removal of outlier


181

Appendix G

Table G3


Standard Errors, and p values Emotional Plus Focused Vocal Quality in the Session with the

Lowest Change Score





SCL-90-R 54 -.26 .58 -.05 .655

IIPb-


IIP subscale-


IIP subscale-

Cold Distant 55 -.33 .54 -.06 .548

IIP subscale-

Socially Avoidant 55 -.82 .71 -.10 .253

IIP subscale-

Nonassertive 55 -.31 .87 -.04 .723

IIP subscale-


IIP subscale-


IIP subscale-


IIP subscale-



Reactive 54 .27 .90 .04 .763


Reflective 54 -.20 .69 -.03 .774


Suppressive 54 -.11 1.04 -.01 .919



cases. bIIP is Inventory of Interpersonal Problems. c Indicates the removal of one outlier case.

182

Appendix H

Table H1


Standard Errors, and p values for Emotional Vocal Quality in the Session with the First Report

of Moderate to High Change


Beck Depression Inventoryb 56 -54.52 24.71 -.29 .032



SCL-90-R 52 -4.35 1.91 -.28 .027

IIPc-


IIP subscale-

Vindictive/Self-Centeredd 51 -2.68 1.36 -.23 .055

IIP subscale-

Cold Distant 52 -4.00 1.78 -.22 .029

IIP subscale-


IIP subscale-

Nonassertive 52 -5.70 2.86 -.20 .052

IIP subscale-

Overly Accommodatingd 51 -3.49 2.17 -.16 .115

IIP subscale-


IIP subscale-

Intrusive-Needy 52 -2.18 1.30 -.16 .100

IIP subscale-



Reactive 51 -6.18 2.84 -.28 .035


Reflective 51 2.18 2.39 .09 .367


Suppressive 51 -4.27 3.30 -.17 .201



cases. bIndicates the removal of two outlier cases. cIIP is Inventory of Interpersonal Problems. dIndicates

the removal of one outlier case.

183

Appendix H

Table H2


Standard Errors, and p values for Focused Vocal Quality in the Session with the First Report of

Moderate to High Change





SCL-90-R 52 -2.51 1.48 -.21 .095

IIPc-


IIP subscale-

Vindictive/Self-Centeredd 51 -.33 1.08 -.04 .762

IIP subscale-

Cold Distant 52 -2.41 1.39 -.18 .089

IIP subscale-

Socially Avoidant 52 -.85 1.89 -.04 .653

IIP subscale-

Nonassertive 52 -3.12 2.21 -.15 .164

IIP subscale-


IIP subscale-


IIP subscale-

Intrusive-Needy 52 -1.04 1.00 -.10 .303

IIP subscale-

Domineering-Controlling 52 .35 .91 .05 .699


Reactive 51 2.20 2.26 .13 .335


Reflective 51 4.08 1.70 .21 .021


Suppressive 51 -.97 2.73 -.05 .725



cases. bIndicates the removal of two outlier cases. cIIP is Inventory of Interpersonal Problems. dIndicates

the removal of one outlier case.

184

Appendix H

Table H3


Standard Errors, and p values for Emotional Plus Focused Vocal Quality in the Session with the

First Report of Moderate to High Change





SCL-90-R 52 -3.13 1.11 -.33 .007

IIPc-

Circumplex Total 52 -2.46 .90 -.28 .009

IIP subscale-

Vindictive/Self-Centeredd 51 -1.18 .84 -.17 .165

IIP subscale-

Cold Distant 52 -2.98 1.05 -.28 .007

IIP subscale-


IIP subscale-

Nonassertive 52 -4.01 1.69 -.24 .022

IIP subscale-


IIP subscale-


IIP subscale-

Intrusive-Needy 52 -1.44 .78 -.17 .069

IIP subscale-



Reactive 51 -.86 1.77 -.07 .628


Reflective 51 3.41 1.36 .23 .016


Suppressive 51 -2.18 2.05 -.14 .292



cases. bIndicates the removal of two outlier cases. cIIP is Inventory of Interpersonal Problems. dIndicates the

removal of one outlier case.

185

Appendix I

Table I1


Standard Errors, and p values for Limited Vocal Quality in the Session with the First Report of

Moderate to High Change



Dysfunctional Attitudes Scalec 48 -89.97 24.82 -.3 .001


SCL-90-R 52 -.60 .66 -.12 .365

IIPd-

Circumplex Total 52 -1.15 .50 -.24 .027

IIP subscale-

Vindictive/Self-Centerede 51 .07 .47 .02 .876

IIP subscale-

Cold Distant 52 -1.12 .59 -.19 .063

IIP subscale-

Socially Avoidant 52 -1.10 .79 -.13 .173

IIP subscale-

Nonassertive 52 -2.89 .89 -.31 .002

IIP subscale-

Overly Accommodatinge 51 -2.00 .68 -.28 .005

IIP subscale-

Self-Sacrificing 52 -2.08 .72 -.29 .005

IIP subscale-

Intrusive-Needy 52 -.14 .44 -.03 .757

IIP subscale-



Reactive 51 -1.06 .99 -.14 .293


Reflective 51 1.54 .76 .18 .049


Suppressive 51 -.62 1.14 -.07 .593



cases. bIndicates the removal of two outlier cases. cIndicates the removal of three outlier cases. dIIP is

Inventory of Interpersonal Problems. eIndicates the removal of one outlier case.

186

Appendix I

Table I2


Standard Errors, and p values for Externalizing Vocal Quality in the Session with the First

Report of Moderate to High Change

Outcome Measure Na B SE Β p value

Beck Depression Inventoryb 56 15.67 6.49 .32 .019

Dysfunctional Attitudes Scaleb 49 57.82 22.31 .26 .013


SCL-90-Rc 51 1.13 .45 .31 .016

IIPd-

Circumplex Total 52 1.16 .38 .31 .004

IIP subscale-

Vindictive/Self-Centeredb 50 .32 .33 .12 .335

IIP subscale-

Cold Distant 52 1.24 .45 .27 .008

IIP subscale-

Socially Avoidant 52 1.10 .61 .17 .078

IIP subscale-

Nonassertive 52 2.53 .68 .35 .001

IIP subscale-

Overly Accommodatingc 51 1.87 .52 .33 .001

IIP subscale-

Self-Sacrificing 52 2.07 .54 .37 .000

IIP subscale-

Intrusive-Needy 52 .37 .35 .10 .290

IIP subscale-



Reactive 51 .84 .77 .14 .287


Reflective 51 -1.56 .58 -.24 .010


Suppressive 51 .82 .90 .12 .366



cases. bIndicates the removal of two outlier cases. cIndicates the removal of one outlier case. dIIP is

Inventory of Interpersonal Problems.

187

Appendix J

Table J1


Standard Errors, and p values Limited Vocal Quality in the Session with the Lowest Change

Score


Beck Depression Inventory 61 2.18 10.57 .03 .837



SCL-90-R 54 .17 .61 .03 .785

IIPb-


IIP subscale-

Vindictive/Self-Centeredc 54 .48 .44 .13 .284

IIP subscale-

Cold Distant 55 -.24 .55 -.04 .671

IIP subscale-

Socially Avoidant 55 .24 .73 .03 .741

IIP subscale-

Nonassertive 55 -1.07 .88 -.12 .232

IIP subscale-

Overly Accommodatingc 54 -1.23 .67 -.18 .074

IIP subscale-

Self-Sacrificing 55 -1.31 .70 -.19 .068

IIP subscale-


IIP subscale-



Reactive 54 .54 .91 .08 .560


Reflective 54 -.07 .70 -.01 .920


Suppressive 54 1.10 1.08 .13 .311



cases. bIIP is Inventory of Interpersonal Problems. cIndicates the removal of one outlier case.

188

Appendix J

Table J2


Standard Errors, and p values Externalizing Vocal Quality in the Session with the Lowest

Change Score


Beck Depression Inventoryb 60 10.24 6.84 .19 .140



SCL-90-R 54 .06 .42 .02 .891

IIPc-

Circumplex Total 55 .37 .34 .11 .285

IIP subscale-

Vindictive/Self-Centeredb 54 .02 .31 .01 .959

IIP subscale-

Cold Distant 55 .28 .39 .07 .469

IIP subscale-


IIP subscale-

Nonassertive 55 .67 .61 .11 .280

IIP subscale-

Overly Accommodatingb 54 .68 .46 .14 .149

IIP subscale-

Self-Sacrificing 55 .98 .48 .21 .047

IIP subscale-

Intrusive-Needyb 54 -.16 .29 -.05 .586

IIP subscale-



Reactive 54 -.40 .64 -.08 .534


Reflective 54 .13 .48 .03 .786


Suppressive 54 -.47 .74 -.08 .534

Rosenberg Self-Esteem Scale 54 -7.79 4.52 -1.82 .091


cases. bIndicates the removal of one outlier case. cIIP is Inventory of Interpersonal Problems.

189

Appendix K

Table K1


Standard Errors, and p values Limited Vocal Quality in the Session with the Highest Change

Score


Beck Depression Inventory 61 7.42 10.92 .09 .500



SCL-90-R 55 -.02 .75 .00 .982

IIPb-

Circumplex Total 55 .16 .61 .03 .794

IIP subscale-

Vindictive/Self-Centeredc 54 .50 .62 .09 .426

IIP subscale-

Cold Distant 55 -.41 .69 -.06 .553

IIP subscale-


IIP subscale-

Nonassertive 55 .34 1.11 .03 .759

IIP subscale-

Overly Accommodatingc 54 .14 .93 .02 .878

IIP subscale-

Self-Sacrificing 55 .03 .89 .00 .978

IIP subscale-


IIP subscale-



Reactive 54 2.94 1.07 .34 .008


Reflective 54 -.50 .86 -.05 .562


Suppressive 54 4.06 1.22 .39 .002




190

Appendix K

Table K2


Standard Errors, and p values Externalizing Vocal Quality in the Session with the Highest

Change Score





SCL-90-R 55 .23 .61 .04 .709

IIPb-

Circumplex Total 55 -.02 .49 .00 .971

IIP subscale-


IIP subscale-

Cold Distant 55 .39 .56 .07 .483

IIP subscale-


IIP subscale-

Nonassertive 55 -.18 .90 -.02 .844

IIP subscale-


IIP subscale-


IIP subscale-


IIP subscale-



Reactive 54 -1.96 .89 -.28 .032


Reflective 54 .14 .70 .02 .843


Suppressive 54 -2.50 1.03 -.29 .019




191

Appendix L

Table L1

Results for Hypothesis #2b: Standard (β) and Unstandardized (B) Regression Coefficients, their Standard Errors, and

p values for Post-Treatment Scores for Outcome Measures by Therapist Vocal Style (Softened-Irregular and Natural-Definite)

for Sessions with the Lowest and Highest Change Scores

Outcome

Measure Na

Therapist Vocal Style in Lowest and

Highest Change Sessions B SE β p value

BDIb 56

Both sessions are Softened-Irregular -3.45 2.37 -.21 .152

One is Softened-Irregular and one is

Natural-Definite 3.00 2.67 .16 .266

DAS 50

Both sessions are Softened-Irregular 2.58 8.28 .04 .757


Natural-Definite .06 9.73 .00 .995

GSI 51

Both sessions are Softened-Irregular -.25 .16 -.21 .136


Natural-Definite -.01 .18 -.01 .941

IIPc Circumplex

Total 51




IIP Vindictive-

Self-Sacrificingb 50

Both sessions are Softened-Irregular .08 .12 .09 .513


Natural-Definite .06 .14 .06 .682

IIP Cold Distant 51




IIP Socially

Inhibited 51




IIP Nonassertive 51




IIP Overly

Accommodating 51




IIP Self-

Sacrificing 51




IIP Intrusive-

Needy 51




Note. Dummy variables were used in the multiple regression analysis. The reference category was Natural-Definite for both

sessions. aDifferent sample sizes are the result of missing data for some clients and/or the removal of outlier cases. bOne

outlier removed. cIIP is Inventory of Interpersonal Problems.

192

Appendix L

Table L1 continued

Results for Hypothesis #2b: Standard (β) and Unstandardized (B) Regression Coefficients, their Standard Errors, and

p values for Post-Treatment Scores for Outcome Measures by Therapist Vocal Style (Softened-Irregular and Natural-Definite)

for Sessions with the Lowest and Highest Change Scores

Outcome

Measure Na

Therapist Vocal Style in Lowest and

Highest Change Sessions B SE β p value

IIP Domineering-

Controlling 51




PF-SOC Reactive

Style 50




PF-SOC

Reflective Style 50



Natural-Definite .15 .22 .07 .504

PF-SOC

Suppressive 50




RSE 52

Both sessions are Softened-Irregular 2.13 1.96 .15 .283


Natural-Definite .54 2.20 .03 .809


sessions. aDifferent sample sizes are the result of missing data for some clients and/or the removal of outlier cases. bOne

outlier removed.

193

Appendix L

Table L2

Results for Hypothesis #2b: Standard (β) and Unstandardized (B) Regression Coefficients, their Standard Errors, and p

values for Post-Treatment Scores for Outcome Measures by Therapist Vocal Style (Softened-Irregular and Natural-Definite) in

First report of Moderate to High Change Session

Outcome Measure Na B SE β p value

BDIb 54 1.22 1.81 .10 .502

DAS 49 -5.84 7.45 -.09 .437

GSI 50 .08 .15 .07 .575

IIPc Circumplex Total 50 .13 .12 .12 .276

IIP Vindictive-Self-Sacrificingd 49 -.12 .10 -.14 .268

IIP Cold Distant 50 .13 .13 .10 .343

IIP Socially Inhibited 50 -.04 .18 -.02 .836

IIP Nonassertive 50 .35 .21 .18 .105

IIP Overly Accommodating 50 .34 .18 .21 .062

IIP Self-Sacrificing 50 .22 .17 .14 .216

IIP Intrusive-Needy 50 -.03 .09 -.04 .713

IIP Domineering-Controlling 50 .03 .08 .05 .698

PF-SOC Reactive Style 49 -.21 .22 -.13 .342

PF-SOC Reflective Style 49 -.01 .17 -.01 .939

PF-SOC Suppressive 49 .27 .24 .15 .272

RSE 51 -2.54 1.71 -.19 .144


sessions. bTwo outliers removed. cIIP=Inventory of Interpersonal Problems. dOne outlier removed.

Documents

THE CLIENT’S AND THERAPIST’S VOCAL QUALITIES IN CBT AND … · Softened-Irregular Vocal Style, compared with a Natural-Definite Vocal Style, was associated with the client’s