The Dyadic parent-child interaction coding system II (DPICS II)

MISSING IMAGE

Material Information

Title:
The Dyadic parent-child interaction coding system II (DPICS II) reliability and validity
Alternate title:
Reliability and validity
Physical Description:
x, 170 leaves : ; 29 cm.
Language:
English
Creator:
Bessmer, Janet L., 1965-
Publication Date:

Subjects

Subjects / Keywords:
Reproducibility of Results   ( mesh )
Psychological Tests   ( mesh )
Mother-Child Relations   ( mesh )
Behavior   ( mesh )
Genre:
bibliography   ( marcgt )
theses   ( marcgt )
non-fiction   ( marcgt )

Notes

Thesis:
Thesis (Ph. D.)--University of Florida, 1996.
Bibliography:
Includes bibliographical references (leaves 163-169).
Statement of Responsibility:
by Janet L. Bessmer.
General Note:
Typescript.
General Note:
Vita.

Record Information

Source Institution:
University of Florida
Rights Management:
All applicable rights reserved by the source institution and holding location.
Resource Identifier:
oclc - 49016091
ocm49016091
System ID:
AA00011213:00001

Full Text



















THE DYADIC PARENT-CHILD INTERACTION CODING
SYSTEM II (DPICS II): RELIABILITY AND VALIDITY














By

JANET L. BESSMER


A DISSERTATION PRESENTED TO THE GRADUATE SCHOOL
OF THE UNIVERSITY OF FLORIDA IN PARTIAL FULFILLMENT
OF THE REQUIREMENTS FOR THE DEGREE OF
DOCTOR OF PHILOSOPHY

UNIVERSITY OF FLORIDA

1996














ACKNOWLEDGMENTS

A number of people have been extremely helpful to me in

completing this project and preparing this manuscript. I

would like to thank the DPICS II observers, Dan Edwards,

Jenifer Jacobs, Tricia Durning, and Nola Litwins, for their

many hours of careful work coding videotapes. In addition,

Arista Rayfield and Tricia Park were very generous with

their time helping in the collection of subjects. All of

you deserve many "labeled praises." I would also like to

thank Dr. Sheila Eyberg for her guidance during my graduate

school training and for her support in completing this

project. I am also most appreciative of my other committee

members, Dr. Stephen Boggs, Dr. Suzanne Bennett Johnson, Dr.

Jane Pendergast, and Dr. Gary Geffken, for providing me with

their time and expertise. My family and friends have

provided me with support and encouragement throughout my

graduate training and particularly during the completion of

my dissertation. Drs. Bob and Judy Rudman have also been

most helpful to me in the final stages of finishing my work.

Finally, I wish to thank Dr. Keith Ewell and R.G. Ewell for

their patience, understanding, and assistance.















TABLE OF CONTENTS

page

ACKNOWLEDGMENTS . .. ii

LIST OF TABLES ... . iv

ABSTRACT . ... viii

INTRODUCTION . . 1
Characteristics of children with Oppositional
Defiant Disorder and their parents 7
Dyadic Parent-Child Interaction Coding System 14
Dyadic Parent-Child Interaction Coding System II 19

METHODS . . . 38
Participants . . .. 38
Measures . ... .. 44
Procedures . .... 50
Observers . .... 53

RESULTS . . . 57
Psychometric properties of measures 57
Reliability . ... 57
Validity . . 68
Discriminant Validity . 68
Convergent Validity . .. .73

DISCUSSION . . 84
Reliability . . 84
Validity . . 93
Behavioral Differences Between Clinic-referred
and Non-referred Families .. .93
Effect of Socio-Economic Status on Behavior
Problems . ... 98
Limitations and Future Directions .. .101

APPENDIX A TABLES . . .. .106

APPENDIX B SUMMARY OF DPICS II CATEGORIES .. .160

REFERENCES ... . . 163

BIOGRAPHICAL SKETCH .... . .170














LIST OF TABLES


Table Dage

1 Categories in the Dyadic Parent-Child Interaction
Coding System (DPICS) . ... 16

2 Categories of the DPICS and DPICS II .. .21

3 Summary of Reliability for Parent Categories
in CDI . ... 106

4 Summary of Reliability for Parent Categories
in PDI . ... 107

5 Summary of Reliability for Child Categories
in CDI . ... 108

6 Summary of Reliability for Child Categories
in PDI . ... 109

7 Scores on Measures Used to Screen Dyads ... .31

8 Mean Frequency of DPICS II Parent Categories
for Referred and Non-referred Mother-Child
Dyads during CDI . ... .110

9 Mean Frequency of DPICS II Child Categories
for Referred and Non-referred Mother-Child
Dyads during CDI . ... 111

10 Mean Frequency of DPICS II Parent Categories
for Referred and Non-referred Mother-Child
Dyads during PDI . ... 112

11 Mean Frequency of DPICS II Child Categories
for Referred and Non-referred Mother-Child
Dyads during PDI . ... 113

12 Correlations between SES and DPICS II variables
during CDI, PDI, and CU ... 114

13 Correlations between PPVT-R and DPICS II
variables during CDI, PDI, and CU 115


iv








14 Normative data for the Parenting Stress Index 45

15 Characteristics of Clinic-referred groups 116

16 Sample Characteristics . ... 40

17 Scores on Measures Used to Compare Participants 42

18 Summary of reliability for parent categories in
CDI--Clinic-referred group ...... 118

19 Summary of reliability for parent categories in
PDI--Clinic-referred group ... .120

20 Summary of reliability for parent categories in
CU--Clinic-referred group ...... 122

21 Summary of reliability for parent categories in
CDI--Comparison group . 124

22 Summary of reliability for parent categories in
PDI--Comparison group ... 126

23 Summary of reliability for parent categories in
CU--Comparison group ... 128

24 Summary of reliability for child categories in
CDI--Clinic-referred group ...... 130

25 Summary of reliability for child categories in
PDI--Clinic-referred group ...... 132

26 Summary of reliability for child categories in
CU--Clinic-referred group 134

27 Summary of reliability for child categories in
CDI--Comparison group ... 136

28 Summary of reliability for child categories in
PDI--Comparison group ... 138

29 Summary of reliability for child categories in CU--
Comparison group . ... 140

30 Intraclass Correlations for Categories Combined
Across Situation for Each Group and the Total
Sample . . ... 142

31 Reliability for the DPICS II Parent Categories
Combined Across Situation ... .61








32 Reliability for the DPICS II Child Categories
Combined Across Situation ... .63

33 Kappa Estimates for DPICS II Summary and Individual
Variables Combined Across Situations ... .66

34 Summary of Univariate Analysis of Covariance for
Untransformed Summary Variables Comparing the
Clinic-Referred Group to a Comparison Group 70

35 Summary of Univariate Analysis of Covariance for
Transformed Summary Variables Comparing the
Clinic-referred and Comparison Group 145

36 Classification Function Coefficients for
Discriminant Analysis on Untransformed
Variables . .. 146

37 Summary of Step-wise Discriminant Analysis 147

38 Classification Function Coefficients for Stepwise
Discriminant Analysis on Untransformed
Variables . .. 148

39 Partial Correlations Between SES, DPICS II Summary
Variables and the ECBI Intensity Score
Controlling for SES . 74

40 Correlations Between SES, DPICS II Summary Variables
and ECBI Intensity Score ... .149

41 Summary of Hierarchical Multiple Regression Analysis
for Untransformed Variables Predicting ECBI
Intensity Score . ... .76

42 Summary of Hierarchical Multiple Regression Analysis
for Transformed Variables Predicting ECBI
Intensity Score . 150

43 Partial Correlation Matrix for Untransformed Child
Variables Used to Predict the Child Domain Score
on the PSI . ... .77

44 Correlation Matrix for Transformed Child Variables
Used to Predict the Child Domain Score
on the PSI . ... 151

45 Summary of Hierarchical Multiple Regression Analysis
for Untransformed Variables Predicting PSI Child
Domain Scores . ... 78








46 Summary of Hierarchical Multiple Regression Analysis
for Transformed Variables Predicting PSI Child
Domain Scores . . 152

47 Partial Correlation Matrix for Parent Variables Used
to Predict the Parent Domain Score on the PSI 79

48 Correlation Matrix for Transformed Parent Variables
Used to Predict the Parent Domain Score on the
PSI . .. .. 153

49 Summary of Hierarchical Multiple Regression Analysis
for Untransformed Variables Predicting PSI Parent
Domain . . ... .80

50 Summary of Hierarchical Multiple Regression Analysis
for Transformed Variables Predicting PSI Parent
Domain . . ... 154

51 Partial Correlation Matrix for Variables Predicting
the PLOC score . .. .81

52 Correlation Matrix for Transformed Variables
Predicting the PLOC score .. .155

53 Summary of Hierarchical Multiple Regression Analysis
for Untransformed Variables Predicting the PLOC
score . . 82

54 Kappa Confusion Matrix for Parent Verbalization
Categories for Both Groups Combined 156

55 Kappa Confusion Matrix for Child Verbalization
Categories for Both Groups Combined 157

56 Kappa Confusion Matrix for Parent Response
Categories for Both Groups Combined 158

57 Kappa Confusion Matrix for Child Response Categories
for Both Groups Combined . 159














Abstract of Dissertation Presented to the Graduate School
of the University of Florida in Partial Fulfillment of the
Requirements for the Degree of Doctor of Philosophy

THE DYADIC PARENT-CHILD INTERACTION CODING
SYSTEM II (DPICS II): RELIABILITY AND VALIDITY

By

Janet L. Bessmer

December, 1996

Chairman: Sheila Eyberg, Ph.D.
Major Department: Clinical and Health Psychology

The reliability and validity of the Dyadic Parent-Child

Interaction Coding System--II (DPICS II) was assessed in the

present study. The DPICS II is the revised version of a

behavioral observation coding system used in research and

clinical settings to describe the quality of parent-child

dyadic interactions. The DPICS II contains 25 categories to

code parents' and children's verbal and nonverbal behavior

including commands, compliance to commands, questions,

praise, yell, whine, destructive, and criticism.

The study participants were sixty mother-child dyads

representing a clinic-referred group (n = 30) and a non-

problem comparison group (n = 30). The children in the

clinic-referred group were participants in a large treatment

outcome study (N = 100) for preschool children with behavior

problems. All participants had met diagnostic criteria for


viii








Oppositional Defiant Disorder. The data on the clinic-

referred families used in the present study was collected as

part of the families' standard initial assessment in the

larger outcome study. The mother-child pairs in the

comparison group were recruited from the Gainesville, FL

community through advertisements. To be included in the

study, the children in the comparison group received

maternal ratings on the Eyberg Child Behavior Inventory

(ECBI)--Problem Scale of 11 or less.

The two groups were compared on several measures

including the Parenting Stress Index (PSI), the Parenting

Locus of Control (PLOC), and the ECBI as well as DPICS II

behavioral observations. Videotapes of the parent-child

dyads were coded by observers trained to use the DPICS II.

The primary coders were blind to the group membership and to

the study hypotheses. Fifty percent of the videotapes,

randomly selected, were re-coded to evaluate reliability.

Reliability was assessed using percent agreement, intraclass

correlations, and Cohen's kappa. Overall, the DPICS II

categories were shown to have acceptable reliability

estimates. The DPICS II also demonstrated convergent

validity by accounting for a significant proportion of

variance in the scores on the ECBI, the PLOC, and the parent

and child scales of the PSI. Selected DPICS II categories

were used in a discriminant analysis and correctly








classified families into the clinic-referred and non-

referred groups.














INTRODUCTION

Direct behavioral observation measures have been called

the "hallmark" of behavioral assessment (Ciminero, 1986) and

have been used widely across content areas within the field

of psychology (Foster & Cone, 1986; Bornstein, Bridgwater,

Hickey, & Sweeney, 1980). Systems for observing behavior

have proven to be particularly valuable tools for the

assessment of behavior problems in children (McMahon &

Forehand, 1988). Observational methods are useful with

children because they have difficulty providing an accurate

self-report, particularly about socially undesirable or

inappropriate behavior (Hartmann & Wood, 1990). In

addition, there are concerns about the accuracy of parental

perceptions of children's disruptive behavior (Wahler &

Sansbury, 1992). Direct observation by an independent

observer of children's behavior and their interactions with

relevant individuals in their environments is considered to

provide the most objective description of target behaviors,

such as noncompliance to parental commands and the

effectiveness of the parents' responses (McMahon & Forehand,

1988).

Although there is considerable support for using direct

observation as an assessment device, concerns regarding the








2

practicality of direct observation, particularly in clinical

settings, have been raised (Mash & Terdal, 1988). Direct

observation in naturalistic settings, such as the home or

school, is thought to be too time-consuming for typical

clinical applications, while observation of free play in

analogue situations is thought to yield information about

behavior that may not be generalizable to more relevant

settings. Structured behavioral observations in the

laboratory setting have been proposed as an effective

alternative because they can efficiently elicit the target

behaviors and can facilitate comparison of behavior across

subjects (Hughes & Haynes, 1978).

The literature concerning direct observation systems

contains guidelines for the development and use of these

systems with children. Roberts and Forehand (1978) stated

that for an observational system to be most useful it should

accomplish several goals: 1) describe maladaptive parent-

child interactions, 2) define the child behavior(s) targeted

for change, 3) specify the appropriate treatment

intervention, 4) evaluate the effects of the intervention.

To this, one should add specifying the parent behaviors

targeted for change. In addition, Goldfried (1979)

advocated that observation systems should meet accepted

standards of reliability and validity. He also criticized

many behavioral observation systems for children for lacking

normative data.










The recent literature on behavioral observations

reflects an awareness of the complex methodological issues

involved in utilizing observation systems. While behavioral

observations were once seen as inherently objective and

valid, factors such as frequency of occurrence (Hartmann,

1977), complexity of the coding system (Hops, Davis, &

Longoria, 1995; Jones, Reid, & Patterson, 1974;

Mash & McElwee, 1974; Kazdin, 1977), and observer

expectancies (Kazdin, 1977) all have an impact on the

reliability and validity of the observational data. Current

guidelines for utilizing behavioral observation coding

systems incorporate and attempt to account for these

influences.

Methods for estimating the reliability of direct

observational data have also been developed and improved.

Percent agreement, as a indicator of interobserver

agreement, has been the most widely used estimate of

reliability due to its ease of computation and its ability

to provide immediate assessment of agreement for an

individual session (Jacob, Tennebaum & Krahn, 1987).

Percent agreement also has been the most widely criticized

form of reliability (Suen & Ary, 1989). Researchers have

frequently cited the potential for percent agreement figures

to be inflated by chance agreement, particularly for

behaviors with a high base rate (Jacob, Tennebaum, & Krahn,

1987). Computing percent agreement on occurrences only for










low base rate behavior and on nonoccurrences of high base

rate behaviors has been suggested as one way to limit the

chance of artificial inflation of the reliability estimates

(Suen & Ary, 1989).

Acceptable values for percent agreement have also been

disputed. Eighty percent agreement has been offered as a

reasonable standard for reliability estimates (Page & Iwata,

1986). However, Jones, Reid, and Patterson (1974) support

using 70% agreement as the minimal accepted standard for

more complex coding systems.

In spite of attempts to establish a standard for

percent agreement estimates, acceptable reliability has been

shown to be dependent on such factors as the occurrence of

the behavior (Suen & Ary, 1989) and the complexity of the

coding system (Jones, Reid, & Patterson, 1974). For

example, 75% agreement for a behavior with a base rate of

occurrence of 45% of the observation period is better than

75% agreement for a behavior occurring 70% of the time.

Awareness of the base rate is important, therefore, in

evaluating a given reliability estimate using percent

agreement. Studies examining the relationship between

complexity of the coding system, (i.e., the number of

discrimination required of the observer during an

observation period) and estimates of interobserver agreement

have found significant correlations, with values ranging

from -.52 to -.75 (Jones, Reid, & Patterson, 1974).










Similarly, Mash and McElwee (1974) found that percent

agreement was reduced by increasing the number of categories

in a system from four to eight. The aforementioned

difficulties in using percent agreement as an estimate of

reliability have resulted in it becoming the least

recommended method (Hops, Davis, & Longoria, 1995; Suen &

Ary, 1989).

Although numerous types of reliability estimates exist

(Suen, Ary, & Ary, 1986), users of observation systems are

more frequently advocating the use of either the kappa

statistic and/or intraclass correlations (Hops, Davis, and

Longoria, 1995; Suen & Ary, 1989). Percent agreement and

kappa both provide an estimate of interobserver agreement

with kappa having the added benefit of correcting for chance

agreement. Kappa "...is the ratio of actual nonchance

agreements divided by the total possible nonchance

agreements" (Suen & Ary, 1989, p.112). Kappa estimates have

been widely recommended for use as a measure of

interobserver agreement (Hops, Davis, and Longoria, 1995;

Jacob, Tennebaum, & Krahn, 1987). The range of possible

kappa values extends from -1.00 to 1.00. As the values

approach zero and negative numbers, reliability is

considered to be at chance levels of agreement or lower.

Kappa values above .75 are considered excellent, and values

from .60 to .75 are considered good (Fleiss, 1981).








6

Although intraclass correlations have been discussed in

the literature on behavioral observations and proposed as an

estimate of reliability for many years (Jones, Reid, &

Patterson, 1974), it appears that only recently intraclass

correlations have become more widely known and used within

the field of behavioral assessment. Based on the

Generalizability Theory (Cronbach, Gleser, Nanda, &

Rajaratnam, 1972), reliability is seen within the context of

situational factors such as observers, setting, and time.

Analysis of variance procedures are used to evaluate these

factors to determine the sources of variance in the

observation data. Sources of error can then be partitioned

out leaving only the true variance attributed to a

particular factor (Suen & Ary, 1989; Hartmann & Wood, 1990).

Intraclass correlations are recommended over kappa and

percent agreement because they can provide an estimate of

both intraobserver and interobserver reliability (Hops,

Davis, & Longoria, 1995; Suen, 1988). The design of the

study, of course, has implications for which aspects of

reliability (e.g, setting, time) can be evaluated (Suen &

Ary, 1989). Some of the designs require significantly more

duplicate coding, which can be impractical and costly,

compared to other reliability estimates (Jacob, Tennenbaum,

& Krahn, 1987). Overall, intraclass correlations and the

generalizability theory have been described in positive

terms as broadening the scope of analysis for reliability










studies to include contextual factors that may or may not

limit the generalizability of the data (Hartmann & Wood,

1990).


Characteristics of Children with Oppositional Defiant
Disorder and their Parents

Children diagnosed with Oppositional Defiant Disorder

demonstrate a "recurrent pattern of negativistic, defiant,

disobedient, and hostile behavior toward authority figures",

according to the Diagnostic and Statistical Manual of Mental

Disorders, 4th Edition (DSM-IV) (American Psychiatric

Association [APA], 1994). According to the DSM-IV,

Estimates of the prevalence of this disorder in children

range from 2% to 16% (APA, 1994). Attention-

Deficit/Hyperactivity Disorder, Learning Disorders, and

Communication Disorders often co-occur in children diagnosed

with Oppositional Defiant Disorder (APA, 1994).

Familial and environmental factors have also been found

to be associated with this disorder. Marital discord and

parental psychopathology (e.g., maternal depression,

parental history of behavior problems, and parental

Attention-Deficit/Hyperactivity Disorder) are more commonly

found in families with a child diagnosed with Oppositional

Defiant Disorder (APA, 1994; Dumas & Serketich, 1994; Griest

& Forehand, 1982).

Additional factors such as maternal "insularity" and

socio-economic disadvantage also have been identified as










characteristic of a subset of families in which a child is

clinic-referred for significant behavior problems. Wahler

(1980) described some mothers of children with behavior

problems as "insular" after noting a pattern of social

isolation from peers and primarily negative, coercive social

exchanges in the mothers' lives. In one such study, 59% of

mothers of clinic-referred children (29 out of 49) were

classified as insular (Dumas & Wahler, 1983). Given that

insularity is not routinely measured in the mothers of

clinic-referred children, it is unclear to what extent this

characteristic is associated with behavior disorders in

children in general. However, the results of studies

indicate that insularity is associated with the development

of behavior problems, with higher drop-out rates from

treatment, and with poorer maintenance of positive treatment

outcome (Wahler, 1980; Dumas & Wahler, 1983).

Socio-economic disadvantage also appears to be a

consistent characteristic of a subset of the families with

behaviorally disordered children. Dodge, Pettit, and Bates

(1994) found that in a sample of 585 children followed from

kindergarten to third grade there was a linear relationship

between the risk of developing behavior problems and lower

SES. In a 1984 treatment outcome study of clinic-referred

samples of children with behavior problems, Dumas found that

a significant portion of the families (approximately 50%)

were classified in the high socio-economic disadvantage










range based on factors such as income and maternal

education. Socio-economic disadvantage appears to be

related not only to the increased incidence of behavior

problems in children but also appears to be negatively

related to outcome in treatment. In Dumas's 1984 study, he

found that children in the high disadvantage families were

more likely not to succeed in a standard treatment program

(i.e., parent training) and were more likely to demonstrate

a higher level of aversive behavior in treatment and follow

up than children who were successful in treatment. In the

group of families who were unsuccessful in treatment, the

mothers (who were predominantly in the high disadvantage

group) were more aversive and more indiscriminate in their

use of aversive behavior towards their children than the

successful mothers at pre- and post-treatment as well as at

follow-up. These studies suggest that behavioral problems

in children frequently occur within the context of familial

and environmental stressors and that the greater these

stressors, the more likely the children will have poorer

outcome in response to parent training, the conventional

treatment for behavior disorders (Dumas, 1984). The

relationship between these familial and environmental

stressors and the development of behavior problems has yet

to be determined. However, Dumas's work indicated that

specific interaction patterns differentiated the families

with lower socio-economic status from those families with










higher socioeconomic status which may account for the

greater numbers of lower status families in clinic-referred

groups.

In addition to familial and environmental factors,

specific behaviors and interaction patterns have also been

found to distinguish children with significant behavior

problems from non-referred children. Although a variety of

behavioral coding systems have been employed to study

children with behavior problems and their families, several

distinguishing features of these families have been

consistently found. The most frequently replicated feature

that separates the behavior of children with significant

behavior problems from normal children is their rate of

noncompliance. In a survey of 43 studies of home

observations of conduct disordered children, 77% of the

studies used coding systems that included some measure of

compliance/noncompliance (McIntyre, Bornstein, Isaacs, et

al., 1983). In the studies that included a normal

comparison group, compliance was found to be significantly

different between the conduct-disordered and the control

children. Griest, Forehand, Wells, and McMahon (1980) found

that percent compliance to "alpha" commands (i.e., commands

for motoric behavior for which the child has an opportunity

to comply) for clinic-referred children was 79.8% compared

to 86.2% for the non-referred children (p < .05).

Similarly, Robinson and Eyberg (1981) found that children










referred for behavior problems had 48% compliance to

commands compared to 62% compliance for non-referred

children (p < .01). Eyberg, Boggs, and Algina (1995) found

that baseline rates of compliance were 21% for an initial

treatment group and 19% for a wait list group of clinic-

referred children.

Their rate of inappropriate behaviors is another

distinguishing feature of children with behavior problems.

Robinson and Eyberg (1981) found that the children with

behavior problems were more likely than non-referred

children to whine and yell. Other investigators have also

documented significant differences in the frequency of

whining between groups (McIntyre, Bornstein, Isaacs, et al.,

1983; Lobitz & Johnson, 1975). Robinson and Eyberg (1981)

found that total inappropriate behaviors (i.e., cry, whine,

yell, smart talk, destructive) recorded during the Parent-

Directed Interaction (PDI) were 1.16 in ten minutes for

normal children and 6.65 in ten minutes for behavior problem

children. Forster, Eyberg, and Burns (1990) found that

children with conduct problems issued more commands during

the Child-Directed Interaction (CDI) than non-referred

children. It is possible that children who issue numerous

commands during play appear to be "bossy" to others, a

behavior that could have a negative impact on the child's

social interactions.










Some research indicates that children with behavior

problems display fewer prosocial behaviors compared to

normal children. Lobitz and Johnson (1975) found that the

proportion of child positive valence behavior (i.e., laugh,

approval, attention, talk, nonverbal interaction, and

independent activity) to total child behavior discriminated

between children with and without behavior problems.

Forster, Eyberg, and Burns (1990) noted that children with

behavior problems were less likely to use positive

verbalizations, such as praise of the parent and questions.

Based on clinical impressions, Patterson (1986) also

conceptualized children with conduct problems as having

social skills deficits although he did not find significant

differences between groups of referred and nonreferred

children on behaviors such as laugh, attend, approval, and

physical positive.

Two types of parental verbal behavior have been found

to consistently characterize parents of children with

behavior problems. First, the frequency of parental commands

has been shown to distinguish families of non-referred

children from families of children referred for behavior

problems (Lobitz & Johnson, 1975; Rogers, Forehand, &

Griest, 1981). Webster-Stratton (1985) found that the

frequency and type of command differentiated the two groups

of parents. Mothers of clinic-referred children were

observed to give both indirect (e.g., "Will you sit down?")








13

and direct commands (e.g., "Sit down.") more frequently and

to repeat commands before the child had sufficient

opportunity to comply. Similarly, direct commands were

significantly higher for parents of children with behavior

problems during PDI and CDI (Robinson & Eyberg, 1981).

Secondly, parents of clinic-referred children differ

from parents of non-referred children on the frequency of

their negative verbal behavior. Robinson and Eyberg (1981),

Webster-Stratton (1985), and Aragona and Eyberg (1981) found

that parents of clinic-referred children issued

significantly more critical statements. Similarly, Lobitz

and Johnson (1975) found that the parents of referred

children significantly differed from the parents of non-

referred children on a summary variable of negative

behaviors which included threatening commands, negative

commands, disapproval, ignoring, and physical negative.

Overall, these studies indicate that children with

behavior problems have significantly higher rates of

noncompliance, have higher rates of inappropriate behaviors,

and emit fewer prosocial behaviors. Parents of children

with behavior problems issue significantly more commands and

more negative verbalizations, such as criticism. Behavioral

observation systems intended to assess children with conduct

problems should include categories known to differ between

children with and without clinically significant behavioral

problems.










The Dyadic Parent-Child Interaction Coding System

The Dyadic Parent-Child Interaction Coding System

(DPICS) (Eyberg & Robinson, 1983) is a widely used system

that allows for efficient direct observation of parent-child

interactions in a standardized laboratory setting using a

reliable and valid method. The DPICS was designed for both

research and clinical purposes. The system was intended to

provide practicing clinicians with a manageable and

practical way to measure pre- and posttreatment changes and

on-going treatment progress. At the same time, the DPICS

was intended to provide researchers with a reliable and

valid system that measures behaviors with sufficient detail

and specificity to advance our knowledge in the treatment

and assessment of behaviorally disordered children.

Some of the basic categories of the DPICS, such as

direct and indirect commands, labeled and unlabeled praise,

physical positive and negative, and critical statements,

were originally derived from the Hanf (1968) and Patterson

(1969) coding systems. Additional categories including

descriptive statements, acknowledgments, and irrelevant

verbalizations, were included in the DPICS to allow for

continuous coding of all parental verbal behaviors. The

system also contained categories to assess children's

inappropriate behaviors (e.g., yell, whine, cry,

destructive, and smart talk). Two sequences of behaviors

were coded: 1) the parent's response to inappropriate child










behavior and 2) the child's response to parental commands

(i.e., compliance, noncompliance, or no opportunity for

compliance). The coding system was gradually developed and

improved using feedback from users of the system from 1974

until its publication in 1983. The 1983 version contains a

total of 22 parent and child behavior categories (Table 1).

The system continued to be evaluated and refined by its

users. For example, Wruble, Sheeber, Sorenson, et al.

(1991) evaluated the procedure for coding child compliance,

verifying the use of the 5-second interval for compliance

through observation of compliance times in non-referred

children.

DPICS observations are generally conducted in a

laboratory or clinic setting. For laboratory observation,

the parent-child pair is typically observed from behind a

one-way mirror while they play with a standard set of toys

that encourages positive, interactive play. The parent-

child dyads are observed during three standard DPICS

situations designed to assess the quality of the parent and

child's social interactions. Each situation differs both in

the amount of parental control required and the demand

placed on the child for compliance.

In the first situation, called the Child-Directed

Interaction (CDI), the parent is instructed to follow the

child's lead during play. CDI is intended to place little










Table 1

Categories in the Dyadic Parent-Child Interaction Coding
System (DPICS)


Parent Behavior

Acknowledgment

Reflective Statements

Descriptive Statements

Descriptive/Reflective
Questions
Indirect Commands
Direct Commands
Labeled Praise
Unlabeled Praise
Criticism


Physical Negative
Physical Positive


Child Behavior


Smart Talk
Yell
Whine
Cry
Physical Negative
Physical Positive
Destructive
Compliance
Noncompliance
No Opportunity for
Compliance
Changes Activity


Ignores/Responds to deviant

Note: Adapted from Eyberg & Robinson, 1983.



demand on the child for compliance and to offer the parent

opportunities to provide the child with positive attention










(e.g., descriptions, praises, answers). If the parent is

successful in following the child's lead in play, little

child noncompliance and inappropriate behavior is expected

to occur in this situation.

In the second situation, the Parent-Directed

Interaction (PDI), the parent is instructed to structure the

play and attempt to get the child to follow the parent's

direction and rules for the play. In this situation, the

observers assess the parent's ability to direct their child

and gain the child's cooperation. Because the PDI situation

increases the amount of parental control, it provides

observers with an opportunity to assess the child's response

to directions (e.g., compliance, inappropriate behavior).

In the third situation, Clean-up (CU), the parent is

instructed to get the child to pick up the toys without

assistance and put them into their respective containers.

The Clean-Up situation requires the highest level of

parental control. Unlike the first two play situations,

Clean-Up is a task situation in which it is anticipated that

parents will direct their children using commands and

directive information descriptions, informing the child

about the task demands. This situation provides

opportunities to assess the parent's success in gaining

compliance from the child, to assess inappropriate behavior

in the child in response to parental demands, and to assess








18
compliance to the commands. The parents' use of praise and

positive attention for compliance can also be evaluated.

The total time required for comprehensive baseline and

post-treatment observation of a parent-child dyad is

approximately 25 minutes. The CDI and PDI situations each

last approximately ten minutes. The first five minutes of

each of the two play situations is used as a warm-up and

transition period, and the last five minutes in each of the

two play situations is coded. Because cleaning up the toys

may not require more than a few minutes with a compliant

child, the initial five minutes of the CU situation is

coded, after which the observation session can be

terminated.

Coding of the dyad may be conducted "live" with the

observer coding during the actual observation period.

Clinicians may prefer to use selected categories when coding

"live" to assess behaviors targeted for change.

Alternatively, the parent-child interactions can be

videotaped and coded at a later date. The latter method is

recommended for research purposes.

In the 1981 study to standardize and validate the DPICS

(Robinson & Eyberg), parent-child dyads were observed in the

two play situations, Child-Directed and Parent-Directed

Interaction. The standardization sample consisted of 20

families who had been referred to a university child








19

psychology clinic for treatment of behavior problems and 22

control families recruited from the community.

The mean interrater reliability coefficient for parent

behaviors was .91 (range, .67-1.0) and for the child

behaviors was .92 (range, .76-1.0). In a discriminant

analysis, the DPICS variables were found to classify

correctly 94% of the families into either the clinic-

referred or non-referred group. The DPICS variables also

accounted for 61% of the variance in parental reports of

their children's behavior at home on the ECBI. Thus, DPICS

has demonstrated reliability and validity as an

observational system for children with behavior problems.

Since its development, the DPICS has been widely used

both clinically and in research to describe parent-child

interactions. For example, the DPICS has been used to

distinguish between the parent-child interactions of mothers

of neglected children, children with behavior problems, and

normal control children (Aragona & Eyberg, 1981) and abusive

and nonabusive families (Webster-Stratton, 1985).

Furthermore, DPICS has been employed as a measure of pre- to

post-treatment changes for children with behavior problems

(Eyberg & Robinson, 1982; Eyberg & Matarazzo, 1980;

Eisenstadt, Eyberg, McNeil, Newcomb, & Funderburk, 1993).


Dyadic Parent-Child Interaction Coding System II

Recently, the DPICS has been expanded and revised. The

new version, DPICS II, shares many similarities with the










original system including use of the same observation

procedures (i.e., CDI, PDI, and CU) and retention of many of

the original categories (Eyberg, Bessmer, Newcomb, Edwards,

& Robinson, 1994). The expanded version contains twenty-

five categories for child behavior and twenty-seven

categories for parent behavior (See Appendix B: Summary of

DPICS II Categories). The DPICS II differs from the

previous version in two major ways. First, the rules or

guidelines for coding behaviors have been clarified and

expanded based upon feedback from users of the DPICS. In

the manual for the DPICS II, the descriptions of the

categories are followed by more detailed guidelines to

facilitate accurate coding and by a greater variety of

examples of parent and child behaviors to illustrate the

coding principles. Attempts have been made by the authors

to operationalize the coding criteria to a greater degree to

reduce and/or eliminate subjective judgments by the coders

whenever possible. For verbalizations, coding rules were

often based on grammatical properties of the words used

rather than on assumptions about the intended meaning of the

words. For vocalizations and nonverbal behaviors, the

coding rules were designed to provide sufficient observable

behavioral criteria to facilitate reliable coding.

The second major change involved the addition of

several behavioral categories and the removal of several

other categories. Table 2 lists the original categories of










Table 2

Categories of the DPICS and DPICS II


Child Behavior


Playtalk
Acknowledgment
Behavioral Descriptiona
Information Descriptiona
Reflective Statements
Descriptive/Reflective
Questions
Information Questions
Indirect Commands
Direct Commands
Labeled Praise
Unlabeled Praise
Criticism
Smart Talk
Yell
Whine
Laugh
Physical Negative
Physical Positive
Destructive
Compliance
Noncompliance
No Opportunity for
Compliance
Answer
No Answer
No Opportunity for Answer
Contingent Labeled Praise
Warning


Playtalk
Acknowledgment
Behavioral Description
Information Description
Reflective Statements
Descriptive/Reflective
Questions
Information Questions
Indirect Commands
Direct Commands
Labeled Praise
Unlabeled Praise
Criticism
Smart Talk
Yell
Whine
Laugh
Physical Negative
Physical Positive
Destructive
Compliance
Noncompliance
No Opportunity for
Compliance
Answer
No Answer
No Opportunity for Answer


a In DPICS, descriptions were coded Descriptive Statement.


Parent Behavior








22

the DPICS along with the categories that have been added in

DPICS II, which are shown in bold typeface. While DPICS

focused on behaviors that seemed most salient for assessment

and treatment (i.e., frequency of the child's inappropriate

behaviors, such as noncompliance, frequency of parents'

praise, commands, and criticism), in the DPICS II, the

categories for child and parent behaviors have been designed

to be reflexive, in that the same verbal and motor behaviors

are coded for both the parent and child. Although this

change greatly expands the total number of categories in the

system, the reflexive nature of the parent and child

categories enables coders to learn one set of rules for the

categories which then apply to both the parent and child

behaviors.

By allowing coding of the same behaviors for parents

and children, DPICS II may be a more useful research and

clinical tool. It is possible for clinicians and

researchers to use the DPICS II to describe behaviors within

the interaction that may elicit or maintain the children's

behavior problems and assess changes in these behaviors

during and after treatment. For example, parents' modeling

of inappropriate behaviors may be associated with the

children's use of these inappropriate behaviors.

Specifically, parents' use of critical statements and smart

talk may be associated with higher levels of criticisms,

smart talk, whine, and yell in their children.










In addition to inappropriate behaviors, the DPICS II

may also be useful in assessing socially appropriate

behaviors. Because children with behavior problems have

been found to be less socially competent than non-referred

children (La Freniere, Dumas, Capuano, & Dubeau, 1992),

categories for coding child appropriate verbalizations are

included to allow researchers and clinicians to use the

DPICS II to examine the child behaviors related to prosocial

skills.

Users of the original DPICS will note several new

categories, and other categories from the original system

have been divided into smaller units. Playtalk, for

example, a category that previously has been used only in

research (Forster, Eyberg, and Burns, 1984; Mee, 1991), is

intended to capture indirect communication in the form of

parent and child verbalizations that are spoken as if by

characters in their game. Forster Eyberg, and Burns (1990)

and Speltz, DeKlyen, Greenberg, & Dryden (1995) found that

Playtalk was more frequent in families with conduct problem

children. In contrast, Mee (1991) found very low rates of

Playtalk in a group of children with behavior problems;

however, low reliability for this category found in Mee's

study may account for her findings. Further study of

Playtalk, provided it can be coded reliably, appears

warranted given these mixed findings.










The DPICS categories of Descriptive Statements and

Descriptive/Reflective Questions have each been divided into

two categories. The subdivision of the original categories

was intended to capture differences in the directiveness

and/or attentiveness communicated in the verbalization. The

Descriptive Statements category, for example, was divided

into Behavioral Descriptions, (i.e., statements that follow

the other person by directly describing the on-going or

immediately completed behavior of the other member of the

dyad), and Information Descriptions, (i.e., statements that

describe aspects of the play situation (e.g., toys,

feelings, behaviors of the speaker) that need not be related

to the behavior of the other member of the dyad). The DPICS

category of Descriptive/Reflective Question also has been

divided into two types of questions: (a) Information

Questions that request a verbal response from the listener

that is more than an acknowledgment, and (b)

Descriptive/Reflective Questions that require no more than a

brief verbal acknowledgment from the listener.

The description and question categories were divided to

enable separate recordings of components of interactions

that may be directive or coercive and have a higher

likelihood of eliciting noncompliance or inappropriate

behavior in children. The division of these categories is

intended to allow researchers and clinicians to examine

further the maladaptive interactions that distinguish










children with behavior problems and their parents from

families in which these difficulties are not reported.

In addition to research applications, the DPICS II is

helpful in clinical settings as a way to monitor parents'

use of directive and non-directive verbal behaviors

frequently taught in behavior management treatment programs

for young children. The DPICS, and more recently the DPICS

II, have been used as an assessment device for Parent-Child

Interaction Therapy (PCIT), a parent training program

designed to treat children with behavior problems and their

parents (Eyberg, 1979, 1988; Hembree-Kigin & McNeil, 1995).

It is anticipated that some of the categories, such as

Contingent Labeled Praise, Behavioral Descriptions, and

Labeled Praise, will occur infrequently in parents who have

not received skills training. However, inclusion of these

categories will enable clinicians and researchers to use the

DPICS II to assess positive attending skills in parents,

such as praise and descriptions, during and after treatment.

Some additional categories in the system may provide

more information about important sequences of behavioral

events in parent-child interactions. Whereas the DPICS

allowed for coding of two sequences, parental response to

deviant child behavior and child compliance to parental

commands, the DPICS II allows several additional sequences

to be measured in the interaction. For example, parental

Labeled Praise following child Compliance to commands is now










coded as Contingent Labeled Praise, and whether a question

requesting information has been answered appropriately or

not also can be recorded. This sequential information may

be helpful in determining the amount of reciprocity in the

dyad and may be a feature of parent-child dyads that

distinguishes clinic-referred families from normal families.

Finally, several categories contained in the original

system have been removed. Reasons for removing categories

included poor interobserver agreement, infrequency of the

behavior and lack of utility of the category. Parent

categories that are no longer contained in the system are

Irrelevant Verbalization, Responds to (child) Deviant,

Ignores (Child) Deviant. Child categories that have been

eliminated are Change Activity and Cry.

One study has examined the reliability of a subset of

the DPICS II categories which comprise the Clinic Version of

the DPICS II. Bessmer (1993) examined the reliability of

the parent verbalization categories, the child inappropriate

behavior categories, the child response to commands

categories and the child response to Information Questions

categories in a sample of 20 nonreferred mother-child dyads

recruited from the Gainesville, FL, community. To be

included in the study the children had to fall within the

normal range on the Problem Score of the Eyberg Child

Behavior Inventory (less than 11) (Eyberg & Ross, 1978).

The dyads were coded "live" rather than from videotape, and










reliability estimates were obtained using Pearson product-

moment correlations, percent agreement on occurrence only,

and Cohen's kappa (Cohen, 1960) for the CDI and PDI

situations as summarized in Tables 3 and 4 in Appendix A.

Information Question, Descriptive/Reflective Question,

Behavioral Description, Acknowledgment, and Playtalk

categories in CDI demonstrated excellent reliability (i.e.,

correlations and percent agreement > .80 and kappa > .75).

The parent categories with good to fair reliability

estimates (correlations and percent agreement > .60 and

kappa > .40) were Direct Command, Indirect Command,

Information Description, Reflective Statement, and Unlabeled

Praise. Labeled Praise and Criticism/Smart Talk had poor

reliability estimates during the CDI situation.

In the PDI situation shown in Table 4 in Appendix A,

Information Question, Unlabeled Praise, and Playtalk had

excellent reliability. The categories with good to fair

reliability estimates were Direct Command, Indirect Command,

Descriptive/Reflective Question, Information Description,

Reflective Statement, Acknowledgment, and Criticism/Smart

Talk. The categories with poor reliability were Behavioral

Description and Labeled Praise.

On all measures of reliability, the child categories of

Answer and No Opportunity for Answer had excellent

reliability during CDI (See Table 5 in Appendix A). If two

out of three methods of calculating reliability are








28

considered, Compliance, Noncompliance and No Opportunity for

Compliance appeared highly reliable. No Answer appeared to

have good to fair reliability estimates. Possibly as a

result of the infrequency of the behaviors they measure, the

other child categories demonstrated poor reliability.

In the PDI situation, the categories of Compliance, No

Opportunity for Compliance, Answer, and No Opportunity for

Answer were highly reliable (See Table 6 in Appendix A).

Noncompliance and No Answer had fair to good reliability

across the three methods. Similar to the results in CDI,

the child inappropriate behavior categories occurred rarely

or never, yielding no estimate or very low estimates of

reliability.

The Bessmer (1993) study provided reliability

information about the categories in the Clinic version of

DPICS II using live coding of a normal sample of mother-

child dyads. That study showed that many of the categories

could be reliably coded. Because a non-referred sample is

likely to exhibit some of the behaviors measured by the

DPICS II at extremely low rates, or not at all, the

reliability of some of the behavioral categories could not

be examined. Low reliability estimates were obtained for

categories such as Labeled Praise, Behavioral Description,

and Criticism/Smart Talk. These low estimates were likely

influenced by the very low frequency of occurrence and may

not represent a fair evaluation of the reliability of these










categories if examined under different conditions. In

addition, many of the child inappropriate behaviors were not

coded in any of the dyads, making computation of reliability

estimates impossible. To obtain information about the

reliability of the DPICS II categories that would occur with

a clinic sample, the reliability of the categories needs to

be evaluated in a sample of children with significant

behavior problems.

Preliminary information on the validity of the DPICS II

categories was obtained in a pilot study (Bessmer & Eyberg,

1993) which compared the 20 non-referred mother-child dyads

used in the Bessmer study to a sample of 20 mother-child

dyads referred for treatment of Oppositional Defiant

Disorder. The two groups were not significantly different

on demographic characteristics such as age, sex, and race of

the children. The two groups were found to be significantly

different on socio-economic status (SES) level using

Hollingshead's Four Factor Index of Social Status

(Hollingshead, 1975). The non-referred group obtained a

mean SES score of 51.5 while the clinic-referred group

obtained a mean score of 35.5. The differences in SES level

are likely related to the high level of education in the

non-referred sample which was recruited from a university

community. The two groups of children also differed

significantly on their mean standard score on the Peabody

Picture Vocabulary Test--Revised (PPVT-R), but were each










within one standard deviation of the normative mean, as

shown in Table 7. As expected, the two groups differed

significantly on the Eyberg Child Behavior Inventory (ECBI),

a measure of behavior problems.



Table 7

Scores on Measures Used to Screen Dyads


Normative Referred Nonreferred
M M M
(SD) (SD) (SD)

PPVT-R Standard Score

100a 95.2 106.89*
15a (13.7) (14.02)

ECBI Problem Score

7b 22.2 3.75**
(8)b (6.6) (2.73)

ECBI Intensity Score

97b 171.5 98.8**
(35)b (27.4) (16.77)
Note: Significance levels refer to comparisons between
referred and non-referred samples in the Bessmer and Eyberg
pilot study (1993).
(Dunn & Dunn, 1981); b(Colvin, Eyberg, & Adams, 1996); p
< .01; p < .001;



The mean frequency of occurrence of the DPICS II

categories in the Clinic version were compared for the two

groups using multiple t tests. The complete results of the

t tests are reported in Tables 8 11 in Appendix A. The

comparisons of the two groups revealed significant










differences during the CDI situation on the parent

categories of Information Description and Acknowledgment.

During the PDI situation, the parent categories of Direct

Command, Information Description, Acknowledgment, Unlabeled

Praise, Criticism/Smart Talk, and Playtalk were

significantly different between groups. Fewer differences

were observed in the child categories. In the CDI

situation, there were no significant differences between

groups on the child categories, and in the PDI situation,

only the child behavior category of Noncompliance was

significant. The summary variable of Total Inappropriate

Behavior also was significantly different between groups in

PDI.

The differences between the two groups in the parent

and child categories during PDI are consistent with the

literature on children with behavior problems. In general,

behaviorally disordered children displayed significantly

more inappropriate behavior and were significantly more

noncompliant than non-referred children. The parents of

children with behavior problems issued more commands and

more critical statements than parents of nonreferred

children.

In this pilot study, non-referred parents were found to

provide more Information Descriptions and Acknowledgments

during CDI and PDI and more Unlabeled Praise during PDI.

These findings suggested that non-referred parents were more








32
interactive, positive, and responsive to their children than

the parents of clinic-referred children. While these data

support the validity of the DPICS II in discriminating

between clinic-referred and non-referred dyads, the

demographic confounds and the multiple f tests may have

accounted for some of the findings and these results should

be considered preliminary.

A further investigation of the relationship between

DPICS II categories and the child's SES and PPVT-R scores

was completed because of the significant differences between

groups found on these characteristics. The correlations

between SES, PPVT-R scores and DPICS II variables relevant

to the present study were computed on a sample of clinic-

referred subjects (N = 47) drawn from the treatment outcome

study subjects. The correlation matrices for these analyses

are presented in Tables 12 and 13 in Appendix A.

Significant correlations were found between SES and several

child behavior categories: 1) Noncompliance in CDI (r =

.31); 2) Smart Talk in PDI and CU (r = -.31 and -.36,

respectively); 3) No Opportunity for Compliance in CDI (r =

-.40); and 4) Criticism in CU (r = -.40).

Significant correlations also were found between SES

and several parent categories: 1) Direct Command in CDI (I =

-.42) and CU (r =-.30); 2) Criticism in CDI (r = -.29). No

significant correlations were found between SES and child

Yell, Whine, Destructive, Physical Negative, and Compliance










nor were any correlations significant between SES and the

parent categories of Indirect Command and Labeled Praise.

Given that a large number of correlations were performed,

some significant findings were expected based on chance.

However, for some DPICS variables, significant correlations

with SES occurred in more than one situation, suggesting

that the frequency of occurrence of those variables may vary

as a function of SES. Parent Direct Command and child Smart

Talk, specifically, appeared to be related to SES. These

findings indicate that SES may be associated with some the

categories in the DPICS II. Further analyses using the

DPICS II variables should evaluate the effects of SES in

relation to the categories. No significant correlations

between child PPVT-R standard scores and selected DPICS II

variables were found, indicating that these variables are

not related to the child's receptive vocabulary level.

The purpose of the present study was to examine the

reliability and validity of the research version of the

DPICS II by comparing videotaped interactions of mother-

child dyads in which the child was referred to a psychology

clinic for behavior problems to non-referred mother-child

dyads. Reliability was assessed using intraclass

correlations, percent agreement, and kappa. Validity was

examined by evaluating the degree to which the DPICS II

categories discriminated between clinic-referred and non-

referred dyads. In addition, the association between the










DPICS II categories and the ECBI, the PSI, and the PLOC,

measures which have been shown to discriminate between

clinic-referred and non-referred children, was used to

demonstrate the convergent validity of the coding system for

use in evaluating children with behavior problems.



Hypotheses

Based upon previous studies using DPICS with children

with behavior problems, the following hypotheses were made:

1. Parent-child interactions can be reliably coded using

the DPICS II, as measured by intraclass correlations,

percentage of interrater agreement and Cohen's kappa.

2. The DPICS II will demonstrate discriminant validity for

use with children with behavior problems by replicating

the previous findings about children with behavior

problems and their mothers:

a. Children who have been referred for behavior

problems will have significantly higher rates

of noncompliance compared to non-referred

children.

b. The referred children will have significantly

higher frequencies on the summary variable,

Total Inappropriate Behaviors, which includes

the categories Destructive, Physical

Negative, Yell, Whine, Smart Talk, and










Criticism, compared to children in the

comparison group.

c. The referred children will demonstrate fewer

prosocial behaviors; specifically, they will

have a lower rate of Total Prosocial

Behaviors (Answer + Acknowledgment + Laugh +

Information Descriptions + Behavioral

Descriptions + Physical Positive + Labeled

Praise + Unlabeled Praise).

d. Mothers of children with behavior problems

will have a higher total number of commands

and a higher direct command ratio (Direct

Commands/(Indirect + Direct Commands). In

addition, mothers of clinic-referred children

will display significantly fewer Total

Prosocial Behaviors (Acknowledgment + Answer

+ Information Description + Behavioral

Description + Laugh + Labeled Praise +

Unlabeled Praise + Reflection + Physical

Positive).

e. Mothers of clinic-referred children will have

a higher rate of Negative Talk (Criticisms +

Smart Talk) and Total Inappropriate Behavior

(Criticisms + Smart Talk + Yell + Whine +

Destructive + Physical Negative) compared to

the mothers of non-referred children.








36
3. The DPICS II categories and summary variables of Child

Noncompliance, Child Total Inappropriate Behavior,

Child Prosocial Behavior, Parent Total Inappropriate

Behavior, Parent Direct Command Ratio, and Parent Total

Commands will demonstrate discriminant validity by

correctly classifying dyads into their respective

groups (i.e., clinic-referred and non-referred) in a

discriminant function analysis.

4. The DPICS II categories and summary variables of Child

Noncompliance, Child Total Inappropriate Behavior,

Child Prosocial Behavior, Parent Inappropriate

Behavior, Parent Direct Command Ratio, and Parent Total

Commands will demonstrate convergent validity by

accounting for a significant proportion of the

variance in ECBI intensity scores.

5. The DPICS II categories and summary variables for child

behavior (i.e., Noncompliance, Total Inappropriate

Behavior, Prosocial Behavior) will demonstrate

convergent validity by accounting for a significant

proportion of the variance in PSI child domain scores.

6. The DPICS II categories and summary variables for

parents (i.e., Direct Command Ratio, Total Commands,

Inappropriate Behavior, Prosocial Behavior) will

demonstrate convergent validity by accounting for a

significant proportion of the variance in PSI parent

domain scores. In a separate analysis, these parent








37

categories and summary variables will account for a

significant proportion of the variance in the total

PLOC score.














METHODS

Participants

Two groups of 30 mother-child dyads, a clinic-referred

group and a non-referred comparison group, participated in

the study. All dyads met the following criteria to be

included in the study: 1) The child was between 3.0 and 7.0

years old; 2) English was the primary language spoken in the

home; 3) The mother and child had no history of mental

retardation; 4) The mother's and child's language skills

were sufficient for them to understand each other and for

the observers to understand their verbalizations (e.g., the

child's language skills are at or above the 2 year old level

as measured by the PPVT-R or another standardized measure of

verbal ability).

The 30 mother-child pairs comprising the clinic-

referred group had been referred to the Child Study

Laboratory at the University of Florida Health Sciences

Center for the children's externalizing behavior problems.

The children and their parents were assessed in the Child

Study Laboratory for inclusion in a treatment outcome study.

Children included in the clinic-referred group had met

diagnostic criteria for Oppositional Defiant Disorder based

on their parent's responses to a structured interview








39

designed to yield DSM-III-R diagnoses. Dyads were eligible

for inclusion in the present study if they met the basic

criteria outlined above, and the child received a Problem

score on the ECBI of greater than 11. Parents signed a

standard consent form indicating that information about

their family would be used for research purposes, and data

used in this study was routinely collected as part of the

standard procedures for all families being assessed for

inclusion in the larger treatment outcome study. The thirty

families representing the clinic-referred group in the

present study were selected randomly from the larger sample

of clinic-referred children. They appear to be

representative of the characteristics of the larger sample.

(See Table 15 in Appendix A for comparison of demographic

variables between current sample and the larger sample they

were drawn from.)

The normal comparison group consisted of 30 mother-

child pairs recruited from advertisements placed in

recreation areas, libraries, preschools, and day-care

centers in the Gainesville area. To be eligible for the

study, the children met the basic inclusion criteria and

also received a Problem score on the Eyberg Child Behavior

Inventory of 11 or less. Additional demographic descriptors

of the families were collected to describe the families,

including parent level of education, average annual income,

and single- versus two-parent status of mothers.










The normal comparison group was balanced for age, sex,

and race with the participants in the clinic-referred group.

The groups were not significantly different on these sample

characteristics. The children in the clinic-referred group

ranged in age from 3 years, 1 month to 6 years, 11 months

with the mean age being 4 years, 10 months (SD = 1 year, 1

month). The children in the comparison group ranged in age

from 3 years, 1 month to 6 years, 9 months with the mean age

being 4 years, 9 months (SD = 1 year, 1 month). The clinic-

referred group and the comparison group each consisted of 23

males and 7 females. In the clinic-referred group, 24

children were Caucasian, 3 children were African American, 2

children were Hispanic, and 1 child was of Asian descent but

was being raised by Caucasian parents. In the comparison

group, 25 children were Caucasian, 3 children were African

American, and 2 children were Hispanic.

A summary of the additional measures and demographic

information collected on the participant families is

presented in Table 16. The parents' SES was calculated

using Hollingshead's Four Factor Index, which yields a score

based on parents' education, occupation, sex, and marital

status (Hollingshead, 1975). The mean SES for clinic-

referred families was 38.3 (SD = 12.4) and ranged from 14 to

61. The mean SES for the comparison families was 48.9 (SD =

10.8) and ranged from 31 to 66. The clinic-referred group's










Table 16

Sample Characteristics


Variable Clinic-referred Comparison
(n = 30) (n = 30)


Sex of child (% male) 77% 77%

Mean child age (months) 58.5 57.1

Family composition (%)
Two-parent 60.0 86.0
Single-parent 40.0 14.0

Maternal age (years) 30.7 34.4

Maternal educationa 4.2 5.6

Maternal employment (%)
Employed 63.3 66.7
Not employed 36.7 33.3

Ethnicity (%)
Caucasian 80.0 82.0
African American 10.0 11.0
Hispanic 7.0 7.0
Other 3.0 0.0

PPVT-R standard score
M 95.1 102.1
SD (14.3) (14.4)

SES**
M 38.3 48.9
SD (12.4) (10.8)

Hollingshead SES (n)
Class V 1 0
Class IV 8 0
Class III 7 6
Class II 11 14
Class I 3 10


Note. a Maternal education was assessed on a scale of 1 to
7 1 = less than 7 years of education, 7 = graduate degree.
P < .01.










mean falls within Hollingshead's (1975) Class III SES

category, which he described as consisting of skilled

craftsmen, clerical, sales workers. The comparison group's

mean falls within the Class II category, described as medium

business, minor professional, and technical workers. The

significant difference found in SES between the groups may

be related to the markedly higher proportion of single

parent families in the clinic-referred group (40% versus 14%

in the comparison group). Mothers in the clinic-referred

group had typically graduated from high school while mothers

in the comparison group were more likely to have received

some college education.

The clinic-referred and comparison groups were expected

to differ significantly on the measures related to behavior

problems and these differences were, in fact, found. The

clinic-referred group demonstrated significantly elevated

scores on all measures compared to the non-referred group.

Mothers in the clinic-referred and comparison group rated

their children on the ECBI as having mean intensity scores

of 176.9 (SD = 25.9) and 103.8 (SD = 23.3), respectively.

On the problem score, the clinic-referred children were

rated as having a mean of 24.0 problem behaviors (SD = 6.4)

and the comparison children had a mean of 4.2 problem

behaviors (SD = 3.2). It is important to note that the

problem score was used to select the participants, and

therefore, differences on this scale are anticipated.










On the PSI, mothers of clinic-referred children

reported a mean total stress score of 293.7 (SD = 34.7),

which falls at the 95th percentile. Mothers of children in

the comparison group received a total stress score of 216.1

(SD = 39.0), which falls at the 40th percentile. On the

PLOC, mothers of clinic-referred children received a mean

total score of 131.0 (SD = 14.9). Mothers of children in

the comparison group received a mean total score of 114.5

(SD = 12.9). The PLOC scores are similar to scores reported

for clinic and non-clinic groups. Roberts, Joe, and Rowe-

Hallbert (1992) reported the mean test score for a non-

clinic group was 108.2 (SD = 16.8) and was 121.7 (SD = 15.2)

for a clinic-referred sample.


Table 17

Scores on Measures Used to Compare Participants



Measure Clinic-referred Comparison
M SD M SD

ECBI-I*** 176.9 (25.9) 103.8 (23.3)

ECBI-P*** 24.0 (6.4) 4.2 (3.2)

PSI total*** 293.7 (34.7) 216.1 (39.0)

PSI child*** 146.1 (17.3) 98.2 (19.5)

PSI parent*** 147.3 (22.3) 118.0 (22.9)

PLOC*** 131.0 (14.9) 114.5 (12.9)

Note: Scores on the PLOC for the comparison group are based
on n = 24. All other analyses are based on n = 30 for the
clinic-referred group and the comparison group. Q < .001.










Measures

The Eyberg Child Behavior Inventory

The Eyberg Child Behavior Inventory (ECBI; Eyberg, in

press) consists of 36 items describing typical problem

behaviors for children. Parents rate the frequency with

which these behaviors occur on a scale of 1 (never occurs)

to 7 (always occurs). An intensity score, ranging from 36-

252, may be derived by summing these ratings. A problem

score, ranging from 0-36, is derived by summing the number

of child behaviors deemed problematic by the parent.

The ECBI was recently restandardized with 798 children

and adolescents aged 2 to 16 drawn from several pediatric

health care settings (Colvin, Eyberg, & Adams, 1996). The

participants were representative of the demographic

composition of the Southeastern United States. The internal

consistency coefficient for the ECBI Intensity score was

.94, and was .93 for the Problem score. The mean Intensity

scores (n = 221) ranged from 113.0 for children three years

old to 88.0 for children six years old. The mean Problem

scores (n = 221) ranged from 8.1 for children three years

old to 4.8 for children six years old. The ECBI Intensity

and Problem Scale scores were not significantly correlated

with Hollingshead SES scores, child age, or ethnic group.

The test-retest reliability coefficients for the Intensity

Scale and Problem Scale have been reported as r = .85 and r

= .80, respectively, after a three month interval and r =










.75 and r = .75, respectively, after a ten month interval

(Eyberg, 1992). The ECBI has demonstrated external validity

by differentiating children with behavior problems from

children without behavior problems (Eyberg & Ross, 1978) and

by showing pre- to post-treatment changes (Eyberg &

Robinson, 1982).



Parenting Stress Index

The Parenting Stress Index (Abidin, 1990), an 101 item

pencil and paper measure with an optional Life Stress scale

consisting of 19 items, was originally developed as a

screening measure for the detection of stressors within a

parent-child system commonly associated with dysfunctional

parenting. The items of the PSI are divided into two

domains, the Child Domain and the Parent Domain, each of

which is further divided into subscales. The 47 items

comprising the Child Domain are divided into 6 subscales:

Adaptability, Acceptability, Demandingness, Child Mood,

Distractibility, and Reinforces Parent.

The Parent Domain has 7 subscales: Parent Depression

Parent Attachment, Restrictiveness of Parental Role,

Parental Sense of Competence, Social Isolation, Relationship

with Spouse, and Parental Health. Finally, the optional

Life Stress Domain assesses the number of major changes in

the family's environment, such as death in the family, job

changes, moving, etc.











Normative information on the PSI has been gathered on

large samples of parents recruited from a variety of public

and private pediatric clinics in the United States (Abidin,

1990). Percentiles, means and standard deviations are

available for the domain scores and the total score by child

age (See Table 14 for age groups relevant to this study).


Table 14

4


3 YEARS 4 YEARS 5 YEARS 6 YEARS

M SD M SD M SD M SD


Total
Stress 221 38 229 40 224 30 222 37

Child
Domain 97 18 103 19 101 16 99 20

Parent
Domain 122 23 126 26 123 27 121 21


From Abidin, 1990.



Internal reliability coefficients for the subscales of

the PSI have been determined for both the original

standardization sample of 534 parents who obtained services

from small group pediatric clinics in central Virginia and

from a cross-cultural sample of 435 parents from Bermuda and

the United States (Hauenstein, Scarr, & Abidin, 1986 as

cited in Abidin, 1990). The internal reliability


NormaL ve -a a or e an rs nex


t f th P ti


Str-r f TI dv










coefficients from both of these samples were fairly

consistent. The alpha coefficients for both samples ranged

from .59 to .78 for the Child Domain subscales and from .55

to .80 for the Parent Domain subscales. Based on the sample

of 534 parents, the internal reliability coefficients for

the Child Domain Total Scale score was .89, for the Parent

Domain Scale Total score was .93, and for the Total stress

score was .95.

Test-retest reliability coefficients have been computed

for intervals ranging from three weeks, three months, and

one year. Both domain scores and the total stress score

were shown to have adequate test-retest reliability

coefficients.

The PSI has been found to be correlated with both the

Intensity and the Problem Scales on the ECBI (Eyberg, Boggs,

& Rodriquez, 1992), indicating that child disruptive

behaviors are associated with maternal stress. The

correlation between the ECBI intensity score and the PSI

total score was .59, .45 for the parent domain score, and

.59 for the child domain score. The correlation between the

ECBI problem score and the total score was .60, .45 for the

parent domain score, and .62 for the child domain score.

Discriminant validity has been demonstrated by Mouton and

Tuma (1988) who found that the PSI parent and child domain

scores were significantly higher for clinic-referred mothers

compared to control mothers.










The Parental Locus of Control (PLOC)

The Parental Locus of Control Scale (Campis, Lyman, &

Prentice-Dunn, 1986) consists of 47 items intended to assess

parents' attitudes about their ability to influence their

children's behavior. The test items are rated by parents on

a five-point scale. These ratings are summed to yield a

total score. In addition, the PLOC is divided into five

subscales labeled Parental Efficacy, Parental

Responsibility, Child Control of Parent's Life, Parental

Belief in Fate/Chance, and Parental Control of Child's

Behavior. The PLOC total scale score was reported as having

high internal consistency (Cronbach's alpha = .92). The

Cronbach's alpha reliability coefficients for the subscales

were .75 for Parental Efficacy, .77 for Parental

Responsibility, .67 for Child Control, .75 for Fate/Chance,

and .65 for Parental Control (Campis, Lyman, & Prentice-

Dunn, 1986).

The test-retest reliability coefficient for the PLOC

was r = .829 after a mean interval of 16 days (Roberts, Joe,

& Rowe-Hallbert, 1992). Campis, Lyman, & Prentice-Dunn

(1986) found that parents whose children had

behavior/emotional problems had significantly higher total

scores than a non-problem sample, indicating that they

perceived their child's behavior as less under their

control. Comparing the results of behavioral observation

with the PLOC, the total score was found to be negatively










associated with child compliance (r = -.349, p < .01), and

positively associated with negative talk (r = .346, p <

.01), cry/yell (r = .276, p < .05), and the ECBI intensity

score (r = .259, p < .05) (Roberts, Joe, & Rowe-Hallbert,

1992).



Peabody Picture Vocabulary Test--Revised (PPVT-R)

The PPVT-R is a measure of receptive or hearing

vocabulary for American Standard English (Dunn & Dunn,

1981). The PPVT-R has two forms, Form L and M, which each

contain 175 items. The measure is individually administered

to participants who are asked to select verbally or

nonverbally the picture which best represents each test item

verbally presented by the examiner. The PPVT-R is easily

and quickly administered because only those items between

the child's basal and ceiling level are administered. The

instrument also can be rapidly scored. The raw scores then

can be converted to standard scores with a mean of 100 and

standard deviation of 15.

The PPVT-R was standardized on 4,200 children between

the ages of 2 1/2 and 18 years old, with 100 children of

each sex at each age level. The sample of children

approximated the 1970 U.S. census data for sex, age,

geographic location, occupational background, racio-ethnic,

and urban-rural population distributions. Internal

consistency coefficients of the PPVT-R, Form L, ranged from










.67 to .88, based on split-half reliability procedures.

Test-retest reliability coefficients for standard scores

ranged from .54 to .90, with a median value of .77. Test-

retest reliability was evaluated on a subsample of 962

children with the retest interval ranging from 9 to 31 days.

IQ scores and PPVT-R scores have been found to be correlated

between .40 and .60. However, the PPVT-R scores should not

be interpreted as intelligence scores.



Procedures

The videotaping and administration of questionnaires to

both groups of dyads were conducted in a standardized manner

during two data collection sessions held one week apart.

All data on the clinic-referred sample used in this study

were collected as part of a more extensive, standard

assessment conducted for a treatment outcome study. The

clinic-referred families were paid $50 for their

participation by the treatment outcome researchers. The

mothers of the non-referred children were recruited through

advertisements placed in the community. After completing

all of the data collection, the mothers were paid $20 for

their participation.

A total of forty-two non-referred mother-child dyads

completed the data collection process. Twelve dyads were

excluded from analysis in the present study for a variety of

reasons. The predominant reason for exclusion of subjects










was to create a non-referred sample that matched the

demographic characteristics of the clinic-referred sample.

For example, too many female subjects and three-year olds

were initially included in the non-referred group. In

addition, three children were excluded from the non-referred

group because of high maternal ratings on the ECBI Problem

Scale.

To maintain confidentiality of the dyads, the

videotapes and questionnaires, including demographic

information, were labeled only with a number and were kept

in locked files which were accessible only to the

researchers. Observers did not have access to demographic

or identifying information about the participant families.

For videotaping of both groups, the mother and child

were brought to a playroom in the Child Study Laboratory

where the same five age-appropriate toysI, selected for

their unstructured, interactive quality, were provided.

There, the dyad was videotaped from behind a one-way mirror

in the three DPICS II standard situations. The CDI and PDI

situations were videotaped for 10 minutes and CU was

videotaped for 5 minutes. Coding was completed on the last

five minutes of CDI and PDI and the five minutes of CU.

These situations were first described to the parents

before taping. During the observations, the parents wore a


1 The five toys used in the assessment are Nesting Animals,
Lincoln Logs, Waffle Blocks, Magna Doodles, and the Sesame Street
Garage.










bug-in-the ear device, an audio receiver worn in the ear

similar to a hearing aid. This device was used to signal

unobtrusively to the mothers when CDI began and when to

change from one situation to another. At five minute

intervals, the parents were read standard instructions over

the device from a transmitter in the observation room.

For the first situation, CDI, the following directions

were given:

"In this situation tell that he/she

may play with whatever he/she chooses. Let

him/her choose any activity he/she wishes. You

just follow his/her lead and play along with

him/her."

After the 5 minute warm-up period, the parent was told:

"You're doing a nice job of allowing

to lead the play. Please continue to let him/her

lead."

In the second situation, PDI, the following instructions

were given:

"That was fine. Do not clean up the play things

at this time. Now we'll switch to another

situation. Tell that it is your turn to

choose the game. You may choose any activity.

Keep him/her playing with you according to your

rules."

After the 5 minute PDI warm-up period, the parent was told:










"You're doing a nice job of leading the play.

Please continue to get to play along

with you according to your rules."

For the third situation, CU, the parent was given the

following directions:

"That was fine. Now I'd like you to tell

that it is time to leave the playroom

and the toys must be put away. Make sure you have

him/her put the toys away by him/herself. Have

him/her put all the toys in their containers and

all the containers in the toybox."



Observers

Four graduate students, employed as DPICS II coders in

the aforementioned treatment outcome study, served as

primary coders for both the clinic-referred group and the

comparison group. These coders also served as reliability

coders for approximately half of the clinic-referred

subjects.

Before beginning to code videotapes for the present

study, all observers had successfully completed training

procedures for the DPICS II in accordance with the

recommendations provided by The Workbook: A coder training

manual for the Dyadic Parent-Child Interaction Coding System

II (Eyberg, Edwards, Bessmer, & Litwins, 1994). Training

consisted of a minimum of 30 hours of didactic training in








54

DPICS II which included reading the coding manual, studying

and successfully completing paper and pencil training

exercises and quizzes, and coding transcripts of actual

parent-child interactions. The observers then coded

training videotapes with a transcript, coded videotapes with

feedback from a trained coder, and finally coded criterion

tapes to evaluate their level of mastery. The coders were

considered successfully trained when they achieved a minimum

of 80% agreement with "correct" codings of a criterion tape

on selected codes specifically related to the hypotheses

(e.g., parent command, child compliance, parent critical).

During the period of data coding, unannounced checks on

reliability were conducted intermittently (the first author

re-coded tapes at random intervals), and the reliability

estimates were shared with the observers in order to

decrease observer drift. Training sessions were held weekly

with the observers as part of their coding responsibilities

for the larger treatment outcome study. These sessions were

used to discuss any ambiguities found in the coding manual

and to practice coding categories that were considered

difficult or had poor reliability.

The primary observers coded three videotaped situations

collected from each of the two observation periods for a

total of six five-minute segments of videotape. Their

observations were recorded on a computer software package,

Tape (Eyberg & Celebi, 1993), which documents the sequence










in which behaviors occurred, as well as the time they

occurred. To ensure optimal reliability, observers prepared

brief written logs for each five-minute situation coded,

noting the beginning and ending time of each segment and

noting difficult-to-understand verbalizations. If the

primary coder understood the verbalization, he or she wrote

down the words with no punctuation or code. However, if the

verbalization was unintelligible, the observer noted this on

the log so that the reliability coder would not attempt to

code this behavior. The logs were kept in order to aid the

reliability coder in: a) coding the same segment of tape,

using the same starting point and b) coding the same words

as the primary coder whenever possible.

Videotapes of the comparison group were assigned to the

observers initially by a faculty supervisor and later by the

author. Although the observers were aware that they would

be assigned tapes of families in the comparison group, they

were kept blind to the group membership of the dyads on the

videotapes and to the hypotheses of the present study.

Videotapes of dyads in the comparison group were stored with

the videotapes of clinic-referred children and were labeled

in a similar manner. Continued attempts were made to keep

the observer's uninformed of group membership throughout the

coding process. However, for approximately 50% of the

tapes, the observers were aware of group membership when

they coded tapes for payment outside of their assistantship










duties. The observers did remain uninformed as to the

hypotheses throughout the study.

Reliability was assessed on 50% of the five-minute

segments for each family. A reliability observer re-coded

three of the six segments for each participant family. For

the clinic-referred group, reliability coding had been

partially completed by observers for the larger treatment

outcome study. The author, serving as an additional

reliability coder for the present study, re-coded videotapes

of the clinic-referred families as necessary so that 50% of

the segments were coded, equally distributed across

situations (CDI, PDI, CU) and data collection sessions (week

1, week 2). Approximately half (42 out of 90) of the

segments were coded by the author for reliability on the

clinic-referred group.

To assess reliability for the comparison group, the

author re-coded 50% of the five-minute segments of videotape

for each dyad in the comparison group. The three situations

were coded for each family equally distributed across week 1

and week 2 data collection sessions.














RESULTS

Psychometric Properties of Measures

All of the measures used to evaluate the DPICS II

categories for validity were first evaluated for internal

consistency within the present samples. Cronbach's index of

internal consistency (Cronbach, 1951) was computed for the

mothers' responses to the PSI, the PLOC, and the ECBI

Intensity score to assure measurement reliability for the

specific populations sampled. The internal consistency

estimates for the clinic-referred and non-referred groups

combined were as follows: alpha = .95 for the PSI total

score, alpha = .94 for the PSI child scale, alpha = .93

for the PSI parent scale, alpha = .85 for the PLOC total

score, and alpha = .95 for the ECBI intensity score.

Although the internal consistency estimate for the PLOC was

somewhat lower than previously reported estimates, the

internal consistency estimates of the remaining measures

used in this study were consistent with previous published

findings, and all were adequate for use in further analyses.



Reliability

Percent agreement (average and weighted), Cohen's kappa

(1960), and intraclass correlations (P) were computed to










determine the reliability of the parent and child

categories. Reliability estimates were obtained for both

groups in the three DPICS situations on the 50% of the

segments observed by both the reliability and primary

coders. The average percent agreement and kappa were

computed using a computer software program developed

specifically for the DPICS II (Eyberg & Celebi, 1993). The

average percent agreement was computed using a five-second

window, which considers the same codes from the two raters

that are within five seconds of each other as agreements.

Percent agreement was based on occurrences only from either

the primary or reliability coder. In other words, only

behaviors that were coded by at least one observer were

included in the calculations. In this real-time coding

system, there are no "empty" intervals (i.e., intervals in

which there is no occurrence of behavior coded). Weighted

percent agreement was calculated by dividing the sum of

agreements across the subjects by the sum of the agreements

plus disagreements across subjects. Kappa estimates were

obtained using a one-second window and also were based on

occurrences only from either the primary or reliability

coder. Intraclass correlations were computed by a formula

derived by Fleiss (1981) which is based on analysis of

variance procedures.

Given the number of categories in the coding system,

percent agreement above 70% was used as the standard for










acceptable reliability (Jones, Reid, & Patterson, 1975).

Fleiss (1981) indicated that kappa values greater than .75

can be considered as representing excellent agreement beyond

chance. Kappa values ranging from .60 to .75 indicate good

agreement beyond chance, values from .40 to .60 indicate

fair agreement, while values below .40 are considered as

indicative of poor agreement. These kappa values will be

used to evaluate the kappas found in this study. Intraclass

correlations can be evaluated as a standard correlation

coefficient (Suen & Ary, 1989).

The reliability estimates, weighted and average percent

agreement and kappa, for each situation for each of the two

groups are presented in Tables 18 29 in Appendix A.

(Intraclass correlations could not be computed for each of

the situations. See Table 30 in Appendix A for the

intraclass correlations by group.) Listed with the

reliability estimates in the tables are: (a) the total

frequency each behavior occurred across the thirty dyads in

each group as observed by the primary coder; and (b) the

number of dyads in which either the primary and/or the

reliability coder observed at least one occurrence of the

behavior. These numbers are included to provide information

about the occurrence of the behaviors to aid in interpreting

the reliability estimates.

Tables 31 and 32 summarize the reliability estimates

for the parent and child categories combined across the










three situations and across groups. The reliability

estimates included in these tables are weighted and average

percent agreement, kappa, and intraclass correlations.

The DPICS II categories are ranked within the tables

with the highest estimates appearing first. In addition,

borrowing the convention used to rate kappa estimates

created by Fleiss (1981), the estimates are also divided

into groups considered to have "excellent", "good," and

"fair" reliability estimates. Given the variability of the

estimates across types of reliability for each category,

this classification was somewhat subjective and was often

based on two out of three types of estimates.

Several categories had either very poor estimates

(e.g., Parent Whine) or estimates could not be calculated

because of insufficient data. For example, the reliability

of the parent categories of Contingent Praise, Destructive,

and Yell could not be calculated because of no occurrence of

these behaviors. Similarly, child Labeled Praise could not

be calculated, and child Physical Positive and Negative had

extremely low estimates.

The remaining hypotheses of the study which were

related to the validity of the DPICS II involved selected

individual categories and summary variables, comprised of

combinations of the individual categories. In order to

provide kappa reliability estimates on the variables used in










Table 31

Reliability for the DPICS II Parent Categories Combined
Across Situation



Parent Category Weighted Average Kappa P
% Agree % Agree


Excellent

Information Question 85% 82% .80 .93
(83%-88%)

Direct Command 82% 77% .69 .99
(80%-84%)

Descriptive/Reflective 80% 78% .74 .90
Question (79%-82%)

Unlabeled Praise 77% 76% .70 .88
(72%-81%)

Information Description 75% 74% .63 .86
(73%-76%)

Answer 74% 71% .90 .93
(70%-78%)

Good

Labeled Praise 71% 66% .68 .89
(55%-88%)

Laugh 67% 64% .84 .90
(62%-72%)

Indirect Command 64% 57% .63 .92
(61%-67%)

Criticism 63% 58% .62 .94
(59%-67%)

Acknowledgment 58% 58% .59 .86
(55%-61%)

Playtalk 69% 46% .76 .85
(63%-75%)










Table 31 continued


Parent Category Weighted Average Kappa P
% Agree % Agree

Good


Compliance


No Opportunity for
Compliance

Fair

Reflective Statemer


Behavioral Descript


Smart Talk


No Opportunity for
Answer

Physical Negative


Physical Positive


Noncompliance


No Answer


Whine


Insufficient Data

Contingent Praise

Yell

Destructive


54%
(45%-62%)

53%
(49%-57%)



it 48%
(41%-55%)

:ion 43%
(31%-55%)

47%
(35%-60%)

44%
(35%-53%)

39%
(27%-51%)

37%
(27%-48%)

37%
(29%-44%)

35%
(25%-44%)

21%
(6%-35%)



**

**

**


49%


48%




45%


44%


29%


42%


34%


34%


34%


34%


19%




**

**

**


Note. Analyses based on n = 60,
referred group (n = 30) with the


combining the clinic-
comparison group (n = 30).










Table 32

Reliability for the DPICS II Child Categories Combined
Across Situation



Child Category Weighted Average Kappa P
% Agree % Agree


Excellent

Information Question 80% 79% .87 .96
(76%-83%)

Information Description 78% 77% .71 .93
(76%-80%)

Answer 73% 71% .82 .82
(69%-78%)

Good

Descriptive/Reflective 68% 65% .72 .89
Question (63%-72%)

Direct Command 68% 63% .72 .89
(64%-72%)

No Opportunity for 67% 62% .59 .95
Compliance (64%-70%)

Laugh 65% 53% .78 .96
(58%-72%)

Compliance 63% 63% .71 .92
(59%-67%)

Acknowledgment 64% 62% .74 .79
(61%-67%)

Playtalk 60% 50% .70 .95
(56%-64%)

No Opportunity for 59% 57% .71 .78
Answer (54%-65%)

Smart Talk 55% 47% .70 .91
(49%-61%)










Table 32 continued


Child Category Weighted Average Kappa P
% Agree % Agree

Good

Yell 52% 48% .57 .93
(44%-60%)

Whine 50% 39% .57 .85
(44%-56%)

Unlabeled Praise 50% 54% .72 .81
(29%-71%)

Noncompliance 48% 40% .55 .85
(44%-53%)

Indirect Command 45% 42% .61 .74
(38%-51%)

Fair

Criticism 45% 49% .58 .60
(39%-52%)

Reflective Statement 45% 39% .51 .66
(33%-57%)

Destructive 41% 39% .49 .79
(34%-48%)

No Answer 37% 36% .51 .64
(30%-45%)

Behavioral Description 32% 43% .43 .63
(7%-57%)

Physical Negative 18% 19% .33 .29
(4%-32%)

Physical Positive 17% 17% .61 .37
(5%-28%)
Insufficient Data

Labeled Praise ** ** ** -.34

Note. Analyses based on n = 60, combining the clinic-
referred group (n = 30) with the comparison group (n = 30).








65

further analyses, summary variables were created by dividing

the categories of interest into naturally occurring classes

of behaviors (e.g., verbalizations, vocalizations, and

motoric behaviors) (See Table 33). Three summary variables

described child and parent inappropriate behavior: Negative

Talk (Critical + Smart Talk), Negative Vocalizations (Whine

+ Yell), and Negative Behavior (Physical Negative +

Destructive). The parent summary variable of Negative

Vocalizations was dropped due to low reliability estimates.

Finally, the summary variable of Prosocial Behavior was

developed for parent and child. The parent Prosocial

category contained verbalizations (Acknowledgment,

Behavioral Description, Information Description, Labeled

Praise, Unlabeled Praise, and Reflection) and the categories

of Laugh, Physical Positive, and Answer. The child

Prosocial Behavior summary variable contained the same

categories except Physical Positive was dropped because of

its low reliability estimate. The variable child

Noncompliance was considerably less reliable than child

Compliance. Consequently, Compliance was used to test the

hypotheses regarding validity. Although the kappa estimate

for child Negative Behavior was somewhat low, all of the

variables demonstrated adequate kappa reliability and were

considered acceptable to be used in subsequent analyses.










Table 33

Kappa Estimates for DPICS II Summary and Individual
Variables Combined Across Situations



Variable Clinic Comparison Total
(n=30) (n=30) (n=60)


PARENT BEHAVIOR

Prosociala .64 .65 .65
Answer .92 .88 .90
Laugh .85 .82 .90
Physical Positive .46 .78 .49

Negative Talk .84 .56 .60

Negative Behavior .67 .57 .80

Total Command .70 .70 .70

Direct Command Ratiob .67 .67 .69

CHILD BEHAVIOR

Prosociala .67 .71 .69
Answer .85 .77 .82
Laugh .77 .77 .96

Compliance Ratioc .72 .69 .71

Negative Talk .71 .64 .69

Negative Vocalization .51 .75 .79

Negative Behavior .51 .53 .52


Note: a Kappa estimate for verbalization categories
combined (Acknowledgment + Descriptions + Praise +
Reflection). Kappa estimate for Direct Command category.
c Kappa estimate for Compliance category.



With the exception of the prosocial behavior

categories, the summary variables were found to be non-










normally distributed, primarily due to the low or

nonexistent occurrence of some of the inappropriate behavior

categories. The distributions of the variables measuring

inappropriate behavior were skewed toward zero, particularly

for the comparison group. To address this problem, the

variables first were combined into broader or composite

summary variables to provide a wider range of possible

values. These values were then transformed using

mathematical functions described below.

The composite summary variable of child Inappropriate

Behavior was created from the sum of Negative Talk, Negative

Behavior, and Negative Vocalization (i.e., Critical + Smart

Talk + Destructive + Physical Negative + Whine + Yell).

The composite summary variable for parent Inappropriate

Behavior was comprised of Negative Talk and Negative

Behavior (i.e., Critical + Smart Talk + Destructive +

Physical Negative). Unlike the child summary variable,

parent Inappropriate Behavior did not include parent

Negative Vocalizations because no estimate could be

calculated for Yell, and Whine had an extremely low

reliability estimate.

In addition, the summary variables were transformed

prior to analysis. Arc sine transformations of the square

root of the value were completed on the child compliance

ratio and parent direct command ratios while the other

variables were transformed by adding .1 to their value and








68

then taking their square root (Personal communication, John

Hartzel, September, 1995). Analyses were performed both

with transformed and untransformed values for the summary

variables. The results of the analyses for the

untransformed variables may be more easily translated into

clinically useful information than results based on

transformed variables.


Validity

Discriminant validity

Previous behavioral observation systems have found that

specific behaviors differentiate children with behavior

problems from non-referred children. Analyses were

conducted on the DPICS II variables to evaluate the

discriminant validity of the system. Because of the

significant difference in SES between the two groups, SES

was used as a covariate in the analyses. In order to

control for Type I error and SES effects, a multivariate

analysis of covariance (MANCOVA) was used to test

differences between groups on child Compliance Ratio

(Compliance/Total Commands), child Inappropriate Behavior

(Negative Talk + Negative Vocalizations + Negative

Behavior), child Prosocial Behavior (Acknowledgment +

Behavioral Descriptions + Information Descriptions + Labeled

Praise + Unlabeled Praise + Reflection + Laugh + Answer),

Total number of parental commands (Direct + Indirect

Commands), parent Direct Command Ratio (Direct










Commands/Total Commands), parent Prosocial Behavior, and

parent Inappropriate Behavior (Negative Talk + Negative

Behavior).

When all seven transformed variables were entered

together in the MANCOVA, controlling for the variable SES,

the effect of group membership was statistically

significant, Hotelling's T2 (7,51) = 7.91, p < .001.

Similarly, when the untransformed values of the seven

variables were entered in to the MANCOVA, the effect of

group membership was again found to be significant,

Hotelling's T2 (7,51) = 6.39, P < .001. Further analyses of

the individual summary variables were conducted to explore

which variables contributed to the overall difference found

between the two groups (See Table 34).

The differences between groups were examined for each

summary variable using Analysis of Covariance (ANCOVA)

procedures to control for SES effects. Using the

untransformed values of the variables, the results revealed

that the two groups differed significantly on all of the

summary variables except for child Prosocial Behavior.

Table 34 also includes the mean frequency of occurrence of

the behavioral categories (untransformed values). The means

reported here represent the frequency of occurrence summed

across the three situations. See Table 35 in Appendix A for

the results of univariate analyses on the transformed values

of the variables.










Table 34

Summary of Univariate Analysis of Covariance for
Untransformed Summary Variables Comparing the Clinic-
referred and Comparison Group


Mean
Variable Clinic Comparison F(1,57)


Parent Categories
Prosocial Behavior 92.53 121.37 8.11**
Direct Command Ratio .72 .57 12.71**
Total Commands 62.58 39.67 9.78**
Inappropriate Behavior 20.37 5.40 20.59*
Negative Talk 18.35 8.30 15.13**
Negative Behavior 2.35 .42 8.73*
Child Categories
Prosocial Behavior 89.98 88.5 .49
Compliance ratio .53 .75 19.17**

Inappropriate Behavior 31.28 10.02 14.96**

Negative Talk 14.45 5.10 10.36**

Negative Vocalization 12.77 3.17 15.13**

Negative Behavior 4.07 1.75 6.39*

Note. Means based on n = 30 for each group. Values based
on the mean frequency of occurrence summed across the three
DPICS II situations.
* E < .05. ** P < .01. *** E < .001.



A discriminant function analysis also was performed on

the summary variables to demonstrate that DPICS II variables

could be used to distinguish between the clinic and non-

clinic samples. The technique known as jackknifingg" was

used to more accurately identify the classification rate










(Norusis, 1993). Using this technique, each case is, in

turn, excluded from the calculation of the discriminant

function, and later entered into the function to determine

if it can be correctly classified. This method is thought

to produce a more accurate estimate of the function's

classification rate because the case was not used to create

the function (Norusis, 1993).

SES along with the transformed variables of child

Compliance, child Prosocial, child Inappropriate Behavior,

parent Total Command, parent Prosocial Behavior, parent

Direct Command Ratio, and parent Inappropriate Behavior were

used in the jacknifing procedure, yielding an overall

correct classification rate of 86.6%. Twenty-six of the

thirty dyads were correctly classified in each group. When

the same analysis was performed using SES and untransformed

values for the DPICS II summary variables, the overall

classification rate was the same (86.6%); however, twenty-

five non-referred dyads and twenty-seven referred dyads were

correctly classified.

When all of the subjects were entered simultaneously

into a discriminant analysis, the classification rate was

slightly higher at 93.3% for transformed values and 90% for

untransformed values of the variables. (See Table 36 in

Appendix A for the classification function coefficients for

the untransformed variables.)










When the variable of SES was eliminated from the

analyses to determine how well the DPICS II categories alone

could classify the participants, the transformed variables

yielded a classification rate of 86.6% using the jacknifing

procedure and 90% when all participants were entered

simultaneously. The untransformed variables without the

inclusion of SES in the analysis yielded a classification

rate of 81.7% using the jacknifing procedure and 86.6% when

all the participants were used in the analysis.

To determine whether a smaller group of variables

could effectively discriminate between participants, the

transformed and untransformed variables and SES were entered

into a stepwise discriminant analysis. Using only the

transformed variables of child Inappropriate Behavior,

parent Prosocial Behavior, and parent Inappropriate

Behavior, the classification rate was 90% for both the

jacknifing procedure and the regular discriminant analysis.

When the stepwise discriminant analysis was performed

on the untransformed values and SES, the variables of parent

Inappropriate Behavior, child Compliance Ratio, parent

Prosocial Behavior, and SES yielded a correct classification

rate of 91.6% for the jacknifing procedure and of 95% for

the regular discriminant analysis (see Table 37 in Appendix

A). The classification function coefficients used to

calculate the group membership of a particular subject using

these four variables are listed in Table 38 in the appendix.










In summary of the analyses completed, the MANCOVA

revealed significant differences between groups using the

transformed and untransformed values of the DPICS II

variables. The post-hoc univariate analyses of these

variables also demonstrated significant differences between

groups using the transformed and untransformed values. In

addition, the transformed and untransformed variables were

able to successfully discriminate between the two groups of

clinic-referred and comparison participants during the

discriminant analysis. In order to provide the most

clinically relevant information, the remaining analyses will

focus on the untransformed values. The results of the

transformed values of the variables will also be reported,

primarily in the appendix.

Convergent validity

Next, the DPICS II variables were used to predict the

scores on various measures associated with children with

behavior problems. In the first analysis, the DPICS II

variables were used to predict the ECBI Intensity score.

Table 39 provides a matrix of partial correlations for the

untransformed variables used to predict the ECBI Intensity

score controlling for SES (See Table 40 in the appendix for

the Spearman correlation matrix of the transformed values).

Table 39 reveals highly significant correlations

between variables. The ECBI Intensity score was

significantly correlated with all DPICS II variables with










the exception of child Prosocial. The DPICS II summary

variables also were significantly intercorrelated. Child

Compliance, for example, was negatively correlated with

parent Inappropriate Behavior. Child Inappropriate Behavior

was highly positively correlated with Parent Inappropriate

Behavior and Total Commands and negatively correlated with

Compliance. Parent Direct Command Ratio was negatively

correlated with both parent Prosocial Behavior positively


Table 39

Partial Correlations Between DPICS II Variables and ECBI
Intensity Score Controlling for SES



Variable 2 3 4 5 6 7 8


1. ECBI-I -.02 .30* -.41** .45** -.35** .40** .30*

CHILD BEHAVIOR

2. Prosocial .05 -.05 -.18 .08 -.18 .02

3. Inappropriate -.72*** .19 .09 .45** .46***

4. Compliance Ratio -.16 -.02 -.33* -.25

PARENT BEHAVIOR

5. Direct Command Ratio -.41** .40** .28*

6. Prosocial -.14 -.03

7. Inappropriate .72

8. Total Commands

Note. All analyses based on N = 60.
*E < .05. **p < .01. ***P < .001.








75

correlated with Inappropriate Behavior. Although these high

correlations are reasonable and expected, they can affect

the outcome of the multiple regression as each variable does

not contribute unique effects.

Hierarchical multiple regression analyses were used to

determine the relative contributions of SES, child Prosocial

Behavior, child Inappropriate Behavior, parent Inappropriate

Behavior, parent Direct Command Ratio, parent Prosocial

Behavior, and parent Total Command in the prediction of the

ECBI Intensity score. The untransformed variables

accounted for 40% of the variance in the ECBI Intensity

Score, F(8,51) = 4.17, p < 001. As shown in Table 41, after

considering the effect of SES, child Compliance accounted

for 13% of the variance, parent Inappropriate Behavior

accounted for 7% of the variance, and parent Direct Command

Ratio accounted for 6% of the variance. The other

variables entered into the analysis contributed little

additional predictive power.

When all of the transformed summary variables were

included in the analysis, they were found to account for a

significant proportion of variance (45%) in the ECBI

Intensity Score, F(8,51) = 5.24, E < .001 (See Table 42 in

Appendix A). SES and Parent Inappropriate Behavior

accounted for 12% and 27% of the variance respectively. The

other variables were not found to contribute significantly to

the prediction of the ECBI Intensity score.










Table 41

Summary of Hierarchical Multiple Regression Analysis for
Untransformed Variables Predicting ECBI Intensity Score



Predictor Variable R2 change F change


SES

Compliance

Parent Inappropriate

Direct Command Ratio

Parent Prosocial Behavior

Child Prosocial Behavior

Total Command

Child Inappropriate

Note. P coefficients are
variables entered.
* p < .05. ** p < .01.


-.14 .11 7

-.26 .13 8

.24 .07 4

.23 .06

-.12 .01

.07 .003

-.04 .001

-.04 .0005

given for equation when all


In the fifth hypothesis, it was proposed that DPICS II

variables of child Compliance, child Prosocial, and Child

Inappropriate Behavior would predict the child domain score

on the PSI. Table 43 presents the partial correlation

matrix for untransformed variables to be included in this

analysis, controlling for the effects of SES. Child

Compliance was negatively correlated with child

Inappropriate Behavior, (r(60) = -.72, p < .001). (See

Table 44 in the appendix for the Spearman correlation matrix

with transformed variables).


.88**

*.63**

L.24**

3.35*

.64

.18

.05

.03











Table 43

Partial Correlation Matrix for
Prdit f- t-h rChi1A nmmain Scrn


Child Variables Used to
nn th. PST


Variable 1 2 3



1. Compliance Ratio -- -.05 -.72***

2. Prosocial .05

3. Inappropriate Behavior -


* 2 < .05. *** 2 < .001.


In a hierarchical multiple regression analysis, these

child summary variables were used as predictor variables of

the scores on the child domain of the PSI. Together, the

untransformed variables accounted for 22% of the variance,

F(4,55) = 3.96, a < .01. As shown in Table 45, SES


Table 45

Summary of Hierarchical Multiple Regression Analysis for
Untransformed Variables Predicting PSI Child Domain Scores


Predictor Variable P

SES -.33

Child Inappropriate Behavior .09

Child Compliance -.23

Child Prosocial Behavior -.05

Note. P coefficients are given for
variables entered.
* E < .05 ** E < .01.


R2 change F

.13 8

.06 3

.03

.003

equation when all


change

.92**

.77*

1.45

.15


"~"'"" ~"~ '^"`" ~""~"' ""'~ "' "'~ ~~-


3








78

accounted for approximately 13% of the variance in the child

domain score and the untransformed variable of child

Inappropriate Behavior accounted for an additional 6%. In

the same analysis, the transformed variables were found to

account for significant variance (25%) in the PSI child

domain score, F(4,55) = 4.55, p < .01 (See Table 46 in the

appendix).

The sixth hypothesis of the study proposed that the

DPICS II summary variables of parent Inappropriate Behavior,

parent Prosocial Behavior, parent Total Command, and parent

Direct Command Ratio would account for significant variance

in the parent domain score of the PSI. Table 47 presents

the partial correlation matrix for the untransformed

variables used to predict the parent domain score of the PSI

(See Table 48 in the appendix for the Spearman correlations

for the transformed variables).


Table 47

Partial Correlation Matrix for Parent Variables Used to
Predict the Parent Domain Score on the PSI



Variable 1 2 3 4


1. Direct Command Ratio -- .28* .40** -.41**

2. Total Commands -- .72*** -.03

3. Inappropriate Behavior -- -.14

4. Prosocial Behavior -

* E < .05. ** E < .01. *** p < .001.










Parent Direct Command Ratio was significantly correlated,

controlling for SES, with parent Total Commands, parent

Inappropriate Behavior, and parent Prosocial Behavior. In

addition, parent Total Commands was significantly correlated

with parent Inappropriate Behavior ( r(60) = .72, p <.001).

When the parent categories of Direct Command Ratio,

Prosocial Behavior, Total Commands, and Inappropriate

Behavior (Critical + Smart Talk + Physical Negative +

Destructive) were used as predictors of the PSI parent

domain score in a hierarchical multiple regression analysis,

the untransformed variables did not account for a

significant proportion of variance (13%) in the parent

domain score, F(5,54) = 1.66, 2 = .16. In this analysis,

SES accounted for 6% (p < .05) of the variance and parent

Inappropriate Behavior accounted for an additional 6% (2 <

.05) of the variance in the scores. The other variables in

the equation did not contribute significantly to the

prediction of the PSI parent domain score. These results

are summarized in Table 49.

In the same analysis, the transformed variables

accounted for 19% of the variance in the parent domain

score, F(5,54) = 2.53, p < .05. The variables of SES and

parent Inappropriate Behavior accounted for 6% (2 < .05) and

12% (p < .01) of the variance, respectively. (See Table 50

in the appendix for a summary.)










Table 49

Summary of Hierarchical Multiple Regression Analysis for
Untransformed Variables Predicting PSI Parent Domain



Predictor Variable p R2 change F change


SES -.16 .06 4.00*

Inappropriate Behavior .27 .06 3.53*

Direct Command Ratio .11 .007 .41

Total Commands -.08 .003 .15

Prosocial Behavior .03 .0005 .03

Note. coefficients are given for equation when all
variables entered.
* E < .05.


The sixth hypothesis also proposed that the parent

variables of parent Direct Command Ratio, Total Commands,

and parent Inappropriate Behavior would account for

significant variance in the total score on the Parenting

Locus of Control (PLOC) scale. Table 51 presents the

partial correlation matrix, controlling for SES, for the

untransformed variables used in the analysis and post-hoc

analyses. (See Table 52 in the appendix for the Spearman

correlation matrix for the transformed variables used in the

analysis and in post-hoc analyses.) As shown in previous

correlation matrices, there are several significant

intercorrelations between the predictor variables. Child

Inappropriate Behavior is positively correlated with Parent








81

Innapropriate Behavior r(60) = .45, p < .001 and negatively

correlated with Compliance r(60) = -.72, 2 <.001.


Table 51

Correlation Matrix for Variables Predicting the PLOC Score



Variable 2 3 4 5 6 7


1. Total Commands .28* .72*** .46*** -.25 -.03 .02

2. Direct Command .40** .19 -.16 -.41** -.18
Ratio

3. Parent .45*** -.33* -.14 -.18
Inappropriate

4. Child -.72*** .09 .05
Inappropriate

5. Compliance -.02 -.05

6. Parent Prosocial .08

7. Child Prosocial

* E < .05. ** 2 < .01. *** 2 < .001.



SES and the transformed parent variables of parent

Prosocial, Total Commands, Direct Command Ratio, and parent

Inappropriate Behavior were entered into a hierarchical

multiple regression analysis as predictor variables for the

total score of the PLOC. Together, these variables

accounted for a significant proportion of variance (27%),

F(5,50) = 3.67, 2 < .01. SES and parent Inappropriate

Behavior each accounted for approximately 10% of the

variance in the PLOC score. The same analysis completed








82

using the untransformed values of the variables resulted in

22% of the variance being accounted for, F(5,50) = 2.82, p <

.05. Table 51 presents the results of this analysis.


Table 51

Summary of Hierarchical Multiple Regression Analysis for
Untransformed Variables Predicting PLOC Total Score



Predictor Variable p R2 change F change


SES -.22

Parent Prosocial -.32

Parent Inappropriate -.13

Parent Total Commands .23

Parent Direct Command Ratio -.05

Child Prosocial -.06

Child Inappropriate -.17

Child Compliance -.58

Note. p coefficients are given for
variables entered.
* E < .05. ** v < .01.


.10

.06

.04

.01

.01

.00002

.05

.16

equation when


Three additional variables were examined in a multiple

regression analysis of predictors of the PLOC total score as

part of post-hoc analyses. Because the PLOC score is

thought to reflect the parents' perceptions of successfully

managing their child's behavior, the variables of child

Prosocial, child Compliance, and child Inappropriate


5.91*

3.53*

2.16

.77

.60

.0001

2.70*

9.45***

all








83

Behavior were entered into a hierarchical multiple

regression. When these variables were included in the

regression, the proportion of variance accounted for by the

variables increased from 22% to 42% F(8,47) = 4.36, p <

.001. This increase was due largely to the effect of the

child Compliance variable. The results are summarized in

Table 53.














DISCUSSION

The results of the study supported the hypotheses that

the DPICS II coding system is reliable and valid for use in

assessing children with behavior problems. First, with

some exceptions, the categories of the DPICS II demonstrated

adequate reliability. Second, discriminant validity of the

DPICS II categories was demonstrated. Finally, evidence of

the convergent validity of the DPICS II was provided when

summary variables were shown to account for significant

proportions of variance in various other measures thought to

be associated with behavior problems in children.


Reliability

A variety of reliability estimates were examined for

these data. All of the parent and child DPICS II categories

were ranked and then subdivided into groupings labeled

"excellent", "good", "fair," and "insufficient data" to

provide a method of organizing the information. Subjective

judgements were used to group the variables, particularly

when the different estimates were discrepant. More weight

was placed on the kappa estimates and intraclass

correlations because of the complex nature of the coding

system (Hops, Davis, & Longoria, 1995; Suen & Ary, 1989).

The reliability estimates were interpreted using










several general guidelines. First, the estimates were

judged in relation to accepted standards within the

literature for reliability. Then, the similarity between

the estimates for each category was considered. The

reliability estimates for categories with similar percent

agreement, kappa values, and intraclass correlations were

thought to be fairly accurate estimates. The categories

with discrepant estimates of reliability were more difficult

to interpret. For those categories, factors that

contributed to the low estimates were examined. The

frequency of occurrence of the behavior plus the number of

dyads in which at least one behavior was coded were factors

that appeared to impact on the reliability.

In cases where the reliability estimates were based on

infrequent occurrence both within and across subjects, the

estimates were likely to be affected by restricted variance.

For example, in Table 17, Physical Negative occurred a total

of two times, once in each of two subjects. On one

occasion, both coders agreed, leading to 100% agreement. On

the second occasion, the observers disagreed yielding 0%

agreement. The average percent agreement was, therefore,

50%. For this category, only three possible percent

agreement values could occur: 0%, 50%, and 100%. Because

of the limited opportunity to code this behavior, the coders

would have to be 100% accurate across subjects to meet the

standards for reliability of 70% or greater (Jones, Reid, &








86

Patterson, 1974). In other words, receiving a score of 70%,

the standard for good percent agreement, is not possible

when only two behaviors are coded or when the behavior

occurs in so few participants. The resulting reliability

estimates will either be on overestimate or an underestimate

simply due to restricted variance.

Reliability estimates for infrequently occurring

behaviors and estimates for behaviors occurring in only a

few subjects also are likely to be inconsistent across

studies. Basic statistical principles indicate that an

estimate based on a single measurement is less reliable and

less representative of the population mean than an estimate

based upon multiple measurements (Fleiss, 1986).

Reliability estimates of categories with a moderate base

rate will likely be similar across future studies using the

DPICS II, if other factors remain the same (e.g., coder

training, population). The reliability estimates that are

based on only a few occurrences sample the behavior

insufficiently to obtain a stable estimate that would be

generalizable across studies.

Percent agreement can be difficult to interpret when

the percent agreement is based on widely differing frequency

of occurrence of a behavior within the participant families

(Hartman, 1977). For example, if a participant displays 10

behaviors on which the raters agree on eight occurrences,

the raters would have 80% agreement. However, if another










family displays only two of the same behaviors, and the

coders disagreed once out of two occurrences, their

agreement will be 50% agreement for that category. When

these values are averaged to get a mean agreement for the

category, the overall percent agreement is 65%.

The solution to this problem may be the use of a

weighted average. Using the above example, a weighted

average, which takes into account the differential

frequency, would yield a percent agreement of nine agree

divided by twelve opportunities for agreement, or 75%

agreement. Weighted percent agreement is included in the

tables, and in some categories, the two types of percent

agreement are discrepant (e.g., parent Play Talk). Weighted

average, however, does not account for chance.

Some researchers have noted that the when the frequency

of occurrence of a behavior approaches the extremes of zero

or 100% of the total behaviors, the higher the likelihood

that percent agreement will be inflated by chance (Hartmann,

1977; Suen & Ary, 1989). This has led some authors to

suggest that only in cases where the frequency of occurrence

is between 20% and 80% of the total behaviors should percent

agreement be used to assess reliability (Suen & Ary, 1989).

Overall, percent agreement has been increasingly criticized

and is currently considered one of the least desirable

methods for use in behavioral observation data (Hops, Davis,

& Longoria, 1995).










Kappa estimates have been proposed as a form of

reliability that corrects for chance agreement. Kappa

estimates are computed figuring in the probability of the

behavior in relation to other behaviors. In addition, Kappa

estimates can be helpful in identifying common coding

errors. The "confusion matrix" that is obtained in the

process of calculating kappa indicates which pairs of

categories are mistaken for one another. Tables 54 and 55

in Appendix A illustrate the confusion matrices for the

parent and child verbalization categories and Tables 56 and

57 in Appendix A illustrate the parent and child

vocalization, response, and physical behavior categories.

Kappa estimates in this study tended to be higher than

the percent agreement estimates. The kappa estimates are

likely elevated somewhat by the very large number of

behaviors coded overall. The confusion matrix, for example,

contained 14,495 coded behaviors for all of the parent

verbalization categories for both groups combined across all

three situations (See Table 54). The equation for kappa

includes as part of the estimate, the number of agreements

that a behavior is category X plus the number of times that

the coders agree that the behavior is not X (i.e., agreement

that the category is Y). The sum of the proportion of

agreements that the behavior was not X (i.e., the

proportions on the diagonal in a confusion matrix) was










generally quite large in relation to the total number of

behaviors coded in a single category.

In addition, the proportion of occurrence of the

behavior in relation to the other behaviors is affected by

the number of categories and thus behaviors included in the

analysis. The kappa estimate is, therefore, affected by the

number of other categories in the system and the number of

behaviors included in the confusion matrix. For DPICS II

observations, the kappa estimate was typically higher than

percent agreement because the confusion matrix reflected the

large number of behaviors being coded.

In order not to inflate the kappa estimates by

including additional, non-related categories, the present

study addressed this potential problem by breaking the

categories into two classes of behaviors, one containing the

verbalization categories and one containing all of the

vocalization, physical behaviors, and response to command

and question categories. These classes contain the

categories that are likely to be confused with one another.

For instance, Direct Commands could be confused with

Indirect Commands, questions, and descriptions but not with

Laugh. Each class of behaviors, each containing

approximately thirteen categories, was analyzed separately

to reduce any artificial inflationary effect. Kappa

estimates for parent and child behaviors also were computed










in separate confusion matrices to reduce the likelihood of

an overestimation of the reliability.

Intraclass correlations were included in the study to

provide an alternative method to evaluating reliability.

Intraclass correlations are based on examining the amount of

variance attributed to between subjects differences and the

variance attributed to within subjects differences, or in

this case, coder error. The correlation coefficient (P) is

interpretable as "...the proportion of variance of an

observation due to subject-to-subject variability in error-

free scores (Fleiss, 1986, p.3)." To achieve a high

intraclass correlation, the variability in the frequency of

occurrence between the participants must be greater than the

variability between the two observers' values for a given

behavioral category on a particular subject.

In examining all of the reliability estimates for the

categories of the DPICS II, the majority of the categories

demonstrated adequate reliability by at least one estimate.

Those categories that occurred fairly frequently tended to

have the strongest estimates. A number of the categories in

the "fair" classification tended to be categories with a low

frequency of occurrence, such as Behavioral Descriptions,

parent Smart Talk, Physical Negative, and Physical Positive.

The probability that a behavior will occur, or the base

rate, likely affects the coders in several ways. They may

have less practice coding these low base rate behaviors and