Detecting significant change in the neurosychological performance of college football players

MISSING IMAGE

Material Information

Title:
Detecting significant change in the neurosychological performance of college football players
Physical Description:
iii, 142 leaves : ill. ; 29 cm.
Language:
English
Creator:
Phalin, Benjamin R
Publication Date:

Subjects

Subjects / Keywords:
Psychology-Clinical Psychology thesis, Ph. D   ( lcsh )
Dissertations, Academic -- Psychology-Clinical Psychology -- UF   ( lcsh )
Genre:
bibliography   ( marcgt )
theses   ( marcgt )
non-fiction   ( marcgt )

Notes

Thesis:
Thesis (Ph. D.)--University of Florida, 2004.
Bibliography:
Includes bibliographical references.
Statement of Responsibility:
by Benjamin R. Phalin.
General Note:
Printout.
General Note:
Vita.

Record Information

Source Institution:
University of Florida
Rights Management:
All applicable rights reserved by the source institution and holding location.
Resource Identifier:
aleph - 003103770
System ID:
AA00004699:00001

Full Text











DETECTING SIGNIFICANT CHANGE IN THE NEUROPSYCHOLOGICAL
PERFORMANCE OF COLLEGE FOOTBALL PLAYERS













By

BENJAMIN R. PHALIN


A DISSERTATION PRESENTED TO THE GRADUATE SCHOOL
OF THE UNIVERSITY OF FLORIDA IN PARTIAL FULFILLMENT
OF THE REQUIREMENTS FOR THE DEGREE OF
DOCTOR OF PHILOSOPHY

UNIVERSITY OF FLORIDA


2004






























Copyright 2004

by

Benjamin R. Phalin















ACKNOWLEDGEMENTS

A project of this scope involves the collaboration of a great number of people.

First and foremost, I would like to acknowledge my parents, whose emphasis on the

value of education and unswerving support throughout my own educational endeavors

has allowed me to continue to strive to reach for my potential. Secondly, I would like to

thank my wife for her encouragement throughout the process of the preparation of this

paper. Last, but certainly not least, I would like to thank Dr. Eileen Fennell and Dr.

Duane Dede for their guidance, and for providing me with outstanding examples of true

scientist-practitioners.















TABLE OF CONTENTS


Page

ACKNOWLEDGEMENTS............................................................. iii

ABSTRACT................................................................................... vi

CHAPTER

ONE INTRODUCTION................................................................. 1

Historical Context...................................................... 1
Minor Head Injury: Definition............................................ 4
Minor Head Injury: Pathology............................................ 10
Neuropsychological Effects of Minor Head Injury..................... 16
Sports-related Minor Head Injury...................................... 20
Reliable Change ....................................................... .. 28
Reliable Change Index............................................ 29
Reliable Change Index with Correction for Practice.......... 30
Linear Regression................................................ 31
Multiple Regression Model....................................... 31
Specific Aims and Hypotheses.......................................... 33

TWO METHODS.......................... ......... ...................................... 35

Subjects ................................. ...................................... 35
Protocol.............................. .... .................................. 36
Baseline............................................................. 35
In-Season Concussion Evaluation............................... 39
Data Analysis............................................................... 40
Basic Models...................................................... 41
Reliable Change Indices..................................... ..... 41
Regression Models............................................ .... 42
Sensitivity and Specificity................................... ..... 42

THREE RESULTS...................................................................... 44

Demographic Data for Baseline Population............................. 44









Demographic Characteristics for Concussed and Control............. 47
Participants
Baseline Performance..................................................... 49
Overview of Repeated Testing............................................51
Detecting Significant Decline Using Basic Statistical Methods......51
Model 1 .............................. ................ .......... 56
Model 2............................................................ 63
Reliable Change Index Models................................ ...... 71
M odel 3........................................................... 71
Model 4................................ ............................. 79
Model 5 .......................................................... 87
Regression Models......................................... ...............95
M odel 6................................... ......................... 95
Model 7.......................................................... 103

FOUR DISCUSSION ..................................................................... 112
Basic Models....................................................... .......... 116
RCI Models......................................................... ....... 117
Regression Models...........................................................121
Limitations and Implications for Future Research..................... 129
Contributions of the Current Study...................................... 131

REFERENCES............................................................................ 135

BIOGRAPHICAL SKETCH................................................................ 142


























v















Abstract of Dissertation Presented to the Graduate School
of the University of Florida in Partial Fulfillment of the
Requirements for the Degree of Doctor of Philosophy

DETECTING SIGNIFICANT CHANGE IN THE NEUROPSYCHOLOGICAL
PERFORMANCE OF COLLEGE FOOTBALL PLAYERS

By

Benjamin R. Phalin

August 2004


Chair: Eileen B. Fennell
Cochair: Duane E. Dede
Major Department: Clinical and Health Psychology

The use of neuropsychological test instruments has been advocated by researchers

in the detection of cognitive decline following sports-related concussion as being a

sensitive means of producing objective results with which to aid in making return-to-play

decisions. However, no study to date has provided a systematic examination of serial

neuropsychological testing over several days following concussion. In the current study,

a brief battery ofneuropsychological tests was administered to 443 college football

players at four Division I universities. Of the 443 initial participants, 26 concussed

players were identified by team trainers and subsequently underwent retesting using the

same battery of tests within 24 hours, 3 days, 5 days, and 7 days of concussion.

Concussed players were assigned controls matched on position, history of diagnosed








learning disability, and history of previous concussion who were also re-administered the

battery at the same test intervals.

Seven different statistical models conceptually organized into three groups were

examined to determine their ability to distinguish between the concussed and

nonconcussed participants. Basic models included a comparison of retest performance to

normative data and comparison of retest performances to baseline performance. RCI

models included a basic RCI approach, RCI with practice correction, and indices created

based on the magnitude of change of the control population. Regression models included

simple linear regression and multiple regression models, though multiple regression

analysis was ended due to instability of prediction equations. Optimal

sensitivity/specificity combinations occurred when classification was made based on one

significant variable finding. Good sensitivity was observed for all models at the 24-hour

retest period, although specificity was a limiting factor for the most basic models. Linear

regression models and an analysis of magnitude of change over re-test intervals were the

only models to distinguish between groups at Day 3 and Day 7. Results suggested the

importance of systematic examination of cognitive decline following sports-related

concussion, especially given currently-used AAN return-to-play guidelines, as well as

need for appropriate comparison data and statistical analyses to aid in the determination

of cognitive recovery following concussive injury. Further data collection is necessary,

especially at later sessions, to accurately use multiple regression models.















CHAPTER ONE
INTRODUCTION

Historical Context

The use of neuropsychological measures has, over the course of the continuing

evolution of the field, been the focus of considerable debate in terms of utility and

correlates of specific test performance. This evolution has been described in terms of

distinct eras (McSweeny et al.,1993) or areas of focus in the use of neuropsychological

test performance. The first has been described by Rourke (quoted in Mcsweeny et al.,

1993), and involved the interpretation of test performance and the use of cognitive testing

mainly to imbue some veracity to the hypothesis that performance on certain specifically

designed test measures indirectly suggested dysfunction in certain localized brain

anatomy. This endeavor, of course, has been the mainstay of the various academic

subspecialties concerned with the study of neuroanatomy and the associated cognitive

sciences and theories. The very roots of this goal can be traced back as far as the time of

Gall, phrenology, and the notion that various protrusions on the scalp of an individual

revealed certain personality traits and tendencies (Cowey, 2001). While the theory of

phrenology itself has long since been discredited, it is the notion of localization that has

stood the test of time and has led to a greater understanding and logical organization of

the study of the structure and function of the human brain. During the infancy of the field

of neuropsychological theory in general, the primary opportunity to further corroborate

structure and function was found to be the study of the performance of individuals with








localized lesions. From the performance of these individuals, inferences concerning the

function of the absent or damaged areas were formulated. One of the most cited cases is

that of patient H.M., who received resection of the anterior portions of both of his

temporal lobes as part of a radical surgical correction for intractable epilepsy. Brenda

Milner and her colleagues detailed the cognitive effects of this resection and established

the utility of neuropsychological tests in the measurement of deficient cognitive abilities

(Milner, 1958). As McSweeny et al. (1993) noted, neuropsychological tests have been

utilized recently in order to aid in diagnosis and to provide clinically significant

contributions to the diagnostic process. A vast literature has been amassed, for instance,

in the application of neuropsychological techniques in the examination and treatment of

epileptic patients subsequent to patient H.M. (Bennett, 1987; Hermann, 1985; Hermann

and Wyler, 1988). Cognitive tests have been used to aid in the process of localizing the

focus of epileptic activity by suggesting relatively discrete anatomical areas of

dysfunction which are thought to contribute to poor performance on certain measures.

Further, cognitive tests have been used in conjunction with reversible lesion methods

(e.g., WADA) to localize brain regions responsible for language and memory in

individuals undergoing surgical resection methods in order to avoid the accidental

destruction of these areas and their subsequent debilitating effects (Loring et al., 1992).

Determination of the language- dominant hemisphere via neuropsychological methods is

an important step in identifying those individuals who might best be served with this

surgical treatment.

A further use ofneuropsychological techniques becomes salient when one seeks

to quantify the effect of a surgical resection or any other neurological insult. By








comparing an individual's performance before a potentially neurologically altering event

to his/her performance afterward, one is able to objectively describe the resulting

cognitive effects. Another commonly used methodology, in general, involves comparing

the performance of one individual to that of a group of other individuals, or normative

population. In this sense, the term "normative" implies that the mean score of a group of

individuals is the standard to which other individuals can be compared. This is perhaps

the most typical method used in neuropsychology to gauge the performance of the

individual in question, and has often been used without question as to the validity of such

practice. Within the past several years, neuropsychological test performance has been

used somewhat less frequently to localize neurological insult due to the continuing

growth and advance of imaging technology (Ruff, 2003). While the use of imaging

devices such as positron emission topography (PET), magnetic resonance imaging (MRI),

and functional magnetic resonance imaging (fMRI) has continued to hone the ability of

the investigator to view the brain itself in a non-obtrusive way, and have continued to

lend themselves effectively to the diagnostic process, neuropsychological test results

have played a somewhat decreased role in this capacity. However, the focus of

neuropsychological examination and analysis of test performance have clearly proven

their merit in other ways. For example, the measurement of cognitive abilities and

inherent strengths and weaknesses within the individual offers the neuropsycholgist the

unique opportunity to contribute to the practical knowledge about the cognitive

functioning of a patient, and allows him or her to recommend strategies designed to aid

the individual in returning to previous levels. In the event that a baseline measure of an

individual's performance is available, a comparison with post-event findings allows for








an objective review of resulting cognitive deficits and a gauge of changing cognitive

status. In this sense, a measure of individual cognitive change over time has become a

major focus of neuropsycholgical measures. With this increasing focus on longitudinal

follow-up and repeated testing, several issues regarding serial test administration and the

sensitivity of established neuropsychological measures have arisen. Recently, the use of

repeated assessment over time has been applied in various circumstances in order to

measure the progressive nature of cognitive change. This has been the case, as

McSweeny et al. (1993) pointed out, in the assessment of the dementias, which involve

the insidious loss of cognitive abilities due to the progressive alteration of brain structure

or chemistry. Measures of neuropsychological change over time have also been used to

assess treatment affects and interventions.

Recently, the use of neuropsycholgical testing has been applied to the study of

one of the most common causes of cognitive change and continuing sequelae: head

injury. While the results of moderate to severe head injury are often obvious, both to

current imaging techniques and measures of cognitive status, and the recovery from such

is often prolonged and some effects permanent, the effects of more subtle, or mild head

injuries are not so easy to unilaterally identify. Therefore, the use of neuropsychological

measures in this regard has become more valuable.

Minor Head Injury: Definition

While several researchers have endeavored to elucidate the consequent effects of

minor head injury, the lack of consistency in defining the phenomenon has made it

difficult, if not impossible, to reconcile the findings of different investigations. The use

of the term "minor head injury" generally suggests the occurrence of mild to moderate








traumatic brain injury (Kibby and Long, 1996). However there has been little agreement

on the terminology utilized in describing various levels of severity within this broad

category. A review of the literature reveals a basic framework underlying the working

model of minor head injury. However, from this basic outline, many slight variations

have been created and used in experimental design. In general, a minor head injury is

considered to be one in which a force is applied to the head, producing posttraumatic

memory difficulties or other fast-resolving neurological symptoms" (Kibby and Long,

1996).

In examining the origins of this very basic foundation for defining minor head

injury, Kibby and Long (1996) refer to research findings suggesting that the severity of

head injury is directly related to the force that is applied to it. They indicate that this

finding is supported by the work of Graham et al. (1995) which suggested that the degree

of axonal injury is naturally also affected by the amount and direction of force applied.

In other words, the amount of force applied to the head is directly correlated with the

amount of damage expected to be observed in the axons of the brain and the amount of

metabolic disruption (McIntosh et al., 1996; Meaney et al., 1995). In a natural

progression, one might assume that this also applies to any observed cognitive

dysfunction, which may be directly related to the extent of axonal injury incurred. Kibby

and Long suggest that this is shown through the work of Langfitt and Gennarelli (1982)

and Levin et al., (1987). In their respective investigations, these authors concluded that a

minor head injury resulting in a loss of consciousness of less than 20 minutes was not

associated with any lingering deficits or detectable neuroanatomical damage. Therefore,

the extent of the head injury was assumed to be minor.








Aside from the adoption of some common findings to the definition of minor

head injury, one of the most commonly used indicators of head injury severity is the

Glasgow Coma Scale (GCS) (Jennett and Bond, 1975). The GCS is a measure

commonly employed at the scene of a head injury-inducing accident by immediate care

providers or in an emergency medicine situation. In performing this assessment

technique, the degree of coma is measured by evaluating motor, verbal, and eye-opening

responses to a set of standard stimuli. The resulting score is based on a scale from 3 to

15, the lowest score representing no response to verbal commands or somatic stimulation,

and the highest suggesting the intact ability to follow verbal commands, orientation, and

obvious awareness of sensory input. In the current literature, it is generally suggested

that a minor head injury is one that results in a GCS of 13-15, scores that are often

correlated with a positive long-term outcome. However, the practice of using the GCS by

itself in the assessment of injury severity is questionable since the work of researchers

such as Cantu (1992) and Strauss and Savitsky (1934) has suggested that symptoms

subsequent to injury without the observation of loss of consciousness are also common.

Another commonly used indicator of head injury severity is posttraumatic

amnesia, which has been defined as the period of time following the applied trauma to the

brain in which new memories are unable to be stored. This time period is associated with

the inability to recall events, sequence time, or learn new information (Ahmed et al.,

2000; McAllister, 1992; McFarland et al., 2001). A correlation has been noted between

the length of PTA and the duration of LOC, though McAllister indicates that this

relationship is not always linear, and warns against using either measure as the sole

predictor of head injury severity and potential outcome.








While some commonalities exist between research projects in their inclusion

criteria for cases of minor head injury, a significant amount of slight variation around the

central theme has remained. The reasons for this occurrence may be many, from

theoretical differences to the utilization of samples of convenience or the use of

nonprospective research designs. Researchers have historically not applied a common

definition to their inclusion criteria when studying minor head injury. For instance, in

the often-cited study by Rimel et al. in 1981, patients were included as cases of minor

head injury if they had LOC of less than 20 minutes. This group added another specifier

to the basic definition. In order to control for cognitive changes observed that could

possibly have been related to other event-related trauma, Rimel et al. excluded patients

with a hospital stay of greater than three days. It was their supposition that a longer

hospital stay was indicative of more severe injuries that might introduce confounds to

results obtained that otherwise would have been attributed to the head injury alone.

Another definition was provided by the Centers for Disease Control in 1985, and

while this definition is not specific to minor head injury, a sense of the vague boundaries

that might be inferred from it is still recognizable. The CDC states that a traumatic brain

injury is the "occurrence of injury to the head arising from blunt or penetrating trauma, or

from acceleration-deceleration forces, that is associated with any of these symptoms or

signs attributable to the injury: decreased level of consciousness, amnesia, other

neurologic or neuropsychologic abnormalities"(Thurman et al., 1998).

Subsequent researchers have included patients who met some, but not all, of these

criteria; others have included patients with more severe injuries and confounding

neurological histories (Dikmen et al., 1986; Leininger et al., 1990; Levin et al., 1987).








What is clear in this rather cloudy picture is that the only consistency is inconsistency.

Table 1 indicates the variability in defining minor head injury among some of the most

cited authors in this field.

Table 1. Varying inclusion criteria for minor head injury
Author LOC PTA GCS

Barth et al. (1983) <20 minutes <60 minutes >12
Dikmen et al. (1986) <1 hour >1 hour >11
Parker (1996) <5 minutes
Alexander (1992) <15 minutes <2 hours
King (1996) <24
LOC=loss of consciousness, PTA=posttraumatic amnesia, GCS=Glasgow Coma Scale
In an attempt to assemble these various criteria, the basic definition of minor head

injury, as outlined by Kibby and Long (1996), is as follows. A minor head injury is

incurred by any blow to the head resulting in a loss of consciousness of less than 20

minutes, an initial GCS score of greater than or equal to 13, and posttraumatic amnesia of

less than 24 hours. This definition, while serving as a solid basis for the nosology of

minor head injury and indicating the upper limits of what is thought to constitute minor

head injury, fails to define the boundaries at the minimum degree of that which can be

considered in this category.

Recognizing the variability in defining minor head injury from one study to the

next, a definition of mild traumatic brain injury (MTBI) was formulated by the Mild

Traumatic Brain Injury Committee of the Head Injury Interdisciplinary Special Interest

Group of the American Congress of Rehabilitation Medicine (1993). According to the

consensus reached by this organization, MTBI is defined by, at least one of the following:








1. any period of loss of consciousness, 2. any loss of memory for events
immediately before of after the accident, 3. any alteration of mental state at the
time of the accident, 4. focal neurological deficits, which may or may not be
transient, but when the severity of the accident does not exceed the following: a.
loss of consciousness of 30 minutes or less; b. after 30-minutes, an initial GCS
score of 13-15; and c. posttraumatic amnesia not greater than 24 hours. (Busch
and Alpem, 1998).

This attempt at providing a uniform definition, while again clearly delineating an

upper limit for symptoms indicative of minor head injury, fails to account for the minimal

symptomatology necessary to qualify as such. The minimum limits of this definition

include a wide gradation of injuries, for which the definition itself does not account or

enumerate. Within this category is the type of injury known as concussion, which

according to the Mild Traumatic Brain Injury Subcommittee, also falls under the general

rubric of minor head injury.

Concussion has come to be defined as a "clinical syndrome characterized by

immediate and transient posttraumatic impairment of neural function" (Maddocks and

Saling, 1996). In general, those suffering from concussion are thought to recover without

residual impairment. The categorization of concussion severity was the topic of a recent

set of guidelines authored by the Quality Standards Subcommittee of the American

Academy of Neurology (AAN) in 1997 (cited in Hinton-Bayre et al., 1999). In this

classification scheme, the severity of concussion is based upon a graded scale ranging

from 1, the least severe, to 3, the most severe. Grade 1 concussions are those in which

the individual does not lose consciousness as the result of the insulting force, but

experiences transient confusion (inattention, poor concentration, or inability to process

information or sequence tasks) that resolves in 15 minutes or less. Grade 2 concussions

are those in which, like grade 1 concussions, the individual does not lose consciousness,








but experiences transient confusion that does not remit within a 15 minute time period.

The most severe concussion, grade 3, according the AAN guidelines, includes a period of

loss of consciousness, which may last seconds to minutes, as well as transient

neurological disruption (Kelly and Rosenberg, 1998). These transient neurological

symptoms have also been the subject of many research investigations. Patients suffering

from minor head injuries, including concussion, often complain of a cluster of symptoms,

which was recognized by Strauss and Savitsky (1934). In their study, these investigators

referred to this complex of symptoms as "post-concussion syndrome" (Watson et al.,

1995). This cluster includes those symptoms mentioned above plus many others, which

may persist for as long as weeks or months after injury. The current study will focus on

the neuropsychological effects observed in concussions incurred in a college football

population.

Minor Head Injury: Pathology

Researchers have long sought to identify the core features and underlying

mechanism for the cognitive sequelae that are evidenced following mild traumatic brain

injury. Key to this search is the fact that little evidence has surfaced regarding the

biological changes that are thought to occur in these instances (Montgomery et al., 1991).

Lishman (1988) reviews some of the early research involving animal models of

concussion study in the search for the pathophysiological roots of concussion

symptomatology. Initially, he states, cerebral ischaemia as a result of a blow to the head

was thought to be responsible for the pattern of symptoms observed. This was

hypothesized to arise from the rise in intracranial pressure and obstruction of intracranial

circulation. While this hypothesis theoretically accounted for the brief loss of








consciousness observed in many cases of minor head injury, laboratory findings were not

supportive, and other theories were subsequently formulated. Lishman (1988) points to

the contribution of Denny-Brown and Russell in 1941 concerning the theory that shear

strain resultant from rotational forces are in some part responsible for transient cognitive

deficits in mild head injury. The studies of these researchers on the effect of

acceleration-deceleration effects upon the brain itself promulgated further studies of the

effect of these forces. Denny-Brown's studies with animals suggested that what he

described as "uncomplicated" mild head injury, or concussion, was achieved with his

subjects only when the head was free to move about when the actual insult occurred. His

conclusions, therefore, implicated the acceleration and deceleration forces that were

applied to the skull during the course of the injury.

It was at this point that researchers began to appreciate the fluidity of the

environment within the cranium. That is, investigators moved toward experimental

designs that would allow them to directly observe and measure those forces acting within

the skull itself. Lishman (1988) describes the studies of Pudenz and Shelden in 1946 on

monkeys in which the skullcaps were removed and replaced with transparent material.

By so doing, these researchers were able to observe the movement of the brain at the

moment of impact. They indicated in their report that they observed a "swirling" motion

both at the point of impact and in other areas of the brain not directly impacted by the

blow. It was this swirling motion that was thought to produce many of the symptoms

associated with concussion.

Since the early work of these researchers, many others have confirmed and

elaborated upon these hypotheses. The brain can be thought of as a semi-gelatinous mass








that is suspended in a fluid within a confined space. It is held in place both by its limited

housing within the skull and through its connection via the brainstem and spinal cord.

Because it is "floating" in a homeostatic amount of cerebrospinal fluid (CSF), and

"tethered, it moves out of phase relative to the skull in response to changes in speed or

direction"(McAllister, 1992). The anatomically disrupting effects seen in mild head

injury are thought to be directly correlated with the force of the impact imparted to the

head. Further, relatively few instances of concussion have been noted in cases of static

injury. That is, when the individual is not free to move his head, the results of minor

head injury have not been found to be as substantial (Kingsley, 1996; McAllister, 1992).

Therefore, it would appear that the risk of incurring complications as a result of applied

forces is greatly increased when the head does not have a substance through which it may

conduct a portion of the force vectors acting upon it.

In the case of minor head injury, the most common insults are classified as

diffuse. In this sense, a diffuse injury is one in which a definitive location of focal impact

cannot be identified. Because minor head injury rarely results in fatality, it is often

impossible to detect the subtle neuroanatomical changes inherent to the injury.

Montgomery et al. (1991) suggested that one of the first researchers to identify

anatomical anomalies subsequent to minor head injury was Oppenheimer in 1968. In his

studies, Oppenheimer had the opportunity to examine the brains of five individuals who,

having suffered from mild traumatic brain injuries, died of complications from other

injuries also incurred during the causative incident. Microscopic examination of brain

tissue revealed diffuse axonall disruptions" throughout the white matter. In particular, he

indicated that these injuries were more common in midbrain tegmentum and the corpus








collosum (Montgomery et al., 1991). Subsequent research over the years has indicated

that this diffuse damage to white matter tracts is the probable cause of cognitive

disruption in minor head injury. The motion of the brain within the skull is now thought

to be responsible for the "twisting and stretching" of neuronal axons and the subsequent

damage that occurs. Because this damage can be caused by inertial forces, direct impact

with the head is not necessary to induce detrimental effects (Busch and Alpern, 1998).

Furthermore, Busch and Alpem (1998) point out that damage to axons by shearing forces

causes tearing and the disruption of other cell milieu including blood vessels, possibly

altering blood transport to the directly affected neurons and those immediately

surrounding. In reviewing the histopathology of minor head injury, Busch also refers to

the work of Strich (1956) and Povlishock et al. (1983) in which damage to neurons by

shear or stretching was observed extensively in the white matter of the hemispheres and

in the brainstem. In addition, these researchers discovered that retraction balls could be

viewed in the instances where axonal fibers had been disconnected.

McAllister (1992) suggests that a predilection for these types of anatomical

disruptions occurs in the corpus collosum, superior cerebellar peduncle, basal ganglia,

and periventricular white matter. In each of these cases, the long white matter tracts

appear to be more susceptible to the damaging effects of axonal twisting than are other

areas. In addition, because the brain is situated in a hard casing, the opportunity for

injury by impact with the skull wall is also present, particularly in those regions

vulnerable to bearing the brunt of the type of motion that is often seen in cases of mild

head injury.








Gennarelli's research in the area of traumatic brain injury has provided evidence

to support the notion of what he has termed "coup/contrecoup" distinction. In general, an

injury resultant from a coup force is observed locally, at the site of the impact of the head

with the offending object (Gennarelli and Graham, 1998). He further asserted that the

area and severity of damage are directly related to the size and force of the impacting

object. Therefore, he continued, but warned against generalization, coup injuries are

often associated with the acceleration of the head. Contrecoup injuries, on the other

hand, are often seen as arising from the sudden deceleration of the head. These insults

are "away from the site of injury" (Gennarelli and Graham, 1998). While they are often

observed in a linear relation with the site of impact, and from the coup-related injury site,

because of the nature of the internal environment of the cranium, this is not always the

case. The location of the brain within the skull, as mentioned above, predisposes certain

areas to damage. These areas include the temporal lobe poles, the frontal lobe poles, and

the parasagittal region (Parker, 1996), all of which are thought commonly affected to a

degree in many cases of mild head injury.

In the early 1980's, Gennarelli and his colleagues suggested the term "diffuse

axonal injury" (DAI) in the description of the anatomical effects observed in many

instances of traumatic brain injury. As described above, focal damage produced by a

non-missile, closed-head injury is rather rare. More often, diffuse axonal damage is

prevalent. In elaborating on the definition of DAI, researchers formulated three

categories of DAI (Gennarelli and Graham, 1998). In a DAI of grade I, damage to axons

is notable within the white matter of the hemispheres and occasionally in the brain stem

and corpus collosum. This is most likely the extent of the damage seen in most cases of








minor head injury. A grade II DAI also evidences some "focal abnormalities in the

corpus collosum, often associated with small hemorrhages, called tissue tear

hemorrhages" (Gennarelli and Graham, 1998). According to this gradation system, grade

HI DAI's also include extensive damage to the rostral brainstem. In a 1968 paper,

Oppenheimer and his colleagues reported their finding that "clusters ofmicroglia" were

found in the white matter of patients who, having suffered minor head injury, had died of

unrelated causes. Gennarelli suggests that this finding has been supported recently by

other investigators and has been elaborated upon by others who have discovered the

presence of what he terms "microscopic DAI"(Gennarelli and Graham, 1998).

However, Gennarelli suggested that a possibly more important discovery was

made by Blumbergs et al. (1994) when these investigators reported having performed

immunostaining with an amyloid precursor protein (APP) antibody. In their results, these

researchers reported observing multiple areas of diffuse axonal injury when using this

staining technique in five patients who had received minor head injury. Experimentation

with other antibody staining techniques have met with less success in identifying those

regions that have suffered diffuse axonal injury. Staining for the detection of P-APP

remains the most sensitive to elucidating this phenomenon. Gennarelli warns, though,

that this should not be considered a marker for head injury since heightened production

and collection of P-Apolipoprotein is also seen in the "dystrophic axons" of some elderly

patients and in some cases of known transient ischemic attack and infarction. He

suggests that an accumulation of this precursor protein should suggest an area in which

metabolic change has taken place, which certainly could be the case in minor head injury,

though it may also be true in other instances as well.








Recent work by Roberts et al. (1994) has also suggested a link between the

accumulation of P-APP and the increased risk for the development of Alzheimer's

Disease (AD). The work of these researchers supports the hypothesis that the expression

of P-APP is indeed part of "an acute phase response" to the type of neuronal damage that

is seen in minor head injury. They further suggest that the overaccumulation of the

precursor protein may lead to a build-up of P-APP and subsequently lead to the

expression of certain AD symptoms, or an increase in an individual's predilection to

develop AD later in life if they are already genetically susceptible. In addition to these

important biochemical findings, Busch and Alpem (1998) sites the work ofPovlishock et

al. (1983) in which it was found that minor head injury can lead to a disruption in

electrolyte transfer and capillary permeability.

While the neuroanatomical effects of minor head injury are not easily identified

by common medical measures such as imaging devices and biochemical markers, recent

research in the area of diffuse microscopic damage and the detection of the accumulation

of amyloid precursor protein and its subsequent correlation with outcome have suggested

to investigators that there are biological consequences of even minor head injury. It is

becoming more commonly accepted that these changes do play a role in many of the

neuropsychological and other cognitive changes that are often observed following even

the mildest forms of head injury.

Neuropsychological Effects of Minor Head Injury

While the psychosocial ramifications of minor head injury and the effect of

persisting post concussion syndrome symptoms have received a large amount of attention

in the literature, the issue of cognitive impairment suffered as a result of minor head








injury has been one that has produced varying and debated results. This is due in large

part to the methodology used in different studies. However, there are some common

aspects to the findings from various research groups that suggest that minor head injury

does contribute to detectable cognitive sequelae.

The studies ofRimel et al. (1981) and Barth et al. (1983) are often cited in the

literature when referring to the finding that minor head injury produces a measurable

neuropsychological change (Dikmen et al., 1986). The findings by these two research

groups, using the same population, suggested that there are noticeable cognitive as well

as psychosocial deficits in those having suffered minor head injury. Dikmen argues that

these findings may not be accurate, however, since these groups used what she describes

as an inappropriate normative comparison group. While the groups were impaired

compared to this normative group, she states, it is difficult to parse out the effects of other

factors that are not known, including premorbid neurological status. In her own study on

the effects of minor head injury on neuropsychological functioning, Dikmen indicates

that the findings were much less definitive when testing her subjects at 1 and 12 months

post injury.

Dikmen performed a wide range of neuropsychological tests on 20 consecutive

hospitalized patients who had experienced a period of coma not over 1 hour or, in the

absence of coma with PTA of at least 1 hour, a GCS score of greater than or equal to 12

on admission, with no evidence of cortical or brainstem contusion, no history of prior

head injury or neurological insult, and between the ages of 15 to 60. She utilized a control

group of friends and relatives of those suffering from minor head injury rather than the

published normative data for the individually administered tests. Although her results








were not as conclusive as those of Rimel and Barth, she reported that there was a trend

for her minor head injury population to perform more poorly on tests of concentration

and short-term memory at 1 month after injury. This was supportive of the findings of a

previous and oft-cited study by Gronwall and Wrightson (1974) who showed that those

suffering minor head injury and suffering from post concussion symptoms tended to

demonstrate impairment in tests of attention and processing speed. She tempers her

finding, though, by stating that her results may not be indicative of significant differences

between the groups if one takes into account the premorbid differences in intellectual

ability between them (Dikmen et al., 1986). She did not have any measurements of

baseline functioning with which to compare her findings.

In another study of a large group of head injured individuals participating in three

separate experiments, Dikmen again sought to clarify the neuropsychological

ramifications of traumatic brain injury (Dikmen et al., 1995). The effort included 436

adult head-injured patients, 243 of whom had a GCS of 13-15 on admission. In this

study, she indicated that the extent of neuropsychological impairment observed was

directly related to the duration of coma associated with the injury. In the group that she

labeled as 'mild head injury' (those with coma of duration less than 1 hour and GCS of

13-15 on admission), she found that there was no statistically significant difference from

a trauma control group in neuropsychological performance. She suggests that the results

are similar to her previous study of neuropsychological outcome in minor head injury

with respect to the finding that measures of attention and short-term memory show a

trend toward persisting difference in those with minor head injury. In her discussion of

results, she stressed the fact that measures of attention and memory tend to show decline








with increasing severity of head injury. Again, she had no baseline comparative

measures.

Dikmen's studies of the neuropsychological functioning of those suffering minor

head injury have taken place at relatively long periods after the injurious incident (1

month and 1 year), and have shown limited persisting differences between the injured

group and the control population. Likewise, a meta-analytic review by Binder et al.

(1997) using 8 studies comprised of 11 samples of individuals with a history of mild head

trauma at least 3 months after receiving their injury resulted in an overall nonsignificant

effect size of .07 when comparing a composite neuropsychological test score to control

groups. However, when they used the d statistic, they found a significant finding of. 12,

p < .03 for measures of attention. It was their conclusion then, much like Dikmen's, that

the neuropsychological effects, if any, of minor head injury are generally resolved from 1

to 3 months post-injury.

In a study by Montgomery et al. (1991) of 26 consecutive admissions to an

accident and emergency room for minor head injury (defined by this group as an injury

requiring an overnight inpatient hospital stay as well as a PTA period of less than 12

hours), investigators utilized a Four-Choice Reaction-Time measurement in order to

detect possible cognitive sequelae of the injury. In addition, a measurement of brainstem

evoked potential and subjective measure of post concussive complaints were performed

on the same day as the injury, six weeks later, and at six months post-injury. Fenton

reported that head-injured patients exhibited a significantly prolonged reaction time

measurement on the day of the injury when compared to a control group matched for age,

sex, and social status. Results indicated "serial improvement" at the six-week and six-








month follow-up. Interestingly, over half of the patient's involved in this study still

complained of post-concussion syndrome symptoms at the six-week and six-month

follow-up assessments. In addition, more than half of these patients exhibited what

Fenton called "significant delays" in brainstem conduction during the first testing session.

Other studies examining the cognitive sequelae of minor head injury have shown

that more significant differences are detectable in the acute and early recovery phases of

head injury and are often associated with the degree of persisting post concussional

symptoms. This is particularly true in investigations examining the effects of minor head

injuries incurred in contact sport related incidents, which usually involve lesser amounts

of applied forces (acceleration and deceleration) and generally result in milder concussive

effects (Maddocks and Saling, 1996).

Sports-related Minor Head Injury

The interest in sports related head injuries in the head injury literature has a long

history, though relatively little methodologically sound research has been conducted in

the area until recently. A 1991 report by the National Health Interview Survey estimated

that approximately 300,000 people in the United States suffered a head injury as a direct

result of participating in a sporting activity. Of these, 34% (approximately 103,000) were

categorized as "mild" (Kelly and Rosenberg, 1998; Thurman et al., 1998). In addition,

the survey contended that many of those individuals suffering head injuries never sought

medical treatment, and therefore many estimates of national head injury prevalence may

be significantly below the actual number. Until the last 10 to 15 years, however, most

investigation into the cognitive effects of sports-related head injury had been isolated or








anecdotal, and some investigations continue to employ methods that bring the validity of

findings into question.

In an article published in the Journal of the American Medical Association in

1928, Martland described the symptoms that he observed in boxers after receiving

repeated blows to the head. He termed the condition "punch drunk" and described the

symptoms as including a slight unsteadiness in gait or uncertainty in equilibrium, slight

mental confusion, and distinct slowing of muscular action. While he noted that most of

these cases remained relatively minor, some resulted in more severe neurological

complication including hesitancy in speech, tremors of the hands, and nodding

movements of the head (Martland, 1928). In addition, he postulated that the effects of

repeated blow to the head might cause lasting damage. While he indicated that he had to

rely in some cases on observation garnered from "laymen", he suggested that he could

identify many individuals known to suffer from persisting symptoms thought to be

related to repeated blows to the head. He was unable to empirically evaluate many of

them, he stated, as they were housed at the time in asylums across the country.

Further corroboration of this syndrome was provided by Millspaugh (1937) in his

review of the history of naval boxers. Millspaugh stated conclusively that there are

indeed cognitive sequelae to the repeated blows received in a boxing match, although the

cognitive deficits he described were of an objective nature. He also described a complex

of symptoms acutely accompanying repeated blow to the head, which included "fullness

in the head, cephalgia, vertigo, irritability, insomnia, increased fatiguability, and memory

defects" (Millspaugh, 1937). In addition, he described what appeared to him to be a

period of amnesia in both those suffering loss of consciousness during the fight, and








those without LOC. He took this to indicate that there was some physical damage to the

brain itself which may manifest for long periods of time. In his concluding remarks,

Millspaugh states: "repeated and frequent concussions, occasionally very severe, often

undoubtedly associated with intracranial capillary hemorrhages are to say the least not

conducive to stabilized mental equilibrium."

In a study by Tysvaer et al. (1989), 37 former professional soccer players were

examined in order to determine the existence of persisting neuropsychological deficits

caused by heading a soccer ball. The players ranged in age from 35 to 64, and had no

reported history of alcohol abuse. Investigators interviewed the players and used the

Wechsler Adult Intelligence Scale (WAIS), the Trail Making Test (parts A and B), a

modified version of the Halstead-Reitan aphasia screening test, tests of sensory-

perceptual functioning, motor tests, and the Benton visual retention test, Form C, in order

to determine neuropsychological status. Their scores on these tests were compared to a

control group consisting of 20 hospitalized patients who manifested sings of a variety of

disorders. These researchers found that the former soccer players exhibited a

significantly larger mean split between Performance IQ (PIQ) and Verbal IQ (VIQ) on

the WAIS, and significantly worse scores on both Trail Making Tests A and B. They

concluded that their results suggested that brain injury incurred as a result of soccer

injuries can result in permanent brain injury as exhibited by performance on

neuropsychological tests measuring perception, learning, manual dexterity, speed, and

attention. While these results may be suggestive of persisting cognitive deficits as a

result of sports-related head injury, the methodology is fraught with problems. Most

importantly, the authors do not report the neurologic history of any of the players tested.








Therefore, it is impossible to distinguish the extent of head injuries any one individual

may have suffered. In addition, they do not appear to use the time since the supposed

injury as a covariate in examining test performance, nor do they provide a history of other

factors that may play a role in affecting cognitive functioning. Further, researchers

reported no knowledge of cognitive functioning before these individuals played soccer.

Finally, the researchers report that the control group used was one consisting of

hospitalized patients with "varying" injuries. Without knowledge of the history of the

group, it would appear that it is an inappropriate normative comparison.

The recent examination of minor head injury incurred in sporting events has

several advantages over other scenarios in which the state of the individual at the time of

the accident and observation of their immediate behavior is not possible (Collins et al.,

1999). Within the realm of organized sporting events involving physical contact between

players, the opportunity exists to assess injury very soon after the incident because of the

immediate availability of medical staff. In addition, there is often a means of reviewing

the incident via television recording. This affords the opportunity to assess the types of

forces applied to the head of the injured player. Further, some investigators now collect

baseline measures ofneuropsychological functioning before a player can compete in the

contact sport. This allows for comparison of findings subsequent to the injurious incident

and more accurate analysis of subsequent cognitive changes following head injury.

A prospective study by Barth et al. (1989) of the cognitive effects of minor head

injury in a college football population suggested that neuropsychological testing is a

sensitive device in the detection of subtle cognitive changes following concussion. This

group collected baseline measures for 2,350 college football players at a total of 10








universities. Subsequently, they retested 182 players with documented cases of

concussion within 24 hours of the injury, 5 and 10 days later. The battery used in this

investigation consisted of Trail Making Tests A and B, the Symbol Digit Test, and the

PASAT. As a comparison group, Barth used a population of 59 players sustaining "mild

orthopedic injuries and 48 male college students" (Barth et al., 1989). These individuals

were tested with the same protocol using the same time period as the concussed players.

Barth concluded that while the control group performed significantly better on the

"within 24 hour" PASAT and Symbol Digit test, constituting what he termed "normal

testing behavior," (exhibiting practice effects) the concussed group failed to show

improvement on these two measures within 24 hours of receiving a concussion.

However, he noted that these individuals tended to show the same improvement as

controls on the day 5 tests of these measures with a leveling at day 10, which suggested

to Barth that these individuals had returned to baseline functioning at that point. The

investigators failed to show the same pattern of results on the Trail Making Tests, which

Barth suggested may be due to the strong effects of practice inherent to its successive

administration. As a result of this study, Barth concluded that a single incident of mild

concussion in a college football player results in detectable neuropsychological changes

within 24 hours of the incident, and that a recovery to baseline functioning occurs quickly

within the 5 to 10 days following injury.

In a study of 130 Australian Rules Football players by Maddocks and Saling

(1996), baseline measures of cognitive function were collected using the Paced Auditory

Serial Addition Test (PASAT), the Digit Symbol Substitution Test (DSST), and Four-

Choice Reaction time, which involves measures of decision and movement times. Of the








130 with baseline cognitive measures, Maddocks reported on 10 players who

subsequently received concussion (5 were described as incurring no LOC, 5 had LOC of

less than 1 minute and PTA of less than 30 minutes) and were re-tested at 5 days post-

injury. He indicated that while the neurological features of concussion, including

headache, nausea and dizziness, had remitted after 5 days, persisting neuropsychological

deficits were noted on the DSST and Four Choice Reaction time (decision time). He

noted that significant differences were not found on the PASAT, a test that previous

authors had found to be particularly sensitive to the effects of minor head injury due to

the attentional component inherent to performing the task. Maddocks suggested that

since his population of concussed individuals presented with milder forms of minor head

injury than most minor head injury populations examined, the sensitivity of the test may

be less effective. Of importance, Maddocks utilized a control group consisting of age-

matched umpires who were tested on two separate occasions with the same test measures.

Although this may allow for comparison with the head injured population taking into

account the effect of practice on the various measures, Maddocks fails to elaborate on

other factors that may have made this an inappropriate comparison group (e.g., history of

past head injury, inter-test duration, educational history, etc.).

In another study by Hinton-Bayre et al. (1997), 54 players from a professional

rugby team were tested at baseline using the Symbol Digit Modalities Test (SDMT), the

Digit Symbol Substitution Test (DSST), and the Speed of Comprehension Test. Ten

players subsequently received concussions and were tested within 24-48 hours of the

incident. The authors reported that those measures of speed of information processing

were sensitive to the cognitive effects of the concussion, though an untimed test of word








recognition was not. They further stated that tests of speed of information processing

were more sensitive than either the SDMT or the DSST based on comparison with

baseline measures.

In a 1998 study by McCrea et al., researchers administered the Standardized

Assessment of Concussion (SAC) to 568 high school and college football players before

the 1995 and 1996 seasons. The SAC consists of a brief measure of orientation, short-

term delayed verbal memory (5 words recalled immediately and after a 5 minute delay), a

brief measure of attention and shot-term memory (reverse Digit Span), and concentration

(recitation of the months of the year in reverse order), as well as several exertional

exercise in order to create conditions of intracranial pressure during which post

concussive symptoms are most likely to be seen. The sum of the scores is added to

produce a composite total with a maximum score of 30. Of the 568 individuals tested at

baseline, 33 subsequently experienced concussion and were tested immediately. 28 of

the 33 were also tested 2 days after the injury occurred. McCrea found that the

concussed players performed significantly below both the control group and their own

baseline composite measures on the SAC. At 48 hours, he indicated the concussed

players had returned to their baseline level. He concluded that this brief battery of

modified neuropsychological measures is sensitive to the cognitive deficits obtained from

minor head injury incurred in football, and suggests that his observation that all

concussed players were not different from baseline at 48 hours is consistent with other

findings. Of note, however, was the lack of attention to the effects of practice, and the

possibility that they may have contributed to improvements seen at the follow-up testing,

thus potentially masking any persistent cognitive effects of concussion. Further, the lack








of matched controls excludes the researchers from making comment on the potential

effects of age and education level on results obtained, although they contend that these

variables have "minimal" effect on performance and that there were no significant

differences between the scores of high school and college players.

A more recent test-retest study by Barr and McCrea (2001) described the results

of the administration of the SAC to concussed players immediately following the injury

compared to their baseline performance. A control group of non-concussed players was

also tested at the same time intervals. These researcher found that using both a reliable

change index approach and a linear regression approach were successful at identifying

detriments in cognitive performance immediately following concussion. Receiver

operating characteristic curves were used to determine the statistical significance of the

models in accurately classifying a player as concussed or not concussed.

A study by Collins et al. (1999) tested 393 athletes from 4 university football

teams using a brief neuropsychological battery that included the Hopkins Verbal

Learning Test (HVLT, verbal learning, and delayed memory), Trail Making Tests A and

B, Digit Span test, Symbol Digit Modalities Test, Grooved Pegboard Test, and Controlled

Oral Word Association. In addition the athletes completed the Concussion Symptom

Scale, a self-report, Likert scale of post concussion symptoms. Subsequently, the group

retested 16 individuals who suffered concussion within 24 hours of the incident, and

within 3, 5, and 7 days afterwards. Collins concluded, after performing a discriminant

analysis on the resulting data, that the use of this brief neuropsychological battery

resulted in an overall correct classification rate of 89.5% for the concussed and control

groups at the 24-hour testing period. In particular, he noted that concussed individuals








were significantly worse on the Hopkins Total score and Hopkins Delay score than the

controls, and that moderate differences between the groups persisted until the 5-day

testing. While not included in the analysis of concussed players, Collins, et al. analyzed

the baseline scores of all 393 athletes and discovered an interaction between history of

diagnosed learning disability and history of previous concussion, and significantly lower

baseline performance on two measures (Trail Making Test B and Digit Symbol

Modalities Test). This finding suggests that these demographic variables may be

important to examine and take into account when evaluating the neuropsychological

performance of an athlete after concussion.

Reliable Change

The tests of cognitive change and recovery of function described above generally

have utilized one of two methods to indicate recovery after concussion and fitness to

return to play. One assesses a player's return to baseline functioning (Barth et al., 1989;

McCrea et al., 1998), and the other a comparison to a control group involving the

determination of differences between serial testing results (Collins et al.,1999; Maddocks

and Saling, 1996). Hinton-Bayre et al. (1999) suggested that the former does not

properly account for the effects of practice, while the latter provides limited information

without the availability of "suitable" normative data. The results of neuropsychological

testing are influenced by many factors in addition to the cognitive effects of minor head

injury. Especially in repeated, or serial testing, as is performed in the evaluation of

recovery of a concussed athlete in order to determine return-to-play status, the results of

testing may be influenced by such things as the reliability of the test being used, practice

effects, and the attitude of the test taker (Temkin et al., 1999). Even when there is no true








change in cognitive status, these variables may suggest that a change has taken place. In

this sense, the ability to detect true change in an individual case can be affected by many

factors. Temkin et al. (1999) examined four statistical models in order to determine

which was most sensitive to detecting "real change" in the individual case. They used 7

test measures and retest scores from 384 "neurologically stable" individuals in the

comparison of four statistical models in order to address this question. These models

were the basic Reliable Change Index (RCI), the RCI with correction for practice effects,

a simple regression model, and a multiple regression model. Each of these models was

used in order to generate a predicted retest score which was then compared to the actual

retest score. The difference between the two, or residual, was then compared to a

confidence interval that was calculated to contain 90% of those scores expected given no

real change. This method was based on Za=Z.95=l.645. This value suggests that 5% of

scores were thought to be significantly higher than the normal population, while 5% were

assumed to be significantly lower. In other words, any residual score lying outside this

interval was assumed to reflect real change.

Reliable Change Index

The Reliable Change Index was originally developed to assess the effectiveness

of clinical interventions (Hinton-Bayre et al., 1999). It was used by Jacobson and Truax

(1991) with the idea that clinically significant change was related to a return to normal

functioning. The model is based on the assumption that a change in test scores from one

time to the next is significant if the size of that change is large compared to the standard

error of the difference of the test. Jacobson and Truax (1991) indicate that the error

variance of a test is calculated through the use of test-retest reliability and the variation of








scores around the mean. According to Hinton-Bayre, the formula for obtaining an

individual's RCI score is:

RCI= (x2 1)
Sdff

In this instance, xl is the individual's first score, x2 is the retest score and sdiff is the

standard error of the difference, which is calculated as follows:

Sdf = 2(s 2)

The standard error of the measurement, SE is obtained through the use of the standard

deviation of the control group, S. In addition, the reliability of the measure, r12, is used in

obtaining this statistic:

SE = S1- r2 (Hinton-Bayre et al., 1999)

Again, in the use of this statistic, an RCI z value exceeding 1.96 is considered to be a

significant change.

In the analysis of Temkin et al. (1999), a change was considered to be reliable,

using the RCI, if the absolute value of the post-test minus the pre-test exceeded the

standard error of the test-retest difference of the normative sample multiplied by the z-

score cutoff described above. This created a prediction interval that should contain 90%

of the scores in a normal population. Any change score lying outside of this interval in

either direction was considered "real" change.

Reliable Change Index with Correction for Practice

The second model used by Temkin et al. (1999) was a slight variation on the RCI

model described above. This model has been used by others in the evaluation of

cognitive outcome following various surgical procedures (Chelune et al., 1993; Hermann








et al., 1996; Kneebone et al., 1998; Sawrie et al, 1996). In this model, practice effects for

individual tests are determined by calculating the mean difference between the test and

retest scores of the normative population. This mean difference is then added to the

baseline score of the individual in question in order to calculate a predicted follow-up

score. A reliable change is then determined by the amount of difference between the

actual retest score and the predicted. A confidence interval is determined just as it was

for the basic RCI. That is, the confidence interval is determined by the standard error of

the difference of the normative sample multiplied by the z-score that determines the

cutoff point (generally Z.95).

Linear Regression

The third model utilized by Temkin et al. (1999) was a basic linear regression. A

formula that can be used for predicting any score in the population under investigation is

calculated by regressing the retest score of the normative sample on their initial scores.

Originally suggested by McSweeny et al. (1993), Temkin suggests that this model

controls for both practice effects and regression toward the mean. Reliable change is

determined again by examining the difference between the observed retest score and the

predicted score. A prediction interval is calculated based on the residuals estimated by

the equation multiplied by a z-cutoff score (again, generally Z.95).

Multiple Regression Model

The final model investigated by Temkin et al. (1999) incorporated the use of

stepwise linear regression in order to determine a predicted retest score. Multiple

measures were used in the stepwise regression which the investigators thought might be

important in determining follow-up scores. These included baseline test score, test-retest








interval, demographic variables, and a measure of overall neuropsychological

competence. In order to account for nonlinear relationships between test and retest score,

these investigators also included the square and cube of baseline scores in their multiple

regression models for each measure. Temkin explains that variables were added to the

stepwise linear regression if they had the "highest partial correlation among the variables

not already in the equation, and if the significance level associated with the variable was

under .05." Variables were removed from the equation if, after adding another variable,

the significance was above .10.

The results of Temkin et al. (1999) suggest that the prediction intervals associated

with the four models became narrower, and therefore more sensitive to the detection of

change, with the growing complexity of the methods used. In this instance, the term

"complexity" does not necessarily refer to the difficulty of mathematical operations, but

to the amount of error variance accounted for (e.g., test-retest reliability, practice effects,

regression to the mean, etc.). That is, the RCI was less sensitive than the RCI with

practice effects, which was less sensitive than the linear regression, which in turn was

less sensitive than the multiple regression model. However, they indicate that the RCI

produced high error rates due to the non-normality of the population residuals. This was

in large part due to the practice effects seen in the individual measures, which caused the

population data to become somewhat skewed. When practice effects were taken into

account, the residuals better estimated a normal curve, and predicted scores were more

closely correlated with observed scores. They report that overall prediction accuracy was

similar for the last three models, though, with the addition of demographic variables in

the multiple regression model, significantly better prediction was possible.








Specific Aims and Hypotheses

Given the magnitude of the occurrence of head injury in the United States, and the

subsequent cognitive as well as psychosocial consequences inherent to it, the importance

of detecting neuropsychological deficits is clear. The literature suggests that minor head

injury in sports, a particularly common occurrence in a college football population, varies

significantly from person to person, and that subtle deficits may be indicative of the

lingering effects of concussion. The assessment of recovery from concussion is

particularly important when one considers the potential effects of repeated concussion

and its long-term consequences (Cantu and Voy, 1995; McCrory and Berkovic, 1998).

As Collins et al. (1999) have indicated, the possibility of a correlation between multiple

concussion and reduced baseline neuropsychological functioning make the assessment of

concussion a particularly salient issue in the student athlete population.

Studies on the effects of minor head injury to this point have produced varying,

often contradictory results. This is in no small part due to the variability of the

methodology utilized. In many cases, investigators have used inappropriate normative

samples or comparison groups, while in others, factors contributing to change (or lack

thereof) in serial testing have not been taken into account. More sensitive and sound

statistical measures are needed in the assessment of head injury and the period of

recovery after head injury.

The present study is designed to further the research on the sensitivity of

neuropsychological measures over serial testing sessions in assessing the functioning of

individuals suffering minor head injury in a college football population. The aims of this

investigation are (1) to analyze the effectiveness of various statistical models in detecting








ongoing cognitive sequelae in concussed college football players via serial

neuropsychological testing, (2) to determine appropriate thresholds for test performance

resulting in acceptable sensitivity and specificity classification values, (3) to determine

those functions which appear most indicative of enduring cognitive complaints over the

7-day period following concussion, and (4) to calculate representative models in order to

aid in prediction of cognitive recovery following concussion. It is hypothesized that each

progressively "complex" model will exhibit increasing sensitivity in terms of

demonstrating lower test performance relative to baseline at the retest session. It is also

hypothesized, based on the findings of Temkin et al. (1999), that only the most complex

models (i.e., regression models) will allow for accurate classification of concussed

individuals beyond the 24-hour test session. As was the case in the Temking et al. (1999)

study, the term "complex" is used here to denote the level to which potential error

variance is taken into account. Therefore, in relation to the current study, it is

hypothesized that those models which account for more potential error variance will be

more successful at detecting real post-concussive cognitive decline at the more distal

testing sessions.















CHAPTER TWO
METHODS

Subjects

This study included the subjects that were originally reported by Collins et al.

(1999) as well as others from a continuing data collection project at the University of

Florida. Participants included male football players from four universities: the University

of Florida, Michigan State University, the University of Pittsburgh, and the University of

Utah. All participants in the study were volunteers who agreed to participate in the study

without compensation and without financial responsibility. All participants on the

football teams of these respective schools were asked to participate.

In the initial meeting with the student athlete, baseline data was collected

regarding age, football position, handedness, college entrance exam scores (SAT and/or

ACT), high school grade point average, highest level of education achieved by mother

and father, occupation of mother and father, history of diagnosed learning disability

(LD), history of diagnose attention deficit-hyperactivity disorder (ADHD), history of

neurological problems (e.g., epilepsy, migraine headaches, etc.), history of significant

medical illness (e.g., asthma, diabetes, heart problems, etc.), history of psychological

illness (e.g., depression, anxiety, etc.), history of alcohol and substance abuse, and history

of recent medication use. In addition, each participant was asked to provide a history of

previous concussion (including date, description of circumstances surrounding the

incident, incidence of loss of consciousness, incidence of post concussive symptoms,








length of symptom presentation, and the results of imaging tests if any). All concussions

were subsequently classified according the grading system outlined by the American

Academy of Neurology (1997).

Protocol

Baseline

Approval by the Institutional Review Boards for each respective school was

obtained. Each participant was provided the opportunity to review the nature, purpose,

and projected use of the data with either a Ph.D. psychologist or a doctoral level student.

All data were collected by Ph.D.-level psychologists, doctoral students, or in some

instances, team physicians and athletic trainers who had been trained in administering

this brief battery. Each had attended a training session in test administration in order to

ensure standardized administration techniques. Tests were scored in a standardized

fashion, and the results from other institutions were sent to this investigator for entry into

a centralized database.

Each participant was administered a brief neuropsychological battery including

the following tests: Hopkins Verbal Learning Test (HVLT), Trail Making Tests A and B,

the Digit Span subtest of the Wechsler Adult Intelligence Scale-Revised, Symbol Digit

Modalities Test (DSMT), Grooved Pegboard, and the Controlled Oral Word Association

test (COWA). This battery is one that is currently used in a professional football study

on the effects of concussion and was recommended by Lovell and Collins (1998) for

study of concussion in a college football population.

The Hopkins Verbal Learning Test (Brandt, 1991) consists of a presentation of 12

words belonging to three semantic categories. Three learning trials are administered,








after which a delay period is interjected for approximately 15-20 minutes. The subject is

then asked to recall as many words as they can remember. After this delayed free recall,

a 24-word recognition list is presented which contains all of the target words in addition

to 6 semantically related words and 6 unrelated words. As the words are read to the

subject, they are simply asked to identify those words that were presented previously.

This test allows for a measure of short-term verbal memory, attention, concentration, and

learning. In addition, a measure of delayed verbal memory is calculated as well as a

score for recognition memory. In addition, within the recognition trial, scores for related

false positive responses and unrelated false positives are calculated. A discrimination

score is also calculated to determine the subject's ability to differentiate between

previously presented words and foils. The HVLT has six alternate forms (Lezak, 1995).

For the purposes of this study, Hopkins Total score and Hopkins Delay score were

examined.

Trail Making Test A (Reitan and Wolfson, 1985) consists of 25 numbered circles.

The subject is asked to connect the numbered circles in the correct order (from 1 to 25) as

quickly as they can, without making any mistakes or lifting their pencil from the paper.

Score on this measure is recorded in seconds to completion. Errors during the

completion of the task are immediately pointed out to the subject and they are directed to

continue from the point of their last correct mark. No score is kept for the number of

errors committed; it is felt that this is reflected in an increased time of completion. Trail

Making Test A is thought to be a measure of visual scanning and sequencing ability as

well as motor speed. Trail Making Test B, (Reitan and Wolfson, 1985) while similar in

appearance, encompasses circles with both letters and numbers. The subject is asked to








start with the number 1 and connect the circles in alternating increasing sequence (i.e., 1-

A-2-B-3-C, etc.). The score for this measure too is recorded in seconds to completion.

This measure is thought to reflect a person's ability to effectively shift mental set and

utilize cognitive flexibility in addition to visual scanning, sequencing, and motor speed.

These functions are often associated with frontal-executive functioning (Spreen and

Strauss, 1998).

The Digit Span subtest of the WAIS-III (Wechsler, 1997) consists of the

presentation of increasingly long sequences of numbers. In the Forward condition, the

subject is simply asked to listen to the numbers and repeat them in the same order in

which they were given. In the reverse condition, they are asked to recite them

backwards. Each part is discontinued when the subject fails to correctly recall two trials

of numbers of the same length. This measure is thought to reflect attention and

concentration. For the purposes of this study, the Digit Span total score (forward plus

back) was examined.

The Symbol Digit Modalities Test (SDMT) (Smith, 1982) is similar to the Digit

Symbol subtest of the WAIS-R in that is requires the subject to provide quick

substitutions, but it in this measure, the subject is asked to write the correct number under

each symbol based on a key. Subjects are given 90 second to complete as many items as

they can. This test was chosen based on its ability to measure visual scanning and visuo-

motor speed. Immediately following its administration, the subject is asked to recall as

many of these associations as they can, which provides a measure of incidental learning.

The variable examined as part of this study was total number of symbols correctly

produced in the 90 second test administration.








The Grooved Pegboard test (Matthews and Klove, 1964) consists of a small board

containing a 5x5 matrix of slotted holes arranged in various angles. The subject is given

25 pegs, each the same shape, and asked to put them in successive slots starting with the

top row as quickly as they can without dropping. They are allowed only to hold one peg

at a time with one hand. Time to completely fill the holes are measured for each hand

independently (Lezak, 1995). This test was chosen for inclusion in the battery due to its

sensitivity to motor speed and coordination. Dominant hand and non-dominant hand

completion times were examined in the current study.

The Controlled Oral Word Association Test (COWA) (Benton and Hamsher,

1989) consists of three trials. In each, the subject is presented with a letter and asked to

produce as many words as they can until the examiner asks them to stop. They are told

that they must not use proper nouns or the same words with different endings. The

subject is given 60 seconds to produce as many words as they can. This measure was

chosen for its ability to reflect abilities of word fluency and retrieval. The current study

utilized total words produced over all three trials as the variable under examination.

In addition to these neuropsychological tests, the participant were administered

the Concussion Symptom Scale in order to examine baseline levels of symptoms

commonly associated with post concussion syndrome (e.g., headache, memory

difficulties, concentration difficulties, sensitivity to light or noise, etc.). This is a

subjective questionnaire utilizing a Likert-type scale.

In-season concussion evaluation

Those athletes who sustained a concussion during the football season, either in

practice or in a game, were evaluated with this same battery (utilizing alternate forms








where available) within 24 hours of the incident, and at 3, 5, and 7 days post-concussion.

In-season concussion was initially evaluated by trainers or team physicians and was then

referred for neuropsychological testing as part of this study. Grade of concussion was

evaluated based on the AAN guidelines. In addition, within 24 hours of concussion,

circumstances contributing to the concussion were obtained through interview with

trainers, and videotape of the incident (if available) was reviewed in some cases.

Football players within each respective team served as controls. They were

matched as closely as possible to the concussed player in terms of the following

variables: race, position, history of learning disability, and history of head injury.

Athletes were encouraged to continue with testing protocol over all test intervals.

However, some difficulty was encountered with attrition, especially in the control

population. Ultimately, data from participating individuals was excluded listwise. That

is, a player's data was utilized over each retest interval only if they generated data for

each corresponding test session. This resulted in a decreasing study population over each

successive test interval. At the 24-hour test session, total number of concussed players

who had data at both baseline and 24 hours was 21. The number of controls at the same

interval was 13. For the Day 3 test session, the total number of concussed players was 19

with 11 controls. At Day 5 the number of concussed players was 18 with 10 controls. At

Day 7, there were 19 concussed and 6 control participants.

Data Analysis

Data analysis included the application of 7 statistical models of significance

detection. These were conceptually divided into three broad categories based on their

statistical methodology.








Basic Models

The first subgroup consisted of two basic, clinically-oriented approaches (Model

1 and Model 2). For Model 1, each of the concussed and control participants' retest

scores for each measure was compared to their respective normative mean scores and

standard deviations, which were derived from the larger baseline group of football

players (n=409), ultimately resulting in a standardized z-score. Significant findings were

classified using a standard clinical approach (Reitan and Wolfson, 1985), where 1

standard deviation below the mean was interpreted as denoting mild impairment. In

Model 2, each concussed and control participants' retest scores for each variable were

compared to their own baseline performance again using a criterion of 1 standard

deviation (based on the normative population) decrease in score to denote a finding

suggesting significant cognitive decline following concussion.

Reliable Change Indices

The second conceptual group of statistical models included three variants of the

Reliable Change Index (Model 3, Model 4, and Model 5). Model 3 involved the

application of a basic RCI analysis as described by Jacobson and Truax (1991) and used

in studies by Temkin et al. (1991) and Barr and McCrae (2001) to detect change in serial

neuropsychological testing. Model 4 utilized the application of the same RCI procedures

with correction for practice. For both Model 3 and Model 4, findings for each variable

were deemed significant if the retest score fell outside the computed change index in the

negative direction. For Model 5, each concussed participants' change on the individual

variables from baseline to retest interval was directly compared to the average change

score for the control population over the same time period. A significant finding was








denoted by a change score which fell below a 95% confidence interval based on the

standard deviation of change scores for the control group on each variable and test

interval.

Regression Models

The third conceptual subgroup of statistical models included two regression

approaches (Model 6 and Model 7). Consistent with the research ofTemkin et al. (1999)

and Barr and McCrea (2001), in Model 6, simple linear regression equations were

calculated, based on control population data, in order to predict retest scores for each

variable over the four test intervals. The observed scores were then compared to the

predicted scores and a standard z-score was calculated using the standard error of the

residuals for the model based on the control participants. Observed retest scores falling

beyond a significance value of 1.96 in the negative direction were classified as

significantly below expectation. In Model 7, a multiple regression equation approach

using history of diagnosed learning disability and history of previous concussion as

additional predictive indicators was analyzed.

Sensitivity and Specificity

Sensitivity values were calculated for each variable over the 4 test sessions for

each model. For individual variables, sensitivity was determined by that variable's

probability of accurately classifying a concussed player as being significantly below

expectation based on the applied statistical model.

Each model was then examined over all test session to determine how many

variables from the battery of neuropsychological tests as a whole were found to have

been significantly below expectation for both concussed and control participants. A








classification cut-point (the optimal number of significant variables below expectation)

was determined based on the highest degree of sensitivity/specificity combination at each

time interval, where sensitivity was determined by the number of concussed participants

who were accurately classified as concussed at each cut-point, and specificity was

determined by the number of control participants who were accurately classified as "not

concussed" at those same cut-points.

Classification accuracy was then assessed using Receiver Operating Characteristic

curves, which provided a statistical indication of each model's ability to accurately

distinguish between the concussed and nonconcussed participants relative to chance

(Hanley, 1989).

















CHAPTER THREE
RESULTS

Demographic Data for Baseline Population

During the current study, data was collected using the above-described pre-season

neuropsychological test battery, and demographic variables were collected from a total of

443 football players at four universities. The total number of pre-season participants

from each location is presented in Table 2, and is presented graphically in Figure 1.

Table 2. Numbers and percentage of total pre-season participants' locations

N Percent Cumulative Percent

University of Florida 155 35.2 35.2
Michigan State University 119 26.9 62.1
University of Utah 85 19.2 81.3
University of Pittsburgh 83 18.7 100.0

Location


Univ. of Pittsburgh
18 7o*


Univ of Florda
35 2%


Uni, ol Utan
19 2.


S Mich. St. Univ.
26.9%


Figure 1. Percent of baseline participants from each contributing school.













Table 3 and Figure 2 indicate the racial composition of the total number of baseline

participants. The majority of these were African American (47.9%), closely followed by


Caucasians (47.2%). There were significantly fewer Hispanic, Asian, and Samoan

participants.


Table 3. Racial composition of baseline population

N Percent Cumulative Percent


African American 212 47.9 47.9
Caucasian 209 47.2 95.0
Hispanic 6 1.4 96.4
Asian 2 0.5 96.8
Haitian 1 0.2 97.1
Samoan 13 2.9 100.0




RACE
Samoan
2.9%
Haitian
.2%
Asian
.5%
Hispanic
1.4%

7 African American
Caucaswar. 47.9%
472"




Figure 2. Racial composition of baseline population across sites.


Table 4 indicates the frequencies of ages of the baseline population. The mean age of

baseline population participants was 19.85 years with a standard deviation of 1.75 years.









Table 4. Distribution of ages across the baseline population
Age (years) N Percent Cumulative Percent

17 6 1.4 1.4
18 113 22.5 26.9
19 101 22.8 49.7
20 77 17.4 67.0
21 72 16.3 83.3
22 39 8.8 92.1
23 22 5.0 97.1
24 7 1.6 98.6
25 2 0.5 99.1
26 2 0.5 99.5
27 2 0.5 100.0


AGE
30



20







a. 0
17 18 19 20 21 22 23 24 25 26 27
Figure 3. Distribution of ages for baseline population.

Table 5 indicates the frequencies of year in school for all participants in the baseline

population.

Table 5. Distribution of class status for the baseline population
N Percent Cumulative Percent

Freshman 171 38.6 38.6
Sophomore 92 20.8 59.4
Junior 95 21.4 80.8
Senior 67 15.1 95.9
Fifth Year Senior 18 4.1 100.0








Demographic Characteristics for Concussion and Control Participants

As indicated above, in-season concussion data utilizing re-administration of the

baseline neuropsychological test battery was conducted for those individuals who were

reported by team trainers to have received a concussion either during practice or during a

game. Initial re-testing took place within 24-hours of the identified concussive event, at

three days, five days, and seven days post-concussion. During the time of the

investigation, trainer-identified concussive events resulted in a total of 26 concussed

players, including three players who incurred two concussions during the time of the

study. Table 6 indicates the demographic characteristics of the concussed players.

Table 6. Demographics characteristics of the concussed players
N Percent

Age
18 7 26.9
19 11 42.3
20 6 23.1
21 2 7.7

Race
African American 14 53.8
Caucasian 12 46.2
History ofLD
Yes 6 23.1
No 20 76.9

When a concussed player was identified by a trainer, a matched control was

assigned to each for purposes of neuropsychological test result comparison over time. As

outlined above, control subjects were matched as closely as possible to concussed

individuals in terms of race, history of diagnosed learning disability, history of

concussion, and position. However, control subjects' continued participation in the time-

sensitive research design was less than optimal. Ultimately, when control participants








failed to continue in the study, the time requirements did not allow for replacement with

another control participant. Subsequently, the total number of control participants was

less than the total number of concussed subjects, resulting in a total of 14 individuals.

Table 7 indicates the demographic characteristics of the control population.

Table 7. Demographic characteristics of the control population

N Percent

Age
18 6 42.9
19 4 28.6
20 1 7.1
21 3 21.4

Race
African American 7 50.0
Caucasian 7 50.0

History ofLD
Yes 2 14.3
No 12 85.7

Examination of the racial composition of the baseline population and the concussed

subjects indicates that both groups were nearly half African American and half

Caucasian.

A one-way ANOVA was performed to determine whether there were any

statistically significant differences between the baseline population (excluding those

players who eventually suffered concussion and their matched controls), the concussed

subjects, and their controls in terms of age, race, and year in school. Using status as the

between-subjects factor (baseline, concussed, control), no significant differences were

found between the groups in terms of racial composition [F(2,440) = 1.12, p=0.33], nor

were there any significant differences between the groups with regard to year in school








[F(2,440) = 2.97, p=0.052] although there was a trend for the concussed players to be of

lower school year. It was not surprising, therefore, that a significant effect for age was

observed [F(2,440) = 3.87, p=0.022]. However, paired t-tests using a Bonferroni

corrected p-value of 0.017 (.05/3) failed to demonstrate a significant difference between

the groups in terms of age. Again, a trend was observed for the concussed players to be

slightly younger than the baseline population as a whole, though this was not statistically

significant. For the purposes of this study, which examined baseline to retest session

performance (with 24 hours, day 3, day 5, and day 7), participants were excluded

listwise. That is, participants must have had a score at both test sessions for inclusion

into examination of performance for a particular test interval. This resulted in fewer

included participants for each successive time interval. Ultimately 21 concussed and 13

controls were included for the 24 hour analysis, 19 concussed and 11 controls for the Day

3 analysis, 18 concussed and 10 controls for Day 5, and 19 concussed and 6 controls for

Day 7.

Baseline Performance

As noted above, ten variables were examined from the battery of

neuropsychological tests administered during the current study. Those variables were

Hopkins total score, Trails A total time in seconds, Trails B total time in seconds,

Grooved Pegboard Test dominant and nondominant hand time in seconds, digit span

forward, digit span backward, Digit Symbol time in seconds, COWA total score, and

Hopkins Delay score. Table 8 shows the means and standard deviations for the total

Baseline population as well as the concussed participants and control subjects at baseline.








Table 8. Baseline Performance of total population, concussed, and controls
Total Baseline Controls Concussed
n=409 n=13 n=21

M (SD) M (SD) M (SD)
Measure

Hopkins Total 24.96 (4.02) 25.77 (3.44) 24.19 (4.13)
Hopkins Delay 8.35 (2.09) 8.85 (1.77) 7.71 (1.98)
Trails A 20.99 (5.96) 21.97 (6.17) 22.18 (5.53)
Trails B 52.84 (17.39) 45.95 (13.13) 57.83 (24.83)
Digits Forward 9.05 (2.18) 8.31 (2.14) 9.00 (1.89)
Digits Back 7.20 (2.48) 7.08 (2.53) 6.67 (2.29)
Digit Symbol 58.75 (9.07) 64.92 (8.03) 55.81 (8.98)
Pegs Dominant 66.56 (10.50) 65.48 (8.41) 70.73 (13.79)
Pegs Nondominant 73.34 (11.81) 72.72 (11.78) 79.62 (16.96)
COWA Total 39.08 (9.23) 35.77 (6.70) 41.32 (10.21)

A series of one-way ANOVA's was run in order to detect any significant

differences between the groups with regard to their baseline performance on any of these

ten measures. Table 9 provides a summary of those findings.

There was a significant finding with regard to baseline performance between the

three groups on the Digit Symbol test [F (2,440) = 4.14, p=0.02]. Subsequently, paired t-

tests were performed using a Bonferroni corrected significance alpha value of 0.017

(.05/3). Results indicated that the control subjects scored, on average, significantly

better than did the concussed participants (t = 2.99, p=0.013) at baseline. Further, there

was also a trend indicating that the control participants' scores were significantly better

than the control group at baseline, although this did not reach significance after applying

correction for multiple t-tests. However, because the difference between the concussed

group and control group scores on the Digit Symbol test at baseline was found to be

significant, this variable was omitted from further analysis.








Table 9. One-way ANOVA's using status as between-subjects factor
Measure df F Significance

Hopkins Total (2,440) 0.65 0.52
Hopkins Delay (2,440) 1.35 0.26
Trails A (2,440) 0.56 0.57
Trails B (2,440) 1.82 0.16
Digits Forward (2.440) 0.73 0.48
Digits Backward (2,440) 0.48 0.62
Digit Symbol (2,440) 4.14 0.02
Grooved Pegboard Dom (2,440) 1.63 0.19
Grooved Pegboard Nondom (2,440) 2.73 0.07
COWA Total (2,440) 1.40 0.25

Overview of Repeated Testing

Repeated testing utilizing the above-described neuropsychological battery

generally yielded results consistent with historical reviews of test-retest performance for

those having suffered minor head injury. That is, the concussed participants in the

current study, on average, exhibited a decline (or a stasis) in performance across test

variables within 24 hours of concussion when compared to their baseline performance.

Examination of subsequent performances on days 3, 5, and 7 days post concussive event

generally suggested a gradual improvement over these days. This is particularly evident

when comparing the graphical representation of these performances compared to the

control population's average performance over this same time interval. Figures 4 through

12 represent the average performance of individuals in both the concussed and control

groups over the testing period.

Detecting Significant Decline Using Basic Statistical Methods

Generally speaking, neuropsychological testing involves the examination of an

individual after a known or suspected neurological insult has occurred.











Hopkins Total Score


= in-season concussion

22 o_ control
Baseline 24 Hours Day 3 Day 5 Day 7

Testing Session

Figure 4. Hopkins Verbal Learning Test -Retest Performance for Both Groups


Trail Making Test A


--
S18-N in-season concussion

I 16 o control
Baseline 24 hours Day 3 Day 5 Day 7

Testing Session
Figure 5. Trail Making Test A Test-Retest Performance for Both Groups











Trail Making Test B


(D in-season concussion

S 30 o control
Baseline 24 Hours Day 3 Day 5 Day 7

Test Session

Figure 6. Trail Making Test B Test-Retest Performance for Both Groups


Digit Span Forward





,/\
/ '
/ "
/
/\



/
P/


8-
E in-season concussion

Z 7 ____ control
Baseline 24 Hours Day 3 Day 5 Day 7

Test Session

Figure 7. Digit Span Forward Test-Retest Performance for Both Groups










Digit Span Backward
9.5

0 9.0.







0
S8.5 n-season concussion
Z 8.0 control



Baseline 24 Hours Day 3 Day 5 Day 7
4-
Q 7.5
7.0

E 6.5 in-season concussion
Z 6.0.______._______,_________________ 0 control
Baseline 24 Hours Day 3 Day 5 Day 7

Test Session
Figure 8. Digit Span Backward Test-Retest Performance for Both Groups


Grooved Pegboard Dominant Hand


E 58 in-season concussion
0 ---
o 56 o control
Baseline 24 Hours Day 3 Day 5 Day 7

Test Session
Figure 9. Grooved Pegboard (Dominant Hand) Test-Retest Performance for Both Groups











Grooved Pegboard Nondominant Hand
90.
8-
80
C







o 70
0




E in-season concussion
0 80


0- --

0 60 1o control
Baseline 24 Hours Day 3 Day 5 Day 7

Test Session

Figure 10. Grooved Pegboard (Nondom. Hand) Test-Retest Performance for Both
Groups





o Controlled Oral Word Association

0
0.

48 \
/ \
S46 \
0

44-

42-

/ \
E \


6 38! "-----in-season concussion
0 o
1- 36 _control
Baseline 24 Hours Day 3 Day 5 Day 7


Test Session

Figure 11. COWA Test-Retest Performance for Both Groups










Hopkins Delay Score
11


10 -,

-0
o






7 in-season concussion
E
z 6 control
Baseline 24 Hours Day 3 Day 5 Day 7
Figure 12. Hopkins Delay Performance for Both Groups

In most instances, the examiner is without the benefit of prior knowledge regarding the

individual's baseline cognitive abilities. Therefore, certain assumptions are made with

regard to that person's premorbid cognitive abilities, and a comparison is made between

their status-post neuropsychological performance and some normative group that is

deemed appropriate. In many circumstances, an arbitrary cut-point is determined for a

neuropsychological measure, a performance under which (or significantly different from)

is considered to indicate abnormality. In this study, the group of more than 400 football

players who participated in the baseline study provided an appropriate comparison group.

Model 1

The most basic model (referred to as Model 1 for the purposes of this study) for

detecting if a concussed player's performance suggested ongoing concussive sequelae

was a comparison at each post-concussive test interval to the normative baseline

population mean score for that measure. Using a liberal criterion, a difference of greater

than 1.0 SD below that expected (normative mean) score would suggest a performance








below expectation, perhaps indicating ongoing concussive effects. Using that approach,

Table 10 indicates the cut-points for the 9 neuropsychological measures that were

studied. These were derived by determining a score for each particular measure which

was 1 SD worse than the mean.

Using these criteria, the number of concussed and control participants who were

classified as significantly below the cut-point was determined for each test session

(within 24 hours of concussion, 3 days, 5 days, and 7 days post-concussion). Table 11

indicates the sensitivity of each measure for each test session.

Table 10. Cut Points Using 1 SD Difference from Normative Population Mean (Model 1)

Measure Normative Mean Score Standard Deviation Cut-Point

Hopkins Total 24.96 4.02 20.94
Trails A (sec) 20.99 5.96 26.95
Trails B 52.84 17.39 70.23
Digit Span Forward 9.05 2.18 6.87
Digit Span Backward 7.20 2.47 4.73
Grooved Pegboard (D) 66.56 10.50 77.06
Grooved Pegboard (N) 73.34 11.81 88.15
COWAT 39.08 9.23 29.85
Hopkins Delay 8.35 2.09 6.26

As Table 11 indicates, within 24 hours of concussion, the most sensitive measure

appears to be total words recalled after the delay period for the Hopkins Verbal Learning

Test. However, none of the measures alone was particularly sensitive. Examining the

battery as a whole using Model 1, the number of concussed and control participants who

scored below the cut-points for each measure were totaled. Tables 12-15 show the

cumulative number for each group who were significantly different than the normative

mean group for all 9 variables examined.









Table 11. Sensitivity for Neuropsychological Measures for Model 1
Measure 24 Hours Day 3 Day 5 Day 7

Hopkins Total 0.29 0.11 0.17 0.05
Trails A 0.29 0.05 0.00 0.11
Trails B 0.19 0.11 0.06 0.00
Digit Span Forward 0.19 0.16 0.00 0.05
Digit Span Back 0.14 0.05 0.05 0.00
Grooved Pegboard D 0.14 0.16 0.06 0.05
Grooved Pegboard N 0.38 0.16 0.11 0.21
COWAT 0.19 0.05 0.11 0.00
Hopkins Delay 0.43 0.42 0.11 0.16

Table 12 demonstrates that, within 24 hours of concussion, using the criterion of

falling below the cut-points (derived using Model 1) for 2 or more variables achieved the

greatest (Se + Sp) value of 1.37. That is, 11 of the 21 concussed individuals fell below

the cut-point for 2 or more variables while only 2 of the 13 control participants did so at

that test session. However, as has been noted in the literature, in the case of sports

concussion it is likely better to err on the side of caution, and while using a criteria of

only one variable below significance value resulted in a greater number of false positive

errors (Sp=0.38), it also resulted in a far better true positive rate (Se = 0.91).

Table 12. Number of Concussed and Control Participants Falling Below Cut-Point for
Multiple Variables Within 24 Hours with Model 1

No. of Sig.Measures Concussed Control Se Sp (Se + Sp)
0 21 13 1.0 0.0 (1.00)
1 19 8 0.91 0.38 (1.29)
2 11 2 0.52 0.85 (1.37)
3 8 0 0.38 1.00 (1.38)
4 3 0 0.14 1.00 (1.14)
5 2 0 0.10 1.00 (1.10)
6 2 0 0.10 1.00 (1.10)
7 1 0 0.05 1.00 (1.05)
8 1 0 0.05 1.00 (1.05)
9 1 0 0.05 1.00 (1.05)
Se=Sensitivity, Sp=Specificity








Again, Table 13 demonstrates that the highest combined sensitivity and

specificity rates were achieved at Day 3 using a criteria of 3 or more variables falling

below their respective cut-points (Se + Sp = 1.12). However, sensitivity is extremely

negatively affected by the more conservative inclusion criteria. This more conservative

criterion, however, also resulted in a significantly higher specificity rate, demonstrating

the need for a pre-determined tolerance for higher false-positive rates when attempting to

detect ongoing effects of concussion.

Table 13. Number of Concussed and Control Participants Falling Below Cut-Point for
Multiple Variables at Day 3 with Model 1
No. of Measures Concussed Control Se Sp (Se + Sp)

0 19 11 1.00 0.0 (1.00)
1 10 5 0.53 0.55 (1.08)
2 6 3 0.32 0.73 (1.05)
3 4 1 0.21 0.91 (1.12)
4 2 0 0.11 1.00 (1.11)
5 1 0 0.05 1.00 (1.05)
6 0 0 0.00 1.00 (1.00)
7 0 0 0.00 1.00 (1.00)
8 0 0 0.00 1.00 (1.00)
9 0 0 0.00 1.00 (1.00)
Se=Sensitivity, Sp=Specificity

At day 7 (Table 15), the highest combined sensitivity and specificity score was achieved

using a criterion of one variable below cut-point (1.25).

Certainly, the sensitivity and specificity values using a comparison of

performance at these time intervals to a normative population suggests that the model

itself has some value within 24 hours of concussion. Although the model is not an

optimal one, it was able to detect a degree of performance aberrance after concussion

compared to the appropriate standard scores, though specificity was conversely poor.








Receiver operating characteristic curves were calculated for this model using Analyse-It

software for Excel at each of the time interval and are shown in Figures 13-16.

Table 14. Number of Concussed and Control Participants Falling Below Cut-Point for
Multiple Variables at Day 5 with Model 1
No. of Measures Concussed Control Se Sp (Se + Sp)

0 18 10 1.00 0.0 (1.00)
1 8 5 0.44 0.50 (0.94)
2 3 1 0.17 0.90 (1.07)
3 0 1 0.00 0.90 (0.90)
4 0 1 0.00 0.90 (0.90)
5 0 0 0.00 1.00 (1.00)
6 0 0 0.00 1.00 (1.00)
7 0 0 0.00 1.00 (1.00)
8 0 0 0.00 1.00 (1.00)
9 0 0 0.00 1.00 (1.00)
Se=Sensitivity, Sp=Specificity

Table 15. Number of Concussed and Control Participants Falling Below Cut-Point for
Multiple Variables at Day 7 with Model 1
No. of Measures Concussed Control Se Sp (Se + Sp)

0 19 6 1.00 0.0 (1.00)
1 8 1 0.42 0.83 (1.25)
2 4 0 0.21 1.00 (1.21)
3 0 0 0.00 1.00 (1.00)
4 0 0 0.00 1.00 (1.00)
5 0 0 0.00 1.00 (1.00)
6 0 0 0.00 1.00 (1.00)
7 0 0 0.00 1.00 (1.00)
8 0 0 0.00 1.00 (1.00)
9 0 0 0.00 1.00 (1.00)
Se=Sensitivity, Sp=Specificity

Figure 13 shows the receiver operating characteristic curve for the classification

rate for Model 1 at 24 hours. The diagonal line represents chance level, and by

definition, the area underneath it is equal to 0.50. The area underneath the curve for

Model 1 at 24 hours was 0.773 (95% Confidence Interval = .62-.93; SE = 0.08) and was

found to be statistically different than chance (p=0.0003). Using generally accepted









accuracy classification, this model at 24 hours after concussion would be characterized

only as fair.


ROC Curve for Model 1 at 24 Hours


0.9
S0.8
0.7
0 No discrimination

0.4
S0.6 / -0-Model 1
f0.5 -

0.3
d& 0.2
0.1

0 0.2 0.4 0.6 0.8 1
1 Specificity (false positives)


Figure 13. ROC Curve for Model 1 at 24 Hours

At Day 3, detection of significant scores on the test variable compared to the baseline

normative population was not better than chance using Model 1 (Figure 14). The area

under the curve (AUC) was calculated as 0.56 (95% Confidence Interval = .35-.77; SE =

0.11) and was not statistically different than the reference diagonal (p=0.30). A similar

pattern was observed at Day 5 as can be seen in Figure 15. Using Model 1, the detection

rate was not statistically difference than chance. The AUC was found to be 0.517 (95%

Confidence Interval = 0.29-0.74; SE = 0.12).

The calculated AUC for Model 1 at day 7 was 0.65 (95% Confidence Interval =

0.41-0.88; SE = 0.12), and again, was not statistically better than chance (p=0.11) at

differentiating between concussed participants and controls (Figure 16).











ROC Curve for Model 1 at Day 3


No discrimination
--o- mdell


0 0.2 0.4 0.6 0.8 1
1 Specificity (false positives)


Figure 14. Receiver Operating Characteristic Curve for Model 1 at Day 3



ROC Curve for Model 1 at Day 5

1

0.9
0.8

S0.7
-No discrimination
S0.6 --0- model

S0.5

S0.4

S0.3-

I 0.2

0.1 <
/-

0 0.2 0.4 0.6 0.8 1
1 Specificity (false positives)


Figure 15. Receiver Operating Characteristic Curve for Model 1 at Day 5










ROC Curve for Model 1 at Day 7



0.9
0.8
I 0.7
No discrimination
o 0.6
6 --model 1
2 0.5
0.4
0.3-
0.2-
0.1 -

0 0.2 0.4 0.6 0.8 1
1 Specificity (false positives)


Figure 16. Receiver Operating Characteristic Curve for Model 1 at Day 7

Model 2

In the circumstance that one has baseline test measures for an individual,

comparison of their performance following neurological insult to that baseline

performance may be more appropriate in determining change in cognitive status during

subsequent testing sessions. In the second basic model examined in this study (here

referred to as Model 2), performance of concussed individuals at each testing session

following concussion was compared to their baseline performance. This was achieved by

determining significance based again on a liberal criterion of 1 SD (derived from the

normative population as a whole, n=409) below their baseline performance. In this

manner, each subject was afforded an individualized comparator as illustrated for one

concussed participant in Table 16.








Please note that those variables for which time is the recorded score, a cut-point

was determined by the addition of 1 SD to the baseline performance, and ordinal

variables were rounded to the nearest whole number. Therefore, as can be seen in Table

13, for this participant's time on Trails A to be classified as significantly different at 24

hours than baseline, his score would have to increase by nearly 6 seconds, from 33" to

nearly 39" total time.

Table 16. Example of Cut-Point Determination for Study Participants
Measure Baseline Performance 1 SD(from normative pop.) Cut-Point

Hopkins Total 20 4.02 16
Trails A 33" 5.96" 38.96"
Trails B 68" 17.39" 85.39"
Digits For. 12 2.18 10
Digits Back. 6 2.48 4
Pegs Dominant 73" 10.50" 83.50"
Pegs Nondominant 100" 11.81" 111.81"
COWAT 52 9.23 43
Hopkins Delay 8 2.09 6

The same procedure was used for each concussed and control participant, and their retest

results at each testing session were compared to their appropriate cut-point score in order

to determine if there was a significant change from their baseline performance. The

sensitivities for individual variables were calculated for each testing session and are

presented in Table 17.

As was noted in Model 1, the sensitivities for the individual variables was

relatively poor at detecting significant change. The most sensitive individual variables at

24 hours post-concussion were Hopkins Total score and Grooved Pegboard nondominant

hand completion time. As would be expected, using Model 2, the individual variables

were most effective in detecting significant change relative to baseline performance

within 24 hours of concussion, with a trend toward decreasing sensitivity at each








successive testing session. The one exception was the Hopkins Delay score, which

remained relatively stable over time.

Table 17. Sensitivity for Neuropsychological Measures for Model 2
Measure 24 Hours Day 3 Day 5 Day 7

Hopkins Total 0.43 0.11 0.17 0.11
Trails A 0.29 0.16 0.11 0.11
Trails B 0.29 0.05 0.06 0.05
Digit Span Forward 0.14 0.11 0.06 0.05
Digit Span Back 0.19 0.11 0.06 0.05
Grooved Pegboard D 0.19 0.05 0.00 0.05
Grooved Pegboard N 0.43 0.16 0.06 0.16
COWAT 0.14 0.11 0.00 0.00
Hopkins Delay 0.33 0.26 0.17 0.26

The battery as a whole was again examined to determine the number of concussed

and control participants who scored below their individual cut-points across variables

using Model 2. Sensitivity and specificity characteristics of the Model for each test

session are presented in Tables 18-21.

As Table 18 demonstrates, the greatest degree of classification accuracy occurred

at 24 hours for Model 2 when a criterion of a significant reduction in score on 1 or more

variables was used. Specificity greatly increased when the criteria was two or more

significant variables, though this also resulted in the largest single drop in sensitivity.

Using the same criterion on Days 3 through 7, a predictable drop in sensitivity was

observed over time, while specificity remained relatively constant. Notably, this model

saw its lowest sensitivity using the one-variable criterion at the Day 5 test session (Table

20), even relative to the Day 7 test results. This pattern was consistent in most of the

models examined as part of this study.








Table 18. Number of Concussed and Control Participants Falling Below Cut-Point for
Multiple Variables Within 24 Hours with Model 2

No. of Sig.Measures Concussed Control Se Sp (Se + Sp)

0 21 13 1.0 0.0 (1.00)
1 17 5 0.81 0.62 (1.43)
2 10 1 0.48 0.92 (1.40)
3 8 0 0.38 1.00 (1.38)
4 4 0 0.19 1.00 (1.19)
5 4 0 0.19 1.00 (1.19)
6 3 0 0.14 1.00 (1.14)
7 3 0 0.14 1.00 (1.14)
8 1 0 0.05 1.00 (1.05)
9 0 0 0.00 1.00 (1.00)
Se=Sensitivity, Sp=Specificity

Table 19. Number of Concussed and Control Participants Falling Below Cut-Point for
Multiple Variables at Day 3 with Model 2
No. of Sig.Measures Concussed Control Se Sp (Se + Sp)

0 19 11 1.0 0.0 (1.00)
1 12 5 0.63 0.55 (1.18)
2 4 2 0.21 0.82 (1.03)
3 2 0 0.11 1.00 (1.11)
4 1 0 0.05 1.00 (1.05)
5 1 0 0.05 1.00 (1.05)
6 1 0 0.05 1.00 (1.05)
7 1 0 0.05 1.00 (1.05)
8 0 0 0.00 1.00 (1.00)
9 0 0 0.00 1.00 (1.00)
Se-Sensitivity, Sp=Specificity

As with the use of this model in the previous testing sessions, the greatest degree

of sensitivity was achieved when a criteria of one significant variable was used.

However, this also resulted in falsely classifying greater than half of the control

participants as concussed. Receiver operating characteristic curves for Model 2 are

presented in Figures 17 through 20.









Table 20. Number of Concussed and Control Participants Falling Below Cut-Point for
Multiple Variables at Day 5 with Model 2
No. of Sig.Measures Concussed Control Se Sp (Se + Sp)

0 18 10 1.0 0.0 (1.00)
1 6 3 0.33 0.70 (1.03)
2 4 1 0.22 0.90 (1.12)
3 2 1 0.11 0.90 (1.01)
4 0 1 0.00 0.90 (0.90)
5 0 0 0.00 1.00 (1.00)
6 0 0 0.00 1.00 (1.00)
7 0 0 0.00 1.00 (1.00)
8 0 0 0.00 1.00 (1.00)
9 0 0 0.00 1.00 (1.00)
Se=Sensitivity, Sp=Specificity

Table 21. Number of Concussed and Control Participants Falling Below Cut-Point for
Multiple Variables at Day 7 with Model 2
No. of Sig.Measures Concussed Control Se Sp (Se + Sp)

0 19 6 1.0 0.0 (1.00)
1 10 4 0.53 0.33 (0.86)
2 5 0 0.26 1.00 (1.26)
3 1 0 0.05 1.00 (1.05)
4 0 0 0.00 1.00 (1.00)
5 0 0 0.00 1.00 (1.00)
6 0 0 0.00 1.00 (1.00)
7 0 0 0.00 1.00 (1.00)
8 0 0 0.00 1.00 (1.00)
9 0 0 0.00 1.00 (1.00)
Se=Sensitivity, Sp=Specificity

Data for the 24-hour test session resulted in a ROC curve suggesting that this

model had "fair" classification accuracy. AUC was calculated to be 0.788 (SE =

0.0773; 95% Confidence Interval = 0.636-0.939), and was significantly greater than

change with a significance value ofp<0.0001. At the Day 3 test session (Figure 18),

Model 2 was found not to be statistically better than chance at classifying the concussed


and control participants (AUC = 0.566, SE = 0.1135, p<0.28).













ROC Curve for Model 2 at 24 Hours


1
0.9
' 0.8
= 0.7
.0.6
. 0.5
S0.4
0.3
I -2
S0.2
0.1


No discrimination
-0-Model 2


0 0.2 0.4 0.6 0.8 1
1 Specificity (false positives)



Figure 17. Receiver Operating Characteristic Curve for Model 2 at 24 Hour Test Session




ROC Curve for Model2 at Day 3


1

0.9

0.8

:_ 0.7
S No discrimination
S0.6 --Model 2
______


Figure 18. Receiver Operating Characteristic Curve for Model 2 at Day 3 Test Session


0 0.2 0.4 0.6 0.8 1
1 -Specificity (false positives)











ROC Curve for Model 2 at Day 5


No discrimination
---- Model 2


0 0.2 0.4 0.6 0.8
1 Specificity (false positives)


Figure 19. Receiver Operating Characteristic Curve for Model 2 at Day 5 Test Session


ROC Curve for Model 2 at Day 7

1
0.9
S0.8
0.7
No discrimination
& 0.6- -o- tModel 2
2 0.5
0.4-
S0.3
r 0.2
0.1 -
0
0 0.2 0.4 0.6 0.8 1
1 Specificity (false positives)

Figure 20. Receiver Operating Characteristic Curve for Model 2 at Day 7 Test Session

Similarly, Model 2 was found not to be statistically different than chance at the

Day 5 and Day 7 test sessions, as shown in Figures 19 and 20. The AUC for Model 2 at

Day 5 was found to be 0.522 (SE = 0.1154, 95% Confidence Interval = 0.296-0.748,









p<0.42), while the AUC for Model 2 at Day 7 was 0.518 (SE = 0.1211, 95% Confidence

Interval = 0.280-0.755, p<0.44).

As the ROC analysis suggested, in the repeated testing of the concussed football

players, comparing their re-test performance to either an appropriate normative sample

(Model 1) or to their own baseline performance using a liberal criteria of 1.0 SD reduced

performance (Model 2) yielded rather poor sensitivity for the individual variables

included in this test battery. Further, examining the variables collectively, both models

were only fair in accurately discriminating between concussed and control participants

even within 24 hours of concussion, a time period in which the cognitive sequelae of

concussion are generally most evident.


Model 1 vs. Model 2 ROC at 24 Test Session (diff = 0.022,
p<0.82)

1
0.9
0.8 D0
5 I Nodiscrimination
S0.7
S------ Model 2
o 0.6
0
CX 0 .. -Model 1
0.5
0.4
0.3
| 0.2
0.1
0
0 0.2 0.4 0.6 0.8 1
1 Specificity (false positives)


Figure 21. Comparison of Model 1 and Model 2 at the 24-Hour Test Session


Neither was better than chance at discriminating between the two groups at the Day 3,

Day 5, or Day 7 test sessions. A comparison of Model 1 and Model 2 for the 24 Test








Session suggested that Model 1, with an AUC of 0.788 was slightly more sensitive than

Model 2, which had an AUC of 0.766. However, there was not a statistically significant

difference between the two (p=0.82) as illustrated in Figure 21.

Reliable Change Index Models

Model 3

Three variants of the Reliable Change Index (RCI) were subsequently applied to

the data set. The first, a basic RCI model (referred to as Model 3 for the purposes of this

study), essentially examined whether the test-retest performance difference for each

individual was large relative to the error variance of the variable under examination,

providing indication of a change from baseline performance. The following formula was

applied for each individual's performance on each variable:


RCI (x2 x)
Sdiff

In this equation, x, is the individual's score at the Baseline testing session, x, is the

individual's score at the second test session (in this case, within 24 hours of concussion,

day 3, day 5, and day 7), and sd the standard error of the difference, was calculated

using sE, the standard error of the measure with the equation:


S diff= 2E 2

sE was derived from the correlation of the test-retest scores of the control group for each

variable (r,2) and the standard deviation (s) of the control group at baseline using the

equation:


SE = S1 -r2









An index, outside of which a re-test score was considered significant, was calculated by

multiplying the sdy by a cutoff level of 1.96, a widely accepted statistical significance

level. Tables 22 through 25 indicate the data used to calculate these measures as well as

the RCI ranges used as comparitors for each variable at each testing session.

Table 22. Data Used in the Calculation of Reliable Change Indices at 24 Hours
Measure r2 sd SE sd RCI(+/-)

Hopkins Total 0.841 3.52 1.40 1.99 3.89
Trails A 0.204 5.59 4.99 7.05 13.82
Trails B 0.623 11.21 6.88 9.74 19.08
Digits Forward 0.942 2.27 0.55 0.77 1.51
Digits Backward 0.548 2.41 1.62 2.29 4.49
Pegs Dominant 0.720 7.87 4.17 5.89 11.55
Pegs Nondominant 0.737 12.02 6.16 8.72 17.09
COWAT 0.282 6.97 5.91 8.35 16.37
Hopkins Delay 0.682 1.73 0.98 1.38 2.70
a r12 based on the correlation for control participants between baseline and 24 hour performance, listwise n=13
b sd is the standard deviation for the control group at baseline
c standard error of the measure
d standard error of the difference


Table 23. Data Used in the Calculation of Reliable Change Indices at Day 3
Measure r2 Sd SE d d RCI(+/-)


Hopkins Total 0.679 3.52 1.99 2.82 5.53
Trails A 0.760 5.59 2.74 3.87 7.59
Trails B 0.215 11.21 9.93 14.05 27.53
Digits Forward 0.672 2.27 1.30 1.84 3.60
Digits Backward 0.659 2.41 1.41 1.99 3.90
Pegs Dominant 0.704 7.87 4.28 6.06 11.87
Pegs Nondominant 0.617 12.02 7.44 10.52 20.62
COWAT 0.429 6.97 5.27 7.45 14.60
Hopkins Delay 0.419 1.73 1.32 1.87 3.66
a r12 based on the correlation for control participants between baseline and Day 3 performance, listwise n=13
b sd is the standard deviation for the control group at baseline
c standard error of the measure
d standard error of the difference









Table 24. Data Used in the Calculation of Reliable Change Indices at Day 5
Measure r2Sdb SE Sdid RCI(+/-)

Hopkins Total 0.628 3.52 2.15 3.04 5.95
Trails A 0.392 5.59 4.36 6.16 12.08
Trails B 0.427 11.21 8.49 12.00 23.52
Digits Forward 0.418 2.27 1.73 2.45 4.80
Digits Backward 0.228 2.41 2.12 2.99 5.87
Pegs Dominant 0.297 7.87 6.60 9.34 18.30
Pegs Nondominant 0.735 12.02 6.19 8.75 17.15
COWAT 0.604 6.97 4.39 6.20 12.16
Hopkins Delay 0.386 1.73 1.36 1.92 3.76
a r12 based on the correlation for control participants between baseline and Day 5 performance, listwise n=10
b sd is the standard deviation for the control group at baseline
c standard error of the measure
d standard error of the difference


Table 25. Data Used in the Calculation of Reliable Change Indices at Day 7
Measure r2 Sd bSE sd RCI(+/-)


Hopkins Total 0.492 3.52 2.51 3.55 6.96
Trails A 0.301 5.59 4.67 6.61 12.95
Trails B 0.365 11.21 8.93 12.63 24.76
Digits Forward 0.600 2.27 1.44 2.03 3.98
Digits Backward 0.497 2.41 1.71 2.42 4.74
Pegs Dominant 0.674 7.87 4.50 6.36 12.46
Pegs Nondominant 0.883 12.02 4.11 5.81 11.40
COWAT 0.025 6.97 6.88 9.73 19.08
Hopkins Delay 0.884 1.73 0.59 0.83 1.63
a r12 based on the correlation for control participants between baseline and Day 7 performance
b sd is the standard deviation for the control group at baseline
c standard error of the measure
d standard error of the difference

Please note that a change was considered to indicate a significant difference between a

performance at a particular test session and the appropriate baseline performance only if

the difference exceeded the RCI cutoff point associated with a significant decrease in

performance at the second test session. That is, it was considered significant if the second

test performance on a particular measure was significantly below the baseline


performance based on the RCI.








Table 26 demonstrates the use of this method on one concussed participant for

two test variables. As is illustrated, for the variable 'Hopkins Total words recalled,' a

reduction of greater than 4 recalled words was calculated to be a significant change

relative to the error variance of the test as indicated by the sd This participant's

reduction of 8 words at 24 hours relative to his baseline performance resulted in a z-score

of-4.03, which, using a cut-point ofz=1.96, was found to be a highly significant change

in the negative direction. That is, his performance on Hopkins Total was significantly

below his performance on the same measure at baseline. For the variable Trails A, which

is measured in seconds, his increase of 51" in total completion time from baseline to the

24 hour test session ultimately resulted in a Trails A time at 24 hours of 84", greatly

exceeding the maximum value in the calculated RCI of 46.8". This resulted in a z-score

of 7.23.

Table 26. Example of RCI Significance Determination for One Concussed Participant at
the 24 Hour Test Session

Measure Observed Change RCI Interval z-score
From Baseline
To 24 Hours


Hopkins Total -8 words 20 +/- 3.89 = (16-24) -4.03
Trails A +51.0" 33" +/- 13.82 = (19.2-46.8) 7.23



Table 27 presents the sensitivities for the individual measures at each test session using

Model 3, the basic RCI approach. Again, the data indicated that the sensitivities for the

individual variables were relatively poor, even within 24 hours, which was also seen in

Models 1 and 2. The sensitivities for the individual variables using Model 3 also








generally represent relatively lower values than in Model 2, which is not surprising since

the criteria used for Models 1 and 2 were more liberal (1.0 SD).

Table 27. Sensitivity for Neuropsychological Measures for Model 3
Measure 24 Hours Day 3 Day 5 Day 7

Hopkins Total 0.43 0.11 0.11 0.11
Trails A 0.14 0.11 0.00 0.00
Trails B 0.19 0.00 0.06 0.05
Digit Span Forward 0.29 0.05 0.00 0.00
Digit Span Back 0.14 0.00 0.06 0.05
Grooved Pegboard D 0.19 0.11 0.00 0.00
Grooved Pegboard N 0.29 0.05 0.00 0.16
COWAT 0.05 0.00 0.00 0.00
Hopkins Delay 0.38 0.16 0.11 0.37

The battery as a whole was again examined to determine the number of concussed and

control participants who scored below their individual cut-points across multiple

variables using Model 3. Sensitivity and specificity characteristics of Model 3 for each

test session are presented in Tables 28-31.

Table 28. Number of Concussed and Control Participants Falling Below Cut-Point for
Multiple Variables at 24 Hours using Model 3
No. of Sig.Measures Concussed Control Se Sp (Se + Sp)

0 21 13 1.0 0.0 (1.00)
1 16 1 0.76 0.92 (1.68)
2 10 0 0.48 1.00 (1.48)
3 7 0 0.33 1.00 (1.33)
4 4 0 0.19 1.00 (1.19)
5 3 0 0.14 1.00 (1.14)
6 2 0 0.10 1.00 (1.10)
7 1 0 0.05 1.00 (1.05)
8 1 0 0.05 1.00 (1.05)
9 0 0 0.00 1.00 (1.00)
Se=Sensitivity, Sp=Specificity

As with the first two models, Model 3 was most effective in detecting a concussed

participant when a criterion of one variable below the respective RCI cutoff was used.

Notably, the RCI model resulted in a much greater specificity value when using this








criterion than either Model 1 or Model 2, suggesting that the basic RCI model may be

more effective in separating the groups. This is not altogether surprising since the cutoff

criteria are based on the performance of the control population. At Day 3 (Table 29),

Model 2 saw a significant decline in sensitivity, even when using the minimal criteria of

one variable falling outside the RCI.

Table 29. Number of Concussed and Control Participants Falling Below Cut-Point for
Multiple Variables at Day 3 with Model 3
No. of Sig.Measures Concussed Control Se Sp (Se + Sp)


1.0
0.21
0.16
0.05
0.05
0.05
0.05
0.00
0.00
0.00


5 1
6 1
7 C
8 C
9 C
Se=Sensitivity, Sp=Specificity


0.0
0.82
0.82
1.00
1.00
1.00
1.00
1.00
1.00
1.00


(1.00)
(1.03)
(0.98)
(1.05)
(1.05)
(1.05)
(1.05)
(1.00)
(1.00)
(1.00)


Table 30. Number of Concussed and Control Participants Falling Below Cut-Point for
Multiple Variables at Day 5 with Model 3
No. of Sig.Measures Concussed Control Se Sp (Se + Sp)


1.0
0.22
0.06
0.05
0.00
0.00
0.00
0.00
0.00
0.00


0.0
0.90
0.90
0.90
0.90
1.00
1.00
1.00
1.00
1.00


(1.00)
(1.12)
(0.96)
(0.96)
(0.90)
(1.00)
(1.00)
(1.00)
(1.00)
(1.00)


As was seen in the previous two models, detection of concussed players saw its

most extensive drop at the Day 5 test session (Table 30), but saw resurgence in sensitivity


at the Day 7 test session (Table 31).








Table 31. Number of Concussed and Control Participants Falling Below Cut-Point for
Multiple Variables at Day 7 with Model 3
No. of Sig.Measures Concussed Control Se Sp (Se + Sp)

0 19 6 1.0 0.0 (1.00)
1 9 3 0.47 0.50 (0.97)
2 5 0 0.26 1.00 (1.26)
3 0 0 0.00 1.00 (1.00)
4 0 0 0.00 1.00 (1.00)
5 0 0 0.00 1.00 (1.00)
6 0 0 0.00 1.00 (1.00)
7 0 0 0.00 1.00 (1.00)
8 0 0 0.00 1.00 (1.00)
9 0 0 0.00 1.00 (1.00)
Se=Sensitivity, Sp=Specificity

Receiver operating characteristic curves for Model 3 at each test session are

shown in Figures 22 through 25. As Figure 22 indicates, Model 3 was relatively good at

discriminating between concussed and control participants at the 24 hour test session.

Area under the curve was found to be 0.861 (SE = 0.0641, 95% Confidence Interval =

0.735-0.986), which was significantly better than chance level (p<0.0001).

While the use of Model3 was generally good at distinguishing between concussed

and control participants within 24 hours of concussion, Figures 23-25 indicate that the

model performed no better than chance at Day 3, Day 5, and Day 7. The AUC for Model

3 at Day 3 was 0.514 (SE = 0.111, 95% Confidence Interval = 0.298-0.731, p<0.45);

AUC for Model 3 at Day 5 was 0.556 (SE = 0.111, 95% Confidence Interval = 0.338-

0.773, p<0.31); and the AUC for Model 3 at Day 7 was 0.553 (SE = 0.123, 95%

Confidence Interval = 0.312-0.793, p<0.33).











ROC Curve for Model 3 at 24 Hours


No discrimination
--- Model 3


0 0.2 0.4 0.6 0.8 1
1 Specificity (false positives)

Figure 22. Receiver Operating Characteristic Curve for Model 3 at 24 Hour Test Session



ROC Curve for Model 3 at Day 3

1

0.9

S0.8
0.7
0 No discrimination
o 0.6
S-0- Model 3
2 0.5
0.4

S0.3
0.2
0.1


0 0.2 0.4 0.6 0.8 1
1 Specificity (false positives)

Figure 23. Receiver Operating Characteristic Curve for Model 3 at Day 3 Test Session










ROC Curve for Model 3 at Day 5


No discrimination
---- tModel 3


0 0.2 0.4 0.6 0.8 1
1 Specificity (false positives)

Figure 24. Receiver Operating Characteristic Curve for Model 3 at Day 5 Test Session


ROC Curve for Model 3 at Day 7


0 0.2 0.4 0.6 0.8
1 Specificity (false positives)

Figure 25. Receiver Operating Characteristic

Model 4


No discrimination
-o- Model 3









1


Curve for Model 3 at Day 7 Test Session


The second RCI variant analyzed as part of the current study (referred to as Model

4) utilized the same data indicators as were used in the basic RCI model (Model 3), but








also included a correction for observed practice effects in the control population for

individual variables at each testing session. Consistent with the literature, practice

effects were calculated by finding the mean change in score over a test-retest period for

the comparison population. For the current study, this involved calculating the mean

change for the control population for each variable from baseline to 24 hour test session,

baseline to Day 3 test session, baseline to Day 5 test session, and baseline to Day 7 test

session. Again, individual subjects were omitted listwise. Table 32 indicates the mean

change scores for the control population.

Table 32. Mean Change Scores for Individual Variables for Control Participants
Measure 24 Hour(n=13) Day 3(n=1 1) Day5(n=10) Day7(n=6)

Hopkins Total 0.38 -0.91 -0.40 0.50
Trails A -3.69" -5.03" -4.87" -8.48"
Trails B 2.34"* -1.01" -8.40" -11.03"
Digits Forward 1.38 1.64 2.00 1.50
Digits Backward 0.77 1.73 2.50 0.67
Pegs Dominant -2.97" -4.05" -4.20" -8.07"
Pegs Nondominant -0.61" -3.35" 0.46"* -6.55"
COWAT 3.08 5.45 5.80 1.67
Hopkins Delay 0.00* -1.27* 0.60 -2.33*
*indicates no practice effect detected

In order to correct for practice effect, the mean change score for a variable was

subtracted from each participants score on that variable at the retest session. The

resulting number was then entered into the RCI equation as outlined above, and

calculated as before with the basic RCI method in Model 3. Therefore, the equation for

the RCI became:

(x2 -d12)-x1
Sdiff








where x, was the participant's score on a variable at baseline, x, was their score at the

retest session, and d12 was the average change score for the control group on the variable

over that test-retest interval. The sda was calculated as in the basic RCI model using the

correlations and standard deviations for the control group as described above. Please

note that for those variables that were found, on average, not to have practice effects for

the control group at the retest session, no changes were made in the study participants'

retest score in the RCI model. This occurred five times, most frequently for the Hopkins

Verbal Learning Test Total and Delay scores, which was not unexpected given that these

tests used alternate forms over test sessions. Again, for the purposes of this study, scores

were classified as significant only if they exceeded their RCI cutoff in the negative or

declined performance level.

Table 33 presents the sensitivities for the individual measures at each test session

using Model 4, the RCI approach with correction for practice effects. Once again, it is

apparent that the individual variables, despite correction for practice effect, have

relatively low sensitivities. The practice correction does appear to have resulted in a

slight improvement in sensitivity for each re-test session, suggesting the detection of one

or two more significantly decreased scores for each. It is notable that both Hopkins

Delay and Grooved Pegboard nondominant hand performance have relatively larger

sensitivities when compared to the other variables at day 7, suggesting some lingering

short-term memory and motor speed decline in the concussed population. Tables 34

through 37 indicate the sensitivity and specificity of the battery of variables as a whole

for Model 4 at each test session.








Table 34 indicates a relatively high sensitivity/specificity combination was

achieved for the RCI model with correction for practice at the 24 Hour test session using

a criterion of 1 significant variable. Sixteen of the 21 concussed participants at this test

session fell below the cutoff for 1 or more of the 9 variables under examination while

only 1 control participants fell below the cutoff for 1 or more variables. By the Day 3

test session (Table 35), the RCI model with practice correction continued to have fairly

high classification success using a criterion of 1 significant variable, achieving a Se + Sp

value of 1.31.

Table 33. Sensitivity for Neuropsychological Measures for Model 4
Measure 24 Hours Day 3 Day 5 Day 7

Hopkins Total 0.43 0.11 0.11 0.11
Trails A 0.19 0.21 0.00 0.16
Trails B 0.19 0.00 0.06 0.05
Digit Span Forward 0.33 0.16 0.01 0.05
Digit Span Back 0.14 0.11 0.01 0.05
Grooved Pegboard D 0.24 0.11 0.00 0.11
Grooved Pegboard N 0.29 0.11 0.00 0.32
COWAT 0.10 0.11 0.11 0.00
Hopkins Delay 0.38 0.16 0.11 0.37

Table 34. Number of Concussed and Control Participants Falling Below Cut-Point for
Multiple Variables at 24 Hours with Model 4
No. of Sig.Measures Concussed Control Se Sp (Se + Sp)

0 21 13 1.0 0.0 (1.00)
1 16 1 0.76 0.92 (1.68)
2 11 0 0.52 1.00 (1.52)
3 7 0 0.33 1.00 (1.33)
4 5 0 0.24 1.00 (1.24)
5 4 0 0.19 1.00 (1.19)
6 2 0 0.10 1.00 (1.10)
7 2 0 0.10 1.00 (1.10)
8 1 0 0.05 1.00 (1.05)
9 0 0 0.00 1.00 (1.00)
Se=Sensitivity, Sp=Specificity








At the Day 5 test session (Table 36), the application of Model 4 resulted in an

accuracy of 33% when using a criterion of significance on one variable while producing 2

false positive identifications. Although slightly better than the corresponding accuracy

using Model 3, the ability to distinguish between the groups was relatively poor.

However, unlike the previous models, Model 4 achieved some success in distinguishing

between concussed and control participants by Day 7 as can be seen in Table 37. Using a

criterion of significance on one variable, Model 4 achieved a sensitivity level of 0.68

although it also produced a high number of false positive identifications.

Table 35. Number of Concussed and Control Participants Falling Below Cut-Point for
Multiple Variables at Day 3 with Model 4
No. of Sig.Measures Concussed Control Se Sp (Se + Sp)

0 19 11 1.0 0.0 (1.00)
1 11 3 0.58 0.73 (1.31)
2 4 2 0.21 0.82 (1.03)
3 1 0 0.05 1.00 (1.05)
4 1 0 0.05 1.00 (1.05)
5 1 0 0.05 1.00 (1.05)
6 1 0 0.05 1.00 (1.05)
7 0 0 0.00 1.00 (1.00)
8 0 0 0.00 1.00 (1.00)
9 0 0 0.00 1.00 (1.00)
Se=Sensitivity, Sp=Specificity

Table 36. Number of Concussed and Control Participants Falling Below Cut-Point for
Multiple Variables at Day 5 with Model 4
No. of Sig.Measures Concussed Control Se Sp (Se + Sp)

0 18 10 1.0 0.0 (1.00)
1 6 2 0.33 0.80 (1.13)
2 2 1 0.11 0.90 (1.01)
3 1 1 0.05 0.90 (0.95)
4 0 1 0.00 0.90 (0.90)
5 0 0 0.00 1.00 (1.00)
6 0 0 0.00 1.00 (1.00)
7 0 0 0.00 1.00 (1.00)
8 0 0 0.00 1.00 (1.00)
9 0 0 0.00 1.00 (1.00)








Using a criterion of two significant variables, it achieved its highest sensitivity/specificity

combination, but saw a substantial decline in sensitivity. While cursory examination of

the sensitivity/specificity values suggested some clinical utility in identifying lingering

cognitive complaints at this most distal test session, further analysis of statistical

significance using ROC curves indicated otherwise.

Table 37. Number of Concussed and Control Participants Falling Below Cut-Point for
Multiple Variables at Day 7 with Model 4
No. of Sig.Measures Concussed Control Se Sp (Se + Sp)

0 19 6 1.0 0.0 (1.00)
1 13 3 0.68 0.50 (1.18)
2 7 1 0.37 0.83 (1.20)
3 2 0 0.11 1.00 (1.11)
4 1 0 0.05 1.00 (1.05)
5 0 0 0.00 1.00 (1.00)
6 0 0 0.00 1.00 (1.00)
7 0 0 0.00 1.00 (1.00)
8 0 0 0.00 1.00 (1.00)
9 0 0 0.00 1.00 (1.00)
Se=Sensitivity, Sp=Specificity

Receiver operating characteristic curves for Model 4 at each testing session are presented

in Figures 26 through 29. As with the basic RCI model, the RCI with correction for

practice resulted in a model with a relatively good ability to distinguish between

concussed and control participants within 24 hours of concussion. The AUC for this

model was 0.863 (SE = 0.0636, 95% Confidence Interval = 0.738-0.987) and was

significantly better than chance with p<0.0001 (Fig. 26). As seen in Figure 27, Model 4

resulted in slightly better classification rate at the Day 3 test session, although the model

was again not statistically difference than chance level with an AUC of 0.634 (SE =

0.1082, 95% Confidence Interval = 0.422-0.846, p=0.1079). At Day 5, Model 4 was

clearly not better than chance at distinguishing between concussed and control






85


participants (Fig. 28). AUC was calculated as 0.556, p=0.3151 (SE = 0.1154, 95%

Confidence Interval = 0.329-0.782).


ROC Curve for Model 4 at 24 Hours


No discrimination
---- Model 4


0 0.2 0.4 0.6 0.8
1 Specificity (false positives)


Figure 26. Receiver Operating Characteristic Curve for Model 4 at 24 Hour Test Session

ROC Curve for Model 4 at Day 3


No discrimination
0-Model 4


0 0.2 0.4 0.6 0.8 1
1 Specificity (false positives)

Figure 27. Receiver Operating Characteristic Curve for Model 4 at Day 3 Test Session











ROC Curve for Model 4 at Day 5


No discrimination
-o- Model 4


0 0.2 0.4 0.6 0.8 1
1 Specificity (false positives)

Figure 28. Receiver Operating Characteristic Curve for Model 4 at Day 5 Test Session


ROC Curve for Model 4 at Day 7

1
0.9
S0.8
a
S0.7
S No discrimination
0 0.6 4
O 0.6 --o Model 4
2 0.5
0.4-
S0.3
0 0.2-
0.1


0 0.2 0.4 0.6 0.8 1
1 Specificity (false positives)

Figure 29. Receiver Operating Characteristic Curve for Model 4 at Day 7 Test Session

As seen in Figure 29, the application of Model 4 at Day 7 appears to have been somewhat

more successful than the preceding models at distinguishing between concussed and

control participants. This was suggested by the sensitivity/specificity values in Table 37.









The AUC for this model at Day 7 was 0.636 (SE = 0.1245, 95% Confidence Interval =

0.392-0.880); however, this was again not statistically different than the reference

diagonal (p=0.1374).

A comparison between the basic RCI model and the RCI with correction for

practice applied to the data at the 24 Hour test session is shown in Figure 30. Although

the AUC for Model 4 is slightly larger than for Model 3 at the 24 Hour test session (0.863

vs. 0.861), the difference between the two was not statistically significant (p=l).


Comparison of RCI and RCI with Correction at 24 Hours



0.9

S0.8 ..... No discrimination
S0.7-
"-o RCI with Correction for
0.6 Pactice
i 0.5 ---D--RCa
0.4
0.3
0.2
0.1

0 0.2 0.4 0.6 0.8 1
1 Specificity (false positives)
Figure 30. Comparison of ROC Curves for the RCI and RCI with Correction for
Practice

Model 5
The third variant of the RCI model (referred to as Model 5) approached the task

of determining whether there was a real difference between baseline and retest

performance by abandoning some assumptions used in the previous two RCI models.

Those assumptions included that 1) the control participants' true scores for a particular








variable at test and retest sessions were equal, and 2) that the error variances for a

particular test variable at test and retest were also equal. Using the equation

(Y, Xi) -(Y -X)
[Sx2 -Sy2 2S xS yr

where (Y X), or D, is the mean change score for a particular variable for the control

population, and Sx2 S2 2S Syry or SD ,is the standard deviations of the change

scores for the control population, the participants' change scores from baseline to each

successive test session were examined to determine if they were large relative to the

distribution of change scores for the control group for each of those intervals.

Table 38 shows the average change scores for each variable at each testing

session with the standard deviation of the change scores for the control population. Z-

scores were calculated using the above equation. This process is illustrated in Table 39

for one concussed participant at the 24 Hour test session. As with the previous models, a

change score was considered significant if it exceeded 1.96 in the negative direction.

Therefore, as seen in Table 39, a z-score indicating a significant change relative to the

variance of the change scores for the control population for Hopkins Total score is

indicated by a z-score less than -1.96. By contrast, a greater-than-average change score

in the negative direction for Trails A (and all other variables where time is the measured

variable) is indicated by a z-score greater than +1.96.

Table 40 presents the sensitivities for the individual measures at each test session

using Model 5. As the data illustrates, the sensitivities for individual variables using

Model 5 at the 24 Hour test session are once again relatively higher than any of the other

days as a whole. Singly, the sensitivity values are not very high. As with most batteries








of neuropsychological test batteries, however, their combined use resulted in greater

sensitivity for this model over the repeated test sessions.

Table 38. Average Change Scores and Standard Deviations of Change Scores for the
Control Population
Test Session
24 Hour Day 3 Day 5 Day 7
Measure D SD D SD D SD D SD


Hopkins Total 0.38 2.53 -0.91 3.36 -0.40 3.95 -4.33 10.27
Trails A -3.69 6.03 -5.03 3.73 -4.87 5.55 -8.48 4.19
Trails B 2.34 10.55 -1.01 26.02 -8.40 13.96 -11.03 10.44
Digits Forward 1.38 1.12 1.64 1.63 2.00 1.94 1.50 1.87
Digits Backward 0.08 2.22 1.73 2.00 2.50 3.14 0.67 2.07
Pegs Dominant -2.97 6.08 -4.05 5.80 -4.20 13.25 -8.07 7.77
Pegs Nondominant -0.61 8.27 -3.35 10.65 0.46 12.61 -6.55 5.99
COWAT 3.08 8.92 5.45 8.36 5.80 7.93 1.67 11.45
Hopkins Delay 0.00 1.41 -1.27 2.53 0.60 1.84 -2.33 3.56
D =average change, SD =standard deviation of change score

Table 39. Example of Z-score Derivation using Model 5 for one Concussed Participant at
24 Hour Test Session
Measure Baseline 24 hour Difference D S, z-score

Hopkins Total 20 12 -8 0.38 2.53 -3.31
Trails A 33.0 84.0 51 -3.69 6.03 9.06
D =average change, SD =standard deviation of change score

Table 40. Sensitivity for Neuropsychological Measures for Model 5
Measure 24 Hours Day 3 Day 5 Day 7

Hopkins Total 0.43 0.05 0.00 0.00
Trails A 0.29 0.16 0.11 0.32
Trails B 0.19 0.00 0.06 0.05
Digit Span Forward 0.33 0.16 0.22 0.05
Digit Span Back 0.14 0.11 0.06 0.16
Grooved Pegboard D 0.24 0.11 0.00 0.11
Grooved Pegboard N 0.29 0.11 0.00 0.32
COWAT 0.05 0.11 0.00 0.00
Hopkins Delay 0.38 0.00 0.11 0.00








As can be seen, the sensitivity values at day 3 and day 5 are consistently poor.

Interestingly, however, the sensitivities for the individual variables at day 7 are somewhat

higher using Model 5 than the previous 4, suggesting that this may be a more appropriate

model to use at more distal testing sessions. Tables 41 through 44 indicated the

sensitivity and specificity of the battery of variables as a whole for Model 5 at each test

session.

Table 41. Number of Concussed and Control Participants Falling Below Cut-Point for
Multiple Variables at 24 Hours with Model 5
No. of Sig.Measures Concussed Control Se Sp (Se + Sp)

0 21 13 1.0 0.0 (1.00)
1 17 1 0.81 0.92 (1.73)
2 12 0 0.57 1.00 (1.57)
3 7 0 0.33 1.00 (1.33)
4 5 0 0.24 1.00 (1.24)
5 4 0 0.19 1.00 (1.19)
6 2 0 0.10 1.00 (1.10)
7 1 0 0.05 1.00 (1.05)
8 1 0 0.05 1.00 (1.05)
9 0 0 0.00 1.00 (1.00)
Se=Sensitivity, Sp=Specificity

Table 41 indicates that Model 5 was very successful at distinguishing between concussed

and control participants at the 24 Hour test session. Using a criterion of one variable

falling below cutoff, the combined sensitivity/specificity was 1.73, with both good

sensitivity (0.81) and specificity (0.92).

Day 3 sensitivity/specificity data using Model 5 (Table 42) again indicated a

pattern of significantly decreased sensitivity and a less substantial drop in specificity with

the one variable criterion. Using two variables as a criterion results in an unacceptable

drop in sensitivity. A continued reduction in sensitivity at day 5, as was seen with all

previous models, was again evident for Model 5 (Table 43). However, at the Day 7 test








session, Model 5 resulted in fair sensitivity and good specificity using the one-variable

criterion (Table 44).

Table 42. Number of Concussed and Control Participants Falling Below Cut-Point for
Multiple Variables at Day 3 with Model 5
No. of Sig.Measures Concussed Control Se Sp (Se + Sp)

0 19 11 1.0 0.0 (1.00)
1 10 2 0.53 0.82 (1.35)
2 2 0 0.11 1.00 (1.11)
3 1 0 0.05 1.00 (1.05)
4 1 0 0.05 1.00 (1.05)
5 0 0 0.00 1.00 (1.00)
6 0 0 0.00 1.00 (1.00)
7 0 0 0.00 1.00 (1.00)
8 0 0 0.00 1.00 (1.00)
9 0 0 0.00 1.00 (1.00)
Se=Sensitivity, Sp=Specificity

Table 43. Number of Concussed and Control Participants Falling Below Cut-Point for
Multiple Variables at Day 5 with Model 5
No. of Sig.Measures Concussed Control Se Sp (Se + Sp)

0 18 10 1.0 0.0 (1.00)
1 7 2 0.39 0.80 (1.19)
2 1 0 0.06 1.00 (1.06)
3 0 0 0.00 1.00 (1.00)
4 0 0 0.00 1.00 (1.00)
5 0 0 0.00 1.00 (1.00)
6 0 0 0.00 1.00 (1.00)
7 0 0 0.00 1.00 (1.00)
8 0 0 0.00 1.00 (1.00)
9 0 0 0.00 1.00 (1.00)
Se=Sensitivity, Sp=Specificity

The statistical significance of Model 5 at each testing session was again determined

through ROC curve analysis using non-parametric measurement of area under the curve.


Results of this analysis are depicted in Figures 31 through 34.









Table 44. Number of Concussed and Control Participants Falling Below Cut-Point for
Multiple Variables at Day 7 with Model 5
No. of Sig.Measures Concussed Control Se Sp (Se + Sp)

0 19 6 1.0 0.0 (1.00)
1 12 0 0.63 1.00 (1.63)
2 6 0 0.32 1.00 (1.32)
3 1 0 0.05 1.00 (1.05)
4 0 0 0.00 1.00 (1.00)
5 0 0 0.00 1.00 (1.00)
6 0 0 0.00 1.00 (1.00)
7 0 0 0.00 1.00 (1.00)
8 0 0 0.00 1.00 (1.00)
9 0 0 0.00 1.00 (1.00)
Se=Sensitivity, Sp=Specificity


ROC Curve for Model 5 at 24 Hours



0.9
S0.8
0.7
S- No discrimination
L 0.6 -Model 5
20.5
0.4-
0.3
(c 0.2
0.1

0 0.2 0.4 0.6 0.8 1
1 -Specificity (false positives)

Figure 31. Receiver Operating Characteristic Curve for Model 5 at 24 Hour Test Session

Figure 31 depicts the ROC curve for Model 5 at 24 hours. The AUC of 0.888 (SE =

0.0577, 95% Confidence Interval = 0.775-1.00) was larger than any of the preceding

models and was highly significant (p<0.0001), indicating that this model was good at

distinguishing between the change scores of the concussed and control participants.

Further, Figure 32 indicates that Model 5 was also able to distinguish between the









concussed and control populations at the Day 3 test session at a statistically significant

level, an achievement which no previous model was able to attain. The AUC for the

model at this test session was 0.682 (SE=0.0985, 95% Confidence Interval = 0.489-

0.875), which was statistically different than chance level (p=0.033).


ROC Curve for Model 5 at Day 3

1
0.9
S0.8
0.7
No discrimination
S0.6 -- odel 5
2 0.5.
S0.4
0.3
& 0.2-
0.1

0 0.2 0.4 0.6 0.8 1
I Specificity (false positives)

Figure 32. Receiver Operating Characteristic Curve for Model 5 at Day 3 Test Session

By day 5, however, Model 5 was not statistically different than chance at

distinguishing between the two groups (Fig. 33). The AUC was calculated to be 0.597

(SE = 0.1107, 95% Confidence Interval = 0.380-0.814), with a p-value of 0.19.

Interestingly, as was suggested by the sensitivity/specificity values for Model 5 at day 7

in Table 44, Figure 34 indicates that Model 5 was indeed successful at distinguishing

between concussed and control participants at the day 7 test session. The ROC plot

produced an AUC of 0.816 (SE = 0.0837, 95% Confidence Interval = 0.652-0.980),

which was statistically different than chance level (p<0.0001).