Citation
Analytical models for diagnostic classification and treatment planning for craniofacial pain.

Material Information

Title:
Analytical models for diagnostic classification and treatment planning for craniofacial pain.
Series Title:
Analytical models for diagnostic classification and treatment planning for craniofacial pain.
Creator:
Leonard, Michael Steven
Publisher:
Michael Steven Leonard
Publication Date:
Language:
English

Subjects

Subjects / Keywords:
Cost estimates ( jstor )
Discriminants ( jstor )
Facial pain ( jstor )
Information classification ( jstor )
Mathematical vectors ( jstor )
Matrices ( jstor )
Modeling ( jstor )
Statistical models ( jstor )
Symptomatology ( jstor )
Transition probabilities ( jstor )

Record Information

Source Institution:
University of Florida
Holding Location:
University of Florida
Rights Management:
Copyright [name of dissertation author]. Permission granted to the University of Florida to digitize, archive and distribute this item for non-profit research and educational purposes. Any reuse of this item in excess of fair use or other copyright exemptions requires permission of the copyright holder.
Resource Identifier:
022771867 ( alephbibnum )
14057039 ( oclc )

Downloads

This item has the following downloads:

oai.xml

UF00089748_00001.pdf

00006.txt

UF00089748_00001_0036.txt

UF00089748_00001_0097.txt

Copyright2.txt

UF00089748_00001_0017.txt

00026.txt

00047.txt

UF00089748_00001_0004.txt

00080.txt

00058.txt

UF00089748_00001_0074.txt

UF00089748_00001_0123.txt

00105.txt

00060.txt

00054.txt

00092.txt

UF00089748_00001_0107.txt

UF00089748_00001_0073.txt

UF00089748_00001_0095.txt

UF00089748_00001_0126.txt

00051.txt

UF00089748_00001_0029.txt

00055.txt

00061.txt

UF00089748_00001_0081.txt

00067.txt

Copyright1.txt

00037.txt

UF00089748_00001_0131.txt

UF00089748_00001_0101.txt

00033.txt

00100.txt

UF00089748_00001_0058.txt

UF00089748_00001_0039.txt

00096.txt

UF00089748_00001_0070.txt

UF00089748_00001_0022.txt

UF00089748_00001_0033.txt

UF00089748_00001_0051.txt

00108.txt

UF00089748_00001_0030.txt

00062.txt

UF00089748_00001_0047.txt

UF00089748_00001_0079.txt

00002.txt

UF00089748_00001_0021.txt

00112.txt

UF00089748_00001_0060.txt

UF00089748_00001_0035.txt

UF00089748_00001_0104.txt

00076.txt

00057.txt

UF00089748_00001_0114.txt

UF00089748_00001_0071.txt

00087.txt

00066.txt

00073.txt

00075.txt

UF00089748_00001_0135.txt

UF00089748_00001_0023.txt

UF00089748_00001_0018.txt

UF00089748_00001_0054.txt

00007.txt

UF00089748_00001_0077.txt

00127.txt

00027.txt

00063.txt

UF00089748_00001_0012.txt

00114.txt

00091.txt

00071.txt

00120.txt

UF00089748_00001_0085.txt

00059.txt

UF00089748_00001_0037.txt

UF00089748_00001_0062.txt

UF00089748_00001_0125.txt

UF00089748_00001_0001.txt

00042.txt

UF00089748_00001_0118.txt

UF00089748_00001_0055.txt

UF00089748_00001_0121.txt

00012.txt

UF00089748_00001_0130.txt

UF00089748_00001_0020.txt

00125.txt

00023.txt

00039.txt

UF00089748_00001_0115.txt

UF00089748_00001_0009.txt

00122.txt

UF00089748_00001_0129.txt

UF00089748_00001_0113.txt

UF00089748_00001_0072.txt

00133.txt

00072.txt

00081.txt

00020.txt

UF00089748_00001_0043.txt

00038.txt

UF00089748_00001_0064.txt

UF00089748_00001_0063.txt

00101.txt

00011.txt

UF00089748_00001_0014.txt

UF00089748_00001_0103.txt

00034.txt

00010.txt

00083.txt

00024.txt

00110.txt

00093.txt

00117.txt

UF00089748_00001_0011.txt

00022.txt

UF00089748_00001_0094.txt

00119.txt

UF00089748_00001_0106.txt

00111.txt

UF00089748_00001_0111.txt

00019.txt

UF00089748_00001_0087.txt

00126.txt

UF00089748_00001_0040.txt

UF00089748_00001_0061.txt

UF00089748_00001_0116.txt

UF00089748_00001_0099.txt

UF00089748_00001_0002.txt

UF00089748_00001_0133.txt

UF00089748_00001_0015.txt

00070.txt

UF00089748_00001_0105.txt

00032.txt

00068.txt

00107.txt

UF00089748_00001_0076.txt

UF00089748_00001_0102.txt

00128.txt

UF00089748_00001_0048.txt

00064.txt

00008.txt

UF00089748_00001_0128.txt

00035.txt

UF00089748_00001_0044.txt

00095.txt

UF00089748_00001_0013.txt

00090.txt

UF00089748_00001_0006.txt

UF00089748_00001_0084.txt

00016.txt

UF00089748_00001_0024.txt

UF00089748_00001_0042.txt

00116.txt

00118.txt

UF00089748_00001_0068.txt

00005.txt

00103.txt

00017.txt

oai_xml.txt

00097.txt

UF00089748_00001_0003.txt

00050.txt

00121.txt

UF00089748_00001_0028.txt

00085.txt

00018.txt

00098.txt

UF00089748_00001_0053.txt

00113.txt

00052.txt

UF00089748_00001_0008.txt

00084.txt

00069.txt

UF00089748_00001_0100.txt

00004.txt

UF00089748_00001_0069.txt

UF00089748_00001_0059.txt

UF00089748_00001_0027.txt

UF00089748_00001_0005.txt

00088.txt

UF00089748_00001_0078.txt

UF00089748_00001_0120.txt

UF00089748_00001_0110.txt

UF00089748_00001_0122.txt

UF00089748_00001_0082.txt

00029.txt

UF00089748_00001_0046.txt

UF00089748_00001_0088.txt

00074.txt

UF00089748_00001_0041.txt

UF00089748_00001_0019.txt

00132.txt

UF00089748_00001_0112.txt

00077.txt

UF00089748_00001_0119.txt

UF00089748_00001_0108.txt

00041.txt

00053.txt

UF00089748_00001_0117.txt

00104.txt

UF00089748_00001_0083.txt

00115.txt

00078.txt

UF00089748_00001_0086.txt

00131.txt

00021.txt

UF00089748_00001_0057.txt

00028.txt

00031.txt

00009.txt

UF00089748_00001_0091.txt

00046.txt

UF00089748_00001_0127.txt

UF00089748_00001_0031.txt

UF00089748_00001_0050.txt

UF00089748_00001_0075.txt

UF00089748_00001_0093.txt

UF00089748_00001_0056.txt

00044.txt

00013.txt

UF00089748_00001_0092.txt

00001.txt

00109.txt

00099.txt

00102.txt

UF00089748_00001_0026.txt

00040.txt

UF00089748_00001_pdf.txt

00129.txt

00094.txt

00014.txt

00086.txt

UF00089748_00001_0010.txt

UF00089748_00001_0034.txt

UF00089748_00001_0096.txt

00130.txt

00049.txt

00079.txt

UF00089748_00001_0124.txt

00048.txt

UF00089748_00001_0032.txt

00123.txt

UF00089748_00001_0090.txt

UF00089748_00001_0067.txt

UF00089748_00001_0080.txt

UF00089748_00001_0134.txt

00065.txt

UF00089748_00001_0098.txt

00106.txt

UF00089748_00001_0089.txt

00015.txt

00056.txt

UF00089748_00001_0109.txt

00045.txt

UF00089748_00001_0016.txt

UF00089748_00001_0049.txt

UF00089748_00001_0065.txt

UF00089748_00001_0132.txt

UF00089748_00001_0025.txt

UF00089748_00001_0066.txt

00030.txt

UF00089748_00001_0045.txt

00089.txt

UF00089748_00001_0052.txt

00082.txt

UF00089748_00001_0007.txt

00036.txt

00124.txt

00043.txt

00025.txt

UF00089748_00001_0038.txt

00003.txt


Full Text











ANALYTICAL MODELS FOR DIAGNOSTIC CLASSIFICATION AND
TREATMENT PLANNING FOR CRANIOFACIAL PAIN











By



Michael Steven Leonard


A DISSERITTION PRESENTED TO THE GRADUATE COUNCIL OF
THE UNIVERSITY OF FLORIDA IN PARTIAL
FUIFIJThENT OF THE REQUIPFE~ETS FOR THE DEGREE OF
DOCIOR OF PHILSOPHY



UNIVERSITY OF FLORIDA
1973


































To my wife,

Mary













ACKNOEDGEMENTS


Without the considerable contributions of time and effort by the

members of his committee, it would have been impossible for the author

to have canpleted this dissertation. In particular, the author expresses

gratitude to his Chairman, Dr. Kerry Kilpatrick, for his encouragement

and direction during the course of this research effort. The author also

thanks Dr. Kilpatrick for his editorial assistance during the development

and organization of this manuscript. The author thanks Dr. Richard

Mackenzie and Dr. Stephen Roberts for providing the initial direction for

this research. Additionally, the author is grateful to Dr. Than Hodgson

and Dr. Donald Ratliff for their assistance in evaluating and refining

the author's ideas throughout this project. The author expresses his

gratitude to Dr. Thomas Fast and Dr. Parker Mahan for the contribution of

their extensive knowledge about craniofacial pain to the author's research.

The author is deeply appreciative of Dr. Fast's and Dr. Mahan's willing-

ness to spend many hours examining dental records and their endurance of

the nomenclature and idiosyncracies of this mathematical-modeling effort.

Financial support for this research was provided by the Health Systems

Research Division, J. Hillis Miller Health Center. The division's sup-

port in conjunction with a traineeship granted by the National Science

Foundation made it possible for the author to undertake this research.

The author is also grateful to the Industrial and Systems Engineering

Department for the contribution of computer funds. Additionally the au-

thor thanks Dr. William Solberg, University of California at Los Angeles;








Dr. Daniel Laskin, University of Illinois; and Dr. David Mitchell,

University of Indiana, for providing access to the patient records

employed in this modeling effort.

The author would like to express his thanks to the secretarial staff

of the Health Systems Research Division for their translation of the au-

thor's 'first-order' approximation to handwriting into a draft of this

manuscript. Their tolerance of a multitude of last minute changes made

by the author has been appreciated.

Finally, the author thanks his wife, Mary, and his parents, Dorothy

and Charles Leonard, for their encouragement and support throughout the

course of this research.

M.S.L.

August, 1973














TABLE OF CONTENTS



ACKNOIW EDGI ME TS .................................................

LIST OF TABLES............... ....................................

LIST OF FIGURES...................................................

ABSTRACT o o ...................... ............................

Chapter

1. Introduction .........................................

1.1 Craniofacial Pain...............................

1.2 Research Objective..............................

1.3 Dissertation Overview...........................

2. Previous Research.....................................

2.1 Bayesian Classification Models....................

2.2 Non-Parametric Classification Models..............

2.3 Finite-Horizon Treatment Planning.................

2.4 Uncertain-Duration Treatment Planning............

3. Diagnostic Classification...........................

3.1 Model Components..................................

3.2 Alternative Interpretations of Linear Separability

3.3 Model Validation .................................

3.4 Minimum-Cost Symptcn-Selection Algorithm.........

3.4.1 Algorithm Development......................


iii

vii

viii

ix



1

2

8

9

10

10

13

14

15

17

17

26

31

36

39








3.4.2 Statement of the Minimum-Cost Symptom-
Selection Algorithmn.......................

3.4.3 Computational Considerations..............

3.5 Model Applications ..................................

4. Treatment Planning....................................

4.1 Model Camponents........ ....................

4.1.1 Patient States...........................

4.1.2 Transition Probabilities..................

4.1.3 Cost Structure............................

4.2 Selection of Optimal Treatments..................

4.3 Model Validation ...............................

4.4 Model Applications................................

5. Conclusions and Future Research ........................

Appendices

A Craniofacial-Pain Patient Data Vector..................

B Modified Fixed-Increment Training Algorithm............

C Application of the Minimum-Cost Symptom-Selection
Algorithm ...............................................

D Treatment Alternatives for Craniofacial-Pain Patients...

E Stability of Transition-Probability Estimates...........

F Flow Charts of Patient-State Transitions ...............

G Patient-State Treatment Selections.....................

H Application of the Patient-State-Labeling and Optimal-
Treatment-Selection Procedure .........................

BIBLIOGRAPHY......................................................

BIOGRAPHICAL SKETCH..............................................


55

56

57

59

59

60

63

67

70

72

74

77



82

87


91

97

99

104

110


114

118

121













LIST OF TABLES



Tables

1. Survey of Diagnostic-Classification YMdels.............. 12

2. Correlation Between Significant Symptoms and
Discriminant-Function Weights........................... 30

3. Tests of Diagnostic Classifier Accuracy................. 32

4. Classification Variability Among Dental Practitioners... 35

5. Mean Transit Times Through the Craniofacial-Pain Care
System ............... ............................... 75


vii













LIST OF FIGURES



Figures

1. Terporanandibular Joint ................ ............. 3

2. Diagnostic-Classification and Treatment-Planning
Process for Craniofacial Pain......................... 7

3. Craniofacial-Pain Diagnostic Alternatives............... 18

4. Procedure 2........... .......... .................... 53

5. Diagnostic-Classification Transitions .................. 64

6. Patient-Visit Inconvenience Cost....................... 69

7. Application of the Mbdified Fixed-Increment Algorithm... 90

8. Multiple-State History-Augnented Process............. 115


viii







Abstract of. Dissertation Presented to the
Graduate Council of the University of Florida in Partial
Fulfillment of the Requirement for the Degree of Doctor of Philosophy


ANALYTICAL MODELS FOR DIAGNOSTIC CLASSIFICATION AND
TREATMENT PLANNING FOR CRANIOFACIAL PAIN


By

Michael Steven Leonard

December, 1973


Chairman: Dr. Kerry E. Kilpatrick
Major Department: Industrial and Systems Engineering


This dissertation presents a systematic approach to craniofacial-

pain diagnosis and treatment planning using analytic models of the under-

lying decision-making processes. Patient diagnoses are generated by a

linear pattern-recognition classifier trained with a sample of preclas-

sified craniofacial-pain patient data. For this classifier, an algorithm

is developed that minimizes the total cost of the set of features employed

in the classifying process. Diagnostic classifications, augmented by a

history of prior treatment applications, provide the state descriptions

for a Markovian decision model of the treatment-planning process. Cranio-

facial-pain patient records from four university dental clinics serve as

a data base for model construction and validation.

The analytic models provide a means of duplicating the diagnostic

classifications and treatment plans of experts. Approximately 90% of

the diagnostic classifier's classifications and 93% of the treatment-

planning model's treatment selections concurred with the decisions made

by experts in the field of care for craniofacial-pain patients. Moreover,

the models permit an examination of the critical considerations associated







with both decision-making processes. These capabilities are discussed

in terms of applications of the models in teaching, research, and in the

practice of dentistry.











CHAPTER 1

INTRODUCTION

The rapid pace of developments in medical and dental research pre-

vents the practicing physician and dentist from fully utilizing each new

diagnostic and treatment-planning aid as it is published. In each of the

last four years an average of 215,000 new publications have been written

to supplement the knowledge of the health-care practitioner [1]. Con-

currently, the pressures of an ever-increasing patient load force prac-

titioners to select the most expeditious means for diagnosing disorders

and selecting treatments. For example, the medical general-practitioner

(1970) saw an average of 173 patients a week [2], and the median dental

practitioner (1971) saw two patients an hour [3]. Given these circu~r-

stances, practitioners may overlook possible diagnostic and treatment al-

ternatives or they may apply inappropriate treatments. If meaningful

analytic descriptions of the diagnostic and treatment-planning processes

can be developed, these models can assist educators in training new prac-

titioners, researchers in evaluating and disseminating new developments,

and practitioners in improving the quality of patient care [4].

Developing models of the diagnostic-classification and treatment-

planning process requires an understanding of the underlying physiological

processes of diseases and the mechanisms of their cures. Obviously, the

effects of disease and the means of cure vary from one health-care prob-

lem to another. Thus, modeling efforts in diagnosis and treatment plan-

ning must be integrally related to the facet of health care that is under







study. This reality prohibits the model builder from making broad state-

ments about the applicability of his models to other health-care environ-

ments. Accordingly, the models developed in this dissertation are spe-

cifically oriented toward the health-care problem presented in Section

1.1 with the understanding that the results of this modeling effort may

not be applicable to the whole of health-care diagnosis and treatment

planning.


1.1 Craniofacial Pain

The head and face are subject to chronic, persistent,
or recurrent pain more often than any other portion of the
body. Pain in the head or face has a greater significance
to patients than any other pain. It may arouse fears that
the patient is in danger of losing his mind or that he has
a tumor of the brain. In addition, the emotional state of
the patient is adversely influenced because it is generally
known by the layman that the profession's knowledge of the
causes of these pains is meager and that methods of treat-
ment are inadequate [5, p. v].

H. Houston Yerritt, M.D., Dean
Columbia University College of
Physicians and Surgeons

One source of the pain Dr. Yerritt describes is dysfunction of the

temporomandibular joint. The temporamandibular joint, see Figure 1,

provides the articulation between the mandible and the cranium. This

joint is unique both in its structure and its function. Within the plane

of the temporcmandibular joint, lateral, vertical and pivoting motion is

permitted. In addition, the joint is the point of articulation for the

only articulated complex that contains teeth. With this joint, "motion

is directed more by the musculature and less by the shape of the artic-

ulating bones and ligaments than is the fact for other joints" [5, p. 34].

The fact that joint motion is highly dependent on musculature im-

plies that then mandibular dysfunction occurs there is same disturbance









Right tempornandibular articulation


Inset: Anatomical features of the temporoandibular joint


Mandibular Fossa

Articular Eminence

-- Meniscus


Mandibular Condyle


FIGURE 1

4EMPOKRIOANDIBULAR JOINT








of the intricate neuromuscular mechanisms controlling mandibular move-

ment [5]. Emotional tension may also lead to hypertonicity of the

striated masticatory muscles resulting in facial pain or altered sensa-

tion without evidence of peripheral dysfunction. In addition, abnormal

occlusal contacts of the teeth may affect muscle tonicity resulting in

mandibular dysfunction [5]. Moreover, the temporamandibular joint is

prone to disorders common to all joints: rheumatoid arthritis, osteo-

arthritis, traumatic injuries, neoplasms, and nonarticular disorders.

Although the term 'craniofacial-pain' is a broad classification for pain

in the head and face, the term is used in this dissertation to describe

pathological, congenital, hereditary-based, or emotional causes of pain

in and around the temporcmandibular joint.

Though the degree of severity may vary, one or more of the following

four 'cardinal smptcnrs' are exhibited by the craniofacial-pain patient:

pain, joint sounds, limitation of motion, and tenderness in the mastic-

atory muscles [6]. Accampanying these symptcas the patient may complain

of, or the practitioner may find, hearing loss, burning sensations, mi-

graine-like headaches, vertigo, tinnitus, subluxation, luxation, dental

pulpitis, sinus disease, glandular disorders, occlusal disharmony, and

radiographic evidence of joint abnormality. The degree of association

of these additional symptoms and findings with the etiology of the joint

disorders is subject to considerable variation.

Paralleling these areas of anatomic dysfunction is the possibility

that the craniofacial-pain patient may be suffering from psychic dis-

orders. In no other type of patient seen by the dentist does psychic

condition play a larger role [7]. Most craniofacial-pain patients have

symptoms or signs of anxiety, and a sensory preoccupation with the oc-








clusion of their teeth [8]. Many of these patients can be characterized

by a heavy reliance on denial, repression, and projection of their psy-

chic disorders in order to maintain their self-concept of emotional sta-

bility [6]. Often the complaints these patients relate to the practi-

tioner are not compatible with any objective signs.

The practitioner who manages the care of craniofacial-pain patients

assumes a difficult task. For same of these patients, diagnosis is ob-

vious. Generally, however, the craniofacial-pain patient presents a cmn-

plex combination of signs and symptoms [7]. More than one disease en-

tity normally accounts for the patient's symptoms and most craniofacial-

pain patients suffer from a pain-dysfunction complex involving a ccmbina-

tion of masticatory muscle disorders, occlusal disharmony, emotional

tension, and anxiety [5]. Nevertheless the possibility of multiple

almost sub-clinical etiologic factors combining to produce the dysfunc-

tion and pain must be considered. The close relationship of organic and

emotional disorders as they appear in craniofacial-pain patients provides

the examining dentist with the problem of discriminating which factor is

primary in the etiology of the patient's dysfunction [7]. Unfortunately,

the terporcarndibular joint is one of the most difficult areas of the

body to examine radiographically [8]. Hence, with these patients, the

dentist relies to a large degree on tests of emotional stability and

physical examination by visualization, palpation, and auscultation [7].

Therapeutic measures for the care of craniofacial-pain patients are

as varied as the factors contributing to the disorder. "A small percent-

age of patients with symptoms referrable to the temporamandibular joint

will portray such a confusing picture that consultation with other. dental

or medical specialists is indicated" [7, p. 129]. The majority of these










patients will exhibit symptoms that lead to any one of several alterna-

tive courses of patient care. Altering the occlusion of the natural

teeth is one means of treating craniofacial-pain patients. Although in

many cases minor occlusal abnormalities are only contributing factors to

a patient's pain, attention by the dentist to occlusion is at least

partially successful for a majority of craniofacial-pain patients [8].

However, it is important in early therapy not to alter the occlusion ir-

reversibly. Treatment by means of tooth extraction or endodontics, jaw

fixation, prosthetic devices, or by topical treatments may also be sug-

gested by the patient's symptoms. The articular surface of the mandib-

ular condyle has an excellent reparative capacity [6]. Thus, the use of

sedatives, antibiotics, and muscle relaxants, along with physical therapy,

oftenleads to patient 'cures' as these treatments ease the patient's pain

and increase jaw mobility while natural restoration of the joint is in

progress. If, after a reasonable length of time (3 to 6 months) the pa-

tient's symptoms are not relieved, the dentist may consider referral to

another source of care or therapy such as surgery [7].

Typically, the health-care process for craniofacial-pain patients

may be viewed as following the format of Figure 2 [9]. When a patient

is admitted into the care system, he undergoes a data-collection process.

This involves taking a 'full and pertinent' patient history and a phys-

ical examination of the areas of discomfort. The data gathered consist

of symptoms, signs, medical and/or dental history, physical examination

findings, psychosocial information, and so forth. Once these elements

have been elicited, a diagnosis is attempted. If this is not yet pos-

sible, the severe symptoms are treated and the patient's health state is

monitored.



























































FIGURE 2

DAMOSTIC-=ASSIFICATION AND TREIMEnT-
PLANNING PR=SS FOR CRANIOFACIAL PAIN








When initial treatment does not result in a 'cure' for the cranio-

facial-pain patient, treatment effects are evaluated and new data col-

lected. When a patient's diagnostic classification leads to a course

of treatment that is not within the realm cf the practitioner's special-

ty he is referred to a more appropriate care source. 1Mnitoring is con-

tinued on those patients not rejected frcm the system at this point, and

the patient is discharged when he is symptcn-free. However, when other

disorders have been isolated during the course of treatment, the patient

is recycled through the classification-treatment process.

The diagnosis-treatment sequence is not fixed. Treatment can begin

prior to a diagnostic classification or treatment can follow a diagnosis.

Moreover, there may be many diagnostic-treatment data-acquisition cycles

before the patient is considered 'vell.'


1.2 Research Objective

.The introductory discussion of the need for diagnostic and treatment-

planning models, and the brief description of the craniofacial-pain care

system, provide the setting for a statement of the research objective un-

derlying this dissertation. This objective is to derive analytic repre-

sentations of the decision processes involved in selecting diagnostic

classifications and planning treatments for craniofacial-pain patients.

A diagnostic-classification model that duplicates the classification of

expert practitioners is sought. For treatment planning, the modeling

goal is to provide a structure for interaction of the critical considera-

tions associated with the treatment-selection process. These analytic

representations will be structured to permit their application as teaching

devices in the training of dental practitioners, as methods of testing the

effects of new diagnostic tools and treatment applications, and as aids to

the practice of dentistry.







This research objective will be met by developing:

1. A diagnostic-classification model based on the theory

of non-parametric pattern classification, with

a. criteria for applicability of the modeling technique

to diagnostic classification

b. model validation for craniofacial-pain patients

c. development of a minimum-cost symptom-selection

algorithm

2. A Markovian representation of the treatment-selection

process, with

a. justification for utilizing a Markovian model of

the underlying care system

b. model validation for craniofacial-pain patients

3. A description of potential model applications in teaching,

research, and practice.


1.3 Dissertation Overview

In Chapter 1 the motivation and scope of this dissertation was pre-

sented. Chapter 2 provides a review of literature relevant to the diag-

nostic and treatment-selection processes. A model of the diagnostic-

classification process is developed in Chapter 3. Chapter 4 follows

with an analytic representation of the treatment-planning process. Con-

clusions derived from this model-building effort, and suggestions for

future research, are presented in Chapter 5.


I '













CHAPTER 2

PREVIOUS RESEARCH


Over three-hundred publications have been addressed to the problem

of modeling the diagnostic and treatment-planning process. Spanning

-fourteen years, this research has considered such diverse problems as

the classification of liver biopsies [10] and the optimal plan for

treating mid-shaft fractures of the femur [11]. At least ninety-one

disorders have been utilized as environments for developing diagnostic

and treatment-planning models. The magnitude of this research effort

emphasizes the need for analytic representations of these complex deci-

sion-making processes.

Fortunately, the significant contributions in this voluminous

literature can be neatly partitioned into four distinct categories. Re--

search in diagnostic classification has been based either on the applica-

tion of Bayesian statistics or on the use of non-parametric pattern

classifiers. Treatment planning has been presented as either a finite-

horizon decision problem or as an application of decision analysis to a

Markov process of uncertain duration. This section presents a brief dis-

cussion of each of these categories and evaluates their suitability as

analytic representations of the process of providing health care for

craniofacial-pain patients.


2.1 Bayesian Classification Models

Bayesian diagnostic-classification models, such as [12, 13, 14,








15, 16], make a diagnosis on the basis of selecting a patient's 'most

probable' disease state. The Bayesian classifier is an elementary type

of parametric pattern-classification model. In general, parametric

classifiers make use of one or more of the statistical characteristics

of the dispersion of the data being classified to establish rules for

data classification. With the Bayesian models, only the conditional

probabilities for exhibiting sets of synptcmns, given a particular dis-

ease, are tabulated from past medical data. Then, utilizing Bayes'

theorem, the probabilities for the presence of alternate diseases

dl,d2,...,dn can be calculated as a function of the symptcm-cciplex S

the practitioner observes in the patient. Bayes' theorem provides that

for each of the d.

P(djIS) = C(S)P(Sdi)P (di)
n
where C(S) = 1/[Z P(Sjic)P(dk)],
k=l

hence, a patient with symptcm-ccmplex S is classified in disease-group i

if

P(dilS) = max p(dIS).
k

A survey of the results of application of Bayesian models is given in

Table 1.

Although the percentage of correct diagnoses in most of these test

applications is high, there are several reasons why a Bayesian diagnos-

tic model is not used as the means of generating diagnostic classifica-

tion in this dissertation. The first reason is the difficulty in ac-

quiring the proportional presence of alternate diseases P(di), i=l,2,...,n,

in the population of patients that are to be classified by the model.

These 'prior' probabilities of having a particular disease are a function









TABLE 1

SURVEY OF DIAGNOSTIC-CLASSIFICATION MODELS



Bayesian Classifiers


Reference
Number


Disease Group


Number Of
Patients In
Study


% Correct
Patient
Diagnoses


Nontoxic Goiter 88

Bone Tumor 77

Thyroid 268

Congenital Heart 202

Gastric Ulcer 14





Non-Parametric Classifiers


Reference
Number


[17]

[18]

[19]

[20]


Disease Group


Liver

Asthma

Hematologic

Thyroid


Number Of
Patients In
Study


52

230

49

225


% Correct
Patient
Diagnoses


98.1

90.0

93.9

96.0


[12]

[13]

[14]

[15]

[16]


85.3

77.9

96.3

90.0

100.0








of seasonal variation, geographic location, population demography, and

many other factors. Secondly, valid Bayesian analysis requires the

analyst to determine the dependence among exhibited symptoms for each

disease considered by the diagnostic model. In this respect, the prob-

abilities for the presence of groups of symptoms are independent for

saoe diagnostic alternatives and strongly correlated for others [4]. The

third reason for not selecting a Bayesian model is the massive storage

requirement dictated by the necessity of keeping the set of conditional

probabilities. These conditionals, P(S di) for every observable symptcm-

complex S and. every disease i considered, must be at hand each time the

model is used. For example, given ten alternate diseases and ten symp-

toms for which no assumptions of between-symptma independence can be made,

storage is required for 10 (210-1), or 10,230, conditional probabilities.


2.2 Non-Parametric Classification models

Non-parametric diagnostic models, like [17, 18, 19, 20], utilize

non-parametric pattern classifiers, a form of pattern recognition model-

ing. In the literature on pattern recognition, the term 'non-parametric'

implies that no form of probability distribution is assumed for the

dispersion of symptom data in establishing the rules for pattern classi-

fication. These models do assume, however, that classes of symptno data

are distinct entities and, hence, a patient with a particular set of

symptom S cannot simultaneously occupy more than one diagnostic state.

That is, the models assume a deterministic classification for each pat-

tern viewed by the pattern classifier where every observable pattern has

one, and only one, correct classification.

Non-parametric modeling permits the analyst to bypass the difficult

problems of explicitly determiinng the conditional probabilities for,








and the dependence. among, symptams that are required for Bayesian analysis.

With the non-parametric classifier, a diagnosis is generated for the

practitioner by evaluating a discriminant function associated with each

diagnostic classification, gi(.), i=1,2,...,n. As was the case with the

Bayesian models, the values of these discriminants are a function of the

symptcm-ccrplex S exhibited by the patient. The patient's diagnostic

classification corresponds to that disease whose associated discriminant-

function value is maximum. That is, a patient with symptoms S is classi-

fied in disease-group i if

gi(S)>gk(S) for all k 7 i.

Results frcm scae of the applications of pattern-recognition classi-

fiers are presented in Table 1. In these test applications diagnostic

accuracy was consistently high. Because of these models' ease of imple-

rentation and small storage requirements, a non-parametric pattern classi-

fier is preferable as a vehicle for generating diagnostic classifications.

The use of a non-parametric classifier is further motivated by features

of the care process for craniofacial-pain patients discussed in Chapter 3.


2.3 Finite-Horizon Treatment Planning

In the realm of research on modeling the treatment-planning process,

several authors [9, 21, 22] have presented schemes for analysis that

utilize methods for making decisions under risk and uncertainty. The

treatment-selection process has alternately been defined as a two-person

zero-sum game, structured as a decision tree, and modeled as a Markov

process of limited duration. Treatment costs and the 'costs' of occupy-

ing 'non well' or terminal patient states, provide the basis for select-

ing an 'optimal' treatment plan. Finiteness of the planning horizon is

assured either by establishing a maximum permissible number of treatment







applications, or by considering at any stage of analysis the effects of

a fixed number of future treatments. Validation of the decisions gen-

erated by these models has thus far been limited to checks on the feasi-

bility of the treatment regimens selected. Unfortunately, the finite-

horizon models either do not consider the possibility of a patient's

prolonged stay in the health-care system, as is the case of the models

with a maximum number of possible treatments, or, where only a fixed

number of future treatments is considered, they provide no more than a

heuristic treatment-selection procedure.


2.4 Uncertain-Duration Treatment Planning

Bunch and Andrew [11] have considered the possibility of prolonged

occupation of the same diagnostic state during the course of a patient's

progression through the care system. In their Markovian representation

of the care system for mid-shaft fractures of the femur, they provide

this modeling refinement. As a consequence of this modification, the

number of treatment decisions made for each patient is a random variable

with no fixed upper bound. Howard's iterative scheme for policy selec-

tion [25] provides the means for choosing the optimal treatment regimen

by selecting treatment alternatives that maximize the relative 'value'

of occupying each disease state. Although the Bunch and Andrew model did

not consider return visits to the same disease state, a more generalized

Markovian representation could incorporate that possibility. Neverthe-

less, the proximity to reality that this category of transient Markovian

models provides requires considerable effort as holding-time distribu-

tions, treatment 'costs,' and transition probabilities must be supplied

by the analyst for all treatment alternatives at each of the disease

states in the care system.








The data collected on craniofacial-pain patient progressions

through the care system reveal that both prolonged occupation of a

single diagnostic state and return visits to the same state occur fre-

quently. Moreover, as will be discussed in Chapter 4, there are several

characteristics of the craniofacial-pain care system that permit reduc-

tions in the number of input parameters required for a transient Markovian

model of this system. Therefore, an uncertain-duration transient

Markovian representation of the health-care process has been selected as

the means of evaluating the effectiveness of alternative treatment regi-

mens on patients with craniofacial pain.













CHAPTER 3

DIAGNOSTIC CLASSIFICATION


The analytic model developed to provide diagnostic classifications

for craniofacial-pain patients is based on the principles employed in

non-parametric pattern classification. The patterns classified by this

diagnostic model are vector representations (see Section 3.1 and Appen-

dix A) of the craniofacial-pain patient's physical and emotional status.

In the first sections of this chapter the' theoretical background for the

diagnostic model is established. This discussion is followed by a pre-

sentation of the validation procedures used to evaluate model perfor-

mance. Next, an algorithm is developed to reduce the 'costs' associated

with model utilization. The chapter closes with a discussion of poten-

tial applications of the craniofacial-pain diagnostic classifier in

teaching, in research, and in the health-care process.

3.1 Model Components

In the initial phase of the development of the diagnostic-classi-

fication model a set of possible alternative diagnostic classifications

was established for craniofacial-pain patients. Figure 3 provides a

list of these possible classifications. Note that the alternative classi-

fications in Figure 3 are not mutually exclusive as a craniofacial-pain

patient classified in same diagnostic alternative 'A' could also have

the disorder specified by sane other diagnostic alternative 'B.'

However, for the purposes of this dissertation, each patient's diagnostic





18



1. Temporomandibular Joint Arthritis -Developmental

2. Temporamandibular Joint Arthritis -Infectious

3. Temporoaandibular Joint Arthritis--Osteo (Degenerative)

4. Temporamandibular Joint Arthritis--Traumatic (Acute)

5. Temporamandibular Joint Arthritis--Traumatic (Chronic)

6. Myopathy-Acute Trauma

7. Myopathy--Myositis

8. Oral Pathology-Dental Pathology

9. Vascular Changes--Migrainous Vascular Changes

10. Myofacial Pain-Dysfunction Malocclusion-Balancing Interferences

11. Myofacial Pain-Dysfunction Malocclusion-Lateral Deviation of Slide

12. Myofacial Pain-Dysfunction Malocclusion-Uneven Centric Stops

13. Myofacial Pain-Dysfunction Psychoneurosis-Anxiety/Depression

14. Myofacial Pain-Dysfunction Bruxism

15. Myofacial Pain-Dysfunction Reflex Protective Muscular Contracture

16. Myofacial Pain-Dysfunction Loss of Posterior Occlusion

17. Neuropathy







FIGURE 3

CRANIOFACIAL-PAIN DIAGNOSTIC ALTERNATIVES


I









classification is made on the basis of specifying that etiological fac-

tor that requires most immediate action on the part of the attending

practitioner. Thus, diagnostic classification of a patient into diag-

nostic alternative 'A' signals that the etiology specified by that al-

ternative should determine the course of the patient's care.

The next step in model development isolated relevant data which

measured the physiological and psychological status of craniofacial-pain

patients. In particular, this step of model development sought those

elements of patient status that practitioners employ in their own classi-

fication of craniofacial-pain patients. Appendix A presents a list of

these data elements. Wherever it was feasible, measures of patient

status were segmented to amplify the significance of particular readings

of each measure. Thus, for example, while the duration of a patient's

pain is a continuous measure of his status, it is important for the pur-

poses of classification to know whether a craniofacial-pain patient's

duration of pain is less than 3 weeks, from 3 to 6 weeks, or longer than

6 weeks. For this measure of patient status, a short history of pain

indicates a strong possibility of a recent traumatic injury while pain

over a long period is more likely associated with long standing arthritic

or psychic disorders.

To facilitate the development of an analytic model of the diagnostic-

classification process, a vector representation of the relevant elements

of patient data has been developed. The vector permits the notation of

any of the data elements shown in the listing in Appendix A. The pre-

sence of any of the items found in Appendix A is recorded in a patient's

data vector by an entry of '1' in the vector-dimension corresponding to

the item number, while the absence of a vector item is noted by a '0'








data-vector entry. For example, referring to the listing in Appendix A,

*a male patient would have the following fifth, sixth, and seventh ele-

ments in his data vector

(...,1,0,0,...),
while a pre-menopausal female would have the series of elements

(...,0,1,0,...).
This vector notation of a patient's status serves as the input data for

a non-parametric pattern classifier that assigns a diagnostic classifica-

tion to the patient's dysfunction.

Non-parametric pattern classification, as described in Meisel [23]

and Nilsson [24], is the process of creating decision surfaces that

separate patterns into homogeneous classes, C, i=1,2,...,p, specified

by the analyst. In the craniofacial-pain diagnostic model, the Ci are

the diagnostic alternatives shown in Figure 3. Classification of a pat-

tern (a patient's-data-vector) into one of the classes is performed by

a pattern classifier composed of a maximum detector and a set of dis-

criminant functions. These discriminants, g (a), j=l,2,...,p, are single-

valued functions of each patient's data-vector a. If a. represents a

data vector for a patient whose correct diagnostic classification is the

ith diagnostic alternative, then the gj(a) are chosen so that

gi(ai)>gj(ai) i, j=l,2,...,p, j-i.

The craniofacial-pain classifier uses linear discriminant functions.

These discriminants are linear in the sense that they provide mappings

from E" to El that exhibit the form

gj (a) = all+a2 j2+...+anjnj (n+l)

where in the patient-data-vector a, the value of ar denotes the presence








(ar = 1) or absence (ar = 0) of patient-data-vector item r; and the

Wjk. k=l,2,...,n+l, are constants associated with the j discriminant
function called 'weights.' These discriminant-function weights,

Wjk, j=l,2,...,p, k=l,2,...,n+l, provide an analytic means of duplicating

the correct classification of each pattern observed by the non-parametric

classifier. They provide a link between a pattern's correct classifica-

tion and the individual components of the pattern's vector representa-

tion. In essence, each discriminant's weights are additive elements

whose component sums have significance in terms of a isolating pattern's

correct classification. These weights are a mathematical means of stor-

ing information already known about the correct classification of observed

pattern vectors. Moreover, the weights can be interpreted fran the

point of view of the significance that the practitioner places on each

data-vector component. A discussion of this interpretation of the dis-

criminant-function weights appears in Section 3.2.

Central to the use of linear discriminant functions is the assump-

tion that the space of observable patient data vectors is linearly

separable, for by definition [24],

a pattern space A is linear and its subsets of patterns

AAl ,... ,A are linearly separable if and only if linear

discriminant functions g ,g2,... ,g exist such that

for all a in A. g. (a) >gj (a)

for all i=1,2,... ,p, j,2,...,p, ji.

In the context of diagnostic classification, the assumption of linear

separability implies that there exists a set of hyperplanes that parti-

tion the space of observable patient data vectors into convex homogeneous

regions, each region representing a unique diagnostic classification.








Rosen [26] has provided a restatement of this assumption in the require-

ment that the sets of data vectors corresponding to each diagnostic al-

ternative have non-intersecting convex hulls. In either form, this is

a fairly restrictive assumption on the dispersion of patient data vec-

tors (see Section 3.2).

Selecting the 'weights' for each of the discriminant functions is

a process known as 'training.' For the linear non-parametric classifier,

training generates each discriminant function's wjk's by applying a sys-

tematic algorithm to the members of a set of representative patterns with

pre-established classifications. Nilsson [24] discusses several algorithms

suitable for training the craniofacial-pain diagnostic classifier. In

the course of using these algorithms for model development, a new 'mod-

ified fixed-increment' training algorithm was constructed (see Appendix

B). Employing the new algorithm has resulted in a reduction of approx-

imately 35% in the amount of training time required to derive the weights

for the craniofacial-pain classifier.

Symbolically, the craniofacial-pain diagnostic classifier, with its

set of trained weights, can be represented in the following format:

let a. = the 296-dimension data vector describing patient 'i'

aik = the kth element in the data vector describing patient

'i', whose value is either zero or one, k=l,2,...,295

(by definition ai,296=1)

Cj = diagnostic alternative 'j', j=l,2,...,17

dij = the value of the discriminant function for diagnostic

alternative 'j' generated by the data vector of patient
'i'







W. = the 296-dinension vector of weights associated with
-J
diagnostic alternative 'j'

Wk = the k element in the weight vector W.,
jk -3
that is

a = [ailai2,...ai295

Wj =I [wjl' j2 ,wj295'wj2961
and
296
d. = a.W. ai jk
di -3-73 k= 1 ik jk

where T denotes vector transposition. Patient 'i' is classified in

diagnostic alternative Cj when d i>dis for every s/. If m.x di is

not unique, then it is not yet possible to classify patient 'i' into

one of the diagnostic alternatives. Treatment is prescribed for severe

synptcas and classification is attempted at a later date.

Data from four sources were used to construct and verify the diag-

nostic-classification model, as well as the treatment-planning model

presented in Chapter 4. Contributions of clinical records came from

the dental schools at the universities of California at Los Angeles,

Florida, Illinois, and Indiana. In all, the records of 250 patients,

involving a total of 480 patient-practitioner interactions, form the

data base for model building and validation. The relevant information

from each of these patient visits has been recorded in the data-vector

format of Appendix A. A diagnostic classification from Figure 3 was

assigned to each of these patient data vectors by either Dr. Thomas B.

Fast, Chairman of the Division of Oral Diagnosis, or by Dr. Parker E.

Mahan, Chairman of the Department of Basic Dental Sciences, at the

College of Dentistry, University of Florida.







With this basic structure for the diagnostic-classification model,

the classified patient data vectors, and the training algorithm presented

in Appendix B, an initial test was performed to verify that the space of

observed patient data vectors was separable by linear discriminant func-

tions. Application of the modified fixed-increment training algorithm

to the set of 480 data vectors verified this requirement, as the algo-

rithm terminated in a set of feasible discriminant-function weights.

Using the discriminant functions these constants determine, it is possi-

ble to duplicate the pre-established diagnostic classifications for each

of the patient data vectors.

This first test of the diagnostic classifier established that a non-

parametric classifier could be employed to reproduce the original clas-

sifications for each data vector used in model construction. However,

this test does not reveal how well the classification model will perform

on patient data not employed in developing the discriminant-function

weights. The remainder of this section, and Section 3.3, address the

question of how the diagnostic classifier performs on 'new' patient data

vectors, that is, vectors that have no duplicate in the training sample.

Model training has created a set of weights that, by the definition

of the training procedure, correctly classify every patient data vector

that lies within the bounds of the training-sample pattern-class convex

hulls. Since every data vector is a binary vector, new patient data

vectors must fall outside the convex hulls established by the training-

sample vectors. Yet, if new data vectors have a number of data-vector

elements that are identical to those of the training-sample vectors

with the same diagnostic classification, then this relationship will be

reflected in a 'close proximity,' as measured by a Euclidean-distance







function, between each new vector and its associated training-sample

convex hull. Given this close proximity, the classifier's discriminant

functions should correctly classify most new data vectors as these vec-

tors will lie within or near the boundaries of the appropriate discrim-

inating hyperplanes. Hence, the key to providing adequate classifier

performance for new data vectors lies in devising data-vector-represen-

tations of patient data for which the data vectors of a canron diagnostic

classification exhibit strong similarity.

In the introductory discussion of the elements of patient data used

in the patient data vector, it was pointed out that an effort was made to

select components of patient status that assist the practitioner in his

selection of diagnostic classifications for a craniofacial-pain patient.

Then these elements were partitioned to generate as much discriminating

information as possible from each data element. In terms of the alter-

nate diagnostic classifications, these elements of patient data were

chosen so that all patients in any one diagnostic classification would

have a unique combination of exhibited or non-exhibited data-vector ele-

ments. Employing these carefully constructed qualitative data elements

resulted in a set of 'natural' gaps in the vector representations of

patient data from alternate diagnostic classifications. The fact that

there are portions of the pattern space that cannot be occupied by any

data vector, and partitions of the space where the vectors of each clas-

sification must lie, assiststhe classifer in making correct classifica-

tions of data not used in model construction.

As Section 3.3 shows, this discussion is not meant to imply that the

craniofacial-pain diagnostic classifier can, in its present state of

development, correctly classify every new data vector. What has been








stated is that a knowledge of the underlying classifying process can

be employed in constructing the data vector examined by the classifier,

and that fully utilizing this information will lead to a classifier that

can be expected to be capable of performing well on new patient data.

Of course, this discussion has been predicated on the separability of

the underlying pattern space of data vectors. If this requirement is

not met by same form of patient-data-vector representation, classifica-

tion of patients by linear classifier is not possible.

The next section of this chapter provides relationships between

linear separability and the data that may be observed in a health-care

system for which diagnostic classification by linear discriminants is

being considered. This section has a dual purpose. First, linear sep-

arability is couched in 'non-gemaetric' terms. Second, and more impor-

tantly, using the craniofacial-pain health-care system as an example

of the section's developments provides information about the suitability

of the non-parametric classifier as a model of the decision-making pro-

cess associated with diagnostic classification in this care system.

3.2 Alternative Interpretations of Linear Separability

The criteria for pattern space separability are mathematically

concise. Unfortunately, these separability criteria are not readily

expressible in non-geometric terms. The discussion developed in this

section provides the reader with scme non-geometric criteria that indi-

cate when the use of a non-parametric pattern classifier should be con-

sidered as a means of generating diagnoses for a medical or dental dis-

order.

The first criterion is associated with a probabilistic measure of

symptom exhibition. Given a patient who exhibits sane set of symptoms







S, non-parametric pattern classification requires that P[SIC] = 1 for

the diagnostic alternative 'C.' that describes the patient's current
diagnostic status, and P[S Ck] = 0 for all other diagnostic alternatives
'Ck.' However, assume that for the disorder in question the probability

of exhibiting any relevant symptom has been calculated fran historical
data, that is, estimates of P[siCjC] are available for all relevant

symptoms si and all diagnostic alternatives Cj. Then, if the following
decision rule leads to the correct classification of a majority of the

patients with the disorder in question, utilization of a non-parametric
classification model should be investigated:

classify a patient who exhibits the set of symptoms S in the
th
j diagnostic alternative if

T P[silCj] > P[silCk] for all kj. (1)

s.iS s.iCS

Since (1) holds if and only if

log [T P[silCj]] > log [T P[silCk]] for all k-j,
.eS siES

decision rule (1) can be expressed in terms of logarithms. Let the set

of symptoms S be represented as a row vector a with the elements of a
assigned values as follows:
ai = 1 if symptom s is an element of S

and ai = 0 if symptan s is not an element of S,

where n is the total number of relevant symptnos. Form the column vectors

Wj = [log P[sllC], log P[s2|Cj],..., log P[snlCj]T








Then log [ P[si.C.]] = aW., and decision rule (1) can be restated as

s.cS

classify a patient who is characterized by the vector a in the
.th
j diagnostic alternative if

aW. > a for all kij. (2)

Note that decision rule (2) is identical to the decision rule employed

in non-parametric pattern classification.

This equivalence implies that if (1) holds for every preclassified

patient examined, the values log P[siC j] form a set of feasible discrim-

inant-function weights. If (1) leads to the correct classification of

a majority of the patients examined, it is logical to assume that there

may be a set of feasible discriminant-function weights. This assumption

was examined using the craniofacial pain patient data. From the data

vectors classified in Diagnostic Alternatives 13, 14, and 15, a total of

189 patient visits, the P[siC .] were calculated. Each data vector was

then classified with decision rule (1), and 164 of the data vectors

(86.7%) were assigned to their pre-established diagnostic alternative.

The second criterion provides a subjective measure of the feasibil-

ity of using a non-parametric pattern classifier. If symptoms for most

of the diagnostic alternatives, associated with the disorder of interest,

can be isolated such that

1. a patient's exhibition of a subset of these symptoms leads

the practitioner to a selection of one of the diagnostic

alternatives, or

2. a patient's exhibition of a subset of these symptoms leads

the practitioner to eliminate from further consideration








one of the diagnostic alternatives,

then the use of a non-parametric classifier as a means of generating

classifications should be investigated.

The linear non-parametric classifier employes a weighted sum of

the symptoms exhibited by each patient in its discriminating functions.

If symptoms can be isolated that are significant to the classification

of patients with the disorder under investigation, then there is a

'natural' weight for each of these symptans in the decision-making pro-

cess used by the practitioner. The existence of these natural weights

increases the probability that a training algorithm will be able to find

a feasible set of discriminant-function weights. Indeed, the relative

importance of the significant symptoms may be reflected in the magnitude

of the discriminant-function weights generated by the application of a

training algorithm.

As an example, the significant symptoms associated with two cranio-

facial-pain diagnostic alternatives, Alternatives 4 and 14, were isolated

by Dr. Fast. A comparison of these symptars and their associated dis-

criminant-function weights revealed a high degree of correlation between

symptom significance and discriminant-function weights, see Table 2.

The reader should note that both of the criteria discussed in this

section are heuristic approximations to the gearetric requirement for

pattern space separability. However, if the disorder under investigation

meets one or both of these criteria, it may be possible to employ a non-

parametric classifier to diagnose the disorder since the requirement for

pattern space separability is most likely met.









TABLE 2

CORRELATION BEmCIE SIGNIFICANT SYMPTOMS
AND DISCRIMINANT-FUNCTION WEIGHTS


Diagnostic Alternative 4: Temporamandibular Joint Arthritis-Traumatic
(Acute)
Discriminant-Function
Significant Symptoms Weights


(+) Duration of Pain (less than 3 weeks) + 3

(+) History of Trauma (accidental) +30

(+) Preauricular Pain +11

(-) Salivary Gland Disease -12

(-) Otitis 1

(discriminant-function weights for Diagnostic Alternative 4 range
from -19 to +37)

Diagnostic Alternative 14: Myofacial Pain-Dysfunction Bruxisn
Discriminant-Function
Significant Symptams Weights


(+) Duration of Pain (more than 6 weeks) +15

(+) Facets + 2

(+) Bruxism and/or Clenching +56

(-) History of Trauma (accidental) -16

(-) Salivary Gland Disease 5

(discriminant-function weights for Diagnostic Alternative 14 range
fran -23 to +56)

Note: For both Diagnostic Alternatives
(+) indicates a symptom that leads the practitioner to classify
a patient in that diagnostic alternative

(-) indicates a symptom that leads the practitioner to classify
a patient in saoe other diagnostic alternative


A








3.3 Mbdel Validation

Validation of the craniofacial-pain diagnostic-classification

model presented in Section 3.1 has been accomplished by three types of

validating procedures. The discussion presented in the preceding sec-

tions, and in particular the relationship between significant symptans

and their associated weights shown in Table 2, reveal a close proximity

between the decision-making process the practitioner utilizes and the

non-parametric classifier's symptam-weighing scheme. This section pre-

sents two other procedures employed in evaluating the diagnostic clas-

sification model's performance.

The first procedure involved testing the diagnostic accuracy of

the classification model on patient data that were not employed in model

construction. Six classification tests were run in sequential order.

In the first five of these tests random samples of 50 patient-data-vec-

tors were drawn frcm the data base of 480 vectors discussed in Section

3.1. Then, as each of the tests was performed, the training algorithm

in Appendix B was applied to the remaining 430 data vectors. With the

weights derived from the training algorithm, the sample of 50 patients

was classified. The model-generated classifications for each of the

data vectors were compared to the classifications assigned to the vectors

when they were created. As each test classification of a sample was

completed, the diagnostic classifier's discriminant-function weights were

set equal to zero, the sample of data vectors was returned to the data

base, and the next test's random sample was drawn. A summary of the re-

sults of these tests of diagnostic accuracy is presented in Table 3.

In each of the first five tests it was possible for a patient who

has had multiple practitioner-visits to have same of the vectors repre-








TABLE 3

TESTS OF DIAGNOSTIC CLASSIFIER ACCURACY


Number of
Patient
Data Vectors


50

50

50

50

50

51


ONE

TWO

THREE

FOUR

FIVE

SIX


Number of
Data Vectors
Correctly Classified Classifier Accuracy


92.0%

90.0%

88.0%

94.0%

90.0%

84.3%


Mean Classifier Accuracy 89.7%

Standard Deviation of Classifier Accuracy 3.5%


TEST

TEST

TEST
TEST



TEST
TEST








senting these visits in a test's randan sample and sane vectors used

in model construction. Such occurrences lead to test results that over-

estimate classifier accuracy. Hence, in Test Six, a random sample of

all of the patient data associated with 40 patients (a total of 51

patient data vectors) was selected. This sample was classified by the

diagnostic-classification. model using the remaining 429 data vectors as

a data base. The results of this test are included in the data shown

in Table 3. There is one other possible factor affecting the classifier's

accuracy as measured by these tests. It is conceivable that there were

duplicate data vectors in the data base of 480 patient-data-vectors.

If duplicates do exist and were included in both the test samples and

the samples' training bases, measures of classifier accuracy will be

overly optimistic. However, since 'noise' is introduced by the variabil-

ity among craniofacial-pain patients and generated in the practitioner's

transcribing of the elements of patient data into the data-vector format,
295
and since there are 2295 possible data vectors, the probability that two

or more of the data-based patient vectors include an identical specifica-

tion of data-vector elements is small enough to justify neglecting this

possibility and its effects.

The results sunmarized in Table 3 reveal that the diagnostic-clas-

sification model performs well in duplicating the diagnostic classifica-

tions originally assigned by the reviewing practitioners, Dr. Fast and

Dr. Mahan. Moreover, the size of the test samples was quite large in

relation to the data base employed in developing each test's diagnostic

model. As new data became available and are incorporated in the para-

meters of the model, the accuracy of the craniofacial-pain diagnostic

classifier can be expected to increase slightly.








The second validating procedure established a measure of variability

on the diagnostic classifications that might be given by different dental

practitioners. The discussion presented in Section 1.1 related the dif-

ficulties associated with diagnosing craniofacial-pain disorders. Prac-

titioners with varying kinds of professional experience can be expected

to reflect their dissimilar backgrounds in differing diagnostic classi-

fications for these patients. To measure the variability associated with

dissimilar backgrounds, five craniofacial-pain data vectors were selected

from the data base employed in constructing the craniofacial-pain diag-

nostic classifier. Four dentists frcm the staff of the College of Den-

tistry at the University of Florida were asked to review these patient

data vectors and assign to each of them a diagnostic classification.

Table 4 summarizes their assignments and also includes the diagnostic

classification originally given by the reviewing practitioners.

The variability in diagnostic assignments reflected in Table 4 re-

affirms the justification for the research objectives set forth in

Section 1.2. Some of the differences in the practitioners' choices of

diagnostic classifications can be explained by the limited amount of

data contained in each of the data vectors, and the less-than-full med-

ical statement of each of the diagnostic alternatives. Nevertheless, a

diagnostic-classification model that generates classifications that are

in 90% agreement with those of experts in the field provides a sizeable

improvement over the variability in classification assignments exhibited

in Table 4 in which only half the respondents agreed on a single diag-

nosis in four out of five cases.








TABLE 4

CLASSIFICATION VARIABILITY AMONG DENTAL PRACTITIONERS



Diagnostic Classification for

Patient 1 Patient 2 Patient 3 Patient 4 Patient 5+


Original
Classification 4 13 15 15 9

Practitioner 1 1 7 15 15 3

Practitioner 2 6 12 15 8 3

Practitioner 3 4 15 15 15 13

Practitioner 4 4 15 15 14 *



* No classification given

+ Patient 5 exhibited a minimal amount of input data (only 17
non-zero data-vector entries)



These four dental practitioners exhibited 100.0% agreement of the
diagnosis on one of the five patients, and 50.0% agreement on the
diagnostic classification of the remaining four patients.








3.4 Minimum-Cost Symptom-Selection Algorithm

The craniofacial-pain diagnostic-classification model detailed in

the previous sections of this chapter has been structured upon the data

vector of the 295 relevant signs, symptoms, and items of patient history

shown in Appendix A. To utilize this model, the practitioner must ex-

amine a patient for the presence or absence of each of these data vector

elements. Although the cost in time and fees varies fran item to item,

there is an expense to the practitioner, and to the patient, associated

with checking each element in the data vector. Hence, it is logical to

investigate the possibility of finding a reduced data vector that 'costs'

less for the patient and practitioner to use and yet still permit cor-

rect classification of all craniofacial-pain .patients.

A review of the literature (see Meisel [23] Chapter 9 for a survey)

reveals that many authors have considered the task of selecting a set

of features to be used in a pattern-classification scheme. Traditional

methods of viewing this problem are based on a search for a transforma-

tion that takes a given set of patterns into scme 'new' pattern space

where separation by discriminant functions is possible. Measures of

pattern class separability are employed to evaluate the effects of

transforming the set of patterns from one space to another. In general,

these transformations take a pattern representation in 'n' features and

create a set of 'r' (r
'new' features are linear combinations of the original features. How-

ever, to reduce the 'costs' associated with using the craniofacial-pain

diagnostic classifier, a transformation must be found that decreases

the size of the data-vector pattern space by eliminating features rather

that combining them. For example, assume patients were diagnosed on








the basis of body-temperature and blood-pressure readings. Traditional

techniques for feature selection might employ a linear combination of

body temperatures and blood pressure measurements as one 'new' feature.

The transformation sought in this investigation would lead to the clas-

sification of patients by either body temperature or blood pressure

alone if this were possible. This example will be used again in Section

3.4.1 to illustrate the algebraic and geometric structure of the problem.

Nelson and Levy [27] have attacked the problem of selecting a re-

duced set of unaltered features for use in a classification scheme.

These authors attach a cost to the use of each available feature, and

employ a ranking scheme to measure each feature's discriminating power.

Then, under a restriction on the total cost of features employed, they

develop an algorithm that selects the set of features that maximizes the

classifier's discriminating power. Unfortunately, their scheme does not

guarantee the selection of a subset of original features that contain

enough 'information' to permit pattern class separation by discriminant

function. Therefore, a new algorithm is presented in this section that

minimizes the cost of the set of features used by the pattern classifier

yet insures that all patterns can be correctly classified by a set of

linear discriminant functions. In the remainder of this section the

more general terms 'feature,' 'pattern,' and 'pattern class' will be

used respectively to represent a data vector item, a patient's data vec-

tor, and a diagnostic classification.

The problem of finding a minimum-cost collection of features would

not be considered if there did not already exist a set of 'n' features

by which the patterns under examination could be correctly classified

by linear discriminants. That is, given a 'n' dimensional representa-


I








tion of each of the 'm.' patterns in each of the 'p' pattern classes

m m m m
i = [ail ,ai2 ,...,ain ,], ml-1,2,...,mi, i=-,2,...,p,

where
m
amc k=l,2,...,n, equals either zero or one, there must exist

a set of 'n+1' dimensional W.'s, j=l,2,...,p, such that
-J
m

a." (W.-W.) > 0 for all m=l,2,...,m. (3)

i=l,2,...,p

j=1,2,...,p

jVi.

Letting A. be. the mi- (n+l) dimensional matrix of patterns in pattern-

class i, then the requirement of (3) can be written in the following

form:

A(W.-.) > 0 i=1,2,...,p

j=l,2,...,p .

j3i.

If such pattern representations and W. 's exists, then a solution to the
-3
following problem yields a minimum-cost collection of pattern-classifying

features:

P1: minimize CX

subject to Ai[X O(Wi-Wj)] > 0 i=l,2,...,p

j=1,2,...,p

j3i


I








S1 1 1
ai ai2 ... a. 1
where A. = i in
_l2 2 2
a 2 a 2 a 1

*

m. m. m.
a. i a. L .. a. i 1
11 1ii ...2 in



Wi = [wil' i2'''' win' in+1

C = [C1C2,...,cn,0]

X_ = [x1,x2,...,xn,1T

and ik is an unrestricted variable

cj is the cost of using feature j

x 0 if feature i is not used

1 if feature i is used

Note: The [ notation is to be read as element by element

multiplication i.e., QOR = S [si] = [q.ijr.ij].


3.4.1 Algorithm Development

The algorithm developed to solve problem P1 is an enumerative

algorithm similar in structure to that of Balas [28]. Unfortunately,

the non-linear nature of problem Pl's constraints prohibits full imple-

mentation of the more powerful techniques used in implicit enumeration

on linear integer problems. The structure of these constraints and

their effect on the optimization of P1 will be discussed in a step-by-

step development.

The minimum-cost feature-selection algorithm does not solve P1 to

the extent of finding the values of the vectors W., i=l,2,,,.,p. This
2.







algorithm does find the minimum-cost collection of features X* and the

total cost associated with using these features, and guarantees the

existence of W. vectors associated with this optimal feature set. Given
--1

this guarantee, the modified fixed-increment algorithm frcm Appendix B
*
can be employed to find the vectors W., i=l,2,...,p.

Choose same solution to P1. By hypothesis there exists at least

one solution (X,Wi,W2,... ,W) to P1 where X = [1,1,...,1,1]. Suppose

there is sauce other solution (X, 2... ,W') where one or more elements

xi in the X vector are equal to zero. For the constraint matrices in P1,

A. [X [ (W -W.) > 0 i=1,2,...,p

j=1,2,...,p

jVi.

If the matrix products [A. X] = A., i=l,2,...,p are constructed, then

each set of constraints in P1 can be written in the form

(Wi-W.) > 0 i=l,2,...,p (4)

j=1,2,...,p

j i.

The creation of the A. is called the zeroing process. Of the col-
1
umns of A., A. retains all columns j of A. where x. = 1, and substitutes
1 1
a column of zeros for each of those columns k in Ai where xk = 0. Using

the zeroing process, the feasibility of any possible solution vector X

to P1 can be examined in terms of the A. O X this vector X creates.

As an example of the zeroing process for a particular set of patterns,

let a be a two-dimensional patient-data-vector a1 = [aai] where
[ 2
i if patient i has normal body temperature

1 if patient i has abnormal body temperature


and







i O if patient i has normal blood pressure
2 1 if patient i has abnormal blood pressure .

Assume two diagnostic categories, X and Y, where data vectors a and
2 1 2
aX are reclassified in category X and data vectors y and a are pre-

classified in category Y.

1 2 1 2
If a = [1,0], a = [1,11], = [0,0], and a = [0,1]

then =[ 0 and A = 01.


Graphically the pattern space can be represented as

2 2
t
pressure
1 1
a ax .
temperature

Consider the vector X= I then [Ax X] = [1 1 and [A X] =[0


Graphically the pattern space, as transformed by X can be represented as



pressure
12 12


temperature

The vector X effectively creates a representation of each patient data

vector in terms of the patient's body temperature alone.








Note that relation (2) is the requirement for pattern separability

by linear discriminants. Hence, a vector X is a component in a feasible
^ A A ^
solution (X,W ,W ,...,W ) to P1 if and only if there exist W. i=l,2,...,p,
-1K P-1
such that (2) holds for all ifj. As discussed in Section 3.1, a pattern

space is linearly separable, and hence, feasible W. exist, if and only if

the individual pattern classes have non-intersecting convex hulls. For

the pattern vectors considered in this section, the individual components

of each of the patterns in each pattern class are either zero or one. As

there is a one-to-one correspondence between the individual patterns in

a pattern class and the vertices of the pattern class's convex hull, the

convex hull of a pattern-class Ai can be expressed as all convex combina-
^m Consider
tions of the individual pattern-class vectors ai, m=l,2,... ,m.. Consider

the following examples of the convex-hull representation of linear separa-

bility.

Assume a = [1,0], aX = [1,1], a = [0,0], and a2 [0,1].

Graphically this pattern space can be represented as

2 2


Feature 2 -Y *-0 X
1 1


Feature 1
1 2
where the line X from al to a2 represents the convex hull of pattern-
1 2
class X and the line Y from a_ to a_ represents the convex hull of

pattern-class Y. Since X and Y do not intersect, implying that the

space is linearly separable, it is possible to draw an infinite number

of lines 0 that serve as discriminating hyperplanes.








1 2 1 2
Assure aX = [1,0], a = [0,1], a = [0,0], and a = [1,1].

Graphically this pattern space can be represented as


2 2

Feature 2 --




Feature 1

1 2
where the line X from a_ to a_ represents the convex hull of pattern-
1 2
class X and the line Y from a_ to ay represents the convex hull of

pattern-class Y. Since the lines X and Y intersect, the pattern space

is not linearly separable, and hence, it is impossible to draw a discri-

minating hyperplane 0.

Therefore, the following condition is equivalent to condition (4):
A t
a vector X is feasible to Pl if and only if there do not exist Us and U

such that
^ t ^
UA = U A for any s=1,2,...,p (5)

t=1,2,...,p

s3t

where

.i i i i
= [Ul,u2'"..um
1

uk > 0 for all k=l,2,...,m.

and
m.
1 uk = 1 for all i=l,2,...,p.
k=l

Checking the feasibility of some vector X by condition (5) yields

[p(p-l)]/2 distinct subproblems. Each of these subproblems may be








characterized as follows:
AT ^T
let A = A and A = B with A and B having columns a.

and bj respectively for any A and At.

m. m.
P2: Find u. > 0, u.=l, and v. > 0, E3 v.=l
i=l 1 3j=l 1
such that
m. m.
E1 u.a. = Z3 v.b.
i=l 7 j=l 3

If such u. and v. exist for any one of the subproblems then X is not
1 J
feasible to Pl. Because the number of subproblems is large even for a

relatively small number p of pattern classes, there is justification for

seeking methods to expedite the solution of each subproblem P2.

To achieve this goal, a series of conditions will be presented that

characterize same of the criteria necessary to the existence of a solu-

tion to subproblem P2. In addition to establishing criteria for exis-

tence, these conditions provide a means for reducing the size of the

matrices A and B. This reduction will be discussed after the conditions

are established.
th k
Condition 1: If the kth row of A has all el ments ai, i=1,2,... ,m

equal to zero (one) and the kth row of B has all
k
elements bk, j=1,2,...,m, equal to one (zero) then no
m. m.
u.>0, 3 u.=l and v.>0, Z3 v.=l exist such that
1 i=l 1 3 j=1 3


Justification 1:


m. m.
1 u.a. = E3 v.b.
i=l 1 j=l 3 3

Under Condition 1 there is no set of convex combina-
th th
tions of the k row elements of A and of the k row

elements of B such that the combinations are equal.



































Condition 2:









Justification 2:


Hence, there can be no set of convex combinations

of the columns of A and of B such that the combina-

tions are equal.

Symbolically,
m. m.
since no ui> 0, Z u.=l and v.>0, E1 v.=l
i=l 1 3- j=l 3
exist such that
m. m.
1 u.a. = 3 v.b.
i=1 j=l 3 3
m. m.
no u.>0, El u.=l, and v.>0, E3 v.=l
i=l j=l 3

exist such that

m. m.
Z1 u.a. = E3 v.b.
i=l j=l 3

th k
If the k row of A has all elements ai, i=l,2,...,mi,

equal to zero (one) and the kth row of B has all

elements bk, i=1,2,...,m., equal to zero (one), the

kth row of matrixes A and B can be eliminated without

loss of possible solutions to subproblem P2.

Under Condition 2 every convex combination of the k

row elements of A and of the kth row elements of B

are equal. Hence, a set of convex combinations of the

columns of A and of the columns of B are equal if and

only if the convex combinations of the remaining rows


th
(all rows except the k row) are equal, Symrbolically,

let aik denote the pattern a. whose k component has

been eliminated and similarly let bjk denote the

elimination of component k from pattern b., then as








m. m.
ZE u.a. = 3 v.b. ,
i=l 1 j=l 3 3

for any choice of
m. m.
u.>0, E1 u.=l and v.>0, EZ v.=l,
Si=l : I j=1 I


m. m.
E3 u.a. = E3 v.b.
i=l 1 j=l 3

if and only if
m. m.
EI u.a = v.b.
i=l i j=l 3jk


Condition 3:












Justification 3:


k
If the kth rw of A has all elements ai, i=l,2,... ,mi,

equal to zero, and some br equals one,
m. m.
no u.>0, 1 u.=l, and v.>0, v >0, E3 v.=l
1- i=l 1 r j=l 1

exist such that
m. m.
1 u.a. = E3 v.b.
i=l 1 1 j=l 31

Under Condition 3 any convex combination of the col-

umns of B that includes a non-zero product of the

thth
column b results in a k row term greater than zero.

The value of the k row term for any convex combina-

tion of the columns of A is equal to zero. Hence, no

set of convex combinations of the columns of A and B

can be equal if the combination for B includes a

specification that vr>0. Symbolically,


A
if v >0,

then for any choice of vj, j=1,2,...,m., j3r,

































Condition 4:


Justification 4:


47



m. A
where v >0 and 3 v.=l
r j=l

m. k m k
3 v.b > E u.a. =0
j=l 3 3 i=l

m.
for any choice of u. such that u.>0 and i u.=l.
1 1- i=l 1
i=l

m.
Hence, if v >0, there exist no u.>0, u.=l
r i=l 1
m.
and v.>0, j/r, E3 v.=l such that
3 j=l 3

m. m.
3 u.a. = 3 v.b..
i=l I I j=l 3 3
th k
If the kt row of A has all elements a., i=l,2,... ,mi,
k
equal to one, and some b equals zero,
r
m. m.
no u.>0, I u.=1 and v.>0, v >0, 3 v.=l
1 i=l 1 3- j= 3


exist such that

m. m.
E1 u.a. = 3 v.b.
i=l 1 j=1 3 3

Condition 4 is similar to Condition 3 in that any

convex combination of the rows of B that includes a

non-zero product of the rth column yields a kt row

term whose value cannot equal any convex combination

of the kt row elements of A. Symbolically,

for any choice of u. and v., where v >0,

m. m.
3 v.b. < u.a. = 1.
j=l 3 3 i=l 1







Note that Conditions 3 and 4 can also be stated, and justified, with

the role of the elements of the A and B matrices reversed.

Given this set of four conditions, consider the following row par-

tition of the A and B matrices:-

A* B*

Al B

A= A" B= B1

A0 C
AC. B0

A0 B0

where by appropriate change of rows in A and B

1. every element in each row of Al is a one

2. every element in each row of B, is a one

3. every element in each row of A0 is a zero

4. every element in each row of B0 is a zero.

The partitions A, Bl, A, and B are the rows of A and B corresponding

to B1, Al, B0, and A0, respectively, and A* and B* are the remaining rows

of A and B. With this partitioning and the four previously established

conditions, the size of the data vectors associated with many of the

[p(p-l) /2 subproblems P2 can be significantly reduced. The reduction
process, Procedure 1, can be stated in this manner:

Step 1: If for same row k in Al (B1) each element in the corre-

sponding row of B1 (A) is equal to one, then row k

of A and B can be eliminated by Condition 2.







Step 2: If for same row k in A0 (B) each element in the corre-
00
spending row of B0 (AO) is equal to zero, then row k of

A and B can be eliminated by Condition 2.

Step 3: If for scene row k in AO (B0) the corresponding row in

B6 (AO) has all elements equal to one or if for same row

k in A, (B) the corresponding row in B1 (A,) has all

elements equal to zero, then this particular subproblem

P2 has no feasible solution by Condition 1. Procedure 1

and the search for a solution to P2 are terminated at

this point because the convex hulls of pattern-classes

A and B do not intersect.

Step 4: If for some row k in A, (B1) the corresponding row in

Bc (Ac) has one or more elements equal to zero, i.e.,

k k k kk k
b = b =...=b = 0 (a=a =...=at=0) then
r s t r s t

columns br,bs'... bt (ar,as, ...,at) can be eliminated by

Condition 3.

Step 5; If for same row k in A0 (B0) the corresponding row in

B0 (A0) has one or more elements equal to one, i.e.,

k k k kk k
br = bs =.=b = 1 (a=a =...=a =1) then

columns b,bs,...,bt (ar,a s....at) can be eliminated by

Condition 4.

Step 6: If the use of Steps 1, 2, 4, and 5 has eliminated all

elements of both matrices, then this particular subproblem

has an infinite number of feasible solutions by Condition

2. Procedure 1 and the search for a solution to P2 are

terminated at this point because the convex hulls of the

pattern-classes A and B intersect.









Step 7: If the use of Steps 1, 2, 4, and 5 has eliminated one or

more rows or columns from either matrix then repartition

the matrices and return to Step 1, otherwise terminate

Procedure 1.

In coding Procedure 1 for computer processing, there is no need to

physically partition the rows of the A and B matrices. Summing the

elements in any row of A or B reveals whether the individual elements in

the row are all equal to zero or are all equal to one. Given this infor-

mation, the steps from Procedure 1 determine whether a pattern is re-

moved from A or B, whether a row in A and B is removed, or whether the

procedure should be terminated because no feasible set of convex combina-

tions for P2 exists.

As an example of the use of Procedure 1 consider the set of matrices

A and B in subproblem P2 were

0 1 1 0 1 1 1

S1 0 0 0 0 0 0 0
A= B=
1 0 0 0 1 1 1 0

0 1 1 1 1 1 1 0


In the first application of the steps of Procedure 1:

1. Column 4 can be eliminated from matrix A by Step 4 and

2. Column 1 can be eliminated from matrix A by Step 5.

After the first application of the steps of the procedure

1 1 11 11 1

A 0 0 B 0 0 0 0
A= B=
00 1 1 1 0

1 1 1 1 0







In the second application of the steps of Procedure 1:

1. Row 1 can be eliminated from both matrices by Step 1

2. Row 2 can be eliminated frcm both matrices by Step 2 and

3. Column 4 can be eliminated frcm matrix B by Step 4.

After the second application of the steps of the procedure
0 0 1 1 1
A= B=
1 1 1 1 1

In the third application of the steps of Procedure 1:

1. Row 2 can be eliminated frcm both matrices by Step 1 and

2. Procedure 1 can be terminated by Step 3.

Hence, for this set of A and B matrices, subproblem P2 has no feasible

solution.

Although the use of Procedure 1 may lead to a reduction in the size

of most subproblems, the pattern vectors (ai and bj) for each of these

problems may still be quite large. Restating subproblem P2 as a linear

program yields

P3: minimize [0 0]

subject to IA -B [u= [
11...1 00...0

00...0 11...1

and U>0
V>0

where the existence of any solution vectors U* and V* signals the inter-
section of the convex hulls of pattern-classes A and B.








Consider the dual of P3, .written in the following form:



P4: maximize [0 1 11 1
2

subject to A 01 1 HI
-B l 0U 1 <



I,,Xl X2 unrestricted in sign,

Note that P4 may have many associated ir variables, but has only as many

constraints as the number of patterns in A and B (as reduced by Procedure

1). P4 always has at least one solution to its constraint set. Thus, if

an application of a linear-programming algorithm to P4 reveals the exis-

tence of an unbounded solution, then P2 has no solution. Therefore, if

and only if P4 has a bounded solution do ui and vj exist such that


m. m.
1 u.a. = E3 v.b.
i=l j=l

where
u. > 0, E u. = 1
i=l
and .
v. > 0, E3 v. = 1.
3 j=1 I

The preceding discussion with its development of a reduction proce-

dure and dual formulation provides the structure for a second procedure.

Procedure 2 establishes a mechanism to verify the feasibility of any

assignment of zeros and ones to the X vector of problem P1, see Figure 4.

That is, given some vector X and a set of patterns a., in=l,2,...,m.,

and i=l,2,...,p, the [p(p-l1)/2 subproblems P2 are formed by zeroing out
























































FIGURE 4


PROCEDURE 2








the appropriate pattern-vector elements. Then Procedure 1 is applied

to each subproblem. Finally, for each pair of pattern classes the

boundedness of the dual formulation P4 is examined. Vector X represents

a feasible set of a pattern-classifying features for P1 if and only if

each of the [p(p-l)]/2 subproblem formulations P4 is unbounded.

Before a statement of the algorithm to solve problem P1 is presented

several terms must be defined. The assignment vector is defined as a

listing of variables xi, elements of the vector X in Pi, whose values have

been determined by the steps of the algorithm. The elements in this vec-

tor are recorded with the value of their assignment, either zero or one.

These elements are entered in the vector in the order they were assigned,

with the first algorithm assignment in the first (left) position. For

example, consider the assignment vector

[x4 = 0, 10 =1, x2 = 0].

This vector records that the algorithm first assigned x4 equal to zero,

then assigned x10 equal to one, and its last assignment was x2 equal to

zero. Feasibility of a solution X, as determined by the assignment-vector

cc~ponent values, is checked by Procedure 2 with the value of those vari-

ables not included in the assignment vector temporarily set equal to one.

The value V of an assignment vector is defined as minus one times the

sum of the costs associated with each of the variables in the assignment

vector, multiplied by the value assigned to the respective variable.

For the example assignment vector, [x4 = 0, x10 = 1, x2 = 0], where

c4 = 5, cl0 = 2, and c2 = 7, the assignment vector has the value


V= (-1)-[5(0) + 2(1) + 7(0)] = -2.









3.4.2 State.rnt of the Minimum-Cost Simptcm-Selection Algorithm

Step 0: Create the assignment vector (at this point the vector is

null as there is no variable assignment in the vector).

Set V*=-= and go to Step 4.

Step 1: Start at the right side of the assignment vector and move

to left, stopping at the first variable assigned a zero

value. If no variable in the assignment vector has a

zero assignment, go to Step 2. Otherwise go to Step 3.

Step 2: Calculate V for the assignment vector. If V is greater

than V*, record the values of the variables in the assign-

ment vector as the optimal solution X* to P1. Otherwise,

record (as the optimal solution X* to P1) the values of the

variables in the best current solution X. Terminate the

algorithm.

Step 3: Change the value of the variable isolated in Step 1 to an

assigned value of one, and eliminate from the assignment

vector all variable assignments to the right of this new

assignment. If the assignment vector includes the assign-

ment x.=l for every xi in X return to Step 2. Otherwise go

to Step 4.

Step 4: Select a variable xk that is not an element of the assign-

ment vector. Assign this variable the value Xk=0 in the

assignment vector. Use Procedure 2 to check the feasibility

of this assignment. If the assignment vector is not fea-

sible, go to Step 6. Otherwise go to Step 5.

Step 5: If the assignment vector with the new assignment xk=0 does

not include an assignment for every xi in X, return to

Step 4. Otherwise go to Step 7.








Step 6: If the assignment vector with the assignment Xk=l (xk is the

variable selected in Step 4) does not include an assignment

for every xi in X, return to Step 4. Otherwise go to Step 7.

Step 7: Calculate V for the assignment vector. If V* is greater

than V, go to Step 1. Otherwise go to Step 8.

Step 8: Record as the best current solution X the values of the

variables in this assignment vector. Set V*=V, and return

to Step 1.

Note that in the course of applying this algorithm all solutions are

considered and the best current solution is replaced only when another

solution has a larger associated value. As the number of possible solutions

is finite, the algorithm must terminate, and at this termination the value

of the optimal solution and its assignments are known. An application of

the minimum-cost symptcm-selection algorithm is presented in Appendix C.


3.4.3 Computational Considerations

Returning to the setting of diagnostic classification of craniofacial-

pain patients, application of the minimum-cost symptom-selection algorithm
295
would require an enumeration (explicit or implicit) over 22 possible

solutions in order to find the optimal collection of data-vector elements.

As the number of possible solutions is prohibitively large, heuristic

modifications to the symptan-selection algorithm are required for this

application. One possible modification could employ the fact that only

a few of the elements in the patient data vector have large associated

'costs' for their utilization. In particular, the eight elements of

radiographic data and the two measures of emotional trauma are significant-

ly more 'costly' to examine than the other items in the data vector.







With this modification, the algorithm would only consider eliminating

these ten high cost features. Another heuristic approximation to the

optimal collection of features might rank the data-vector elements in

order of descending cost of utilization. Procedure 2 would then be used

to eliminate these components one by one, starting with the item of high-

est cost, until the procedure signaled an infeasible solution to P1. Cer-

tainly, other heuristics might also be developed to exploit the structure

of this algorithm.


3.5 Model Applications

The structure of the craniofacial-pain diagnostic-classification

model permits model utilization for a variety of purposes. Since the

model is developed in terms of general data-vector and diagnostic-alterna-

tive parameters, these model components can be altered to suit the appli-

cation in question. This section presents a brief discussion of sane of

the possible applications of the diagnostic classifier.

In a teaching environment, the diagnostic-classification model with

its set of discriminant weights can be stored for ccmputer-terminal ac-

cess. Then, on a set of tutorial example patients, students can compare

their diagnoses with those of the diagnostic model. Moreover, the student

can interact with the classifier in constructing his own 'sample' patients

for the classifier to diagnose. Finally, the student can request the

classifier to relate those discriminant-function weights that the model

employs in considering the 'significance' (Section 3.2) of any one or

group of symptoms.

The effectiveness of new diagnostic tests can be evaluated using the

minimirn-cost symptams-selection algorithm. This algorithm provides an

immediate nieasure of the 'worth' of new research developments. Given a







cost for employing a new test, the algorithm returns an evaluation of

the test's classifying capability. The algorithm reveals whether the

test is included in the mininum-cost collection of features and whether

the use of the new test permits the practitioner to discontinue other

examination procedures. Additionally, the algorithm can be employed to

point out new areas for research, as it isolates diagnostic alternatives

where correct classification of patients is difficult using existing tests

and procedures.

As employed in the practitioner's office, the diagnostic classifier

will provide a direct link between the practicing dentist and the kn3w-

ledge of experts in the field of craniofacial pain. Information will

flow over the link in both directions. As new patients are seen by the

practitioner, the record of each visit will be reviewd by experts and

then used to supplement the data base employed in model construction.

Then, when developments dictate, new sets of discriminant-function weights

can be transmitted to the dental practitioners. This kind of interaction

results in a more accurate and representative diagnostic classifier as

the patient-sample data base becomes larger.












CHAPTER 4

TREATMENT PLANNING


The selection of treatment regimens for craniofacial-pain patients

is modeled as a 4arkovian decision process. The states in this Marko-

vian model are descriptions of a patient's health-care status and the

decision alternatives are feasible treatments for the patient's dys-

function (see Section 4.1). In the first two sections of this chapter,

motivation for the rodel structure is provided and the components of

the decision model are developed. The third section provides a descrip-

tion of the validating procedures used to determine the appropriateness

of the model and the model-generated treatment decisions. This chapter

closes with a discussion of potential teaching, research, and private

practice applications of the treatment-planning model.


4.1 Model Components

Several model-building components frcn the craniofacial-pain care

system are isolated to permit the construction of a Markovian represen-

tation of this system. A set of state descriptions that characterize,

for decision-making purposes, the status of craniofacial-pain patients

is presented in Section 4.1.1. Then transition probabilities measuring

the effects of treatment applications are discussed in Section 4.1.2.

Section 4.1.3 overlays the model's state descriptions and transition

probabilities with costs accrued during the patient's progression through

the care system. These components are integrated and verified in the

discussions of Sections 4.2 and 4.3.







Values for many of the treatment-planning model's parameters were

gathered from the set of patient records discussed in Section 3.1. As

the patient histories from the contributing university dental clinics

were reviewed, notations of treatment applications and time between suc-

cessive visits were made for each patient-practitioner interaction. The

values of the remaining model parameters were either estimated by the

reviewing practitioners, Dr. Fast and Dr. Mahan, or were gathered from

responses to questionnaries completed by patients who visited the

University of Florida's Dental Clinic. In modeling the complicated pro-

cess of care for craniofacial-pain patients, several simplifying assump-

tions were made. This section provides the motivation for these assump-

tions and presents the notation employed in the analytic description of

the treatment-planning process.


4.1.1 Patient States

In general, a Markovian system structure requires that the current

state of the system completely characterizes the probabilities associated

with future state occupancies of the system. To fully satisfy this

Markovian condition for state structure in the craniofacial-pain treat-

ment-planning nodel would require that the model include as distinct mod-

el states every possible combination of diagnostic classifications a pa-

tient might have occupied, in conjunction with every combination of treat-

ment applications he might have undergone, during his stay in the care

system. Unfortunately, such a model would have an infinite number of

'patient states.'

However, for a majority of craniofacial-pain patients the know-

ledge of a patient's prior treatment record, coupled with his current

diagnostic classification, is adequate to determine his prior diagnostic







classifications. Even in the cases where the current classification

and prior treatment record do not provide a total description of a pa-

tient's condition, these elements of patient status do provide signifi-

cant information about the probabilities associated with a patient's

future status in the care system. For example, in the data employed in

model construction, 47 craniofacial-pain patients occupied Diagnostic

Alternative 15 and were treated with an application of drugs at least

once. Eight of these patients were 'well' after a first treatment with

drugs, while 39 required multiple applications of drugs or other treat-

ments during their stay in the system. Yet of the 12 patients who were

given two applications of drugs, 9 were 'well' following the second

repetition of drug therapy. Thus, while the overall data-based transi-

tion-probability estimate for a transition from Diagnostic Alternative

15 into the well state following any one application of drugs is .36,

the transition-probability estimate for a transition into the well state

following two successive applications of drugs is .75. Hence, for this

diagnostic classification, information on the prior application of drugs

is important in determining a patient's future status in the care system.

This form of 'current diagnostic classification augmented by treat-

ment record' patient-state description is employed in the craniofacial-

pain treatment-planning model as an approximation to a 'true' Markovian

state structure. Each of the diagnostic alternatives shown in Figure 3

forms the basis for a collection of patient states. The diagnostic al-

ternative is augmented with a record of treatments that have been applied

since the patient entered the care system. Appendix D provides a list

of the treatment alternatives that may be prescribed for craniofacial-

pain patients. The record of each treatment given to the patient is noted

in the patient-state descriptions without regard to its chronological







order. For example, a patient's occupation of the state 'JI1,2,2'

denotes that he is currently classified in diagnostic alternative J,

and that since he entered the care system he has been treated with one

application of treatment 1 and two applications of treatment 2.

.Augmenting the patient-state descriptions with treatment history

expands the dimensionality of the state space, yet the number of history-

augmented states remains finite for two reasons. The treatment records

used in model construction reveal that, for sane combinations of diag-

nostic alternatives and treatment applications, there is a feasible

limit to the number of treatment repetitions that can be given to any

one patient. Thus, the first reason for a finite state space is that no

patient state in the treatment-planning model includes more repetitions

of a particular treatment than the clinical data have established as a

feasible limit. As an example, the records of patient visits used in

model construction establish a feasible limit of only one application

of treatment 18 for patients classified in any of the diagnostic alter-

natives. Therefore, the treatment-planning model includes patient states

that exclude treatment 18 as a portion of their treatment history or

exhibit the form

'JI. .. ,18,...'

for each diagnostic classification 'J' where 18 is a feasible treatment.

The second reason for a finite state space is that there is a 'boundary

application' of many treatments such that neither the treatment-record

data nor the reviewing practitioners established differences between the

transition probabilities for the boundary application and those for

further repetitions of the treatments (see Section 4.1.2 and Appendix E).

In Diagnostic Alternative 13, for example, the first application of treat-







ment 24 is the boundary repetition of that treatment. Hence, multiple

repetitions of treatment 24 are not added to the state description of

patient states based on Diagnostic Alternative 13, as the additional

information on multiple applications does not influence transition pro-

babilities associated with this treatment's effectiveness. Thus, a

second application of treatment 24 for a patient who continues to be

classified in Diagnostic Alternative 13 places the patient in a state

of the form

'131 ...,24,....

The craniofacial-pain treatment-planning model includes two terminal

patient states in addition to the patient states that are based on diag-

nostic alternatives. One or the other of these two terminal states,

'well' or 'referred,' represents the patient's status when he exits the

care system. A patient exists the system in the 'well' state when the

effects of treatment applications result in sufficient improvement so

that no further treatment is required. The patient moves into the 're-

ferred' state in lieu of further treatment. This alternative to treat-

ment is selected when the 'expected costs' of remaining in the care sys-

tem exceed the costs of referring the patient to another source of care

(see Section 4.1.3).


4.1.2 Transition Probabilities

Patient-state transitions that involve a change of diagnostic clas-

sification follow one of two basic formats, see Figure 5. For the initial

diagnostic classifications in Format I, with each treatment application,

the patient either remains in his original diagnostic classification or

he transits into the well state. For Format II, the six diagnostic al-

ternatives shcwn in the lower illustration form a different structure.








Format I

Patients whose first-visit diagnostic classification is Diagnostic

Alternative 1, 2, 3, 4, 5, 6, 10, 11, 14, 16, or 17, make transitions out

of their original classification 'I' according to the following figure:


Format II

For patients originally classified in Diagnostic Alternative 7, 8, 9, 12,

13, or 15, the following kinds of diagnostic-classification transitions

are possible:


FIGURE 5

DIAGNOSTIC-CLASSIFICATION TRANSITIONS







Here it is possible for the patient to alternate between any one of

several diagnostic classifications during the course of his stay in the

care system. Note that in both formats for diagnostic-classification

transitions a patient moves into the referred state not as a result of

a treatment application, but rather as an alternative to further treat-

ment.

To these underlying diagnostic-classification transitions the cranio-

facial-pain treatment-planning model adds a record of the changes in

treatment history. Appendix F displays complete charts of all of the

diagnostic-alternative-based patient states included in the treatment-

selection model. In these charts the patient states are connected by

arcs that represent feasible transitions from one state to another. Not

shown in the charts are the well and referred patient states and the arcs

that connect every diagnostic-alternative-based state with these terminal

states.

Howard [25] establishes that in terms of the policy decisions gen-

erated by a Markovian decision model, holding-time distributions are im-

portant only insofar as they affect the mean weighting time in each sys-

tem state and the expected costs of each state occupancy. The records

of the patient visits employed in model construction revealed that, in

the care of the patients described by the data, one or more treatments

were prescribed at each visit, and a series of return visits was scheduled

for the patient following his initial interaction with the practitioner

if return visits were warranted. Under these conditions, specifying

holding-time distributions for the time between successive patient-state

transitions does not refine the model. Therefore, the treatment-planning

model employs a Markovian rather than semi-Markovian representation of







the care system, since a 'n' visit holding time in a particular patient

state can be modeled with no loss of information as 'n' repetitions of

the 'virtual' transition frcm the state in question to itself. Care for

craniofacial-pain patients is modeled as a discrete-stage Markovian sys-

tem with the beginning of visits to the practitioner serving as stage

indicators.

Using the history-augmented patient states, transition probabilities

are specified in terms of the treatment that generated the transformation.

In making a state-transition following a treatment, a patient must move

to a state that includes that treatment as a portion of its state descrip-

tion. For example, following application of treatment 'k,' a patient

must progress frcm patient-state 'IIm,n' to 'JIk,m,n' where 'I' may be

equivalent to 'J.' The only exception to this rule is in the application

of a treatirnrt beyond its boundary number of repetitions. Here, if treat-

nmnt 'k' has a boundary number of two, then following an application of

treatment 'k' three or more times a patient progresses from patient state

'IIk,k,m,n' to 'JIk,k,m,n' where again 'I' may be equivalent to 'J.'

This structure is indicated because inclusion of more than the boundary

number of applications (two in this case) in the state description does

not affect the transition probabilities.

Estimates of the values of the transition probabilities were ob-

tained from the patient records discussed previously. A discussion of

the stability of these probability estimates under variations in patient

data is presented in Appendix E. Where the data on the effects of treat-

ment alternatives were limited, the data-generated probability estimates

were refined by estimates frnm the reviewing practitioners. Notationally,

transition probabilities are represented in the analytic model in the







following form;

pk = the probability of making a transition from

patient-state 'I' to patient-state 'J' following

the application of treatment-alternative 'k.'


4.1.3 Cost Structure

A patient's progression through the craniofacial-pain system gener-

ates a niltitude of implicit and explicit costs. The explicit costs can

be measured in terms of the dollar charges paid by the patient or the

practitioner during the patient's stay in the system. Other costs are

implicit in nature and can be quantified only as they relate to the

'opportunities' lost by the patient and the practitioner %wile the pa-

tient remains in the care system. For modeling purposes four major

system costs have been isolated. These costs are:

(a) Cost of treatment applications

(b) Cost of the practitioner and his staff's

services

(c) Cost to the patient of occupying a non-well

patient-state

(d) Patient-referral cost.

Although these costs do not encompass all of the system costs, they mea-

sure significant explicit and implicit charges associated with a patient's

stay in this system. In the treatment-planning model, each of these costs

is charged on a per-patient-visit basis.

Costs of the various treatment applications and the costs associated

with the practitioner and his staff's services were estimated by the re-

viewing practitioners. Estimates of treatment and care-system service

costs were partitioned by diagnostic classification as well as treatment









category. The cost estimates reflect typical charges in a dental clinic

environment.

The inconvenience experienced by a patient in making a visit to the

practitioner was used as a measure of the cost of occupying a 'non-well'

patient state. Estimates of this inconvenience cost were gathered from

responses to a questionnaire completed by patients at the University of

Florida's Dental Clinic. These were general dental patients not neces-

sarily suffering from craniofacial pain. Figure 6 shows the distribution

of these patient estimates.

Values for patient-referral costs were composed of the sum of three

distinct estimates. The first component was an estimate of the total

fee charged by the practitioner receiving the referred craniofacial-pain

patient. Record transferral and duplication costs, as well as the fees

lost by the referring practitioner, formed the second component. The

third component of the patient-referral cost is a measure of the incon-

venience experienced by the referred patient, a value estimated by using

a multiple of the value of the inconvenience cost discussed in the pre-

ceding paragraph. Appendix G provides a justification for using this

particular combination of components in the referred-cost estimates.

Symbolically, the patient-state transition costs (negative constants)

are represented in the analytical model as
k
c j = the sum of the costs generated by the transition

from patient-state 'I' to patient-state 'J'

following the application of treatment 'k.'

This sum includes the type (a), (b), (c), and (d) costs appropriate to

each patient-state transition.







Fifty-eight patients at the Unive-rsity of Florida's Dental Clinic responded

to the following questions:

How much would you estimate that this trip to the
Dental Clinic cost you in terms of lost wages, baby-
sitting fees, transportation costs, and other costs
that you may have had to pay so that you could
be here for your appointment?

The distribution of these. estimates is shown in this histogram.


Number

of

Respon-

dants


0.- 1.- 10.- 20.- 30.-
.99 9. 19. 29. 39.


40.- 50.- 60.- 70.- 80.-
49. 59. 69. 79. 300.


Dollars
The mean value for these 58 estimates of patient-visit inconvenience costs


was $30.72.


FIGURE 6


PATIENT-VISIT INCONVENIENCE COST








4.2 Selection of Optimal Treatments

The craniofacial-pain treatment-planning model is transient in the

sense that only two of the model's patient states, well and referred, can

represent the patient's status when he exits the health-care system. In

a stochastic sense, only the terminal states' are recurrent as they alone

possess non-zero long-run probabilities of state occupancy. Hence, the

choice of treatment alternatives at each patient state is made with the

goal of minimizing the costs accrued by the patient as he passes through

the diagnostic-alternative-based patient states into one of the recurrent

states.

For notational convenience, in the analytic model the well patient

state is denoted as state 'W' and the referred state as state 'R.' In

modeling the care system for craniofacial-pain patients there is no

justification for providing costs for the transitions from states 'R'

and 'W' to themselves, hence, 'cR,R and W' are set equal to zero.

Analytically, the treatment-planning model is made monodesmic; i.e.,

having only one recurring state, by defining pR,W=1 and p WR0. The

total number of states, not including states 'W' and 'R,' is denoted by

'S.' With these definitions and the notation introduced in the previous

section, a procedure for selecting the set of optimal treatment decisions

is developed.

Howard [25] has shown that for a monodesmic, transient Markovian

decision model, a set of optimal decisions is defined as those decisions

that maximize the expected-value 'v of occupying each system-state 'I.'

Since the treatment-planning model for craniofacial-pain patients fits

into this category of decision model, a modification of Howard's algorithm

is employed to select optimal treatment regimes. The process of select-
i








ing an optimal set of treatments is accomplished by finding the set of
I
treatment alternatives kl,k2,... ,k that maximize each of the vI (the

expected value of occupying patient-state 'I' given treatment alternative

'k ') where
kI k k\
Iv = r + p P v I=1,2, ...,S
all patient
states J

and
kp kI kI
I all patient P
states J
kI
With treatment-augmented patient states, maximizing the v can be

carried out in the following manner:

1. Group for simultaneous analysis all patient states possessing

a common treatment history, where one or more of the treatments in this

history are at their boundary level. Each of the 'T' sets of states

complying with this description forms an analysis set B., j=1,2,...,T.

2. Label sequentially the patient states, starting with state W

as 1, state R as 2, and then selecting numbers for the remaining unlabeled

patient states on the basis that the one with the most treatments in its

history receives the next number-label. For example, state 'JIl,2,2,4'

would be labeled with a smaller number than state 'JI2,6,6.' When the

numbering scheme reaches the members of one of the analysis sets isolated

in Step 1 (above), numbers for the members of that set may be arbitrarily

assigned. Given this state numbering scheme, the selection of optimal

treatments can proceed dynamically since for each state I that is not a

member of an analysis set, I=1,2,...,S, I/Bj, j=1,2,...,T

I
VI = rI + PIJVJ
J=1







and for the states of set B, .j=l,2,...,T
t
V = r + Z p V + E p i IcB.
JeB. J=l

where t = the number of last non-group B. state imme-

diately preceding the smallest number-labeled

state in B..

Thus, the process of selecting optimal treatments proceeds recur-

sively from the state of smallest number-label to the one of largest

number-label, stopping to consider simultaneously the values of a number

of states only when an analysis set is encountered.

Howard's value iteration and policy improvement algorithm [25] is

employed only in the case of selecting treatments for the analysis-set

patient states. An example of this section's labeling and optimization

procedure is presented in Appendix H.

This optimization procedure was applied to the states of the cranio-

facial-pain treatment-planning model. Appendix G presents a list of the

optimal treatment selections for each of the model's patient states.

4.3 Model Validation

Validation of the craniofacial-pain treatment-planning model was

accomplished in two phases. In the first phase of validation, the indi-

vidual components of the Markovian representation were examined by the
reviewing practitioners. The second phase of model validation compared

model-generated treatment decisions with those made by the reviewing ex-

perts. In addition, statistics generated by the model were compared to

the care-system description provided by the patient records from the

university dental clinics. This section discusses the results of these

validating efforts.







The review of model components was accomplished as values for the

model parameters were collected. Some of the data-based estimates of

transition probabilities and boundary-level application numbers did not

conform to expert judgment about the effects and effectiveness of vari-

ous treatment applications. When these disparities occurred, the esti-

mates were modified to reflect expert judgment.

The general structure of the patient states was reviewed to insure

that the representation shown in Appendix F did in fact portray a set of

logical progressions through the care system. Although this examination

established the validity of the patient progressions, the review did

point out one deficiency in the model's structure. The number and types

of treatment alternatives available for use at each patient state were

determined by records of actual applications of these treatments in the

data used for model construction. It was the judgment of the reviewing

practitioners that in several cases the selection of treatment alterna-

tives for a patient state did not include the 'most appropriate' treat-

ment alternative. Nevertheless, model deficiency can readily be correct-

ed. With the collection of data on the effects of these 'most appropriate'

treatments, these additional treatment alternatives can be incorporated

as decision alternatives for the patient states in question.

The reviewing practitioners made selections of treatments for each

of the model's patient states. In those cases where the model's treat-

ment alternatives did not include the practitioners' 'most appropriate'

choice of treatments, the practitioners made a selection from the same

list of alternatives used by the model. Appendix G lists their choices

of treatment along with each model-generated selection. The two sets of

treatment plans include the same treatment selection for 87 out of 94







patient states, or 92.6% of the patient states. The 7 differences in

treatment selections arise in part from the approximations the treatment-

planning model employs in its representation of the care system and in

part fromnslight inconsistencies in the practitioner's treatment selections.

One last test was performed to verify the suitability of the Mark-

ovian representation of the craniofacial-pain care system. Mean transit

times through the care system to one of the terminal states were calcu-

lated using the model-generated treatment decisions, and each of six

first-visit patient states. These model-generated transit times were

compared to estimates of the same statistics gathered from the patient

records contributed by the university dental clinics. Table 5 presents

the values of both sets of statistics. The close correlation of these

values reveals that the treatment-planning model not only duplicates the

decisions of experts, but also provides a structure for gathering other

relevant information about the underlying care system.


4.4 bodel Applications

Like the diagnostic-classification model presented in Chapter 3, the

craniofacial-pain treatment-planning model has been structured to permit

its utilization in a variety of applications. Markovian modeling provides

an analytic representation of the craniofacial-pain care system as well

as establishing a means of making treatment selections. This section dis-

cusses applications of the model's analytic representation and treatment

selections in teaching, in research, and in practice.

The model-generated treatment decisions reveal which treatments are

most frequently used in the care of craniofacial-pain patients. In a

teaching environment, this information can be used to specify treatment-


I







TABLE 5

MEAN TRANSIT TIMES THROUGH THE CRANIOFACIAL-PAIN CARE SYSTEM



Model Truncated Patient
For a Patient ~Wose First Generated Iodel- Record
Diagnostic Classification Was Estimate* Estimate+ EstimateV


Myopathy-Myositis 1.50 1.34 1.35

Oral Pathology-Dental Pathology 1.11 1.04 1.08

Vascular Changes-
Migrainous Vascular Changes 3.89 3.42 3.06

Myofacial Pain Dysfunction-
Uneven Centric Stops 1.86 1.43 1.50

Myofacial Pain Dysfunction-
Anxiety/Depression 3.87 3.47 3.18

Myofacial Pain Dysfunction-
Reflex Protective Muscular
Contracture 1.90 1.79 1.87

The values in these sets of estimates are specified in terms of the

number of patient visits in which the patient occupies a non-well or

non-referred patient state.

Note: The treatment-planning model considers the possibility of

'infinite duration' occupancy of non-well or non-referred

states.

+ These truncated estimates were generated frcm the treatment-

planning model on the conditional basis that a patient must

transit into either the well or the referred state by his

fifth patient visit.

V The maximum number of visits for any patient described by

the clinical data was five patient visits.








application techniques that should be emphasized in training dental stu-

dents in craniofacial-pain care. Moreover, the parameters employed in

model development, in particular the transition probabilities and refer-

ral costs, are themselves valuable instructional materials in developing

the dental student's treatment-selection skills.

The treatment-planning model provides a method for evaluating new

developments in treatment for craniofacial-pain patients. With estimates

of the effectiveness of his new treatment, the researcher can use the

craniofacial-pain treatment-planning model to get two immediate responses.

First, the optimization technique of Section 4.2 will determine if this

new treatment provides 'better care' for the patient than any of the

other treatment alternatives the model has to choose fram. Second, if

optimal treatment selections for the model include the new treatment, the

model's statistics will show improvement in length of stay, and other

relevant measures of treatment effectiveness, introduced by using this

new treatment.

In the office of the practicing dentist, the treatment-planning mod-

el's decisions could provide a concise reference of the treatment selec-

tions suggested by experts in the field of craniofacial pain. Moreover,

the practitioner would have a chance to contribute to the refinement of

the listing as the treatment records of his patients could supplement

the data used in model construction. In addition, the practitioner could

employ the statistics associated with the treatment-planning model in

scheduling the length, and number, of his appointments for craniofacial-

pain patients.


Ii


I













CHAPTER 5

CONCLUSIONS AND FUTUPR RESEARCH.


This dissertation has presented analytic models of the decision pro-

cesses associated with diagnosing and selecting treatments for a partic-

ular health-care problem. The selection, construction, and testing of

these models have been discussed in sace detail. Meanwhile, the model

building effort itself has been the source of a number of insights into

decision-making in a health-care environment. These insights will be

reflected in this chapter's discussion of the dissertation's central re-

search conclusion and suggestions.of topics for future investigation.

The similarity between the decision-making processes employed by

the practitioner and the analytic structure of this dissertation's models

is quite revealing. In both diagnosis and treatment planning for cranio-

facial-pain patients it appears that the practitioner, like the analytic

models, makes 'first-order' decisions. The linearity of symptom signifi-

cance (a first-order polynomial of symptom weights), and the present-

patient-state dependency of transition probabilities measuring treatment

effectiveness (a first-order stochastic dependence) provide a means of

generating decisions that closely approximate the decisions made by dental

practitioners. This general conclusion on the applicability of first-

order decision techniques to craniofacial-pain diagnostic classification

and treatment planning characterizes the central development of this

dissertation.







Given this summary statement, there are several logical extensions

to this dissertation's research that should be examined in future inves-

tigations. The following suggestions identify some of the more fruitful

areas for further research efforts. These suggestions are ordered in

the author's view of their significance.

1. This dissertation's research found that first-order decision-

making models are valid descriptions of the underlying thought processes

employed by the craniofacial-pain practitioner. It is possible that these

first-order descriptive decisions are 'suboptiral' and that higher order

decision-making tools might yield prescriptive, or 'optimal,' diagnostic

classifications and treatment plans for craniofacial-pain patients. That

is, considering the interaction between significant symptoms and multiple-

state dependency for patient-state transitions may lead to optimal diag-

nostic and treatrent-selection decisions. As the models themselves can

readily be increased in their decision-making 'order,' an investigation

into this possibility would be hampered only by the necessity of collect-

ing an elaborate data base. Nevertheless, such an investigation should

be undertaken in this, the most significant, of future research areas.

2. As this dissertation's analytic models can be applied directly

to any health-care problem where there is verification that practitioners

make first-order decisions, one potential avenue of future research would

be to isolate those health-environments where these kinds of decisions

are made. However, a word of caution is interjected at this point. Math-

ematical modeling demands an underlying structure for the process being

modeled. Yet, in a process dealing with a product that is subject to

considerable variation, such as the care of a patient in a health-care

system, isolating an underlying process structure is difficult. Moreover,








the problem of finding process structure is compounded in the health-care

field by a lack of unifying and consistent nomenclature. In the health-

care field, scholarly literature and historical precedent can serve as

the justification for two or more contradicting sets of terminology for

the same anatomical structure or physiological process. Thus, in re-

searching the generality of first-order decision-making techniques, the

investigator must consider process variability and nomenclature incon-

sistency before he mrrkes any statement about the applicability of this

dissertation's decision-making tools to other health-care environments.

3. A non-geometric discussion of the criteria for pattern space

separability was presented to provide a means of characterizing health-

care disorders for which diagnostic classification by a linear pattern

classifier might be feasible. Unfortunately, this dissertation's tech-

niques are heuristic and do not provide an exact reproduction of the

underlying mathematical specifications. Future research in this area

could lead to a precise statement of non-geametric criteria for linear

separability, and thus provide an indirect means for evaluating potential

applications of linear non-parametric classifiers.

4. This dissertation's minimum-cost symptom-selection algorithm

represents a clear departure frcm previous research in feature selection.

The algorithm's utilization of the convex-hull representation of pattern

space separability makes this development unique in the literature of

feature selection. However, the algorithm's method of checking the fea-

sibility of potential feature collections is extremely tedious. A more

efficient method to check feature-collection feasibility may be revealed

through future investigations in this area.


II


I







5. From a mathematical-programming point of view, the syptam-

selection algorithm represents one of a limited number of techniques

capable of solving a problem with non-linear constraints. The algorithm

seeks an optimal assignment of components, where the feasibility of any

assignment is determined by the existence of a set of discriminating com-

ponent multipliers. In this more general context, the structure of the

algorithm may be applicable in a variety of problem areas not directly

related to the feature-selection problem. The possibility of employing

the algorithm in this general setting should be investigated.

6. In modeling the treatment-planning process for craniofacial-pain

patients the concept of boundary-level treatment applications was intro-

duced. Boundary numbers on the effects of repeated treatment applications

are likely to occur in data derived from the care of patients with a va-

riety of physiological disorders. Further investigations of this phenom-

enon may result in more effective methods of predicting which treatments

will have boundary-level application numbers, and more efficient statis-

tical techniques to determine values for these numbers.

7. The training algorithm developed in the construction of the

craniofacial-pain diagnostic classifier generates a feasible integer so-

lution to a large number of linear constraints. This algorithm is both

efficient and easily coded for computer applications. An investigation

of the uses of this algorithm in a mathematical-progranming setting may

reveal applications in solution techniques for more general integer pro-

grams.

8. Potential applications have been suggested for the diagnostic-

classification and treatment-planning models in teaching, in research,

and in practice. The models and their applications have been presented so


I I





81

that they might readily be employed by sare future investigator. Actual

applications of the models should yield significant contributions to

the effectiveness of the teacher, researcher, and practitioner.














APPENDIX A

CRPNIQFACIAL-PAIN PATIENT DATA VECTOR


Referral Throu


Sex


005 Male


Ae Group



Duration of Pair







Character of Pai


Change in Charac


001 Medical GP

003 Dental GP

006 Female


008 0 -

010 40 -

012

013

014

015

.n 016

018

020

022

024

026

ter of Pain


002

004

007


Medical Specialist

Dental Specialist

Female, menopausal

or post menopausal


19 009 20 39

55 011 56 up

Less than 3 weeks

Frcm 3 to 6 weeks

More than 6 weeks

Episodic

Aching 017 Burning

Cutting 019 Discomfort

Dull 021 Pressure

Pricking 023 Sharp

Soreness 025 Stinging

Tenderness 027 Throbbing

028 Constantly getting worse

029 Got worse, then plateaued

030 Got worse, plateaued, then better

031 Getting better

032 Intermittent periods without pain

033 No change since beginning







List of Drugs Taken


034

035

036


037

038

039

040

041

042

043

044

045

046

047

048

049

050


History of Trauma


Mild Analgesics; Asprin, APC, etc,

Moderate Analgesics (non-narcoticl

Strong Analgesics: Narcotics and

Synthetic Narcotics

Anti-anxiety Agents: Mellaril, etc.

Anti-arthritic Agents: Steroids, etc.

Anti-depressives: Tofranil, etc.

Birth Control Pills

Hormone Preparations

Anti-inflammatory Agents.

Muscle Relaxants: Valium

Muscle Relaxants: Meprobaniate

Muscle Relaxants: Others

Sedatives: Barbiturates, etc.

Other Drugs

Accidental

Factitial

Surgical


location of Swelling


09



104

o08


Side


97 ~09^
101 102 10

105 1 10

109/ 110 11

)2 113 1 4

Right
Side








Location of Tenderness


Left
.Side


Location of Pain


Left
Side


Limited Jaw Opening


Joint Sounds


Headaches


244
245
246


247
248


243 Yes

Clicking
Crepitation
Pain accompanying joint sound


Frequent headaches
Headache associated with joint pain


Right
Side


Right
Side








249 Taste

251 Visual acuity


Upper Respiratory Infection


253 In

of


Changes in


Evidence of 254 ArI

255 Eve3

256 Neu

257 Otit

258 Salj

259 Sint

260 Strc

261 Vasc

Facets 262 1 3

Lateral Slide Preraturities


iritis

try's Syndrome

:opathy

tis

vary gland disease

isitis

kes

;ular disease

263 4 -- up

264 On working side

265 On balancing side


Tooth Ache 266 Yes

Biting Stress Tooth Mobility 267 Yes

Recent Restorative or Dental Prosthesis


Jaw Deviates on Opening 269

Impinrgeent of Coronoid Process
on ZygaCmatic Arch

Meniscus-Condyle Dyscoordination

Padio caphic Examination 275


Left

271


268

270

.eft


Yes

Right

272


Right


273 Left 274 Right

Mandibular condyle apposition

(such as spur formation)

Mandibular condyle resorption

(such as flattening of anterior-

superior surface or irregular surface)


250 Hearing

252 Perception of light

touch on face

Conjunction with beginning

STMJ pain


I








Radiographic Examination


Emotional T

Bruxism or

Uneven Cent

History of

History of


277

278

279

280

281

282


?rauma 283 Anxiety

Clenching 285 Yes

:ric Stops 286 Yes

Lengthy Dental Procedures

General Anesthesia 28i


ossa' opposition

ossa resorption

rticular eminence apposition

rticular eminence resorption

evidence of fracture

clinical or radiographic

evidence of pathoses

284 Depression





287 Yes

3 Yes


Tinnitus 289 Yes

Extraction of Teeth 290 Less than 6 veks k

291 Leaving a space the

Preauricular Pain 292 Yes

Alteration of Inter-Occlusal or Inter-Arch Space


prior to '1IT pain

at permits extrusion



293 Yes


Paresthesia 294 Yes

Luxation or Subluxation


295 Yes













APPENDIX B

MODIFIED FIXED-INCREIEN2T TRAINING ALGORITHM


In presenting the modified fixed-increment training algorithm the

following notation is employed:

p = the number of classification categories

t = the number of training-sample row vectors

(k)
a = training sample row vector number 'k' preclassified
--3.
in category 'j', j=l,2,...,p, k=l,2,...,t, and

k=i[mod t] where 'i' is the index of the training-

algorithm iteration
(i) th
W = the 'j column of weights (the constraints in the
--3
'j 't discriminant function) used in the 'ith'

iteration of the training algorithm, j=l,2,...,p.

a = non-negative constant specified by the analyst

to adjust the size of the 'dead zone' [23] in dis-

cririnant function values, i.e., a > 0

S= positive constant specified by the analyst to adjust

the scale of the weight vectors, i.e., S > 0.

Using this notation, let a) be the i pattern examined by the

algorithm, then

case 1: if a.k) Wi) > ak) W + a for all cj
S_ -3 -all c

let W(i+l) = Wi) for all c.
-c -C








case 2: if ak W) < a (k) (i) + a for a subset B of the
-3 -3 -3 -z
p discriminants z E B,

jiB
(i+l) (i) (k)
let W = Ba ] z c B
z -z -[
W(i+) = W(i) for all c / {B U j}
w: -C
-c -c
and w4(i+) = W + B[a (k)] where nB = the number

of discriminants in

the subset B.

The algorithm is terminated when the values of the W., j=l,2,...,p, have
-J3
not changed during a complete cycle of the t training patterns, i.e.,

when W.-1w. +2.. .=W for all j where 0 is the last case 2 pattern
-j --3
examined by the algorithn.

This algorithm is guaranteed to terminate in a set of feasible

W4, j=1,2,...,p, if the training sample is linearly separable and a and B

have been appropriately selected. If the training sample is linearly

separable, the algorithm will converge for any fixed value of a > 0,

where 8 is selected appropriately large. Hence, the algorithm is nor-

mally applied to a training sample with a=0 and B=1. If the algorithm

converges, these constants can be adjusted and the training algorithm

reapplied.

The justification for specifying a non-zero a (a = size of the

dead zone) is that as a is increased the accuracy of the classifier is

increased in making classifications of data not used in developing the

discriminant-function weights. For example, with the craniofacial-pain

diagnostic classifier and the test samples discussed in Section 3.3,

the diagnostic model correctly classified approximately 5% more of the

test samples' data vectors when the model was trained with a=30, 8=3

(versus an original training with a=0, =1l).







Proof that the algorithm converges if feasible weight vectors
*
W., j=l,2,...,p, exist (that is, the sample space is linearly separable)

is developed in Nilsson [22]. Nilsson's proof can be directly applied
*
since for any set of feasible W.
-3

(k) (k) *
a. W. > a. W + a
--3J -- + a-

for all k=l,2,...,t, and z=l,2,...,p, zfj, while for any W j=,2,.,p
-J
i=l,2,...,

a!k) w(i) < a(k) (i)
-3 -3 --3 -z

for sane k and sane z.

Typically, a training algorithm is applied to the members of a

training sample without prior knowledge of whether the sample pattern

space is linearly separable. The algorithm is allowed to process sample

patterns until it either converges on a set of discriminating hyperplanes

or it has run for a 'reasonable' amount of time without termination. Ex-

perience with medical data and the modified fixed-increment algorithm

has shown that if there is a set of discriminating hyperplanes, the

algorithm will find it in no more than 3 complete cycles for each of the

pattern classes. For example, if there are 5 pattern classes and the

pattern space can be linearly partitioned, the algorithm should terminate

in no more than 15 full cycles through the training data. This rough

measure of training time provides an index for establishing a limit on

ccrputer processing time.

An application of the modified fixed-increment training algorithm

is presented in Figure 7.









Given the training sample of the form a = [ai,a ,11 where
i2'


1
a = [0,0,11


2 = [1,0,1]
-2


3
a3 = [0,1,1]


the training sample patterns can be represented in 3-dimensional

space by 1 3
Sa a3


2
A92


The modified fixed-increment algorithm with a = 0 and 8 = 1


proceeds as follows:


Sample

[0,0,1] [ 0

[1,0,1] [ 0

[0,1,1] [-1

[0,0,1] [-1

[1,0,11 [-1

*[0,1,1] [-2

*[0,0,1] [-2

*[1,0,1] [-2

Hence, the se


(* indicates correct sample classification)


W- W W aW aW aW
-1 -2 -3 -1 -2 -

, 0, 0] [ 0, 0, 0] [ 0, 0, 0] 0 0

I, 0, 2] [ 0, 0,-1] [ 0, 0,-1] 2 -1 -]

, 0, 1] [ 2, 0, 1] [-1, 0,-2] 1 1 -

.,-1, 0] [ 2,-1, 0] [-1, 2, 0] 0 0

L,-1, 2] [ 2,-1,-1] [-1, 2,-1] 1 1 -

,-1, 1] [ 3,-1, 0] [-1, 2,-1] 0 -1 ]

,-1, 1] [ 3,-i, 0] [-1, 2,-1] 1 0 -

,-1, 1] [ 3,-1, 0] [-1, 2,-1] -1 3 -

>t of weights generated by this training sample is

W = [-2,-1, 1]

W2 = [ 3,-1, 0]

W3 = [-1, 2,-l].

FIGURE 7

APPLICATION OF THE MODIFIED FIXED-INCREMENT ALGORITHM


-3


L






L

L
0






2.




Full Text
26
stated is that a knowledge of the underlying classifying process can
be employed in constructing the data vector examined by the classifier,
and that fully utilizing this information will lead to a classifier that
can be expected to be capable of performing well on new patient data.
Of course, this discussion has been predicated on the separability of
the underlying pattern space of data vectors. If this requirement is
not met by sane form of patient-data-vector representation, classifica
tion of patients by linear classifier is not possible.
The next section of this chapter provides relationships between
linear separability and the data that may be observed in a health-care
system for which diagnostic classification by linear discriminants is
being considered. This section has a dual purpose. First, linear sep
arability is couched in 'non-geanetric* terms. Second, and more impor
tantly, using the craniofacial-pain health-care system as an example
of the section1 s developments provides information about the suitability
of the non-par ame trie classifier as a model of the decision-making pro
cess associated with diagnostic classification in this care system
3.2 Alternative Interpretations of Linear Separability
The criteria for pattern space separability are mathematically
concise. Unfortunately, these separability criteria are not readily
expressible in non-gecmetric terms. The discussion developed in this
section provides the reader with sane non-geanetric criteria that indi
cate when the use of a non-parametric pattern classifier should be con
sidered as a means of generating diagnoses for a medical or dental dis
order.
The first criterion is associated with a probabilistic measure of
symptom exhibition. Given a patient who exhibits sane set of symptoms


APPENDIX B
MODIFIED FIXED-INCREMENT TRAINING ALGORITHM
In presenting the modified fixed-increment training algorithm, the
following notation is employed:
p = the number of classification categories
t = the number of training-sample row vectors
a:
(k)
-3
= training sample row vector number 'k' preclassified
in category 'j', j=l,2,...,p, k=l,2,...,t, and
k=i[mod t] where i* is the index of the training-
algorithm iteration
wji} = the column of weights (the constraints in the
1 j^1' discriminant function) vised in the 'i**1'
iteration of the training algorithm, j=l,2,...,p.
a = non-negative constant specified by the analyst
to adjust the size of the 'dead zone' [23] in dis-
crirdnant function values, i.e., a >_ 0
6 = positive constant specified by the analyst to adjust
the scale of the weight vectors, i.e., 0 > 0.
Using this notation, let aP^ be the i^1 pattern examined by the
algorithm, then
case 1: if aik* wi1* > aik* WiiJ + a
~3 -3 -3 -c
let W(i+1) = W(i)
c c
for all c^j
for all c.
87


7
FIGURE 2
DIAGNOSTIC-CLASSIFIGriCN AND TREATMENT-
PLANNING PROCESS FOR CRANIOFACIAL PAIN


Dr. Daniel Laskin, University of Illinois; and Dr.'David Mitchell,
University of Indiana, for providing access to the patient records
employed in this modeling effort.
The author would like to express his thanks to the secretarial staff
of the Health Systems Research Division for their translation of the au
thor's 'first-order' approximation to handwriting into a draft of this
manuscript. Their tolerance of a multitude of last minute changes made
by the author has been appreciated.
Finally, the author thanks his wife, Mary, and his parents, Dorothy
and Charles Leonard, for their encouragement and support throughout the
course of this research.
M.S.L.
August, 1973
iv


64
Format I
Patients whose first-visit diagnostic classification is Diagnostic
Alternative 1, 2, 3, 4, 5, 6, 10, 11, 14, 16, of 17, make transitions out
of their original classification 'I' according to the following figure:
Format II
For patients originally classified in Diagnostic Alternative 7, 8, 9, 12,
13, or 15, the following kinds of diagnostic-classification transitions
are possible:


113
Format 1: Referral Cost = [record transferral cost]
+ [practitioner's lost fee]
+ 2*[inconvenience cost associated with
a dental visit]
Fermat 2: Referral Cost [fee paid to referral care system]
+ 2*[inconvenience cost associated with
a dental visit]
Format 3: Referral cost = [fee paid to referral care system]
+ [practitioners lost fee]
+ [record transferral cost]
+ 2* [inconvenience cost associated with
a dental visit]
where in all three formats the multiple of the inconvenience cost was
suggested by the fact that in the clinical records (Section 3.1} the
median number of visits to the referral care system was two visits. The
treatment-planning model was optimized with referral costs based on each
of these formats. Use of the Format 3 referral costs leach to model-gen
erated treatment selections that most closely duplicated the selections
of the reviewing practitioners. Hence, this format for patient-referral
costs has been selected for utilization in the treatment-planning model.


APPENDIX D
TREATMENT ALTERNATIVES FOR CRANIOFACIAL-PAIN PATIENTS
Treatment Application
Number
Treatments
11
12
13
14
15
16
17
18
21
22
23
24
25
26
27
28
31
32
33
Chill Therapy
Drug Therapy
Fixation
Heat Therapy
Occlusal Adjustment
Physical Therapy
Prosthetics
Tooth Extraction or Endodontics
Drug Therapy and Fixation
Drug Therapy and Health Therapy
Drug Therapy and Occlusal Adjustment
Drug Therapy and Physical Therapy
Drug Therapy and Prosthetics
Heat Therapy and Physical Therapy
Occlusal Adjustment and Physical Therapy
Physical Therapy and Prosthetics
Chill Therapy, Drug Therapy, and Physical Therapy
Drug Therapy, Fixation, and Heat Therapy
Drug Therapy, Fixation, and Physical Therapy
97


63
ment 24 is the boundary repetition of that treatment. Hence, multiple
repetitions of treatment 24 are not added to the state description of
patient states based on Diagnostic Alternative 13, as the additional
information on multiple applications does not influence transition pro
babilities associated with this treatment's effectiveness. Thus, a
second application of treatment 24 for a patient who continues to be
classified in Diagnostic Alternative 13 places the patient in a state
of the form
1 "3T 24 1
XJ X / f
The craniofacial-pain treatment-planning model includes two terminal
patient states in addition to the patient states that are based on diag
nostic alternatives. One or the other of these two terminal states,
'well' or 'referred,' represents the patient's status when he exits the
care system. A patient exists the system in the 'well' state when the
effects of treatment applications result in sufficient improvement so
that no further treatment is required. The patient moves into the 're
ferred' state in lieu of further treatment. This alternative to treat
ment is selected when the 'expected costs' of remaining in the care sys
tem exceed the costs of referring the patient to another source of care
(see Section 4.1.3).
4.1.2 Transition Probabilities
Patient-state transitions that involve a change of diagnostic clas
sification follow one of two basic formats, see Figure 5. For the initial
diagnostic classifications in Format I, with each treatment application,
the patient either remains in his original diagnostic classification or
he transits into the well state. For Format II, the six diagnostic al
ternatives shewn in the lower illustration form a different structure.


85
Changes in
249 Taste
250 Hearing
251 Visual acuity
252 Perception of light
touch on face
Upper Respiratory Infection 253 In conjunction with beginning
of TMJ pain
Evidence of 254 Arthritis
255 Every's Syndrome
256 Neuropathy
257 Otitis
258 Salivary gland disease
259 Sinusitis
260 Strokes
261 Vascular disease
Facets 262 1-3 263 4 up
Lateral Slide Prematurities 264 On working side
265 On balancing side
Tooth Ache 266 Yes
Biting Stress Tooth Mobility 267 Yes
Recent Restorative or Dental Prosthesis 268 Yes
Jaw Deviates on Opening 269 Left 270 Right
Impingeirent of Coronoid Process 271 Left 272 Right
on Zygcsratic Arch
Meniscus-Condyle Dyscoordination 273 Left 274 Right
Radiographic Examination 275 Mandibular condyle apposition
(such as spur formation}
276 Mandibular condyle resorption
(such as flattening of anterior-
superior surface or irregular surface)


116
*
vx = O
v2 = O
and Vj: find the that maximize
kI kI t D -kI vkJ
VI I + jf3 PIJ J
,1=3,4
v =
max
kc
k, 4 k *
^ +j3PW Vj
V, =
max
jf
and Vgt find the k^ that maximize
h
k 8 k k 6
V = ri +Jf7PU VJ +j!3puvj
*1
,1=7,8
v =
max
kn
k9 8 k *
r9 + VJ
1 ~
p99
V10
max
"10
-10
'10
8 kio *
+ j=3Pl0J Vj
1 P101010


19
classification is made on the basis of specifying that etiological fac
tor that requires most inmediate action on the part of the attending
practitioner. Thus, diagnostic classification of a patient into diag
nostic alternative 'A' signals that the etiology specified by that al
ternative should determine the course of the patient's care.
The next step in model development isolated relevant data which
measured the physiological and psychological status of craniofacial-pain
patients. In particular, this step of model development sought those
elements of patient status that practitioners employ in their own classi
fication of craniofacial-pain patients. Appendix A presents a list of
these data elements. Wherever it was feasible, measures of patient
status were segmented to amplify the significance of particular readings
of each measure. Thus, for example, while the duration of a patient's
pain is a continuous measure of his status, it is important for the par-
poses of classification to know whether a craniofacial-pain patient's
duration of pain is less than 3 weeks, from 3 to 6 weeks, or longer than
6 weeks. For this measure of patient status, a short history of pain
indicates a strong possibility of a recent traumatic injury virile pain
over a long period is more likely associated with long standing arthritic
or psychic disorders.
To facilitate the development of an analytic model of the diagnostic-
classification process, a vector representation of the relevant elements
of patient data has been developed. The vector permits the notation of
any of the data elements shown in the listing in Appendix A. The pre
sence of any of the items found in Appendix A is recorded in a patient's
data vector by an entry of '1' in the vector-dimension corresponding to
the item number, while the absence of a vector item is noted by a 'O'


71
ing an optimal set of treatments is accomplished by finding the set of
K
treatment alternatives k, ,k,...,k that maximize each of the v_ (the
ls I
expected value of occupying patient-state I' given treatment alternative
'kj') where
h J5!
*i k.
^ = V + all patient ^ ^ '
states J
1=1,2,...,S
and
*1
rI
y ^ *1
all patient PlJ ^
states J
With treatment-augmented patient states, maximizing the v^ can be
carried out in the following manner:
1. Group for simultaneous analysis all patient states possessing
a common treatment history, where one or more of the treatments in this
history are at their boundary level. Each of the 'T* sets of states
complying with this description forms an analysis set B^, j=l,2,...,T.
2. Label sequentially the patient states, starting with state W
as 1, state Ras 2, and then selecting numbers for the remaining unlabeled
patient states on the basis that the one with the most treatments in its
history receives the next number-label. For example, state ..111,2,2,4*
would be labeled with a smaller number than state *.112,6,6.1 When the
numbering scheme reaches the members of one of the analysis sets isolated
in Step 1 (above), numbers for the members of that set may be arbitrarily
assigned. Given this state numbering scheme, the selection of optimal
treatments can proceed dynamically since for each state I that is not a
member of an analysis set, 1=1,2,...,S, IBj, j=l,2,...,T
I
VI rI ?UVJ


BIOGRAPHICAL SKETCH
Michael Steven Leonard was bom February 2, 1947, in Salisbury,
North Carolina. In June, 1965, he was graduated cum laude from Cocoa
High School in Rockledge, Florida. He received the degree of Bachelor
of Industrial Engineering with High Honors fron the University of Florida
in June, 1970. In September, 1970 he began graduate work in engineering
at the University of Florida. He received the degree of Master of
Engineering in March, 1972. In June, 1972, he was designated a Distin
guished Military Graduate of the Air Force Reserve Officer Training Corps.
From September, 1970, until the present, his graduate training has been
supported by a National Science Foundation traineeship.
Michael Leonard is married to the former Mary Elizabeth Stewart
of Cocoa, Florida. He holds the reserve ccrrmission of Second Lieutenant
in the United States Air Force. He is a member of Lambda Chi Alpha
fraternity; Alpha Pi Mu, Sigma Tau, and Tau Beta Pi honorary fraternities;
and the Operations Research Society of America.
121


APPENDIX C
APPLICATION OF THE MINIMUM-COST SYMPTOM-SELECTION ALGORITHM
Given three pattern classes X, Y, and Z, with patterns of the form
-j = faji'aj2'aj3,:L'' where
= [0 10 1] = [0 0 11]
2 rr. i n 2
a^ = [0 0 0 1]
= [10 0 1] ,
[0111] aj = [1 0 1 1]
these patterns can be represented in three-dimensional space (without
their constant = 1 components) by
1 feature 1
X /K feature 2
One set of feasible linear-classifier discriminant-function weights
,T
for these patterns is
Wx =[-2 3 0 -1]
WY = [ 1 -2 2 0]
Wz = [ 1 -1 -2 1]T.
Suppose feature 1 costs* 2 units to employ in the classifier,
feature 2 costs 6 units to employ in the classifier,
and feature 3 costs 3 units to employ in the classifier.
Then, for the minimum-cost symptcm-selection algorithm (Section 3.4)
, and C = [2 6 3 0].
*1-
0 10 1
, a2 "
0 0 11
II
0 0 0 1
0 111
L J
10 11
10 0 1
*
91


48
Note that Conditions 3 and 4 can also be stated, and justified, with
.the role of the elements of the A and B matrices reversed.
Given this set of four conditions, consider the following row par
tition of the A and B matrices:-
A =
"a* "
'b*
*1
*1
B =
Bi
*0.
*0
I
o o
1
1
W
o
i
where by appropriate change of rows in A and B
1. every element in each row of A^ is a one
2. every element in each row of B^ is a one
3. every element in each row of A^ is a zero
4. every element in each row of Bq is a zero.
The partitions A^, B^, A^, and B are the rows of A and B corresponding
to B1, A1, Bq, and AQ, respectively, and A* and B* are the remaining rows
of A and B. With this partitioning and the four previously established
conditions, the size of the data vectors associated with many of the
[p(p-l)]/2 subproblems P2 can be significantly reduced. The reduction
process, Procedure 1, can be stated in this manner:
Step 1: If for seme row k in A^ (B^) each element in the corre
sponding row of B (Ap is equal to one, then row k
of A and B can be eliminated by Condition 2.


29
one of the diagnostic alternatives,
then the use of a non-parainetrie classifier as a means of generating
classifications should be investigated.
The linear non-parametrie classifier employes a weighted sum of
the symptoms exhibited by each patient in its discriminating functions.
If symptoms can be isolated that are significant to the classification
of patients with the disorder under investigation, then there is a
'natural' weight for each of these symptoms in the decision-making pro
cess used by the practitioner. The existence of these natural weights
increases the probability that a training algorithm will be able to find
a feasible set of discriminant-function weights. Indeed, the relative
importance of the significant symptoms may be reflected in the magnitude
of the discriminant-function weights generated by the application of a
training algorithm.
As an example, the significant symptoms associated with two cranio
facial-pain diagnostic alternatives, Alternatives 4 and 14, were isolated
by Dr. Fast. A comparison of these symptoms and their associated dis
criminant-function weights revealed a high degree of correlation between
symptom significance and discriminant-function-weights, see Table 2.
The reader should note that both of the criteria discussed in this
section are heuristic approximations to the geometric requirement for
pattern space separability. However, if the disorder under investigation
meets one or both of these criteria, it may be possible to employ a non-
parametric classifier to diagnose the disorder since the requirement for
pattern space separability is most likely met.


60
Values for many of the treatment-planning model's parameters viere
gathered from the set of patient records discussed in Section 3.1. As
the patient histories from the contributing university dental clinics
were reviewed, notations of treatment applications and time between suc
cessive visits were made for each patient-practitioner interaction. The
values of the remaining model parameters were either estimated by the
reviewing practitioners, Dr. Fast and Dr. Mahan, or were gathered frcm
responses to questionnaries completed by patients who visited the
University of Florida's Dental Clinic. In modeling the complicated pro
cess of care for craniofacial-pain patients, several simplifying assump
tions were made. This section provides the motivation for these assump
tions and presents the notation employed in the analytic description of
the treatment-planning process.
4.1.1 Patient States
In general, a Markovian system structure requires that the current
state of the system completely characterizes the probabilities associated
with future state occupancies of the system. To fully satisfy this
Markovian condition for state structure in the craniofacial-pain treat
ment-planning model would require that the model include as distinct mod
el states every possible combination of diagnostic classifications a pa
tient might have occupied, in conjunction with every combination of treat
ment applications he might have undergone, during his stay in the care
system. Unfortunately, such a model would have an infinite number of
patient states.'
However, for a majority of craniofacial-pain patients the know
ledge of a patient's prior treatment record, coupled with his current
diagnostic classification, is adequate to determine his prior diagnostic


12
TABLE 1
SURREY OF DIAGNOSTIC-CLASSIFICATiai MODELS
Bayesian Classifiers
Reference
Number
Disease Group
Number Of
Patients In
Study
% Correct
Patient
Diagnoses
[12]
Nontoxic Goiter
88
85.3
[13]
Bone Tumor
77
77.9
[14]
Thyroid
268
96.3
[15]
Congenital Heart
202
90.0
[16]
Gastric Ulcer
14
100.0
Non-Parametric Classifiers
Reference
Number
Disease Group
Number Of
Patients In
Study
% Correct
Patient
Diagnoses
[17]
Liver
52
98.1
[18]
Asthma
230
90.0
[19]
Hematologic
49
93.9
[20]
Thyroid
225
96.0


23
VA = the 296-diirension vector of weights associated with
diagnostic alternative j1
til
w., = the k element in the weight vector W.,
that is
rj295,wj2961
T
and
296
ik jk
where T denotes vector transposition. Patient 'i' is classified in
diagnostic alternative Ch when d^j>d^s for every j. If mgx d^t is
not unique, then it is not yet possible to classify patient i' into
one of the diagnostic alternatives. Treatment is prescribed for severe
syrptcms and classification is attempted at a later date.
Data frcm four sources were used to construct and verify the diag
nostic-classification model, as well as the treatment-planning model
presented in Chapter 4. Contributions of clinical records came frcm
the dental schools at the universities of California at Los Angeles,
Florida, Illinois, and Indiana. In all, the records of 250 patients,
involving a total of 480 patient-practitioner interactions, form the
data base for model building and validation. The relevant information
frcm each of these patient visits has been recorded in the data-vector
format of Appendix A. A diagnostic classification frcm Figure 3 was
assigned to each of these patient data vectors by either Dr. Thanas B.
Fast, Chairman of the Division of Oral Diagnosis, or by Dr. Parker E.
Mahan, Chairman of the Department of Basic Dental Sciences, at the
College of Dentistry, University of Florida.


41


20
data-vector entry. For example, referring to the listing in Appendix A,
a male patient would have the following fifth, sixth, and seventh ele
ments in his data vector
(...,1,0,0,...),
while a pre-manopausal female would have the series of elements
(...,0,1,0,...).
This vector notation of a patient's status serves as the input data for
a non-par aire trie pattern classifier that assigns a diagnostic classifica
tion to the patient's dysfunction.
Non-parsmetric pattern classification, as described in Meisel [23]
and Nilsson [24], is the process of creating decision surfaces that
separate patterns into homogeneous classes,Ct, i=l,2,...,p, specified
by the analyst. In the craniofacial-pain diagnostic model, the (t are
the diagnostic alternatives shown in Figure 3. Classification of a pat
tern (a patient' s-data-vector) into one of the classes is performed by
a pattern classifier composed of a maximum detector and a set of dis
criminant functions. These discriminant^ (a), j=l,2, ,p, are single
valued functions of each patient's data-vector a. If au represents a
data vector for a patient whose correct diagnostic classification is the
x1 diagnostic alternative, then the (a) are chosen so that
gi-i*>gj-i) i, j=l,2,...,p, j^i.
The craniofacial-pain classifier uses linear discriminant functions.
These discriminants are linear in the sense that they provide mappings
from E11 to that exhibit the form
gj(a) =a1wjl+a2wj2+...+^wjn4.(n+1)
where in the patient-data-vector a, the value of a^ denotes the presence


37
the basis of body-temperature and blood-pressure readings. Traditional
techniques for feature selection might employ a linear combination of
body temperatures and blood pressure measurements as one 'new* feature.
The transformation sought in this investigation would lead to the clas
sification of patients by either body temperature or blood pressure
alone if this were possible. This example will be used again in Section
3.4.1 to illustrate the algebraic and geometric structure of the problem.
Nelson and Levy [27] have attacked the problem of selecting a re
duced set of unaltered features for use in a classification scheme.
These authors attach a cost to the use of each available feature/ and
employ a ranking scheme to measure each feature's discriminating power.
Then, under a restriction on the total cost of features employed, they
develop an algorithm that selects the set of features that maximizes the
classifier's discriminating power. Unfortunately, their scheme does not
guarantee the selection of a subset of original features that contain
enough 'information' to permit pattern class separation by discriminant
function. Therefore, a new algorithm is presented in this section that
minimizes the cost of the set of features used by the pattern classifier
yet insures that all patterns can be correctly classified by a set of
linear discriminant functions. In the remainder of this section the
more general terms 'feature,' 'pattern,' and 'pattern class' will be
used respectively to represent a data vector item, a patient's data vec
tor, and a diagnostic classification.
The problem of finding a minimum-cost collection of features would
not be considered if there did not already exist a set of 'n' features
by which the patterns under examination could be correctly classified
by linear discriminants. That is, given a *n' dimensional representa-


69
Fifty-eight patients at the University of Florida's Dental Clinic responded
to the following questions;
Hew much would you estimate that this trip to the
Dental Clinic cost you in terms of lost wages, baby
sitting fees, transportation costs, and other costs
that you may have had to pay so that you could
be hare for your appointment?
The distribution of these, estimates is shown in this histogram.
.99 9. 19. 29. 39. 49. 59. 69. 79. 300.
Dollars
The mean value for these 58 estimates of patient-visit inconvenience costs
was $30.72.
FIGURE 6
PATIENT-VISIT INCONVENIENCE COST


11
15, 16], irak a diagnosis on the basis of selecting a patient's 'most
probable' disease state. The Bayesian classifier is an elementary type
of parametric pattern-classification model. In general, parametric
classifiers make use of one or more of the statistical characteristics
of the dispersion of the data being classified to establish rules for
data classification. With the Bayesian models, only the conditional
probabilities for exhibiting sets of symptoms, given a particular dis
ease, are tabulated from past medical data. Then, utilizing Bayes'
theorem, the probabilities for the presence of alternate diseases
*.. ,d^ can be calculated as a function of the syirptcm-ccrrplex S
the practitioner observes in the patient. Bayes* theorem provides that
for each of the d^
P(dilS) = C(S)PCS|di)P(di)
n
where C(S) = 1/[E PiSjcyPicy},
k=l
hence, a patient with syiptcm-ccmplex S is classified in disease-group i
if
P(d. |S) = max p(d, |s).
1 k
A survey of the results of application of Bayesian models is given in
Table 1.
Although the percentage of correct diagnoses in most of these test
applications is high, there are several reasons why a Bayesian diagnos
tic irodel is not used as the means of generating diagnostic classifica
tion in this dissertation. The first reason is the difficulty in ac
quiring the proportional presence of alternate diseases P(d^}, i=l,2,...,n,
in the population of patients that are to be classified by the model.
These 'prior' probabilities of having a particular disease are a function


50
Step 7: If the use of Steps 1, 2, 4, and 5 has eliminated one or
more rows or columns iron either matrix then repartition
the matrices and return to Step 1, otherwise terminate
Procedure 1.
In coding Procedure 1 for computer processing, there is no need to
physically partition the rows of the A and B matrices. Summing the
elements in any row of A or B reveals whether the individual elements in
the row are all equal to zero or are all equal to one. Given this infor
mation, the steps from Procedure 1 determine whether a pattern is re
moved iron A or B, whether a row in A and B is removed, or whether the
procedure should be terminated because no feasible set of convex combina
tions for P2 exists.
As an example of the use of Procedure 1 consider the set of matrices
A and B in subproblem P2 where
0
1
1
-
0
1
1
1
1
A =
1
0
0
0
B =
0
0
0
0
1
0
0
0
1
1
1
0
0
1
1
1
1
1
1
0
In the first application of the steps of Procedure 1:
1. Column 4 can be eliminated from matrix A by Step 4 and
2. Column 1 can be eliminated from matrix A by Step 5.
After the first application of the steps of the procedure
1 1
1111
A =
0 0
B =
0 0 0 0
0 0
1110
1 1
1 1 0_


25
function, between each new vector and its associated training-sample
convex hull. Given this close proximity, the classifier's discriminant
functions should correctly classify most new data vectors as these vec
tors will lie within or near the boundaries of the appropriate discrim
inating hyperplanes. Hence, the key to providing adequate classifier
performance for new data vectors lies in devising data-vector-represen-
tations of patient data for which the data vectors of a canton diagnostic
classification exhibit strong similarity.
In the introductory discussion of the elements of patient data used
in the patient data vector, it was pointed out that an effort was made to
select components of patient status that assist the practitioner in his
selection of diagnostic classifications for a craniofacial-pain patient.
Thai these elements were partitioned to generate as much discriminating
information as possible frcm each data element. In terms of the alter
nate diagnostic classifications, these elements of patient data were
chosen so that all patients in any one diagnostic classification would
have a unique combination of exhibited or non-exhibited data-vector ele
ments. Employing these carefully constructed qualitative data elements
resulted in a set of 'natural' gaps in the vector representations of
patient data iron alternate diagnostic classifications. The fact that
there are portions of the pattern space that cannot be occupied by any
data vector, and partitions of the space where the vectors of each clas
sification must lie, assists the classifer in making correct classifica
tions of data not used in model construction.
As Section 3.3 shows, this discussion is not meant to imply that the
craniofacial-pain diagnostic classifier can, in its present state of
development, correctly classify every new data vector. What has been


94
By Step 3 of Procedure,1, no feasible convex combination of
these matrices exists.
*rp Arp
Application of Procedure 1 to and yields the following
reduced matrices:
a£ = [1 1] [0] .
By Step 3 of Procedure 1, no feasible convex combination of
these matrices exists.
AT at
Application of Procedure 1 to and A^ yields the following
reduced matrices:
=[0 1] A^ = [0 1] .
Ajp Arp
For these reduced matrices, A^ and Ay P4 has the following
form: maximize A^ + ^
subject to A^ 0
it + A^ <_ 0
X2 i 0
it + ^2 r
t, A^, A2 unrestricted.
P4 has the bounded optimal solution A^ = A 2 = 0.
Hence the assignment vector ^ = 1, x^ = 0] is infeasible
by the rules of Procedure 2. Go to Step 6 of the algorithm.
Step 6: The assignment vector is now ^ = 1, x^ = 1]. As this vector
does not include an assignment for every variable, return to
Step 4 of the algorithm.
The tree of possible solutions to PI new has the form


APPENDIX F
FLOW CHARTS OF PATIENT-STATE
TRANSITIONS
C 1 *) 1116 }
(2 *-(^ 2124 ^ 2124,2.4)
v 20 ^ C^) 2 20 C3 C324
104


61
classifications. Even in the cases where the current classification
and prior treatment record do not provide a total description of a pa
tients condition, these elements of patient status do provide signifi
cant information about the probabilities associated with, a patient's
future status in the care system. For example, in the data employed in
model construction, 47 craniofacial-pain patients occupied Diagnostic
Alternative 15 and were treated with an application of drugs at least
once. Eight of these patients were 'well* after a first treatment with
drugs, while 39 required multiple applications of drugs or other treat
ments during their stay in the system. Yet of the 12 patients who were
given two applications of drugs, 9 were well* following the second
repetition of drug therapy. Thus, while the overall data-based transi
tion-probability estimate for a transition from Diagnostic Alternative
15 into the well state following any one application of drugs is .36,
the transition-probability estimate for a transition into the well state
following two successive applications of drugs is .75. Hence, for this
diagnostic classification, information on the prior application of drugs
is important in determining a patient's future status in the care system.
This form of 'current diagnostic classification augmented by treat
ment record' patient-state description is employed in the craniofacial-
pain treatment-planning model as an approximation to a 'true' Markovian
state structure. Each of the diagnostic alternatives shown in Figure 3
forms the basis for a collection of patient states. The diagnostic al
ternative is augmented with a record of treatments that have been applied
since the patient entered the care system. Appendix D provides a list
of the treatment alternatives that may be prescribed for craniofacial-
pain patients. The record of each treatment given to the patient is noted
in the patient-state descriptions without regard to its chronological


U,F Libraries:Digital Dissertation Project
3 of 4
nonprofit, educational purposes via the Internet or successive
technologies.
This is a non-exclusive gra'h't 'of'^perrars-si-ons 'ft/i^sjbecific off-line
and Sion .tfuS
on-line uses for an indefinite term, Q£f-iine uses shall be limited
, -A tbi .yqoD SVKtoiA
to
those specifically allowed by "Faif^UseW as pjbesa&sSLbed by the terms
United States copyright legislaLibh' (trf7~~Titl-e iVP U. S. Code) as
well as h&muisn
to the maintenance and preservat^qj^gg^a digital^archive copy.
Digitization allows "the" University of Florida~~bo generate image- and
text-based versions as appropriate and to provide and enhance access
using search software.
This grant of permissions prohibits use of the digitized versions
for
commercial use or profit.
Printed or Typed Name
A
of Copyright
Holder/Licensee
Personal information blurred
*23 'ZjOoQ
Date of 'Signature
Please print, sign and return to:
Cathleen Martyniak
UF Dissertation Project
5/23/2008 11:35 AM


13
of seasonal variation, geographic location, population demography, and
many other factors. Secondly, valid Bayesian analysis requires the
analyst to determine the dependence among exhibited symptoms for each
disease considered by the diagnostic model. In this respect, the prob
abilities for the presence of groups of symptoms are independent for
sane diagnostic alternatives and strongly correlated for others [4]. The
third reason for not selecting a Bayesian model is the massive storage
requirement dictated by the necessity of keeping the set of conditional
probabilities. These conditionals, P(S|d^) for every observable symptcm-
canplex S and. every disease i considered, must be at hand each time the
model is used. For example, given ten alternate diseases and ten symp
toms for which no assumptions of between-symptcm independence can be made,
storage is required for 10-(210-1), or 10,230, conditional probabilities.
2.2 Non-Parametric Classification Models
Non-paramatric diagnostic models, like [17, 18, 19, 20], utilize
non-parametric pattern classifiers, a form of pattern recognition model
ing. In the literature on pattern recognition, the term 'non-parametric'
implies that no form of probability distribution is assumed for the
dispersion of symptom data in establishing the rules for pattern classi
fication. These models do assume, however, that classes of symptom data
are distinct entities and, hence, a patient with a particular set of
symptom S cannot simultaneously occupy more than one diagnostic state.
That is, the models assume a deterministic classification for each pat
tern viewed by the pattern classifier where every observable pattern has
one, and only one, correct classification.
Non-parametric modeling permits the analyst to bypass the difficult
problems of explicitly determining the conditional probabilities for,


8
When initial treatment does not result in a 'cure' for the cranio
facial-pain patient, treatment effects are evaluated and new data col
lected. When a patient's diagnostic classification leads to a course
of treatment that is not within the realm c-f the practitioner's special
ty he is referred to a more appropriate care source. Monitoring is con
tinued on those patients not rejected iron the system at this point, and
the patient is discharged when he is symptom-free. However, when other
disorders have been isolated during the course of treatment, the patient
is recycled through the classification-treatment process.
The diagnosis-treatment sequence is not fixed. Treatment can begin
prior to a diagnostic classification or treatment can follow a diagnosis.
Moreover, there may be many diagnostic-treatment data-acquisition cycles
before the patient is considered 'well.'
1.2 Research Objective
. The introductory discussion of the need for diagnostic and treatment
planning models, and the brief description of the craniofacial-pain care
system, provide the setting for a statement of the research objective un
derlying this dissertation. This objective is to derive analytic repre
sentations of the decision processes involved in selecting diagnostic
classifications and planning treatments for craniofacial-pain patients.
A diagnostic-classification model that duplicates the classification of
expert practitioners is sought. For treatment planning, the modeling
goal is to provide a structure for interaction of the critical considera
tions associated with the treatment-selection process. These analytic
representations will be structured to permit their application as teaching
devices in the training of dental practitioners, as methods of testing the
effects of new diagnostic tools and treatment applications, and as aids to
the practice of dentistry.


44
characterized as follows:
let A = A and A = B with A and B having columns a.
" 'S v 1
P2:
and bj respectively for any Ag and At.
m. m.
Find u. > 0, E1 u.=l, and v. > 0, E3 v.=l
i
such that
m
i=l
3 -
3
j=l
m.
v3
Z1 u.a.
i=l 11 j=l 3 3
ZJ v.b. .
If such u^ and v^ exist for any one of the subproblems then X is not
feasible to Pi. Because the number of subproblems is large even for a
relatively snail number p of pattern classes, there is justification for
seeking methods to expedite the solution of each subproblem P2.
To achieve this goal, a series of conditions will be presented that
characterize seme of the criteria necessary to the existence of a solu
tion to subproblem P2. In addition to establishing criteria for exis
tence, these conditions provide a means for reducing the size of the
matrices A and B. This reduction will be discussed after the conditions
are established.
th k
Condition 1: If the k row of A has all elements a^, i=l,2,... ,nu,
equal to zero (one) and the k 1 row of B has all
V
elements b., j=l,2,...,m, equal to one (zero) then no
u.>0, ed
u.=l and v.>0, E-1 v.=l exist such that
1
1=1
1 j=l ^
m.
m.
Z1 u.a. =
= E-5 v.b. .
i=l 11
j=l 3 3
Justification 1: Under Condition 1 there is no set of convex combina-
tions of the k 1 row elements of A and of the k1 row
elements of B such that the combinations are equal.


67
following form;
Jr
p the probability of making a transition from
JuJ
patient-state 'I' to patient-state 'J* following
the application of treatment-alternative 'k.1
4.1.3 Cost Structure
A patients progression through the craniofacial-pain systsn gener
ates a multitude of implicit and explicit costs. The explicit costs can
be measured in terms of the dollar charges paid by the patient or the
practitioner during the patients stay in the system. Other costs are
implicit in nature and can be quantified only as they relate to the
opportunities lost by the patient and the practitioner while the pa
tient remains in the care system. For modeling purposes four major
system costs have been isolated. These costs are:
(a) Cost of treatment applications
(b) Cost of the practitioner and his staff's
services
(c) Cost to the patient of occupying a non-well
patient-state
(d) Patient-referral cost.
Although these costs do not encompass all of the system costs, they mea
sure significant explicit and implicit charges associated with a patient's
stay in this system. In the treatment-planning model, each of these costs
is charged on a per-patient-visit basis.
Costs of the various treatment applications and the costs associated
with the practitioner and his staff's services were estimated by the re
viewing practitioners. Estimates of treatment and care-system service
costs were partitioned by diagnostic classification as well as treatment


2
study. This reality prohibits the model builder from making broad state
ments about the applicability of his models to other health-care environ
ments. Accordingly, the models developed in this dissertation are spe
cifically oriental toward the health-care problem presented in Section
1.1 with the understanding that the results of this modeling effort may
not be applicable to the whole of health-care diagnosis and treatment
planning.
1.1 Craniofacial Pain
The head and face are subject to chronic, persistent,
or recurrent pain more often than any otter portion of the
body. Pain in the head or face has a greater significance
to patients than any other pain. It may arouse fears that
the patient is in danger of losing his mind or that he has
a tumor of the brain. In addition, the emotional state of
the patient is adversely influenced because it is generally
known by the layman that the profession's knowledge of the
causes of these pains is meager and that methods of treat
ment are inadequate [5, p. v].
H. Houston Merritt, M.D., Dean
Columbia University College of
Physicians and Surgeons
One source of the pain Dr. Merritt describes is dysfunction of the
tenporonandibular joint. The torporamandibular joint, see Figure 1,
provides the articulation between the mandible and the cranium. This
joint is unique both in its structure and its function. Within the plane
of the temporomandibular joint, lateral, vertical and pivoting motion is
permitted. In addition, the joint is the point of articulation for the
only articulated complex that contains teeth. With this joint, "motion
is directed more by the musculature and less by the shape of the artic
ulating bones and ligaments than is the fact for otter joints" [5, p. 34].
The fact that joint motion is highly dependent on musculature im
plies that when mandibular dysfunction occurs there is sane disturbance


75
TABLE 5
MEAN TRANSIT TIMES THROUGH THE CRANIOFACIAL-PAIN CARE SYSTEM
For a Patient Whose First
Diagnostic Classification Was
Model
Generated
Estimate*
Truncated
I-iodel-
Estimate+
Patient
Record
Estimate'
Myopathy-Myositis
1.50
1.34
1.35
Oral Pathology-Dental Pathology
1.11
1.04
1.08
Vascular Changes-
Migrainous Vascular Changes
3.89
3.42
3.06
Myofacial Pain Dysfunction-
Uneven Centric Stops
1.86
1.43
1.50
Myofacial Pain Dysfunction-
Anxiety/Depression
3.87
3.47
3.18
Myofacial Pain Dysfunction-
Reflex Protective Muscular
Contracture
1.90
1.79
1.87
The values in these sets of estimates are specified in terms of the
number of patient visits in which the patient occupies a non-well or
non-referred patient state.
Note: The treatment-planning model considers the possibility of
'infinite duration' occupancy of non-well or non-referred
states.
+ These truncated estimates were generated fran the treatment
planning model on the conditional basis that a patient must
transit into either the well or the referred state by his
fifth patient visit.
The maximum number of visits for any patient described by
the clinical data was five patient visits.
v


27
S, non-pararretric pattern classification requires that P [S} 3 =1 for
the diagnostic alternative 'CL' that describes the patient's current
diagnostic status, and P[S|C^] = 0 for all other diagnostic alternatives
'C^..' However, assume that for the disorder in question the probability
of exhibiting any relevant symptom has been calculated from historical
data, that is, estimates of Pts^JCh] are available for all relevant
symptoms s^ and all diagnostic alternatives Ct. Then, if the following
decision rule leads to the correct classification of a majority of the
patients with the disorder in question, utilization of a non-parametric
classification model should be investigated:
classify a patient who exhibits the set of symptoms S in the
th
j diagnostic alternative if
IT P[si|Cj] > TT for 311 ^3* d)
s.eS s.eS
i i-
Since (1) holds if and only if
log tTT P[si|Cj]] > log tfT p[silC]c^ for 311
s^eS s^eS
decision rule (1) can be expressed in terms of logarithms. Let the set
of symptoms S be represented as a row vector a with the elements of a
assigned values as follows:
a^ = 1 if symptom s^ is an element of S
and a^ = 0 if symptom is not an element of S,
where n is the total number of relevant symptoms. Form the column vectors
Wj = [log Pts-JCh], log P[s2|Cj],..., log P[sn|c_.]]T


52
Consito: the dual of P3, written in the following form:
P4: maximize [0_ 1 1]
n_
h
h
-
n>-X1,X2 unrestricted in sign.
Note that P4 may have many associated ir^ variables, but has only as many
constraints as the number of patterns in A and B (as reduced by Procedure
1). P4 always has at least one solution to its constraint set. Thus, if
an application of a linear-programming algorithm to P4 reveals the exis
tence of an unbounded solution, then P2 has no solution. Therefore, if
and only if P4 has a bounded solution do u. and v. exist such that
x 3
m.
I1 u.a.
i=l 1 1
xu
¡P v.b.
j=l 3 3
where
and
m.
u. >0, I1 u. = 1
1 i=l 1
m.
v. > 0, iP v. = 1.
3 j=i 3
The preceding discussion with its development of a reduction proce
dure and dual formulation provides the structure for a second procedure.
Procedure 2 establishes a mechanism to verify the feasibility of any
a.
assignment of zeros and ones to the X vector of problen PI, see Figure 4.
That is, given seme vector X and a set of patterns a, rtt=l,2,... ,rm,
and i=l,2,...,p, the [p(p-l)]/2 subproblems P2 are formed by zeroing out


Two treatments and are available for patients classified
in Alternative I and Alternative J. has a boundary-level
application number of one application and T2 may be given only
once during the patient's stay in the care system.
Note that this figure omits the transition arcs between the
diagnostic-classification-based patient states and the terminal
states well and referred.
FIGURE 8
MULTIPLE-STATE HISTORY-AUGMENTED PROCESS


ANALYTICAL MODELS FOR DIAGNOSTIC CLASSIFICATION AND
TREATMENT PLANNING FOR CRANIOFACIAL PAIN
By
Michael Steven Leonard
A DISSERTATION PRESENTED TO THE GRADUATE COUNCIL OF
THE UNIVERSITY OF FLORIDA IN PARTIAL
Fmrnii-ENT OF THE REQUIREMENTS FOR THE DEGREE OF
DOCTOR OF PHILOSOPHY
UNIVERSITY OF FLORIDA
1973


108


45
Condition 2:
Justification 2
Hence, there can be no set of convex combinations
of the columns of A and of B such that the combina
tions are equal.
Symbolically,
m. m.
since no u.> 0, Z1 u.=l and v.>0, Z-* v.=l
l
1=1
3- j=1 3
exist such that
in. m. .
Z1 u.a. = Z3 v.b. ,
i=l 11 j-1 33
m.
m.
'3 ,T -I
no u.>0, Z u.=l, and v.>0, ZJ v.=l
1_ i=l 1 3 j=l 3
exist such that
m. m.
Z1 u.a. = Z3 v.b. .
i=l 1 1 j=l 3 3
If the k row of A has all elements a^, i=l,2,... ,nu,
equal to zero (one) and the k row of B has all
Jr
elements b^, i=l,2,...,iru, equal to zero (one), the
+*Vi
K11 ot of matrixes A and B can be eliminated without
loss of possible solutions to subproblen P2.
Under Condition 2 every convex combination of the k
row elements of A and of the K 1 row elements of B
are equal. Hence, a set of convex combinations of the
columns of A and of the columns of B are equal if and
only if the convex combinations of the remaining rows
til
(all rows except the kul row) are equal. Symbolically,
* tlx
let a^ denote the pattern a^ whose k component has
*
been eliminated and similarly let b., denote the
3K
elimination of component k from pattern b^, then as


Ill
Patient
State
Model
Selection
Practitioner
Selection
Patient
State
Model
Selection
Practitioner
Selection
9
12
12
14141
24
24
9112
12
Refer+
14112,12
Refer
Refer*
10
23
23
14112,23
12
12*
10123
12
12
14124,24
22
Refer+
10112,23
12
12
14124,35
24
24
11
12
12*
15
12
12
11112
12
12*
15112
15
15
12
15
15
15120
20
20
15115
15
15
15122
34
24
12123
Refer
Refer
15123
23
23
12135
24
24
15124
12
12
12115,31
Refer
17*+
15127
16
16
12124,35
24
24
15134
34
34
13
12
12
15135
24
. 24
13112
12
12
15141
34
34
13118
24
24
15112,12
12
12
13122
34
34*
15112,23
12
12
13123
12
12
15116,27
16
16
13124
24
24
15120,20
20
Refer*+
13112,12
23
23
15122,22
12
12
14
23
23
15122,34
34
34*
14112
12
12*
15123,23
23
23
14123
12
12
15124,24
24
24
14124
24
24
15134,34
34
34
14126
26
26
15112,22,22 12
12
14135
35
35
15122,34,
34 34
34*


120
[25] Howard, R.A., Dynamic Probabilistic Systems, Vol. II; Semi-
Markov and Decision Processes, New York: Wiley (1971).
<*
[26] Rosen,- J.B., "Pattern Separation by Convex Programming,"
Journal of Mathematics and Application, Vol. 10 (1965)
123-134.
[27] Nelson, G.E., and Levy, D.M., "Selection of Pattern Features
by Mathematical Programming Algorithms," IEEE Transactions
on Systems Science and Cybernetics, Vol. SSC-6 (1970)
20-25.
[28] Balas, E., "An Additive Algorithm for Solving Linear Programs
with Zero-One Variables," Operations Research, Vol. 13
(1965) 517-547.
[29] Freund, J.E., Mathematical Statistics, Englewood Cliffs, N.J.:
Prentice-Hall (1962).


CHAPTER 2
PREVIOUS RESEARCH
Over three-hundred publications have been addressed to the problem
of modeling the diagnostic and treatment-planning process. Spanning
' fourteen years, this research has considered such diverse problems as
the classification of liver biopsies [10] and the optimal plan for
treating mid-shaft fractures of the femur [11]. At least ninety-one
disorders have been utilized as environments for developing diagnostic
and treatment-planning models. The magnitude of this research effort
emphasizes the need for analytic representations of these complex deci
sion-making processes.
Fortunately, the significant contributions in this voluminous
literature can be neatly partitioned into four distinct categories. Re
search in diagnostic classification has been based either on the applica
tion of Bayesian statistics or on the vise of non-par ame trie pattern
classifiers. Treatment planning has been presented as either a finite-
horizon decision problem or as an application of decision analysis to a
Markov process of uncertain duration. This section presents a brief dis
cussion of each of these categories and evaluates their suitability as
analytic representations of the process of providing health care for
craniofacial-pain patients.
2.1 Bayesian Classification Models
Bayesian diagnostic-classification models, such as [12, 13, 14,
10


105
c
1 )
or


Abstract of. Dissertation Presented to the
Graduate Council of the University of Florida in Partial
Fulfillment of the Requirement for the Degree of Doctor of Philosophy
ANALYTICAL MODELS FOR DIAGNOSTIC CLASSIFICATICW AND
TREATMENT PLANNING FOR CRANIOFACIAL PAIN
By
Michael Steven Leonard
December, 1973
Chairman: Dr. Kerry E. Kilpatrick
Major Department: Industrial and Systems Engineering
This dissertation presents a systematic approach to craniofacial-
pain diagnosis and treatment planning using analytic models of the under
lying decision-making processes. Patient diagnoses are generated by a
linear pattern-recognition classifier trained with a sample of preclas
sified craniofacial-pain patient data. For this classifier, an algorithm
is developed that minimizes the total cost of the set of features employed
in the classifying process. Diagnostic classifications, augmented by a
history of prior treatment applications, provide the state descriptions
for a Markovian decision model of the treatment-planning process. Cranio
facial-pain patient records frcm four university dental clinics serve as
a data base for model construction and validation.
The analytic models provide a means of duplicating the diagnostic
classifications and treatment plans of experts. Approximately 90% of
the diagnostic classifier's classifications and 93% of the treatment
planning model's treatment selections concurred with the decisions made
by experts in the field of care for craniofacial-pain patients. Moreover,
the models permit an examination of the critical considerations associated


119
[12] Boyle, J.A., Greig, W.R., Franklin, D.A., Harden, R.McG.,
Buchanan, W.W., and McGirr, E.M., "Construction of a Model
for Computer-Assisted Diagnosis: Application to the Problem
of Nontoxic Goiter," Quarterly Journal of Msdicine, Vol 35
(1965) 565-588.
[13] Lodwick, G.S., Harm, C.L., Smith, W.E., Keller, R.F., and
Robertson, E.B., "Ccmputer Diagnosis of Primary Bone Tumors:
A Preliminary Report," Radiology, Vol. 80 (1963) 273-275.
[14] Overall, J.E., and Williams, C.M., "Conditional Probability Pro
gram for Diagnosis of Thyroid Function," Journal of the
American Medical Association, Vol 183, No. 5 (1963)
307-313.
[15] Toronto, A.F., Veasy, L.G., and Warner, H.R., "Evaluation of a
Computer Program for Diagnosis of Congenital Health Disease,"
Progress in Cardiovascular Diseases, Vol. 5, No. 4 (1963)
362-377.
[16] Wilson, W.J., Templeton, A.W., Turner, W.H., and Lodwich, G.S.,
"The Computer Analysis and Diagnosis of Gastric Ulcers,"
Radiology, Vol. 85 (1965) 1064-1073.
[17] Burbank, F., "A Computer Diagnostic System for the Diagnosis of
Prolonged Undifferentiated Liver Disease," American Journal
of Medicine, Vol. 46 (1969) 401-413.
[18] Collon, M.F., Rubin, L., Neyman, J., Dantzig, G.B., and Siegelaub,
A.B., "Automated Multiphasic Screening and Diagnosis,"
American Journal of Public Health, Vol. 54 (1964) 641-750.
[19] Lipkin, M., Engle, R.L., David, B.J., Zgorykin, V.K., Ebald, R.,
Sendrow, M., and Berkley, C., "Digital Computers as an Aid
to Differential Diagnosis," Archives of Internal Medicine,
Vol. 108 (1961) 56-72.
[20] Overall, J.E., and Williams, C.M., "Comparison of Alternative
Ccmputer Models for Thyroid Diagnosis," San Diego Symposium
on Biomedical Engineering, Vol. 3 (1963).
[21] Betague, N.E., and Gorry, A., "Automated Judgemental Decision-
Making for a Serious Medical Problem," Management Science,
Vol. 17, No. 1 (1971) B421-B434.
[22] Ledley, R.S., "Computer Aids to Clinical Treatment Evaluation,"
Operations Research, Vol. 15 (1967) 694-705.
[23] Meisel, W.S., Computer-Oriented Approaches to Pattern Recognition,
New York: Academic Press (1972).
[24]Nilsson, N.J., Learning Machines, New York: McGraw-Hill (1965).


103
and established a boundary application number for the treatment that
reflected their knowledge about the treatment's effectiveness as well
as the information supplied by the data.


62
order. For example, a patient's occupation, of the state *<111,2,2
denotes that he is currently classified in diagnostic alternative J,
and that since he entered the care system he has been treated with one
application of treatment 1 and two applications of treatment 2.
.Augmenting the patient-state descriptions with treatment history
expands the dimensionality of the state space, yet the number of history-
augmented states remains finite for two reasons. The treatment records
used in model construction reveal that, for sane combinations of diag
nostic alternatives and treatment applications, there is a feasible
limit to the number of treatment repetitions that can be given to any
one patient. Thus, the first reason for a finite state space is that no
patient state in the treatment-planning model includes more repetitions
of a particular treatment than the clinical data have established as a
feasible limit. As an example, the records of patient visits used in
model construction establish a feasible limit of only one application
of treatment 18 for patients classified in any of the diagnostic alter
natives. Therefore, the treatment-planning model includes patient states
that exclude treatment 18 as a portion of their treatment history or
exhibit the form
'Jl...,18,...
for each diagnostic classification J* where 18 is a feasible treatment.
The second reason for a finite state space is that there is a boundary
application* of many treatments such that neither the treatment-record
data nor the reviewing practitioners established differences between the
transition probabilities for the boundary application and those for
further repetitions of the treatments (see Section 4.1.2 and Appendix E).
In Diagnostic Alternative 13, for example, the first application of treat-


33
senting these visits in a test's random sample and sane vectors used
in model construction. Such occurrences lead to test results that over
estimate classifier accuracy. lienee, in Test Six, a random sample of
all of the patient data associated with 40 patients (a total of 51
patient data vectors) was selected. This sample was classified by the
diagnostic-classification, model using the remaining 429 data vectors as
a data base. The results of this test are included in the data shown
in Table 3. There is one other possible factor affecting the classifier's
accuracy as measured by these tests. It is conceivable that there were
duplicate data vectors in the data base of 480 patient-data-vectors.
If duplicates do exist and were included in both the test samples and
the samples' training bases, measures of classifier accuracy will be
overly optimistic. However, since 'noise' is introduced by the variabil
ity among craniofacial-pain patients and generated in the practitioner's
transcribing of the elements of patient data into the data-vector format,
295
and since there are 2 possible data vectors, the probability that two
or more of the data-based patient vectors include an identical specifica
tion of data-vector elements is small enough to justify neglecting this
possibility and its effects.
The results summarized in Table 3 reveal that the diagnostic-clas
sification model performs well in duplicating the diagnostic classifica
tions originally assigned by the reviewing practitioners, Dr. Fast and
Dr. Mahan. Moreover, the size of the test sanples was quite large in
relation to the data base employed in developing each test's diagnostic
model. As new data became available and are incorporated in the para
meters of the model, the accuracy of the craniofacial-pain diagnostic
classifier can be expected to increase slightly.


54
the appropriate pattern-vector elements. Then Procedure 1 is applied
to each sutoproblem. Finally, for each pair of pattern classes the
A
boundedness of the dual formulation P4 is examined. Vector X represents
a feasible set of a pattern-classifying features for PI if and only if
each of the [p(p-l)]/2 subproblem formulations P4 is unbounded.
Before a statement of the algorithm to solve problem PI is presented
several terms must be defined. The assignment vector is defined as a
listing of variables x^, elements of the vector X in Pi, whose values have
been determined by .the steps of the algorithm. The elements in this vec
tor are recorded with the value of their assignment, either zero or one.
These elements are entered in the vector in the order they were assigned,
with the first algorithm assignment in the first (left) position. For
example, consider the assignment vector
[x4 = 0, x10 =1, x2 = 0].
This vector records that the algorithm first assigned x^ equal to zero,
then assigned x^q equal to one, and its last assignment was x2 equal to
zero. Feasibility of a solution X, as determined by the assignment-vector
component values, is checked by Procedure 2 with the value of those vari
ables not included in the assignment vector temporarily set equal to one.
The value V of an assignment vector is defined as minus one times the
sum of the costs associated with each of the variables in the assignment
vector, multiplied by the value assigned to the respective variable.
For the example assignment vector, [x^ = 0, x^q = 1, = 0], where
c^ = 5, c^q = 2, and = 7, the assignment vector has the value
V = (-1) [5(0) +2(1) +7(0)] =-2.


53
FIGURE 4
PROCEDURE 2


4
of the intricate neuromuscular mechanisms controlling mandibular move
ment [5]. Emotional tension may also lead to hyperbonicity of the
striated masticatory muscles resulting in facial pain or altered sensa
tion without evidence of peripheral dysfunction. In addition, abnormal
occlusal contacts of the teeth may affect muscle tonicity resulting in
mandibular dysfunction [5]. Moreover, the tenporcmandibular joint is
prone to disorders carmon to all joints: rheumatoid arthritis, osteo
arthritis, traumatic injuries, neoplasms, and nonarticular disorders.
Although the term craniofacial-pain* is a broad classification for pain
in the head and face, the term is used in this dissertation to describe
pathological, congenital, hereditary-based, or emotional causes of pain
in and around the temporomandibular joint.
Though the degree of severity may vary, one or more of the following
four 'cardinal symptoms' are exhibited by the craniofacial-pain patient:
pain, joint sounds, limitation of motion, and tenderness in the mastic
atory muscles [6]. Accompanying these symptoms the patient may complain
of, or the practitioner may find, hearing loss, burning sensations, mi
graine-like headaches, vertigo, tinnitus, subluxation, luxation, dental
pulpitis, sinus disease, glandular disorders, occlusal disharmony, and
radiographic evidence of joint abnormality. The degree of association
of these additional symptoms and findings with the etiology of the joint
disorders is subject to considerable variation.
Paralleling these areas of anatomic dysfunction is the possibility
that the craniofacial-pain patient may be suffering from psychic dis
orders. In no other type of patient seen by the dentist does psychic
condition play a larger role [7]. Most craniofacial-pain patients have
symptoms or signs of anxiety, and a sensory preoccupation with the oc-


93
Step 6:
The tree
Step 4:
where [ 0 ] is the null matrix.
By Step 6 of Procedure 1, feasible convex combinations of these
matrices exist.
Hence the assignment vector [x2 0] is infeasible by the rules
of Procedure 2. Go to Step 6 of the algorithm.
The assignment vector is now [x2 = 1]. As this vector does not
include an assignment for every variable, return to Step 4 of
the algorithm.
of possible solutions to Pi now has the form
Select the variable x^ for assignment. The assignment vector is
now [x2 = 1, x^ = 0]. Apply Procedure 2.
In Procedure 2, zeroing out the column k=3 fran A^, A^, and A^
yields
Al =
Atp
Application of Procedure 1 to A^ and yields the following
reduced matrices:
*0 o"
0 1
0 1
1 1
CT
0 0
*T
0 0
0 0
1 1
A2 _
0 0
1 1
A3 =
0 0
1 1
* m
Al =
o o
A2 =
o
1
i i
A


CHAPTER 1
INTRODUCTION
The rapid pace of developments in medical and dental research pre
vents the practicing physician and dentist fran fully utilizing each new
diagnostic and treatment-planning aid as it is published. In each of the
last four years an average of 215,000 new publications have been written
to supplement the knowledge of the health-care practitioner [1]. Con
currently, the pressures of an ever-increasing patient load force prac
titioners to select the most expeditious means for diagnosing disorders
and selecting treatments. For example, the medical general-practitioner
(1970) saw an average of 173 patients a week [2], and the median dental
practitioner (1971) saw two patients an hour [3]. Given these circumr-
stances, practitioners may overlook possible diagnostic and treatment al
ternatives or they may apply inappropriate treatments. If meaningful
analytic descriptions of the diagnostic and treatment-planning processes
can be developed, these models can assist educators in training new prac
titioners, researchers in evaluating and disseminating new developments,
and practitioners in improving the quality of patient care [4].
Developing models of the diagnostic-classification and treatment
planning process requires an understanding of the underlying physiological
processes of diseases and the mechanisns of their cures. Obviously, the
effects of disease and the means of cure vary frcm one health-care prob
lem to another. Thus, modeling efforts in diagnosis and treatment plan
ning must be integrally related to the facet of health care that is under
1


84
Location of Tenderness
Left
Side
Right
Side
Location of Pain
Side
Limited Jaw Opening 243 Yes
Joint Sounds
244
Clicking
245
Crepitation
246
Pain accompanying joint sound
Headaches 247 Frequent headaches
248 Headache associated with joint pain


96
assignment for every variable; go to Step 7 of the algorithm.
Step 7: The value V is calculated for this assignment vector, where
V= -1[1(6) + 1(3) + 0(2)] = -9 .
*
As V = -=, go to Step 8 of the algorithm.
*
Step 8: V is set equal to -9, and the values of the variables x^ = 0,
X£ = 1, and x^ = 1 in this assignment vector are stored for
future reference. Go to Step 1 of the algorithm.
Steps 2 and 3 of the algorithm dictate that the algorithm is
terminated at this point since these steps generate the assignment
vector [x^ = 1, x^ = 1, x^ = 1] which is known to be feasible to PI.
*
V for this assignment vector is -11, which is smaller than V Hence
the minimum-cost collection of classifying features is feature 2 and
feature 3, with a cost of 9 units associated with utilizing these
features in a linear pattern classifier.


101
Analysis by Duration of Pain
Less than From 3 to Mare than
3 weeks 6 weeks 6 weeks
Transition 8
into
13
15
Wall
?
X* = 5.047 with 6 degrees of freedon
Hence, the analysis reveals that duration of pain is not significant in
determining estimates of transition probabilities out of Diagnostic Al-
o
temative 13 following application of treatment 24, as x nt- =12.592.
Uj f D
Analysis by Nature of Pain
Continuous Episodic
Transition 8
into
13
15
Well
2
X = 3.964 with 3 degrees of freedom
Hence, the analysis reveals that nature of pain is not significant in
determining estimates of transition probabilities out of Diagnostic Al-
2
temative 13 following application of treatment 24, as x Ar -3=7.815.
UD f
0
1

8
6
1
0
11
3


CHAPTER 5
CONCLUSIONS AND FUTURE RESEARCH.
This dissertation has presented analytic models of the decision pro
cesses associated with diagnosing and selecting treatments for a partic
ular health-care problem. The selection, construction, and testing of
these models have been discussed in seme detail. Meanwhile, the model
building effort itself has been the source of a number of insights into
decision-making in a health-care environment. These insights will be
reflected in this chapter's discussion of the dissertation's central re
search conclusion and suggestions.of topics for future investigation.
The similarity between the decision-making processes employed by
the practitioner and the analytic structure of this dissertation' s models
is quite revealing. In both diagnosis and treatment planning for cranio
facial-pain patients it appears that the practitioner, like the analytic
models, makes 'first-order' decisions. The linearity of symptom signifi
cance (a first-order polynomial of symptom weights), and the present-
patient-state dependency of transition probabilities measuring treatment
effectiveness (a first-order stochastic dependence) provide a means of
generating decisions that closely approximate the decisions made by dental
practitioners. This general conclusion on the applicability of first-
order decision techniques to craniofacial-pain diagnostic classification
and treatment planning characterizes the central development of this
dissertation.
77


30
TABEE 2
CORRELATION BETWEEN SIGNIFICANT SYMPTOMS
AND DISCRIMINANT-FUNCTION WEIGHS
Diagnostic Alternative 4: Temporomard.tbular Joint Arthritis-Traumatic
(Acute)
Significant Symptoms
Discriminant-Function
Weights
(+) Duration of Pain (less than 3 weeks) + 3
(+) History of Trauma (accidental) +30
(+) Preauricular Pain +11
(-) Salivary Gland Disease -12
(-) Otitis 1
(discriminant-function weights for Diagnostic Alternative 4 range
fran -19 to +37)
Diagnostic Alternative 14: Myofacial Pain-Dysfunction Bruxism
Discriminant-Function
Significant Symptoms Weights
(+) Duration of Pain (more than 6 weeks) +15
(+) Facets + 2
(+) Bruxism and/or Clenching +56
(-) History of Trauma (accidental) -16
(-) Salivary Gland Disease 5
(discriminant-function weights for Diagnostic Alternative 14 range
from -23 to +56)
Note: For both Diagnostic Alternatives
(+) indicates a symptom that leads the practitioner to classify
a patient in that diagnostic alternative
(-) indicates a symptom that leads the practitioner to classify
a patient in seme other diagnostic alternative


51
In the second application of the steps of Procedure 1:
1. Pow 1 can be eliminated from both matrices by Step 1
2. Pow 2 can be eliminated fron both matrices by Step 2 and
3. Column 4 can be eliminated frem matrix B by Step 4.
After the second application of the steps of the procedure
0 0
1 1 l
A =
B =
1 1
.
1 1 1^
In the third application of the steps of Procedure 1:
1. Pew 2 can be eliminated fron both matrices by Step 1 and
2. Procedure 1 can be terminated by Step 3.
Hence, for this set of A and B matrices, subproblem P2 has no feasible
solution.
Although the use of Procedure 1 may lead to a reduction in the size
of most subproblems, the pattern vectors (a^ and b^) for ach of these
problems may still be quite large. Restating subproblem P2 as a linear
program yields
P3:
minimize [0 0]
subject to
and
A
-B
11...1
00...0
00...0
11...1
u > 0
V > 0
u*
o'
V
1
-
1
where the existence of any solution vectors U* and V* signals the inter
section of the convex hulls of pattern-classes A and B.


106
13118
cry
8112
C S}12 \
12 8112,12
C 10 >
10123 y
^10112,23^)
O
C 11 >
+/ 11112
TJ


89
Proof that the algorithm converges if feasible weight vectors
*
Wj, j=l,2,...,p, exist (that is, the sample space is linearly separable)
is developed in Nilsson [22]. Nilsson's proof can be directly applied

since for any set of feasible W_.
aP^ W* > a!k) W* + a
3 ~3 3 ~2
r(i)
for all 10=1,2,... ,t, and 2=1,2,...,p, z^j, while for any j=l,2,...,p,
i-l,2,,0
a(k) w(i) <
-3 -3
+ a
for sane k and sane z.
Typically, a training algorithm is applied to the members of a
training sample without prior knowledge of whether the sample pattern
space is linearly separable. The algorithm is allowed to process sample
patterns until it either converges on a set of discriminating hyperplanes
or.it has run for a 'reasonable' amount of time without termination. Ex
perience with medical data and the modified fixed-increment algorithm
has shown that if there is a set of discriminating hyperplanes, the
algorithm will find it in no more than 3 complete cycles for each of the
pattern classes. For exanple, if there are 5 pattern classes and the
pattern space can be linearly partitioned, the algorithm should terminate
in no more than 15 full cycles through the training data. This rough
measure of training time provides an index for establishing a limit on
computer processing time.
An application of the modified fixed-increment training algorithm
is presented in Figure 7.


To ray wife,
Mary


I certify that I have read this study and that in my opinion it
conforms to acceptable standards of scholarly presentation and is fully
adequate, in scope and quality, as a dissertation for the degree of
Doctor of Philosophy.
Richard S. Mackenzie
Professor and Director,
Education
Office of Dental
I certify that I have read this study and that in my opinion it
conforms to acceptable standards of scholarly presentation and is fully
adequate, in scope arid quality, as a dissertation for the degree of
Doctor of Philosophy.
Associate Professor of Industrial and
Systems Engineering
This dissertation vas submitted to the Dean of the College of Engineering
and to the Graduate Council, and was accepted as partial fulfillment of
the requirements for the degree of Doctor of Philosophy.
Dean, Graduate School


5
elusion of their teeth [8]. Many of these patients can be characterized
by a heavy reliance on denial, repression, and projection of their psy-
ciiic disorders in order to maintain their self-concept of emotional sta
bility [6] Often the complaints these patients relate to the practi- .
tioner are not canpatible with any objective signs.
The practitioner who manages the care of craniofacial-pain patients .
assumes a difficult task. For sane of these patients, diagnosis is ob
vious. Generally, however, the craniofacial-pain patient presents a.com
plex combination of signs and symptoms [7]. More than one disease en
tity nonrally accounts for the patient's symptoms and most craniofacial-
pain patients suffer frem a pain-dysfunction complex involving a combina
tion of masticatory muscle disorders, occlusal disharmony, emotional .
tension, and anxiety [5]. Nevertheless the possibility of multiple
almost sub-clinical etiologic factors combining to produce the dysfunc
tion and pain must be considered. The close relationship of organic and
emotional disorders as they appear in craniofacial-pain patients provides
the examining dentist with the problem of discriminating which factor is
primary in the etiology of the patient's dysfunction [7]. Unfortunately,
the temporomandibular joint is one of the most difficult areas of the
body to examine radiographically [8]. Hence, with these patients, the
dentist relies to a large degree on tests of emotional stability and
physical examination by visualization, palpation, and auscultation [7].
Therapeutic measures for the care of craniofacial-pain patients are
as varied as the factors contributing to the disorder. "A small percent
age of patients with symptoms referrable to the temporomandibular joint
will portray such a confusing picture that consultation with other, dental
or medical specialists is indicated" [7, p. 129], The majority of these


Step 4: Select the variable x^ for assignment. The assignment vector
is now [x2 = 1, x^ = If x2. = ^ Apply Procedure 2.
In Procedure 2, zeroing out the column k=l from A^, A^, and A^
yields
0 o
o o
o o'
1 1
*T
0 0
CT
0 0
0 1
A2
1 1
*3 =
0 0
1 1
1 1
i i_
Am ^m
application of Procedure 1 to A^ and A^ yields the following
reduced matrices:
Am Am
A£ = [ 1 ] A2 = [0 0] .
By Step 3 of Procedure 1, no feasible convex combination of
these matrices exists.
Am Am
Application of Procedure 1 to A^ and A^ yields the following
reduced matrices:
= [ 1 ] A^ = [0 0] .
By Step 3 of Procedure 1, no feasible convex combination of
these matrices exists.
"T AT
Application of Procedure 1 to A2 and A^ yields the foliating
reduced matrices:
Am Am
Aj = [1 li = to o] .
By Step 3 of Procedure 1, no feasible convex combination of
these matrices exists.
Hence tlie assignment vector [x2 = 1, x^ = 1, x^ = 0] is
feasible by the rules of Procedure 2. Go to Step 5 of the
algorithm.
Step 5: The assignment vector [x2 = 1, x3 = 1, x^ = 0] includes an


66
the care system, since a 'nl visit holding time in a particular patient
state can be modeled with no loss of information as Tn' repetitions of
the 'virtual* transition frcm the state in question to itself. Care for
craniofacial-pain patients is modeled as a discrete-stage Markovian sys
tem with the beginning of visits to the practitioner serving as stage
indicators.
Using the history-augmented patient states, transition probabilities
are specified in terms of the treatment that generated the transformation.
In making a state-transition following a treatment, a patient must move
to a state that includes that treatment as a portion of its state descrip
tion. For example, following application of treatment *k,' a patient
must progress frcm patient-state 'Ilm,n* to 'Jlk,m,n' where I may be
equivalent to J.1 The only exception to this rule is in the application
of a treatment beyond its boundary number of repetitions. Here, if treat
ment k1 has a boundary number of two, then following an application of
treatment k' three or more times a patient progresses frcm patient state
'IIk,k,m,n' to 'Jl^k^n1 where again *1' maybe equivalent to *J.1
This structure is indicated because inclusion of more than the boundary
number of applications (two in this case) in the state description does
not affect the transition probabilities.
Estimates of the values of the transition probabilities were ob
tained frcm the patient records discussed previously. A discussion of
the stability of these probability estimates under variations in patient
data is presented in Appendix E. Where the data on the effects of treat
ment alternatives were limited, the data-generated probability estimates
were refined by estimates frcm the reviewing practitioners. Notationally,
transition probabilities are represented in the analytic model in the


92
This feature-selection algorithm will be employed to find the minimum-
cost collection of classifying features. For the purposes of illustra
tion, the feature* variables x^, i=l,2,3, are selected in the order x2,
x^, x^ in Step 4 of the algorithm (Section 3.4.2). Note that this is a
logical ordering of features in descending order of feature-utilization
'costs.1 This prior specification of the order of variable assignments
permits the construction of a tree that represents the possible solu
tions remaining to be considered at each step of the algorithm. This
tree of possible solutions to PI has the form
*
Step 0: The algoritlm is initialized with V = -<. Go to Step 4 of the
algorithm.
Step 4: Select the variable x2 for assignment. The assignment vector is
now [x2 = 0]. Apply Procedure 2 (Figure 4).
In Procedure 2, zeroing out the column k=2 from A^, A2, and A^
yields
0
o"
0 1
0 1
0
0
:t
0 0
:t
0
0
0 1
A2 ~
1 1
*3
0
0
1 1
1 1
1 1
'sm
Application of Procedure 1 (Section 3.4.1) to A^ and A2 yields
the following reduced matrices:
Am Am
= P 1 A2 = [ 0 ],


38
tion of each of the 'nr' patterns in each of the *p* pattern classes
-im= [aiim/ai2m/***,ainm,11/ i=l,2,...,p,
where
m
a., k=l,2,...,n, equals either zero or one, there must exist
JJv
a set of 'n+1' dimensional Vhs, j=l,2,...,p, such that
£um (Vh-W,.) > 0 for all m=l,2,...,im (3)
i1/2/.../P
j=l/2/.../P
j^i.
Letting be. the im* (n+1) dimensional matrix of patterns in pattern-
class i, then the requirement of (3) can be written in the following
form:
A. (W.-W.) > 0 i=l/2/.../P
x ~x
j==l/2/.../P
j^i.
If such pattern representations and Vh 's exists, then a solution to the
following problem yields a minimum-cost collection of pattern-classifying
features:
minimize CX
subject to A. [X (W.-W.)] >£ i=l,2,...,p
x i ]
j=l/2,... ,p
PI:


BIBLIOGRAPHY
[1] .S. Department of Health, Education, and Welfare, Cumulated
Index Medicus, Washington, D.C.: U.S. Government Printinq
Office (1970-1973).
[2] Ahevne, P., Ryan, G.A., and Walsh, R.J., 1972 Reference Data on
the Profile of Medical Practice, Chicago: Center for Health
Services Research and Development, American Medical Associa
tion (1972).
[3 ] Bureau of Economic Research and Statistics, "1971 Survey of Dental
Practice," Journal of the American Dental Association,
Vol. 85 (1972) 154-158.
[4] Bruce, R.A., and Yrdall, S.R., "Computer-Aided Diagnosis of
Cardiovascular Disorders," Journal of Chronic Diseases,
Vol. 19 (1966) 473-484.
[5] Schwartz, L., and Chayes, C.M., Facial Pain and Mandibular
Dysfunction, Philadelphia: W. B. Saunders (1968).
[ 6 ] TMT Research Center, Conference on Function and Dysfunction of
the Temporomandibular Joint Complex, Chicaqo: University of
Illinois (1969).
[ 7] Mitchell, D.F., The Dental Clinics of Worth America, Symposium
on Oral Medicine, Philadelphia: W..B. Saunders (1268).
[8] Mitchell, D.F., Standish, S.M., and Fast, T.B., Oral Diagnosis/
Oral Medicine, 2nd Edition, Philadelphia: Lea & Febiger
TT971).
[9] Ledley, R.S., "Practical Problems in the Use of Computers in
Medical Diagnosis," Proceedings of the IEEE, Vol. 57, No. 11
(1969) 1900-1918.
[10] Lincoln, T.L., and Parker, R.D., "Medical Diagnosis Using Bayes
Theorem," Health Services Research, Vol. 2, No. 1 (1967)
34-35.
[11] Bunch, W.H., and Andrew, G.M., "Use of Decision Theory in
Treatment Selection," Clinical Opthopaedics and Related
Research, No. 80 (1971) 39-52.
118


34
The second validating procedure established a measure of variability
on the diagnostic classifications that might be given by different dental
practitioners. The discussion presented in Section 1.1 related the dif
ficulties associated with diagnosing craniofacial-pain disorders. Prac
titioners with varying kinds of professional experience can be expected
to reflect their dissimilar backgrounds in differing diagnostic classi
fications for these patients. To measure the variability associated with
dissimilar backgrounds, five craniofacial-pain data vectors were selected
from the data base employed in constructing the craniofacial-pain diag
nostic classifier. Four dentists from the staff of the College of Den
tistry at the University of Florida were asked to review these patient
data vectors and assign to each of them a diagnostic classification.
Table 4 summarizes their assignments and also includes the diagnostic
classification originally given by the reviewing practitioners.
The variability in diagnostic assignments reflected in Table 4 re
affirms the justification for the research objectives set forth in
Section 1.2. Same of the differences in the practitioners' choices of
diagnostic classifications can be explained by the limited amount of
data contained in each of the data vectors, and the less-than-full med
ical statement of each of the diagnostic alternatives. Nevertheless, a
diagnostic-classification model that generates classifications that are
in 90% agreement with those of experts in the field provides a sizeable
improvement over the variability in classification assignments exhibited
in Table 4 in which only half the respondents agreed on a single diag
nosis in four out of five cases.


3
Right temporomandibular articulation
Inset: Anatomical features of the temporomandibular joint
TEMPORCMNDIBUIAR JOINT
/


55
3.4.2 Statement of the Minimum-Cost Symptom-
Selection Algorithm
3.4.3 Computational Considerations 56
3.5 Model Applications 57
4. Treatment Planning 59
4.1 Model Components 59
4.1.1 Patient States 60
4.1.2 Transition Probabilities 63
4.1.3 Cost Structure 67
4.2 Selection of Optimal Treatments 70
4.3 Model Validation 72
4.4 Model Applications. 74
5. Conclusions and Future Research 77
Appendices
A Craniofacial-Pain Patient Data Vector 82
B Modified Fixed-Increment Training Algorithm 87
C Application of the Minimum-Cost Symptom-Selection
Algorithm 91
D Treatment Alternatives for Craniofacial-Pain Patients... 97
E Stability of Transition-Probability Estimates 99
F Flow Charts of Patient-State Transitions 104
G Patient-State Treatment Selections 110
H Application of the Patient-State-Labeling and Optimal-
Treatment-Selection Procedure 114
BIBLIOGRAPHY 118
BIOGRAPHICAL SKETCH 121
Vi


74
patient states, or 92.6% of the patient states. The 7 differences in
treatment selections arise in part fran the approximations the treatment
planning model employs in its representation of the care system and in
part from slight inconsistencies in the practitioner's treatment selections.
One last test was performed to verify the suitability of the Mark
ovian representation of the craniofacial-pain care system. Mean transit
times through the care system to one of the terminal states were calcu
lated using the model-generated treatment decisions, and each of six
first-visit patient states. These .model-generated transit times were
compared to estimates of the same statistics gathered fran the patient
records contributed by the university dental clinics. Table 5 presents
the values of both sets of statistics. The close correlation of these
values reveals that the treatment-planning model not only duplicates the
decisions of experts, but also provides a structure for gathering other
relevant information about the underlying care system.
4.4 Model Applications
Like the diagnostic-classification model presented in Chapter 3, the
craniofacial-pain treatment-planning model has been structured to permit
its utilization in a variety of applications. Markovian modeling provides
an analytic representation of the craniofacial-pain care system as well
as establishing a means of making treatment selections. This section dis
cusses applications of the model's analytic representation and treatment
selections in teaching, in research, and in practice.
The model-generated treatment decisions reveal which treatments are
most frequently used in the care of craniofacial-pain patients. In a
teaching environment, this information can be used to specify treatment-


14
and the dependence, among, syirptcms that are required for Bayesian analysis.
With the non-parametric classifier, a diagnosis is generated for the
practitioner by evaluating a discriminant function associated with each,
diagnostic classification, g^(), i=l,2,...,n. As was the case with the
Bayesian models, the values of these discriminants are a function of the
symptom-complex S exhibited by the patient. The patient's diagnostic
classification corresponds to that disease whose associated discriminant-
function value is maximum. That is, a patient with symptoms S is classi
fied in disease-group i if
gi(S)>gk(S) for all k f i.
Results frcm seme of the applications of pattern-recognition classi
fiers are presented in Table 1. In these test applications diagnostic
accuracy was consistently high. Because of these models' ease of imple
mentation and small storage requirements, a non-parametric pattern classi
fier is preferable as a vehicle for generating diagnostic classifications.
The use of a non-parametric classifier is further motivated by features
of the care process for craniofacial-pain patients discussed in Chapter 3.
2.3 Finite-Horizon Treatment Planning
In the realm of research on modeling the treatment-planning process,
several authors [9, 21, 22] have presented schemes for analysis that
utilize methods for making decisions tinder risk and uncertainty. The
treatment-selection process has alternately been defined as a two-person
zero-sum game, structured as a decision tree, and modeled as a Markov
process of limited duration. Treatment costs and the 'costs' of occupy
ing 'non well' or terminal patient states, provide the basis for select
ing an 'optimal1 treatment plan. Finiteness of the planning horizon is
assured either by establishing a maximum permissible number of treatment


32
TABLE 3
TESTS OF DIAGNOSTIC CLASSIFIER ACCURACY
Number of
Patient
Data Vectors
Number of
Data Vectors
Correctly Classified
Classifier
TEST ONE
50
46
92.0%
TEST' TWO
50
45
90.0%
TEST THEEE
50
44
88.0%
TEST FOUR
50
47
94.0%
TEST FIVE
50
45
90.0%
TEST SIX
51
43
84.3%
Mean Classifier Accuracy 89.7%
Standard Deviation of Classifier Accuracy 3.5%


58
cost for employing a new test, the algorithm returns an evaluation of
the test's classifying capability. The algorithm reveals whether the
test is included in the minimum-cost collection of features and whether
the lose of the new test permits the practitioner to discontinue other
examination procedures. Additionally, the algorithm can be employed to
point out new areas for research, as it isolates diagnostic alternatives
where correct classification of patients is difficult using existing tests
and procedures.
As employed in the practitioner's office, the diagnostic classifier
veil provide a direct link between the practicing dentist and the know
ledge of experts in the field of craniofacial pain. Information will
flow over the link in both directions. As new patients are seen by the
practitioner, the record of each visit will be reviewed by experts and
then used to supplement the data base employed in model construction.
Then, when developments dictate, new sets of discriminant-function weights
can be transmitted to the dental practitioners. This kind of interaction
results in a more accurate and representative diagnostic classifier as
the patient-sample data base becomes larger.


xml record header identifier oai:www.uflib.ufl.edu.ufdc:UF0008974800001datestamp 2009-02-09setSpec [UFDC_OAI_SET]metadata oai_dc:dc xmlns:oai_dc http:www.openarchives.orgOAI2.0oai_dc xmlns:dc http:purl.orgdcelements1.1 xmlns:xsi http:www.w3.org2001XMLSchema-instance xsi:schemaLocation http:www.openarchives.orgOAI2.0oai_dc.xsd dc:title Analytical models for diagnostic classification and treatment planning for craniofacial pain.dc:creator Leonard, Michael Stevendc:publisher Michael Steven Leonarddc:date 1973dc:type Bookdc:identifier http://www.uflib.ufl.edu/ufdc/?b=UF00089748&v=00001000580622 (alephbibnum)14057039 (oclc)dc:source University of Floridadc:language English


ACKNCWLEDGEMENTS
Without the considerable contributions of time and effort by the
members of his committee, it would have been impossible for the author
to have completed this dissertation. In particular, the author expresses
gratitude to his Chairman, Dr. Kerry Kilpatrick, for his encouragement
and direction during the course of this research effort. The author also
thanks Dr. Kilpatrick for his editorial assistance during the development
and organization of this manuscript. The author thanks Dr. Richard
Mackenzie and Dr. Stephen Roberts for providing the initial direction for
this research. Additionally, the author is grateful to Dr. Them Hodgson
and Dr. Donald Ratliff for their assistance in evaluating and refining
the author's ideas throughout this project. The author expresses his
gratitude to Dr. Thomas Fast and Dr. Parker Mahan for the contribution of
their extensive knowledge about craniofacial pain to the author's research.
The author is deeply appreciative of Dr. Fast's and Dr. Mahan's willing
ness to spend many hours examining dental records and their endurance of
the nomenclature and idiosyncracies of this mathematical-modeling effort.
Financial support for this research was provided by the Health Systems
Research Division, J. Hillis Miller Health Center. The division's sup
port in conjunction with a traineeship granted by the National Science
! Foundation made it possible for the author to undertake this research.
The author is also grateful to the Industrial and Systems Engineering
Department for the contribution of computer funds. Additionally the au
thor thanks Dr. William Solberg, University of California at Los Angeles;
iii


18
1. Temporomandibular Joint ArthritisDevelopmental
2. Temporomandibular Joint ArthritisInfectious
3. Temporomandibular Joint ArthritisOsteo (Degenerative)
4. Temporomandibular Joint Arthritis-Traumatic (Acute)
5. Temporomandibular Joint Arthritis Traumatic (Chronic)
6. MyopathyAcute Trauma
7. MyopathyMyositis
8. Oral PathologyDental Pathology
9. Vascular ChangesMigrainous Vascular Changes
10. Myofacial Pain-Dysfunction MalocclusionBalancing Interferences
11. Myofacial Pain-Dysfunction MalocclusionLateral Deviation of Slide
12. Myofacial Pain-Dysfunction MalocclusionUneven Centric Stops
13. Myofacial Pain-Dysfunction PsychoneurosisAnxiety/Depression
14. Myofacial Pain-Dysfunction Bruxism
15. Myofacial Pain-Dysfunction Reflex Protective Muscular Contracture
16. Myofacial Pain-Dysfunction Loss of Posterior Occlusion
17. Neuropathy
FIGURE 3
CRANIOFACIAL-PAIN DIAGNOSTIC ALTERNATIVES


43
Assume a^ = [1,0], a^ = [0,1], aj = [0,0], and aj = [1,1]
Graphically this pattern space can be represented as
2
Y
12
where the line X from ^ to ^ represents the convex hull of pattem-
1 2
class X and the line Y frcm to represents the convex hull of
pattern-class Y. Since the lines X and Y intersect, the pattern space
is not linearly separable, and hence, it is impossible to draw a discri
minating hyperplane 0.
Therefore, the following condition is equivalent to condition (4):
c t
a vector X is feasible to PI if and only if there do not exist b and U
such that
where
<3 t A
u A = U A for any s=l,2,...,p
S *"* V
tl,2,...,p
s^t
U1 = [UjyU^, ,Uj^ ]
(5)
uk^-
> 0 for all k=l,2,...,m^
and
m.
E1 = 1 for all i=l,2,...,p .
k=l K
Checking the feasibility of seme vector X by condition (5) yields
[p(p-l)]/2 distinct subproblems. Each of these subproblems may be


LIST OF FIGURES
Figures
1. Temporomandibular Joint 3
2. Diagnostic-Classification and Treatment-Planning
Process for Craniofacial Pain.... 7
3. Craniofacial-Pain Diagnostic Alternatives 18
4. Procedure 2 53
5. Diagnostic-Classification Transitions 64
6. Patient-Visit Inconvenience Cost 69
7. Application of the Modified Fixed-Increment Algorithm... 90
8. Multiple-State History-Augmented Process 115
viii


Given the training sample of the form a = [a^,a2,l] where
a£ = [0,0/1] a2 = [1,0/1] a2 = [0,1,1]
the training sample patterns can be represented in 3-dimensional
space by
The modified fixed-increment algorithm with a = 0 and g 1
proceeds as follows: (* indicates correct sample classification)
Sample
*1
2
3
BEi
2
3
[0,0,1]
[ 0, 0,
0]
[ 0, 0, 0]
[ 0, 0, 0]
0
0
0
[1,0,1]
[ 0, 0,
2]
[ 0, 0,-1]
[ 0, 0,-1]
2
-1
-1
[0,1,1]
[-1, 0,
1]
[ 2, 0, 1]
[-1, 0,-2]
1
1
-2
[0,0,1]
[-1,-1,
0]
[ 2,-1, 0]
[-1, 2, 0]
0
0
0
[1,0,1]
[-1,-1,
2]
[ 2,-1,-1]
[-1, 2,-1]
1
1
-2
*[0,1,1]
[-2,-1,
1]
[ 3,-1, 0]
[-1, 2,-1]
0
-1
1
*[0,0,1]
[-2,-1,
1]
[ 3,-1, 0]
[-1, 2,-1]
1
0
-1
*[1,0,1]
[-2,-1,
1]
[ 3,-1, 0]
[-1, 2,-1]
-1
3
-2.
Hence, the set of weights generated by this training sample is
Wx = [-2,-1, 1]
W2 = [ 3,-1, 0]
W3 = [-1, 2,-1].
FIGURE 7
.APPLICATION OF THE MODIFIED FIXED-INCREMENT ALGORITHM


CHAPTER 4
TREATMENT PLANNING
The selection of treatment regimens for craniofacial-pain patients
is modeled as a Markovian decision process. The states in this Marko
vian model are descriptions of a patient's health-care status and the
decision alternatives are feasible treatments for the patient's dys
function (see Section 4.1). In the first two sections of this chapter,
motivation for the model structure is provided and the components of
the decision model are developed. The third section provides a descrip
tion of the validating procedures used to determine the appropriateness
of the model and the model-generated treatment decisions. This chapter
closes with a discussion of potential teaching, research, and private
practice applications of the treatment-planning model.
4.1 Model Components
Several model-building components from the craniofacial-pain care
system are isolated to permit the construction of a Markovian represen
tation of this system. A set of state descriptions that characterize,
for decision-making purposes, the status of craniofacial-pain patients
is presented in Section 4.1.1. Then transition probabilities measuring
the effects of treatment applications are discussed in Section 4.1.2.
Section 4.1.3 overlays the model's state descriptions and transition
probabilities with costs accrued during the patient's progression -through
the care system. These components are integrated and verified in the
discussions of Sections 4.2 and 4.3.
59


49
Step 2: If for sane row k in AQ (Bq) each element in the corre
sponding rcw of Bq (Aq) is equal to zero, then row k of
A and B can be eliminated by Condition 2.
Step 3:
Step 4:
Step 5;
Step 6:
If for sane row k in Aq (Bq) the corresponding row in
Bq (Aq) has all elements equal to one or if for sane row
(B1)
elerrents equal to zero, then this particular subproblem
P2 has no feasible solution by Condition 1. Procedure 1
and the search for a solution to P2 are terminated at
this point because the convex hulls of pattern-classes
A and B do not intersect.
If for sane row k in (B^) the corresponding row in
(a) has one or more elements equal to zero, i.e.,
, k k _,k n ,k_ k_ k
r s t r s t
columns b ,b ,...,b. (a ,a ,...,a ) can be eliminated by
XT S l XT S t
Condition 3.
If for sane row k in AQ (BQ) the corresponding row in
Bq (Aq) has one or more elements equal to one, i.e.,
= b* =.. = 1 (a¡W=.. .=a^=l) then
X S u XT S u
columns b ,b ,...,b. (a .a ,...,a) can be eliminated by
rs trs t
Condition 4.
If the use of Steps 1, 2, 4, and 5 has eliminated all
elements of both matrices, then this particular subproblem
has an infinite number of feasible solutions by Condition
2. Procedure 1 and the search for a solution to P2 are
terminated at this point because the convex hulls of the
pattern-classes A and B intersect.
c c
the corresponding row in B^ (A^) has all


CHAPTER 3
DIAGNOSTIC CLASSIFICATION
The analytic model developed to provide diagnostic classifications
for craniofacial-pain patients is based on the principles employed in
non-parametric pattern classification. The patterns classified by this
diagnostic model are vector representations (see Section 3.1 and Appen
dix A) of the craniofacial-pain patient's physical and emotional status.
In the first sections of this chapter the' theoretical background for the
diagnostic model is established. This discussion is followed by a pre
sentation of the validation procedures used to evaluate model perfor
mance. Next, an algorithm is developed to reduce the 'costs' associated
with model utilization. The chapter closes with a discussion of poten
tial applications of the craniofacial-pain diagnostic classifier in
teaching, in research, and in the health-care process.
3.1 Model Components
In the initial phase of the development of the diagnostic-classi
fication model a set of possible alternative diagnostic classifications
was established for craniofacial-pain patients. Figure 3 provides a
list of these possible classifications. Note that the alternative classi
fications in Figure 3 are not mutually exclusive as a craniofacial-pain
patient classified in sane diagnostic alternative 'A' could also have
the disorder specified by sane other diagnostic alternative 'B.'
However, for the purposes of this dissertation, each patient's diagnostic
17


TABLE OF CONTENTS
ACKNCLEDGEMENTS iii
LIST OF TABLES vii
LIST OF FIGURES viii
ABSTRACT ix
Chapter
1. Introduction 1
1.1 Craniofacial Pain 2
1.2 Research Objective 8
1.3 Dissertation Overview 9
2. Previous Research 10
2.1 Bayesian Classification Models 10
2.2 Non-Parametric Classification Models 13
2.3 Finite-Horizon Treatment Planning 14
2.4 Uncertain-Duration Treatment Planning 15
3. Diagnostic Classification 17
3.1 Model Components 17
3.2 Alternative Interpretations of Linear Separability 26
3.3 Model Validation 31
3.4 Minimum-Cost Symptom-Selection Algorithm 36
3.4.1 Algorithm Development 39
v


68
category- The cost estimates reflect typical charges in a dental clinic
environment.
The inconvenience experienced by a patient in making a visit to the
practitioner was used as a measure of the cost of occupying a 'non-well'
patient state. Estimates of this inconvenience cost were gathered from
responses to a questionnaire completed by patients at the University of
Florida's Dental Clinic. These were general dental patients not neces
sarily suffering from craniofacial pain. Figure 6 shows the distribution
of these patient estimates.
Values for patient-referral costs were composed of the sum of three
distinct estimates. The first component was an estimate of the total
fee charged by the practitioner receiving the referred craniofacial-pain
patient. Record transferral and duplication costs, as well as the fees
lost by the referring practitioner, formed the second component. The
third component of the patient-referral cost is a measure of the incon
venience experienced by the referred patient, a value estimated by using
a multiple of the value of the inconvenience cost discussed in the pre
ceding paragraph. Appendix G provides a justification for using this
particular combination of components in the referred-cost estimates.
Symbolically, the patient-state transition costs (negative constants)
are represented in the analytical model as
k
cTT = the sum of the costs generated by the transition
Xu
from patient-state 'I' to patient-state 'J'
following the application of treatment 'k.'
This sum includes the type (a), (b), (c), and (d) costs appropriate to
each patient-state transition.


APPENDIX G
PATIENT-STATE
Patient
State
Model
Selection
Practitioner
Selection
1
16
16
1116
16
16
2
24
24
2124
24
24
2124,24
24
Refer*+
3
23
23*
3112
Refer
12+
312.5
41
41
3124
24
24
3132
32
32
3123,41
Refer
Refer
3132,32
Refer
Refer
4
35
35*
4120
Refer
Refer
4124
24
24
4134
34
34
4135
24
24
4120,20
Refer
Refer
4124,24
24
24
4124,35
24
24
TREATMENT SELECTIONS
Patient
State
Model
Selection
Practitioner
Selection
4134/34
34
34
5
35
35
5116
Refer
17+
5124
17
17
5135
24
24
5136
24
24
5124,35
24
24
6
24
24
7
33
33
7112
24
24
7118
12
12
7133
24
24
7112,24
16
16
8
12
12
8112
12
12*
8118
12
12
8124
18
18
8112,12
16
16
8118,24
24
24
8134,41
12
12
no


100
Analysis by Sex
Pre-menopausal
Male Female
Menopausal or
Post-menopausal Female
Transitions 8
into
0
1
0
13
1
11
2
15
0
1
0
Well
2
11
1
2
X 1-250 with 6 degrees of freedom
Hence, the analysis reveals that the sex of the patient is not significant
in determining estimates of transition probabilities out of Diagnostic
2
Alternative 13 following application of treatnent 24, as x g=12.592.
Analysis by Age Group
Transitions 8
into
13
15
Well
2
X = 2.286 with 3 degrees of freedom
Hence, the analysis reveals that age group of the patient is not signifi
cant in determining estimates of transition probabilities out of Diagnos-
2
tic Alternative 13 following application of treatment 24, as x 2=7.815.
20 39 40 55
Years Years

8
0
7
1
6
1
7


112
Patient
Model
Practitioner
State
Selection
Selection
16
25
25
17
Refer
Refer
Note: *
indicates that the reviewing practitioners made their selection
of treatment from a set of alternatives that did not include
the 'most appropriate' treatment alternative (Section 4.3)
+ indicates a difference between the treatment selections made
by the treatment-planning model and the reviewing practitioners.
Hie model-generated treatment selections agree with the reviewing prac
titioners' selections in 87 out of 94 patient states. This represents a
92.6% agreement between the taro sources of treatment selections.
In terms of the craniofacial-pain treatment-planning model's treat
ment selections, the patient-referral costs are the most significant of
the model's components. For each of the model's patient states, the
cost of referral out of that state is used in making the decision whether
to continue treatment for a patient, or whether to suggest that he go to
another source of care. If this cost is set too low, then patients who
should be treated in the craniofacial-pain care system are inappropriately
referred out of the system. On the other hand, too high a referral cost
leads the model to suggest that patients remain in this care system when
it would be to their advantage to seel: care elsewhere. For these reasons,
this cost was the subject of considerable examination in the building of
the treatment-planning model.
The reviewing practitioners suggested three possible alternative for
mats for the cost of referring a patient out of the craniofacial-pain care
system. These were:


72
and for the states of set B_.,. j=l,2,.... ,T
t
where t = the number of last non-group Bj state inrne-
diately preceding .the smallest nurriber-labeled
state in B ^
Thus, the process of selecting optimal treatments proceeds recur
sively from the state of smallest number-label to the one of largest
number-label, stopping to consider simultaneously the values of a number
of states only when an analysis set is encountered.
Howard's value iteration and policy improvement algorithm [25] is
employed only in the case of selecting treatments for the analysis-set
patient states. An example of this section's labeling and optimization
procedure is presented in Appendix H.
This optimization procedure was applied to the states of the cranio
facial-pain treatment-planning model. Appendix G presents a list of the
optimal treatment selections for each of the model's patient states.
4.3 Model Validation
Validation of the craniofacial-pain treatment-planning model was
accomplished in two phases. In the first phase of validation, the indi
vidual components of the. Markovian representation were examined by the
reviewing practitioners. The second phase of model validation compared
model-generated treatment decisions with those made by the reviewing ex
perts. In addition, statistics generated by the model were compared to
the care-system description provided by the patient records from the
university dental clinics. This section discusses the resulte of these
validating efforts.


36
3.4 Minimum-Cost Symptom-Selection Algorithm
The craniofacial-pain diagnostic-classification model detailed in
the previous sections of this chapter has been structured upon the data
vector of the 295 relevant signs, symptoms, and items of patient history
shown in Appendix A. To utilize this model , the practitioner must ex
amine a patient for the presence or absence of each of these data vector
elements. Although the cost in time and fees varies frcm item to item,
there is an expense to the practitioner, and to the patient, associated
with checking each element in the data vector. Hence, it is logical to
investigate the possibility of finding a reduced data vector that 'costs'
less for the patient and practitioner to use and yet still permit cor
rect classification of all craniofacial-pain patients.
A review of the literature (see Meisel [23] Chapter 9 for a survey)
reveals that many authors have considered the task of selecting a set
of features to be used in a pattern-classification scheme. Traditional
methods of viewing this problem are based on a search for a transforma
tion that takes a given set of patterns into same 'new' pattern space
where separation by discriminant functions is possible. Measures of
pattern class separability are employed to evaluate the effects of
transforming the set of patterns frcm one space to another. In general,
these transformations take a pattern representation in 'n' features and
create a set of 'r' (r 'new' features are linear combinations of the original features. How
ever, to reduce the 'costs' associated with using the craniofacial-pain
diagnostic classifier, a transformation must be found that decreases
the size of the data-vector pattern space by eliminating features rather
that combining them. For example, assume patients were diagnosed on


78
Given this summary statement, there are several logical extensions
to this dissertations research that should be examined in future inves
tigations. The following suggestions identify sene of the more fruitful
areas for further research efforts. These suggestions are ordered in
the author's view of their significance.
1. This dissertation's research found that first-order decision
making models are valid descriptions of the underlying thought processes
employed by the craniofacial-pain practitioner. It is possible that these
first-order descriptive decisions are suboptimal' and that higher order
decision-making tools might yield prescriptive, or 'optimal,' diagnostic
classifications and treatment plans for craniofacial-pain patients. That
is, considering the interaction between significant symptoms and multiple-
state dependency for patient-state transitions may lead to optimal diag
nostic and treatment-selection decisions. As the models themselves can
readily be increased in their decision-making 'order,' an investigation
into this possibility would be hampered only by the necessity of collect
ing an elaborate data base. Nevertheless, such an investigation should
be undertaken in this, the most significant, of future research areas.
2. As this dissertation's analytic models can be applied directly
to any health-care problem where there is verification that practitioners
make first-order decisions, one potential avenue of future research would
be to isolate those health-environments where these kinds of decisions
are made. However, a word of caution is interjected at this point. Math
ematical modeling demands an underlying structure for the process being
modeled. Yet, in a process dealing with a product that is subject to
considerable variation, such as the care of a patient in a health-care
system, isolating an underlying process structure is difficult. Moreover,


31
3.3 Model Validation
Vali.daiJ.on of the craniofacial-pain diagnostic-classification
model presented in Section 3.1 has been accomplished by three types of
validating procedures. The discussion presented in the preceding sec
tions, and in particular the relationship between significant symptoms
and their associated weights shown in Table 2, reveal a close proximity
between the decision-making process the practitioner utilizes and the
non-parametric classifier's symptom-weighing scheme. This section pre
sents -favo other procedures employed in evaluating the diagnostic clas
sification model's performance.
The first procedure involved testing the diagnostic accuracy of
the classification model on patient data that were not employed in model
construction. Six classification tests were run in sequential order.
In the first five of these tests random samples of 50 patient-data-vec-
tors were drawn from the data base of 480 vectors discussed in Section
3.1. Then, as each of the tests was performed, the training algorithm
in Appendix B was applied to the remaining 430 data vectors. With the
weights derived from the training algorithm, the sample of 50 patients
was classified. The modelgenerated classifications for each of the
data vectors were compared to the classifications assigned to the vectors
when they were created. As each test classification of a sample was
completed, the diagnostic classifier's discriminant-function weights were
set equal to zero, the sample of data vectors was returned to the data
base, and the next test's random sample was drawn. A summary of the re
sults of these tests of diagnostic accuracy is presented in Table 3.
In each of the first five tests it was possible for a patient who
has had multiple practitioner-visits to have seme of the vectors repre-


9
This research objective will be met by developing:
1. A diagnostic-classification model based on the theory
of non-par ametrie pattern classification, with
a. criteria for applicability of the modeling technique
to diagnostic classification
b. model.validation for craniofacial-pain patients
c. development of a minimum-cost symptom-selection
algorithm
2. A Markovian representation of the treatment-selection
process, with
a. justification for utilizing a Markovian model of
the underlying care system
b. model validation for craniofacial-pain patients
3. A description of potential model applications in teaching,
research, and practice.
1.3 Dissertation Overview
In Chapter 1 the motivation and scope of this dissertation was pre
sented. Chapter 2 provides a review of literature relevant to the diag
nostic and treatment-selection processes. A model of .the diagnostic-
classification process is developed in Chapter 3. Chapter 4 follows
with an analytic representation of the treatment-planning process. Con
clusions derived iron this model-building effort, and suggestions for
future research, are presented in Chapter 5.


102
Analysis by Number of Replications
of Treatment 24
Is*" Application 2n<^ implication 3^ Application
Transitions 8
into
1
0
0
13
9
5
0
15
1
0
0
Well
6
3
5
2
X = 8.099 with 6 degrees of freedom
Hence, the analysis reveals that the number of replications is not signif-
cant in determining estimates of transition probabilities out of Diag
nostic Alternative 13 following application of treatment 24, as
Thus, for Diagnostic Alternative 13, the five factors of patient
variation do not affect transition-probability estimates of the effective
ness of treatment 24.
The last type of analysis performed, analysis by number of treatment
replications, established treatment-application boundary numbers. If
the analysis revealed no significant effect for differences in treatment
repetitions, then the boundary number for the treatment was set at zero
or one by the reviewing practitioner. Note, that a zero boundary-applica
tion number for a treatment alternative implies that a record of that
treatment provides no additional information about the patient's progres
sion through the care system, and, therefore, the treatment-planning model
does not add a record of the treatments to its patient-state descriptions.
If the analysis revealed a significant effect for treatment repetitions,
the reviewing practitioners examined the data-based estimates of transi
tion probabilities associated with multiple repetitions of the treatment


109
8112
17
28


98
Treatment Application Treatments
Number
34 Drug Therapy, Heat Therapy, and Physical Therapy
35 Drug Therapy, Occlusal Adjustment, and Physical
Therapy
36 Fixation, Heat Therapy, and Physical Therapy
41 Drug Therapy, Fixation, Heat Therapy, and Physical
Therapy


107


73
The review of model components was accomplished as values for the
model parameters were collected. Seme of the data-fcased estimates of
transition probabilities and boundary-level application numbers did not
conform to expert judgment about the effects and effectiveness of vari
ous treatment applications. When these disparities occurred, the esti
mates were modified to reflect expert judgment.
The general structure of the patient states was reviewed to insure
that the representation shown in Appendix F did in fact portray a set of
logical progressions through the care system. Although this examination
established the validity of the patient progressions, the review did
point out one deficiency in the model's structure. The number and types
of treatment alternatives available for use at each patient state were
determined by records of actual applications of these treatments in the
data used for model construction. It was the judgment of the reviewing
practitioners that in several cases the selection of treatment alterna
tives for a patient state did not include the 'most appropriate' treat
ment alternative. Nevertheless, model deficiency can readily be correct
ed. With the collection of data on the effects of these 'most appropriate*
treatments, these additional treatment alternatives can be incorporated
as decision alternatives for the patient states in question.
The reviewing practitioners made selections of treatments for each
of the model's patient states. In those cases where the model's treat
ment alternatives did not include the practitioners' 'most appropriate*
choice of treatments, the practitioners made a selection iron the same
list of alternatives used by the model. Appendix G lists their choices
of treatment along with each model-generated selection. The two sets of
treatment plans include the same treatment selection for 87 out of 94


76
application techniques that should be emphasized in training dental stu
dents in craniofacial-pain care. Moreover, the parameters employed in
model development, in particular the transition probabilities and refer
ral costs, are themselves valuable instructional materials in developing
the dental student's treatment-selection slcills.
The treatment-planning model provides a method for evaluating new
developments in treatment for craniofacial-pain patients. With estimates
of the effectiveness of his new treatment, the researcher can use the
craniofacial-pain treatment-planning model to get two immediate responses
First, the optimization technique of Section 4.2 will determine if this
new treatment provides 'better care' for the patient than any of the
other treatment alternatives the model has to choose frcm. Second, if
optimal treatment selections for the model include the new treatment, the
model's statistics will show improvement in length of stay, and other
relevant measures of treatment effectiveness, introduced by using this
new treatment.
In the office of the practicing dentist, the treatment-planning mod
el's decisions could provide a concise reference of the treatment selec
tions suggested by experts in the field of craniofacial pain. Moreover,
the practitioner would have a chance to contribute to the refinement of
the listing as the treatment records of his patients could supplement
the data used in model construction. In addition, the practitioner could
employ the statistics associated with the treatment-planning model in
scheduling the length, and number, of his appointments for craniofacial-
pain patients.


47
Condition 4:
Justification 4
m. A
where v >0 and E-3 v.=l
r j-1 3
k m-i k
I3 v.b. > E1 u.a. = 0
j=l i=l
i x
m.
for any choice of u. such that u.>0 and I u.=l.
1 1 i=l 1
m.
Hence, if v >0, there exist no u.>0, E1 u.=l
r 1 i=l 1
m.
and v.>0, yfcc, E-3 v.=l such that
3 3=1 3
m. m.
E3 u.a. = E-3 v.b. .
i=l 1 1 j=l 1 3
If the kUL row of A has all elements a^, i=l,2,... ,rru,
equal to one, and seme b^. equals zero,
m. m.
no u.>0, E1 u.=l and v.>0, v ^_0, E-* v.=l
1 i=l 1 3 r j=i 3
exist such that
m.
m.
= t3
E u.a. = EJ v.b. .
i=l 1 1 j=l ^ 3
Condition 4 is similar to Condition 3 in that any
convex combination of the rows of B that includes a
non-zero product of the r1 column yields a k row
term whose value cannot equal any convex combination
+*Vl
of the k row elements of A. Symbolically,
for any choice of u. and v., where v >0,
jt i j r
m. m. ,
Z3 v.b. < E1 u.a. = 1.
j=l ^ ^ i=i ^


81
that they might readily be employed by sore future investigator. Actual
applications of the models should yield significant contributions to
the effectiveness of the teacher, researcher, and practitioner.


117
where r.
*1
10
z
J=1
*1
u u


21
(ar = 1) or absence (a^ 0} of patient-data-vector item r; and the
-* 4*V
k=l,2, ,n+l, are constants associated with the j discriminant
function called 'weights.' These discriminant-function weights,
j=l,2,...,p, k=l,2, ,n+l, provide an analytic means of duplicating
the correct classification of each pattern observe! by the non-parametric
classifier. They provide a link between a pattern's correct classifica
tion and the individual components of the pattern's vector representa
tion. In essence, each discriminant's weights are additive elements
whose component sums have significance in terms of a isolating pattern's
correct classification. These weights are a .mathematical means of stor
ing information already known about the correct classification of observed
pattern vectors. Moreover, the weights can be interpreted from the
point of view? of the significance that the practitioner places on each
data-vector component. A discussion of this interpretation of the dis
criminant-function weights appears in Section 3.2.
Central to the use of linear discriminant functions is the assump
tion that the space of observable patient data vectors is linearly
separable, for by definition [24],
a pattern space A is linear and its subsets of patterns
A.,A2,...,A are linearly separable if and only if linear
x ^ p
discriminant functions exist such that
for all a in A^
(a) >g^ (a)
for all i=l,2,...,p, j=l,2,...,p, j^i.
In the context of diagnostic classification, the assumption of linear
separability implies that there exists a set of hyperplanes that parti
tion the space of observable patient data vectors into convex homogeneous
regions, each region representing a unique diagnostic classification.


40
algorithm does find the minimum-cost collection of features X* and the
total cost associated with using these features, and guarantees the
k
existence of iA vectors associated with this optimal feature set. Given
this guarantee, the modified fixed-increment algorithm frcm Appendix B
*
can be employed to find the vectors A, i=l,2,...,p.
Choose seme solution to PI. By hypothesis there exists, at least
one solution ... ,W ) to PI where X = [1,1,.. .,1,1], Suppose
AAA A
there is sane other solution (X,W^,W2,... ,W^) where one or more elements
A
x^ in the X vector are equal to zero. For the constraint matrices in PI,
\[X (Wi-Wj)] > 0 i=l,2,...,p
j=l,2,... ,p
A A
If the matrix products [A^D X] = A^, i=l,2, ,p are constructed, then
each set of constraints in PI can be written in the form
A. (W.-W.)>0 i=l,2,...,p (4)
1 1 J
j=l,2,... ,p
j#-
A
The creation of the A^ is called the zeroing process. Of the col-
A A
urnns of A^, A^ retains all columns j of A^ where x^ = 1, and substitutes
A
a column of zeros for each of those columns k in A^ where x^. = 0. Using
the zeroing process, the feasibility of any possible solution vector _X
to Pi can be examined in terms of the A^D X this vector X creates.
As an example of the zeroing process for a particular set of patterns,
let a1 be a two-dimensional patient-data-vector a1 = [a^a^] where
al=1
0 if patient i has normal body temperature
1 if patient i has abnormal body temperature
and


65
Here it is possible for the patient to alternate between any one of
several diagnostic classifications during the course of his stay in the
care system. Note that in both formats for diagnostic-classification
transitions a patient moves into the referred state not as a result of
a treatment application, but rather as an alternative to further treat
ment.
To these underlying diagnostic-classification transitions the cranio
facial-pain treatment-planning model adds a record of the changes in
treatment history. Appendix F displays complete charts of all of the
diagnostic-alternative-based patient states included in the treatment-
selection model. In these charts the patient states are connected by
arcs that represent feasible transitions fran one state to another. Not
shown in the charts are the well and referred patient states and the arcs
that connect every diagnostic-altemative-based state with these terminal
states.
Howard [25] establishes that in terms of the policy decisions gen
erated by a Markovian decision model, holding-time distributions are im
portant only insofar as they affect the mean weighting time in each sys
tem state and the expected costs of each state occupancy. The records
of the patient visits employed in model construction revealed that, in
the care of the patients described by the data, one or more treatments
were prescribed at each visit, and a series of return visits was scheduled
for the patient following his initial interaction with the practitioner
if return visits were warranted. Under these conditions, specifying
holding-time distributions for the time between successive patient-state
transitions does not refine the model. Therefore, the treatment-planning
model employs a Markovian rather than semi-Markovian representation of


83
List of Drugs Taken
History of Trauma
Location of Swelling
034 Mild Analgesics; Asprin, APC, etc,
035 Moderate Analgesics (non-narcoticl
036 Strong Analgesics; Narcotics and
Synthetic Narcotics
037 Anti-anxiety Agents: Mellaril, etc,
038 Anti-arthritic Agents: Steroids, etc.
039 Anti-depressives: Tofranil, etc.
040 Birth Control Pills
041 Hormone Preparations
042 Anti-inflammatory Agents .
043 Muscle Belaxants: Valium
044 Muscle Belaxants: Meprobamate
045 Muscle Belaxants: Others
046 Sedatives: Barbiturates, etc.
047 Other Drugs
048 Accidental
049 Factitial
050 Surgical
Side


46
m. m.
E1 u.a. = E3 v.b. ,
. 11 3 ]
i=l 3=1 J J
for any choice of
m.
u.>0, E1
u.=l
o
A
>
m.
E3
i=l
1
3
j=l
II
H
m.
= E3
j=l
v.b.
3 3
if and only if
*
E u.a.. = EJ v.b.. .
i=l 3k
Condition 3:
th 1c
If the k row of A hasall elements a^, i=l,2,... ,nu,
equal to zero, and sane b equals one,
m. m.
no u.>0, E1 u.=l, and v.>0, v >0, E3 v.=l
i- i=1 i 3 r j=]_ 3
exist such that
m. m.
E1 u.a. = E3 v.b. .
i=l 11 j=l 3 3
Justification 3: Under Condition 3 any convex combination of the col
umns of B that includes a non-zero product of the
column b^ results in a k1 row term greater than zero.
The value of the km row term for any convex combina
tion of the columns of A is equal to zero. Hence, no
set of convex combinations of the columns of A and B
can be equal if the combination for B includes a
specification that vr>0. Symbolically,
A
if vr>0,
A
then for any choice of v^, j=l,2,.-..,iru, j?r,


.APPENDIX A
CRANIQFACIAL-PAIN PATIENT DATA VECTOR
Referral Through
001
Medical GP
002
Medical Specialist
003
Dental GP
004
Dental Specialist
Sex 005 Male
006 Female
007
Female, menopausal
or post menopausal
Age Group 008
0 -
* 19 009
20 39
010
40 -
- 55 011
56 up
Duration of Pain
012
Less than 3 weeks
013
From 3 to 6 weeks
014
More than 6 weeks
015
Episodic
Character of Pain
016
Aching
017
Burning
018
Cutting
019
Discanfort
020
Dull
021
Pressure
022
Pricking
023
Sharp
024
Soreness
025
Stinging
026
Tenderness
027
Throbbing
Change in Character of Pain 028 Constantly getting 'worse
029 Got worse, then plateaued
030 Got worse, plateaued, then better
031 Getting better
032 Intermittent periods without pain
033 No change since beginning
82


16
The data collected on craniofacial-pain patient progressions
through the care system reveal that both prolonged occupation of a
single diagnostic state and return visits to the same state occur fre
quently. Moreover, as will be discussed in Chapter 4, there are several
characteristics of the craniofacial-pain care system that permit reduc
tions in the number of input parameters required for a transient Markovian
model of this system. Therefore, an uncertain-duration transient
Markovian representation of the health-care process has been selected as
the means of evaluating the effectiveness of alternative treatment regi
mens on patients with craniofacial pain.


ANALYTICAL MODELS FOR DIAGNOSTIC CLASSIFICATION AND
TREATMENT PLANNING FOR CRANIOFACIAL PAIN
By
Michael Steven Leonard
A DISSERTATION PRESENTED TO THE GRADUATE COUNCIL OF
THE UNIVERSITY OF FLORIDA IN PARTIAL
Fmrnii-ENT OF THE REQUIREMENTS FOR THE DEGREE OF
DOCTOR OF PHILOSOPHY
UNIVERSITY OF FLORIDA
1973

To ray wife,
Mary

ACKNCWLEDGEMENTS
Without the considerable contributions of time and effort by the
members of his committee, it would have been impossible for the author
to have completed this dissertation. In particular, the author expresses
gratitude to his Chairman, Dr. Kerry Kilpatrick, for his encouragement
and direction during the course of this research effort. The author also
thanks Dr. Kilpatrick for his editorial assistance during the development
and organization of this manuscript. The author thanks Dr. Richard
Mackenzie and Dr. Stephen Roberts for providing the initial direction for
this research. Additionally, the author is grateful to Dr. Them Hodgson
and Dr. Donald Ratliff for their assistance in evaluating and refining
the author's ideas throughout this project. The author expresses his
gratitude to Dr. Thomas Fast and Dr. Parker Mahan for the contribution of
their extensive knowledge about craniofacial pain to the author's research.
The author is deeply appreciative of Dr. Fast's and Dr. Mahan's willing
ness to spend many hours examining dental records and their endurance of
the nomenclature and idiosyncracies of this mathematical-modeling effort.
Financial support for this research was provided by the Health Systems
Research Division, J. Hillis Miller Health Center. The division's sup
port in conjunction with a traineeship granted by the National Science
! Foundation made it possible for the author to undertake this research.
The author is also grateful to the Industrial and Systems Engineering
Department for the contribution of computer funds. Additionally the au
thor thanks Dr. William Solberg, University of California at Los Angeles;
iii

Dr. Daniel Laskin, University of Illinois; and Dr.'David Mitchell,
University of Indiana, for providing access to the patient records
employed in this modeling effort.
The author would like to express his thanks to the secretarial staff
of the Health Systems Research Division for their translation of the au
thor's 'first-order' approximation to handwriting into a draft of this
manuscript. Their tolerance of a multitude of last minute changes made
by the author has been appreciated.
Finally, the author thanks his wife, Mary, and his parents, Dorothy
and Charles Leonard, for their encouragement and support throughout the
course of this research.
M.S.L.
August, 1973
iv

TABLE OF CONTENTS
ACKNCLEDGEMENTS iii
LIST OF TABLES vii
LIST OF FIGURES viii
ABSTRACT ix
Chapter
1. Introduction 1
1.1 Craniofacial Pain 2
1.2 Research Objective 8
1.3 Dissertation Overview 9
2. Previous Research 10
2.1 Bayesian Classification Models 10
2.2 Non-Parametric Classification Models 13
2.3 Finite-Horizon Treatment Planning 14
2.4 Uncertain-Duration Treatment Planning 15
3. Diagnostic Classification 17
3.1 Model Components 17
3.2 Alternative Interpretations of Linear Separability 26
3.3 Model Validation 31
3.4 Minimum-Cost Symptom-Selection Algorithm 36
3.4.1 Algorithm Development 39
v

55
3.4.2 Statement of the Minimum-Cost Symptom-
Selection Algorithm
3.4.3 Computational Considerations 56
3.5 Model Applications 57
4. Treatment Planning 59
4.1 Model Components 59
4.1.1 Patient States 60
4.1.2 Transition Probabilities 63
4.1.3 Cost Structure 67
4.2 Selection of Optimal Treatments 70
4.3 Model Validation 72
4.4 Model Applications. 74
5. Conclusions and Future Research 77
Appendices
A Craniofacial-Pain Patient Data Vector 82
B Modified Fixed-Increment Training Algorithm 87
C Application of the Minimum-Cost Symptom-Selection
Algorithm 91
D Treatment Alternatives for Craniofacial-Pain Patients... 97
E Stability of Transition-Probability Estimates 99
F Flow Charts of Patient-State Transitions 104
G Patient-State Treatment Selections 110
H Application of the Patient-State-Labeling and Optimal-
Treatment-Selection Procedure 114
BIBLIOGRAPHY 118
BIOGRAPHICAL SKETCH 121
Vi

LIST OF TABLES
Tables
1. Survey of Diagnostic-Classification Models 12
2. Correlation Between Significant Symptoms and
Discriminant-Function Weights.... 30
3. Tests of Diagnostic Classifier Accuracy 32
4. Classification Variability Among Dental Practitioners... 35
5. Mean Transit Times Through the Craniofacial-Pain Care
System 75
vii

LIST OF FIGURES
Figures
1. Temporomandibular Joint 3
2. Diagnostic-Classification and Treatment-Planning
Process for Craniofacial Pain.... 7
3. Craniofacial-Pain Diagnostic Alternatives 18
4. Procedure 2 53
5. Diagnostic-Classification Transitions 64
6. Patient-Visit Inconvenience Cost 69
7. Application of the Modified Fixed-Increment Algorithm... 90
8. Multiple-State History-Augmented Process 115
viii

Abstract of. Dissertation Presented to the
Graduate Council of the University of Florida in Partial
Fulfillment of the Requirement for the Degree of Doctor of Philosophy
ANALYTICAL MODELS FOR DIAGNOSTIC CLASSIFICATICW AND
TREATMENT PLANNING FOR CRANIOFACIAL PAIN
By
Michael Steven Leonard
December, 1973
Chairman: Dr. Kerry E. Kilpatrick
Major Department: Industrial and Systems Engineering
This dissertation presents a systematic approach to craniofacial-
pain diagnosis and treatment planning using analytic models of the under
lying decision-making processes. Patient diagnoses are generated by a
linear pattern-recognition classifier trained with a sample of preclas
sified craniofacial-pain patient data. For this classifier, an algorithm
is developed that minimizes the total cost of the set of features employed
in the classifying process. Diagnostic classifications, augmented by a
history of prior treatment applications, provide the state descriptions
for a Markovian decision model of the treatment-planning process. Cranio
facial-pain patient records frcm four university dental clinics serve as
a data base for model construction and validation.
The analytic models provide a means of duplicating the diagnostic
classifications and treatment plans of experts. Approximately 90% of
the diagnostic classifier's classifications and 93% of the treatment
planning model's treatment selections concurred with the decisions made
by experts in the field of care for craniofacial-pain patients. Moreover,
the models permit an examination of the critical considerations associated

with both decision-making processes. These capabilities are discussed
in terms of applications of the models in teaching, research, and in the
practice of dentistry.
x

CHAPTER 1
INTRODUCTION
The rapid pace of developments in medical and dental research pre
vents the practicing physician and dentist fran fully utilizing each new
diagnostic and treatment-planning aid as it is published. In each of the
last four years an average of 215,000 new publications have been written
to supplement the knowledge of the health-care practitioner [1]. Con
currently, the pressures of an ever-increasing patient load force prac
titioners to select the most expeditious means for diagnosing disorders
and selecting treatments. For example, the medical general-practitioner
(1970) saw an average of 173 patients a week [2], and the median dental
practitioner (1971) saw two patients an hour [3]. Given these circumr-
stances, practitioners may overlook possible diagnostic and treatment al
ternatives or they may apply inappropriate treatments. If meaningful
analytic descriptions of the diagnostic and treatment-planning processes
can be developed, these models can assist educators in training new prac
titioners, researchers in evaluating and disseminating new developments,
and practitioners in improving the quality of patient care [4].
Developing models of the diagnostic-classification and treatment
planning process requires an understanding of the underlying physiological
processes of diseases and the mechanisns of their cures. Obviously, the
effects of disease and the means of cure vary frcm one health-care prob
lem to another. Thus, modeling efforts in diagnosis and treatment plan
ning must be integrally related to the facet of health care that is under
1

2
study. This reality prohibits the model builder from making broad state
ments about the applicability of his models to other health-care environ
ments. Accordingly, the models developed in this dissertation are spe
cifically oriental toward the health-care problem presented in Section
1.1 with the understanding that the results of this modeling effort may
not be applicable to the whole of health-care diagnosis and treatment
planning.
1.1 Craniofacial Pain
The head and face are subject to chronic, persistent,
or recurrent pain more often than any otter portion of the
body. Pain in the head or face has a greater significance
to patients than any other pain. It may arouse fears that
the patient is in danger of losing his mind or that he has
a tumor of the brain. In addition, the emotional state of
the patient is adversely influenced because it is generally
known by the layman that the profession's knowledge of the
causes of these pains is meager and that methods of treat
ment are inadequate [5, p. v].
H. Houston Merritt, M.D., Dean
Columbia University College of
Physicians and Surgeons
One source of the pain Dr. Merritt describes is dysfunction of the
tenporonandibular joint. The torporamandibular joint, see Figure 1,
provides the articulation between the mandible and the cranium. This
joint is unique both in its structure and its function. Within the plane
of the temporomandibular joint, lateral, vertical and pivoting motion is
permitted. In addition, the joint is the point of articulation for the
only articulated complex that contains teeth. With this joint, "motion
is directed more by the musculature and less by the shape of the artic
ulating bones and ligaments than is the fact for otter joints" [5, p. 34].
The fact that joint motion is highly dependent on musculature im
plies that when mandibular dysfunction occurs there is sane disturbance

3
Right temporomandibular articulation
Inset: Anatomical features of the temporomandibular joint
TEMPORCMNDIBUIAR JOINT
/

4
of the intricate neuromuscular mechanisms controlling mandibular move
ment [5]. Emotional tension may also lead to hyperbonicity of the
striated masticatory muscles resulting in facial pain or altered sensa
tion without evidence of peripheral dysfunction. In addition, abnormal
occlusal contacts of the teeth may affect muscle tonicity resulting in
mandibular dysfunction [5]. Moreover, the tenporcmandibular joint is
prone to disorders carmon to all joints: rheumatoid arthritis, osteo
arthritis, traumatic injuries, neoplasms, and nonarticular disorders.
Although the term craniofacial-pain* is a broad classification for pain
in the head and face, the term is used in this dissertation to describe
pathological, congenital, hereditary-based, or emotional causes of pain
in and around the temporomandibular joint.
Though the degree of severity may vary, one or more of the following
four 'cardinal symptoms' are exhibited by the craniofacial-pain patient:
pain, joint sounds, limitation of motion, and tenderness in the mastic
atory muscles [6]. Accompanying these symptoms the patient may complain
of, or the practitioner may find, hearing loss, burning sensations, mi
graine-like headaches, vertigo, tinnitus, subluxation, luxation, dental
pulpitis, sinus disease, glandular disorders, occlusal disharmony, and
radiographic evidence of joint abnormality. The degree of association
of these additional symptoms and findings with the etiology of the joint
disorders is subject to considerable variation.
Paralleling these areas of anatomic dysfunction is the possibility
that the craniofacial-pain patient may be suffering from psychic dis
orders. In no other type of patient seen by the dentist does psychic
condition play a larger role [7]. Most craniofacial-pain patients have
symptoms or signs of anxiety, and a sensory preoccupation with the oc-

5
elusion of their teeth [8]. Many of these patients can be characterized
by a heavy reliance on denial, repression, and projection of their psy-
ciiic disorders in order to maintain their self-concept of emotional sta
bility [6] Often the complaints these patients relate to the practi- .
tioner are not canpatible with any objective signs.
The practitioner who manages the care of craniofacial-pain patients .
assumes a difficult task. For sane of these patients, diagnosis is ob
vious. Generally, however, the craniofacial-pain patient presents a.com
plex combination of signs and symptoms [7]. More than one disease en
tity nonrally accounts for the patient's symptoms and most craniofacial-
pain patients suffer frem a pain-dysfunction complex involving a combina
tion of masticatory muscle disorders, occlusal disharmony, emotional .
tension, and anxiety [5]. Nevertheless the possibility of multiple
almost sub-clinical etiologic factors combining to produce the dysfunc
tion and pain must be considered. The close relationship of organic and
emotional disorders as they appear in craniofacial-pain patients provides
the examining dentist with the problem of discriminating which factor is
primary in the etiology of the patient's dysfunction [7]. Unfortunately,
the temporomandibular joint is one of the most difficult areas of the
body to examine radiographically [8]. Hence, with these patients, the
dentist relies to a large degree on tests of emotional stability and
physical examination by visualization, palpation, and auscultation [7].
Therapeutic measures for the care of craniofacial-pain patients are
as varied as the factors contributing to the disorder. "A small percent
age of patients with symptoms referrable to the temporomandibular joint
will portray such a confusing picture that consultation with other, dental
or medical specialists is indicated" [7, p. 129], The majority of these

6
patients will exhibit symptoms that lead to any one of several alterna
tive courses of patient care. Altering the occlusion of the natural
teeth is one means of treating craniofacial-pain patients. Although in
many cases minor occlusal abnormalities are only contributing factors to
a patient's pain, attention by the dentist to occlusion is at least
partially successful for a majority of craniofacial-pain patients [8].
However, it is important in early therapy not to alter the occlusion ir
reversibly. Treatment by means of tooth extraction or endodontics, jaw
fixation, prosthetic devices, or by topical treatments may also be sug
gested by the patient's symptoms. The articular surface of the mandib
ular condyle has an excellent reparative capacity [6]. Thus, the use of
sedatives, antibiotics, and muscle relaxants, along with physical therapy,
often leads to patient 'cures' as these treatments ease the patient's pain
and increase jaw mobility while natural restoration of the joint is in
progress. If, after a reasonable length of time (3 to 6 months) the pa
tient's symptoms are not relieved, the dentist may consider referral to
another source of care or therapy such as surgery [7].
typically, the health-care process for craniofacial-pain patients
may be viewed as following the format of Figure 2 [9]. When a patient
is admitted into the care system, he undergoes a data-collection process.
This involves taking a 'full and pertinent' patient history and a phys
ical examination of the areas of discomfort. The data gathered consist
of symptoms, signs, medical and/or dental history, physical examination
findings, psychosocial information, and so forth. Once these elements
have been elicited, a diagnosis is attempted. If this is not yet pos
sible, the severe symptoms are treated and the patient's health state is
monitored.

7
FIGURE 2
DIAGNOSTIC-CLASSIFIGriCN AND TREATMENT-
PLANNING PROCESS FOR CRANIOFACIAL PAIN

8
When initial treatment does not result in a 'cure' for the cranio
facial-pain patient, treatment effects are evaluated and new data col
lected. When a patient's diagnostic classification leads to a course
of treatment that is not within the realm c-f the practitioner's special
ty he is referred to a more appropriate care source. Monitoring is con
tinued on those patients not rejected iron the system at this point, and
the patient is discharged when he is symptom-free. However, when other
disorders have been isolated during the course of treatment, the patient
is recycled through the classification-treatment process.
The diagnosis-treatment sequence is not fixed. Treatment can begin
prior to a diagnostic classification or treatment can follow a diagnosis.
Moreover, there may be many diagnostic-treatment data-acquisition cycles
before the patient is considered 'well.'
1.2 Research Objective
. The introductory discussion of the need for diagnostic and treatment
planning models, and the brief description of the craniofacial-pain care
system, provide the setting for a statement of the research objective un
derlying this dissertation. This objective is to derive analytic repre
sentations of the decision processes involved in selecting diagnostic
classifications and planning treatments for craniofacial-pain patients.
A diagnostic-classification model that duplicates the classification of
expert practitioners is sought. For treatment planning, the modeling
goal is to provide a structure for interaction of the critical considera
tions associated with the treatment-selection process. These analytic
representations will be structured to permit their application as teaching
devices in the training of dental practitioners, as methods of testing the
effects of new diagnostic tools and treatment applications, and as aids to
the practice of dentistry.

9
This research objective will be met by developing:
1. A diagnostic-classification model based on the theory
of non-par ametrie pattern classification, with
a. criteria for applicability of the modeling technique
to diagnostic classification
b. model.validation for craniofacial-pain patients
c. development of a minimum-cost symptom-selection
algorithm
2. A Markovian representation of the treatment-selection
process, with
a. justification for utilizing a Markovian model of
the underlying care system
b. model validation for craniofacial-pain patients
3. A description of potential model applications in teaching,
research, and practice.
1.3 Dissertation Overview
In Chapter 1 the motivation and scope of this dissertation was pre
sented. Chapter 2 provides a review of literature relevant to the diag
nostic and treatment-selection processes. A model of .the diagnostic-
classification process is developed in Chapter 3. Chapter 4 follows
with an analytic representation of the treatment-planning process. Con
clusions derived iron this model-building effort, and suggestions for
future research, are presented in Chapter 5.

CHAPTER 2
PREVIOUS RESEARCH
Over three-hundred publications have been addressed to the problem
of modeling the diagnostic and treatment-planning process. Spanning
' fourteen years, this research has considered such diverse problems as
the classification of liver biopsies [10] and the optimal plan for
treating mid-shaft fractures of the femur [11]. At least ninety-one
disorders have been utilized as environments for developing diagnostic
and treatment-planning models. The magnitude of this research effort
emphasizes the need for analytic representations of these complex deci
sion-making processes.
Fortunately, the significant contributions in this voluminous
literature can be neatly partitioned into four distinct categories. Re
search in diagnostic classification has been based either on the applica
tion of Bayesian statistics or on the vise of non-par ame trie pattern
classifiers. Treatment planning has been presented as either a finite-
horizon decision problem or as an application of decision analysis to a
Markov process of uncertain duration. This section presents a brief dis
cussion of each of these categories and evaluates their suitability as
analytic representations of the process of providing health care for
craniofacial-pain patients.
2.1 Bayesian Classification Models
Bayesian diagnostic-classification models, such as [12, 13, 14,
10

11
15, 16], irak a diagnosis on the basis of selecting a patient's 'most
probable' disease state. The Bayesian classifier is an elementary type
of parametric pattern-classification model. In general, parametric
classifiers make use of one or more of the statistical characteristics
of the dispersion of the data being classified to establish rules for
data classification. With the Bayesian models, only the conditional
probabilities for exhibiting sets of symptoms, given a particular dis
ease, are tabulated from past medical data. Then, utilizing Bayes'
theorem, the probabilities for the presence of alternate diseases
*.. ,d^ can be calculated as a function of the syirptcm-ccrrplex S
the practitioner observes in the patient. Bayes* theorem provides that
for each of the d^
P(dilS) = C(S)PCS|di)P(di)
n
where C(S) = 1/[E PiSjcyPicy},
k=l
hence, a patient with syiptcm-ccmplex S is classified in disease-group i
if
P(d. |S) = max p(d, |s).
1 k
A survey of the results of application of Bayesian models is given in
Table 1.
Although the percentage of correct diagnoses in most of these test
applications is high, there are several reasons why a Bayesian diagnos
tic irodel is not used as the means of generating diagnostic classifica
tion in this dissertation. The first reason is the difficulty in ac
quiring the proportional presence of alternate diseases P(d^}, i=l,2,...,n,
in the population of patients that are to be classified by the model.
These 'prior' probabilities of having a particular disease are a function

12
TABLE 1
SURREY OF DIAGNOSTIC-CLASSIFICATiai MODELS
Bayesian Classifiers
Reference
Number
Disease Group
Number Of
Patients In
Study
% Correct
Patient
Diagnoses
[12]
Nontoxic Goiter
88
85.3
[13]
Bone Tumor
77
77.9
[14]
Thyroid
268
96.3
[15]
Congenital Heart
202
90.0
[16]
Gastric Ulcer
14
100.0
Non-Parametric Classifiers
Reference
Number
Disease Group
Number Of
Patients In
Study
% Correct
Patient
Diagnoses
[17]
Liver
52
98.1
[18]
Asthma
230
90.0
[19]
Hematologic
49
93.9
[20]
Thyroid
225
96.0

13
of seasonal variation, geographic location, population demography, and
many other factors. Secondly, valid Bayesian analysis requires the
analyst to determine the dependence among exhibited symptoms for each
disease considered by the diagnostic model. In this respect, the prob
abilities for the presence of groups of symptoms are independent for
sane diagnostic alternatives and strongly correlated for others [4]. The
third reason for not selecting a Bayesian model is the massive storage
requirement dictated by the necessity of keeping the set of conditional
probabilities. These conditionals, P(S|d^) for every observable symptcm-
canplex S and. every disease i considered, must be at hand each time the
model is used. For example, given ten alternate diseases and ten symp
toms for which no assumptions of between-symptcm independence can be made,
storage is required for 10-(210-1), or 10,230, conditional probabilities.
2.2 Non-Parametric Classification Models
Non-paramatric diagnostic models, like [17, 18, 19, 20], utilize
non-parametric pattern classifiers, a form of pattern recognition model
ing. In the literature on pattern recognition, the term 'non-parametric'
implies that no form of probability distribution is assumed for the
dispersion of symptom data in establishing the rules for pattern classi
fication. These models do assume, however, that classes of symptom data
are distinct entities and, hence, a patient with a particular set of
symptom S cannot simultaneously occupy more than one diagnostic state.
That is, the models assume a deterministic classification for each pat
tern viewed by the pattern classifier where every observable pattern has
one, and only one, correct classification.
Non-parametric modeling permits the analyst to bypass the difficult
problems of explicitly determining the conditional probabilities for,

14
and the dependence, among, syirptcms that are required for Bayesian analysis.
With the non-parametric classifier, a diagnosis is generated for the
practitioner by evaluating a discriminant function associated with each,
diagnostic classification, g^(), i=l,2,...,n. As was the case with the
Bayesian models, the values of these discriminants are a function of the
symptom-complex S exhibited by the patient. The patient's diagnostic
classification corresponds to that disease whose associated discriminant-
function value is maximum. That is, a patient with symptoms S is classi
fied in disease-group i if
gi(S)>gk(S) for all k f i.
Results frcm seme of the applications of pattern-recognition classi
fiers are presented in Table 1. In these test applications diagnostic
accuracy was consistently high. Because of these models' ease of imple
mentation and small storage requirements, a non-parametric pattern classi
fier is preferable as a vehicle for generating diagnostic classifications.
The use of a non-parametric classifier is further motivated by features
of the care process for craniofacial-pain patients discussed in Chapter 3.
2.3 Finite-Horizon Treatment Planning
In the realm of research on modeling the treatment-planning process,
several authors [9, 21, 22] have presented schemes for analysis that
utilize methods for making decisions tinder risk and uncertainty. The
treatment-selection process has alternately been defined as a two-person
zero-sum game, structured as a decision tree, and modeled as a Markov
process of limited duration. Treatment costs and the 'costs' of occupy
ing 'non well' or terminal patient states, provide the basis for select
ing an 'optimal1 treatment plan. Finiteness of the planning horizon is
assured either by establishing a maximum permissible number of treatment

15
applications, or by considering at any stage of analysis the effects of
a fixed number of future treatments. Validation of the decisions gen
erated by these models has thus far been limited to checks on the feasi
bility of the treatment regimens selected. Unfortunately, the finite-
horizon models either do not consider the possibility of a patient's
prolonged stay in the health-care system, as is the case of the models
with a maximum number of possible treatments, or, where only a fixed
number of future treatments is considered, they provide no more than a
heuristic treatment-selection procedure.
2.4 Uncertain-Duration Treatment Planning
Bunch and Andrew [11] have considered the possibility of prolonged
occupation of the same diagnostic state during the course of a patient's
progression through the care system. In their Markovian representation
of the care system for mid-shaft fractures of the femur, they provide
this modeling refinement. As a consequence of this modification, the
number of treatment decisions made for each patient is a random variable
with no fixed upper bound. Howard's iterative scheme for policy selec
tion [25] provides the means for choosing the optimal treatment regimen
by selecting treatment alternatives that maximize the relative 'value'
of occupying each disease state. Although the Bunch and Andrew model did
not consider return visits to the same disease state, a more generalized
Markovian representation could incorporate that possibility. Neverthe
less, the proximity to reality that this category of transient Markovian
models provides requires considerable effort as holding-time distribu
tions, treatment 'costs,' and transition probabilities must be supplied
by the analyst for all treatment alternatives at each of the disease
states in the care system.

16
The data collected on craniofacial-pain patient progressions
through the care system reveal that both prolonged occupation of a
single diagnostic state and return visits to the same state occur fre
quently. Moreover, as will be discussed in Chapter 4, there are several
characteristics of the craniofacial-pain care system that permit reduc
tions in the number of input parameters required for a transient Markovian
model of this system. Therefore, an uncertain-duration transient
Markovian representation of the health-care process has been selected as
the means of evaluating the effectiveness of alternative treatment regi
mens on patients with craniofacial pain.

CHAPTER 3
DIAGNOSTIC CLASSIFICATION
The analytic model developed to provide diagnostic classifications
for craniofacial-pain patients is based on the principles employed in
non-parametric pattern classification. The patterns classified by this
diagnostic model are vector representations (see Section 3.1 and Appen
dix A) of the craniofacial-pain patient's physical and emotional status.
In the first sections of this chapter the' theoretical background for the
diagnostic model is established. This discussion is followed by a pre
sentation of the validation procedures used to evaluate model perfor
mance. Next, an algorithm is developed to reduce the 'costs' associated
with model utilization. The chapter closes with a discussion of poten
tial applications of the craniofacial-pain diagnostic classifier in
teaching, in research, and in the health-care process.
3.1 Model Components
In the initial phase of the development of the diagnostic-classi
fication model a set of possible alternative diagnostic classifications
was established for craniofacial-pain patients. Figure 3 provides a
list of these possible classifications. Note that the alternative classi
fications in Figure 3 are not mutually exclusive as a craniofacial-pain
patient classified in sane diagnostic alternative 'A' could also have
the disorder specified by sane other diagnostic alternative 'B.'
However, for the purposes of this dissertation, each patient's diagnostic
17

18
1. Temporomandibular Joint ArthritisDevelopmental
2. Temporomandibular Joint ArthritisInfectious
3. Temporomandibular Joint ArthritisOsteo (Degenerative)
4. Temporomandibular Joint Arthritis-Traumatic (Acute)
5. Temporomandibular Joint Arthritis Traumatic (Chronic)
6. MyopathyAcute Trauma
7. MyopathyMyositis
8. Oral PathologyDental Pathology
9. Vascular ChangesMigrainous Vascular Changes
10. Myofacial Pain-Dysfunction MalocclusionBalancing Interferences
11. Myofacial Pain-Dysfunction MalocclusionLateral Deviation of Slide
12. Myofacial Pain-Dysfunction MalocclusionUneven Centric Stops
13. Myofacial Pain-Dysfunction PsychoneurosisAnxiety/Depression
14. Myofacial Pain-Dysfunction Bruxism
15. Myofacial Pain-Dysfunction Reflex Protective Muscular Contracture
16. Myofacial Pain-Dysfunction Loss of Posterior Occlusion
17. Neuropathy
FIGURE 3
CRANIOFACIAL-PAIN DIAGNOSTIC ALTERNATIVES

19
classification is made on the basis of specifying that etiological fac
tor that requires most inmediate action on the part of the attending
practitioner. Thus, diagnostic classification of a patient into diag
nostic alternative 'A' signals that the etiology specified by that al
ternative should determine the course of the patient's care.
The next step in model development isolated relevant data which
measured the physiological and psychological status of craniofacial-pain
patients. In particular, this step of model development sought those
elements of patient status that practitioners employ in their own classi
fication of craniofacial-pain patients. Appendix A presents a list of
these data elements. Wherever it was feasible, measures of patient
status were segmented to amplify the significance of particular readings
of each measure. Thus, for example, while the duration of a patient's
pain is a continuous measure of his status, it is important for the par-
poses of classification to know whether a craniofacial-pain patient's
duration of pain is less than 3 weeks, from 3 to 6 weeks, or longer than
6 weeks. For this measure of patient status, a short history of pain
indicates a strong possibility of a recent traumatic injury virile pain
over a long period is more likely associated with long standing arthritic
or psychic disorders.
To facilitate the development of an analytic model of the diagnostic-
classification process, a vector representation of the relevant elements
of patient data has been developed. The vector permits the notation of
any of the data elements shown in the listing in Appendix A. The pre
sence of any of the items found in Appendix A is recorded in a patient's
data vector by an entry of '1' in the vector-dimension corresponding to
the item number, while the absence of a vector item is noted by a 'O'

20
data-vector entry. For example, referring to the listing in Appendix A,
a male patient would have the following fifth, sixth, and seventh ele
ments in his data vector
(...,1,0,0,...),
while a pre-manopausal female would have the series of elements
(...,0,1,0,...).
This vector notation of a patient's status serves as the input data for
a non-par aire trie pattern classifier that assigns a diagnostic classifica
tion to the patient's dysfunction.
Non-parsmetric pattern classification, as described in Meisel [23]
and Nilsson [24], is the process of creating decision surfaces that
separate patterns into homogeneous classes,Ct, i=l,2,...,p, specified
by the analyst. In the craniofacial-pain diagnostic model, the (t are
the diagnostic alternatives shown in Figure 3. Classification of a pat
tern (a patient' s-data-vector) into one of the classes is performed by
a pattern classifier composed of a maximum detector and a set of dis
criminant functions. These discriminant^ (a), j=l,2, ,p, are single
valued functions of each patient's data-vector a. If au represents a
data vector for a patient whose correct diagnostic classification is the
x1 diagnostic alternative, then the (a) are chosen so that
gi-i*>gj-i) i, j=l,2,...,p, j^i.
The craniofacial-pain classifier uses linear discriminant functions.
These discriminants are linear in the sense that they provide mappings
from E11 to that exhibit the form
gj(a) =a1wjl+a2wj2+...+^wjn4.(n+1)
where in the patient-data-vector a, the value of a^ denotes the presence

21
(ar = 1) or absence (a^ 0} of patient-data-vector item r; and the
-* 4*V
k=l,2, ,n+l, are constants associated with the j discriminant
function called 'weights.' These discriminant-function weights,
j=l,2,...,p, k=l,2, ,n+l, provide an analytic means of duplicating
the correct classification of each pattern observe! by the non-parametric
classifier. They provide a link between a pattern's correct classifica
tion and the individual components of the pattern's vector representa
tion. In essence, each discriminant's weights are additive elements
whose component sums have significance in terms of a isolating pattern's
correct classification. These weights are a .mathematical means of stor
ing information already known about the correct classification of observed
pattern vectors. Moreover, the weights can be interpreted from the
point of view? of the significance that the practitioner places on each
data-vector component. A discussion of this interpretation of the dis
criminant-function weights appears in Section 3.2.
Central to the use of linear discriminant functions is the assump
tion that the space of observable patient data vectors is linearly
separable, for by definition [24],
a pattern space A is linear and its subsets of patterns
A.,A2,...,A are linearly separable if and only if linear
x ^ p
discriminant functions exist such that
for all a in A^
(a) >g^ (a)
for all i=l,2,...,p, j=l,2,...,p, j^i.
In the context of diagnostic classification, the assumption of linear
separability implies that there exists a set of hyperplanes that parti
tion the space of observable patient data vectors into convex homogeneous
regions, each region representing a unique diagnostic classification.

22
Rosen [26] has provided a restatement of this assumption in the require
ment that the sets of data vectors corresponding to each diagnostic al
ternative have non-intersecting convex hulls. In either' form, this is
a fairly restrictive assumption on the dispersion of patient data vec
tors (see Section 3.2).
Selecting the 'weights' for each of the discriminant functions is
a process known as 'training.' For the linear non-parametric classifier,
training generates each discriminant function's w. ,'s by applying a sys-
JK
tematic algorithm to the members of a set of representative patterns with
pre-established classifications. Nilsson [24] discusses several algorithms
suitable for training the craniofacial-pain diagnostic classifier. In
the course of using these algorithms for model development, a new 'mod
ified fixed-increment' training algorithm was constructed (see Appendix
B). Employing the new algorithm has resulted in a reduction of approx
imately 35% in the amount of training time required to derive the weights
for the craniofacial-pain classifier.
Symbolically, the craniofacial-pain diagnostic classifier, with its
set of trained weights, can be represented in the following format:
let an = the 296-dimension data vector describing patient 'i'
a^ = the k element in the data vector describing patient
'i', whose value is either zero or one, k=l,2,...,295
(by definition a^ 296~^
Cj = diagnostic alternative 'j', j=l,2,...,17
d^j = the value of the discriminant function for diagnostic
alternative 'j' generated by the data vector of patient
'i'

23
VA = the 296-diirension vector of weights associated with
diagnostic alternative j1
til
w., = the k element in the weight vector W.,
that is
rj295,wj2961
T
and
296
ik jk
where T denotes vector transposition. Patient 'i' is classified in
diagnostic alternative Ch when d^j>d^s for every j. If mgx d^t is
not unique, then it is not yet possible to classify patient i' into
one of the diagnostic alternatives. Treatment is prescribed for severe
syrptcms and classification is attempted at a later date.
Data frcm four sources were used to construct and verify the diag
nostic-classification model, as well as the treatment-planning model
presented in Chapter 4. Contributions of clinical records came frcm
the dental schools at the universities of California at Los Angeles,
Florida, Illinois, and Indiana. In all, the records of 250 patients,
involving a total of 480 patient-practitioner interactions, form the
data base for model building and validation. The relevant information
frcm each of these patient visits has been recorded in the data-vector
format of Appendix A. A diagnostic classification frcm Figure 3 was
assigned to each of these patient data vectors by either Dr. Thanas B.
Fast, Chairman of the Division of Oral Diagnosis, or by Dr. Parker E.
Mahan, Chairman of the Department of Basic Dental Sciences, at the
College of Dentistry, University of Florida.

24
With this basic structure for the diagnostic-classification model,
the classified patient data vectors, and the training algorithm presented
in Appendix B, an initial test was performed to verify that the space of
observed patient data vectors was separable by linear discriminant func
tions. Application of the modified fixed-increment training algorithm
to the set of 480 data vectors verified this requirement, as the algo
rithm terminated in a set of feasible discriminant-function weights.
Using the discriminant functions these constants determine, it is possi
ble to duplicate the pre-established diagnostic classifications for each
of the patient data vectors.
This first test of the diagnostic classifier established that a non-
parametric classifier could be employed to reproduce the original clas
sifications for each data vector used in model construction. However,
this test does not reveal how well the classification model will perform
on patient data not employed in developing the discriminantfunction
weights. The remainder of this section, and Section 3.3, address the
question of how the diagnostic classifier performs on 'new' patient data
vectors, that is, vectors that have no duplicate in the training sample.
Model training has created a set of weights that, by the definition
of the training procedure, correctly classify every patient data vector
that 1 ips within the bounds of the training-sample pattern-class convex
hulls. Since every data vector is a binary vector, new patient data
vectors must fall outside the convex hulls established by the training-
sample vectors. Yet, if new data vectors have a number of data-vector
elements that are identical to those of the training-sample vectors
with the same diagnostic classification, then this relationship will be
reflected in a 'close proximity,1 as measured by a Euclidean-distance

25
function, between each new vector and its associated training-sample
convex hull. Given this close proximity, the classifier's discriminant
functions should correctly classify most new data vectors as these vec
tors will lie within or near the boundaries of the appropriate discrim
inating hyperplanes. Hence, the key to providing adequate classifier
performance for new data vectors lies in devising data-vector-represen-
tations of patient data for which the data vectors of a canton diagnostic
classification exhibit strong similarity.
In the introductory discussion of the elements of patient data used
in the patient data vector, it was pointed out that an effort was made to
select components of patient status that assist the practitioner in his
selection of diagnostic classifications for a craniofacial-pain patient.
Thai these elements were partitioned to generate as much discriminating
information as possible frcm each data element. In terms of the alter
nate diagnostic classifications, these elements of patient data were
chosen so that all patients in any one diagnostic classification would
have a unique combination of exhibited or non-exhibited data-vector ele
ments. Employing these carefully constructed qualitative data elements
resulted in a set of 'natural' gaps in the vector representations of
patient data iron alternate diagnostic classifications. The fact that
there are portions of the pattern space that cannot be occupied by any
data vector, and partitions of the space where the vectors of each clas
sification must lie, assists the classifer in making correct classifica
tions of data not used in model construction.
As Section 3.3 shows, this discussion is not meant to imply that the
craniofacial-pain diagnostic classifier can, in its present state of
development, correctly classify every new data vector. What has been

26
stated is that a knowledge of the underlying classifying process can
be employed in constructing the data vector examined by the classifier,
and that fully utilizing this information will lead to a classifier that
can be expected to be capable of performing well on new patient data.
Of course, this discussion has been predicated on the separability of
the underlying pattern space of data vectors. If this requirement is
not met by sane form of patient-data-vector representation, classifica
tion of patients by linear classifier is not possible.
The next section of this chapter provides relationships between
linear separability and the data that may be observed in a health-care
system for which diagnostic classification by linear discriminants is
being considered. This section has a dual purpose. First, linear sep
arability is couched in 'non-geanetric* terms. Second, and more impor
tantly, using the craniofacial-pain health-care system as an example
of the section1 s developments provides information about the suitability
of the non-par ame trie classifier as a model of the decision-making pro
cess associated with diagnostic classification in this care system
3.2 Alternative Interpretations of Linear Separability
The criteria for pattern space separability are mathematically
concise. Unfortunately, these separability criteria are not readily
expressible in non-gecmetric terms. The discussion developed in this
section provides the reader with sane non-geanetric criteria that indi
cate when the use of a non-parametric pattern classifier should be con
sidered as a means of generating diagnoses for a medical or dental dis
order.
The first criterion is associated with a probabilistic measure of
symptom exhibition. Given a patient who exhibits sane set of symptoms

27
S, non-pararretric pattern classification requires that P [S} 3 =1 for
the diagnostic alternative 'CL' that describes the patient's current
diagnostic status, and P[S|C^] = 0 for all other diagnostic alternatives
'C^..' However, assume that for the disorder in question the probability
of exhibiting any relevant symptom has been calculated from historical
data, that is, estimates of Pts^JCh] are available for all relevant
symptoms s^ and all diagnostic alternatives Ct. Then, if the following
decision rule leads to the correct classification of a majority of the
patients with the disorder in question, utilization of a non-parametric
classification model should be investigated:
classify a patient who exhibits the set of symptoms S in the
th
j diagnostic alternative if
IT P[si|Cj] > TT for 311 ^3* d)
s.eS s.eS
i i-
Since (1) holds if and only if
log tTT P[si|Cj]] > log tfT p[silC]c^ for 311
s^eS s^eS
decision rule (1) can be expressed in terms of logarithms. Let the set
of symptoms S be represented as a row vector a with the elements of a
assigned values as follows:
a^ = 1 if symptom s^ is an element of S
and a^ = 0 if symptom is not an element of S,
where n is the total number of relevant symptoms. Form the column vectors
Wj = [log Pts-JCh], log P[s2|Cj],..., log P[sn|c_.]]T

28
Then log [JJ Pts^Jc..]] = aWj, and decision rule (1) can be restated as
s.eS
i
classify a patient who is characterized by the vector a in the
j diagnostic alternative if
aWj > a^_ for all k^j. (2)
Note that decision rule (2) is identical to the decision rule employed
in non-parametrie pattern classification.
This equivalence implies that if (1) holds for every preclassified
patient examined, the values log P[s.JCj] form a set of feasible discrim
inant-function weights. If (1) leads to the correct classification of
a majority of the patients examined, it is logical to assume that there
may be a set of feasible discriminant-function weights. This assumption
was examined using the craniofacial pain patient data. Fran the data
vectors classified in Diagnostic Alternatives 13, 14, and 15, a total of
189 patient visits, the P[s^|Cj] were calculated. Each data vector was
then classified with decision rule (1), and 164 of the data vectors
(86.7%) were assigned to their pre-established diagnostic alternative.
The second criterion provides a subjective measure of the feasibil
ity of using a nan-parametric pattern classifier. If symptoms for most
of the diagnostic alternatives, associated with the disorder of interest,
can be isolated such that
1. a patients exhibition of a subset of these symptoms leads
the practitioner to a selection of one of the diagnostic
alternatives, or
2. a patient's exhibition of a subset of these symptoms leads
the practitioner to eliminate from further consideration

29
one of the diagnostic alternatives,
then the use of a non-parainetrie classifier as a means of generating
classifications should be investigated.
The linear non-parametrie classifier employes a weighted sum of
the symptoms exhibited by each patient in its discriminating functions.
If symptoms can be isolated that are significant to the classification
of patients with the disorder under investigation, then there is a
'natural' weight for each of these symptoms in the decision-making pro
cess used by the practitioner. The existence of these natural weights
increases the probability that a training algorithm will be able to find
a feasible set of discriminant-function weights. Indeed, the relative
importance of the significant symptoms may be reflected in the magnitude
of the discriminant-function weights generated by the application of a
training algorithm.
As an example, the significant symptoms associated with two cranio
facial-pain diagnostic alternatives, Alternatives 4 and 14, were isolated
by Dr. Fast. A comparison of these symptoms and their associated dis
criminant-function weights revealed a high degree of correlation between
symptom significance and discriminant-function-weights, see Table 2.
The reader should note that both of the criteria discussed in this
section are heuristic approximations to the geometric requirement for
pattern space separability. However, if the disorder under investigation
meets one or both of these criteria, it may be possible to employ a non-
parametric classifier to diagnose the disorder since the requirement for
pattern space separability is most likely met.

30
TABEE 2
CORRELATION BETWEEN SIGNIFICANT SYMPTOMS
AND DISCRIMINANT-FUNCTION WEIGHS
Diagnostic Alternative 4: Temporomard.tbular Joint Arthritis-Traumatic
(Acute)
Significant Symptoms
Discriminant-Function
Weights
(+) Duration of Pain (less than 3 weeks) + 3
(+) History of Trauma (accidental) +30
(+) Preauricular Pain +11
(-) Salivary Gland Disease -12
(-) Otitis 1
(discriminant-function weights for Diagnostic Alternative 4 range
fran -19 to +37)
Diagnostic Alternative 14: Myofacial Pain-Dysfunction Bruxism
Discriminant-Function
Significant Symptoms Weights
(+) Duration of Pain (more than 6 weeks) +15
(+) Facets + 2
(+) Bruxism and/or Clenching +56
(-) History of Trauma (accidental) -16
(-) Salivary Gland Disease 5
(discriminant-function weights for Diagnostic Alternative 14 range
from -23 to +56)
Note: For both Diagnostic Alternatives
(+) indicates a symptom that leads the practitioner to classify
a patient in that diagnostic alternative
(-) indicates a symptom that leads the practitioner to classify
a patient in seme other diagnostic alternative

31
3.3 Model Validation
Vali.daiJ.on of the craniofacial-pain diagnostic-classification
model presented in Section 3.1 has been accomplished by three types of
validating procedures. The discussion presented in the preceding sec
tions, and in particular the relationship between significant symptoms
and their associated weights shown in Table 2, reveal a close proximity
between the decision-making process the practitioner utilizes and the
non-parametric classifier's symptom-weighing scheme. This section pre
sents -favo other procedures employed in evaluating the diagnostic clas
sification model's performance.
The first procedure involved testing the diagnostic accuracy of
the classification model on patient data that were not employed in model
construction. Six classification tests were run in sequential order.
In the first five of these tests random samples of 50 patient-data-vec-
tors were drawn from the data base of 480 vectors discussed in Section
3.1. Then, as each of the tests was performed, the training algorithm
in Appendix B was applied to the remaining 430 data vectors. With the
weights derived from the training algorithm, the sample of 50 patients
was classified. The modelgenerated classifications for each of the
data vectors were compared to the classifications assigned to the vectors
when they were created. As each test classification of a sample was
completed, the diagnostic classifier's discriminant-function weights were
set equal to zero, the sample of data vectors was returned to the data
base, and the next test's random sample was drawn. A summary of the re
sults of these tests of diagnostic accuracy is presented in Table 3.
In each of the first five tests it was possible for a patient who
has had multiple practitioner-visits to have seme of the vectors repre-

32
TABLE 3
TESTS OF DIAGNOSTIC CLASSIFIER ACCURACY
Number of
Patient
Data Vectors
Number of
Data Vectors
Correctly Classified
Classifier
TEST ONE
50
46
92.0%
TEST' TWO
50
45
90.0%
TEST THEEE
50
44
88.0%
TEST FOUR
50
47
94.0%
TEST FIVE
50
45
90.0%
TEST SIX
51
43
84.3%
Mean Classifier Accuracy 89.7%
Standard Deviation of Classifier Accuracy 3.5%

33
senting these visits in a test's random sample and sane vectors used
in model construction. Such occurrences lead to test results that over
estimate classifier accuracy. lienee, in Test Six, a random sample of
all of the patient data associated with 40 patients (a total of 51
patient data vectors) was selected. This sample was classified by the
diagnostic-classification, model using the remaining 429 data vectors as
a data base. The results of this test are included in the data shown
in Table 3. There is one other possible factor affecting the classifier's
accuracy as measured by these tests. It is conceivable that there were
duplicate data vectors in the data base of 480 patient-data-vectors.
If duplicates do exist and were included in both the test samples and
the samples' training bases, measures of classifier accuracy will be
overly optimistic. However, since 'noise' is introduced by the variabil
ity among craniofacial-pain patients and generated in the practitioner's
transcribing of the elements of patient data into the data-vector format,
295
and since there are 2 possible data vectors, the probability that two
or more of the data-based patient vectors include an identical specifica
tion of data-vector elements is small enough to justify neglecting this
possibility and its effects.
The results summarized in Table 3 reveal that the diagnostic-clas
sification model performs well in duplicating the diagnostic classifica
tions originally assigned by the reviewing practitioners, Dr. Fast and
Dr. Mahan. Moreover, the size of the test sanples was quite large in
relation to the data base employed in developing each test's diagnostic
model. As new data became available and are incorporated in the para
meters of the model, the accuracy of the craniofacial-pain diagnostic
classifier can be expected to increase slightly.

34
The second validating procedure established a measure of variability
on the diagnostic classifications that might be given by different dental
practitioners. The discussion presented in Section 1.1 related the dif
ficulties associated with diagnosing craniofacial-pain disorders. Prac
titioners with varying kinds of professional experience can be expected
to reflect their dissimilar backgrounds in differing diagnostic classi
fications for these patients. To measure the variability associated with
dissimilar backgrounds, five craniofacial-pain data vectors were selected
from the data base employed in constructing the craniofacial-pain diag
nostic classifier. Four dentists from the staff of the College of Den
tistry at the University of Florida were asked to review these patient
data vectors and assign to each of them a diagnostic classification.
Table 4 summarizes their assignments and also includes the diagnostic
classification originally given by the reviewing practitioners.
The variability in diagnostic assignments reflected in Table 4 re
affirms the justification for the research objectives set forth in
Section 1.2. Same of the differences in the practitioners' choices of
diagnostic classifications can be explained by the limited amount of
data contained in each of the data vectors, and the less-than-full med
ical statement of each of the diagnostic alternatives. Nevertheless, a
diagnostic-classification model that generates classifications that are
in 90% agreement with those of experts in the field provides a sizeable
improvement over the variability in classification assignments exhibited
in Table 4 in which only half the respondents agreed on a single diag
nosis in four out of five cases.

TABLE 4
CIASSIFICAIIGN VARIABILITY 'AMONG DENTAL PRACTITIONERS
Diagnostic Classification for
Patient 1
Patient 2
Patient 3
Patient 4
Patient
Original
Classification
4
13
15
15
9
Practitioner 1
1
7
15
15
3
Practitioner 2
6
12
15
8
3
Practitioner 3
4
15
15
15
13
Practitioner 4
4
15
15
14
*
* No classification given
+ Patient 5 exhibited a minimal amount of input data (only 17
non-zero'data-vector entries)
These four dental practitioners exhibited 100.0% agreement of the
diagnosis on one of the five patients, and 50.0% agreement on the
diagnostic classification of the remaining four patients.

36
3.4 Minimum-Cost Symptom-Selection Algorithm
The craniofacial-pain diagnostic-classification model detailed in
the previous sections of this chapter has been structured upon the data
vector of the 295 relevant signs, symptoms, and items of patient history
shown in Appendix A. To utilize this model , the practitioner must ex
amine a patient for the presence or absence of each of these data vector
elements. Although the cost in time and fees varies frcm item to item,
there is an expense to the practitioner, and to the patient, associated
with checking each element in the data vector. Hence, it is logical to
investigate the possibility of finding a reduced data vector that 'costs'
less for the patient and practitioner to use and yet still permit cor
rect classification of all craniofacial-pain patients.
A review of the literature (see Meisel [23] Chapter 9 for a survey)
reveals that many authors have considered the task of selecting a set
of features to be used in a pattern-classification scheme. Traditional
methods of viewing this problem are based on a search for a transforma
tion that takes a given set of patterns into same 'new' pattern space
where separation by discriminant functions is possible. Measures of
pattern class separability are employed to evaluate the effects of
transforming the set of patterns frcm one space to another. In general,
these transformations take a pattern representation in 'n' features and
create a set of 'r' (r 'new' features are linear combinations of the original features. How
ever, to reduce the 'costs' associated with using the craniofacial-pain
diagnostic classifier, a transformation must be found that decreases
the size of the data-vector pattern space by eliminating features rather
that combining them. For example, assume patients were diagnosed on

37
the basis of body-temperature and blood-pressure readings. Traditional
techniques for feature selection might employ a linear combination of
body temperatures and blood pressure measurements as one 'new* feature.
The transformation sought in this investigation would lead to the clas
sification of patients by either body temperature or blood pressure
alone if this were possible. This example will be used again in Section
3.4.1 to illustrate the algebraic and geometric structure of the problem.
Nelson and Levy [27] have attacked the problem of selecting a re
duced set of unaltered features for use in a classification scheme.
These authors attach a cost to the use of each available feature/ and
employ a ranking scheme to measure each feature's discriminating power.
Then, under a restriction on the total cost of features employed, they
develop an algorithm that selects the set of features that maximizes the
classifier's discriminating power. Unfortunately, their scheme does not
guarantee the selection of a subset of original features that contain
enough 'information' to permit pattern class separation by discriminant
function. Therefore, a new algorithm is presented in this section that
minimizes the cost of the set of features used by the pattern classifier
yet insures that all patterns can be correctly classified by a set of
linear discriminant functions. In the remainder of this section the
more general terms 'feature,' 'pattern,' and 'pattern class' will be
used respectively to represent a data vector item, a patient's data vec
tor, and a diagnostic classification.
The problem of finding a minimum-cost collection of features would
not be considered if there did not already exist a set of 'n' features
by which the patterns under examination could be correctly classified
by linear discriminants. That is, given a *n' dimensional representa-

38
tion of each of the 'nr' patterns in each of the *p* pattern classes
-im= [aiim/ai2m/***,ainm,11/ i=l,2,...,p,
where
m
a., k=l,2,...,n, equals either zero or one, there must exist
JJv
a set of 'n+1' dimensional Vhs, j=l,2,...,p, such that
£um (Vh-W,.) > 0 for all m=l,2,...,im (3)
i1/2/.../P
j=l/2/.../P
j^i.
Letting be. the im* (n+1) dimensional matrix of patterns in pattern-
class i, then the requirement of (3) can be written in the following
form:
A. (W.-W.) > 0 i=l/2/.../P
x ~x
j==l/2/.../P
j^i.
If such pattern representations and Vh 's exists, then a solution to the
following problem yields a minimum-cost collection of pattern-classifying
features:
minimize CX
subject to A. [X (W.-W.)] >£ i=l,2,...,p
x i ]
j=l/2,... ,p
PI:

39
where A.
i
"11 1
a.., a._ .. a.
ll i2 in
a 2 a 2
ail ai2
a.2
on
m. m. m. .
a.-i a.0i... a. x 1
L ll i2 xn
W. = [w.. ,w.w. ,w. .,]
i ll i2 xn xn+1
T
C [c1,c2,...,cn,0]
r
X [x.^,x2, . ,x^,l]
and Wy, is an unrestricted variable
Cj is the cost of using feature j
x. =4
i
0 if feature i is not used
1 if feature i is used .
Note: The notation is to be read as element by element
multiplication i.e., QDR = S [s^j] = [q^r^..].
3.4.1 Algorithm Development
The algorithm developed to solve problem PI is an enumerative
algorithm similar in structure to that of Balas [28]. Unfortunately,
the ncn-linear nature of problem Pi's constraints prohibits full imple
mentation of the more powerful techniques used in implicit enumeration
on linear integer problems. The structure of these constraints and
their effect on the optimization of PI will be discussed in a step-by-
step development.
The minimum-cost feature-selection algorithm does not solve PI to
the extent of finding the values of the vectors iA, i=lf2,,,.,p. This

40
algorithm does find the minimum-cost collection of features X* and the
total cost associated with using these features, and guarantees the
k
existence of iA vectors associated with this optimal feature set. Given
this guarantee, the modified fixed-increment algorithm frcm Appendix B
*
can be employed to find the vectors A, i=l,2,...,p.
Choose seme solution to PI. By hypothesis there exists, at least
one solution ... ,W ) to PI where X = [1,1,.. .,1,1], Suppose
AAA A
there is sane other solution (X,W^,W2,... ,W^) where one or more elements
A
x^ in the X vector are equal to zero. For the constraint matrices in PI,
\[X (Wi-Wj)] > 0 i=l,2,...,p
j=l,2,... ,p
A A
If the matrix products [A^D X] = A^, i=l,2, ,p are constructed, then
each set of constraints in PI can be written in the form
A. (W.-W.)>0 i=l,2,...,p (4)
1 1 J
j=l,2,... ,p
j#-
A
The creation of the A^ is called the zeroing process. Of the col-
A A
urnns of A^, A^ retains all columns j of A^ where x^ = 1, and substitutes
A
a column of zeros for each of those columns k in A^ where x^. = 0. Using
the zeroing process, the feasibility of any possible solution vector _X
to Pi can be examined in terms of the A^D X this vector X creates.
As an example of the zeroing process for a particular set of patterns,
let a1 be a two-dimensional patient-data-vector a1 = [a^a^] where
al=1
0 if patient i has normal body temperature
1 if patient i has abnormal body temperature
and

41

42
Note that relation (2) is the requirement for pattern separability
by linear discriminants. Hence, a vector X is a component in a feasible
A A A A A
solution ** to ^ anc^ only ^ there exist VA i=l,2,...,p,
such that (2) holds for all i^j. As discussed in Section 3.1, a pattern
A
space is linearly separable, and hence, feasible VA exist, if and only if
the individual pattern classes have non-intersecting convex hulls. For
the pattern vectors considered in this section, the individual components
of each of the patterns in each pattern class are either zero or one. As
there is a one-to-one correspondence between the individual patterns in
a pattern class and the vertices of the pattern class's convex hull, the
A
convex hull of a pattern-class A^ can be expressed as all convex combina-
*
tions of the individual pattern-class vectors a/, m=l,2,... ,m^. Consider
the following examples of the convex-hull representation of linear separa
bility .
Assume = [1,0], a^ = [1,1], a^ = [0,0], and a^ = [0,1].
Graphically this pattern space can be represented as
2 2
^Y
X
Feature 2
**0
**-X
1
1
2y
^x
Feature 1
12
where the line X from a^ to a^. represents the convex hull of pattern
1 2
class X and the line Y from to represents the convex hull of
pattern-class Y. Since X and Y do not intersect, implying that the
space is linearly separable, it is possible to draw an infinite number
of lines 0 that serve as discriminating hyperplanes.

43
Assume a^ = [1,0], a^ = [0,1], aj = [0,0], and aj = [1,1]
Graphically this pattern space can be represented as
2
Y
12
where the line X from ^ to ^ represents the convex hull of pattem-
1 2
class X and the line Y frcm to represents the convex hull of
pattern-class Y. Since the lines X and Y intersect, the pattern space
is not linearly separable, and hence, it is impossible to draw a discri
minating hyperplane 0.
Therefore, the following condition is equivalent to condition (4):
c t
a vector X is feasible to PI if and only if there do not exist b and U
such that
where
<3 t A
u A = U A for any s=l,2,...,p
S *"* V
tl,2,...,p
s^t
U1 = [UjyU^, ,Uj^ ]
(5)
uk^-
> 0 for all k=l,2,...,m^
and
m.
E1 = 1 for all i=l,2,...,p .
k=l K
Checking the feasibility of seme vector X by condition (5) yields
[p(p-l)]/2 distinct subproblems. Each of these subproblems may be

44
characterized as follows:
let A = A and A = B with A and B having columns a.
" 'S v 1
P2:
and bj respectively for any Ag and At.
m. m.
Find u. > 0, E1 u.=l, and v. > 0, E3 v.=l
i
such that
m
i=l
3 -
3
j=l
m.
v3
Z1 u.a.
i=l 11 j=l 3 3
ZJ v.b. .
If such u^ and v^ exist for any one of the subproblems then X is not
feasible to Pi. Because the number of subproblems is large even for a
relatively snail number p of pattern classes, there is justification for
seeking methods to expedite the solution of each subproblem P2.
To achieve this goal, a series of conditions will be presented that
characterize seme of the criteria necessary to the existence of a solu
tion to subproblem P2. In addition to establishing criteria for exis
tence, these conditions provide a means for reducing the size of the
matrices A and B. This reduction will be discussed after the conditions
are established.
th k
Condition 1: If the k row of A has all elements a^, i=l,2,... ,nu,
equal to zero (one) and the k 1 row of B has all
V
elements b., j=l,2,...,m, equal to one (zero) then no
u.>0, ed
u.=l and v.>0, E-1 v.=l exist such that
1
1=1
1 j=l ^
m.
m.
Z1 u.a. =
= E-5 v.b. .
i=l 11
j=l 3 3
Justification 1: Under Condition 1 there is no set of convex combina-
tions of the k 1 row elements of A and of the k1 row
elements of B such that the combinations are equal.

45
Condition 2:
Justification 2
Hence, there can be no set of convex combinations
of the columns of A and of B such that the combina
tions are equal.
Symbolically,
m. m.
since no u.> 0, Z1 u.=l and v.>0, Z-* v.=l
l
1=1
3- j=1 3
exist such that
in. m. .
Z1 u.a. = Z3 v.b. ,
i=l 11 j-1 33
m.
m.
'3 ,T -I
no u.>0, Z u.=l, and v.>0, ZJ v.=l
1_ i=l 1 3 j=l 3
exist such that
m. m.
Z1 u.a. = Z3 v.b. .
i=l 1 1 j=l 3 3
If the k row of A has all elements a^, i=l,2,... ,nu,
equal to zero (one) and the k row of B has all
Jr
elements b^, i=l,2,...,iru, equal to zero (one), the
+*Vi
K11 ot of matrixes A and B can be eliminated without
loss of possible solutions to subproblen P2.
Under Condition 2 every convex combination of the k
row elements of A and of the K 1 row elements of B
are equal. Hence, a set of convex combinations of the
columns of A and of the columns of B are equal if and
only if the convex combinations of the remaining rows
til
(all rows except the kul row) are equal. Symbolically,
* tlx
let a^ denote the pattern a^ whose k component has
*
been eliminated and similarly let b., denote the
3K
elimination of component k from pattern b^, then as

46
m. m.
E1 u.a. = E3 v.b. ,
. 11 3 ]
i=l 3=1 J J
for any choice of
m.
u.>0, E1
u.=l
o
A
>
m.
E3
i=l
1
3
j=l
II
H
m.
= E3
j=l
v.b.
3 3
if and only if
*
E u.a.. = EJ v.b.. .
i=l 3k
Condition 3:
th 1c
If the k row of A hasall elements a^, i=l,2,... ,nu,
equal to zero, and sane b equals one,
m. m.
no u.>0, E1 u.=l, and v.>0, v >0, E3 v.=l
i- i=1 i 3 r j=]_ 3
exist such that
m. m.
E1 u.a. = E3 v.b. .
i=l 11 j=l 3 3
Justification 3: Under Condition 3 any convex combination of the col
umns of B that includes a non-zero product of the
column b^ results in a k1 row term greater than zero.
The value of the km row term for any convex combina
tion of the columns of A is equal to zero. Hence, no
set of convex combinations of the columns of A and B
can be equal if the combination for B includes a
specification that vr>0. Symbolically,
A
if vr>0,
A
then for any choice of v^, j=l,2,.-..,iru, j?r,

47
Condition 4:
Justification 4
m. A
where v >0 and E-3 v.=l
r j-1 3
k m-i k
I3 v.b. > E1 u.a. = 0
j=l i=l
i x
m.
for any choice of u. such that u.>0 and I u.=l.
1 1 i=l 1
m.
Hence, if v >0, there exist no u.>0, E1 u.=l
r 1 i=l 1
m.
and v.>0, yfcc, E-3 v.=l such that
3 3=1 3
m. m.
E3 u.a. = E-3 v.b. .
i=l 1 1 j=l 1 3
If the kUL row of A has all elements a^, i=l,2,... ,rru,
equal to one, and seme b^. equals zero,
m. m.
no u.>0, E1 u.=l and v.>0, v ^_0, E-* v.=l
1 i=l 1 3 r j=i 3
exist such that
m.
m.
= t3
E u.a. = EJ v.b. .
i=l 1 1 j=l ^ 3
Condition 4 is similar to Condition 3 in that any
convex combination of the rows of B that includes a
non-zero product of the r1 column yields a k row
term whose value cannot equal any convex combination
+*Vl
of the k row elements of A. Symbolically,
for any choice of u. and v., where v >0,
jt i j r
m. m. ,
Z3 v.b. < E1 u.a. = 1.
j=l ^ ^ i=i ^

48
Note that Conditions 3 and 4 can also be stated, and justified, with
.the role of the elements of the A and B matrices reversed.
Given this set of four conditions, consider the following row par
tition of the A and B matrices:-
A =
"a* "
'b*
*1
*1
B =
Bi
*0.
*0
I
o o
1
1
W
o
i
where by appropriate change of rows in A and B
1. every element in each row of A^ is a one
2. every element in each row of B^ is a one
3. every element in each row of A^ is a zero
4. every element in each row of Bq is a zero.
The partitions A^, B^, A^, and B are the rows of A and B corresponding
to B1, A1, Bq, and AQ, respectively, and A* and B* are the remaining rows
of A and B. With this partitioning and the four previously established
conditions, the size of the data vectors associated with many of the
[p(p-l)]/2 subproblems P2 can be significantly reduced. The reduction
process, Procedure 1, can be stated in this manner:
Step 1: If for seme row k in A^ (B^) each element in the corre
sponding row of B (Ap is equal to one, then row k
of A and B can be eliminated by Condition 2.

49
Step 2: If for sane row k in AQ (Bq) each element in the corre
sponding rcw of Bq (Aq) is equal to zero, then row k of
A and B can be eliminated by Condition 2.
Step 3:
Step 4:
Step 5;
Step 6:
If for sane row k in Aq (Bq) the corresponding row in
Bq (Aq) has all elements equal to one or if for sane row
(B1)
elerrents equal to zero, then this particular subproblem
P2 has no feasible solution by Condition 1. Procedure 1
and the search for a solution to P2 are terminated at
this point because the convex hulls of pattern-classes
A and B do not intersect.
If for sane row k in (B^) the corresponding row in
(a) has one or more elements equal to zero, i.e.,
, k k _,k n ,k_ k_ k
r s t r s t
columns b ,b ,...,b. (a ,a ,...,a ) can be eliminated by
XT S l XT S t
Condition 3.
If for sane row k in AQ (BQ) the corresponding row in
Bq (Aq) has one or more elements equal to one, i.e.,
= b* =.. = 1 (a¡W=.. .=a^=l) then
X S u XT S u
columns b ,b ,...,b. (a .a ,...,a) can be eliminated by
rs trs t
Condition 4.
If the use of Steps 1, 2, 4, and 5 has eliminated all
elements of both matrices, then this particular subproblem
has an infinite number of feasible solutions by Condition
2. Procedure 1 and the search for a solution to P2 are
terminated at this point because the convex hulls of the
pattern-classes A and B intersect.
c c
the corresponding row in B^ (A^) has all

50
Step 7: If the use of Steps 1, 2, 4, and 5 has eliminated one or
more rows or columns iron either matrix then repartition
the matrices and return to Step 1, otherwise terminate
Procedure 1.
In coding Procedure 1 for computer processing, there is no need to
physically partition the rows of the A and B matrices. Summing the
elements in any row of A or B reveals whether the individual elements in
the row are all equal to zero or are all equal to one. Given this infor
mation, the steps from Procedure 1 determine whether a pattern is re
moved iron A or B, whether a row in A and B is removed, or whether the
procedure should be terminated because no feasible set of convex combina
tions for P2 exists.
As an example of the use of Procedure 1 consider the set of matrices
A and B in subproblem P2 where
0
1
1
-
0
1
1
1
1
A =
1
0
0
0
B =
0
0
0
0
1
0
0
0
1
1
1
0
0
1
1
1
1
1
1
0
In the first application of the steps of Procedure 1:
1. Column 4 can be eliminated from matrix A by Step 4 and
2. Column 1 can be eliminated from matrix A by Step 5.
After the first application of the steps of the procedure
1 1
1111
A =
0 0
B =
0 0 0 0
0 0
1110
1 1
1 1 0_

51
In the second application of the steps of Procedure 1:
1. Pow 1 can be eliminated from both matrices by Step 1
2. Pow 2 can be eliminated fron both matrices by Step 2 and
3. Column 4 can be eliminated frem matrix B by Step 4.
After the second application of the steps of the procedure
0 0
1 1 l
A =
B =
1 1
.
1 1 1^
In the third application of the steps of Procedure 1:
1. Pew 2 can be eliminated fron both matrices by Step 1 and
2. Procedure 1 can be terminated by Step 3.
Hence, for this set of A and B matrices, subproblem P2 has no feasible
solution.
Although the use of Procedure 1 may lead to a reduction in the size
of most subproblems, the pattern vectors (a^ and b^) for ach of these
problems may still be quite large. Restating subproblem P2 as a linear
program yields
P3:
minimize [0 0]
subject to
and
A
-B
11...1
00...0
00...0
11...1
u > 0
V > 0
u*
o'
V
1
-
1
where the existence of any solution vectors U* and V* signals the inter
section of the convex hulls of pattern-classes A and B.

52
Consito: the dual of P3, written in the following form:
P4: maximize [0_ 1 1]
n_
h
h
-
n>-X1,X2 unrestricted in sign.
Note that P4 may have many associated ir^ variables, but has only as many
constraints as the number of patterns in A and B (as reduced by Procedure
1). P4 always has at least one solution to its constraint set. Thus, if
an application of a linear-programming algorithm to P4 reveals the exis
tence of an unbounded solution, then P2 has no solution. Therefore, if
and only if P4 has a bounded solution do u. and v. exist such that
x 3
m.
I1 u.a.
i=l 1 1
xu
¡P v.b.
j=l 3 3
where
and
m.
u. >0, I1 u. = 1
1 i=l 1
m.
v. > 0, iP v. = 1.
3 j=i 3
The preceding discussion with its development of a reduction proce
dure and dual formulation provides the structure for a second procedure.
Procedure 2 establishes a mechanism to verify the feasibility of any
a.
assignment of zeros and ones to the X vector of problen PI, see Figure 4.
That is, given seme vector X and a set of patterns a, rtt=l,2,... ,rm,
and i=l,2,...,p, the [p(p-l)]/2 subproblems P2 are formed by zeroing out

53
FIGURE 4
PROCEDURE 2

54
the appropriate pattern-vector elements. Then Procedure 1 is applied
to each sutoproblem. Finally, for each pair of pattern classes the
A
boundedness of the dual formulation P4 is examined. Vector X represents
a feasible set of a pattern-classifying features for PI if and only if
each of the [p(p-l)]/2 subproblem formulations P4 is unbounded.
Before a statement of the algorithm to solve problem PI is presented
several terms must be defined. The assignment vector is defined as a
listing of variables x^, elements of the vector X in Pi, whose values have
been determined by .the steps of the algorithm. The elements in this vec
tor are recorded with the value of their assignment, either zero or one.
These elements are entered in the vector in the order they were assigned,
with the first algorithm assignment in the first (left) position. For
example, consider the assignment vector
[x4 = 0, x10 =1, x2 = 0].
This vector records that the algorithm first assigned x^ equal to zero,
then assigned x^q equal to one, and its last assignment was x2 equal to
zero. Feasibility of a solution X, as determined by the assignment-vector
component values, is checked by Procedure 2 with the value of those vari
ables not included in the assignment vector temporarily set equal to one.
The value V of an assignment vector is defined as minus one times the
sum of the costs associated with each of the variables in the assignment
vector, multiplied by the value assigned to the respective variable.
For the example assignment vector, [x^ = 0, x^q = 1, = 0], where
c^ = 5, c^q = 2, and = 7, the assignment vector has the value
V = (-1) [5(0) +2(1) +7(0)] =-2.

55
3.4.2 Staten
Step 0:
Step 1:
Step 2:
Step 3:
Step 4:
Step 5:
'ent of the Minimum-Cost Synptcm-Selection Algorithm
Create the assignment vector (at this point the vector is
null as there is no variable assignment in the vector).
Set V*=- and go to Step 4.
Start at the right side of the assignment vector and move
to left, stopping at the first variable assigned a zero
value. If no variable in the assignment vector has a
zero assignment, go to Step 2. Otherwise go to Step 3.
Calculate V for the assignment vector. If V is greater
than V*, record the values of the variables in the assign
ment vector as the optimal solution X* to PI. Otherwise,
record (as the optimal solution X* to PI) the values of the
variables in the best current solution X. Terminate the
algorithm.
Change the value of the variable isolated in Step 1 to an
assigned value of one, and eliminate frcm the assignment
vector all variable assignments to the right of this new
assignment. If the assignment vector includes the assign
ment X£=l for every x^ in X return to Step 2. Otherwise go
to Step 4.
Select a variable x^ that is not an element of the assign
ment vector. Assign this variable the value x^=0 in the
assignment vector. Use Procedure 2 to check the feasibility
of this assignment. If the assignment vector is not fea
sible, go to Step 6. Otherwise go to Step 5.
If the assignment vector with the new assignment x^=0 does
not include an assignment for every x^ in X, return to
Step 4. Otherwise go to Step 7.

56
Step 6: If the assignment vector with the assignment x^-1
variable selected in Step 4) does not include an assignment
for every x^ in X, return to Step -4. Otherwise go to Step 7.
Step 7: Calculate V for the assignment vector. If V* is greater
than V, go to Step 1. Otherwise go to Step 8.
Step 8: Record as the best current solution X the values of the
variables in this assignment vector. Set V*=V, and return
to Step 1.
Note that in the course of applying this algorithm all solutions are
considered and the best current solution is replaced only when another
solution has a larger associated value. As the number of possible solutions
is finite, the algorithm must terminate, and at this termination the value
of the optimal solution and its assignments are known. An application of
the minimum-cost symptom-selection algorithm is presented in Appendix C.
3.4.3 Computational Considerations
Returning to the setting of diagnostic classification of craniofacial-
pain patients, application of the minimum-cost symptom-selection algorithm
295
would require an enumeration (explicit or implicit) over 2 possible
solutions in order to find the optimal collection of data-vector elements.
As the number of possible solutions is prohibitively large, heuristic
modifications to the symptom-selection algorithm are required for this
application. One possible modification could employ the fact that only
a few of the elements in the patient data vector have large associated
'costs' for their utilization. In particular, the eight elements of
radiographic data and the two measures of emotional trauma are significant
ly more 'costly' to examine than the other items in the data vector.
(x, is the

57
With this modification, the algorithm would only consider eliminating
these ten high cost features. Another heuristic approximation to the
optimal collection of features might rank the data-vector elements in
order of descending cost of utilization. Procedure 2 would then be used
to eliminate these components one by one, starting with the item of high
est cost, until the procedure signaled an infeasible solution to PI. Cer
tainly, other heuristics might also be developed to exploit the structure
of this algorithm.
3.5 Model Applications
The structure of the craniofacial-pain diagnostic-classification
model permits model utilization for a variety of purposes. Since the
model is developed in terms of general data-vector and diagnostic-alterna
tive parameters, these model components can be altered to suit the appli
cation in question. This section presents a brief discussion of seme of
the possible applications of the diagnostic classifier.
In a teaching environment, the diagnostic-classification model with
its set of discriminant weights can be stored for computer-terminal ac
cess. Then, on a set of tutorial example patients, students can compare
their diagnoses with those of the diagnostic model. Moreover, the student
can interact with the classifier in constructing his own 'sample* patients
for the classifier to diagnose. Finally, the student can request the
classifier to relate those discriminant-function weights that the model
employs in considering the 'significance* (Section 3.2) of any one or
group of symptoms.
The effectiveness of new diagnostic tests can be evaluated using the
minimum-cost symptcms-selection algorithm. This algorithm provides an
immediate measure of the 'worth' of new research developments. Given a

58
cost for employing a new test, the algorithm returns an evaluation of
the test's classifying capability. The algorithm reveals whether the
test is included in the minimum-cost collection of features and whether
the lose of the new test permits the practitioner to discontinue other
examination procedures. Additionally, the algorithm can be employed to
point out new areas for research, as it isolates diagnostic alternatives
where correct classification of patients is difficult using existing tests
and procedures.
As employed in the practitioner's office, the diagnostic classifier
veil provide a direct link between the practicing dentist and the know
ledge of experts in the field of craniofacial pain. Information will
flow over the link in both directions. As new patients are seen by the
practitioner, the record of each visit will be reviewed by experts and
then used to supplement the data base employed in model construction.
Then, when developments dictate, new sets of discriminant-function weights
can be transmitted to the dental practitioners. This kind of interaction
results in a more accurate and representative diagnostic classifier as
the patient-sample data base becomes larger.

CHAPTER 4
TREATMENT PLANNING
The selection of treatment regimens for craniofacial-pain patients
is modeled as a Markovian decision process. The states in this Marko
vian model are descriptions of a patient's health-care status and the
decision alternatives are feasible treatments for the patient's dys
function (see Section 4.1). In the first two sections of this chapter,
motivation for the model structure is provided and the components of
the decision model are developed. The third section provides a descrip
tion of the validating procedures used to determine the appropriateness
of the model and the model-generated treatment decisions. This chapter
closes with a discussion of potential teaching, research, and private
practice applications of the treatment-planning model.
4.1 Model Components
Several model-building components from the craniofacial-pain care
system are isolated to permit the construction of a Markovian represen
tation of this system. A set of state descriptions that characterize,
for decision-making purposes, the status of craniofacial-pain patients
is presented in Section 4.1.1. Then transition probabilities measuring
the effects of treatment applications are discussed in Section 4.1.2.
Section 4.1.3 overlays the model's state descriptions and transition
probabilities with costs accrued during the patient's progression -through
the care system. These components are integrated and verified in the
discussions of Sections 4.2 and 4.3.
59

60
Values for many of the treatment-planning model's parameters viere
gathered from the set of patient records discussed in Section 3.1. As
the patient histories from the contributing university dental clinics
were reviewed, notations of treatment applications and time between suc
cessive visits were made for each patient-practitioner interaction. The
values of the remaining model parameters were either estimated by the
reviewing practitioners, Dr. Fast and Dr. Mahan, or were gathered frcm
responses to questionnaries completed by patients who visited the
University of Florida's Dental Clinic. In modeling the complicated pro
cess of care for craniofacial-pain patients, several simplifying assump
tions were made. This section provides the motivation for these assump
tions and presents the notation employed in the analytic description of
the treatment-planning process.
4.1.1 Patient States
In general, a Markovian system structure requires that the current
state of the system completely characterizes the probabilities associated
with future state occupancies of the system. To fully satisfy this
Markovian condition for state structure in the craniofacial-pain treat
ment-planning model would require that the model include as distinct mod
el states every possible combination of diagnostic classifications a pa
tient might have occupied, in conjunction with every combination of treat
ment applications he might have undergone, during his stay in the care
system. Unfortunately, such a model would have an infinite number of
patient states.'
However, for a majority of craniofacial-pain patients the know
ledge of a patient's prior treatment record, coupled with his current
diagnostic classification, is adequate to determine his prior diagnostic

61
classifications. Even in the cases where the current classification
and prior treatment record do not provide a total description of a pa
tients condition, these elements of patient status do provide signifi
cant information about the probabilities associated with, a patient's
future status in the care system. For example, in the data employed in
model construction, 47 craniofacial-pain patients occupied Diagnostic
Alternative 15 and were treated with an application of drugs at least
once. Eight of these patients were 'well* after a first treatment with
drugs, while 39 required multiple applications of drugs or other treat
ments during their stay in the system. Yet of the 12 patients who were
given two applications of drugs, 9 were well* following the second
repetition of drug therapy. Thus, while the overall data-based transi
tion-probability estimate for a transition from Diagnostic Alternative
15 into the well state following any one application of drugs is .36,
the transition-probability estimate for a transition into the well state
following two successive applications of drugs is .75. Hence, for this
diagnostic classification, information on the prior application of drugs
is important in determining a patient's future status in the care system.
This form of 'current diagnostic classification augmented by treat
ment record' patient-state description is employed in the craniofacial-
pain treatment-planning model as an approximation to a 'true' Markovian
state structure. Each of the diagnostic alternatives shown in Figure 3
forms the basis for a collection of patient states. The diagnostic al
ternative is augmented with a record of treatments that have been applied
since the patient entered the care system. Appendix D provides a list
of the treatment alternatives that may be prescribed for craniofacial-
pain patients. The record of each treatment given to the patient is noted
in the patient-state descriptions without regard to its chronological

62
order. For example, a patient's occupation, of the state *<111,2,2
denotes that he is currently classified in diagnostic alternative J,
and that since he entered the care system he has been treated with one
application of treatment 1 and two applications of treatment 2.
.Augmenting the patient-state descriptions with treatment history
expands the dimensionality of the state space, yet the number of history-
augmented states remains finite for two reasons. The treatment records
used in model construction reveal that, for sane combinations of diag
nostic alternatives and treatment applications, there is a feasible
limit to the number of treatment repetitions that can be given to any
one patient. Thus, the first reason for a finite state space is that no
patient state in the treatment-planning model includes more repetitions
of a particular treatment than the clinical data have established as a
feasible limit. As an example, the records of patient visits used in
model construction establish a feasible limit of only one application
of treatment 18 for patients classified in any of the diagnostic alter
natives. Therefore, the treatment-planning model includes patient states
that exclude treatment 18 as a portion of their treatment history or
exhibit the form
'Jl...,18,...
for each diagnostic classification J* where 18 is a feasible treatment.
The second reason for a finite state space is that there is a boundary
application* of many treatments such that neither the treatment-record
data nor the reviewing practitioners established differences between the
transition probabilities for the boundary application and those for
further repetitions of the treatments (see Section 4.1.2 and Appendix E).
In Diagnostic Alternative 13, for example, the first application of treat-

63
ment 24 is the boundary repetition of that treatment. Hence, multiple
repetitions of treatment 24 are not added to the state description of
patient states based on Diagnostic Alternative 13, as the additional
information on multiple applications does not influence transition pro
babilities associated with this treatment's effectiveness. Thus, a
second application of treatment 24 for a patient who continues to be
classified in Diagnostic Alternative 13 places the patient in a state
of the form
1 "3T 24 1
XJ X / f
The craniofacial-pain treatment-planning model includes two terminal
patient states in addition to the patient states that are based on diag
nostic alternatives. One or the other of these two terminal states,
'well' or 'referred,' represents the patient's status when he exits the
care system. A patient exists the system in the 'well' state when the
effects of treatment applications result in sufficient improvement so
that no further treatment is required. The patient moves into the 're
ferred' state in lieu of further treatment. This alternative to treat
ment is selected when the 'expected costs' of remaining in the care sys
tem exceed the costs of referring the patient to another source of care
(see Section 4.1.3).
4.1.2 Transition Probabilities
Patient-state transitions that involve a change of diagnostic clas
sification follow one of two basic formats, see Figure 5. For the initial
diagnostic classifications in Format I, with each treatment application,
the patient either remains in his original diagnostic classification or
he transits into the well state. For Format II, the six diagnostic al
ternatives shewn in the lower illustration form a different structure.

64
Format I
Patients whose first-visit diagnostic classification is Diagnostic
Alternative 1, 2, 3, 4, 5, 6, 10, 11, 14, 16, of 17, make transitions out
of their original classification 'I' according to the following figure:
Format II
For patients originally classified in Diagnostic Alternative 7, 8, 9, 12,
13, or 15, the following kinds of diagnostic-classification transitions
are possible:

65
Here it is possible for the patient to alternate between any one of
several diagnostic classifications during the course of his stay in the
care system. Note that in both formats for diagnostic-classification
transitions a patient moves into the referred state not as a result of
a treatment application, but rather as an alternative to further treat
ment.
To these underlying diagnostic-classification transitions the cranio
facial-pain treatment-planning model adds a record of the changes in
treatment history. Appendix F displays complete charts of all of the
diagnostic-alternative-based patient states included in the treatment-
selection model. In these charts the patient states are connected by
arcs that represent feasible transitions fran one state to another. Not
shown in the charts are the well and referred patient states and the arcs
that connect every diagnostic-altemative-based state with these terminal
states.
Howard [25] establishes that in terms of the policy decisions gen
erated by a Markovian decision model, holding-time distributions are im
portant only insofar as they affect the mean weighting time in each sys
tem state and the expected costs of each state occupancy. The records
of the patient visits employed in model construction revealed that, in
the care of the patients described by the data, one or more treatments
were prescribed at each visit, and a series of return visits was scheduled
for the patient following his initial interaction with the practitioner
if return visits were warranted. Under these conditions, specifying
holding-time distributions for the time between successive patient-state
transitions does not refine the model. Therefore, the treatment-planning
model employs a Markovian rather than semi-Markovian representation of

66
the care system, since a 'nl visit holding time in a particular patient
state can be modeled with no loss of information as Tn' repetitions of
the 'virtual* transition frcm the state in question to itself. Care for
craniofacial-pain patients is modeled as a discrete-stage Markovian sys
tem with the beginning of visits to the practitioner serving as stage
indicators.
Using the history-augmented patient states, transition probabilities
are specified in terms of the treatment that generated the transformation.
In making a state-transition following a treatment, a patient must move
to a state that includes that treatment as a portion of its state descrip
tion. For example, following application of treatment *k,' a patient
must progress frcm patient-state 'Ilm,n* to 'Jlk,m,n' where I may be
equivalent to J.1 The only exception to this rule is in the application
of a treatment beyond its boundary number of repetitions. Here, if treat
ment k1 has a boundary number of two, then following an application of
treatment k' three or more times a patient progresses frcm patient state
'IIk,k,m,n' to 'Jl^k^n1 where again *1' maybe equivalent to *J.1
This structure is indicated because inclusion of more than the boundary
number of applications (two in this case) in the state description does
not affect the transition probabilities.
Estimates of the values of the transition probabilities were ob
tained frcm the patient records discussed previously. A discussion of
the stability of these probability estimates under variations in patient
data is presented in Appendix E. Where the data on the effects of treat
ment alternatives were limited, the data-generated probability estimates
were refined by estimates frcm the reviewing practitioners. Notationally,
transition probabilities are represented in the analytic model in the

67
following form;
Jr
p the probability of making a transition from
JuJ
patient-state 'I' to patient-state 'J* following
the application of treatment-alternative 'k.1
4.1.3 Cost Structure
A patients progression through the craniofacial-pain systsn gener
ates a multitude of implicit and explicit costs. The explicit costs can
be measured in terms of the dollar charges paid by the patient or the
practitioner during the patients stay in the system. Other costs are
implicit in nature and can be quantified only as they relate to the
opportunities lost by the patient and the practitioner while the pa
tient remains in the care system. For modeling purposes four major
system costs have been isolated. These costs are:
(a) Cost of treatment applications
(b) Cost of the practitioner and his staff's
services
(c) Cost to the patient of occupying a non-well
patient-state
(d) Patient-referral cost.
Although these costs do not encompass all of the system costs, they mea
sure significant explicit and implicit charges associated with a patient's
stay in this system. In the treatment-planning model, each of these costs
is charged on a per-patient-visit basis.
Costs of the various treatment applications and the costs associated
with the practitioner and his staff's services were estimated by the re
viewing practitioners. Estimates of treatment and care-system service
costs were partitioned by diagnostic classification as well as treatment

68
category- The cost estimates reflect typical charges in a dental clinic
environment.
The inconvenience experienced by a patient in making a visit to the
practitioner was used as a measure of the cost of occupying a 'non-well'
patient state. Estimates of this inconvenience cost were gathered from
responses to a questionnaire completed by patients at the University of
Florida's Dental Clinic. These were general dental patients not neces
sarily suffering from craniofacial pain. Figure 6 shows the distribution
of these patient estimates.
Values for patient-referral costs were composed of the sum of three
distinct estimates. The first component was an estimate of the total
fee charged by the practitioner receiving the referred craniofacial-pain
patient. Record transferral and duplication costs, as well as the fees
lost by the referring practitioner, formed the second component. The
third component of the patient-referral cost is a measure of the incon
venience experienced by the referred patient, a value estimated by using
a multiple of the value of the inconvenience cost discussed in the pre
ceding paragraph. Appendix G provides a justification for using this
particular combination of components in the referred-cost estimates.
Symbolically, the patient-state transition costs (negative constants)
are represented in the analytical model as
k
cTT = the sum of the costs generated by the transition
Xu
from patient-state 'I' to patient-state 'J'
following the application of treatment 'k.'
This sum includes the type (a), (b), (c), and (d) costs appropriate to
each patient-state transition.

69
Fifty-eight patients at the University of Florida's Dental Clinic responded
to the following questions;
Hew much would you estimate that this trip to the
Dental Clinic cost you in terms of lost wages, baby
sitting fees, transportation costs, and other costs
that you may have had to pay so that you could
be hare for your appointment?
The distribution of these, estimates is shown in this histogram.
.99 9. 19. 29. 39. 49. 59. 69. 79. 300.
Dollars
The mean value for these 58 estimates of patient-visit inconvenience costs
was $30.72.
FIGURE 6
PATIENT-VISIT INCONVENIENCE COST

70
4.2 Selection of Optimal Treatments
The craniofacial-pain treatment-planning model is transient in the
sense that only two of the model's patient states, well and referred, can
represent the patient's status when he exits the health-care system. In
a stochastic sense, only the terminal states are recurrent as they alone
possess non-zero long-run probabilities of state occupancy. Hence, the
choice of treatment alternatives at each patient state is made with the
goal of minimizing the costs accrued by the patient as he passes through
the diagnostic-altemative-based patient states into one of the recurrent
states.
For notational convenience, in the analytic model the well patient
state is denoted as state TWV and the referred state as state 'R.r In
modeling the care system for craniofacial-pain patients there is no
justification for providing costs for the transitions iron states *R'
and 'W to themselves, hence, 'cR R' and 'c^ w' are set equal to zero.
Analytically, the treatment-planning model is made monodesmic; i.e.,
having cnly one recurring state, by defining pR W=1 and p^ R=0. The
total number of states, not including states 'W* and *R,' is denoted by
'S.1 With these definitions and the notation introduced in the previous
section, a procedure for selecting the set of optimal treatment decisions
is developed.
Howard [25] has shown that for a monodesmic, transient Markovian
decision model, a set of optimal decisions is defined as those decisions
that maximize the expected-value 'v^' of occupying each system-state 'I.'
Since the treatment-planning model for craniofacial-pain patients fits
into this category of decision model, a modification of Howard's algorithm
is employed tc select optimal treatment regimes. The process of select-

71
ing an optimal set of treatments is accomplished by finding the set of
K
treatment alternatives k, ,k,...,k that maximize each of the v_ (the
ls I
expected value of occupying patient-state I' given treatment alternative
'kj') where
h J5!
*i k.
^ = V + all patient ^ ^ '
states J
1=1,2,...,S
and
*1
rI
y ^ *1
all patient PlJ ^
states J
With treatment-augmented patient states, maximizing the v^ can be
carried out in the following manner:
1. Group for simultaneous analysis all patient states possessing
a common treatment history, where one or more of the treatments in this
history are at their boundary level. Each of the 'T* sets of states
complying with this description forms an analysis set B^, j=l,2,...,T.
2. Label sequentially the patient states, starting with state W
as 1, state Ras 2, and then selecting numbers for the remaining unlabeled
patient states on the basis that the one with the most treatments in its
history receives the next number-label. For example, state ..111,2,2,4*
would be labeled with a smaller number than state *.112,6,6.1 When the
numbering scheme reaches the members of one of the analysis sets isolated
in Step 1 (above), numbers for the members of that set may be arbitrarily
assigned. Given this state numbering scheme, the selection of optimal
treatments can proceed dynamically since for each state I that is not a
member of an analysis set, 1=1,2,...,S, IBj, j=l,2,...,T
I
VI rI ?UVJ

72
and for the states of set B_.,. j=l,2,.... ,T
t
where t = the number of last non-group Bj state inrne-
diately preceding .the smallest nurriber-labeled
state in B ^
Thus, the process of selecting optimal treatments proceeds recur
sively from the state of smallest number-label to the one of largest
number-label, stopping to consider simultaneously the values of a number
of states only when an analysis set is encountered.
Howard's value iteration and policy improvement algorithm [25] is
employed only in the case of selecting treatments for the analysis-set
patient states. An example of this section's labeling and optimization
procedure is presented in Appendix H.
This optimization procedure was applied to the states of the cranio
facial-pain treatment-planning model. Appendix G presents a list of the
optimal treatment selections for each of the model's patient states.
4.3 Model Validation
Validation of the craniofacial-pain treatment-planning model was
accomplished in two phases. In the first phase of validation, the indi
vidual components of the. Markovian representation were examined by the
reviewing practitioners. The second phase of model validation compared
model-generated treatment decisions with those made by the reviewing ex
perts. In addition, statistics generated by the model were compared to
the care-system description provided by the patient records from the
university dental clinics. This section discusses the resulte of these
validating efforts.

73
The review of model components was accomplished as values for the
model parameters were collected. Seme of the data-fcased estimates of
transition probabilities and boundary-level application numbers did not
conform to expert judgment about the effects and effectiveness of vari
ous treatment applications. When these disparities occurred, the esti
mates were modified to reflect expert judgment.
The general structure of the patient states was reviewed to insure
that the representation shown in Appendix F did in fact portray a set of
logical progressions through the care system. Although this examination
established the validity of the patient progressions, the review did
point out one deficiency in the model's structure. The number and types
of treatment alternatives available for use at each patient state were
determined by records of actual applications of these treatments in the
data used for model construction. It was the judgment of the reviewing
practitioners that in several cases the selection of treatment alterna
tives for a patient state did not include the 'most appropriate' treat
ment alternative. Nevertheless, model deficiency can readily be correct
ed. With the collection of data on the effects of these 'most appropriate*
treatments, these additional treatment alternatives can be incorporated
as decision alternatives for the patient states in question.
The reviewing practitioners made selections of treatments for each
of the model's patient states. In those cases where the model's treat
ment alternatives did not include the practitioners' 'most appropriate*
choice of treatments, the practitioners made a selection iron the same
list of alternatives used by the model. Appendix G lists their choices
of treatment along with each model-generated selection. The two sets of
treatment plans include the same treatment selection for 87 out of 94

74
patient states, or 92.6% of the patient states. The 7 differences in
treatment selections arise in part fran the approximations the treatment
planning model employs in its representation of the care system and in
part from slight inconsistencies in the practitioner's treatment selections.
One last test was performed to verify the suitability of the Mark
ovian representation of the craniofacial-pain care system. Mean transit
times through the care system to one of the terminal states were calcu
lated using the model-generated treatment decisions, and each of six
first-visit patient states. These .model-generated transit times were
compared to estimates of the same statistics gathered fran the patient
records contributed by the university dental clinics. Table 5 presents
the values of both sets of statistics. The close correlation of these
values reveals that the treatment-planning model not only duplicates the
decisions of experts, but also provides a structure for gathering other
relevant information about the underlying care system.
4.4 Model Applications
Like the diagnostic-classification model presented in Chapter 3, the
craniofacial-pain treatment-planning model has been structured to permit
its utilization in a variety of applications. Markovian modeling provides
an analytic representation of the craniofacial-pain care system as well
as establishing a means of making treatment selections. This section dis
cusses applications of the model's analytic representation and treatment
selections in teaching, in research, and in practice.
The model-generated treatment decisions reveal which treatments are
most frequently used in the care of craniofacial-pain patients. In a
teaching environment, this information can be used to specify treatment-

75
TABLE 5
MEAN TRANSIT TIMES THROUGH THE CRANIOFACIAL-PAIN CARE SYSTEM
For a Patient Whose First
Diagnostic Classification Was
Model
Generated
Estimate*
Truncated
I-iodel-
Estimate+
Patient
Record
Estimate'
Myopathy-Myositis
1.50
1.34
1.35
Oral Pathology-Dental Pathology
1.11
1.04
1.08
Vascular Changes-
Migrainous Vascular Changes
3.89
3.42
3.06
Myofacial Pain Dysfunction-
Uneven Centric Stops
1.86
1.43
1.50
Myofacial Pain Dysfunction-
Anxiety/Depression
3.87
3.47
3.18
Myofacial Pain Dysfunction-
Reflex Protective Muscular
Contracture
1.90
1.79
1.87
The values in these sets of estimates are specified in terms of the
number of patient visits in which the patient occupies a non-well or
non-referred patient state.
Note: The treatment-planning model considers the possibility of
'infinite duration' occupancy of non-well or non-referred
states.
+ These truncated estimates were generated fran the treatment
planning model on the conditional basis that a patient must
transit into either the well or the referred state by his
fifth patient visit.
The maximum number of visits for any patient described by
the clinical data was five patient visits.
v

76
application techniques that should be emphasized in training dental stu
dents in craniofacial-pain care. Moreover, the parameters employed in
model development, in particular the transition probabilities and refer
ral costs, are themselves valuable instructional materials in developing
the dental student's treatment-selection slcills.
The treatment-planning model provides a method for evaluating new
developments in treatment for craniofacial-pain patients. With estimates
of the effectiveness of his new treatment, the researcher can use the
craniofacial-pain treatment-planning model to get two immediate responses
First, the optimization technique of Section 4.2 will determine if this
new treatment provides 'better care' for the patient than any of the
other treatment alternatives the model has to choose frcm. Second, if
optimal treatment selections for the model include the new treatment, the
model's statistics will show improvement in length of stay, and other
relevant measures of treatment effectiveness, introduced by using this
new treatment.
In the office of the practicing dentist, the treatment-planning mod
el's decisions could provide a concise reference of the treatment selec
tions suggested by experts in the field of craniofacial pain. Moreover,
the practitioner would have a chance to contribute to the refinement of
the listing as the treatment records of his patients could supplement
the data used in model construction. In addition, the practitioner could
employ the statistics associated with the treatment-planning model in
scheduling the length, and number, of his appointments for craniofacial-
pain patients.

CHAPTER 5
CONCLUSIONS AND FUTURE RESEARCH.
This dissertation has presented analytic models of the decision pro
cesses associated with diagnosing and selecting treatments for a partic
ular health-care problem. The selection, construction, and testing of
these models have been discussed in seme detail. Meanwhile, the model
building effort itself has been the source of a number of insights into
decision-making in a health-care environment. These insights will be
reflected in this chapter's discussion of the dissertation's central re
search conclusion and suggestions.of topics for future investigation.
The similarity between the decision-making processes employed by
the practitioner and the analytic structure of this dissertation' s models
is quite revealing. In both diagnosis and treatment planning for cranio
facial-pain patients it appears that the practitioner, like the analytic
models, makes 'first-order' decisions. The linearity of symptom signifi
cance (a first-order polynomial of symptom weights), and the present-
patient-state dependency of transition probabilities measuring treatment
effectiveness (a first-order stochastic dependence) provide a means of
generating decisions that closely approximate the decisions made by dental
practitioners. This general conclusion on the applicability of first-
order decision techniques to craniofacial-pain diagnostic classification
and treatment planning characterizes the central development of this
dissertation.
77

78
Given this summary statement, there are several logical extensions
to this dissertations research that should be examined in future inves
tigations. The following suggestions identify sene of the more fruitful
areas for further research efforts. These suggestions are ordered in
the author's view of their significance.
1. This dissertation's research found that first-order decision
making models are valid descriptions of the underlying thought processes
employed by the craniofacial-pain practitioner. It is possible that these
first-order descriptive decisions are suboptimal' and that higher order
decision-making tools might yield prescriptive, or 'optimal,' diagnostic
classifications and treatment plans for craniofacial-pain patients. That
is, considering the interaction between significant symptoms and multiple-
state dependency for patient-state transitions may lead to optimal diag
nostic and treatment-selection decisions. As the models themselves can
readily be increased in their decision-making 'order,' an investigation
into this possibility would be hampered only by the necessity of collect
ing an elaborate data base. Nevertheless, such an investigation should
be undertaken in this, the most significant, of future research areas.
2. As this dissertation's analytic models can be applied directly
to any health-care problem where there is verification that practitioners
make first-order decisions, one potential avenue of future research would
be to isolate those health-environments where these kinds of decisions
are made. However, a word of caution is interjected at this point. Math
ematical modeling demands an underlying structure for the process being
modeled. Yet, in a process dealing with a product that is subject to
considerable variation, such as the care of a patient in a health-care
system, isolating an underlying process structure is difficult. Moreover,

79
the problem of finding process' structure is compounded in the health-care
field by a lack of unifying and consistent nomenclature. In the health
care field, scholarly literature and historical precedent can serve as
the justification for two or more contradicting sets of terminology for
the same anatomical structure or physiological process. Thus, in re
searching the generality of first-order decision-making techniques, the
investigator must consider process variability and nomenclature incon
sistency before he makes any statement about the applicability of this
dissertation's decision-making tools to other health-care environments.
3. A ncn-geanetric discussion of the criteria for pattern space
separability was presented to provide a means of characterizing health
care disorders for which diagnostic classification by a linear pattern
classifier might be feasible. Unfortunately, this dissertation's tech
niques are heuristic and do not provide an exact reproduction of the
underlying mathematical specifications. Future research in this area
could lead to a precise statement of non-gearetrie criteria for linear
separability, and thus provide an indirect means for evaluating potential
applications of linear non-parametric classifiers.
4. This dissertation's minimum-cost symptan-selection algorithm
represents a clear departure from previous research in feature selection.
The algorithm's utilization of the convex-hull representation of pattern
space separability makes this development unique in the literature of
feature selection. However, the algorithm's method of checking the fea
sibility of potential feature collections is extremely tedious. A more
efficient method to check feature-collection feasibility may be revealed
through future investigations in this area.

80
5. Eran a matheiratical-programming point of view, the symptan-
selection algorithm represents one of a limited number of techniques
capable of solving a problem with non-linear constraints. The algorithm
seeks an optimal assignment of components, where the feasibility of any
assignment is determined by the existence of a set of discriminating com
ponent multipliers. In this more general context, the structure of the
algorithm may be applicable in a variety of problem areas not directly
related to the feature-selection problem. The possibility of employing
the algorithm in this general setting should be investigated.
6. In modeling the treatment-planning process for craniofacial-pain
patients the concept of boundary-level treatment applications was intro
duced. Boundary numbers on the effects of repeated treatment applications
are likely to occur in data derived from the care of patients with a va
riety of physiological disorders. Further investigations of this phenom
enon may result in more effective methods of predicting which treatments
will have boundary-level application numbers, and more efficient statis
tical techniques to determine values for these numbers.
7. The training algorithm developed in the construction of the
craniofacial-pain diagnostic classifier generates a feasible integer so
lution to a large nurtber of linear constraints. This algorithm is both
efficient and easily coded for computer applications. An investigation
of the uses of this algorithm in a mathematical-programming setting may
reveal applications in solution techniques for more general integer pro
grams.
8. Potential applications have been suggested for the diagnostic-
classification and treatment-planning models in teaching, in research,
and in practice. The models and their applications have been presented so

81
that they might readily be employed by sore future investigator. Actual
applications of the models should yield significant contributions to
the effectiveness of the teacher, researcher, and practitioner.

.APPENDIX A
CRANIQFACIAL-PAIN PATIENT DATA VECTOR
Referral Through
001
Medical GP
002
Medical Specialist
003
Dental GP
004
Dental Specialist
Sex 005 Male
006 Female
007
Female, menopausal
or post menopausal
Age Group 008
0 -
* 19 009
20 39
010
40 -
- 55 011
56 up
Duration of Pain
012
Less than 3 weeks
013
From 3 to 6 weeks
014
More than 6 weeks
015
Episodic
Character of Pain
016
Aching
017
Burning
018
Cutting
019
Discanfort
020
Dull
021
Pressure
022
Pricking
023
Sharp
024
Soreness
025
Stinging
026
Tenderness
027
Throbbing
Change in Character of Pain 028 Constantly getting 'worse
029 Got worse, then plateaued
030 Got worse, plateaued, then better
031 Getting better
032 Intermittent periods without pain
033 No change since beginning
82

83
List of Drugs Taken
History of Trauma
Location of Swelling
034 Mild Analgesics; Asprin, APC, etc,
035 Moderate Analgesics (non-narcoticl
036 Strong Analgesics; Narcotics and
Synthetic Narcotics
037 Anti-anxiety Agents: Mellaril, etc,
038 Anti-arthritic Agents: Steroids, etc.
039 Anti-depressives: Tofranil, etc.
040 Birth Control Pills
041 Hormone Preparations
042 Anti-inflammatory Agents .
043 Muscle Belaxants: Valium
044 Muscle Belaxants: Meprobamate
045 Muscle Belaxants: Others
046 Sedatives: Barbiturates, etc.
047 Other Drugs
048 Accidental
049 Factitial
050 Surgical
Side

84
Location of Tenderness
Left
Side
Right
Side
Location of Pain
Side
Limited Jaw Opening 243 Yes
Joint Sounds
244
Clicking
245
Crepitation
246
Pain accompanying joint sound
Headaches 247 Frequent headaches
248 Headache associated with joint pain

85
Changes in
249 Taste
250 Hearing
251 Visual acuity
252 Perception of light
touch on face
Upper Respiratory Infection 253 In conjunction with beginning
of TMJ pain
Evidence of 254 Arthritis
255 Every's Syndrome
256 Neuropathy
257 Otitis
258 Salivary gland disease
259 Sinusitis
260 Strokes
261 Vascular disease
Facets 262 1-3 263 4 up
Lateral Slide Prematurities 264 On working side
265 On balancing side
Tooth Ache 266 Yes
Biting Stress Tooth Mobility 267 Yes
Recent Restorative or Dental Prosthesis 268 Yes
Jaw Deviates on Opening 269 Left 270 Right
Impingeirent of Coronoid Process 271 Left 272 Right
on Zygcsratic Arch
Meniscus-Condyle Dyscoordination 273 Left 274 Right
Radiographic Examination 275 Mandibular condyle apposition
(such as spur formation}
276 Mandibular condyle resorption
(such as flattening of anterior-
superior surface or irregular surface)

Radiographic Examination
277
Fosca apposition
278 Fossa resorption
279 Articular eminence apposition
280 Articular eminence resorption
281 Evidence of fracture
282 Clinical or radiographic
evidence of pathoses
Emotional Trauma 283 Anxiety 284 Depression
Bruxism or Clenching 285 Yes
Uneven Centric Stops 286 Yes
History of Lengthy Dental Procedures 287 Yes
History of General Anesthesia 288 Yes
Tinnitus 289 Yes
Extraction of Teeth 290 Less than 6 weeks prior to Tt'U pain
291 Leaving a space that permits extrusion
Preauricular Pain 292 Yes
Alteration of Inter-Occlusal or Inter-A:ch Space 293 Yes
Paresthesia 294 Yes
Luxation or Subluxation
295 Yes

APPENDIX B
MODIFIED FIXED-INCREMENT TRAINING ALGORITHM
In presenting the modified fixed-increment training algorithm, the
following notation is employed:
p = the number of classification categories
t = the number of training-sample row vectors
a:
(k)
-3
= training sample row vector number 'k' preclassified
in category 'j', j=l,2,...,p, k=l,2,...,t, and
k=i[mod t] where i* is the index of the training-
algorithm iteration
wji} = the column of weights (the constraints in the
1 j^1' discriminant function) vised in the 'i**1'
iteration of the training algorithm, j=l,2,...,p.
a = non-negative constant specified by the analyst
to adjust the size of the 'dead zone' [23] in dis-
crirdnant function values, i.e., a >_ 0
6 = positive constant specified by the analyst to adjust
the scale of the weight vectors, i.e., 0 > 0.
Using this notation, let aP^ be the i^1 pattern examined by the
algorithm, then
case 1: if aik* wi1* > aik* WiiJ + a
~3 -3 -3 -c
let W(i+1) = W(i)
c c
for all c^j
for all c.
87

88
case 2: if
a
00
-1
<
(k) w(i)
j
+ a
let
and
wCi+1)
w(i+1>
-C
w:
3
(i+1)
wf B[a + er^ta^I
for a subset B of the
p discriminants z e B,
j / B
z e B
for all c / {B U j}
where n^ = the number
of discriminants in
the subset B.
The algorithm is terminated when the values of the W^, j=l,2,...,p, have
not changed during a complete cycle of the t training patterns, i.e.,
vhen W. =W; =.. .=4\T. for all j where 0 is the last case 2 pattern
} j --3
examined by tie algorithm.
This algorithm is guaranteed to terminate in a set of feasible
*
W., j=l,2,...,p, if the training sample is linearly separable and a and 6
3
have been appropriately selected. If the training sample is linearly
separable, the algorithm will converge for any fixed value of a > 0,
where ¡3 is selected appropriately large. Hence, the algorithm is nor
mally applied to a training sample with a=0 and 6=1. If the algorithm
converges, these constants can be adjusted and the training algorithm
reapplied.
The justification for specifying a non-zero a (a = size of the
dead zone) is that as a is increased the accuracy of the classifier is
increased in making classifications of data not used in developing the
discriminant-function weights. For example, with the craniofacial-pain
diagnostic classifier and the test samples discussed in Section 3.3,
the diagnostic model correctly classified approximately 5% more of the
test samples' data vectors when the model was trained with a=30, 6=3
(versus an original training with a=0, 6=1).

89
Proof that the algorithm converges if feasible weight vectors
*
Wj, j=l,2,...,p, exist (that is, the sample space is linearly separable)
is developed in Nilsson [22]. Nilsson's proof can be directly applied

since for any set of feasible W_.
aP^ W* > a!k) W* + a
3 ~3 3 ~2
r(i)
for all 10=1,2,... ,t, and 2=1,2,...,p, z^j, while for any j=l,2,...,p,
i-l,2,,0
a(k) w(i) <
-3 -3
+ a
for sane k and sane z.
Typically, a training algorithm is applied to the members of a
training sample without prior knowledge of whether the sample pattern
space is linearly separable. The algorithm is allowed to process sample
patterns until it either converges on a set of discriminating hyperplanes
or.it has run for a 'reasonable' amount of time without termination. Ex
perience with medical data and the modified fixed-increment algorithm
has shown that if there is a set of discriminating hyperplanes, the
algorithm will find it in no more than 3 complete cycles for each of the
pattern classes. For exanple, if there are 5 pattern classes and the
pattern space can be linearly partitioned, the algorithm should terminate
in no more than 15 full cycles through the training data. This rough
measure of training time provides an index for establishing a limit on
computer processing time.
An application of the modified fixed-increment training algorithm
is presented in Figure 7.

Given the training sample of the form a = [a^,a2,l] where
a£ = [0,0/1] a2 = [1,0/1] a2 = [0,1,1]
the training sample patterns can be represented in 3-dimensional
space by
The modified fixed-increment algorithm with a = 0 and g 1
proceeds as follows: (* indicates correct sample classification)
Sample
*1
2
3
BEi
2
3
[0,0,1]
[ 0, 0,
0]
[ 0, 0, 0]
[ 0, 0, 0]
0
0
0
[1,0,1]
[ 0, 0,
2]
[ 0, 0,-1]
[ 0, 0,-1]
2
-1
-1
[0,1,1]
[-1, 0,
1]
[ 2, 0, 1]
[-1, 0,-2]
1
1
-2
[0,0,1]
[-1,-1,
0]
[ 2,-1, 0]
[-1, 2, 0]
0
0
0
[1,0,1]
[-1,-1,
2]
[ 2,-1,-1]
[-1, 2,-1]
1
1
-2
*[0,1,1]
[-2,-1,
1]
[ 3,-1, 0]
[-1, 2,-1]
0
-1
1
*[0,0,1]
[-2,-1,
1]
[ 3,-1, 0]
[-1, 2,-1]
1
0
-1
*[1,0,1]
[-2,-1,
1]
[ 3,-1, 0]
[-1, 2,-1]
-1
3
-2.
Hence, the set of weights generated by this training sample is
Wx = [-2,-1, 1]
W2 = [ 3,-1, 0]
W3 = [-1, 2,-1].
FIGURE 7
.APPLICATION OF THE MODIFIED FIXED-INCREMENT ALGORITHM

APPENDIX C
APPLICATION OF THE MINIMUM-COST SYMPTOM-SELECTION ALGORITHM
Given three pattern classes X, Y, and Z, with patterns of the form
-j = faji'aj2'aj3,:L'' where
= [0 10 1] = [0 0 11]
2 rr. i n 2
a^ = [0 0 0 1]
= [10 0 1] ,
[0111] aj = [1 0 1 1]
these patterns can be represented in three-dimensional space (without
their constant = 1 components) by
1 feature 1
X /K feature 2
One set of feasible linear-classifier discriminant-function weights
,T
for these patterns is
Wx =[-2 3 0 -1]
WY = [ 1 -2 2 0]
Wz = [ 1 -1 -2 1]T.
Suppose feature 1 costs* 2 units to employ in the classifier,
feature 2 costs 6 units to employ in the classifier,
and feature 3 costs 3 units to employ in the classifier.
Then, for the minimum-cost symptcm-selection algorithm (Section 3.4)
, and C = [2 6 3 0].
*1-
0 10 1
, a2 "
0 0 11
II
0 0 0 1
0 111
L J
10 11
10 0 1
*
91

92
This feature-selection algorithm will be employed to find the minimum-
cost collection of classifying features. For the purposes of illustra
tion, the feature* variables x^, i=l,2,3, are selected in the order x2,
x^, x^ in Step 4 of the algorithm (Section 3.4.2). Note that this is a
logical ordering of features in descending order of feature-utilization
'costs.1 This prior specification of the order of variable assignments
permits the construction of a tree that represents the possible solu
tions remaining to be considered at each step of the algorithm. This
tree of possible solutions to PI has the form
*
Step 0: The algoritlm is initialized with V = -<. Go to Step 4 of the
algorithm.
Step 4: Select the variable x2 for assignment. The assignment vector is
now [x2 = 0]. Apply Procedure 2 (Figure 4).
In Procedure 2, zeroing out the column k=2 from A^, A2, and A^
yields
0
o"
0 1
0 1
0
0
:t
0 0
:t
0
0
0 1
A2 ~
1 1
*3
0
0
1 1
1 1
1 1
'sm
Application of Procedure 1 (Section 3.4.1) to A^ and A2 yields
the following reduced matrices:
Am Am
= P 1 A2 = [ 0 ],

93
Step 6:
The tree
Step 4:
where [ 0 ] is the null matrix.
By Step 6 of Procedure 1, feasible convex combinations of these
matrices exist.
Hence the assignment vector [x2 0] is infeasible by the rules
of Procedure 2. Go to Step 6 of the algorithm.
The assignment vector is now [x2 = 1]. As this vector does not
include an assignment for every variable, return to Step 4 of
the algorithm.
of possible solutions to Pi now has the form
Select the variable x^ for assignment. The assignment vector is
now [x2 = 1, x^ = 0]. Apply Procedure 2.
In Procedure 2, zeroing out the column k=3 fran A^, A^, and A^
yields
Al =
Atp
Application of Procedure 1 to A^ and yields the following
reduced matrices:
*0 o"
0 1
0 1
1 1
CT
0 0
*T
0 0
0 0
1 1
A2 _
0 0
1 1
A3 =
0 0
1 1
* m
Al =
o o
A2 =
o
1
i i
A

94
By Step 3 of Procedure,1, no feasible convex combination of
these matrices exists.
*rp Arp
Application of Procedure 1 to and yields the following
reduced matrices:
a£ = [1 1] [0] .
By Step 3 of Procedure 1, no feasible convex combination of
these matrices exists.
AT at
Application of Procedure 1 to and A^ yields the following
reduced matrices:
=[0 1] A^ = [0 1] .
Ajp Arp
For these reduced matrices, A^ and Ay P4 has the following
form: maximize A^ + ^
subject to A^ 0
it + A^ <_ 0
X2 i 0
it + ^2 r
t, A^, A2 unrestricted.
P4 has the bounded optimal solution A^ = A 2 = 0.
Hence the assignment vector ^ = 1, x^ = 0] is infeasible
by the rules of Procedure 2. Go to Step 6 of the algorithm.
Step 6: The assignment vector is now ^ = 1, x^ = 1]. As this vector
does not include an assignment for every variable, return to
Step 4 of the algorithm.
The tree of possible solutions to PI new has the form

Step 4: Select the variable x^ for assignment. The assignment vector
is now [x2 = 1, x^ = If x2. = ^ Apply Procedure 2.
In Procedure 2, zeroing out the column k=l from A^, A^, and A^
yields
0 o
o o
o o'
1 1
*T
0 0
CT
0 0
0 1
A2
1 1
*3 =
0 0
1 1
1 1
i i_
Am ^m
application of Procedure 1 to A^ and A^ yields the following
reduced matrices:
Am Am
A£ = [ 1 ] A2 = [0 0] .
By Step 3 of Procedure 1, no feasible convex combination of
these matrices exists.
Am Am
Application of Procedure 1 to A^ and A^ yields the following
reduced matrices:
= [ 1 ] A^ = [0 0] .
By Step 3 of Procedure 1, no feasible convex combination of
these matrices exists.
"T AT
Application of Procedure 1 to A2 and A^ yields the foliating
reduced matrices:
Am Am
Aj = [1 li = to o] .
By Step 3 of Procedure 1, no feasible convex combination of
these matrices exists.
Hence tlie assignment vector [x2 = 1, x^ = 1, x^ = 0] is
feasible by the rules of Procedure 2. Go to Step 5 of the
algorithm.
Step 5: The assignment vector [x2 = 1, x3 = 1, x^ = 0] includes an

96
assignment for every variable; go to Step 7 of the algorithm.
Step 7: The value V is calculated for this assignment vector, where
V= -1[1(6) + 1(3) + 0(2)] = -9 .
*
As V = -=, go to Step 8 of the algorithm.
*
Step 8: V is set equal to -9, and the values of the variables x^ = 0,
X£ = 1, and x^ = 1 in this assignment vector are stored for
future reference. Go to Step 1 of the algorithm.
Steps 2 and 3 of the algorithm dictate that the algorithm is
terminated at this point since these steps generate the assignment
vector [x^ = 1, x^ = 1, x^ = 1] which is known to be feasible to PI.
*
V for this assignment vector is -11, which is smaller than V Hence
the minimum-cost collection of classifying features is feature 2 and
feature 3, with a cost of 9 units associated with utilizing these
features in a linear pattern classifier.

APPENDIX D
TREATMENT ALTERNATIVES FOR CRANIOFACIAL-PAIN PATIENTS
Treatment Application
Number
Treatments
11
12
13
14
15
16
17
18
21
22
23
24
25
26
27
28
31
32
33
Chill Therapy
Drug Therapy
Fixation
Heat Therapy
Occlusal Adjustment
Physical Therapy
Prosthetics
Tooth Extraction or Endodontics
Drug Therapy and Fixation
Drug Therapy and Health Therapy
Drug Therapy and Occlusal Adjustment
Drug Therapy and Physical Therapy
Drug Therapy and Prosthetics
Heat Therapy and Physical Therapy
Occlusal Adjustment and Physical Therapy
Physical Therapy and Prosthetics
Chill Therapy, Drug Therapy, and Physical Therapy
Drug Therapy, Fixation, and Heat Therapy
Drug Therapy, Fixation, and Physical Therapy
97

98
Treatment Application Treatments
Number
34 Drug Therapy, Heat Therapy, and Physical Therapy
35 Drug Therapy, Occlusal Adjustment, and Physical
Therapy
36 Fixation, Heat Therapy, and Physical Therapy
41 Drug Therapy, Fixation, Heat Therapy, and Physical
Therapy

APPENDIX E
STABILITY OF TPANSITIC^I-PEORABILITY ESTIMATES
This appendix presents the technique employed to determine the sta
bility of transition-probability estimates of treatment effectiveness
for patients who occupy the same patient state, but who exhibit varying
combinations of relevant data-vector elements. The analysis shorn here
is limited to one treatment alternative, treatment 24, and one diagnostic
classification, Diagnostic Alternative 13, but a similar investigation
was performed for a majority of the other diagnostic classifications and
treatment alternatives. Variability of the data-based estimates (Section
4.1.2) of transition probabilities is analyzed in terms of five factors,
patient's sex, patient's age, duration of patient's pain, nature of pa
tient's pain (continuous or episodic), and number of replications of the
same treatment alternative.
Statistically, '2-way' contingency tables [29] measure the effects
of these relevant factors. The rows of each table specify the number
of transitions out of Diagnostic Alternative 13 following an application
of treatment 24, and each table's columns specify a value for the factor
. 2
being analyzed. A chi-squared statistic x is employed to test for in
dependence between the number of transitions and the factor in question.
99

100
Analysis by Sex
Pre-menopausal
Male Female
Menopausal or
Post-menopausal Female
Transitions 8
into
0
1
0
13
1
11
2
15
0
1
0
Well
2
11
1
2
X 1-250 with 6 degrees of freedom
Hence, the analysis reveals that the sex of the patient is not significant
in determining estimates of transition probabilities out of Diagnostic
2
Alternative 13 following application of treatnent 24, as x g=12.592.
Analysis by Age Group
Transitions 8
into
13
15
Well
2
X = 2.286 with 3 degrees of freedom
Hence, the analysis reveals that age group of the patient is not signifi
cant in determining estimates of transition probabilities out of Diagnos-
2
tic Alternative 13 following application of treatment 24, as x 2=7.815.
20 39 40 55
Years Years

8
0
7
1
6
1
7

101
Analysis by Duration of Pain
Less than From 3 to Mare than
3 weeks 6 weeks 6 weeks
Transition 8
into
13
15
Wall
?
X* = 5.047 with 6 degrees of freedon
Hence, the analysis reveals that duration of pain is not significant in
determining estimates of transition probabilities out of Diagnostic Al-
o
temative 13 following application of treatment 24, as x nt- =12.592.
Uj f D
Analysis by Nature of Pain
Continuous Episodic
Transition 8
into
13
15
Well
2
X = 3.964 with 3 degrees of freedom
Hence, the analysis reveals that nature of pain is not significant in
determining estimates of transition probabilities out of Diagnostic Al-
2
temative 13 following application of treatment 24, as x Ar -3=7.815.
UD f
0
1

8
6
1
0
11
3

102
Analysis by Number of Replications
of Treatment 24
Is*" Application 2n<^ implication 3^ Application
Transitions 8
into
1
0
0
13
9
5
0
15
1
0
0
Well
6
3
5
2
X = 8.099 with 6 degrees of freedom
Hence, the analysis reveals that the number of replications is not signif-
cant in determining estimates of transition probabilities out of Diag
nostic Alternative 13 following application of treatment 24, as
Thus, for Diagnostic Alternative 13, the five factors of patient
variation do not affect transition-probability estimates of the effective
ness of treatment 24.
The last type of analysis performed, analysis by number of treatment
replications, established treatment-application boundary numbers. If
the analysis revealed no significant effect for differences in treatment
repetitions, then the boundary number for the treatment was set at zero
or one by the reviewing practitioner. Note, that a zero boundary-applica
tion number for a treatment alternative implies that a record of that
treatment provides no additional information about the patient's progres
sion through the care system, and, therefore, the treatment-planning model
does not add a record of the treatments to its patient-state descriptions.
If the analysis revealed a significant effect for treatment repetitions,
the reviewing practitioners examined the data-based estimates of transi
tion probabilities associated with multiple repetitions of the treatment

103
and established a boundary application number for the treatment that
reflected their knowledge about the treatment's effectiveness as well
as the information supplied by the data.

APPENDIX F
FLOW CHARTS OF PATIENT-STATE
TRANSITIONS
C 1 *) 1116 }
(2 *-(^ 2124 ^ 2124,2.4)
v 20 ^ C^) 2 20 C3 C324
104

105
c
1 )
or

106
13118
cry
8112
C S}12 \
12 8112,12
C 10 >
10123 y
^10112,23^)
O
C 11 >
+/ 11112
TJ

107

108

109
8112
17
28

APPENDIX G
PATIENT-STATE
Patient
State
Model
Selection
Practitioner
Selection
1
16
16
1116
16
16
2
24
24
2124
24
24
2124,24
24
Refer*+
3
23
23*
3112
Refer
12+
312.5
41
41
3124
24
24
3132
32
32
3123,41
Refer
Refer
3132,32
Refer
Refer
4
35
35*
4120
Refer
Refer
4124
24
24
4134
34
34
4135
24
24
4120,20
Refer
Refer
4124,24
24
24
4124,35
24
24
TREATMENT SELECTIONS
Patient
State
Model
Selection
Practitioner
Selection
4134/34
34
34
5
35
35
5116
Refer
17+
5124
17
17
5135
24
24
5136
24
24
5124,35
24
24
6
24
24
7
33
33
7112
24
24
7118
12
12
7133
24
24
7112,24
16
16
8
12
12
8112
12
12*
8118
12
12
8124
18
18
8112,12
16
16
8118,24
24
24
8134,41
12
12
no

Ill
Patient
State
Model
Selection
Practitioner
Selection
Patient
State
Model
Selection
Practitioner
Selection
9
12
12
14141
24
24
9112
12
Refer+
14112,12
Refer
Refer*
10
23
23
14112,23
12
12*
10123
12
12
14124,24
22
Refer+
10112,23
12
12
14124,35
24
24
11
12
12*
15
12
12
11112
12
12*
15112
15
15
12
15
15
15120
20
20
15115
15
15
15122
34
24
12123
Refer
Refer
15123
23
23
12135
24
24
15124
12
12
12115,31
Refer
17*+
15127
16
16
12124,35
24
24
15134
34
34
13
12
12
15135
24
. 24
13112
12
12
15141
34
34
13118
24
24
15112,12
12
12
13122
34
34*
15112,23
12
12
13123
12
12
15116,27
16
16
13124
24
24
15120,20
20
Refer*+
13112,12
23
23
15122,22
12
12
14
23
23
15122,34
34
34*
14112
12
12*
15123,23
23
23
14123
12
12
15124,24
24
24
14124
24
24
15134,34
34
34
14126
26
26
15112,22,22 12
12
14135
35
35
15122,34,
34 34
34*

112
Patient
Model
Practitioner
State
Selection
Selection
16
25
25
17
Refer
Refer
Note: *
indicates that the reviewing practitioners made their selection
of treatment from a set of alternatives that did not include
the 'most appropriate' treatment alternative (Section 4.3)
+ indicates a difference between the treatment selections made
by the treatment-planning model and the reviewing practitioners.
Hie model-generated treatment selections agree with the reviewing prac
titioners' selections in 87 out of 94 patient states. This represents a
92.6% agreement between the taro sources of treatment selections.
In terms of the craniofacial-pain treatment-planning model's treat
ment selections, the patient-referral costs are the most significant of
the model's components. For each of the model's patient states, the
cost of referral out of that state is used in making the decision whether
to continue treatment for a patient, or whether to suggest that he go to
another source of care. If this cost is set too low, then patients who
should be treated in the craniofacial-pain care system are inappropriately
referred out of the system. On the other hand, too high a referral cost
leads the model to suggest that patients remain in this care system when
it would be to their advantage to seel: care elsewhere. For these reasons,
this cost was the subject of considerable examination in the building of
the treatment-planning model.
The reviewing practitioners suggested three possible alternative for
mats for the cost of referring a patient out of the craniofacial-pain care
system. These were:

113
Format 1: Referral Cost = [record transferral cost]
+ [practitioner's lost fee]
+ 2*[inconvenience cost associated with
a dental visit]
Fermat 2: Referral Cost [fee paid to referral care system]
+ 2*[inconvenience cost associated with
a dental visit]
Format 3: Referral cost = [fee paid to referral care system]
+ [practitioners lost fee]
+ [record transferral cost]
+ 2* [inconvenience cost associated with
a dental visit]
where in all three formats the multiple of the inconvenience cost was
suggested by the fact that in the clinical records (Section 3.1} the
median number of visits to the referral care system was two visits. The
treatment-planning model was optimized with referral costs based on each
of these formats. Use of the Format 3 referral costs leach to model-gen
erated treatment selections that most closely duplicated the selections
of the reviewing practitioners. Hence, this format for patient-referral
costs has been selected for utilization in the treatment-planning model.

APPENDIX H
APPLICATION OF THE PATIENT-STAIE-IABELING AND
OPTHlAL-TREATTIETr- SELECTION PROCEDURE
Consider a health-care system with two diagnostic classifications,
'I1 and 'J.' In this system treatment T^ or treatment can be given
to a patient in either diagnostic classification. Treatment has a
boundary-level application number (Section 4.1.1) of one, and treatment
T2 may be given only once during a patient's stay in the care system.
Figure 8 presents a pictorial representation of this system.
Using the labeling procedure of Section 4.2, the patient states in
this system are numbered as follows:
1. W
2. R
3. HT^
4. JlTjTg
s. iit2
6. JIT2
7.
8. Jl^
9. I
10. J .
With this labeling of the system's patient states, analysis to de
termine the optimal treatment decisions 'k^' for each patient-state *i
proceeds as follows:
114

Two treatments and are available for patients classified
in Alternative I and Alternative J. has a boundary-level
application number of one application and T2 may be given only
once during the patient's stay in the care system.
Note that this figure omits the transition arcs between the
diagnostic-classification-based patient states and the terminal
states well and referred.
FIGURE 8
MULTIPLE-STATE HISTORY-AUGMENTED PROCESS

116
*
vx = O
v2 = O
and Vj: find the that maximize
kI kI t D -kI vkJ
VI I + jf3 PIJ J
,1=3,4
v =
max
kc
k, 4 k *
^ +j3PW Vj
V, =
max
jf
and Vgt find the k^ that maximize
h
k 8 k k 6
V = ri +Jf7PU VJ +j!3puvj
*1
,1=7,8
v =
max
kn
k9 8 k *
r9 + VJ
1 ~
p99
V10
max
"10
-10
'10
8 kio *
+ j=3Pl0J Vj
1 P101010

117
where r.
*1
10
z
J=1
*1
u u

BIBLIOGRAPHY
[1] .S. Department of Health, Education, and Welfare, Cumulated
Index Medicus, Washington, D.C.: U.S. Government Printinq
Office (1970-1973).
[2] Ahevne, P., Ryan, G.A., and Walsh, R.J., 1972 Reference Data on
the Profile of Medical Practice, Chicago: Center for Health
Services Research and Development, American Medical Associa
tion (1972).
[3 ] Bureau of Economic Research and Statistics, "1971 Survey of Dental
Practice," Journal of the American Dental Association,
Vol. 85 (1972) 154-158.
[4] Bruce, R.A., and Yrdall, S.R., "Computer-Aided Diagnosis of
Cardiovascular Disorders," Journal of Chronic Diseases,
Vol. 19 (1966) 473-484.
[5] Schwartz, L., and Chayes, C.M., Facial Pain and Mandibular
Dysfunction, Philadelphia: W. B. Saunders (1968).
[ 6 ] TMT Research Center, Conference on Function and Dysfunction of
the Temporomandibular Joint Complex, Chicaqo: University of
Illinois (1969).
[ 7] Mitchell, D.F., The Dental Clinics of Worth America, Symposium
on Oral Medicine, Philadelphia: W..B. Saunders (1268).
[8] Mitchell, D.F., Standish, S.M., and Fast, T.B., Oral Diagnosis/
Oral Medicine, 2nd Edition, Philadelphia: Lea & Febiger
TT971).
[9] Ledley, R.S., "Practical Problems in the Use of Computers in
Medical Diagnosis," Proceedings of the IEEE, Vol. 57, No. 11
(1969) 1900-1918.
[10] Lincoln, T.L., and Parker, R.D., "Medical Diagnosis Using Bayes
Theorem," Health Services Research, Vol. 2, No. 1 (1967)
34-35.
[11] Bunch, W.H., and Andrew, G.M., "Use of Decision Theory in
Treatment Selection," Clinical Opthopaedics and Related
Research, No. 80 (1971) 39-52.
118

119
[12] Boyle, J.A., Greig, W.R., Franklin, D.A., Harden, R.McG.,
Buchanan, W.W., and McGirr, E.M., "Construction of a Model
for Computer-Assisted Diagnosis: Application to the Problem
of Nontoxic Goiter," Quarterly Journal of Msdicine, Vol 35
(1965) 565-588.
[13] Lodwick, G.S., Harm, C.L., Smith, W.E., Keller, R.F., and
Robertson, E.B., "Ccmputer Diagnosis of Primary Bone Tumors:
A Preliminary Report," Radiology, Vol. 80 (1963) 273-275.
[14] Overall, J.E., and Williams, C.M., "Conditional Probability Pro
gram for Diagnosis of Thyroid Function," Journal of the
American Medical Association, Vol 183, No. 5 (1963)
307-313.
[15] Toronto, A.F., Veasy, L.G., and Warner, H.R., "Evaluation of a
Computer Program for Diagnosis of Congenital Health Disease,"
Progress in Cardiovascular Diseases, Vol. 5, No. 4 (1963)
362-377.
[16] Wilson, W.J., Templeton, A.W., Turner, W.H., and Lodwich, G.S.,
"The Computer Analysis and Diagnosis of Gastric Ulcers,"
Radiology, Vol. 85 (1965) 1064-1073.
[17] Burbank, F., "A Computer Diagnostic System for the Diagnosis of
Prolonged Undifferentiated Liver Disease," American Journal
of Medicine, Vol. 46 (1969) 401-413.
[18] Collon, M.F., Rubin, L., Neyman, J., Dantzig, G.B., and Siegelaub,
A.B., "Automated Multiphasic Screening and Diagnosis,"
American Journal of Public Health, Vol. 54 (1964) 641-750.
[19] Lipkin, M., Engle, R.L., David, B.J., Zgorykin, V.K., Ebald, R.,
Sendrow, M., and Berkley, C., "Digital Computers as an Aid
to Differential Diagnosis," Archives of Internal Medicine,
Vol. 108 (1961) 56-72.
[20] Overall, J.E., and Williams, C.M., "Comparison of Alternative
Ccmputer Models for Thyroid Diagnosis," San Diego Symposium
on Biomedical Engineering, Vol. 3 (1963).
[21] Betague, N.E., and Gorry, A., "Automated Judgemental Decision-
Making for a Serious Medical Problem," Management Science,
Vol. 17, No. 1 (1971) B421-B434.
[22] Ledley, R.S., "Computer Aids to Clinical Treatment Evaluation,"
Operations Research, Vol. 15 (1967) 694-705.
[23] Meisel, W.S., Computer-Oriented Approaches to Pattern Recognition,
New York: Academic Press (1972).
[24]Nilsson, N.J., Learning Machines, New York: McGraw-Hill (1965).

120
[25] Howard, R.A., Dynamic Probabilistic Systems, Vol. II; Semi-
Markov and Decision Processes, New York: Wiley (1971).
<*
[26] Rosen,- J.B., "Pattern Separation by Convex Programming,"
Journal of Mathematics and Application, Vol. 10 (1965)
123-134.
[27] Nelson, G.E., and Levy, D.M., "Selection of Pattern Features
by Mathematical Programming Algorithms," IEEE Transactions
on Systems Science and Cybernetics, Vol. SSC-6 (1970)
20-25.
[28] Balas, E., "An Additive Algorithm for Solving Linear Programs
with Zero-One Variables," Operations Research, Vol. 13
(1965) 517-547.
[29] Freund, J.E., Mathematical Statistics, Englewood Cliffs, N.J.:
Prentice-Hall (1962).

BIOGRAPHICAL SKETCH
Michael Steven Leonard was bom February 2, 1947, in Salisbury,
North Carolina. In June, 1965, he was graduated cum laude from Cocoa
High School in Rockledge, Florida. He received the degree of Bachelor
of Industrial Engineering with High Honors fron the University of Florida
in June, 1970. In September, 1970 he began graduate work in engineering
at the University of Florida. He received the degree of Master of
Engineering in March, 1972. In June, 1972, he was designated a Distin
guished Military Graduate of the Air Force Reserve Officer Training Corps.
From September, 1970, until the present, his graduate training has been
supported by a National Science Foundation traineeship.
Michael Leonard is married to the former Mary Elizabeth Stewart
of Cocoa, Florida. He holds the reserve ccrrmission of Second Lieutenant
in the United States Air Force. He is a member of Lambda Chi Alpha
fraternity; Alpha Pi Mu, Sigma Tau, and Tau Beta Pi honorary fraternities;
and the Operations Research Society of America.
121

I certify that I have read this study and that in my opinion it
conforms to acceptable standards of scholarly presentation and is fully
adequate, in scope and quality, as a dissertation for the degree of
Doctor of Philosophy.
Kerr^E. Kilj^trick, Chairman
Assistant Professor of Industrial and
Systems Engineering
I certify that I have read this study and that in my opinion it
conforms to acceptable standards of scholarly presentation and is fully
adequate, in scope and qualify, as a dissertation for the degree of
Doctor of Philosophy.
// r? -j?'
Thomas B. Fast
Professor and Chairman of the Division
of Oral Diagnosis
I certify that I have read this study and that in my opinion it
conforms to acceptable standards of scholarly presentation and is fully
adequate, in scope and qualify, as a dissertation for the degree of
Doctor of Philosophy.
I certify that I have read this study and that in my opinion it
conforms to acceptable standards of scholarly presentation and is fully
adequate, in scope and quality, as a dissertation for the degree of
Doctor of Philosophy.
Professor and Chairman, Department of
Basic Dental Sciences

I certify that I have read this study and that in my opinion it
conforms to acceptable standards of scholarly presentation and is fully
adequate, in scope and quality, as a dissertation for the degree of
Doctor of Philosophy.
Richard S. Mackenzie
Professor and Director,
Education
Office of Dental
I certify that I have read this study and that in my opinion it
conforms to acceptable standards of scholarly presentation and is fully
adequate, in scope arid quality, as a dissertation for the degree of
Doctor of Philosophy.
Associate Professor of Industrial and
Systems Engineering
This dissertation vas submitted to the Dean of the College of Engineering
and to the Graduate Council, and was accepted as partial fulfillment of
the requirements for the degree of Doctor of Philosophy.
Dean, Graduate School

UF Libraries:Digital Dissertation Project
l?75
Cathy Martyniak, Project Coordinator
Christy Shorey, Project Technician
Internet Distribution Consent Agreement
In reference to the following dissertation:
AUTHOR:
Leonard, Michael
TITLE:
Analytical models for diagnostic classification and treatment
planning
for craniofacial pain, (record number: 580622)
PUBLICATION DATE:
1973
, as copyright holder
I,
for
the aforementioned dissertation, hereby grant specific and limited
archive and distribution rights to the Board of Trustees of the
University of Florida and its agents. I authorize the University of
Florida to digitize and distribute the dissertation described above
for
2 of 4
5/23/2008 11:35 AM

U,F Libraries:Digital Dissertation Project
3 of 4
nonprofit, educational purposes via the Internet or successive
technologies.
This is a non-exclusive gra'h't 'of'^perrars-si-ons 'ft/i^sjbecific off-line
and Sion .tfuS
on-line uses for an indefinite term, Q£f-iine uses shall be limited
, -A tbi .yqoD SVKtoiA
to
those specifically allowed by "Faif^UseW as pjbesa&sSLbed by the terms
United States copyright legislaLibh' (trf7~~Titl-e iVP U. S. Code) as
well as h&muisn
to the maintenance and preservat^qj^gg^a digital^archive copy.
Digitization allows "the" University of Florida~~bo generate image- and
text-based versions as appropriate and to provide and enhance access
using search software.
This grant of permissions prohibits use of the digitized versions
for
commercial use or profit.
Printed or Typed Name
A
of Copyright
Holder/Licensee
Personal information blurred
*23 'ZjOoQ
Date of 'Signature
Please print, sign and return to:
Cathleen Martyniak
UF Dissertation Project
5/23/2008 11:35 AM



with both decision-making processes. These capabilities are discussed
in terms of applications of the models in teaching, research, and in the
practice of dentistry.
x


24
With this basic structure for the diagnostic-classification model,
the classified patient data vectors, and the training algorithm presented
in Appendix B, an initial test was performed to verify that the space of
observed patient data vectors was separable by linear discriminant func
tions. Application of the modified fixed-increment training algorithm
to the set of 480 data vectors verified this requirement, as the algo
rithm terminated in a set of feasible discriminant-function weights.
Using the discriminant functions these constants determine, it is possi
ble to duplicate the pre-established diagnostic classifications for each
of the patient data vectors.
This first test of the diagnostic classifier established that a non-
parametric classifier could be employed to reproduce the original clas
sifications for each data vector used in model construction. However,
this test does not reveal how well the classification model will perform
on patient data not employed in developing the discriminantfunction
weights. The remainder of this section, and Section 3.3, address the
question of how the diagnostic classifier performs on 'new' patient data
vectors, that is, vectors that have no duplicate in the training sample.
Model training has created a set of weights that, by the definition
of the training procedure, correctly classify every patient data vector
that 1 ips within the bounds of the training-sample pattern-class convex
hulls. Since every data vector is a binary vector, new patient data
vectors must fall outside the convex hulls established by the training-
sample vectors. Yet, if new data vectors have a number of data-vector
elements that are identical to those of the training-sample vectors
with the same diagnostic classification, then this relationship will be
reflected in a 'close proximity,1 as measured by a Euclidean-distance


Radiographic Examination
277
Fosca apposition
278 Fossa resorption
279 Articular eminence apposition
280 Articular eminence resorption
281 Evidence of fracture
282 Clinical or radiographic
evidence of pathoses
Emotional Trauma 283 Anxiety 284 Depression
Bruxism or Clenching 285 Yes
Uneven Centric Stops 286 Yes
History of Lengthy Dental Procedures 287 Yes
History of General Anesthesia 288 Yes
Tinnitus 289 Yes
Extraction of Teeth 290 Less than 6 weeks prior to Tt'U pain
291 Leaving a space that permits extrusion
Preauricular Pain 292 Yes
Alteration of Inter-Occlusal or Inter-A:ch Space 293 Yes
Paresthesia 294 Yes
Luxation or Subluxation
295 Yes


APPENDIX H
APPLICATION OF THE PATIENT-STAIE-IABELING AND
OPTHlAL-TREATTIETr- SELECTION PROCEDURE
Consider a health-care system with two diagnostic classifications,
'I1 and 'J.' In this system treatment T^ or treatment can be given
to a patient in either diagnostic classification. Treatment has a
boundary-level application number (Section 4.1.1) of one, and treatment
T2 may be given only once during a patient's stay in the care system.
Figure 8 presents a pictorial representation of this system.
Using the labeling procedure of Section 4.2, the patient states in
this system are numbered as follows:
1. W
2. R
3. HT^
4. JlTjTg
s. iit2
6. JIT2
7.
8. Jl^
9. I
10. J .
With this labeling of the system's patient states, analysis to de
termine the optimal treatment decisions 'k^' for each patient-state *i
proceeds as follows:
114


22
Rosen [26] has provided a restatement of this assumption in the require
ment that the sets of data vectors corresponding to each diagnostic al
ternative have non-intersecting convex hulls. In either' form, this is
a fairly restrictive assumption on the dispersion of patient data vec
tors (see Section 3.2).
Selecting the 'weights' for each of the discriminant functions is
a process known as 'training.' For the linear non-parametric classifier,
training generates each discriminant function's w. ,'s by applying a sys-
JK
tematic algorithm to the members of a set of representative patterns with
pre-established classifications. Nilsson [24] discusses several algorithms
suitable for training the craniofacial-pain diagnostic classifier. In
the course of using these algorithms for model development, a new 'mod
ified fixed-increment' training algorithm was constructed (see Appendix
B). Employing the new algorithm has resulted in a reduction of approx
imately 35% in the amount of training time required to derive the weights
for the craniofacial-pain classifier.
Symbolically, the craniofacial-pain diagnostic classifier, with its
set of trained weights, can be represented in the following format:
let an = the 296-dimension data vector describing patient 'i'
a^ = the k element in the data vector describing patient
'i', whose value is either zero or one, k=l,2,...,295
(by definition a^ 296~^
Cj = diagnostic alternative 'j', j=l,2,...,17
d^j = the value of the discriminant function for diagnostic
alternative 'j' generated by the data vector of patient
'i'


80
5. Eran a matheiratical-programming point of view, the symptan-
selection algorithm represents one of a limited number of techniques
capable of solving a problem with non-linear constraints. The algorithm
seeks an optimal assignment of components, where the feasibility of any
assignment is determined by the existence of a set of discriminating com
ponent multipliers. In this more general context, the structure of the
algorithm may be applicable in a variety of problem areas not directly
related to the feature-selection problem. The possibility of employing
the algorithm in this general setting should be investigated.
6. In modeling the treatment-planning process for craniofacial-pain
patients the concept of boundary-level treatment applications was intro
duced. Boundary numbers on the effects of repeated treatment applications
are likely to occur in data derived from the care of patients with a va
riety of physiological disorders. Further investigations of this phenom
enon may result in more effective methods of predicting which treatments
will have boundary-level application numbers, and more efficient statis
tical techniques to determine values for these numbers.
7. The training algorithm developed in the construction of the
craniofacial-pain diagnostic classifier generates a feasible integer so
lution to a large nurtber of linear constraints. This algorithm is both
efficient and easily coded for computer applications. An investigation
of the uses of this algorithm in a mathematical-programming setting may
reveal applications in solution techniques for more general integer pro
grams.
8. Potential applications have been suggested for the diagnostic-
classification and treatment-planning models in teaching, in research,
and in practice. The models and their applications have been presented so


57
With this modification, the algorithm would only consider eliminating
these ten high cost features. Another heuristic approximation to the
optimal collection of features might rank the data-vector elements in
order of descending cost of utilization. Procedure 2 would then be used
to eliminate these components one by one, starting with the item of high
est cost, until the procedure signaled an infeasible solution to PI. Cer
tainly, other heuristics might also be developed to exploit the structure
of this algorithm.
3.5 Model Applications
The structure of the craniofacial-pain diagnostic-classification
model permits model utilization for a variety of purposes. Since the
model is developed in terms of general data-vector and diagnostic-alterna
tive parameters, these model components can be altered to suit the appli
cation in question. This section presents a brief discussion of seme of
the possible applications of the diagnostic classifier.
In a teaching environment, the diagnostic-classification model with
its set of discriminant weights can be stored for computer-terminal ac
cess. Then, on a set of tutorial example patients, students can compare
their diagnoses with those of the diagnostic model. Moreover, the student
can interact with the classifier in constructing his own 'sample* patients
for the classifier to diagnose. Finally, the student can request the
classifier to relate those discriminant-function weights that the model
employs in considering the 'significance* (Section 3.2) of any one or
group of symptoms.
The effectiveness of new diagnostic tests can be evaluated using the
minimum-cost symptcms-selection algorithm. This algorithm provides an
immediate measure of the 'worth' of new research developments. Given a


70
4.2 Selection of Optimal Treatments
The craniofacial-pain treatment-planning model is transient in the
sense that only two of the model's patient states, well and referred, can
represent the patient's status when he exits the health-care system. In
a stochastic sense, only the terminal states are recurrent as they alone
possess non-zero long-run probabilities of state occupancy. Hence, the
choice of treatment alternatives at each patient state is made with the
goal of minimizing the costs accrued by the patient as he passes through
the diagnostic-altemative-based patient states into one of the recurrent
states.
For notational convenience, in the analytic model the well patient
state is denoted as state TWV and the referred state as state 'R.r In
modeling the care system for craniofacial-pain patients there is no
justification for providing costs for the transitions iron states *R'
and 'W to themselves, hence, 'cR R' and 'c^ w' are set equal to zero.
Analytically, the treatment-planning model is made monodesmic; i.e.,
having cnly one recurring state, by defining pR W=1 and p^ R=0. The
total number of states, not including states 'W* and *R,' is denoted by
'S.1 With these definitions and the notation introduced in the previous
section, a procedure for selecting the set of optimal treatment decisions
is developed.
Howard [25] has shown that for a monodesmic, transient Markovian
decision model, a set of optimal decisions is defined as those decisions
that maximize the expected-value 'v^' of occupying each system-state 'I.'
Since the treatment-planning model for craniofacial-pain patients fits
into this category of decision model, a modification of Howard's algorithm
is employed tc select optimal treatment regimes. The process of select-


UF Libraries:Digital Dissertation Project
l?75
Cathy Martyniak, Project Coordinator
Christy Shorey, Project Technician
Internet Distribution Consent Agreement
In reference to the following dissertation:
AUTHOR:
Leonard, Michael
TITLE:
Analytical models for diagnostic classification and treatment
planning
for craniofacial pain, (record number: 580622)
PUBLICATION DATE:
1973
, as copyright holder
I,
for
the aforementioned dissertation, hereby grant specific and limited
archive and distribution rights to the Board of Trustees of the
University of Florida and its agents. I authorize the University of
Florida to digitize and distribute the dissertation described above
for
2 of 4
5/23/2008 11:35 AM


88
case 2: if
a
00
-1
<
(k) w(i)
j
+ a
let
and
wCi+1)
w(i+1>
-C
w:
3
(i+1)
wf B[a + er^ta^I
for a subset B of the
p discriminants z e B,
j / B
z e B
for all c / {B U j}
where n^ = the number
of discriminants in
the subset B.
The algorithm is terminated when the values of the W^, j=l,2,...,p, have
not changed during a complete cycle of the t training patterns, i.e.,
vhen W. =W; =.. .=4\T. for all j where 0 is the last case 2 pattern
} j --3
examined by tie algorithm.
This algorithm is guaranteed to terminate in a set of feasible
*
W., j=l,2,...,p, if the training sample is linearly separable and a and 6
3
have been appropriately selected. If the training sample is linearly
separable, the algorithm will converge for any fixed value of a > 0,
where ¡3 is selected appropriately large. Hence, the algorithm is nor
mally applied to a training sample with a=0 and 6=1. If the algorithm
converges, these constants can be adjusted and the training algorithm
reapplied.
The justification for specifying a non-zero a (a = size of the
dead zone) is that as a is increased the accuracy of the classifier is
increased in making classifications of data not used in developing the
discriminant-function weights. For example, with the craniofacial-pain
diagnostic classifier and the test samples discussed in Section 3.3,
the diagnostic model correctly classified approximately 5% more of the
test samples' data vectors when the model was trained with a=30, 6=3
(versus an original training with a=0, 6=1).


79
the problem of finding process' structure is compounded in the health-care
field by a lack of unifying and consistent nomenclature. In the health
care field, scholarly literature and historical precedent can serve as
the justification for two or more contradicting sets of terminology for
the same anatomical structure or physiological process. Thus, in re
searching the generality of first-order decision-making techniques, the
investigator must consider process variability and nomenclature incon
sistency before he makes any statement about the applicability of this
dissertation's decision-making tools to other health-care environments.
3. A ncn-geanetric discussion of the criteria for pattern space
separability was presented to provide a means of characterizing health
care disorders for which diagnostic classification by a linear pattern
classifier might be feasible. Unfortunately, this dissertation's tech
niques are heuristic and do not provide an exact reproduction of the
underlying mathematical specifications. Future research in this area
could lead to a precise statement of non-gearetrie criteria for linear
separability, and thus provide an indirect means for evaluating potential
applications of linear non-parametric classifiers.
4. This dissertation's minimum-cost symptan-selection algorithm
represents a clear departure from previous research in feature selection.
The algorithm's utilization of the convex-hull representation of pattern
space separability makes this development unique in the literature of
feature selection. However, the algorithm's method of checking the fea
sibility of potential feature collections is extremely tedious. A more
efficient method to check feature-collection feasibility may be revealed
through future investigations in this area.


APPENDIX E
STABILITY OF TPANSITIC^I-PEORABILITY ESTIMATES
This appendix presents the technique employed to determine the sta
bility of transition-probability estimates of treatment effectiveness
for patients who occupy the same patient state, but who exhibit varying
combinations of relevant data-vector elements. The analysis shorn here
is limited to one treatment alternative, treatment 24, and one diagnostic
classification, Diagnostic Alternative 13, but a similar investigation
was performed for a majority of the other diagnostic classifications and
treatment alternatives. Variability of the data-based estimates (Section
4.1.2) of transition probabilities is analyzed in terms of five factors,
patient's sex, patient's age, duration of patient's pain, nature of pa
tient's pain (continuous or episodic), and number of replications of the
same treatment alternative.
Statistically, '2-way' contingency tables [29] measure the effects
of these relevant factors. The rows of each table specify the number
of transitions out of Diagnostic Alternative 13 following an application
of treatment 24, and each table's columns specify a value for the factor
. 2
being analyzed. A chi-squared statistic x is employed to test for in
dependence between the number of transitions and the factor in question.
99


6
patients will exhibit symptoms that lead to any one of several alterna
tive courses of patient care. Altering the occlusion of the natural
teeth is one means of treating craniofacial-pain patients. Although in
many cases minor occlusal abnormalities are only contributing factors to
a patient's pain, attention by the dentist to occlusion is at least
partially successful for a majority of craniofacial-pain patients [8].
However, it is important in early therapy not to alter the occlusion ir
reversibly. Treatment by means of tooth extraction or endodontics, jaw
fixation, prosthetic devices, or by topical treatments may also be sug
gested by the patient's symptoms. The articular surface of the mandib
ular condyle has an excellent reparative capacity [6]. Thus, the use of
sedatives, antibiotics, and muscle relaxants, along with physical therapy,
often leads to patient 'cures' as these treatments ease the patient's pain
and increase jaw mobility while natural restoration of the joint is in
progress. If, after a reasonable length of time (3 to 6 months) the pa
tient's symptoms are not relieved, the dentist may consider referral to
another source of care or therapy such as surgery [7].
typically, the health-care process for craniofacial-pain patients
may be viewed as following the format of Figure 2 [9]. When a patient
is admitted into the care system, he undergoes a data-collection process.
This involves taking a 'full and pertinent' patient history and a phys
ical examination of the areas of discomfort. The data gathered consist
of symptoms, signs, medical and/or dental history, physical examination
findings, psychosocial information, and so forth. Once these elements
have been elicited, a diagnosis is attempted. If this is not yet pos
sible, the severe symptoms are treated and the patient's health state is
monitored.


39
where A.
i
"11 1
a.., a._ .. a.
ll i2 in
a 2 a 2
ail ai2
a.2
on
m. m. m. .
a.-i a.0i... a. x 1
L ll i2 xn
W. = [w.. ,w.w. ,w. .,]
i ll i2 xn xn+1
T
C [c1,c2,...,cn,0]
r
X [x.^,x2, . ,x^,l]
and Wy, is an unrestricted variable
Cj is the cost of using feature j
x. =4
i
0 if feature i is not used
1 if feature i is used .
Note: The notation is to be read as element by element
multiplication i.e., QDR = S [s^j] = [q^r^..].
3.4.1 Algorithm Development
The algorithm developed to solve problem PI is an enumerative
algorithm similar in structure to that of Balas [28]. Unfortunately,
the ncn-linear nature of problem Pi's constraints prohibits full imple
mentation of the more powerful techniques used in implicit enumeration
on linear integer problems. The structure of these constraints and
their effect on the optimization of PI will be discussed in a step-by-
step development.
The minimum-cost feature-selection algorithm does not solve PI to
the extent of finding the values of the vectors iA, i=lf2,,,.,p. This


55
3.4.2 Staten
Step 0:
Step 1:
Step 2:
Step 3:
Step 4:
Step 5:
'ent of the Minimum-Cost Synptcm-Selection Algorithm
Create the assignment vector (at this point the vector is
null as there is no variable assignment in the vector).
Set V*=- and go to Step 4.
Start at the right side of the assignment vector and move
to left, stopping at the first variable assigned a zero
value. If no variable in the assignment vector has a
zero assignment, go to Step 2. Otherwise go to Step 3.
Calculate V for the assignment vector. If V is greater
than V*, record the values of the variables in the assign
ment vector as the optimal solution X* to PI. Otherwise,
record (as the optimal solution X* to PI) the values of the
variables in the best current solution X. Terminate the
algorithm.
Change the value of the variable isolated in Step 1 to an
assigned value of one, and eliminate frcm the assignment
vector all variable assignments to the right of this new
assignment. If the assignment vector includes the assign
ment X£=l for every x^ in X return to Step 2. Otherwise go
to Step 4.
Select a variable x^ that is not an element of the assign
ment vector. Assign this variable the value x^=0 in the
assignment vector. Use Procedure 2 to check the feasibility
of this assignment. If the assignment vector is not fea
sible, go to Step 6. Otherwise go to Step 5.
If the assignment vector with the new assignment x^=0 does
not include an assignment for every x^ in X, return to
Step 4. Otherwise go to Step 7.


I certify that I have read this study and that in my opinion it
conforms to acceptable standards of scholarly presentation and is fully
adequate, in scope and quality, as a dissertation for the degree of
Doctor of Philosophy.
Kerr^E. Kilj^trick, Chairman
Assistant Professor of Industrial and
Systems Engineering
I certify that I have read this study and that in my opinion it
conforms to acceptable standards of scholarly presentation and is fully
adequate, in scope and qualify, as a dissertation for the degree of
Doctor of Philosophy.
// r? -j?'
Thomas B. Fast
Professor and Chairman of the Division
of Oral Diagnosis
I certify that I have read this study and that in my opinion it
conforms to acceptable standards of scholarly presentation and is fully
adequate, in scope and qualify, as a dissertation for the degree of
Doctor of Philosophy.
I certify that I have read this study and that in my opinion it
conforms to acceptable standards of scholarly presentation and is fully
adequate, in scope and quality, as a dissertation for the degree of
Doctor of Philosophy.
Professor and Chairman, Department of
Basic Dental Sciences


15
applications, or by considering at any stage of analysis the effects of
a fixed number of future treatments. Validation of the decisions gen
erated by these models has thus far been limited to checks on the feasi
bility of the treatment regimens selected. Unfortunately, the finite-
horizon models either do not consider the possibility of a patient's
prolonged stay in the health-care system, as is the case of the models
with a maximum number of possible treatments, or, where only a fixed
number of future treatments is considered, they provide no more than a
heuristic treatment-selection procedure.
2.4 Uncertain-Duration Treatment Planning
Bunch and Andrew [11] have considered the possibility of prolonged
occupation of the same diagnostic state during the course of a patient's
progression through the care system. In their Markovian representation
of the care system for mid-shaft fractures of the femur, they provide
this modeling refinement. As a consequence of this modification, the
number of treatment decisions made for each patient is a random variable
with no fixed upper bound. Howard's iterative scheme for policy selec
tion [25] provides the means for choosing the optimal treatment regimen
by selecting treatment alternatives that maximize the relative 'value'
of occupying each disease state. Although the Bunch and Andrew model did
not consider return visits to the same disease state, a more generalized
Markovian representation could incorporate that possibility. Neverthe
less, the proximity to reality that this category of transient Markovian
models provides requires considerable effort as holding-time distribu
tions, treatment 'costs,' and transition probabilities must be supplied
by the analyst for all treatment alternatives at each of the disease
states in the care system.


56
Step 6: If the assignment vector with the assignment x^-1
variable selected in Step 4) does not include an assignment
for every x^ in X, return to Step -4. Otherwise go to Step 7.
Step 7: Calculate V for the assignment vector. If V* is greater
than V, go to Step 1. Otherwise go to Step 8.
Step 8: Record as the best current solution X the values of the
variables in this assignment vector. Set V*=V, and return
to Step 1.
Note that in the course of applying this algorithm all solutions are
considered and the best current solution is replaced only when another
solution has a larger associated value. As the number of possible solutions
is finite, the algorithm must terminate, and at this termination the value
of the optimal solution and its assignments are known. An application of
the minimum-cost symptom-selection algorithm is presented in Appendix C.
3.4.3 Computational Considerations
Returning to the setting of diagnostic classification of craniofacial-
pain patients, application of the minimum-cost symptom-selection algorithm
295
would require an enumeration (explicit or implicit) over 2 possible
solutions in order to find the optimal collection of data-vector elements.
As the number of possible solutions is prohibitively large, heuristic
modifications to the symptom-selection algorithm are required for this
application. One possible modification could employ the fact that only
a few of the elements in the patient data vector have large associated
'costs' for their utilization. In particular, the eight elements of
radiographic data and the two measures of emotional trauma are significant
ly more 'costly' to examine than the other items in the data vector.
(x, is the


TABLE 4
CIASSIFICAIIGN VARIABILITY 'AMONG DENTAL PRACTITIONERS
Diagnostic Classification for
Patient 1
Patient 2
Patient 3
Patient 4
Patient
Original
Classification
4
13
15
15
9
Practitioner 1
1
7
15
15
3
Practitioner 2
6
12
15
8
3
Practitioner 3
4
15
15
15
13
Practitioner 4
4
15
15
14
*
* No classification given
+ Patient 5 exhibited a minimal amount of input data (only 17
non-zero'data-vector entries)
These four dental practitioners exhibited 100.0% agreement of the
diagnosis on one of the five patients, and 50.0% agreement on the
diagnostic classification of the remaining four patients.


42
Note that relation (2) is the requirement for pattern separability
by linear discriminants. Hence, a vector X is a component in a feasible
A A A A A
solution ** to ^ anc^ only ^ there exist VA i=l,2,...,p,
such that (2) holds for all i^j. As discussed in Section 3.1, a pattern
A
space is linearly separable, and hence, feasible VA exist, if and only if
the individual pattern classes have non-intersecting convex hulls. For
the pattern vectors considered in this section, the individual components
of each of the patterns in each pattern class are either zero or one. As
there is a one-to-one correspondence between the individual patterns in
a pattern class and the vertices of the pattern class's convex hull, the
A
convex hull of a pattern-class A^ can be expressed as all convex combina-
*
tions of the individual pattern-class vectors a/, m=l,2,... ,m^. Consider
the following examples of the convex-hull representation of linear separa
bility .
Assume = [1,0], a^ = [1,1], a^ = [0,0], and a^ = [0,1].
Graphically this pattern space can be represented as
2 2
^Y
X
Feature 2
**0
**-X
1
1
2y
^x
Feature 1
12
where the line X from a^ to a^. represents the convex hull of pattern
1 2
class X and the line Y from to represents the convex hull of
pattern-class Y. Since X and Y do not intersect, implying that the
space is linearly separable, it is possible to draw an infinite number
of lines 0 that serve as discriminating hyperplanes.


LIST OF TABLES
Tables
1. Survey of Diagnostic-Classification Models 12
2. Correlation Between Significant Symptoms and
Discriminant-Function Weights.... 30
3. Tests of Diagnostic Classifier Accuracy 32
4. Classification Variability Among Dental Practitioners... 35
5. Mean Transit Times Through the Craniofacial-Pain Care
System 75
vii


28
Then log [JJ Pts^Jc..]] = aWj, and decision rule (1) can be restated as
s.eS
i
classify a patient who is characterized by the vector a in the
j diagnostic alternative if
aWj > a^_ for all k^j. (2)
Note that decision rule (2) is identical to the decision rule employed
in non-parametrie pattern classification.
This equivalence implies that if (1) holds for every preclassified
patient examined, the values log P[s.JCj] form a set of feasible discrim
inant-function weights. If (1) leads to the correct classification of
a majority of the patients examined, it is logical to assume that there
may be a set of feasible discriminant-function weights. This assumption
was examined using the craniofacial pain patient data. Fran the data
vectors classified in Diagnostic Alternatives 13, 14, and 15, a total of
189 patient visits, the P[s^|Cj] were calculated. Each data vector was
then classified with decision rule (1), and 164 of the data vectors
(86.7%) were assigned to their pre-established diagnostic alternative.
The second criterion provides a subjective measure of the feasibil
ity of using a nan-parametric pattern classifier. If symptoms for most
of the diagnostic alternatives, associated with the disorder of interest,
can be isolated such that
1. a patients exhibition of a subset of these symptoms leads
the practitioner to a selection of one of the diagnostic
alternatives, or
2. a patient's exhibition of a subset of these symptoms leads
the practitioner to eliminate from further consideration