Conceptual modeling

MISSING IMAGE

Material Information

Title:
Conceptual modeling an object-oriented approach to requirements specification
Physical Description:
x, 199 leaves : ill. ; 28 cm.
Language:
English
Creator:
Duggins, Sheryl L., 1954-
Publication Date:

Subjects

Genre:
bibliography   ( marcgt )
theses   ( marcgt )
non-fiction   ( marcgt )

Notes

Thesis:
Thesis (Ph. D.)--University of Florida, 1991.
Bibliography:
Includes bibliographical references (leaves 194-198).
Statement of Responsibility:
by Sheryl L. Duggins.
General Note:
Typescript.
General Note:
Vita.

Record Information

Source Institution:
University of Florida
Rights Management:
All applicable rights reserved by the source institution and holding location.
Resource Identifier:
aleph - 001693338
notis - AJA5417
oclc - 25221974
System ID:
AA00003295:00001

Full Text













CONCEPTUAL MODELING:
AN OBJECT-ORIENTED APPROACH TO REQUIREMENTS SPECIFICATION















By

SHERYL L. DUGGINS


A DISSERTATION PRESENTED TO THE GRADUATE SCHOOL
OF THE UNIVERSITY OF FLORIDA IN PARTIAL FULFILLMENT
OF THE REQUIREMENTS FOR THE DEGREE OF
DOCTOR OF PHILOSOPHY

UNIVERSITY OF FLORIDA


1991































Copyright 1991

by

Sheryl L. Duggins
















ACKNOWLEDGMENTS


I would like to express my gratitude to Dr. Douglas

Dankel II, my advisor, dissertation supervisor, and

colleague, for his support, suggestions, prompt reading of

the manuscript, and most importantly, the freedom he

allowed me in pursuing my academic interests.

I would also like to express my gratitude to Dr.

Stephen Thebaut, my advisor and committee member, for his

interest, support, encouragement, and thought-provoking

suggestions.

My sincere appreciation is extended to my committee

chairman, Dr. Paul Fishwick, and to my committee members,

Dr. Randy Chow and Dr. Ronald Rasch, for their interest,

assistance, and support.

Finally, I am eternally grateful to my sons, Ryan and

Tyler, for their cooperation and understanding during all of

the hours I was too busy working to spend time with them;

and to my wonderful, loving husband, Dr. Macklin Duggins,

for his encouragement, support, proofreading, suggestions,

assistance, understanding, and patience during these

difficult years when we both were working very hard to

attain our goals of becoming Dr. and Dr. Duggins.


iii



















TABLE OF CONTENTS


page


ACKNOWLEDGMENTS .

LIST OF FIGURES .

ABSTRACT . .


CHAPTERS


1 INTRODUCTION .

1.1 Problem Statement .
1.2 Research Agenda .
1.3 Dissertation Outline

2 RELATED WORK .


2.1
2.2
2.3


. 8

. . 10

. . 12


Requirements Methodologies .... 12
Basic Concepts of Conceptual Modeling 14
Conceptual Modeling and Object-oriented


Approaches .
2.4 Conceptual Modeling of Requirements
2.5 Other Related Work .

3 THE TARGET LANGUAGE . .

3.1 The Framework .
3.2 The Target . .

4 THE METHODOLOGY . .

4.1 Identify Objects and Processes .
4.2 Model System Dynamics .
4.3 Model Structure of Entities .
4.4 Model Structure of Processes .
4.5 Refine Entities .
4.6 Refine Processes ..
4.7 Annotate to Refine Dynamics .


* .
* S


* *


. 28
S* 31


. *


. .

. .
S S
. S


4.7.1 Model Internal Process Behavior
4.7.2 Model External Process Behavior
4.7.3 Model Entity Behavior ..
4.8 Specify Constraints . ..
4.8.1 Model Internal Process Constraints


51
54
56
62
67
73
79
80
88
114
115
116


iii

vi

ix






...


. .











4.8.2 Model External Process Constraints .
4.8.3 Model Entity Constraints .
4.9 Write Formal Requirements in RML' .
4.9.1 Translating from Graphics to RML' .
4.9.2 RML' Specification ..
4.10 Developing a Consistent Model .


5 SUMMARY . .

5.1 Evaluation .
5.2 Limitations of the Study ..
5.3 Directions for Future Research


. .


APPENDICES


A RML' SYNTAX . .


B SEMANTICS OF 'IS-A' AND 'IN' .

C THE COMPLETE HOSPITAL/PATIENT MODEL .

REFERENCES . .

BIOGRAPHICAL SKETCH .. .. .. .


. .


* S S

* S S

S S S


118
136
137
140
147
153

157

158
159
160


162


. 165


169

194

199


















LIST OF FIGURES


Figure 1.

Figure 2.

Figure 3.

Figure 4.

Figure 5.

Figure 6.

Figure 7.

Figure 8.

Figure 9.

Figure 10.

Figure 11.

Figure 12.

Figure 13.

Figure 14.

Figure 15.

Figure 16.

Figure 17.


Figure

Figure

Figure

Figure


18.

19.

20.

21.


RML' Definitional Property Categories

Classification Levels .

Objects and Processes .

System Dynamics . .

Entity Generalization Structure .

Entity Aggregation Structure .

Process Generalization Structure .

Process Aggregation Structure .

Refined Entity Generalization Structure

Refined Entity Aggregation Structure .

Refined Process Generalization Structure

Refined Process Aggregation Structure

Process Interface .. .. .

Non-elementary Process .

Elementary Process Decomposition .

Preliminary Act-/Pre-/Postconditions .

Pre-/Postconditions for Entity
Life-Cycle Stages . .

Branch and Join Junctors .

Distributed Branch and Join Junctors .

Disjunctive Branch and Join Junctors .

Exclusive Disjunctive Branch and Join
Junctors . .

vi


* .

. .

* .

. .

. .

* .

. .











Figure

Figure

Figure

Figure

Figure

Figure

Figure

Figure


Figure

Figure

Figure


Figure

Figure

Figure

Figure


Figure

Figure


Figure


Figure

Figure


Figure

Figure


vii


22. Conjunctive Branch and Join Junctors .

23. Preliminary Refined System Dynamics .

24. Refined System Dynamics ....

25. Sequential Components of Process P .

26. Parallel Components of Process P .

27. Preliminary Temporal Constraints .

28. Preliminary Entity Constraints .

29. RML' Constraints and
Act-/Pre-/Postconditions .

30. DFD Metaclass Definition .

31. RML' Object Definition with a Ddf View .

32. RML' Object Specification of Ddf
Constraints .. .. .

33. BRANCH and JOIN Metaclasses .

34. DISTRIBUTED BRANCH and JOIN Metaclasses

35. DISJUNCTIVE BRANCH and JOIN Metaclasses

36. EXCLUSIVE DISJUNCTIVE BRANCH and
JOIN Metaclasses ..

37. CONJUNCTIVE BRANCH and JOIN Metaclasses

38. RML' Object Definition with Junctor
Constraints . .

39. RML' Object Specification of Junctor
Constraints . .

40. RML' Temporal Constraints ..

41. RML' Constraints for Entity Life-Cycle
Stages . .

42. RML' Entity Constraints ...

43. Entity Structure Diagram to Translate. .


100

101

S. 104

108

. 110

. 114

. 115


S. 117

S. 121

S. 123


S. 124

. 125

S. 126

. 126


. 127

S. 127


. 129


. 130

. 134


. 135

. 136

. 143











Figure 44. Entity Object Translation .

Figure 45. Process Structure Diagram to Translate

Figure 46. Process Object Translation .

Figure 47. BE_A_HOSPITAL_PATIENT/EVALUATE .

Figure 48. ADMIT/PERFORM TESTS . .

Figure 49. Process Generalization --
BE A HOSPITAL PATIENT .

Figure 50. Entity Generalization .

Figure 51. Entity Aggregation . .

Figure 52. Process Generalization -- ADMIT .

Figure 53. Process Aggregation -- ADMIT .

Figure 54. Process Generalization -- EVALUATE .

Figure 55. Process Aggregation -- EVALUATE .

Figure 56. Process Aggregation -- TREAT/RELEASE .


. 143

. 146

. 147

. 170

. 171


. 172

. 173

. 174

. 175

. 176

. 177

. 178

S. 179


viii

















Abstract of Dissertation Presented to the Graduate School
of the University of Florida in Partial Fulfillment of the
Requirements for the Degree of Doctor of Philosophy


CONCEPTUAL MODELING:
AN OBJECT-ORIENTED APPROACH TO REQUIREMENTS SPECIFICATION

By

SHERYL L. DUGGINS

August 1991



Chairman: Paul Fishwick
Cochairman: Douglas Dankel II
Major Department: Computer and Information Sciences


This dissertation develops a methodological

approach for requirements analysis and specification that

utilizes conceptual modeling to incrementally develop and

formally specify requirements. The methodology provides a

graphical formalism for specification within an object-

oriented framework. Classification, generalization, and

aggregation are utilized for organizing and identifying the

requirements of a system. The modeling strategy integrates

conceptual modeling concepts such as hierarchical

structures, the role of the environment, and a temporal

structure into an object-oriented framework that emphasizes

information hiding, data abstraction, and inheritance.










This truly unified approach results in requirements that are

reliable, modifiable, and easy to understand.

Requirements Modeling Language Prime (RML'), a formal

requirements specification language, was defined to

simplify the syntax and extend the underlying semantics of

Greenspan's RML. The graphical components of the

methodology are formally defined in RML' and the results of

the methodology are a formal requirements specification

expressed in RML'.

This dissertation addresses the problem of

requirements modeling from a knowledge representation

perspective. One of the major tasks addressed by this

research was to discover what types of knowledge should be

expressed during the various stages of requirements

acquisition and to determine how to best represent that

knowledge. The methodology incorporates this information

and guides the analyst through the entire requirements

modeling task.

















CHAPTER 1
INTRODUCTION


Software development in the 1960s was in its infancy

and was characterized by zealous developers trying to

discover innovative ways to utilize computer technology. In

the late 1960s, the field of software engineering emerged,

based on the idea that an engineering-like discipline could

be applied to building software systems. In the two decades

since, there have been significant advances in the area of

software development techniques. However, current

researchers are still searching for a solution to the

problems associated with the "software crisis."

The term "software crisis" refers to the fact that a

significant percentage of delivered software systems were

completely unsatisfactory, extremely late, far over their

budgets, and poorly suited for the intended users of the

system. The reasons cited to explain these failures focus

on a lack of communication between the developers and the

users, a lack of understanding of the problem and the

environment, and overly optimistic estimates regarding

development time and cost [Was80]. Consequently, many of

the problems associated with the software crisis can be










2

attributed to deficiencies in the earliest phase of the

software life-cycle.

There is currently no consensus on what to call the

earliest phase of the life-cycle or how it should be

defined. The term requirements analysis and specification

is used in this dissertation for the first life-cycle phase.

This phase consists of two distinct components:

requirements analysis and requirements specification. It

should be noted that, although the following definitions are

representative of the general usage, these terms do not

reflect a standard.1

Requirements analysis is the process of
gathering all of the relevant information to be
used in understanding a problem situation prior to
system development.

Requirements specification is the
documentation of the requirements analysis
[Gree82].

Because the problems associated with the software

crisis originate during requirements analysis and

specification, the ramifications affect all aspects of

software development, but are particularly significant for

maintenance. Errors introduced during the earliest phases

of development are the hardest to correct and take from 1.5

to 3 times the effort of an implementation error to correct

[Yeh84]. In terms of cost, software maintenance constitutes


1The term requirements definition is often used to
describe the analysis process. It is also used by other
authors to define the documentation of the requirements
analysis. Because of this conflicting usage, alternative
terms were chosen.










3

between 66% and 80% of the life-cycle cost and about two

thirds of that cost can be attributed to misconception--

not identifying the real needs or improper conceptual

modeling [She87].

The lack of thorough attention to requirements analysis

is even more significant in large, complex projects. In two

large command-control systems, 67% and 95% of the software

had to be rewritten after final delivery because of

mismatches with user requirements [Boe73]. In some cases, a

lack of attention to requirements results in entire projects

being cancelled due to the infeasibility of successful

completion. Two expensive examples are the $56 million

Univac-United Airlines reservation system and the $217

million Advanced Logistics System [Boe84].

It has been shown that insufficient effort spent on

requirements analysis and specification can result in

increased maintenance costs and, in extreme cases, can lead

to the total cancellation of projects. However, these

economic ramifications are only part of the problem. The

role played by the requirements specification is so

pervasive that the success of the entire project depends

upon it. This scope is discussed by Yeh and his colleagues

[Yeh84, p. 519-520]:

It is the basis for communication among customers,
users, designers, and implementers of the system,
and unless it represents an informed consensus of
these groups, the project is not likely to be a
success. It must also carry the weight of
contractual relationships between parties that










4

sometimes become adversaries. In particular, the
design and implementation must be validated
against it. .. In short, because the
requirements phase comes so early in development,
it has a tremendous impact on the quality (or lack
thereof) of the development effort and the final
product.

The requirements phase is often primarily associated

with the task of identifying user requirements. Obviously,

that is one necessary component, but the primary purpose of

requirements analysis and specification is to gain a

thorough understanding of the problem situation and to

communicate that understanding to others [Fre80].

Consequently, researchers have proposed solutions to the

requirements problem that include:

1. constructing conceptual models to aid

understanding;

2. developing better requirements methodologies;

and

3. expressing requirements in formal languages

for unambiguous communication.

Due to the complexity of systems today, an additional

layer of understanding is needed between the real world and

the requirements specification. Many researchers have

proposed constructing a conceptual model during the early

stage of software development to assist the analyst in

understanding the problem situation before the solution

system is described ([Rou79], [Bub80], [Rol82], [Gre84],

[Yeh84], [She87], [Kun89], and [Som89]). For all but










5

trivial systems, this explicit, precisely defined conceptual

model is needed to share that understanding with the group

of people involved.

The precise contents of the conceptual model vary with

the researcher, but generally it is viewed as an abstract

model of the application reality, i.e., the system within

its environment. It should be designed by the use of

abstraction and negotiations between users. It describes

the users' views of the context of the system, often with

multiple views being significant. The model typically

identifies major user services, important relationships, and

all external properties of the system. It depicts

abstraction, assumptions, and constraints about the

application and usually views the application in an

extended time perspective. In short, the model forms the

semantic basis for the contents of the system [Bub80].

Another proposed solution to the requirements problem

involves developing better requirements methodologies. The

lack of methodological approaches for requirements modeling

is frequently acknowledged in the literature. Shemer

[She87] states that the lack of a framework for

understanding creates confusion and poor performance in

practicing analysts. Without the proper conceptual tools,

analysts focus on the technical aspects of the developed

system rather than on understanding the problem. Shemer

cites Freeman as stating that too frequently, too many










6

analysts are not producing the results that are needed and,

moreover, are incapable of producing them because analysts

are lacking appropriate intellectual tools.

The inadequacy of current approaches for requirements

modeling is also acknowledged by Yeh and his colleagues

[Yeh84] and is attributed to the fact that most techniques

concentrate on functional requirements and provide weak

structures for expressing them. They state that current

techniques primarily offer tools (predominantly languages),

rather than guidelines for analysis or specification.

Another proposed solution to the requirements problem

focuses on the use of formal specification languages.

Current software requirements documents are usually written

in natural language. Problems resulting from using natural

language for specifying requirements have been widely

reported ([Bel77], [Ros77a], and [Gre84]), and include:

unstated assumptions and undefined concepts; ill-defined and

ambiguous terms; requirements mixed with design and

implementation decisions; poorly organized or fragmented

descriptions; monolithic requirements; and inconsistent and

incomplete descriptions.

One alternative to specifying requirements in natural

language is to express them in a formal requirements

specification language. A formal specification language is

one whose vocabulary, syntax, and semantics are formally

defined, i.e., based on mathematics. Formal languages for










7

specifying requirements include RML [Gre84], the language

used in the CIAM approach [Gus82], Gist [Gol8O], REFINE

[Rea87], and ERAE [Dub89].

It is believed that expressing requirements in a formal

specification language should lead to an unambiguous

requirements specification, and as such, should be ideal as

the basis of a contract between developers and system

procurers. Regardless of that potential, formal

specification is not widely used because of the lack of

familiarity with these new techniques and because formal

methods appear to be difficult to use due to their

proximity to mathematics and logic. Furthermore, as

Sommerville [Som89, p. 93] states, "Unfortunately, the

current state of user experience (and, realistically, the

experience of many software engineers) is such that they

would not understand a specification written in a formal

specification language and would be unwilling to accept it

as a contract basis."

Thus, it is assumed that the widespread utilization of

formal requirements specification languages will not be the

norm for many years. In the meantime, suggestions for

formalizing requirements specification include: the use of

compromise notations that add more structure than that

provided by natural language; tool support for formal

specification development; the use of graphics to structure










8

specifications; and methods that focus on developing the

specification in an incremental fashion [Som89].


1.1 Problem Statement

Several proposals have been discussed that purport to

improve the current state of requirements analysis and

specification. However, none of these suggestions have

proven effective as practical solutions to the software

crisis. It is this author's contention that the lack of

success is not due to any inherent deficiency in either of

the suggestions, but rather, it is because each solution

addresses only a fragment of the problem and, in so doing,

introduces new problems that must be dealt with. In

essence, the practical issues of how to incorporate these

ideas into the existing framework of software development

have not been addressed.

What is needed is a comprehensive solution that

combines the best features of all of the aforementioned

suggestions into one unified approach to requirements

analysis and specification. This dissertation presents a

methodological approach for requirements analysis and

specification that utilizes conceptual modeling to

incrementally develop and formally specify requirements.

The methodology provides a graphical formalism for

specification within an object-oriented framework. The

formal requirements specification is expressed in RML'.










9

1.2 Research Agenda

The difficulties involved in requirements analysis and

specification have been discussed, and a number of

fragmentary solutions to the problem have been presented.

The need for a comprehensive approach that integrates the

best features of many solutions into one unified methodology

has been identified. This dissertation develops such a

requirements methodology based on conceptual modeling.

The research consisted of the following tasks:

1. Examining requirements methodologies,

conceptual modeling approaches, object-oriented

approaches, and formal requirements specification

languages;

2. Developing an appropriate methodological

framework;

3. Discovering the types of knowledge represented

in conceptual modeling languages and identifying

RML as a formal language whose semantics represent

the basic underlying principles of conceptual

modeling;

4. Disambiguating, evaluating, and simplifying

the syntax and semantics of RML, and defining RML'

which integrates these simplifications with a few

extensions to strengthen the underlying framework;

5. Analyzing structural dfd-based requirements to

discover the types of knowledge they represent;










10

6. Comparing the knowledge contained in dfd-based

requirements with the knowledge contained in

conceptual models, object-oriented analysis

approaches, and formal requirements specifications

and identifying the types of additional knowledge

necessary at the specification level;

7. Identifying the major steps of a requirements

methodology based on conceptual modeling that

incrementally develops a formal requirements

specification;

8. Developing strategies/techniques for eliciting

and representing the necessary information;

9. Refining the major steps of the methodology to

incorporate modeling all of the identified

information;

10. Formally defining the constructs used in the

methodology; and

11. Developing guidelines for checking the

consistency of the developing requirements.


1.3 Dissertation Outline

This dissertation is organized into five chapters.

Chapter 2 discusses related work including conceptual

modeling approaches, object-oriented approaches, and

existing requirements methodologies. The target language,

RML', is defined in Chapter 3 and its semantics are

discussed. The requirements methodology is presented in










11

Chapter 4. A summary, the contributions of this thesis, and

directions for future research are provided in Chapter 5.
















CHAPTER 2
RELATED WORK


The related work is divided into the following

categories:

1. Traditional methodologies for requirements

analysis and specification;

2. Basic concepts of conceptual modeling;

3. Conceptual modeling and object-oriented

approaches;

4. Requirements methodologies based on conceptual

modeling; and

5. Related work in other areas.


2.1 Requirements Methodologies

The most widely used requirements methodologies are

based on data flow diagrams. These include: Structured

Analysis (SA) [Dem79], [War86], [You89]; Structured Systems

Analysis (SSA) [Gan79]; and Structured Analysis and Design

Technique (SADT) [Ros77a, Ros77b].

The varieties of SA and SSA are similar and should be

viewed as interpretations of the same technique. They are

based on the same concepts: data flow diagrams (dfds)

supported with data dictionaries, structured English, and

decision trees and/or tables. The later versions reflect

12










13

extensions of the model. Ward and Mellor [War86] added

control information to be able to describe real-time

systems, and Yourdon [You89] added facilities for modeling

data.

SADT looks very different from the other dfd techniques

because it uses different symbols, but actually it bears a

strong resemblance to the other methods. It is larger in

the sense that there are more language features (40 versus 4

in typical dfds) and its methodology is more comprehensive,

since it includes management information such as personnel

roles and reader cycles.

All of the data flow diagram modeling techniques

provide an excellent facility for structuring the initial

problem analysis and understanding the functional

decomposition of the system. Their limitations include

being informal methods; providing a single, functional view

of the system (except for [You89]); giving very little

emphasis to the structure of the data; and relying heavily

on supporting natural language text.

Another approach focuses more on the communication

aspects of the requirements definition problem than on

assisting in understanding the problem being analyzed. This

approach utilizes computer technology for processing and

maintaining all information obtained during requirements

analysis and specification. Two examples are Problem

Statement Language/Analyzer (PSL/PSA) [Tei77] and Software










14

Requirements Engineering Methodology (SREM) [Alf77]. PSL/

PSA consists of a language for expressing specifications

(PSL) and an analyzer for processing the descriptions (PSA).

SREM applies the concepts of PSL/PSA to the development of

real-time systems. It utilizes the Requirements Statement

Language (RSL) [Bel77] for the expression of software

requirements.


2.2 Basic Concepts of Conceptual Modeling

The seminal work on conceptual modeling of Smith and

Smith [Smi77, Smi80], introduced the idea of using

abstraction to understand complex systems. Their context

was the area of database design, but the underlying

principles were applicable to conceptual modeling in any

domain.

The premise of their Semantic Hierarchy Data Model is

that humans understand complex systems by creating

abstractions that can be named and treated as a whole.

These abstractions allow us to focus on the details of

interest, ignoring other peripheral details. They are

either in the form of abstract objects or abstract

operations.

When modeling systems, we use three abstraction

mechanisms for describing the individuals and their

interrelationships: classification, generalization, and

aggregation.












Classification forms new objects by suppressing the

details of individual instances and emphasizing properties

of the whole. New object types (classes) are formed by

collecting instances. Classification corresponds to the set

theoretic operation of 'membership' and describes the

'instance-of' relationship. For example, classification

would allow one to collect the instances '(Jane, Bob, Sue)'

into a new higher-order type called 'secretary.'

Generalization merges existing types to form new types.

Individual differences between subtypes are ignored and

common traits are emphasized. Generalization corresponds to

the set theoretic operation of 'union' and describes the

'is-a' relationship. For example, the existing employee

types '(secretary, teacher)' can be generalized to form the

new type 'employee.' Generalization implies that every

instance of the sub-type is an instance of the type. Thus,

every instance of 'teacher' is also an instance of

'employee.'

Aggregation forms an object as a relationship among

other objects. It suppresses the details of the components

and emphasizes the details of the relationship as a whole.

Aggregation corresponds to the set theoretic operation of

'cartesian product' and describes the 'is-part-of'

relationship. For example, consider the object types

'person,' 'room,' 'hotel,' and 'date.' A 'reservation'

object can be abstracted from the relationship: 'a person










16

reserves a room in a hotel for a date.' The individual

objects (e.g., 'room') are components of the aggregate

(i.e., 'reservation'). An instance of a component is said

to be an attribute of an instance of the aggregate. For

example, given the 'reservation' instance: 'Mary reserves

room 212 at the Sheraton for May 12th,' then 'Mary' is the

'person' attribute of the 'reservation' instance.

Classification, generalization, and aggregation are the

basic ways we have of structuring information. When they

are repeatedly applied to objects, hierarchies of new

objects are formed. The application of these three

mechanisms forms the methodological basis for conceptual

modeling.

There are three primitive operations used to describe

changes in the model: create, destroy, and modify. The

create operation forms an individual, implements its

membership in the types and attribute relationships

specified in the operation, and verifies all type

constraints. The destroy operation removes the individual

from the model and from any classes and attribute

relationships in which it is involved. The modify operation

may involve removals or additions to its class memberships

and attribute relations. It may be applied to individuals

(tokens) or types (classes) and is used to form the

classification hierarchy. To continue our earlier example,

classification can be applied to the employee types










17

'{secretary, teacher)' to form the new higher-order type

'(employee type).'

To use these primitive operations, a predicate language

is needed to specify the involved individuals in terms of

their attributes, categories, and types. Predicates may be

used with these operations to construct function

definitions. These functions characterize the behavioral

semantics of a type just as the structural semantics are

characterized by its subtypes and component types.

These basic ideas form the core of conceptual modeling

approaches. Extensions to this core usually include time as

a significant factor. For example, the "application

universe of discourse" in Gustafson et al. is defined as

consisting of a "time varying set of entities" [Gus82, p.

100]. Type membership as well as attribute values and

relationship functions may vary with time. Entities have an

associated existence at a particular time 't.' The notion

of an event is used to include dynamics into the model.

Events play the same roles as the primitive operations

described above.

Because of the importance of the structural components,

conceptual modeling is sometimes equated with information

modeling (Conceptual Information Modeling or CIM).

Alternatively, conceptual modeling is viewed as consisting

of two sub-tasks: information modeling together with

process modeling.










18

Information modeling focuses on the static components

and is referred to as a "snapshot" since it models a slice

of reality at a given time. This approach typically avoids

introducing any processing type information, and

consequently, does not utilize any form of top-down

functional decomposition or data flow-based techniques for

analysis. The method focuses on identifying objects,

attributes, relationships, subtypes/supertypes, associated

objects, and constraints as in Bubenko [Bub80] and Gustafson

et al. [Gus82].

The process model depicts the dynamics of the system.

Processes, activities, or events are used to depict dynamic

objects that affect the static aspect of the slice of

reality being modeled. This information may be described in

a state transition diagram illustrating the various states

that the data may be in during its life-cycle, the events

that precipitate state changes, and the actions occurring as

a result of those state changes [Shl88], [Yeh84].


2.3 Conceptual Modeling and Object-oriented Approaches

Another concept that is frequently linked with

conceptual modeling is "object-oriented." Object-oriented

in this sense means "data-oriented or data-centered," i.e.,

that modeling the data is of primary concern. While that is

certainly true of an object-oriented view, the intended

usage implies much more and hinges on the notion of an










19

object itself. As Booch [Boo83, p. 424] states:

In the real world, an object is simply an entity
that is visible or otherwise tangible; we can do
things to an object (like throw a rock) or
objects can have a 'life' of their own (for
example, as in a stream flowing down a
mountainside). In our software, an object is also
any entity that acts or can be acted upon; in a
sense, objects are the computational resources of
our system that parallel (abstract) the objects
from the real world. Objects exist in time
and hence can be created, destroyed, copied,
shared, and updated.

Thus, an object is a data abstraction that can play active

and passive roles. In a passive role, an object has

operations invoked on it and, in an active role, an object

can invoke operations over other related objects [Bro82].

The above view of an object refers to the concepts

associated with object-oriented approaches: abstraction,

information hiding, and data abstraction [Boo83].

Additionally, inheritance is also typically considered an

underlying principle of object-oriented approaches, although

it may not be a "necessary" component [Boo86].

Abstraction refers to organizing information based on

classification, generalization, and aggregation.

Information hiding as defined by Parnas [Par72] suggests

decomposing systems into components that hide or encapsulate

design decisions about abstractions. Data abstraction

refers to defining data types in terms of the operations

that apply to objects of the type, with the constraint that

the values of such objects can be modified and observed only

by the use of the operations [Oxf86]. Inheritance, the










20

principle of receiving properties or characteristics from an

ancestor, is used to define new data types as extensions or

restrictions of existing types.

Based on the above description of object-oriented,

conceptual modeling as previously described is not truly an

object-oriented approach. Conceptual modeling and object-

oriented approaches do have abstraction and inheritance as

common characteristics. However, for conceptual modeling

to be an object-oriented approach, it must additionally

incorporate the concepts of data abstraction and information

hiding. This further implies that the operations/processes

must be encapsulated and associated with their related

objects and, hence, cannot be dealt with separately as is

typically done in the "snapshot" plus process model

approach.


2.4 Conceptual Modeling of Requirements

The work discussed in this section reflects the use of

conceptual modeling as the methodological basis for

requirements analysis. The construction of a model for

understanding the system and its environment is viewed as a

prerequisite to requirements specification. Abstraction

plays a central role in uncovering the objects, functions,

relationships, constraints, and properties of the developing

system. The semantics of the system are typically defined

within a temporal framework.










21

The methodology for conceptual modeling based on the

Semantic Hierarchy Data Model of Smith and Smith [Smi80] as

described earlier consists of the following:

1. identify the names of all object types and

functions during requirements analysis;

2. use classification, generalization, and

aggregation to determine type/instance, object/

category, and object/component relationships;

3. suppress uninteresting types, categories, and

aggregate objects and determine important naming

conventions;

4. identify conditions for updating object types;

and

5. specify the types, categories, components,

functions, and attribute relationships of each

object.

Although this was a methodology for producing a conceptual

design of a data base, it is representative of methodologies

for conceptual modeling of requirements [Bub80], [Gus82].

Note the importance given to modeling the structure of the

data vs the weak attention given to describing the

functions.

Although the emphasis on data structure is central to

all conceptual modeling approaches, some methodologies give

more attention to the role played by the processes than do

others. For example, the states of entities and events










22

causing state change are considered important in Yeh et al.

[Yeh84] and to a lesser degree in Shlaer and Mellor [Shl88].

One novel approach that treats processes and entities

uniformly is the Requirements Modeling Language (RML)

[Gre84]. In this approach the requirements modeling is

viewed as a conceptual modeling task, and a formal

requirements specification language (RML) is defined for

expressing the model. The language is unique in that it is

both a formal requirements specification language and a

conceptual modeling language. The semantics of RML are

based on fundamental notions of conceptual modeling and

accepted properties of requirements. A methodological

approach is given that utilizes SADT for initial problem

analysis and RML for the formal specification.

Viewed as a conceptual modeling methodology for

requirements specification, the RML approach inadequately

addresses the pragmatic aspects of how to integrate the

constructs of SADT, conceptual modeling, and RML into a

unified framework. The problems associated with

transforming an initial functional decomposition (the SADT

model) into an object-oriented description (the RML model)

are overlooked, although the fact that the methodology

results in uncharacteristic, "simplistic" RML models is

noted.2



2Typical object-oriented RML models focus on
describing a system in terms of the generalization










23

Another problem not addressed is the difficulty

involved in describing a system in RML. Since RML is a

formal specification language, it suffers from the problems

associated with formality discussed earlier. To simplify

the modeling task, the methodology should assist in

incrementally developing the specification.

Greenspan's work on RML is significant because it

combines the notions of conceptual modeling and formal

specification into an object-oriented framework for

requirements modeling. The associated problems stem from an

inadequate methodological approach. Solutions to these

problems are investigated in Chapter 3.

Kung [Kun89] also utilizes conceptual modeling as the

basis for requirements modeling. His approach is visual

and formal and models both the static and dynamic aspects of

a piece of reality in one model. He models process behavior

with a notation borrowed from Petri nets and relational

calculus. These models are then translated into Prolog

programs and executed (i.e., an executable specification is

produced).

The steps involved in the execution of a process

behavior model are closely related to the semantics of

conceptual modeling and RML. Much of his work involves


abstraction (is-a hierarchies) to utilize inheritance.
However, RML models resulting from the SADT/RML methodology
describe functionally decomposed systems based on
aggregation (is-part-of hierarchies), and consequently, are
not object-oriented and do not utilize inheritance.










24

defining and specifying many of the features already found

in conceptual modeling, RML, and various other requirements

techniques. Consequently, it appears Kung did not benefit

from the efforts of others. However, due to these

commonalities, his approach is consistent with the goals of

this research.


2.5 Other Related Work

The position taken in this research is that building a

requirements model is parallel to constructing a knowledge

base about some slice of reality. Both requirements models

and knowledge bases are expected to be natural and direct

representations of the knowledge they embody so that they

can facilitate communication between domain experts, system

builders, and, ultimately, system users. The acquisition of

knowledge from human experts for the purpose of knowledge

representation demands an intense understanding of the

application domain. Understanding the problem domain is an

absolute prerequisite to building correct, reliable systems

and is the goal of requirements analysis. Thus, knowledge

representation techniques that have proved useful can be

utilized for requirements modeling [Bor85].

Semantic networks, a knowledge representation

technique introduced by Quillian [Qui68] as a model of human

memory, is fundamental to this research. Semantic networks

represent the world by constructing a graph-like model of

interrelated objects. These objects can be used to










25

represent any category of conceptual unit of information

about the world (for example, entities, processes, and

constraints). There is a strong emphasis on organization/

abstraction principles, with 'is-a' hierarchies often

considered synonymous with semantic networks.

The dissertation also has much in common with research

in the database and information system areas. The

introduction of abstraction principles into databases

[Smi77] and of conceptual modeling for database design

[Smi80] had a major influence on the framework. The entity-

relationship database approach [Che76] proved useful for

modeling the structure of objects, and the introduction of

semantic networks into databases by Roussopoulos [Rou79] was

also instructive.

The problems associated with transforming requirements

expressed in dfds into object-oriented design (OOD) are

addressed by Alabiso [Ala88] and Coad and Yourdon [Coa90].

Alabiso presented a transformation from dfds to OOD, which

unfortunately, was far too simplistic. It maps dfd features

(data, processes, terminals, and stores) into object-

oriented design/programming components (methods, messages,

and classes defined in the Smalltalk-80 programming

language). The additional information that is implicitly

expressed in the dfd as well as additional information

needed to clarify and solidify the intent of the dfd is

ignored. He further states that the related issues of










26

organization/abstraction principles (which are fundamental

to OOD) are truly "design-time tasks" and should not be

dealt with by a development methodology.

The work of Coad and Yourdon [Coa90] is most related to

our goals. They address the need for an object-oriented

analysis (OOA) methodology as a prequisite to OOD. Their

methodology emphasizes abstraction mechanisms and data

abstraction as does ours, but, like Alabiso, they introduce

design components during the analysis phase. By focusing on

the OOD components methods and messages, they force the

analyst to make design decisions. Their methodology

concentrates on the data modeling aspects, as is typically

done in OOA/OOD approaches, and neglects the dynamic

aspects. Their methodology has no formal underpinnings, and

the results have to be respecified in a natural language

requirements specification. However, due to the common

framework, their approach serves to reinforce this research.

Further, since they only model the static aspects of

objects, their diagrams are easier to comprehend than those

resulting from our methodology.

This chapter discussed related work including

traditional methodologies for requirements analysis and

specification, the basic concepts of conceptual modeling,

object-oriented and conceptual modeling approaches,

requirements methodologies based on conceptual modeling, and

related work in other areas. The next chapter presents the










27

framework of our approach and the formal specification

language that is the target of our methodology.
















CHAPTER 3
THE TARGET LANGUAGE


3.1 The Framework

A conceptual modeling approach is to be taken toward

developing a methodology for requirements modeling that

provides a graphical formalism for specification. The

methodology utilizes a layered, incremental approach to

development within an object-oriented framework. The

methodology addresses both requirements analysis and

requirements specification by providing a formal definition

of the constructs identified during the development process.

The formal requirements specification is expressed in RML'.

The basic assumption underlying the methodology is that

humans use abstraction mechanisms (classification,

generalization, and aggregation) for understanding complex

systems. Therefore, those same abstractions should be used

for organizing and identifying the requirements of a system.

Furthermore, applying those abstractions within a framework

that emphasizes information hiding, data abstraction, and

inheritance results in requirements that are reliable,

modifiable, and easy to understand.

Object-oriented approaches and conceptual information

modeling approaches decompose a system based on the 'is-a'










29

hierarchical structure of the data. The models identify

objects, attributes, constraints, types/subtypes, associated

objects, and relationships. Processes are of secondary

importance in both cases.

Object-oriented approaches specify processes in terms

of the object-oriented design/programming components:

methods and messages. Methods are the processes that apply

to an object. Messages are sent from one object to another

to activate processes and define the interface between

objects.

Top-down functional decomposition approaches view a

system in terms of the 'is-part-of' hierarchical structure

of the functional components. The model consists of

processes, subprocesses, data flows into and out of the

processes, and external process interfaces. The modeling of

the data is of secondary importance and is usually described

textually in a data dictionary rather than within the model.

Functional decomposition methods result in systems that

are difficult to maintain due to their early emphasis on

functions. Functions tend to change over time which

requires changing the system. On the other hand, existing

object-oriented analysis techniques are too design oriented.

They force the requirements analyst to make design decisions

since the same constructs utilized at the design level

(methods and messages) must be specified at the requirements

level.










30

Current analysis approaches are either function-

oriented or object-oriented and force the analyst to select

one of these views as the overall view of the system.

However, since all systems consist of both data and

processes, and since the goal of requirements modeling is to

understand the complete system, the requirements methodology

should assist in viewing a system from both perspectives.

This would shed the most light on the innerworkings of the

system, and consequently, result in more clearly defined

requirements.

This research incorporates an object-oriented

framework based on object-oriented concepts rather than

object-oriented design/programming constructs. Processes

are identified with their associated objects (i.e., data

abstraction) and abstraction is used to define both objects

and processes (i.e., information hiding and inheritance).

The framework integrates concepts from object-oriented and

functional decomposition approaches into one methodology.

The methodology utilizes the three abstraction

mechanisms to model the three types of objects: entities,

processes, and constraints. Thus, there are 'is-a'

hierarchies, 'is-part-of' hierarchies, and 'instance-of'

hierarchies for entities, processes, and constraints. These

hierarchies describe the structure of the model.

Each individual described in the model is an instance

of some class. Data abstraction is used to define these










31

classes. Thus, objects are defined in terms of the

relationships/processes that apply to instances of the

class. These relationships relate instances of the class to

objects of all three types. Also, the instances of the

class can be accessed only by those identified

relationships/processes.

The graphical presentation is motivated by the "one

picture is worth ten thousand words" idea. Graphics help

the user (and analyst) understand what is being developed by

providing something to be internalized and compared to the

user's existing internal model. This comparison cannot be

made with any of the static, formal notations currently in

use for the specification of requirements [Yeh84].

The components of the methodology are based on familiar

techniques/concepts. The dynamic nature of the system is

modeled with a simplified dfd. The structure of entities,

processes, and constraints is defined using a notation based

on Chen's [Che76] entity-relationship diagram. Processes

are specified based on standard precondition/postcondition

concepts. The dynamics are refined by adding annotations to

specify data flow branches/joins and function activation

rules [Mar88].


3.2 The Target

Requirements Modeling Language (RML) developed by

Greenspan [Gre84] is a conceptual modeling language and a

formal requirements specification language. Due to










32

ambiguities in the syntax and semantics of RML, a simplified

version has been defined (called RML' due to its proximity)

and is used in the methodology.

The methodological components are formally defined in

RML' and the result of the methodology is a formal

requirements specification expressed in RML'. The

semantics of RML' are integrated into the methodology and

are significant as they represent the basic underlying

principles of conceptual modeling.

In general, the semantics of RML and RML' are very

similar. However, some significant semantic distinctions do

exist between the two languages. These are discussed in the

overview of the semantics of RML' that follows. The RML'

language definition is given in Appendix A.

RML' provides an object-oriented requirements model in

which all information is represented by objects. The

objects are inter-related by properties and grouped into

classes (and metaclasses). There are three types of

objects:

1. entities: the objects in the world,

2. processes: the actions that cause change in

the world, and

3. constraints: what should be true in the

world.

Objects within RML' are related to other objects by

properties (relationships). Properties have three pieces of












information:

1. subject,

2. attribute (also known as the property name),

and

3. value.

There are two kinds of properties: factual and

definitional. Factual properties express factual

information about individual objects (tokens) or classes.

EXAMPLE: < john-smith, age, 23 >

In this example, 'john-smith' is the subject, 'age' is

the attribute, and '23' is the value. Definitional

properties specify generic information that pertains to each

of the instances (or tokens) of a class (or metaclass).

EXAMPLE: ( PERSON, age, AGEVALUE )

In this example 'PERSON' is the subject, 'age' is the

attribute, and 'AGEVALUE' is the value. A factual

property corresponds to each definitional property of a

class (or metaclass). Classes (or metaclasses) induce

factual properties of their instances. In the above

examples, 'john-smith' is an instance of the class 'PERSON,'

'PERSON' has the definitional property 'age,' therefore,

'john-smith' must have an induced factual property 'age'

whose value '23' is an instance of the class 'AGEVALUE.'

Definitional properties are grouped into categories.

Each object type has a set of associated definitional

property categories. For a complete listing of these










34

definitional property categories see Figure 1.

RML' supports the three abstraction principles

associated with conceptual modeling: classification,

aggregation, and generalization. The framework of classes,

properties, and abstraction principles applies uniformly to

all three kinds of objects: entities, processes, and

constraints.

Like other conceptual modeling languages, RML' has a

temporal framework. Time is viewed as an infinite and

dense sequence of time points. Every event has an

associated start and end point; constraints are true at a

given time point; and the value of an object's properties as

well as its membership in a class is evaluated with respect

to time. Property categories are used to specify temporal

constraints and to specify default times when properties

will be evaluated.

An object is defined on one of four classification

levels: token (level 0), class (level 1), metaclass (level

2), and metametaclass (level 3). A token is an instance of

one or more classes but has no instances of its own; a class

is an instance of one or more metaclasses and has tokens as

instances; a metaclass is an instance of one or more

metametaclasses and has classes as instances; a

metametaclass has metaclasses as instances.

For organizing the classes, RML' provides a number of

built-in classes and metaclasses. For simplification, these












Definitional property categories for an ENTITY:


necpart a non-null ENTITY that is a permanent attribute
of an instance
association an associated ENTITY that might change over time
component an ENTITY that is a component of an instance
invariant a CONSTRAINT that is always true of the
instance
initcond a CONSTRAINT that is true when an object becomes
an instance of the class
finalcond a CONSTRAINT that is true at the time an object
ceases to be an instance of the class
producer a PROCESS that produces the instance
consumer a PROCESS that consumes the instance
modifier a PROCESS that modifies the instance
constraint a CONSTRAINT that specifies additional
conditions about the instance


Definitional property categories for a PROCESS:


input

output

control


precond
postcond
actcond


stopcond

component

constraint


an ENTITY participating in the PROCESS and
taken from the given property value class
an ENTITY inserted into the given property value
class by this process
an ENTITY participating in the PROCESS but not
removed from the given property value class by
this PROCESS
a CONSTRAINT that must be true at the start time
a CONSTRAINT that must be true at the end time
a CONSTRAINT which, if it becomes true at any
point, causes an associated instance of the
PROCESS to begin at that point
a CONSTRAINT which, if it becomes true, causes
the PROCESS instance to stop at that point
a PROCESS that must occur in order for the
instance to occur
a CONSTRAINT that specifies additional
conditions that must be true in order for the
instance to occur


Definitional property categories for a CONSTRAINT:


argument
part
suffcond

defn


constraint


an ENTITY that is an argument of the CONSTRAINT
a CONSTRAINT that is true when subject is true
a CONSTRAINT that is a sufficient condition for
the instance to be true
a CONSTRAINT class every instance of which gives
a necessary and sufficient condition for subject
to be true
a CONSTRAINT expression with a non-null value


Figure 1. RML' Definitional Property Categories










36

built-in objects have been kept to a minimum. Our modeler

is provided with the built-in metaclasses: CLASS, which

contains all classes, PROCESS_CLASS, ENTITY_CLASS, and

CONSTRAINT CLASS. To further describe the classification

hierarchy, the following built-in metametaclasses are

provided: PROCESS_METACLASS, ENTITY_METACLASS, and

CONSTRAINTMETACLASS.

Ambiguities found in the semantics and syntax of RML

were reflective of underlying structural problems that

needed to be dealt with to solidify the framework of RML'.

Many of the identified problems focused on the intended

semantics associated with using the built-in classes/

metaclasses and the difference between 'in' (instance) and

'is-a.'

The 'is-a' relationship has been used to describe many

different types of relationships [Bra83], [Woo75]. In many

schemes that use 'is-a,' the relationship is used to say

both that an object is an instance of a class (e.g., Sue is

a teacher) and that one class is a subclass of another

(e.g., a teacher is a person). The first case is an

example of classification and is denoted in RML (and RML')

by the 'in' relationship. The second case is an example of

generalization and is the only intended usage of the 'is-a'

relationship used in RML (and RML').

Recall that definitional properties specify generic

information that pertains to each of the instances of a










37

class or metaclass. Factual properties express factual

information about individual objects or classes.

There are three constraints that are central to the

conceptual modeling framework of RML and RML' that describe

the semantics of the above ideas. They are [Gre84, p. 13]:

Property induction constraint -- Every instance of
a class or metaclass has an induced factual
property corresponding to each definitional
property of the class with a value that is an
instance of the definitional property value or
null.

Extensional ISA constraint -- If C ISA D then
every instance of C is an instance of D.

Intentional ISA constraint -- If C ISA D then
every definitional property i of D is also a
definitional property of C, and furthermore, the
value of property i for C ISA the value for
property i of D.

To clarify the meaning of the above constraints,

consider the following examples:

Let PERSON_CLASS be a metaclass with properties:
cardinality and
average_age.

Let PERSON be a class with properties:
name and
social_security_#.

Let STUDENT be a class with properties:
grade_level and
grade_point_average.

The statement: 'PERSON in PERSON CLASS' means PERSON

is an instance of PERSONCLASS, and therefore, by the

property induction constraint, has factual properties

induced for each definitional property of PERSON_CLASS.










38

Thus, the class PERSON has values for 'cardinality' and

'average_age.'

The statement: 'STUDENT isa PERSON' means every

instance of STUDENT is also an instance of PERSON, and the

definitional properties of PERSON are also definitional

properties of STUDENT, inherited from PERSON. So STUDENT

has properties 'name,' 'social_security_#,' 'grade level,'

and 'gradepoint_average' and all instances of STUDENT have

factual properties induced for each of these properties.

In cases where there are no required metaclass

properties (e.g., no need for class cardinality), RML allows

the use of a built_in metaclass. Thus, instead of defining

a metaclass in the above example, one could use the built in

metaclass ENTITY_CLASS and say 'PERSON in ENTITY_CLASS.'

Alternatively, one could say 'PERSON isa ENTITY CLASS.'

Both usages occur in the RML language description and are

syntactically correct. However, no attempt was made by

Greenspan to distinguish between the two statements. Since

the distinction between these statements is not obvious, and

the resulting semantic implications are even more obscure,

this required disambiguation.

Being an instance of a built_in metaclass (or class)

(e.g., 'PERSON in ENTITY_CLASS') means that factual

properties are induced for each definitional property of the

metaclass (or class). But a built_in class (or metaclass)

has no definitional properties, therefore no factual










39

properties are induced. By the 'is-a' constraints, 'PERSON

isa ENTITY_CLASS' means every instance of PERSON is an

instance of ENTITY_CLASS and every definitional property of

ENTITYCLASS is a definitional property of PERSON. Again,

since built_in classes (and metaclasses) have no

definitional properties, 'PERSON isa ENTITYCLASS' adds no

new information to the definition of PERSON.

In both cases, no new properties are induced or

inherited. However, the two statements are not equivalent.

Recall that objects are defined on one of four

classification levels. This classification level determines

the level of instances of the object. 'A in B' means that

if B is level n, then A is level n-1. 'Is-a' related

objects have the same classification level (see Figure 2).

The statement 'PERSON in ENTITYCLASS' implies that

PERSON is an instance of a metaclass, which is a class.

Instances of PERSON are tokens. On the other hand, 'PERSON

isa ENTITY_CLASS' implies that PERSON is a metaclass.

Instances of PERSON are classes. Thus there are semantic

distinctions between the two statements. To complicate this

even a bit more, RML also allows 'PERSON in ENTITY

METACLASS' which means that PERSON is an instance of a

metametaclass, which is a metaclass. Again, there would be

no factual properties induced and thus, this statement

would have the same effect as 'PERSON isa ENTITY CLASS.'
















level 3

metametaclasses







level 2

metaclasses







level 1

classes







level 0

tokens


ENTITY METACLASS


in




PERSON CLASS




in




PERSON
\ isa
STUDENT


Ma in



Mary Jones


Figure 2: Classification Levels












The above illustrates the need for some semantic

distinctions. Since 'is-a' represents the generalization

abstraction, 'is-a' related objects should reflect some type

of specialization occurring. That cannot be possible if the

generalization is a builtin class (or metaclass).

Alternatively, the 'in' relationship represents the

classification abstraction which is used to describe a

relationship between objects existing at different

classification levels. That is precisely what is occurring

in any relationship involving a built_in class (or

metaclass). Therefore, a constraint was added to RML'

specifying that built_in classes and metaclasses may

participate in 'in' relationships but may not participate in

'is-a' relationships.

Other related structural problems occurring in RML

concern the semantics involved with 'in' and/or 'is-a'

relationships involving more than two objects. Again, these

relationships are used in the RML language description and

are syntactically correct, but are semantically ambiguous.

More importantly, they also appear, perhaps inadvertently,

in the classification/generalization diagrams resulting from

the proposed methodology. Consequently, they must be

precisely defined. The following properties were identified

and added to RML':

1. 'is-a' is a transitive relationship and

definitional properties should be inherited












accordingly.

2. 'in' is not a transitive relationship and

factual properties may be induced only for an

instance of a class or metaclass.

3. If A isa B and B isa C then A isa C and A, B,

and C are 'is-a related.'

4. If A is an instance of B, then all objects

'is-a related' to A are also instances of B and

have induced factual properties corresponding to

each definitional property of B.

See Appendix B for details.

The most significant difference between RML and RML' is

due to the framework of the methodology and our desire for

uniformity. Specifically, the fact that the framework

integrates an analysis technique based on functional

decomposition with an object-oriented/conceptual modeling

approach requires a synthesis of the two opposing views.

Furthermore, the methodology utilizes the three abstraction

mechanisms to uniformly model the three types of objects.

Recall that functional decomposition approaches view a

system in terms of the 'is-part-of' hierarchical structure

of the processes. Object-oriented approaches and

conceptual modeling approaches decompose a system based on

the 'is-a' hierarchical structure of the data. The analysis

portion of the methodology results in 'is-a' hierarchies and

'is-part-of' hierarchies for both processes and data. The










43

problem occurs when trying to map those two views onto RML,

which, being a conceptual modeling language, supports

decomposition based on the generalization ('is-a') and

classification ('instance-of') abstraction mechanisms.

RML makes an attempt to support 'is-part-of'

decompositions via the part property categories for entities

and processes. For processes a part property is a process

that must occur in order for the instance to occur. That is

precisely what it should be, but the problem is that one

cannot hierarchically decompose a system based on the 'is-

part-of' structure of the processes using only this one

property category.

For entities a part property is a non-null entity that

is a component of an instance. These are considered

permanent attributes because their values are fixed over the

life-time of the instance. For example, a PERSON might have

part properties 'name' and 'social_security#.' Again,

there is no way to hierarchically decompose entities based

on their aggregation structure using only this one property

category.

To remedy this situation, RML' has been defined with

the following semantic distinctions:

1. The 'part-of' relationship has been added to

allow for hierarchical decomposition based on

aggregation.










44

EXAMPLE: EXAMINE part-of EVALUATE


Given that 'EVALUATE' is defined with 'EXAMINE' as

a component, the 'part-of' relationship allows us

to illustrate their hierarchical relationship by

defining 'EXAMINE' as 'part-of' the aggregate

'EVALUATE.'


2. The part property category for processes was

replaced with component and was defined to be a

PROCESS that must occur in order for the instance

to occur.


EXAMPLE: EVALUATE in PROCESSCLASS with
component
examine: EXAMINE

This example illustrates that 'EVALUATE' has

'EXAMINE' as a component. This definition of

'EVALUATE' requires that 'EXAMINE' be defined as

'part-of' 'EVALUATE' due to their hierarchical

relationship.


3. The component property category for entities

was added and defined to be an ENTITY that is a

component of an instance.

4. The part property category for entities was

replaced with necpart and defined to be a non-null

ENTITY that is a permanent attribute of an

instance.










45

5. Two additional constraints that are central to

the framework of RML' are needed to describe the

semantics of the above ideas. They are:


Extensional PART-OF constraint -- If C PART-OF D
then every instance of C is a component of an
instance of D.

Intentional PART-OF constraint -- If C PART-OF D
then every component i of C is also a component of
D, and furthermore, the value of every component i
of C is PART-OF the C value for D.


EXPLANATION: The first constraint is specifying

the relationship between 'part-of' and component.


EXPLANATION: The second constraint is specifying

that components and their values are

hierarchically related. It further states that

components are upwardly inherited by their

aggregates.


EXAMPLE: D with
component
a: A
b: B
c: C

C part-of D with
component
r: R
s: S

Thus, D has components: a, b, c, r, s; and

R part-of C, and S part-of C.










46

6. If C part-of D and B isa D then C part-of B,

and furthermore, C is part-of all objects isa

related to D.


EXAMPLE: Continuing the above example, add:

X isa Y
Y isa Z
Z isa D

Then, C part-of X, C part-of Y, C part-of Z; and

X, Y, and Z all have components: a, b, c, r, s.


This chapter discussed the framework of our approach

and the syntax and semantics of the target language.

Language extensions were presented as clarifications to

enhance the semantic contents of the methodology. The next

chapter presents the methodology.
















CHAPTER 4
THE METHODOLOGY


This chapter presents the requirements methodology.

The major steps are identified, strategies for refining the

model are explored, and techniques for checking the

consistency of the developing model are examined. To

illustrate the methodology, one example is developed

throughout this chapter. The sample domain is a hospital

from the patient/processing perspective (as opposed to other

perspectives such as management/personnel, patient services,

etc.). Please note that there is nothing intrinisic about

the particular domain selected that links it to the

methodology, it is just used for illustrative purposes. The

methodology is applicable to requirements modeling in any

domain. Specific limitations are discussed in Chapter 5.

The methodology consists of the following nine steps:

1. Identify objects and processes,

2. Model system dynamics,

3. Model structure of entities,

4. Model structure of processes,

5. Refine entities,

6. Refine processes,

7. Annotate to refine dynamics,










48

8. Specify constraints, and

9. Write formal requirements in RML'.

A brief overview of each of these steps follows.

1. Identify objects and processes. This step consists

of listing the major objects and processes in the system

being developed. Objects and processes should be clustered

into aggregation hierarchies and specialization hierarchies.

2. Model system dynamics. This step models the

dynamics of the system using a simplified dfd.

Decomposition of both processes and data based on their

aggregation structures occurs at this level. 'Is-part-of'

hierarchies for processes are identified as in any

functional decomposition. The data is also decomposed

("leveled" in SA terminology) as in all dfd-based

techniques.

3. Model structure of entities. A modified entity-

relationship diagram (erd) is used to model the structure of

the entities. Generalization is used to model class/

subclass entity relationships in an Entity Generalization

Structure diagram. Aggregation is used to model object/

component entity relationships in an Entity Aggregation

Structure diagram.

4. Model structure of processes. A modified erd is

used to model the structure of the processes.

Generalization is used to model class/subclass process

relationships in a Process Generalization Structure diagram.










49

Aggregation is used to model object/component process/

subprocess relationships in a Process Aggregation Structure

diagram.

5. Refine entities. This step consists of two

subtasks: identify related objects and processes; and

specify class/metaclass refinement. The first subtask

identifies associated objects, necessary part objects, and

related processes. The second subtask utilizes the

classification abstraction mechanism to specify 'in' related

class/metaclass instances. Class names of the property

values for all previously defined properties are specified,

and specialized property values are given for subclasses as

appropriate. Any additional attributes associated with

subclasses are identified. This information is

incrementally added to the entity structure diagrams

obtained in step 3.

6. Refine processes. This step consists of two

subtasks: identify input, output, and associated objects;

and specify class/metaclass refinement. The first subtask

involves specifying information in the process erds that was

previously identified on the dfd to emphasize their

semantics in the conceptual modeling language. The second

subtask utilizes the classification abstraction mechanism

to specify 'in' related class/metaclass instances. Class

names of the property values for all previously defined

properties are specified and specialized property values are












given for subclasses as appropriate. Additional properties

associated with subclasses are identified. This

information is incrementally added to the process structure

diagrams obtained in step 4.

7. Annotate to refine dynamics. This step identifies

information that is needed to interpret the meaning of the

dynamic model obtained in step 2. It consists of three

subtasks: model internal process behavior; model external

process behavior; and model entity behavior. The internal

process behavior is modeled by identifying triggering

conditions and pre-/postconditions. The external process

behavior is modeled by specifying constraints describing the

life-cycle of entities, evaluating and annotating the data

flow junctors (i.e., the points on the dfd where data flows

split or join), and evaluating and specifying constraints

describing the temporal relationships of processes. The

entity behavior is modeled by adding information pertaining

to initial conditions, final conditions, and invariants.

These constraints are informally specified in this step and

refined in the next step.

8. Specify constraints. This step involves

incrementally refining and formally specifying the

constraints that were identified in the previous step. It

consists of three subtasks: model internal process

constraints; model external process constraints; and model










51

entity constraints. A predefined core of reusable objects/

constraints is defined to assist in the formal

specifications.

9. Write formal requirements in RML'. The final step

formalizes all of the information identified in the previous

steps. The relevant information has been identified and

each graphic component has a formal definition. This step

involves simply writing down the equivalent RML' definition

for the constructs specified. Since the first eight steps

of the methodology have already produced a formal

requirements specification, this step is considered

optional.

These nine steps are examined in detail in the

following nine sections.


4.1 Identify Objects and Processes

Whether one is using an object-oriented approach or a

functional approach, the first step is to list the major

objects and/or processes in the system being modeled.

Objects and processes are considered equally important.

Brainstorming is suggested to assist identification. It is

best to alternate between objects and processes to avoid one

dominating view. Thus, list objects, then consider what

processes operate on those objects. Then consider what

processes are in the system, and what objects are needed by

those processes.

Objects and processes should be clustered into










52

hierarchies. Consider aggregation hierarchies (the

functional approach) for both processes and objects.

Identify specializations of objects and processes.

This step is illustrated in Figure 3. The major

entities and processes in the system are listed.

Specializations of both entities and processes that are

known at this initial stage are listed. In our example,

'PERSON' has the specializations subclassess) 'PATIENT' and

'CHILD.' These objects are further decomposed into their

subclasses (e.g., 'PATIENT' has the specialization

'SURGICAL_PATIENT'), and these subclasses are similarly

further decomposed as necessary (e.g., 'SURGICALPATIENT'

has the specialization 'TRANSPLANT_SURGERY_PATIENT').

Aggregation is used to model the object/component

relationships for both entities and processes. The entity

components that are known at this stage are actually entity

attributes and are written in lower case letters to

illustrate this (e.g., 'PERSON' has 'name' and 'age'

attributes). At this point there is no distinction made

between permanent or temporary attributes.

The process components that are listed correspond to

those resulting from the typical functional decomposition.

For example, the 'EVALUATE' process consists of the subtasks

'EXAMINE,' 'TEST,' and 'ASSESS.'















MAJOR OBJECTS IN SYSTEM


ENTITIES


PERSON
PATIENT
PHYSICIAN
NURSE
HOSPITAL WARD


PROCESSES

ADMIT
EVALUATE
TREAT
RELEASE


SPECIALIZATIONS


PERSON
PATIENT
SURGICAL PATIENT
TRANSPLANT SURGERY
PATIENT


ADMIT
ADMIT CHILD PATIENT
ADMIT SURGICAL PATIENT
ADMIT SURGICAL
CHILD PATIENT


CHILD
CHILDPATIENT
SURGICAL CHILD PATIENT


COMPONENTS


PERSON
name
socialsecurity_#
insurance_#
age
address


ADMIT
CHECK ID
CHOOSE WARD
PERFORM TESTS
URINALYSIS
BLOODCOUNT
BLOODPRESSURE
TAKE TEMP
EVALUATE
EXAMINE
TEST
ASSESS


Figure 3: Objects and Processes










54

4.2 Model System Dynamics

A simplified dfd is used that consists of the basic

components: processes, data flows, and external sources.

Data stores (files) may be optionally used. Their contents

are modeled explicitly during the structural decomposition

of entities to alleviate the negative criticisms associated

with data stores. The notion of control information found

in Ward and Mellor's [War86] version and SADT [Ros77a,

Ros77b] may either be included at this point or added later

as a refinement. Analyst's accustomed to using that

notation will find it difficult to decompose without it,

just as strict SA [Dem79] or SSA [Gan79] advocates will

consider it inappropriate for an initial dfd decomposition.

The point here is that analysts may use the version they

feel comfortable with. The similarities outweigh the

differences. Whatever information is not included during

this initial decomposition will be added during a later step

of the methodology.

Decomposition of both processes and data based on their

aggregation structures occurs at this level. 'Is-part-of'

hierarchies for processes are identified as in any

functional decomposition. The data is also decomposed

("leveled" in SA terminology) as in all dfd-based

techniques.

Figure 4 contains two dfds. The top dfd is modeling

'BE A HOSPITAL PATIENT' which consists of four subprocesses:















eoftsc.Iywi -


eval+ut-d-
?a+itIt


BE A HOSPITAL PATIENT
thy~ician


- EXAMINE




TEST pliant
physician

_t*evanl'te-
,, eptri t-
ASSESS


EVALUATE

Figure 4: System Dynamics










56

'ADMIT,' 'EVALUATE,' 'TREAT,' AND 'RELEASE.' The input to

this system is a 'person' and the output from this system is

also a 'person.' The 'person' becomes a 'patient' by the

'ADMIT' process, an 'evaluated_patient' by the 'EVALUATE'

process, and a 'treated patient' by the 'TREAT' process.

The controls for this system are 'ward,' 'physician,' and

'consulting_physician.'

The lower dfd decomposes 'EVALUATE' into its

components: 'EXAMINE,' 'TEST,' and 'ASSESS.' This dfd

illustrates the subprocessing that occurs to a patient

during the 'EVALUATE' process.

All processes that are decomposed into components

require a dfd illustrating their decomposition. These dfds

are typically organized in a top-down fashion. The entire

collection of dfds comprise the system dynamics modeled in

this step.

This layer of the model should identify the following:

a. processes/subprocesses;

b. input/output data objects;

c. data flows (including splits & joins); and

d. external entities.


4.3 Model Structure of Entities

A modified entity-relationship diagram (erd) is used to

model the structure of the entities. Generalization is used

to model class/subclass entity relationships in an Entity

Generalization Structure diagram. Aggregation is used to










57

model object/component entity relationships in an Entity

Aggregation Structure diagram.

There are two possible interpretations of entity

decompositions based on aggregation, and only one of those

is modeled in this step. In one type, the component objects

are viewed as permanent attributes: objects whose values do

not change over the lifetime of the related object (e.g.,

the 'name' attribute of object 'PERSON'). This type is

treated as an object/necpart decomposition and is not

modeled in this step. It is added to the model during

refinement when related objects and processes are specified.

The other interpretation of entity decomposition based

on aggregation corresponds more closely to the conceptual

modeling interpretation of aggregation and parallels the

aggregation decomposition for process. Given the temporal

framework of conceptual modeling, a component is said to

exist during the life-time of the instance of the aggregate

object. For processes, one can say that all component

subprocesses occur while the aggregate component is

activated. For entities, components may be thought of as

different stages the aggregate entity may be in during its

life-time.

This second interpretation of entity decomposition also

maps onto the object/component type of data leveling that

occurs in dfds. This includes entity decompositions that

reflect changes in the life-cycle of the entity due to the










58

processes it has undergone. For example, a 'patient' may be

an "input" to an EVALUATE process and a TREAT process. To

reflect the various processing stages of the 'patient,' the

dfd may show 'patient' being an "output" from the processes

as 'evaluated_patient' and 'treated_patient.' Viewing these

stages as components of the aggregate 'patient' would imply

that an instance of 'patient' has attributes 'evaluated' and

'treated.'

The treatment of life-cycle stages of entities requires

further decomposition and the specification of additional

constraints. They are examined in more detail in step 7.

The entities to include in these models come from the

initial list of objects and the dfd. Only those entities

identified that participate in class/subclass or object/

component relationships are included in the erds. Some

entities identified as components in the initial list of

objects may be determined to be associated objects instead

(e.g., 'age' because the value is not fixed over the

lifetime of the PERSON entity).

Since the graphical components of our methodology are

formally defined in the conceptual modeling language RML',

their semantics must be equivalent to the semantics of

RML'. Therefore, the semantics of the underlying formalism

must be reviewed at this point.

Recall that objects are organized hierarchically based

on the three abstraction mechanisms: classification,










59

generalization, and aggregation. This means that objects

are organized based on 'is-a,' 'is-part-of,' and 'instance-

of' hierarchies. Our graphical representation, like the

underlying RML', has separate 'is-a' and 'is-part-of'

hierarchies but has combined 'instance-of' hierarchies with

the other two as necessary. This results in generalization

('is-a') diagrams and aggregation ('is-part-of') diagrams

for both entities and processes.

Each object has an object definition that includes the

object name followed by one of the abstraction operators

('isa,' 'part-of,' or 'in') and one or more other objects.

Specifically, an object may be in (i.e., be an instance of)

one or more other objects (e.g., 'A in B' and 'A in B, C'

are both syntactically correct); an object may be isa one or

more other objects (e.g., 'A isa B' and 'A isa B, C' are

both syntactically correct); and an object may be part-of

one other object (e.g., 'A part-of B' is correct). (See

Appendix A for the complete syntax.) The semantics of the

abstraction operators are implied in the diagrams.

In generalization diagrams, objects are organized

hierarchically based on the isa abstraction operator. The

diagrams should be read from bottom to top with the

specializations below their generalizations. In aggregation

diagrams, objects are organized hierarchically based on the

part-of abstraction operator. These diagrams should be










60

read from bottom to top with the components below their

aggregates.

In addition to the overall hierarchical organization of

objects based on the abstraction mechanisms, recall that

objects are related to other objects through properties, and

different object types have different property categories.

Since these relationships are specified within the object

definition in the underlying formal representation, they

must be graphically specified in a similar manner.

Therefore, an object's properties are defined with the

object on the appropriate diagrams.

The Entity Generalization Structure diagram models the

system from an 'is-a' point of view. However, the

components of any items participating in object/component

relationships are also included in the model. This sets up

a link between the two diagrams created in this step. An

asterisk is used to denote that the object is decomposed

further in another diagram.

The Entity Generalization Structure diagram for our

example is given in Figure 5. 'PERSON' is the

generalization of both 'PATIENT' and 'CHILD' (i.e., 'PATIENT

is-a PERSON' and 'CHILD is-a PERSON'). 'CHILDPATIENT' has

both 'PATIENT' and 'CHILD' as generalizations (i.e., 'CHILD

PATIENT is-a PATIENT, CHILD'). The only properties shown on

this diagram are component properties because they represent

an aggregation decomposition. The component properties of

'PATIENT' are 'EVALUATED PATIENT' and 'TREATED PATIENT.'
















Key:
4 -- decomposed on
another diagram
0-- component


Figure 5: Entity Generalization Structure










62

Note that 'EVALUATED_PATIENT' is decomposed further in

another diagram.

The Entity Aggregation Structure diagram models the

system from an 'is-part-of' point of view. The entities to

include in this model participated in an object/component

relationship on the previous diagram, although some of the

entities may be the result of further decomposition.

Figure 6 illustrates the Entity Aggregation Structure

diagram for our example. Both 'EVALUATED_PATIENT' and

'TREATEDPATIENT' are 'part-of' related to the aggregate

'PATIENT.' 'EXAMINED PATIENT' and 'TESTED PATIENT' are the

result of further decomposition and are hidden with an

asterisk on the Entity Generalization Structure diagram.

This layer of the model should identify the following:

a. the generalization structure of entities; and

b. the aggregation structure of entities.


4.4 Model Structure of Processes

A modified erd is used to model the structure of the

processes. Generalization is used to model class/subclass

process relationships in a Process Generalization Structure

diagram. Aggregation is used to model object/component

process/subprocess relationships in a Process Aggregation

Structure diagram.

The process/subprocess relationships were previously

modeled in the dfd layer. They are repeated here for

consistency. This often entails more information than can































































Figure 6: Entity Aggregation Structure










64

be expressed in a single diagram. Decompose in layers as

necessary. The final results should include both

aggregation and generalization decompositions for individual

processes. Lower level components may be hidden by using an

asterisk.

The Process Generalization Structure diagram models the

system from an 'is-a' point of view. As in the previous

Entity Generalization Structure diagram, the components of

any objects participating in object/component relationships

are also included in the model.

Figure 7 is the Process Generalization Structure

diagram for our example. It models the subclasses of the

'ADMIT' process. 'ADMIT SURGICAL CHILD PATIENT' has both

'ADMIT CHILD PATIENT' and 'ADMIT SURGICAL PATIENT' as

generalizations. This diagram shows the component

properties of 'ADMIT' ('CHECK_ID,' 'CHOOSE_WARD,' and

'PERFORM_TESTS') and the additional component properties of

the specializations of 'ADMIT' (e.g., 'ADMITCHILD_PATIENT'

has the additional component 'FIND_NURSE').

The Process Aggregation Structure diagram models the

system from an 'is-part-of' point of view. Aggregate

processes are decomposed into their components. In Figure

8, 'ADMIT' is decomposed into its components 'CHECK_ID,'

'CHOOSEWARD,' and 'PERFORMTESTS.' Note that 'ADMIT' is

'part-of' 'BE A HOSPITAL PATIENT' as illustrated on the

System Dynamics diagram (Figure 4).
































































Figure 7: Process Generalization Structure
































































Figure 8: Process Aggregation Structure










67

This layer of the model should identify the following:

a. the generalization structure of processes; and

b. the aggregation structure of processes.


4.5 Refine Entities

This step introduces concepts associated with the

conceptual modeling language. It represents a shift in

perspective from what is typically considered analysis to a

more detailed analysis necessary for formalization. It

consists of two subtasks: identify related objects and

processes; and specify class/metaclass refinement.

Information is incrementally added to the entity structure

diagrams obtained in step 3.

The first subtask identifies related objects and

processes. Related objects are either associated objects or

necessary part objects. Objects associated with the

entities whose values may change over the lifetime of the

related entity are considered temporary attributes.

Necessary parts have values that do not change over the

lifetime of the entity and are considered fixed attributes.

Related processes are those typically identified in

conceptual modeling approaches: processes that create the

entity, modify the entity, or consume the entity. These

processes are identified and their relationships are

specified.

The second subtask utilizes the classification

abstraction mechanism to specify 'in' related class/










68

metaclass instances. All entities specified previously are

treated as class names. Metaclass names are added and may

be either user-defined or built-in. Metaclasses are built-

in unless there are associated metaclass attributes. For

user-defined metaclasses, associated attributes must be

identified.

All previously defined attributes and relationships

represent properties of the associated entity. Class names

of the property values must be specified for each of these

properties. These class names identify the class that the

value of the related object must come from. All previously

specified components are treated as class names and their

attributes must be specified.

Using multiple inheritance, provide specialized

property values for subclasses as appropriate. Also

identify any relevant additional attributes associated with

subclasses.

The Entity Generalization Structure diagram (Figure 5)

obtained in step 3 is the starting point for the Refined

Entity Generalization Structure diagram (Figure 9) created

in this step. Similarly, information is incrementally added

to the Entity Aggregation Structure diagram (Figure 6) to

create the Refined Entity Aggregation Structure diagram

(Figure 10).

Consider Figure 9, the Refined Entity Generalization

Structure diagram. For 'PERSON,' the associated objects are








69
Key: WTIT-
S-- association METACL
-- necessary part



PERSon
CL AS



sic.I- seei PERS'









*RI


Figure 9: Refined Entity Generalization Structure










70

'address,' 'age,' and 'insurance_#.' Note that associated

objects are shown on the right side of each entity box for

consistency. 'PERSON' has as necessary part objects 'name,'

and 'social_security#.' These objects are fixed attributes

with values that do not change over the life-time of

'PERSON.' Note that necessary part objects are shown on the

left side of the entity box. Since the semantics of related

processes correspond to the information contained in the

Refined Entity Aggregation Structure diagram, they are

modeled in that diagram and are discussed below with that

example.

The entities previously specified represent class names

(e.g., 'PERSON' and 'PATIENT'). 'PERSON' is an instance of

the user-defined metaclass 'PERSON_CLASS,' which is an

instance of the built-in metaclass 'ENTITY METACLASS.' The

associated metaclass attributes averageg' and 'cardinality'

are identified for the user-defined metaclass 'PERSON

CLASS.'

'PERSON' has 'name,' 'socialsecurity,' 'address,'

'age,' and 'insurance_#' properties. Each of these

properties has an associated property value (e.g., the

'name' property of 'PERSON' has the property value 'PERSON

NAME').

The previously specified components 'TREATEDPATIENT'

and 'EVALUATED_PATIENT' represent class names. Their

attributes are 'treat' and 'evaluate' respectively.












Property values for inherited properties of subclasses

are specialized (e.g., the property value of the 'age'

property for 'CHILD' is 'CHILD AGE VALUE' which is a

specialization of 'AGE_VALUE'). 'SURGICAL_PATIENT' has the

additional properties 'bloodtype' and 'surgery' and their

respective property values are 'BLOODTYPE' and 'SURGERY_

TYPE.'

Figure 10 is the Refined Entity Aggregation Structure

diagram. Any associated objects or necessary part objects

identified on the previous diagram are repeated here (e.g.,

'PATIENT' has the associated objects 'ward,' 'physician,'

and 'consulting_physician') and their property values are

given (e.g., 'ward' has the property value 'HOSPITAL_WARD').

Related processes are identified for each entity

(e.g., 'ADMIT' creates 'PATIENT,' 'EVALUATE' and 'TREAT'

modify 'PATIENT,' and 'RELEASE' consumes 'PATIENT') and

their relationships are specified (e.g., the 'ADMIT' process

is related to 'PATIENT' by the 'register' property).

Related processes for component entities are identified and

their relationships are specified (e.g., 'TREATED PATIENT'

is created by 'TREAT' and the relationship is described by

the 'treat' property).

Note that there are no 'in' related objects modeled on

the diagram. Since the Refined Entity Generalization

Structure diagram illustrates that 'PATIENT is-a PERSON' and

'PERSON in PERSON_ CLASS,' we know that 'PATIENT in










Key:
-K producer
--- consumer
-- modifier


Figure 10: Refined Entity Aggregation Structure










73

PERSON_CLASS' by the semantics of 'is-a' and 'in' described

in Appendix B. This inheritance is implied in the Entity

Generalization diagram and is not repeated here.

Finally, the additional properties associated with

component entities are identified and their property values

specified (e.g., 'EVALUATED_PATIENT' has the associated

property 'physician' with the property value 'PHYSICIAN').

This layer of the model should identify the following:

a. associated objects (temporary attributes);

b. necessary part objects (fixed attributes);

c. related processes (created by, modified by,

and consumed by) and relationships;

d. the classification structure of entities;

e. class names of property values;

f. specialized property values or additional

attritutes for subclasses based on multiple

inheritance; and

g. metaclass attributes.


4.6 Refine Processes

This step applies concepts associated with the

conceptual modeling language to processes. It consists of

two subtasks: identify input, output, and associated

objects; and specify class/metaclass refinement.

Information is incrementally added to the process structure

diagrams obtained in step 4.










74

The first subtask identifies input and output objects

and other associated objects. The input and output objects

have already been identified on the dfd and are just copied

onto the process erds. The semantics of input and output in

the conceptual modeling language should be considered.

Input entities participate in the process and are removed

from their property value class by this process (i.e.,

input is consumed). Output entities are inserted into their

given property value class by this activity (i.e., output

is produced).

The associated objects are objects affecting the

process but not consumed by the process. These entities

correspond to control items that may have been identified on

the dfd. If control information was not included on the dfd

then it needs to be added here. Consider entities that

constrain the process in some way and are needed by the

process. Control entities serve as input to the process but

they differ from input entities in that they still exist

after the process terminates.

The second subtask utilizes the classification

abstraction mechanism to specify 'in' related class/

metaclass instances. All processes specified previously

are treated as class names. Metaclass names are added and

may be either user-defined or built-in. Metaclasses are

built-in unless there are associated metaclass attributes.










75

For user-defined metaclasses, associated attributes must be

identified.

All previously identified related entities represent

properties of the associated process. Class names of the

property values must be specified for each of these

properties. These class names identify the class that the

value of the related object must come from. All previously

specified components are treated as class names and their

attributes must be specified.

Using multiple inheritance, provide specialized

property values for subclasses as appropriate. Also

identify any additional properties associated with

subclasses and specify their property values.

Figure 11 is the Refined Process Generalization

Structure diagram for our example. The input to 'ADMIT' is

'person' and its property value class is 'PERSON.' 'ADMIT'

produces the output 'patient' which is inserted into the

property value class 'PATIENT' by this process. 'ADMIT' has

associated properties 'ward,' 'physician,' and 'consulting_

physician.' These objects are not removed from their

property value classes by this process (e.g., 'physician' is

not removed from 'PHYSICIAN' by this process).

Note that 'ADMIT' has no 'in' related class/metaclasses

specified. This is because 'ADMIT' is 'part-of' the

aggregate process 'BE A HOSPITALPATIENT' which is









Key:
9-- input
7-- output


Figure 11: Refined Process Generalization Structure









77

illustrated on the Refined Process Aggregation Structure

diagram.

The previously identified components represent class

names (e.g., 'CHECKID') and their attributes are specified

(e.g., the attribute of 'CHECK_ID' is 'check id').

Specialized property values for subclasses are given

(e.g., the property value of the 'patient' property of

'ADMIT CHILD PATIENT' is 'CHILD PATIENT' which is a

specialization of 'PATIENT'). Additional properties

associated with subclasses are provided and their property

values are identified (e.g., 'ADMIT SURGICAL PATIENT' has

the associated properties 'bloodtype' and 'surgery' with

property values 'BLOOD_TYPING' and 'SURGERYTYPE').

The Refined Process Aggregation Structure diagram for

our example is given in Figure 12. 'ADMIT' has components

'CHECK_ID,' 'CHOOSE_WARD,' and 'PERFORM_TESTS.' 'ADMIT' is

'part-of' the aggregate 'BEA_HOSPITAL_PATIENT' and that

hierarchical relationship is also included in this model.

The input to 'ADMIT' is 'person' and its property value

class is 'PERSON.' The output from 'ADMIT' is 'patient' and

its property value class is 'PATIENT.' The associated

properties of 'ADMIT' and their property value classes are

copied from the previous diagram because they may be

associated with the component processes. In our example,

'CHOOSE_WARD' has the associated property 'ward' and
































































Figure 12: Refined Process Aggregation Structure










79

'PERFORM_TESTS' has the associated properties 'physician'

and 'consulting_physician.'

The input and output objects for the components are

specified with their property values (e.g., the input

property of 'CHECK_ID' is 'person' with the property value

class 'PERSON' and the output property is 'identified' with

the property value 'IDENTIFIEDPATIENT').

This layer of the model should identify the following:

a. input and output objects;

b. associated objects (objects affecting the

process but not consumed by the process);

c. the classification structure of processes;

d. class names of property values;

e. specialized property values or additional

attritutes for subclasses based on multiple

inheritance; and

f. metaclass attributes.


4.7 Annotate to Refine Dynamics

This step bridges the gap between analysis and

specification. Analysis techniques are used to build models

that are intentionally vague. This vagueness allows the

analyst to concentrate on the structure of the overall

system and consequently, avoid dealing with the details.

However, at some point those details must be considered.

This is that point. Keep in mind that the information

identified in this layer is still not procedural. There are










80

no control structures specified. The information supplied

at this level is needed to precisely interpret the meaning

of the dynamic model obtained in step 2. (See Figure 4.)

This step consists of three subtasks: model internal

process behavior; model external process behavior; and model

entity behavior.


4.7.1 Model Internal Process Behavior

The first subtask involves modeling the internal

behavior of the processes. The view is taken that the

function of a process can be described without discussing

the algorithm or procedure that will be used. This

coincides with the notion of information hiding which is

central to the methodology. The assumption is that we can

describe and understand a process in terms of its

constraints without considering procedural details. Pre-/

postconditions are a familiar way to express this

information.

Preconditions describe all the things that must be true

before the process begins operating. Yourdon [You89]

discusses the types of information typically described by

preconditions. These include:

1. What inputs must be available. Even though a

process may have more than one input flow into a

process, they may not all be required to activate

the process. Some of the inputs may be needed

during the process but are not necessary for the










81

process to begin doing its work. Preconditions

specify which input items are necessary for the

process to begin.

2. What relationships must exist between inputs

or within inputs. A precondition might specify

that two inputs with matching fields must arrive

or that one component of a data item must be

within a certain range.

3. What relationships must exist between inputs

and data stores. A precondition might specify

that there be a record within a data store (or a

token in an associated class) that matches some

aspect of an input data item.

4. What relationships must exist between

different stores or within a single store. A

precondition might stipulate that a record (token)

exists with an attribute that matches an attribute

of another record (token) in a different store

(class).

Postconditions describe what must be true when the

process has finished doing its job. Yourdon [You89] also

discusses the types of information described by

postconditions. These include:

1. The outputs that will be generated or produced

by the process.










82

2. The relationships that will exist between

output values and the original input values. This

is typically used when an output is a direct

mathematical function of an input value.

3. The relationships that will exist between

output values and values in one or more stores.

This is used when information is retrieved from a

store (class) and used as part of an output of the

process.

4. The changes that will have been made to

stores. This is used when new items have been

added, existing items have been modified, or

existing items have been deleted from stores

(classes).

Due to the inherent vagueness of dfds, a process

specified in a dfd can represent more than one activation,

each having different participating inputs, controls, and

outputs. One task addressed in this step is to identify

the possible combinations of inputs, outputs, and controls.

Each combination is an activation and can be specified by an

activation rule. The combination of inputs, outputs, and

controls form the interface of the process. The following

interface operators will be used to describe the

combinations: + (or), (and), and x (xor). Activation

rules show which inputs and controls, together with the










83

appropriate interface operators, produce what combination of

outputs, using interface operators as needed.

Figure 13 shows a process with input, output, and

control.


Cl



II ------> ------> 01
x
12 ------> ------> 02



Figure 13: Process Interface


Interface operators have been added to the diagram to

illustrate the combinations. The resulting activation is:

Ii 12 Cl --> 01 x 02

This rule states that both input objects and the control

object are needed to activate the process. The described

activation will produce either 01 or 02 but not both.

Several questions arise based on the above

specification. Marca and McGowan [Mar88] consider the left

side of the rule to be the precondition and the right side

to be the postcondition. Do these adequately represent the

pre-/postconditions? What are the triggering conditions

and how do they differ from the preconditions? What does

the activation rule really tell us about the internal

behavior of this process?










84

The above questions have been considered and the

following conclusions have been made:

1. The left side of the activation rule indicates

what information is needed for the process to

become activated (i.e., Yourdon's first type of

precondition) and is considered a triggering

condition.

2. The right side of the activation rule

indicates what outputs are produced and is

considered a postcondition (i.e., Yourdon's first

type of postcondition).

3. If any operators other than AND are used in an

activation rule or if the process has more than

one activation rule, then the process is not

elementary and the internal behavior of the

process is not understood. Each non-elementary

process must be decomposed into subprocesses that

are elementary.

4. Preconditions and postconditions typically

specify semantic information (e.g., relationships

between inputs, outputs, and existing classes;

acceptable ranges of data values; and changes to

associated data (classes or stores)) and thus

contain more application dependent information

than that which is specified in an activation










85

rule. Elementary process decompositions can be

used to identify semantic pre-/postconditions.

To illustrate the above conclusions it is necessary to

examine what a process at the conceptual modeling level

does. In essence, when a process is activated, it takes

instances of the input classes) and places them in the

output classess. Associated objects may be altered,

component processes may be activated (if they exist), and

constraints may be asserted. If the triggering conditions

are met and the preconditions are satisfied, the

postconditions will be true and the output will be

produced. In other words, the process works. But if the

process specification is ambiguous, then this process

behavior must be examined more closely to determine what

should happen when the process works.

Consider the following example. Assume the ADMIT

process was specified as in Figure 14.


patient
person ------->
-------> ADMIT x rejected_patient
___________----->


Figure 14: Non-elementary Process


The activation rule is:

person --> patient x rejected_patient.

This means that ADMIT is activated when a person arrives and

produces either a patient or a rejected patient but not










86

both. Thus the postcondition based on the above activation

rule would not be able to assert what output was produced by

the process (i.e., which property value class the entity was

inserted into).

The above ambiguity is due to the fact that the process

is non-elementary. In order to clarify the intended

behavior of the process, one must decompose the process

until all components are elementary. See Figure 15.


PATIENT


V

PERSON ______ ADMIT ,PATIENT




REJECT REJECTED
ADMISSION PATIENT



Figure 15: Elementary Process Decomposition


The activation rule for ADMIT is:

PERSON --> PATIENT

and the activation rule for REJECTADMISSION is

PERSON --> REJECTED PATIENT.

These tell us that the triggering condition is the arrival

of an instance of PERSON in both cases. Note that the

control object (PATIENT) is not needed for either process to

become activated. Control information is included in the










87

triggering condition (i.e., the left side of the activation

rule) if the arrival of that information activates the

process. In some cases the control information represents a

data store (i.e., an existing class) and the precondition

must specify how that data store (class) is being used

(e.g., some attribute of a record (token) must match some

aspect of the input data item). The postconditions simply

assert what output is produced. Even though this is clearer

than in the previous example (i.e., we know that ADMIT

produces a PATIENT), we still don't know under what

conditions a person is admitted or rejected. This type of

semantic information is needed to clarify the function of

each of the above processes and is specified as

preconditions.

Writing these pre-/postconditions involves digging

into the innerworkings of the system being modeled. It may

entail a design decision if this information is not

available. In this example, assume that the only reasons a

person will not be admitted are if they are already a

patient in the hospital or if there is no more room left in

the hospital. Similarly, a person will be admitted if they

are not already a patient in the hospital and there is room

left.

This step identifies the triggering conditions (or

actcond) and pre-/postconditions by stating them in short

phrases. These phrases will be the basis for further












refinement in the next step. The content of these short

phrases typically involves stating or negating related

assertions, specifying the arrival of an object in the case

of activation conditions, and asserting that an object is or

is not in another object in pre-/postconditions.

Continuing the above example, the applicable

constraints are informally specified as in Figure 16.


ADMIT
actcond: person arrives
precondition: person not already a patient
room left
postcondition: person in hospital

REJECT ADMISSION
actcond: person arrives
precondition: person already a patient
no more room left
postcondition: person not in hospital

Figure 16: Preliminary Act-/Pre-/Postconditions


4.7.2 Model External Process Behavior

The second subtask of this step is to model the

external process behavior. This task pertains to making

explicit certain types of information that are implicitly

represented on dfds. This includes specifying constraints

describing the life-cycle stages of entities, evaluating and

annotating the data flow junctors (i.e., the points on the

dfd where data flows split or join), and evaluating and

specifying constraints describing the temporal relationships

of processes. A set of predicates is defined in this

section to be used in the temporal constraints.










89

The information identified in this step will be

expressed as informal constraints which are formally defined

in the next step.

It is believed that making this information explicit

will improve the quality of the requirements. However, it

is also realized that including too much of this type of

information will be detrimental to the system as it

decreases design flexibility. Therefore, determining which

types of constraints are necessary and which types should be

avoided are essential tasks that must be addressed during

this step. Guidelines for evaluating constraints are

presented to assist the analyst in this task.

The first set of constraints that needs to be modeled

concerns the representation of life-cycle changes of

entities. The problem of entity life-cycle changes is

inherent in a methodology that combines an object-oriented

approach with an analysis technique based on functional

decomposition. Dfds show the processes that comprise a

system and the data that flows into and out of those

processes. This typically includes data items that are

modified somewhat by the processes they undergo. For

example, an EVALUATE process may take a 'patient' as an

"input" and produce 'evaluated_patient' as an "output."

This 'evaluated_patient' may then serve as "input" to a

TREAT process and appear as the "output" 'treated_patient'

(see Figure 4). Each of these different data flows reflect a










90

different life-cycle stage of the data item 'patient.' That

is, the primary distinction between the data acting as an

input to and output from a process is simply that it has

undergone the process.

It is standard practice to use different names for the

data flowing into and out of a process. The problem occurs

when trying to tie that practice into an object-oriented

framework. Recall that object-oriented approaches decompose

a system based on the 'is-a' structure of the data. Since

these different stages do not reflect generalization

decompositions (i.e., they are not different types of

patients) they do not appear on 'is-a' hierarchies.

Considering them distinct, unrelated objects does not seem

appropriate either. Furthermore, they play a unique role

in that all of the subtypes of the data item also undergo

the same processes and have the same life-cycle stages.

Thus, 'CHILD_PATIENT,' 'SURGICAL_PATIENT,' 'TRANSPLANT

SURGERYPATIENT,' and 'SURGICALCHILD_PATIENT' all undergo

EVALUATE and TREAT processes since they are all 'is-a'

related to 'PATIENT.' Therefore, all instances of each of

these classes will have the life-cycle stages 'evaluated

patient' and 'treated_patient.'

Solutions to deal with the above problem that were

considered include using a state transition diagram to model

the different states of an object, viewing the different

data items as the same object but with different attributes,