"Format plus"

Material Information

"Format plus" a microcomputer program for conversion of data files created using "Gather" to files structured for execution by the Statistical Analysis System (SAS)
Series Title:
Bradenton GCREC research report
Gilreath, J. P ( James Preston ), 1947-
Gulf Coast Research and Education Center (Bradenton, Fla.)
Place of Publication:
Bradenton FL
Gulf Coast Research & Education Center, IFAS, University of Florida
Publication Date:
Physical Description:
5 p. : ; 28 cm.


Subjects / Keywords:
Statistics -- Data processing ( lcsh )
Personal computers ( jstor )
Microcomputers ( jstor )
Statistical analysis ( jstor )
government publication (state, provincial, terriorial, dependent) ( marcgt )
non-fiction ( marcgt )


General Note:
Caption title.
General Note:
"February, 1985"
Statement of Responsibility:
James P. Gilreath.

Record Information

Source Institution:
University of Florida
Rights Management:
All applicable rights reserved by the source institution and holding location.
Resource Identifier:
62559029 ( OCLC )


This item has the following downloads:

Full Text


The publications in this collection do
not reflect current scientific knowledge
or recommendations. These texts
represent the historic publishing
record of the Institute for Food and
Agricultural Sciences and should be
used only to trace the historic work of
the Institute and its staff. Current IFAS
research may be found on the
Electronic Data Information Source

site maintained by the Florida
Cooperative Extension Service.

Copyright 2005, Board of Trustees, University
of Florida

;, c

5007 60th Street East
Bradenton, FL 34203

Bradenton GCREC Research Report BRA1985-12 February 1985


James P. Gilreath

Abstract. A computer program was developed which provides automatic
formatting of data sets and generation of input and procedural statements
for analysis by means of the Statistical Analysis System software resident
in a mainframe computer. The program is written in MBASIC and requires
9.1K bytes of memory. Although written for a Rainbow 100 microcomputer,
it could be modified for execution on other computers.

Research is a time-consuming process which includes data collection
and statistical analysis. In the past, analyses were performed on mechanical
calculators, then electronic calculators, and often required days to complete.
Today most horticultural scientists use statistical programs on mainframe
computers to analyze their data in a matter of minutes. With the advent
of low cost, powerful microcomputers, new opportunities exist for further
improvements' in data collection and analysis. The most time-consuming
.processes are those of entering data from.handwritten sheets into a computer
and the subsequent process of properly organizing the data and providing
the necessary input and procedural statements into a file which the statistical
program can manipulate and execute. In addition to time considerations,
the probability of errors must be considered as this probability increases
each time data are handled.

Recognizing the time-consuming nature of this process, a previously
reported program, called "Gather" (1), was written for a portable microcomputer
to allow the computer to be used for data collection in the field, greenhouse
and laboratory. This 'program greatly decreases the labor intensive aspect
of transcription and -provides a means of submitting data to a host computer
for analyses. .Hojever, it is still necessary for the user to provide keyboard
input of treatment and replication identifiers and operating instructions,
such as format parameters and statistical procedures, to the host computer
which will perform the analyses. Automation of these procedures would
further reduce the time required for each analysis. Thus, what once took
days to accomplish could be completed in less than 30 minutes by automation.
Currently no program is available which can accomplish this. This paper
describes a BASIC program, entitled "Format Plus," which was written to
achieve this goal and to serve as a component in a series of programs designed

to completely automate ,al aspects of data collection and analysis, including
tabulation of results. Format Plus takes the data transmitted from the
portable microcomputer, adds appropriate treatment and replication identifiers,
formats the data and generates the necessary input and operational statements
prior to submission for statistical analysis. The program in its present
form is written primarily for horticultural scientists and adds the necessary
procedural statements for analysis based on randomized complete block design.
Other designs can be easily accommodated by changing a few program lines
or modifying the statements contained in the finished data file prior to
submission for analysis. In any event, the data are properly formatted,
regardless of experimental design.

Although the program was written for a Rainbow 100 (Digital Equipment
Corporation, 200 Baker Avenue, Concord, MA 07142) microcomputer, it can
be implemented for execution on other computers with only minor modifications.
This program requires approximately 9.1K bytes of memory and is written
in MBASIC (Microsoft Corporation, 10700 Northrup Way, Bellevue, WA 98004).
It was preferable to have the program resident in a microcomputer rather
than a mainframe because the author uses a microcomputer to transmit data
to a remote mainframe and mainframe user time is more expensive, especially
if connection involves long distance telecommunications. The program was
written specifically for use with the Statistical Analysis System (SAS),
(SAS Institute Inc., Box 8000, Cary, NC 27511), which is a common mainframe
computer statistical package. The program in its current form is written
for analysis of variance with treatment means ranked by Duncan's new multiple
range test at the 5% level of probability. Providing instructions for
the type of analysis is not as important as is automatically generating
the necessary treatment and replication identifiers and the formatting
statements, because the procedural statements, can be changed easily and
require the least amount of time of all the program's functions.

Data previously collected with a portable microcomputer executing "Gather"
are subsequently downloaded to the Rainbow 100 and are contained on a diskette
for the Rainbow 100 in the form of string data stored by the ASCII (American
National Standard Code for Information Interchange) code of the characters
in the string and hereafter will be referred to as raw data (Table 1).
Format Plus begins with a title page displayed on the monitor. In an internal
process the program directs the computer to generate the first lines of
SAS code which provide user account number, password and SAS execution
initiation statements (Fig. 1). The user is then prompted to supply the
name of the raw data file to be processed by the program as keyboard input.
In an input/output process the computer reads the raw data file and from
this file determines the length of the treatment and replication string
expressions by means of an internal process consisting of if...then...goto
statements. The maximum length of each data set item is determined and
SAS dependent and independent variable definition statements and input
format statements are generated. This is done by determining the length
of each treatment number, replication number and data set string by means
of a series of subroutines.

In the case of the treatment and replication number subroutines, the
value of each variable is compared to the maximum length allowable for
that variable. If the value is less than the maximum, the number of digits
required for that variable to equal the maximum is determined and added

to the variable. If the length of the value equals the maximum allowable
length, the computer proceeds to the next step of the program. For example,
if the data file contained 10 treatments, then treatments 1 through 9 would
require addition of a zero in front of the treatment number in order to
properly format the data. Format Plus determines this and inserts the
needed- zero, proceeding through the data file one treatment value at e
time until all treatment values have been processed. Once all values are
processed, the computer creates the appropriate printable string expressions
for treatment and replication variables, then enters a loop
where formatting space requirements for each data set are determined.
Data portion is compared to the maximum allocated space for set, then
enters a subroutine to determine the number of elements needed and elements
are added as required.

Once the integers for each data set are properly padded out, decimal points
in the data are located, and their proper location is identified throughout
the data. This is done by checking the raw data of each data set for decimals
and determining from this the appropriate location of the decimal, that
is, the number of integers appearing after the decimal. After the decimal
location is ascertained, the number of digits needed is determined and
added to the data. The appropriate integer and decimal elements are con-
catenated, and the data elements are added to the appropriate printable
string expression. The computer then checks for more data elements and
data sets and continues the process until no more are encountered. Missing
data are accounted for by having them entered as "@" .in the raw data file.
When Format Plus encounters such an entry, it pads that data out as blanks
to indicate missing data rather than "O's" which would represent real values.

In the next phase of the program the computer is instructed to check for
alternate file names which contain the same data. If an alternate is en-
countered, thefile specification and identifier are added to the alternate
file name. The file specification identifies to which disk drive of the
computer the alternate file is to be written. The file identifier specifies
the type of file, such as a raw data, transformed data or backup (copy)
file. The computer is then directed to open the file for output to the
appropriate disk drive. The remaining SAS instructions are added to the
file, and the file is encoded with program statements to close the file
in order to protect it from accidentally being written over (Table 2).

The end of process menu is then printed on the monitor and the user is
instructed to select one of the following functions through keyboard inputs
edit another file, catalog the files, create a backup copy of the file,
or terminate the program. If the user chooses to edit another file, that
is, generate the necessary SAS file, then the values for all previously
used variables are cleared from memory and the program returns the computer
to the beginning of the program. If cataloging of files is selected, the
computer is instructed to search for all other existing files of the particular
type specified. This process begins by checking disk drive A for legal
files (files of the specified type). If any are present, their names are
printed on the monitor and the program directs the computer to perform
the same task for disk drive B. Absence of legal files on A sends the
computer directly to drive B. If none are present on drive B or all legal
files on B have been printed, the computer is directed back to menu selection

for additional instructions- from the -user. Selection of the file backup
procedure' changes the file identifier of the file to be transformed to
one with a "bak" suffix, creates the backup copy of the file, then returns
the computer to menu selection. The user can end the program by selecting
"terminate" which exits the computer from MBASIC and returns it to the
CPM 86/80 operating system language. In the event of an error condition,
the program returns the computer to the end of process menu where the user
selects the desired function.

Once the data file has been transformed into the proper file format for
analysis,' the file can be submitted to the mainframe computer for analysis
by SAS. Use of this program in conjunction with the data collection program,
"Gather," allows one to obtain completed statistical analysis of experimental
data within 10 minutes of data collection, thereby providing the user with
more time to spend on other things.


The author wishes to express his deepest appreciation to Mr. Gregory V. King
for his valuable technical assistance in writing this program and preparation
of this manuscript.

This computer program was developed as part of a project contributing to
a joint program between the Institute of Food and Agricultural Sciences
of the University of Florida and the Gas Research Institute, Chicago, IL,
entitled "Methane From Biomass and Waste."


1. Gilreath, J. P. 1985. Description of a BASIC program for data collection
using a portable microcomputer. HortScience: 20:301.

Table 1. Example of the organization of an input string expression of
a data file for submission to processing by Format Plus with

String exDression



Number of data sets or variables
Number of treatments
Number of replications
Name of first variable
Name of second variable
Data for Treatment 1, Rep 1, Variable
Data for Treatment 1, Rep 1, Variable
Data for Treatment 1, Rep 2, Variable
Data for Treatment 1, Rep 2, Variable
Data for Treatment 1, Rep 3, Variable
Data for Treatment 1, Rep 3, Variable
ETC, continuation of data
To end of file

Data for Treatment 6, Rep
Data for Treatment 6, Rep

3, Variable 1
3, Variable 2

Table 2. Example of a data file (from Table 1) processed by Format
Plus and ready for analysis by the Statistical Analysis

//JIM JOB (1021,0032,10,14



1 GERMRATE 7 10 2;

String ex-----r ssion--