Historic note

Group Title: TREC-H research report - Tropical Research and Education Center-Homestead ; SB-85-4
Title: The pursuit of accuracy in daily temperature data
Full Citation
Permanent Link: http://ufdc.ufl.edu/UF00067847/00001
 Material Information
Title: The pursuit of accuracy in daily temperature data
Series Title: Homestead TREC research report
Physical Description: 9 leaves : ; 28 cm.
Language: English
Creator: Orth, Paul G
TREC (Agency)
Publisher: University of Florida, Agricultural Research and Education Center
Place of Publication: Homestead Fla
Publication Date: 1985
Subject: Temperature measurements   ( lcsh )
Temperature -- Tables   ( lcsh )
Climate -- Florida   ( lcsh )
Genre: government publication (state, provincial, terriorial, dependent)   ( marcgt )
non-fiction   ( marcgt )
Statement of Responsibility: Paul G. Orth.
General Note: "December 31, 1985."
 Record Information
Bibliographic ID: UF00067847
Volume ID: VID00001
Source Institution: University of Florida
Rights Management: All rights reserved by the source institution and holding location.
Resource Identifier: oclc - 72819176

Table of Contents
    Historic note
        Historic note
        Page 1
        Page 2
        Page 3
        Page 4
        Page 5
        Page 6
        Page 7
        Page 8
        Page 9
Full Text


The publications in this collection do
not reflect current scientific knowledge
or recommendations. These texts
represent the historic publishing
record of the Institute for Food and
Agricultural Sciences and should be
used only to trace the historic work of
the Institute and its staff. Current IFAS
research may be found on the
Electronic Data Information Source

site maintained by the Florida
Cooperative Extension Service.

Copyright 2005, Board of Trustees, University
of Florida

The Pursuit of Accuracy in Daily Temperature Data

SPaul G. Orth
SAssociate Professor
nilve ity of Florida, IFAS
Tropfic aRese rch and Education Center
.' Homestead, Florida 33031

". \: tract

Maximum a qimum daily temperatures at the Homestead Experiment Station
from four i strumefition i-urfes were compared with each other and with the
officially r orted dat ITThe most accurate and reliable data source was the
pre-calibrate data .ge -sensor combination housed at the instrumentation
site used for m et~i{ 20 years. The manually observed readings were also
accurate but were\ available for only 19 days of the month of October 1983.
The care required to maintain accuracy and reliability is discussed, and the
complications caused by multiple data sources is indicated.


The Homestead Experiment Station (official Weather Service name) weather
station has been operated continuously since September of 1930 at what is now
called Tropical Research and Education Center, Homestead. It became a part
of the National Weather Service system of cooperative stations in February of
1931. During 1982 decreases in State funding and associated reduction in
* personnel made it difficult to continue the taking of routine weather
observations. About the same time there was interest in more detailed
measurements of weather parameters that would be used in integrated pest
management (IPM) programs and crop modeling, and a suitable electronic data
logger was obtained for an IPM program for tree crops. A large amount of
data were recorded, but the IPM program was discontinued before decisions
could be made on meteorological parameters, data format, and other pertinent
aspects. During this period a second data logger was obtained for use as a
back-up and for additional research needs. Before any research projects were
completed the emphasis changed to using the data logger as a gatherer of
routine weather data. From January 1 through June 30 of 1983 the source of
official cooperative weather station data was a combination of the data
logger records and manual observations. Reliance on the automated station
gradually increased during the period.

During the summer of 1983 one data logger was turned over for use in an
automated weather station linked to the IFAS computer in Gainesville. Also,
a new weather station site in a more open location was developed for this
data logger. The other data logger was reprogrammed to achieve more
efficient use of memory space and continued in operation at the old site.
The new site became operational in September of 1983. By October there were
four primary sources of temperature data available to the author, and it was
appropriate to evaluate the quality of each source of data and to examine
some possible implications regarding the collection of daily maximum and
minimum temperatures for use in long term compilation of climatic data.


December 31 1985

Materials and Methods

October 1983 maximum and minimum temperature data from four sources were
compared along with the data on the form submitted for October to the
cooperative weather station section of the U.S. Weather Service. This
normally is considered the official data record. The first source of data
were readings taken by an observer from official U.S. Weather Service
thermometers. Twenty-three sets of readings were available. These
thermometers were exposed in a standard "cotton region" medium sized shelter
of approximately 0.21 m3 (7.6 ft3).

A second data source was the file automatically compiled by the IFAS computer
from daily data retrievals from the data logger at the new site. This site
was about 100 meters NNW of the older site. The data was retrieved from the
IFAS computer in Gainesville in a program called [WETHR]AWARDS. The data
displayed were in C and were converted to oF for comparison. They were
summarized, by the computer, from 24 sets of hourly summaries. The
temperature sensor was housed in a small ventilated shelter about 0.038 m3
(1.35 ft3) internal volume. Data for 29 days was available from this source.

A third source of data was from the data logger at the old site. The
electronics and sensor had been thoroughly verified during the first six
months of the year. The regression equation programmed into the data logger
was T = 1.811 S + 31.63. T is the calculated temperature in oF and S is the
sensor reading. The program output the maximum and minimum temperature and
time of occurrence every 12 hours. Output every 12 hours was used primarily
because "leaf wetness" data was needed on that basis. However, it was also
useful in separating out a PM minimum that was less than the AM minimum. The
sensor was exposed in close proximity to the thermometers used as the first
data source. The data were automatically transferred to cassette tape.
After the end of the month the tape was read to the IFAS computer through an
interface unit. Sorting of lines of data was accomplished using the editing
program on the computer. The data were then transferred into a statistical
program which was used to finish the sorting. Data for all 31 days was
available from this source.

A fourth potential source of data was the continuous tracings of a
hygrothermograph located in the same shelter as the sensors for data sources
1 and 3. These data were needed only to separate ambiguities in data from
the first three sources and to assist in the study of data where time of
occurrence or some other factor was of interest.

Initially, the long-term calibrated data logger was the data source selected
as a primary reference because of the completeness of the data and the
expected high correlation with the manually observed readings. These data
were compared visually with the observed readings. Since the agreement was
excellent, data from the other two main sources were compared with data from
the selected source as shown below.

Results and Discussion

The data studied are in table 1. Some numbers are supplemented by an
interpretive letter. This letter signifies a factor affecting the number, and
one that is important for understanding a lack of agreement in numbers. These
codes are explained in table 2. Columns 1 and 6 contain the observed (OBSN)
maximum and minimum temperatures read on thermometers supplied by the U.S.
Weather Service. These readings were usually made about 7:30 AM local time and
are the extremes reached since the previously recorded values. The data are
adjusted for date of occurrence (e.g. The maximum read at 7:30 AM on October 2
is entered as the maximum temperature for October 1.)

Table 1. October 1983 temperature extremes for Homestead Experiment Station
collected from four primary data sources and one derived source.

Maximum temperature Minimum temperature

Column 1 2 3 4 5 6 7 8 9 10
1 87.5 85.4 87.9 -- 88 68.5 69.1 68.7 -- 69
2 89.0 87.0 89.1 -- 89 70.2 C 71.9 71.3 -- 70 C
3 85.8 83.9 85.6 86 86 73.1 73.6 73.4 -- 73
4 87.2 86.2 87.3 88 87 73.2 71.7 71.5 F 74 73
5 86.8 D 84.8 86.2 87 87 D 71.2 71.7 71.3 72 71
6 90.8 -- 91.1 92 91 70.1 70.3 71 70
7 90.0 88.2 90.1 90 88 A 68.9 69.4 68.9 70 69
8 -- 88.3 89.9 90 85 A -- 69.1 68.7 69 69
9 -- 84.9 86.4 -- 90 B 68.5 69.4 68.5 69 71 A
10 86.5 83.7 86.5 87 87 -- 70.6 70.0 71 69 B
11 85.1 83.1 85.3 85 85 69.2 69.8 69.6 E 70 69
12 89.8 88.3 89.8 90 90 70.5 70.7 70.7 71 71
13 88.2 87.4 87.9 88 88 71.2 C 72.7 72.5 F 74 71 C
14 -- 86.7 88.9 89 87 A 71.3 71.9 71.5 E 72 71
15 -- 86.8 89.6 90 88 A -- 74.0 73.7 75 73 A
16 89.7 87.9 90.1 90 90 -- 72.7 72.4 F 74 72 A
17 89.0 86.9 89.5 89 89 E 71.1 71.7 71.4 72 71
18 85.5 E 83.2 85.2 E 85 86 E 71.2 C 70.0 69.4 F 71 71 C
19 87.0 85.2 87.2 87 87 68.5 69.2 68.4 E 68 69
20 88.8 85.8 88.8 89 89 69.2 69.9 69.3 69 69
21 89.0 87.2 89.0 89 86 A 70.1 C 71.2 70.9 70 70 C
22 -- 86.4 89.1 I 88 86 A 68.1 68.9 68.2 67 73 A
23 -- 86.6 87.9 87 89 B -- 72.9 72.3 72 72 A
24 88.0 86.8 87.4 E 87 88 -- 71.6 71.0 70 68 B
25 81.0 80.3 81.4 81 81 72.0 70.6 70.4 F 72 72
26 75.2 74.7 75.1 75 75 68.8 67.4 67.0 F 69 69
27 82.0 80.9 82.0 82 82 66.8 66.0 65.3 F 66 67
28 -- 80.3 82.5 82 83 62.8 63.3 62.7 62 63
29 84.5 D 82.3 85.3 84 82 A -- 64.5 63.8 F 66 63 A
30 -- 82.0 83.6 83 85 B 62.0 H 63.4 62.8 62 68 A
31 84.2 -- 84.5 84 84 -- -- 66.8 67 62 B

Table 2. Codes and corresponding factors affecting data quality.


A Non-calibrated automated data collection and/or
confusion in assigning data to a 24 hour period

B Incorrectly used three day OBSN maximum for third day maximum
or three day OBSN minimum for third day minimum

C Previous set temperature caused data loss

D Probably inaccurate manual reading

E Small discrepancy between manual and CR21 (appears
exaggerated due to rounding to an integer)

F Calendar day minimum in PM

G Two alternate data sources used to determine value;
discrepancy small

H Read 48 hours or more since thermometers reset

I Reason not known

Columns 2 and 7 contain the daily extremes, calendar day basis, recorded by
one data logger (A-1) and available from the IFAS computer in Gainesville.
Columns 3 and 8 are also daily extremes on a calendar day basis. They are
from the calibrated data logger-sensor system (A-2).

Columns 4 and 9 contain midnight to midnight extremes, adjusted by
calibration, from continuous thermograph traces (GRA). All readily available
readings are reported, but numerical comparisons were not made with the other
data because of the lack of precision in such readings. They were used in
the decision to call the maximum temperature on the 22 88 when conflicting
trends were indicated in A-i and A-2

Columns 5 and 10 are the values reported to the U.S. Weather Service on the
monthly summary form. However, the maximum temperatures were moved one day
earlier as was done with the manual readings so correct comparisons with
calendar date data could be made.

Columns 11-18 in table 3 are derived from columns 1-10 in table 1 as
described below. Some numbers are supplemented by an interpretive letter as
mentioned above (table 2). When any column is summarized the numbers
identified by letters are not included since, in essence, the number is not
correct and the cause of the error is known with reasonable certainty. The
purpose of the summarization is to quantify normal variation.


Table 3. Selected temperature comparisons October 1983 for Homestead
Experiment Station.

Maximum temperature

Minimum temperature

Column 11
Date 1-3

1 .4
2 .1
3 + .2
4 .1
5 + .6 D
6 .3
7 .1
8 --
9 --
10 .0
11 .2
12 .0
13 + .3
14 --
15 --
16 .4
17 .5
18 + .3 E
19 .2
20 .0
21 .0
22 --
23 --
24 .0
25 .4
26 + .1
27 .0
28 --
29 .8 D
30 --
31 .3



- .5

13 14

+1.1 I
-.6 E

+1 D
-2 A
-5 A
+4 B
-2 A
-2 A
-1 E
+1 E
-3 A
-2 A
+1 B
-3 A
+1 B

Column 3 subtracted from column 1 gives column 11.
October 24th where it was decided the manual observation
the more accurate value because of the evidence from two

(An exception is
would be considered
other data sources,

the generally good agreement between columns 1 and 3, and the fact that
adding .1 to column 3 would give a number which when rounded off would equal
the manual observation.) The mean of this column (n = 20) is -0.12 0.05
which is excellent agreement. Standard deviation is 0.21. Small differences
between readings from two data sources are exaggerated and thus noticed when
the rounding off of one number causes it to increase to the next digit, and
the ot r number rounds off to one digit less. This was mentioned above for
the 24 and also occurred on the 18 However, such rounding off does not
affect the quality of climatological data. Rounding cancels out over a


-1.1 C
- .1
- .2


- .2
-2.1 C
- .1
-- 1

- .3
- .5 C
- .1
- .8 C
- .1
-- .

- .2
+ .1

- .8 H


+ .4
+ .6
+ .2
+ .2
+ .4

+ .5
+ .4
+ .4
+ .6
+ .4
+ .2
+ .5
+ .3
+ .3
+ .3
+ .6
+ .7
+ .6
+ .3
+ .7
+ .6
+ .6
+ .2
+ .4
+ .7
+ .6
+ .7
+ .6


-2.0 F
-.5 E
+ .2 E
-.7 F
+ .1 E
-.5 F
-2.3 F
- .1 E
-1.6 F
-2.0 F
-1.5 F
-2.5 F


-1 C
+2 A
-1 B
-2 C
-1 A
-1 A
-1 C
-1 C
+5 A
0 A
-3 B
-3 A
+5 A
-5 B


period of time, and in addition small random variations occur in observed
daily extreme temperature.

Similar data for minimum temperatures are in column 15 which is column 6
minus column 8 except where an F appears in column 17. An F means that the
daily minimum shown in column 8 came in the afternoon. However, the morning
minimum on the reference data logger-sensor, not shown, corresponded to the
manual reading, column 6, and thus was used to calculate the difference
between systems and was entered in column 15. That same automated morning
minimum was subtracted from the afternoon minimum to give the numbers
identified by F in column 17. Thus those numbers show how much colder the PM
minimum was than the AM minimum. The mean of column 15 (n = 18) is -0.12
0.03, an excellent agreement but statistically different from zero. Standard
deviation is 0.12. The thermometer minimum read on October 30 should have
occurred the morning before when no reading was taken. However, its
difference of .80 from the A-2 reading is greater than usual. A reasonable
explanation is that the thermometer was reset with too much slope, and
vibration of the instrument shelter, by wind, caused the marker to slip.
Such slippage has been the only reasonable explanation on a number of
previous occasions.

The AM minimums read on the thermograph, column 9, tend to correlate well
with column 6 except on those days marked with C. On those days, the
thermometer resetting time the previous day was too early. Thus, four times
during the month the minimum thermometer reading was less than the actual
minimum because of when the thermometer was reset. The solution to this
problem is to vary the time of observation slightly as weather changes
dictate and/or make observations 2 or 3 hours after sunrise.

Since manual readings generally were not made over weekends, the Monday AM
reading usually covers a 3 day period. Therefore, without additional
information it is impossible to determine which of the 3 days that extreme
happened. Strictly speaking, then, accurate observations of maximum
temperature were available for 19 days and of minimum temperatures for 15
days in October. All data from manual observations in table 2 were placed on
the correct date since the other data sources made it possible to know which
of the 3 days had the coolest temperature and which the warmest.

Column 12 (column 2 minus column 3) and column 16 (column 7 minus column 8)
are comparisons between the reference system and the data automatically filed
in the IFAS computer. The mean of column 12 (n = 29) is -1.84 0.14 with a
standard deviation of 0.73. The same summarization of column 16 (n = 29)
gives 0.45 0.04 and a standard deviation of 0.19. The maximum temperature
readings at the newly established site (column 2) were significantly lower
than those in either columns 1 or 3. The minimum temperature readings showed
better agreement but were generally warmer at the new site. Major factors
possibly involved in the discrepancies are sensor shelter type, areal
location, and sensor calibration. Data from the new site could not
substitute for data from the old site.

Columns 14 and 18 show the agreement between the officially reported extreme
temperatures, columns 5 and 10, and the values which this study indicates
should have been reported. The reasons for disagreement are identified by
letters previously mentioned which also can serve the reader as the key

information needed to reach his/her own conclusions. Some discrepancies are
positive and others are negative. Thus the mean does not fully indicate the
lack of agreement; standard deviation is a better indicator. The mean for
column 14 was -0.32 0.29 with a standard deviation of 1.60. The respective
numbers for column 18 were -0.23 0.34 and 1.87.

Three additional errors appear on the official summary form for October.
On three days the set temperature, i.e. temperature at time of observation,
on one day is lower than the minimum temperature reported for the next day.
This is impossible because the temperature read on the minimum thermometer is
the lowest temperature since the previous resetting of the thermometer.

Summary and Conclusions

The data presented show that it is not easy to maintain a high quality
weather station for collection of temperature data as part of a long term
data base of climatic data. The source of the best quality data for October
1983 was the data logger-sensor combination (columns 3 and 8, table 2) which
had been calibrated against the official thermometers over a period of
several months. This conclusion is based on the excellent correlation
between data from this source and manual observations, the traditional source
of data. Also, this was the only data source available for every day of the
month. There are two ramifications of the relationship between manual and
automated readings. First, a nearly perfect correlation between manual and
automated readings strongly affirms the reliability of the maximum and
minimum thermometers. Supplying accurate thermometers is the responsibility
of the U.S. Weather Service. It is unlikely that if either is inaccurate,
both will be equally inaccurate in magnitude and direction. Thus, when the
data logger sensor, a single sensor, has the same linear regression
relationship with both the maximum thermometer and the minimum thermometer,
it is clear that the thermometers are performing consistently. Accuracy can
be evaluated through comparison with a certified calibrated sensor. Second,
if automated readings are to be the data source for continuation of long term
climatic records, then their consistency with past records must be verified
before the automatic equipment becomes the source of routine data, and there
must be a continuing program to maintain data accuracy.

The regression equation relating the data logger-sensor combination and
official thermometers was developed over several months under actual
operational conditions. To develop such a regression equation requires a
wide range of data points. This only can be achieved, under natural
conditions, over a long period of time in which the weather provides both
high and low maximum and minimum temperatures, ranging through a significant
portion of the temperature range encountered during a year. Calibration
under field conditions avoids the introduction of anomalies that might be
introduced by using equipment for calibration not designed for use in such a
manner (e.g. refrigerators and ovens). Continued occasional manual readings
can be used to monitor the automated system. Sensor fatigue or problems with
the data logger should be suspected if agreement decreases.

Thus as an example of good temperature data accuracy are these data which
showed an agreement within 0.30F (mean + standard deviation, sign ignored).
The data also showed a fairly consistent relationship with A-2 being 0.10
more than the manual reading, both maximum and minimum temperature. This is

not a great enough difference to require recalibration or use of a correction
factor, but can be taken into account when necessary.

When manual readings are used to monitor an automated system it is necessary
to reset the thermometers 24 hours before the reading to be compared is made.

The temperature comparisons made indicate the data filed in the IFAS VAX
computer was not consistent with historic temperature records for Homestead
Experiment Station. This is shown by the comparison between A-1 and the
observed temperatures. The latter were made in the same manner and location
as for the historical records. More data must be collected to determine the
importance of the various possible causes of the lack of agreement -
instrument shelter design, calibration, and location.

This study also raised the issue of which 24-hour period to use as standard.
There is no right 24-hour period for daily weather records. The calendar day
basis is subject to the least confusion. However, that system, on occasion,
results in an extreme temperature at midnight ending a day or at 12:01 AM,
starting a day. Such numbers have no significance relative to agriculture.
On the other hand it is easier to use manual readings to verify the accuracy
of automated data on a calendar day basis than it is to use automated data to
fill in missing manual observations routinely collected for an 8AM to 8AM
day. Thus a midnight to midnight 24-hr interval is logical when automated
data collection is used.

Automated 12 hour data summaries serve to alert the scientist to departures
from the normal pattern of minimum temperature in the morning and maximum in
the afternoon. Data on time of occurrence of extremes can be a useful
addition to the detailed records. It adds corroborating detail, and is
useful in the calibration and interpretation of thermograph charts.

Setting a policy on data quality is difficult because the person in charge of
data collection usually does not know how the data will be used. On the
other hand the data user frequently has no control over the quality of the
data available. Data quality is supervised by the U.S. Weather Service for
cooperative stations in its network. The quality of data from stations not a
part of this system should be done by the sponsoring agency.

Two examples of inaccuracy in the October data are a mean maximum temperature
of 86.20F not 86.5, and a maximum temperature on October 8 of 900F not 850.
The former probably would be of little importance in most research. The
latter error is larger than usually is acceptable.

The primary sources of error and disagreement illustrated in this study were:
(1) manual observations taken too early in the morning resulting in a
resetting of the minimum thermometer sometimes lower than the morning minimum
of the next day, (2) observations covering a span of 48 hours or more with
incorrect assignment of the extreme to a specific 24 hour period, (3) using
data from one site to represent conditions at another site without prior
calibration, and (4) differences in beginning and ending times of 24 hour
periods. The latter manifested itself primarily in a PM minimum temperature
recorded by a data logger being both lower than the AM minimum recorded on
the thermometer and higher than the minimum recorded on the thermometer the
next day.


Collection of meteorological data on a daily basis requires an appropriate
mix of trained personnel and suitable equipment. Automation of max-min
temperature data collection has the primary advantage of personnel
scheduling. Data collection or quality control can be scheduled for a
convenient time once a week, or once a month, etc. whatever best fits the
program requiring the data. A back-up for such a system is also needed, and
a thermograph could be used in this situation. Automation saves time if the
system works with few problems. Computerized data acquisition and management
can save the most in labor input. However, an automated system solely for
the collection of daily maximum and minimum temperatures and rainfall is
probably not cost effective. The start up costs in labor and equipment are
high as well as maintenance. Recovery of missing data can be a time
consuming problem. The automated station can be indispensable for the
collection of detailed data on a variety of parameters. This report has
discussed some of the technical matters to be considered when planning
weather station operation.

University of Florida Home Page
© 2004 - 2010 University of Florida George A. Smathers Libraries.
All rights reserved.

Acceptable Use, Copyright, and Disclaimer Statement
Last updated October 10, 2010 - - mvs