Citation
Statistical and Mechanistic Analysis of Bacterial Water Quality to Evaluate and Inform Food Safety Agricultural Water Regulations

Material Information

Title:
Statistical and Mechanistic Analysis of Bacterial Water Quality to Evaluate and Inform Food Safety Agricultural Water Regulations
Creator:
Vazquez, Kathleen M
Place of Publication:
[Gainesville, Fla.]
Florida
Publisher:
University of Florida
Publication Date:
Language:
english
Physical Description:
1 online resource (74 p.)

Thesis/Dissertation Information

Degree:
Master's ( M.S.)
Degree Grantor:
University of Florida
Degree Disciplines:
Agricultural and Biological Engineering
Committee Chair:
MUNOZ-CARPENA,RAFAEL
Committee Co-Chair:
GAO,BIN
Committee Members:
HAVELAAR,ARIE HENDRIK
DANYLUK,MICHELLE D

Subjects

Subjects / Keywords:
agricultural-water -- food-safety -- modeling -- water -- water-quality
Agricultural and Biological Engineering -- Dissertations, Academic -- UF
Genre:
bibliography ( marcgt )
theses ( marcgt )
government publication (state, provincial, terriorial, dependent) ( marcgt )
born-digital ( sobekcm )
Electronic Thesis or Dissertation
Agricultural and Biological Engineering thesis, M.S.

Notes

Abstract:
Irrigation water is considered a major pathway to fresh produce for food-borne illness related pathogens. The Food Safety Modernization Act (FSMA) specifies sampling-based methods using Escherichia coli as an indicator organism and the microbial criteria, geometric mean (GM) and statistical threshold value (STV), to regulate agricultural water. We use an extensive dataset on levels of E. coli and other fecal indicator organisms as well as presence or absence of Salmonella and physico-chemical parameters in six agricultural irrigation ponds in West Central Florida to evaluate the empirical and theoretical basis of this rule. We find high variability of (log-transformed) E. coli counts, with standard deviations exceeding those assumed in the rule up to threefold. Because of this high variability, twenty samples are insufficient to characterize the bacteriological quality of irrigation ponds, and a rolling dataset using five samples per year to update GM and STV values results in highly uncertain results and delays in detecting a shift in water quality. In these ponds, E. coli was an adequate predictor for the presence of Salmonella in 150 ml samples, with turbidity as a second significant variable. The FSMA agricultural water regulations also lack preventive measures, important to controlling the processes that introduce bacterial contamination into water sources. This study develops a simple mechanistic model to predict the microbial quality of agricultural water. This model proved useful to simulate data from a highly variable surface water irrigation pond. The performance of the model was similar or superior to existing pathogen transport models, with a Nash-Sutcliffe efficiency of 0.574 when incorporating observed values uncertainty. Global sensitivity analysis is then used to reveal the most important processes controlling bacterial water quality criteria: aquatic removal rate of bacteria for GM, and bacterial source and transport dynamics STV. It was also found that peak E. coli concentration events were mechanistically driven by rainfall/runoff processes. From these findings, we suggest preventive measures enhancing die-off rates through treatment after large runoff-producing rainfall events. Bacterial source characteristics such as wildlife population should be controlled in instances where STV exceeds regulatory limits. Vegetative filter strips, when properly designed and maintained, may also provide an opportunity to mitigate bacterial transfer into the agricultural waters by reducing runoff flow and settling particulate pollutants. ( en )
General Note:
In the series University of Florida Digital Collections.
General Note:
Includes vita.
Bibliography:
Includes bibliographical references.
Source of Description:
Description based on online resource; title from PDF title page.
Source of Description:
This bibliographic record is available under the Creative Commons CC0 public domain dedication. The University of Florida Libraries, as creator of this bibliographic record, has waived all rights to it worldwide under copyright law, including all related and neighboring rights, to the extent allowed by law.
Thesis:
Thesis (M.S.)--University of Florida, 2017.
Local:
Adviser: MUNOZ-CARPENA,RAFAEL.
Local:
Co-adviser: GAO,BIN.
Electronic Access:
RESTRICTED TO UF STUDENTS, STAFF, FACULTY, AND ON-CAMPUS USE UNTIL 2018-06-30
Statement of Responsibility:
by Kathleen M Vazquez.

Record Information

Source Institution:
UFRGP
Rights Management:
Applicable rights reserved.
Embargo Date:
6/30/2018
Classification:
LD1780 2017 ( lcc )

Downloads

This item has the following downloads:


Full Text

PAGE 1

STATISTICAL AND MECHANISTIC ANALYSIS OF BACTERIAL WATER QUALITY TO EVALUATE AND INFORM F OOD SAFETY AGRICULTURAL WATER REGULATIONS By KATHLEEN M VAZQUEZ A THESIS PRESENTED TO THE GRADUATE SCHOOL OF THE UNIVERSITY OF FLORIDA IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF MASTER OF SCIENCE UNIVERSITY OF FLORIDA 2017

PAGE 2

2017 Kathleen M. Vazquez

PAGE 3

To my Mwam Gregg, and Papa

PAGE 4

4 ACKNOWLEDGMENTS I woul d like to acknowledge my advisors, Dr. Rafael Muoz Carpena and Dr. Arie Havelaar for their thoughtful encouragement, patience, and insight. This work would not be possible without them. I woul d also like to thank my parents and grandfather for putting up with me almost getting ki cked out of every institution I ha ve ever attended and paying for my education anyway. I woul d like to thank Natalie Nelson for mentoring me and encouraging me to pursue gra duate school. Who knows where I woul d be without your support and advice. I woul d like to thank all my wonderful friends for their encouragement and feedback and moral support during this year and a half. I a m grateful to have been in classes with so many smart and insp iring women. I woul d also like to thank Marena Smith for being a great distraction and climbing partner. Finally, I woul d like to thank my partner, Jackson Smith, for making me sandwiches and doing the dishes and staying up late making me laugh when it wa s most needed. I could not have done it without you.

PAGE 5

5 TABLE OF CONTENTS page ACKNOWLEDGMENTS ................................ ................................ ................................ ............... 4 LIST OF TABLES ................................ ................................ ................................ ........................... 7 LIST OF FIGURES ................................ ................................ ................................ ......................... 8 LIST OF OBJECT S ................................ ................................ ................................ ......................... 9 LIST OF ABBREVIATIONS ................................ ................................ ................................ ........ 10 ABSTRACT ................................ ................................ ................................ ................................ ... 11 CHAPTER 1 INTRODUCTI ON ................................ ................................ ................................ .................. 13 Regulatory Process and Background ................................ ................................ ...................... 13 Bacterial Transport Modeling and Analysis ................................ ................................ ........... 16 Objectives ................................ ................................ ................................ ............................... 17 2 EVALUATING THE U.S. FSMA PRODUCE SAFETY RULE STANDARD FOR MICROBIAL QUALITY OF AGRICULTURAL WATER FO R PRODUCE GROWING ................................ ................................ ................................ ............................. 19 Materials and Methods ................................ ................................ ................................ ........... 21 Data ................................ ................................ ................................ ................................ .. 21 Statistical methods ................................ ................................ ................................ ........... 21 Testing suitability of lognormal distributions to fit empirical data ................................ 21 Testing sufficiency of 20 samples to characterize the pond microbial water quality ..... 22 Testing MWQP response time to shifts in pond microbial water quality ....................... 22 Evaluating E. coli concentrations as predictors of Salmonella presence/absence ........... 23 Results ................................ ................................ ................................ ................................ ..... 24 Discussion ................................ ................................ ................................ ............................... 27 3 A SIMPLE MECHANISTIC BACTERIAL TRANSPORT MODEL TO INFORM FOOD SAFETY REGULATI ONS OF AGRICULTURAL WATER QUALITY ................ 42 Materials and Methods ................................ ................................ ................................ ........... 44 Data ................................ ................................ ................................ ................................ .. 44 Conceptual Model and Computer Model Structure ................................ ......................... 45 Bacterial Source Characteristics ................................ ................................ ............... 45 Hydrology/Landscape Characteristics ................................ ................................ ...... 48 Aquatic Removal ................................ ................................ ................................ ...... 49 Model Fitness and Parameterization ................................ ................................ ............... 49

PAGE 6

6 Global Sensitivity Analysis ................................ ................................ ............................. 51 Results and Discussion ................................ ................................ ................................ ........... 52 Model Goodness of Fit ................................ ................................ ................................ ... 52 Global Sensitivity Analysis ................................ ................................ ............................. 53 Final Remarks ................................ ................................ ................................ ......................... 55 4 CONCLUSION ................................ ................................ ................................ ....................... 66 Main Findings ................................ ................................ ................................ ......................... 66 Limitations and Future Research ................................ ................................ ............................ 67 Broader Impacts ................................ ................................ ................................ ...................... 68 APPENDIX: MARKDOWN FILES ................................ ................................ .............................. 70 LIST OF REFERENCES ................................ ................................ ................................ ............... 71 BIOGRAPHICAL SKETCH ................................ ................................ ................................ ......... 74

PAGE 7

7 LIST OF TABLES Table page 2 1 Standard statistics of the lognormal (base e) distributio n of E. coli concentrations .......... 31 2 2 Parameters of statistical distributions fitted to E. coli concentrat ion data ......................... 32 2 3 Geometric Mean (P50) and Statistical Threshold Val ue (P90) for the full data set .......... 33 2 4 GM and STV values of of 15 subsets ................................ ................................ ............... 34 2 5 Logistic regression for Salmonella presence or absence in relation to concentrations of indicator bacteria and physicochemical parameters ................................ ...................... 35 3 1 Table of model parameters ................................ ................................ ................................ 57 3 2 Optimized values of each unknown parameter. ................................ ................................ 58 3 3 Results of FitEval analysis considering three levels of uncertainty ................................ .. 59

PAGE 8

8 LIST OF FIGURES Figure page 2 1 Conceptual diagram of simulation model used to evaluate responsiveness of microbial criteria to shifts in water quality. ................................ ................................ ....... 36 2 2 Graphical evaluation of goodness of fit of different distributions. ................................ .... 37 2 3 GM and STV Ratios ................................ ................................ ................................ ........... 38 2 4 Ev aluation of responsiveness of produce safety rule to shifts in E. coli concentration. .... 40 2 5 Probability of Salmonella presence i n 150 ml of pond water as a function of the concentration of E. coli and (power transformed) turbidity ................................ .............. 41 3 1 Conceptual diagram of the study site and model processes. ................................ .............. 60 3 2 Model structure diagram ................................ ................................ ................................ .... 61 3 3 Time series plot of observed and simulated bacterial concentrations. ............................... 62 3 4 Cumulative distribution function (CDF) of the NSE under the three conditions evaluated with FitEval. ................................ ................................ ................................ ...... 63 3 5 Results of Morris method of Global Sensitivity analysis ................................ .................. 64 3 6 Results of Sobol method of global sensitivity analysis ................................ ..................... 65

PAGE 9

9 LIST OF OBJECTS Object page A 1 Chapter 2 R Markdown ................................ ................................ ................................ ...... 70 A 2 Chapter 3 R Markdown ................................ ................................ ................................ ...... 70

PAGE 10

10 LIST OF ABBREVIATIONS AIC Akaike Information Criterion AMC Antecedent Moisture Condition CN Curve Number FSMA Food Safety Modernization Act GM Geometric Mean GSA Global Sensitivity Analysis HACCP Hazard Analysis and Critical Control Point HSPF Hydrologic Simulation Program Fortran MC Microbial Criteria MLE Maximum Likelihood Estimation MM Method of Moments MPN Most Probable Number NSE Nash Sutcliffe Efficiency PSR Produce Safety Rule RMSE Root Mean Square Error STV Statistical Threshold Value SWAT Soil and Water Assessment Tool

PAGE 11

11 Abstract of Thesis Presented to the Graduate School of the University of Florida in Partial Fulfillment of the Requirements for the Degree of Master of Science STATISTICAL AND MECH ANISTIC ANALYSIS OF BACTERIAL WATER QUALITY TO EVALUATE AND INFORM F OOD SAFETY AGRICULTURAL WATER REGULATIONS By Kathleen M. Vazquez December 2017 Chair: Rafael Mu oz Carpena Major: Agricultural and Biological Engineering Irrigation water is considered a major pathway to fresh produce for food borne illness related pathogens. The Food Safety Modernization Act (FSMA) specifies sampling based methods using Escherichia coli as an indicator organism and the microbial criteria geometri c mean (GM) and statistical threshold value (STV), to regulate agricultural water. We use an extensive dataset on levels of E. coli and other fecal indicator organisms as well as presence or absence of Salmonella and physico chemical parameters in six agricultural irrigation ponds in West Central Florida to evaluate the empirical and theoretical basis of this rule. We find high variability of (log transformed) E. coli counts, with standard deviations exceeding tho se assumed in the rule up to threefold. Because of this high variability, twenty samples are insufficient to characterize the bacteriological quality of irrigation ponds, and a rolling dataset using five samples per year to update GM and STV values results in highly uncertain results and delays in detecting a shift in water quality. In these ponds, E. coli was an adequate predictor for the presence of Salmonella in 150 ml samples, with turbidity as a second significant variable. T he FSMA agricultural water regulations also lack preventive measures, important to controlling the processes that introduce bacterial contamination into water sources. This study develops a simple mechanistic model to predict the microbial quality of agricultural water. This

PAGE 12

12 mode l proved useful to simulate data from a highly variable surface water irrigation pond. The performance of the model was similar or superior to existing pathogen transport models, with a Na sh Sutcliffe efficiency of 0.581 when incorporating observed values uncertainty. It was found that peak E. coli concentration events were mechanistically driven by rainfall/runoff processes. From these findings, management activities should focus on high rainfall events. Vegetative filter strips, when properly designed and maintained, may provide an opportunity to mitigate bacterial transfer via runoff into the agricultural waters by reducing flow and settling particulate pollutants. Global sensitivity analysis was used to reveal the most important processes controlling bac terial water quality criteria: aquatic removal rate of bacteria for GM, and bacterial source and transport dynamics for STV. This suggests that when transport via runoff cannot be adequately mitigated, bacterial source characteristics such as wildlife popu lation should be controlled This will regulate STV values, decreas ing transfer risk from events of high concentration

PAGE 13

13 CHAPTER 1 INTRODUCTION The Food Safety Modernization Act attempts to establish a new paradigm for food safety regulations in the U.S. based on prevention rather than reaction. The Harvesting, Packing, and Holding of Produce for Human Consumption or production in an effort to prevent introduction of foodborne illness related pathogens at every step [1] Growing, harvesting, post harvest handling and processing all fall under this rule, with criteria specified for factors that could affect food safety. One production area regulat ed under this rule is agricultural water. Agricultural water falls under two categories according to the produce safety rule: water involved in direct contact with produce, and water used during d for growing sprouts, hand washing, or produce washing or cooling, and must be kept free of detectable indicator bacteria. than sprouts. These are the regulat ions discussed in this work. Regulatory Process and Background W ater sources used for irrigation of produce must be monitored through periodic sampling and testing for concentration of the indicator organism, Escherichia coli Water quality regulations fr equently use indicator organism to regulate microbial water quality due to the threat of multiple pathogens, and the difficulty and costs associated with enumerating said pathogens [1] To enumerate a single pathogen will not indicate the likelihood that other pathogens are present. Additionally, testing for even a single pathogen can be cost prohibitive for many growers. Indicator or ganisms are the common solution to this problem. They provide a lower cost testing strategy and can be chosen to correlate with presence of pathogens of interest. E. coli

PAGE 14

14 is chosen for this regulation to signify the presence of fecal contamination. Pathoge ns of fecal origin are the primary concern of growers using surface waters in rural areas to irrigate produce This is due to both animal presence (livestock and/or wildlife) and application of biological soil amendments of animal origin in the land are as that drain to these surface water sources. To evaluate the level of microbial contamination in these surface water sources, a sampling based approach is specified by the produce safety rule (PSR). The first step required is to establish a microbial wa ter quality profile, or MWQP. This profile consists of 20 water samples, measured for concentrations of E. coli taken over 2 4 years. These 20 values are updated annually with 5 new water samples from the most recent year. With the addition of these new samples, the 5 oldest water samples are removed, creating a rolling sampling scheme that always contains at least 20 samples from the most recent 4 years. This is the minimum sampling scheme required; while growers are welcome to do more, costs of the requ ired methods will likely limit the amount of excess sample inclusion. From this sample profile, microbial criteria must be calculated annually and maintained below limits established in the produce safety rule. The microbial criteria are the geometric mea n, or GM, and the statistical threshold value, or STV. These are the 50 th and 90 th percentile of a lognormal distribution fit to the data using the method of moments, respectively. These two values are used to capture the average concentration and any peak events that may occur, as both can contribute to food safety risks. According to the PSR, the GM must be maintained below 126 CFU/100 ml and the STV must be maintained below 410 CFU/100 ml. These water quality regulations are based on recreational water standards established by the EPA. While they are said to reflect new scientific knowledge, they have been largely unchanged since the 1986 Ambient Water Quality Criteria for Bacteria was published [2] In the

PAGE 15

15 198 6 rule, recreational water microbial limits were established for several indicator organisms, including E. coli As in the PSR, two criteria were used to evaluate the mean concentration and peak concentration events. The GM is used in both regulations, but in place of the STV, the 1986 rule uses a single sample maximum. The 1986 rule details the background studies and calculations that inform these limits. These studies involved establishing correlations between indicator bacteria levels in recreation water s and rates of illness of the bathers in days and weeks after as compared with non bathers. GM limits are then calculated from that correlation and a target maximum acceptable illness rate. This GM limit is the same as used in the PSR, 126 CFU/100 ml. The single sample maximum is equal to a percentile of a lognormal distribution with mean equal to the GM limit and a standard deviation of 0.4. The percentile used depends on the use of the water, varying between the 75 th percentile for a designated beach area and the 95 th percentile for water infrequently used for full body contact. The 90 th percentile (the value used is equal to the STV limit in the PSR, 410 CFU /100 ml [2] These connections raise some questions, including: if the microbial criteria limits were established for lightly used full body contact recreation water with E. coli concentrations having a standard dev iation of 0.4, how well do they control the microbial quality of highly variable surface water used for the irrigation of produce ? More specifically, this rule makes assumptions in specifying a sampling scheme, in calculating microbial criteria, and in the selection of an indicator organism. This work explores these assumptions through statistical analysis of an extensive data set. Six surface water irrigation ponds in West Central Florida were studied throughout two growing seasons in a study previo usly published [3] This work uses data collected from these ponds to explore how the

PAGE 16

16 sampling scheme, microbial criteria, and indicator organism used influence the effectiveness and efficiency of the current regulations. This regulatory process for agricultural water also lacks preventive measures. For noncompliant water sources, several options ar e available. Farmers may: (a) identify and control contamination sources or treat water to bring water sources back into compliance; (b) implement a waiting period between irrigation and harvest given a die off of 0.5 log 10 per day for up to four days; b) implement a waiting period between harvest and end of storage given established die off rates ; or c) implement methods reducing microbial content (such as washing) [4] These methods do not specify prevention as a principle of management. Only one of the control measures seeks to control the source of contamination and with this is only to be implemented in the event of a noncomplia nt source. Compliant sources do not have any preventive recommendations. A formal system of preventive food safety controls commonly used is the Hazard Analysis and Critical Control Point, or HACCP system. HACCP principles include conducting a detaile d hazard analysis and identifying critical control points [5] Analyzing the hazards in a system requires developing a detailed understanding of processes involved. From this understanding, critical control points become evident. These are processes or points in the system where preve ntive measures can prevent the introduction of food safety risk. These principles could be reflected in agricultural water quality regulations by establishing a process based understanding of the microbial quality of the water sources and using that unders tanding to identify preventive measures to be implemented. Bacterial Transport Modeling and Analysis One way to provide this process based understanding is mechanistic bacterial transport modeling. Models that mechanistically predict bacterial concentrati ons in surface water have

PAGE 17

17 been used in many locations with varying results and methods [6] [8] One c onsistent challenge in modeling bacterial concentrations in many surface waters is high variability [8] Mechanistic models often struggle to predict very high peak events in bacterial concentrations. While this is commonly attribut ed to the stochastic process of deposition by wildlife or livestock into surface waters directly, if these peak events are mechanistically driven, predicting them could aid in regulating water sources on a temporal basis. This is particularly important in regulating water used for irrigation of produce as many of these surface water sources display high variability [3] Mechanistic models also provide the opportunity for further analysis to reveal mitigation strategies. Global sensitivity analysis (GSA) is one method for understanding which of many parameters are the most important for explaining th e predicted values. In this work, a simple mechanistic bacterial transport model is developed and fitted to the data from one highly variable irrigation pond. This model is then analyzed using GSA to identify important process parameters. These process par ameters are used to recommend strategies to mitigate the accumulation, transport, or removal of bacteria in these surface waters. Objectives The purpose of this research is to use statistical methods to analyze the current food safety regulations concern ing agricultural water quality and use bacterial transport modeling to inform said regulations. The specific objectives addressed in Chapters 2 and 3, are: To use an extensive data set to evaluate (a) the number of samples required, (b) the sampling schedule, (c) the microbial criteria calculation methods, and (d) the indicator organism selection of the FSMA produce safety rule

PAGE 18

18 To develop a mechanistic bacterial transport model for a highly variable surface water source and to analyze this model using GSA to inform improved preventive food safety regulations

PAGE 19

19 CHAPTER 2 EVALUATING THE U.S. FSMA PRODUCE SAFETY RULE STANDARD FOR MICROBIAL QUALITY OF AGRICULTURAL WATER FOR PRODUCE GROWING Standards for the 27, 2015 as part of the implementation of the Food Safety Modernization Act [4] based minimum standards for important factors that affect produce safety, including worker training, health and hygiene, agricultural water, biological soil amendments, domesticated and wild animals, and equipment, tools and buildings. Agricultural water is recognized as a key source of contamination of produce and the rule stipulates that all water must be safe and of adequate sanitary quality for its intended use. To ensure such safety, inspection is re quired of all systems, at least annually, for likely introduction of hazards, as well as adequate maintenance of facilities to reduce potential for contamination of produce. Generic Escherichia coli is used as a measure to assess the potential for fecal co ntamination, even though the rule notes that E. coli levels do not directly predict the presence of pathogens. When using surface water irrigation during growing activities for covered produce (other than sprouts) using a direct water application method, a microbial water quality profile (MWQP) must be established by analyzing a minimum of 20 samples over 2 4 years for E. coli Samples should be representative of use and taken as close to, but prior to harvest. The geometric mean (GM, the antilogarithm of t he median or 50 percentile of a lognormal distribution) concentration of E. coli should not exceed 126 CFU per 100 ml, and the statistical threshold value (STV, the antilogarithm of the 90 percentile of a lognormal distribution) should not exceed 410 CFU p er 100 ml. If the water does not meet this microbial criterion (MC), its use should be discontinued, or the grower can perform corrective measures. These corrective measures include re inspecting the source to identify and control sources of hazards or the water can be treated to comply with

PAGE 20

20 the criteria. Treatment must be adequate to make water safe, including meeting the microbial criteria, be delivered consistently and monitored at an adequate frequency. Alternatively, other corrective measures may be ap plied, including: a) time interval between irrigation and harvest assuming a die off of 0.5 log 10 per day for up to four days; b) time between harvest and end of storage based on established die off; or c) processes achieving microbial reductions (such as washing). After the initial survey, the MWQP must be updated by analyzing a minimum of five samples per year and using a rolling data set of at least 20 samples within the previous 4 years. Water use should be modified if microbial criteria are not met, an d a new water quality profile is required if significant changes in land use occur. Criteria as a starting point, and does not specifically consider data from agricultural pon ds. To provide a basis for evaluating the MC using data from actual agricultural surface waters, Topalcengiz et al. [3] collected data on levels of E. coli and other fecal indicator organisms as well as presence or absence of Salmonella and physico chemical parameters in agricultural waters in West Central Florida. We use this extensive dataset to e valuate the empirical and theoretical basis of the PSR MC for bacteriological quality of agricultural water. Specifically, the objectives of this work are to investigate: a) how well lognormal distributions fit the empirical data and how estimates of GM an d STV are affected by censoring of data due to detection limits of the microbial tests; b) if 20 samples are sufficient to adequately characterize the bacteriological quality of irrigation ponds, given the observed variability in counts; c) how sensitive t he MC is to shifts in water quality; and d) the predictive ability of E. coli for presence/absence of Salmonella

PAGE 21

21 Materials and M ethods Data Full details of sampling sites and methods are provided in [3] Briefly, six agricultural ponds in West Central Florida were sampled during the November 2012 May 2013 and October 2013 June 2014 growing seasons. Samples were collected weekly, every other day after rain events of 2.0 cm or more within 24 h, and after intensive surface water usage due to the initial planting stage or freeze protection. A total of 540 samples were collected (90 per pond). Turbidity, temperature ( air and water), pH, and oxidative reduction potential(ORP) were measured on site using portable equipment. Indicator bacteria were quantified by most probable number (MPN) methods using commercially available kits to comply with the proposed rule of 2013 [9] In the final rule. FDA has specified EPA method 1603 to quantify E. coli as colony forming units (CFU)/100 ml. How ever, in the Final Produce Safety Rule [4] water testing methods that are equivalent in precision, accuracy, and sensitivity to EP A method 1603 are permitted, and the MPN method used for data collection here has been EPA validated and approved as equivalent to 1603 [10] Salmonella presence/absence in 150 ml was detected by pre enrichment, enrichment, and detection of the invA gene by PCR. Statistical methods All statistical analysis was done in R version 3.3.1 [11] Data were provided in Excel sheets and imported in R using the XLConnect package. One (1) turbidity measurement was missing for Pond 6, and was replaced by the mean of observed values fo r this pond. The data are provided as an R object in the Supplementary Materials. Testing suitability of lognormal distributions to fit empirical data The PSR suggests to fit a lognormal distribution to the E. coli count data using the method of moments (M M), by calculating the mean and standard deviation of log transformed

PAGE 22

22 counts. All transformations were based on natural (base e ) logarithms. Left censoring was considered by replacing all non detect values by half the limit of detection (i.e. 0.5 MPN / 100 ml). We also fitted truncated lognormal and several alternative distributions using maximum likelihood estimation (MLE) with the R package fitdistrplus In addition to the lognormal distribution, we fitted Gamma, Weibull, generalized Pareto and Frchet di stributions (the latter two distributions are available in the actuar R package). The Gamma and Weibull distributions are commonly used in microbial risk assessment to model concentration data [12] The generalized Pareto distribution and the Frchet distribution are examples of extreme value distributions, that are used to model datasets with heavier tails. Final distributions were selected based on the Akaike Information Crite rion (AIC) and by visual inspection of plots of the fitted and empirical cumulative frequency distributions. Testing sufficiency of 20 samples to characterize the pond microbial water quality To evaluate the PSR, subsets of 20 samples from the total set of 90 samples per pond were selected either randomly (without replacement) or evenly spaced, by selecting every fourth sample. GM and STV were calculated for each subset of samples. Testing MWQP response time to shifts in pond microbial water quality To st udy the effects of possible shifts in water quality, we simulated alternative timelines composed of 2 distinct periods after initial establishment of the microbial water quality profile (MWQP) of a compliant pond: a baseline period of sets of samples with no changes in water quality, followed by scenarios of biological water quality degradation represented by shifts to non compliant underlying E. coli distributions for the next 10 sets as demonstrated in Figure 1. For the compliant period, we simulated a p ond that would meet the GM and STV threshold values consistently, even when accounting for sampling variation. For this purpose, we arbitrarily selected GM = 60 and STV = 191, represented by a lognormal distribution with

PAGE 23

23 from this lognormal distribution, representing the initial sampling, to set the MWQP and calculated the sample GM 1 and STV 1 (Figure 1). To establish the baseline of GM and STV in this compliant pond, we simulated 4 subsequent sets of five samples by randomly sampling sets of 5 samples/year. For each new set, we calculated the sample GM and STV of the 20 most recent samples, i.e. we replaced the oldest 5 samples with the new 5 samples every year (GM 2 to GM 5 and STV 2 to STV 5 in Figure 1). Then, four scenarios representing a shift in water quality after the baseline period were simulated by either increasing the mean or the standard deviation of the lognormal distribution. The shifts in the mean were chos en to represent scenarios where the theoretical GM and STV would either be marginally exceeded, or by a larger margin. Ten sets of five samples from the Shift scenario distributions were added to the dataset. GM and STV of the 20 most recent samples were r ecalculated for each set, i.e. we again replaced the oldest 5 samples with the new 5 samples for every set (GM 6 to GM 15 and STV 6 to STV 15 in Figure 1). This procedure (compliant MWQP/baseline + non compliant shift scenario) was repeated five times to evalu ate random variability in sampling results. Evaluating E. coli concentrations as predictors of Salmonella presence/absence Relationships between presence or absence of Salmonella and indicator bacteria or physicochemical parameters were analyzed by logistic regression analysis, using the R glm function. Turbidity values were highly skewed and were normalized by a Box Cox power transformation: where = transforme d vector of turbidity measurements, = untransformed vector and = power factor, optimized by maximum likelihood estimation using the R package car [13]

PAGE 24

24 Results A lognormal distribution an adequate description of the variability in E. coli counts, but may underestimate the oc currence of extreme events Table 1 shows descriptive statistics of the log e transformed E. coli counts in ponds 1 6. Of interest to evaluate the appropriateness of a lognormal distribution are the skewness and kurtosis. A normal distribution has a skewnes s (a measure of the symmetry of the distribution) of 0 and a kurtosis (a measure of the peakedness of the distribution) of 3. The data show a skewness between 0.3 and 1.0, and a kurtosis between 2.5 and 3.1, indicating the distributions are more skewed and less peaked than a normal distribution. Because these results were obtained by replacing censored values with half the limit of detection they may underestimate the spread of the distribution. Table 2 shows the mean and standard deviation of a truncated lognormal distribution as fitted by maximum likelihood estimation (MLE), which also provides uncertainty intervals of the estimates. The MLE estimate of the mean of the lognormal distribution is lower than the MM estimate (compare with Table 1), whereas the MLE estimate of the standard deviation is higher than the MM estimate, indicating that indeed accounting for censoring increases the spread of the fitted distributions. For all ponds, the Gamm a and Weibull distributions resulted in poorer models of the data than the other distributions, as indicated by higher AIC values (Table 3). Comparisons of the lognormal and extreme value distributions varied between ponds, with the Frchet distribution pr oviding the best model for data from four ponds (1, 2, 3 and 6) and the lognormal and generalized Pareto distributions providing the best model for one pond each (5 and 4, respectively). Visual inspection of the fitted distributions (Figure 2) suggests tha t the tail of the distributions of ponds 1, 4, 5 and 6 are well fitted by a lognormal distribution, with the Frchet

PAGE 25

25 distribution overestimating the tails of these distributions. Ponds 2 and 3 have particularly heavy tails, which are fitted well by the Par eto and Frchet extreme value distributions. All ponds were in compliance with the MC, with varying levels of confidence. Table 3 shows GM and STV values for the full set of 90 samples per irrigation pond calculated by PSR. The GM (50 percentile) E. coli c ounts are around one order of magnitude below the threshold of 126 CFU per 100 ml, while there is high variability in the counts with the STV (90 percentile) of pond 2 approaching the threshold of 410 CFU per 100 ml. Fitting a lognormal distribution to the data by MLE, while taking censoring into account results in slightly lower values for GM, whereas the STV values are generally higher with the STV of pond 2 now exceeding the threshold value. GM values for the extreme value distributions are similar or sl ightly lower than these of the fitted lognormal distribution, while the STV are generally higher. A notable exception is the STV for pond 2 using the Frchet distribution, which is lower than even the STV calculated by PSR. Confidence intervals for GM and particularly STV are relatively wide, even for the full dataset of 90 samples per pond. The critical value for STV is in the 95% confidence interval for ponds 2 and 4, implying that it cannot be concluded with a high level of confidence whether these ponds are in compliance. 20 samples are not sufficient to characterize the bacteriological quality of irrigation ponds. GM and STV for each subset of 20 randomly or evenly spaced samples from the full set of irrigation pond data (90 samples) are shown in Table 4, a graphical summary is provided in Figure 3. The GM value can be over or underestimated by a factor of approximately three, whereas the STV value can be underestimated by a factor of 8, or overestimated by a factor of more than 4 for these ponds. Thus our results demonstrate that taking only 20 samples from agricultural surface water is not sufficient for a reliable estimate of the E. coli concentration. As

PAGE 26

26 expected, the variability in STV values is higher than in GM values, and the greater the variabil ity in the observed counts, the greater the variability in the estimates based on 20 samples. The lowest variability was observed for pond 3, for which the standard deviation of the lognormal distribution was lowest at 1.66. The MC detects shifts in water quality with delays The theoretical means and standard deviations of the simulated pond and simulated shifts in water quality are shown in Table 5, and results of the simulated sampling campaigns are shown in Figure 4. In all simulated datasets, the resul ts of the baseline pond are within the PSR criteria. Shifting the mean log concentration by 0.9 log e units (2.5 x) while keeping the log standard deviation constant (Shift 1) causes both the GM and STV of the pond to be non compliant. A gradual increase in calculated GM and STV, based on consecutively adding adding sets of 5 new samples, is observed with GM and STV exceeding the criteria 2 5 and 1 6 sets after the shift, respectively. When the mean log concentration is increased by 1.4 log e units (4 x) whil e keeping the log standard deviation constant (Shift 2), the increase in calculated GM and STV is somewhat more rapid, exceeding the criteria after 1 3 sets. Shifting the standard deviation of the simulated distributions while keeping the mean log concentr ation constant (Shifts 3 and 4) does not affect the theoretical GM, although there is some drift towards higher values in the simulated data due to occasional high counts. It takes between 3 and 6 sets before the STV is exceeded for a shift of the standard deviation by 0.7 log e units (2.0x), while the response time is reduced to 1 5 sets for a shift of the standard deviation by 1.7 log e units (5.5x). All scenarios show considerable variability in simulated GM and particularly STV, emphasizing that when the water quality changes, it is not possible to obtain a reliable estimate of this statistic when using only 5 additional samples per set.

PAGE 27

27 Salmonella presence or absence can be predicted by E. coli counts and turbidity Out of 90 samples, there were 4, 7, 2, 8, 2 and 3 samples from which Salmonella was isolated in ponds 1 6, respectively. Boxplots (see supplementary materials) suggest that in Salmonella positive samples, all indicator bacteria occurred at higher concentrations than in Salmonella negative sampl es, but that there were differences between ponds in the strength of the effects. For the physicochemical parameters, less consistent patterns were observed. The skewness and kurtosis of the Box Cox transformed turbidity values (using a power factor of 0.3 3) were 0.22 and 3.38, respectively, indicating acceptable transformation to a normal distribution (theoretical skewness 0 and kurtosis 3). Results of univariate logistic regression are shown in Table 5 and confirm the significant effects of indicator bact eria, as well as a significant effect of turbidity. These significant factors were subsequently included in a multivariate model, with only E. coli and turbidity showing significant results. There were no interactions between these two variables (p = 0.51) nor when ponds were entered as a factor in the model (p between 0.13 and 0.53 for individual ponds). Turbidity has a strong effect on the predicted probability of Salmonella presence as a function of E. coli concentration, as illustrated in a graphical r epresentation of the final model in Figure 5. Discussion Our results demonstrate a high variability of E. coli counts in irrigation ponds in West Central Florida. Lognormal distributions provide an acceptable fit to the data in most cases, but, importantly may underestimate the tails (extreme events) for ponds with high variability. In general, the observed variability is much higher than assumed in the PSR, which was based on observations in recreational surface waters. A pond that would just meet the PSR thresholds would (on the log e scale) have a mean of 4.84 and a standard deviation of 0.92. Clearly, the mean level of contamination observed in our ponds is lower than the theoretical mean, but the variation

PAGE 28

28 in counts is considerably higher than assumed i n the PSR with standard deviations ranging between 1.66 2.78 (see Table 2). Fitting truncated lognormal distributions to the data using maximum likelihood estimation provide the opportunity to take censoring of the data into account without the need for decisions on replacing censored data by an arbitrary number, e.g. half the limit of detection. Our results demonstrate that the actual variability of counts is underestimated when censoring is not properly accounted for. The method of moments proposed in t he PSR and implemented in several on line tools 1 2 requires users to replace censored values by the limit of detection. This results in an overestimation of the GM and an underestimation of the STV and further development of such tools to appropriately acc ount for censored data is recommended. We demonstrate that taking 20 samples is not sufficient to reliably characterize the water quality in the investigated agricultural ponds, due to the high variability of E. coli counts. Furthermore, the rolling updat e of the GM and STV by consecutively adding sets of 5 samples, and recalculation based on the most recent 20 samples is shown to respond very slowly to shifts in water quality, with increases in E. coli levels being detected only after 1 6 sets, depending on the nature and the magnitude of the shift. Increasing the number of samples might address this problem, but will increase the costs of monitoring. The standard error of the mean of a sampling distribution is defined as the population standard deviation, divided by the square root of the number of samples. This implies that to achieve the same statistical precision at different values of the population standard deviation, the number of samples needs to increase by the square of 1 University of Arizona Cooperative Extension. FSMA Produce Safety Rule Online Calculator Available at: http://agwater.arizona.edu/onlinecalc/ Accessed July 29, 2016. 2 Western Center for Food Safety. 2016. Determining Your Microbiological Water Quality Profile (MWQP) for Untreated Surface Water Used in t he Production of Fresh Produce Version 4.0 .Available at: http://ucfoodsafety.ucdavis.edu/files/229168.xlsx Accessed July 29, 2016.

PAGE 29

29 the ratio of the population standard deviations. In our results, the observed (log) standard deviations were up to 3 times higher than the theoretical standard deviation of a pond that would just comply with the PSR. Hence, to obtain the same precision of the GM as aimed for in the PSR, 20 x 3 2 = 180 samples would be needed in a 2 4 year period. More data are needed from different production areas in the U.S. to further evaluate the variability of E. coli counts in actual agricultural waters. Draper et al [14] recently published a survey on surface waters used for irrigation in Pennsylvania, and reported similar standard deviations of E. coli counts (2.19 and 2.76 (log e scale) in two consecutive years, based on 94 and 59 samples from 33 and 21 farms, respectivel y). Presence or absence of Salmonella in 150 ml pond water was significantly correlated with all three indicator bacteria measured in this survey (coliform bacteria, E. coli and enterococci) as well as with turbidity. In a multivariate regression model, only E. coli and turbidity were significant predictors of Salmonella presence. The distributions of both predictor variables were highly skewed to the right, indicating the oc currence of rare but extreme contamination events, during which the probability of Salmonella presence was also increased. We do not believe that increasing the number of samples will be an effective and efficient way to control the microbial quality of ag ricultural waters. Instead, further data collection and analysis to understand the processes driving the variability of bacteriological quality of irrigation ponds is necessary. Such data may inform preventive or rapid corrective actions, that may have a l arger impact on produce safety than the current MC, which has considerable statistical limitations and will only lead to action after several years of data collection. Such studies should include climate and other factors that may affect contamination, and should preferably be based on mechanistic rather than statistical models of bacterial transport into agricultural waters. This

PAGE 30

30 will allow farmers to understand events that drive peak contamination levels and restrict the use of water or apply appropriate treatment after such events. Such events may occur more or less frequently and be driven by different factors in different production zones. This targeted risk management strategy will be more effective and more efficient than the corrective measures stipu lated in the PSR, which requires farmers to either treat their water or apply additional storage time, regardless of the current water quality. Most of the time, the E. coli counts will be sufficiently low to allow use of the water without further restrict ions, whereas during peak events, the limited treatment or reduction of bacterial levels on the produce will be insufficient to assure its safety. Farm to fork modeling of contamination events is necessary to better understand the impact of peak events on consumer risks and to decide when additional risk management strategies are needed.

PAGE 31

31 Table 2 1 Standard s tatistics of the lognormal (base e) distribution of E. coli concentrations (MPN per 100 ml) in irrigation ponds Site Censored data 1 Mean Standard de viation Standard error Skewness Kurtosis Pond 1 31 0.989 1.846 0.195 1.019 3.108 Pond 2 9 2.582 2.658 0.280 0.868 3.007 Pond 3 23 0.948 1.503 0.158 0.671 2.410 Pond 4 6 2.576 1.931 0.204 0.378 2.875 Pond 5 11 2.089 1.867 0.197 0.327 2.488 Pond 6 8 2.100 1.971 0.208 0.553 2.553 1 Number of v alues below the level of detection replaced by half the limit of detection (0.5 MPN / 100 ml)

PAGE 32

32 Table 2 2 Parameters of statistical distributions fitted to E. coli concentration data in six irrigation ponds 1 Site LN G D WB a b a b Pond 1 0.68(0.26) 3 2.26(0.22) 0.20(0.03) 0.0090(0.0024) 0.39(0.04) 5.14(1.55) Pond 2 2.50(0.30) 2.78(0.22) 0.32(0.02) 49.2(17.4) Pond 3 0.86(0.18) 1.66(0.15) 0.38(0.06) 0.0435(0.0097) 0.55(0.05) 4.97(1.02) Pond 4 2.56(0.21) 1.96(0.15) 0.31(0.04) 0.0032(0.0006) 0.48(0.04) 34.3(8.04) Pond 5 2.05(0.21) 1.94(0.16) 0.33(0.04) 0.0071(0.0015) 0.50(0.04) 19.7(4.46) Pond 6 2.07(0.22) 2.03(0.16) 0.29(0.04) 0.0042(0.0009) 0.46(0.04) 21.6(5.29) Site GPD FD AIC a b a b Pond 1 1.41(0.35) 1.60(0.28) 1.41(0.35) 1.60(0.28) F D < G P D
PAGE 33

33 Table 2 3 Geometric Mean (P50) and Statistical Threshold Value (P90) for the full data set (90 samples per pond), as compute d by the method of moments according to the produce safety rule (PSR), or by fitting different distributions by maximum likelihood estimation, while taking censoring of data into account, with 95% confidence intervals in parenthesis. Site GM(P5 0) PSR LN G D WB GPD FD Pond 1 3 2 (1 3) 3 (1 5) 2 (1 4) 2 (1 3) 2 (1 2) Pond 2 13 12 (8 20) 16 (9 29) 8 (5 12) 8 (5 15) Pond 3 3 2 (2 3) 3 (2 4) 3 (2 4) 2 (2 3) 2 (2 3) Pond 4 13 13 (8 21) 26 (16 38) 16 (10 29) 12 (8 18) 10 (7 14) Pond 5 8 8 (5 10) 13 (8 22) 9 (6 14) 7 (5 11) 6 (5 9) Pond 6 8 8 (5 10) 16 (10 25) 10 (7 16) 6 (5 11) 6 (4 9) Site STV(P90) PSR LN G D WB GPD FD Pond 1 29 36 (17 78) 69 (33 118) 45 (25 79) 34 (16 69) 32 (15 63) Pond 2 397 432 (193 865) 674 (290 1537) 470 (175 1479) 385 (138 1117) Pond 3 18 20 (12 31) 25 (17 37) 22 (16 34) 20 (13 30) 20 (10 34) Pond 4 156 159 (88 299) 287 (120 431) 196 (88 361) 154 (91 281) 227 (109 452) Pond 5 88 93 (52 147) 136 (63 216) 106 (58 178) 92 (47 159) 126 (71 268) Pond 6 102 106 (58 154) 207 (80 336) 133 (59 242) 114 (70 230) 117 (59 221)

PAGE 34

34 Table 2 4 GM and STV values of 10 sets of 20 randomly sampled and 5 sets of 20 evenly spaced samples compared to the GM and STV values of the full set of 90 samples per pond. Samples Pond 1 Pond 2 Pond 3 Pond 4 Pond 5 Pond 6 GM STV GM STV GM STV GM STV GM STV GM STV Full set 2.7 28.6 13.2 397.2 2.6 17.7 13.1 155.8 8.1 88.1 8.2 101.8 Random 2 18.7 18 827.2 2.3 13.6 16.5 188.5 5.9 41.7 9.4 86.4 Random 2.1 20.7 10.9 268.5 2.6 21.7 8.4 76 4.8 59.8 4.4 31 Random 3.2 39.8 15.9 238.9 3.4 20.6 13 110.6 11.1 99.8 8.8 148.8 Random 2.6 26.4 12.7 394.6 2.2 10.9 7.7 99.5 12.7 167.1 8.1 163.7 Random 3.1 39.4 17.3 698.3 3.5 25.9 25 213.9 7.2 46.5 11.1 137.3 Random 3.8 53.7 18.1 487.2 2.6 19.2 11 216.8 10.6 107.3 10.3 185.6 Random 3.7 72.5 8.1 170.5 4.5 48.5 30.4 706.1 12.2 180.2 6.6 71.8 Random 3.1 35 18.4 851.8 2.4 16.9 20.2 294.4 8 105.3 6 119.3 Random 3.7 47.4 12 271.5 1.5 5.1 13.3 150.4 5.1 36.4 5.7 64 Random 2.3 20.9 10 209 2.7 16.3 16.3 227.3 18.6 210.5 13.3 276.3 Spaced 2.8 35.6 22.8 645.3 2.5 25.1 25.7 214.7 12.2 144.2 13.7 216.5 Spaced 4.3 40.2 15.3 1141.1 2.8 17.6 13.7 263.7 7.1 84.9 7.2 123.8 Spaced 1.8 18.7 13.9 247.5 2.0 9.8 9.9 117.1 7.7 87.8 8.6 74.4 Spaced 2.4 24.5 6.1 102.9 3.1 22.6 8.3 70.9 6.3 58.9 5.1 48 .0 Spaced 3 .0 39 24 723.7 2.7 27.2 24.5 210 11.5 140.6 14 .0 235.2

PAGE 35

35 Table 2 5 Logistic regression for Salmonella presence or absence in relation to concentrations of indicator bacteria and physicochemical parameters (all ponds) Predictor Odds ratio (95% CI) p value Univariate analysis Coliforms (log e MPN per 100 ml) 2.32 (1.59 3.43) 1.4 x 10 5 E.coli (log e MPN per 100 ml) 1.66 (1.40 1.99) 1.7 x 10 8 Enterococci (log e MPN per 100 ml) 2.51 (1.62 3.96) 4.4 x 10 5 Air temp erature ( C) 1.01 (0.95 1.09 0.68 Water temp erature ( C) 1.02 (0.93 1.14) 0.61 Conductivity S) 1.00 (0.99 1.00) 0.39 pH 0.85 (0.57 1.28) 0.42 ORP (mV) 1.00 (0.99 1.01) 0.96 Turbidity (FAU) 1.44 (1.17 1.40) 6.6 x 10 4 Multivariate analysis (final model) E. coli (log e MPN per 100 ml) 3.01 (2.00 4.65) 2.3 x 10 7 Turbidity (FAU) 1.29 (1.04 1.16) 0.02 Box Cox power transformation (power factor 0.33)

PAGE 36

36 Figure 2 1 Conceptual diagram of simulation model used to evaluate responsiveness of microbial criteria to shifts in water quality.

PAGE 37

37 Figure 2 2 Graphical evaluation of goodness of fit of different distributions. Black lines: data; colored lines: fitted distributions as indicated in legend.

PAGE 38

38 Figure 2 3 GM and STV Ratios. A) Ratio of GM values of 10 sets of 20 randomly sampled, and 5 sets of 20 evenly spaced samples to the GM value of the full set of 90 samples per pond. The red dashed line represents a ratio of 1, i.e. similar values for the 20 and 90 sample results. B) Ratio of STV values of 10 sets of 20 randomly sampled and 5 sets of 20 evenly spaced samples to the STV value of the full set of 90 samples per pond A

PAGE 39

39 Figure 2 3 Continued. B

PAGE 40

40 Figure 2 4 Evaluation of responsiveness of produce safety rule to shifts in E. coli concentration. Refer to Figure 1 for simulation approach Set 1: MWQP; sets 2 figure. Solid black lines: regulatory thresholds for GM and STV. Dashed red and blue lines: theoretical GM and STV before and after shift. Solid red and blue lines: simulated sampling results.

PAGE 41

41 Figure 2 5 Probability of Salmonella presence in 150 ml of pond water as a function of the concentration of E. coli and (power transformed) turbidity. Note that the original turbidity values can be calculated by the reverse transformation where = tran sformed vector of turbidity measurements, = untransformed vector and = power factor, optimized by maximum likelihood estimation.

PAGE 42

42 CHAPTER 3 A SIMPLE MECHANISTIC BACTERIAL TRANSPORT MODEL TO INFORM FOO D SAFETY REGULATIONS O F AGRICULTURAL WATER QUALITY Foodborne illnesses are a significant concern for public health officials and the public alike. In recent years, an increasing amount of foodborne illness outbreaks have been associated with fresh produce [1], [15] One major pathway of pathogens to fresh produce is through agricultural water, including water used for irrigating plants, hand washing surrounding harvest, and post harvest pro cessing [1], [16] Ensuring adequate microbial water quality has been a regulatory challenge in public health for some tim e, with the Food Safety Modernization Act as the newest legislation [2], [17] The Food Safety Modernization Act (FSMA) aims to take a preventive approach towards Growing, Harvesting, Packing, and Holding of production, including the microbial quality of agricultural water [1] Agricultural water used for irrigation is considered here. New FSMA standards take a sampling based approach to controlling the microbial water quality, specifying a number of samples to be taken and criteria to be met by the portfolio of sample results [1] However, sampling based regulations have some important li mitations. Generally, preventive food safety management systems like HACCP (Hazard Analysis and Critical Control Point) first establish detailed understanding of the system, preventive measure s can be introduced, and then use monitoring (i.e. microbiological testing) to ensure system functioning [5] This differs from sampling based regulations which only use monitoring to ensure system functioning and do not start with preventive measures. More specifically, for highly variable surface water sources, these methods may not adequately or efficiently control microbial water quality. In the produce safety rule, indicator

PAGE 43

43 bacteria, specifically Escherichia coli, are used to represent the likelihood of fecal contamination [1] Previous work in Central Florida indicates that surface water sources used for irrigation have highly variable levels of E. coli [3] variability coupled with the sampling s cheme required was found to lead to possible over or underestimation of the critical values used to regulate agricultural water sources [18] In addition, sampling methods can be slow to respond to large changes in water quality over t ime [18] A more mechanistic approach to understanding the temporal dynamics of microbial water quality may shed light on improved management and regulatory strategies. A recent review of current modeling approaches emphasizes the rol e modeling can play in both mandated and voluntary water quality control measures [19] Mechanistic bacteria modeling in surface waters has been pursued in many regions with differing levels of complexity and results [6] [8], [19], [20] A majority of watershed scale models summarized in a 2016 review only consider bacteria osed to the soil reservoir [19] One frequently used model that does include the soil reservoir is the hydrologic simulation program FORTRAN (HSPF) [21] This program models fecal indicator bacteria sources as a population of bacteria shedding anim als on the area of interest [21] Both HSPF and the Soil and Water Assessment Tool, transport processes. SWAT is more frequently used in agricultural areas and is the only model included in the 2016 review to consider both manure and soil reservoirs [19] However, the bacterial transport equations used are more complex, including taking into account different bacteria types, ma king it less appealing for this application [7 ]

PAGE 44

44 This study uses existing bacterial source modeling components alongside basic hydrologic and aquatic degradation principles to develop a runoff driven bacterial transport model for a highly variable irrigation pond in West Central Florida. State of the art model calibration and evaluation techniques that incorporate uncertainty in the bacterial measurements are used to assess the quality of the model. Global sensitivity analysis (GSA) is then used to identify possible strategies to mitigate conta mination events. The use of mechanistic models like the one proposed that provide better understanding of surface water sources with highly variable microbial quality can help inform more efficient and adequate, real time methods for controlling water rela ted foodborne illnesses. Materials and Methods Data Microbiological data used for the development of this model was gathered as part of a larger dataset on agricultural water, previously published [3] The data was collected from irrigation ponds used for strawberry production in West Central Florida More detailed location is withheld due to gro wer privacy and data protection rules in the sampling protocol. Sample collection was conducted over two growing seasons. Growing season 1 spanned November 2012 May 2013; growing season 2 spanned October 2013 June 2014. A total of 90 samples were colle cted on a weekly basis, with more frequent samples collected after intense rainfall events or periods of water use. For each sample, E. coli was quantified using the most probable number (MPN) method, which the EPA has validated as equivalent to EPA method 1603 [10] EPA method 1603 or an equivalent method is required by the produce safety rule for enumerating E. coli For this reason, this paper will treat units of MPN and CFU as equivalent for data and model comparison purposes. Meteorological data for the period of sampling was accessed from a Florida Automated Weather Network station approximately 10 miles west [22]

PAGE 45

45 Data was collected from a single irrigation pond in West Central Florida. The pond is situated between agricultural fields and is fed by rainfall and runoff. Adjacent fields collect runoff in small drains that feed the pond through culverts (Fig. 1). The agricultural fields are comprised of sandy soils (USDA soil taxonomy classification Ona fine sand and St. Johns fine sand [23] ) and utilize black plastic mulch on raised beds during the growing season. There is no use of biological soil amendments in the adj acent fields, however, wildlife is frequently present in the area surrounding both the pond and fields [3] The pond is connected to a channel which is used during periods of extremely high rainfall to divert excess water. As growing seasons span the dry season in Florida, this channel is not active during the growing season and is not conside red in the hydrologic model of the pond. Conceptual Model and Computer Model Structure In order to develop the model structure, a conceptual diagram of the system was made (Figure 1). In this conceptual diagram, wildlife on the area of interest contribut e to accumulation of indicator bacteria on the soil surface. Rainfall events generating runoff then transport the indicator bacteria to the pond via runoff collection drains at the edge of adjacent fields. Rainfall and runoff events also contribute to the changing volume of water in the pond. Once transported to the pond, bacteria are removed via die off and sedimentation. Though the pond is unlined, it is of a sufficient depth that seepage is excluded from the water balance and only evapotranspiration is c onsidered. Depth of the pond was also used to justify exclusion of sediment resuspension. The final model structure is shown in Figure 2. All model development was carried out in R version 3.3.1. Details on the model components are provided below. Bacterial Source Characteristics To model the accumulation and transport of bacteria, equations from two widely used models were considered, HSPF and SWAT. Because t he HSPF model is often used to specify

PAGE 46

46 total maximum daily loadings of contaminants in ur ban watersheds it does not have a focus on biological soil amendments as a source [24] This is relevant to the study area due to the lack of livestock or biological soil am endment application. Bacterial accumulation, storage, and transport equations from the HSPF model were used due to this relevance and their relative simplicity. Three dist inct compartments are considered to model bacterial source characteristics. These are shown in Figure 2. The first model compartment is the HSPF bacterial accumulation. Accumulation of bacteria on the soil surface is modeled as a population of animals on the area of interest, following Equation 2 1 [21] Rainfall variability has been shown to have a positive effect on small mammal population growth rate [25] In the dataset, two distinct growing seasons represent differing levels of rainfall variability that could explain differing magnitudes of peaks in E. coli levels. From this, the wildlife population parameter was considered separately for each growing season. Subscript j will refer to growing season with j = 1,2 (3 1) where ACCUM [CFU/acre/day] is the accumulation rate, F prod [g/individual/day] is feces produced, FC den [CFU/g] is the density of bacteria per gram of feces, POP j [ -] is the number of individuals in the area during growing s eason j and HAB [acres] is the total area of interest. The second bacterial source characteristic model compartment is the HSPF bacterial surface storage. Die off of bacteria on the soil surface is considered by specifying a storage limit. Storage appr oaches this limit as accumulation occurs. Runoff events cause wash off of bacteria,

PAGE 47

47 reducing surface storage in that time step. Storage follows Equation 2 2 [21] Subscript i will denote the day. ( 3 2 ) where SQO i [CFU/acre] is soil surface storage of bacteria, SoilDO j [day 1 ] is the die off rate of bacteria on the soil surface, SQO lim [CFU/acre] is the storage limit of bacteria on the soil surface. Transport from the soil surface is modeled using the HSPF bacterial transport compartment. Runoff depth on a daily scale is required as an input to this compartment (see section 2.2.2). A parameter corresponding to the degree to which bacteria is readily r emoved from the soil surface by runoff is also required and takes the form of the runoff rate that removes 90% of bacteria in 1 landscape characteristics than the bacterial source characteristics. Transport occurs according to Equation 2 3 [21] ( 3 3 ) where SOQO i [ CFU/acre ] is the amount of bacteria washed off the soil surface, Q i [in] i s the runoff depth, wsqop [in/hr] is the runoff rate that removes 90% of stored bacteria in one hour, WSFAC [ ] is a measure of susceptibility of bacteria to runoff related wash off.

PAGE 48

48 Hydrology/Landscape Characteristics To model runoff from the area of interes t, the NRCS curve number method is used. Curve numbers were estimated based on land cover, assuming 60% soil and 40% black plastic mulch. These values were optimized during parameterization. This method is well established and uses a single landscape param eter, curve number, alongside rainfall depth and antecedent moisture content (AMC) of the soil to produce runoff depth on a daily time step to input to other model compartments. This model compartment follows Equation 2 4 [26] ( 3 4 ) where CN 2 is the NRCS curve number for AMC II (unitless), and P i [in] is the rainfall depth on day i To calculate the final concentration of bacteria in the pon d, a water budget must also be kept for the pond volume. The pond is maintained at a depth of 15 ft and is assumed to have a 4:1 slope. Initial pond volume is assumed to be at capacity due to the lack of data and the start of data collection coinciding wit h the end of the wet season in Central Florida. Inputs and outputs of the pond are considered to be runoff, direct rainfall, and evapotranspiration. This process follows Equation 2 5. ( 3 5 ) where PondVol i is the volume of the pond on day i,

PAGE 49

49 Q i [in] is runoff depth on day i, A land [ha] is the land area draining to the pond, A pond [ha] is the pond area, P i [mm] is the precipitation on day i, ET i [mm] is the evapotranspiration on day i Aquatic Removal All removal of bacteria from the water column was considered to follow a first order decay process (Equation 2 6). This includes the processes of die off and settling out of the water column combine d into a single rate constant. For simplicity, this rate was assumed constant throughout the study period. ( 3 6 ) where C i [CFU/100 ml] is the concentration of bacteria in the pond on day i k [day 1 ] is the removal rate, and t is the time step, here, 1 day. Model Fitness and Parameterization Values obtained from the literature along with appropriate ranges were used to parametrize the model (Table 1). Eight uncertain model inputs (underlined in Table 1) were selected in the inverse calibration process of the model. The procedure consisted in m aximizing the Nash Sutcliffe model efficiency (NSE) [27] NSE quantifies the model prediction accuracy and is defined as:

PAGE 50

50 NSE can take values between accuracy than the mean of observed values. The inverse calibra tion search procedure consisted of a) generating parameter samples (each with k=8) globally distributed in the parameter dimensional space; b) running the model for each sample set; c) calculating the NSE for the simulated time series of E. coli concentrat ions (CFU) in the pond; and d) identify the optimal set that provided the maximum value of NSE. The global sampling of the parameters was produced based on the Morris trajectory based method [28], [29] presented in the Global Sensitivity Analysis section 2.5 below. Due to the po ssibility of stochastic direct deposition events influencing peak E. coli concentrations, parameterization was also attempted while excluding the top one and top three rainfall events, resulting in exclusively negative values of the Nash Sutcliffe efficien cy [6], [7] This indicates that these peak events are not outliers but are important to explai n the bacterial concentration dynamics, and that these dynamics are in part runoff driven and necessary to simulate the observed data. Uncertainty analysis was conducted for observed values considering uncertainty introduced through sampling methods. Unce rtainty in E. coli quantification using the MPN method is well documented, with tables listing the results of the test, the MPN value, and the 95% confidence limits [30] These confidence limits were used to find the percent around each observed value that defines for the upper and lower bounds. Here, these percentages are 60% and 213% for the lower and upper bounds, respectively. These were used define a triangular distribution of uncertainty around the observed E. coli concentration values. Similarly, a lognormal distribution of uncertainty was defined based on the 95% confidence limits to the find coefficient of variation of a lognormal distribution, here equal to 64%.

PAGE 51

51 The assessment of the model performance was perfo rmed using the FitEval software [31] FitEval uses block bootstrapping of the {O t P t } values to approximate the underlying distribution of goodness of fit statistics (NSE and Root Mean Square Error (RMSE). From this, median values and 95% confidence intervals are provided for both NSE and RMSE. NSE provides a dimensionless metric of goodness of fit, and RMSE an indicator of absolute error, with the same dimensions as model outputs. The uncertainty in the observed data is accounted for in FitEval by relaxing the sum of square errors used in NSE and RMSE calculation [32] These results demonstrate how considering uncertainty affects model fit resulting in increases in go odness of fit metrics when observed value uncertainty is considered. FitEval uses this same [31] defined in Figure 4. Global Sensitivity Analysis Appropriate parameters from each model compartment w ere included in the global sensitivity analysis (GSA) of the model. Ranges surrounding the optimized parameters were analyzed using both the Sobol and Morris methods [28], [29] Both methods quantify the effects of input variance on output variance, both from individual parame ters (first order) and interactions between parameters (total order). The Morris method uses a more efficient sampling process to identify important input parameters with large first and/or total order effects ; the Sobol method requires a more thorough sam pling to quantify effects as a fraction of the total output variance, also known as variance decomposition [33] Bot h are use d here as both magnitude and fractional representations are useful. Here, the geometric mean (GM) and statistical threshold value (STV) are considered to be the model outputs of interest. These values are representative of the 50 th and 90 th percen tile of a lognormal distribution fit to the data and are

PAGE 52

52 software, Simlab, version 2.2.1 [34] Results and Discussion Model Goodness of Fit Nash Sutcliffe efficiency of the final calibrated model are pr esented Table 2, and a time series plot of observed and model generated values is shown in Figure 3. Inspection of this time series plot suggests limited prediction of the model for peak E. coli concentration events without consideration of uncertainty in these values. However, uncertainty should be considered since this high uncertainty is intrinsic to these field measurements, the results of which are presented in Table 3 and Figure 4. Both goodness of fit metrics were improved by including uncertainty, e ither as a triangular or lognormal distribution. Evaluating lognormal uncertainty resulted in greater model impro vement, with a mean NSE of 0.581 than evaluating triangular uncer tainty, with a mean NSE of 0.513 There have been many attempts to model bacterial concentrations in surface water with mixed results. A study by Baffaut and Sadeghi fit the Soil and Water Assessment Tool (SWAT) to 7 datasets in 6 locations to predict bacterial concentrations in surface wa ter bodies [6] A separat e study fit both SWAT and HSPF to four different locations [7] For all model calibrations in both studies, the NSE was used to evaluate the fit of the model to both hydrologic data and bacterial concentration data. For hydrologic data, Nash Sutcliffe efficiencies were generally in [7] However, even for locations with such strong hydrologic fits, the majority of bacterial between 6 and 0.387; only 1 was able to attain an acceptable fit value of 0.73 [6], [7] Thus, while NSE=0.581 achieved by th e model developed in this study

PAGE 53

53 cutoff, it is considered competitive and in the high range of that obtained in previous studies. The proposed model has the benefits of its mechanistic structure, simplicity and parsimony that can serve to inform food safety regulations. Global Sensitivity Analysis Global sensitivity analysis was used on the calibrated model to quantify the effects of parameter inputs on the GM and STV of the predicted data set. Figures 5a b show the individua l and interaction effects of each parameter on the GM and STV, respectively. Individual effects are presented on the x axis and interaction effects are presented on the y axis. Aquatic removal, k stands out as a significant parameter for both GM and STV. Bacterial density in feces, FC den and rate of feces produced per animal, F prod have a higher relative importance to STV than GM. Curve numbers of both land cover types, CN2 mulch and CN2 soil have a greater effect on the GM. Figure 6 shows the results of the Sobol method of global sensitivity analysis. Plots represent individual effects of each parameter on GM and STV, respectively. From this variance decomposition, groupings of parameters can be analyzed for their potential to ensure adequate water qualit y. Parameters were grouped according to model processes with bacterial source characteristics, runoff and landscape characteristics, and aquatic removal considered. Aquatic removal rate, k dominated the variance effects for both GM and STV. Bacterial sour ce characteristics explained a higher proportion of the variance of the STV than the GM. Runoff/land cover characteristics, specifically both curve number values, explain a greater proportion of the GM variance. These results can be used to select appropri ate mitigation strategies and best management practices. As aquatic removal rate, k drives most of the variance of both regulatory criteria, increasing this value would result in more compliant ponds. Die off rates of microorganisms are known to be influe nced by both light and temperature [35] While temperature may not be easily

PAGE 54

54 influenced in low maintenance surface water irrigation ponds, increasing sunlight availability to ponds could be achieved through appropriate management of vegetation in the surrounding areas. Treatment of water to promote die off can also be applied. This mitigation strategy might best be applied temporally. The failure of the model to be fit to the data set when excluding peak concentration events indicates that these peaks are runoff driven. This suggests that while peak events are not guaranteed following runoff producing rainfall events, there is some correlation that could inform the timing of water treatment. Bacterial source characteristics were found to drive a large portion of the variance of the STV. This represents the magnitudes of spikes in E. coli concentrations and can be important in controlling outbreaks. For irrigation ponds with more pronounced dom estic animal presence, the STV could be influenced by exclusion of animals from the area draining to the pond. Management practices that control wildlife are also an option for irrigation ponds without a known population of domestic animals. A possible mi tigation strategy under the category of runoff/landscape characteristics is the use of vegetative filter strips. Vegetative filter strips are a pollution control measure involving planting or facilitating vegetation between pollution source and area of int erest [36] These well managed barriers of dense vegetation decrease runoff flow velocity and incr ease infiltratio n into the soil. This reduced transport of pollutants results in a net bacterial sink. Vegetative filter strips should not be confused with unimproved field borders, where spontaneous vegetation is allowed, unmanaged. These unimproved borders can actually become pathogen sources by providing animal habitat area While effectiveness depends on many variables, including what proportion of bacteria are attached to sediment, sediment size, and vegetation density, a well constructed vegetative filter strip can r emove 80% of bacteria [36] This is an opportunity missed in the study

PAGE 55

55 area, which uses collection drains to route runoff to the ponds While runoff/landscape characteristics drive a smaller proportion of the variance of both regulatory criteria, vegetative filter strips could still provide important reduction to both values. Final Rema r ks This study provides a simple mechanistic bacterial transport model fit to E. coli concentration data from a highly variable irrigation pond in West Central Florida. A Na sh Sutcliffe efficiency of 0.581 was achieved for the model when considering uncertainty in observed data, a compet itive value with those present in the literature. Global sensitivity analysis of this model provides insight into processes important to controlling the microbial criteria set forth by off rates of bacteria in surface water and controlling sources of bacteria will prove the most effective in reducing the GM and STV, respectively. It is also evident that the peak E. coli concentration events are driven by runoff producing rain events. This fi nding can help concentrate efforts such as treatment to promote die off when they are most needed, specifically, after large rain events. There are clear limitations to fitting bacterial transport models to concentration data. Goodness of fit metrics are frequently low for both HSPF and SWAT model fittings [6] [8] While the NSE values attained for the m odel developed in this study are competitive with those in the literature, increased goodness of fit would improve the reliability of the model to inform food safety regulations and mitigating strategies for growers. Because aquatic removal rate was found to be important in controlling both GM and STV, expanding the model to include dependencies on temperature and solar radiation could improve the model performance in future work. Additional challenges are the differing hydrologic and source characteristic s among sources of agricultural water. Because FSMA is a federal regulation, agricultural water sources will vary

PAGE 56

56 significantly. Future research should aim at establishing representative values across different eco regions of wildlife populations would con tribute to improving applicability of this model Introducing other bacterial source processes, differing forms of transfer, and a vegetative fi lter strip model such as VFS MOD W [37] would also expand possible applications. Preventive food safety regulations, such as FSMA, seek to create an environment where pathogens are controlled via practices at each production level, resulting in safe products at the consumer level. This principle only goes so far in the current regulati ons. Specifically, in the agricultural water quality regulations, pathogens are controlled via sampling for indicator organisms and reacting appropriately, but no preventive measures are specified to prevent pathogens/indicator organisms from entering the water in the first place [1] In order to specify these mitigating practices, the system must be sufficiently well understood. Mech anistic bacterial transport modeling can supply this understanding, given models can be fit to data with acceptable accuracy and efficiency. This information can aide growers seeking to attain or maintain compliant water sources, and inform future food saf ety regulations attempting to take a preventive approach to water related food safety issues.

PAGE 57

57 Table 3 1 Table of model parameters. Each parameter range was taken from the literature directly or expanded around recommended values in the literature. Values specified as 'Variable' are calculated on a daily time step by the model. Uncertain parameters used in the model calibration are underlined. Parameter Description Units Value/Range Source Bacterial Source Characteristics Bacterial Deposition Bicknell et. al. 1996 [21] ACCUM Accumulation rate CFU/acre/day Calculated F prod Feces produced per day g/day 75 1100 Parker et. al. 2013 [38] FC den Density of FC bacteria per gram CFU/g 1.40e4 1.20e11 Parker et. al. 2013 [38] POP j Population size in season j ( j = 1,2) individuals 1 100 Estimate Bacterial Soil Surface Storage Bicknell et. al. 1996 [21] HAB Habitat area acres 13 USDA 2015 [23] SQO init Initial Storage of FC bacteria in soil CFU/acre Variable Model SoilDO Die off rate of FC bacteria in soil day 1 Calculated SQO Lim Storage limit for FC bacteria in soil CFU/acre Calculated Bacterial Transport with Runoff Bicknell et. al. 1996 [21] SOQO Amount washed off land CFU/acre/day Variable Model Q RO depth in Variable Model SQO Storage of FC bacteria in soil CFU/acre Variable Model wsqop RO rate that removes 90% stored FC bacteria in one hour in/hr 0.6 1.3 Moyer and Hyer 2003 [39] WSFAC Susceptibility of FC bacteria to washoff [ ] Calculated Hydrology/Landscape Characteristics Rainfall/Runoff to Pond NRCS 1986 [26] Q Runoff depth mm Variable Model P Rainfall depth mm Variable Data S Potential maximum retention mm Calculated CN2 mulch Curve number for AMC II mulch [ ] 72 98 NRCS 1986 [26] CN2 soil Curve number for AMC II soil [ ] 55 72 NRCS 1986 [26] AMC I/II/III Antecedent Moisture Condition (based on 5 day previous rainfall) [ ] Variable Data Pond Water Budget Q Runoff depth mm Variable Model P Rainfall depth mm Variable Data ET Evapotranspiration mm Variable Data A pond Pond area m 3 3564 USDA 2015 [23] A land Land area draining to pond ha 5.261 USDA 2015 [23] Aquatic Removal Aquatic Die off and Deposition Easton et. al. 1999 [40] k Die off rate of FC bacteria in water day 1 0.1 0.9 Easton et. al. 1999 [40]

PAGE 58

58 Table 3 2 Optimized values of each unknown parameter. Parameter Value Units F prod 713 g/day FC den 4.933e6 CFU/g POP 1 76.9 individuals POP 2 3.8 individuals wsqop 1.19 in/hr k 0.337 d ay CN2 mulch 83 [ ] CN2 soil 61 [ ]

PAGE 59

59 Table 3 3 Results of FitEval analysis considering three levels of uncertainty. Classification corresponds to thresholds included in Figure 4. Error Distribution NSE RMSE Classification None 0.482 1670.956 Unsatisfactory Acceptable ( 0.115 0.698) (985.744 2687.91) Triangular 0.513 1619.956 Unsatisfactory Acceptable ( 0.085 0.74) (871.704 2622.465) Lognormal 0.581 1501.952 Unsatisfactory Good ( 0.115 0.813) (918.136 2492.04) Median values are followed by 95% confidence intervals in parenthesis.

PAGE 60

60 Figure 3 1 Conceptual diagram of the study site and model processes. Bacterial accumulation on the soil surface is driven by wildlife. Bacterial transport is driven by runoff. Runoff is diverted to the pond via dr ains and culverts.

PAGE 61

61 Figure 3 2 Model structure diagram. Each model compartment belongs to a model process and produces one model output.

PAGE 62

62 Figure 3 3 Time series plot of observed and simulated bacterial concentrations produced with optimized parameter s. This model has a NSE of 0.481

PAGE 63

63 Figure 3 4 Cumulative distribution function (CDF) of the NSE under the three conditions eval uated with FitEval. Considering uncertainty shifts the CDF towards higher

PAGE 64

64 Figure 3 5 Results of Morris method of Global Sensitivity analysis. The x axis, *, represents the absolute value of first order effects. The y axis, represents interaction effects. The diagonal line provides a threshold to assess the relative dominance b etween direct effects (below the line) and interaction effects (above the line) for each input

PAGE 65

65 Figure 3 6 Results of Sobol method of global sensitivity analysis. A) Variance decomposition of GM. B) Variance decomposition of STV.

PAGE 66

66 CHAPTER 4 CONCLUSION Main Findings Analyzing the current food safety regu lations concerning the microbial quality of variability than assumed in the rule can lead to underestimation of the peak events for some ponds when fitting lognorma l distributions to calculate regulatory microbial criteria Current regulations assume a standard deviation of 0.92, while the standard deviation of ponds in the study area were observed as having standard deviations ranging betw een 1.66 2.78. Consequent ly considering 20 samples in the water quality profile is ins ufficient to characterize the water quality in the investigated agricultural ponds. The rolling sampling schedule also provides a slow response to sudden shifts in water quality with the GM and STV only reflecting shifts after 1 6 simulated years depending on the nature and the magnitude of the shift. These findings suggest the need for a mechanistic approach to understanding these highly variable water sources and a preventive regulatory focus To provide this mechanistic understanding, t his study provides a simple bacterial transport model fit to E. coli concentration data from a highly variable irrigation pond in West Central Florida. The performance of the model was similar or superior to e xisting pathogen transport models, with a Nash Sutcliffe efficiency of 0.581 when incorporating observed values uncertainty. Global sensitivity analysis was then used to reveal the most important processes controlling bacterial water quality criteria: aqua tic removal rate of bacteria for GM, and bacterial source and transport dynamics STV. It was also found that large peaks E. coli concentration events were mechanistically driven by rainfall/runoff processes. From these findings, we suggest preventive measu res enhancing die off rates through treatment after large runoff producing

PAGE 67

67 rainfall events. Bacterial source characteristics such as wildlife population should be controlled in instances where STV exceeds regulatory limits. Vegetative filter strips, when p roperly designed and maintained, also provide an opportunity to mitigate bacterial transfer into the agricultural waters by reducing runoff flow and settling particulated pollutants. Limitations and Future Research The insufficiency of the sampling scheme prescribed by the PSR may seem to indicate that increased sampling frequency should be recommended. However, increased frequency would place a greater sampling cost burden on growers. Additionally, due to the high variability in E. coli levels in the irrig ation ponds, the number of samples required to adequately estimate the true mean would be approximately 9 times that required in the current rule. This magnitude of increase is unreasonable to recommend due to costs associated with sampling. Therefore, we sought alternative methods of controlling the microbial quality of these ponds through a mechanistic understanding of peak events and the recommendation of preventive control measures. Fitting a mechanistic bacterial transport model to a highly variable i rrigation pond has many limitations and much complexity was omitted in order to develop a parsimonious model with adequate fit. Bacterial transport in the model developed is modeled through a single wash off process. This ignores partitioning of transporte d bacteria into that in solution and that attached to sediment. Even when these processes are included, as they are in the SWAT model, there is still difficulty predicting the magnitudes of peak E. coli concentration events. Overall, NSE values remain typi cally low when predicting bacterial concentrations and increased goodness of fit would improve the reliability of the model to inform food safety regulations and mitigating strategies for growers. Because aquatic removal rate was found to be important in c ontrolling

PAGE 68

68 both GM and STV, expanding the model to include dependencies on temperature and solar radiation could improve the model performance in future work. Because of the broad regulatory scope of FSMA, establishing widely applicable preventive control s could pose a challenge. Future work should include additional surveys of surface water sources used for irrigation in major U.S. ecoregions. Establishing variability estimates could focus modeling efforts where they are most needed, as sampling based app roaches prove more problematic with increased variab i lity. Additional information on wildlife populations within agri cultural watersheds would also ease parameterization of the model developed in this study. In areas where runoff is the main driver, the im plementation of vegetative filter strips is a promising strategy for mitigating bacterial transport to irrigation ponds. VFSMOD W, a model of transport through vegetative filter strips, could be coupled with the model developed here to better understand th e impacts of including well managed filter strips [37] Broader Impacts Preventive food safety regulations, such as FSMA, seek to create an environment where pathogens are controlled via practices at each productio n level, resulting in safe products at the consumer level. This principle only goes so far in the current regulations. Specifically, in the agricultural water quality regulations, pathogens are controlled via sampling for indicator organisms and reacting appropriately, but no preventive measures are specified to prevent pathogens and indicator organisms from e ntering the water in the first place [1] This sampling scheme does not stand up to statistical analysis, with both the number of s amples required and the frequency with which the sampling profile is updated found to be insufficient to characterize the microbial quality of highly variable surface water. These findings are timely as the compliance dates for the PSR Agricultural Water R ule have been delayed to develop guidance and explore ways to simplify the rule [41]

PAGE 69

69 Findings from this study can aid FSMA authors and growers committed to food safety in attempting to take a preventive approach to wa ter related food safety issues by allowing g rowers w ith noncompliant ponds to target specific processes based on the noncompliant microbial criteria. Furthermore, future food safety regulations could take advantage of the finding that in irrigation ponds with no point source for bacterial contamination, run off may be the primary driver of peak indicator organism concentration events. Understanding and controlling peak events is the most effect way to control foodborne illness risk to consumers of fresh produce, as significant transfer risk is associated with very high concentrations of pathogenic organisms. A sampling scheme based on these findings would seek to use a limited number of water samples wisely, to best capture peak events, and regulate the use of water at high risk times (after large runoff produ cing rain events.) While these findings will not apply to all sources of irrigation water, they are an important step in understanding microbial dynamics and developing smarter food safety regulations.

PAGE 70

70 APPENDIX R MARKDOWN FILES Object A 1 Chapter 2 R Mar k down The code used to produce the results from Chapter 2, Evaluating the U.S. FSMA Produce Safety Rule standard for microbial quality of agricultural water for produce growing is included here as an R markdown file. Object A 2 Chapter 3 R Markdown The code used to produce the results from Chapter 3, A Simple Mechanistic Bacterial Transport Model to Inform Food Safety Regu lations of Agricultural Water Quality is included here as an R markdown file.

PAGE 71

71 LIST OF REFERENCES [1] and Holding of P Fed. Regist. no. 80, pp. 74354 74568, 2015. [2] [3] PLoS One vol. 12, no. 4, 2017. [4] holding of produce fo Fed. Regist. no. 80, pp. 74354 74568, 2015. [5] Hazard Analysis Critical Control Point (HACCP) 2014. [Online]. Available: http://www.fda.gov/Food/GuidanceRegulation/HAC CP/ucm2006801.htm#princ. [6] Trans. ASABE vol. 53, no. 5, pp. 1585 1594, 2010. [7] Watershed Examination of In J. Environ. Eng. vol. 139, no. 5, pp. 719 727, 2013. [8] Scale Agricultural Modeling of Pathogen Indicator Organisms in a Small Scale Agricultural Catchment Hum. Ecol. Risk Assess. no. January, 2013. [9] Fed. Regist. no. 78, pp. 3503 3646, 2013. [10] U analysis of pollutants. Sect. 3. Identification of test procedures. Table 1H. List of approved 40 CFR 136 vol. U.S. Envir, 20 17. [11] [12] Ann u Rev Food Sci Technol. no. 6, pp. 479 503, 2015. [13] J. Fox, S. Weisberg, D. Adler, D. M. Bates, G. Baud Bovy, S. Ellison, D. Firth, M. Friendly, G. Gorjanc, S. Graves, R. Heiberger, R. Laboissiere, G. Mon, D. Murdoch, H. Nilsson, D. Ogle, B. Ripley, W. Venables, A. Zeileis, and R R topics documented p. 167, 2014. [14]

PAGE 72

72 Pennsylvania Surface Water Used for Irrigating P J. Food Prot. vol. 79, no. 6, pp. 902 912, 2016. [15] Advances in Food and Nutrition Research vol. 57, no. 9. pp. 155 208, 2009. [16] Food Microbiol. vol. 32, no. 1, pp. 1 19, 2012. [17] ments for [18] A. H. Havelaar, K. M. Vazquez, Z. Topalcengiz, R. Muoz Carpena, and M. D. Danyluk, agricultural water for J. Food Prot. vol. 80, no. 11, pp. 1832 1841, 2017. [19] K. H. Cho, Y. A. Pachepsky, D. M. Oliver, R. W. Muirhead, Y. Park, R. S. Quilliam, and derived microorganisms at the watershed scale: Water Res. vol. 100, pp. 38 56, 2016. [20] Agric Water Manag. vol. 70, no. 1, pp. 1 17, 2004. [21] B. R. Bicknell, J. C. Imhoff, J. L. Kittle Jr., A. S. Donigan Jr., and R. C. Johanson, EPA no. September, 1996. [22] University o IFAS Extension . [23] Natural Resources Conservation Service, United States Department of Agriculture 2015. [24] D. Chin and D. Sakura scale fate and transport Trans. vol. 52, no. 1, pp. 145 154, 2009. [25] E. M. Troyer, S. E. Cameron Devitt, M. E. Sunquist, V. R. Goswami, and M. K. Oli, population growth r J. Mammal. vol. 95, no. 2, pp. 421 430, 2014. [26] USDA Nat. Resour. Conserv. Serv. Conserv. Engeneering Div. Tech. Release 55 p. 164, 1986. [27] J. E. Nash and J. V. Sutcliff J. Hydrol. vol. 10, no. 3, pp. 282 290, 1970. [28]

PAGE 73

73 Math Comput. Simul. vol. 55, no. 1 3, pp. 271 280, 2001. [29] Technometrics vol. 33, no. 2, pp. 161 174, 1991. [30] P. Hamilton, B. Jones, J. Largier, C. McGee, M. Noble, K. Orzech, G. Robertson, L. Environ. Prot. Agency p. 342, 2004. [31] A. Ritter and R. Muoz Statistical significance for reducing subjectivity in goodness of J. Hydrol. vol. 480, pp. 33 45, 2013. [32] R. D. Harmel, P. K. Smith, and K. W. Migliaccio, of fit indicators to incorporate both measurement and model uncertainty in model calibration and Trans. ASABE vol. 53, no. 1, pp. 55 63, 2010. [33] Uncertain. Manag. Simulation Optimization Complex Syst. Algorithms Appl. no. around 30, p. 23, 2014. [34] European Commission [35] Off Rates in Ur ban 18, 2005. [36] C. Yu, B. Gao, and R. Muoz J. Hydrol. vol. 434 435, pp. 1 6, Apr. 2012. [37] R. Muoz Carpena, J. E. Parsons, and J. W. J. Hydrol. vol. 214, no. 1 4, pp. 111 129, 1999. [38] I. D. Parker, R. R. Lopez, R. Padia, M. Gallagher, R. Karthikeyan, J. C. Cathey, N. J. f free ranging mammals in the deposition of Escherichia Wildl. Res. vol. 40, no. 7, pp. 570 577, 2013. [39] FORTRAN and Bacterial Source Tracking for De velopment of Fecal Coliform Total Maximum Daily U.S. Dep. Inter. U.S. Geol. Surv. 2003. [40] Situ Die Off of Indicator Bacteria and Pathogen Univ. Alabama Birmingham pp. 1 4, 1999. [41] 2017.

PAGE 74

74 BIOGRAPHICAL SKETCH After completing an a Community College, K athleen Vazquez graduated from th e ngineering in 2015. In Fall of 2017, she completed her Master of Science degree at UF and will continue in a doctoral program in the Agricu ltural and Biological Engineering Department.