Citation |

- Permanent Link:
- http://ufdc.ufl.edu/UF00053817/00001
## Material Information- Title:
- Analysis and interpretation of on-farm experimentation
- Series Title:
- FSR/E training units; Participcation manual volume 3
- Creator:
- Caldwell, John S.
Taylor, Dan. Walecks, Lisette Farming Systems Support Project - Affiliation:
- University of Florida -- Farming Systems Support Project -- Institute of Food and Agricultural Sciences
- Place of Publication:
- Gainesville, Fla.
- Publisher:
- Farming Systems Support Project, University of Florida
- Publication Date:
- 1987
- Language:
- English
- Physical Description:
- xi, 391, 51 p. ill. ; 28 cm.
## Subjects- Subjects / Keywords:
- Agriculture ( LCSH )
Farm life ( LCSH ) Farming ( LCSH ) University of Florida. ( LCSH ) Agriculture -- Research -- On-farm ( LCSH ) Farms -- Research ( LCSH ) - Spatial Coverage:
- North America -- United States of America -- Florida
## Notes- Funding:
- Electronic resources created as part of a prototype UF Institutional Repository and Faculty Papers project by the University of Florida.
## Record Information- Source Institution:
- University of Florida
- Holding Location:
- University of Florida
- Rights Management:
- The University of Florida George A. Smathers Libraries respect the intellectual property rights of others and do not claim any copyright interest in this item. This item may be protected by copyright but is made available here under a claim of fair use (17 U.S.C. Â§107) for non-profit research and educational purposes. Users of this work have responsibility for determining copyright status prior to reusing, publishing or reproducing this item for purposes other than what is allowed by fair use or other copyright exemptions. Any reuse of this item in excess of fair use or other copyright exemptions requires permission of the copyright holder. The Smathers Libraries would like to learn more about this item and invite individuals or organizations to contact Digital Services (UFDC@uflib.ufl.edu) with any additional information they can provide.
- Resource Identifier:
- 17763082 ( OCLC )
## UFDC Membership |

Full Text |

F SS
c p an ANALYSIS AND INTERPRETATION OF ON-FARM EXPERIMENTATION FSR/E TRAINING UNITS: VOLUME III Prepared By: Farming Systems Support Project International Programs Institute of Food and Agricultural Sciences University of Florida Gainesville, Florida 32611 Technical Editor: John Caldwell, Virginia Polytechnic Institute Technical Editor Economics Section: Dan Taylor, Virginia Polytechnic Institute Coordinating Editor: Lisette Walecka, University of Florida DECEMBER, 1987 The Farming Systems Support Project (FSSP) is a cooperative agreement between the University of Florida and the United States Agency for International Development, Cooperative Agreement No. DAN-4099-A-00-2083-000, Project number 936-4099. VOLUME III: ANALYSIS AND INTERPRETATION OF ON-FARM TRIALS TABLE OF CONTENTS PREFACE ................................................................... i ACKNOWLEDGEMENTS ......................................................... ii VOLUME III: ORGANIZATION OF MANUAL ...................................... xi Unit I: A Framework for Analysis ......................................... 1 Unit II: Looking at Data Sets (II,A) What is a Data Set and What Can it Do? ..................... 31 (II,B) From Domain to Trial Back to Domain: Samples, Populations, and Statistical Inference ..................... 39 (II,C) Statistical Notation ..................................... 55 (II,D) Looking at Data: Techniques for Summarizing and Describing Data ........................................ 59 Unit III: Analysis Techniques (III,A,0) Qualitative Effects of Two Treatments: Paired and Unpaired t-tests........... .......................... 81 (III,A,l) Principles and Procedures of Analysis of Variance (ANOVA) for Site-Specific Simple Designs .................. 101 (III,A,2) Principles and-Procedures of Analysis of Variance (AVOVA) for Designs Used in Regional Trials.................... 141 (III,A,3) Determining which Treatments are Different (III,A,3,a) Unplanned Comparisons: Means separation Techniques ........................................ 211 (III,A,3,b) Planned Comparisons of Factorials ................. 219 (III,A,3,c) Other Planned Comparisons: single Degree of Freedom Contracts ................................. 235 (III,A,3,d) Confounding and Fractional Replication............ 241 (III,A,4) A Way to Handle "Damaged" Data: Analysis of Covariance....................... .......................257 (III,A,5) Partial Budgeting ......................................... 259 (III,A,6) Sensitivity Analysis .................................... 281 (III,A,7) Analysis of On-Farm trial results for Consumption/ Nutrition ................................................. 299 (III,B,l) Linear Regression and the Correlation Coefficient ......... 313 (III,B,2) Modified Stability Analysis ................................ 329 (III,B,3) Response Surface Analysis ................................. 359 (III,C) Analysis of Alternative Enterprises ........................ 371 Unit IV: Integrated Interpretation of Trial Results .................... 377 GLOSSARY ........... .................................. G-1 LIST OF RESOURCES.................................................... R-1 STATISTICAL TABLES ...................................................... t-i PREFACE One of the major objectives of the Farming Systems Support Project is to provide training and support for training activities in FSR/E methodology. This collection of training units has been produced in response to an absence of available training materials which could be used in training practitioners in the skills necessary for implementing the FSR/E approach to agricultural development. The development, testing, review, and revision process has been rapid due to the demand for training materials and the limited time remaining for the project. These training volumes are not error free. We encourage your scrutiny. As you work with the training manuals and if you have comments, additions, adaptions, corrections, or suggestions please let us know. This collection of training units is not a course. Rather, it is a set of resources which supports FSR/E courses. It is an attempt to provide the trainer and practitioner trainee with a wide variety resources for teaching and learning specific content and skills needed for implementing FSR/E successfully. Volume I, Diagnosis in FSR/E, contains nine units for introducing trainees to various diagnostic steps in the FSR/E approach. It stresses, but is not limited to, initial diagnosis. Volume One also contains units which detail on-going, or continuous, diagnosis throughout the FSR/E process. Links between social and biological science disciplines are stressed, as are considerations of intra-household and socio-cultural issues. The final unit focuses on problem identification and prioritization, a step leading toward appropriate trial design. Volume II, Design Techniques for On-Farm Experimentation, contains five units which detail the farm trial design, layout, and management process. The links between biological and social science disciplines in on-farm research are considered, and, like Volume One, intra-household and socio-cultural issues are addressed. This volume begins by focusing on the importance of establishing clear evaluation criteria before designing a trial and culminates with a discussion of practical implications of managing on-farm experimentation. Volume III, Analysis and Interpretation of on-Farm Experimentation, establishes a framework for analysis reinforcing the importance of the establishing evaluation criteria early in the design process. It provides basic statistical and analytical techniques useful for on-farm experimentation and ties all volumes together by introducing the concept of integrated analysis. A Trainer's Manual accompanies these three volumes, and provides notes for the trainer which accompany the variety of activities presented in each volume. One of the objectives of this series of training materials is to provide participatory activities which will involve the participant directly in the training in a "hands-on" fashion. It emphasises group discussions and role play as well as other types of activities. This is the second edition of the training units, and the revisions which were made were based on comments from a variety of sources, including i specific reviewers, participants in shortcourses, individual users of the training manuals, and others. We have tried to address the majority of the concerns voiced. Major revisions included integration of livestock issues and expansion of economic analysis material. Emphasis has also been placed on presenting material in a "how to do" fashion. The units have not been developed to be exhaustive texts of the the topics presented. Rather, they have been developed to convey basic information in a format as complete and concise as possible. It is our hope that both trainers and trainees will search out more information on specific topics covered in the training units. The learning objectives and key points focus on the main essence of the unit or section. A common glossary gives all the definitions in one place, since many terms are used in more than one unit or section. Many units are divided into sub-units, sections and sub-sections, each with its own set of learning objectives, key points, list of terms, and discussion. Suggested learning activities accompany the units or sections and each activity has separate instructions for trainers. The units are not thought to be the "final word." Rather, they have been developed as the foundation of developing training units in FSR/E. Your comments, adaptations, additions, and suggested activities are welcomed and encouraged. The best measure of the usefulness of a product is given from those who use the product. The best way to improve a product is to listen to the users. At the end of this introduction you will find a one page evaluation sheet. We hope that you will use this form to send us your comments. This is not meant to limit your comments (and we encourage detailed comments) but rather to encourage you, the user, to let us know what you think and suggest. AKN(CWLEDGEMENTS Throughout the development process of the FSR/E training units, from the planning, writing, initial editing, reviewing, testing, revising, to the final production, and second edition, many individuals have been involved. FSSP would like to acknowledge their efforts. The individuals are listed below with their affiliations at the time of their participation. Technical Editors: Volume I: Tim Frankenberger University of Arizona' Steve Franzel Development Alternatives, Inc. Malcolm Odell Synergy International Marcia Odell Synergy International Volume II: John Caldwell Virginia Polytechnic Institute Volume III: John Caldwell Virginia Polytechnic Institute Volume III, Economic Analysis sections: Dan Taylor Virginia Polytechnic Institute Initial Planning Emanuel Acquah University of Maryland, Eastern Shore Lorna Butler Washington State Univeristy Steve Franzel Development Alternatives, Inc. Dan Galt University of Florida, FSSP James Jones University of Florida, FSSP Susan Poats University of Florida, FSSP Federico Poey Agricultural Development Consultants, Inc. Lisette Walecka University of Florida, FSSP Authors: Jay Artis Michigan State University Emanuel Acquah University of Maryland Eastern Shore Kenneth Buhr University of Florida Lorna Butler Washington State University John Caldwell Virginia Polytechnic Institute Cornelia Flora Kansas State University Steve Franzel Development Alternatives, Inc. Dan Galt University of Florida, FSSP Martha Gaudreau University of Minnesota John Hammerton Caribbean Agricultural Research and Development Institute (CARDI) James Jones University of Florida, FSSP Kenneth McDermott University of Florida, FSSP James Meiman Colorado State University Malcolm Odell Synergy International Ramiro Ortiz Agricultural Development Consultants, 'Inc. Donald Osburn USAID/Washington Susan Poats University of Florida, FSSP Kenneth Sayre International Agricultural Development Service (LADS) Jerry Van Sant Development Alternatives, Inc. Robert Waugh Colorado State University Peter Wotoweic University of Florida Peter Hildebrand University of Florida Dan Taylor Virginia Polytecnic Institute Henk Knipsheer Winrock International Al Hagan University of Missouri Don Osburn U.S. Agency for International Development Marilynn Prehm virginia Polytechnic Institute John Lichte University of Florida Jim Oxley Colorado State University Mark Kujawa Colorado State University John Russell University of Florida Contributors: Ron Knapp Centro Internacional de Mejoramiento de Maiz y Trigo (CIMMYT) Dan Minnick International Rice Research Institute (IRRI) Robert Tripp CIMMYT Janis Timberlake Virginia Polytechnic Institute Clive Lightfoot Cornell University Ly Tung Visayas State College of Agriculture, Philippines iii Training Consultants: Kathy Alison Office of International Cooperation, & Development (OICD) USDA Peg Hively Office of International Cooperation, & Development (OICD) USDA The FSSP would like to thank CIMMYT Economics Program and CARDI for permission to include portions of their work in economic analysis and on-farm experimental design respectively. Reviewers: The draft edition of the Volume Two, Techniques for Design and Analysis of On-Farm Experimentation, was used for the first time in the FSSP/Gambia Agricultural Diversification workshop on On-Farm Experimentation in May, 1985. Parts of Volume One, Diagnosis in FSR/E, were used for the first time in the Jamaica Farming Systems Research Workshop, June, 1985. Feedback received during this initial testing was used, along with other feedback, in the revising effort. Richard Bernsten, Michigan State University, presented the FSSP training units for review at the "Farming Systems Research Socio-Economics Monitoring Tour/Workshop," held September 16 28, 1985, at IRRI, Los Banos, Philippines, at the request of Marlin Van Der Veen, IRRI. Comments from that session, as well as detailed comments by Richard Bernsten, were very useful in revising both volumes. Susan Almy, Rockefeller Foundation, also provided very detailed comments. Additional review comments were made by Peter Hildebrand, University of Florida.. Martha Gaudreau, University of Minnesota, played an important role in the revision of the Diagnostic Unit. Klaus Hinklemann, Virgina Polytechnic Institute, provided valuable consultation on some statistical aspects of the units. Specific reviewers for the volumes included Hal McArthur (University of Hawaii), Roque de Pedro (Viscaya State College of Agriculture, The Philippines), Cornelia Flora (Kansas State University), John Lichte (University of Florida), Eric Crawford (Michigan State University). Valuable comments were also offered by Janis Jiggins and Federico Poey (AGRIDEC). The FSSP acknowledges the above contributions and those of others who may have been inadvertently omitted. I would like to gratefully acknowledge the patience, hard work, and general support of the FSSP secretaries, Lana Bayles, Shirlene Washington, and Jack Weiss throughout the training unit development process. I would also like to thank Donna Long, secretary senior at Virginia Polytechnic Institute for her valuable and patient assistance throughout the revision process. Lisette Walecka Coordinating Editor December, 1987 iv ADDITIONAL TRAINING MATERIALS OF INTEREST The statistical interpretations and explanations in volume II of this series is based on the statistical tables in (Rolf and Sokal, 1969), other tables may be slightly different. We recommend: F. James Rolf and Robert R. Sokal, State University of New York at Stonybrook, 1969, Statistical Tables, W. H. Freemand and Company, San Francisco. CARDI, April, 1984, "On-farm Experimentation: A Manual of Suggested Experimental Procedures. CIMMYT, revised November, 1985, "Introduction to Economic Analysis of On-Farm Experiments", Draft Workbook, CIMMYT Economics Program, FSSP, 1985, "Bibliography of Readings in Farming Systems, volume 1," Poey, F. et. al, 1985, "Anatomy of On-Farm Trials: A Case Study From Paraguay", FSSP. Hildebrand, P. and F. Poey, 1985, On-Farm Agronomic Trials in Farming Systems Research and Extension", Lynne Rienner Publishers, Inc., Boulder Colorado. The FSSP has developed and initially tested a case study based on Dominican Republic data from the Las Cuevas region which gives trainees the opportunity to interview farmers and develop research priorities. Intra-Household Dynamics and Farming Systems Research and Extension: Case Studies in Agricultural Development. The Population Council and The Farming Systems Support Project hacvce developed a set of seven teaching cases which directly address the relationship between an understanding of intra-household dynamics and the design and extension of new technologies for improving farm production. Each case, in two or three sequenced sections, provides trainees with information drawn from actual project experience with which they can analyze relationship of gender roles and intra-household dynamics to the farming system and make decisions about future project activities. The seven cases, described below, are accompanied by background papers, a conceptual framework for analyzing the cases, guidelines for studying a case, and teaching guidelines. ZAMBIA Based on the work of the Adaptive Research Planning Team in Central Province, Zambia, the material includes initial diagnostic surveys, labor survey, on-farm trial protocols and results and special studies on decision making and female headed households. It is a good beginning case and can be used alone or with other cases for either short term training or a longer term classroom situation. BURKINA FASO Improvement in the production of staple cereals and other crops was the objective of the Purdue Unversity and the Semi-Arid Food Grain Research and v Development project (SAFGRAD) in three villages of the Mossi plateau of Burkina Faso. The case includes initial diagnosis, the results of three years of on-farm trials, and labor studies. This case is particularly suitable for a longer term training situation and for audiences with technical interests. COLOMBIA This case covers eight months of an on-farm testing project for varietal and fertilizer technology components conducted by the International Center for Tropical Agriculture (CIAT) and the International Center for Fertilizer Development (IFDC) in Pescador, Colombia. The material includes a description of the composition and objectives of the multidisciplinary research team, successive stages of information generated to design and evaluate the experimentation phase, design of on-farm trials and the generation of additional information regarding women's activities related to production and consumption in the farming system. This case works well in both short and long term training and with general and technical audiences. it is particularly useful for looking at different disciplinary perspectives towards technology design, innovative approaches in diagnostic research, and the inclusion of consumption considerations. ST. LUCIA This case describes diagnostic surveys and a proposed intervention undertaken in the Mabouya Valley of St. Lucia by the Caribbean Research and Agricultural Development Institute (CARDI). The area is dominated by plantation agriculture on the valley floor and small farms and subsistence farms at higher elevations. Seasonal and long term migration of males is characteristic. The three parts -- the diagnostic surveys, case profiles, and proposed interventions may be used in several ways in both short and long term training. It is best used as a second case in a series of cases. KENYA TFicase describes an agroforestry research and extension project undertaken by a non-government organization, CARE/Kenya, with assistance from the International Center for Research in Agroforestry (ICARF) in the Western Province of Kenya. Diagnostic and extension activities are done with groups and individual farm households. Material includes initial diagnosis, the training for and methodology used by field personnel to insure that both women and men were included, the results of formal trials and further research, and on-farm design activity. This case is particularly suited for looking at methodologies for working with groups and for applying benefits analysis to technology choice. It is suitable for both short term and long term teaching situations. INDONESIA The primary objective of TROPSOIL'S multidisciplinary team is the development of techniques for soil management in Sitiung, a transmigration site in Sumatra which includes migrants from Java as well as indigenous peoples. Thecase includes technical information on soils and forages, procedures and results of the initial sondeo, on-farm trials, time allocation studies, nutrition and income studies, and forage trials. Both ethnic and gender differences influence farmer preferences and technological possibilities. This case is particularly rich and is best used in a long term training situation. vi BOTSWANA ThTscase depicts a project to improve arable production in the MahalapyeDistrict of Botswana, an area with low and erratic rainfall, an economy dominated by cattle and a high percentage of female headed households. Included are a summary of the technical and socio-economic research during the first three years of the project with increasing specification of household characteristics and dynamics, the fourth season's trails and farmer evaluations, and additional diagnostic work targeted on poorer predominatly female households. This case is best used in a longer training situation. MATERIALS AND SERVICE AVAILABLE Written Materials: For use by trainers and trainees or for self study. Volume I: Case material Background articles on Gender Roles and Farming Systems and on Farming Systems Research; Conceptual Framework for analyzing household dynamics and farming systems; Introduction to the Case Study Method; and Individual Case Studies Volume II: Analysis and Teaching Notes Teaching by the Case Study Method and examples; Best uses for each case; and Analysis and Teaching Notes for individual cases. Services Experienced consultants for training, case writing, or project assistance One day or two day pre-conference workshops One week course on Intra-household Dynamics and Agricultural Research and Extension One week course on developing own case materials Training of trainers For more information contact: Hilary Sims Feldstein, Managing Editor RFD 1 Box 821 Hancock, New Hampshire 03449 603-525-3772 Dr. Susan Poats FSSP/ University of Florida 3028 McCarty Hall Gainesville, Florida 32611 vii, EVALUATION (FEEDBACK) TRAINEE Your comments are encouraged. Please feel free to write your comments and send them to the FSSP at the address listed on the back of this form. Being specific about the unit, sub-unit or sections which you are discussing will assist us in our efforts to provide quality materials. (optional) NAME: DATE: LOCATION: 1. How did you find the units most/least useful? most: least: 2. How was the content most ........ useful? relevant? 3. was the level of presentation appropriate? 4. Was the volume organized appropriately? 5. In the future editions what would you want to see ...... added? expanded? shortened? omitted? 6. How useful were the existing activities provided in the unit? PLEASE MAKE ANY ADDITIONAL COMMENTS, OR SUGGESTIONS. THANK YOU! INTERNATIONAL PROGRAMS FSSP(TRAINING UNITS) 3028 MCCARTY HALL UNIVERSITY OF FLORIDA GAINESVILLE, FLORIDA 32611 VOLUME III: ORGANIZATION OF MANUAL This volume presents a general framework for analysis and techniques useful for the analysis and interpretation of on-farm experimentation. Analysis does not begin after design. Rather, as the other two volumes have indicated, plans for analysis must begin very early in the process, and continue to guide you through the planning, design and experimentation process. The ultimate success of the alternative technology is its acceptance of and adoption by the farmer. These units provide FSR/E team members with a basic understanding of appropriate techniques and of some underlying concepts of diagnosis. Unit I. A Framework for Analysis Establishing a framework for analysis requires planning and should be linked closely with the evaluation criteria and the design of the trial. This unit helps the practitioner to focus on choosing specific analytical tools and being able to determine the impact of proposed technologies. The ultimate evaluation is acceptability by farmers. Unit II. Looking at Data Sets Being able to assess data sets as a preliminary step in analysis is extremely important. This unit will help practitioners to preview the data, make statistical inferences, and summarize and describe data. Unit III. Analysis Techniques There are various techniques for analysis. Which one is appropriate depends on the design used. Interpretation of results includes biological, economic, and social interpretation. Analysis and interpretation help the team and farm households make better decisions for future activities. This unit presents a variety of analysis techniques available to the FSR/E practitioner in detail. VOLUME III: ORGANIZATION OF MANUAL xi UNIT I A FRAMEWORK FOR ANALYSIS VOLUME III UNIT I A FRAMEWORK FOR ANALYSIS OUTLINE 1. -Analysis in FSR/E: An Overview 2. Establishing a Framework for Analysis of On-Farm Experimentation 3. Possible Types of Biological Analysis 4. A Comparison of Two Tools: ANOVA and MSA 5. Choosing Appropriate Economic Analysis Techniques 6. Determining Impact of Proposed Technologies on Other Household Activities 7. Evaluation of Acceptability by Farmers PARTICIPANT LEVEL Agricultural research assistant Extension technology verification assistant LEARNING OBJECTIVES After completing this section, participants will be able to: 1. Explain how analysis fits into the process of FSR/E and planning for related activities that support agricultural production. 2. Identify types of analytical tools and the criteria for choosing tools. 3. Explain why it is important to consider risk when evaluating alternatives to current farming practices and discuss several factors influencing variability in yields, costs, prices and farmer practices which influence risk. 4. Explain why economic analysis of on-farm research is an essential complement to biological analysis and to be able to choose appropriate economic analysis techniques for evaluating technologies. 5. Discuss important considerations in selecting the individual farmer, the farm-household or other groupings of people as units of analysis. 6. Explain how and why social science perspectives are necessary in the interpretation and evaluation of the results of on-farm trials. 7. Explain why it is essential to incorporate the farmer explicitly into the process of technology evaluation and what methods can be used to achieve this goal. KEY POINTS 1. Analysis occurs at a point where FSR/E again considers the whole system, and the many different factors in the system which affect acceptability of the specific technology tested in the on-farm trials from which the data are taken. Volume III: I page 1 2. In Farming Systems Research and Extension, the farm household is usually considered the most relevant unit of analysis, with additional attention given to sub-units including individual household members, and to links between households and others outside these units. 3. Choice of analysis tools affects what type of integrated interpretation is possible, and knowing what type of integrated interpretation is desired can affect the choice of analysis tools. 4. Analysis is not only an assessment of trial results, but also a reprioritization and planning for next year's FSR/E trials and related activities in on-station back-up research, extension, and policy recommendations. 5. In the long run, we must remember that farmers, not plants or animals, adapt and adopt new technologies. FSR/E -practitioners must learn to plan, view and evaluate their work from the farmer perspective, to "see the world through their eyes." It is critical that researchers make the maximum effort to provide technologies which are the best suited to the conditions of the farmers. 6. By actively involving farmers at each step of the process (designing, testing, and evaluating alternative solutions to problems), FSR,/E practitioners will better understand farmers' perspectives on proposed technological improvements. TERMS acceptability index active evaluation beneficiaries commercial cost decision-makers diffusion domain directed survey economics economic economic analysis enterprise' farm household farm enterprise farmer environment farmer feedback household intra-household inter-household investors market production net income Passive evaluation price recommendation domain resources Volume III: I page 2 risk role stakeholde rs subsistence subsistence production unit of analysis DISCUSSION ANALYSIS IN FSR/E: AN ovERviEw~ The previous training volume in this series focused on how to design and implement on-farm experiments. Volume III now provides the skills necessary to analyse and interpret the results from on-farm experimentation. Though analysis and interpretation are done after a trial is completed, the plannig for analysis takes place beloThethe trial is implemented, during e same time that the FSR/E team is designing the layout for on-farm trials. Unit I.C of Volume II discusses how to plan for the evaluation of alternative technologies. It is useful to review this unit prior to continuing your work with Volume III. In moving towards conducting the analysis of on-farm experiments it is especially important to keep in mind the seven key questions for determining farmer acceptability. Now, after the trial is concluded, these can be slightly rephrased in the following way to begin the process of analysis: 1. Was the problem to be solved important to farmers? 2. Did farmers understand the trial? 3. Do farmers have the time, inputs, and labor needed by the new technology? 4. Does the proposed technology make sense within the present farming system? 5. Is the mood favorable for investing in new technology in the region? 6. Is the proposed change compatible with local preferences, beliefs, or community sanctions? 7. Do farmers believe the technology will hold up over the long term? If the team has been able to develop a good rapport with the cooperating farmers and the local community over the course of conducting the on-farm trial, then many of these questions already will. have been answered by the time the trial is completed. Sometimes it is necessary to go back to the farmer community after the trial is over to explore further the possible answers to the questions. Diagnostic tools (informal surveys, questionairres, key informant interviews, group interviews) can be used to help clarify how farmers reacted to a trial or what potential a new technology has in a particular region. In beginning the process of analysis and interpretation, it is also necessary to examine whether all of the potential stakeholders were involved in the trial. Were women as well as men included as cooperating farmers in the on-farm research? How were they involved? Do their opinions differ from men on the outcomes of the trial? what incentives exist to encourage men or women to adopt the technology proposed by the trial? Volume III: I page 3 Finally, in constructing a framework for analysis, as outlined in the following sub-section, it is necessary to return to the criteria used by the team to order and select priority problems for on-farm research (see I:XI). Were the criteria used for selecting priority problems and screening possible solutions appropriate? How does the team regard these criteria now that the trial is completed? Would other criteria have been more useful? These three sets of criteria the key questions, the stakeholder examination, and the original screening criteria help to construct an analytical framework to interpret and evaluate the results from on-farm experimentation. We will return to these in Unit IV of this Volume. Before we can use these sets of criteria, we need to thoroughly examine and analyze the on-farm trial results or data. Data are the numbers and sets of observations and information generated from an on-farm trial or trials. Field books, at the end of a trial, are full of data collected during the trial. All of the data must be organized so that it can be useful in understanding what happened during the trial. Unit II in this Volume explains how to organize, summarize and describe data for analysis. Once the data is organized, the process of analysis begins. At this point, though the team has been working in an interdisciplinary mode, it is necessary to draw upon specific disciplinary tools to conduct the analysis. This does not mean that only those persons from the specific discipline can conduct the analysis. Rather, it implies that a series of discrete disciplinary tools must be applied to generate a set of analytical responses before the team can consider all of the responses together within an interdisciplinary framework for final analysis and interpretation. It may be easier to understand this process by using the diagram in Figure III:I.1 below. Figure III:I.l. AGCRCNOMIC interpretation I N T ECC!NO1IC -interpretation I T N E R SOCIAL/CULIURAL interpretation T R E P I D A T A --NUTRITION interpretation G R R E A ANIMAL SCIENCE interpretation A T FARMER interpretation E T ~D I POLICY interpretation 0 N As shown in the diagram, the data is submitted to a series of analytical tools or tests that are drawn from various appropriate disciplines. Depending upon the nature of the problem, the type of trial design, and the kind of data generated, other disciplinary tools or analytical measures may be applied. Each category of types of analysis represents a series of Volume III: I page 4 possible tools that can be used. For example, the category ECONOMIC includes such tools as partial budgeting, sensitivity analysis, or break even analysis. The AGRONOMIC category includes two sets of tools: those that comprise ANOVA techniques and those that lead to Modified Stability Analysis (MSA). (See the comparison of these tools in sub-section 4 in this discussion.) SOCIAL analysis may include a variety of qualitative assessments of the data, in addition to quantitative calculations. These will certainly include analysis of the data on labor and time allocations measured in the trials, or analysis of intra- and inter-household responses to changes required by the technology proposed in the trial. NUTRITION may include consideration of the "consumability" of the products of the trial. Did farmers and consumers like the taste of the new varieties? which varieties have better nutritional features? Under the FARMER category, members of the team may wish to involve the cooperating farmers in the analysis of the various results of the trial. Were certain factors more visibly beneficial from farmers, viewpoints? why did farmers prefer certain parts of a new technology and not others? Each of the individual tools utilized from any category of analysis will be followed by interpretation of the results of that particular analytical technique. Once the interpretation of the individual tools or techniques is completed, all of the analysis is brought together in a process we have recently decided to call "INTEGRATED INTERPRETATION". This is a team process requiring equal participation of all team members. The process is best thought of as a time to put all the tool-specific results, analyses, and interpretations on the table, in front of the team, and determine together what will be the next steps to take. Will a trial need to be repeated? Is other information needed? Did new research problems arise as a result of the trial? Did farmers point out problems in the course of cooperation that were not identified before the trial began? Did the technology "not work".> Are there new problems that require some basic it on-station" research before proceeding again with on-farm research? Did a new problem arise requiring different disciplinary expertise other than that represented on the team? Is there a need for a new team member? Does the tested technology meet agronomic requirement but lack required social or nutritional features? Does the technology require additional labor which cannot be met by the farm family nor the limited hired labor available locally? These are the kinds of questions which will need to be answered during the integrated interpretation of the individual trial results. The remaining sections of this unit introduce certain specific considerations about the sets of analytical tools and the framework needed to provide an integrated interpretation of the results provided by analysis. These serve as an introduction to the major section of the Volume, Unit III, which covers the skills needed in order to conduct the most frequently used types of analysis in on-fam experimentation. The Volume concludes with Unit IV which covers the process of integrated interpretation by the entire team. 2. ESTABLISHING A FRAMEWORK FOR ANALYSIS OF ON-FARM EXPERIMENT a. How analysis fits into the FSR/E process Volume III: I page 5 FSR/E is a systematic method of linking research and extension. This method addresses two important problems of the linkage between research and extension: 1. Problems of unexpected events: recommendations from research for extension do not work as we expect, because of different results when farmers use the recommendations (or when farmers do not use the recommendations at all!) 2. Problems of prioritization: there are many more needs for agricultural research than we can meet with limited budget, personnel, and time. Analysis itself is not the objective of FSR/E. Rather, analysis is part of a rocess, with the objective of prediction and explanation of acceptability of new technology by farm households. In order for analysis to meet this objective, it is useful to look more closely at how it fits into the whole FSR/E process. The process of FSR/E can be depicted as a long funnel with a series of screens. Figure 111:1.2 shows this depiction. The figure is organized in two dimensions: 1. Horizontally: The figure shows a timeline moving through the general sequence of activities in FSR/E. The top-most labels under the heading of "Activity" proceed, fromleft to right, along the timeline: "Diagnosis" to "Design" to "Iterative diagnosis and design" to "Extension". 2. Vertically: The figure shows different tyes of persons involved in the FSR/E process. The labels to the far left under the heading of "Affiliation" give, from top to bottom, the different types of people who contribute and interact in the activities of FSR/E that occur across the timeline. The order from top to bottom begins with greatest involvement in assessment of acceptability: "Farm households" "FSR/E team" "Extension" "Backup Department of Agriculture/University specialists" (*) "Policy makers" * (NOTE: "Department of Agriculture," abbreviated as "DA," is used for convenience only; the actual name of the part of the government responsible for agricultural research outside of universities varies from country to country.) Volume III: I page 6 (D Activity Diagn sis esig Tes in C Possble espnsiblites lnd o :i ocl- : M a nyai o P v oAl e i n C n o esi vT raio j c tl t a o Problems at. P~ol tiy alction steiea ure nml asiacioi1 Implementation -7Data 0A PSi/u team osaibites Dae a ,--t ..4 11:tflnii@t Reptesentativeness H- Bakup Possible Backu P reamndac ione ac DJovril solutionps research cBakp nlyi specialists ( ________ D Policy makers Coortrainta limting $le. sunoel rutreel om focus ope.: Cl.0081 wchole spates interventtin .s1pa" tam considered focus Co..ieee C tD The specific contributions and interactions of the different types of people shown vertically at each point in the horizontal timeline sequence fill in the diagram from left to right. The degree of focus of these contributions and interactions changes as the diagram moves from left to right. This is shown in two ways in the diagram: 1. The funnel drawn across the diagram; 2. The bottom-most phrases, moving across from the heading at the far lef-Fotto ot"Sstems focus. The changes in degree of focus shown in these two ways in the diagram can be summarized as follows. Reference to specific training volumes and units are included. The funnel starts out wide at the left, with many problems considered in initial diagnosis from a broad, holistic perspective. These techniques are covered in Volume I. The funnel narrows as problems are put through prioritization "screens" for design. The screens are based on: 1. Ranking of importance by farm households; and 2. Possibility of solution using ideas from farm household members, the FSR/E team, extension personnel, and backup specialists. Prioritization is presented in Unit I:IX in Volume I. Problems-that are eliminated from the design of on-farm trials"(within the narrow funnel) may be referred to others with linkages to the FSR/E process, such as backup specialists for station research or policy makers who have the power to reduce constraints "outside the farm gate" that limit on-farm research options. These linkages are discussed in Units II:I,B and II:II,A. The funnel closes when a design priority has been chosen as the intervention focus of the trials. Unit II:I,A presents various ways to classify the resulting design priority. The funnel remains closed as design and testing activities for the design priority proceed, from left to right through the funnel, from "cooperator selection" all the way to "data collection." Volume II presents techniques for these activities. A key element in the process within the narrow funnel is the selection of criteria for data collection, at the time of the design survey. These criteria determine what data will be available for analysis. Unit II:I,C present techniques for selection of these criteria. The data analysis activities covered in the present volume, are at the point where the funnel opens again. The funnel opens because many different factors go into farm household members' evaluation of the acceptability of the specific new technology compared in the trials. Together, with the cooperating farm households, the FSR/E acts as a "proxy" for all the farm households in the domain, to assess potential acceptability in the domain as a whole. For this purpose, the data from the trial are put through another series of "screens" comprising the Volume III: I page 8 various types of biological and socio-economic analysis "screens" that represent the different factors affecting acceptability. III:III of this volume presents the tools for these different specific analyses. At this point, where the funnel begins to open for analysis of data from on-farm trials, we need to consider two questions: 1. What are the different tools of analysis that are available to assess all the different factors affecting acceptability, and which do we use? 2. How can we do an integrated interpretation of the results of several different individual analyses, and for what purpose? These two questions are closely related. Choice of analysis tools affects what type of integrated interpretation is possible. Likewise, conversely, knowing what type of integrated interpretation we want to make, for what purpose, can lead us to choose one tool of analysis over another, or add another tool of analysis to those we initially might have considered. Establishing a framework for analysis allows us to choose the right tools for the desired integrated interpretation. b. Types of analysis in FSR/E At this point, let's expand the portion of the funnel between implementation and data. We will use a concrete example to illustrate both the types of data that a trial can produce, and corresponding analysis types. Figure 111:1.3 shows a trial comparing natural fallow with planted fallows (the interventions) prior to planting of rainfed corn on shifting cultivation land. The figure consists of three parts. 1. The center top part, labeled at the upper left as "Trial activity." This part consists of a matrix, with a series of vertically arranged rows consisting of farms an--Etreatments, labelled in the first two columns of the center top part. Horizontally, labelled across the top, the the timeline of columns of implementation activities. The intersedti6n of the columns in the timeline and the rows of treatments at each farm represent plots (in this example, or animals if this-were an animal trial) where the activities are done, and where data are generated. 2. The middle part, labelled "Data type." This shows the different types of data generated: a. As i into the trial (labor, purchased inputs), and factors affecting those inputs (gender, seasonability, price variation); b. As dependent variables affected by the trial treatments such as percent tallow cover or soil fertility at different points in time; Volume III: I page 9 c. As outputs of the trial, both direct (such as corn yield), and ind irect (such as income or secondary products), and aspects of the output (who receives the benefits, price changes affecting the benefits). 3. The bottom part, labelled "Analysis type," shows different types of analysis procedures for each type of ata, and among some types of data over time. c. Choosing analysis tools: identifying uses for the results Figure 111:1.3 presents many types of analysis. How to choose among these is essential. There are several points a team can consider to help choose the right tools for each set of trials. 1. Who will use the results of the trial analysis? The various people shown in the left-hand side, under "Affiliation," in the diagram are all potential users of the results. The relative importance to these different users may vary. 2. what is the basis of the trial? Different types of on-farm trials in FSR/E have different objectives. These objectives depend, in part, on the basis of the trial, as discussed in detail in II:I,A: (a) Use of product (home, sale, food, feed) (b) Scale of production (commercial vs. garden) (C) Basis of the agricultural production system (crop, animal, mixed, monocrop, intercrop, relay, etc.) (d) Basis of the priority problem (variety, breeds, cultural practices, nutrition, feed resources, etc.) 3. What is the function of the trial in the FSR/E process? Unit II:I,B presents a 3-step sequence of on-farm trials: exploratory, ref ihement, and validation. These differ in terms of researcher-farmer management sharing. This also affects trial objectives and what types of data analysis are needed. Eploratory and refinement on-farm trials are more researcher-managed and les tamer-managed comaed to validation trials. They usually do not have immediate use by farm households of the new technology as their objective. Instead, their objectives are to identify biological (especially in exploratory trials) and socio-economic (increasingly in refinement trials) factors that might contribute to acceptability. Direct observation and measurement of acceptab Iility is especially important in validation trials because immediate use by farm households of the new technology is an objective in validation trials. This immediate use is also easier because validation trials are farmer-managed trials. Volume III: I page 10 Usually more than one trial, and more than one season, are needed to develop an acceptable solution to the priority problems identified in initial diagnosis. Exploratory trials often include many possible new technologies which are eliminated as a result of the trial. Perhaps there are no significant biological responses to P and K fertilization, for example. Screening and elimination of unacceptable new technology is in fact one objective of exploratory trials. Refinement trials will include fewer technologies, but these are subjected to more thorough socio-economic analysis. More thorough socio-economic analysis can result in elimination -of many possible ways of using a new technology. For example, many N fertilizer rates may be shown to be uneconomic. It may also even result in elimination of a whole new technology. Perhaps, for example, incompatibility with existing equipment will eliminate all twin-row high density plantings, in spite of earlier promising biological results. Hence, integrated interpretation of results needs to assess progress towards identification -of acceptable technology at pre-determined intervals over time. Also, the needs of farm households can change from one season to the next. Circumstances they face also can change. Because FSR/E involves the active participation of farm households, these changes can affect trial. objectives. Changes in trial objectives in turn affect trial design. This is another reason why integrated interpretation needs to assess progress. towards identification of acceptable technology at pre-determined intervals over time. What these points mean is that integrated interpretation of trial results is really a part of a larger planning process. The funnel diagram is drawn in figure 111:1.2 in a linear form, but at the right-hand side, we are also back to the left-hand side, at a later point in time. Thus, the two ends of the funnel can also be joined to make an on-farm diagnosis, design, and testing loo that is repeated over and over again. In succeeding years, the previous year's trial analysis substitutes for the initial diagnosis of the first year. The point where the two ends join in the loop is thus the very beginning of design, at prioritization. In other words, analysis is input into re-prioritization. At this point of reprioritization where the two ends join the loop, there are also other loops branching out to other functions that support agricultural production, such as: 1. On-station backup research; 2. Extension; 3. Policy. Figure II:II,A.l in Volume II, shows how the FSR/E loop and the on-station backup research loop link at the point of planning and design. Figure 111:1.4 expands that diagram to show the other loops as well. This perspective, that analysis is re-prioritization for both FSR/E and for other activities as well, provides the framework for choosing analysis tools and combining them for integrated interpretation. Volume III: I page 11 Figure 111:1.3 Types of Analysis in FSR/E .rslfarmsa treat- plant sampling samping plant harvest activity Bmea lOAgt date I1 date a corn corn fallow 11 X x z x Fallow x a z a if kudzu x x x z C. x x 2 2 :ail PeMaerA ct Percent Cor :-!Te FeWAsl labor, cover. cover, yield Seasonbility Hired soil sai~l labor, fertility fertility Priest variation Inputs ~J:Ss Intr Sensitivity Partial t-tast (2 crts) t-test (2 :rt*) tCe hogie- analysis budget- AWIVA (a 3 tints) AIB3VA (~3 trts) hold ing Correlatien(2 veras) analysis Regression (tim) Crreasica Volume III: I page 12 Figur-e Continues.. 111:1.3 Types of Analysis in FSR/E beau. Benefits to: P- rice "hatg* Stalks Aie Food 'Cauguntion by: production Pewle Female Pble "Ele children ehildren Partial Sensitivity Intrahousehold Consumption/ LineAr programin. budgeting analysis analysis nutrition whole-faru analysis smalysis Volume III: I page 13 Figure 111:1.4 Linking on-farm research, station research, decision/ policy making and extension through the planning for design stage of FSR/E F u FSR/E Team Begins work Here FarmHouseholds Diagnosis/Aalysis ON-FARM RESEARCH Experimentation Demonstration of farmer-approved technlogyRequests for farm-level tS~hnlo~Yinformation for policy determination Monitoring of adoption EXTENSION PLANNING AND DECISION/ DESIGN POLICY-MAKERS Monitor nonDissemination of adoption adoptable technology Recommendations for policy changes to enhance technology STATION/BENCH development/adoption RESEARCH Unresolved Technologies appropriate problems to selected environmental zones Disciplinary, thematic or commodity research SOURCE: Susan Poats, based on an earlier diagram by Collinson, 1982. For Collinson diagram, see figure II: II,A.1 Volume III: I page 14 3. POSSIBLE TYPES OF BIOLOGICAL ANALYSIS The data in the matrix of a field book page are usually too many to see what has happened in the trial. There are just too many numbers. Also, there may appear to be differences from one treatment to another in the field book page. However, the numbers in the field book do not tell us which is more likely: the apparent differences from one treatment to another are really due to the treatments, or they are just due to random variation. Likewise, there may appear to be differences from one farm to the next. Again, though, we don't know which is more likely: that those differences are really due to the farms (their different soils, or different management, for example), or just due to random variation. Statistical analysis is one way we reduce the numbers in the matrix of a field book to fewer numbers that are easier to see. These numbers are various types of summary statistics. The summary statistics also allow us to test hypotheses about relationships among classifiers (farms and blocks), independent variables (treatments), and dependent variables. Even before doing analysis, it is possible to calculate some simple, summary statistics. It is also possible to make graphs from the original data, or from simple summary statistics. These techniques do not allow us to test hypotheses. However, they can help us get a "feel" for a data set. This is often useful in explaining the results of statistical tests in concrete terms. Also, even when tests indicate no real differences among treatments or farms, these techniques sometimes can give us new ideas for future trials. That is, they can be useful making new hypotheses, which type of summary statistics we calculate depends on 2 considerations: 1. What relationships among the components of the data set do we want to test; and 2. What relationships exist among the classifiers (farms, blocks, and plots) and independent variables (treatments) due to the design -used; Three broad classes of analysis are: 1. Analysis to test for the presence or absence of effects of independent variables and/or classifiers on dependent variables; 2. Analysis to interpret effects; 3. Analysis of relationships among dependent variables. Analysis of effects. The fifth key characteristic of a scientific method is accepting or rejecting the hypothesis, based on the data from the testing. Hypotheses in biological analysis involve effects. An effect is 'F-explained. The difference in values of a dependent variable that can e explanation may be association with any or all of the following: Volume III.: I page 15 1. Independent variables; 2. Differences among farms; 3. Treatment-by-farm interaction. Treatment-by-farm interaction means that the effect of treatments on a dependent variable is different depending on which farm one looks at. This is often a very important effect in FSR/E. Units II and III of Volume II on Design, explain treatment-by-farm interaction in more detail. It is important to note that an effect is only a hypothesis until after the analysis is done. The analysis reduces the data set to summary statistics. The summary statistics tell us which is more likely, our hypothesis of a real effect, or an alternate hypothesis of only random variation and no effect. Effects may be either qualitative, quantitative, or a mixture of both qualitative and quantitative. a. Qualitative effects: These are effects of factors with levels that do not have a continuous numerical relationship. In the case of effects of an independent variable, the independent variable is discrete. Examples include: 1. Different varieties; 2. Different pesticides; 3. Different methods of land preparation. Effects of farms can also be qualitative, if no measure of the farm environment is made. In this case, each group of farms is a discrete. group. This is entirely analogous to different varieties, or different methods of land preparation, for independent variables. If farms are treated as discrete groups, then treatment-by-farm interactions for qualitative effects of discrete independent variables are also qualitative. The type of summary statistic to be calculated depends on the design used. With only 2 treatments and single replications per farm, the t-test procedure can be used. This is the simplest procedure. All other de-si-gns require use of one or another ANOVA procedure. ANOVA procedures are more flexible, but also more complex-an-d- difficult. (See the comparison of these two techniques in the following sub-section of this discussion.) b. Quanitative effects These are effects of factors with levels that do have a continuous numerical relationship. Examples for independent variables include: 1. Fertilizer rate; 2. A regular series of plant in-the-row spacings (e.g., 20, 30, and 40 cm); 3. Feeding rates. Volume III: I page 16 The behavior of the dependent variable is often called a ree when the effect of the independent variable is quantitative. If farm environment is measured, the effect of farms on a dependent variable can also be quantitative. ANOVA procedures are used to calculate summary statistics to test quantitative effects of dependent variables. Regression procedures are used to calculate summary statistics for quantitative effects of farm environment. The latter is called modified stability analysis. c. Mixtures of qualitative and quantitative effects Sometimes a set of treatments may have both discrete and continuous treatments. There are 2 cases where this can occur: 1. Nested treatments These occur when one or more discrete treatments have several rates. A weed control example would be: (a) Farmer control: hand weeding; (b) Researcher control: no weeding; (c) Intervention 1: herbicide 'Hit' at 100 ppm; (d) Intervention 2: herbicide 'Hit' at 200 ppm; (e) Intervention 3: herbicide 'Hit' at 300 ppm. The field book would show only one treatment column, perhaps labeled as "weed control." However, the variable "weed control" really consists of a continous variable nested within a discrete variable. The rate of herbicide 'Hit' is a continous variable, but method of weed control (hand weeding vs. no weeding vs. herbicide) is a discrete variable. 2. Factorial treatment sets One factor may be discrete, and the other factor may be continuous. Some examples would be: (a) N rate (4 levels) by P rate (3 levels) (b) Variety by plant spacing. When the two factors have an interaction, the analysis must determine the response of the dependent variable to the continuous factor separately for each level, or group of levels, of the discrete factor. Alternatively, a continous independent variable may have a treatment-by-farm interaction. Analysis is similar to the case of two independent variable factors: the analysis must determine the response of the dependent variable to the continuous independent variable separately for each group of farms that are different. ANOVA procedures are used to calculate summary statistics to test mixtures of quantitative and qualitative effects. Part of a scientific method is explaining effects of an independent variable. If the testing of our hypothesis confirmed the hypothesis, we want to know why the dependent variable had different values for different levels of the independent variable. This will increase our confidence in our prediction in the future, if we can explain why the effect occurred in Volume III; I page 17 the trial. This means we want to know which levels contributed most to the effect. methods of analysis for interpreting effects are different for qualitative and quantitative effects: 1. Qualitative effects Here we want to separate out the individual effects of the discrete levels (treatments) of the independent variable. There are 2 types of analysis procedures for this: a. Separation resulting from the analysis: This includes the least sigiicant difference (LSD) and various means separation procedures. These procedures do not require hypotheses about the levels prior to analysis. They are especially appropriate in exploratory trials when not enough may be known to establish hypotheses about individual levels (for example, differences among varieties) prior to the trial. They are easier to do, and widely used. b. Testing of individual treatment effects: This is called single dere-f-freedom contrasts, or orthogonal comparisons. These procedures are based on hypotheses about differences among levels made prior to the testing and anlaysis. These procedures calculate summary statistics that tell us which is more likely, our hypothesis of a real effect, or an alternate hypothesis of only random variation and no effect. These procedures thus promote more in-depth design of a trial. They are especially appropriate in refinement trials. However, they are more difficult to do, and have become more common only in recent years. The acceptance or rejection of the hypothesis of an effect is based on which is more likely, the hypothesis of real differences, or an alternate hypothesis of random variation only. How much is more? We can also calculate different summary statistics for different chances of the hypothesis of real differences being likely. These are called confidence intervals. Apparent differences among levels may disappear if we want to accept our hypothesis of real effects only when it is more highly-likely. 2. Quantitative effects Here we want to fit a line or curve to the individual data points. The curve describes the response of the dependent variable to the continuous independent variable. Regression procedures are used to calculate summary statistics. The summary statistics describe the direction (slope) of the line, or the shape of the curve. Another way to explain effects of an independent variable is to examine how similar the effects are on different dependent variables. Correlation procedures are used to calculate summary statistics for this purpose. Summary statistics can be used to test the similarity of the effects on the two variables. They also can tell how strong the similarity is. Volume III: I page 18 Which type of summary statistics we calculate depends on 2 considerations: 1. What relationships among the components of the data set do we want to test; and 2. What relationships exist among the classifiers (farms, blocks, and plots) and independent variables (treatments) due to the design used. 4. A COMPARISON OF TWO TOOLS: ANOVA AND MSA Two choices of basic analytical procedures exist for analysis of biological and economic data from regional trials: 1. t-test (two treatments) and ANOVA (3 or more treatments); 2. Modified Stabililty Analysis (MSA). Each choice has advantages and disadvantages that can be summarized as follows: 1. t-test/ANOVA are more sophisticated and powerful for hypothesis testing, but take longer to learn to use correctly. 2. MSA is better at identifying treatment-by-farm interactions and planning for disemination of results through extension, and easier to learn how to use, but is less powerful for hypothesis testing. Both procedures can be used together for analysis of simple treatment stuctures in regional trials with RCBD replicated across farms only. Teams may also learn both of these procedures in a sequence over time: 1. Initial phase: basic skills: The team learns how to use MSA before developing an understanding of how it works. The team applies MSA for most designs and treatment structures on its own. The team is introduced to hypothesis-testing through t-test procedures. It relies on subject-matter specialists for analysis in cases where ANOVA is needed. 2. Intermediate phase: adding skills and deepening understandin: The team calls on subject-matter specialists for assistance in those cases where ANOVA is needed. The team begins to develop a deeper understanding of hypothesis testing and ANOVA through real-world, on-the-job training in ANOVA, using their own data. The team continues to use MSA for most data on its own, either without ANOVA, or together with ANOVA for more complete interpretation of results, as appropriate, depending on the design and data set. Volume III: I page 19 3. Final phase: functioning as an independent research unit: The team uses both MSA and ANOVA on its own, separately or in combination, as most appropriate for each data set. The team calls on subject-matter specialists in special, difficult cases, but it now has the base of skills and understanding of principles to learn how to use each new application that subject-matter specialists introduce to solve special analysis problems. The team begins to develop new techniques for participatory, integrated assessment of trial biological, economic, and social results, including assessment of secondary and tertiary systems causation and trade-off linkages. In this way, the team pushes forward the state-of-the-arts of FSR/E. The team begins to develop new technologies for design and analysis of independent farmer experimentation, also pushing forward the state-of-the-arts of FSR/E. Volume III: I page 20 Table III: I.l: Advantages of t-test/ANOVA procedures versus MSA for data from regional trials t-test/ANOVA MSA Advantages 1. Provides rigorous, valid tests of 1. Can identify treatment-byhypotheses. environment interactions even when treatment means are equal. 2. Provides a range of probabilities 2. Can identify treatment-byof making wrong decisions about environment interactions even when comparisons among treatments. treatments are replicated only across farms, but not within farms. 3. Can analyze many types of 3. Can be used regardless of the treatment structures. degree of heterogeneity among sites. 4. Can identify treatment-by- 4. Promotes thorough investigation environment interactions when into differences among farms in a i4W4'ractiens are also domain. replicated within farms. 5. Promotes disciplined, scientific 5. Promotes thinking ahead to linkage thinking about hypotheses and with extension and dissemination of statistical inference from a results to different types of sample to a domain, farms. 6. Can make use of statistical 6. Can be combined with confidence function keys for sums of squares intervals to provide a range of found on most scientific probabilities of making wrong calculators. decisions about comparisons among treatments. 7. Cross-checking for treatments 7. Can be used for simple and blocks can be done easily interpretation without on scientific calculators by understanding how the procedure comparing results obtained by works. line-by-line data entry and statistical keys versus results obtained by formulas using intermediate sums. 8. Is fairly easy to calculate byhand. Volume III: I page 21 Table III: 1.2: Disadvantages of t-test/ANOVA procedures versus MSA for data from regional trials t-test/ANOVA MSA Disadvantages 1. Cannot identify treatment-by- 1. Tests of hypotheses may not be environment interactions when valid. treatments are replicated only across farms, but not within farms. 2. Requires sophisticated under- 2. Environmental index used as standing of principles and independent variable for regression procedures of hypothesis testing is highly correlated with the and statistical inference based dependent variables. on probabilities. 3. May be difficult to interpret 3. May not be possible to use if the more complex treatment regression of 1 or more treatments structures. on the environmental index is not significant. 4. May not be possible to use when 4. Understanding how the regression sites are too heterogeneous. procedure works is easier if comparison is made with ANOVA. 5. Requires a long series of 5. Sums of products calculation is calculations even for simple tedious, errors are easy to make, designs. and there is no way to cross-check using intermediate sums. 6. Requires several long series of 6. Most scientific calculators do not calculations with intermediate have statistical function keys for decision-points for more complex sums of products or regression. designs. 7. Many scientific calcultors do not have statistical function keys for simultaneous entry of data from RCBD. 8. Incomplete block designs are more easily analyzed by computer, which is too sophisticated and inaccessible for most field teams. Volume III: I page 22 5. CHOOSING APPROPRIATE ECONOMIC ANALYSIS TECHNIQUES. The process of choosing the most appropriate economic techniques for analyzing the performance of alternative technologies involves contemplating a number of questions related to the evaluation criteria to be used, to project concerns, and to the characteristics of the techniques themselves: - What economic evaluation criteria will be used? - Will this be an ex-ante or ex-post evaluation of the proposed technologies? - What type of on-farm trial is to be analyzed: exploratory, refinement or ve Irification? - How ready is the technology for recommendation to farmers? - How timely and complete must the analysis be? - What clientele groups will use the results of the analysis (farmers, FSR/E team, station researchers, policy-makers)? - What sources and types of data are available or required? - what analytic aids such as calculators or computers are available? - What is the level of economic expertise of the personnel who will conduct the analysis? The techniques described in in Unit III: III are simplified, partial approaches to the problem of economic analysis of alternative technologies in on-farm trials. They are only part of a larger bag of economic evaluation tools which also includes sensitization to the perspectives of different stakeholders,' ongoing personal observations, communication with farmers, more sophisticated economic data analyses at the whole farm level, and monitoring of adoption effects on farm households. 6. DETERMINING IMPACT OF PROPOSED TECHNOLOGIES ON OTHER HOUSEHOLD ACTIVITIES Just because a technological change may be profitable or desirable in one farm enterprise does not necessarily mean that it will be a favorable change when the overall production, consumption and welfare of the farmhousehold is considered.. In the previous sections we have been considering the analysis of technological alternatives only from the point of view of the enterprise in question. Now we will consider ways to examine the effects of changes in technology on the other activities taking place on oroff the farm. It is not always easy for an FSR/E field team to assess the possible impact of a change in one enterprise on other. activities on or off the farm. In the end, it will have to be the farm decision-makers who decide exactly how such adjustments will be made (see next section on Farmer Acceptability). However, this does not mean that the whole farm perspective should be ignored by the field team. Preliminary judgements, regarding the technological alternatives under consideration must be made by the field team even before the technology is put into the hands of the farmers for their testing and evaluation. Volume III: I page 23 It is useful for field team members to ask themselves some of the questions posed earlier in the process (II: I,C Planning for Evaluation). These same questions may well have been asked during the design of the alternatives to be tested by on-farm research. But after the research has been completed and is 'being evaluated, it should be considered again. The relevant questions are: 1. If the amount of a resource required in a farm enterprise is increased by a proposed alternative, where will that increase come from? 2. How will that affect the activity where it is presently used? 3. If use of a resource in one farm enterprise is decreased by an alternative practice, where and how will that freed resource be used? 4. How will that freed resource affect the activity or activities where it will be used? 5. If more of the product in question is produced, what effect will this have on the farm as a whole? Land use and crop activities calendars (I:V) can be useful for the purpose. For example, from the crop activities calendar as presented in volume I, one could see that April and May are busy months for land preparation,and much of the planting takes place in late June and early July. Practices which free time in these months may provide much needed labor for other crops. But practices which require more labor during these periods may not be acceptable because of the other activities which need to be carried out. These possibilities should be evaluated when designing alternatives. They should also be taken into account when analyzing the proposed alternatives. Another useful means of tracing the possible implications of changes in the practices for one enterprise on other activities on the farm is by using the model of the farming system developed in the initial surveys or characteriz'ation. Interactions among the crops, the livestock and the household are particularly evident in these models. Obviously it is not usually possible to quantify the effects, but an evaluation can be made concerning potential problems from making shifts in resource use. For example, if a new technology requires an added cash expense, it may be possible to examine the model of the farming system to determine where this additional money could come from. What other expenses could be cut and what impact might this have on the welfare of the household? It may not be possible for the field team to answer the question, but it would at least bring the competition for whatever money is available to the attention of the FSR/E team. Increasingly complex and sophisticated techniques can be used to analyze whole farm impacts of a change in technology. These can include expanded partial budgets, whole farm budgets or linear, curvilinear or stochastic programming. However, these techniques often can require much more time and resources than are often available to field teams. Volume III: I page 24 It is important to remember that even the most sophisticated methods of analysis are only tools to help field teams evaluate alternative technologies as a proxy for the farmers themselves. This is why we must be "devil's advocates" in our analysis, to represent as many as possible of the hard questions farmers would ask if they could see our results (see III:II,B,3). It is the farmers, ultimately, who make the decision of whether or not to adopt a technology. So it is necessary for field teams to incorporate farmer and stakeholder evaluation in their analyses. Fortunately, it is possible for field teams to conduct directed surveys of farmers in an area in order to involve them in the process and to augment other analyses. Some directed surveys can be conducted in a single day if a specific question is being addressed. An example could be related to an outbreak of some insect that the field team was not expecting. In one day it would be possible for the team to ascertain if this is a regular occurrence and what, if anything, the farmers usually do about it. Other, more complex questions may take a few days, but the information can be forthcoming rapidly and with little cost. This type of dialogue with farmers may well be the most important analytical technique available to the FSR/E team. 7. EVALUATION OF ACCEPTABILITY BY FARMERS In most societies, farmers themselves, decide whether or not to use a new technology. However, researchers need a means of predicting the response of farmers to a technology just like they need a means of predicting the response of a crop to fertilizer. If a 'new variety is involved, seed of the new line must be multiplied if it is anicipated that farmers are going to adopt it. If fertilizer is required, the marketing mechanism must be alerted to have a supply available if farmers are going to want to use it. Researchers can also find out what farmers like and/or dislike about a particular technology at the time they are assessing the acceptability of the technology to the farmers. During the period farmers are using a new technology for first time, such as in an on-farm trial, researchers can obtain a passive evaluation from the farmers. That is, they can find out how the farmers feel about the new technology and whether or not they think they might adopt it the next year. They can find out what the farmers like and dislike about the technology and whether or not it fits into their farming system. This information can be obtained with a directed survey from among those farmers participating in a directed survey from among those farmers participating in on-farm trials and from any neighbors who have observed the trials or participated in field days where trials are located. However, it is a passive evaluation because until the farmer actually puts the technology into practice, it is possible for them to change their minds. An active evaluation can be obtained from the farmers (and made by the researchers) the next year when farmers have had an opportunity, on their own, to adopt or reject (or to modify or continue to experiment with) the new technology. The information can be obtained through another directed survey. Farmers are asked if they are using the new technology and if so, on what proportion of the appropriate crop they are using it on. An index of acceptability (Hildebrand and Poey, p. 122 ff) can be calculated from Volume III: I page 25 this information. The index of acceptability, I, is calculated as follows: I = (C X A) /100 In this equation, C is the percentage of the farmers interviewed who used the practice on at least part of their crop the year following the trial. A is the percentage of the area they planted to that crop on which they are using the practice. A is calculated based only on those farmers who are using the technology. For example, if 45 farmers participated in an on-farm trial using the new technology last year and 30 are using it this year, C is 30/45 or 66.7 percent. If those 30 farmers who are using the technology this year are using it on 60 percent of their area planted to that crop, then A is 60 percent. The index of acceptability, then is: I = (66.7 X 60) / 100 or I = 40 Experience has shown that if I is at least 25 and C at least 50, then there is a good possibility that widespread adoption of the technology in the recommendation domain will follow as more and more farmers learn about it. Researchers can also learn about the technology by asking the farmers why or why not they are using it this year. In the above example, one-third of the farmers who used the technology in the trial the year before did not adopt it this year. The researchers should find out why. It may be that they are in a different recommendation domain. In conducting an evaluation of acceptability, it is best to interview only those farmers on whose farms the trials.were conducted the year before. This is because hands-on learning about the new technology is more desirable than learning from observation or from other souces of information. Those farmers who actually worked with the technology had hands-on learning experiences. Neighbors and other farmers learned about the technology from observation or from information obtained in field days. If a new technology is relatively simple to learn to use, however, and the advantages are obvious, it is possible that neighbors are also using it this year. 'These farmers should then also be interviewed to obtain information on what they like about the technology and to help define diffusion domains. volume III: I page 26 ACTIVITIES ACTIVITY ONE: EVALUATION OF ACCEPTABILILTY BY FARMERS Volume III: I page 27 VOLUME III: I TRAINEE'S NOTES ACTIVITY ONE EVALUATION OF ACCEPTABILITY BY FARMERS OBJECTIVE: After completing this activity, participants will be able to: Calculate the index of acceptability. INSTRUCTIONS: 1. Study the following information: In 1978, ICTA, the Guatemalan Institute of Agricultural Science and Technology conducted a directed survey to make an assessment of farmers' active evaluation of improved maize cultivars. Sixteen farmers were interviewed. All of them had been collaborators in on-farm trials with maize cultivars the preceeding year. Five improved cultivars were tested against the farmers' own "Criollo" cultivar. In the 1977 trial, the general mean yields of ICTA T-101 and ICTA B1 were superior to all other cultivars: Cultivar Mean Yield ICTA T-101 3.45 ICTA BI 3.34 La Maquina 7422 3.18 Sintetico Amarillo 2.68 NKT 66 2.67 Criollo 2.34 In 1978, of the 16 farmers interviewed, 9 of them were planting at least one improved cultivar that had been in the trial in 1977. These nine farmers had planted 54 percent of their maize area to these improved cultivars. 2. Calculate the index of acceptability for improved maize as a technology. Volume III: I page 29 UNIT II LOOKING AT DATA SETS VOLUME III UNIT II (II,A) WHAT IS A DATA SET AND WHAT CAN IT DO? OUTLINE 1. Why Data Sets, and what TLypes in FSR/E? 2. What Kinds of Components Go into a Biological Data Set? 3. How Do We Construct a Biological Data Set? PREREQUI SITES Volume II units, with special emphasis on: I: Introduction B FSR/E, agricultural research and extension, and the scientific method. II: III,B what designs can do. II: IVC How to obtain and handle data from trials. III: I A Framework for Analysis PARTICIPANT LEVEL Agricultural research assistant Extension technology verification assistant LEARNING OBJECTIVES After completing this section the participants will be able to: 1. Describe 3 types of data set components. 2. Show how to arrange the 3 types of components into a data set structure in matrix form. KEY POINTS 1. A data set consists of a matrix of columns and rows of labels and numbers, where: a. *Each column is a vertical list of values for one type of classifier, independent variable, or dependent variable; and b. Each line is a horizontal list of the values of each dependent variable for a unique combination of classifiers and independent variables; 2. The classifiers in a data set are based on the experimental design used, while the values of the independent and dependent variables are obtained from measurements and observations taken or made during the trial. 3. The main objective of analysis of a data set is to test hypotheses about -relationships among the classifiers, independent variables, and dependent variables, by reducing the numbers in the matrix to summary statistics. Volume III: II,A page 31 TERMS classifier data dependent variable experimental design hypothesis independent variable matrix science statistic DISCUSSION 1. WHY DATA SETS, AND WHAT TYPES IN FSR/E? Data sets are a very important part of a scientific method. Let's review briefly why a scientific method is useful. I: Introduction B discusses several problems of agricultural research and extension. One of those problems is the problem of unexpected events: recommendations for changes in agricultural production don't alwayswork as we expect. Either the change doesn't happen the same way with famers' crops or animals as on the research station, or farm household members do not find the change acceptable. The reason is because we don't fully understand the conditions under which farm households will accept a recommendation. A scientific method can be useful as a way to document better experiences of conditions under which recommendations do work, and conditions under which recommendations do not work. This can help us better predict when recommendations will work. Unit I: Introduction B also presents 7 key characteristics of a scientific method. The first key characteristic is a description of the conditions for an event. The event we are interested in is when a recommendation "works": that is, when a recommendation is accepted by farm households. The second key characteristic is the formation of a hypothesis about what conditions will result in the occurance of an event. A hypothesis typically compares several contrasting conditions for the same event. We have an idea about which of the contrasting conditions will in fact result in the event, and which'will not. This leads to the third key characteristic: testing of the hypothesis. We do an on-farm trial to see what actually happens: which contrasting conditions actually result in the event, and which do not. The fourth characteristic is documentation of how the testing was done. This documentation is done with labels and numbers. A single number is a datum; hence, data are many numbers. A data set are numbers gathered for a prticular purpose. In FSR/E, that purpose is documentation of testing. A data set uses a special kind of documentation called a matrix. A matrix is an arrangement of labels and numbers in rows and colu-s. Every label or number in a matrix belongs to one row and to one column at the Volume III: II,A page 32 same time. Thus, each label or number in a matrix shows some relationship between one row and one column. Different types of data sets are possible in FSR/E: 1. Biological data sets from on-farm trials; 2. Economic data sets from on-farm trials; 3.' Consumption data sets from on-farm trials; 4. Integrated data sets from on-farm trials; 5. Socio-economic data sets from directed formal surveys, or from conversion of qualitative observations to quantitative values. Generally, analysis of an on-farm trial begins with construction and analysis of a biological data set. The results of that analysis are then used in deciding how to construct economic and consumption data sets. Integrated data sets are constructed last, based on the results of analyses of the biological, economic, and consumption data. They also usually incorporate additional observations on intrahousehold and community social relationships affected by the trial results. A biological data set from an on-farm trial in FSR/E is a particular kind of matrix. There are 3 reasons for the differences: 1. we take the labels and numbers for our matrix from on-farm trials; 2. we make hypotheses about the relationships among the labels and numbers before we obtain the numbers; 3. We use tiTHie-arix to test our hypotheses. What different types of labels and numbers do we use to make a biological data set from an FSR/E on-farm trial? To answer this question, we first need to look at the components of an FSR/E biological data set. 2. WHAT KINDS OF COMPONENTS GO INTO A BIOLOGICAL DATA SET? A biological data set from an on-farm trial consists of 3 types of components: 1. Independent variables; 2. Dependent variables; 3. Classifiers; Let's look at what these components mean, and how they relate to a scientific method. Independent variables. we have seen that a data set is one way to document testing of conditions for an event: acceptability of recommendations. Part of that documentation is to describe contrasting conditions of a hypothesis about that event. The first component of a data set is a description of contrasting conditions of agricultural production. .In a trial we may have one set of contrasting conditions, or more than one set. we call each set an independent variable. we say "variable" because each is a set of several varying differenti) ways to one step in the production of crops or animals. In Unit II:II,C,l we called such a set Volume III: II,A page 33 of conditions a factor (such as variety, or N rate), and each particular condition a leveT(uh as variety 'Sigurado' vs. variety 'Baro'; or 0 vs. 100 vs. 200 l-T-a- of N). We say "independent" because we control the contrasting conditions. That is, we determine which levels to test, based on diagnosis and checking with farm household members. Each level is a particular treatment that the team and the farm household members carry out. Dependent variables. Another part of our documentation is to describe the events that occur under the contrasting conditions of each treatment. Ultimately, the event we are interested in is acceptability of some of the changes in agricultural production (some of the treatments: such as variety 'Baro' instead of variety 'Sigurado'; 100 kg/ha of N, instead of 0 or 200 kg/ha) we are testing. Usually, however, we do not measure acceptability directly. Instead, we measure various aspects that contribute to acceptability. Unit II: I presents a framework for identifying which aspects of acceptability are important for farm household members. These are the aspects we observe or measure during the trial. We call these dependent variables. There are many possible kinds of dependent variables. Unit II: IV,C explains the different types, and how to measure or observe when each happens. Each measurement or observation is recorded into a field book. Some examples of dependent variables include: 1. Amount of out t (final weight gain at the time animal is ready to be used tot milk production, crop grain yield, crop stalk yield etc.) These types of dependent variables are usually primary experimental variables (Unit II: IV,C); 2. Status of the crop or animal prior to obtaining the output (weight gain before the animal is used, plant height in mid-season, etc.). These are usually secondary experimental variables. 3. Characteristics of outputs (protein content, stalk length, fruit diameter, fruit color, cooking time, etc.). These are also usually secondary experimental variables. Classifiers. Each dependent variable is measured or observed more than once. This is because each treatment is replicated. Unit II: III,B explains why replication is useful: it enables us to measure natural variation. Comparing natural variation with differences in dependent variables is one way for us to make a decision on whether to accept or reject our hypothesis. Classifiers are labels for where each measurement is taken, or each observation made in an on-farm trial. In FSR/E, we can have 2 types of classifiers: 1. Plots and blocks. Plots are individual experimental units: a portion of a field, or a single animal. Blocks are labels for groups of individual experimental units. Unit II: II,B explains these terms in more detail. Unit II: III,C explains why they are important. Volume III: II,A page 34 2. Farms. Most on-farm trials, especially refinement and validation trials, are reinltrials: trials done on more than one farm. Unit II: II and Uit II: III explain more about regional trials. Costs. We can also break each treatment into the steps of carrying it out 'in order to achieve the output. Documentation of resources used to carry out each step (for example, money spent to buy the seed for each variety, and who puts out the money; time spent applying each level of fertilizer, and who provides their time to do it, etc.) is also important in assessing acceptability. Unit I: I explains more about how to determine which steps to document. Unit II: IV explains how to do the measurements. Changes in uses of resources due to treatments can result in cagsin outputs of other crops or animals not receiving the treatments. seare another type of cost. Units II: I and II: IV explain more about these costs also. Cost components are usually not a part of a biological data set. However, the next step after analysis of a biological data set is often construction of an economic data set. The economic data are gathered at the same time as the biological data. Needs for economic data (as well as consumption and social data) may affect plot layout and even treatment choice. At the same time, economic data sets make use of some of the data from a biological data set, either in original form, or after biological analysis and reduction to summary statistics. Thus, a team should keep in mind the other types of data sets they will gather data for, as they gather data and construct their biological data sets. 3. HOW DO WE CONSTRUCT A BIOLOGICAL DATA SET? How do we construct a biological data set for analysis? The key is to organize the field book so that data are recorded in a format for analysis. Unit II: IV,C gives a good example: 1. First make columns for each type of classifier: a. In exploratory or validation trials with many treatments at each farm, sometimes each farm may be listed on a separate page. In constructing the daa set for analysis, this becomes the first column. In other cases, the number of treatments per farm are few. There may be only 1 or 2 replications per farm. In these cases, farm will be the first column in the field book data recording pages. b. The next column usually gives the block (or replication) number. However, if there is only -one block per farm (i.e., single-farm replications), farm equals block, and only one column is needed for both. c. The following column is usually the Dlot number. As the description of types of classifiers =hicates, the classifiers which are used depend on the experimental design used. Unit II: III explains the different types of designs. Volume III: ILA page 35 2. Make columns for the independent variables: These are columns for each factor (treatment type). If there is more than one factor in the trial (such- asN and P), each factor will have its own column. 3. Finally, make columns for the dependent variables: Each type of dependent variable should have its own column. Often there are also other columns which give additional information, such as plot or sample area size, or qualitative observations on pest and disease severity. Usually the columns for dependent variables and other information are arranged in a time sequence order, beginning from planting and ending with harvest and post-harvest variables. The field book is filled out by writing on each line the value or label for the classifiers, independent variables and dependent variables of each plot or animal. The classifiers are taken from the plot plan or herd list, based on the experimental design used. The value or label for each classifier on a given line will be the same as what is written on the stake for the plot to which that line corresponds, in the case of crops. The values of the independent and dependent variables written on that line are recorded from measurements and observations taken or made on plants in that plot, or on that animal, during the trial. Figure III: II,A.l Cropping Systems Program Research Managed Trial Data Compilation Form Independent variables and related Dependent variables and Classifiers Measurements related Measurements Date of Days Sample Fresh -Z Yie ds Sun 3t Pl...t Population 4 .. N l sa Dried epl l t reatmet Plot Date of Itn at.- a i i. rd~tv o ~ 1 tk t ~Str E. r* V. eds L I- In- 0ts era- T P rductive v t Plo Xu- (No.lkg /h* a_ ure (S h Strl , Expe., Volume III: II,A page 36 In summar y, a biological data set thus consists of a matrix of columns and rows, where: 1. Each type of classifier, each independent variable, and each dependent variable have a unique column; and 2. Each line lists the value for each dependent variable of a unique combination of classifiers and independent variables. Aside from the additional information not used in analysis, the field book pages themselves form the data set. If the field books are organized correctly, analysis can be done directly from the field book data set. Volume III: IIA page 37 VOLUME II I UNIT II (II, B) FROM DOMAIN TO TRIAL BACK TO DOMAIN: SAMPLES, POPULATIONS, AND STATISTICAL INFERENCE OUTLINE 1. Domains, Populations, and Samples 2. Distributions and Statistical Inference 3. The "Devil's Advocate" Null Hypothesis PREREQUISITES LII: II,A: What is a Data Set and What Can it Do? PARTICIPANT LEVEL Agricultural research assistant Extension technology verification technician LEARNING OBJECTIVES After completing this section, the participant will be able to: 1. Distinguish between a sample and a population as applied in FSR/E. 2. Describe the principles and function of statistical infetence in FSR/E. 3. Contrast the hypothesis of interest and the "devil's advocate" null hypothesis as applied in FSR/E. KEY POINTS 1. FSR/E teams work with samples because time and resources are never adequate to describe the whole population. 2. Statistical inference provides rules for making judgements about the whole population using information from only a small part of the population, a sample. 3. A team will have more confidence in its hypothesis of interest if it still remains even after the team attempts to disprove it by using another hypothesis, the "devil's advocate" null hypothesis, that negates the hypothesis of interest. TERMS analytical class dependent variable descriptive statistics distribution domainindependent variable Volume III: II,B page 39 inference hypothesis normal distribution null hypothesis population sample variable AKNOWLEDGEMENT Section 2 is based, in part, on training materials developed by Ly Tung, Visayas State College of Agriculture (Visca), Baybay, Leyte, the Phillipines. DISCUSSION 1. SAMPLES, POPULATIONS, AND DOMAINS Usually an FSR/E team is assigned to a fairly large area with many farm households. The area may consist of a municipality with 33 villages, each with 50 to 150 farm households, for a total of over 3,000 households. Or, the area may consist of an extension district covering 3 adminsitrative districts, each with 15 20 villages, and a total of nearly 5,000 households. The team's task is to develop appropriate new technology through on-farm experimentation. With thousands of farm households, each with a multitude of problems of different value to each member, the task of the team is impossible. Time and resources of the team are not adequate to work on all the problems of every farm household member. FSR/E provides a series of procedures to make the task possible for teams. These procedures involve application of a scientific method. Unit I: Introduction B describes this problem and what constitutes a scientific method in more detail. Usually the design stage results in focusing on a particular crop or animal for the domain chosen. The team would like to be able to develop a new technology that would be acceptable to all the farm households in the domain. Perhaps, for example, the problem for design is declining yield of maize after the first year on cleared shifting cultivation land. The declining yield seems to be due to increased soil acidity. There are local lime deposits in the area which could be used to raise soil pH. Ideally, then, the team would like to know what the response of all farm households planting maize on cleared shifting cultivation land would be to the use of local lime on their maize on those fields. This would mean testing the local lime on over 2,000 fields. In reality, however, the team simply does not have the time and resources to work with all these farm households to test the local lime on over 2,000 fields. Instead, in village meetings, the team and the farmers reach agreement that 12 households will participate in a trial using the local lime. Volume III: II,B page 40 In this example, the team thus wants to use the results with the 12 households, in order to provide new information of use to all 2,000 households. Statistics is a scientific method useful for this type of objective. Statistics calls the 2,000 households a population, and the 12 trial cooperators a sample from the population. This exampTe also shows that the population in statistics of interest in FSR/E is the domain. 2. DISTRIBUTIONS AND STATISTICAL INFERENCE The following are an example of maize yields from a sample of 12 farms: Farm number Yield (Kg/ha) 1 136 2 215 3 644 4 2,562 5 639 6 775 7 650 8 983 9 530 10 1,811 11 578 12 817 Each value of maize yield is an observation for one of the 12 farms in the sample. These 12 observations are a simple data set. In this data set, we only have one classifier (farm) and one variable (maize yield). Why do we call maize yield a variable? The reason is because the values are not the same from one observation to the next: the values of maize yield vary from one farm to the next. What do the data from these 12 farms tell us about maize yields on 2,000 farms? We can start by asking if there are any patterns in the values from the 12 farms. To help us find patterns, we can use 2 techniques: 1. ranking; 2. grouping. First, let us rank the data from lowest to highest yield: Farm number Yield (Kg/ha) 1 1 362 215 9 530 11 578 5 639 7 650 6 775 12 817 3 9448 983 10 1,811 4 2,562 Volume III: II,B page 41 Second, let us group the data. One way to group would be in 500 Kg increments (steps). Each group will be a yield class. This will give us the frequency per class: that is, the number of arms in each group: Yield (Kg/ha) Farm no.'s Frequency/Class Class in Class Number of Farms 0 499 1P2 2 500 999 9,11,5,7,6, 12,3,8 8 1,000 1,499 1,500 1,999 10 1 2,000 2,500 0 2,500 3,000 4 1 We can show this result of grouping by making a bar graph. In the bar graph, we will show the classes on the horizontal X-axis, going from 0 at the left to 3,000 Kg/ha at the right. We will show the frequency per class on the vertical Y-axis, going from 0 at the bottom to 10 farms at the top. 10 I Frequency I per 8 class I (No. of I farms) 6 / 4 2 0- "' __ F 7 1 ] 77 0 500 1,000 1,500 2,000 2,500 3,000 Yield class (Kg/ha) Clearly, this variable, maize yield, has a pattern: there are more values (8 out of 12, or 67%'of the sample observations) between 500 and 1,000 Kg/ha, than there are values less than 500 Kg/ha (only 2 observations), or more than 1,000 Kg/ha (again only 2 observations). Such a pattern is called a distribution. One use of statistics is to generate a few numbers that will describe all the values of a sample. For example, here, we would think intuitively that a number between 500 and 1,000 Kg/ha (maybe 750 Kg/ha) would give a fairly good description of these 12 values. Such statistics is called descriptive statistics. Here we want, however, to return to the question of what this sample Volume III: II,B page 42 from 12 farms tells us about the population of 2,000 farms. Most of the values, 67%, were between 500 and 1,000 Kg/ha. Is this representative of the population? Are we ready to infer (extend our judgement) to the whole population, and say we would expect 67% of all 2,000 farms, or 1,333 farms, to have maize yields between 500 and 1,000 Kg/ha? Let us take another example to make our problem clearer. In a training workshop, weights were measured for 35 participants. The results can be shown in a bar graph: 20 18 I. 16 14 No. of i Partic- I ipants 12 10 2 I0 35595 o 5 110 125 Weight (kg) Volume III: II,B page 43 We also obtained weights of 35 wrestlers. They looked like this: I 20 18 16 14 No.of I Wrestlersl 12 10 U/ 8 6 4 .2 35 50 5 8 95 110 T2 Weight (kg) Now, suppose you are given 3 sets of weights of 6 people. You are told that these sets might be taken from the 35 participants, or they might be taken from the 35 wrestlers. You thus have 3 samples (each set of weights) and 2 populations (one of 35 participants, and one of 35 wrestlers): Sample 1 Sample 2 Sample 3 Person Weight (Kg) Person jgt Person Weight 17 ~ 82_1 92 2 52 8 98 14 71 3 78 9 87 15 83 4 61 10 101 16 78 5 57 11 115 17 77 6 63 12 99 18 64 Volume III: II,B page 44 Which population would you judge each sample came from? Let's compare the 3 samples with the 2 populations, using the bar graphs: I 20 18 16 14 No. of Partic- I ipants 12 IV 4' I i0 35s 580 95 11012 Participant Population 6 No. of I Persons 4 2 I 'zl 20/ // __ __ __ 3 5 80 95 110 125 Sample 1 Volume III: IIB page 45 6 No. of I Persons 4 2 0 035 50 6:5 80 95 11 0 125 Sample 2 6 No. of Persons 4 2 50 65 80 95 110 125 Sample 3 Volume III: II,B page 46 We would feel confident in saying that sample 1 must be 6 weights from the participant population. Likewise, we would feel confident in saying that sample 2 must be 6 weights from the wrestler population. But what about sample 3? It might be from the participant population, or it might be from the wrestler population. Let's look a little more closely at sample 3 and the 2 populations. Let's write out the frequencies for each weight class: Weight Class Participants Wrestlers Sample 3 35 50 17% V6 0% 50 65 54% 3% 17% 65 80 23% 83% 11% 43% 50% 100% 80 95 6% 29% 33% 95 110 0% 46% 0% 110 125 0% 11% 0% All the values of sample 3 fall between 50 and 95 Kg weight.. most of the participants, 83%, also fall within 50 and 95 Kg weight. So we might say, sample 3 must come from the participants. However, 43% of the wrestlers also fall between 50 and 95 Kg weight. So there is also a chance that we might be wrong. Statistical inference is based on principles just like this example. statistical inference asks, to which population is this sample most similar? What is the chance that this sample is from the first population? If we decide the chance is high that a sample is from the first population, what is the chance that we have made a wrong judgement, and the sample is really from thesecond population instead? In the real world, we never know the populations. However, statistics can tell us the probability that a sample comes from an assumed population. in both of the population bar graphs, we noticed that most values were in the middle, and fewer values were to each side. The distributions were symmetrical (left and right sides similar in shape). In statistics, we usually assume our populations are symmetrical. Such populations are called normal: populations with a normal distribution. They look like this: A Normal Distribution The statistical inference techniques introduced in this manual are ,based on assuming a normal population. These techniques allow us to make volume III: II B page 47 judgements about samples, in comparison with assumed populations. Statistical tables give us the probabilities that different measures of a sample come from one or another assumed normal population, just as we looked at the probability of a sample of 50 to 95 Kg weight coming from the participant population. This type of statistics is called analytical statistics. 3. THE "DEVIL'S ADVOCATE" NULL HYPOTHESIS a. What it is, and why we use it In the first part of this section, we discussed a situation in which the priority problem for design was declining yield of maize after the first year on cleared shifting cultivation land. The declining yield appeared to be due to increased soil acidity (low pH). The team decided to test the use of lime from local deposits to raise pH. Before it does the trial, what does the team expect the results to be? What is its best guess about those results? Most likely, its best guess is that the local lime will result in increased yield of maize. The team, through discussion with farm household members, has most likely chosen this problem for design, instead of other problems, because of a high probability of success of the proposed solution (intervention) (see Unit I: IX for more explanation of how a team looks at probability of success in choosing a design priority). Before a trial, a team's best guess is a hypothesis. The team may think that there is a high probability of success of the intervention. The team expects that lime will raise pH, which should improve maize yield. Discussions with farm household members suggests that they have the labor to get the lime, haul it, and pound it. The team thinks labor conflicts will not be serious, and the increased food value of the maize will off-set the additional labor use. Nevertheless, the team is not certain of all of this. In particular, the team is not certain that the intervention will be acetable to farm household members. It is important here to recall that the response which FSR/E assesses is not just the biological response (increased yield of maize),.but more fundamentally, the response of farm households-to the use of the intervention. Farm hou sehold members themselves may be enthusiastic about the trial after the design meetings with the team. Yet they themselves will not really know if they would want to go to all the trouble and expense of digging, hauling, pounding, and applying this local lime every year, until they try it. This shows a fundamental difference between a trial and a demonstration. in a trial, the acceptability of the intervention (new practice or intervention) is not yet proven. The purpose of the trial is to see if it will be proven, I a demonstration, we say the practice is proven. In the earlier discussion on samples and populations, we said that we Volume III: II,B page 48 want to use the information from the trial to infer to the population. The sample was 12 farm households, while the population of the domain was 2,.000 farm households. If the 12 farm households do find the local lime, acceptable in the trial, how confident will we be in saying the acceptability of local lime is proven for all 2,000 farm households? Are we ready to put our reputations as researchers on the line in a demonstration, in extension meetings, and in the weekly agricultural radio broadcast, and say this is proven? Are the 12 farm households willing to say the local lime is proven to their neighbors who have the same problem of declining maize yields on acid soils? Obviously, this is a very b ig gamble to take for both the team and the 12 farm households to take. How can we increase our confidence in our conclusions, before we take this gamble? One way is first to make sure we cannot find any way to drove our hypothesis of interest, that the local lime is acceptable. If we make every effort to disprove our hypothesis, and it still remains as the best conclusion, then we will have more confidence in it. We will be more willing to take a gamble on an inference from a trial with 12 households to statements we make for 2,000 households. How do we attempt to disprove our hypothesis? One technique that is standard in statistical analysis is to start with the opposite hypothesis: the hypothesis that there are no differences between the intervention and the current farmer practice. In this example, we start with the hypothesis that there is no difference between using local lime, and not using local lime. Again, note how different this approach is from what we do in a demonstration. In a demonstration, from start to end, we promote the local lime, saying how lrea difference we can expect between the maize plot with local lime, adtemaize plot without local lime. In a trial, in contrast, we say we don't know, and we are going to look first at the results assuming there is no difference between the maize plot with local lime and the maize plot without local lime. Since this hypothesis is one of no difference, in statistics we call this the null hypothesis.- The word "null" simply means "none": the assumed differnce is none. Since we are using this null hypothesis in an attempt to disprove our first hypothesis of interest (that there are differences: that the assumed difference is positive), the null hypothesis is a "devil's advocate" hypothesis. Being the "devil's advocate" means taking the opposite position. Taking the opposite position and attempting to disprove our first, hypothesis is a key characteristic of a scientific method. One step in a scientific method is testing our results by others, to build consensus among everyone. We want consensus among all 2,000 farm house-I~sT researchers, and extensionists about whether farm households find use of local lime to be acceptable. obviously, the use of local lime needs to be tested on farms other than the 12 in the trial, to build a really strong consensus. Using the "devil's advocate" null hypothesis in analysis is the first step towards testing the results of trials on 12 farms by others. In essence, the Volume III: II,B page 49 "devil's advocate" null hypothesis takes the place of some of the questions about the results on 12 farms that the other 1,988 farm households may ask. In using the "devil's advocate" null hypothesis, we are trying, in an indirect way, to bring the rest of the domain into the analysis. b. How it works To explain how the "devil's advocate" null hypothesis works, we can return to the example of participants and wrestlers. Suppose we are given samples 1 and 3. our hypothesis of interest might be that there is a real difference between these 2 samples: that sample 1 comes from the population of the 35 participants, but sample 3 comes from the population of 35 wrestlers. The null hypothesis would then be that both samples 1 and 3 come from the same population, the 35 participants. The various statistical tests assess the probability that the null hypothesis is true: that BoEF -samples 1 and 3 come from the same population. IfTfiat probability, based on the statistical test, is low, then we say that we are unable to accept the null hypothesis. only when the probability of the null hypothesis is low will we be ready to accept an alternate hypothesis: our original hypothesis of interest, that the samples come from different populations. Only then, in this case, will we be ready to accept our original hypothesis that sample 3 in fact comes from the population of 35 wrestlers, rather than the population of 35 participants. We accept our original hypothesis (the alternate hypothesis) only after we have failed to prove the "devil's advocate" null hypothesis using statistical probability. For the example of maize fields with and without local lime, the null hypothesis for analysis of yields is that the yields from the fields with lime and the yields from the fields without lime both come from the population. We say that lime really makes no difference and maize fields with lime are the same as maize fields without lime. Only if our statistical analysis indicates a low probability that the null hypothesis is true are we ready to conclude that maize fields with lime are different from maize fields without lime. How low does the probability of the null hypothesis have to be for us not to.accept it? That depends on risk. How exactly do the different statistical tests work? For example, how do we test a null hypothesis for 3 samples, instead of 2? The units on analysis techniques in the volume present several techniques and explain how they work. If we do fail to accept the null hypothesis, and accept our original alternate hypothesis for yield results, it is important to remember that we have not exhausted all possible null hypotheses. Farm household acceptability is based On many components, and yield is only one component. Null hypotheses can also be made about economic returns, or about sharing Of costs and benefits among different household members. The more null hypotheses we can test, the better we become at .representing the rest of-the farm households in the domain. The re null Volume III: IIB page 50 hypotheses we can test, the more confidence we can have in a conclusion that our intervention is more likely to be acceptable than current farmer practice. Our best alternate hypothesis remains the best explanation of the trial results if the different null hypotheses all have low probability. Volume III: II,B page 51 ACTIVITIES ACTIVITY ONE: DEVELOPING AN EXAMPLE OF SAMPLES AND POPULATIONS (no participant instructions) ACTIVITY TWO: DEVELOPING NULL HYPOTHESES FOR TRIALS BASED ON SITE VISITS (no participant instructions) Volume III: II,B page 53 VOLUME III UNIT II (II,C) STATISTICAL NOTATION OUTLINE 1. Using Summation Symbols 2. Using Variance and Standard Deviation Notation 3. Significance of Sample versus Population Notation PREREQUISITES: II: III,B: What Designs Can Do III: II,B: From Domain to Trial Back to Domain: Sample, Populations, and Statistical Inference PARTICIPANT LEVEL: Agricultural research assistant Extension technology verification technician LEARNING OBJECTIVES: After completing this section the participants will be able to: 1. Use summation symbols to calculate sums and sums of squares. 2. Use variance and standard deviation symbols with subscripts for samples and populations. 3. Use bars to show mean values. KEY POINTS: 1. Summation symbols with the variable squared without parentheses give sums of squares. 2. Summation symbols with parentheses around the symbol and the variable are squares of sums. 3. Greek letters refer to populations, while Roman letters refer to samples. DEFINITIONS: pipuation sample statistic variable DISCUSSION: 1. USING SUMMATION SYMBOLS Field books contain many numbers. Summary statistics reduce these Volume III: II,C page 55 numbers to a few numbers that characterize the data. To reduce the many numbers to a few, we do additions of long lists of numbers. For example, we add up, all the values for each column of 4 treatments. Unit II: III,B explains how random variation is measured. Random variation shows deviations -(difference) from the mean (average). To avoid positive and negative deviations from cancelling, we first square deviations. most procedures in statistical analysis use squaring. It is convenient to explain statistical calculations with formulas. Here is a simple data set: 10 8 9 6 7 3 5 7 11 12 4 3 Let's look at two ways to explain some calculations on this data set. First, we could write formulas for the calculations in words. For example: 1. "Square each number in column 1 and add up all the squared numbers." 2. "Then square each number in column 2 and add up all those squared numbers." we perform these calculations: 1. 12+92+7+52+ 12+42= (10 x 10) + (9 x 9) + (7 x 7) + (5 x 5) + (11 x 11) + (4 x 4) -100 + 81 + 49 + 25 + 121 + 16 = 392 2. 8 2 + 62 + 32 + 7 2+ 12 2+ 3 2 -(8 x 8) + (6 x 6) + (3 x 3) + (7 x 7) + (12 x 12) + (3 x 3) =64 + 36 + 9 + 49 + 144 + 9 = 311 There is a simpler way to write this, however. It uses the symbol E. This is a Greek symbol "sigma." It means add. We can write, EX2,where X is avariable, and X2means the value of the variable squared. In our example, we have two variables, one for each column. E2tells us to do what it took us two sentences of words and ten lines of numbers to do. Another calculation might be: 1. "Add all the numbers up in column 1 and square that total." 2. "Then add all the numbers up in column 2 and square that total, too." volume III: II,C page 56 We add: 1. 10 + 9 + 7 + 5 + 11 + 4 = 46 462 = 46 x 46 = 2,116 2. 8 + 6 + 3 + 7 + 12 + 3 =39 392 39 x 39 = 1,521 There is also a simpler way to write this. It also uses E, but with parentheses: (ZX) Since the EX is inside the parentheses, and the square sign is outside the parentheses, we do the addition first, before squaring. (ZX) 2 is thus a square of a sum. 2. USING VARIANCE AND STANDARD DEVIATION NOTATION Units II: III,B and III: II,B explains the difference between a population and a sample. In FSR/E, a population may be all the farm households belonging to domain. A a in this case would be the cooperating farm households from the domain participating in an on-farm trial. In the on-farm trial, we have data only from the sample. We can calculate the variance (sum of squared deviations) and the standard deviation(the square root of the deviation, from the data from the sample). It is convenient to express this also with symbols. For a sample we use lower-case (small) Roman letters: s = standard deviation 2 s = variance The variance and standard deviation can be calculated for many types of data. To show what kind of variance or standard deviation, often we put a small letter to the lower right of the s or s2. This small letter is called a subscript. The subscript usually is the first letter of the word that tells at type of variance or standard deviation it is. Here are some examples: s= standard deviation of mean difference Sp2 = o variance Volume III: II,C page 57 The subscript of the first example has a bar above the d. This indicates a mean. Contrast the following for the first column: Y =any value: 10, 9,' 7, 5, 11, or 4. Y the mean: 10+ 9+ 7+ 5 +11+ 4 6 46 6 =7.67 For the whole population the Greek letter for s is used: a2 = population variance a= population standard deviation 3. SIGINIFICANCE OF SAMPLE VERSUS POPULATION NOTATION We could not calculate the variance, a2, or standard deviation, a, for the population, unless we had data from every farm in the domain of the trial. However, we will frequently see the Greek symbols for population variance and standard deviation in statistics texts. This is because the sample estimates the population values. That is, the results of the on-farm trial5 are an estimate of results we could expect with all farms in the domain. of course, this is one of the key problems of extension: how to predict whether a new technology will work with farmers. Statistics gives us objective standards to judge how good our estimate i(see Unit II: III,B). This is why statistics can be helpful. It is why a and a are in the end really important, too. We can never know them, but the better we design our trials, with more representative farm households and more real participation by farm household members, the closer we can 2 get to knowing a and a. That means the closer we can get to solving a key problem of research and extension: how to determine recommendations that will work for the whole population of farm households in a domain. Volume III: II,C page 58 VOLUME III UNIT II (II,D) TECHNIQUES FOR SUMMARIZING AND DESCRIBING DATA OUTLINE 1. Looking at Data Initially 2. Summarizing Data with Descriptive (Summary) Statistics 3. Looking at Data Again PREREQUISITES III:I: A framework for analysis III:II,A: Constructing a biological data set III:II,B: Samples, populations, and statistical inference III:II,C: Statistical notation PARTICIPANT LEVEL Agricultural research assistant Extension technology verification technician LEARNING OBJECTIVES ACKNOWLEDGEMENT Section l.b.(1), section 2, and section 3 are taken, with modification, from Clive Lightfoot, Visayas State College of Agriculture (Visca), Baybay, Leyte, the Phillipines. DISCUSSION 1. LOOKING AT DATA INITIALLY A team works with a group of cooperating farm households over a season to carry out a trial. At the end of the trial, the team has a field book with many numbers. These numbers form a data set. As explained in unit III: II,A, the data set is arranged in a i trix: an arrangement of labels and numbers in rows and columns. In the matri-x, each row gives the value of a dependent variable (what we observe and measure: weight gain, yield, etc.) for an unique combination of classifiers (which animal or plot in which block on which farm) and independent variables (the different ways we vary one step in production, with treatments: interventions and controls, such as new breed vs. farmer breed, fertilizer vs. no fertilizer, etc.). The data in the matrix of a field book page are usually too man to see what has happened in the trial. Also, the numbers are usually entered in the order that the classifiers (farms, blocks, animals, or plots) and treatments appear in the field. As explained in II: III,B, the order in the field is based on randomization of the treatments. Thus the arrangement of the numersinhe field book does not follow any logical order. They are confusing to us. Volume III: II,D page 59 How can we see through this confusion? How can we find patterns in the data? How can we reduce all the numbers in the field book to fewer numbers that are easier to see? This unit presents several techniques for this. a. Summary Tables The first step in looking at data is to put them in logical order. To do this, we make a summary table. A summary table rearranges the data in the field book for one dependent variable into a new, smaller matrix. Thus, many summary tables can be made from a field book. How do we make a summary table from a field book? Let's take a simple example first. Here are data from 6 farms, with 3 treatments (3 varieties). Note the structure of the data set. Table III: II,D.I Disease resistance Farm Variety Stand rating Yield V 2 77 6 0. V 3 81 5 0.84 V 1 78 7 0.91 VI 1 91 10 1.15 VI 3 98 10 1.68 VI 2 93 10 1.47 II 3 81 8 0.97 II 2 87 8 1.07 II 1 84 9 1.01 IV 2 55 7 0.82 IV 1 58 8 0.85 Iv 3 56 6 0.73 I 2 89 9 1.43 I 1 96 9 1.29 I 3 95 9 1.71 III 3 88 8 1.29 III 2 87 7 1.23 III 1 89 8 1.11 t ~ t ~ t t t Classifier Independent \-Dependent variables-/ variable It is hard to see what these data tell us. Let's start by making a summary table for yield. To make the summary table, we follow 3 steps: 1. Order the classifiers down; 2. Order the treatments of the independent variable across; 3. Rearrange the data Values from the field book col-mof the depentn v-iaria1e chosen into the new matrix. Schematically, here is what we do: Volume III: II,D page 60 Table III: II,D.2 Classifiers Independent \---Dependent variables---/ variables Disease resistance Farm Variety Stand rating Yield V 2 77 6 0.81 1 V 3 81 5 0.84 1V 1 78 7 0.91 ] VI 1 91 10 1.15 ) I VI 3 98 10 1.68 )-----. VI 2 93 10 1.47 ) II 3 81 8 0.97 } II 2 87 8 1.07}-- I nI 1 84 9 1.011 I IV 2 55 7 0.82) I Iv 1 58 8 0.85 )--IV 3 56 6 0.73 ) I I 2 89 9 1.43] I I 1 96 9 1.29 ]-II I 3 95 9 1.71 ] I III 3 88 8 1.29 ) III 2 87 7 1.23 )--I III 1 89 8 1.11 ) Se1: Step 2: 1 4- 4, Step 3: 111 S0rder I (These would Rearrange classifiers treatments have own summary data into HIM down across I table) new matrix HIM II 111111 II 111111I I HIM Farm Variety 1 2 3 I f.43 -1.29 _f.71 II 1.01 1.07 0.97 111 1.11 1.23 1.29 IV 0.85 0.82 0.73 V 0.91 0.81 0.84 VI 1.15 1.47 1.68 b. Scatter Plots (1) Developing a scatter plot to compare an intervention with farmer control. Let's take another summary table. This is a simpler table, with just 2 treatments rather than 3: Volume III: II,D page 61 Table III:II,D.3 Farm Experimental treatment Farmer control 1 136.1 215.5 2 944.4 2562.5 3 439.8 775.9 4 650.0 1183.0 5 330.0 1011.6 6 1578.7 1417.0 We can begin finding patterns by plotting the data for each treatment out in a scatter diagram. It is much easier to see what is happening when observations are plotted out rather than in columns of numbers. In order to get an upward trend in the plot, data should be ranked One way to rank is to use the farmer control plot data. The farms are placed in order of lowest to highest farmer control yields on the bottom or X-axis of the graph, while yield forms the vertical Y-axis. We can then make 2 graphs, one for the yield of the farmer control, and another for the yield of the experimental treatment. Both graphs have the same X-axis, but the points above each farm will be different because the yields of the treatments are different. This helps identify patterns of some low yields, some high yields, and most yields in a middle area. Making a scatter plot will prompt you to explain why some are low and some are high. Figure III: II,D.l Scatter Diagram of Corn Yield Farmer control Experimental treatment 260C 2562 240C 220C 200C 180C 160C 1417 1578 140 1200 1183 1011 * 1000 944 800 779 650 600- 439 400- 215 330 200- 136 1-3-5-4-6-2 1-3-5-4-6-2 Farm Number Farm Number Volume III: II,D page 62 (2) Developing three types of scatter plots from data with more than two treatments We can apply the same techniques used in the two treatments example to data sets with more than two treatments. Let's take a different example and apply the same technique. The data are from an area where non-irrigated maize is grown for both human and animal consumption. Some farmers apply chemical fertilizer, while most do not. The data are from a trial designed to determine fertilizer recommendations under farmer conditions. The data for 4 N treatments without P205 (N = 0, N = 50, N = 100, and N = 150 Kg/ha) are shown in Table III: II,D.4. These data are shown as they might appear in a field book. The data in Table III: II,D.4 were in turn taken from a larger data set with 12 treatments. The larger data set is shown in Table III: II,D.5. The liager data set shows results for the same 4 levels of N at 3 levels of P205 (P205 = 0, P205 = 25, and P205 = 50 Kg/ha). Table III: II,D.5 is arranged in a summary table format. The columns show the treatment combinations, and the rows show the farms. The data from Table III: II,D.4 are rearranged to appear as the data for the first 4 columns at the left, with P205 = 0 for each N level. Table III: II,D.5 also shows the average of all 12 treatments at each location, in the last column to the right (labeled "Avg."). This average at each location is called the environmental index for each location. Volume III: II,D page 63 Table III: III,D.4 Observation Farm no Nitroen level Yield 1 7 150 4-.92 2 7 0 4.74 3 7 50 5.41 4 7 100 4.29 5 5 0 1.64 6 5 100 2.08 7 5 150 2.19 8 5 50 1.92 9 2 100 5.14 10 2 50 2.60 11 2 0 1.53 12 2 150 5.32 13 8 0 1.21 14 8 100 1.97 15 8 150 2.23 16 8 50 2.33 17 6 50 2.94 18 6 100 4.14 19 6 150 4.34 20 6 0 1.61 21 1 150 3.76 22 1 0 0.40 23 1 100 3.63 24 1 50 1.24 25 4 50 3.82 26 4 0 2.42 27 4 100 5.23 28 4 150 4.48 29 3 150 4.87 30 3 0 4.15 31 3 100 5.80 32 3 50 4.87 Volume III: II,D page 64 Table III: II,D.5 Complete Data Set from Maize Fertilizer Trial Fertilizer treatment (Kg/ha) N: 0 50 100 150 0 50 100 150 0 50 100 150 Tr P 25 0 0 0 0 25 25 25 25 50 50 50 50Avg. T_ 0.40 1.24 3.63 3.76 0.79 2.58 4.23 4.72 1.67 2.51 3.28 3.66 2.71 2 1.53 2.60 5.14 5.32 1.67 3.79 5.10 6.83 1.41 4.13.5.89 6.27 4.14 3 4.15 4.86 5.80 4.87 4.44 5.00 4.97 5.28 5.12 5.66 6.36 6.62 5.18 4 2.42 3.82 5.23 4.48 2.36 4.54 6.26 7.17 1.61 4.41 5.38 6.58 4.52 5 1.64 1.92 2.08 2.19 2.04 3.21 3.12 2.93 1.44 3.44 3.32 3.62 2.68 6 1.61 2.94 4.14 4.34 1.81 3.92 3.61 3.81 1.18 3.89 5.38 4.92 3.46 7 4.74 5.41 4.29 4.92 4.91 5.22 5.38 5.14 5.10 4.88 4.54 5.28 4.98 8 1.21 2.33 1.97 2.23 1.53 2.78 2.49 2.80 1.37 3.51 3.75 4.35 2.53 Avg. 2.21 3.14 3.91 4.01 2.44 3.88 4.40 4.84 2.36 4.06 4.74 5.16 3.76 t t t .t 1 I I I I _I! Data from field book table III: II,D.4 Environmental rearranged in summary table. index for all 12 treatments Perrin et al. 1976 We can use the data in the first four columns of Table III: II,D.5 to illustrate 3 techniques: 1. An intervention (N = 100) compared with the farmer control (N = 0). In this example, as in the previous example, the farmer contol is on the X-axis and the intervention data values are plotted on the Y-axis(see Figure III:II,D.2a). 2. Several treatments (N = 0 and N = 100) compared with the mean of all treatments. Another technique for making a scatter plot is to plot the values of each individual treatment against the values of the environmental index. Figure III:II,D.2b shows how this is done for two of the four treatments (N = 0 and N = 100). The values of the environmental index are plotted on the X-axis. Both of the interventions (N = 100) and the farmer control (N = 0) are now plotted on the Y-axis. The other two interventions (N = 50 and N = 150) could also be plotted. Participants should prepare-another graph and plot these two against the environmental index. How do they compare with the plot of N = 100 and N = 0 versus the environmental index? A line is drawn between the values for N = 0 (the *) and another line is drawn between the values for N = 100 (the -). These lines suggest two trends: a. There is higher yield at N = 100 than at N 0; Volume III: II,D page 65 b. The yield differences become smaller at higher values of the environmental index (compare the difference at environmental index = 5.18 versus at environmental index = 2.53. This suggests that there is less response to nitrogen at higher levels of the environmental index). Perhaps soils at the higher levels have more organic matter and less need for added nitrogen. The team needs to find out what characteristics of farms at the higher levels of the environmental index are different from characteristics of farms at the lower levels. These trends are only visual approximations suggested by looking at the data. To test whether they are real or not, and to determine exactly where the lines could be drawn, requires other analysis techniques of regression. Unit III: III,B,l explains how to do regression and Unit III: III,B,3 explains various ways to interpret the results of this type of regression called modified stability analysis. Figure III: II,D.2a Figure III: II,D.2b N = 100 6.00 A 4 5.00 a 4.00 oo t .l " 2.001 r~ 1 1 1 1.00 1 aD7 1.00 0.00 I 0 1 2 3 4 5 6 2.5 3 3.5 4 4.5 5 5.5 6 fAP M > I j 3~ 1 40 2 73 Farmer control (N = 0) Environmental index (avg. of 12 tests) 3. The means of each treatment on the Y-axis compared with the levels of the treatments ordered from N = 0 to 50 to 100 to 150 on the Y-axis. This leads to response curves and dominance analysis. For example, see unit II: II,C,1, especially Figure II: II,C,l.I and the example in Shaner pages 126 - 139 (compare Figure 7.2 p. 120 versus Figure 7.3 p. 132, of Shaner). Volume III: II,D page 66 2. SUMMARIZING DATA WITH DESCRIPTIVE (SUMMARY) STATISTICS The patterns of high, low, and middle observations can also be summarized mathematically. we can measure 2 types of characteristics: 1. Togetherness what value best represents all the data values? 2. Apartness How different are the data values from one another? a. Togetherness We illustrate here 2 measures of togetherness: 1. Mean 2. Median Means are calculated by summing up the yields from all farms and dividing that number by the total number of farms. The calculation is shown below using the same data as is used for the scatter plot in Section II: I,b, (1) (Scatter plots). The mean for each treatment is obtained by adding up all the values for the treatment and dividing by the number of farms: Farm Experimental treatment Farmer control f-136.1255 2 944.4 2562.5 3 439.8 775.9 4 650.0 1183.0 5 330.0 1.011.6 6 1578.7 1417.0 6 407-9.0 75.3 Mean of experimental treatment Mean of farmer control EXC 4079 EX D =7165.5 n 6 n 6X=679.8 =1194.2 Essentially, the mean estimates what is the typical score or the score that best characterizes the data. But where data are very variable the mean becomes a poor estimate of typical values. Consider the following treatment data whose mean 3469 would not be recognized by any farmers for two farmers it would be too low while for the others too high. Similarly, in the column of farmer control data, the mean 2504 would be recognized by two farmers but not by those who achieved, yield of 3350 or 1133. Volume III: II,D page 67 n Farm control Experimental treatment i 3300 3927 2 2700 1683 3 2883 6867 4 1133 1400 (mean) X = 2 3467 The median is the middle data value when the data are ranked from lowest to highest. If the number of data values is even, the median is the average of the two middle data values. For the farm control, median = 2700 + 2883 = 2791.5 2 For the experimental treatment, median 1683 + 3927 = 2805. 2 The median is less affected by unusual values, such as 6867 for the experimental treatment, than the mean. b. Apartness We illustrate here 3 measures of apartness: 1. Range 2. Variance 3. Standard deviation Because yield data always vary, that is they are spread out, it is not enough to describe a distribution just by its mean. For example, consider the following two sets of imaginary yield data. Both the experimental treatment and the farmer control have the same mean but very different spread or variation. Experimental treatment Farmer control 520 320 560 640 380 I- 800 ]620 ]- 180 ]420 500 Sum of X (EX) 25M 2SH Mean (R) 2 500 500 Variance (S2) 99 249 Range 240 620 Our problem is how to express this variation in one quantity. The most simple measure is the range. The range is calculated by subtracting the lowest value from the hihet value. For the experimental treatment, the Volume III: II,D page 68 range is 620 380 = 240. For the farmer control, it is 800 180 = 620. There is, however, a problem with this and that is that range is "undemocratic": it does not consider all'the scores but just the two most extreme. Ideally, you want a measure that considers all the scores. Seeing as we want to measure variability from the center or mean we might calculate how much each score varies or deviates from the mean either positively or negatively. Still we have a problem. When adding up deviations, positive and negative values will cancel each other out so total deviation will end up as zero. As we know, squaring a number gets rid of its direction (positive or negative) and makes everything positive. So we can add them up to produce an appropriately large deviation. This squared deviation value is called the variance. Variances can be obtained long hand by calculating the deviation of each score from the mean. There is however a quicker way to calculate variance using the following formula. EX2-(ZX) 2 X2 2 n n- 1 Let us use our 1983 SRMU corn data to show how variance is calculated from this formula. For the experimental treatment: n X X2 1 139.1 iA523.2 2 944.4 891891.4 3 439.8 193424.0 4 650.0 422500.0 5 330.0 108900.0 6 1578.7 2492293.7 4079.T 4127532.3 T 2 (r~)2 2 2 n S 2 n 1 (4079)2 4127532.3 6 6 -1 Variance s2 = 270898.4 (Standard Deviation)s = 520.5 Volume III: II,D page 69 For the farmer control, n X X 2 f 21-5.5 4-6440.3 2 2562.5 6566406.3 3 775.9 602020.8 4 1183.0 1399489.0 5 1011.6 1023334.6 6 1417.0 2007889.0 7165.5 11645580.0 E2_(EX) n 11645579.8 -(7165.5) 6 6-1 =617636.3 (Standard error) s 785.9 While the variance is useful for statistical tests, it is not so helpful for interpretation. Here, the standard deviation is used. The standard deviation is simply the square root of the variance. This value can be plotted on a scatter diagram along with the mean to give a better picture of the data. Remember, deviation is both positive and negative from the mean, so lines must be drawn above and below the mean. Most of your observation will fall inside the standard deviation; it is those values lying outside that must be explained. So our completed corn scatter diagram will look like the diagram below. Volume III: II,D page 70 Figure III: II,D.3: Scatter Diagram with Mean and Standard Errors for 2 Treatments Farmer control Experimental treatment 260C- 2562 240t-220C-200(----- 1979 180C 160(-- 1417 1578 140 120C- 1183 1194 ---- ----- 1199 i011I * 100(- 944 80- 779 650 679 60C- 439 409 * 40C- 215 330 136 * 20C-* --159 0-1-3-5--4--6-2 1-3-5--4---6-2 Farm Number Farm Number Standard deviations of treatments are more properly called standard errors. We could also calculate the variance for all 12 values (both treatments). Its square root is the standard deviation for the whole experiment. The standard deviation for a whole experiment measures the "average" difference of any value from the overall mean. It is a measure of the average variability of all the data points. The definitional formula for calculating the standard deviation is: E (X R) 2 S= (n- 1) Volume III: II,D page 71 For a single treatment,. X is the treatment mean, while for the whole experiment, X is the overall mean. In calculating the standard deviation in the previous example, we used a "short-cut" formula instead: EX2 (E) 2 n S= (n -1 Why do we say it is a "short-cut" formula? The reason is because hand held calculators with statistical keys allow us to enter only each data value (each X), and then press one key to obtain s. We can also press 2 2 other keys if we want to see the values of EX and (EX) The calculator has obtained all 3 values automatically for us. For example, take the following data to show how to calculate the standard deviation for a whole experiment, and to understand better what the standard deviation tells us: Treatments a b c 4 9 16 .3 10 17 5 11 16 4 10 15 Totals 16 40 64 Mean 4 10 16 EX =4 + 3 +. +15 = 120 Volume-III: II,D page 72 or, EX = 16 + 40 + 64 = 120 EX)2 = (120)2 = 14,400 EX 2 = 4 2 + 3 2 + ..+ 15 2 = 1,494 1,494 14,400 2 s 294 11 = 26.7 s = 5.17 with a hand held calculator, enter the data points (4, 3, ... 15) and press the s key. Verify that you get the same result, 5.17. Also verify with the EX2 key that you get 1,494, and verify with the ZX key that you get 120. One could also calculate s by hand using the first formula. To do this we first need the grand mean, X: 16 + 40 + 64 X= 12 120 12 = 10 Volume III: II,D page 73 Next, we construct a table to determine the deviation of each value from the grand mean, that is, each X X. We also add the value of each squared deviation, (X 2 Data Value No. X x X (X -)2 (1) T ---6 (2) 3 10 -7 49 (3) 5 10 -5 25 (4) 4 10 -6 36 (5) 9 10 -1 1 (6) 10 10 0 0 (7) 11 10 1 1 (8) 10 10 0 0 (9) 16 10 6 36 (10) 17 10 7 49 (11) 16 10 6 36 (12) 15 10 5 25 Totals 120 120 0 294 Note that the sum of all the deviations from the mean (the sum of all the X 7 values in the third column) equals zero. This makes sense if we think about it: the deviations from the grand mean of all the X values like 15 or 16 that are bigger than the grand mean of X = 10 exactly cancel out the deviations from the grand mean of all the X values like 5 or 4 that are smaller than the grand mean of X = 10. That is why the mean can represent all these values, because it is exactly in the middle of all the bigger values and all the smaller values. However, cancelling out all the X X deviations does not help us obtain an average deviation. Obviously,the average X X deviation is not zero. Infact, only 2 original data values [ data value no. (6) and (8)] equal 10 and have a deviation of zero. The other original data values have X X deviations that range from 7 to 7. Volume III: II,D page 74 How do we solve this problem? we solve it by squaring all the deviations. Squaring eliminates all the minuses. This gives us the (X -7 2 values in the fourth columnn. Note that the total for this fourth column is 294. This is exactly the same value that we obtained using the second formula's term of: n We next divide by 11. This is a weighing factor. It weighs for the number of observations minus one, since we cannot randomly assign the last treatment to the last plot. This gives us an unbiased value. However, we still have a value that looks too large: 294 =26.7 This cannot be the average X -Xdeviation when the range of X x values, as we just saw, is 7 to 7. Why is this value too large? The reason is, of course, because we squared all the X -_R values. So our last step is to reverse the squaring, by taking the square root: V26.75.17 This makes more sense. If we look at the.X X values, we can classify the original data values into 2 groups: Group Data value no. Total no. of data values A. Data values (1), (2), (4), 6 greater than 5.17 or less than -5.17. B. Data values (3), (5), (6), 6 with X(7), (8), (12) between 0 and 5.17 or between 0 and -5.17. our standard deviation of,5.17 ignores the minus sign. It is an average value of the size of the numbers themselves, or the absolute values of the numbers. Now we see that half of the original data values have absolute values greater than 5.17 (group A), and half of the original data values have absolute values less than 5.17 (group B). The standard Volume III: II,D page 75 deviation of 5.17 in fact looks reasonable. This of course is what we obtained with the calculator using the second formula. The second formula is clearly easier to use than the first formula. It is too tedious to calculate all the X X values by hand. However, we should always remember that what we are doing with the "short-cut" formula and the calculator keys is really obtaining an average deviation by squaring each deviation, adding up the squared deviations, weighting, and then taking the square root. In ANOVA, we will obtain the standard deviation yet another way. Instead of using either formula, we will use a term called the error mean sqa (also called residual mean square): s = error meansquare Unit III,A,l explains how to derive the error mean square, and how to use s to calculate another measure of variability of the experiment, the coeffiecient of variation (CV). 3. LOOKING AT DATA AGAIN a. Calculation of Likely outcomes: Confidence Intervals (1) Adding Confidence Intervals to Scatter Plots What is the range of yield outcomes that farmers can expect or are. likely to get the next time they do the experimental treatment? Here, our first problem is to select what level of likelihood or confidence the farmer wants 9 times out of 10, 8 times out of ten or 5 times out of ten, that is, half the time? As a researcher I might want to be very confident and go for 9 out of 10, but it is the farmers who are going to decide whether to adopt, whether to take the risks. So let them, the farmers, decide on level of confidence. The researchers' job is to give farmers the best information possible. In this case, this is a range of confidence levels from which farmers may choose. There is one caution about these kinds of predictions of likely future outcomes. You must remember that your predictions are made from data gathered in particular circumstances. Thus they only apply to those types of circumstances. If you do an experiment under ideal research station circumstances, you cannot expect your predictions to apply exactly to a farm. It is therefore important that your on-farm experiments cover the range of circumstances farmers are likely to face in the future. So give your farmers the range of yields at different levels of confidence. The next thing to do is to calculate the ranges or intervals of Outcomes; that is the size of variation or deviation from the mean at different level of confidence using the formula: 2 Confidence Interval =X+ t n Volume III: II,D page 76 Again, using the 1983 corn data Confidence Intervals are calculated thus: Step one: obtain from the student's It' table It' value for 90, 80, 70, 60, and 50 % probability at n 1 degrees of freedom. Step two: Calculate the deviation from the mean, that is, the standard deviation (SD): s 2 270898.4 212.5 n --and multiply the deviation for the mean by the It' value for each Confidence level. Step three: Add the deviation value obtained in step two to the mean (679.8 for experimental treatment) Step four: Subtract the deviation value obtained in step two from the mean. The results are shown below: Experimental Treatment Steps 2 3 & 4 3 4 result result result CI Level It' value SD step 2 Mean step 3 step 4 90 2.01 212.5 427.1 -679.8 1106.9 252.7 80 1.47 212.5 312.4 679.8 992.2 367.4 70 1.15 212.5 244.4 679.8 924.2 435.4 60 0.92 212.5 195.5 679.8 875.3 484.3 50 0.72 212.5 153.0 679.8 832.8 526.8 Farmer Control Steps 2 3 & 4 3 4 result result result CI Level It' value SD step 2 Mean step 3 step 4 90 2.01 320.8 644.8 1194.2 1839.0 549.4 80 1.47 320.8 471.6 1194.2 1665.8 722.6 70 1.15 320.8 368.9 1194.2 1563.1 825.3 60 0.92 320.8 295.1 1194.2 1489.3 .899.1 .50 0.72 320.8 231.0 1194.2 1425.2 963.2 Volume III: IID page 77 Figure III: II,D.4 Graph of Confidence Intervals .from SRMU 1983 Corn Data C 50 1~ 0 n f L I i e 60 d v e e ni 1 c 70 e 80 90/ El PT.%Iv 0 500 1000 1500 Yield This graph is used to discuss the farmer's question, "What size of yield can I expect, or am I likely to get, next time I do this practice?" We interpret these tables and diagrams for the farmers by saying 9 times out of 10 they will get from the experimental treatment a corn yield between 253 and 1107 kg/ha, while at a riskier level, 7 times out of 10, they will get corn yields between 435 and 924 kg/ha. However, if they repeat their fanner control, then we find 9 times out of 10 they will get between 549 and 1839 kg/ha. You will notice from your diagram that at this level of likelihood (90%) there is a great deal of overlap, that is farmers get the same yield regardless of which practice they use. Of course, as risk increases the overlap decreases, which prompts the question as to whether the mean yields are really different or just due to random chance. Unit III,A.0 explains how to test whether two means are really different or not. There is an important assumption we make in asking this question and that is that we assume that the only difference between farmer control and experimental treatment is our treatment. This assumption does not hold when we have confounding effects. By confounded we mean linked. For example, spoeteseed we use for the experimental treatment had poor germination but farmers seed did not. Here, the experimental treatment was confounded or linked with poor germination, while the farmers control was not. Volume III: II,D page 78 (2) Adding Confidence Intervals to Lines and Curves Confidence intervals can also be calculated and used to incorporate risk in graphical analysis. In Figure III: II,D.5, 95% confidence intervals have been constructed around the DE curve. The original curve was generated by regression. Techniques for adding confidence intervals to lines and curves are discussed in Hildebrand and Poey. Figure III: II,D.5 95% Confidence Interval around a maize response surface. Yl Millet I Yield ( Kg/ha ) 95% Confidence Y 2// Interval A Nitrogen Application (Kg/ha) Remember that curve DE is only an estimate of yield response to a treatment. There is some risk that the actual yield a farmer would experience would not be on curve DE, since not all of the treatment yields fell on that line. For example, with Figure III: II,D.5 a farmer could be told that there is a 95% chance that yields will between Y1 and Y2 Kg/ha if A kg/ha of nitrogen are applied. b. Interpolating and Extrapolating from Lines and Curves Figure III: II,D.6 demonstrates how the notion of interpolation and extrapolation work. The X's on the graph could represent the yields of various treatments from field trials or information obtained from a survey of farmers, while the curved line (D to E) represents the best estimate of yields which can be obtained from the data (how to obtain such lines will be discussed in the subunit on response surface analysis, III: III,B,3). Nitrogen level B also turns out to be a point on the line DE, which was developed from the data. What happens if less fertilizer isapplied? Without the graphical perspective, the next point which could be considered would be nitrogen level F. However, since line DE has been constructed the impact of fertilizer level C, or any point on curve DE for that matter, can be estimated. That is, at nitrogen level C, millet yield Y would be estimated. The estimation of yields between observed observations (F and B in this case) is referred to as interpolation. (See Hildebrand and Poey). Interpolation is statistically valid, an can be given probability of error based on regression and confidence intervals. Volume III: II,D page 79 Extrapolation projects the impact of treatments beyond the range of the actual treatments. Suppose that a farmer can only use G amount of nitrogen. Rather than say that nothing can be told to this farmer about production with G kg of fertilizer, extrapolation can be used. The dashed line in Figure III: II,D.4 represents an extrapolation by allowing one to project the impact of treatment beyond the range of the actual treatments. Through extrapolation the farmer can be told that a yield of Y2 with G kg of fertilizer may occur. While extrapolation is a powerful technique, it must be used with caution. Extrapolation is not statistically valid, but is based on qualitative judgement. That is, in order to extrapolate, it must be assumed that the same pattern of response to different levels of fertilizer (line DE) will continue outside of the actual range of treatments. How good, or bad of an assumption this is will depend on the specific situation. The only general guideline which can be offered is that the further from the actual observations the more likely it is that the extrapolation predicition will be wrong. This is because the further the extrapolations from the actual observations the more likely it is that the response will diverge from the behavior observed within the observations. Yl Millet Yield (Kg/ha) I G D F C B A Nitrogen application (Kg/ha) Figure III: II,D.6 Millet Yield Response to Fertilizer Volume III: II,D page 80 UNIT III ANALYSIS TECHNIQUES Volume III Unit III (III, A, 0) QUALITATIVE EFFECTS OF TWO TREATMENTS: PAIRED AND UNPAIRED T-TESTS OUTLINE 1. Types of t-tests 2. Applying the paired t-test 3. Applying the unpaired t-test PREREQUISITES PARTICIPANT LEVEL Agricultural research assistant Extension technology verification assistant LEARNING OBJECTIVES After completing this section the participants will be able to: 1. Contrast situations when a paired t-test is appropriate, versus situations when an unpaired t-test is appropriate. 2. Set up tables for summarizing and drawing conclusions from a data set using paired and unpaired t-tests. 3. Perform calculations by hand and with microcomputer to obtain and use paired and unpaired t-test statistics. KEY POINTS 1. A paired t-test can be used in 2 situations: a. To compare 2 treatments, where the 2 treatments are present on each farm; b. To compare before and after values of the same treatments,. where the 2 treatments are not always present on each farm. 2. Calculations for unpaired t-tests differ depending on: a. The number of observations in each treatment; b. whether the variances of each treatment are similar or not. TERMS farmer control intervention t-test treatment variance Volume III: III,A,O page 81 DISCUSSION 1. TYPES OF T-TESTS T-tests are used to determine if a qualitative effect of 2 treatments is real. There are 2 types of t-tests: 1. Paired t-tests; 2. Unpaired t-tests. Situations when a paired t-test is appropriate are different from situations when an unpaired t-test is appropriate. a. Paired t-tests In validation testing, usually only 2 treatments are compared: the farmer control, and the intervention. The 2 treatments are present on every farm. This is a situation where a paired t-test is appropriate. In livestock on-farm trials, each farm often only has a few animals. Each animal represents a large investment of money for the farm household. The animals frequently have important social value as well. Hence, the risk to the farm household of a treatment harming an animal is great. For these reasons, livestock on-farm trials may often only be able to use 1 animal per farm. When only 1 animal per farm is available. for trials with adult animals, it may be useful to compare performance before and after the treatment. For example, milk production might be measured for a month for 1 animal on each farm before introducing a new feed ration. Milk production would then be measured for another month while the new feed ration was given. This is another situation where a paired t-test can be used. The paired t-test compares before and after values of the same treatment applied to the same experimental units (the same animals). Of course, this comparison will not be valid if there are other major differences between the 2 months besides the change in feeding rations. For example, if the first month is the end of the hot, dry season, the animals may be stressed by heat and inadequate water. If the second month is the start of the rainy season, temperatures during the day will not be so high, due to cloud cover. Water will become freely available. The animals may appear to do better, but the main reason may be reduced stress due to the change in seasons, rather than the change in rations. b. Unpaired t-tests Another livestock trial might compare the 2 feeding rations at the same time. Farms with only 1 animal might be divided into 2 groups. The animals on farms in the first group would be given the new ration. The animals on farms in the second group would be given the farmer control ration. While the 2 groups would be chosen so that farms in both groups were more or less similar, it might not be possible to pair all the farms in each group one-by-one. This would be a situation where an unpaired t-test would be appropriate. Volume III: III,A,O page 82 Sometimes a team may identify independent farmer experimentation. For example, 5 farms may be growing only the standard variety of sweet potato, but 5 farms growing a new variety, and 8 farms growing both. The team might take yield samples from plots on all 18 farms. This would give the following data set: Standard New Farm variety variety --- 100 2 110 3 95 4 112 5 89 6 1-3 7 122 8 99 9 143 10 140 11 102 12 97 114 13 102 154 14 113 135 15 101 131 16 78 87 17 83 89 18 100 125 The 2 varieties could be compared for farms 11 through 18 with a paired t-test. However, perhaps the team considers all 18 farms to be from the same domain. The team thus may want to compare the data for the 2 varieties from all the 18 farms. This is another situation where an unpaired t-test would be appropriate. Thus, in both of the above examples, the unpaired t-test allows us to compare 2 treatments, where the 2 treatments are not always present on each farm. 2. APPLYING THE PAIRED T-TEST a. Setting up a table To apply the paired t-test, we first set up a summary table. The summary table will show the following statistics: Statistic Notation 1. Difference between X1 X2 = d each value 2. Sum of differences (X1 X2) = Ed 3. Sum of squares of diff. Z(X1 X2) = Zd2 4. Means of each treatment XI' X2 5. Mean difference Volume III: III,A,0 page 83 An example table is: Farm Farmer New Difference variety variety (XI) (X2) 1 90 75 15 2 110 115 -5 3 85 90 -5 4 120 105 15 5 150 130 20 6 140 115 25 EX Ed2 Ed Xd b. Completing the table Next, we have to calculate the values of the statistics. How to do this depends on whether we have: 1. A simple hand calculator with only arithmetic functions (+, -,x,+) and a memory; or 2. A hand calculator with statistical functions. Case 1: Simple hand calculator Here we have to calculate Ed2 first by using the memory (M+ key), before calculating Ed and d: Step Find Keys Result 1 Ed2 15 x 15 = 225 M+ -5 x -5 = 25 M+ 25 x 25= 625 M+ RM 1525 2 Ed 15- 5 -...+ 25= 65 d 65+6= 11 i and X2 can be found like . Volume III: III,A,0 page 84 |