<%BANNER%>

Assessment of Adolescent Alcohol Use

MISSING IMAGE

Material Information

Title:
Assessment of Adolescent Alcohol Use Estimating and Adjusting for Measurement Bias
Physical Description:
1 online resource (98 p.)
Language:
english
Creator:
Livingston, Melvin D, III
Publisher:
University of Florida
Place of Publication:
Gainesville, Fla.
Publication Date:

Thesis/Dissertation Information

Degree:
Doctorate ( Ph.D.)
Degree Grantor:
University of Florida
Degree Disciplines:
Epidemiology
Committee Chair:
Komro, Kelli Ann
Committee Members:
Wagenaar, Alexander C
Xu, Xiaohui
Muller, Keith E

Subjects

Subjects / Keywords:
alcohol -- misclassification -- mode -- recall
Epidemiology -- Dissertations, Academic -- UF
Genre:
Epidemiology thesis, Ph.D.
bibliography   ( marcgt )
theses   ( marcgt )
government publication (state, provincial, terriorial, dependent)   ( marcgt )
born-digital   ( sobekcm )
Electronic Thesis or Dissertation

Notes

Abstract:
There is an extensive literature on the deleterious effects of early onset of alcohol use. Preventing and reducingearly onset of alcohol use remains a public health priority. Central toepidemiological surveillance, rigorous etiological research and experimentalevaluation of prevention strategies is the valid measurement of alcohol useamong adolescents. Although there is a history of literature regarding thereliability and validity of self-report measures of alcohol use amongadolescents, there has rarely been rigorous testing of the potential biasingeffects of misclassification. The purpose of this study is to evaluate theextent of misclassification due to survey modality and recall bias, as well asavailable methods for bias correction. These results will then be used to develop a hierarchy of best availableapproaches to account for bias due to misclassification.  While generally applicable, the resultingmethods will be discussed in the context of adolescent alcohol use. Methods: For all studies, I used an urbanlongitudinal sample of adolescents followed from 6th to 12th grade (ages 12-18)derived from the Project Northland Chicago trial. In the first study,generalized estimating equations were used to assess the effect of surveymodality at 12th grade on self-reported substance use and associated behaviors.All estimates controlled for 8th grade values of the outcome variables in orderto control potential selection effects. In the second study, logistic regression was used to assess predictorsof recall bias for age of first alcohol use. An indicator variable for recall bias was constructed by comparingprospective and retrospective measures of age of alcohol use onset. In thethird study, I used simulations to test methods that account for misclassificationincorporating information from a validity sub study. These results were used tomake recommendations for analyzing data with misclassified predictors.
General Note:
In the series University of Florida Digital Collections.
General Note:
Includes vita.
Bibliography:
Includes bibliographical references.
Source of Description:
Description based on online resource; title from PDF title page.
Source of Description:
This bibliographic record is available under the Creative Commons CC0 public domain dedication. The University of Florida Libraries, as creator of this bibliographic record, has waived all rights to it worldwide under copyright law, including all related and neighboring rights, to the extent allowed by law.
Statement of Responsibility:
by Melvin D Livingston.
Thesis:
Thesis (Ph.D.)--University of Florida, 2013.
Local:
Adviser: Komro, Kelli Ann.
Electronic Access:
RESTRICTED TO UF STUDENTS, STAFF, FACULTY, AND ON-CAMPUS USE UNTIL 2015-08-31

Record Information

Source Institution:
UFRGP
Rights Management:
Applicable rights reserved.
Classification:
lcc - LD1780 2013
System ID:
UFE0045849:00001

MISSING IMAGE

Material Information

Title:
Assessment of Adolescent Alcohol Use Estimating and Adjusting for Measurement Bias
Physical Description:
1 online resource (98 p.)
Language:
english
Creator:
Livingston, Melvin D, III
Publisher:
University of Florida
Place of Publication:
Gainesville, Fla.
Publication Date:

Thesis/Dissertation Information

Degree:
Doctorate ( Ph.D.)
Degree Grantor:
University of Florida
Degree Disciplines:
Epidemiology
Committee Chair:
Komro, Kelli Ann
Committee Members:
Wagenaar, Alexander C
Xu, Xiaohui
Muller, Keith E

Subjects

Subjects / Keywords:
alcohol -- misclassification -- mode -- recall
Epidemiology -- Dissertations, Academic -- UF
Genre:
Epidemiology thesis, Ph.D.
bibliography   ( marcgt )
theses   ( marcgt )
government publication (state, provincial, terriorial, dependent)   ( marcgt )
born-digital   ( sobekcm )
Electronic Thesis or Dissertation

Notes

Abstract:
There is an extensive literature on the deleterious effects of early onset of alcohol use. Preventing and reducingearly onset of alcohol use remains a public health priority. Central toepidemiological surveillance, rigorous etiological research and experimentalevaluation of prevention strategies is the valid measurement of alcohol useamong adolescents. Although there is a history of literature regarding thereliability and validity of self-report measures of alcohol use amongadolescents, there has rarely been rigorous testing of the potential biasingeffects of misclassification. The purpose of this study is to evaluate theextent of misclassification due to survey modality and recall bias, as well asavailable methods for bias correction. These results will then be used to develop a hierarchy of best availableapproaches to account for bias due to misclassification.  While generally applicable, the resultingmethods will be discussed in the context of adolescent alcohol use. Methods: For all studies, I used an urbanlongitudinal sample of adolescents followed from 6th to 12th grade (ages 12-18)derived from the Project Northland Chicago trial. In the first study,generalized estimating equations were used to assess the effect of surveymodality at 12th grade on self-reported substance use and associated behaviors.All estimates controlled for 8th grade values of the outcome variables in orderto control potential selection effects. In the second study, logistic regression was used to assess predictorsof recall bias for age of first alcohol use. An indicator variable for recall bias was constructed by comparingprospective and retrospective measures of age of alcohol use onset. In thethird study, I used simulations to test methods that account for misclassificationincorporating information from a validity sub study. These results were used tomake recommendations for analyzing data with misclassified predictors.
General Note:
In the series University of Florida Digital Collections.
General Note:
Includes vita.
Bibliography:
Includes bibliographical references.
Source of Description:
Description based on online resource; title from PDF title page.
Source of Description:
This bibliographic record is available under the Creative Commons CC0 public domain dedication. The University of Florida Libraries, as creator of this bibliographic record, has waived all rights to it worldwide under copyright law, including all related and neighboring rights, to the extent allowed by law.
Statement of Responsibility:
by Melvin D Livingston.
Thesis:
Thesis (Ph.D.)--University of Florida, 2013.
Local:
Adviser: Komro, Kelli Ann.
Electronic Access:
RESTRICTED TO UF STUDENTS, STAFF, FACULTY, AND ON-CAMPUS USE UNTIL 2015-08-31

Record Information

Source Institution:
UFRGP
Rights Management:
Applicable rights reserved.
Classification:
lcc - LD1780 2013
System ID:
UFE0045849:00001


This item has the following downloads:


Full Text

PAGE 1

ASSESSMENT OF ADOLESCENT ALCOHOL USE: ESTIMATING AND ADJUSTING FOR MEASUREMENT BIAS By MELVIN LIVINGSTON A DISSERTATION PRESENTED TO THE GRADUATE SCHOOL OF THE UNIVERSITY OF FLORID A IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF DOCTOR OF PHILOSOPHY UNIVERSITY OF FLORIDA 2013 1

PAGE 2

2013 Melvin Livingston 2

PAGE 3

To my wife and daughter 3

PAGE 4

ACKNOWLEDGMENTS Data for this study was obtained from grant s from the National Institute of Alcohol Abuse and Alcoholism and Nation al Center on Minority Heal th and Health Disparities (R01 AA013458; R01 AA016549), awarded to Dr Kelli A Komro. I thank Karen Alfano, MBA, for survey design and management of dat a collection, Kian Farbakhsh, M.S., for database design and management, and Cheryl Perry, Ph.D, for her overall contributions to the PNC study. I also thank Amy Tobl er, Ph.D, for her management of the PNC Etiology study. I also gratefully acknowl edge the participation of students, parents, and community leaders in the Project No rthland Chicago tr ial and follow-up. I thank my dissertation committee, Dr s. Kelli Komro, Alex Wagenaar, Keith Muller, and Xiaohui Xu, for their help and en couragement through the doctoral process. They have all individually contributed to my success. I am a bette r scientist and writer for having worked with them. I am grateful to Drs. Kelli Komro and Alex Wagenaar for hiring me into the position of statistical coordinator. This ex perience has allowed me to greatly improve both my statistical abilities and communication skills. They gave me the freedom and the encouragement to explore different methods and analyses. Working with researchers that are at the t op of their fields has allowed me to be involved in cutting edge research that is shaping the future of prevention efforts. I greatly appreciate the opportunity to have worked with such a talent ed group of researchers in the Department of Health Outcomes & Policy and look forward to continued collaboration. I would also like to thank Drs. John Kairalla, Stephanie Staras, and Mildred Maldonado-Molina for their timely encouragement and advice throughout the dissertation process. 4

PAGE 5

I thank my parents for thei r support throughout my education. They have always provided an environment that allowed me to pursue my intellectual abilities. I thank my brother for always be ing there for me throughout our lives. I thank my in-laws for pushing me to do what was best even when it was more difficult. Over the past decade, their advice has greatly shifted the direction of my life. I thank my brother-in-law for being awake at every hour of the night so that I always had company while writing. I would like to thank my wife, Bethany and our daughter, Ellie. Bethany has been my greatest source of support. Whenever I needed it, she would stay awake to keep me focused (even when that meant lying on a hard office floor while five months pregnant). She was always willing to wake up in the mi ddle of the night and li sten if I stumbled in with yet another eureka moment. She has kept me grounded and sane throughout our life together. Without her this dissertation would not have been possible. I am thankful for what a wonderful wife and mother she continues to be. Fina lly, I want thank Ellie for always having a smile to greet me at t he end of each long and stressful day. I cannot wait to see the person you grow up to be. 5

PAGE 6

TABLE OF CONTENTS page ACKNOWLEDG MENTS .................................................................................................. 4LIST OF TABLES ............................................................................................................ 8LIST OF FI GURES .......................................................................................................... 9ABSTRACT ................................................................................................................... 10 CHAPTER 1 INTRODUC TION .................................................................................................... 12Measurement Error ................................................................................................. 15Recall Bi as .............................................................................................................. 16Survey M odality ...................................................................................................... 16Measurement Error of Adol escent Alc ohol Us e ...................................................... 17Reliability of Adolescent Alcohol M easure .............................................................. 19Validity of Adolescent Alcohol M easure .................................................................. 20Survey M odality ...................................................................................................... 22Web Versus Paper ........................................................................................... 22School Administered Versus Home Admi nister ed ............................................ 23Recall Bi as .............................................................................................................. 23Next Steps in the Measurement of Adolescent Alcohol Use ................................... 24Existing Methods to Correct fo r Misclassifica tion Bias ............................................ 25Method Requiring No Exte rnal Info rmation ............................................................. 26Sensitivity Analysis ........................................................................................... 26Method Requiring External Information ................................................................... 27Probabilistic Sensitiv ity Anal ysis ....................................................................... 27Methods Requiring Vali dation Sub St udies ............................................................. 28Regression Calibra tion Met hods ...................................................................... 28Instrumental Variable Binary E rrors in Variabl es Methods ............................... 29Multiple Imputation Methods ............................................................................. 30Research Aims ....................................................................................................... 31Aim 1 ................................................................................................................ 31Aim 2 ................................................................................................................ 32Aim 3 ................................................................................................................ 322 PROJECT NORTHLAND CHI CAGO STUDY D ESIGN .......................................... 343 THE EFFECTS OF SURVEY MODALITY ON ADOLESCENTS RESPONSES TO ALCOHOL U SE ITEM S .................................................................................... 36Literature Review .................................................................................................... 36Methods .................................................................................................................. 38 6

PAGE 7

Study De sign .................................................................................................... 38Missing Data ..................................................................................................... 40Measures .......................................................................................................... 40Statistical A nalysis ............................................................................................ 43Results .................................................................................................................... 45Discussio n .............................................................................................................. 464 PREDICTORS OF RECALL ERRO R IN SELF-REPORT OF AGE OF ALCOHOL USE ONSET ......................................................................................... 50Literature Review .................................................................................................... 50Methods .................................................................................................................. 52Study De sign .................................................................................................... 52Measures .......................................................................................................... 54Statistical A nalysis ............................................................................................ 57Results .................................................................................................................... 59Missing Data ..................................................................................................... 59Model Results ................................................................................................... 60Discussio n .............................................................................................................. 615 COMPARING METHODS OF MISC LASSIFICATION CORRECTION IN STUDIES OF ADOLESCEN T ALCOHOL USE ....................................................... 68Literature Review .................................................................................................... 68Methods .................................................................................................................. 69Summary of Avail able Met hods ........................................................................ 69Simulation Design ............................................................................................ 72Results .................................................................................................................... 74Regression Ca libration ..................................................................................... 74MIME ................................................................................................................ 75PSA .................................................................................................................. 76Discussio n .............................................................................................................. 766 CONCLUS IONS ..................................................................................................... 82Accomplishments of th e Dissertat ion ...................................................................... 82Limitations of t he Dissertat ion ................................................................................. 84Contributions and Re commendati ons ..................................................................... 85Future Res earch ..................................................................................................... 87REFERENC ES .............................................................................................................. 88BIOGRAPHICAL SKETCH ............................................................................................ 98 7

PAGE 8

LIST OF TABLES Table page 3-1 The effects of survey modalit y on self-reported alcohol use ............................... 493-2 The effects of survey modality on self-reported al cohol use among adolescents reporting lifet ime alcohol use .......................................................... 494-1 Cross-tabulation of pros pective and retrospective measures of age at first alcohol use ......................................................................................................... 654-2 Final added in order model results predicting disagreement between the retrospective and prospective ag e at first al cohol us e ........................................ 664-3 Predictors of the retros pective age at first alcohol measure indicating a later age at first al cohol use ........................................................................................ 665-1 Simulation results comparing methods of misclassificati on correction ............... 80 8

PAGE 9

LIST OF FIGURES Figure page 4-1 Hypothesized covari ate chronol ogy .................................................................... 67 9

PAGE 10

Abstract of Dissertation Pr esented to the Graduate School of the University of Florida in Partial Fulfillment of the Requirements for t he Degree of Doctor of Philosophy ASSESSMENT OF ADOLESCENT ALCOHOL USE: ESTIMATING AND ADJUSTING FOR MEASUREMENT BIAS By Melvin Livingston August 2013 Chair: Kelli A. Komro Major: Epidemiology Objective: There is an extensive literature on the deleterious effects of early onset of alcohol use. Preventing and reduc ing early onset of alcohol use remains a public health priority. Central to epidemiol ogical surveillance, rigorous etiological research and experimental evaluation of prev ention strategies is the valid measurement of alcohol use among adolescents. Although ther e is a history of lit erature regarding the reliability and validity of self -report measures of alcohol use among adolescents, there has rarely been rigorous testing of the potential biasing effects of misclassification. The purpose of this study is to evaluate the extent of misclassification due to survey modality and recall bias, as well as available methods fo r bias correction. These results will then be used to develop a hierarchy of best available approaches to account for bias due to misclassification. While generally applicable, the resulting methods will be discussed in the context of adolescent alcohol use. Methods: For all studies, I used an urban lo ngitudinal sample of adolescents followed from 6th to 12th grade (ages 12-18) derived from the Project Northland Chicago trial. In the firs t study, generalized estimating equations were used to assess 10

PAGE 11

11 the effect of survey modality at 12 th grade on self-reported substance use and associated behaviors. All estimates controlle d for 8th grade values of the outcome variables in order to control potential select ion effects. In the second study, logistic regression was used to assess predictors of reca ll bias for age of first alcohol use. An indicator variable for recall bias was constructed by comparing prospective and retrospective measures of age of alcohol use onset. In the third study, I used simulations to test methods that account for misclassifica tion incorporating information from a validity sub study. These results were used to make recommendations for analyzing data with misclassified predictors.

PAGE 12

CHAPTER 1 INTRODUCTION Alcohol use is widespread among American adolescents. While rates of early alcohol initiation have been declining since the early 1990s, rates still remain high. According to 2010 reports from Monitoring th e Future (MTF), over 35% of adolescents have initiated alcohol use by 8th grade, over 58% by 10th grade, and 71% by 12th grade; while 16% of adolescents report having been drunk by 8th grade, 37% by 10th grade, and 54% by 12th grade1. Not only are these rates high but there is also extensive literature on the harmful effects of alcohol use during adolescence. The literature suggests that early initiation of alcohol use is associated with the init iation of other highrisk behaviors (violence, substance abuse, academic problems, employment problems, and risky sex)2-5 and consequent health problems (emotional problems, sexually transmitted infections, and injury)2, 3, 6-8. As such adolescent alcohol use is commonly considered to be a serious public health problem. Many different fields of research are necessary to understand, reduce and prevent any large-scale public health pr oblem, including adolescent alcohol use. Epidemiology can be viewed as a part of this wider group of study known as prevention science9. Epidemiological and prevention resear ch is of great importance to the development of effective policies for r educing and preventing public health problems. Brownson et al.describe a f our step process for the devel opment of successful policies 10. First, health risks are ident ified. Next, appropriate inte rventions are developed and evaluated. Next, evidence from these evaluations is used to develop policies. Finally, these policies are enacted. A key underpinni ng of this framework is the constant evaluation of evidence at each stage in the process10, 11. The identification of health 12

PAGE 13

risks described in the first stage of the Browns on model is classic epidemiology, as this is the primary function of surveillance and observational research. Evaluation continues during the second stage as ineffective intervent ions are discarded in favor of programs backed by strong evidence. As these inte rventions are turned into wider policies, continued evaluation is important to check generalizability and continued effectiveness as they go to scale. The cumulative effect of these continued evaluations contribute much to the overall public health knowledge base, improving with each iteration of the policy development cycle12. To design the strongest policies and interv entions it is necessary that the underlying evaluations be based on best prac tices during each stage of development. When this standard is not met, the risk of po or policy decisions increases. The classic example of this is the case of hormone r eplacement therapy. For years, observational studies provided evidence for the postmenop ausal use of hormone replacement therapy as a way for women to minimize bone loss leading to fractures, heart disease, colorectal cancer, and breast cancer. However, evi dence published from the Womens Health Initiative (WHI) using randomized control trials found very different results. The WHI found an increase in the rate of breast canc er and cardiovascular events, rather than the expected decrease13. While this is one of the more dramatic reversals in medical policy, it is not a unique instance. A recent re view in the Archives of Internal Medicine found sixteen reversals in evidence-based recommendations in 2009 alone14. Examples of poor policy decisions based on incomplete ev aluation are not found solely in medical research. The Drug Abuse Resistance Educ ation (DARE) projec t became the most widely adopted program for drug prevention among schoolaged children. However, 13

PAGE 14

meta-analyses over the course of almost two decades of resear ch have continually found it to be ineffective15. DARE provides not only an example of a reversal in the scientific community, but also the dangers of research findings not being adequately translated into public policy decisions. As of 2004, DARE still received the largest proportion of federal funding compared to any other school-based substance abuse prevention program, despite the fact that the first meta -analysis demonstrating DAREs ineffectiveness was published in 199416. Other examples of weak research being applied to policy show the dangers of misclassification. A recent policy reversal by the U.S. Preventive Services Task Force (USPSTF) on the recommended age to begin screening for breast cancer is a prime example. Recommendations from 2002 USPS TF report on breast cancer screening promote the use of screening mammogr aphy in women aged 40 and up every one to two years17. Further research demonstrated t hat mammographies in women aged 40-49 had higher false positive rates than mammographies in older women. False positive screens resulted in psychological harm and unnecessary treatments for subclinical tumors. In 2009, the USPSTF deemed that the net morta lity benefit of screening in women aged 40-49 was not high enough relative to the harms associated with false positive screenings18. Seemingly more mundane cases of misclassification can also have policy implications. A 2005 study found that race/ethnicity was misclassified in the Surveillance, Epidemiology, and End Result s (SEER) registry when compared to selfreported race/ethnicity in 11% of those samp led. More troubling was that the sensitivity of the race/ethnicity measure was found to vary by racial/ethnic group (Non-Hispanic White: 0.96, Black: 0.89, American Indian: 0.06, Hispanic White: 0.69)19. This is 14

PAGE 15

particularly troubling consider ing that SEER data are relied on heavily to plan and evaluate policies for the reduction of racial/ethnic disparities in cancer20. To adequately provide information for the formulation of sound interventions and policies, special attention must be paid to measurement issues9. This is of particular importance in fields that rely on self-report measures for eval uation. One such field, and the primary focus of this study, is adolesc ent alcohol use. While the literature on the prevalence and consequences of adolescent alcohol use is extensive, it commonly relies on retrospective self-report measures. This leaves a bulk of the evidence at risk of bias due to the potential of measurement error. Measurement Error Forms of measurement vary across research fields. While research in the biological and physical sciences deals most often with measurement error caused by improper instrument calibration and poor specimen handling, research into adolescent alcohol use is primarily survey driven with its own unique sets of difficulties. When relying on survey instruments, measurem ent error can be introduced at any of the research phases: instrument development, prot ocol development, protocol execution, instrument implementation, data capture, and analysis21. During instrument development, measurement error can be introduced by poor wording of survey items that leads to a misunderstanding of the question. During protocol development, measurement error can be introduced by not preparing survey administrators to handle both normal and unexpected circumstances in a uniform way. During protocol execution, measurement error can be in troduced when survey administrators do not adhere to well-prepared protocols. During in strument implementation, measurement error can be introduced by individual subject characteristics such as subject inability to 15

PAGE 16

accurately recall the desired information. During data capture, me asurement error can be introduced by inaccurate data recording. During analysis, measurement error can be introduced through a variety of mechanisms in cluding: incorrect variable recoding, unnecessary categorization, and improper matching. An y of these causes of measurement error can lead to misclassifica tion when dealing with categorical data, and these are just a few exampl es of the ways error can be introduced at each phase. However, the primary means mi sclassification is introduced is through participant recall and effects of the method of interview22. Recall Bias All areas of research that rely on retros pective memory are su bject to recall bias. Many factors have been shown to affect the accuracy of a participants retrospective memory. These factors include length of time since the event, perceived importance of the event, method of questioning, use of an event calendar, cognitive ability, distress at recall and personality traits. While decades of research have been done on what affects recall and ways to minimi ze potential recall biases at data collection, there is no systematic rule for how to best deal wit h recall bias at the analysis phase. Survey Modality Results from the literatur e on the presence of survey modality effects have been mixed. Many studies have found no evidence of response bias due to survey modality 23-31. One study found a significant difference in perceived anonymity and privacy due to modality, but no overall effect on response bias 32. Other studies have found significant mode effects in a wide variety of settings 33-38. Of particular interest are results from an analysis of the YRBS compar ing paper and pencil surveys to multiple web survey formats. Denniston et al. found significantly (p<0.01) lower amounts of missing data in 16

PAGE 17

the paper and pencil format (1.5%) compared to in-class web without programmed skip patterns (5.3%), in-class web with programm ed skip patterns (4.4%), and "on your own" web without programmed skip patterns (6.4%). Denniston et al. also found greater agreement (p<0.001) with the statement I am confident t hat the answers I gave in this survey will never be linked wit h my name among paper and pencil responders (75.4%) compared to in-class web without programmed skip patterns (66.4%), in-class web with programmed skip patterns (68.5%), and "on your own" web without programmed skip patterns (74.1%) 35. As with recall bias, there does not s eem to be a consistent rule for the best way to account for misclassifica tion due to survey modality in the analysis phase. Measurement Error of Adolescent Alcohol Use Measurement error of se lf-report alcohol measures can be discussed in two broad frameworks, the cognitive fram ework and the situational framework 39. Under the cognitive framework, four different proc esses are believed to underlie the question and answer process: comprehension, information retrieval, decision-making, and response generation. Error can be introduced in any of these four phases. These errors in turn lead to measurement errors and validity problems in questionnaires. Under the situational framework, the focus is on factors external to the participant. Of primary concern in the situation framework is the pr esence of others while a participant answers questions, the perceived degree of anonymity and confidentiality of survey responses, and the perceived social desirabili ty of survey responses. Cognitive errors in the se lf-report of alcohol use are bel ieved to occur primarily in the comprehension and retrieval phases of the question and answer process. Comprehension errors are believed to occur primarily in the comprehension of the 17

PAGE 18

reference time periods common to self-report measures. This can be seen in the increased reliability of ever use measures compared to frequency of use during a time period in longitudinal studies40, 41. Issues in the information retrieval phase are also of great concern for alcohol research. As many alcohol self-r eport items request the frequency of use within a specific timeframe, proper retrieval involves both recalling the event and placing it in the proper time. The cognitive burden of this retrieval process increases as the length of t he recall timeframe increases 42. Assuming that higher selfreport rates are more likely to be accurate than lower self -report rates for alcohol use, an example of these errors can be seen in that monthly reported rates multiplied by twelve have been found to significant ly exceed reported yearly rates42. Problems with information retrieval may be compounded for alc ohol use, since events occurring during alcohol use may be more difficult to recall. In the broader epidemiolo gical literature, the results of these cognitive errors are referr ed to as recall bias and are introduced during the instrument implementati on phase. The potential effe cts of recall bias can be minimized in the instrument and protocol development phases. Creating questions with shorter reference time periods and using protocols aids such as having a calendar present have been shown to reduce recall bias43. Situational factors affecti ng the validity and reliability of adolescent alcohol use self-report are thought to be re lated primarily to illegality of adolescent alcohol use39. Social desirability bias and fear of reprisal are the driving situati onal factors at play during the measurement of se lf-reported adolescent alcohol use. Perceived anonymity, confidentiality and privacy can be viewed as moderators of the re lationship between these situational factors and the validity of self -report responses44. Participants with 18

PAGE 19

higher perceived anonymity, confidentiality and pr ivacy are less likely to fear reprisal or succumb to social desirability bias and thus more likely to provide valid responses 44. Decisions made during the pr otocol development phase r egarding the mode by which the survey is administered can have effect s on the perceptions of the participants. Reliability of Adolescent Alcohol Measure The reliability of alcohol self-report m easures has been thorough ly studied. In some evaluations, the reliability of self-r eported alcohol use items is improved by combing multiple items into a single continuously measured scale. Williams et al. has developed scales for alcohol use tendencies in young adolescents with high internal consistency (Chronbachs alphas, male=0.89, female=0.80) and test retest reliability (reliability coefficients, male=0.90, female=0.73) 45. Similarly, Komro et al. developed an alcohol use scale for use in the evaluation of the DARE Plus program with high internal consistency (Chronbachs alpha=0.83) and te st retest reliabi lity (reliability coefficient=0.88) 46. While the use of these scales can increase the overa ll reliability of self-reported alcohol measures, many studies are limited in the number of alcohol response items making them less feasible for some researchers. As a result it is important to be able to make the best use out of single-item self-report measures, and they will be the primary focu s of this study. Individual items regarding the frequency of alcohol use items have been found to be reliable in terms of both logical consistency47, 48 and test-retest reliability40, 41, 47, 49, 50.In order to examine the test-r etest reliability of self-repor t alcohol use items, results from Brener et al. are further r eported. In Brener et al., the test-retest reliability of the alcohol use items on the Youth Risk Behav ior Survey (YRBS) are examined. A nationally representative sample (n=4,619) of respondents grades 9-12 from the YRBS 19

PAGE 20

was retested two weeks after the initial surv ey; % kappa values were calculated to measure test retest reliabi lity. Of alcohol use items reported %kappa values were highest for ever used alcohol (%kappa=81.9) The consistency of other self-report measures, while not as high, were still within acceptable limit s: age first drank alcohol <13 years (% kappa= 65.9), drank alcohol 1 day during the past 30 days (% kappa= 70.9), and had 5 or more drinks in a row 1 day during the past 30 days (% kappa= 67.6). Research into the reliability of age at fi rst alcohol use is less consistent, as these measures commonly rely on retrospective da ta. Johnson et al. found an absolute mean difference in the self-reported age at first alcohol use between measurement occasions of 2 years; recommendations from these authors suggest th at the use self-reported age at first drink is reliable enough for most epidem iological studies, with the exception of when age at first alcohol use is a studys prime concern51. Parra et al. report moderate reliability of self-reported age at first onset, while finding significant trends indicating that participants reported later age at first drink as they aged52. Engels et al. recommend caution in the use of self-reported age at first drink after finding very low levels consistent reporting of age at first alcohol use in their high school-aged sample53. By comparing the self-report ed age of onset at the first and third wave, Engels et al. found that 4.6% of respondents reported consistent age at first alcohol use, 6.1% reported earlier age at first alc ohol use at wave 3, and 89.0% reported later age at first alcohol use at wave 3. Validity of Adolescent Alcohol Measure While reliability studies provide some evidence about the potential validity of alcohol self-report measures, actually establis hing validity for self-reported alcohol has 20

PAGE 21

proven more difficult. This is primarily due to the lack of a "gold standard" measure to compare self-report assessments of alcohol use to. Biological measures of substance use are commonly viewed as the gold standard, as they are seen as more objective than self-report measures54. However, unlike illicit substances, biological markers for alcohol use have significant limitations. A rec ent review of alcohol biomarkers found that existing biomarkers target t he presence of chronic heavy drinking or any drinking within a limited time frame 55. Blood samples are capable of det ecting chronic drinking up to three weeks after drinking has ceased by examining levels of gamma-glutamyl transpeptidase (GGT), %carbohydratedefici ent transferring (% CDT),aspartate aminotransferase (AST), or alanine aminotransferase (ALT). Mean corpuscular volume can be used to assess chronic heavy drinking up to several months after drinking cessation. Samples of saliva or urine can be examined for the presence of ethyl sulfate (EtS) or ethyl glucuronide (EtG) to detect any drinking within several days. EtG levels in hair can be used to assess any drinking up to several months after a single episode; it is unknown how many drinks are necessary for reliable detection from hair samples 55. The validity of self-reported alcohol use c an be tested with these biomarkers in the limited cases described. A study of 238 males in outpatient treatment for alcohol found relatively low sensitivity (0.4) and high specif icity (0.96) of self-re ported alcohol relapse using measured GGT as the gold standard 56. In twice weekly testing of outpatients for an alcohol dependence treatment program, self-reported alcohol use was compared to urinary EtG levels. Self-report measures were found to have moderate sensitivity (0.67) and high specificity (0.94) 57. 21

PAGE 22

While useful in limited contexts, the liter ature testing the valid ity of self-reported alcohol use with biomarkers is difficult to apply to adolescent alcohol research that depends heavily on survey items that assess the frequency of drinking episodes and amount drunk per episode. No current biomar kers are capable of assessing the validity of these types of variables 55. In light of the difficulty of establishing the validity of common self-report alcohol measures, it se ems of great importance to incorporate potential measurement uncertainty into analytic al methods applied to alcohol research. Survey Modality Web Versus Paper Results on modality effects of web vers us paper surveys specific to alcohol questionnaires have been mixed. Some published studies have found little evidence to support the presence of modality effects for web-based surveying compared to paper based surveying of self-report alcohol use 58-61. Others have found that the interaction of survey mode with questions of high sensitivity can lead to significant survey mode effects62, 63. This may explain the inconsisten cy in the literature on studies reporting significant modality effects in self-report alcohol use. Studies that found little evidence of modality effects for self-reported alcohol us e occur primarily in adult or college age samples, while studies finding significant modality effects 62-68 occur in samples drawn from adolescent populations. Since adolescents face greater social pressure not to drink compared to adults, the difference in perceived anonymity of different survey modes may be of greater importance in their responses. It is worth noting that while some of these studies provide strong methodol ogical rigor by randomizing participants to survey mode62-65, few tests of the effects of modality on test-retest reliability have been made. Miller et al. examined test-retest reliability by survey mode in a sample of 22

PAGE 23

college-aged participants, concluding no difference in the reliability coefficients by mode 61. Additional research is needed to test c onsistency across modes over time in an adolescent sample. School Administered Vers us Home Administered Studies comparing school versus home administration of surveys have consistently found higher reported rates of alcohol and drug use in school administered surveys 66-68. Kann et al. found significant differences in reported prevalence rates of problem behaviors between the schoolbased YRBS and the home-based National Health Interview, including measures of alcohol use. Kann et al. found significant differences in the reported prevalence of the following alcohol measures: lifetime alcohol use (YRBS=79.8%, NHIS=68.5%, t= 6.56, p<0.001), episodic heavy drinking (YRBS=28.2%, NHIS=22.5%, t=3.21, p<0. 001), and drank alcohol before age 13 (YRBS=33.8%, NHIS=19.7%, t=11.56, p<0.001); significant differences were not found for the measure of self-r eported current alcohol use67. The magnitude of Kann et al.s finding were largest for questions regarding i llegal or stigmatized behavior, leading the authors to conclude the higher rates reported via an anonymous paper and pencil survey in school to be less affected by social desirability bias than a home-based paper and pencil survey. This logic is consistently used across other cited articles, but no examinations of test-retest reliability or validity (via biochemical validation) by administration location have been reported. The re sulting differences between modes may be attributed to the lack of anonymi ty in household-based surveys. Recall Bias Recall bias is a commonly descri bed source of misclassification in epidemiological studies21, 22, 69. In the study of self-report ed alcohol, recall bias has been 23

PAGE 24

shown to be a significant source of bias when retrospective measures are used52, 70-74. Collins et al. found that retrospective meas ures tended to underreport the extent of past drinking, and that levels of cu rrent drinking were predictive of the extent of recall bias in high school students71. Studies have found the presence of recall bias in a wide range of reference time periods. For example, Simpura et al. found 18-ye ar recall of alcohol use in adult men tended to overestimate the le vels of past drinking, while finding that the original level of drinking was predictive of the measurement error70. On the other extreme, Gmel et al. found sign ificant biasing effects in seven day alcohol use recall in emergency room settings74. The extent of recall bias in self-reported alcohol can also be affected by the perceived harm the alcohol us e may cause. A study of the recall of pregnant women, found that women who experienced adverse pregnancy outcomes tended to under report alcohol use when measured retrospectively72. A longitudinal study of college freshman reported recall bias to be a factor in the measurement of selfreported age of alcohol use onset52. A cohort (n=410) of college freshmen was followed annually for 11 years. A significant linear tr end was observed for the mean self-reported age of alcohol use onset by study year (F=84. 75, p<0.001), indicating that increasing the length of the recall window also increases the recalled age of first alcohol use. Next Steps in the Measurement of Adolescent Alcohol Use The literature has shown that misclassifica tion in the measurement of adolescent alcohol use is a consistent source of bias75-77. Although other forms of data collection are being advanced in the meas urement of alcohol use, adolescent alcohol research remains primarily driven by self-report su rveys. While self-report measures of adolescent alcohol use have been found to be reliable, their validity can still be affected through misclassification due to factors such as recall and survey modality78, 79. Despite 24

PAGE 25

these validity concerns, self-report measur es are still the only practical means of attaining information on the frequency, amoun t, and age of onset of adolescent alcohol use. Due to these limitations, it is critical to look at analytical methods for adjusting estimates based on these measures. Existing Methods to Correct for Misclassification Bias There is an often quoted assumption t hat if the misclassification can be determined to be non differential, then any bias present would be towards the null69, 80, 81. Based on this assumption, it is often arg ued that whether bias is present or not is irrelevant, since if it is the results are merely more conservative than they would otherwise be. This assumption is problematic in many practical cases. In order for this assumption to hold true, several other assumptions must also be true. First, misclassification errors must be assumed independent of e rrors in other variables. Violation of this assumption is common. Fo r example, when analyz ing the effects of early drunkenness a researcher may wish to c ontrol for the effects of early alcohol initiation. In this case, the assumption of independent errors is clearly violated. A participant who is likely to be misclassified perhaps due to recall bias, on age at first drunkenness is also more likely to be miscla ssified on age at first drink. Second, the assumption of bias towards the null can be shown to be violated when the misclassified variable consists of more than two categorie s. Third, the assumption does not account for other types of systematic error (confounding, selection bias, social desirability bias, etc.) and how they might interact with misclassi fication error. Fina lly, random error alone could be enough to calculate a point estimate that is in fact an overestimate of the true point estimate, even when all assumptions necessary for bias towards the null are met69, 82. 25

PAGE 26

Methods have been developed to assess or account for the potential bias due to misclassification. Methods fo r correcting measurement erro r of continuous exposure variables have also been extensively studied, but are not discussed here. Methods to deal with misclassification of categorical variables have varied greatly in their sophistication and underlying assumptions, including: probabilistic sensitivity analyses83, regression calibration84-87, maximum likelihood88-90, a variety of nonparametric methods91, 92, Bayesian methods93-95, and multiple imputation methods96. Unfortunately, it is uncommon to see these methods im plemented in epidemiological practice. For some methods this may be due to the complexity of implementation. However, the following methods are easily implemented via existing software: standard sensitivity analysis, probabilistic sensitivity analysis, regression calibration, and multiple imputation for measurement error (MIME). As a result of this accessibility, these methods will be discussed in greater detail. Methods based on instrumental variables techniques will also be discussed as they relate to regression calibration methods. These methods can be broken out into three broad categories: me thods requiring no info rmation external to the study at hand, methods requiring some form of external information, and methods requiring validation samples as part of the study at hand. Method Requiring No External Information Sensitivity Analysis The standard sensitivity analysis is one of the more common approaches to testing the potential effects of mi sclassification bias. In a sensitivity analysis, values are assumed for the misclassificati on rates (sensitivity and specificity) and the number of exposed and unexposed is back calculated under these misclassification rates. The data are then reanalyzed in order to derive an adjusted point estimate and confidence 26

PAGE 27

interval. While simple to accomplish in prac tice, a standard sensitivity analysis is flawed by having no mechanism by which to express the uncertainty of the misclassification rates. Several combinations of misclassification rates can be trie d, but it is left up to the analyst to decide which, if any is the most plausible. The determination of the rates of misclassification can be improved when external data are available, such as a validation sub-study. Algebraic methods for calculating corrected poi nt estimates when validation information is available and the validati on data are imperfect have also been developed97. However, when validation information is available the methods discussed below may be more preferable. Method Requiring External Information Probabilistic Sensitivity Analysis Monte Carlo based methods have been devel oped for sensitivity analysis that account for uncertainty in the misclassifica tion rates by having the analyst specify an external distribution for the misclassification rates83. For non-differential misclassification, two external distributions are chosen: one for the specificity and one for the sensitivity of the measure. For di fferential misclassification, four external distributions are specified: t he sensitivity and specificity of cases and the sensitivity and specificity of controls. A single iteration of the simulation dr aws from these distributions a set of sensitivity and specificity values. These values are then used to calculate the positive predictive value (PPV) and negative predictive value (NPV) of exposure classification. For those who are initially classified as exposed the PPV is the probability of being correctly classified, while for those initially classified as unexposed the NPV is the probability of bei ng correctly classified. These values are then applied to the corresponding individual records. Next, a random number is generated from a 27

PAGE 28

uniform(0,1) distribution for each record. If this number is larger than the records probability of being correctly classified then the re cord is reclassified. Finally a logistic regression is run on the newly classified dat a, and a summary log odds ratio calculated. In order to account for r andom error the standard error fo r the conventional log odds ratio is calculated; then a value is sampl ed from a standard normal distribution. The product of this standard normal deviate and the conventional standard error is subtracted from the reclassifi ed log odds ratio. This process is then repeated many times resulting in a distribution of odds ra tios adjusted for both random and systematic error. An advantage to this method is that while it would benefit from independent assessments of the misclassification rates, it is not required that such data is on hand. However, how robust the method is to misspecif ication of the external distributions for the misclassification rates is still an open question. Methods Requiring Validation Sub Studies Regression Calibration Methods Methods using regression calibration hav e also been employed in accounting for misclassification85-87. Regression calibration methods wo rk by first fitting a primary logistic regression model among the whole sample relating the misclassified exposure to the outcome of interest. E(Y=1 W=w )= b0+b1*w Where W is the exposure with measurement erro r, and Y is the outcome of interest. The coefficient b1 is then corrected by the formulas: b1t=b1/a1 and Var(b1t)=1/(a1 2)Var(b1) +b1 2/(a1 4)*Var(a1) Where a1 is estimated by the regression: E(X W )= a0+a1*w 28

PAGE 29

where X is the gold standard m easurement of the exposure. The primary limitation to these regression calibration methods is the need for either a validation sub-study in order to make the appropriate estimation co rrections, however when such data is available these methods provide a viable al ternative for dealing with misclassification error. Instrumental Variable Binary Errors in Variables Methods Errors in variables methods for binar y data have been developed for the analysis of clinical trials in the context of noncom pliance with treatment assignment. The basic instrumental variable method relies on the potential outcome framework as described by Rubin et al. 98. Let R be random assignment to treatment, and let R=1 when assigned to treatment and let R=0 when assigned to control. Let A be the potential treatment received, and let A=1 when treatment is re ceived and let A=0 when treatment is not received. Let Y equal the outcome of interest. Under the assumptions of an instrumental variable analysis the only groups who provide information about the effects of treatment on the outcome are those who comply with random treatment assignment. The resulting instrumental variable estimator (IVE) is IVE = [E(Y R=1) E(Y R=0) ]/ [E(A R=1) E(A R=0) ] which is the standard intent to treat estima te divided by the propor tion of compliers. The analogous estimate for a mi sclassified exposure (IVEobs) is formulated by letting X be the gold standard measure of exposure and W be the expos ure measure with misclassification. The resu lting estimator is IVEobs = [E(Y W=1) E(Y W=0) ]/ [E(X W=1) E(X W=0) ] 29

PAGE 30

which is the misclassified estimate divide d by the proportion of misclassified. The proportion of misclassified could be estimat ed from a validation sub study by the regression: E(X W )= a0+a1*w where a1 is the proportion misclassified. Under this formulation it is easy to see that IVEobs is equivalent to the regression ca libration method previously described. Multiple Imputation Methods Methods using multiple imputation hav e also been employed in accounting for misclassification96. Multiple imputation for measurem ent error (MIME) works by first fitting a logistic regression m odel among the validation sample relating the misclassified exposure to the gold standard exposure measure. E(X=1 W=w and Y=y)= 0+ 1*W+ 2*Y Where X is the gold standard exposure meas ure in the validation sample, W is the exposure with measurement error, and Y is the outcome of intere st. The term W*Y can be included in the model to account for differential misclassification. From the parameters 0, 1, 2, and the covariance matrix wy, draws from a multivariate standard normal distribution for the coefficients are made for each imputation. For each of k imputations, let Zk equal the imputed exposure measur e. Set Z equal to X for those in the validation sample. For those not in the validation sub study draw Z from Bernoulli(pkwy), where pkwy = 1/(1+exp(0k+ 0k*w+ 0k*y)). Finally, run the primary analysis in each imputation set and combine using standard multiple imputation procedures. The primary draw back to these multiple imputation methods is the necessity of a validation sub-study in or der to make the appropriate estimation corrections. Such studies are uncommon, as they can be expensive. However when 30

PAGE 31

such data are available these methods prov ide a viable alternative for dealing with misclassification error. Needs Still Present in the Methods Literature There are a myriad of methods that hav e been developed to tackle the problem of misclassification in binary variables. Despite this progress, it is rare to see any of these methods proceed past the development phase. One of th e reasons for this is the lack of clear guidelines concerning their use, presented in a way that is broadly accessible. Even among the more widely under stood methods focused on here, there is still a need for a systematic hierarchy for their us e. In order to develop such a hierarchy it is necessary to first test the rela tive performance of these methods against one another in varying practical situations. Research Aims This study will make use of both real and simulated data. Data from Project Northland Chicago (PNC) consists of a hi gh-risk urban longitudinal sample followed from 6th to 12th grade. This study is unique in that it uses longitudinal data to assess potential sources of misclassification bi as. The following specific aims will be addressed: Aim 1 Assess whether survey modality affects how adolescent participants respond to survey questions concerning alcohol behaviors. Specifically: Do rates of alcohol use (weekly, mont hly, yearly) diffe r between mailed paper surveys and web-based surveys, mailed paper surveys and at-school paper surveys? If so, can these be attributed to modality effects or are they the result of self-selection into survey mode? Does reported age at first alcohol onset differ between mailed paper surveys and web-based surveys, mailed paper surveys and at-school paper surveys? If so, can 31

PAGE 32

these be attributed to modality e ffects or are they the result of self-selection into survey mode? Due to the presence of multiple su rvey administration modes during 12th grade follow-up, PNC data was used in Paper #1 to assess the presence and extent of survey modality effects on participant responses rega rding alcohol behavior. Few studies have looked at potential modality effects in re sponses to alcohol use questions in adolescents. Studies that have addressed this question have done so primarily in college-aged samples. Aim 2 Assess the predictors of recall bias for age at first alcohol onset among adolescents. What are the relative contri butions of each predictor to changes in misclassification rates? Due to the presence of both retrospective and prospective measures of alcohol initiation, PNC data was used in Paper #2 to assess predictors of recall bias in the context of alcohol use initiation. Few studi es have explored the magnitude of recall bias and its predictors in context of alcohol rese arch. Those that have, have been restricted in the number of predictors assessed. By providing an understanding of the mechanisms that might underlie recall error, th is study will enable researchers to make more sound judgments about the possibility and extent of recall bi as in self-report measures of alcohol use in samples of adolescents Aim 3 Create a hierarchy of best practices for dealing with misclassification bias in an observational setting. Paper #3 used simulations to test met hods that incorporate information from validation sub-studies. Result s from paper #3 is discussed as a formulation of best practices for analyzing data with misclassified pr edictors. By provid ing a hierarchy of 32

PAGE 33

best practices, this study will allow researchers to easily determine the best course of action in dealing with misclassifica tion error with available data. 33

PAGE 34

CHAPTER 2 PROJECT NORTHLAND CHICAGO STUDY DESIGN All data for this dissertation are taken from the Project Northland Chicago (PNC) trial99-101. PNC was a longitudinal group randomized trial of Chicago public schools aiming to reduce alcohol and drug use in a mu ltiethnic urban setting. Public schools in Chicago were eligible for re cruitment if they included 5th through 8th grade, had thirty or more students per grade, and had low rates of mobility (<25%). Sixty-one schools agreed to participate in the study over the course of three years. These schools were combined into twenty-two study units based on their geographic distribution. These study units were created in order to meet previously calculated power requirements for ten study units per condition averaging 200 participants each. Study units were matched based on ethnicity, poverty, mobility, reading scores, and mathematics scores. These study units were then randomized to eit her intervention (n=10) or control (n=12) status. The original data collection consist ed of four waves, lasting from 6th to 8th grade. Any student enrolled in a participating school dur ing the course of the trial was eligible for participation. A baseline survey of si xth graders was conducted in Fall 2002. Three follow surveys occurred after the initiation of the intervention: Spring 2003, Spring 2004, and Spring 2005. Of the eligible students, 91% (n=4,259) completed the baseline survey in 2002; 94% (n=4,240) complet ed the Spring 2003 survey; 93% (n=3778) completed the Spring 2004 survey; and 95% (n=3,802) completed the Spring 2005 survey. This data collection resulted in a series of repeated cross-sections with an embedded cohort of participants. The cohort fo llow up rates were: 89% for baseline to 34

PAGE 35

the Spring 2003 survey, 67% for the baseline to Spring 2004 survey, and 61% for the baseline to Spring 2005 survey. For the 5,711 students who completed at least one survey during the intervention period, attempts were made to recruit them as part of a long-term follow conducted during their 12th grade year (2009)101. Recruitment was carried out in three phases. During phase 1, participants were contact ed through the mail with telephone follow-up reminders for non-responders. Participants w ho responded during phase 1 were given the option to complete a paper copy of the surv ey included in the mailing, or to complete the survey online. Phase 1 resulted in 2375 participants completing surveys. For the 2824 participants who did not respond durin g phase 1, phase 2 of recruitment attempted to reach them through the school system. Paper surveys were administered through the Chicago public school resulting in 448 further completers. Those who did not have valid addresses or a verified school enrollment were tracked as part of phase 3. During phase 3, attempts were made to de liver a paper survey by means of a courier service. Phase 3 resulted in 209 further co mpletions. The overall response rate for the long-term follow-up survey was 53.1% (n=3032). 35

PAGE 36

CHAPTER 3 THE EFFECTS OF SURVEY MODALITY ON ADOLESCENTS RESPONSES TO ALCOHOL USE ITEMS Literature Review Mixed-mode designs have become increasingly common in survey based research, surveillance systems, as well as epidemiological and intervention research 102107. The most commonly cited reasons for the use of mixed mode surveys are increased response rates and decreased study costs108. However, the increase in response rate gained from using mixed-modes is not without potential complications. Participants response may vary by mode of the survey itself, introducing unfortunate confounding 63, 65, 66. Studies to date examining modality effects of web versus paper surveys specific to alcohol questionnaires have found mixed re sults. Four published studies found little evidence for modality effects for web-based survey of self-reported alcohol use compared to paper surveys 58-61. Two studies have found that the interaction of survey mode with questions of high sensitivity is more likely to lead to significant survey mode effects 62, 63. This may explain inconsistency in the literature on studies reporting significant modality effects in self-reported alcohol use. Studies that found little evidence of modality effects for self-r eported alcohol use occur primar ily in adult or college-aged samples, while studies finding significant modality effects occur in samples drawn from adolescent populations 62-68. Since adolescents face greater social pressure not to drink compared to adults, the difference in perce ived anonymity of different survey modes may be of greater importance in how truthfully adolescents respond. Studies comparing school versus home-adm inistered surveys have consistently found higher reported rates of alcohol and drug use in school-administered surveys 6636

PAGE 37

68. Kann et al. found significant differences in reported prevalence rates of problem behaviors between the school-based Youth Ri sk Behavior Survey (YRBS) and the home-based National Health Interview Survey (NHIS), including m easures of alcohol use. Kann et al. found significant differences in the reported prev alence of the following alcohol measures: lifetime alcohol use (YRBS=79.8%, NHIS=68.5%, t=6.56, p<0.001), episodic heavy drinking (YRBS=28.2%, NHIS=22.5%, t=3.21, p<0.001), and drank alcohol before age 13 (YRBS=33.8%, NHIS=19. 7%, t=11.56, p<0.001); significant differences were not found for se lf-reported current alcohol use 67. The largest differences between the school versus home-based surveys occurred with questions regarding illegal or stigmatized behavior, lead ing the authors to conclude the higher rates reported via an anonymous paper and pencil survey in school to be less affected by social desirability or underreporting bias than a home-based paper and pencil survey. The resulting differences between modes are typically attributed to the perceived lower levels of anonymity in household-based surveys. While much research has been done on the effects of survey modality on adolescent responses to alcohol use items, most of this research has focused on comparisons from national population-based surveys in order to assess error in population-level prevalence estimates 59, 60, 62, 66-68. While national survey comparisons are useful for assessing general populati on-wide effects of su rvey modality and validating population prevalenc es, they do not necessarily demonstrate the presence or absence of modality effects in more specializ ed settings or the potential threat to the interpretation of intervention effects in cont rolled trials. There remains a need to explore effects of survey modality in settings common to adolescent alcohol prevention trials. 37

PAGE 38

Previous studies have examined college 59, 60 or elementary 64, 65 aged students. As a result, little is known about the influence of survey modality in trials of adolescent alcohol prevention. This st udy examined the effect of su rvey modality on responses to alcohol use questions in an adolescent popul ation drawn from a recent alcohol prevention trial. Specifically, this study examined differences in responses to selfreported alcohol use items by survey mode, whether differences were the result of modality effects versus self selection, and whether differences were differential by intervention status. Methods Study Design Data for this study were collected as par t of the Project Northland Chicago (PNC) trial99, 100. PNC was a longitudinal group-randomiz ed trial of Chicago public schools to test an alcohol and drug use prevention interv ention within a multi-et hnic urban setting. Public schools in Chicago were eligib le for recruitment if they included 5th through 8th grade, had thirty or more students per grade, and had low rates of mobility (<25%). Sixty-one schools agreed to participate in the st udy over the course of three years. These schools were combined into twent y-two study units based on their geographic distribution. These study units were created in order to meet previously calculated power requirements for ten st udy units per condition aver aging 200 participants each. Study units were matched based on ethnicity poverty, mobility, reading scores, and mathematics scores. Study units were then r andomized to either intervention (n=10) or control (n=12) status. The original data collection consisted of four waves of school-based surveys, following a cohort of youth from 6th to 8th grade. Any student enr olled in a participating 38

PAGE 39

school during the course of the trial was eligible for participation. A baseline survey of sixth graders was conducted in Fall 2002. Three follow surveys occurred after the initiation of the intervention: Spring 2003, Sp ring 2004, and Spring 2005. Of the eligible students, 91% (n=4,259) completed the bas eline survey in 2002; 94% (n=4,240) completed the Spring 2003 survey; 93% (n=3778) completed the Spring 2004 survey; and 95% (n=3,802) completed the Spring 2005 survey. The trial design thus resulted in a series of repeated cross se ctions with an embedded cohort of participants. The cohort follow-up rates were: 89% for baseline to th e Spring 2003 survey, 67% from baseline to Spring 2004 survey, and 61% from bas eline to Spring 2005 survey. For the 5,711 students who completed at least one survey during the intervention period, attempts were made to recruit them as part of a long-term follow conducted during their 12th grade year (2009)101. Recruitment was carried out in three phases. During phase 1, participants were contact ed through the mail with telephone follow-up reminders for non-responders. Participants w ho responded during phase 1 were given the option to complete a paper copy of the surv ey included in the mailing, or to complete the survey online. Phase 1 resulted in 2375 participants completing surveys. For the 2824 participants who did not respond durin g phase 1, phase 2 of recruitment attempted to reach them through the school system. Paper surveys were administered through Chicago public schools resulting in 448 further completers. Those who did not have valid addresses or a verified school enro llment were tracked as part of phase 3. During phase 3, attempts were made to deliv er a paper survey by means of a courier service. Phase 3 resulted in 209 further co mpletions. The overall response rate for the long-term follow-up survey was 53.1% (n=3032). 39

PAGE 40

The analysis sample for this study incl uded a cohort of 2,147 African American (40%), Hispanic (29%), white (16%), and other (15%) youth who completed a survey in 8th grade and 12th grade. The sample was 45% male, spoke English at home (72%), and was largely low income (78% reported re ceiving free or reduced price lunch). Data collection protocols and analyses were approved by Institutional Revi ew Boards at the University of Minnesota and University of Florida. Missing Data Youth who completed both an 8th grade survey as well as a 12th grade survey were included in the report her e. Students lost to follow-up were more likely to be nonwhite (p<0.01) and male (p<0.01); students lost to follow-up were also marginally more likely in the 8th grade to report drinking in the last year (p=0.15), month (p=0.08), and week (p=0.05). There were no significant differences between thos e lost to follow-up versus those with data in 12th grade among those reporting free or reduced price lunch (p=0.72). Differential loss to follow-up was handled by the inverse probability weighting method. The probability of completion was modeled by means of a logistic regression with race/ethnicity, gender, and 8th grade alcohol use items as predictors of completion. The complete cases were then weighted in t he analysis by the inverse of their modeled probability of completion. Measures Survey modality A three-level nominal variable was constructed for survey modality: mailed paper survey, web-bas ed survey, and school-based survey. Participants who returned a paper survey fr om phase 1 were classified as mail-based group; participants who returned a paper survey as part of phase 3 were also classified 40

PAGE 41

as mail-based group. Participants from phases 1 and 3 who completed the survey online were classified as web-based group. Participants who completed the survey during phase 2 were classified as sc hool-based group. For the proceeding regression analyses this variable was dummy coded wit h mail-based group as the reference group. Alcohol in the last year A single item was used to measure alcohol use in the last year. Participants were asked: D uring the last 12 months, on how many occasions, or times, have you had alcoholic beverages to drink? Participants could respond: never, 1-2 occasions, 3-5 occasion, 6-9 occasions, 10-19 occasions, 20-39 occasions, or 40 or more occasions. Alcohol in the last month A single item was used to measure alcohol use in the last month. Participants were asked: Duri ng the last 30 days, on how many occasions, or times, have you had alcoholic beverages to drink? Participants could respond: never, 1-2 occasions, 3-5 occasion, 6-9 occasions, 10-19 occasions, 20-39 occasions, or 40 or more occasions. Alcohol in the last week. A single item was used to measure alcohol use in the last week. Participants were asked: Duri ng the last 7 days, on how many occasions, or times, have you had alcoholic beverages to drink? Participants could respond: never, 1-2 occasions, 3-5 occasion, 6-9 occasions, 10-19 occasions, 20-39 occasions, or 40 or more occasions. Age at first alcohol use Participants were asked, During the last 12 months, on how many occasions, or times, have you had alcoholic beverages to drink? during the baseline, 6th, 7th, 8th, and 12th grade surveys. During the 12th grade survey, students 41

PAGE 42

were also asked A year ago, about how ofte n did you drink alcohol; Two years ago, about how often did you drink alcohol; and Three years ago, about how often did you drink alcohol. Each participants age at firs t alcohol use was estimated by the age at the time of the first surv ey indicating alcohol use. Ever use alcohol Participants were asked, I f you have ever had an alcoholic drink, think back to the last time you drank How did you get the alcohol? Participants could respond Youve never had an alcoholic dr ink or one of ten possible sources. Responses to this variable were dichotomiz ed to reflect categories of never had an alcoholic drink and had an alcoholic drink. Race/ethnicity During the baseline and inte rvention period surveys, race/ethnicity was measured by a single it em. Participants were asked How do you describe yourself? Mark all that describe you. Participants could respond by marking Asian American or Asian Indian ; Black or African Americ an; Latino, Hispanic, or Mexican American; Native American or American Indian ; White, Caucasian, or European American; or O ther. Race/ethnicity was categor ized as either: Asian, Black, Hispanic, Native American, White, or Other. Free/reduced price lunch Free or reduced price l unch was measured at baseline by a single item. Participants we re asked Do you receive free or reducedprice lunches at school? Response options included: Yes, No or Dont know. Responses of Dont know were recoded as missing. Gender Gender was measured at baseline by a single item. Participants were asked whether they were a Boy or Girl. These responses were dummy coded as a single variable with female as the reference group. 42

PAGE 43

Statistical Analysis The analytical plan followed three steps. Fi rst, we tested for overall differences between modalities by including only survey modality as a predictor in a series of regression models. Next we attempted to isolate the effect of survey modality from potential selection effects by including additional covariates in our modality regressions. Finally, we ran models including a, interventi on by survey mode interaction term in order to assess whether modality e ffects where differential by intervention condition. As a sensitivity analysis, we restricted the sample to those reporting some lifetime drinking, and re-ran all previous analyses in order to explore whether any observed differences could be due to social desirability of res ponses. Post-hoc power analyses were also ran to verify the ability of our sample to detect scientific ally meaningful differences. Modality regressions. First, the presence of any differences by survey mode was assessed with a series of regres sion models of the following form: E(Yi )= 0 + 1 Xi Where Yi represents the outcome variable for participant i and Xi represents the indicator variable for surv ey modality for participant i. To explore potential mode effects while teasing apart potential self-selection into mode, the regr ession models were specified as: E(Yi )= 0 + 1 Xi + 2Zi + 3Bi Where Yi represents the outcome variable for participant i Xi represents the indicator variable for survey modality for participant i Zi represents a set of time invariant selection factors for participant i (race, gender, and socioec onomic status), and Bi represents the value of t he outcome variable at 8th grade for participant i Controlling for 8th grade values of the outcome variable will pr ovide greater control of selection factors 43

PAGE 44

compared to the demographic cont rols present in many studies, providing a more valid estimate of the modality effects unconfounded by selection factors. Outcomes assessed while controlling for the covariates included: alcohol use in the last week, month, and year. To explore whether mode effects are differential by intervention condition, the regression models were specified as: E(Yi )= 0 + 1 Xi + 2 Ti + 3Zi + 4Bi + 5 Xi Ti Where Yi represents the outcome variable for participant i Xi represents the indicator variable for survey modality for participant i Ti represents the indicator variable for the intervention group for participant i Zi represents a set of time invariant selection factors for participant i (race, gender, and socioeco nomic status), and Bi represents the value of the outcome variable at 8th grade for participant i As a sensitivity analysis, the analysis was restricted to the sa mple that reported any lifetime drinking. Since this sample has already reported underage drinking, they might be less likely to be influenced by the soci al desirability of a negative response. If differences by survey mode persist in this samp le it would be more difficult to attribute them to social desirability bias. All previ ous models were re-run using the restricted sample. Additionally, the effect of survey mo dality on age at first alcohol use was also conducted on the restricted sample. Participants in the study are naturally clustered by school, and therefore observations cannot be assumed to be independe nt. As a result, standard regression techniques can lead to incorrect inferences due to biased standard erro r estimates. Our analysis takes this correlation into account by means of generalized estimating equations using PROC GENMOD in SAS v9.3. The GENMOD procedure allows us to 44

PAGE 45

easily account for the correlation between pa rticipants clustered in schools, while providing flexibility on the di stributional assumptions of our outcome variables. Power analysis All alcohol outcomes were approx imated as Poisson distributed random variables during the power analysis. For these outcomes G*Power v3.1.3 was used to determine the smallest effect size det ectable (in rate ratios) given our sample. The parameters needed for the power analysis of Poisson distributed outcomes were: the sample size (N), the desired power ( ), desired type-1 error rate ( ), the base event rate (e 0), the R2 of the covariates, and the proporti on of participants who selected to take the non-mailed paper survey ( ). N is a fixed quantit y of our data (N=2,147). was set at 0.95 in order to confident ly claim no effect. The value for was 0.05, and e 0 was estimated by an intercept only Poisson regression model for each outcome. The R2 of the other covariates was estimated by a regression excluding the modality effect. was calculated directly from the data. Results The full analysis sample including both drinkers and non-drinkers produced no statistically significant modality effects. No significant differences were found in responses to the alcohol use items when comparing mailed-based versus web-based survey administrations. When comparing the school-based group to the mail-based group, a small marginally significant effe ct of survey modality was observed in responses to the alcohol use in the last year item [RR=1.11, 95% C I: (0.98, 1.25)]. This effect was robust to the inclus ion of covariates to control for potential selection effects [RR=1.10, 95% CI: (1.00, 1. 21)]. No other modality effe cts were found comparing the mail-based group to the school-based group in the full sample (Table 3-1). 45

PAGE 46

The regression models testing for a differ ential modality effect by intervention condition detected no statistically significant intervention by survey mode interactions. Thus, there is no evidence of differential e ffects of modality by intervention group. No significant modality effects were detected among students who reported ever drinking alcohol in their lifet imes, regardless of covariate controls. For comparisons of weband mail-based groups, the direction and magnitude of the estimates produced no clear pattern and were not statistically signif icant. In addition, no statistically significant differences were found between the mailand school-based groups, and the direction and magnitude of the estimates produc ed no clear pattern (Table 3-2). We have adequate statistical power to detect changes in alcohol behaviors due to survey modality as low as 5% to 10% depending on the outcome tested. We have to statistical power to detect scientifically meani ngful changes in reported alcohol use due to modality. This is made clear when compar ing our detectable effect sizes to 20% to 50% reductions from prior adolescent alcohol prevention trials109. Discussion Our results show no appreciable differences in self-reported alcohol use across mailed paper, in-school paper, and web-based surveys of high school-aged participants in a large-scale alcohol prevention trial. These findings support potential benefits of using mixed modality surveys for long-te rm follow-up of high sch ool-aged participants. In the PNC follow-up surveys, including multip le modalities increased the response rate of the follow-up survey with no threat of survey moda lity confounding results. Our results show that the import ant benefits of added sample si ze came with little to no measurement error due to multiple survey modalities. 46

PAGE 47

While the results did produce a single margi nally significant esti mate of the effect of the in-school modality relative to mail ed paper surveys on participants self-reported alcohol use in the past year, this single resu lt should be interpreted with caution. Due to the number of models te sted, this effect could well be du e to chance alone. On the other hand, it is possible that the estimate may be the result of a real association between survey modality and adolescent s self-reported alcohol use. Even if reflecting a true effect, the size of the effe ct is small enough to not be of substantive importance, especially compared to other causes of me asurement error, such as recall bias 39. If the estimate is the result of a real associ ation between survey modality and adolescents self-reported alcohol use, t he sensitivity analysis among t hose reporting some lifetime alcohol use points to social desirability as the likely mechanism for the difference. When restricting the sample to self-reported lifet ime alcohol users, a group demonstrably less influenced by the social desirability of t heir responses, the marginally significant estimate was eliminated. Any interpretation of these results shoul d be viewed in the context of both the studys limitations and its strengths. First, the data used were not taken from a trial designed with the purpose of testing the effects of survey modality on participants responses. Participants were allowed to self-s elect into the modality of their choice in order to maximize the response rate of the original trial. Because participants were not randomized to their survey modality, we c annot be sure whether the results reflect the true effect of modality or are t he result of selection factors. However, all participants had previously responded to the alcohol measurem ent items using a single survey modality during 8th grade. Since the 8th grade measurements could not be influenced by different 47

PAGE 48

survey modalities, controlling for these re sponses provides the strongest control of selection factors possible in this sample Second, PNC did not produce a significant intervention effect in reducing self-reported alcohol use. While the results provide the best available evidence that survey moda lity does not have differential effects by assigned intervention group, it is possible that no association was found due solely to the lack of an intervention effect in the original trial. Desp ite the problems inherent in using secondary data for this study, the use of data from an established intervention trial also provides a key strength. Because this study is derived from an alcohol prevention trial, its conclusions are more directly applicable to the both the interpretation of intervention effects from similar complet ed trials and the design of future trials, compared to conclusions drawn from si milar modality analyses of more general population surveys. Future research should seek to build on this study in several ways. First, the results should be replicated in trials within different demographic groups, and implemented in other settings. The present sa mple entirely from a highly urban setting, and consisted of majority minority population. To increase the generalizability of these findings, they should be replicated using trials performed in rural settings and with populations with other dem ographic distributions. Second, results should be replicated in trials that produced a significant intervention effect to provide stronger evidence that potential modality effects are not affected by treatments. Nevertheless, our results provide comforting evidence that adolescent alcohol prevention trials may use multiple survey modalities when appropriate to increase response rates without introducing bias that harms the interpretation of intervention effects. 48

PAGE 49

Table 3-1. The effects of survey modality on self-r eported alcohol use EstCIEstCIEstCIEstCI Poisson Distributed Outcomes Alcohol in last week0.99(0.94, 1.04)1.00(0.95, 1.05)1.02(0.94, 1.11)1.00(0.93, 1.09) Alcohol in last month1.02(0.94, 1.11)1.02(0.94, 1.11)1.05(0.95, 1.17)1.06(0.96, 1.16) Alcohol in last year1.04(0.95, 1.13)1.01(0.92, 1.11)1.11(0.98, 1.25)1.10(1.00, 1.21) adjusted for race, gender, free or reduced price lunch, and 8th grade values of the outcome variables School vs mailed paper Web vs mailed paper Crude Adjusted*Crude Adjusted* Table 3-2. The effects of survey m odality on self-report ed alcohol use among adolescents reporting lifetime alcohol use EstCIEstCIEstCIEstCI Poisson Distributed Outcomes Alcohol in last week0.98(0.92, 1.05)0.99(0.92, 1.06)1.00(0.89, 1.12)0.98(0.88, 1.09) Alcohol in last month1.02(0.92, 1.12)1.01(0.91, 1.12)1.02(0.90, 1.15)1.00(0.89, 1.11) Alcohol in last year1.02(0.93, 1.12)1.00(0.90, 1.10)1.05(0.95, 1.15)1.00(0.92, 1.09) Normally Distributed Outcome A ge at Alcohol Onse t -0.07(-0.20, 0.06)0.00(-0.14, 0.14)0.07(-0.11, 0.23)0.13(-0.05, 0.30) adjusted for race, gender, free or reduced price lunch, and 8th grade values of the outcome variables Crude Adjusted*Crude Adjusted* Web vs mailed paper School vs mailed paper 49

PAGE 50

CHAPTER 4 PREDICTORS OF RECALL ERROR IN SEL F-REPORT OF AGE OF ALCOHOL USE ONSET Literature Review Alcohol use is widespread among American adolescents. While rates of early alcohol initiation have been dropping since the early 1990s, rates still remain high. According to 2010 reports from the Monitori ng the Future study (MTF), over 35% of adolescents have initiated alcohol use by 8th grade, over 58% by 10th grade, and 71% by 12th grade; additionally, 16% of adolescents report having been drunk by 8th grade, 37% by 10th grade, and 54% by 12th grade1. Not only are these rates high but there is also extensive literature on the harmful effe cts of alcohol use during adolescence. The literature strongly suggests that early initiation of alcohol use is associated with higher rates of subsequent high-risk behaviors (violence, substance abuse, academic problems, employment problems, and risky sex)2-5 and consequent health problems (emotional problems, sexually tr ansmitted infections, and injury)2, 3, 6-8. Moreover, early onset increases the risk of alcohol dependence in adulthood. In short, early onset of alcohol use is commonly considered to be a serious public health problem. An earlier age at first alcohol use has been used to predict a variety of negative outcomes [2-8]. More recently age of alcohol use onset has been studied as a way to subtype patients for alcoholism interventions Kranzier et al. analyzed the effects of sertraline treatment on subgroups defined by age of alcohol use onset, concluding that age of alcohol onset was a clinically us eful measure for subtyping alcohol abuse patients for treatment 110. The continued use and growth of reliance on retrospective age of alcohol use onset measures underscores the importance of understanding the predictors of recall error in age of alcohol use onset. Knowing the characteristics of 50

PAGE 51

survey respondents who are more likely to experience error in recall will allow researchers to better interpre t findings based on retrospective measures of alcohol use onset. Due to the importance of adolescent alcohol use, and early onset of use, as prominent public health problem, it is important that its e ffects be studied as rigorously as possible. However, data collected for t he purpose of adolescent alcohol research often of necessity relies on retrospective measures with the serious potential for systematic recall error 52, 70-74. Collins et al. found that retrospective measures tended to underreport the extent of past drinking, and that levels of current drinking were predictive of the extent of re call bias in high school students71. Studies have found the presence of recall bias in a wide range of re ference time periods. For example, Simpura et al. found 18-year recall of alcohol use in adult men tended to overestimate the levels of past drinking, while finding that the origin al level of drinking was predictive of the measurement error70. Gmel et al. found significant underreporting bias in seven day alcohol use recall in emergency room settings74. The extent of recall bias in selfreported alcohol use can also be affected by the perceived harm the alcohol use may cause. A study of recall bias amo ng pregnant women found that women who experienced adverse pregnancy outcomes tended to underreport alcohol use when measured retrospectively compared to women who did not experience adverse pregnancy outcomes72. A longitudinal study of college freshmen reported recall bias to be a factor in the meas urement of self-reported age of alcohol use onset52. A cohort (n=410) of college freshmen wa s followed annually for 11 y ears. A significant linear trend was observed for the mean self-reported age of alcohol use onset by study year 51

PAGE 52

(F=84.75, p<0.001), indicating that increasing the length of the recall period also increased the self reported age of first alcohol use. Few studies have explored the magnitude of recall error and its predictors in context of adolescent alcohol research. Prior studies on the predictors of recall error in adolescent alcohol use have been restricted in the number of predictors assessed. We used data available from Proj ect Northland Chicago (PNC), a longitudinal intervention trial to reduce adolescent alcohol use, to assess the association of a variety of predictors with recall error in age of alc ohol use onset. The cohort embedded in PNC enables us to construct both prospective and retrospective measures of the age of alcohol use onset. This allows for the direct assessment of which participants experienced poor recall, and allows us to explore variables that predict which participants give an inaccurate report l eading to potential bias. Furthermore, the direction of any recall error can be a ssessed by comparing responses of the retrospective versus prospective meas ures. By improving understanding of the mechanisms that underlie recall error, this study enables researchers to make more sound judgments about the possibili ty and extent of recall bias using retrospective selfreport measures of age of alcohol use ons et. Results will also help better assess the validity of effect sizes from the alcohol use etiological literature, which often relies on retrospective measures of age of alcohol use onset. Methods Study Design Data for this paper was collected as part of the Project Nort hland Chicago (PNC) trial99, 100. PNC was a longitudinal group-random ized trial of Chicago public schools aiming to reduce alcohol and drug use in a multiethnic high-poverty urban setting. 52

PAGE 53

Public schools in Chicago were eligib le for recruitment if they included 5th through 8th grade, had thirty or more students per grade, and had low rates of mobility (<25%). Sixty-one schools agreed to participate in the study. Characteristics of students within the study schools were similar to the ov erall demographics of the Chicago Public Schools district. The PNC sample was 50% male, ethnically diverse (43% black, 29% Hispanic, 13% white, and 15% other) and of low socio-economic status (72% received free or reduced-price lunch). Less than half the students lived with both of their parents (47%) and 74% reported English as the primary language at home. The core data collection consisted of four waves of school-based surveys, following a cohort of youth from 6th to 8th grade. Any student enrolled in a participating school during the course of the trial was eligible for participation. A baseline survey of sixth graders was conducted in Fall 2002. Three follow-up surveys occurred after the initiation of the intervention: Spring 2003, Sp ring 2004, and Spring 2005. Of the eligible students, 91% (n=4,259) completed the bas eline survey in 2002; 94% (n=4,240) completed the Spring 2003 survey; 93% (n=3778) completed the Spring 2004 survey; and 95% (n=3,802) completed the Spring 2005 survey. These surveys waves resulted in a series of repeated cross-sections wit h an embedded cohort of participants. The cohort follow-up rates were: 89% for baselin e to the Spring 2003 survey, 67% for the baseline to Spring 2004 survey, and 61% for the baseline to Spring 2005 survey. For the 5,711 students who completed at least one survey during the intervention period, attempts were made to recruit them as part of a long-term follow conducted during their 12th grade year (2009)101. Recruitment was carried out in three phases. During phase 1, participants were contact ed through the mail with telephone follow-up 53

PAGE 54

reminders for non-responders. Participants w ho responded during phase 1 were given the option to complete a paper copy of the surv ey included in the mailing, or to complete the survey online. Phase 1 resulted in 2375 participants completing surveys. For the 2824 participants who did not respond durin g phase 1, phase 2 of recruitment attempted to reach them through the school system. Paper surveys were administered through the Chicago public school resulting in 448 further completers. Those who did not have valid addresses or a verified school enrollment were tracked as part of phase 3. During phase 3, attempts were made to de liver a paper survey by means of a courier service. Phase 3 resulted in 209 further co mpletions. The overall response rate for the long-term follow-up survey was 53.1% (n=3032). In order to assess predictor s of recall error it is necessary to construct a prospective measure of age of alcohol use on set for comparison with the retrospective measure included as part of PNCs long-term follow-up survey. In order to facilitate this, the sample was restricted to only those w ho responded to the alcohol use questions in both the original PNC survey and the long-term follow-up survey in 2009. The resulting study cohort included 1,054 African American (43.1%), Hispanic (37.5%), and White (19.5%) youth who completed a survey in 6th grade and 12th grade. The sample was 44.6% male, 69.9% spoke English at home, and 74.5% were low income. Data collection protocols and analyses were approved by Institutional Revi ew Boards at the University of Minnesota and University of Florida. Measures Retrospective measure. Age of alcohol use onset was assessed retrospectively by means of a single item on the 12th grade follow-up survey. Participants were asked About how old were you when you first started drinking, not 54

PAGE 55

counting small tastes or sips? Participant s could respond: Never; years old or younger; or 10 years old; 11 or 12 years old; or 14 years old; or 16 years old; or years or older. Prospective measure. The prospective age of alcohol use onset variable was created by combining information from eight variables over the 5 survey time points. Students were asked During the last 12 mo nths, on how many occasions, or times, have you had alcoholic beverages to drink? during the baseline, 6th, 7th, 8th, and 12th grade surveys. During the 12th grade survey, students were also asked A year ago, about how often did you drink alcohol; Two years ago, about how often did you drink alcohol; and Three years ago, about how oft en did you drink alco hol. From these responses, age of alcohol use onset was assi gned as the participants age at the time of the survey the first time the respondent indica ted using alcohol. This prospective age of alcohol use onset was then categorized to ma tch the categories of the retrospective measure. Due to the inclusion of the recall questions from 12th grade, covering grades 9 to 11 (ages 15 to 17), this measure is not fully prospective. However it is the most prospective measure possible that matc hes categories described in the purely retrospective measure. Agreement measures. The measure of agreement compared the prospective and retrospective measures of age of alcohol use onset. If the tw o alcohol measures were found to have equal values then t he agreement measure was categorized as agree and if they were unequal the agr eement measure was categorized as disagree. A second indicator was coded to show whether the retrospective measure 55

PAGE 56

indicated an earlier or later age of alcohol use onset compared to the retrospective measure. Age. Date of birth was recorded on the surv eys during grades 6 to 8. Age at 12th grade was calculated based on the 2009 survey date and the earlier recorded date of birth. Race/ethnicity. Race/ethnicity was measured with a single item. Participants were asked How do you describe yourself? Mark all that describe you. Participants could respond by marking Asian Americ an or Asian Indian; Black or African American; Latino, Hispanic, or Mexican American; N ative American or American Indian; White, Caucasian, or European Amer ican; or Other. Race/ethnicity was categorized as either: Asian, Black, Hispani c, Native American, White, or Other. Free/reduced price lunch. Free or reduced price l unch was measured at baseline by a single item. Participants we re asked Do you receive free or reducedprice lunches at school? Participants coul d respond: Yes, No, or Dont know. Responses of Dont know were recoded as missing. Family structure. Family structure was measured at baseline by a single item. Participants were asked: Who do you liv e with most of the time? Mark only one answer. Participants could respond: Mother and Father together; Mother and father equally, at separate households ; Parent and step-parent, Mother mostly; Father mostly; Grandparent; Other relative; Foster parents; or Other. These responses were dichotomized to reflect categories of Mother and father together and Other. Baseline and 12th grade alcohol use. A single item was used to measure alcohol use at both baseline and 12th grade. Participants were asked: During the last 56

PAGE 57

12 months, on how many occasions, or time s, have you had alcoholic beverages to drink? Participants could respond: never, 12 occasions, 3-5 occasion, 6-9 occasions, 10-19 occasions, 20-39 occasion s, or 40 or more occasions. These responses were dichotomized to reflect categories of yes or no for baseline alcohol use due to extreme skewness in the distributions. Baseline and 12th grade cigarette use. A single item was used to measure cigarette use at both baseline and 12th grade. Participants were asked: Have you ever smoked a cigarette? Participants could res pond yes or no for both baseline and 12th grade cigarette use. Baseline and 12th grade marijuana use. A single item was used to measure marijuana use at both baseline and 12th grade. Participants were asked During the last 12 months, on how many occasions, or times, if any, have you used marijuana (other names for marijuana are: pot, grass, w eed, dope, reefer, blunt, hash, hashish)? Participants could respond: never, 1-2 occa sions, 3-5 occasion, 6-9 occasions, 10-19 occasions, 20-39 occasions, or 40 or mo re occasions. These responses were dichotomized to reflect categories of y es or no for both baseline and 12th grade marijuana use due to extreme skewness in the distributions. Statistical Analysis Analyses were conducted in two phases. Du ring phase 1, logistic regression was used to examine factors associated with agreement between the retrospective and prospective measures. During phase 2, we assessed direction of the bias due to each significant predictor. Details of each phase are discussed below. The sampling frame of t he parent trial included recr uiting students through their schools, therefore students were cluster ed within school, and observations cannot be 57

PAGE 58

assumed to be independent. As a result, standard regression techniques can lead to incorrect inferences 111. Our analyses take this correla tion into account by means of survey logistic regression using PROC SURVEYLOGISTIC in SAS v9.3, where school is treated as survey cluster. Phase 1. Logistic regression analysis was us ed with agree as the reference group, compared with disagree. A series of multivariable logistic regressions were estimated to assess the relative importanc e of each independent va riable. An added-inorder procedure was used to eliminate individual terms from the full model. The ordering of the term elimination proc ess was based on a combination of chronology (Fig. 4-1) and model hierarchy rather than any automated statistical procedure 112. The first candidate variable for elimination was the t heorized most chronologically proximal risk factor (self-report ed marijuana use in 12th grade). Each variable was then tested in sequence from the next most chronologic ally proximal endi ng with the most chronologically distant. Variables were re moved from the model if both its p-value exceeded 0.05; and its removal did not suggest confounding, as measured by a change in estimate criterion of 10% in any of the remaining variable s. Variables retained in this phase were further explored in phase 2. All phase 1 models were run using PROC SURVEY LOGISTIC in SAS v9.3. Phase 2 The retained predictors were te sted to see whether they were associated with the direction of bias in self -reported age of alcohol use onset. In order to test this, the sample was restricted to only respondents whose retrospective and prospective measure disagreed. Next, a series of univariate logistic regression models were run of the form: 58

PAGE 59

E(Yi)= 0+ 1Xi where Yi represents a binary indicator variable fo r the retrospective measure reporting a later age at first alcohol use compared to the prospective measure, and Xi represents one of the retained predictors. All phase 2 models were run using PROC SURVEY LOGISTIC in SAS v9.3. Loss to follow-up Due to the necessary restricti on of participants to those with both a baseline and long-term follo w-up survey, it is possible that the results could be due to selection bias caused by loss to follow-up. In order to address this concern all analyses were conducted again using inverse pr obability of censoring weighting. The probability of completion was modeled by means of a logistic regression with family structure, socioeconomic status, race/ethnicity, gender and a baseline score of alcohol risk as predictors of completion. The comple te cases were then weighted by the inverse of their modeled probability of completion. These weight s were then incorporated into the analysis using PROC SURVEYLOGISTIC. Results Missing Data PNC participants who were lost to follo w-up were more likely to be older at baseline (t=7.49, p<0.001), non-white ( 2=28.92, p<0.001), and male ( 2=39.12, p<0.001). Additionally t hey were also more likely to be smokers ( 2=5.15, p=0.02). There were no significant differences found between those lost to follow-up and those who completed a 12th grade survey on qualifying for a fr ee or reduced price lunch, living in a home with both a mother and father, baseline alcohol use, or baseline marijuana use. Weighting the analyses based on the pr obability of cohort completion had no 59

PAGE 60

substantive effect on any of the analyses. Thus, all results presented are from the unweighted analyses. Model Results Agreement between the prospective and retr ospective measures of age at first alcohol use was found in 39% of respondents. In 40% of respondents, the retrospective measure indicated a later age at first alc ohol use; and in 21% an earlier age of first alcohol use. A full cross-tabul ation of the retrospective and prospective measures of age of alcohol use onset is shown in Table 4-1. Table 4-2 presents the results from the final model of the previously described added-in-order procedure. The added-in-order procedure produced a model including eligibility for free or reduced price lunches, baseline al cohol use, alcohol use in 12th grade, and cigarette use in 12th grade as significant predictors. The strongest associations with recall error were found wit h baseline [OR: 6.73 95% CI: (3.95, 11.47)] and 12th grade alcohol use [OR: 3.90 95% CI: (2.91, 5.22)]. Table 4-3 presents results from the phase 2 models exploring whether the retained predictors were associated with a parti cular direction of the recall bias. For both eligibility for free or r educed price lunches and baseline alcohol use, there was no statistically significant difference in the odds of the retrospective measure indicating later-self-reported age at first alcohol use co mpared to the prospective measure. There was a statistically significant increase in the odds of the retrospective measure indicating later self-reported age at first alcohol use for both alcohol use in 12th grade [OR: 23.53 95% CI: (15.71, 35.24) ] and cigarette use in 12th grade [OR: 3.39 95% CI: (2.35, 4.88)]. 60

PAGE 61

Discussion This study was designed to assess the va lidity of retrospective recall of age of alcohol use onset, and to determine predictors and direction of recall error. Our assumption is that the pros pective age of alcohol use onset measure is more accurate than the retrospective measure. This assump tion is consistent wit h prior literature on retrospective questionnaires because the cogni tive burden of recalling past events increases as the length of t he recall timeframe increases 42. Based on this assumption, we treat disagreement between the prospective and retrospective measures of age at first alcohol use as evidence of recall error. Eligibility for free or reduced price lunch at baseline was significantly associated with an increase in recall error for age at first alcohol use, although t he direction of recall error was not consistent. This finding is cons istent with a prior stud y that found that low socioeconomic status responde nts were particularly susceptible to memory errors (effect size=0.6) 113. Since there was no consistent direction found in the recall error associated with eligibility for free or reduced price lunch, we do not expect this recall error to result in biases for population level effect estimates. However, researchers and clinicians implementing screeni ngs of low-income individuals should be aware that retrospective measures of age at first alcohol use are more susceptible to recall error among lower SES adolescents, keeping mind t hat one cannot assume that the recall error consistently results in an earlier ac tual age of onset t han the one reported. Substance use reported at baseline was associated with an increase in recall error for age at fi rst alcohol use. There are two possibl e interpretations for this result. First, early alcohol use could be responsible for an increase in recall error due to negative effects of alcohol us e on cognitive development 114-116. Another possibility is 61

PAGE 62

that those that reported alc ohol use at baseline have greater recall error due to the increased length of the recall period. Our da ta from PNC does not give us a way to directly test which of these explanations, or the combination, may be true. However, prior research showing that a longer recall period leads to more recall error found that the reported age at first alcohol use shift ed later as the recall period increased. Our results found no difference in the odds of se lf-reported baseline al cohol users reporting a later age at first alcohol use on the retrospective measure compared to the prospective measure. This in consistency between our findings and prior studies on the length of first alcohol use recall periods gives some credence to the alternative explanation that increased recall error is due to the negative cognitive effects of early alcohol use. Substance use r eported at the 12th grade follow-up was associated with an increase in recall error for age at first alc ohol use. Twelfth-grade substance use items had a consistent and statistically significant pattern of increasing the odds of the retrospective measure showing a later age at fi rst alcohol use. Later substance use is a common outcome in the etiological literature relying on retrospective recall of age at first alcohol use 52, 70-74. For example, Gruber et al. used re called age at first alcohol use to predict later alcohol abuse117. Our results imply that differential misclassification may be present in Gruber et al.s and similar studies. This misclassification results in attenuated effect sizes by shifting relatively more of t he alcohol users into later age at first alcohol use categories. The finding t hat students who drink by 12th grade are more likely to report an inaccurate and later age at first al cohol use may result in an underestimation of the risk of early onset of alcohol use. 62

PAGE 63

The primary limitation of this study is the lack of a gold st andard measure for age at first alcohol use with which to compar e the retrospective measure. While the variables used to construct the prospective measure during the 6th to 8th grade window have been shown to have high test-retest reliability 40, 41, 47, 49, 50, the prospective measure is also composed of variables that employ recall questions that cover the gap in surveying between the core 6th to 8th grade PNC surveys and the later long-term follow-up occurring in 12th grade. This problem is mitigated by the fact that the prospective measure is still more accurate th an the fully retrospective measure. All of the variables used to create t he prospective measure have substantially shorter recall periods compared to the retrospective measure, which have been found to result in a lower rate of recall error 52. This makes our prospective m easure a useful proxy for the purposes of this study; however, future st udies should continue to further improve upon measurement of age at first alcohol use. Future research should seek to build on th is study in several other ways. First, results should be replicated in data sets with different demogr aphic groups and other settings. The present sample was drawn fr om an entirely urban setting, and included a majority-minority population. In order to incr ease generalizability of these findings, they should be replicated using studies performed in rural settings, middle and upper class students, and perhaps predominately white populations. Second, results should be replicated with data that include a fully prospective measure of age at first alcohol use. While further research is needed, these resu lts provide important practical information facilitating the correct interpretation and implementation of research involving retrospective measures of age at first alcohol use. Most notably, those most at risk of 63

PAGE 64

64 the negative outcomes associated with early alc ohol initiation are also those most likely to misreport their age at first alcohol use.

PAGE 65

Table 4-1. Cross-tabulation of prospective and retrospective meas ures of age at first alcohol use Prospective measure I have never had an alcoholic drink 8 years old or younger 9-10 years old 11-12 years old 13-14 years old 15-16 years old 17 years or older Retrospective measure I have never had an alcoholic drink 263 (25.2%) 0 (0.0%) 0 (0.0%) 92 (8.8%) 69 (6.6%) 13 (1.3%) 3 (0.3%) 8 years old or younger 0 (0.0%) 0 (0.0%) 0 (0.0%) 3 (0.3%) 3 (0.3%) 0 (0.0%) 0 (0.0%) 9 or 10 years old 0 (0.0%) 0 (0.0%) 0 (0.0%) 4 (0.4%) 0 (0.0%) 2 (0.2%) 1 (0.1%) 11 or 12 years old 0 (0.0%) 0 (0.0%) 0 (0.0%) 20 (1.9%) 9 (0.9%) 0 (0.0%) 1 (0.1%) 13 or 14 years old 1 (0.1%) 0 (0.0%) 0 (0.0%) 62 (5.9%) 41 (3.9%) 7 (0.7%) 1 (0.1%) 15 or 16 years old 1 (0.1%) 0 (0.0%) 1 (0.1%) 124 (11.9%) 117 (11.2%) 58 (5.6%) 3 (0.3%) 17 years or older 1 (0.1%) 0 (0.0%) 1 (0.1%) 58 (5.6%) 29 (2.8%) 26 (2.5%) 30 (2.9%) 65

PAGE 66

Table 4-2. Final added in order model results predict ing disagreement between the retrospective and prospective age at first alcohol use Variables Odds Ratio (95% CI) Eligible for Free/R educed Price Lunch 1.60 (1.13, 2.26) Baseline Alcohol Use 6.73 (3.95, 11.47) 12th grade Alcohol Use 3. 90 (2.91, 5.23) 12th grade Cigarette Use 1.43 (1.09, 1.87) Table 4-3. Predictors of the retrospective ag e at first alcohol measure indicating a later age at first alcohol use Variables Odds Ratio (95% CI) Eligible for Free/R educed Price Lunch 0.76 (0.52, 1.13) Baseline Alcohol Use 1.30 (0.86, 1.98) 12th grade Alcohol Use 23.53 (15.71, 35.24) 12th grade Cigarette Use 3. 39 2.35 4.88) 66

PAGE 67

Figure 4-1. Hypothesized covariate chronology Demographics Baseline Substance Use 12th Grade Substance Use Gender Race Age Family Structure Free/Reduced Price Lunch Baseline Alcohol use Baseline Cigarette use 12th Grade Alcohol use 12th Grade Cigarette use 12th Grade Marijuana use Baseline Marijuana use 67

PAGE 68

CHAPTER 5 COMPARING METHODS OF MISCLASSIF ICATION CORRECTION IN STUDIES OF ADOLESCENT ALCOHOL USE Literature Review The literature has shown that misclassifica tion in the measurem ent of adolescent alcohol use is a consistent source of bias Although other forms of data collection are being advanced in the measurement of alcohol use57, 118, adolescent alcohol research nevertheless remains primarily driven by self-report surveys55. These self-report measures are typically retrospective allowing fo r a greater possibility of misclassification due to recall bias79. Despite these concerns, self-r eport measures ar e still the most practical means of attaining info rmation on the frequency, amount, and age of adolescent alcohol use55. However in circumstances that allow for a subset of study participants alcohol use to be measured with li ttle or no error, statistical procedures exist that incorporate informa tion from these validation subsamples to reduce bias in estimated effects of alcohol use measures on various outcomes. While the type of gold standard measure necessary for these designs is rare in the study of adolescent alcohol use, biomarkers for alcohol use continue to improve. As alcohol biomarkers improve, the ability of researchers to include validatio n sub-samples as part of their study design will also increase. Until va lidated biomarkers are inexpensive enough to replace survey responses for all study participants, it only re mains feasible to measure a sub-sample of participants with a biological meas ure. As a result, it is im portant for researchers to understand how to efficiently incorporate validity sub-study data into the design and analysis of adolescent alcohol use outcomes. Existing methods have been developed to account for bias due to binary misclassification. Methods to deal with misclassification of binary variables have varied 68

PAGE 69

greatly in their sophistication and underlyi ng assumptions, including: probabilistic sensitivity analyses83, regression calibration84-87, maximum likelihood88-90, a variety of nonparametric methods91, 92, Bayesian methods93-95, and multiple imputation methods96. It is uncommon to see these methods implemented in practice. For some methods this may be due to the complexity of implementat ion. However, several methods are easily implemented via existing software, including regression calibration, multiple imputation for measurement error (MIME), and pr obabilistic sensitivity analysis (PSA). For researchers to be prepared to make best use of validation sub-study data to correct for misclassification, researchers will need to understand how available methods correct for misclassification and their utility in different scenarios experienced in practice. Of particular importance is how t he methods perform with respect to sample size and the nature of the misclassification. Methods that perform well only at high sample sizes or only when the data is non diffe rentially misclassified with respect to the outcome may be of limited utilit y. This study provides a summary of the methods of three procedures for misclassification correction that are accessible and easy to implement: regression calibration, MIME, and PSA. Next, we present results of a simulation study on the differential effe ctiveness of these methods and make recommendations on when they are best used. Methods Summary of Available Methods Regression calibration Regression calibration works by first fitting a primary logistic regression model among the whole sample relating the misclassified exposure to the outcome of interest. E(Y=1 W=w )= b0+b1*w 69

PAGE 70

Where W is the misclassified exposure, and Y is the outcome of interest. The coefficient b1 is then corrected by the formulas: b1t=b1/a1 and Var(b1t)=1/(a1 2)Var(b1) +b1 2/(a1 4)*Var(a1) Where a1 is estimated by the regression: E(X W )= a0+a1*w where X is the gold standard measurement of the exposure. Unlike the other methods discussed, regression calibration is not compatible with an assumption of differential misclassification. This method can be easily implemented in SAS by hand or by means of the SAS macro %BLIN. Multiple imputation for measurement error For binary miscla ssification, MIME works by first fitting a logistic regression model among the validation sample relating the misclassified exposure to the gold standard exposure measure. E(X=1 W=w and Y=y)= b0 + b1*W + b2*Y + b3*W*Y Where X is the gold standard exposure meas ure in the validation sample, W is the misclassified exposure, and Y is the outcome of interest. The term W*Y is included in the model to account for differential misclassification. We include the W*Y interaction in all scenarios in order to account for differences in the sensitivity and specificity by the outcome due to random error. From the parameters b0, b1, b2, and the covariance matrix wy, draws from a multivariate standard normal distribution for the coefficients are made for each imputation. Fo r each of k imputations, let Zk equal the imputed exposure measure. Set Zk equal to X for those in the validation sample. For those not in the validation sub-sample draw Z from Bernoulli(pkwy), where pkwy = 1/(1+exp(b0k+b1k*w+b2k*y)). Finally, run the primary anal ysis in each imputation set and 70

PAGE 71

combine using standard multiple imputati on procedures. This method can be easily implemented in SAS by modifyi ng the code provided by Cole et al. Our implementation uses 40 imputations as suggested by Cole et al. Probabilistic sensitivity analysis The probabilistic sensitivity analysis method attempts to recreate the dat a that would have existed ha d the exposure variable not been misclassified. It does this by simulating from a set of ex ternal distributions for the sensitivity and specificity of the misclassified meas ure. For non-differential misclassification, two exter nal distributions are chosen: one for the specificity and one for the sensitivity of the measure. For di fferential misclassification, four external distributions are specified: t he sensitivity and specificity of cases and the sensitivity and specificity of controls. A single iteration of the simulation dr aws from these distributions a set of sensitivity and specificity values. These values are then used to calculate the positive predictive value (PPV) and negative predictive value (NPV) of exposure classification. These values are then applied to the corresponding individual records. Next, a random number is generated from a uni form(0,1) distribution for each record. If this number is larger than the records probabi lity of being correctly classified then the record is reclassified. Finally a logistic regression is run on the newly classified data, and a summary log odds ratio calculated. To account for random error the standard error of the conventional log-odds ratio is calculated; then a value is sampled from a standard normal distribution. The product of this standard normal deviate and the conventional standard error is subtracted from the reclassified log odds ratio. This process is then repeated many times resulting in a distribution of odds ratios adjusted for both random and systematic error. 71

PAGE 72

The external distributions used by the probabilistic sensitivity method can be specified a number of ways. Fox, Lash and Greenlands implementation of this method, the SAS macro %SENSEMAC, provides for the misclassification rates to be specified under a uniform, triangular, or trapezoidal distribution83. Fox and associates discussion of the uses of this method do not explicitly include situat ions where a validation substudy is present. In our implementation us ing validation sub-samp les, the triangular distribution is specified, with the peak determined by the misclassification rates calculated within the validat ion sample, and the upper and lower bounds set to 1.96 standard deviations above and below the peak We allow for separate sensitivity and specificity distributions by outcome status in all scenarios to allow for differences in the sensitivity and specificit y due to random error. Simulation Design The purpose of the simulation study is to compare the relative performance of the three correction methods described with res pect to bias, standard error, mean squared error, and interval coverage. In order to achieve this goal, 2000 simulation datasets were generated for four different misclassification scenarios at three different sample sizes. Two non-differential misclassifications scenarios were tested: a high sensitivity and specificity scenario where both the sensit ivity and specificity were set to 0.9, and a low sensitivity and specificity scenario where the sensitivity and specificity were set to 0.6. Two differential misclassifications scenario s were also tested: a high sensitivity and specificity in the cases with a high sensitivit y and low specificity in the controls scenario, and low sensitivity and s pecificity in the cases with a low sensitivity and high specificity in the controls scenario. Each of these scena rios were tested at sample sizes of 200, 1000, and 2000. We restricted the number of misclassification scenarios due to the 72

PAGE 73

large amount of computing time required fo r each scenario. The scenarios were chosen to demonstrate the performance of each met hod in a best and worst case scenario with regards to misclassification rates. Each simulation dataset consists of the following variables: X, Y, V, and W; where X is the exposure measured without error simulated from a distribution Bernoulli(p1). Y is the outcome meas ured without error. Y is constructed so that the relationship between X and Y produces the log odds when Y is regressed on X via logistic regression, such that the resulting odds ratio was 2.0. This is done by first calculating a vector fo r the linear predictor (Lpred) for each X such that Lpred i=(X ). From Lpred a probability P is calculated so that P= exp(Lpred)/(1 + exp(Lpred)). Next P is compared to a draw from a Uniform(0,1) distribution (Pu). If P > Pu then Y is set to 1; if P < Pu then Y is set to 0. This process is r epeated for each of the records to achieve the desired sample size. In order to simulate the presence of a validation sub-study, the variable V is constructed by randomly sampling 30% of the records in a simulation dataset; those sampled have V set to the co rresponding value for X, while those not sampled have V set to missing. W repres ents the misclassified exposure. W is constructed from X by purposefully misclassifying X according to specified sensitivity and specificity values. For each of the 2000 simulation datasets in the scenarios described, we estimated the bias, standard error, mean squa red error, and interval coverage of each method. The bias for each method was calculated by taking the difference between the log-odds estimate from the correction method and the true log odds. The standard errors of the nave approach, regression ca libration and MIME are obtained directly 73

PAGE 74

from the output. The standard error for PSA was calculated by assuming that the resulting interval can be viewed as a confidence interval based on a normal distribution and back calculating the standar d error from the upper and lower interval estimates. The mean squared error for each method was calculated by squaring the bias and summing it with the variance ca lculated from the standard erro r. Interval coverage was calculated as the proportion of the interval s generated that contained the true log odds. In addition to the misclassification procedures described, we also include results of the analyses using the misclassified exposure without adjustmen t for comparison. Results Table 5-1 presents results for our simulati ons of the nave analysis as well as the analyses for the misclassifica tion correction methods. Over all the misclassification correction methods performed best at hi gher sample sizes and lower rates of misclassification. The results for each method are presented in detail below. Regression Calibration In terms of MSE, regre ssion calibration outperformed the nave analysis in the non-differential misclassification scenario s with sample sizes of 1000 and 2000. Regression calibration outperformed both MIME and PSA when the sensitivity and specificity where highest (0.9, 0. 9) regardless of sample size. At a sample size of 200, regression ca libration produced less bias than the nave approach; however, regression calibration produced much higher standard errors compared to the nave approach. The result ing MSEs showed worse performance for the regression calibration appr oach compared to the nave approach at sample sizes of 200. 74

PAGE 75

Unsurprisingly regression calibration performed poorly in all of the differential misclassification scenarios. Regression cali bration produced more biased and less precise estimates than the nave approach in every differential misclassification scenario. The confidence intervals also showed poor coverage under differential misclassification. MIME MIME was unable to produce any estimate s when the sample size was 200 due to a lack of convergence in the imputation model. MIME produced practically unbiased estimates in all scenarios with sample sizes 1000 and 2000. MIME also showed adequate interval coverage in all scenarios. In terms of MSE, MIME outperformed the nave analysis in all but the high sensitivity and specificity scenario with sample size of 1000. MIME produced similar results in terms of bias to the regression calibration in all of the non-differential misclassification scenari os. For non-differential misclassification, MIME produced smaller standard errors than regression calibration when the sensitivity and specificity where low; MIME produced higher standard errors than regression calibration when the sensitivity and specific ity where high. As a result, performance measured by MSE favored regression calibrat ion when sensitivity a nd specificity where high and MIME when sensitivity and specificit y where low. MIME performed better than regression calibration on all metrics in all differential misclassification scenarios. In terms of MSE, MIME performed better than PSA in all applicable scenarios. PSA produced less biased estimates in the non-differential scenarios when the sensitivity and specificity were both high. However, MIME consistently produced lower standard errors than PSA in all scenarios. 75

PAGE 76

PSA PSA was unable to produce estimates when the sample size was 200 due to the sparseness of the crosstab cells used to estima te the sensitivity and specificity of the sample. At sample sizes of 1000 and 2000, PSA consistently produced less biased estimates than the nave approach. Due to the large standard errors of the PSA approach, the nave approach outperformed the PSA in terms of MSE in most scenarios. The PSA method had a lower MSE than the nave approach in the differential misclassification scenario where t he specificity of controls was low and the specificity of the controls was high and sens itivity and specificity of the cases was high. As a result of the large standard errors the PSA intervals c ontained the true value 100% of the time. In the non-differential misclassification sc enarios, PSA produced similar or worse levels of bias reduction compared to regressi on calibration. However, due to the large standard errors resulting from PSA, regression calibration produced lower MSEs in all non-differential scenarios. In the differential misclassification scenarios, PSA produced lower MSEs than regression calibration due to PSAs substantially lower bias. In the all scenarios, PSA produced simila r or worse levels of bias reduction compared to MIME. Due to the large st andard errors resulti ng from PSA, MIME consistently performed better in all scenario s than PSA in terms of MSE and interval coverage. Discussion To highlight the strengths and weaknesse s of each method, recommendations will be made based on the structur e of the data available. Si nce the sample size of a study is the parameter perhaps most readily apparent to the researcher, 76

PAGE 77

recommendations are stratified first by samp le size and then by the nature of the misclassification. These recommendations provide needed guidance for researchers deciding if and how to utilize validation subsamples in the design and analysis of their studies. When samples sizes are low, regression ca libration is the only method able to produce an estimate adjusted for misclassifica tion. It is worth not ing that regression calibration does not perform well unless the misclassification is non-differential. At lower sample sizes, if the exposure misclassificati on is expected to be differential, then our results indicate that the researcher is better off analyzing the misclassified exposure without any adjustments. At lower sample size s, when the exposure misclassification is expected to be non-differential, regression ca libration produced less biased estimates; however due to the higher st andard errors of regression ca libration, it may best be employed as a sensitivity analysis instead of the primary analysis. When samples sizes are large enough to enable the use of procedures other than regression calibration, t hen MIME provides a flexible alternative for correction of misclassification error. When exposure is misclassified non-differentially then MIME produces similar estimates to regression ca libration. In the non-differential scenarios tested, regression calibration performed slight ly better than MIME in some instances and slightly worse in others. However, in the high sample-size differential misclassification scenarios, MI ME clearly outperformed the other tested methods. MIME has the additional benefit of being easy to modify for miscl assification of non-binary variables. However, based on our results, MIME is unsuitable for misclassification correction when sample sizes are low. 77

PAGE 78

PSA performed unexpectedly poorly in comparison to regression calibration and MIME. While it did consistently produce le ss biased estimates than the nave approach, the width of PSAs simulation based interval was much larger than the confidence intervals of the other methods. The interv als produced in our simulation study were similar in width to the exam ple shown by Fox and associates83. One notable difference between the example shown in Fox et al. and our simulations is the magnitude of the true effect estimate. Fox and associates tr ue effect estimate was an odds ratio of approximately 3.0, while our true effect estimate was an odds ratio of 2.0. This difference partially explains why our simulations did not find results as favorable as those shown by Fox. The only situation in which PSA provided comparable estimates to the other methods was when the misclassificati on rates were already very low. While PSA may be useful purely as a sensitivity an alysis, we cannot recommend its use as a method to correct for misclassification. The primary limitation of all of these procedures is the necessity of unbiased estimates of the relationship between the misclassified exposure and a gold standard exposure measure. For regression calibrati on and MIME, this requires a validation subsample to compare the misclassified measure to the gold standard. For PSA, the sensitivity and specificity of the misclassified measure can either be calculated directly from a validation sub-sample or assum ed based on prior literat ure. When no gold standard exists, there are no guar antees that any of these methods will reduce the bias in the estimation of the exposure-outcome relationship. If attemp ts are made to use these methods to correct for misclassification when the assumed gold standard is itself misclassified, then it is possible to increase bias in the estimation of exposure-outcome 78

PAGE 79

79 relationships. If the assumed gold standard cannot be assumed to be measured without error, then results from the misclassi fication correction procedures should be considered as results from sensitivity analys es and not definitive in and of themselves. In conclusion, validity sub-samples are an efficient way to decrease the bias resulting from using commonly misclassifi ed exposure measures. The importance of these methods to adolescent alcohol research will only continue to grow as biomarkers and other validated measurements used to assess alcohol use improve. When sample sizes are high, MIME is a particularly effective way to account for exposure misclassification. Future re search should focus on improving accessible methods for misclassification correcti on for studies with small sample sizes.

PAGE 80

Table 5-1. Simulation results comparing methods of misclassification correction Cases Control Complete data (nave) Regression calibration Sensitivity, specificity Sensitivity, specificity Bias Standard error MSE Interval coverage Bias Standard error MSE Interval coverage N=200 NonDifferential 0.6, 0.6 0.6, 0.6 -0.63 0. 28 0.48 0.45 -0.16 2.68 11 0.99 0.9, 0.9 0.9, 0.9 -0.24 0.33 0.18 0.89 -0.02 0.51 0.35 0.96 Differential 0.6, 0.6 0.6, 0.9 0.68 0.34 0.58 0.46 4.67 2.64 30.72 0.67 0.9, 0.9 0.9, 0.6 -1.44 0. 3 2.16 0 -2.36 0.74 6.16 0.08 N=1000 NonDifferential 0.6, 0.6 0.6, 0.6 -0.61 0. 13 0.39 0 -0.06 1.06 1.81 0.98 0.9, 0.9 0.9, 0.9 -0.24 0.15 0.08 0.64 -0.01 0.23 0.07 0.95 Differential 0.6, 0.6 0.6, 0.9 0.68 0.15 0.57 0 4.67 1.15 23.12 0 0.9, 0.9 0.9, 0.6 -1.43 0. 13 2.05 0 -2.37 0.34 5.73 0 N=2000 NonDifferential 0.6, 0.6 0.6, 0.6 -0.6 0.09 0.37 0 -0.03 0.73 0.84 0.98 0.9, 0.9 0.9, 0.9 -0.24 0.1 0 .07 0.38 -0.01 0.16 0.03 0.95 Differential 0.6, 0.6 0.6, 0.9 0.68 0.11 0.48 0 4.65 0.81 22.3 0 0.9, 0.9 0.9, 0.6 -1.43 0. 09 2.04 0 -2.35 0.24 5.58 0 80

PAGE 81

81 Table 5-1. Continued Cases Control MIME Probabilistic sensitivity analysis Sensitivity, specificity Sensitivity, specificity Bias Standard error MSE Interval coverage Bias Standard error MSE Interval coverage N=200 NonDifferential 0.6, 0.6 0.6, 0.6 0.9, 0.9 0.9, 0.9 Differential 0.6, 0.6 0.6, 0.9 0.9, 0.9 0.9, 0.6 N=1000 NonDifferential 0.6, 0.6 0.6, 0.6 0 0.31 0.13 0.96 -0.3 1.81 3.41 1 0.9, 0.9 0.9, 0.9 -0.04 0.26 0.09 0.97 0 0.43 0.18 1 Differential 0.6, 0.6 0.6, 0.9 0.01 0.29 0.12 0.95 0.12 1.4 1.97 1 0.9, 0.9 0.9, 0.6 -0.06 0. 28 0.11 0.96 -0.06 1.15 1.33 1 N=2000 NonDifferential 0.6, 0.6 0.6, 0.6 0 0.21 0.06 0.95 -0.21 1.46 2.17 1 0.9, 0.9 0.9, 0.9 -0.01 0.18 0.04 0.96 0 0.27 0.08 1 Differential 0.6, 0.6 0.6, 0.9 0 0.2 0.06 0.95 0.04 0.98 0.97 1 0.9, 0.9 0.9, 0.6 -0.01 0. 19 0.05 0.96 -0.01 0.77 0.59 1

PAGE 82

CHAPTER 6 CONCLUSIONS Misclassification is a threat to the valid ity of measurement of adolescent alcohol use that warrants attention. If studies of adolescent alcohol use do not account for misclassification in the analyses and interpreta tion on their results, then the chances of drawing incorrect conclusions increases. The purpose of this dissertation was to evaluate the extent of misclassification due to survey modality and recall bias, as well as examine available methods for misclassification correction. Brief summaries of the results of each study are presented below. Then the results of this dissertation are used to develop a series of recommendations for bes t practices available to account for bias due to misclassification. Finally future areas of research arising from the results are discussed. Accomplishments of the Dissertation The first study of this dissertation a ssessed the effect of survey modality on responses to alcohol use questions. We found no statistically significant survey modality effects overall and no differential e ffects of survey modality by assigned intervention group. Based on these results, we concluded that there was little difference in self-reported alcohol use across ma iled paper, in-school paper, and web-based surveys among high school-aged participants in a large-scale alcohol prevention trial. These findings demonstrated t he potential benefit of using mixed modality surveys for long-term follow-up of high school-aged participants. In the long-term follow-up of students from the Project Northland Chicago tria l, including multiple modalities served to increase the response rate and sample si ze of the follow-up survey. Our study 82

PAGE 83

demonstrated that this added sample size came with little to no measurement error due to differing survey modalities. In the second study of this dissertation, the predictors of misclassification in age at first alcohol use due to recall bias were in vestigated. In this study we considered the disagreement between a prospective and retrospec tive measure of age at first alcohol use to be evidence of misclassification. Among those participants whose prospective and retrospective measures of age at first alcohol use disagreed, we further assessed whether the prospective measur e indicated an earlier or late r age of onset compared to the retrospective measure. We found that eligibility for free or reduced price lunch in 6th grade, alcohol use in 6th grade, cigarette use in 12th grade, and alcohol use in 12th grade were significantly associated with recall error of age of alcohol use onset. We also found that self-reported substance us e (alcohol and tobacco) in 12th grade predicted a later self-reported age of first alcohol use w hen measured retrospectively. The results indicate that recall error is higher among high-risk youth. The third study of this dissertation re viewed existing accessible methods for the correction of misclassification, how these methods might be applied in the study of adolescent alcohol use, and presented resu lts from a simulation study so that researchers know when to best apply each method. The methods presented included regression calibration, multiple imputation for measurement error 96, and a version of Foxs probabilistic sensitivity analysis 83 modified to incorporate data from a validation sub-study. Compared to unadj usted analyses, regression calibration provided less biased estimates of the exposure-outcome relationship when the misclassification was non-differential. At high enough sample sizes, MIME produced unbiased and relatively 83

PAGE 84

efficient corrections for exposure misclassification. PSA provided little utility for correcting misclassification in binary exposure variables. Based on our results we concluded that at sample sizes of 200, r egression calibration was a useful tool for sensitivity analysis with misclassified exposu res. While at sample sizes of 1000 and greater, we concluded that MIME was a flexible procedure t hat efficiently reduced bias for the misclassification of exposure variables. Limitations of the Dissertation These studies were not without their lim itations. Studies one and two of this dissertation relied solely upon a single dat a source: the Proj ect Northland Chicago (PNC) trial. Relying solely on data from P NC presents two limitations. First, since the samples used in this dissertation were drawn from PNC, they are most applicable to an urban setting with a majority minority population. As a result the gene ralizability of the results of studies one and two cannot be asce rtained without future research. Second, PNC was not specifically designed to test the hypotheses under inve stigation in studies one and two. With respect to study one, partici pants were allowed to self-select into the modality of their choice in order to maximize the response rate of the PNCs follow-up. Because participants were not randomized to their survey modality, we cannot be sure whether the results reflect the true effect of modality or are the result of selection factors. While our analysis in study one re flects a best attempt to isolate modality effects from selection effects, it is possi ble that our estimates are confounded by unmeasured variables. With respect to study two, the design of PNC did not allow for a completely prospective measure of the age at first alcohol use. Study three had all of its data generated as part of a set of simulations. As a result, study three suffers from the same limit ation that all simulati on studies share: we 84

PAGE 85

could not reasonably explore all of the situations that re searchers may experience when attempting to apply misclassifi cation correction methods in practice. Of particular concern is how the different misclassifica tion correction methods perform when the validation sub-study data is measured with error and when error in the sub-study is correlated with error in t he primary measurement. Contributions and Recommendations Despite these limitations, this dissert ation contributes new findings to the literature on misclassification of adolescent alcohol use. Collectively, studies one and two highlight where researchers should focus when attempting to minimize the presence of misclassification in the design of their studies. Specifically, when using survey data for adolescent alcohol use, the designs and validation of the questionnaire appear to be far more important than the mode in which the questionnaire is delivered. The size of the effects seen in study one are small enough to not be of substantive importance when compared to other causes of measurement error, such as recall bias39. This is underscored by the fact that study two found only 39% of respondents correctly reported their age at first alcohol use using retrospective recall. In order to both maximize statistical power and mini mize measurement error when designing new studies of adolescent alcohol use, we make the following recommendations based on our results. First, researchers should make use of multimodal surveys to increase their response rate. The relative contribution of mixing survey modes to bias in the study of adolescent alcohol is small when compared to other sources of error. However, when possible res earchers should test the assumption of no modality effects in their own data. As adol escent alcohol investigators continue to develop novel ways of delivering surveys to participants, it will be important for 85

PAGE 86

researchers to continue to test for the presence of modality effects when using new technology. This may be of particular im portance with the rise of smartphones and tablet computers, as partici pants sense of privacy on thes e more personal devices may differ compared to the modes of surveys commonl y in current use. In intervention trials of adolescent alcohol use, intervention co mponents delivered electronically will be more and more commonly experienced by partici pants on their phones and tablets. If multimodal surveys that include smartphone or tabl et surveys are used in conjunction with the electronic delivery of intervention component s, then interaction of survey mode and intervention status should be assessed. Second, researchers should seek to mini mize measurement error by designing studies that limit the reliance on retrospective questionnaires. Studies that are able to follow participants behavior prospectively will greatly reduc e the probability of measurement error. When this is not feasible, researchers should be aware of how measurement error may affect their particula r sample. Our results indicate that early adolescent alcohol users and current alcohol users are more like to be a source of measurement error. Researchers whose study samples include these types of participants should present both their primary results and a sensitivity analysis showing the extent of measurem ent error necessary to invalidate their findings. Third, as new validated measures of alcohol use become available researchers should incorporate them in their study desig ns. If valid measures of alcohol use are available but too costly or intrusive to us e on the entire study sample, then researchers should use the validated measures on a sub-samp le of the study so that this information can be used to correct for measurement error in the full sample. Study three 86

PAGE 87

demonstrates novel ways to incorporate validati on data from sub-studies that is efficient from both a statistical and budgetary point of view. Based on our results, we recommend the use of MIME fo r misclassification correction using validation sub-study data due to its flexibility in allowing for both differential and non-differential misclassification. Future Research One avenue of future research is te sting the performance of MIME when predictors of misclassification are known outside of the validati on sub-sample. Study two provides a clear example where we hav e assessed predictors of misclassification for age at first alcohol use. When predictor s of misclassification are measured, they could be easily included as part of the impu tation model during the MIME process. The effect of including misclassification predictors as part of MIMEs imputation model could be tested in a variety of setti ngs. Of particular inte rest is whether these added variables could make MIME a viable method to correct for misclassification when the validation data is itself measured with error. In order to improve the m easurement of adolescent alcohol use, future research will need to focus on improving biomarkers for alcohol use in the context found in the study of adolescent use. While useful in lim ited contexts, current biomarkers are difficult to apply to adolescent alcohol research that depends heavily on survey items that assess the frequency of drinking episodes and amount drunk per episode. No current biomarkers are capable of assessing the validity of these types of variables55. We echo the call made by Freeman and Vrana to improve t he state of biomarkers for alcohol use so that the research surrounding adolescent alcohol use can continue to grow and improve119. 87

PAGE 88

REFERENCES 1. Johnston LD, OMalley, P. M., Bachman, J. G., & Schulenberg, J. E. Monitoring the Future national results on adolescent drug use: Overview of key findings, 2010. Ann Arbor: Institute for Social Research, The University of Michigan; 2011. 2. Greenblatt JC. Patterns of Alcohol Use Among Adolescents and Associations with Emotional and Behavioral Pr oblems: Substance Abuse and Mental Health Services Administration; 2000. 3. Hingson R, Heeren T, Lev enson S, Jamanka A, Voas R. Age of drinking onset, driving after drinking, and involvement in alcohol related motor-vehicle crashes. Accid Anal Prev. 2002 Jan;34(1):85-92. 4. Hingson R, Heeren T, Zakocs R. Age of drinking onset and involvement in physical fights after drinking. Pediatrics. 2001 Oct;108(4):872-7. 5. Warner LA, White HR. Longitudinal e ffects of age at onset and first drinking situations on problem drinking. S ubst Use Misuse. 2003 Dec;38(14):1983-2016. 6. DiClemente RJ, Lodico M, Grinstead OA Harper G, Rickman RL, Evans PE, et al. African-American adolescents residi ng in high-risk urban environments do use condoms: Correlates and predictors of c ondom use among adolescents in public housing developments. Pediat rics. 1996 Aug;98(2):269-78. 7. Ellickson PL, Tucker JS, Klein DJ. Ten-y ear prospective study of public health problems associated with early drinking. Pediatrics. 2003 May;111(5 Pt 1):949-55. 8. Epstein L, Tamir A. Health-Related B ehavior of Adolescents Change over Time. J Adolescent Health. 1984;5(2):91-5. 9. Coie JD, Watt NF, West SG, Hawkins JD, Asarnow JR, Markman HJ, et al. The Science of Prevention a Conceptual-Frame work and Some Directions for a National Research-Program. Am Psychol. 1993 Oct;48(10):1013-22. 10. Brownson RC, Newschaffer CJ, AliAbar ghoui F. Policy research for disease prevention: Challenges and practical reco mmendations. American Journal of Public Health. 1997 May;87(5):735-9. 11. Ellen M. Capwell FBaVTF. Why Ev aluate? Health Promotion Practice. 2000;1(15-20). 12. Lezine DA, Reed GA. Political will: A bridge between public health knowledge and action. American Journal of P ublic Health. 2007 Nov;97(11):2010-3. 88

PAGE 89

13. Rossouw JE, Anderson GL, Prentice RL, LaCroix AZ, Kooperberg C, Stefanick ML, et al. Risks and benefits of estrogen plus progestin in healthy postmenopausal women Principal results from the Women's H ealth Initiative randomized controlled trial. Jama-J Am Med Assoc. 2002 Jul 17;288(3):321-33. 14. Prasad V, Gall V, Cifu A. LESS IS MORE The Frequency of Medical Reversal. Arch Intern Med. 2011 Oct 10;171(18):1675-6. 15. West SL, O'Neal KK. Project DARE out come effectiveness revisited. American Journal of Public Heal th. 2004 Jun;94(6):1027-9. 16. Ennett ST, Tobler NS, Ringwalt CL, Fl ewelling RL. How Effective Is Drug-Abuse Resistance Education a Metaanalysis of Project Dare Outcome Evaluations. American Journal of Public Heal th. 1994 Sep;84(9):1394-401. 17. Berg AO, Allan JD, Frame PS, Ho mer CJ, Johnson MS, Klein JD, et al. Screening for breast cancer: Recommendati ons and rationale. Ann Intern Med. 2002 Sep 3;137(5):344-6. 18. Calonge N. Screening for Breast Cancer: U. S. Preventive Services Task Force Recommendation Statement (vol 151, pg 716, 2009). Ann Intern Med. 2010 May 18;152(10):688-. 19. Gomez SL GS. Misclassi cation of race/ethnicity in a population-based cancer registry. Cancer Causes Control. 2006;17(6):771-81. 20. Center to Reduce Cancer Health Dispar ities (U.S.). Economic costs of cancer health disparities : summary of meeting pr oceedings. Rockville, MD: Dept. of Health & Human Services, National Institutes of Health, National Cancer Institute; 2007. 21. White E, Armstrong BK, Saracci R. Principles of exposure measurement in epidemiology : collecting, eval uating, and improving measures of disease risk factors. 2nd ed. Oxford ; New York: Oxfo rd University Press; 2008. 22. Schlesselman JJ, Stolley PD. Case-cont rol studies : design, conduct, analysis. New York: Oxford University Press; 1982. 23. Denscombe M. Web-bas ed questionnaires and the mode effect An evaluation based on completion rates and data contents of near-identical questionnaires delivered in different modes. Soc Sci Co mput Rev. 2006 Sum;24(2):246-54. 24. Eaton DK, Brener ND, Kann L, Dennist on MM, McManus T, Kyle TM, et al. Comparison of Paper-and-Pencil Versus Web Administration of the Youth Risk Behavior Survey (YRBS): Risk Behavior Preval ence Estimates. Evaluation Rev. 2010 Apr;34(2):137-53. 89

PAGE 90

25. Hancock DR, Flowers CP. Comparing social desirability responding on World Wide Web and paper-administered surveys. Et r&D-Educ Tech Res. 2001;49(1):5-13. 26. Layne BH, DeCristoforo JR, McGinty D. Electronic versus traditional student ratings of instruction. Res High Educ. 1999 Apr;40(2):221-32. 27. Lucia S, Herrmann, L., & Killias, M. How important are interview methods and questionnaire designs in research on se lf-reported juvenile delinquency? An experimental comparison of Internet vs. paper-and-pencil questionnaires and different definitions of the reference period. Journal of Experimental Criminology. 2007(3):39-64. 28. Olsen DR, Wygant, S. A., and Brown, B. L. Entering the next millennium with Web-based assessment: Considerations of effi ciency and reliability. Conference of the Rocky Mountain Association for Institut ional Research. Las Vegas, NV; 1999. 29. Tomsic ML, Hendel, D. D., and Matross, R. P. A World Wide Web response to student satisfaction surveys: Comparisons using paper and Inter net formats. Annual Meeting of the Associati on for Institutional Research. Cincinnati,OH.; 2000. 30. van de Looij-Jansen PM, de Wilde EJ. Co mparison of web-based versus paperand-pencil self-administered questionnaire: E ffects on health indicators in dutch adolescents. Health Serv Res. 2008 Oct;43(5):1708-21. 31. Wu Y, Newfield SA. Comparing data collected by computerized and written surveys for adolescence health research J School Health. 2007 Jan;77(1):23-8. 32. Bates SC, Cox JM. The impact of computer versus paper-pencil survey, and individual versus group administration, on se lf-reports of sensitive behaviors. Comput Hum Behav. 2008 May;24(3):903-16. 33. Burr MA, Levin, K. Y., and Becher, A. Ex amining web vs. paper mode effects in a federal government customer satisfaction st udy. Annual Conference of the American Association for Public Opinion Research. Montreal, Canada.; 2001. 34. Carini R, Hayek JC, Kuh GD, Kennedy JM, Ouimet JA. College student responses to Web and paper surveys: Does mode matter? Res High Educ. 2003 Feb;44(1):1-19. 35. Denniston MM, Brener ND, Kann L, Ea ton DK, McManus T, Kyle TM, et al. Comparison of paper-and-pencil versus Web administration of the Youth Risk Behavior Survey (YRBS): Participation, data quality, and perceived privacy and anonymity. Comput Hum Behav. 2010 Sep;26(5):1054-60. 36. Dillman DA. Mail and Internet Surve ys: The Tailored Design Method. 2nd ed. New York: John Wiley and Sons; 2000. 90

PAGE 91

37. Sax LJ, Gilmartin SK, Bryant AN. Assessing response rates and nonresponse bias in web and paper surveys. Res High Educ. 2003 Aug;44(4):409-32. 38. Turner CF, Ku L, Roge rs SM, Lindberg LD, Pleck JH, Sonenstein FL. Adolescent sexual behavior, drug use, and violence: Increased reporting with computer survey technology. Science. 1998 May 8;280(5365):867-73. 39. Brener ND, Billy JOG, Grady WR. Assessment of factors affect ing the validity of self-reported health-risk behavior among adolesc ents: Evidence from the scientific literature. J Adolescent Health. 2003 Dec;33(6):436-57. 40. Bailey SL, Flewelling RL, Rachal JV. T he Characterization of Inconsistencies in Self-Reports of Alcohol and Marijuana Use in a Longitudinal-Study of Adolescents. J Stud Alcohol. 1992 Nov;53(6):636-47. 41. Omalley PM, Bachman JG, Johnston LD Reliability and Consistency in SelfReports of Drug-Use. Int J Addict. 1983;18(6):805-24. 42. Bachman JG, Omalley PM When 4 Months Equal a Y ear Inconsistencies in Student Reports of Drug-Use. Public Opin Quart. 1981;45(4):536-48. 43. Agampodi SB, Fernando S, Dharma ratne SD, Agampodi TC. Duration of exclusive breastfeeding; validity of retrospective assessment at nine months of age. BMC Pediatr. 2011;11:80. 44. Turner CF, Lessler JT, Gfroerer JC, Na tional Institute on Drug Abuse. Division of Epidemiology and Prevention Research., Research Triangle Institute. Survey measurement of drug use : methodol ogical studies. Rockville, Md. Washington, D.C.: National Institute on Drug Abuse, Division of Epidemiology and Prevention Research, U.S. Dept. of Health and Human Services, Public Health Service, Alcohol, Drug Abuse For sale by the U.S. G.P.O., Supt. of Docs.; 1992. 45. Williams CL, Toomey TL, McGovern P, Wagenaar AC, Perry CL. Development, reliability, and validity of self report alcohol-use measures with young adolescents. J Child Adoles Subst. 1995;4(3):17-40. 46. Komro KA, Perry CL, Munson KA, St igler MH, Farbakhsh K. Reliability and validity of self-report measures to evaluate drug and viol ence prevention programs. J Child Adoles Subst. 2004;13(3):17-51. 47. Barnea Z, Rahav G, Teichman M. The Reliability and Consistency of SelfReports on Substance Use in a Longitudinal-Study Brit J Addict. 1987 Aug;82(8):891-8. 48. Brown TL. Are Adolescents Accurate R eporters of their Alc ohol Use. Indivdual Differences Research. 2004;2:17-25. 91

PAGE 92

49. Shillington AM, Clapp JD. Self-report stability of adole scent substance use: are there differences for gender, ethnicity and age? Drug Alcohol Depen. 2000 Jul 1;60(1):19-27. 50. Brener ND, Kann L, McM anus T, Kinchen SA, Sundberg EC, Ross JG. Reliability of the 1999 Youth Risk Behavior Survey que stionnaire. J Adolescent Health. 2002 Oct;31(4):336-42. 51. Johnson TP, Mott JA. The reliability of self-reported age of onset of tobacco, alcohol and illicit drug use. A ddiction. 2001 Aug;96(8):1187-98. 52. Parra GR, O'Neill SE, Sher KJ. Relia bility of self-reported age of substance involvement onset. Psychol Addi ct Behav. 2003 Sep;17(3):211-8. 53. Engels RCME, Knibbe RA, Drop MJ. Inc onsistencies in adolescents' self-reports of initiation of alcohol and tobacco use. Addict Behav. 1997 Sep-Oct;22(5):613-23. 54. Patrick DL, Cheadle A, Thompson DC, Diehr P, Koepsell T, Kinne S. The Validity of Self-Reported Smoking a Review and Me taanalysis. American Journal of Public Health. 1994 Jul;84(7):1086-93. 55. Litten RZ, Bradley AM, Moss HB. Alcohol Biomarkers in Applied Settings: Recent Advances and Future Research Oppor tunities. Alcohol Clin Exp Res. 2010 Jun;34(6):955-67. 56. Mundle G, Ackermann K, Gunt hner A, Munkes J, Mann K. Treatment outcome in alcoholism A comparison of self-report and the biological markers carbohydratedeficient transferrin and gamma-glutamyl transferase. Eur Addict Res. 1999 Jun;5(2):91-6. 57. Kip MJ, Spies CD, Neumann T, Nachbar Y, Alling C, Aradottir S, et al. The usefulness of direct ethanol metabolites in assessing alcohol intake in nonintoxicated male patients in an emergency room se tting. Alcohol Clin Exp Res. 2008 Jul;32(7):1284-91. 58. Kypri K, Gallagher SJ, Cashell-Smith ML. An Internet-based survey method for college student drinking research. Drug Al cohol Depen. 2004 Oc t 5;76(1):45-53. 59. McCabe SE, Boyd CJ, Couper MP, Craw ford S, D'Arcy H. Mode effects for collecting alcohol and other drug use data: Web and US ma il. J Stud Alcohol. 2002 Nov;63(6):755-61. 60. McCabe SE, Couper MP, Cranford JA, Boyd CJ. Comparison of Web and mail surveys for studying secondary consequenc es associated with substance use: Evidence for minimal mode effects. Addict Behav. 2006 Jan;31(1):162-8. 92

PAGE 93

61. Miller ET, Neal DJ, Roberts LJ, Baer JS, Cressler SO, Metrik J, et al. Test-retest reliability of alcohol measures: Is there a difference between Internet-based assessment and traditional methods? Psychol A ddict Behav. 2002 Mar;16(1):56-63. 62. Kristina Kays KG, William Buhrow Does survey format influence self-disclosure on sensitive question items? Co mput Hum Behav. 2012;28:251-6. 63. Aquilino WS. Interview Mode Effects in Surveys of Drug and Alcohol-Use a Field Experiment. Public Opin Quart. 1994 Sum;58(2):210-40. 64. McCabe SE, Boyd CJ, Young A, Crawford S, Pope D. Mode effects for collecting alcohol and tobacco data am ong 3rd and 4th grade students: A randomized pilot study of Web-form versus paper-form surve ys. Addict Behav. 2005 May;30(4):663-71. 65. Wang YC, Lee CM, Lew-Ting CY, Hsiao CK, Chen DR, Chen WJ. Survey of substance use among high school students in Taipei: Webbased questionnaire versus paper-and-pencil questionnaire. J Adolesc ent Health. 2005 Oct;37(4):289-95. 66. Gfroerer J, Wright D, K opstein A. Prevalence of yout h substance use: the impact of methodological differences between two national surveys. Drug Alcohol Depen. 1997 Jul 25;47(1):19-30. 67. Kann L, Brener ND, Warren CW, Collins JL, Giovino GA. An assessment of the effect of data collection setting on the prevalence of health risk behaviors among adolescents. J Adolescent Health. 2002 Oct;31(4):327-35. 68. Rootman I, Smart RG. A Comparis on of Alcohol, Tobacco and Drug-Use as Determined from Household and School Surv eys. Drug Alcohol Depen. 1985;16(2):8994. 69. Rothman KJ, Greenland S, Lash TL. Modern epidemiology. 3rd ed. Philadelphia: Wolters Kluwer Health/Lippincott Williams & Wilkins; 2008. 70. Simpura J, Poikol ainen K. Accuracy of Retrospecti ve Measurement of Individual Alcohol-Consumption in M en a Reinterview after 18 Years. J Stud Alcohol. 1983;44(5):911-7. 71. Collins LM, Graham JW, Hansen WB, Johnson CA. Agreement between Retrospective Accounts of Substance Use and Earlier Reported Substance Use. Appl Psych Meas. 1985 Sep;9(3):301-9. 72. Feldman Y, Koren G, Mattice D, Shear H, Pellegrini E, Macleod SM. Determinants of Recall and Recall Bias in Studying Drug and Chemical-Exposure in Pregnancy. Teratology. 1989 Jul;40(1):37-45. 93

PAGE 94

73. Liu SM, Serdula MK, Byers T, W illiamson DF, Mokdad AH, Flanders WD. Reliability of alcohol intake as recalled from 10 years in the past. Am J Epidemiol. 1996 Jan 15;143(2):177-86. 74. Gmel G, Daeppen JB. Recall bias for seven-day recall measurement of alcohol consumption among emergency depar tment patients: Implications for case-crossover designs. Journal of Studies on Al cohol and Drugs. 2007 Mar;68(2):303-10. 75. Caetano R, Babor TF. Dia gnosis of alcohol depende nce in epidemiological surveys: an epidemic of youthful alcohol de pendence or a case of measurement error? Addiction. 2006 Sep;101:111-4. 76. Tourangeau R, Yan T. Sensitive questions in surveys. Psychol Bull. 2007 Sep;133(5):859-83. 77. Henry DB, Kobus K, Schoeny ME. Accuracy and Bias in Adolescents' Perceptions of Friends' Substance Use. Psychol Addict Behav. 2011 Mar;25(1):80-9. 78. Tipping S, Hope S, Pickering K, Erens B, Roth MA, Mindell JS. The effect of mode and context on survey results: Analysis of data from the Health Survey for England 2006 and the Boost Survey for London. Bmc Med Res Methodol. 2010 Sep 27;10. 79. Sartor CE, Bucholz KK, Nelson EC, Madden PAF, Lynskey MT, Heath AC. Reporting Bias in the Association Between A ge at First Alcohol Use and Heavy Episodic Drinking. Alcohol Clin Exp Res. 2011 Aug;35(8):1418-25. 80. Jurek AM, Greenland S, Maldonado G, C hurch TR. Proper interpretation of nondifferential misclassification effects: expectations vs observations. Int J Epidemiol. 2005 Jun;34(3):680-7. 81. Olsen J, Christensen K, Murray J, Ek bom A. Information Bias. Springer Ser Epidemi. 2010:113-7. 82. Dosemeci M, Wacholder S, Lubin JH. Does nondifferential misclassification of exposure always bias a true effect toward the null value? Am J Epidemiol. 1990 Oct;132(4):746-8. 83. Fox MP, Lash TL, Greenland S. A method to automate probabilistic sensitivity analyses of misclassified binary variables. Int J Epidemiol. 2005 Dec;34(6):1370-6. 84. Spiegelman D, Carroll RJ, Kipnis V. Effi cient regression calibration for logistic regression in main study/internal validati on study designs with an imperfect reference instrument. Stat Med. 2001 Jan 15;20(1):139-60. 94

PAGE 95

85. Spiegelman D, McDermott A, Rosner B. Regression calibration method for correcting measurement-error bias in nutri tional epidemiology. Am J Clin Nutr. 1997 Apr;65(4):S1179-S86. 86. Spiegelman D, Rosner B, Logan R. Estimation and inference for logistic regression with covariate misclassifica tion and measurement error in main study/validation study designs. J Am Stat Assoc. 2000 Mar;95(449):51-61. 87. Thurston SW, Williams PL, Hauser R, Hu H, Hernandez-Avila M, Spiegelman D. A comparison of regression calibration approac hes for designs with internal validation data. J Stat Plan Infer. 2005 Apr 1;131(1):175-90. 88. Breslow NE, Holubkov R. Weighted likelihood, pseudo-lik elihood and maximum likelihood methods for logistic regression analysis of two-stage data. Stat Med. 1997 Jan 15;16(1-3):103-16. 89. Satten GA, Kupper LL. Inferences A bout Exposure-Disease Associations Using Probability-of-Exposure Information. J Am Stat Assoc. 1993 Mar;88(421):200-8. 90. Spiegelman D, Casella M. Fully parametric and semi-parametric regression models for common events with covariate meas urement error in main study validation study designs. Biometri cs. 1997 Jun;53(2):395-409. 91. Cheng J, Small DS, Tan ZQ, Ten Have TR. Efficient nonparametric estimation of causal effects in randomized trials with nonc ompliance. Biometrika. 2009 Mar;96(1):1936. 92. Pepe MS, Fleming TR. A Nonparametri c Method for Dealing with Mismeasured Covariate Data. J Am Stat Assoc. 1991 Mar;86(413):108-13. 93. Dellaportas P, Stephens DA. Bayesi an-Analysis of Errors-in-Variables Regression-Models. Biometri cs. 1995 Sep;51(3):1085-95. 94. Prescott GJ, Garthwaite PH. A Bayesi an approach to prospective binary outcome studies with misclassification in a bi nary risk factor. St at Med. 2005 Nov 30;24(22):3463-77. 95. Richardson S, Gilks WR. A Bayesian-A pproach to Measurement Error Problems in Epidemiology Using C onditional-Independence Models. Am J Epidemiol. 1993 Sep 15;138(6):430-42. 96. Cole SR, Chu HT, Greenland S. Mu ltiple-imputation for measurement-error correction. Int J Epidem iol. 2006 Aug;35(4):1074-81. 97. Brenner H. Correcting for exposure misclassification using an alloyed gold standard. Epidemiology 1996 Jul;7(4):406-10. 95

PAGE 96

98. Rubin DB. Estimating Causal Effe cts of Treatments in Randomized and Nonrandomized Studies. J Educ Psychol. 1974;66(5):688-701. 99. Komro KA, Perry CL, Veblen-Mortens on S, Bosma LM, Dudovitz BS, Williams CL, et al. Brief report: The adaptation of Project Northland for urban youth. J Pediatr Psychol. 2004 Sep;29(6):457-66. 100. Komro KA, Perry CL, Veblen-Mortenson S, Farbakhsh K, Toomey TL, Stigler MH, et al. Outcomes from a randomized cont rolled trial of a multi-component alcohol use preventive intervention for urban youth: Project Northland Chicago. Addiction. 2008 Apr;103(4):606-18. 101. Tobler AL, Komro KA. Contemporary opt ions for longitudinal follow-up: Lessons learned from a cohort of ur ban adolescents. Eval Progr am Plann. 2011 May;34(2):8796. 102. Fleming CB, White HR, Catalano RF. Romantic Relationships and Substance Use in Early Adulthood: An Examination of the Influences of Relationship Type, Partner Substance Use, and Relations hip Quality. J Health So c Behav. 2010 Jun;51(2):153-67. 103. Goldsmith KA, Kasehagen LJ, Rosenberg KD, Sandoval AP, Lapidus JA. Unintended childbearing and knowledge of emergency contra ception in a populationbased survey of postpartum women. Mate rn Child Hlth J. 2008 May;12(3):332-41. 104. Pabst A, Baumeister SE, Kraus L. Alcohol-Expectancy Dimensions and Alcohol Consumption at Different Ages in the General Population. Journal of Studies on Alcohol and Drugs. 2010 Jan;71(1):46-53. 105. Wells S, Mihic L, Tremblay PF, Gr aham K, Demers A. Where, with whom, and how much alcohol is consumed on drinking events involving aggression? Event-level associations in a Canadian National Surv ey of university student s. Alcohol Clin Exp Res. 2008 Mar;32(3):522-33. 106. White HR, Fleming CB, Kim MJ, Cata lano RF, McMorris BJ. Identifying Two Potential Mechanisms for Changes in Al cohol Use Among College-Attending and NonCollege-Attending Emerging Adults. Dev Psychol. 2008 Nov;44(6):1625-39. 107. Komro KA, Perry CL, Veblen-Mortenson S, Farbakhsh K, Toomey TL, Stigler MH, et al. Outcomes from a randomized cont rolled trial of a multi-component alcohol use preventive intervention for urban youth: project northland Chicago. Addiction. 2008 Apr;103(4):606-18. 108. Leeuw EDd. To Mix or Not to Mix Data Collection Modes in Surveys. Journal of Ofcial Statistics. 2005; Vol. 21(No. 2):233-55. 96

PAGE 97

109. Botvin GJ, Kantor LW. Preventing al cohol and tobacco use through life skills training Theory, methods, and empirical findings. Alcohol Research & Health. 2000;24(4):250-7. 110. Kranzler HR, Feinn R, Armeli S, Tennen H. Comparison of alcoholism subtypes as moderators of the response to sertra line treatment. Alcoho l Clin Exp Res. 2012 Mar;36(3):509-16. 111. Liang KY, Zeger SL. Regression-Analysi s for Correlated Data. Annu Rev Publ Health. 1993;14:43-68. 112. Muller KE, Fetterman BA. Regression and ANOVA : an integrated approach using SAS software. Hoboken, N.J.: J. Wiley; 2003. 113. Farah MJ, Shera DM, Sa vage JH, Betancourt L, Giannetta JM, Brodsky NL, et al. Childhood poverty: specific associations with neurocognitive development. Brain Res. 2006 Sep 19;1110(1):166-74. 114. White AM, Swartzwelder HS. Age-re lated effects of alcohol on memory and memory-related brain function in adolescents and adults. Recent Dev Alcohol. 2005;17:161-76. 115. De Bellis MD, Narasimhan A, Thatc her DL, Keshavan MS, Soloff P, Clark DB. Prefrontal cortex, thalamus, and cerebellar volumes in adolescents and young adults with adolescent-onset al cohol use disorders and comorbid mental disorders. Alcohol Clin Exp Res. 2005 Sep;29(9):1590-600. 116. Brown SA, Tapert SF, Granholm E, De lis DC. Neurocognitive functioning of adolescents: effects of protracted alc ohol use. Alcohol Clin Exp Res. 2000 Feb;24(2):164-71. 117. Gruber E, DiClemente RJ, Anderson MM, Lodico M. Early drinking onset and its association with alcohol use and problem behav ior in late adolescence. Prev Med. 1996 May-Jun;25(3):293-300. 118. Dahl H, Carlsson AV, Hillgren K, Helander A. Ur inary Ethyl Glucuronide and Ethyl Sulfate Testing for Detection of Re cent Drinking in an Outpatient Treatment Program for Alcohol and Drug Dependenc e. Alcohol Alcoholism. 2011 MayJun;46(3):278-82. 119. Freeman WM, Vrana KE. Future Pros pects for Biomarkers of Alcohol Consumption and Alcohol-Induced Disorde rs. Alcohol Clin Exp Res. 2010 Jun;34(6):946-54. 97

PAGE 98

98 BIOGRAPHICAL SKETCH Melvin Douglas Livingston III received his B.A. in physics (yes, they do have those) from the University of Florida in t he summer of 2007. Immediately following this he entered the M.P.H. pr ogram concentrating in epidemio logy at the University of Florida. When the opportunity ar ose to transition to the ne wly formed Ph.D. program in epidemiology he enrolled in the programs first cohort. He was a recipient of the 2008 Graduate Alumni Fellowship. Two years into his doctoral work he accepted the full time position of statistical coordinator in the College of Medicine, Department of Health Outcomes and Policy. His time as a docto ral student and full time employee gave him the opportunity to lead the analysis on four ex ternally funded grants. This work resulted in co-authorship of eight peer-reviewed articl es. He received his Ph.D. in the summer of 2013. Doug will pursue a faculty position where he can continue his research.