<%BANNER%>
PRIVATE ITEM
Digitization of this item is currently in progress.
Generalized shrinkage F-like statistics for testing an interaction term in gene expression analysis in the presence of h...
CITATION PDF VIEWER
Full Citation
STANDARD VIEW MARC VIEW
Permanent Link: http://ufdc.ufl.edu/AA00008689/00001
 Material Information
Title: Generalized shrinkage F-like statistics for testing an interaction term in gene expression analysis in the presence of heteroscedasticity
Series Title: BMC Bioinformatics
Physical Description: Book
Language: English
Creator: Yang, Jie
Casella, George
McIntyre, Lauren M.
Publisher: BioMed Central
Publication Date: 2011
 Notes
Abstract: Background: Many analyses of gene expression data involve hypothesis tests of an interaction term between two fixed effects, typically tested using a residual variance. In expression studies, the issue of variance heteroscedasticity has received much attention, and previous work has focused on either between-gene or within-gene heteroscedasticity. However, in a single experiment, heteroscedasticity may exist both within and between genes. Here we develop flexible shrinkage error estimators considering both between-gene and within-gene heteroscedasticity and use them to construct F-like test statistics for testing interactions, with cutoff values obtained by permutation. These permutation tests are complicated, and several permutation tests are investigated here. Results: Our proposed test statistics are compared with other existing shrinkage-type test statistics through extensive simulation studies and a real data example. The results show that the choice of permutation procedures has dramatically more influence on detection power than the choice of F or F-like test statistics. When both types of gene heteroscedasticity exist, our proposed test statistics can control preselected type-I errors and are more powerful. Raw data permutation is not valid in this setting. Whether unrestricted or restricted residual permutation should be used depends on the specific type of test statistic. Conclusions: The F-like test statistic that uses the proposed flexible shrinkage error estimator considering both types of gene heteroscedasticity and unrestricted residual permutation can provide a statistically valid and powerful test. Therefore, we recommended that it should always applied in the analysis of real gene expression data analysis to test an interaction term.
General Note: 10.1186/1471-2105-12-427
 Record Information
Source Institution: University of Florida
Holding Location: University of Florida
Rights Management: All rights reserved by the source institution.
System ID: AA00008689:00001

Downloads

This item is only available as the following downloads:

( PDF )


Full Text

PAGE 1

RESEARCHARTICLE OpenAccessGeneralizedshrinkage F -likestatisticsfortesting aninteractiontermingeneexpressionanalysisin thepresenceofheteroscedasticityJieYang1*,GeorgeCasella2,4andLaurenMMcIntyre2,3,4AbstractBackground: Manyanalysesofgeneexpressiondatainvolvehypothesistestsofaninteractiontermbetweentwo fixedeffects,typicallytestedusingaresidualvariance.Inexpressionstudies,theissueofvarianceheteroscedasticity hasreceivedmuchattention,andpreviousworkhasfocusedoneitherbetween-geneorwithin-gene heteroscedasticity.However,inasingleexperiment,heteroscedasticitymayexistbothwithinandbetweengenes. Herewedevelopflexibleshrinkageerrorestimatorsconsideringbothbetween-geneandwithin-gene heteroscedasticityandusethemtoconstruct F -liketeststatisticsfortestinginteractions,withcutoffvalues obtainedbypermutation.Thesepermutationtestsarecomplicated,andseveralpermutationtestsareinvestigated here. Results: Ourproposedteststatisticsarecomparedwithotherexistingshrinkage-typeteststatisticsthrough extensivesimulationstudiesandarealdataexample.Theresultsshowthatthechoiceofpermutationprocedures hasdramaticallymoreinfluenceondetectionpowerthanthechoiceof F or F -liketeststatistics.Whenbothtypes ofgeneheteroscedasticityexist,ourproposedteststatisticscancontrolpreselectedtype-Ierrorsandaremore powerful.Rawdatapermutationisnotvalidinthissetting.Whetherunrestrictedorrestrictedresidualpermutation shouldbeuseddependsonthespecifictypeofteststatistic. Conclusions: The F-liketeststatisticthatusestheproposedflexibleshrinkageerrorestimatorconsideringboth typesofgeneheteroscedasticityandunrestrictedresidualpermutationcanprovideastatisticallyvalidand powerfultest.Therefore,werecommendedthatitshouldalwaysappliedintheanalysisofrealgeneexpression dataanalysistotestaninteractionterm.BackgroundTheregulationofgeneexpressionstartswhenacell s DNAistranscribedintomRNA.Thesimultaneous expressionprofilesofmanygenesunderdifferentcircumstancescanprovideinsightintophysiologicalprocesses.Usingmoderntechnologiesingeneexpression experimentssuchasoligon ucleotidearrays[1],and cDNAspottedarrays[2],manyscientistshavemade noveldiscoveriesaboutcomplexbiologicalprocessesof yeast[3,4],drosophila[5] ,mice[6],humans[7],and otherspecies.Recentlyonesuchstudyalsoincluded RNA-seq[8].Statisticalmethodologiesandissues involvedinmicroarraydataanalysishavebeenwidely reviewed[9-12],anditisexpectedthatmanyofthe sameissueswillneedtobeaddressedwithRNA-seq. Theanalysisofvariance(ANOVA)modelisapopular statisticalmodelingmethodfortheanalysisofmicroarrays.SinceitsintroductionbyKerr etal .[13],ithas beenextensivelyexaminedforuseinthissetting [14-21].Kerr etal .constructedanANOVAmodelthat includedthegeneeffectasafixedeffect.Thismodel assumesidenticallyandindependentlydistributedresidualerrorsacrossgenes.Theadvantageofthismodelis thatthelargenumberofgenesinvolvedinamicroarray experimentresultsinhugedegreesoffreedomforthe errorestimate,whichcanleadtoaverypowerfultest. However,thecommonassumptionofhomoscedasticity maynotholdtrueinthissetting[22].Onealternativeis *Correspondence:jie.yang@stonybrook.edu1DepartmentofPreventiveMedicine,StonyBrookUniversity,StonyBrook, NY11794,USA FulllistofauthorinformationisavailableattheendofthearticleYang etal BMCBioinformatics 2011, 12 :427 http://www.biomedcentral.com/1471-2105/12/427 2011Yangetal;licenseeBioMedCentralLtd.ThisisanOpenAccessarticledistributedunderthetermsoftheCreativeCommons AttributionLicense(http://creativecommons.org/licenses/by/2.0),whichpermitsunrestricteduse,distribution,andreproductionin anymedium,providedtheoriginalworkisproperlycited.

PAGE 2

touseanANOVAmodelforeachgene,buttheresultingteststatisticsfromgen e-specificmodelsmayhave limitedpowerbecausethebiologicalsamplesizefor eachgeneinamicroarrayexperimentisusuallysmall. Toaddressthisproblemoflimitedpower,researchers haveproposedothermethodsforobtainingmoreinformationacrossgenes,rangingfromasimpleequalweightedaverageofagene-specificerrorestimateand theglobalaverageofallgene-specificerrorestimates( F2statisticproposedbyWu etal .[19]toempiricalBayesianmodelingofallgene-specificerrors[23-26].Other variations[27-29]useddifferentvariancemodelingstrategiestoaddresstheheteroscedasticityproblem,butno clearwinnerhasemerged[30].HuangandLiu[31] extendedtheteststatisticsproposedbyCui etal .[28] byassuminganormaldistributiononthemeanand thenderivinganempiricalBayeslikelihoodratiotest. Theresultingteststatisticshrinksboththemeanand variances. Inadditiontotheproblemof between -geneheteroscedasticity,wemustalsobeconcernedwith within -gene heteroscedasticity.Forex ample,inthestudyofsimple differentialgeneexpressi onbetweenatreatmentgroup andacontrolgroup,thevarianceinthetreatmentarm maydifferfromthatinthecontrolarm.Some approachestothisproblemincludeageneralBayesian frameworktomodelheteroscedasticerrorinasingle generalizedlinearmixedmodelsetting[32]andastructuralmodelplacedontheerrorvariancesspecificto eachgeneandtreatmentcombination[33]. Asgeneexpressionstudiesbecomemorepopular,the complexityoftheexperimentincreases.Insteadofonly simpletreatmentandcontrolexperiments,twoormore factorexperimentsarebeingconducted.Thisincrease inexperimentcomplexityhasledtomanyscientific questionsinvolvingthehypothesistestingofaninteractionbetweentwofactors.Forexample,testingaprobe bygenotypeinteractioncanresultininferencesabout polymorphismintheprobe,suchassinglenucleotide polymorphism(SNP)andinsertion-deletion(indel) [34-37];testingaprobebysexcanimplythatalternative splicingoccursbetweenmaleandfemalesubjects[38]; andinpharmacogenomicstudies,testingthegenotypedrug/treatmentorgenotype-diseaseinteractionmaybe ofinterest[39].Thusfar,allthedevelopmentof ANOVAmethodsformicroarraystudieshasfocusedon testsofmaineffects. Here,ageneralizedshrinka geestimatorincorporating bothwithin-andbetween-geneheteroscedasticitiesis developed(seeLehmannandCesella[40]forareview ofshrinkageestimation).Inanygivenexperiment,both within-geneandbetween-ge neheteroscedasticitymay exist;thus,takingthesepossibilitiesintoaccountshould leadtoanimprovedteststatistic.Moreover,giventhe increasingcomplexityofrecentstudiesandtheburgeoninginterestinhypothesesthatinvolveinteractions, wefocusonanimprovedshrinkage-based F -testfor interactionterms.MethodsHerewedevelopnewshrinkageestimatesfortheerror termandshowhowtousetheseestimatestoconstruct F -likestatistics.Wethenestimatethenulldistribution ofthesestatisticsbyusingpermutationtests.ShrinkageerrorestimatorsShrinkageerrorestimatorspullindividualerrorestimatestowardshrinkagetargets,withtheamountof shrinkagedependingonthevariabilityofindividual errorestimates[28,40].Letthegene-specificerrorestimatesforallgenes i andsubgroups k be 2 1 1,..., 2 1 K,..., 2 I K i =1,..., I,k =1,..., K,andlet 2 i k bethetruevarianceof gene i ingroup k .Whentheexperimentaldesignis balanced, 2 i k istheresidualmeansquareforgene i in group k and 2 i k/ 2 i k 2 ,where representsthe degreesoffreedomfortheerrorestimates. Thechoicesofshrinkagetargetsinmicroarraydata includethefollowing: 1.Specificvaluesforeachgene-groupcombination 2.Gene-specificvaluesthatarethesameacrossall othergroups 3.Group-specificvaluesthatarethesameacross genesbutdifferentacrossgroups 4.Asinglepointrepresentingtheunderlyingcommonerror Correspondingly,thesetargetsarecorrectwhen(1) therearebothwithin-geneandbetween-geneheteroscedasticity;(2)thereisonlybetween-geneheteroscedasticity;(3)thereisonlywithin-geneheteroscedasticity;and (4)allerrorvariancesareidentical.Wenowdevelopa generalizedshrinkageerro restimatorusingthesefour shrinkagetargets. Let Xi k log 2 i k m log 2 i k+log 2 / m ,where m isthemeanoflog log 2 / .Thenusingasymptotic normalapproximationof Xi k,thedistributionof Xi ks withdifferentshrinkagetargetsfordifferentgene i and group k combinationsis Xi k| i k N i k, 2 i k N + i+ k, b2 (1)Yang etal BMCBioinformatics 2011, 12 :427 http://www.biomedcentral.com/1471-2105/12/427 Page2of10

PAGE 3

where = 1,1,..., 1, K ,..., I ,1,..., I K = ( 1,..., I ) representsthegene-specificmeandifferences,and = ( 1,..., K ) modelsdifferentmeanswithrespectto differentclassesofthesubgroups. If s2and 2areknown,thentheBayesestimatorof i kunderthesquarederrorlossis[39]: B i k= 2 2+ b 2( + i+ k) + b2 2+ b 2Xi k Here, s2isthevarianceoflog 2 / andisknown [28,40],but 2isnotknown.However,themarginaldistributionof Xi kcanbeusedtocreateanempirical Bayesestimatorof 2andhenceof i k.Marginally,Xi k~ N ( + ai+ bk, s2+ 2), i =1,..., I,k =1,... K,and,from thismodel,theleastsquareestimatesof , , , aretheuniformlyminimum-varianceandunbiasedestimators.Usingthefactthat E( [ IK ( I + K 1) 2] t ( Xi k i k ) 2)= 1 2+ b2, theempiricalBayesestimatorfor 2is t ( Xi k i k ) 2/[ IK ( I + K 1 ) 2] 2 Then,wecanconstructthepositive-partempirical Bayesestimator[40]: EB + i k= Xi k+ 1 [ IK ( I + K 1) 2] 2 t ( Xi k i k)2+ Xi k Xi k= + i+ k, where( x )+= max ( x ,0).Thegeneralizedshrinkage errorestimatefor si kcanbeobtainedthroughexponentiating EB + i k asfollows: 2 Gen i k=exp( EB + i k) (2) Usingasimilarargument,thegeneralizedshrinkage errorestimatorwiththeshrinkagetargetateachgene is 2 Gen gene i k=exp( m + + i) exp[ 1 [ IK ( I 1) 2] 2 t ( Xi k i)2 + Xi k i ] (3) withtheshrinkagetargetateachgroupis 2 Gen grp i k=exp( m + + k) exp[ 1 [ IK ( K 1) 2] 2 t ( Xi k k)2 + Xi k k ] (4) andwiththeshrinkagetargetatthecommonerror, wehave 2 Gen ce i k=exp( m + ) exp[ 1 [ IK 3] 2 t ( Xi k )2 + Xi k ] (5) TheshrinkageerrorestimatorproposedbyCui etal [28]shrinksthegene-specificerrorestimatorstoward theircommoncorrectedgeometricmean.Specifically, theestimatorfor 2 i iscalculatedas 2 Cui i=exp b m + t Xi I t exp[ n f 1 [ I 3] 2 t ( Xi t Xi I )2r + ( Xi t Xi I )] (6) where Xiistheresidualvarianceestimatefroma gene-specificmodel,and m and s2arethemeanand varianceoflog 2K K .Theunderlyingassumptionfor thisestimatoristhatthereisnobetween-geneheteroscedasticity,asthisestimatorshrinkseverygene-specific errorestimatortowardonetarget.Therefore,itwill overshrinkthegene-specificerrorestimateswhengene heteroscedasticityexists.I ncomparison,generalized shrinkageerrorestimatorsareflexibleintermsofincorporatingadifferenttypeofheteroscedasticity.Some degreesoffreedomareusedforincorporatingtheheteroscedasticity.However,thegainisthattheerrorestimatoristhenclosertotheunderlyingdistributionand shouldleadtobetterperformanceoftheresultantF-like teststatisticsasshownintheresultssection. Informulas(2),(3),(5),and(6), m isthemeanand s2isthevarianceofalog-transformedchi-squarerandom variable.Thesimulation-basedapproximatevaluesof m and s2canbefoundfromTable1inworkofCui etal [28].Pounds[41]gaveanalyticalexpressionsforthese parametersanddevelopedRcodefortheexactcalculation.Here,thesimulation-basedapproximatevalues wereused.Shrinkage F-likestatisticsToconstructastatisticforthehypothesistestofno interactionbetweentwofix edeffects,thetraditional F testissimplytheratioofthemeansquareoftheinteractionterm(MSI)andthemeansquareofresiduals (MSE).This F -test,referredtoas F1[42],is F1= MSI M S E = MS I 2 .The F1testcorrespondingtoaspecificgene i isdenotedby F1, i= M S Ii 2 i (7)Yang etal BMCBioinformatics 2011, 12 :427 http://www.biomedcentral.com/1471-2105/12/427 Page3of10

PAGE 4

Theerrorvarianceestimatorinthistestusesdata fromonlygene i .Inoligonucleotidemi-croarraymodels, thedegreesoffreedomfortheerrorestimatecanbe smallbecausethesamplesiz eofRNAisusuallysmall, andhencethepowerof F1canbelimited. Followingthemethodofconstructingan F -teststatisticgivenbyNeter etal .[42],thegene-specificshrinkage F -likestatisticsfortestinganinteractionbetweentwo fixedeffectscanbeobtainedas FGen i= MSIi tk 2 Gen i k/ K FGen gene i= MSIi tk 2 Gen gene i k/ K FGen grp i= MSIi tk 2 Gen grp i k/ K FGen ce i= MSIi tk 2 Gen ce i k/ K FCui i= MSIi 2 Cui i. Whenthehomoscedasticerrorassumptionistrue,the pooledvarianceestimator, 2 p oo l ,canbeusedtoconstruct an F -likestatistic.Forabalanceddesign,thepooledvarianceestimateistheaverageofallgene-specificerror estimates.Thisstatisticisdenotedby F3usingthesame notationusedbyCuiandChurchill[22],whoalsointroducedanothershrinkage-type F statistic, F2,whichcan alsoborrowinformationacrossgeneswhenestimating theresidualvariances.Thestatistic F2usesanequalweightedaverageofagene-specificerrorestimator 2 and 2 p oo l .Thedefinitionsof F2, iand F3, iare F2, i= MSIi 0.5 2 i+0.5 2 pool, F3, i= MSIi 2 p ool. PermutationtestsFortheproposedgeneralizedshrinkage F -liketeststatistics,thenulldistributionsarenotknownnameddistributions.Therefore,anempiricalapproachsuchasa permutationtestcanbeusedtoestimatethenulldistributions.Thepermutationtestforinteractioniscomplicated,becausethereisnoexactpermutationtestfor suchapurpose[43].Wethereforemustconsideran approximatepermutationmethodfortestinganinteractionterminacrossedfixed/mixedmodel[44,45]. Permutationapproachesdevelopedpreviouslyfocused onasingleANOVAmodel.Inthetypicalgeneexpressionstudy,thousandsofANOVAmodelsareconsidered simultaneously.Theadditionalcomplexityoftheshrinkage F -likestatisticsindicatesthatMonteCarlostudies areneededtoinvestigatetheperformanceofresidual permutationandrawdatapermutation,withrestrictions ornot,inagene-expressionanalysis.Thechoiceofpermutationproceduresiscriticalforassessingtheperformanceofateststatistic. Forallthemodified F -likestatisticspresentedinthe previoussection,thenulldistributionscanonlybe approximatedempirically,butpermutationprocedures canbeusedtofindtheapproximatenulldistributionof allthe F and F -likestatistics.Theimportantissuesin performingapermutationanalysisincludethechoiceof theexchangeableunitsunderthenullhypothesis,the choiceofusingrestrictedpermutationornot,andthe choiceofresidualpermutationorrawdatapermutation. Thesechoicesinfluencethepowerofateststatistic. Residualpermutationusingresidualsfromareduced modelandunrestrictedrawdatapermutationcanbe usedtoapproximatethenulldistributionofastatistic fortestinganinteractionterm[44].Whenusing F1to testaninteractionterminasingleANOVAmodel,the residualpermutationleadstoamorepowerfultestthan unrestrictedrawdatapermutation[44].However,in geneexpressionanalysis,t housandsofgene-specific ANOVAmodelsaresimultaneouslyconsidered,andfor Table1ResultsfromrawdatapermutationRestricted?Dataset F1F2F3FCuiFGenFGen-geneFGen-grpYES null-ce5.05(0.07)5.06(0.08)5.12(0.17)5.09(0.10)5.09(0.10)5.05(0.08)5.11(0.10) null-gh5.02(0.07)5.13(0.16)5.26(0.20)5.03(0.07)5.07(0.12)5.03(0.07)5.11(0.16) null-wgh4.97(0.07)4.96(0.09)4.93(0.18)4.99(0.08)4.99(0.12)4.96(0.09)5.01(0.16) null-bgh5.02(0.07)4.99(0.17)5.03(0.21)5.02(0.07)5.02(0.15)5.01(0.09)5.03(0.18) NO null-ce5.10(0.07)5.06(0.08)5.06(0.08)7.4(0.12)5.15(0.09)5.12(0.08)5.08(0.08) null-gh5.08(0.07)5.12(0.16)5.12(0.12)7.4(0.09)5.10(0.11)5.07(0.10)5.12(0.09) null-wgh12.31(0.10)7.56(0.10)4.61(0.10)17.37(0.14)5.32(0.11)5.07(0.09)5.87(0.11) null-bgh12.31(0.11)6.63(0.17)5.55(0.19)15.68(0.12)6.30(0.12)6.10(0.11)6.30(0.11)CWERobtainedfrom1,000permutationswiththenominalsignificancelevelsettingat0.05,withstandarderrorsinparentheses.Ninehundredsimulationruns wereperformedtogetempiricalaverageCWERofalltypesof F -liketeststatistics.Yang etal BMCBioinformatics 2011, 12 :427 http://www.biomedcentral.com/1471-2105/12/427 Page4of10

PAGE 5

aparticulargene-specificANOVAmodel,information fromothergene-specificANOVAmodelsisusedto constructtheshrinkageerrorestimate.Hence,bothresidualpermutationandrawdatapermutationwereinvestigated.Furthermore,bothrestrictedandunrestricted permutationswerestudied,becausethepermutation unitsareexchangeableonlywithineachparticular groupwhenwithin-genehete roscedasticityispresent acrossthosesubgroups.ResultsThepropertiesofthisshrinkageestimatorarecompared withthoseofotherexisting F and F -likestatisticsthat havebeenproposedanddescribedinthe Shrinkage F likestatistics section.SimulationstudiesThepurposeofthesesimulationstudieswastocompare theperformancesof F1, F2, F3, FCui, FGen, FGen-gene,and FGen-grpintermsoftypeIerrorandpowerandtocomparetheresultsofaparticular F -likestatisticusingfour differentpermutationstrategie s:restricted/unrestricted residualpermutationandrestricted/unrestrictedraw datapermutation. Inthesesimulationstudies,100geneswithtwoprobes foreachgeneandthreereplicatesfromeachoftwo linesweresimulatedtomimicasplit-plotdesignina generaloligonu-cleotidemicroarrayexperiment.The gene-specificANOVAmodelinwhichdataweregeneratedfromthemodel, yplr= Pp+ Ll+ RLrl+ PLpl+ plr, w p =1,2, l =1,2, r =1,2,3,where P,L,RL ,and PL representprobe,line,replicatesfromaparticularline, andtheinteractionbetweenprobeandline,respectively. Replicateswerenestedwithineachline,and RL is usuallytreatedasarandomeffectduringthemodel-fittingprocedure,whichresultsinacorrelationbetween probesfromthesamebiologicalsample.Inthesimulateddatasets,thecorrelationbetweengeneswas0.As manyas900simulationrunswerecarriedouttocomparetheperformancesof F1, F2, F3, FCui, FGen, FGen-gene, and FGen-grpbasedondifferentpermutationprocedures. Thefourpermutationstestedwereunrestrictedresidual permutation,restrictedresidualpermutationwith respecttoeachline,unrestrictedrawdatapermutation, andrestrictedrawdatapermutationwithrespectto eachline.Theresidualspermutedwerefromareduced fixedmodelwithfixedeffectsforonlylineandprobe. Twotypesofdataweresimulated:nullcasesandcases withaprobebylineinteractionatarangeofdegrees. Nullcasesincluded: null-ce ,allprobe-levelexpression valuesweresimulatedfromthestandardnormaldistribution; null-gh ,thegene-specificerrorvarianceswere simulatedfromthelog-normaldistributionwithmean logat0andstandarddeviationat2,mimickingthe generalheteroscedasticerrordistributionintypicaldatasets; null-wgh ,allgeneshadthesameerrorstructures andtheresidualerrorvarianceofline1was100times thatofline2; null-bgh ,simulateddataweremodified from null-gh,withthevarianceofline1multipliedby 100.Correspondingly, ce,gh,wgh ,and bgh inFigures1 and2weresimulatedbyaddinginteractiontermsto null-ce,null-gh,null-wgh ,and null-bgh .Quantitative interactionwasassumedandthedifferencesinthe oppositedirectionweresettomakethedetection powersforaninteractiontermbasedontraditionalFstatisticsandtabledp-valuesrangefrom0.05to0.95. Tables1and2showtheresultsfrom900simulation runsusingrawdatapermutationandresidualpermutation,respectively.DatainTable1suggestthatwhen bothtypesofgeneheteroscedasticityexist,theunrestrictedrawdatapermutationhadagreateraveragecomparison-wiseerrorrate(CWER)thanresidual permutation.Rawdatapermutationwithrestrictioncan controlprespecifiedCWERimallcases.InTable2,for thecommonerrorcases,allteststatisticshadtheprespecifiedCWERfrombothrestrictedandunrestricted residualpermutation.Whenwithin-geneheteroscedasticityexisted, F1and FCuihadinflatedCWERfromboth tworesidualpermutationtests.Restrictedresidualpermutationreduces,butdoesnotsolve,thisproblem.For F2and F3,onlytherestrictedresidualpermutationcould controltheprespecifiedCWER.For FGen, FGen-gene,and FGen-grp,restrictedresidualper mutationgaveconservativeresultsintermsofhavingCWERsmallerthanthe prespecifiedlevel.Whentheshrinkagetargetiscorrectly set,unrestrictedresidualpermutationcontrolsthenominalCWER.Asexpected,only FGencoupledwithunrestrictedresidualpermutationcouldbeusedforallcases, becausetheCWERwasalwayslessthanthenominal level. Furthersimulationstocomparetherejectionrates wereconducted.Onlyresultsfromresidualpermutation areshownbecauseitwasfoundthatrawdatapermutationwaslesspowerfulthanr esidualpermutation.This isconsistentwiththefindingsofAndersonandTer Braak[44].Figure1showstheestimatedaveragenull hypothesisrejectionratecurvesfromall F -likestatistics andbothrestrictedandunrestrictedresidualpermutationprocedures.Thex-axisrepresentstheaveragenull hypothesisrejectionrateusing F1andthetabulated p values.ThesolidlineshowsthatthecorrespondingstatisticcontrolstheprespecifiedCWER,andthedashed lineshowsthatthecorrespondingCWERwasinflated. Ingeneral,restrictedresidualpermutationislesspowerfulthanunrestrictedresidualpermutation.Forexample, thepowerofallstatisticsfromunrestrictedresidualpermutationalmostdoubledinsomecaseswhereheteroscedasticityexisted.Yang etal BMCBioinformatics 2011, 12 :427 http://www.biomedcentral.com/1471-2105/12/427 Page5of10

PAGE 6

Whenthecommonerrorassumptionisvalid, F3is obviouslythemostpowerfultestandtheprespecified CWERiscontrolled.Allother F -likestatisticsperformed verysimilarlyinthiscase.Whentheshrinkagetarget wascorrectlyset,theresultantteststatisticwasthe mostpowerfulone.Forexample,whentherewasonly within-geneheteroscedasticity, FGen-grpwasmorepowerfulthan FGenand FGen-genebasedoneitherrestrictedor unrestrictedresidualperm utation.Therejectionrate comparisonofstatisticallyvalidteststatisticsisfurther illustratedinFigure2,wherethex-axisistheaverage rejectionratefromusing FGenandunrestrictedresidual permutation.Figure2clear lyshowsthatunrestricted residualpermutationismorefavorableintermsof power. FGen-grpappearstobemorepowerfulthan FGen, butwhenbothtypesofgenehe teroscedasticitiesoccur, FGengrphasinflatedCWER.DrosophiladataThedatausedinthisstudyarefromageneexpression comparisonstudybetween D.melanogaster and D.simulans [46].Expressionof10genotypesofeachspecieswas measuredinmaleflies.In D.simulans ,eachgenotypewas measuredseparately,andin D.melanogaster ,apoolof10 genotypeswasmeasured.All genotypes(individualor pooled)wereindependentlyisolatedandhybridizedthree times.Thegoaloftheoriginalstudywastoprovideagenome-wideapproachtoidentifyingcandidategenespotentiallyresponsibleforadaptationandspeciationin D. simulans and D.melanogaster .Inthisstudy,wefocuson Figure1 Thecomparisonofpowercurvesofall F -liketeststatistics .Thex-axisistheaveragepowerfromanalyzing900simulateddata setsusing F1withtabled p -values.They-axisistheestimatedpowersusingempiricalgene-specificnulldistributionsfrom1,000residual permutations.Theupperfourplotsshowtheresultswithrestrictedresidualpermutation,whilethelowerfourplotsshowtheresultsfrom unrestrictedresidualpermutation.ThesolidlineindicatestheempiricalaverageCWERofastatisticisattheprespecifiedlevel,andthedashed lineshowsaninflatedempiricalaverageCWER."ce, allgeneshavecommonerror; gh, onlybetween-geneheteroscedasticityexists; wgh, only within-geneheteroscedasticityexists; bgh, bothbetween-geneandwithin-geneheteroscedasticityexist. Yang etal BMCBioinformatics 2011, 12 :427 http://www.biomedcentral.com/1471-2105/12/427 Page6of10

PAGE 7

identifyingsequencedifferencesbetweengenotypesin D. simulans basedonhybridizationprofiles.Within-geneheteroscedasticityisexpectedbecausethegenotypescome fromdifferentlines.Theproposedgeneralizedshrinkage F -liketeststatistics FGen, FGen-gene,and FGen-grpwerecomparedwith F2, F3withrestrictedresidualpermutation, whichcouldcontrolprespecifiedCWERforanyvariance structureinsimulationstudies.Furthermore,Smyth s moderatedF-teststatistic[25]withoutmultipletesting adjustmentandcontrollingthefalsediscoveryrate(FDR) at5%wereusedforcomparison.Asthemaininterestisin sequencedifference,thefocusisonthetestofinteraction betweenlineandprobe.Thesplitplotmodeldescribed aboveisused.SASprogramcodesareincludedinthe additionalfiles(additionalfile1andadditionalfile2). TheDrosophilagenomehasbeenfullysequencedand bothSNPsandindelscancauseasignificantinteraction term.Thus,thefalsepositi verateanddetectionpower Figure2 Thecomparisonofpowercurvesof FGenfromunrestrictedresidualpermutationversusother F -liketeststatistics .Onlyresults frompermutationcombinationsthatcancontrolprespecifiedCWERareusedinthisfigure.Thex-axisistheaveragepowerafteranalyzing900 simulateddatasetsusing FGenand1,000unrestrictedresidualpermutations.They-axisistheestimatedpowerfromother F-liketeststatisticsand empiricalgene-specificnulldistributionsbasedontheappropriatepermutation.Thesolidblacklinecorrespondsto FGenwithunrestricted permutation,andthistestalwayscontrolsprespecifiedCWER."ce, allgeneshavecommonerror; gh, onlybetween-geneheteroscedasticity exists; wgh, onlywithin-geneheteroscedasticityexists; bgh, bothbetween-geneandwithingeneheteroscedasticityexist; res, restricted permutation; unres, unrestrictedpermuation. Table2ResultsfromresidualpermutationRestricted?Dataset F1F2F3FCuiFGenFGen-geneFGen-grpYES null-ce4.59(0.07)4.15(0.08)3.63(0.14)4.09(0.09)4.55(0.07)3.23(0.06)4.44(0.08) null-gh4.57(0.07)4.1(0.13)3.95(0.16)4.49(0.07)4.6(0.07)4.61(0.07)4.38(0.07) null-wgh6.74(0.08)4.33(0.08)3.51(0.14)6.49(0.09)4.38(0.07)4.2(0.07)2.78(0.09) null-bgh6.74(0.08)4.35(0.16)4.07(0.19)6.58(0.08)4.36(0.07)4.16(0.07)3.64(0.07) NO null-ce5.1(0.07)4.99(0.08)4.5(0.08)4.59(0.08)4.99(0.07)4.1(0.07)4.68(0.07) null-gh5.1(0.07)4.83(0.1)4.59(0.11)5.08(0.07)4.99(0.07)5.01(0.07)4.95(0.07) null-wgh10.75(0.09)8.46(0.09)7.6(0.09)12.37(0.11)5.03(0.08)6.43(0.08)4.93(0.08) null-bgh10.75(0.1)8.38(0.17)8.07(0.19)10.79(0.1)5.02(0.08)6.38(0.08)6.73(0.08)CWERobtainedfrom1,000permutationswiththenominalsignificancelevelsettingat0.05,withstandarderrorsinparentheses.Ninehundredsimulationruns wereperformedtogetempiricalaverageCWERofalltypesof F -liketeststatistics.Yang etal BMCBioinformatics 2011, 12 :427 http://www.biomedcentral.com/1471-2105/12/427 Page7of10

PAGE 8

basedonSNP/indelsequenceinformationcanbecalculatedforasubsetofthedata.Inthedataset,therewere 10linesfrom D.simulans andthreereplicatesfrom eachline.Eachprobesethad14probes.The1,285probesetscontainingall good probeswereselected.A bad probe ssequencesatisfiesoneormoreofthefollowingcriteria:itmatchesthe D.simulans genomemultipletimes;itcannotbemappedtotheflybase4.2.1 genome;or,ithasnoinformation,suchashittingoutsideanexon,hittingapoorlyalignedregion,orhittinga regionlackingasequence.SNPorindelinformation couldbedeterminedin777probesets.Forthisdataset, therewasahighdegreeofwithin-geneheteroscedasticity:about22.3%oftheprobesetshadadifferencein line-specificresidualvarianceestimatesaslargeasor morethana10-foldchange.Therefore,assuggestedby theconclusionsfromsimulationstudies,unrestricted residualpermutationandrestrictedresidualpermutation wereusedforgeneralizedshrinkage F -liketeststatistics ( FGen, FGen-gene, FGen-grp)andrestrictedresidualpermutationwasusedforstatistics( F2, F3).Theresultsare showninTable3.Consistentwiththefindingsfromthe simulationstudies, FGenhadabout30%moredetecting powerbyvaluingthewithin-geneheteroscedasticity thantheother F -liketeststatistics( F2, F3).Thefalsediscoveryrateof FGenwasslightlyhigherthanthatof F2, F3. FGen-geneand FGen-grpperformedsimilarlyto FGen. BothofSmyth smoderatedF-teststatisticwithoutmultipletestingadjustmentandwithFDRsetat5%for multipletestingadjustmentdetectedmoreSNPsand indelsbutattheexpenseofagreaterFDRthan FGen.DiscussionForgeneexpressionanalysis,ANOVAmodelshavebeen apopularmodelingtechnique.BasedonANOVAmodels,flexibleshrinkage F -liketeststatisticsweredevelopedtoaccountforboththewithin-geneandbetweengeneheteroscedasticities.Theemphasishereison testinganinteractionterm,asthiscaseisofincreasing interesttobiologists,andthereisnoclearexistingtheoryonthemostpowerful,validapproachforsuchstatistics.Forall F -likestatisticsstu diedhere,theirnull distributionswereapproximatedempiricallythrough permutations.Fourdifferentpermutationprocedures wereinvestigatedforeightdifferent F -likestatistical testsoftheinteractionterm. Asexpected,wefoundthatwhenanerrorestimator overshrinks,theresulting F -likestatisticcannotcontrol theprespecifiedCWER.Forexample, FGen-geneisan over-shrinkageerrorestim atorwhenthereiswithingeneheteroscedasticity.Asaresult,comparedwithgeneralizedshrinkage F -likestatistics,itisnotvalidwhen within-geneheteroscedasticityexists.Undershrinkageis alsoimportant,asitwillleadtoaconservativetestand lowerpower.Thisisclearlydemonstratedwhenthe commonerrorcanbeassumedandthemostpowerful validtestis FGen-grp. Themoststrikingresultwastheimpactofthepermutationprocedures.Althoughthiswasnotcompletelyunexpected[43-45],theeffectofthepermutation proceduresisdramaticandworthyofspecialattention. Unrestrictedrawdatapermutationcouldnotcontrol prespeci-fiedCWERwhentherewaswithin-geneheteroscedasticity.Restrictedrawdatapermutationcould beused,butitwaslesspowerfulthanresidualpermutation.AlsoconsistentwithfindingsfromAnderson andTerBraak[44],restrictedpermutationsareless powerfulthanunrestricted permutations.However, unrestrictedpermutations arevalidonlyforacommon errorandwhenbetween-geneheteroscedasticityexists forourproposedshrinkagesta tistics;theyarenotvalid incombinationwith F2, F3,or FCui.For FGen-grp,the unrestrictedpermutationcanalsobeusedincases havingwithin-geneheteroscedasticity,whileonly FGenisvalidwithunrestrictedpe rmutationinallcasesin termsofcontrollingprespecifiedCWER.Interestingly, Table3Probesetswithsignificant line*probe termsfoundbyF-liketeststatisticsandappropriateresidual permutationproceduresandSmyth smoderatedF-teststatisticTeststatisticRestrictedpermutation?NumberofprobesetsfoundTruefalsediscoveryratePower F2Yes 124 22.6% 12.4% F3Yes 187 29.4% 17.1% FGenNo 453 29.5% 41.1% FGen-geneNo 455 28.8% 41.7% FGen-grpNo 474 28.9% 43.4% FGenYes 136 24.3% 13.3% FGen-geneYes 122 22.1% 12.2% FGengrpYes 116 21.5% 11.7% moderatedF -1 N/A 535 34.1% 75.5% moderatedF -2 N/A 813 34.4% 68.8%TheCWERwassetto0.05.Gene-specificcutoffvalueswereobtainedfrom1,000permutations. moderatedF-1 and moderatedF-2 representresultsfrom usingmoderatedFstatisticwithoutanymultipletestingadjustmentandsettingFDRto5%.Yang etal BMCBioinformatics 2011, 12 :427 http://www.biomedcentral.com/1471-2105/12/427 Page8of10

PAGE 9

thepowergainfromusingthe correctshrinkagetarget FGen-grpratherthan FGenisfarlessthanthatofusing unrestrictedpermutation.Theresultisthat F3isnever themostpowerfulchoicewhentestinganinteraction term. Thecorrectshrinkagetargetcanleadtothemost powerfulteststatistic.Asoneofthereviewerssuggested,astatisticaltestmaybeappliedtohelppickthe bestshrinkagetargetbeforeobtainingshrinkageerror estimates.However,thisextratestingstepmayinflate theCWERoftheteststatisticwhenthereisgeneheteroscedasticity.Forexample,whentherearebothtypes ofgeneheteroscedasticities,itispossiblethattheabove testsuggestsonlywithin-geneheteroscedasticitiesexist, and FGen-grpisshowntoinflatetheCWER.Thereis minimalpenaltytousingtheshrinkageestimatorwe propose,sowerecommendsettingtheshrinkagetarget inthefullspacespannedbygroupandgeneandusing unrestrictedpermutationtocompensateforthepossible powerlossinfewerdegreesoffreedomleftforestimatingtheerrors.ConclusionsTheproposedgeneralizedshrinkage F -likestatisticwith shrinkagetargetslocatedinaspacespannedbygene andanothergroup, FGen,withunrestrictedresidualpermutationisalwaysvalidintermsofhavingaprespecifiedCWER.Thisstatistichasreasonablepowerinmost cases;thus,itisgenerallyrecommendedtobeappliedto testaninteractiontermintheanalysisofrealgene expressiondata.AdditionalmaterialAdditionalfile1:SASprogramcode1.SASprogramcodefor analyzingtherealdatasetusingresidualpermutationwithoutrestriction. Additionalfile2:SASprogramcode2.SASprogramcodefor analyzingtherealdatasetusingresidualpermutationwithrestriction. Listofabbreviations CWER:comparison-wiseerrorrate;FDR:falsediscoveryrate;indel:insertion anddeletion;SNP:singlenucleotidepolymorphism; Acknowledgements WethankBrandonWaltsforidentifyingtrueSNPpositions;AngelaJ. McArthurandDavidR.Gallowayfortheirhelpinscientificediting;associate editorandthreereviewersfortheirconstructivecommentsthatmuch improvedthismanuscript.ThisresearchwassupportedbyNIH 1R01GM077618(McIntyre),NIH1R01GM081704(Casella). Authordetails1DepartmentofPreventiveMedicine,StonyBrookUniversity,StonyBrook, NY11794,USA.2DepartmentofStatistics,UniversityofFlorida,Gainesville,FL 32611,USA.3DepartmentofMolecularGeneticsandMicrobiology,University ofFlorida,Gainesville,FL32611,USA.4TheGeneticsInstitute,Universityof Florida,Gainesville,FL32611,USA. Authors contributions Allauthorscontributedtothedesignoftheoverallstrategy.JYcarriedout alltheanalysisanddraftedthemanuscript.LMMandGChelpedtodraft andfinalizethemanuscript.Allauthorsreadandapprovedthefinal manuscript. Received:10June2010Accepted:1November2011 Published:1November2011 References1.FodorSPA: Massivelyparallelgenomics. Science 1997, 277 :393-395. 2.SchenaM,ShalonD,HellerR,ChaiA,BrownPO,DavisRW: Parallelhuman genomeanalysis:microarray-basedexpressionmonitoringof1000 genes. ProceedingsofNationalAcademyScience 1996, 93 :10614-10619. 3.GalitskiT,SaldanhaAJ,StylesCA,LanderES,FinkGR: Ploidyregulationof geneexpressioninyeast. Science 1999, 285 :251-254. 4.TuBP,KudlickiA,RowickaM,McKnightSL: Logicoftheyeastmetabolic cycle:Temporalcompartmentalizationofcellularprocesses. Science 2005, 310 :1152-1158. 5.WhiteKP,RifkinSA,HurbanP,HognessDS: Microarrayanalysisof Drosophiladevelopmentduringmetamorphosis. Science 1999, 286 :2179-2184. 6.ChabasD,BaranziniSE,MitchellD,BernardCCA,RittlingSR,DenhardtDT, SobelRA,LockC,KarpujM,PedottiR,HellerR,OksenbergJR,SteinmanL: Theinfluenceoftheproinflammatorycytokine,Osteopontin,on autoimmunedemyelinatingdisease. Science 2001, 294 :1731-1735. 7.SebatJ,LakshmiB,TrogeJ,AlexanderJ,YoungJ,LundinP,ManerS, MassaH,WalkerM,ChiMY,NavinN,LucitoR,HealyJ,HicksJ,YeK, ReinerA,GilliamTC,TraskB,PattersonN,ZetterbergA,WiglerM: Largescalecopynumberpolymorphisminthehumangenome. Science 2005, 305 :525-528. 8.BlekhmanR,MarioniJC,ZumboP,StephensM,GiladY: Sex-specificand lineage-specificalternativesplicinginprimates. GenomeResearch 2010, 20(2) :180-189. 9.ButteA: Theuseandanalysisofmicroarraydata. NatureReviews 2002, 1 :951-960. 10.ChurchillGA: FundamentalsofexperimentaldesignforcDNA microarrays. NatureGenetics 2002, 32 :490-495. 11.CraigBA,BlackMA,DoergeRW: Geneexpressiondata:Thetechnology andstatisticalanalysis. JournalofAgricultural,Biological,andEnvironmental Statistics 2003, 8(1) :1-28. 12.AllisonDA,CuiX,PageGP,SabripourM: Microarraydataanalysis:from disarraytoconsolidationandconsensus. NatureReviewsGenetics 2006, 7 :55-65. 13.KerrMK,MartinM,ChurchillGA: Analysisofvarianceforgeneexpression microarraydata. JournalofComputationalBiology 2000, 7 :819-837. 14.KerrMK,ChurchillGA: Statisticaldesignandtheanalysisofgene expressionmicroarrays. GeneticalResearch 2001, 77 :123-128. 15.KerrMK,ChurchillGA: Experimentaldesignforgeneexpression microarrays. Biostatistics 2001, 2 :183-201. 16. PritchardCC,HsuL,DelrowJ,NelsonPS: Projectnormal:Definingnormal variationinmousegeneexpression. ProceedingsoftheNationalAcademy ofSciencesUSA 2001, 98 :13266-13271. 17.WolfingerRD,GibsonG,WolfingerED,BennettL,HamadehH,BushelP, AshfariC,PaulesRS: AssessinggenesignificancefromcDNAmicroarray expressiondataviamixedmodels. JournalofComputationalBiology 2001, 8(6) :625-637. 18.KerrMK,AfshariCA,BennettL,BushelP,MartinezJ,WalkerNJ,ChurchillGA: Statisticalanalysisofageneexpressionmicroarrayexperimentwith replication. StatisticaSinica 2002, 12 :203-217. 19.WuH,KerrMK,CuiXQ,ChurchillGA: MAANOVA:ASoftwarepackagefor theanalysisofspottedcDNAmicroarrayexperiments,In. Theanalysisof geneexpressiondata:methodsandsoftware Springer;2002,313-341. 20.ChuT,WeirB,WolfingerR: Asystematicstatisticallinearmodeling approachtooligonucleotidearrayexperiments. MathematicalBiosciences 2002, 176 :35-51. 21.WayneML,PanYJ,NuzhdinSV,McIntyreLM: Additivityandtransacting effectsongeneexpressioninmale Drosophilasimulans Genetics 2004, 168 :1413-1420. 22.CuiX,ChurchillGA: StatisticaltestsfordifferentialexpressionincDNA microarrayexperiments. GenomeBiology 2003, 4(4) :201.Yang etal BMCBioinformatics 2011, 12 :427 http://www.biomedcentral.com/1471-2105/12/427 Page9of10

PAGE 10

23.BaldiP,LongAD: ABayesianframeworkfortheanalysisofmicroarray expressiondata:Regularizedt-testandstatisticalinferencesofgene changes. Bioinformatics 2001, 17 :509-519. 24.LnnstedtI,SpeedT: Replicatedmicroarraydata. StatisticaSinca 2002, 12 :31-46. 25.SmythGK: LinearmodelsandempiricalBayesmethodsforassessing differentialexpressioninmicroarrayexperiments. StatisticalApplications inGeneticsandMolecularBiology 2004, 3,No.1,Article3. 26.TongTJ,WangYD: Optimalshrinkageestimationofvarianceswith applicationstomicroarraydataanalysis. JournaloftheAmericanStatistical Association 2007, 102 :113-122. 27.TusherVG,TibshiraniR,ChuG: Significanceanalysisofmicroarrays appliedtoionizingradiationresponse. ThePreceedingsofNational AcademyScience 2001, 989 :5116-5121. 28.CuiX,HwangJTG,QiuJ,BladesNJ,ChurchillGA: Improvedstatisticaltests fordifferentialgeneexpressionbyshrinkingvariancecomponents estimates. Biostatis-tics 2005, 6 :59-75. 29.FengS,WolfingerRD,ChuTM,GibsonGC,McGrawLA: EmpiricalBayes analysisofvariancecomponentmodelsformicroarraydata. Journalof Agricultural,Biological&EnvironmentalStatistics 2006, 1113 :197-190. 30.KimSY,LeeJW,SohnIS: Comparisonofvariousstatisticalmethodsfor identifyingdifferentialgeneexpressioninreplicatedmicroarraydata. StatisticalMethodsinMedicalResearch 2006, 15 :3-20. 31.HwangJTG,LiuP: Optimaltestsshrinkingbothmeansandvariances applicabletomicroarraydataanalysis. preprint2007-03 Departmentof Statistics,IowaStateUniversity,IA;2007. 32.KizilkayaK,TempelmanRJ: Ageneralapproachtomixedeffectsmodeling ofresidualvariancesingeneralizedlinearmixedmodels. Genetics SelectionEvolution 2005, 37 :31-56. 33.JaffrezicF,MarotG,DegrelleS,IsabelleH,FoulleyJL: Astructuralmixed modelforvariancesindifferentialgeneexpressionstudies. Genetical Research 2007, 89(1) :19-25. 34.RostoksN,BorevitzJO,HedleyPE,RussellJ,MudieS,MorrisJ,CardleL, MarshallDF,WaughR: Single-featurepolymorphismdiscoveryinthe Barleytranscriptome. GenomeBiology 2005, 6 :R54. 35.KirstM,CaldoR,CasatiP,TanimotoG,WalbotV,WiseRP,BucklerES: Geneticiversitycontributiontoerrorsinshortoligonucleotide microarrayanalysis. PlantBiotechnologyJournal 2006, 4 :489-498. 36.ZhangX,ShiuSH,CalA,BorevitzJO: Globalanalysisofgenetic, epigeneticandtranscriptionalpolymorphismsinArabidopsisThaliana Usingwholegenometilingarrays. PLoSGenetics 2008, 4(3) :e1000032. 37.ZhangX,BorevitzJO: GlobalAnalysisofAllele-specificExpressionin ArabidopsisThaliana. Genetics 2009, 182(4) :943-954. 38.McIntyreLM,BonoLM,GenisselA,WestermanR,JunkD,Telonis-ScottM, HarshmanL,WayneML,KoppA,NuzhdinSV: Sexspecificexpressionof alternativetranscriptsinDrosophila. GenomeBiology 2006, 7 :R79. 39.KellyP,ZhouYH,WhiteheadJ,StallardN,BowmanC: Sequentiallytesting foragene-druginteractioninagenomewideanalysis. Statisticsin Medicine 2008, 27 :2022-2034. 40.LehmannEL,CasellaG: TheoryofPointEstimation. 2edition.NewYork: Springer-Verlag;1998. 41.PoundsS: Computationalenhancementofashrinkage-basedanalysisof varianceF-testproposedfordifferentialgeneexpressionanalysis. Biostatistics 2007, 83 :505-506. 42.NeterJ,WassermanW,KutnerMH: AppliedLinearStatisticalModels: Regression,AnalysisofVariance,andExperimentalDesigns. 3edition.Irwin, Inc;1990. 43.EdgingtonES: RandomizationTests. 3edition.MarcelDekker,NewYork; 1995,(1995). 44.AndersonMJ,TerBraakCJF: Permutationtestsformulti-factorialanalysis ofanova. JournalofStatisticalComputationandSimulation 2003, 732 :85-113. 45.ChurchillGA,DoergeRW: Naiveapplicationofpermutationtestingleads tonflatedtypeIerrorrates. Genetics 2008, 178 :609-610. 46.NuzhdinSV,WayneML,HarmonKL,McIntyreLM: Commonpatternof evolutionofgeneexpressionlevelandproteinsequenceindrosophila. MolecularBiologyandEvolution 2004, 21 :1308-1317.doi:10.1186/1471-2105-12-427 Citethisarticleas: Yang etal .: Generalizedshrinkage F -likestatisticsfor testinganinteractiontermingeneexpressionanalysisinthepresence ofheteroscedasticity. BMCBioinformatics 2011 12 :427. Submit your next manuscript to BioMed Central and take full advantage of: Convenient online submission Thorough peer review No space constraints or color gure charges Immediate publication on acceptance Inclusion in PubMed, CAS, Scopus and Google Scholar Research which is freely available for redistribution Submit your manuscript at www.biomedcentral.com/submit Yang etal BMCBioinformatics 2011, 12 :427 http://www.biomedcentral.com/1471-2105/12/427 Page10of10