<%BANNER%>

A Novel Probabilistic Lower Bound on Mutual Information Applied to a Category-Based Framework for Spike Train Discrimination

Permanent Link: http://ufdc.ufl.edu/UFE0041213/00001

Material Information

Title: A Novel Probabilistic Lower Bound on Mutual Information Applied to a Category-Based Framework for Spike Train Discrimination
Physical Description: 1 online resource (93 p.)
Language: english
Creator: Vanderkraats, Nathan
Publisher: University of Florida
Place of Publication: Gainesville, Fla.
Publication Date: 2009

Subjects

Subjects / Keywords: Computer and Information Science and Engineering -- Dissertations, Academic -- UF
Genre: Computer Engineering thesis, Ph.D.
bibliography   ( marcgt )
theses   ( marcgt )
government publication (state, provincial, terriorial, dependent)   ( marcgt )
born-digital   ( sobekcm )
Electronic Thesis or Dissertation

Notes

Abstract: We propose a framework by which the information transmission capacity of feedforward networks of spiking neurons can be measured for any discrete set of stimulus classes, or categories, using a statistical classifier to decode stimulus information. Network performance is measured using mutual information (MI), which is estimated using both a naive approach inspired by Fano's inequality and one which levies an intermediate, real-valued data representation common in many classification techniques. We establish the latter through a result of independent interest: a novel probabilistic lower bound on the MI between a binary and a continuous random variable. The bound is derived from the the Dvoretzky, Kiefer, and Wolfowitz inequality for the difference between an empirical cumulative distribution function and the true distribution function. Our framework is demonstrated on a model of the early human auditory system, using a variety of auditory categories and a phenomenological model of a 400-neuron auditory nerve simulated by the Meddis Inner-Hair Cell model. During classification, in addition to the standard spike count representation for a population of spike trains, we establish the utility of other novel feature spaces. We find that basic networks with random synaptic weights are surprisingly effective at transmitting stimulus information.
General Note: In the series University of Florida Digital Collections.
General Note: Includes vita.
Bibliography: Includes bibliographical references.
Source of Description: Description based on online resource; title from PDF title page.
Source of Description: This bibliographic record is available under the Creative Commons CC0 public domain dedication. The University of Florida Libraries, as creator of this bibliographic record, has waived all rights to it worldwide under copyright law, including all related and neighboring rights, to the extent allowed by law.
Statement of Responsibility: by Nathan Vanderkraats.
Thesis: Thesis (Ph.D.)--University of Florida, 2009.
Local: Adviser: Banerjee, Arunava.

Record Information

Source Institution: UFRGP
Rights Management: Applicable rights reserved.
Classification: lcc - LD1780 2009
System ID: UFE0041213:00001

Permanent Link: http://ufdc.ufl.edu/UFE0041213/00001

Material Information

Title: A Novel Probabilistic Lower Bound on Mutual Information Applied to a Category-Based Framework for Spike Train Discrimination
Physical Description: 1 online resource (93 p.)
Language: english
Creator: Vanderkraats, Nathan
Publisher: University of Florida
Place of Publication: Gainesville, Fla.
Publication Date: 2009

Subjects

Subjects / Keywords: Computer and Information Science and Engineering -- Dissertations, Academic -- UF
Genre: Computer Engineering thesis, Ph.D.
bibliography   ( marcgt )
theses   ( marcgt )
government publication (state, provincial, terriorial, dependent)   ( marcgt )
born-digital   ( sobekcm )
Electronic Thesis or Dissertation

Notes

Abstract: We propose a framework by which the information transmission capacity of feedforward networks of spiking neurons can be measured for any discrete set of stimulus classes, or categories, using a statistical classifier to decode stimulus information. Network performance is measured using mutual information (MI), which is estimated using both a naive approach inspired by Fano's inequality and one which levies an intermediate, real-valued data representation common in many classification techniques. We establish the latter through a result of independent interest: a novel probabilistic lower bound on the MI between a binary and a continuous random variable. The bound is derived from the the Dvoretzky, Kiefer, and Wolfowitz inequality for the difference between an empirical cumulative distribution function and the true distribution function. Our framework is demonstrated on a model of the early human auditory system, using a variety of auditory categories and a phenomenological model of a 400-neuron auditory nerve simulated by the Meddis Inner-Hair Cell model. During classification, in addition to the standard spike count representation for a population of spike trains, we establish the utility of other novel feature spaces. We find that basic networks with random synaptic weights are surprisingly effective at transmitting stimulus information.
General Note: In the series University of Florida Digital Collections.
General Note: Includes vita.
Bibliography: Includes bibliographical references.
Source of Description: Description based on online resource; title from PDF title page.
Source of Description: This bibliographic record is available under the Creative Commons CC0 public domain dedication. The University of Florida Libraries, as creator of this bibliographic record, has waived all rights to it worldwide under copyright law, including all related and neighboring rights, to the extent allowed by law.
Statement of Responsibility: by Nathan Vanderkraats.
Thesis: Thesis (Ph.D.)--University of Florida, 2009.
Local: Adviser: Banerjee, Arunava.

Record Information

Source Institution: UFRGP
Rights Management: Applicable rights reserved.
Classification: lcc - LD1780 2009
System ID: UFE0041213:00001


This item has the following downloads:


Full Text

PAGE 1

ANOVELPROBABILISTICLOWERBOUNDONMUTUALINFORMATION APPLIEDTOACATEGORY-BASEDFRAMEWORKFOR SPIKETRAINDISCRIMINATION By NATHAND.VANDERKRAATS ADISSERTATIONPRESENTEDTOTHEGRADUATESCHOOL OFTHEUNIVERSITYOFFLORIDAINPARTIALFULFILLMENT OFTHEREQUIREMENTSFORTHEDEGREEOF DOCTOROFPHILOSOPHY UNIVERSITYOFFLORIDA 2009

PAGE 2

c r 2009NathanD.VanderKraats 2

PAGE 3

ACKNOWLEDGMENTS Asadults,Irmlybelievethatweowemuchofourprofessiona lsuccesstoallthose whohavehelpedmoldourindividualityfromchildhoodforwa rd.Intheinterestofbrevity, Icanonlyacknowledgeaselectfewhere.However,Iwouldlik ethankallofthosewho haveplayedaroleinmyeducationbothasastudentandasahum anbeing,ofwhomI feelfortunatetohavehadmany. First,Iowethankstomyfamily,whohaveshownmeloveandsup portfromboth nearandfar:mymom,whooverwhelmsmewithprideandloveaso nlyamothercan; mydad,whohasbeenasourceofwisdomandencouragement;myg randparents, ErnestandLilianHainz,andFrankandVirginiaVanderKraat s;mylittlesistersAmy (Sissy)andMegan(Squirt)andtheirfamilies,inparticula rmynieceHannahand nephewsCameronandChase,whohavebrightenedmylife;andc ertainlynotleast ofallMonica,mylovinggirlfriend,closestfriendandcon dantforthepastfouryears. Monicajokesthatshehasearnedanhonoraryassociate'sdeg reeforlisteningtomy manycomputersciencestories,andIhavenodoubtshedeserv esit. Thefoundationsofmyeducationwerewelllaidbymanyexcept ionalelementary andsecondaryeducators,ofwhomIcanonlynameafew:mymusi cteacher,Lisa Finch,whoencouragedmethroughouthighschool;MikeHuske y,anEnglishteacher whotookhiscallingsoseriouslyhesoughtmeouttogivemead vicebeforeIhadeven takenaclasswithhim;HaroldHagan,whodemonstratedthefu nsideofmathematics; LarryWells,myhighschoolprincipal,whowentoutofhisway formeonseveral occasions;andMaxineJinkerson,mymiddleschoolgiftedte acherwho,inadditionto teachingmetothinkcreativelyinlife,introducedmetoBAS IC,myrstprogramming language. Iamalsogratefultomylabmatesandfriendswhohavecontrib utedprofessionally tomystudiesbylendingtheirearsandtheiradvice:AlexSin gh,AlinaZare,AmiGates, 3

PAGE 4

JasonChi,JeremyBolton,KarthikGurumoorthy,NekoFisher ,OliviaProsper,Seth Spain,SubhoSengupta,andVenkatRamaswamy. Professionally,IwouldliketoacknowledgetheUniversity ofFloridaHigh-Performance ComputingCenterforprovidingcomputationalresourcesan dsupportthathave contributedtothisstudy. Finally,Iwouldliketothankmysupervisorycommitteeandt hefacultyofthe ComputerandInformationScienceandEngineeringdepartme nt,whodisplayed continuousdiligenceandskillthroughoutmygraduatestud ies. Inparticular,IoweasingulardebtofgratitudetoArunavaB anerjee,mygraduate adviser,whoisasuniqueandcompellinganindividualasIwi llevermeetinmylifetime. Asanadviser,ArunavadisplayedunendingpatienceasIstru ggledthroughtheupsand downsofearningaPhD.IfeverIwasdiscouragedbyanegative resultorfeltpressured bythedemandsofgraduatelife,ameetingwithArunavawasth ecure.Aftersuchan appointment,myegowassoboostedthatIoftenfeltasifIwas onthevergeofcuring cancer.Asateacher,Arunavaisunrivaled,possessingacom binationofencyclopedic knowledgeandtheabilitytodistillanddisseminatethatkn owledgeinstantlytoany audience.Astudentonceaskedhimadeepquestionaboutamat hematicalhomework problem.Ashisresponseunfolded,heexcitedlytransition edintoanimpromptu 30-minutelectureonEinstein'stheoriesofrelativity!As amentor,hissupportextended evenbeyondtheacademicrealm.WhenIsoughttobuyamotorcy cleearlyingraduate school,hesoldmethebikehehadusedasastudentforfarless thanitsvalue,onthe oneconditionthat”itmustbeloved.”Whenevertheseldom-d efeatedFloridafootball teamtooktheeld,Arunavawasquicktobetapitcherofbeera gainstthehometeam: inhisownway,athinly-veiledstudentstimuluspackage.In short,Icannotimagine completingmydoctoralstudieswithoutArunava'said,andf orthatIamforeverinhis debt. 4

PAGE 5

TABLEOFCONTENTS page ACKNOWLEDGMENTS ..................................3 LISTOFTABLES ......................................7 LISTOFFIGURES .....................................8 ABSTRACT .........................................9 CHAPTER 1BACKGROUNDMATERIAL .............................10 1.1Introduction ...................................10 1.2NeuroscienceConcepts ............................12 1.2.1BasicNeuralDynamics ........................12 1.2.2FunctionalBrainRegions .......................13 1.2.3NeuralModeling ............................14 1.3AuditorySystemOverview ...........................16 1.4MeddisInner-HairCellModel .........................17 1.5MachineLearning ...............................19 1.5.1SupportVectorMachines .......................19 1.5.2InformationTheory ...........................21 2ANOVELFRAMEWORKFORANALYZINGSPIKINGNETWORKS ......23 2.1Introduction ...................................23 2.1.1MutualInformation ...........................23 2.1.2DimensionalityReductionthroughClassication ...........24 2.1.3EvaluatingFeedforwardNetworks ...................25 2.2Methods .....................................26 2.2.1ModelingtheAuditoryPeriphery ...................26 2.2.2MeasuringInformationTransmissionOveraNetwork ........27 2.2.3StimulusCategorization ........................29 2.2.4DescriptionofAuditoryCategories ..................31 2.2.5ClassicationofSpikeTrainResponses ...............33 2.2.5.1Priorworkinspiketrainclassication ...........33 2.2.5.2Novelfeaturespacesforspiketrainpopulations .....34 2.2.5.3Supportvectormachineclassier .............36 2.2.6SpikingNeuronModel .........................36 3APROBABILISTICLOWERBOUNDONMI ....................37 3.1ClassicationandMI ..............................37 3.1.1Na¨veConnection:TheFanoMethod .................37 3.1.2ABetterMIBound ...........................38 5

PAGE 6

3.2ANovelLowerBoundonMutualInformation ................39 3.2.1TheTube-UnconstrainedSolution ...................41 3.2.2ADiscreteFormulation .........................44 3.2.3ConstraintsontheDistributionFunctions ...............46 3.2.4KKTConditionsfortheGeneralProblem ...............47 3.2.5AConstructiveAlgorithmfortheGeneralSolution ..........49 3.2.6TheExistenceofaTube-InactiveSolution ..............51 3.2.7TheExtendedRatioSatisabilityProblem ..............54 3.2.8TheNonexistenceofaTube-InactiveSolution ............56 3.2.9FindingtheRightmostTightStatistic, z m ...............57 4RESULTS .......................................62 4.1IndependentEvaluationoftheMIBound ...................62 4.2BaselineResults ................................65 4.3NetworkResults ................................66 4.3.1ArchitectureDescriptions .......................66 4.3.2One-LayerNetworkResults ......................68 4.3.3Two-LayerNetworkResults ......................70 5DISCUSSION .....................................83 5.1EvaluatingNetworkPerformanceandOurFramework ...........83 5.2SignicanceofOurTighterLowerBoundonMI ...............84 5.3ConcludingRemarks ..............................84 APPENDIX:METHODSPECIFICS ............................86 REFERENCES .......................................88 BIOGRAPHICALSKETCH ................................93 6

PAGE 7

LISTOFTABLES Table page 4-1Baselineresultsforpitchdiscrimination ......................71 4-2Baselineresultsforinstrumentrecognition .....................72 4-3Baselineresultsforinstrumentrecognition .....................73 4-4Baselineresultsforup/downfrequencysweeps ..................73 4-5Baselineresultsfornoisypitchesandnoisymixtures ...............74 4-6Single-layerfwdA2Aresultsforpitchesandmixtures ...............75 4-7Single-layerfwdA2Aresultsforsweeps ......................76 4-8Single-layerfwdA2A-1nbrA2Aresultsforpitchesandmi xtures ..........77 4-9Single-layerfwdA2A-1nbrA2Aresultsforsweeps .................78 4-10Two-layerfwdA2Aresultsforpitches ........................78 4-11Two-layerfwdA2Aresultsformixtures .......................79 4-12Two-layerfwdA2Aresultsforsweeps ........................79 4-13Two-layerfwdA2A-1nbrA2Aresultsforpitches ...................80 4-14Two-layerfwdA2A-1nbrA2Aresultsformixtures ..................81 4-15Two-layerfwdA2A-1nbrA2Aresultsforsweeps ..................82 A-1Meddismodelparametersfor40centerfrequencygroups ............87 7

PAGE 8

LISTOFFIGURES Figure page 1-1Simplieddrawingofaneuron ...........................11 1-2Basicneuraldynamicsforasingleneuronwithtwoinputs ............11 2-1SoundstimuliandbaselineANresponseforseveralinstr umentsandsweeps .28 2-2SoundstimuliandbaselineANresponsefortwoinstrumen trecognitiontasks .29 2-3Outlineofexperimentalprocedureforasampletask ...............30 2-4Relativeharmonicsforeachtypeofharmonictone(instr ument) .........31 2-5Visualizationofaspiketrainprojectedintoafeatures pace ...........35 3-1SampledistributionsfortwoinputclassesandtheirDKW tubes .........41 3-2RelaxationoftheDKWtubes ............................45 3-3Visualdepictionofthepinningalgorithm ......................50 4-1MIresultsfornormally-distributedreal-valuedclass -conditionalinputdata ...63 4-2MIresultsforgamma-distributedreal-valuedclass-co nditionalinputdata ....63 4-3MIlowerboundestimatesasafunctionofsamplesize ..............64 4-4Two-layerversionsofthefeedforwardnetworks ..................67 8

PAGE 9

AbstractofDissertationPresentedtotheGraduateSchool oftheUniversityofFloridainPartialFulllmentofthe RequirementsfortheDegreeofDoctorofPhilosophy ANOVELPROBABILISTICLOWERBOUNDONMUTUALINFORMATION APPLIEDTOACATEGORY-BASEDFRAMEWORKFOR SPIKETRAINDISCRIMINATION By NathanD.VanderKraats December2009 Chair:ArunavaBanerjeeMajor:ComputerEngineering Weproposeaframeworkbywhichtheinformationtransmissio ncapacityof feedforwardnetworksofspikingneuronscanbemeasuredfor anydiscreteset ofstimulusclasses,orcategories,usingastatisticalcla ssiertodecodestimulus information.Networkperformanceismeasuredusingmutual information(MI),which isestimatedusingbothana¨veapproachinspiredbyFano's inequalityandonewhich leviesanintermediate,real-valueddatarepresentationc ommoninmanyclassication techniques.Weestablishthelatterthrougharesultofinde pendentinterest:anovel probabilisticlowerboundontheMIbetweenabinaryandacon tinuousrandomvariable. TheboundisderivedfromthetheDvoretzky,Kiefer,andWolf owitzinequalityforthe differencebetweenanempiricalcumulativedistributionf unctionandthetruedistribution function. Ourframeworkisdemonstratedonamodeloftheearlyhumanau ditorysystem, usingavarietyofauditorycategoriesandaphenomenologic almodelofa400-neuron auditorynervesimulatedbytheMeddisInner-HairCellmode l.Duringclassication, inadditiontothestandardspikecountrepresentationfora populationofspiketrains, weestablishtheutilityofothernovelfeaturespaces.Wen dthatbasicnetworkswith randomsynapticweightsaresurprisinglyeffectiveattran smittingstimulusinformation. 9

PAGE 10

CHAPTER1 BACKGROUNDMATERIAL 1.1Introduction Theburgeoningeldofcomputationalneurosciencereprese ntsanexciting intersectionofbiology,mathematics,andcomputermodeli ng.Fromabiological perspective,computationalandtheoreticaltechniquesal lowthebraintobestudied fromvantagepointsthatwereinaccessibleevenadecadeago .Fromacomputer scienceviewpoint,understandingthebrainmeansshedding lightonasystemthatis successfulatmanyofthemostdifcultcomputationaltasks ,suchaspatternrecognition. Ourstudyhastwocentralthemes.First,weintroduceanovel frameworkfor assessingthecapabilitiesofsimulatedspikingneuralnet works.Inordertoaccurately modelsubsystemsofthebrain,theabilitytocompareandcon trastspecicnetworksis imperative.Ourapproachusesstatisticalclassication, alongwithavarietyoffeature spacerepresentationsoftheneuraldata,todecodeinforma tionaboutnaturalstimuli. Wetestthismethodologybybuildingamodeloftheearlyhuma nauditorysystem, foundeduponaphenomenologicalmodelofa400-neuronaudit orynervesimulatedby theMeddisInner-HairCellmodel.Forseveralnaturalaudit orycategories,thesystem isshowntoextracthighamountsofstimulusinformationfro mfeedforwardnetworks containing400neuronsperlayer. Inaddition,totightentheresultsprovidedbythisframewo rk,wedescribea signicantadvancementinmeasuringtheamountofstimulus informationinaneural population'sresponse.Usingtheintermediateinformatio ngeneratedbymanykindsof statisticalclassiers,anovelmathematicalboundisderi vedontherelationshipbetween thestimuliandthedownstreamresponse.Theboundisshownt hroughinformation theorytobeanimprovementovertheestimatesgivenbyclass icationresultsalone. Ourresultisgeneralizedtoanybinaryclassicationprobl em,andsoisofindependent interesttotheeldofmachinelearning. 10

PAGE 11

Axon Myelin Sheath Soma Axon terminals Dendrites Synapse Figure1-1.Simplieddrawingofaneuron Potential Voltage Potential Resting In2 In1 Out Threshold Spiketrain Out In1 In2 time Out Memb Figure1-2.Basicneuraldynamicsforasingleneuronwithtw oinputs 11

PAGE 12

1.2NeuroscienceConcepts 1.2.1BasicNeuralDynamics Theprimarycomputationalunitofthebrainistheneuron( Squireetal. 2003 ; Bearetal. 2007 ).Theaveragehumanbraincontains100billionofthesecell s, surroundedbysupportingglialcells.Asimpliedanatomic aldrawingofaneuronis showninFigure 1-1 .Severalnotableconstituentsincludetheaxon,alongtend ril throughwhichtheneuroncommunicatestootherneurons;den drites,spiderlike lamentsthatcollectinformationfromotherneurons;syna pses,thejunctionsbetween aneuron'saxonandotherneurons'dendrites;andthesoma,t hecellbodyofthe neuron.Ahumanbrainisadensenetworkofroughly100trilli oninterneuralconnections ( Squireetal. 2003 ). Totransmitinformation,aneuronreliesonactionpotentia ls,commonlycalled spikes,thatpropagateacrosstheneuralberslikewavestr aversingataughtrope. Owingtodifferencesintheionicconcentrationsinsideand outsidethesoma,theneuron hasanintrinsicvoltageknownasthemembranepotential.As incomingspikesreachthe soma,themembranepotentialoftheneuronrises.Ifthepote ntialcrossesaparticular thresholdvoltage,theneuronundergoesaseriesofcellula rchangesandproducesa spike.Forabrieftimeafterspiking,namedtheabsoluteref ractoryperiod,theneuron isunabletoproduceaspikeduetotheresettingoftheintern alionicbalancewithinthe soma.Followingtheabsoluterefractoryperiod,theneuron entersarelativerefractory periodwhereintheneuron'sabilitytospikeisdrastically reduced.Thisbasicspiking processisillustratedinFigure 1-2 Whiletheprevioussummaryiscursory,afewdetailsarenece ssary.First,for aparticularneuron,alloutgoingspikeshaveidentical,or stereotyped,waveforms, regardlessoftimeorthespeedatwhichtheneuronwasdriven tothreshold.The effectsofasingleincomingspikeonaneuronmanifestasaqu ickriseinthemembrane potential,followedbyanexponentialdecaybacktotherest ingpotential.Becauseofthis 12

PAGE 13

exponentially-droppinginuence,theeffectsofaspikeaf teracertainperiodoftimecan beconsiderednegligible( Banerjee 2001 ). Anotherrelevantneuralfeatureisthatthemagnitudeofthe effectofoneneuron's spikeonanotherneuronisscaledbythesynapticweightofth esynapsethatconnects them.Overtime,aneuron'ssynapticweightscanchange,giv inggreaterorlesser importancetoitsvariousinputs.Thesealterations,known assynapticplasticity,are criticaltotheformationofmemories.1.2.2FunctionalBrainRegions Onahighlevel,thevertebratebrain( Squireetal. 2003 ; Bearetal. 2007 )canbe partitionedintomanygrossfunctionalareas.Forexample, sensorysystems,suchasthe visual,auditory,andsomatosensoryareas,translatether ealworldstimuliencountered byanorganismintoneuralsignals.Thecerebralcortex,com monlytermedjustthe cortex,refersanatomicallytotheouterlayerofthecerebr um.Functionally,thecortex performsmuchofthehighlevelprocessingofthebrain,inte gratingsensorysignalsas wellasbeingthesourceofabstractthought. Differentareasofthebrainvaryintermsoftheirstructura lcharacteristics.For instance,theauditorypathway,asensorysystem,transmit sauditoryinformationfrom theeartotheauditorycortex( Altschuleretal. 1991 ).Broadly,thissystemcanbe thoughtofasaseriesofneuronclustersthatformaninforma tionconduit,allowing communicationupanddownthepathway.Asinformationmoves upthepathway,certain spatialfeaturesofthesignalarepreserved,indicatingad egreeoforganizationwithin theclusters.Incontrast,corticalareastypicallyintegr ateinformationfrommanyother partsofthebrain,makingthefunctionofspeciccorticaln euronsmuchmoredifcult tounderstand( Squireetal. 2003 ).Incorticalsystems,ithasoftenbeenobservedthat spikingratesadhereroughlytoPoissonstatistics( Riekeetal. 1997 ).Whetherthis variabilitybetweenspikesisusedtocommunicateinformat ion,orissimplyaproductof 13

PAGE 14

stochasticprocessesinthebrain,hasbeenanissueofmuchd ebate( Dayan&Abbott 2001 ; Mainen&Sejnowski 1995 ). 1.2.3NeuralModeling Whenmodelingasystemascomplexastheneuron,abalancemus tbestruck betweencomputationalefciencyandthecompletenessofth emodel.Withtoomany details,simulationoflargesystemsquicklybecomesinfea sible.Withtoofew,the model'sdissimilaritytothebrainimpedesitsusefulness. Consequently,aplethoraof neuralmodelsareseenthroughouttheeld. In1952,AlanHodgkinandAndrewHuxleyproposedasetofordi narydifferential equations(ODE)thatmodelthemovementsofkeyionsacrossa neuron'scell membrane( Dayan&Abbott 2001 ; Gerstner&Kistler 2002 ).KnownastheHodgkin-Huxley model,thisdescriptionwasalandmarkworkcapableofsimul atingneuraldynamicstoa highdegreeofaccuracy. AlthoughtheHodgkin-Huxleymodelisquiteaccurate,thesi mulationofthesetof nonlinearODEsiscomputationallyprohibitiveforexperim entsinvolvinglargeneural populations.Duetothestereotypednatureofspikes,asimp liedneuronmodel,termed theleakyorpassiveintegrate-and-remodel,isfrequentl yusedintheeld.Rather thanmodelingalloftheionowsacrossthemembraneindivid ually,thismodelreduces theneurontoabasiccircuitcontainingaresistorandcapac itorinparalleldrivenbya time-varyingcurrent( Gerstner&Kistler 2002 ; Dayan&Abbott 2001 ): m dV dt = V ( t )+ RI ( t ) where m isthemembranetimeconstantoftheneuron,whichaffectsth espeed atwhichthemembranepotentialresetstothreshold, V ( t ) denotesthemembrane potential, denotesthethresholdvoltage, R isthemembraneresistance,and I ( t ) isthe time-dependentcurrentthroughthemembrane.Notably,the leakyintegrate-and-re modeldoesnotmodeltheactionpotentialsdirectly,onlyth eeffectsofincoming 14

PAGE 15

spikesonthemembranepotential.Whenaneuron'smembranep otentialreaches itsthreshold,aspikemustbeaddedtothemodelandthemembr anepotentialresetto itsrestingpotential.Furthermore,allrefractoryeffect smustbeinterjectedintothemodel supplementally. Beyondintegrate-and-re,oneadditionalsimplicationa idsthecomputational efciencyoftheneuronmodeltremendously.Ratherthanexp licitlymodelingtheneural circuit,onecanobservethatthemembranepotential'sresp onsetoastereotypedspike canbedescribedasafunctionofafewxedsynapticparamete rs,assumingthateach spikemakesanadditivecontributiontothetotalpotential .Thespikeresponsemodel (SRM)viewsthemembranepotentialasafunctionofonlythet imesofthespikesin thesystem( MacGregor&Lewis 1977 ; Gerstner&Kistler 2002 ).OneversionofSRM describestheeffectofaninboundspike,calledthepost-sy napticpotential(PSP),as: PSP ( t Q d )= Q d p t e d 2 t e t where t isthetimeelapsedsincethespikeoccurred, Q issynapticweight, d isdistance (indimensionlessunits)ofthesynapsefromthesoma, (in ms )controlstherateof PSPrise,and (in ms )determineshowquicklyitfalls. Toenforcerefractoryperiodsafteraspikeoccurs,anaddit ionalterm,calledthe afterhyper-polarization(AHP)providesalargenegativet ermthatdecaysexponentially. TheAHPeffectsaredeterminedcompletelybythetimesincet hedepartureofeach outputspike: AHP ( t )= Re t r where t againdenoteselapsedtime, R determinesthemaximumdropinpotentialafter aspike,and r governstherateatwhichtheAHPeffectsubsides. ThusfortheSRMmodel,eachcontributiontoaneuron'smembr anepotentialhas asimpleclosed-formsolution.Theoverallmembranepotent ial,then,isthesumofthe individualcontributions.Aswiththeintegrate-and-rem odel,spikesmustbegenerated 15

PAGE 16

explicitlywhenthepotentialcrossesitsthreshold.Tocir cumventboththepractical concernthatanunboundednumberofspikesmustbetrackedfo ralongexperiment,as wellasthetheoreticalimpossibilityofsimulatinganinn itespikinghistoryfortheneuron, onecannotethatthePSPandAHPeffectsdecayexponentially withtime.Therefore,as previouslyalluded,amovingtimewindowcanbecreatedfora nysimulation,whereall spikesoutsidethewindowaredeemedirrelevant.Important ly,thresholdmodelssuch asthethesehavebeenshowntohaveequivalentpowertothege neralHodgkin-Huxley model,givensuitableparameterchoices( Kistleretal. 1997 ). 1.3AuditorySystemOverview Inthemammalianauditorysystem( Squireetal. 2003 ; Altschuleretal. 1991 ; Moore 2004 ; Bearetal. 2007 ),arrivingsoundistranslatedbytheinnerearfromair pressurewavesintoneuronactionpotentials.Thebulkofth isencodingishandled bythecochlea,whichisthefamiliarshell-shapedapparatu slocatedjustbehindthe tympanicmembrane,oreardrum.Withinthecochlealiesalon g,coiledmembrane knownasthebasilarmembrane(BM).TheBMisthinandstiffat itsbase,becoming widerandlessrigidasitnearsitsapexatthecenteroftheco il.Soundenteringthebase oftheBMproducesawavethroughthemembrane,peakingmaxim allyatalocation determinedbythesound'sfrequency.Asaresult,thecochle aissometimesreferredto asasimpleFourierdiscriminator,separatingasoundbyits componentfrequenciesinto aspatialplacecode.Thisfrequencyseparationscheme,cal ledtonotopy,ispreserved throughmanystagesoftheauditorypathway. AlongthelengthoftheBMrunstheOrganofCorti,whichconta inshaircellsthat aredeectedaccordingtothemotionoftheBM.Therearetwot ypesofhaircells: innerandouter.Innerhaircells(IHC)dotheworkofencodin gsoundsintospiketrains, producingspikesinresponsetothemotionoftheirhairs,or stereocilia.Outerhaircells (OHC)providefeedbacktothecochlea,functioningasacoch learamplierthatadjusts themagnitudeofthephysicalresponseoftheBMtofuturesou nds.Thehaircellsmake 16

PAGE 17

connectionstodendritesofthespiralganglion(SG),theax onsofwhichareknown collectivelyastheauditorynerve.TheinnervationoftheS Gisnotbalancedbetween thetwohaircellgroups.EachIHCconnectstoapproximately 10-20uniqueSGcells, whilemanyOHCscombinetoinnervateasingleSGneuron.Afte rwards,theauditory nervebers(ANF)projecttohigherlevelsoftheauditorypa thway.Itisimportanttonote thatalloftheincomingsoundinformationpassedalongtoth ebrainmustbepresentin theauditorynerve. AsaresultofthetonotopyinheritedfromtheBM,eachIHC,an dsimilarlyeachANF, issaidtohaveacertainbestfrequency(BF).Thus,forthepr esentationofpure-tone, orsinglefrequency,stimuli,eachANFrespondsmoststrong lytotonesatornearits BF.ThefartherastimulusgetsfromthisBF,themoretheber 'sreactiondecays.It shouldbepointedoutthatastrongresponseisnotdenedsim plyasthespikerate oftheANF.Rather,thereisasignicanttemporalcomponent totheANFresponse. ThisfeatureismosteasilydemonstratedinberswithlowBF s,whereaber'soutput becomesphase-lockedtothestimulusfrequencyneartheBF. Phase-locking,asthe nameimplies,meansthattheANFonlytransmitsspikesneart heperiodofthestimulus frequency. InadditiontoitsBF,eachANFmaybeclassiedbyitsspontan eousrate(SR).SR denotesaber'smeanspikerateintheabsenceofinputfromi tsIHCs.Typically,ANFs aredividedintothreeSRgroups:highSR-between18and250H z,roughly61%ofall ANFs,mediumSR-between0.5and18Hz,around23%ofANFs,and lowSR-less than0.5Hz,accountingfor16%ofthebers.HighSRbers,th emostcommontype, havealowthresholdforsoundintensity.Conversely,thesm allernumberoflowSR bershaveamuchhigherthreshold,andarethustunedforhig herintensitysounds. 1.4MeddisInner-HairCellModel Ofthemanycomputationalmodelsthathavebeenmadeoftheau ditoryperiphery, theMeddisInner-Haircellmodelisperhapsthemostwell-kn own( Meddis 1986 17

PAGE 18

1988 ; Lopez-Povedaetal. 2001 ; Sumneretal. 2002 ).TheMeddismodelaccurately describestheBM'smotioninresponsetoarbitrarysounds,t hensimulatestheionand neurotransmitterowwithinthehaircellstoproducephysi ologicallyplausibletrains ofspikes.Overtheyears,thesimulatorhasbeenexpandedto texperimentaldata fromanarrayoforganisms.Ithasalargedegreeofparametri cexibility,enabling variationatmanystages.Themodelreplicatesthebehavior oftheIHCsonly,ignoring thefeedbackoftheunder-studiedOHCs.However,theeffect softhecochlearamplier ontheIHCoutputsareencapsulatedintothesystemusingnon linearitiesintheearly processingsteps.Therefore,themodelisabletomatchphys iologicaldataatany desireddynamicrange.Meddis'slabkeepsanup-to-dateimp lementationofhismodel online( Univ.ofEssexDept.ofPsychologyandUniv.ofCambridgePhy siologyDept. 2009 ). Therststagesofsoundprocessing,fromtheearthroughthe cochlea,are implementedintheMeddismodelasaseriesoflters.First, toreproducetheeffects ofsoundpassingthroughtheouterandmiddleears,asmallnu mberofbandpass ltersareinstituted.Next,thefrequencysegregationoft hecochleaisemulatedby wayofabankofauditorylters,termedthecochlearlterba nk.Eachlterinthisbank correspondstoapositionontheBM,whichwillinturnbecome theoptimalfrequencyfor acorrespondingIHC.Thelterbankcontainsdual-resonanc e-nonlinear(DRNL)lters. Eachlterconsistsoftwoparallelpaths,oneofwhichislin ear,andtheothernonlinear. Asmentionedabove,thenonlinearcomponentintheDRNLlte rbankisnecessary toreproduceavarietyofnonlinearexperimentalresultsth atarisenaturallyduetothe cochlearamplier.Forinstance,theresponsesofthelowan dmediumSRANFsare verysensitivetononlinearitiesintheBMresponse.Whenth esignalreachesthispoint, ithasbecomeasetofBMvelocitiesforeachdesiredBF. Inabiologicalear,thestereociliaineachIHCcanmoveinon eoftwodirectionsin responsetothemotionoftheBM.Inonedirection,themoveme nttriggersacascading 18

PAGE 19

reaction:potassiumowsinwardintothecell,whichcauses channelstoopenand calciumtoenter,whichinturncausesneurotransmittertob ereleasedbythecellinto thesynapticcleft.WhenBMmotionpullsthestereociliaint heotherdirection,potassium ow-andconsequentlyneurotransmitterrelease-isshutof f.Becausethisphysical processhasbeenextensivelystudied,theMeddismodelsimu latesitdirectly.Next,the neurotransmitter'sactivationofthepostsynapticdendri teismodeledprobabilistically, usingexperimentalevidencetochooseitsparameters.This stepalsoencompassesthe ANF'srefractoryperiodandcontainssomeuncertaintyinth edetailsofneurotransmitter processinginsynapticclefts.Consequently,reningneur otransmitterdynamicshas beenamainfocusinlatterrevisionstothemodel. TheMeddismodelwasoriginallydesignedtobetruthfultoav arietyofwell-known auditoryproperties,includingrate-intensity,rapidand short-termadaptation,phase-locking, additivity,andrecovery( Meddis 1986 ).Laterversionsalsoshowaccordancewith spikestatisticsontheauditorynerve( Sumneretal. 2002 ).Additionally,thehuman parametersofthemodelwerederiveddirectlyfromexisting psychophysicaldata ( Lopez-Povedaetal. 2001 ). 1.5MachineLearning Ourstudyreliesonseveralconceptsfrommachinelearning, twoofwhichare surveyedhere.1.5.1SupportVectorMachines Statisticalclassicationmethodscategorizehigh-dimen sionaldataintoasmall numberofclasses.Classierslearnhowtomaketheseassign mentsbyobserving atrainingsetoflabeleddatapoints,afterwhichthetraine dmodelcanbeappliedto unseenexamples.Datatobeclassiedmustberepresentedas precise,xed-dimensional points,whereeachdimensionisreferredtoasafeatureandt heresultingspaceis termedthefeaturespace. 19

PAGE 20

Originallydevelopedinthelate1970s( Vapnik 1999 ),supportvectormachine (SVM)classiershavebecomepopularinrecentyearsduetot heirexibilityandease oftraining.Initssimplestform,anSVMclassierdrawsali nearhyperplanethrougha featurespace,partitioningdatapointsintotwoclasses.T hehyperplaneisrepresented byaweightvector w thatisaugmentedbythebias.Inadditiontomaximizingits classicationaccuracy,theSVMattemptstomaximizethedi stance,ormargin,between theborderlinedatapoints,calledsupportvectors,andthe hyperplane.Withtheconcept ofmargin,anSVMhasanaturalwayofdealingwithambiguouso rmisclassiedtraining points.Asoftmarginallowstheclassiertomakeatrade-of fbetweenthemargin maximizationandtheamountoftrainingerror,effectively allowingsomemistakes inordertoimprovetheresultinghyperplane( Burges 1998 ; Dudaetal. 2001 ).The degreeofsoftnessiscontrolledbyarealparameter C .Thus,given n datapointsinthe trainingset,theSVMsolvesaconvexoptimizationproblem( Boyd&Vandenberghe 2004 )withrespecttotheaugmentedweightvector w ,avectorof n Lagrangemultipliers representingcorrectly-classiedpoints,andavectorof n slackvariables for potentialmisclassications.Theproblemcanbesolvedusi ngthedualformulation, whichconvenientlydoesnotdependontheslackvariables: L ( )= n X k =1 k 1 2 n X k =1 n X j =1 k j z k z j y j y k T whereforall k =1,..., n y k denotesthe k th datapoint(augmentedwiththebias), z k = 1 denotesthetrueclasslabelof y k ,andeachmultiplier k isconstrainedby 0 k C and P nk =1 z k k =0 ( Burges 1998 ; Dudaetal. 2001 ).Oncesolved,the optimalhyperplaneisgivenby: w = n X k =1 k z k y k wherethedatapoints y k forwhich k 6 =0 arethesupportvectors( Burges 1998 ). Additionally,SVMscaneasilybeextendedintononlinearcl assiersusingthe so-calledkerneltrick( Burges 1998 ).Essentially,afeaturespace R d isprojectedintoa 20

PAGE 21

higherdimensionalspaceinwhichtheSVMcanperformalinea rclassication,usinga mapping .Potentialmappingsarechosensuchthatthereexistsafunc tion K where, forall i j : K ( y i y j )=( y i ) ( y j ) Thekernelisthensubstitutedforthedotproduct y j y k T inthedualformulationabove. Manyusefulkernelfunctionshavebeendiscovered.Theline arSVMisused primarilyinthisproposal,althoughapolynomialkernelis occasionallynecessaryas well.Thepolynomialkernelisgivenby: K ( y i y j )=( y i y j +1) d where d isthedegreeofthepolynomial. 1.5.2InformationTheory Informationtheory,pioneeredbyClaudeShannoninthelate 1940s( Shannon 1948 ; Cover&Thomas 2006 ),isawidely-appliedmethodologyconcernedwith quantifyingtheamountofinformationpresentinrandomvar iables.Thefundamental conceptofinformationtheoryisentropy,orShannonentrop y,whichbearsmany similaritiestotheentropyofthermodynamics.Entropymea surestheamountof uncertaintyinarandomvariable X ,orequivalently,theexpectedvalueofamountof informationcontainedinanobservationof X .Forinstance,considerobservingabinary randomvariable X withtrueprobabilities P ( X =0)=0.999 and P ( X =1)=0.001 Since,foranytrial,theoutcomeisverylikelytobea 0 ,theuncertaintyof X islow.On theotherhand,considerarandomvariable Y withadiscretenumberofequally-likely outcomes,suchastherollofafairdie.Inthiscase,theunce rtaintyof Y cannotbe higher,andtheentropyof Y ismaximal.Theentropyofanydiscreterandomvariable X with n outcomes f x i : i =1... n g isgivenby: H ( X )= n X i =1 p ( x i )log 2 p ( x i ) 21

PAGE 22

where p ( x i ) denotestheprobabilityofoutcome x i and H ( X ) ismeasuredinbits. Mutualinformation(MI)isameasurementofthecorrelation betweentworandom variablesusingthenotionofentropy.Conceptually,forad iscreterandomvariable X ,MI quantieshowmuchtheuncertaintyin X isreducedbyknowingthevalueofanother randomvariable Y .TheMIbetween X and Y isdenedas: I ( X ; Y )= H ( X ) H ( X j Y ) where H ( X j Y ) denotestheconditionalentropyof X given Y ;thatis,howmuchentropy ispresentin X giventhat Y isknown. Alternatively,theMIcanbethoughtofasdescribinghowfar thejointdistribution p ( X = x Y = y ) differsfromtheproductofthemarginals, p ( X = x ) p ( Y = y ) .Recall that p ( X = x ) p ( Y = y )= p ( X = x Y = y ) ifandonlyif X and Y areindependent.MI canbeequivalentlyformulatedas: I ( X ; Y )= X x y p ( X = x Y = y )log p ( X = x Y = y ) p ( X = x ) p ( Y = y ) whichexactlydescribesthedifferencebetweenthejointdi stributionandtheproduct ofthemarginalsintermsoftheKullback-Leibler(K-L)dive rgence,anaturalmeasure ofdistancebetweentwoprobabilitydistributions. 1 Fornoisycommunicationchannels, suchasthefeedforwardnetworksofspikingneuronsconside redthroughoutourstudy, MIprovidesanaturalplatformfordiscussinghowmuchinfor mationthechannel'soutput containsaboutitsinput. 1 Strictlyspeaking,theK-Ldivergenceisnotatruedistance metric,sinceitdoesnot satisfythetriangleinequality. 22

PAGE 23

CHAPTER2 ANOVELFRAMEWORKFORANALYZINGLARGENETWORKSOFSPIKING NEURONS 2.1Introduction Inthestudyofsensorysystems,acommonexperimentalsetup involvespresenting stimulidrawnfromanitenumberofclasses,readingthespi ketrainresponses fromneuronsatsomepointalongthesensorypathway,andthe ndetermininghow todecodetheresponsestopredicttheclassofeachstimulus ( Riekeetal. 1997 ; Georgopoulosetal. 1982 ; Leeetal. 2009 ; Warlandetal. 1997 ; Theunissenetal. 2000 ; Milleretal. 2002 ).Typically,theearlysensorypathwaysserveasfeedforwa rd networks,propagatinginformationfromthesenseorgansto higherpartsofthebrain ( Squireetal. 2003 ).Whiletheencodingofinformationinsuchsituationshasb een researchedextensively( Riekeetal. 1997 ; Nirenbergetal. 2006 ; Pougetetal. 2000 ; Diesmannetal. 1999 ; Gollisch&Meister 2008 ; Shadlen&Newsome 1998 ; Gerstner&Kistler 2002 ),lessfocushasbeengiventotheroleofthenetwork architectures-thespecicinterneuralconnectionsandsy napticweights-atcommunicating information.Assessingandcomparingarchitecturesiscri ticalforunderstanding informationowinthebrain,andforbuildingrealisticmod elsofneuralsystems.For comparingtheperformanceofvariousfeedforwardnetworks attransmittinginformation aboutspecicsensorytasks,mutualinformation(MI)( Shannon 1948 ; Cover&Thomas 2006 )isanidealmetric,offeringaprecisequanticationofthe informationshared betweenthestimuliandtheirresponses.2.1.1MutualInformation Forsmallspikingnetworks,transmissiveabilitycanbeass ayedthroughdirect estimationoftheMIbetweenthestimuliandthespiketrainr esponsesonthenetwork's outputlayer( Strongetal. 1998 ; Shlensetal. 2007 ; Kenneletal. 2005 ; Wolpert&Wolf 1995 ; Miller 1955 ; Paninski 2003 ; Polaetal. 2005 ; Montemurroetal. 2007 ).These sortsofcomputationsultimatelyrelyontheconstructiono fprobabilitydistributions 23

PAGE 24

thatdependonboththeinput(stimulus)andoutput(respons e).Unfortunately,MI isprohibitivelydifculttocalculateforlargeneuralpop ulations,sincethesizeofthe responsesamplespace(i.e.,setofalloutcomes)scalesexp onentiallywiththesizeof thepopulation( Paninski 2003 ; Strongetal. 1998 ).Forexample,assumingtimebinning ofthespiketrainresponseswitharesponselengthof 40 msandagenerousbinsizeof 4 ms,apopulationof 400 neuronscorrespondstoanastronomical 2 4000 1.3 10 1204 possiblespiketrains. Oneprominentstrategyforhandlinglargepopulationsisto avoidthescaling problembyintroducingassumptionsontheencodingofinfor mationwithinthesystem. Someauthorspresupposeanencodingdirectly( Mehringetal. 2003 ; Mazurek&Shadlen 2002 ; Diesmannetal. 1999 ).Othersassumethatcorrelationsbetweenoutput neuronsarelimited( Nirenberg&Victor 2007 ; Shlensetal. 2006 ; Schneidmanetal. 2006 ; Latham&Nirenberg 2005 ).Commonly,neuralencodingsarequitecomplex. Responsestosoundstimulibytheauditorynerve,forinstan ce,containtemporal, rate,andpopulationcomponents( Squireetal. 2003 ; Moore 2004 ).Furthermore, evidencesuggeststhathigh-orderstatisticaldependenci eswithinapopulationdo encodestimulusinformationinsensorysystems( Pillowetal. 2008 ; Pougetetal. 2000 ).Therefore,ageneralframeworkforanalyzingspikingnet worksshouldbefreeof encodingassumptions.2.1.2DimensionalityReductionthroughClassication Statisticalclassicationmethodsareoftenusedtodecode sensoryinformationfrom populationsofspiketrains( Warlandetal. 1997 ; Serruyaetal. 2002 ; Mesgaranietal. 2008 ).Theperformanceofapopulation-howwellitconveysinfor mationaboutthe stimuli-isquantiedsimplybytheclassier'saccuracyco mbinedwiththegeneralization error( Vapnik 1999 ).Aclassier,then,maybeviewedasamappingbetweenthehi gh dimensionalspaceoftheresponsespiketrainandabinarycl assicationspace.A probabilisticlowerboundonthemutualinformationbetwee nbinarystimuluslabels 24

PAGE 25

andtheclassier'slabelpredictionscanbederivedusinga methodinspiredbyFano's inequality( Cover&Thomas 2006 ). Manyclassicationtechniquesoperatebyprojectinghighdimensionaldataonto arealline,thenpartitioningitintoanitenumberofclass es( Dudaetal. 2001 ). Discriminantclassiers,forinstance,projecteachpoint ontoareallinedenoting distancefromthediscriminant,thenpredictaclasslabelb asedontheprojected sign.Clearly,suchintermediaterepresentationscontain moreinformationabout apopulation'sperformancethantheaccuracyalone.Tomake useofthisextra knowledge,wederivealowerboundonthemutualinformation betweenabinary andaone-dimensional,continuousrandomvariable.Thebou ndisprobabilistic,relying ontheDvoretzky-Kiefer-Wolfowitz(DKW)inequality( Dvoretzkyetal. 1956 )forthe differencebetweenanempiricalcumulativedistributionf unctionandthetruedistribution function.2.1.3EvaluatingFeedforwardNetworks Inthisarticle,weintroduceaframeworkthroughwhichthei nformationtransmission capacityoflargefeedforwardnetworkscanbeassessedfora nygivensetofstimulus categories,usingaclassication-based,information-th eoreticapproachtodecoding responses.Toquantifynetworkperformance,wepresentano vellowerboundon mutualinformationforanymemorylesscommunicationchann elwithbinaryinput andone-dimensionalreal-valuedoutput,suchastheconde ncesinherenttomost classicationtechniques.Thisresultallowsforthetight estpossiblediscussionabout mutualinformationacrosseachnetworkwithoutassumingan encodingschemeor explicitlybuildingthejointprobabilitydistributionfu nction. Section 2.2 containsthedetailsofourmethodology.Ourprocedureforn etwork assessmentissurveyedinSection 2.2.2 ,andpicturedinFigure 2-3 .Sections 2.2.1 and 2.2.3 describethegenerationandinitialprocessingoftheaudit orystimuli.InSection 2.2.5 ,wediscussthedecodingofspiketrainresponsesandthespe cicclassiersand 25

PAGE 26

featurespacesused.Section 3 addressesthelowerboundsonmutualinformation. Thena¨veapproachisdescribedinSection 3.1 ,andthedetailedderivationofour novellowerboundisgiveninSection 3.2 .Aquadratic-timealgorithmfordetermining ourboundisintroducedinSection 3.2.5 .Section 4.1 comparesthetwolowerbound methods.Finally,resultsoftestingtheframeworkandcomp aringnetworkarchitectures arepresentedinSections 4.2 and 4.3 2.2Methods 2.2.1ModelingtheAuditoryPeriphery Anidealtestbedforstudyingfeedforwardnetworksistheas cendingauditory pathway,whichmovesinformationfromtheinnereartotheau ditorycortexinaprimarily feedforwardmanner( Altschuleretal. 1991 ).Auditorystimuliareeasytocategorize,but complexenoughsothatnotwosoundwaveformsareidentical, evenforsimplestimulus sets.Furthermore,aspreviouslymentioned,thepathway's encodingofstimuliisquite heterogeneous. Toproducephysiologicallyrealisticauditorynervedataf romdigitalsoundstimuli, weemploytheMeddisInner-HairCellmodel( Meddis 1986 1988 ; Lopez-Povedaetal. 2001 ; Sumneretal. 2002 ).TheMeddismodeloffersalargedegreeofparametric exibility,suchastheabilitytoemulatetheauditoryperi pheriesofseveralorganisms. Experimentally,ithasshownadherencetoavarietyofwellknownphysiological properties,includingrate-intensity,rapidandshort-te rmadaptation,phase-locking, additivity,andrecovery( Meddis 1986 ),anditisfaithfultoobservedspikestatisticson theauditorynerve( Sumneretal. 2002 ).Figures 2-1 and 2-2 illustratesomesimulated auditorynervespikeresponsesforvariouskindsofauditor yinput. AllMeddismodelsimulationsarebasedonastandardhumanpa rameterset derivedfromexistingpsychophysicaldata( Lopez-Povedaetal. 2001 ).Weuse 40 centerfrequenciesrangingfrom 100 Hzto 5000 Hz.Thecenterfrequencies,aswell astheirassociatedbandwidths,increaseonanexponential scaleoverthisrange. 26

PAGE 27

Toemulatethephysiologicaltraitofmultipleauditoryner vebersinnervatingeach inner-haircell, 10 nervebersaregeneratedfromidenticalstimuliforeachce nter frequency.Notably,theresponsesfortwoberswithinthes amegroupwilldifferdueto theprobabilisticcomponentofspikegenerationinthemode l.Thesecenterfrequency groupsallowforpopulation-codingofthesignalsatanycen terfrequency.Therefore thissimulatedauditorynerve,whichservesasthefoundati onforallneuralsimulations describedhereafter,containsatotalof400spiralganglio ncellsrespondingtoeach stimulus. Forourstudy,weselectauditorycategoriesthatmustbedis criminatedbytheir frequencyspectra,nottheirabsoluteamplitudes.Homeost aticmechanismsina biologicalauditorypathwaymaintainnon-pathologicalsp ikeratesdespitevariations inloudness( Squireetal. 2003 ; Altschuleretal. 1991 ; Moore 2004 ).Toensurethat spiketrainsthroughoutourmodelremaindecodableintheab senceofsuchcontrols,we institutetwonormalizationprocedures:acoarseadjustme nttotheinputgainlevelofthe Meddismodelfordifferentstimuluscategories(detailedi nAppendix 5.3 ),andaglobal thresholdadjustmentforeachofourrandom-weightnetwork s(describedinSection 4.3.1 ).FurtherdetailsofourinstantiationoftheMeddismodela reprovidedinAppendix 5.3 2.2.2MeasuringInformationTransmissionOveraNetwork InSection 3 ,weexamineanoveltechniqueformeasuringtheMIbetweenso und stimuliandspiketrainresponses.However,inordertojudg etheperformanceofvarious networkarchitectures,anydegradationinthesignaldueto theMeddismodelmustbe discounted.Networkperformancecanbeisolatedusingatwo -stageprocess,depicted inFigure 2-3 .First,theMIiscalculatedbetweenthestimuliandthesimu latedauditory nerve,whichistheoutputoftheMeddismodel.Thisnumberse rvesasabaseline;high valuesindicatethatthesignalhasnotlostmuchstimulusin formationwhilepassing throughtheperipherymodel.Next,thesignalispassedthro ughanetwork,andtheMI 27

PAGE 28

298Hz4629Hz845Hz3321Hz C Puretones DownDown Up D 1000Hz Sweeps 1576.14Hz 747.78Hz4548.47Hz 354.68Hz746.2Hz 95.69Hz495.02Hz 387.15Hz240.77Hz 830.27Hz629.85Hz A Harmonic Tone AB Harmonic Tone B Figure2-1.Samplesofsoundstimuliandbaselineauditoryn erveresponseforseveral tasks.Forallparts,thelowerpanelshowsthespiketrainre sponsesofthe 400 -neuronauditorynerve.A)B)InstrumentsAandB.Upperpane lshows therelativeharmonicsofeachstimulus.Middlepanelshows four 50 ms stimuliandcorrespondingbasefrequencies.C)Puretonest imuliandAN responseforfour 50 ms stimuli,alongwitheachinterval'sfrequency.D) Frequencysweeptask.Toppanelshowsthree 100 msintervalswiththeir basefrequenciesandup/downdirectionlabels. 28

PAGE 29

676.20Hz161.62Hz779.87Hz302.28Hz BBCC B Mix of B and C 467.58Hz451.59Hz BABB 134.13Hz 727.57Hz A Mix of A and B Figure2-2.Samplesofsoundstimuliandbaselineauditoryn erveresponsefortwo instrumentrecognitiontasks.Upperpanelsdepictfour 50 msstimuli,each subscriptedbythefundamentalfrequencyandsuperscripte dbywhichofthe twoinstrumentsgeneratedtheinterval.Lowerpanelsshowt hespiketrain responsesofthe 400 -neuronauditorynerve. iscomputedbetweenthestimuliandthenetwork'soutputspi ketrainresponses.The differencebetweenthebaselineMIandthenetworkMIgivesa measurementofthe informationlostduringtransmissionthroughthenetwork.2.2.3StimulusCategorization Inordertoqualifytheconceptofnetworkperformance,onem ustrstspecifythe distributionofstimulionwhichperformancewillbejudged .Overtherangeofdetectable stimuli,biologicalsensorysystemsareknowntofocusdisp roportionatelyoncertain signals( Moore 2004 ; Berryetal. 1997 ; Guentheretal. 2004 ).Therefore,measuring channelcapacity( Cover&Thomas 2006 )orassumingequal-weightstimulidoesnot giveausefulreectionofanetwork'sabilities.Instead,a moremeaningfulcontextis providedbyevaluatingperformanceonnaturaltasks,commo nlytermedcategories. Categoriesareequivalenceclassesofstimuli.Acategorys pecicationpartitions thestimulussetaccordingtosomehigher-levelpropertyof thestimuli( Ohletal. 2001 ; Freedmanetal. 2001 ; Guentheretal. 2004 ; Leeetal. 2009 ; Fuster&Jervey 1981 ; 29

PAGE 30

50ms BCD A t=200ms t=100ms t=0 StimulusBaseFrequencyStimulusWaveformAuditoryNerveResponseResponse Network Simulated E Figure2-3.Anoutlineoftheexperimentalprocedureforasa mpletask,showingfour 50 msintervalsofaharmonictone.A)Abasefrequencyisselect edat random.B)Thestimulusisgenerated:here,aharmonictone( Section 2.2.4 ).C)Thesoundispassedintothe400-neuronauditoryperiph ery model(Section 2.2.1 )D)Thebaselineauditorynerveresponseispassed throughanetworkofspikingneurons.Theexampleconnectio nsontheleft showthefwdA2Anetwork(Section 4.3.1 ).E)TheMIismeasuredbetween thestimulusandthebaselineresponse(leftarrows),andth enmeasured betweenthestimulusandthenetworkresponse(rightarrows ).The differenceshowsinformationlossthroughthisnetwork. 30

PAGE 31

B A C D E Figure2-4.Relativeharmonicsforeachtypeofharmonicton e,orinstrument.Leftto right,eachpegdenotesanincreasingharmonic:peg 1 isthefundamental f peg 2 istheharmonic 2 f ,etc.A)InstrumentA-Alleightharmonicshave amplitude 1 .B)InstrumentB-Amplitudeofeachharmonicistheinverseo f thenumberofthatharmonic,asinaharmonicseries: 1, 1 2 1 3 ,... .C) InstrumentC-Oddharmonicshaveamplitude 1 ,evenharmonicshave amplitude 0 .D)InstrumentD-Oddharmonicshaveanamplitudeaccording totheharmonicseries,evenharmonicshaveamplitude 0 .E)Puretone stimulus-afundamentaltonewithnootherharmonics. Freedmanetal. 2003 ; Milleretal. 2003 ).Forinstance,ifthestimulussetconsistsof 10-secondsoundclipsofviolinperformances,someinteres tingauditorycategories mightbewhatsongisbeingplayedorwhatkeythemusicisbein gperformedin.Tasks ofthisnaturetypicallyconsistofasmallnumberofstimulu scategories.Inwhatfollows, werestrictourdiscussiontobinarycategories,sinceanyc lassicationproblemona nitenumberofclassescanbeexpressedasanequivalentser iesofbinaryproblems. 2.2.4DescriptionofAuditoryCategories Weconsiderthreefamiliesofauditorycategories,eachofw hichrequiresattention toadifferentsetofauditorypropertiesforsuccessfuldis crimination.Examplesofeach ofthecategorytypesareexhibitedinFigures 2-1 and 2-2 Therstgroupofcategoriesconcernspitchdiscrimination on 50 mssoundsthat eachhaveaperceivedpitch.Severaldistinctharmonicpro les,termedinstruments,are used.Soundsfromagiveninstrumentaregeneratedusingadd itivesynthesis: 31

PAGE 32

y ( t )= n X i =0 A i sin(2 ( i +1) t + ) (2–1) wherethe 0 th harmonicisdenedasthefundamental, n denotesthenumberof harmonicovertones, A i denotestheamplitudeofthe i th harmonic, isthefundamental frequency,and representsthephaseshiftofthenalwaveform y ( t ) .Asperahuman listener,theperceivedpitchofeachstimulusisdenedast hefundamentalfrequency. Fivedistinctharmonicprolesareevaluated,eachillustr atedinFigure 2-4 .For puretonestimuli,soundsaregeneratedwithfrequenciesra ngingfrom90Hzto5200Hz inaccordancewiththeMeddismodel'soptimalresponserang e.Eachoftheinstruments AthroughDcontainssevenovertonesinadditiontothefunda mentalfrequency.To ensurethatatleastsixofofthesevenharmonicsineachstim ulusaredetectableby oursimulatedperiphery,thesetonesaregeneratedusingfu ndamentalfrequencies between90Hzand860Hz.Setsof 50 msstimuliaregeneratedsuccessivelyfor10 seconds,whichisappendedandprependedby 100 msofsilence.Eachstimulusis phase-matchedtotheprecedingintervalsothatnodisconti nuitiesinthephysical waveformmarktheintervalboundaries. Thesecondcategorygroupinvolvesinstrumentdiscriminat ion.Forpairsofthe instrumentsAthroughD,theclassier'staskistopredictw hichinstrumentproduced agiven 50 msstimulusforrandomly-pitchedtones.Eachstimulusisge neratedby rstrandomlyselectingoneofthetwoinstruments,thenran domlychoosingabase frequencybetween90Hzand860Hz,asabove.Thestimuliarea gainphase-matchedto thepreviousintervalforwaveformsmoothness. Thenaltypeofauditorycategoryrequirestheclassierto discriminatestimuli basedonatemporalauditorypropertythatevolvesduringth edurationofthestimulus. Weconsiderthesetofupanddownfrequencysweeps,orchirps ,of 1000 Hz.To generateeach 100 msstimulus,astartingfrequencyisselectedatrandomsuch that theminimumandmaximumfrequenciesforthesweepwillbeper ceptiblebytheMeddis 32

PAGE 33

model.Eachintervalisgeneratedby: y ( t )=sin(2 t ( + zt 1000 Hz = 0.1 ms = 2)+ ) (2–2) where t rangesfrom 0 msto 100 ms, denotesthestartingfrequency, z = 1 ispositive for”up”stimuliandnegativefor”down”stimuli,and istheinitialphaseofthesound. Asabove,thebeginningofeachintervalisphase-matchedto theendoftheprevious interval. Inadditiontothesestimuli,weinvestigatetheeffectofad dingnoiseonthesystem's abilitytodiscriminatevariousauditorycategories.Repr esentativesofeachofthethree familiesabovearecombinedwithGaussianwhitenoiseatSNR levelsof20and5. 2.2.5ClassicationofSpikeTrainResponses Statisticalclassiersprovideaneffectivewaytoworkwit hresponsedatathatis sparseorhigh-dimensional,asisoftenthecasewithspiket rainpopulationsamples. Eachpopulation'sresponsebecomesapointinafeaturespac e,andtheclassier makesuseoftheintrinsicstructureofthefeaturespacetos eparatethedatainto classes.Consequently,easeofclassicationisdirectlyr elatedtohowwellthefeature spacecancapturetherelevantinformationinthestimuli.2.2.5.1Priorworkinspiketrainclassication Onepopulartechniquefordecodingstimulusinformationfr omspiketrainpopulations islinearltering,whichseekstoreconstructanoftencont inuousstimulus( Warlandetal. 1997 ).Whenappliedtoasmallnumberofstimuluscategories,lin earlteringcanbe viewedasalinearclassierthatreconstructsthestimulus classlabel( Serruyaetal. 2002 ; Bar-Yosefetal. 2002 ).Otherclassicationmethodsthathavebeenapplied topopulationsofneuronsincludeMLP( Wessbergetal. 2000 ),LVQ( Nicolelisetal. 1998 ),SVM( Mesgaranietal. 2008 ; Kimetal. 2006 ),andaBayesianmethodusing Kalmanlters( Wuetal. 2005 ). 33

PAGE 34

Almostubiquitouslyintheliterature,rawspiketrainsare representedasbinned spikecountsorbinnedspikerates,sometimestermedperist imulustimehistograms. Axedresponsewindowisbrokenintoaanumberoftimebins,t heneitheracount isobtainedofthenumberofspikesineachbin,orarateiscal culatedbyaveraging thecountsovermanytrials.Toreducedimensionality,prep rocessingmethodsare sometimesperformedontheserawbinnedcounts( Richmondetal. 1987 ; Laubach 2004 ). 2.2.5.2Novelfeaturespacesforspiketrainpopulations Precisetiminginformationisknowntoplayacriticalrolei ntheencodingofsensory information( Mainen&Sejnowski 1995 ; Reichetal. 2000 ).Therefore,inadditionto utilizingthestandardspikecountrepresentation,weden etwonewfeaturespacesfor spiketrainpopulationsbasedoninter-spikeintervals(IS I)andspiketimes.Figure 2-5 showsanexampleofeachrepresentation. Inbothofthetiming-basedrepresentations,eachspiketha toccurswithinthe designatedresponsewindowisdenotedbyarealvalue.Thefe aturesareordered rstbytheneuronthatproducedthespike,andthenbytheage softhespikesfrom oldesttoyoungest.Forthespiketimesfeaturespace,eachf eaturevaluerecordsthe timebetweenaspikeandthebeginningoftheresponsewindow .Thespiketimes representation,consequently,isverysensitivetothepre ciselocationoftheresponse interval.FortheISIfeaturespace,eachfeaturevaluedeno testhetimebetweena spikeandthepreviousspikebythesameneuron.Fortherst, oroldest,spikebyeach neuron,thefeaturevalueisthetimesincethebeginningoft hewindow.Therefore, perturbingtheresponsewindowdoesnotgreatlyaltertheIS Irepresentationofthedata points. Inorderforthesetiming-basedrepresentationstobeuseda sfeaturespaces, eachdimensionmusthaveaconsistentmeaningacrossallpos siblespiketrains.To thatend,wespecifyamaximumnumber D ofpossiblespikesforeachneuron,which 34

PAGE 35

t=25ms t=50ms t=0 Intervals Interspike Spike Count (2 bins) ... 1, 0, ( 2, 2, 2, 1 ) Feature Space Representations Spike Times ... 8, 100, 100, 100, 100, ( 13, 14, 41, 42, 100, 3, 22, 44, 100, 100 ) ... 8, 100, 100, 100, 100, ( 13, 1, 27, 1, 100, 3, 19, 22, 100, 100 ) Figure2-5.Visualizationofaspiketrainprojectedasapoi ntinafeaturespace.Feature spacepointsarethenprojectedintoaone-dimensionalclas sicationspace bytheclassier.Insettablegivesseveralfeaturespacere presentationsof thesamplespiketrain,whereallnumbersarein ms andwindowsizeis 50 ms.Forspikecount,two 25 msbinsareshown.ForspiketimesandISI, thenullvalueis 100 ms,andthenumberoffeaturesperneuronis 5 alsosetsthetotalnumberoffeaturesas ( D n ) ,where n isthenumberofneurons inthepopulation.Ifaneuron'sresponseforsometrialcont ainsmorethan D spikes, theremainingspikesareignored.Iftheresponsecontainsf ewerthan D spikes,the extrafeaturesaregivenavalueof ; d ,axednullvalue.Thechoiceof ; d mustbe carefullyconsidered,sinceitaffectsthedistributionof datapointsinthefeaturespace. Forinstance, ; d shouldnotbeallowedtocollidewithanypermissiblefeatur evalue. 35

PAGE 36

Hereafter,wesetthevalueof ; d tobetwicethelengthoftheresponsewindow,which placestheemptyfeaturesinadifferentpartofthespacetha nthevalidfeatures. 2.2.5.3Supportvectormachineclassier Inwhatfollows,weemploysupportvectormachine(SVM)clas siers,whichcan readilyimplementmanytypesofdiscriminationboundaries .WeutilizetheSVM-Light package( Joachims 1999 2002 ),whichisfreelyavailableonline( Joachims 2009 ).For eachexperiment,ahold-outsetwasusedalongwiththetrain ingsettodeterminea reasonable C valuefortheSVM. 2.2.6SpikingNeuronModel Forallnetworksimulations,spikingdynamicsaregoverned byazeroth-orderSpike ResponseModel(SRM)( MacGregor&Lewis 1977 ; Gerstner&Kistler 2002 ).Each post-synapticpotential(PSP)isgivenby: PSP ( t Q d )= Q d p t e d 2 t e t (2–3) where t isthetimeelapsedsincethespikeoccurred, Q issynapticweight, d isdistance (indimensionlessunits)ofthesynapsefromthesoma, (in ms )controlstherateof PSPrise,and (in ms )determineshowquicklyitfalls. Theafter-hyperpolarizationpotential(AHP)effectsofou tputspikesaredetermined by: AHP ( t )= Re t r (2–4) where t againdenoteselapsedtime, R determinesthemaximumdropinpotentialafter aspike,and r (in ms )governstherateatwhichtheAHPeffectsubsides. Unlessotherwisenoted,thePSPparameter issetto 6 msandthePSPparameter is 15 ms.Thedistanceparameter d israndomly-generatedbetween 1 and 2 ,and remainsconstantoncecreated.FortheAHPparameters, R issetto 1000 mV,and r is 1.6 ms. 36

PAGE 37

CHAPTER3 APROBABILISTICLOWERBOUNDONMI Foranymemorylesscommunicationchannelwithabinary-val uedinputanda one-dimensionalreal-valuedoutput,weintroduceaprobab ilisticlowerboundonthe mutualinformationgivenempiricalobservationsonthecha nnel.Theboundisbuilt upontheDvoretzky-Kiefer-Wolfowitz(DKW)inequality,an disdistribution-free.A quadratictimealgorithmisdescribedforcomputingthebou ndanditscorresponding class-conditionaldistributionfunctions. 3.1ClassicationandMI 3.1.1Na¨veConnection:TheFanoMethod Abinaryclassiermapshigh-dimensionaldataontotwopred ictedclasses. 1 MotivatedbyFano'sinequality( Cover&Thomas 2006 ),wecanderivealowerboundon MIusingonlyaclassier'stotalaccuracy.Considerbinary -valuedrandomvariables X (input)and Y (output).Sincethedistributionof X iscontrolledbytheexperimenter,the marginalprobabilities P ( X =0) and P ( X =1) areassumedtobexed apriori suchthat P ( X =0)= P ( X =1)=0.5 .Themutualinformationisthen: I ( X ; Y )=1 H ( X j Y ) (3–1) whichcanbewrittenasafunctiondependingsolelyonthetru eclass-conditional probabilitiesoferror, P ( e j X =0)= P ( Y =1 j X =0) and P ( e j X =1)= P ( Y =0 j X = 1) .Givenanempirically-observedsetof n 2 testsamplesforeachclass,Hoeffding's inequality( Hoeffding 1963 )statesprobabilisticboundsoneachtrueclass-condition al testerror: P 2 S 0 n P ( e j X =0) h 2 e n 2h 1 Fortheseanalyses,classiersareassumedtobepre-traine d.All empirically-observeddataisthereforeunseenandclassi edbyaxedmodel. 37

PAGE 38

and P 2 S 1 n P ( e j X =1) h 2 e n 2h whereforeachclass x 2f 0,1 g S x denotesthetotalnumberoferrorsandthetrueerror P ( e j X = x ) lieswithin h oftheempiricalerrorwithacondence r =1 2 e 2 n 2h .For anydesiredcondence,then: h = s ln 1 r 2 n (3–2) Whenappliedtobinaryrandomvariables,Fano'sinequality becomesanequalityforthe conditionalentropy: H ( X j Y )= H ( e ) (3–3) = P ( e )log P ( e ) (1 P ( e ))log(1 P ( e )) (3–4) whereP(e)denotesthetotalprobabilityoferror: P ( e )= P ( e j X =0) P ( X =0)+ P ( e j X =1) P ( X =1) = 1 2 2 S 0 n + h + 2 S 1 n + h = S 0 + S 1 n + h ThusalowerboundonMIcanbecalculatedusingtheclassier 'saccuracyontheentire setofsamples.3.1.2ABetterMIBound Althoughthisdiscreteclassreconstructionmethodcanbeu sedtoboundMI, distillingthedatafromahigh-dimensionalrepresentatio ntoasimplebinaryvalue discardsconsiderableinformation.Ontheotherhand,asdi scussedpreviously, computingMIwiththefull-featureddataisofteninfeasibl e.Fortunately,manyclassiers produceanintermediaterepresentation,mappingeachdata pointtoareallinebefore theirnalclassprediction.Discriminantmethods,forins tance,measurethedistance 38

PAGE 39

fromeachdatapointtothediscriminantboundary.Asanothe rexample,aclustering algorithmcanassigneachdatapointareal-valuedcondenc ethatisafunctionofthe Mahalanobisdistancebetweenthepointandtheclustercent ers.Theextrainformation providedbythiscontinuousrandomvariablecanbeutilized toproduceatighterlower bound. Sinceempiricaldistributionfunctionsareconstructedfr omanitenumberof samples,atruelowerboundonMIcannotbeobtaineddirectly .Rather,weemployan inequalityduetoDvoretzky,Kiefer,andWolfowitz( Dvoretzkyetal. 1956 ),andrenedby Massart( Massart 1990 ),thatgivesprobabilisticboundsonanytruedistribution function giventheempiricaldistributionfunctionbasedontheobse rvedsamples.Asimilar problemwasrecentlyaddressedformaximizingdifferentia lentropyonasinglevariable ( Learned-Miller&DeStefano 2008 ).Weproposeamethodbywhichaprobabilistic lowerboundonMIcanbeconstructedusinganyreal-valuedcl ass-conditionaldata. 3.2ANovelLowerBoundonMutualInformation Consideracommunicationchannelwiththeinputrandomvari able, X ,taking discrete(binary)values,i.e., X 2f 0,1 g ,andtheoutputrandomvariable, Y ,takingreal valuesinaboundedrange,i.e., Y 2 [ a b ]: a b 2 R .Thechannelismodeledbya pairofunknown(conditional)continuousdistributionfun ctions P ( Y y j X =0) and P ( Y y j X =1) withdensityfunctions f 0 and f 1 suchthat: F 0 ( y ) Z y a f 0 ( t )d t P ( Y y j X =0) and F 1 ( y ) Z y a f 1 ( t )d t P ( Y y j X =1) Asbefore,weassume P ( X =0)= P ( X =1)=0.5 .However,ourresultsareeasily generalizedforothervalues. Let y 0 1 y 0 2 ,..., y 0 ( n 2 ) beasampleof n 2 independent,identicallydistributedrandom variableswithdistributionfunction F 0 ( y ) ,and y 1 1 y 1 2 ,..., y 1 ( n 2 ) beasampleof n 2 independent, identicallydistributedrandomvariableswithdistributi onfunction F 1 ( y ) .Also,let b F 0 ( y ) 39

PAGE 40

b F 1 ( y ) betheempiricaldistributionfunctions,denedby: b F 0 ( y )= 2 n n 2 X i =11( y 0 i y ) and b F 1 ( y )= 2 n n 2 X i =11( y 1 i y ) where1E representstheindicatorfunctionfortheevent E ,andthetraditionalsubscript denotingthenumberofsamplesfor b F isunderstood. Inwhatfollows,weutilizetheorderstatistics 2 ofthecombinedsample y 0 [ y 1 denotedby h z i j i =1... n i .Fornotationalconvenience,wedenethepoints z 0 a and z n +1 b ,sothat F 0 ( z 0 )= F 1 ( z 0 )=0 and F 0 ( z n +1 )= F 1 ( z n +1 )=1 TheDvoretzky-Kiefer-Wolfowitz(DKW)inequalityplacesa tightprobabilisticbound onthedifferencebetweenanyempiricaldistributionfunct ionandthetruedistribution function( Dvoretzkyetal. 1956 ; Massart 1990 ).Using n 2 samplesforeachdistribution, theboundson F 0 and F 1 aregivenby: P r n 2 sup t j b F 0 ( t ) F 0 ( t ) j > dkw 2 e 2 2dkw and P r n 2 sup t j b F 1 ( t ) F 1 ( t ) j > dkw 2 e 2 2dkw Therefore,givenanydesiredcondence r ,theDKWinequalityguaranteesthat thetruedistributionswillliewithinthexedtubedrawn dkw aroundtheempirical distributions,where: dkw = s ln 1 r 2 n (3–5) Withinthisframework,weseektwodistributionfunctions F 0 and F 1 on [ z 0 z n +1 ] thatminimizethemutualinformation I ( X ; Y ) subjecttotheDKWtubeconstraints.Since P ( X =0)= P ( X =1)=0.5 ,theentropy H ( X )=1 ,and: 2 Theorderstatistics z 1 z 2 ,..., z n ofasample y 1 y 2 ,... y n arethevaluesofthesample arrangedinnon-decreasingorder. 40

PAGE 41

1 0.5 0 Figure3-1.Sampledistributionsfortwoinputclassesandt heirDKWtubes.Thedotted linesrepresenttheempiricaldistributionsforeachclass ,andthesolidlines depicttheupperandlowertubes. I ( X ; Y )= H ( X ) H ( X j Y ) =1+ 1 2 Z z n +1 z 0 f 0 ( t )log f 0 ( t ) f 0 ( t )+ f 1 ( t ) + f 1 ( t )log f 1 ( t ) f 0 ( t )+ f 1 ( t ) d t 1+ 1 2 M R (3–6) Wefocushereafteronthevariablecomponent M R 3.2.1TheTube-UnconstrainedSolution Beforeundertakingthegeneralproblem,weaddressamuchsi mplersubproblem: Theorem3.1. Consideranypairofvalues c : c a and d : d b suchthatthe solutionsforthedistributionfunctions F 0 and F 1 areknownat F 0 ( c ) F 0 ( d ) F 1 ( c ) ,and F 1 ( d ) (wecallthesefourpoints“pins”,andthecorrespondingint erval [ c d ] a“pinned interval”).Assumingthat F 0 and F 1 aremonotonicallynon-decreasingon [ c d ] ,then,in theabsenceoffurtherconstraints,thesolutionthatminim izestheinterval'scontribution tothetotalMIisgivenbyanytwocurveswiththeproperty: f 0 ( t )= v f 1 ( t ) 8 t : c t d (3–7) 41

PAGE 42

forsome v 2 R .Inotherwords,thetwosolutiondensitiesaremultiplesof oneanother ontheinterval [ c d ] .Furthermore,theminimumcontributionto I ( X Y ) bythisinterval is: m = log + + log + (3–8) where = F 0 ( d ) F 0 ( c ) and = F 1 ( d ) F 1 ( c ) ProofofTheorem 3.1 Asstated,let = F 0 ( d ) F 0 ( c ) ,denotingtheincreasein F 0 betweenitstwopins,and = F 1 ( d ) F 1 ( c ) ,denoting F 1 'sincreasebetweenitspins. TheminimalMIon [ c d ] isgivenbythecurvesthatminimizethefunctional: M R d c = Z d c f 0 ( t )log f 0 ( t ) f 0 ( t )+ f 1 ( t ) + f 1 ( t )log f 1 ( t ) f 0 ( t )+ f 1 ( t ) dt subjectonlytotheconstraints: Z d c f 0 ( t ) dt = Z d c f 1 ( t ) dt = (3–9) f 0 ( t ) 0 f 1 ( t ) 0 (3–10) 8 t : c t d Usingtheobjectiveandtheseconstraints,theLagrangiani ntegrandfunctionforthe resultingcalculusofvariationsproblemis: f 0 ( t )log f 0 ( t ) f 0 ( t )+ f 1 ( t ) + f 1 ( t )log f 1 ( t ) f 0 ( t )+ f 1 ( t ) + 1 f 0 ( t )+ 2 f 1 ( t ) 1 ( t ) f 0 ( t ) 2 ( t ) f 1 ( t ) (3–11) where 1 and 2 areconstantsand 1 ( t ), 2 ( t ) 0 .TheHessianoftheintegrandof M R d c is: H ( M )= 264 f 1 ( t ) f 0 ( t )( f 0 ( t )+ f 1 ( t )) 1 f 0 ( t )+ f 1 ( t ) 1 f 0 ( t )+ f 1 ( t ) f 0 ( t ) f 1 ( t )( f 0 ( t )+ f 1 ( t )) 375 42

PAGE 43

whichispositivesemi-denite.Thereforetheintegrandis convex,andconsequentlythe functional M R d c isalsoconvex.Sincealltheconstraintsareafne,anyextr emalsofthe Lagrangianmustbeminima.TwoEuler-Lagrangeequationsar e: 0=log f 0 ( t ) f 0 ( t )+ f 1 ( t ) + 1 1 ( t ) (3–12) 0=log f 1 ( t ) f 0 ( t )+ f 1 ( t ) + 2 2 ( t ) (3–13) 8 t : c t d Complementaryslacknessrequiresthat: 1 ( t ) f 0 ( t )=0= 2 ( t ) f 1 ( t ) 8 t : c t d Nowforany t ,if f 0 ( t )=0 ,thentheright-handsideofEquation 3–12 mustbenon-nite, andthereforenonzero.Similarly, f 1 ( t ) mustbenonzeroinordertosatisfyEquation 3–13 .Therefore, 1 ( t )= 2 ( t )=0 forall t .RewritingEquations 3–12 3–13 showsthat: v f 0 ( t ) f 1 ( t ) = 1 2 1 1 =2 2 1 8 t : c t d Sothen, v 2 R isaconstantthatdoesnotdependon t ,and f 0 and f 1 differbyaconstant multipleonthepinnedinterval.Since 1 and 2 havenootherconstraints,itisalsoclear thatanytwodensitiesdifferingbyaconstantmultiplewill beequivalentminimizers. Furthermore,ifthemultiplier v issuchthat v f 1 ( t )= f 0 ( t ) ,thenitiseasytoshow that = v ,andtherefore: + = v R d c f 1 ( t ) dt v R d c f 1 ( t ) dt + R d c f 1 ( t ) dt = v v +1 = f 0 ( t ) f 0 ( t )+ f 1 ( t ) 8 t : c t d + = R d c f 0 ( t ) dt R d c f 0 ( t ) dt + v R d c f 0 ( t ) dt = 1 v +1 = f 1 ( t ) f 0 ( t )+ f 1 ( t ) 8 t : c t d 43

PAGE 44

Therefore,theminimumvalueonthepinnedintervalis: m = min f f 0 f 1 g Z d c f 0 ( t )log f 0 ( t ) f 0 ( t )+ f 1 ( t ) + f 1 ( t )log f 1 ( t ) f 0 ( t )+ f 1 ( t ) dt = log + + log + Forreasonsthatwillsoonbemadeclear,anysolutionmeetin gtherequirementsof Theorem 3.1 willbeknownasatube-unconstrainedsolution. 3.2.2ADiscreteFormulation Atanypoint z i ,wedenotethe(unknown)conditionalprobabilitymassfore ach X 2f 0,1 g by: f i 0 Z z i z i 1 f 0 ( t )d t = F 0 ( z i ) F 0 ( z i 1 ) and f i 1 Z z i z i 1 f 1 ( t )d t = F 1 ( z i ) F 1 ( z i 1 ) Also,wedenotethelowerDKWtubeboundariesat z i forthetwodistributionfunctions as F 0 ( z i ) and F 1 ( z i ) ,andtheuppertubeboundariescorrespondinglyas F + 0 ( z i ) and F + 1 ( z i ) .Inorderforthedistributionfunctionstobefeasiblethro ughouttheentireinterior ofthetubes,the F boundariesareraisedto 0 whenevertheyfallbelow 0 ,andthe F + boundariesareloweredto 1 whenevertheyriseabove 1 .Thetubeshavenonzero widthattheminimumandmaximumorderstatistics, z 1 and z n ,andaredenedasbeing collapsedattheintervaledgessuchthat: F 0 ( z 0 )= F + 0 ( z 0 )= F 1 ( z 0 )= F + 1 ( z 0 )=0 and F 0 ( z n +1 )= F + 0 ( z n +1 )= F 1 ( z n +1 )= F + 1 ( z n +1 )=1 UsingTheorem 3.1 ,theproblemofminimizingMIwithrespecttotwofunctionsc an besimpliedtoaconstrainedoptimizationproblemwithan itenumberofconstraints. Sincetheempiricaldistributionsareeachdenedusingsam plesof n 2 i.i.d.random 44

PAGE 45

1 0.5 0 Figure3-2.RelaxationoftheDKWtubes.Thesolidlinesdeno teDKWtubescomputed fromtwoempiricaldistributions.Thedottedlinesshowthe piecewiselinear versionsofthetubes. variables,theDKWtubeswillbestepfunctionson [ z 0 z n +1 ] .Bydenition,thetubesare atbetweenanytwosuccessiveorderstatistics, z i and z i +1 ,withoutlossofgenerality. However,weconsideraweakenedversionoftheDKWtubessuch thatsuccessivetube boundariesarepiecewiselinear,asshowninFigure 3-2 .Formally: F + 0 ( t )= c 0 ( t z i )+ F + 0 ( z i ) and F 0 ( t )= c 0 ( t z i )+ F 0 ( z i 1 ) F + 1 ( t )= c 1 ( t z i )+ F + 1 ( z i ) and F 1 ( t )= c 1 ( t z i )+ F 1 ( z i 1 ) 8 t : z i t z i +1 where c 0 = F + 0 ( z i +1 ) F + 0 ( z i ) z i +1 z i and c 1 = F + 1 ( z i +1 ) F + 1 ( z i ) z i +1 z i Nowthen,consideranysolutiontothegeneralproblem.Thed istributionfunctions willtakevalueswithintheirtubesat F 0 ( z i ) F 0 ( z i +1 ) F 1 ( z i ) ,and F 1 ( z i +1 ) .Ontheinterval [ z i z i +1 ] ,considerthelinearsolutionobtainedbydrawingonestrai ghtlinebetween F 0 ( z i ) and F 0 ( z i +1 ) ,andanotherbetween F 1 ( z i ) and F 1 ( z i +1 ) .Theselinesclearly 45

PAGE 46

liewithintherelaxedDKWtubes.Furthermore,sincetheyar elinear,thissolution necessarilyhasthepropertythat f 0 and f 1 aremultipleson [ z i z i +1 ] ApplyingTheorem 3.1 ,thegeneralproblemmaybesimpliedbyplacingpinsatall orderstatistics,yieldingasystemof 2 n variablesinlieuoftwofunctions.Thefunctional I ( X ; Y ) canthusbewrittendiscretely: I ( X ; Y )=1+ 1 2 n +1 X i =1 f i 0 log f i 0 f i 0 + f i 1 + 1 2 n +1 X i =1 f i 1 log f i 1 f i 0 + f i 1 =1+ 1 2 M ( f 1 0 f 2 0 ,..., f n 0 f n +1 0 f 1 1 f 2 1 ,..., f n 1 f n +1 1 ) (3–14) where I isequivalentlyminimizedbythefunction M ,subjecttoahostofconstraints. Findingtheminimumof M cannowbecleanlyposedasaconstrainedoptimization problem( Boyd&Vandenberghe 2004 ). 3.2.3ConstraintsontheDistributionFunctions Foranystatistic z i :1 i n ,fourconstraintsareimposedbytheDKWtubesof F 0 and F 1 ,andtwomoreensurethenonnegativityof f 0 and f 1 : g 1 i = i X j =1 f j 0 F + 0 ( z i ) 0 g 2 i = F 0 ( z i ) i X j =1 f j 0 0 g 3 i = i X j =1 f j 1 F + 1 ( z i ) 0 g 4 i = F 1 ( z i ) i X j =1 f j 1 0 g 5 i = f i 0 0 g 6 i = f i 1 0 (3–15) Twomoreconstraintsnecessitatethatthetotalprobabilit yundereachcurvesumsto1: h 0 = n +1 X j =1 f j 0 1=0 h 1 = n +1 X j =1 f j 1 1=0 (3–16) Notethatthesubscriptsfortheinequalityconstraintsdon otincludethepoint z n +1 ,since theseconditionswouldberedundant. TheLagrangianforthisoptimizationproblemistherefore: L = M + n X i =1 1i g 1 i + 2i g 2 i + 3i g 3 i + 4i g 4 i + 5i g 5 i + 6i g 6 i + h 0 + h 1 (3–17) 46

PAGE 47

Withrespecttotheargumentsoftheobjectivefunction M ,theconstraints g 1 i g 2 i g 3 i g 4 i g 5 i g 6 i h 0 ,and h 1 arebothlinearandcontinuously-differentiable.Without lossof generality,theHessianof M atany z i is: H ( M )= 264 f i 1 f i 0 ( f i 0 + f i 1 ) 1 f i 0 + f i 1 1 f i 0 + f i 1 f i 0 f i 1 ( f i 0 + f i 1 ) 375 whichispositivesemi-denite.Sinceallconstraintsarea fneandtheproblemisstrictly feasible,theKarush-Kuhn-Tucker(KKT)conditionsarebot hnecessaryandsufcient ( Boyd&Vandenberghe 2004 ). 3.2.4KKTConditionsfortheGeneralProblem TherstKKTconditionspeciesthatthegradientoftheLagr angianmustbezero: r L =0 (3–18) where @ L @ f 1 0 =log f 1 0 f 1 0 + f 1 1 + 51 + 11 + 12 + + 1n 21 22 2n =0 @ L @ f 1 1 =log f 1 1 f 1 0 + f 1 1 + 61 + 31 + 32 + + 3n 41 42 4n =0 @ L @ f 2 0 =log f 2 0 f 2 0 + f 2 1 + 52 + 12 + 13 + + 1n 22 23 2n =0 @ L @ f 2 1 =log f 2 1 f 2 0 + f 2 1 + 62 + 32 + 33 + + 3n 42 43 4n =0 ... @ L @ f n 0 =log f n 0 f n 0 + f n 1 + 5n + 1n 2n =0 @ L @ f n 1 =log f n 1 f n 0 + f n 1 + 6n + 3n 4n =0 @ L @ f n +1 0 =log f n +1 0 f n +1 0 + f n +1 1 + =0 @ L @ f n +1 1 =log f n +1 1 f n +1 0 + f n +1 1 + =0 47

PAGE 48

Notably,withregardtothetube-relatedconstraints 1i 2i 3i ,and 4i ,thetermsof successivepartialsareuppertriangularwhenviewingtheo rderstatisticsinincreasing order. TheremainingKKTconditionsareprimalfeasibility,dualf easibility,andcomplementary slackness.Primalfeasibilityrequiresthesatisfactiono fallconstraints: h 0 = h 1 =0 g k i 0 8 k =1...6, 8 i =1... n Dualfeasibilityenforcespositivityonallofthe multipliers: ki 0 8 k =1...6, 8 i =1... n Complementaryslacknessdictatesthat: ki g k i =0 8 k =1...6, 8 i =1... n (3–19) AsintheproofofTheorem 3.1 ,thecomplementaryslacknesscriteriacanbeused todeterminethemonotonicityconstraints 5i and 6i .Forany i withoutlossofgenerality, if 5i isnonzero,then f i 0 =0 byEquation 3–19 .Thisimpliesthat @ L @ f i 0 isnon-nite, contradictingEquation 3–18 .Asimilarargumentcanbemadefor 6i and f i 1 .Therefore: 5i = 6i =0 8 i =1... n Itisalsoimportanttonotethesymmetriesbetween 1i and 2i ,and 3i and 4i When 1i isnonzero,thecurve F 0 ( z i ) issaidtobetightagainstthetopofitstube. Sincethetubemusthavenonzerowidthatanynon-boundarypo int, F 0 ( z i ) cannotalso betightagainstthebottomofthetube.Consequently,Equat ion 3–19 impliesthat 2i is 0 .Similarly,when 2i isnonzero, 1i mustbe 0 ,correspondingto F 0 ( z i ) beingtight againstthebottomofitstube.If F 0 ( z i ) liesinthemiddleofthetube,then 1i = 2i =0 Ananalogousrelationshipexistsbetween 3i 4i ,and F 1 .Forconvenience,wetake 48

PAGE 49

advantageofthesepropertiestodenetwonewvariablesets : 12i 1i 2i and 34i 3i 4i Conceptually, 12i ispositiveifandonlyif F 0 ( z i ) istightagainstthetopofitstube, negativeifandonlyif F 0 ( z i ) istightagainstthebottomofitstube,and0ifthecurvelies withinthetubewithouttheneedforslack. 34i isdenedsimilarlywithrespectto F 1 ( z i ) Asaconsequenceofalltheabove,weobservethat and ,aswellas 12i and 34i for i =1... n ,arepairsofdependentvariableswhosevaluesaregoverned bythe equations: 2 P ji =1 12n i +1 +2 P ji =1 34n i +1 =1 8 j =0... n (3–20) Theseequationsimplythat and arestrictlypositive.Foranypair 12i and 34i ,itis easytoshowthateithervariabletakesavalueof 0 ifandonlyifitscounterpartalsohas avalueof 0 3.2.5AConstructiveAlgorithmfortheGeneralSolution Anoptimalsolutiontothegeneralproblemcanbeobtainedby constructingasetof valuesthatsatisfytheKKTconditionsfromSection 3.2.4 .Informally,wetakeadvantage oftheuppertriangularstructureofEquation 3–18 toarriveatafeasiblesolutionfor theKKTconstraints.Weproposeanalgorithmthatstartsatt hebottomofthesystem andrisestothetop,incrementallygeneratingasolution.F igure 3-3 givesaschematic diagramofthealgorithm'soperation.Beginningfromtheri ghtmostpoint z n +1 and workingleft,thealgorithmlocatesthepinnedpointsatwhi chthedistributionfunctionis tightagainsttheDKWtubes.Attermination,thesubsetofth epinnedpointsforwhich 12i and 34i arenonzerohavebeenidentied,whichinturnenablesasolu tiontobe determinedthroughrepeatedapplicationofTheorem 3.1 .Anoverviewoftheprocedure isasfollows: 49

PAGE 50

F (z ) p F (z ) 1 F (z ) p 0 z p-1p z z n+1 z n z n-1 v [p,p] v [p-1,p] v [p-8,p] v [p-6,p][p-7,p] v v [p-5,p] v [p-4,p] v [p-3,p] v [p-2,p] v sign v + [k+1,p] v sign + [k+1,p] v =v inner v + int v int80 3 2 1 Pass 2Pass 1 (detailed) Pass 3 Pass 4 zzz p-9p-8p-7 zzzzz p-6p-5p-4p-3p-2 1 s 0 s F (z ) 8 8 Unsatisfiable Interval First k = p-9 Innermost Ratio s = p-4 Figure3-3.Visualdepictionofthealgorithm,highlightin gPass3.Startingatstatistic z p ratiosareconstructedfromrighttoleftuntiltherstunsa tisableratiois encountered, v [ p 8, p ] inthisexample.Notethatthepictured [ v sign v + sign ] results fromthedirectionofthepinsfromPass2, F 0 ( z p ) and F 1 ( z p ) .Theinterval [ v int v + int ] markstheintersectionofthesatisableintervals.Next,t he innermostratioonthesamesideastheunsatisableinterva lisfound,which subsequentlydeterminesthenextpointtobepinned.Here,t heinnermost ratiois v + [ p 3, p ] = v + int ,andthepointtobepinnedistherefore z s = z p 4 .The algorithmthenproceedstoPass4,setting z p = z s 50

PAGE 51

1.Createavariable, z p todenotetheleftmostdeterminedpin(notincluding z 0 ). Set z p = z n +1 .Createtwovariables, 12p and 34p ,torepresenttheleftmost undeterminedslackvariables.Assignthevariable to 12p ,andthevariable to 34p 2.Checkwhetherthereisatube-inactivesolutionontheint erval [ z 0 z p ] usingthe methodofSections 3.2.6 and 3.2.7 .Atube-inactivesolutiononaninterval [ z a z b ] isoneforwhich F 0 and F 1 arenottightagainsttheirtubesthroughouttheinterior oftheinterval,sothat 12i = 34i =0 8 i : a < i < b .Inotherwords,thetube constraintsareinactiveontheinterioroftheinterval [ z a z b ] .Suchasolutionclearly reducestothetube-unconstrainedproblemaddressedinThe orem 3.1 Ifatube-inactivesolutionexistson [ z 0 z p ] ,ndthesolutiononthissegmentusing Theorem 3.1 ,andstop.Ifnot,proceedtoStep3. 3.UsingthemethodofSection 3.2.8 ,ndtherightmoststatistic, z k ,atwhicha tube-inactivesolutionon [ z k z p ] isnotpossible,implyingthatsome 12i and 34i pairon [ z k +1 z p 1 ] isnonzero.ThroughthemethoddescribedinSection 3.2.9 determinetherightmoststatistic, z m ,forwhich 12m and 34m mustbenonzero,and determinethesignsofboth 12m and 34m .Thisknowledgeinturndeneswhether eachofthecdfstouchthetoporthebottomofitsrespectivet ubeat z m ,thus pinningthesolutionat F 0 ( z m ) and F 1 ( z m ) 4.Findatube-inactivesolutionon [ z m z p ] ,therebysolvingfor 12p and 34p 5.Set z p = z m .Set 12p = 12m and 34p = 34m .Recordthesignsof 12p and 34p forusein Step3.GotoStep2. 3.2.6TheExistenceofaTube-InactiveSolution Bydenition,apinnedinterval [ z i z p ] hasatube-inactivesolutionifthesolution curves F 0 and F 1 aremonotonicallynon-decreasingandarenotaffectedbyth etube constraintsofEquation 3–15 .Equivalently,theKKTconditionsontheintervalare satisedwith: 12i +1 = 34i +1 = 12i +2 = 34i +2 =...= 12p 1 = 34p 1 =0 Theprimalfeasibility,dualfeasibility,andcomplementa ryslacknessconditionsare thereforetriviallysatised.Consequently,atube-inact ivesolutionexistsontheintervalif andonlyifitispossibletosatisfythezero-gradientcondi tions: log f i +1 0 f i +1 0 + f i +1 1 = 12p C 12 ( p n +1] log f i +1 1 f i +1 0 + f i +1 1 = 34p C 34 ( p n +1] 51

PAGE 52

log f i +2 0 f i +2 0 + f i +2 1 = 12p C 12 ( p n +1] log f i +2 1 f i +2 0 + f i +2 1 = 34p C 34 ( p n +1] ... ... log f p 0 f p 0 + f p 1 = 12p C 12 ( p n +1] log f p 1 f p 0 + f p 1 = 34p C 34 ( p n +1] where C 12 ( p n +1] and C 34 ( p n +1] areconstantsxedbypreviousiterationsofthealgorithm: C 12 ( p n +1] = 8>><>>: P nj = p +1 12j + if p n 0 else C 34 ( p n +1] = 8>><>>: P nj = p +1 34j + if p n 0 else Tosimplifytheproblem,thezero-gradientconditionscanb erewritteninto anequivalentsysteminvolvingtheratiosbetween f 0 and f 1 ateachpoint z j .This substitutionismadepossiblebynotingthatatany z j ,setting v j = f j 0 f j 1 meansthat: f j 0 f j 0 + f j 1 = v j v j +1 and f j 1 f j 0 + f j 1 = 1 v j +1 Also, v j min( f j 0 ) max( f j 1 ) # v j v + j max( f j 0 ) min( f j 1 ) # (3–21) wheremonotonicityensuresthat v j 0 and v + j isnite.Thezero-gradientconditions thenbecome: log v i +1 v i +1 +1 = 12p C 12 ( p n +1] log 1 v i +1 +1 = 34p C 34 ( p n +1] log v i +2 v i +2 +1 = 12p C 12 ( p n +1] log 1 v i +2 +1 = 34p C 34 ( p n +1] ... ... log v p v p +1 = 12p C 12 ( p n +1] log 1 v p +1 = 34p C 34 ( p n +1] 52

PAGE 53

RecallthatbyTheorem 3.1 ,anysolutiononapinnedintervalthatsatisesall feasibilityandcomplementaryslacknessconditionsandha sthepropertythatthe densities f 0 and f 1 aremultiplesmustbeanoptimalsolutiononthatinterval.T he existenceofaratio v j meansthatthetwoprobabilitymassesat z j differbytheconstant multiplier v j .Consequently,theissueofndingwhetheratube-inactive solutionexistson [ z i z p ] isequivalenttondingwhetherthereexistssomeratio, v ,thatsatisesallofthe v j constraintssimultaneously,meaning: v = v j 8 j :( i +1) j p Wecalltheproblemofndingsucha v theratiosatisabilityproblem. Substitutingthesatisfyingratio v ,thezero-gradientsystemsimpliesto: log v v +1 = 12p C 12 ( p n +1] log 1 v +1 = 34p C 34 ( p n +1] (3–22) Onenaltransformationwillfacilitateaniterativesolut iontotheratiosatisability problem.Wedeneeach v [ j p ] tobetheratiooftheprobabilitymassesofthetwocurves ontheinterval [ z j 1 z p ] : v [ j p ] min( P pl = j f l 0 ) max( P pl = j f l 1 ) # v [ j p ] v + [ j p ] max( P pl = j f l 0 ) min( P pl = j f l 1 ) # v [ j p ] = max(0, F 0 ( z p ) F + 0 ( z j 1 )) F 1 ( z p ) F 1 ( z j 1 ) # v [ j p ] v + [ j p ] = F 0 ( z p ) F 0 ( z j 1 ) max(0, F 1 ( z p ) F + 1 ( z j 1 )) # (3–23) Itisstraightforwardtoshowthateithersetofratioshasas atisfying v ifandonlyif theotherratiosetissatisedbythesame v : v i +1 = v i +2 =...= v p = v v [ i +1, p ] = v [ i +2, p ] =...= v [ p p ] = v Henceforth,werefertotheratiosatisabilityproblemfor the v [ j p ] ratioset,meaning: v = v [ j p ] 8 j :( i +1) j p 53

PAGE 54

TheratiosatisabilityproblemispicturedinFigure 3-3 .Algorithmically,itcanbe solvedbycomputingtheintersectionofallthe v [ j p ] intervalsbetween v [ i +1, p ] and v [ p p ] whichisallsuchratiosontheinterval [ z i z p ] .Iftheintersectionisnon-empty: p \ j = i +1 h v [ j p ] v + [ j p ] i 6 = ; (3–24) thentheconditionsaresatisable,andtheinterval [ z i z p ] hasatube-inactivesolution. Otherwise,thesolutiononthegivenintervalmustbetighta gainstthetubesatsome intermediatepoint.3.2.7TheExtendedRatioSatisabilityProblem Itisclearfromthedescriptionofthealgorithmthat 12p 6 =0 and 34p 6 =0 forall algorithmicpasses.Inthecaseoftherstpass, 12p > 0 and 34p > 0 .Forall subsequentpasses,thesignsof 12p and 34p willhavealreadybeendeterminedbythe previouspass.Therefore,theratiosatisabilityproblem mustbeamendedtoaccountfor thisadditionalconstraint.Letthevariable t representtheindexofthecurrentpassofthe algorithm: t =1 denotestherstpass, t =2 thesecondpass,andsoon. Anequivalentconditionforthesignsintermsofasatisfyin gratio v = v ( t ) becomes clearwhenexaminingthezero-gradientconditionsfromEqu ation 3–22 .Theconstants C 12 ( p n +1] and C 34 ( p n +1] canbeunrolledas: log v ( t 1) v ( t 1)+1 = C 12 ( p n +1] = C 12 ( p +1, n +1] + 12p +1 log v ( t 1) v ( t 1)+1 = C 34 ( p n +1] = C 34 ( p +1, n +1] + 34p +1 The t th passzero-gradientconditionsfromEquation 3–22 arethenequivalentto: v ( t ) v ( t )+1 = v ( t 1) v ( t 1)+1 2 12p 1 v ( t )+1 = 1 v ( t 1)+1 2 34p (3–25) Nowif F 0 ( z p ) ispinnedatthetopofitstube,then 12p > 0 ,andsubsequently: v ( t ) v ( t )+1 < v ( t 1) v ( t 1)+1 54

PAGE 55

v ( t ) < v ( t 1) (3–26) Ontheotherhand,if F 0 ( z p ) ispinnedatthebottomofitstube,then 12p < 0 ,and: v ( t ) > v ( t 1) (3–27) Also,itfollowsfromEquation 3–20 thatifthesolutionforonecdfistightagainstitstube atanystatistic z p (for z p 6 = z n +1 ),thentheothercdfmustalsobetightagainstitstube. Corollary1. Atanystatistic z p where z p < z n +1 andthesolutioncurves F 0 and F 1 are tightagainsttheirtubes,thetwocurvesmustbebeposition edeithertowardseachother orawayfromeachother: F 0 ( z p )= F + 0 ( z p ) F 1 ( z p )= F 1 ( z p ) F 0 ( z p )= F 0 ( z p ) F 1 ( z p )= F + 1 ( z p ) (3–28) ProofofCorollary 1 Asshownabove(Equation 3–26 ): F 0 ( z p )= F + 0 ( z p ) ) v ( t ) < v ( t 1) Asimilarresultfor F 1 canbederivedfromEquation 3–25 ,yielding: F 1 ( z p )= F + 1 ( z p ) ) v ( t ) > v ( t 1) (3–29) Hence,ifbothcurvesaretightagainstthetopsoftheirtube s,thennoconsistent v exists foralgorithmicpass t ,whichcontradictstheassumptionthatthispinnedpointis partof aglobalsolution.Analogouslogicshowsthatthetubescann otbebothtightagainstthe bottomoftheirtubes.Thereforethetwocurvesmustbetight eithertowardseachother orawayfromeachother. Sothen v ( t 1) placesaboundonthenewratio v ( t ) ,andthisboundisequivalent toenforcingthesignsof 12p and 34p inthezero-gradientsystem.Let v ( t 1) denote theminimumratio v ( t 1) thatwouldhavesatisedtheintervalhandledbythepreviou s 55

PAGE 56

pass.Since z p mustbepinnedasinCorollary 1 ,thedenitionsofEquation 3–23 imply that v ( t 1)= v ( t 1) ifandonlyif F 0 ( z p ) ispinnedatthetopofitstubeand F 1 ( z p ) is pinnedatthebottomofitstube.Similarly,let v + ( t 1) denotethemaximumratio v ( t 1) thatwouldhavesatisedthepreviousinterval.Then v ( t 1)= v + ( t 1) when F 0 ( z p ) ispinneddownand F 1 ( z p ) ispinnedup.IncorporatingEquations 3–26 and 3–27 ,the extendedratiosatisabilityproblemcanbecompletedbyin cludingaconstraintimposed bytherange: v sign v + sign 8>>>>>><>>>>>>: [0, 1 ) if t =1 [0, v ( t 1)) if v ( t 1)= v + ( t 1) ( v ( t 1), 1 ) if v ( t 1)= v ( t 1) (3–30) Thenasatisfyingratiocanbefoundifandonlyif: v sign v + sign \ p \ j = i +1 h v [ j p ] v + [ j p ] i 6 = ; (3–31) 3.2.8TheNonexistenceofaTube-InactiveSolution Becauseasatisfying v canbefoundforsomeintervalifandonlyiftheintervalhas atube-inactivesolution,thelackofasatisfying v onaninterval [ z k z p ] indicatesthatno tube-inactivesolutionexists: v sign v + sign \ p \ j = k +1 h v [ j p ] v + [ j p ] i = ; (3–32) Duringtheexecutionofthealgorithm,thestatistic z k mustbedeterminedforeachpass, relativetothecurrent z p .Theintervals f [ v [ j p ] v + [ j p ] ] j j =( k +2)... p g ,arecollectively referredtoasthesatisableintervals.Theintersectiono fthesatisableintervalsandthe currentrangeconstraintisdenotedas: [ v int v + int ] v sign v + sign \ p \ j = k +2 h v [ j p ] v + [ j p ] i (3–33) 56

PAGE 57

Theinterval [ v [ k +1, p ] v + [ k +1, p ] ] isreferredtoastherstunsatisableintervalsince,as thealgorithmworksleftfrom z p z k istherstpointatwhichtheaboveintersectionis empty.Inthiscase,theremustbesomestatisticon [ z k +1 z p 1 ] atwhichthecurvesare tightagainsttheirtubes.Sincethealgorithmseekstoplac epinsateverytightpoint,we mustndtherightmostsuchpoint,whichwedenote z m .Once z m hasbeenfoundforthe currentpass,thealgorithmproceedstothenextpassusing z m asthenew z p 3.2.9FindingtheRightmostTightStatistic, z m Identifying z m ,therightmoststatisticonaninterval [ z k +1 z p 1 ] whosesolutionistight againstthetubes,followsfromasimplepropertyoftheseto fallminimumandmaximum ratiosontheinterval.Wedenetheinnermostratio v [ s +1, p ] andacorresponding innermoststatistic z s asfollows: v [ s +1, p ] 8>><>>: v int if v + [ k +1, p ] < v int ; v + int if v [ k +1, p ] > v + int (3–34) and z s 8>><>>: z l 1 if v + [ k +1, p ] < v int ; z r 1 if v [ k +1, p ] > v + int (3–35) where l =argmax j j ( k +2) j p v [ j p ] and r =argmin j j ( k +2) j p v + [ j p ] Sobydenition,theinnermostratioalwaysliesonthesames ideofthesatisable intervalsastherstunsatisableinterval [ v [ k +1, p ] v + [ k +1, p ] ] Theorem 3.2 willprovethattheinnermoststatisticandtherightmostti ghtstatistic areequivalent.Thetheoremreliesonthefollowinglemma:Lemma1. Givenastatistic z m thatischosentopintheinterval [ z m z p ] onthe t th algorithmicpass,if z m ispinnedsothat: F 0 ( z m )= F + 0 ( z m ) and F 1 ( z m )= F 1 ( z m ) 57

PAGE 58

(whichistruewhen v [ m +1, p ] ( t )= v [ m +1, p ] ( t ) ),andif F 0 ( z p ) F + 0 ( z m 1 ) and v [ m p ] ( t ) < v [ m +1, p ] ( t ) then: v [ m m ] ( t +1) < v [ m p ] ( t ) Similarly,if z m ispinnedsothat: F 0 ( z m )= F 0 ( z m ) and F 1 ( z m )= F + 1 ( z m ) (whichistruewhen v [ m +1, p ] ( t )= v + [ m +1, p ] ( t ) ),andif F 1 ( z p ) F + 1 ( z m 1 ) and v + [ m p ] ( t ) > v + [ m +1, p ] ( t ) then: v + [ m m ] ( t +1) > v + [ m p ] ( t ) Inotherwords,whenplacingapinat z m ,if F 0 ( z m ) istightagainstthetopofitstube and F 1 ( z m ) istightagainstthebottom,thentheminimumratio v [ m m ] ( t +1) ,therstnew minimumratiototheleftofthepinonthealgorithm'snextpa ss,willbelessthanthe oldratio v [ m p ] ( t ) aslongas v [ m p ] ( t ) < v [ m +1, p ] ( t ) .Similarly,if F 0 ( z m ) istightagainstthe bottomofitstubeand F 1 ( z m ) istightagainstthetop,thenthemaximumratio v + [ m m ] ( t +1) willbegreaterthantheoldratio v + [ m p ] ( t ) aslongas v + [ m p ] ( t ) > v + [ m +1, p ] ( t ) ProofofLemma 1 v [ m p ] ( t ) < v [ m +1, p ] ( t ) F 0 ( z p ) F + 0 ( z m 1 ) F 1 ( z p ) F 1 ( z m 1 ) < F 0 ( z p ) F + 0 ( z m ) F 1 ( z p ) F 1 ( z m ) F + 0 ( z m ) F + 0 ( z m 1 ) F 1 ( z m ) F 1 ( z m 1 ) < F 0 ( z p ) F + 0 ( z m 1 ) F 1 ( z p ) F 1 ( z m 1 ) v [ m m ] ( t +1) < v [ m p ] ( t ) 58

PAGE 59

Forthesecondcase, v + [ m p ] ( t ) > v + [ m +1, p ] ( t ) F 0 ( z p ) F 0 ( z m 1 ) F 1 ( z p ) F + 1 ( z m 1 ) > F 0 ( z p ) F 0 ( z m ) F 1 ( z p ) F + 1 ( z m ) F 0 ( z m ) F 0 ( z m 1 ) F + 1 ( z m ) F + 1 ( z m 1 ) > F 0 ( z p ) F 0 ( z m 1 ) F 1 ( z p ) F + 1 ( z m 1 ) v + [ m m ] ( t +1) > v + [ m p ] ( t ) Theorem3.2. Givensome z k and z p suchthatatube-inactivesolutionexistson [ z k +1 z p ] ,butnoton [ z k z p ] ,therightmoststatistic z m atwhichtheglobalsolutionmust betightagainstthetubesisexactlytheinnermoststatisti c z s ProofofTheorem 3.2 Clearly,theratio v [ m +1, p ] correspondingtotherightmoststatistic z m mustpossessthreeattributes: 1.Feasibility.Theratiomustrepresenttwocdfsthattouch theirtubes,sothateither: F 0 ( z m )= F + 0 ( z m ) and F 1 ( z m )= F 1 ( z m ) or F 0 ( z m )= F 0 ( z m ) and F 1 ( z m )= F + 1 ( z m ) (3–36) implyingthateither: v [ m +1, p ] = v [ m +1, p ] or v [ m +1, p ] = v + [ m +1, p ] (3–37) 2.BackwardConsistency.Thechosenratiomustsatisfythei nterval [ z m z p ] meaning: v [ j p ] v [ m +1, p ] v + [ j p ] 8 j :( m +1) j p (3–38) 3.ForwardConsistency.Theratiomustbeconsistentwithag lobalsolution.Inother words,itmustnotcontradicttheexistenceofasolutionont heremaininginterval [ z 0 z m ] .ThealgorithmofSection 3.2.5 canproceedaslongasthestatistic z m canbeincludedinthenextpass,therebyinductivelyguaran teeingasolutionon [ z 0 z m ] : v sign ( t +1), v + sign ( t +1) \ h v [ m m ] ( t +1), v + [ m m ] ( t +1) i 6 = ; (3–39) 59

PAGE 60

Now,considertheinnermoststatistic z s andtheinnermostratio v [ s +1, p ] .Thisratiois feasiblebydenition,andbackwardsconsistentsincether eisatube-inactivesolutionfor theentireinterval [ z s z p ] Furthermore,theratiocanbeshowntobeforwardconsistent .First,considerthe casewhentherstunsatisableintervalliestotheleftoft hesatisableintervals,so that v [ s +1, p ] ( t )= v int ( t )= v [ s +1, p ] ( t ) .WeshowthatLemma1appliesfor z m = z s Bythedenitionofaninnermostratio, v [ s p ] < v [ s +1, p ] .Furthermore,weseethat F + 0 ( z s 1 ) < F 0 ( z p ) ,sinceif F + 0 ( z s 1 ) F 0 ( z p ) then F + 0 ( z s ) F 0 ( z p ) andso v [ s +1, p ] =0 whichcontradictstheassumptionthattherstunsatisabl eintervalliestotheleftof v int ByLemma 1 ,thenewminimumratio v [ s s ] ( t +1) cannotbegreaterthantheoldminimum ratio, v [ s p ] ( t ) ,whichisalsotheboundderivedfromthesignsofthe p 's. Ontheotherhand,iftherstunsatisableintervalliestot herightofthesatisable intervals,then v [ s +1, p ] ( t )= v + int ( t )= v + [ s +1, p ] ( t ) v + [ s p ] > v + [ s +1, p ] ,and F + 1 ( z s 1 ) < F 1 ( z p ) AgainLemma 1 applies,statingthatthenewmaximumratio v + [ s s ] ( t +1) cannotbeless thantheoldmaximumratio v + [ s p ] ( t ) Therefore,theinterval h v [ s s ] ( t +1), v + [ s s ] ( t +1) i mustbesubsumedbytheinterval v sign ( t +1), v + sign ( t +1) .Sincetheintervalitselfmustbenonempty,theratio v [ s +1, p ] is forwardconsistentbyEquation 3–39 Theinnermostratio z s satisesthreepropertiesoftherightmoststatistic,show ing thatitisconsistentwithaglobalsolutionasacandidatefo r z m .ByTheorem 3.1 ,the unconstrainedsolutionon [ z s z p ] obtainedbypinningonlyat z s and z p mustbeoptimal. Therefore,therecanbenootherstatistic z j : z s < z j < z p thatistightagainstitstubes withoutcontradictingTheorem 3.1 .So,thestatistic z s correspondingtotheinnermost ratioisexactlytherightmoststatistic z m ThealgorithmofSection 3.2.5 istherebyproventoconstructasolutionon [ z 0 z n +1 ] SincetheprocedureequivalentlysatisestheKKTconditio nsforthegeneralproblem, thesolutionisguaranteedtobeoptimal.Therunningtimeof thealgorithmis O ( n 2 ) 60

PAGE 61

sinceitmoveslinearlythroughtheorderstatisticstodete rminethepinlocations,with exactlyonelinearbacktrackoneachintervaltondeachrig htmost z m 61

PAGE 62

CHAPTER4 RESULTS 4.1IndependentEvaluationoftheMIBound Wedemonstratetheperformanceofourlowerboundontwowell -knowndistribution familiesforwhichthetrueMIbetweenclasslabelsandclass ieroutputsiscomputed numerically.Usingacondenceof 0.99 ,theDKWtubewidthsarexedto 2 dkw ,and n againdenotesthesamplesize. Werstgeneratedatafortwonormally-distributedclass-c onditionalinputsusing unitvariances.Letting d denotethedistancebetweenthemeansofthetwodistributio ns, thetrueMIis: 1+ 1 2 p 2 Z 1 1 e ( y d ) 2 2 log e yd d 2 2 1+ e yd d 2 2 e y 2 2 log(1+ e yd d 2 2 ) # d y For n = 200,000,Figure 4-1 comparesthelowerboundonMIobtainedbyourresult tothetrueMI,aswellastothelowerboundgivenbytheFanome thod,foranumberof distances. WhileouralgorithmproducesanMIboundwithoutassumingan yparticular classier,theFanomethodreliesonanestimateoftheproba bilityofclassicationerror, andthusinherentlyrequiresaclassier.TheFanoresultss hownherearecomputed usingthetheoretically-optimaldiscriminantfromthetru egeneratingdistributions,in ordertoavoidanynegativebiasstemmingfromtheuseofasub -optimaldiscriminant. UsingthecorrectionfromEquation 3–2 andacondenceof 0.99 ,theFanomethodgives alowerboundonMIasinEquation 3–3 Inadditiontothenormally-distributedinputs,wealsosam pletwogammadistributions withunitscaleparameters.Therstclassisgeneratedwith shapeparameter k =1 andthesecondclassusesanintegershapeparameterdenoted bythevariable k .Thus, bothclassesaredrawnfromErlangdistributions,andthetr ueMIis: 1+ 1 2 Z 1 1 e y log ( k 1)! y k 1 +( k 1)! + y k 1 ( k 1)! log y k 1 y k 1 +( k 1)! d y (4–1) 62

PAGE 63

0 1 2 3 4 5 6 7 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Distance between meansMI (bits) Figure4-1.MIresultsfornormally-distributedreal-valu edclass-conditionalinputdata. Thex-axisdenotesdistancebetweenthemeansofthetwodist ributions. ThesolidlinerepresentsthetrueMI,thedashedlineisthel owerboundon MIgivenbythealgorithm,andthedottedlineshowsthelower boundgiven bytheFanomethod. 2 4 6 8 10 12 14 16 18 20 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 kMI (bits) Figure4-2.MIresultsforgamma-distributedreal-valuedc lass-conditionalinputdata. Thex-axisdenotestheshapeparameterforthesecondclassd istribution (therstdistributionhasaxedshapeparameterof1).Thes olidlineshows thetrueMI,thedashedlineisthelowerboundonMIgivenbyth ealgorithm, andthedottedlinedepictsthelowerboundgivenbytheFanom ethod. 63

PAGE 64

10 3 10 4 10 5 0.25 0.3 0.35 0.4 0.45 0.5 Number of SamplesMI (bits) ADistancebetweenmeans=2.TrueMI=0.486 10 3 10 4 10 5 0.7 0.75 0.8 0.85 0.9 0.95 Number of SamplesMI (bits) BDistancebetweenmeans=5.TrueMI=0.975 Figure4-3.MIlowerboundestimatesasafunctionofsamples izefortwopairsof Gaussianclasses.Thex-axes(scaledlogarithmically)sho wthenumberof samplesusedperclass.Eachpointrepresentsthemeanlower boundfor10 randomly-generatedtrials.Thebarsshowawidthoftwostan darddeviations forthe10trials.Theuppercurvesshowtheboundsgivenbyou ralgorithm, andthelowercurvesshowtheboundsgivenbytheFanomethod. AcomparisonofthelowerboundsonMIandtheactualMIusing n = 200,000samplesis showninFigure 4-2 TheperformanceofanyMI-estimationmethodisdependenton thesamplesize. ConvergenceratesofbothourmethodandtheFanomethodares howninFigure 4-3 withsamplesizesrangingfrom2,000to1,200,000. 64

PAGE 65

4.2BaselineResults Usingourexperimentalframework,arstroundoftrialsmea suredtheMIbetween variousstimulusclassesandtheauditorynerveresponse.A smentionedpreviously, theseresultsservebothtohighlighttheperformancediffe rencesbetweenvarious featurespacesandclassiersandtoprovideabaselineforl atertestsoffeedforward networks.Foreachexperiment, 14,000 sampleswereusedtotraintheclassier, 6,000 sampleswereheldouttoconguretheSVM's C parameter,and 80,000 newsamples wereusedtodeterminethebound.Thebaselineresultsforal lauditorycategoriesare listedinTables 4-1 to 4-5 .Forpurposesofcomparison,ourMIboundislistedalongsid e theFanoboundandoverallclassieraccuracy.Fornearlyev erycategory,alinearSVM wasabletodecodeahighamountofstimulusinformation(MI > 0.9 bits)usingatleast oneofthefeaturespaces.Also,ineverycasewherealinearS VMdidnotshowahigh competence,aquadraticSVMrunonthesamecategoryachieve dalowerboundofat least 0.9 bitsforsomefeaturespace. Stateddifferently,fora 50 msstimulusintervalandbinarycategories,themaximum bitratethroughtheperipherymodelis 20 bps.Inpractice,for 80,000 samplesanda condenceof 0.99 ,asusedthroughoutourexperiments,thelowerbounddoesno t exceed 0.933 bits,or 18.66 bps,evenforperfectclassication.AlowerboundonMIof 0.9 bitsensuresthatthesystemisachievingatleast 18 bps.Thisresultindicatesthat nearlyallstimulusinformationisconveyedbytheresponse Examiningtheresultsacrosscategoryfamiliesaccentuate sthedifferencesbetween featurespaces.Forthehigh/lowpitchdiscriminationtask s,allfeaturespacesproduce ahighMIbound,whichissomewhatexpectedduetothetonotop icorganizationof theauditorynerve.Fortheinstrumentrecognitiontasks,t heISIandspiketimes featurespacesgivesignicantlyhigherboundsthanthespi kecountfeaturespaces usingthelinearSVM,whichsuggeststhattheyprovideamore linearly-discriminable 65

PAGE 66

representationoftheseconcepts. 1 Thefrequencysweepstasks,bycontrast,yielded muchhigherboundswiththespikecountspacesthanwithISIo rspiketimes. TheeffectsofaddingGaussianwhitenoisetoseveralrepres entativecategories wasmixed.Lineardiscriminabilityofbothhigh/lowandrec ognitiontaskswasvery tolerantoftheadditionofnoiseateitherSNRlevel.Thefre quencysweeptask'slinear discriminabilitywasvirtuallyunalteredbyalowlevelofn oise.However,whenthenoise levelwasincreasedtoanSNRof5,theMIboundsdecreasedsig nicantlyforallfeature spaces.Onceagain,useofaquadraticSVMproducedamaximal lowerboundforall noiselevels,indicatingthatnearlyallstimulusinformat ionwasstilldiscernible. Onanadditionalnote,thetwotiming-basedfeaturespaces, ISIandspiketimes, oftenreportedsimilarMIbounds.However,theirresultswe reoccasionallydisparate. Forthefrequencysweepstask,forinstance,thespiketimes spacevastlyoutperformed theISIfeaturespace,particularlywhenaquadraticSVMwas used.Ontheotherhand, theISIfeaturespacedemonstratedsuperiorperformancefo rtheinstrumentrecognition tasks,especiallywhentheboundswerelow. 4.3NetworkResults 4.3.1ArchitectureDescriptions Toevaluateourframework,weconductedaseriesofexperime ntstoassessand compareseveralspikingnetworkarchitectures.Twonatura lconnectivityschemeswere chosen,keepinginmindthattonotopyshouldbepreservedth roughouttheauditory pathway.Bothnetworktypesconsistoflayersof 400 neuronswithfeedforwardneural connectionsfromlayertolayer.Foreachtype,bothaone-la yerandatwo-layerversion 1 Ingeneral,poorperformancebyalinearclassierforaspec iccategorydoesnot indicatethatstimulusinformationisabsentfromtherespo nse,asevidencedbythe baselineresults.However,theextenttowhichalinearclas siercandiscriminatevarious categoriesisanindependently-interestingproperty.Con sequently,trendsinvolvingthe lineardiscriminabilityofcategorieswillbediscussedth roughoutourinterpretationofthe results. 66

PAGE 67

A B Figure4-4.Two-layerversionsofthenetworkstestedinthe text.Eachsquaredenotes anentirecenterfrequencygroup,andthelinesconnectings quares representall-to-allconnectionsbetweentwogroups.Ther efore,foragroup sizeof10neurons,eachlinerepresents100neuralconnecti ons.A)The fwdA2Anetwork.Eachgroupconnectsonlytoitsanalogousgr oupin subsequentlayers.B)ThefwdA2A-1nbrA2Anetwork.Eachgro upconnects toitsanalogousgroupandtheanalogousgroupsofitstwoclo sest neighbors. weretested.Thetwo-layerversionsofthenetworksaredepi ctedinFigure 4-4 .Both architecturesutilizeall-to-allconnectionsbetweenent irecenterfrequencygroups.For therstarchitecture,designatedfwdA2A,eachgroupiscon nectedtoexactlyonegroup inthesubsequentlayer.With 10 neuronspergroupand 40 groups,thisyieldsatotalof 4000 synapticconnections.Thesecondarchitecture,calledfwd A2A-1nbrA2A,connects eachgrouptobothitsanalogousgroupaswellasitstwoneare stneighboringgroups. Fortherstandlastcenterfrequencygroups,onlyoneneigh borisused.Anintuition behindthissecondtypeofnetworkisthatitallowsthesigna ltospreaditselfoverthe neuralpopulationasitpropagates,whilestillrespecting thetonotopycondition.The fwdA2A-1nbrA2Aarchitecturecontainsatotalof 11,800 connectionsbetweeneachtwo layers. Eachtestnetworkwascreatedandthenxedthroughoutalltr ials.Allsynaptic weightswererandomlygeneratedbetween 1 and 40 ,andallsynapticdistanceswere randomlygeneratedbetween 1 and 2 .Sinceourmodelhaslimitedamplituderange, 67

PAGE 68

differenttypesofstimulicanpossesswidevariationsinth espikeratesoftheoutput layer.Toavoidpathologicalcaseswhereexcessivelylowor highoutputspikerates obfuscateinformationtransmission,anormalizationproc edurewasintroducedon thetotaloutputspikerateforeachtask.Byalteringthespi kingthresholdvaluefor allneuronsinthesystem,theoutputspikeratecanbeadjust edwithoutaffectingthe synapticconnectionsbetweentasks.Foreachtask,atarget spikerateof 45 Hzis soughtoverasubsetofthetask'sstimuli.Anacceptablethr esholdlevelisdetermined byperformingabinarysearchtomovethetotaloutputspiker atetowithin 0.5 Hzof thetarget.Oncethistoleranceisachieved,allarchitectu reparametersarefrozenand remainconstantthroughoutallnetworktests.Lastly,thef wdA2A-1nbrA2Anetworks carryanadditionalcaveat:thetwoboundarygroupshaveath irdfewersynaptic connectionsthantheircentralcounterparts.Therefore,t hesetwogroupsuseaseparate thresholdthatisnormalizedindependentlyofthecentralo ne. Sinceallcategoriestestedonthebaselineshowedhighlowe rboundsonMI,anyof themmaybeusedtoevaluatenetworks.Itisimportanttonote thatwedonotneedto usethesamefeaturespaceorclassierwhencomparingtheba selineresponsewiththe networkoutput,sincethesignal'srepresentationmaybetr ansformedduringpropagation throughthenetwork.Anetworkcanbeassessedbyrelatingth ebestlowerboundon thebaselinetothebestlowerboundonthenetworkoutput,as discussedinSection 2.2.2 .Furthermore,anycomparisonsbetweenfeaturespacesandc lassiersmustbe madeinthecontextofaparticulararchitecture.4.3.2One-LayerNetworkResults Resultsforthetwoone-layernetworksarepresentedinTabl es 4-6 to 4-9 .Again, themaximumbitratesthroughthesystemare 20 bpsforthetaskswith 50 msstimuli,and 10 bpsforthetasksusing 100 msstimuli.Inpracticehowever,withacondenceof 0.99 and 80,000 samples,perfectclassicationgivesalowerboundestimat eofaround 0.933 bits,whichequatesto 18.66 bpsforthe 50 mstasksand 9.33 bpsforthe 100 mstasks. 68

PAGE 69

Forthepitchdiscriminationcategories,usingalinearSVM ,bothnetworks demonstratedlowerboundsonMIgreaterthan 0.9 bitsforallfeaturespaces,corresponding toaminimumbitrateof 18 bps.ThefwdA2Anetworkperformedslightlybetterthan fwdA2A-1nbrA2A.Fortheinstrumentrecognitiontasks,the spiketimesandISIfeature spacesgaveahighlowerboundforthelinearclassier,whil ethespikecountspaces faredpoorly.Thefrequencysweepscategorieswereeasilyd iscriminablebyalinearly SVMusingthespikecountfeaturespaces,showinganMIof 0.924 bitsandabitrateof 9.24 bpsforthe20-binspikecount.Asinthebaselineresults,th elinearSVMwasnot abletodiscriminatethesweepscategoriesusingeithertim ing-basedfeaturespace.For bothnetworks,aquadraticSVMagainachievedanear-maxima llowerboundforevery task,andforalmosteveryfeaturespace.Thisperformancei ndicatesthatverylittle ofthestimulusinformationforanycategorywaslostduring transmissionofthesignal througheithernetworktype. Whilethestimulusinformationwaspreservedthroughbotho ne-layernetworks, therepresentationofthisinformationwaschangedbythene twork.Fortheone-layer networksusingthepitchdiscriminationcategories,perfo rmanceusingalinearSVMwas verysimilartothatofthebaseline.Fortheinstrumentreco gnitiontasksusingalinear SVM,theMIlowerboundwassignicantlyhigherfortheone-l ayernetworksthanforthe baseline.Thisresultisparticularlyintriguinginlighto ftherandomly-generatednetwork weights.Forthesweepstasks,thelinearclassieronfwdA2 Aproducedaworsebound thanthebaseline,andthelinearSVMonfwdA2A-1nbrA2Agave ahigherbound. Contrastingthetwoarchitecturesintermsofthelineardis criminabilityoftheir responses,thefwdA2Anetworksgaveslightlyhigherlowerb oundsforthehigh/low andinstrumentrecognitioncategories.However,thefwdA2 A-1nbrA2Anetworkshowed higherboundsforthesweepstask.Whenthespikecountfeatu respaceswereused forthesweepscategories,fwdA2A-1nbrA2Awithalinearcla ssierreportedMIlower boundsalmostidenticaltothoseobtainedwithaquadraticS VM. 69

PAGE 70

4.3.3Two-LayerNetworkResults Two-layerversionsofthefwdA2AandfwdA2A-1nbrA2Anetwor kswerealso tested.CompleteresultsaredisplayedinTables 4-10 to 4-15 .Allpitchdiscrimination categorieswereagaineasilyclassiedbyalinearSVMforbo thnetworktypes,giving lowerboundsofatleast 0.92 bits,orminimumbitratesof 18.4 bps,regardlessofthe featurespaceusedThefwdA2Anetworkdidnoticeablybetter withalinearSVMonthe instrumentrecognitiontasksthanthefwdA2A-1nbrA2Anetw ork.Forbothnetworks, everyinstrumentrecognitiontaskforwhichlinearclassi cationdidnotachievea lowerboundof 0.9 bitsdisplayednearmaximalperformancewithaquadraticSV M. Performanceonthesweepstaskswerenoticeablydifferentf orthetwonetworks.While thefwdA2A-1nbrA2Ashowedextremelyhighlowerboundswith alinearSVM,thelower boundsgivenbythefwdA2AnetworkwithalinearSVMwereless than 0.9 bitsforall featurespaces.Onceagain,classicationwiththequadrat icSVMachievednearly maximallowerboundsforboth. 70

PAGE 71

Table4-1.Baselineresultsforhigh/lowpitchdiscriminat iontasks InstrumentFeatureSpaceSVMTypeAccuracyFanoMIMI PuretonesSpikeCount2binsLinear99.72%0.9150.924 SpikeCount10binsLinear99.67%0.9120.922SpikeCount20binsLinear99.66%0.9110.922SpikeTimesLinear99.75%0.9170.926ISILinear99.76%0.9180.926 HarmToneASpikeCount2binsLinear99.74%0.9160.926 SpikeCount10binsLinear99.70%0.9140.924SpikeCount20binsLinear99.68%0.9120.924SpikeTimesLinear99.75%0.9170.926ISILinear99.77%0.9180.926 HarmToneBSpikeCount2binsLinear99.69%0.9130.924 SpikeCount10binsLinear99.70%0.9140.924SpikeCount20binsLinear99.65%0.9110.923SpikeTimesLinear99.71%0.9140.924ISILinear99.72%0.9150.925 HarmToneCSpikeCount2binsLinear99.70%0.9140.925 SpikeCount10binsLinear99.69%0.9130.924SpikeCount20binsLinear99.66%0.9110.923SpikeTimesLinear99.69%0.9130.925ISILinear99.71%0.9140.925 HarmToneDSpikeCount2binsLinear99.59%0.9070.922 SpikeCount10binsLinear99.54%0.9040.921SpikeCount20binsLinear99.55%0.9040.921SpikeTimesLinear99.60%0.9070.922ISILinear99.61%0.9080.923 Responsewindowsuseanoffsetof 3 msandawindowsizeof 50 ms.TheMeddisinput gainlevelissetto 30 dBforpuretonesand 10 dBfortheharmonictones. 71

PAGE 72

Table4-2.Baselineresultsforinstrumentrecognitiontas ks MixtureFeatureSpaceSVMTypeAccuracyFanoMIMI AvsBSpikeCount2binsLinear94.54%0.6720.762 SpikeCount10binsLinear94.13%0.6540.727SpikeCount20binsLinear93.75%0.6410.712SpikeTimesLinear99.08%0.8740.898ISILinear99.10%0.8750.898SpikeCount2binsQuadratic99.97%0.9300.931SpikeCount10binsQuadratic99.74%0.9150.925SpikeCount20binsQuadratic98.76%0.8590.891SpikeTimesQuadratic99.99%0.9310.932ISIQuadratic99.99%0.9310.932 AvsCSpikeCount2binsLinear83.71%0.3420.414 SpikeCount10binsLinear86.94%0.4190.496SpikeCount20binsLinear86.54%0.4080.483SpikeTimesLinear99.54%0.9020.915ISILinear99.60%0.9070.918 AvsDSpikeCount2binsLinear96.22%0.7460.807 SpikeCount10binsLinear96.14%0.7400.769SpikeCount20binsLinear95.95%0.7290.754SpikeTimesLinear99.28%0.8870.906ISILinear99.44%0.8970.911 BvsCSpikeCount2binsLinear90.70%0.5490.605 SpikeCount10binsLinear90.80%0.5530.600SpikeCount20binsLinear90.52%0.5420.587SpikeTimesLinear99.61%0.9080.918ISILinear99.61%0.9080.919 Responsewindowsuseanoffsetof 3 msandawindowsizeof 50 ms.TheMeddisinput gainlevelissetto 20 dB. 72

PAGE 73

Table4-3.Baselineresultsforinstrumentrecognitiontas ks(continued) MixtureFeatureSpaceSVMTypeAccuracyFanoMIMI BvsDSpikeCount2binsLinear84.05%0.3500.428 SpikeCount10binsLinear85.14%0.3750.459SpikeCount20binsLinear84.60%0.3630.439SpikeTimesLinear91.70%0.5600.639ISILinear92.34%0.5820.660SpikeCount2binsQuadratic98.87%0.8620.885SpikeCount10binsQuadratic96.91%0.7630.824SpikeCount20binsQuadratic94.96%0.6800.753SpikeTimesQuadratic98.63%0.8480.870ISIQuadratic99.16%0.8790.900 CvsDSpikeCount2binsLinear94.64%0.6780.796 SpikeCount10binsLinear94.72%0.6810.757SpikeCount20binsLinear94.38%0.6650.742SpikeTimesLinear98.21%0.8260.868ISILinear98.42%0.8370.874SpikeCount2binsQuadratic99.86%0.9220.923SpikeCount10binsQuadratic99.72%0.9130.919SpikeCount20binsQuadratic99.24%0.8840.903SpikeTimesQuadratic99.95%0.9220.925ISIQuadratic99.96%0.9290.931 Responsewindowsuseanoffsetof 3 msandawindowsizeof 50 ms.TheMeddisinput gainlevelissetto 20 dB. Table4-4.Baselineresultsforup/downfrequencysweepsta sks SweepRangeFeatureSpaceSVMTypeAccuracyFanoMIMI 1000HzSpikeCount4binsLinear99.82%0.9210.927 SpikeCount20binsLinear99.95%0.9300.932SpikeCount50binsLinear99.94%0.9290.931SpikeTimesLinear77.57%0.2180.287ISILinear72.80%0.1440.199SpikeTimesQuadratic99.82%0.9220.927ISIQuadratic90.26%0.5140.613 Responsewindowsuseanoffsetof 3 msandawindowsizeof 100 ms.TheMeddisinput gainlevelissetto 10 dB. 73

PAGE 74

Table4-5.Baselineresultsforrepresentativetaskswitha ddedGaussianwhitenoise CategorySNRFeatureSpaceSVMTypeAccuracyFanoMIMI High/Low20SpikeCount2binsLinear99.75%0.9170.926HarmToneA20SpikeCount10binsLinear99.70%0.9140.925 20SpikeCount20binsLinear99.71%0.9140.92420SpikeTimesLinear99.74%0.9160.92620ISILinear99.75%0.9170.926 5SpikeCount2binsLinear99.70%0.9140.9255SpikeCount10binsLinear99.65%0.9110.9235SpikeCount20binsLinear99.64%0.9100.9225SpikeTimesLinear99.73%0.9160.9265ISILinear99.75%0.9170.926 Mixture20SpikeCount2binsLinear94.92%0.6900.779AvsB20SpikeCount10binsLinear94.39%0.6690.744 20SpikeCount20binsLinear93.87%0.6450.71920SpikeTimesLinear99.07%0.8740.89720ISILinear99.12%0.8770.899 5SpikeCount2binsLinear95.17%0.7000.7735SpikeCount10binsLinear94.60%0.6770.7435SpikeCount20binsLinear94.06%0.6580.7235SpikeTimesLinear99.39%0.8930.9085ISILinear99.52%0.9000.911 1000Hz20SpikeCount4binsLinear99.68%0.9120.922Sweeps20SpikeCount20binsLinear99.85%0.9240.929 20SpikeCount50binsLinear99.83%0.9220.92720SpikeTimesLinear76.34%0.1970.26220ISILinear70.55%0.1160.16620SpikeCount4binsQuadratic100%0.9330.93320SpikeCount20binsQuadratic100%0.9330.93320SpikeCount50binsQuadratic100%0.9330.93320SpikeTimesQuadratic99.81%0.9210.92620ISIQuadratic88.33%0.4580.560 5SpikeCount4binsLinear94.38%0.6570.7405SpikeCount20binsLinear92.64%0.5930.6885SpikeCount50binsLinear92.02%0.5720.6685SpikeTimesLinear69.67%0.1050.1435ISILinear61.49%0.0330.0495SpikeCount4binsQuadratic100%0.9330.9335SpikeCount20binsQuadratic99.99%0.9330.9335SpikeCount50binsQuadratic99.77%0.9180.9265SpikeTimesQuadratic98.18%0.8260.8685ISIQuadratic75.50%0.1840.255 TheMeddisinputgainlevelandresponsewindowforeachtask isthesameasforthe correspondingnoiselessversion. 74

PAGE 75

Table4-6.Single-layerfwdA2Anetworkresultsforpitchdi scriminationandinstrument recognitiontasks TaskFeatureSpaceSVMTypeAccuracyFanoMIMI High/LowSpikeCount2binsLinear99.77%0.9180.927HarmToneASpikeCount10binsLinear99.80%0.9200.927 SpikeCount20binsLinear99.70%0.9130.924SpikeTimesLinear99.80%0.9200.927ISILinear99.78%0.9190.927 High/LowSpikeCount2binsLinear99.62%0.9090.921HarmToneBSpikeCount10binsLinear99.61%0.9080.921 SpikeCount20binsLinear99.53%0.9030.920SpikeTimesLinear99.64%0.9100.923ISILinear99.74%0.9160.925 High/LowSpikeCount2binsLinear99.75%0.9170.926HarmToneCSpikeCount10binsLinear99.73%0.9160.926 SpikeCount20binsLinear99.65%0.9100.923SpikeTimesLinear99.73%0.9160.926ISILinear99.73%0.9150.925 MixtureSpikeCount2binsLinear99.01%0.8700.894AvsBSpikeCount10binsLinear98.89%0.8590.886 SpikeCount20binsLinear98.15%0.8230.861SpikeTimesLinear99.59%0.9060.918ISILinear99.66%0.9110.920 MixtureSpikeCount2binsLinear98.79%0.8570.885AvsCSpikeCount10binsLinear98.83%0.8600.888 SpikeCount20binsLinear97.82%0.8070.849SpikeTimesLinear99.62%0.9080.920ISILinear99.67%0.9120.921 MixtureSpikeCount2binsLinear99.12%0.8780.900BvsCSpikeCount10binsLinear99.20%0.8820.902 SpikeCount20binsLinear98.70%0.8520.882SpikeTimesLinear99.86%0.9240.929ISILinear99.89%0.9240.928 Responsewindowsuseanoffsetof 16 msandawindowsizeof 50 ms.Thegainlevelfor eachtaskisthesameasforthecorrespondingbaselineversi on. 75

PAGE 76

Table4-7.Single-layerfwdA2Anetworkresultsforfrequen cysweepstasks SweepRangeFeatureSpaceSVMTypeAccuracyFanoMIMI 1000HzSpikeCount4binsLinear99.07%0.8750.900SweepsSpikeCount20binsLinear99.73%0.9160.924 SpikeCount50binsLinear99.07%0.8750.900SpikeTimesLinear75.76%0.1880.259ISILinear70.96%0.1210.176SpikeCount4binsQuadratic100.0%0.9330.933SpikeCount20binsQuadratic100.0%0.9330.933SpikeCount50binsQuadratic100.0%0.9330.933SpikeTimesQuadratic99.71%0.9140.922ISIQuadratic93.66%0.6290.717 Responsewindowsuseanoffsetof 16 msandawindowsizeof 100 ms.Thegainlevelis thesameasforthebaselinesweeps. 76

PAGE 77

Table4-8.Single-layerfwdA2A-1nbrA2Anetworkresultsfo rpitchdiscriminationand instrumentrecognitiontasks TaskFeatureSpaceSVMTypeAccuracyFanoMIMI High/LowSpikeCount2binsLinear99.65%0.9100.923HarmToneASpikeCount10binsLinear99.67%0.9110.923 SpikeCount20binsLinear99.53%0.9030.918SpikeTimesLinear99.68%0.9130.924ISILinear99.66%0.9110.923 High/LowSpikeCount2binsLinear99.50%0.9010.918HarmToneBSpikeCount10binsLinear99.53%0.9020.919 SpikeCount20binsLinear99.36%0.8920.913SpikeTimesLinear99.56%0.9050.919ISILinear99.57%0.9050.920 High/LowSpikeCount2binsLinear99.70%0.9140.926HarmToneCSpikeCount10binsLinear99.69%0.9130.926 SpikeCount20binsLinear99.61%0.9080.922SpikeTimesLinear99.74%0.9160.925ISILinear99.72%0.9150.925 MixtureSpikeCount2binsLinear99.32%0.8880.905AvsBSpikeCount10binsLinear99.12%0.8760.897 SpikeCount20binsLinear98.33%0.8330.868SpikeTimesLinear99.57%0.9050.917ISILinear99.65%0.9100.920 MixtureSpikeCount2binsLinear98.37%0.8350.873AvsCSpikeCount10binsLinear98.56%0.8460.881 SpikeCount20binsLinear97.18%0.7790.827SpikeTimesLinear99.37%0.8930.910ISILinear99.36%0.8920.910 MixtureSpikeCount2binsLinear99.42%0.8960.911BvsCSpikeCount10binsLinear99.45%0.8980.913 SpikeCount20binsLinear99.23%0.8840.904SpikeTimesLinear99.82%0.9210.927ISILinear99.83%0.9220.927 Responsewindowsuseanoffsetof 16 msandawindowsizeof 50 ms.Thegainlevelfor eachtaskisthesameasforthecorrespondingbaselineversi on. 77

PAGE 78

Table4-9.Single-layerfwdA2A-1nbrA2Anetworkresultsfo rfrequencysweepstasks TaskFeatureSpaceSVMTypeAccuracyFanoMIMI 1000HzSpikeCount4binsLinear99.97%0.9320.933SweepsSpikeCount20binsLinear100.0%0.9330.933 SpikeCount50binsLinear99.95%0.9310.932SpikeTimesLinear88.32%0.4570.530ISILinear77.46%0.2160.290SpikeCount4binsQuadratic100.0%0.9330.933SpikeCount20binsQuadratic100.0%0.9330.933SpikeCount50binsQuadratic100.0%0.9330.933SpikeTimesQuadratic99.99%0.9330.933ISIQuadratic98.62%0.8480.880 Responsewindowsuseanoffsetof 16 msandawindowsizeof 100 ms.Thegainlevelis thesameasforthebaselinesweeps.Table4-10.Two-layerfwdA2Anetworkresultsforpitchdisc riminationtasks TaskFeatureSpaceSVMTypeAccuracyFanoMIMI High/LowSpikeCount2binsLinear99.76%0.9160.925HarmToneASpikeCount10binsLinear99.70%0.9120.924 SpikeCount20binsLinear99.68%0.9100.923SpikeTimesLinear99.74%0.9160.926ISILinear99.74%0.9140.924 High/LowSpikeCount2binsLinear99.69%0.9110.922HarmToneBSpikeCount10binsLinear99.63%0.9080.921 SpikeCount20binsLinear99.58%0.9040.920SpikeTimesLinear99.72%0.9130.922ISILinear99.69%0.9110.922 High/LowSpikeCount2binsLinear99.75%0.9150.924HarmToneCSpikeCount10binsLinear99.68%0.9110.924 SpikeCount20binsLinear99.69%0.9110.922SpikeTimesLinear99.73%0.9140.924ISILinear99.74%0.9140.923 Theresponsewindowsizeis 50 ms.Responsewindowsforspikecountfeaturespaces useanoffsetof 23 msandwindowsforthetiming-basedspacesuseanoffsetof 34 ms. Thegainlevelisthesameasforthecorrespondingbaselinet asks. 78

PAGE 79

Table4-11.Two-layerfwdA2Anetworkresultsforinstrumen trecognitiontasks TaskFeatureSpaceSVMTypeAccuracyFanoMIMI MixtureSpikeCount2binsLinear99.23%0.8820.903AvsBSpikeCount10binsLinear98.89%0.8630.888 SpikeCount20binsLinear97.91%0.8110.852SpikeTimesLinear99.22%0.8820.902ISILinear98.74%0.9550.883 MixtureSpikeCount2binsLinear97.72%0.8010.844AvsCSpikeCount10binsLinear97.79%0.8060.847 SpikeCount20binsLinear96.64%0.7510.806SpikeTimesLinear98.08%0.8190.857ISILinear97.79%0.8050.847SpikeCount2binsQuadratic99.96%0.9290.930SpikeCount10binsQuadratic99.83%0.9200.926SpikeCount20binsQuadratic99.37%0.8920.911SpikeTimesQuadratic99.97%0.9300.931ISIQuadratic99.97%0.9300.931 MixtureSpikeCount2binsLinear98.67%0.8500.882BvsCSpikeCount10binsLinear98.52%0.8420.874 SpikeCount20binsLinear97.75%0.8020.845SpikeTimesLinear99.23%0.8830.903ISILinear99.20%0.8790.898 Theresponsewindowsizeis 50 ms.Responsewindowsforspikecountfeaturespaces useanoffsetof 23 msandwindowsforthetiming-basedspacesuseanoffsetof 34 ms. Thegainlevelisthesameasforthecorrespondingbaselinet asks. Table4-12.Two-layerfwdA2Anetworkresultsforfrequency sweepstasks TaskFeatureSpaceSVMTypeAccuracyFanoMIMI 1000HzSpikeCount4binsLinear97.78%0.8040.850SweepsSpikeCount20binsLinear97.94%0.8120.856 SpikeCount50binsLinear94.95%0.6780.757SpikeTimesLinear71.67%0.1290.180ISILinear71.10%0.1160.157SpikeCount4binsQuadratic100.0%0.9330.933SpikeCount20binsQuadratic100.0%0.9330.933SpikeCount50binsQuadratic99.92%0.9260.929SpikeTimesQuadratic97.44%0.7870.841ISIQuadratic91.00%0.5370.633 Theresponsewindowsizeis 100 ms.Responsewindowsforspikecountfeaturespaces useanoffsetof 23 msandwindowsforthetiming-basedspacesuseanoffsetof 34 ms. Thegainlevelisthesameasforthecorrespondingbaselinet asks. 79

PAGE 80

Table4-13.Two-layerfwdA2A-1nbrA2Anetworkresultsforp itchdiscriminationtasks TaskFeatureSpaceSVMTypeAccuracyFanoMIMI High/LowSpikeCount2binsLinear99.61%0.9060.919HarmToneASpikeCount10binsLinear99.63%0.9070.920 SpikeCount20binsLinear99.53%0.9010.916SpikeTimesLinear99.56%0.9030.918ISILinear99.55%0.9020.916 High/LowSpikeCount2binsLinear99.41%0.8940.913HarmToneBSpikeCount10binsLinear99.31%0.8880.910 SpikeCount20binsLinear99.10%0.8760.904SpikeTimesLinear99.32%0.8880.910ISILinear99.28%0.8860.909 High/LowSpikeCount2binsLinear99.63%0.9070.920HarmToneCSpikeCount10binsLinear99.65%0.9090.921 SpikeCount20binsLinear99.57%0.9050.920SpikeTimesLinear99.62%0.9070.919ISILinear99.59%0.9050.919 Theresponsewindowsizeis 50 ms.Responsewindowsforspikecountfeaturespaces useanoffsetof 23 msandwindowsforthetiming-basedspacesuseanoffsetof 34 ms. Thegainlevelisthesameasforthecorrespondingbaselinet asks. 80

PAGE 81

Table4-14.Two-layerfwdA2A-1nbrA2Anetworkresultsfori nstrumentrecognitiontasks TaskFeatureSpaceSVMTypeAccuracyFanoMIMI MixtureSpikeCount2binsLinear98.70%0.8530.877AvsBSpikeCount10binsLinear98.19%0.8250.859 SpikeCount20binsLinear97.11%0.7730.819SpikeTimesLinear98.62%0.8480.876ISILinear98.59%0.8470.875SpikeCount2binsQuadratic99.94%0.9280.930SpikeCount10binsQuadratic99.84%0.9210.927SpikeCount20binsQuadratic99.29%0.8880.910SpikeTimesQuadratic99.98%0.9300.931ISIQuadratic99.97%0.9300.931 MixtureSpikeCount2binsLinear96.76%0.7560.808AvsCSpikeCount10binsLinear95.90%0.7190.775 SpikeCount20binsLinear93.51%0.6240.687SpikeTimesLinear97.41%0.7870.832ISILinear97.21%0.7770.823SpikeCount2binsQuadratic99.95%0.9280.930SpikeCount10binsQuadratic99.87%0.9230.927SpikeCount20binsQuadratic99.54%0.9020.914SpikeTimesQuadratic99.92%0.9260.929ISIQuadratic99.93%0.9270.929 MixtureSpikeCount2binsLinear98.93%0.8650.885BvsCSpikeCount10binsLinear98.84%0.8600.882 SpikeCount20binsLinear98.16%0.8240.858SpikeTimesLinear98.89%0.8630.885ISILinear98.89%0.8630.884SpikeCount2binsQuadratic99.96%0.9290.930SpikeCount10binsQuadratic99.83%0.9200.926SpikeCount20binsQuadratic99.33%0.8890.909SpikeTimesQuadratic99.97%0.9300.931ISIQuadratic99.96%0.9290.931 Theresponsewindowsizeis 50 ms.Responsewindowsforspikecountfeaturespaces useanoffsetof 23 msandwindowsforthetiming-basedspacesuseanoffsetof 34 ms. Thegainlevelisthesameasforthecorrespondingbaselinet asks. 81

PAGE 82

Table4-15.Two-layerfwdA2A-1nbrA2Anetworkresultsforf requencysweepstasks TaskFeatureSpaceSVMTypeAccuracyFanoMIMI 1000HzSpikeCount4binsLinear99.96%0.9290.930SweepsSpikeCount20binsLinear99.97%0.9300.931 SpikeCount50binsLinear99.92%0.9260.929SpikeTimesLinear87.81%0.4430.519ISILinear82.47%0.3120.400SpikeTimesQuadratic99.97%0.9290.930ISIQuadratic98.97%0.8680.894 Theresponsewindowsizeis 100 ms.Responsewindowsforspikecountfeaturespaces useanoffsetof 23 msandwindowsforthetiming-basedspacesuseanoffsetof 34 ms. Thegainlevelisthesameasforthecorrespondingbaselinet asks. 82

PAGE 83

CHAPTER5 DISCUSSION 5.1EvaluatingNetworkPerformanceandOurFramework Theresultsofourstudiesofnetworksofspikingneuronsrev ealedseveral interestinginsights.Forone,performancewashighlydepe ndentonboththetask andthefeaturespacechosenforclassication.Therefore, anystudythatseeksto makethoroughcomparisonsbetweennetworksshouldconside rtheinclusionofseveral featurespaces.Althoughthespikecountfeaturespaceisus edalmostexclusively inotherliterature,anetwork'scapabilitiescanbeexpres sedmorecompletelyby incorporatingadditionalspacessuchastheonesdescribed here. Asforthespecicnetworksstudied,fwdA2AandfwdA2A-1nbr A2A,bothwere surprisinglyeffectiveattransmittingstimulusinformat ionconsideringthatalloftheir parameters-includingsynapticweights-weregeneratedra ndomly.Theone-layer networksshowedconsistentlyhighlowerboundsonMIforall categories,almost alwaysdemonstratingaboundofatleast 0.92 bitsforthelinearSVM.Forthetwo-layer networks,performancewithalinearclassierwasagainnea rlymaximalforthepitch discriminationcategoriesandfrequencysweepscategorie s,withtheexceptionof two-layerfwdA2Anetworkforthefrequencysweeptask.Inad dition,linearclassication forbothtwo-layernetworksfortheinstrumentrecognition tasksoftengaveboundslarger than 0.9 bits.Furthermore,inallcaseswherealinearSVMcouldnota chieve 0.9 bitsfor anyfeaturespace,aquadraticSVMrunonthetaskdemonstrat edanearmaximallower bound. Insomecases,suchastheinstrumentrecognitiontaskswith theone-layer fwdA2A-1nbrA2Anetwork,thelowerboundusingthenetworkr esponsewasactually greaterthantheoneforthecorrespondingfeaturespace,ta sk,andclassieronthe baseline.Itisimportanttokeepinmindthatcomparisonsca nnotbemadedirectly betweenanetwork'soutputandthebaselinegivenaspecicc lassierandfeature 83

PAGE 84

space.However,increasedperformancebylinearclassier susingbasicfeaturespaces doessuggestthatasimplerandom-weightnetworkmaybecapa bleofsomeamountof signalprocessing.Transmissionthroughspikingneuralne tworkslikelyhastheeffectof cleaningthesignalofcertainkindsofnoise,makinglinear andpolynomialdiscrimination easierdownstream. Wemustemphasizethatthesynapticweightsofeachnetworka rchitecturewere xedthroughouttestingofallauditorycategories.Therem arkableperformanceof random-weightnetworksattransmittingstimulusinformat ionaboutsuchadiversityof categoriesinvitesfurtherstudy,perhapswithmorecomple xnaturalauditorytasks. 5.2SignicanceofOurTighterLowerBoundonMI Ourprobabilisticlowerboundwasshowntobesuperiortothe lowerboundobtained byusingclassieraccuracyalone.Comparisonsbetweenthe twoaremosteasilymade inthecontextoftheresultsonknowndistributionsofSecti on 4.1 .ForboththeGaussian andgammadata,ourlowerboundshowedthemostmarkedimprov ementontheFano methodwhenMIwasnear 0.5 bits.Using 200,000 samples,ourboundwashigherby over 0.1 bitsinthebestcase. FortheGaussianmodeldata,bothourboundandtheFanobound showeda logarithmicconvergencetotheirmaximawithrespecttothe classsamplesizes,asseen inFigure 4-3 .However,theboundgivenbyourmethodbecomesnoticeablyb etterthan thatoftheFanomethodassamplesizesincrease. Inaddition,applyingournovellowerboundtotheclassica tionofspiketrain responsesgaveestimatesthatwerealwaysatleastashighas thosegivenbythena¨ve bound.Asinthemodeldistributions,thedifferencebetwee nthelowerboundsseemsto bemostextremewhentheMIisnear 0.5 bits. 5.3ConcludingRemarks Thisnewframeworkprovidesanobjectiveviewpointforstud yingnetworksofspiking neurons.Themethodologyprovidesavaluabletoolforbuild ingrealisticfeedforward 84

PAGE 85

neuralmodels.Inparticular,continueduseofthisframewo rkalongwithmorecomplex feedforwardnetworksandmorenaturalauditorycategories couldyieldanaccurate simulationofthehumanauditorypathway.Suchamodelhasim mensepotential toincreaseunderstandingofinformationowintheauditor ysystem.Inadditionto thebenettoneuroscience,thisaimcouldalsoaidthedevel opmentofnewspeech recognitiontechniques. Furthermore,theprobabilisticlowerboundderivedinours tudyhasimmediate applicationstoanymachinelearningworkthatreliesonacl assierwithanintermediate real-valuedrepresentationofdata.Asdiscussedearlier, thisincludesalldiscriminatory classiersandclusteringmethods,twoofthemostcommoncl assicationtechniques. Thelowerboundcanbegeneratedquicklyforlargedatasets, usesextrainformation togenerateasuperiorestimatecomparedtoexistingtechni quesrelyingonaccuracy alone,andhasatheoretically-soundbasisforanalyzingem piricaldata.Therefore,the boundrepresentsanewhigh-qualitytoolforthemachinelea rningresearcher. 85

PAGE 86

APPENDIX:METHODSPECIFICS ImplementingtheMeddisInner-HairCellmodelrequiresthe specication ofalargenumberofmodelparameters.Wheneverpossible,ou rchoicesfollow thedefaultmodelforthehumanauditoryperipheryfoundont heDSAMwebsite ( Univ.ofEssexDept.ofPsychologyandUniv.ofCambridgePhy siologyDept. 2009 ). However,someadjustmentshadtobemadeforourpurposes. First,theMeddismodelrequiresthatthegainlevelofinput soundsbesetmanually. Abiologicalauditorynervehasalargedynamicrange,which itachievesinpartby containingberswithvaryingsensitivitiestotheamplitu desofinboundsounds.In addition,theauditorysystemmaintainsadegreeofhomeost asisthroughafeedback mechanism:adescendingneuralpathwaythatrunsinparalle ltotheascendingauditory pathway,terminatingatouter-haircellsonthebasilarmem brane( Squireetal. 2003 ; Altschuleretal. 1991 ; Moore 2004 ).Tofocusourstudyonthefrequencyaspectsof soundsinsteadoftheirrawvolumelevels,ourmodelisrestr ictedtoasinglebertype, highspontaneousratebers,anddynamicgainadjustmentsa reomitted.Allstimuli aregeneratedwiththesamemaximumamplitude,andtheMeddi smodel'sinputgain parameterisadjustedgrosslysothattheoutputspiketrain sarewell-behaved. Theexperimentalgainadjustmentprocedureisasfollows.F oreachtask,thegain parameteriscoarselytunedtomaximizealinearclassier' sperformanceonasingle pitchdiscriminationtaskusingthespiketimesfeaturespa ceandadefaultresponse window,toaprecisionof 10 dB.Thisoptimalgainlevelisheldconstantthroughoutall otheroptimizationsandtestingprocedures.Experimental ly,the 0 dBSPLlevelforthe Meddismodelcorrespondstoagainofbetween 60 dBand 70 dB,sothatareported gainlevelof 10 dB,forexample,equatestoroughly 50 dBSPL. Furthermore,sincethedefaultMeddismodelcontainedonly 20 centerfrequencies, wedeterminedanewlistof 40 centerfrequenciesandbandwidthsfollowingthesame distribution.AcompleteexpositionisgiveninTable 5.3 86

PAGE 87

TableA-1.Meddismodelparametersfor40centerfrequencyg roups GroupNumber0123456789 CenterFrequency100.00121.81145.30170.62197.91227.31 258.99293.13329.92369.56 Bandwidth32.8138.2243.8149.6255.6561.9568.5375.4382 .6590.24 GroupNumber10111213141516171819 CenterFrequency412.28458.32507.92561.38618.98681.05 747.94820.02897.69981.38 Bandwidth98.22106.60115.43124.72134.52144.84155.741 67.23179.36192.18 GroupNumber20212223242526272829 CenterFrequency1071.571168.761273.491386.341507.951 638.991780.211932.382096.352273.05 Bandwidth205.71220.01235.12251.09267.97285.82304.71 324.68345.80368.16 GroupNumber30313233343536373839 CenterFrequency2463.462668.642889.743127.993384.733 661.393959.524280.784626.965000.00 Bandwidth391.81416.84443.32471.36501.03532.45565.71 600.92638.197677.67 87

PAGE 88

REFERENCES Altschuler,R.A.,Bobbin,R.P.,Clopton,B.M.,&Hoffman,D .W.(1991). Neurobiology ofhearing:Thecentralauditorysystem .NewYork:RavenPress. Banerjee,A.(2001).Onthephase-spacedynamicsofsystems ofspikingneurons.i: Modelandexperiments. NeuralComputation 13 ,161–93. Bar-Yosef,O.,Rotman,Y.,&Nelken,I.(2002).Responsesof neuronsincatprimary auditorycortextobirdchirps:Effectsoftemporalandspec tralcontext. J.Neurosci. 22 ,8619–32. Bear,M.F.,Conners,B.W.,&Paradiso,M.A.(2007). Neuroscience:Exploringthe brain .Baltimore:LippincottWilliams&Wilkins,3rded. Berry,M.J.,Warland,D.K.,&Meister,M.(1997).Thestruct ureandprecisionofretinal spiketrains. PNAS 94 ,5411–16. Boyd,S.P.,&Vandenberghe,L.(2004). Convexoptimization .Cambridge:Cambridge Univ.Press. Burges,C.J.C.(1998).Atutorialonsupportvectormachine sforpatternrecognition. DataMin.Knowl.Disc. 2 ,121–67. Cover,T.M.,&Thomas,J.A.(2006). ElementsofInformationTheory .NewJersey: Wiley,2nded. Dayan,P.,&Abbott,L.F.(2001). Theoreticalneuroscience:Computationaland mathematicalmodelingofneuralsystems .Cambridge:MITPress. Diesmann,M.,Gewaltig,M.O.,&Aertsen,A.(1999).Stablep ropagationof synchronoousspikingincorticalneuralnetworks. Nature 402 ,529–33. Duda,R.O.,Hart,P.E.,&Stork,D.G.(2001). PatternRecognition .NewYork:John Wiley&Sons. Dvoretzky,A.,Kiefer,J.,&Wolfowitz,J.(1956).Asymptot icminimaxcharacterofthe sampledistributionfunctionandoftheclassicalmultinom ialestimator. Ann.Math. Stat. 27 ,642–69. Freedman,D.J.,Riesenhuber,M.,Poggio,T.,&Miller,E.K. (2001).Categorical representationofvisualstimuliintheprimateprefrontal cortex. Science 291 ,312–6. Freedman,D.J.,Riesenhuber,M.,Poggio,T.,&Miller,E.K. (2003).Acomparison ofprimateprefrontalandinferiortemporalcorticesdurin gvisualcategorization. J. Neuro. 23 ,5235–46. Fuster,J.M.,&Jervey,J.P.(1981).Inferotemporalneuron sdistinguishandretain behaviorallyrelevantfeaturesofvisualstimuli. Science 212 ,952–5. 88

PAGE 89

Georgopoulos,A.P.,Kalaska,J.F.,Caminiti,R.,&Massey, J.T.(1982).Ontherelations betweenthedirectionoftwo-dimensionalarmmovementsand celldischargein primatemotorcortex. J.Neurosci. 2 (11),1527–37. Gerstner,W.,&Kistler,W.(2002). Spikingneuronmodels:Singleneurons,populations, plasticity .Cambridge:CambridgeUniv.Press. Gollisch,T.,&Meister,M.(2008).Rapidneuralcodinginth eretinawithrelativespike latencies. Science 391 ,1108–11. Guenther,F.H.,Nieto-Castanon,A.,Ghosh,S.S.,&Tourvil le,J.A.(2004). Representationofsoundcategoriesinauditorycorticalma ps. J.SpeechLang. Hear.R. 47 ,46–57. Hoeffding,W.(1963).Probabilityinequalitiesforsumsof boundedrandomvariables. J. Am.Stat.Assoc. 58 ,13–30. Joachims,T.(1999).Makinglarge-scalesvmlearningpract ical.InB.Schlkopf, C.Burges,&A.Smola(Eds.) Advancesinkernelmethods-supportvectorlearning Cambridge:MITPress. Joachims,T.(2002). Learningtoclassifytextusingsupportvectormachines .Norwell: KluwerPress. Joachims,T.(2009).SVMlightsupportvectormachine.Retr ievedNovember14,2009 from http://svmlight.joachims.org/ Kennel,M.B.,Shlens,J.,Abarbanel,H.D.I.,&Chichilnisk y,E.J.(2005).Estimating entropyrateswithbayesiancondenceintervals. NeuralComputation 17 ,1531–76. Kim,H.K.,Kim,S.S.,&Kim,S.J.(2006).Improvementofspik etraindecoderunder spikedetectionandclassicationerrorsusingsupportvec tormachine. Med.Biol. Eng.Comput. 44 ,124–130. Kistler,W.,Gerstner,W.,&vanHemmen,J.L.(1997).Reduct ionofthehodgkin-huxley equationstoathresholdmodel. NeuralComputation 9 ,1069–100. Latham,P.,&Nirenberg,S.(2005).Synergy,redundancy,an dindependencein populationcodes,revisited. J.Neurosci. 25 ,5195–5206. Laubach,M.(2004).Wavelet-basedprocessingofneuronals piketrainspriorto discriminantanalysis. J.Neurosci.Meth. 134 ,159–68. Learned-Miller,E.,&DeStefano,J.(2008).Aprobabilisti cupperboundondifferential entropy. IEEET.Inform.Theory 54 ,5223–30. Lee,J.H.,Russ,B.E.,Orr,L.E.,&Cohen,Y.(2009).Prefron talactivitypredicts monkeys'decisionsduringanauditorycategorytask. Front.Integr.Neurosci 3 (16). 89

PAGE 90

Lopez-Poveda,E.A.,O'Mard,L.P.,&Meddis,R.(2001).Ahum annonlinearcochlear lterbank. J.Acoust.Soc.Am. 110 ,3107–18. MacGregor,R.J.,&Lewis,E.R.(1977). NeuralModeling .NewYork:PlenumPress. Mainen,Z.F.,&Sejnowski,T.J.(1995).Reliabilityofspik etiminginneocorticalneurons. Science 268 ,1503–6. Massart,P.(1990).Thetightconstantinthedvoretsky-kie fer-wolfowitzinequality. Ann. Probab. 18 ,1269–83. Mazurek,M.E.,&Shadlen,M.N.(2002).Limitstothetempora ldelityofcorticalspike ratesignals. Nat.Neurosci. 5 ,463–71. Meddis,R.(1986).Simulationofmechanicaltoneuraltrans ductionintheauditory receptor. J.Acoust.Soc.Am. 79 ,702–11. Meddis,R.(1988).Simulationofauditory-neuraltransduc tion:Furtherstudies. J. Acoust.Soc.Am. 83 ,1056–63. Mehring,C.,Hehl,U.,Kubo,M.,Diesmann,M.,&Aertsen,A.( 2003).Activitydynamics andpropagationofsynchronousspikinginlocallyconnecte drandomnetworks. Biol. Cybern. 88 ,395. Mesgarani,N.,David,S.V.,Fritz,J.B.,&Shamma,S.A.(200 8).Phoneme representationandclassicationinprimaryauditorycort ex. J.Acoust.Soc.Am. 123 ,899–909. Miller,E.K.,Freedman,D.J.,&Wallis,J.D.(2003).Thepre frontalcortex:Categories, conceptsandcognition. Philos.T.R.Soc.Lond. 357 ,1123–36. Miller,G.(1955).Noteonthebiasofinformationestimates .InH.Quastler(Ed.) InformationtheoryinpsychologyII-B ,(pp.95–100).Glencoe,IL:FreePress. Miller,L.M.,Escabi,M.A.,Read,H.L.,&Schreiner,C.E.(2 002).Spectrotemporal receptiveeldsinthelemniscalauditorythalamusandcort ex. J.Neurophysiol. 87 516–27. Montemurro,M.A.,Senatore,R.,&Panzeri,S.(2007).Tight data-robustboundsto mutualinformationcombiningshufingandmodelselection techniques. Neural Computation 19 ,2913–57. Moore,B.C.J.(2004). Anintroductiontothepsychologyofhearing .London:Academic Press,5thed. Nicolelis,M.A.L.,Ghazanfar,A.A.,Stambaugh,C.R.,Oliv eira,L.M.O.,Laubach,M., Chapin,J.K.,Nelson,R.J.,&Kaas,J.H.(1998).Simultaneo usencodingoftactile informationbythreeprimatecorticalareas. Nat.Neurosci. 1 ,621–30. 90

PAGE 91

Nirenberg,S.,Jacobs,A.,Fridman,G.,Latham,P.,Douglas ,R.,Alam,N.,&Prusky,G. (2006).Rulingoutandrulinginneuralcodes. J.Vision 6 ,889. Nirenberg,S.,&Victor,J.D.(2007).Analyzingtheactivit yoflargepopulationsof neurons:Howtractableistheproblem? Curr.Opin.Neurobiol. 17 ,397–400. Ohl,F.W.,Scheich,H.,&Freeman,W.J.(2001).Changeinpat ternofongoingcortical activitywithauditorycategorylearning. Nature 412 ,733–6. Paninski,L.(2003).Estimationofentropyandmutualinfor mation. NeuralComputation 15 ,1191–1253. Pillow,J.W.,Shlens,J.,Paninski,L.,Sher,A.,Litke,A.M .,Chichilnisky,E.J.,& Simoncelli,E.P.(2008).Spatio-temporalcorrelationsan dvisualsignallingina completeneuronalpopulation. Nature 454 ,995–9. Pola,G.,Petersen,R.S.,Thiele,A.,Young,M.P.,&Panzeri ,S.(2005).Data-robust tightlowerboundstotheinformationcarriedbyspiketimes ofaneuronalpopulation. NeuralComputation 17 ,1962–2005. Pouget,A.,Dayan,P.,&Zemel,R.(2000).Informationproce ssingwithpopulation codes. Nat.Rev.Neurosci. 1 ,125–32. Reich,D.S.,Mechler,F.,Purpura,K.P.,&Victor,J.D.(200 0).Interspikeintervals, receptiveelds,andinformationencodinginprimaryvisua lcortex. J.Neurosci. 20 1964–74. Richmond,B.J.,Optican,L.M.,Podell,M.,&Spitzer,H.(19 87).Temporalencoding oftwo-dimensionalpatternsbysingleunitsinprimateinfe riortemporalcortex. J. Neurophysiol. 57 ,132–46. Rieke,F.,Warland,D.,deRuytervanSteveninck,R.,&Biale k,W.(1997). Spikes: Exploringtheneuralcode .Cambridge:MITPress. Schneidman,E.,Berry,M.J.II,Segev,R.,&Bialek,W.(2006 ).Weakpairwise correlationsimplystronglycorrelatednetworkstatesina neuralpopulation. Nature 440 ,1007–12. Serruya,M.D.,Hatsopoulous,N.G.,Paninski,L.,Fellows, M.R.,&Donoghue,J.P. (2002).Instantneuralcontrolofamovementsignal. Nature 416 ,141–2. Shadlen,M.N.,&Newsome,W.T.(1998).Thevariabledischar geofcorticalneurons: Implicationsforconnectivity,computation,andinformat ioncoding. J.Neurosci 18 3870–96. Shannon,C.E.(1948).Amathematicaltheoryofcommunicati on. BellSystemTechnical Journal 27 ,379–423,623–656. 91

PAGE 92

Shlens,J.,Field,G.D.,Gauthier,J.L.,Grivich,M.I.,Pet rusca,D.,Sher,A.,Litke,A.M., &Chichilnisky,E.J.(2006).Thestructureofmulti-neuron ringpatternsinprimate retina. J.Neurosci. 26 ,8254–66. Shlens,J.,Kennel,M.B.,Abarbanel,H.D.I.,&Chichilnisk y,E.J.(2007).Estimating informationrateswithcondenceintervalsinneuralspike trains. NeuralComputation 19 ,1683–1719. Squire,L.R.,Bloom,F.E.,McConnell,S.K.,Roberts,J.L., Spitzer,N.C.,&Zigmond, M.J.(2003). Fundamentalneuroscience .London:AcademicPress,2nded. Strong,S.,Koberle,R.,deRuytervanSteveninck,R.,&Bial ek,W.(1998).Entropyand informationinneuralspiketrains. Phys.Rev.Lett. 80 ,197–200. Sumner,C.J.,Lopez-Poveda,E.A.,O'Mard,L.P.,&Meddis,R .(2002).Arevisedmodel oftheinner-haircellandauditorynervecomplex. J.Acoust.Soc.Am. 111 ,2178–88. Theunissen,F.E.,Sen,K.,&Doupe,A.J.(2000).Spectral-t emporalreceptiveeldsof nonlinearauditoryneuronsobtainedusingnaturalsounds. J.Neurosci. 20 ,2315–31. Univ.ofEssexDept.ofPsychologyandUniv.ofCambridgePhy siologyDept.(2009). Developmentsystemforauditorymodelling.RetrievedNove mber14,2009from http://www.pdn.cam.ac.uk/groups/dsam/index.html Vapnik,V.N.(1999). TheNatureofStatisticalLearningTheory .Berlin:Springer,2nd ed. Warland,D.K.,Reinagel,P.,&Meister,M.(1997).Decoding visualinformationfroma populationofretinalganglioncells. J.Neurophysiol. ,(pp.2336–50). Wessberg,J.,Stambaugh,C.R.,Kralik,J.D.,Beck,P.D.,La ubach,M.,Chapin,J.K., Kim,J.,Biggs,S.J.,Srinivasan,M.A.,&Nicolelis,M.A.(2 000).Real-timeprediction ofhandtrajectorybyensemblesofcorticalneuronsinprima tes. Nature 408 ,361–5. Wolpert,D.,&Wolf,D.(1995).Estimatingfunctionsofprob abilitydistributionsfroma nitesetofsamples. Phys.Rev.E 52 ,6841–54. Wu,W.,Gao,Y.,Bienenstock,E.,Donoghue,J.P.,&Black,M. J.(2005).Bayesian populationdecodingofmotorcorticalactivityusingakalm anlter. NeuralComputation 18 ,80–118. 92

PAGE 93

BIOGRAPHICALSKETCH NathanDavidVanderKraatswasraisedinthesmalltownofHil lsboro,MOnearthe St.Louismetropolitanarea.Hegraduatedasthevaledictor ianofhishighschoolclass in1998. From1998to2003Nathancarriedouthisundergraduatestudi esattheUniversity ofIllinoisatUrbana-Champaign(UIUC),obtainingabachel or'sdegreeincomputer science.Heexploredseveralprofessionalinterests,incl udingworkasanetwork technicianandspecialprojectsmemberfortheUIUChousing department,a7-month stintinRichardson,TXasaco-opfortheHPUXkernelcodedev elopmentgroup,and undergraduatethesisworkinnaturallanguageprocessing. Inaddition,Nathanwasa 5-yearmemberoftheIllinoisVarsityMen'sGleeClub,ahigh lightofwhichincludeda EuropeantourwithperformancesatSt.Peter'sBasilicainV aticanCity,St.Francisof AssisiCathedral,andNotreDamedeParisinFrance.Hegradu atedfromUIUCwith highesthonorsasaJamesScholarandCampusHonorsScholar. In2003,NathanrelocatedtoGainesville,FLtopursueadoct oraldegreeatthe UniversityofFloridawiththeDepartmentofComputerandIn formationScienceand Engineering.Hisprimaryareasofstudywerecomputational neuroscienceandmachine learning,undertheadvisementofDr.ArunavaBanerjee.Nat hanenjoysteaching, andhasservedasateachingassistantforseveralundergrad uatecoursesandasan instructorforasummerdiscretemathematicscourse.Outsi deoftheclassroom,he wasatwo-yearmemberoftheUniversityofFloridaRugbyFoot ballClub.Heattaineda master'sdegreeinMay2009andgraduatedwithhisPhDinDece mber2009. Inhissparetime,Nathanisanavidsportsfan.Hisfavoriteo rganizationsincludethe St.LouisCardinalsbaseball,IllinoisFightingIllinibas ketball,andFloridaGatorsfootball teams. 93