Kalman Filtering in Reproducing Kernel Hilbert Spaces

MISSING IMAGE

Material Information

Title:
Kalman Filtering in Reproducing Kernel Hilbert Spaces
Physical Description:
1 online resource (154 p.)
Language:
english
Creator:
Zhu, Pingping
Publisher:
University of Florida
Place of Publication:
Gainesville, Fla.
Publication Date:

Thesis/Dissertation Information

Degree:
Doctorate ( Ph.D.)
Degree Grantor:
University of Florida
Degree Disciplines:
Electrical and Computer Engineering
Committee Chair:
Principe, Jose C
Committee Members:
Wu, Dapeng
Shea, John Mark
Rao, Murali

Subjects

Subjects / Keywords:
kalmanfilter -- rkhs
Electrical and Computer Engineering -- Dissertations, Academic -- UF
Genre:
Electrical and Computer Engineering thesis, Ph.D.
bibliography   ( marcgt )
theses   ( marcgt )
government publication (state, provincial, terriorial, dependent)   ( marcgt )
born-digital   ( sobekcm )
Electronic Thesis or Dissertation

Notes

Abstract:
There are numerous dynamical system applications that require to do estimation or prediction tasks from noisy data, including vehicle tracking, channel tracking, economic forecasting, and so on. Many linear algorithms have been developed to deal with these problems under different assumptions and approximations, such as the Kalman filter, the recursive least squares algorithm, and the least mean squares algorithm. However, these linear algorithms cannot solve nonlinear problems that often occur in real life. To address these nonlinear problems, some nonlinear algorithms have been recently proposed, like kernelized version of the linear algorithms. Our research follows this line and seeks to develop novel algorithms using kernel methods to deal with the nonlinear problems. Specifically, our goal is to derive the Kalman filter in the reproducing kernel Hilbert space, which is a space of functions. In this dissertation, two algorithms are presented to address different nonlinear problems: a novel extended kernel recursive least squares algorithm and a kernel Kalman filter with conditional embeddings in RKHS. The former is a novel version of the extended kernel recursive least squares algorithm with more flexible state model, while the latter is the Kalman algorithm constructed in the RKHS based on the conditional embeddings operator. The two algorithms are developed under different assumptions and have different applications. In this dissertation, their performance are also tested and compared with other existing algorithms.
General Note:
In the series University of Florida Digital Collections.
General Note:
Includes vita.
Bibliography:
Includes bibliographical references.
Source of Description:
Description based on online resource; title from PDF title page.
Source of Description:
This bibliographic record is available under the Creative Commons CC0 public domain dedication. The University of Florida Libraries, as creator of this bibliographic record, has waived all rights to it worldwide under copyright law, including all related and neighboring rights, to the extent allowed by law.
Statement of Responsibility:
by Pingping Zhu.
Thesis:
Thesis (Ph.D.)--University of Florida, 2013.
Local:
Adviser: Principe, Jose C.
Electronic Access:
RESTRICTED TO UF STUDENTS, STAFF, FACULTY, AND ON-CAMPUS USE UNTIL 2014-05-31

Record Information

Source Institution:
UFRGP
Rights Management:
Applicable rights reserved.
Classification:
lcc - LD1780 2013
System ID:
UFE0045386:00001


This item is only available as the following downloads:


Full Text

PAGE 1

KALMANFILTERINGINREPRODUCINGKERNELHILBERTSPACESByPINGPINGZHUADISSERTATIONPRESENTEDTOTHEGRADUATESCHOOLOFTHEUNIVERSITYOFFLORIDAINPARTIALFULFILLMENTOFTHEREQUIREMENTSFORTHEDEGREEOFDOCTOROFPHILOSOPHYUNIVERSITYOFFLORIDA2013

PAGE 2

c2013PingpingZhu 2

PAGE 3

ACKNOWLEDGMENTS Iwouldliketosincerelythankmyadvisor,Dr.JoseC.Prncipe,forhissupport,encouragementandpatienceinguidingtheresearch.Hisopennessandsupporttomyinvestigativeideashadhelpmetobecomeabetterresearch.IwouldliketothankDr.DapengWu,Dr.JohnM.Shea,andDr.MuraliRaoforbeingservingonmycommitteeandfortheirhelpfuladvice.Theirvaluablecommentsandconstructivecriticismhelpimprovethequalityofthisworkgreatly.Manythanksareduetoformerlabmember,Dr.BadongChen,forhisadviceandmanydiscussionsthatmademyresearchaloteasierandhappier.Thanksareduetoallofcurrentlabmembersfortheirsupport.Finally,Iwouldliketothankmyparentsfortheirencouragementandsupportwithme. 3

PAGE 4

TABLEOFCONTENTS page ACKNOWLEDGMENTS .................................. 3 LISTOFTABLES ...................................... 7 LISTOFFIGURES ..................................... 8 ABSTRACT ......................................... 10 CHAPTER 1INTRODUCTION ................................... 12 1.1Estimation,predictionandsmooth ...................... 12 1.2Background ................................... 13 1.3Requirementsofproposedalgorithms .................... 14 2RELATEDWORK .................................. 16 2.1AlgorithmsintheInputSpace ......................... 16 2.1.1BayesianFilter ............................. 16 2.1.2KalmanFilter .............................. 17 2.1.3NonlinearKalmanFilter ........................ 19 2.1.3.1ExtendedKalmanFilter ................... 20 2.1.3.2UnscentedKalmanFilter .................. 21 2.1.3.3CubatureKalmanFilter ................... 22 2.1.4DualKalmanFilter ........................... 25 2.1.5Adaptivelters ............................. 25 2.1.5.1RecursiveLeastSquares .................. 27 2.1.5.2ExtendedRecursiveLeastSquares ............ 28 2.2AlgorithmsinRKHS .............................. 29 2.2.1FoundationsofRKHS ......................... 30 2.2.2KernelRecursiveLeastSquares ................... 32 2.2.2.1KRLS-ALD .......................... 33 2.2.2.2QKRLS ............................ 34 2.2.3ExtendedKernelRecursiveLeastSquares .............. 35 2.2.4RevisitingtheEx-KRLSAlgorithm ................... 38 2.2.4.1ExistenceofStateTransitionOperator ........... 39 2.2.4.2ValueofStateTransitionParameter ............ 41 2.2.4.3RelationshipwiththeKalmanFilter ............. 42 2.2.4.4ConclusionsaboutEx-KRLSalgorithm .......... 42 2.2.5KernelKalmanFilter .......................... 43 2.3SummaryofRelatedAlgorithms ....................... 44 4

PAGE 5

3ANOVELEXTENDEDKERNELRECURSIVELEASTSQUARES ....... 46 3.1Overview .................................... 46 3.2ExtendedKalmanFilter-KernelRecursiveLeastSquaresAlgorithm .... 47 3.3FormulationoftheNovelExtendedKernelRecursiveLeastSquares ... 49 3.4ExperimentsandResults ........................... 53 3.4.1Vehicletracking ............................. 53 3.4.2Rayleighchanneltracking ....................... 57 3.5DiscussionandConclusion .......................... 62 4LEARNINGNONLINEARGENERATIVEMODELSOFTIMESERIESWITHAKALMANFILTERINRKHS ............................ 66 4.1Overview .................................... 66 4.2ConditionalEmbeddingsinRKHS ...................... 69 4.2.1HilbertSpaceEmbeddings ...................... 69 4.2.2Conditionalembeddingoperator ................... 72 4.2.3DynamicalSystemModelwithaConditionalEmbeddingoperator 74 4.3KernelKalmanFilterbasedonConditionalEmbeddingOperator ..... 75 4.3.1ConstructionofKalmanFilteringintheRKHS ............ 76 4.3.2KalmanFilterPredictorinRKHS ................... 83 4.4Simplifyingtheconditionalembeddingoperator ............... 85 4.4.1DownSamplingTrainingData ..................... 85 4.4.2QuantizingTrainingData ........................ 85 4.4.3TrainingDataforOnlineOperation .................. 88 4.5ExperimentsandResults ........................... 90 4.5.1Estimationofnoisytime-series .................... 90 4.5.1.1Discussionofnoiseparameters .............. 91 4.5.1.2Comparisonofestimationresults .............. 93 4.5.2Predictionofnoisytime-series ..................... 96 4.5.3PredictionofSunspotdataset ..................... 100 4.6DiscussionandConclusion .......................... 102 5KERNELKALMANFILTERBASEDONCONDITIONALEMBEDDINGS .... 106 5.1Overview .................................... 106 5.2State-SpaceModelinRKHS ......................... 107 5.3DerivationofFull-blownKernelKalmanFilter ................ 109 5.3.1RecursiveEquationswithGivenTrainingData ............ 110 5.3.2FKKFalgorithmwithgeneratedtrainingdata ............. 112 5.4ExperimentandResult ............................. 115 5.5Discussion ................................... 118 5.6Conclusion ................................... 120 5

PAGE 6

6CONCLUSIONANDFUTUREWORK ....................... 121 6.1ConclusionandDiscussion .......................... 121 6.2Futurework ................................... 124 6.2.1NoisemodelinRKHS ......................... 124 6.2.2OperatorDesign ............................ 124 6.2.3ApplicationofKalmanlterinKernelAdaptiveAlgorithms ..... 124 APPENDIX ASQUARE-ROOTCUBATUREKALMANFILTER .................. 126 BPROOFSOFTHEOREM4.1ANDTHEOREM4.2 ................ 128 CPROOFSOFTHEOREM4-3 ............................ 131 DPROOFSOFTHEOREM5.1ANDTHEOREM5.2 ................ 132 EKERNELRECURRENTSYSTEMTRAINEDBYREAL-TIMERECURRENTLEARNINGALGORITHM .............................. 135 E.1Introduction ................................... 135 E.2Kernelrecurrentnetworks ........................... 136 E.3On-linerecurrentsystemlearning ...................... 138 E.3.1Kernelreal-timerecurrentlearningalgorithm ............. 138 E.3.2Teacherforcing ............................. 141 E.4Experimentsandresults ............................ 143 E.5Conclusion ................................... 144 REFERENCES ....................................... 146 BIOGRAPHICALSKETCH ................................ 154 6

PAGE 7

LISTOFTABLES Table page 3-1MSEofdistanceandslopek ........................... 56 3-2MSEofposition .................................... 57 3-3PerformancecomparisoninRayleighfadingchanneltrackingwithmaximumDopplerfrequencyfD=100Hz ........................... 60 3-4PerformancecomparisoninRayleighfadingchanneltrackingwithmaximumDopplerfrequencyfD=500Hz ........................... 62 3-5Computationalcomplexityanalysis ......................... 64 4-1MSEofestimationofIKEDAdatafordifferentalgorithms(=1.6forstablenoise) ......................................... 94 4-2PredictionMSEwithk=50 ............................. 99 4-3PredictionMSEwithk=100 ............................ 99 4-4Timecomplexities .................................. 104 5-1NMSEofStateestimation .............................. 117 6-1Algorithmrequirementschecklist .......................... 123 E-1Usingx(1)asinputstopredictx(1),x(2)andx(3) ................ 144 7

PAGE 8

LISTOFFIGURES Figure page 1-1Relationshipofestimation,predictionandmodeling ............... 13 2-1Dualestimationproblem ............................... 26 2-2Relationshipbetweenalgorithms .......................... 45 3-1Trajectoryofthevehiclewithbackground ..................... 53 3-2Trajectoryofthevehicle ............................... 54 3-3TrajectoriesofthetruepositionandthepredictionsoftheEKF,KRLS,andEx-KRLS-KFalgorithms ............................... 57 3-4ComparisonoftheEKF,KRLS,andEx-KRLS-KFalgorithms .......... 58 3-5EnsemblelearningcurvesoftheLMS-2,RLS,EX-RLS,KRLS,EX-KRLSandEX-KRLS-KFintrackingaRayleighfadingmultipathchannelwithmaximumDopplerfrequencyfD=100Hz ........................... 60 3-6EnsemblelearningcurvesoftheLMS-2,RLS,EX-RLS,KRLS,EX-KRLSandEX-KRLS-KFintrackingaRayleighfadingmultipathchannelwithmaximumDopplerfrequencyfD=500Hz ........................... 61 3-7MSEversusmaximumDopplerfrequency.Errorbarsmarkonestandarddeviationaboveandbelowthemean ............................. 63 4-1Generatedtimeseries ................................ 66 4-2Trajectoryoftherst200pointsoftheseriesIKEDA ............... 90 4-3MSEofestimationresults(a)withrespecttoqandr(b)withrespecttoand ............................................ 91 4-4MSEofestimationresultsversusr=q.Errorbarsmarkonestandarddeviationaboveandbelowthemean. ............................. 93 4-5SNRofestimationresultsversusinputsignal.Errorbarsmarkonestandarddeviationaboveandbelowthemean. ....................... 95 4-6StatetrajectoryofLorenzsystemforparameters=8=3,=10and=28 97 4-710-stepPredictionMSEwithdifferentk ...................... 98 4-8PredictionMSEwithk=50 ............................. 99 4-9PredictionMSEwithk=100 ............................ 100 4-10Normalizedannualsunspotdataovertheperiod1700-2011 ........... 101 8

PAGE 9

4-11NormalizedMSEofannualSunspotdatapredictionovertheperiod1980-2011 102 5-1Truevaluesofthestatesandcorrespondingnoisymeasurementsinanexemplarrun .......................................... 116 5-2NMSEofestimationresultsversusnumberofsamplesm.Errorbarsmarkonestandarddeviationaboveandbelowthemean. ................ 118 6-1Researchlineofalgorithms ............................. 123 E-1State-spacemodel,thefeedbackpartisshowninred. .............. 137 9

PAGE 10

AbstractofDissertationPresentedtotheGraduateSchooloftheUniversityofFloridainPartialFulllmentoftheRequirementsfortheDegreeofDoctorofPhilosophyKALMANFILTERINGINREPRODUCINGKERNELHILBERTSPACESByPingpingZhuMay2013Chair:JoseC.PrncipeMajor:ElectricalandComputerEngineering Therearenumerousdynamicalsystemapplicationsthatrequireestimationorpredictionfromnoisydata,includingvehicletracking,channeltracking,timeseriesdenoising,prediction,estimation,andsoon.Manylinearalgorithmshavebeendevelopedtodealwiththeseproblemsunderdifferentassumptionsandapproximations,suchastheKalmanlter,therecursiveleastsquaresalgorithm,andtheleastmeansquaresalgorithm.However,theselinearalgorithmscannotsolvenonlinearproblemsthatoftenoccurinreallife.Toaddressthesenonlinearproblems,somenonlinearalgorithmshavebeenrecentlyproposed,likekernelizedversionofthelinearalgorithms.Ourresearchfollowsthislineandseekstodevelopnovelalgorithmsusingkernelmethodstodealwithnonlinearproblems.Specically,ourgoalistoderivetheKalmanlterinthereproducingkernelHilbertspace(RKHS),whichisaspaceoffunctions,toimplementsignaldenoising,predictionandestimation. Inthisdissertation,werstanalyzeanddiscussindepththeextendedkernelrecursiveleastsquaresalgorithm,andpointoutthelimitationofthisalgorithmanditscloserelationshipwiththeKalmanlterinRKHS.Next,wedevelopanovelextendedkernelrecursiveleastsquaresbasedonthenonlinearKalmanltersandkernelrecursiveleastsquaresalgorithms,toimprovetrackingandpredictionperformance.However,thenonlinearKalmanlterisimplementedintheinputspace,nottheRKHS.Then,weintroducetheconceptsofembeddingsinRKHSandconditionalembedding 10

PAGE 11

operators,anddevelopanalgorithmtolearnnonlineargenerativemodelsoftimeseriesbasedontheembeddingsandconditionalembeddingoperatorstodenoiseandpredicttimeseries.Wepresentadetailedvalidationofthemethod,includingananalysisofhowtospeedupthecalculations.ThisalgorithmisaKalmanlterinRKHSwithatrivialmeasurementmodel,inwhichtheestimatedmeasurementembeddingsasthehiddenstatesinRKHSarepropagatedandestimated.Becausetheembeddingsrepresentdistributionsofrandomvariables,themodelsdescribethemeasurementdynamics.Finally,aKalmanlterinRKHSwithnon-trivialmeasurementmodelisproposed,namedfull-blownkernelKalmanlter(FKKF),inwhichthestateembeddingsaretreatedasthehiddenstatesinRKHS.Becauseofthenon-trivialmeasurementmodelandthestateembeddings,thisalgorithmissupposedtobeabletohandlemorecomplicateddynamicalsystemproblems.Wepresentasimpleexamplethatshowsitsperformance. InconclusionthisdissertationestablishestheframeworktoadvancethetheoryofstatemodelsinRKHSandshowsthepotentialandalsoidentiesthedifcultiesthatneedstobeconqueredtoestablishpracticalimplementationsofnonlinearstatemodels. 11

PAGE 12

CHAPTER1INTRODUCTION Thisdissertationaddressestheproblemofmodelingadynamicalsystemforestimationandprediction.Applicationsincludingvehicletracking,channeltracking,andtimeseriesdenoisingandprediction.Generallyspeaking,timeserieslteringproblemslikeestimationandprediction,requiresystemmodeling,whichisusuallydescribedintermsofmathematicalequations.Hereweconsiderthedynamicalsystem,whichcontainsstateandmeasurementmodelsdenedas: xi+1=fi(xi,ui)+ni (1) yi=hi(xi,ui)+vi (1) wherexi2Rnxisthestateofthedynamicsystematdiscretetimei,whichisdenedastheminimalmodelthatissufcienttouniquelydescribetheunforceddynamicalbehaviorofthesystem,butusuallyunobservable;ui2Rnuistheinputofthesystem;yi2Rnyisthemeasurement,whichismeasurableorobservable;fi:RnxRnu!Rnxandhi:RnxRnu!Rnyaresomefunctionsusedtodescribethedynamicsofthesystem;fnigandfvigareindependentprocessandmeasurementnoisesequenceswithzeromeansandcovariancesQiandRi,respectively.Herenoisesareaddedtothemodelstoreecttheuncertaintyofthemodelandthemeasurementnoise. 1.1Estimation,predictionandsmooth Withtheabovemodel,wecanapproximatethehiddenstatexjusingallavailabledatafykgandfukguptocurrenttimei.Whenthetimeji,theoperationiscalledestimation,whilewhenthetimej>i,itisprediction.Herethetasksofestimationandpredictionaredonewiththeassumedknownmodel.Ontheotherhand,thetaskthatidentiestheunderlyingmodelofthesystemusingallinformationismodeling. Itisclearthatthesetasksarestronglyinterdependent.Forinstance,whentheaccuratemodelofthesystemorclean(noiseless)signalgeneratedbythemodel 12

PAGE 13

isavailable,onecanobtaintheotherbymodelingorestimation.Furthermore,ifanaccuratemodelandgoodsignalestimatesareavailable,goodpredictionscanbegeneratedbyusingtheestimatesasinputstothemodel.(seeFigure 1-1 ) Figure1-1. Relationshipofestimation,predictionandmodeling 1.2Background However,inrealcases,neithertheaccuratemodelnornoisefreesignalareavailable,hencetheproblemsaremuchmorechallenging.Inthesecases,someapproximatedmodelsareassumedbasedonthepriorknowledgeandnoisesareaddedtoreecttheuncertaintyoftheassumedmodel.Manyalgorithmsaredevelopedunderthusassumptions,liketheparticlelters(PF)[ 18 19 ]andthe(nonlinear)Kalmanlters(KF)[ 25 29 33 35 38 39 ]. Althoughnoisesareintroducedtothesealgorithmstocompensatetheinaccuracyofthemodel,itisstillnotsatisfactory.Therefore,thedualestimationisstudiedbysomeresearchers.Thetaskofthedualestimationproblemistoestimatehiddenstatesfromnoisydataandapproximatetheunderlyingsystemmodel,likethedualKalmanlter(DKF)[ 34 43 45 ]. Anothermethodologyistosimplifythesystemmodelasalinearmodelandignorethestatemodel,normallyreferredasadaptiveltering,liketherecursiveleastsquares(RLS)[ 1 ]algorithmandtheleastmeansquares(LMS)[ 1 ]algorithm.Butbecauseofthesimplication,thesealgorithmscannotbeappliedtoestimateorpredictthehidden 13

PAGE 14

states.Thesealgorithmsjustfocusonsystem'sinputsandoutputsandtrytoachievethebestlinearrelationshipbetweenthem. Recently,somenonlinearalgorithmshavealsobeenproposedbasedontheselinearalgorithms,allofwhicharesupportedbythekerneltheory,likethekernelleastmeansquares(KLMS)[ 15 ],thekernelrecursiveleastsquares(KRLS)[ 16 ]andtheextendedkernelrecursiveleastsquares(Ex-KRLS)[ 17 ]algorithms.However,mostofthesealgorithmsstilljustfocusontherelationshipbetweeninputsandoutputs,notonthestatesofthesystem.EvenfortheEx-KRLSalgorithm,justasimplerandomwalkmodelisusedtodescribethestatemodelinRKHS. 1.3Requirementsofproposedalgorithms KernelmethodsexpressnonlinearfunctionsinaninnerproductformintheRKHS,thereforelinearadaptivelteralgorithmscanbeconstructedinthereproducingkernelHilbertspace(RKHS)tosolvenonlinearadaptivelterproblems.SincethereisacloserelationshipbetweenlinearadaptiveltersandtheKalmanlter,weseektodevelopalineardynamicalmodelalgorithm,theKalmanlter,intheRKHStosolvenonlineardynamicalproblems.Becauseofthepropertiesofthekernelmethod,weexpectthatourproposedalgorithmscanachievethefollowingrequirements: 1. Implementnonlinearmodelsintheinputspace,usingalinearstatemodelframeworkintheRKHSandlearnthesystemmodelfromdatadirectlyandrepresentitintheRKHS. 2. Estimateorpredictthehiddenstateandtheoutputofthesystemmodelwithevaluationintheinputspace. 3. Berobustinanon-Gaussiannoiseenvironment. Ofcause,goingtotheRKHSalsobringsdisadvantagesbecauseweloseaccesstothestate(nowafunctioninRKHS)andthecalculationsbecomemuchmoreinvolvedandtimeconsuming.Infact,westilldonotknowifitispossibletodesignrealisticstatespacemodelsinRKHS.Thisdissertationisoneoftherstattemptstodoso.WehopetheKalmanlterintheRKHScouldbeanothermethodologybesidesthenonlinear 14

PAGE 15

Kalmanltersandtheparticleltertosolvenonlineardynamicalproblems.Inaddition,becausethenovelalgorithmsshouldsurviveinanon-Gaussiannoiseenvironment,weexpectthemtooutperformotherexistingalgorithmsforsomenon-Gaussiancases. Inordertodevelopthealgorithmsachievingalloftheserequirements,westudysomeexistingrelatedalgorithmsandobtainsomepreliminaryachievements.Furthermore,thesealgorithmswillbringusmoreideastodesignnewalgorithms. Therestofthedissertationisorganizedasfollows.SomeexistingrelatedalgorithmsarereviewedinChapter 2 ,includingthealgorithmsintheinputspaceandthealgorithmsinthereproducingkernelHilbertspace.Especially,theextendedkernelrecursiveleastsquaresalgorithmisanalyzedanddiscusseddeeplyinSection 2.2.3 .ThelimitationofthisalgorithmanditscloserelationshipwiththeKalmanlterinRKHSisstated.ThenanovelextendedkernelrecursiveleastsquaresalgorithmisproposedinChapter 3 .InChapter 4 ,aKalmanlterwithtrivialmeasurementmodelaredevelopedinRKHS.Afterthat,amorecomplicatedKalmanlterinRKHSisproposedinChapter 5 .Finally,theconclusionandfutureworksaregiveninChapter 6 15

PAGE 16

CHAPTER2RELATEDWORK InthisChapter,wereviewsomeexistingalgorithmsthatareappliedtonoisytimeseries.Thesealgorithmsarecategorizedintotwogroupsaccordingtothespacewherethealgorithmisdeveloped.Thereviewofthesealgorithmsalsogivesthemotivationofthisdissertation. 2.1AlgorithmsintheInputSpace 2.1.1BayesianFilter Whenthesystemmodelin( 1 )and( 1 )isassumedtobeknown,theBayesianlterscanbeappliedtoestimateorpredictthestates.IntheBayesianlteringframework,theposteriordensityofthestateismaintainedbysystemdynamicsandnewmeasurement,whichprovidesacompletestatisticaldescriptionofthestateofthedynamicsystem.Maintainingtheposteriordensityincludestwosteps:predictionandmeasurementupdate[ 19 ]. Step1:Prediction. p(xijDi)]TJ /F8 7.97 Tf 6.59 0 Td[(1)=Zp(xijxi)]TJ /F8 7.97 Tf 6.58 0 Td[(1)p(xi)]TJ /F8 7.97 Tf 6.59 0 Td[(1jDi)]TJ /F8 7.97 Tf 6.58 0 Td[(1)dxi)]TJ /F8 7.97 Tf 6.59 0 Td[(1(2) whereDi)]TJ /F8 7.97 Tf 6.59 0 Td[(1=fyigk)]TJ /F8 7.97 Tf 6.58 0 Td[(1i=1denotesthehistoryofthemeasurementsuptotime(i)]TJ /F6 11.955 Tf 12.31 0 Td[(1),andp(xi)]TJ /F8 7.97 Tf 6.58 0 Td[(1jDi)]TJ /F8 7.97 Tf 6.58 0 Td[(1)istheposteriordensityatthelasttime(i)]TJ /F6 11.955 Tf 12.25 0 Td[(1).Thestatetransitiondensityp(xijxi)]TJ /F8 7.97 Tf 6.59 0 Td[(1)isobtainedfrom( 1 ). Step2:Measurementupdate. p(xijDi)=p(yijxi)p(xijDi)]TJ /F8 7.97 Tf 6.59 0 Td[(1) p(yijDi)]TJ /F8 7.97 Tf 6.58 0 Td[(1)(2) Thedenominatoriscalculatedas p(yijDi)]TJ /F8 7.97 Tf 6.59 0 Td[(1)=Zp(yijxi)p(xijDi)]TJ /F8 7.97 Tf 6.58 0 Td[(1)dxi(2) wherethemeasurementlikelihoodfunctionp(yijxi)isobtainedfrom( 1 ). 16

PAGE 17

TheBayesianlteringframeworkprovidesaconceptualuniedrecursiveapproachfornonlinearlteringproblemstomaintaintheposteriordistributionofthehiddenstate.However,itispracticallyinefcient,becauseitinvolvesmulti-dimensionalintegralsandtheconditionalprobabilitydensityfunctions(pdf)arealsoneeded.Furthermore,itisnoteasytoexpressarbitrarydistributionfunctions. TheParticlelter(PF)algorithmisasequentialMonteCarlomethodtoimplementtheBayesianlter[ 18 23 ].ThePFalgorithmrepresentstherequiredpdfasasetofimportantsamples(orparticles),ratherthanafunctionoverthestatespace.Thealgorithmrecursivelypropagatesandupdatesthesesamplesforthediscretetimeproblem,andcanestimatethehiddenstateusingthesesamples.However,torepresentthepdf,thePFalgorithmhastogeneratealargenumberofparticlesforeachiterationandresamplethem,whichresultsinhighcomputationalcomplexityandmemoryrequirements. 2.1.2KalmanFilter Forthesakeofsimplication,manyalgorithmsaredevelopedundertheGaussianassumption,i.e.inthespaceofGaussianfunctions.Undertheseconditions,thepredictivedensityp(xijDi)]TJ /F8 7.97 Tf 6.58 0 Td[(1)andthelterlikelihooddensityp(yijxi)arebothGaussian,whichleadstoaGaussianposteriordensityp(xijDi).TheGaussiandistributionisdeterminedbyitsmeanandcovariance.UndertheGaussianapproximation,thefunctionalrecursionoftheBayesianlterreducestoanalgebraicrecursionoperatingonlyonthemeansandcovariancesofvariousconditionaldensitiesencounteredinthetimeandthemeasurementupdates. BecausetheGaussianfamilyisclosedunderlineartransformation,whenthesystemislinear,wehavethecelebratedKalmanlter(KF)proposedbyR.E.Kalmanin1960[ 25 ].TheKFprovidesarecursiveoptimalsolutiontothelinearlteringproblem,andiswidelyappliedtostationaryaswellasnon-stationarylinear 17

PAGE 18

dynamicalenvironmentstoestimatethecurrenthiddenstatesorpredicthiddenstatesormeasurementsinthefuture. Forthelinearsystem,thedynamicalmodel( 1 )and( 1 )canberewrittenas: xi+1=Fixi+ni (2) yi=Hixi+vi (2) whereFiisthetransitionmatrixtakingthestatexifromtimeitotimei+1,andHiisthemeasurementmatrix.Theprocessnoiseniisassumedtobezero-mean,additive,white,andGaussian,withthecovariancematrixdenedby E[ninTj]=8>><>>:Qifori=j0fori6=j(2) Similarly,themeasurementnoiseviisassumedtobezero-mean,additive,white,andGaussian,withthecovariancematrixdenedby E[vivTj]=8>><>>:Rifori=j0fori6=j(2) Supposethatameasurementonalineardynamicalsystem,describedby( 2 )and( 2 ),hasbeenmadeattimei.Therequirementistousetheinformationcontainedinthenewmeasurementyitoupdatetheestimationoftheunknownhiddenstatexi.Let^x)]TJ /F5 7.97 Tf 0 -8.28 Td[(idenoteaprioriestimateofthestate,whichisalreadyavailableattimei.IntheKalmanlteralgorithm,thehiddenstatexiisestimatedasalinearcombinationof^x)]TJ /F5 7.97 Tf 0 -8.27 Td[(iandthenewmeasurementyi,intheformof ^xi=^x)]TJ /F5 7.97 Tf 0 -8.27 Td[(i+Gi(yi)]TJ /F7 11.955 Tf 11.96 0 Td[(Hi^x)]TJ /F5 7.97 Tf 0 -8.27 Td[(i)(2) wherethematrixGiiscalledKalmangain. 18

PAGE 19

TheKalmanlteralgorithmissummarizedinAlgorithm2-1.Thedetailscanbefoundin[ 25 26 ]. Algorithm2-1:Kalmanlter Initialization:Fori=0,set^x0=E[x0]P0=E[(x0)]TJ /F4 11.955 Tf 11.95 0 Td[(E[x0])(x0)]TJ /F4 11.955 Tf 11.95 0 Td[(E[x0])T] Computation:Fori=1,2,...,compute:Stateestimatepropagation^x)]TJ /F5 7.97 Tf 0 -8.28 Td[(i=Fi^xi)]TJ /F8 7.97 Tf 6.58 0 Td[(1ErrorcovariancepropagationP)]TJ /F5 7.97 Tf 0 -8.28 Td[(i=FiPi)]TJ /F8 7.97 Tf 6.58 0 Td[(1FTi+Qi)]TJ /F8 7.97 Tf 6.58 0 Td[(1KalmangainmatrixGi=P)]TJ /F5 7.97 Tf 0 -8.28 Td[(iHTiHiP)]TJ /F5 7.97 Tf 0 -8.28 Td[(iHTi+Ri)]TJ /F8 7.97 Tf 6.58 0 Td[(1Stateestimateupdate^xi=^x)]TJ /F5 7.97 Tf 0 -8.28 Td[(i+Gi(yi)]TJ /F7 11.955 Tf 11.96 0 Td[(Hi^x)]TJ /F5 7.97 Tf 0 -8.28 Td[(i)ErrorcovarianceupdatePi=(I)]TJ /F7 11.955 Tf 11.95 0 Td[(GiHi)P)]TJ /F5 7.97 Tf 0 -8.28 Td[(i. Here,P)]TJ /F5 7.97 Tf 0 -8.28 Td[(iandPiareaprioricovariancematrixandaposterioricovariancematrix,respectively,denedas P)]TJ /F5 7.97 Tf 0 -8.28 Td[(i=E[(xi)]TJ /F7 11.955 Tf 11.66 0 Td[(^x)]TJ /F5 7.97 Tf 0 -8.28 Td[(i)(xi)]TJ /F7 11.955 Tf 11.66 0 Td[(^x)]TJ /F5 7.97 Tf 0 -8.28 Td[(i)T],(2) Pi=E[(xi)]TJ /F7 11.955 Tf 11.66 0 Td[(^xi)(xi)]TJ /F7 11.955 Tf 11.65 0 Td[(^xi)T].(2) 2.1.3NonlinearKalmanFilter TheKalmanlterhasbeenappliedtoawiderangeofeldslikeoptimalcontrols,statisticalsignalprocessing,andeconometrics.However,therearemanyapplications,inwhichtheassumptionoflinearityisnotsatised.Whenthesystemisnonlinear,specically,thefunctionsfandhin( 1 )and( 1 )arenonlinear,theintegralbelowisinvolvedinthepredictionandupdateofthemeansandcovariances.[ 38 ] I=Zf(x)p(x)dx(2) wheref()denotesanonlinearfunctionandp(x)isaGaussiandistributionwithrespecttox.Inordertodealwith( 2 ),manyalgorithmsareproposed,includingextended 19

PAGE 20

Kalmanlter(EKF),unscentedKalmanlter(UKF)andcubatureKalmanlter(CKF).Allofthesealgorithmsarestilldevelopedundertheassumptionsoftheknownsystemandobservationmodels.Whentheyarenotknown,thereisaneedtolearnthemfromdata.Specically,whenthesystemandobservationaredescribedbyparameterizedmodelsandtheseparametersareunknown,thedualKalmanlter(DKF)canbeappliedtosolvetheproblem,whichisdiscussedinsection 2.1.4 2.1.3.1ExtendedKalmanFilter TodeveloptheextendedKalmanlter(EKF),thebasicideaistolinearizethestate-spacemodelof( 1 )and( 1 )ateachtimeinstantaroundthemostrecentstateestimate,whichistakentobeeither^xior^x)]TJ /F5 7.97 Tf 0 -8.28 Td[(i,dependingonwhichparticularfunctionisbeingconsidered.Oncealinearmodelisobtained,thestandardKalmanltercanbeapplied.HerethenoisepdfisstillconsideredGaussian.ThealgorithmofEKFissummarizedinAlgorithm2-2.Thedetailscanbefoundin[ 27 31 ]. Algorithm2-2:ExtendedKalmanlter Initialization:Fori=0,set^x0=E[x0]P0=E[(x0)]TJ /F4 11.955 Tf 11.95 0 Td[(E[x0])(x0)]TJ /F4 11.955 Tf 11.95 0 Td[(E[x0])T] Computation:Fori=1,2,...,compute:Denitions:linearizethestate-spacemodelFi=@f(x) @xjx=^xi)]TJ /F17 5.978 Tf 5.75 0 Td[(1Hi=@h(x) @xjx=^x)]TJ /F10 5.978 Tf 0 -6.41 Td[(iStateestimatepropagation^x)]TJ /F5 7.97 Tf 0 -8.28 Td[(i=fi(^xi)]TJ /F8 7.97 Tf 6.59 0 Td[(1)ErrorcovariancepropagationP)]TJ /F5 7.97 Tf 0 -8.28 Td[(i=FiPi)]TJ /F8 7.97 Tf 6.59 0 Td[(1FTi+Qi)]TJ /F8 7.97 Tf 6.59 0 Td[(1KalmangainmatrixGi=P)]TJ /F5 7.97 Tf 0 -8.28 Td[(iHTiHiP)]TJ /F5 7.97 Tf 0 -8.28 Td[(iHTi+Ri)]TJ /F8 7.97 Tf 6.59 0 Td[(1Stateestimateupdate^xi=^x)]TJ /F5 7.97 Tf 0 -8.28 Td[(i+Gi(yi)]TJ /F7 11.955 Tf 11.95 0 Td[(hi(^x)]TJ /F5 7.97 Tf 0 -8.28 Td[(i))ErrorcovarianceupdatePi=(I)]TJ /F7 11.955 Tf 11.96 0 Td[(GiHi)P)]TJ /F5 7.97 Tf 0 -8.28 Td[(i. FortheEKF,thenonlinearfunctionsarelinearizedattheestimatevalues^xiand^x)]TJ /F5 7.97 Tf 0 -8.28 Td[(i.However,thelinearizationofnonlinearfunctionsresultsinaninaccuratepropagationof 20

PAGE 21

thepdf,anddegradestheperformance.Inordertoimprovethepropagationofthepdfthroughthenonlinearfunctions,theunscentedKalmanlter(UKF)andthecubatureKalmanlter(CKF)areproposed,whicharedescribedinsection 2.1.3.2 and 2.1.3.3 ,respectively. 2.1.3.2UnscentedKalmanFilter TheunscentedKalmanlter(UKF)isastraightforwardextensionoftheunscentedtransformation(UT)totheKF.TheUTisamethodforcalculatingthestatisticsofarandomvariablewhichundergoesanonlineartransformation.Considerpropagatinganx-dimensionalrandomvariablexthroughanonlinearfunctiony=f(x).Assumexhasasymmetricpriordensity(x)withmeanxandcovariancePx,withinwhichtheGaussianisaspecialcase.Tocalculatethestatisticsofy,weformasetof(2L+1)samplespointsnamedsigmavectorsandweights,fXn,!(m)n,!(c)ng2Ln=0accordingtothefollowing: X0=x,Xn=x+p (L+)Pxn,n=1,...,L,Xn=x)]TJ /F14 11.955 Tf 11.95 13.27 Td[(p (L+)Pxn,n=L+1,...,2L, (2) !(m)0= L+,!(c)0= L++1)]TJ /F3 11.955 Tf 11.96 0 Td[(2+,!(m)n=!(c)n=1 2(L+),n=1,...,2L. (2) whereLissetasnx,and=2(L+k))]TJ /F4 11.955 Tf 12.92 0 Td[(Lisascalingparameter.Theconstantdeterminesthespreadofthesigmapointsaroundx,andisusuallysettoasmallpositivevalue(e.g.,110)]TJ /F8 7.97 Tf 6.59 0 Td[(4).Theconstantkisasecondscalingparameter,whichisusuallysetto3)]TJ /F4 11.955 Tf 12.09 0 Td[(L,andisusedtoincorporatepriorknowledgeofthedistributionof 21

PAGE 22

x(forGaussiandistributions,=2isoptimal).p (L+)Pxndenotestheithcolumnofthematrixsquareroot. Thesigmapointsandweightssatisfythefollowingmoment-matchingconditions: x=2LXn=0!(m)nXn,Px=2LXn=0!(c)n(Xn)]TJ /F6 11.955 Tf 11.81 0 Td[(x)(Xn)]TJ /F6 11.955 Tf 11.8 0 Td[(x)T. (2) Thesesigmavectorsarepropagatedthroughthenonlinearfunction Yn=f(Xn),n=0,...,2L,(2) andthemeanandcovarianceforycanalsobeapproximatedas y2LXn=0!(m)nYn,Py2LXn=0!(c)n(Yn)]TJ /F6 11.955 Tf 11.9 0 Td[(y)(Yn)]TJ /F6 11.955 Tf 11.9 0 Td[(y)T (2) UsingthesesigmavectorsfXg2Ln=0andfYg2Ln=0,wecanestimatethecrosscovariance,andusethemtoimplementtheKalmanlter.ThealgorithmofUKFissummarizedinAlgorithm2-3.Thedetailscanbefoundin[ 33 37 ]. 2.1.3.3CubatureKalmanFilter SimilarlytotheUKF,theCKFisanotherapproximateBayesianlterbuiltintheGaussiandomain,butusesacompletelydifferentsetofdeterministicweightedpoints,thecubaturepointsfng,whichrestsonapplyinganumericalmethodknownasthecubaturerule[ 40 41 ].Thecubaturepointsarelocatedattheintersectionofthesphereanditsaxes.Whenthehiddenstatexi2Rnx,thecubaturepointsandweightsaredenedas n=8>><>>:p nxenn=1,2,...,nx)]TJ 9.3 7.97 Td[(p nxen)]TJ /F5 7.97 Tf 6.59 0 Td[(Ln=nx+1,nx+2,...,2nx(2) 22

PAGE 23

Algorithm2-3:UnscentedKalmanlter Initialization:Fori=0,set^x0=E[x0]P0=E[(x0)]TJ /F4 11.955 Tf 11.96 0 Td[(E[x0])(x0)]TJ /F4 11.955 Tf 11.96 0 Td[(E[x0])T] Computation:Fori=1,2,...,settingL=nxcalculatethesigmapointsandweightsXi)]TJ /F8 7.97 Tf 6.58 0 Td[(1=^xi)]TJ /F8 7.97 Tf 6.59 0 Td[(1^xi)]TJ /F8 7.97 Tf 6.59 0 Td[(1+p Pi)]TJ /F8 7.97 Tf 6.58 0 Td[(1^xi)]TJ /F8 7.97 Tf 6.59 0 Td[(1)]TJ /F3 11.955 Tf 11.95 0 Td[(p Pi)]TJ /F8 7.97 Tf 6.58 0 Td[(1calculatetheweightsn!(m)n,!(c)no2Ln=0accordingto( 2 )TimeUpdateXiji)]TJ /F8 7.97 Tf 6.58 0 Td[(1=f(Xi)]TJ /F8 7.97 Tf 6.58 0 Td[(1,ui)]TJ /F8 7.97 Tf 6.59 0 Td[(1),^x)]TJ /F5 7.97 Tf 0 -8.27 Td[(i=P2Ln=0!(m)nXn,iji)]TJ /F8 7.97 Tf 6.58 0 Td[(1P)]TJ /F5 7.97 Tf 0 -8.28 Td[(i=P2Ln=0!(c)n)]TJ /F2 11.955 Tf 5.48 -9.68 Td[(Xn,iji)]TJ /F8 7.97 Tf 6.58 0 Td[(1)]TJ /F6 11.955 Tf 11.81 0 Td[(^x)]TJ /F5 7.97 Tf 0 -8.28 Td[(i)]TJ /F2 11.955 Tf 12.95 -9.68 Td[(Xn,iji)]TJ /F8 7.97 Tf 6.59 0 Td[(1)]TJ /F6 11.955 Tf 11.81 0 Td[(^x)]TJ /F5 7.97 Tf 0 -8.28 Td[(iT+Qi)]TJ /F8 7.97 Tf 6.59 0 Td[(1regeneratethesigmapointsandweightsXi)]TJ /F8 7.97 Tf 6.58 0 Td[(1=hXiji)]TJ /F8 7.97 Tf 6.58 0 Td[(1X0ji)]TJ /F8 7.97 Tf 6.59 0 Td[(1+p Pi)]TJ /F8 7.97 Tf 6.58 0 Td[(1X0ji)]TJ /F8 7.97 Tf 6.58 0 Td[(1)]TJ /F3 11.955 Tf 11.96 0 Td[(p Pi)]TJ /F8 7.97 Tf 6.59 0 Td[(1isettingL!2Lrecalculatetheweightsn!(m)n,!(c)no2Ln=0accordingto( 2 )Yiji)]TJ /F8 7.97 Tf 6.59 0 Td[(1=h(Xiji)]TJ /F8 7.97 Tf 6.59 0 Td[(1),^y)]TJ /F5 7.97 Tf -.18 -8.28 Td[(i=P2Ln=0!(m)nYn,iji)]TJ /F8 7.97 Tf 6.59 0 Td[(1MeasurementUpdatePyiyi=P2Ln=0!(c)n)]TJ /F2 11.955 Tf 5.48 -9.69 Td[(Yn,iji)]TJ /F8 7.97 Tf 6.58 0 Td[(1)]TJ /F6 11.955 Tf 11.9 0 Td[(^y)]TJ /F5 7.97 Tf -.19 -8.28 Td[(i)]TJ /F2 11.955 Tf 12.95 -9.69 Td[(Yn,iji)]TJ /F8 7.97 Tf 6.58 0 Td[(1)]TJ /F6 11.955 Tf 11.9 0 Td[(^y)]TJ /F5 7.97 Tf -.18 -8.28 Td[(iT+Ri)]TJ /F8 7.97 Tf 6.59 0 Td[(1Pxiyi=P2Ln=0!(c)n)]TJ /F2 11.955 Tf 5.48 -9.68 Td[(Xn,iji)]TJ /F8 7.97 Tf 6.59 0 Td[(1)]TJ /F6 11.955 Tf 11.8 0 Td[(^x)]TJ /F5 7.97 Tf 0 -8.28 Td[(i)]TJ /F2 11.955 Tf 12.96 -9.68 Td[(Yn,iji)]TJ /F8 7.97 Tf 6.59 0 Td[(1)]TJ /F6 11.955 Tf 11.89 0 Td[(^y)]TJ /F5 7.97 Tf -.18 -8.28 Td[(iTGi=PxiyiP)]TJ /F8 7.97 Tf 6.59 0 Td[(1yiyi^xi=^x)]TJ /F5 7.97 Tf 0 -8.28 Td[(i+Gi)]TJ /F7 11.955 Tf 5.47 -9.68 Td[(yi)]TJ /F7 11.955 Tf 11.75 0 Td[(^y)]TJ /F5 7.97 Tf -.18 -8.28 Td[(iPi=P)]TJ /F5 7.97 Tf 0 -8.28 Td[(i)]TJ /F7 11.955 Tf 11.96 0 Td[(GiPyiyiGTiwhere=p L+ !i=1 nx(2) whereenis00101,here1isthenthcomponent. Usingthesecubaturepoints,( 2 )canbecalculatedas IN(f)=ZRnxf(x)N(x;0,I)dx2nxXn=1!nf(n)(2) Withthecubaturepoints,wecanpresenttheCKFalgorithminAlgorithm2-4.MoredetailsoftheCKFalgorithmarediscussedin[ 38 39 ]. 23

PAGE 24

Algorithm2-4:CubatureKalmanlter Initialization:Fori=0,set^x0=E[x0]P0=E[(x0)]TJ /F4 11.955 Tf 11.95 0 Td[(E[x0])(x0)]TJ /F4 11.955 Tf 11.96 0 Td[(E[x0])T] Computation:Fori=1,2,...,TimeUpdateFactorizePi)]TJ /F8 7.97 Tf 6.59 0 Td[(1=Di)]TJ /F8 7.97 Tf 6.58 0 Td[(1DTi)]TJ /F8 7.97 Tf 6.59 0 Td[(1.Evaluatethecubaturepoints(n=1,2,...,m)Xn,i)]TJ /F8 7.97 Tf 6.59 0 Td[(1=Di)]TJ /F8 7.97 Tf 6.58 0 Td[(1n+^xi)]TJ /F8 7.97 Tf 6.59 0 Td[(1,wherem=2nxEvaluatethepropagatecubaturepoints(n=1,2,...,m)Xn,i=f(Xn,i)]TJ /F8 7.97 Tf 6.59 0 Td[(1,ui)]TJ /F8 7.97 Tf 6.58 0 Td[(1).Estimatethepredictedstate^x)]TJ /F5 7.97 Tf 0 -8.28 Td[(i=1 mPmn=1Xn,iEstimatethepredictederrorcovarianceP)]TJ /F5 7.97 Tf 0 -8.28 Td[(i=1 mPmn=1Xn,iXTn,i)]TJ /F6 11.955 Tf 11.8 0 Td[(^x)]TJ /F5 7.97 Tf 0 -8.28 Td[(i^x)]TJ /F5 7.97 Tf 6.59 0 Td[(Ti+Qi)]TJ /F8 7.97 Tf 6.59 0 Td[(1.MeasurementUpdateFactorizeP)]TJ /F5 7.97 Tf 0 -8.28 Td[(i=D)]TJ /F5 7.97 Tf 0 -8.28 Td[(iD)]TJ /F5 7.97 Tf 6.59 0 Td[(TiEvaluatethecubaturepoints(n=1,2,...,m)X)]TJ /F5 7.97 Tf -1.75 -8.27 Td[(n,i=D)]TJ /F5 7.97 Tf 0 -8.27 Td[(i)]TJ /F8 7.97 Tf 6.58 0 Td[(1n+^x)]TJ /F5 7.97 Tf 0 -8.27 Td[(i,Evaluatethepropagatedcubaturepoints(n=1,2,...,m)Y)]TJ /F5 7.97 Tf -.98 -8.28 Td[(n,i=h)]TJ /F2 11.955 Tf 5.48 -9.69 Td[(X)]TJ /F5 7.97 Tf -1.75 -8.28 Td[(n,i,ui.Estimatethepredictedmeasurement^y)]TJ /F5 7.97 Tf -.18 -8.28 Td[(i=1 mPmn=1Y)]TJ /F5 7.97 Tf -.99 -8.28 Td[(n,iEstimatetheinnovationcovariancematrixPyy=1 mPmn=1Y)]TJ /F5 7.97 Tf -.99 -8.28 Td[(n,iY)]TJ /F5 7.97 Tf 6.59 0 Td[(Tn,i)]TJ /F6 11.955 Tf 11.9 0 Td[(^y)]TJ /F5 7.97 Tf -.18 -8.28 Td[(i^y)]TJ /F5 7.97 Tf 6.59 0 Td[(Ti+Ri.Estimatethecross-covariancematrixPxy=1 mPmn=1X)]TJ /F5 7.97 Tf -1.75 -8.28 Td[(n,iY)]TJ /F5 7.97 Tf 6.58 0 Td[(Tn,i)]TJ /F6 11.955 Tf 11.81 0 Td[(^x)]TJ /F5 7.97 Tf 0 -8.28 Td[(i^y)]TJ /F5 7.97 Tf 6.58 0 Td[(Ti.EstimatetheKalmangainGi=PxyP)]TJ /F8 7.97 Tf 6.58 0 Td[(1yyEstimatetheupdatestate^xi=^x)]TJ /F5 7.97 Tf 0 -8.28 Td[(i+Gi)]TJ /F7 11.955 Tf 5.48 -9.69 Td[(yi)]TJ /F6 11.955 Tf 11.9 0 Td[(^y)]TJ /F5 7.97 Tf -.19 -8.28 Td[(iEstimatethecorrespondingerrorcovariancePi=P)]TJ /F5 7.97 Tf 0 -8.28 Td[(i)]TJ /F7 11.955 Tf 11.96 0 Td[(GiPyyGTi. Therearetwobasicpropertiesofanerrorcovariancematrix:i)symmetryandii)positivedeniteness,whicharepreservedineachupdatecycle.However,inpractice,becauseoferrorsintroducedbyarithmeticoperatorsperformedonniteword-lengthdigitalcomputers,thesetwopropertiesareoftenlost.Inordertoavoidthisproblem,asquare-rootversionoftheCKFisproposed,namedsquare-rootcubatureKalmanlter(SCKF).ThealgorithmofSCKFisgiveninAPPENDIXA.Thisisexactlythealgorithm 24

PAGE 25

weappliedinlaterchapters.Butforsimplicity,westillnameitasCKFifnoconfusionoccurs. 2.1.4DualKalmanFilter The(nonlinear)Kalmanlternotonlyprovidesanefcientmethodforgeneratingapproximatemaximum-likelihoodestimatesofthestateofadiscrete-timedynamicalsystem,butalsoinvolvesestimatingtheparametersofanonlinearmodelgivencleantrainingdataofinputsandoutputs.Insomecases,weneedtoestimatethestateofasystemwithaparameterizedmodel,whichisnotknownexactly.Whenthenoiselessstateisnotavailable,adualestimationapproachisrequired,inwhichboththestatesofthedynamicalsystemanditsparametersareestimatedsimultaneously,givenonlynoisyobservation. Tobemorespecic,weconsidertheproblemoflearningboththehiddenstatesxiandparameterswofadiscrete-timenonlineardynamicalsystem,modeledasfollows: xi+1=fi(xi,ui,w)+ni(2) yi=hi(xi,ui,w)+vi(2) wherewistheparameterinthestatetransitionandmeasurementfunctions,whichisusedtodeterminethesystemmodel. Essentially,toestimatebothhiddenstatesandparameters,two(nonlinear)Kalmanltersarerunconcurrently.Ateachtimestep,a(nonlinear)Kalmanstatelterestimatesthestateusingthecurrentmodelestimate^wi,whiletheother(nonlinear)Kalmanlterestimatestheparametersusingthecurrentestimate^xi.ThesystemisshownschematicallyinFigure 2-1 [ 42 46 ]. 2.1.5Adaptivelters Reconsiderthedynamicalmodelin( 1 )and( 1 ),wherewexthehiddenstatexi,butassumeitisstillunobservable.Inaddition,themeasurementisequaltotheinnerproductbetweenthehiddenstatexiandinputui.Undersuchassumptionwehavethe 25

PAGE 26

Figure2-1. Dualestimationproblem newmodel: xi+1=xiyi=xTiui+vi (2) Wecanapplyalinearadaptiveltertosolvethiskindofproblem.Adaptiveltersareself-designinglters,whichrelyonarecursivealgorithmtoadapttheirweights(thehiddenstatex).Adaptiveltersarecommonlyclassiedaslinearornonlinear.Ifitsinput-outputmapobeystheprincipleofsuperposition,thelterissaidtobelinear.Otherwise,thelterisnonlinear. Forlinearadaptivelters,basically,therearetwodistinctapproachesforderivingrecursivealgorithmsfortheiroperation.Therstapproachisthestochasticgradientapproachforinstancetheleastmeansquaresalgorithm(LMS);thesecondapproachisbasedonthemethodofleastsquares,forinstancetherecursiveleastsquaresalgorithm(RLS).Theformerissimplebutissensitivetoeigenvaluespreadoftheinput,whichmakesconvergenceslow;thelatteriscomputationallymoreexpensive,butfollowstheWienersolutionateachiteration.BecauseofthecloserelationshipbetweenRLSandKF,herewejustfocusontheRLSalgorithmanditsextension. 26

PAGE 27

2.1.5.1RecursiveLeastSquares Withasequenceoftrainingdatafuj,yjgij=1uptotimei,theRLSalgorithmestimatetheweightxibyminimizingthefollowingcost minxiiXj=1jyj)]TJ /F7 11.955 Tf 11.95 0 Td[(uTjxij2+kxik2(2) whereujisthenu1regressorsinput,yjisthedesiredresponse(whichisassumedascalarhere,butextensiontomulti-dimensionaloutputisquitestraightforward),andistheregularizationparameter.TheadvantageoftheRLSalgorithmistobeabletocalculatedtheweightxirecursivelyfromthepreviousestimatexi)]TJ /F8 7.97 Tf 6.59 0 Td[(1withoutsolving( 2 )directly.ThestandardRLSissummarizedinAlgorithm2-5,andmoredetailscanbefoundin[ 1 2 ]. Algorithm2-5:RecursiveLeastSquares(RLS) Initialization:fori=0,setx0=0,P0=)]TJ /F8 7.97 Tf 6.59 0 Td[(1Imwherem=nu,Imismmidenticalmatrix Iterate:fori1rei=1+uTiPi)]TJ /F8 7.97 Tf 6.59 0 Td[(1uiGi=Pi)]TJ /F8 7.97 Tf 6.58 0 Td[(1ui=reiei=yi)]TJ /F7 11.955 Tf 11.96 0 Td[(uTixi)]TJ /F8 7.97 Tf 6.59 0 Td[(1xi=xi)]TJ /F8 7.97 Tf 6.59 0 Td[(1+GieiPi=Pi)]TJ /F8 7.97 Tf 6.59 0 Td[(1`)]TJ /F7 11.955 Tf 11.96 0 Td[(Pi)]TJ /F8 7.97 Tf 6.58 0 Td[(1uiuTiPi)]TJ /F8 7.97 Tf 6.58 0 Td[(1=rei TheRLSestimationmaybeviewedasaspecialcaseoftheKalmanalgorithm.AdistinguishingfeatureoftheKalmanlteristhenotionofstate,whichprovidesadynamicaldescriptionofthesystemataspecicinstantoftime.Generallyspeaking,adaptivelterscanbetreatedwithaconstanthiddenstatex,andtheoptimalweightsofthelters,whichneedtobelearnedfromthetrainingdataasfui,yigattheithiteration.Butthereisanexception:theextendedrecursiveleastsquaresalgorithm(Ex-RLS),whichhasstatemodelandvarianthiddenstates.Becauseofthis,theEx-RLSalgorithmhasmuchcloserrelationshipwiththeKalmanlter. 27

PAGE 28

2.1.5.2ExtendedRecursiveLeastSquares AlthoughtheRLSalgorithmhasafasterrateofconvergencethantheleastmeansquares(LMS)algorithm[ 1 2 ],anotherwell-knownlinearregressionalgorithminanon-stationaryenvironment,theLMSalgorithmexhibitsabettertrackingbehaviorthantheRLSalgorithm. InordertoimprovethetrackingbehavioroftheRLSalgorithm,theEx-RLSwasdevelopedbyHaykin[ 3 ]basedonthestate-spacemodelbelow: xi+1=Fixi+niyi=uTixi+vi (2) whereFiisknowntransitionmatrix,fyigNi=0andfuigNi=0areN+1desiredresponseandinputs,respectively.Forgivenpositive-denitematricesf0,QnRng,andaninitialguessx0,weposetheproblemofestimatingtheinitialstatevectorx0andthesignalsn0,n1,...,nNinaregularizedleast-squaresmannerbysolving minfx0,x1,...,xNgJ(x0,n1,...,nN)(2) subjecttothestate-equationconstraint xi+1=Fixi+ni(2) HerecostfunctionJisquadraticinitsargumentsandisgivenby J=(x0)]TJ /F6 11.955 Tf 11.81 0 Td[(x0)T)]TJ /F8 7.97 Tf 6.58 0 Td[(10(x0)]TJ /F6 11.955 Tf 11.8 0 Td[(x0)+NXn=0nTiQ)]TJ /F8 7.97 Tf 6.59 -.01 Td[(1ini+NXn=0)]TJ /F4 11.955 Tf 5.48 -9.69 Td[(yi)]TJ /F7 11.955 Tf 11.95 0 Td[(uTixiTR)]TJ /F8 7.97 Tf 6.58 0 Td[(1i)]TJ /F4 11.955 Tf 5.48 -9.69 Td[(yi)]TJ /F7 11.955 Tf 11.95 0 Td[(uTixi (2) 28

PAGE 29

Thesolutionof( 2 )shownin[ 3 ]leadstoaniterativeprocedurethatprovidesrecursiveestimatesofthesuccessiveweightvectorsxi,denotedby^xi.ThealgorithmoftheEx-RLSissummarizedinAlgorithm2-6.Moredetailscanbefoundin[ 1 2 ]. Algorithm2-6:ExtendedRecursiveLeastSquares(Ex-RLS) Initialization:fori=0,setx0=0,P0=0 Iterate:fori1Re,i=Ri+uTiPi)]TJ /F8 7.97 Tf 6.59 0 Td[(1uiGi=Fi)]TJ /F8 7.97 Tf 6.59 0 Td[(1Pi)]TJ /F8 7.97 Tf 6.58 0 Td[(1ui=reiei=yi)]TJ /F7 11.955 Tf 11.95 0 Td[(uTi^xi)]TJ /F8 7.97 Tf 6.58 0 Td[(1^xi=Fi)]TJ /F8 7.97 Tf 6.59 0 Td[(1^xi)]TJ /F8 7.97 Tf 6.59 0 Td[(1+GieiPi=Fi)]TJ /F8 7.97 Tf 6.59 0 Td[(1Pi)]TJ /F8 7.97 Tf 6.59 0 Td[(1`FTi)]TJ /F8 7.97 Tf 6.58 0 Td[(1+Qi)]TJ /F8 7.97 Tf 6.59 0 Td[(1)]TJ /F7 11.955 Tf 11.95 0 Td[(GiRe,iGTi Actually,thisalgorithmcanberegardedastheKalmanltermentionedintheprevioussubsectionwiththefollowing(statistical)assumptionsonthenoisesequences:First,fnigandfvigarebothassumedzero-meanwhitenoisesequenceswithcovariancematricesQandR;Second,theinitialstate-vectorx0isassumedrandomwithmeanx0,andcovariancematrix0;Third,therandomvariablesfni,vi(x0)]TJ /F6 11.955 Tf 11.8 0 Td[(x0)gareassumeduncorrelated.Simplyput,theEx-RLSalgorithmsolvesadeterministicestimationproblem,unliketheKalmanlterthatsolvesastochasticestimationproblem.AslongaswetreattheweightmatricesQandRinthecostfunction( 2 )asthecovariancematricesQandRtodescribethenoises,wecanutilizetheKalmanltertosolvetheEx-RLSproblem. 2.2AlgorithmsinRKHS Inthissection,wereviewsomealgorithmsthataredevelopedinthereproducingkernelHilbertspace(RKHS).TheRKHSisafunctionspace,inwhichsomenonlinearproblemsintheinputspacearerepresentedaslinearforms.Becauseofthisproperty,somelinearalgorithmsareextendedasnonlinearalgorithms,suchasthecelebratedsupportvectormachine(SVM)[ 5 10 ],kernelprinciplecomponentanalysis(KPCA)[ 11 12 ],andkernelindependentcomponentanalysis(KICA)[ 13 14 ].Here,wejustfocusonsomekernelversionsofadaptivelters,includingthekernelrecursiveleast 29

PAGE 30

square(KRLS)[ 16 ]andtheextendedkernelrecursiveleastsquares(Ex-KRLS)[ 17 ].Inaddition,wepresentakernelKalmanlter[ 56 57 ],whichderiveaKalmanlterinasubspaceoftheRKHS. 2.2.1FoundationsofRKHS Inordertopresentthekernelmethodmoreclearly,werstprovidesomefoundationsofreproducingkernelHilbertspaces(RKHS). AreproducingkernelHilbertspaceisaspecialHilbertspaceassociatedwithakernelsuchthatitreproduces(viaaninnerproduct)eachfunctioninthespace[ 48 54 ].LetHbeaHilbertspaceofreal-valuedfunctionsonadomainX,equippedwithaninnerproducth,iHandareal-valuedbivariatefunctionk(x,y)onXX.Anynonnegativedenitebivariatefunctionisareproducingkernelbecauseofthefollowingfundamentaltheorem[ 49 ]. Theorem2.1(Moore-Aronszajn). Givenanynonnegativedenitefunctionk(x,y),thereexistsauniquelydetermined(possiblyinnitedimensional)HilbertspaceHconsistingoffunctionsonXsuchthat (I)8x2X,k(x,)2H (II)8x2X,8f2H,f(x)=hf,k(x,)iH. ThenH:=HkisareproducingkernelHilbertspaceassociatedwithakernelfunctionk(x,y).PropertyIIiscalledthereproducingpropertyofk(x,y)inHk. AccordingtotheMercer'stheorem[ 90 ],therearecountablymanynonnegativeEigenvaluesfi:i2NgandcorrespondingorthonormalEigenfunctionsf i:i2Ng2Hksuchthat k(x,y):=Xi2Ni i(x) i(y)T,(x,y)2XX(2) wheretheprimerTdenotesthetranspose.TheseriesaboveconvergesabsolutelyanduniformlyonXX.Wehaveafeaturemap':X!`2(N)ateachx2Xandj2Nas '(x)(j):=p j j(x).(2) 30

PAGE 31

Therefore,Mercer'srepresentationin( 2 )establishesforeachx,y2Xthat k(x,y)=hk((x,),k(y,)iHk=h'(x),'(y)i`2(N).(2) Itiseasytocheckthat`2(N)isessentiallythesameastheRKHSHkbyidentifying'(x)=k(x,)[ 89 ].Byslightlyabusingthenotation,wedonotdistinguish`2(N)andHkinthisdissertation.Alternatively,k(x,)canbeviewedasthefeaturemap'(x),whichmapseachx2Xintothehighdimensional(orinnitedimensional)spaceHk,whosedimensionalitynkisdeterminedbythenumberofstrictlypositiveeigenvalues.Forexample,fortheGaussiankernel,whichisdenedas k(x,y)=exp)]TJ 10.49 8.08 Td[(kx)]TJ /F4 11.955 Tf 11.95 0 Td[(yk2 22=exp)]TJ /F2 11.955 Tf 5.48 -9.68 Td[()]TJ /F3 11.955 Tf 9.3 0 Td[(%kx)]TJ /F4 11.955 Tf 11.96 0 Td[(yk2(2) whereisthekernelsizeand%isthekernelparameter,thedimensionalityofthefeaturespacenkisinnite.Inotherwords,wecantreat'(x)asanelementinHk,afunction,orannk1columnvector,especially,aninnite-dimensionalvectorfortheGaussiankernel.( 2 )isalsoknownaskerneltrick,whichmakestheinnerproductintheinnitedimensionalspacecomputable. AccordingtotheRepresenterTheorem[ 87 88 ],functionshavethefollowingrepresentation f()=Xj2Nnjk(xj,),xj2X.(2) whereNn:=f1,2,...,ngandfj:j2NngRareparameterstypicallyobtainedfromtrainingdata,fxj:j2Nng.TheRepresenterTheoremhaswidelyapplicability[ 52 54 ]. Furthermore,ifthekernelassociatedwiththeRKHSHisuniversal[ 78 ],thengivenanyprescribedcompactsubsetXsubofX,anypositivenumber"andanyfunctionf2C(Xsub),thereisafunctiong2K(Xsub)suchthatkf)]TJ /F4 11.955 Tf 12.41 0 Td[(gkXsub",whereC(Xsub)isthespaceofallcontinuouscomplex-valuedfunctionsfromXsubtoCequippedwith 31

PAGE 32

maximumnormkkXsubandK(Xsub)isthespacedenedby K(Xsub):= spanfk(x,):x2Xsubg.(2) Inotherwords,withanuniversalkernel,wecanapproximateanyreal-valuedtargetfunctiondenedonacompactspacearbitrarilywellasthenumberofsummandsincreaseswithoutbound.TheGaussiankernelisuniversal,thereforeiswidelyappliedinmanykernelmethodsandwillbeappliedinallalgorithmsproposedinthisdissertation. 2.2.2KernelRecursiveLeastSquares WeconsiderasequenceofinputandoutputsamplesDi=f(u1,y1),...,(ui,yi)g,arisingfromsomeunknownsource.Inthepredictionproblem,oneattemptstondthebestpredictor^yiforyigivenDi)]TJ /F8 7.97 Tf 6.58 0 Td[(1[fuig.Inthiscontext,oneisofteninterestedinon-lineapplications,wherethepredictorisupdatedfollowingthearrivalofeachnewsample.TheKRLSalgorithmassumesafunctionalform,e.g.^yi=f(ui)andminimizesthecostfunctionJ J=minf"iXj=1jyj)]TJ /F4 11.955 Tf 11.96 0 Td[(f(uj)j2+kf()k#.(2) whereisregularizationfactor. InthereproducingkernelHilbertspace(RKHS)denotedbyH,aspaceoffunctions,afunctionf()isexpressedasaninnitedimensionalvector,denotedbyx2H,andtheevaluationofthefunctionf(u)isexpressedastheinnerproductbetweenxand'(u),asbelow: f(u)=hxj'(u)i=xT'(u).(2) where'()mapsuintoH[ 48 50 ].Sothecostfunctionisrewrittenas J=minx"iXj=1yj)]TJ /F7 11.955 Tf 11.95 0 Td[(xT'(uj)2+kxk#.(2) 32

PAGE 33

TheKRLSalgorithmsolvesthecostfunction( 2 )recursivelyandestimatexasalinearcombinationoff'(uj)gij=1, x=iXj=1aj'(uj)=iai(2) where i=['(u1),...,'(ui)] and ai=[a1,...,ai]T. TheKRLSalgorithmissummarizedinAlgorithm2-7.Thedetailscanbefoundin[ 16 17 ]. Algorithm2-7:KernelRecursiveLeastSquares(KRLS) Initialize:S1=(+k(u1,u1)))]TJ /F8 7.97 Tf 6.59 0 Td[(1,a1=S1y1Iteratefori>1hi=[k(ui,u1),...,k(ui,ui)]TJ /F8 7.97 Tf 6.59 0 Td[(1)]Tzi=Si)]TJ /F8 7.97 Tf 6.59 0 Td[(1hiri=+k(ui,ui))]TJ /F7 11.955 Tf 11.95 0 Td[(z(i)ThiSi=r(i))]TJ /F8 7.97 Tf 6.59 0 Td[(1Si)]TJ /F8 7.97 Tf 6.58 0 Td[(1r(i)+ziz(i)T)]TJ /F7 11.955 Tf 9.3 0 Td[(zi)]TJ /F7 11.955 Tf 9.3 0 Td[(zTi1ei=yi)]TJ /F7 11.955 Tf 11.96 0 Td[(hTiai)]TJ /F8 7.97 Tf 6.59 0 Td[(1ai=ai)]TJ /F8 7.97 Tf 6.58 0 Td[(1)]TJ /F7 11.955 Tf 11.95 0 Td[(zir)]TJ /F8 7.97 Tf 6.59 0 Td[(1ieir)]TJ /F8 7.97 Tf 6.58 0 Td[(1iei AlthoughtheKRLScanonlineapproximatetheunderlyingnonlinearfunctionbasedonthetrainingdataset,thecomputationcomplexityO(i2)increaseswiththenumberoftrainingdata.Toreducethetimeandspacecomplexities,someonlinesparsicationalgorithmsareproposed,includingKRLSwithapproximatelineardependency(ALD),quantizedKRLS(QKRLS)andsoon. 2.2.2.1KRLS-ALD Accordingto( 2 ),theapproximatedunderlyingfunction^f()=xisthelinearcombinationofmappedinputsf'(ui)gij=1.Toreducethecomplexity,weneedtoreduce 33

PAGE 34

thenumberofthemappedinputs.Theideaistosequentiallytestwhetheranewfeatureinputisapproximatelylinearlydependentonthedictionaryvectors,whichisasetofalreadyselectedfeatureinputs.Ifnot,weaddittothedictionary.Wedenotethedictionaryasf'(~uj)gmi)]TJ /F17 5.978 Tf 5.75 0 Td[(1j=1andnewfeatureinputas'(ui)atiterationi. Toavoidaddingthetrainingsampletothedictionary,weneedtondcoefcients~c=[~c1,...,~cmi)]TJ /F17 5.978 Tf 5.76 0 Td[(1]satisfyingtheapproximatelineardependence(ALD)condition i:=min~ckmi)]TJ /F17 5.978 Tf 5.75 0 Td[(1Xj=1~cj'(~uj))]TJ /F3 11.955 Tf 11.95 0 Td[('(ui)k2ALD(2) whereALDistheALDthresholdparameterdeterminingthelevelofsparsity.IftheALDconditionin( 2 )holds,'(ui)canbeapproximatedwithinasquarederrorbysomelinearcombinationofcurrentdictionarymembers.Expanding( 2 ),wehave i=min~cfmi)]TJ /F17 5.978 Tf 5.75 0 Td[(1Xk,j=1~ck~cjk(~uk,~uj))]TJ /F6 11.955 Tf 11.96 0 Td[(2mi)]TJ /F17 5.978 Tf 5.75 0 Td[(1Xj=1~cjk(~uk,ui)+k(ui,ui)g=min~cf~cT~Ki~c)]TJ /F6 11.955 Tf 11.95 0 Td[(2~cT~i'(ui)+k(ui,ui)g (2) where~i=['(~u1),...,'(~umi)]TJ /F17 5.978 Tf 5.75 0 Td[(1)]and~Ki=~Ti~i.Solving( 2 )yieldstheoptimal~ciandtheALDcondition ~ci=~K)]TJ /F8 7.97 Tf 6.59 0 Td[(1i~Ti'(ui)(2) i=k(ui,ui))]TJ /F6 11.955 Tf 11.74 0 Td[(~cTi~Ti'(ui)(2) respectively.Ifi>ALD,thenwemustexpandthecurrentdictionaryandmi)]TJ /F8 7.97 Tf 6.59 0 Td[(1=mi+1.MoredetailsanddiscussionabouttheKRLS-ALDcanbefoundin[ 16 ]. 2.2.2.2QKRLS UnliketheKRLS-ALDalgorithm,whichthrowsawaytheinputwhentheinputisapproximatelylinearlydependentonthedictionaryvectors,theQKRLSalgorithmquantizesandkeepallinputs[ 84 ].Toimplementtheonlinevectorquantization(VQ)fortheKRLSalgorithm,asimpleonline(sequential)VQmethodisapplied[ 83 ],inwhichthe 34

PAGE 35

Algorithm2-8:OnlineVectorQuantization Initialization:i=1Selectquantizationsize"andinitializecodebookC1=u1.Computation:i>11)computethedistancebetweenuiandCi)]TJ /F8 7.97 Tf 6.58 0 Td[(1:dis(ui,Ci)]TJ /F8 7.97 Tf 6.59 0 Td[(1)=kui)]TJ /F7 11.955 Tf 11.95 0 Td[(Cji)]TJ /F8 7.97 Tf 6.59 0 Td[(1kwherej=min1jjci)]TJ /F17 5.978 Tf 5.75 0 Td[(1jkui)]TJ /F7 11.955 Tf 11.95 0 Td[(Cji)]TJ /F8 7.97 Tf 6.59 0 Td[(1k2)Ifdis(ui,Ci)]TJ /F8 7.97 Tf 6.59 0 Td[(1)",keepthecodebookunchanged:Ci=Ci)]TJ /F8 7.97 Tf 6.59 0 Td[(1andquantizeuitotheclosetcodevector:Q[ui]=Cji)]TJ /F8 7.97 Tf 6.59 0 Td[(1.3)Otherwise,updatethecodebook:Ci=Ci)]TJ /F8 7.97 Tf 6.58 0 Td[(1,ui.wherekkdenotestheEuclideannormandCji)]TJ /F8 7.97 Tf 6.59 0 Td[(1denotesthejthelementofthecodebookCi)]TJ /F8 7.97 Tf 6.59 0 Td[(1. codebookistrainedsequentiallyfromtheinputdata.TheVQalgorithmispresentedinAlgorithm2-8. HereQ[]denotesavectorquantizationoperator,whichmapstheinputintoNbins.Inthenthbin,thereisoneinputcnwhichrepresentsotherinputsinthesamebin.Therefore,usingtheonlineVQmethod,wecanquantizeeachnew'(ui)tocn.Then,atiterationiwehavethecodebookCi=[c1,...,cN]representingNbins.ThenumberofinputsinthenthbinisMn.Thus,wehavePNn=1Mn=i. Actually,theQKRLSalgorithmattemptstosolvethefollowingcostfunctionrecursively min2H"iXj=1(yj)]TJ /F7 11.955 Tf 11.95 0 Td[(T'(Q[uj]))2+&kk2H#(2) where&istheregularizationterm.TheQKRLSissummarizedinAlgorithm2-9. Here,i=['(c1),...,'(cN)],Ki=Tiiand=diag[M1,...,MN]. 2.2.3ExtendedKernelRecursiveLeastSquares Intheprevioussubsection,thefunctionf(),orwesayx2Hisconsideredstatic.However,foranon-lineartimevariantsystem,thefunctionshouldbetimevarying,astheweightvectorsintheEx-RLSalgorithm.Therefore,liketheEx-RLSalgorithm,akernelizedversionoftheEx-RLSalgorithm,namedasextendedkernelrecursiveleastsquares(Ex-KRLS)algorithmisproposedbyLiuetal.[ 17 ]. 35

PAGE 36

Algorithm2-9:QuantizedKernelRecursiveLeastSquares Initialization:i=1Selectquantizationsize",regularizationterm&,initializeC1=u1,1=[1],S1=[k(u1,u1)+&])]TJ /F8 7.97 Tf 6.59 0 Td[(1,anda1=S1y1. Computation:i>11)Ifdis(ui,Ci)]TJ /F8 7.97 Tf 6.59 0 Td[(1)":Keepthecodebookunchanged:Ci=Ci)]TJ /F8 7.97 Tf 6.59 0 Td[(1,updatethematrixi:i=i)]TJ /F8 7.97 Tf 6.59 0 Td[(1+jjT,updateSi:Si=Si)]TJ /F8 7.97 Tf 6.58 0 Td[(1)]TJ /F8 7.97 Tf 13.15 7.38 Td[(Sji)]TJ /F17 5.978 Tf 5.75 0 Td[(1KjTi)]TJ /F17 5.978 Tf 5.76 0 Td[(1Si)]TJ /F17 5.978 Tf 5.76 0 Td[(1 1+KjTi)]TJ /F17 5.978 Tf 5.76 0 Td[(1Si)]TJ /F17 5.978 Tf 5.76 0 Td[(1,updateai:ai=ai)]TJ /F8 7.97 Tf 6.59 0 Td[(1+Sji)]TJ /F17 5.978 Tf 5.76 0 Td[(1[yi)]TJ /F8 7.97 Tf 7.36 1.77 Td[(KjTi)]TJ /F17 5.978 Tf 5.75 0 Td[(1ai)]TJ /F17 5.978 Tf 5.75 0 Td[(1] 1+KjTi)]TJ /F17 5.978 Tf 5.75 0 Td[(1Si)]TJ /F17 5.978 Tf 5.75 0 Td[(1.2)Ifdis(ui,Ci)]TJ /F8 7.97 Tf 6.59 0 Td[(1)>":updatethecodebook:Ci=Ci)]TJ /F8 7.97 Tf 6.58 0 Td[(1,uiupdatethematrixi:i=i)]TJ /F8 7.97 Tf 6.59 0 Td[(11,updateSiandaihi=[k(C1i)]TJ /F8 7.97 Tf 6.58 0 Td[(1,ui),...,k(Ci)]TJ /F8 7.97 Tf 6.59 0 Td[(1i)]TJ /F8 7.97 Tf 6.59 0 Td[(1,ui)]T,zi=Si)]TJ /F8 7.97 Tf 6.58 0 Td[(1hi,z,i=Si)]TJ /F8 7.97 Tf 6.59 0 Td[(1i)]TJ /F8 7.97 Tf 6.59 0 Td[(1hi,ri=&+k(ui,ui))]TJ /F7 11.955 Tf 11.95 0 Td[(hTiz,iSi=r)]TJ /F8 7.97 Tf 6.59 0 Td[(1iSi)]TJ /F8 7.97 Tf 6.59 0 Td[(1ri+z,izTi)]TJ /F7 11.955 Tf 9.3 0 Td[(z,i)]TJ /F7 11.955 Tf 9.3 0 Td[(zTi1ai=ai)]TJ /F8 7.97 Tf 6.58 0 Td[(1)]TJ /F7 11.955 Tf 11.96 0 Td[(z,ir)]TJ /F8 7.97 Tf 6.58 0 Td[(1ieir)]TJ /F8 7.97 Tf 6.59 0 Td[(1ieiwhereei=yi)]TJ /F7 11.955 Tf 11.95 0 Td[(hTiai)]TJ /F8 7.97 Tf 6.59 0 Td[(1 In[ 17 ]theEx-KRLSalgorithmwasderivedasageneralnonlinearstate-spacemodelinRKHS,accordingtothefollowingtheorem. Theorem2.2. (Theorem1in[ 17 ])Assumeageneralnonlinearstate-spacemodel si+1=g(si)di=h(ui,si)+vi (2) wheres2Sistheoriginalstatevector,g:S!Sandh:US!Raregeneralcontinuousnonlinearfunctions.Thereexistsatransformedinputvector'(ui)2Hu,atransformedstatevectorx(si)2HuandalinearoperatorA2HuHu,suchthatthe 36

PAGE 37

followingequationshold: x(si+1)=Ax(si) (2) di='(u)Tx(si)+vi (2) and '(ui)T'(uj)=k(ui,uj)(2) withk(,)asuitablekernelfunction. Theproofisgivenin[ 17 ]. Usingthistheorem,theEx-KRLSalgorithmisdevelopedbasedontwospecialcasesofthestatetransitionoperators.First,thestatetransitionoperatorisassumedasA=Iforthetrackingmodelwhere>0isascalingfactorandIisanidentityoperator.Second,thestatetransitionoperatorisconstructedwithniterankassumption.In[ 17 ]onlytherstcaseofthestatetransitionoperatorisimplemented.Therefore,wewillonlyfocusontherstassumption,whichresultsinthefollowingtrackingmodel: xi+1=xi+nidi='(ui)Txi+vi. (2) whereweusexiinsteadofx(si)forshortandniisstatenoiseintheRKHS.ItisnoteworthythatthisnoiseisomittedinTheorem1,sincethenoisetermmakestheproofverycomplicated.Thestate-spacemodelintheRKHSissupposedtosolvethefollowingproblem: si+1=~si+~nidi=h(ui,si)+vi (2) 37

PAGE 38

Algorithm2-10:ExtendedKernelRecursiveLeastSquaresfortheTrackingModel Initializei=1a1=d1 +k(u1,u1),1==(jj2+q),Q1=jj2 [+k(u1,u1)][jj2+q]Iteratei=2,3,...hi=[k(ui,u1),...,k(ui,ui)]TJ /F8 7.97 Tf 6.59 0 Td[(1)]Tzi=Qi)]TJ /F8 7.97 Tf 6.58 0 Td[(1hiri=ii)]TJ /F8 7.97 Tf 6.59 0 Td[(1+k(ui,ui))]TJ /F7 11.955 Tf 11.95 0 Td[(hTiziei=di)]TJ /F7 11.955 Tf 11.96 0 Td[(hTiai)]TJ /F8 7.97 Tf 6.59 0 Td[(1ai=ai)]TJ /F8 7.97 Tf 6.59 0 Td[(1)]TJ /F7 11.955 Tf 11.95 0 Td[(zir)]TJ /F8 7.97 Tf 6.59 0 Td[(1ieir)]TJ /F8 7.97 Tf 6.59 0 Td[(1ieii=i)]TJ /F17 5.978 Tf 5.76 0 Td[(1 jj2+iqi)]TJ /F17 5.978 Tf 5.75 0 Td[(1Qi=jj2 ri(jj2+iqi)]TJ /F17 5.978 Tf 5.76 0 Td[(1)Qi)]TJ /F8 7.97 Tf 6.58 0 Td[(1ri+zizTi)]TJ /F7 11.955 Tf 9.29 0 Td[(zi)]TJ /F7 11.955 Tf 9.29 0 Td[(zTi1 where~isascalarwhichisverycloseto1and~niisthestateprocessingnoiseinthestatespace. Then,theEx-KRLSalgorithmforthetrackingmodelisproposedbyimplementingtheEx-RLSalgorithm[ 1 ]intheRKHS,whichactuallyminimizesthefollowingleastsquarescostfunctioninRKHS: minfx1,n1,...nNg[PNi=1N)]TJ /F5 7.97 Tf 6.59 0 Td[(ijdi)]TJ /F3 11.955 Tf 11.96 0 Td[('(ui)Txij2+Nkx1k2+q)]TJ /F8 7.97 Tf 6.59 0 Td[(1PNi=1N)]TJ /F5 7.97 Tf 6.58 0 Td[(iknik2],subjecttoxi+1=xi+ni (2) whereistheexponentialweightingfactoronthepastdata,istheregularizationparametertocontroltheinitialstate-vectornormandqprovidestrade-offbetweenthemodelingvariationandmeasurementdisturbance.TheEx-KRLSissummarizedinAlgorithm2-10.Thedetailscanbefoundin[ 17 ]. 2.2.4RevisitingtheEx-KRLSAlgorithm Inthissectionwere-thinktheTheorem 2.2 andtheEx-KRLSalgorithm.Afterthatthreeimportantstatementsaregiven.Therstoneisabouttheexistenceofthestate 38

PAGE 39

transitionoperatorA.Thesecondoneisaboutthevalueoftheparameter.Finally,thisEx-KRLSalgorithmisconnectedwiththeKalmanlteralgorithm. 2.2.4.1ExistenceofStateTransitionOperator Withthekerneltrick,themeasurementfunctionh(si,ui)in( 2 )isexpressedastheinnerproductintheRKHSas h(si,ui)=hh(si,),'(ui)iHu=hxi,'(ui)iHu (2) wherethefunctionh(si,)isdenotedbyxiorx(si),whichisaparameterizedfunctionofsi.Inaddition,thedynamicsofthefunctionh(si,)aredescribedbytheoperatorAin( 2 ).In[ 17 ],xiandxi+1areconstructed,thentheauthorsstatethattheoperatorAexists. However,theexistenceoftheparameterizedfunctionscannotguaranteetheexistenceoftheoperatorA.First,wegiveacounterexample.Consideringthestatetransitionfunctiong(s)=s+1,themeasurementfunctionh(u,s)=s2u,andthecurrenthiddenstatesi=1 2,wehavesi)]TJ /F8 7.97 Tf 6.59 0 Td[(1=)]TJ /F8 7.97 Tf 10.49 4.71 Td[(1 2andsi+1=3 2.Thentheparameterizedmeasurementfunctionsareh(si)]TJ /F5 7.97 Tf 6.59 0 Td[(i,u)=1 4u,h(si,u)=1 4u,andh(si+i,u)=9 4u.Becauseh(si)]TJ /F8 7.97 Tf 6.59 0 Td[(1,u)andh(si,u)arethesame,onedoesnotknowtheparameterizedfunctionatnextstep.Therefore,thereexistsnooperatorAinthiscase. Then,weformallydiscusstheexistenceconditionofthisoperatorthroughconstructingtheoperatorA.Observing( 2 ),onecanndthattheoperatorAjustdescribesthedynamicsofx(si),butdoesnotdescribethedynamicsofthehiddenstatesiexplicitly.Ifwewanttodescribethedynamicsofthehiddenstatesiexplicitlyinthemodel,wehavetoconstructtheoperatorsthatspecifythestateandmeasurement 39

PAGE 40

modelsintheRKHS: (si+1)=G(si) (2) yi=hH(si),'(ui)iHu (2) where(si)2Hsand'(ui)2Hu,GandHarebothoperators,mappingHs7!HsandHs7!Hu,respectively.ItisnoteworthythatHuandHsaredifferentRKHS.Comparing( 2 )and( 2 ),onecanndthat xi=H(si)2Hu(2) and xi+1=H(si+1)=HG(si) (2) TheoperatorsGandHaredeterminedbythefunctionsgandhin( 2 ),respectively.Actually,theexistencesofxiandxi+1havebeenprovenbyconstructingtheoperatorsGandHin[ 17 ].However,theexistenceoftheoperatorAwasnotproven. Accordingto( 2 )and( 2 ),theoperatorAexists,ifandonlyifthemappingH:(si)7!xiisinjective,andtheoperatorAshouldbe A=HGT.(2) Here,wedenotethemappingfromxito(si)byT.Moreover,wehave (si)=Txi=TH(si)(2) holdingforallsi.Itmeans TH=I(2) Insum,theoperatorAexistsIfandonlyifTexists. 40

PAGE 41

2.2.4.2ValueofStateTransitionParameter Letusconsiderthevalueoftheparameterinthetrackingmodelundertheassumptionthatthestatetransitionoperatorexists.Insuchcase,thestatetransitionoperatorisassumedasA=I.Becauseof( 2 )wehave HGT=I.(2) Next,accordingto( 2 ),thefollowingequationisobtained. G=I.(2) Then,substituting( 2 )into( 2 ),wehave (si+1)=G(si)=(si).(2) Furthermore,accordingtothepropertyofthetranslationinvariantkernels[ 86 ],includingtheGaussiankernel,weknowthatthenormof(s)isconstant,suchas k(si)kHs=k(si+1)kHs(2) Finally,substituting( 2 )into( 2 ),wehave 2=1(2) Therefore,theparametershouldbe1,notafreeparameter.ThismeansthattheoreticallywecannotmodelthestatetransitionoperatorastheformofA=Iforgeneralcases.Thisstatemodelcanonlybeusedfortheidentitystatemodelwhereissetas1.TheEx-KRLSalgorithmproposedin[ 17 ]isactuallyarandomwalkKRLSalgorithm,whichisdescribedbythefollowingstate-spacemodelintheRKHS: xi+1=xi+nidi='(ui)Txi+vi. (2) 41

PAGE 42

Thatistosaytheproblemdescribedbythestatespacemodelin( 2 )cannotbesolvebytheEx-KRLSalgorithm,unless~isequalto1. 2.2.4.3RelationshipwiththeKalmanFilter BecauseofthecloserelationshipbetweentheEx-RLSandtheKalmanlter,itiseasytonotethattheEx-KRLSalgorithmcanbeexplainedasaKalmanlterintheRKHS.Thecostfunction( 2 )canberewrittenas minfx1,n1,...nNg[xT1)]TJ /F8 7.97 Tf 6.58 0 Td[(1x1+PNi=1nTiQ)]TJ /F8 7.97 Tf 6.58 0 Td[(1ini+PNi=1(di)]TJ /F3 11.955 Tf 11.95 0 Td[('(ui)Txi)TR)]TJ /F8 7.97 Tf 6.58 0 Td[(1i(di)]TJ /F3 11.955 Tf 11.96 0 Td[('(ui)Txi)]subjecttoxi+1=xi+ni (2) where=)]TJ /F8 7.97 Tf 6.59 0 Td[(1I,Qi=qiIandRi=i. AccordingtotheargumentsabouttheEx-RLSandtheKalmanlterinSection12.Aand12.Bin[ 1 ],theEx-KRLSalgorithmsummarizedinAlgorithm1isequivalenttoaKalmanlterintheRKHS,withthestatespacemodeldenedin( 2 ).ForthisKalmanlterintheRKHS,theinitialposteriorstatecovarianceis=)]TJ /F8 7.97 Tf 6.58 0 Td[(1I,stateprocessnoiseni2Huandmeasurementnoisevi2Rarebothindependent,zeromean,GaussianprocesswithcovarianceQi=qiIandRi=i. 2.2.4.4ConclusionsaboutEx-KRLSalgorithm Inthissection,weanalyzedtheEX-KRLSalgorithmproposedin[ 17 ]andpointoutthefollowing:First,thestatetransitionoperatorAintheRKHSdoesnotexistinthegeneralcase.Theproofoftheorem1in[ 17 ]isnotalwaycorrect.Itonlyexistswhenthemappingfromthehiddenstatesitotheparameterizedmeasurementfunctionh(si,)isinjective.Second,eveniftheoperatorAexistsandisassumedasIforthetrackingmodel,theoretically,theparameterhastobesetas1,whichisnotafreeparameter.Therefore,therandomwalkKRLSmodelisnotaspecialcaseofthisalgorithm,buttheonlycase.Furthermore,thegeneralstatetransitionoperatorshouldnotbemodeledintheformIintheRKHS,whichisverydifferentfromtheEx-RLSalgorithm.Thus, 42

PAGE 43

theproblemwhichisdescribedby( 2 )cannotbesolvedbythisEx-KRLSalgorithm.Finally,theEx-KRLSalgorithmisconnectedwiththeKalmanlterandexplainedasaspecialKalmanlterintheRKHSwithanidentitystatemodel. 2.2.5KernelKalmanFilter Asmentionedintheprevioussubsection,itisverychallengingtoestimatethemeasurementfunctionoftwoarguments.Therefore,considerthemeasurementfunctionofonlyoneargument,thehiddenstatexi.Noinputuiisconsideredintheproblem.ThereexistsakernelKalmanlteralgorithm(KKF)proposedbyL.RalaivolaandF.d'AlcheBuc[ 56 57 ]totacklethenonlineartimeseriesprocessingfyigTi=1,whichisbuiltinahighdimensionalsubspacegeneratedbythetrainingdatafy0jgNj=1. TheideaoftheKKFalgorithmisdescribedasfollowing:First,obtainasetoforthonormalbasisB=[b1,,bl]fromthelprincipleaxesofthemappedtrainingdatasetf'(y0i)gNi=1byKernelPCA(KPCA)[ 12 ]method;next,projectthemappedmeasurement'(yi)intothesubspaceHBwiththebasisB,wherethesubspaceHBisspannedbyB;then,constructstate-spacemodelinthesubspaceHBandimplementtheKalmanltertoestimateorpredictthehiddenstateinsuchsubspace.Finally,mapthehiddenstatebacktotheinputspace. Morespecically,thestate-spacemodelofthisalgorithmisexpressedasfollows: s'i+1=Fs'i+'s+v's (2) y'i=s'i+'y+v'y (2) whereFisamatrixoroperatorinHBHB;s'iisthehiddenstateandy'iistheprojectionof'(yi)inthesubspaceHB;'sand'yarebothvectorsinHB;v'sandv'yareassumedtobezero-meanGaussiannoisevectorsofcovariances2sIand2yI,respectively.OnecanndthatthemeasurementmodelintheKKFalgorithmisanidentityfunctionofthehiddenstate. 43

PAGE 44

BecauseBistheorthogonalbasisofthesubspaceHB,F,'sand'ycanbeexpressedasB~FBT,B~sandB~y,respectively,where~Fisanllmatrix,and~sand~yarebothl1vectors.Therefore,asetofkernelKalmanlteringandkernelsmoothingequationscanbederivedfromthetraditionalKalmanlterandKalmansmoother.Then,themodelparametersf~F,~s0,~s,~y,s,yginthekernelKalmanlterandtheKalmansmootherareapproximatedusinganExpectation-Maximization(EM)algorithm[ 58 ]fromthetrainingdata.Finally,wendthehiddenstatesintheinputspacebysolvingaPreImageproblem[ 57 59 ].ThedetailsoftheKKFalgorithmcanbefoundin[ 56 57 ]. Insum,althoughthekernelKalmanlterisimplementedusingthekernelmap,itisnottherealKalmanlterintheRKHS.ItjustexpressestheKalmanequationsinahigherdimensionalsubspaceHBHy,whereHyistheRKHSdenedonthemeasurementdomain.Moreover,becauseoftheverycomplicatedandtimeconsumingoff-linelearningprocedureofmodelparameters,thesemodelparameterscannotbeupdatedsimultaneously,whichresultsinabadtrackingperformanceforatimevariantsystem,especiallywhentherangeofthetrainingdatacannotcoverthetestdata. 2.3SummaryofRelatedAlgorithms Intheprevioussections,wereviewaseriesofrelatedalgorithmsinvolvedindynamicalmodelproblems.Toclearlypresenttheframeworkofthesealgorithms,wesummarizetheminFigure 2-2 .Fromthegure,onecanndthatthelinearadaptivelteralgorithmsmentionedinthedissertationareallexpandedtotheRKHSasnewnonlinearalgorithms,excepttheKKFalgorithm,whichisjustimplementedinahighdimensionalsubspaceofRKHS.Therefore,weintendtoreformulatetheKalmanlterinRKHStoobtainanewnonlinearKalmanlteralgorithm.ThisisthegoalofourresearchandismarkedredinFigure 2-2 .Inthefollowingchapters,wewillpresentsomealgorithmsthatwedevelopalongthisresearchline. 44

PAGE 45

Figure2-2. Relationshipbetweenalgorithms 45

PAGE 46

CHAPTER3ANOVELEXTENDEDKERNELRECURSIVELEASTSQUARES 3.1Overview InChapter2,wehavepresentedtheKRLSandEx-KRLSalgorithms.Forboth,themeasurementfunctionsareconstructedinthefeaturespaceHinaninnerproductform;whilefromtheinputspacepointofview,themeasurementfunctionsarenonlinear,unknowninadvance,andmustbelearnedfromthedata.ComparedwiththeKRLSalgorithm,theEx-KRLSalgorithmhasamoreexiblestatemodelinRKHS.However,asanalyzedanddiscussedinSection 2.2.3 ,theEx-KRLSisjustarandomwalkKRLSalgorithm,whichcannotreecttherealandfullsystemstates.Inordertohaveabetterdescriptionofthetimevariantdynamicsystem,wehavetouseanon-trivialtransitionoperatorA.Althoughin[ 17 ],thealgorithmhasbeenderivedbasedonanon-trivialtransitionoperatorobtainedbysubspaceprojection,howtoestimatethisnon-trivialtransitionoperatorisstillanopenquestion. HerewetakeadifferentandhybridapproachofbuildingthestatemodelintheinputspaceandtheobservationmodelinRKHS.Thesolutiondescribesthesystemstatesbetter,andstillusethestatevectorxitorepresenttheknownsystemstateattimei.SimilartotheKRLSandEx-KRLSalgorithms,themeasurementfunctionh(,)isassumedunknownnonlinearandwillbelearnedfromthedatainRKHS.Thestatespacemodelispresentedas xi+1=f(xi)+niknownmodel (3) yi=h(xi,ui)+viunknownmodel (3) wherethestatenoiseniandviarebothintheoriginalspaces.Oncetheunknownmeasurementfunctionisobtained,wecanusethe(nonlinear)KalmanlterformulationtosolvethisproblemjustliketheEx-RLSalgorithmmentionedbeforeaslongasweassumecorrespondingpositivedenitematricesf0,Q,Rg.Itisnoteworthythatthe 46

PAGE 47

measurefunctioncanalsobeavector-valuefunctionh(,)withvectornoisevi,whichcanbetreatedasvectorsofscalar-valuefunctionsandlearnedlikeseveralscalar-valuefunctions.Therefore,inthechapterwedonotdistinguishscalar-valueandvector-valuemeasurementfunctions,whicharenotdenotedbyh(,)orh(). Sincetheoverallstatespacemodelisnonlinear,thenonlinearKalmanlterisnecessary.In[ 47 ]wehaveproposedasimilaralgorithm,EKF-KRLSalgorithm.Theonlydifferencehereisthatthemeasurementfunctionischosenwithtwoarguments,theinputuiandthesystemstatexi.Therefore,werstintroducetheEKF-KRLSalgorithm,thenpresentthenovelextendedkernelrecursiveleastsquaresalgorithm,whichisacombinationofKRLSandnonlinearKFalgorithms,namedastheextendedkernelrecursiveleastsquaresbasedontheKalmanlter(Ex-KRLS-KF).Finally,theexperimentsandconclusionarepresented. 3.2ExtendedKalmanFilter-KernelRecursiveLeastSquaresAlgorithm TheproblemwewanttosolveusingtheextendedKalmanlter-kernelrecursiveleastsquares(EKF-KRLS)algorithmcanbemodeledas xi+1=f(xi)+niyi=h(xi)+vi, (3) Hereweassumethetransitionfunctionf()isknown,butthenonlinearfunctionhi()isunknown.Theassumptionsonnoiseniandviarethesameasmentionedbefore. Inordertoobtainanonlinearobservationmodel,itwillbeconstructedintheRKHSH.Thewholesystemisreformulatedas xi+1=f(xi)+ni;(inputspace)yi=hh,'(xi)iH+vi(Hspace) (3) WereviewtheEKFalgorithminAlgorithm2-2toderivetheEKF-KRLSalgorithm.Becausethestatemodelisintheinputspace,thethingsweneedtoconsiderarethe 47

PAGE 48

KalmangainmatrixGiandtheerrorcovarianceupdateequations.Inlightof[ 1 ],theKalmangainisdenedas Gi=Exi+1~eTiR)]TJ /F8 7.97 Tf 6.59 0 Td[(1~e,i(3) whereE[]denotestheexpectationoperator,~ei=yi)]TJ /F6 11.955 Tf 10.54 0 Td[(^yi=yi)]TJ /F4 11.955 Tf 10.6 0 Td[(h(^x)]TJ /F5 7.97 Tf 0 -8.27 Td[(i)andR~e,i=E~ei~eTi.Accordingtotheorthogonalityprinciple,wehave R~e,i=E[~ei~eTi]=Eh)]TJ /F4 11.955 Tf 5.47 -9.68 Td[(h(xi))]TJ /F4 11.955 Tf 11.95 0 Td[(h(^x)]TJ /F5 7.97 Tf 0 -8.28 Td[(i)+vi)]TJ /F4 11.955 Tf 12.96 -9.68 Td[(h(xi))]TJ /F4 11.955 Tf 11.95 0 Td[(h(^x)]TJ /F5 7.97 Tf 0 -8.28 Td[(i)+viTi=Eh)]TJ /F4 11.955 Tf 5.47 -9.69 Td[(h(xi))]TJ /F4 11.955 Tf 11.95 0 Td[(h(^x)]TJ /F5 7.97 Tf 0 -8.28 Td[(i))]TJ /F4 11.955 Tf 12.95 -9.69 Td[(h(xi))]TJ /F4 11.955 Tf 11.96 0 Td[(h(^x)]TJ /F5 7.97 Tf 0 -8.28 Td[(i)Ti+E[vivTi]=Eh)]TJ /F4 11.955 Tf 5.47 -9.69 Td[(h(xi))]TJ /F4 11.955 Tf 11.95 0 Td[(h(^x)]TJ /F5 7.97 Tf 0 -8.27 Td[(i))]TJ /F4 11.955 Tf 12.95 -9.69 Td[(h(xi))]TJ /F4 11.955 Tf 11.96 0 Td[(h(^x)]TJ /F5 7.97 Tf 0 -8.27 Td[(i)Ti+Ri (3) Exi+1~eTi=FiExi~eTi+Eni~eTi(3) whereFi=@f(x) @xjx=^xi.ThetermsExi~eTiandEni~eTiaregivenby Exi~eTi=E)]TJ /F7 11.955 Tf 10.46 -9.68 Td[(xi)]TJ /F7 11.955 Tf 11.66 0 Td[(^x)]TJ /F5 7.97 Tf 0 -8.27 Td[(i+^x)]TJ /F5 7.97 Tf 0 -8.27 Td[(i~eTi=E)]TJ /F7 11.955 Tf 10.46 -9.68 Td[(xi)]TJ /F7 11.955 Tf 11.66 0 Td[(^x)]TJ /F5 7.97 Tf 0 -8.28 Td[(i~eTisince^x)]TJ /F5 7.97 Tf 0 -8.28 Td[(i?~eTi=E)]TJ /F7 11.955 Tf 10.46 -9.69 Td[(xi)]TJ /F7 11.955 Tf 11.66 0 Td[(^x)]TJ /F5 7.97 Tf 0 -8.28 Td[(i)]TJ /F4 11.955 Tf 12.95 -9.69 Td[(h(xi))]TJ /F4 11.955 Tf 11.96 0 Td[(h(^x)]TJ /F5 7.97 Tf 0 -8.28 Td[(i) (3) since)]TJ /F7 11.955 Tf 5.48 -9.68 Td[(xi)]TJ /F7 11.955 Tf 11.65 0 Td[(^x)]TJ /F5 7.97 Tf 0 -8.27 Td[(i?vi, Eni~eTi=Ehni)]TJ /F4 11.955 Tf 5.48 -9.68 Td[(h(xi))]TJ /F4 11.955 Tf 11.96 0 Td[(h(^x)]TJ /F5 7.97 Tf 0 -8.28 Td[(i)+viTi=EnivTisinceni?)]TJ /F4 11.955 Tf 5.48 -9.68 Td[(h(xi))]TJ /F4 11.955 Tf 11.95 0 Td[(h(^x)]TJ /F5 7.97 Tf 0 -8.28 Td[(i)T=0assumeni?vi. (3) LiketheerrorcovarianceP)]TJ /F5 7.97 Tf 0 -8.28 Td[(i=E[(xi)]TJ /F7 11.955 Tf 13.01 0 Td[(^x)]TJ /F5 7.97 Tf 0 -8.28 Td[(i)(xi)]TJ /F7 11.955 Tf 13.01 0 Td[(^x)]TJ /F5 7.97 Tf 0 -8.28 Td[(i)T]intheKalmanlteralgorithm,wealsoneedtoconstructEh)]TJ /F7 11.955 Tf 5.48 -9.69 Td[(xi)]TJ /F7 11.955 Tf 11.66 0 Td[(^x)]TJ /F5 7.97 Tf 0 -8.28 Td[(i)]TJ /F7 11.955 Tf 12.95 -9.69 Td[(xi)]TJ /F7 11.955 Tf 11.66 0 Td[(^x)]TJ /F5 7.97 Tf 0 -8.28 Td[(iTi.Weemployarst-orderTaylorapproximationofthenonlinearfunctionh()around^x)]TJ /F5 7.97 Tf 0 -8.27 Td[(i.Specically,h(xi)is 48

PAGE 49

approximatedasfollows h(xi)=h)]TJ /F7 11.955 Tf 5.18 -9.69 Td[(^x)]TJ /F5 7.97 Tf 0 -8.28 Td[(i+)]TJ /F7 11.955 Tf 5.48 -9.69 Td[(xi)]TJ /F7 11.955 Tf 11.65 0 Td[(^x)]TJ /F5 7.97 Tf 0 -8.28 Td[(ih(^x)]TJ /F5 7.97 Tf 0 -8.27 Td[(i)+@hi @^x)]TJ /F5 7.97 Tf 0 -8.28 Td[(i)]TJ /F7 11.955 Tf 5.48 -9.68 Td[(xi)]TJ /F7 11.955 Tf 11.66 0 Td[(^x)]TJ /F5 7.97 Tf 0 -8.27 Td[(i. (3) Withtheaboveapproximateexpression,wecanapproximateGibysubstitutingfrom( 3 )to( 3 )into( 3 ), GiP)]TJ /F5 7.97 Tf 0 -8.27 Td[(iHTiHiP)]TJ /F5 7.97 Tf 0 -8.27 Td[(iHTi+Ri)]TJ /F8 7.97 Tf 6.58 0 Td[(1(3) whereHi=@h @^x)]TJ /F10 5.978 Tf 0 -6.42 Td[(i.TheerrorcovarianceupdateequationisthesamewiththeapproximatedKalmangain. OnceHiisestimatedattimei,wecanapplytheKalmanlteralgorithmtosolvethepredictionproblem.Untilnow,thederivationisverysimilartotheEKFalgorithm,exceptthatthehiddenstatemodelisstilllinearandthefunctionh()isassumedunknown.InordertoobtainHi=@h @^x)]TJ /F10 5.978 Tf 0 -6.42 Td[(i,wewillusetheKRLSalgorithmmentionedinChapter2toestimatethefunctionh()basedonthepredictedhiddenstates^xj(ji).Ateverytimestep,theEKFalgorithmestimatesthestateusingthecurrentestimatedfunction^hi(),whiletheKRLSalgorithmestimatestheunknownfunction^hi+1()usingallavailableestimatedhiddenstates.Weconcatenatethesetwoalgorithmstoobtainanovelalgorithm,EKF-KRLSalgorithm,summarizedinAlgorithm3-1. Herea0isan1nyvectorifny>1,andaifori>1isa(i+1)nymatrix. LiketheKRLSalgorithm,i)]TJ /F8 7.97 Tf 6.59 0 Td[(1isscaledbyaforgettingfactor(01)togeti.Thereasonisthatweestimatethemeasurementfunctionhi()basedontheestimatedhiddenstatesf^xjgij=1.However,thehiddenstatesarenottrustedatthebeginningofltering.Soweusetheforgettingfactortogetridoftheimpactofwrongearlyestimates. 3.3FormulationoftheNovelExtendedKernelRecursiveLeastSquares Asdiscussedinsection 2.2.3 ,itisnoteasytobuildastatemodelinRKHS,whichinvolvesoperatorlearning.Toavoidthisproblem,westillmaintainthemeasurement 49

PAGE 50

Algorithm3-1:EKF-KRLS Initialize:fori=0,set:^x0=E[x0],0=[k(^x0,)]P0=E[(x0)]TJ /F4 11.955 Tf 11.96 0 Td[(E[x0])(x0)]TJ /F4 11.955 Tf 11.96 0 Td[(E[x0])T]Q(0)=(+k(^x0,^x0)))]TJ /F8 7.97 Tf 6.59 0 Td[(1,a(0)=1Tny Filteringfori=1,2,...Stateltering(EKF)Fi=@f(x) @xjx=^xi)]TJ /F17 5.978 Tf 5.76 0 Td[(1^x)]TJ /F5 7.97 Tf 0 -8.28 Td[(i=Fi)]TJ /F8 7.97 Tf 6.58 0 Td[(1^xi)]TJ /F8 7.97 Tf 6.59 0 Td[(1^yi=a(i)]TJ /F6 11.955 Tf 11.95 0 Td[(1)TTi)]TJ /F8 7.97 Tf 6.58 0 Td[(1'(^x)]TJ /F5 7.97 Tf 0 -8.28 Td[(i)P)]TJ /F5 7.97 Tf 0 -8.28 Td[(i=Fi)]TJ /F8 7.97 Tf 6.59 0 Td[(1Pi)]TJ /F8 7.97 Tf 6.59 0 Td[(1FTi)]TJ /F8 7.97 Tf 6.59 0 Td[(1+Si)]TJ /F8 7.97 Tf 6.59 0 Td[(1Hi=@i)]TJ /F17 5.978 Tf 5.76 0 Td[(1 @^x)]TJ /F10 5.978 Tf 0 -6.42 Td[(ia(i)]TJ /F6 11.955 Tf 11.96 0 Td[(1)TGi=P)]TJ /F5 7.97 Tf 0 -8.28 Td[(i~HTih~HiP)]TJ /F5 7.97 Tf 0 -8.28 Td[(i~HTi+Rii)]TJ /F8 7.97 Tf 6.58 0 Td[(1^xi=^x)]TJ /F5 7.97 Tf 0 -8.28 Td[(i+Ki(yi)]TJ /F6 11.955 Tf 11.9 0 Td[(^yi)Pi=I)]TJ /F7 11.955 Tf 11.95 0 Td[(Ki~HiP)]TJ /F5 7.97 Tf 0 -8.28 Td[(iMeasurementfunctionupdate(KRLS)i=[i)]TJ /F8 7.97 Tf 6.59 0 Td[(1,k(^xi,)]h(i)=Ti)]TJ /F8 7.97 Tf 6.58 0 Td[(1'(^xi)z(i)=Q(i)]TJ /F6 11.955 Tf 11.95 0 Td[(1)h(i)r(i)=+k(^xi,^xi))]TJ /F7 11.955 Tf 11.95 0 Td[(z(i)Th(i)Q(i)=r(i))]TJ /F8 7.97 Tf 6.58 0 Td[(1Q(i)]TJ /F6 11.955 Tf 11.96 0 Td[(1)r(i)+z(i)z(i)T)]TJ /F7 11.955 Tf 9.3 0 Td[(z(i))]TJ /F7 11.955 Tf 9.3 0 Td[(z(i)T1e(i)=yi)]TJ /F7 11.955 Tf 11.95 0 Td[(a(i)]TJ /F6 11.955 Tf 11.95 0 Td[(1)Th(i)a(i)=a(i)]TJ /F6 11.955 Tf 11.95 0 Td[(1))]TJ /F4 11.955 Tf 11.95 0 Td[(r(i))]TJ /F8 7.97 Tf 6.58 0 Td[(1z(i)e(i)Tr(i))]TJ /F8 7.97 Tf 6.59 0 Td[(1e(i)T modelintheinputspace,butlearnthemeasurementfunctioninRKHS.ItissimilartotheEKF-KRLSalgorithmpresentedintheprevioussection.Todealwithmeasurementfunctionin( 3 )withtwoargumentsinputuiandhiddenstatexi,wecombinethemasanewinputzi=uTixTiTandexpressthemeasurementfunctionintheRKHSininnerproductformas h(xi,ui)=~h(zi)=D~h(),'(zi)EH (3) 50

PAGE 51

Algorithm3-2:ExtendedKernelRecursiveLeastSquareswithKalmanlter(Ex-KRLS-KF) Initializationfori=0:Initializemeasurementfunctionh0(,)Sets0,0,QandR Filteringfori=1,2,...Stateltering(NonlinearKF)UsinganynonlinearKalmanltertoestimatethehiddenstatexibasedontheinitializedorapproximatedmeasurementfunctionhi)]TJ /F8 7.97 Tf 6.58 0 Td[(1(,)andinputuiandmeasurementyiMeasurementfunctionupdate(KRLS)UsingKRLStoapproximateandupdatethemeasurementfunctionhi)]TJ /F8 7.97 Tf 6.58 0 Td[(1(,)basedonalltheavailableinputsfujgij=1,outputsfyjgij=1,andestimatedhiddenstatesf^xjgij=1 Therefore,aslongaswehavecorrespondingmeasurementsfyig,hiddenstatesfxigandinputsfuig,wecanapproximatethemeasurementfunctionh(,)usingtheKRLSalgorithm. ItisobservedthattheEKFalgorithmisnottheuniquechoiceforstateltering.Actually,anynonlinearKalmanltersdescribedinSection 2.1.3 canbeappliedtoestimatethehiddenstates.Now,thenovelalgorithmcanbepresentedbelow: Inpractice,forthisalgorithmtheexponentially-weightedKRLSshouldbeemployed.Thereasonisthatweestimatethemeasurementfunctionh(xi,ui)basedontheestimatedhiddenstatesf^xjgij=1.However,thehiddenstatesarenottrustableatthebeginningoflteringbecauseoftheincorrectinitialmeasurementfunction.Soweusetheforgettingfactortoattenuatetheimpactofincorrectearlyestimates.Furthermore,wecanconsiderdiscardingsomebeginningestimatedstatesorjustcomputingseveralrecentestimatedstatesinarunningwindow. AlthoughouralgorithmisthecombinationofthenonlinearKalmanlterandKRLSalgorithms,thecomputationalcomplexityoftheEx-KRLS-KFiscomparablewiththeKRLSandEx-KRLSalgorithm,whichisequaltoO(i2)atiterationi.Moreover,we 51

PAGE 52

shouldalsoconsidersparsicationandapproximatelineardependency(ALD)[ 16 17 ]torestrictcomputationalcomplexity. Inthisalgorithm,themeasurementfunctionisapproximatedbasedontheinputs,measurementsandestimatedhiddenstates.Ontheotherhand,thehiddenstatesareestimatedbasedontheapproximatedmeasurementfunction.Actually,itistheinformationaboutthestatemodelin( 2 )thatisappliedtoapproximatetheunknownmeasurementfunction. WeusetheKRLSalgorithmtoapproximatethemeasurementfunctionh(,),whichisafunctionoftwoarguments.However,wecanalsotreatthefunctionasaparameterizedfunctionofoneargumentinputui.Whentreatingthehiddenstatexiastheparameterofthefunction,wehave h(xi,ui)=hxi(ui)(3) whichcanberewrittenastheinnerproductintheRKHS h(xi,ui)=hhxi(),'(ui)iH.(3) Therefore,wehavethefollowingtwospecialcases: Bysettingthestatetransitionfunctionf()asanidenticalfunction,wehavetheKRLSalgorithmforthefollowingrandomwalkmodel: xi+1=xi+ni (3) yi=hxi(ui)+vi (3) Furthermore,bysettingthecovariancematrixQaszeromatrix,wehavetheKRLSalgorithm. Theparameterizedmeasurementfunctionhxi()isthecounterpartofxiintheEx-KRLSalgorithm,whichisalsounknowninadvance.However,thetransitionbehavior 52

PAGE 53

ofthefunctionhxi(),whichisdeterminedbyitsparameterxi,isdescribedbetterthanbyxialone.Thereforethisalgorithmismoregeneral. 3.4ExperimentsandResults Inthissection,twoexperimentsaregiventoevaluatethetrackingperformanceoftheEx-KRLS-KFalgorithm.Oneisvehicletracking,andtheotherisRayleighfadingchanneltracking.Theperformancesofouralgorithmsarecomparedwithotherexistingalgorithms.Todemonstratethegeneralityofthenovelalgorithm,theEKFandCKFalgorithmsareappliedinouralgorithm,respectively. 3.4.1Vehicletracking BecausethisalgorithmisthecombinationofthenonlinearKalmanlterandtheKRLSalgorithm,wecomparethetrackingperformancesofouralgorithmwiththenonlinearKalmanlterandtheKRLSalgorithmtotrackthevehicleinapopularopensurveillancedataset,PETS2001,whichisavailableat http://www.hitech-projects.com/euprojects/cantata/dataset/cantata/dataset.html .ThecelebratedextendedKalmanlter(EKF)isselectedasthenonlinearKalmanlterinthisexperiment. Sinceourgoalistoevaluatethetrackingperformanceofthesealgorithms,wemarkmanuallythevehiclewhichwewanttotrack.Theguresbelowshowtheframesandthetrajectoryofthevehicle. Figure3-1. Trajectoryofthevehiclewithbackground InFigure 3-1 theredlineisthetrajectoryoftherightfrontlightofthevehicle.Onecanseethatthevehicletravelsstraightrst,thenbacksup,andnallyparks.The 53

PAGE 54

Figure3-2. Trajectoryofthevehicle trajectoryispresentedaloneinFigure 3-2 .Thereare410framesinthesurveillancevideo.InthegurethevehiclepositionsP(",)areinaCartesiancoordinatesystem.ThekinematicsofthevehiclecanbemodeledbytheRamachandra'smodel: xi+1=0BBBBBBBBBBBBBB@1TT2 200001T0000010000001TT2 200001T0000011CCCCCCCCCCCCCCAxi+ni(3) wherethestateofthemodelx=[__]T;anddenoteposition,and_and_denotevelocitiesofthevehicleinthexandydirections,respectively;Tisthetime-intervalbetweentwoconsecutiveframes;theprocessnoisenisN(0,Q)withacovariancematrixQ=diag[00a00a].Thescalarparametersaandaarerelated 54

PAGE 55

toprocessnoiseintensities.niistheplantnoisethatperturbstheaccelerationandaccountsforbothmaneuversandothermodelingerrors. Aradarisxedattheoriginoftheplaneandequippedtotrackthevehicle.OurmeasurementsarethedistancebetweenthevehicleandtheoriginP(0,0)andtheslopekwithrespecttotheorigin.Themeasurementequationisexpressedas 0B@iki1CA=0B@p "2i+2ii "i1CA+vi(3) wherethemeasurementnoisevsN(0,R)withacovariancematrixR. Forthesethreedifferentalgorithms,wehavethefollowingsetting: NonlinearKalmanlter:Usingthestatemodelandmeasurementmodelgivenin( 3 )and( 3 ),weemploytheEKFalgorithmtopredictthenextstepmeasurement.Theparametersaresetas:a=a=1000andR=diag([1,0.1]). KRLS:WeusetheKRLSalgorithmtopredictthenextmeasurementbasedonthepreviousNmeasurements.Forthe2-Dvehicletracking,actuallyweuse2Ndatatopredictthedistanceandslope,respectively.TheparameterNissetas6toobtainthebestperformanceoftheKRLSalgorithm.WechoosetheGaussiankernelintheKRLSalgorithm.Herethekernelsizeissetas100throughtrials.Theforgettingfactoris0.85toimprovethetrackingperformanceoftheKRLSalgorithm. Ex-KRLS-KF:FortheEx-KRLS-KFalgorithmweonlyneedtoknowthetransitionmatrix.Thefunctionh()canbelearnedfromthedata.Wechoosethesametransitionmatrix( 3 )astheEKFtotrackthevehicle. Comparing( 3 )with( 3 ),onecanndthatthereisnoinputinthemeasurementmodelinthisapplication.InordertoemploytheEx-KRLS-KFalgorithm,wecansetalltheinputsasconstants.Actually,ifwesetallinputsaszeros,wehavetheEKF-KRLSalgorithm[ 47 ]inSection 3.2 55

PAGE 56

IntheEx-KRLS-KFalgorithm,theunknownmeasurementfunctionh(,)islearnedusingtheKRLSalgorithmwiththepreviouslyestimatedhiddenstates,whilethehiddenstatesthemselvesareestimatedbythepreviousapproximatedfunction^h(,).Thebeginningestimatedhiddenstatesarenottrustable.Therefore,weneedaforgettingfactor0<<1tomakethecurrenthiddenstatesmoreimportantwithlargerweights.Wealsousetherunningwindowtocontroltheupdatingofthefunctionh(),whichmeansthatwelearnthecurrenth()onlybasedonthepreviousmestimatedhiddenstates.Wesettheseparametersas=0.64andm=35.WesettheparametersinthecovariancematrixofprocessnoiseasQ=diag[00a00a]wherea=a=10)]TJ /F8 7.97 Tf 6.58 0 Td[(7,andthecovariancematrixofmeasurementnoiseasR=rIwherer=0.1.WealsousetheGaussiankernelinthealgorithmandthekernelsizeissettobe1000. Allparametersabovearechosentoobtainthebestperformances. Becausetherangesofthedistanceandtheslopekarequitedifferent,wecomparetheirperformancesseparately.Consideringthatthesealgorithmshavedifferentconvergencetime,wecomparetheirperformancesfromthe50thframetotheend.Table 3-1 summarizesthepredictionperformancesofthedistanceandtheslopek. Table3-1. MSEofdistanceandslopek AlgorithmEKFKRLSEx-KRLS-KF distance0.31800.36540.1709slopek0.00240.00010.0000 Inordertocomparetheirperformancesmorecorrectlyandvisually,wetransformthedistanceandtheslopektothepositionP(,)usingtheequationsbelow: = p 1+k2, (3) =k. (3) ThetrajectoriesanderrorsareplottedinFigure 3-3 andFigure 3-4 .TheTable 3-2 summarizesthepredictionperformancesofthesethreealgorithms.TheresultsaretheMSEbetweenpredictedpositionandtruepositionfromthe100thframetothe 56

PAGE 57

end.Theseresultsclearlyshowthattheproposedalgorithmhasthefastestrateofconvergenceandthebesttrackingperformance;andthesuperiorityisveryobvious. Figure3-3. TrajectoriesofthetruepositionandthepredictionsoftheEKF,KRLS,andEx-KRLS-KFalgorithms Table3-2. MSEofposition AlgorithmEKFKRLSEx-KRLS-KF MSE1.71301.01560.5467 3.4.2Rayleighchanneltracking WeconsidertheproblemoftrackinganonlinearRayleighfadingmultipathchannelandcomparetheperformanceofournovelEx-KRLS-KFalgorithmwithsomeexistingalgorithms,includingthenormalizedLMS,RLS,Ex-RLS,KRLSandEx-KRLSalgorithms.Inthisexperiment,weemploytheCKFinouralgorithminsteadoftheEKFintheEx-KRLS-KFalgorithm. ThenonlinearRayleighfadingmultipathchannelemployedhereisthecascadeofatraditionalRayleighfadingmultipathchannelandasaturationnonlinearity.IntheRayleighmultipathfadingchannel,thenumberofthepathsischosenasM=5,the 57

PAGE 58

Figure3-4. ComparisonoftheEKF,KRLS,andEx-KRLS-KFalgorithms samemaximumDopplerfrequencyfDisusedforeachchannelandthesamplingrateTs=0.8s.Accordingto[ 1 ],themaximumDopplerfrequencyfDisrelatedtothespeedofthemobileuserv,andtothecarrierfrequencyfc,asfD=vfc=c,wherecdenotesthespeedoflight,c=3108m=s.Assumeacarrierfrequencyoffc=900MHz,thenverifythattheDopplerfrequencythatcorrespondstoavehiclemovingatthespeedofv=120Km=his100Hz(soitisaslowfadingchannelwiththesamefadingrateforallthepaths). TheRayleighfadingmultipathchannelissimulatedbyMatlabCommunicationSystemToolbox[ 91 ].ThesignalisaunitpowerwhiteGaussianprocesswhichissentthroughthischannel.ThentherealpartoftheoutputoftheRayleighchanneliscorruptedwithadditivewhiteGaussiannoisewithvariance2=0.001.Finallythesaturationnonlinearityy=tanh(x)isappliedtothenoisyoutputoftheRayleighchannel.Thewholenonlinearchannelistreatedasablackboxandonlytheinputandoutputareknown. 58

PAGE 59

Inthisexperiment,sixalgorithmsareemployedtotrackthetimevariantmultipathchannel.TherstoneisnormalizedLMSalgorithm(shownasLMS-2inthefollowingguresandtables)(withregularizationfactor=10)]TJ /F8 7.97 Tf 6.59 0 Td[(3,stepsize=0.25);thesecondistheRLSalgorithm(withregularizationparameter=10)]TJ /F8 7.97 Tf 6.59 0 Td[(3,forgettingfactor=0.995);thethirdistheEX-RLSalgorithm(withstatetransitionmatrixF=diag[1,...,M],andstatenoisecovarianceQ=diag[q1,...,qM],theparametersiandqi(i=1,...,M)arebothdeterminedbythemaximumDopplerfrequencyfD=100Hz,forgettingfactor=0.995,regularizationparameter=10)]TJ /F8 7.97 Tf 6.59 0 Td[(3).Allparametersforthesetwoalgorithmsaresetaccordingto[ 1 ](onpage759)).TheforthistheKRLSalgorithm(forgettingfactor=0.995,regularizationparameter=0.01);thefthistheEx-KRLSalgorithm(A=I,=1,q=10)]TJ /F8 7.97 Tf 6.58 0 Td[(2,=0.995,=0.01).ThelastoneistheproposedalgorithmEx-KRLS-KFalgorithm(withthesamestatemodelandstatenoisewiththeEx-RLSalgorithm,forgettingfactor=0.995,regularizationparameter=0.01,andmeasurementnoisecovarianceR=10)]TJ /F8 7.97 Tf 6.59 0 Td[(6).Forallthekernelmethods,theGaussiankernelisappliedwiththekernelparameterequalto1.Alltheparametersareselectedforbestresults. Wegenerate1000symbolsforeveryexperimentandperform200MonteCarloexperimentswithindependentinputsandadditivenoise.TheensemblelearningcurvesareplottedinFigure 3-5 whichclearlyshowsthatouralgorithmhasthebesttrackingperformance.BecausethetrackingperformancesoftheRLSandEx-KRLSaretooclosetobedistinguishedinthegure,wejustplottheEx-RLScurve.Thelast100valuesinthelearningcurvesareusedtocalculatethenalmeansquareerror(MSE),whichislistedinTable 3-3 .FromTable 3-3 ,theproposedalgorithmoutperformsotherexistingalgorithmsstatisticallysignicantly.ObservingFigure 3-5 morecarefully,onecanndthattheKRLS,Ex-KRLSandEx-KRLS-KFalgorithmsperformverysimilarlyatbeginningstageoftracking(fori100)becauseofslowfadingchannel,whenthesealgorithmsarelearningthenonlinearchannel.Then,theperformancesofthesekernel 59

PAGE 60

algorithmsaredifferent(fori>100).TheKRLSalgorithmcannotalwaystrackthedynamicalsystemandtheMSEoftheKRLSstatstoincreasefromaboutthe500thiteration.TheEx-KRLSalgorithmcanstilltrackandlearnthissystemslowlybecauseofitsrandomwalkstatemodelandstatenoise.TheEx-KKRLS-KFalgorithmcantrackandlearnthissystembetterbecauseofmoreprecisestatemodelandwell-learnedmeasurementmodel.Therefore,evenforthisslowfadingchannelwestillobtainmorethan5dBimprovementcomparedwiththeEx-RKLS. Figure3-5. EnsemblelearningcurvesoftheLMS-2,RLS,EX-RLS,KRLS,EX-KRLSandEX-KRLS-KFintrackingaRayleighfadingmultipathchannelwithmaximumDopplerfrequencyfD=100Hz Table3-3. PerformancecomparisoninRayleighfadingchanneltrackingwithmaximumDopplerfrequencyfD=100Hz AlgorithmMSE(dB) LMS-2-10.87640.79662RLS-11.80350.54729Ex-RLS-11.80430.54741KRLS-16.10641.2613Ex-KRLS-18.33121.5165Ex-KRLS-KF-23.65900.82174 60

PAGE 61

Inthepreviousexperiment,themaximumDopplerfrequencyissetasfD=100Hzforaslowfadingchannel.Inordertoshowthedifferentbehaviorsofthesealgorithms,weincreasethemaximumDopplerfrequencyasfD=500Hz,whichcorrespondstoavehiclemovingatthespeedofv=600Km/h(thespeedofthefastesttrainintheworldisabout570Km/h),toobtainafasterfadingchannelandrepeattheexperiment.ConsideringthattheincreasedmaximumDopplerfrequencywillresultinlargerstatenoiseforEx-KRLSalgorithm,weresettheparameterq=0.1inthisalgorithm.TheotherparametersaremaintainedormodiedbythenewmaximumDopplerfrequency. TheperformancesofthesealgorithmsareplottedinFigure 3-6 .Thelast100valuesinthelearningcurvesareusedtocalculatethenalmeansquareerror(MSE),whichislistedinTable 3-4 Figure3-6. EnsemblelearningcurvesoftheLMS-2,RLS,EX-RLS,KRLS,EX-KRLSandEX-KRLS-KFintrackingaRayleighfadingmultipathchannelwithmaximumDopplerfrequencyfD=500Hz FromtheresultsinTable 3-4 ,itisclearthattheproposedalgorithmachievesthebestperformanceanditisalsostatisticallysignicant.TheKRLSalgorithmdoes 61

PAGE 62

Table3-4. PerformancecomparisoninRayleighfadingchanneltrackingwithmaximumDopplerfrequencyfD=500Hz Algorithm MSE(dB) LMS-2 -10.96380.85232RLS -9.541400.44459Ex-RLS -9.820100.44807KRLS -5.919701.1839Ex-KRLS -14.06771.1863Ex-KRLS-KF -21.50851.1051 notworkinthiscases.EventheEx-KRLSalgorithmcannotobtaingoodtrackingperformanceandtheMSEincreasesslightlywithiterationincreasing. Toanalyzeouralgorithmtheoretically,weplottheMSEoftheEx-KRLS-KFfordifferentmaximumDopplerfrequencies(from100Hzto1500Hz)inFigure 3-7 .OnecanndthattheMSEsarelargerthan-10whenthemaximumDopplerfrequenciesarehigherthan1000Hz.ThereasonisthatwhenthemaximumDopplerfrequencyishighthehiddenstatechangesfasterwhichresultsinthattheKRLSalgorithmisnotabletolearnthemeasurementfunctionwell.Further,withtheinaccuratemeasurementfunction,theEx-KRLS-KFcannotestimateproperhiddenstatesandtrackthesignalwell.Therefore,theEx-KRLS-KFisjustsuitabletotrackingslowfadingchannel. 3.5DiscussionandConclusion Inthischapter,anewextendedkernelrecursiveleastsquaresalgorithm,Ex-KRLS-KF,isproposedbasedontheKalmanlter.TheKRLSandEx-KRLSalgorithmsarealsomentionedandcomparedwiththeproposedalgorithminthischapter.TheKRLSalgorithmconstructsitsmodelasthelinearformintheRKHS,whichisanunknownnon-linearmodelfromtheviewoftheinputspace.TheEx-KRLSalgorithmgoesalongthisline,andconstructsitsstatemodelintheRKHSaswelltoachievebettertrackingperformance.However,sincetheestimationofthetransitionoperatorintheRKHSforageneralcaseisdifcult,Ex-KRLSisjustimplementedpracticallywithaverysimplestatemodel,whichcanbeusedasexponentially-weightedKRLSandrandom-walkKRLS(seedetailsinSection 2.2.3 ).Inordertodevelopanalgorithmwhichcandeal 62

PAGE 63

Figure3-7. MSEversusmaximumDopplerfrequency.Errorbarsmarkonestandarddeviationaboveandbelowthemean withacomplicatedstatemodel,weproposedtheEx-KRLS-KFalgorithmwhichisacombinationoftheKRLSandKalmanlter.Thestatemodelispreservedintheoriginaldataspace,whilethemeasurementmodelisconstructedintheRKHS.Therefore,thisalgorithmcanhavecomplicatedstatemodelandissuitableforanyunknownmeasurementmodel. Twoapplicationsofthisalgorithmarepresented.Inthevehicletrackingproblem,ouralgorithmiscomparedwithKRLSandEKFalgorithmswhicharethetwopartsofouralgorithmtotrackthevehicleinasurveillance.Theapplicationisaspecialcaseforouralgorithm:theinputsaresetaszeros.Thetrackingperformancesshowthatouralgorithmoutperformsitscomponents.ThenonlinearRayleighfadingchannelproblemillustratesthatouralgorithmisthebestoneamongotherexistingalgorithms,includingtheLMS-2,RLS,Ex-RLS,KRLS,andEx-KRLSalgorithms. Allthekernelizedalgorithmsincludedinthischapterhavebettertrackingperformancesthanotheralgorithmsinthedataspace.Thesuperiorityisachievedatthecostofhigher 63

PAGE 64

computationalcomplexity.InTable 3-5 ,wesummarizethecomputationalcomplexityofallalgorithmsimplementedinthischapter. Table3-5. Computationalcomplexityanalysis AlgorithmsComputationalcomplexityperiteration RLSO(n2x)Ex-RLSO(n2x)EKFO(n3x)UKFO(n3x)CKFO(n3x)KRLSO(i2)Ex-KRLSO(i2)Ex-KRLS-KFO(i2)+O(n3x) Herenxdenotesthedimensionoftheweightvectororthestatevector.ThecomputationalcomplexityofKRLSandEx-KRLSalgorithmsgrowasthesquareoftheiterationi;andthecomputationalcomplexityofthenon-kernelizedalgorithmsgrowasthesquareorthecubeofthestatedimensionnx.BecausetheproposedalgorithmisthecombinationofKRLSandnonlinearKalmanlteralgorithms,thecomputationalcomplexityoftheproposedalgorithmisthesumoftheircomputationalcomplexity.Generally,sincethesamplenumberismuchlargerthanthestatedimension,thecomputationalcomplexityoftheproposedalgorithmisalsocomparablewiththeKRLSandEx-KRLSalgorithms.TherearestrategiestoreducethecomputationalcomplexityofKRLS(seeforinstanceALD[ 16 17 ]andsurprise[ 55 ])andtheycanalsobeappliedtotheEx-KRLS-KFalgorithm.Furtherworktowardsconstantbudgetalgorithmisapromisinglineofresearch. BecauseofthecloserelationshipbetweentheEx-RLSandtheKalmanlter,nonlinearKalmanlteralgorithms(suchasEKF,UKF,andCKF)areemployedtosolvethenonlinearEx-RLSproblem.UsingthenonlinearKalmanlterwiththeapproximatedmeasurementfunction,wecanpredicttheoutputsofthemeasurementfunction.However,howtoestimatethehiddenstateisstillaninterestingproblem.Sincetheapproximationofthemeasurementfunctionisbasedontheestimatedhiddenstates, 64

PAGE 65

whilethehiddenstatesareestimatedbasedonthemeasurementfunction,howtoimplementthisduallearningtaskstillneedsmoreresearch. 65

PAGE 66

CHAPTER4LEARNINGNONLINEARGENERATIVEMODELSOFTIMESERIESWITHAKALMANFILTERINRKHS 4.1Overview WestudyanovelmethodologytodesignalearningnonlineargenerativemodelusingaBayesianlteringframeworkinreproducingkernelHilbertspaces(RKHS).Thegoalofthisalgorithmistoestimateand/orpredictthetimeseriesfromthenoisymeasurementsgeneratedbythefollowingunderlyingunknownautonomoussysteminFigure 4-1 ,whichcanbenon-linear,non-convergentanddynamicalsystem, Figure4-1. Generatedtimeseries wherefxigarethetimeseriesinRnd,whichareusuallyunobservable,andfyigarethemeasurementsinRnd,whicharethenoisyversionofthetimeseriesfxigcorruptedbyzero-meanmeasurementnoisefvig.ThenoisyfvigisindependentwithsignalfxigandnoGaussianassumptionisrequiredaboutthenoise.Becauseonlythemeasurementsareobservable,wemodelanewdynamicalsystemaboutmeasurementsinRKHS,whichislearnedfromthegiventrainingmeasurementdata,andusetheKalmanlterintheRKHStoestimateand/orpredictthemeasurementdistribution.Further,wecanestimatedand/orpredictedtimeseriesfromthecorrespondingmeasurementdistribution. Thismodelisverypopularandappliedtostudymanyphenomenainscience,engineeringandeconomyelds,andmanymethodologiesareproposedtosolvethis 66

PAGE 67

kindofproblems.Wecancategorizethemintotwogroups:directmodelsandgenerativemodels. Directapproachesmakenoexplicitattempttomodeltheunderlyingdistributionsofthevariablesandfeaturesandareonlyinterestedinoptimizingamappingfromtheinputstothedesiredmodeloutputs.FortheautonomousmodelinFigure 4-1 ,yiortheembeddingpastmeasurementsyE,i=[yi)]TJ /F8 7.97 Tf 6.59 0 Td[((nE)]TJ /F8 7.97 Tf 6.59 0 Td[(1),yi)]TJ /F8 7.97 Tf 6.59 0 Td[((nE)]TJ /F8 7.97 Tf 6.58 0 Td[(2),...,yi)]TJ /F15 7.97 Tf 6.59 0 Td[(,yi],wherenEandaretheembeddingdimensionandtheembeddingdelayrespectivelyaccordingtoTakens'embeddingtheorem[ 64 ],areusedtopredictyi+1,byapproximatingthefunctionf:yi+1=f(yi)orf:yi+1=f(yE,i).Thesedirectapproachesinclude:neuralnetworks[ 71 ],supportvectorregression(SVR)[ 65 67 ],andkerneladaptivelteralgorithmssuchaskernelrecursiveleastsquares(KRLS)[ 16 ]andkernelleastmeansquares(KLMS)[ 15 ].Theseapproachesusuallyhavegoodperformancetopredictthenoiselesstimeseries.However,theycannotrepresenttheunderlyingstructureofthetimeseries,andtheperformanceisalwaysaffectedbydifferentkindsofnoiseenvironments. Generativeapproachesproduceaprobabilitydensitymodeloverallvariablesinasystemandmanipulateittocomputeclassicationandregressionfunctions.ThecelebratedKalmanlter[ 25 ]isoneofthegenerativeapproach,whichiswidelyusedtoestimatehiddenstatesforlinearsystemmodels.TheproblemisthattheKalmanlerisalinearsystemsocannotcopewiththediversityofmodelsthatgeneratetimeseriesofinterestinscienceandengineering.SomeofnonlinearversionsoftheKalmanlterweredevelopedtosolvenonlineardynamicalsystemproblems,suchastheextendedKalmanlter(EKF)[ 27 29 ],unscentedKalmanlter(UKF)[ 33 34 ]andcubatureKalmanlter(CKF)[ 38 39 ]algorithms.Morerecently,thereareothergenerativeapproacheswhicharealsoappliedtodealwiththesenonlinearlteringproblems,suchasthekernelKalmanlter(KKF)[ 56 57 ],theDynamicalSystemModelwithaConditionalEmbedding(DSMCE)operator[ 61 ]andthekernelBayes'rule(KBR)[ 82 ]algorithms.FortheKKF,theKalmanlterisimplementedinahighdimensionalsubspaceobtainedbyKernel 67

PAGE 68

PCAalgorithm[ 11 12 ].TheDSMCEandKBRalgorithmsarebothdevelopedbasedontheconditionalembeddingconceptinRKHS.Allofthesegenerativeapproachestreatthetimeseriesfxigortheirmappingsinthefeaturespaceasthehiddenstatesandattempttodescribethedynamicsofthehiddenstatesbytheassumedsystemstate-spacemodelorthegivenhiddenstatetrainingdataset.However,formostofthetimeseriesproblems,theaccuratesystemstate-spacemodelorcleanhiddenstatetrainingdatasetarenotavailable.Therefore,accurateestimationandpredictionperformancescannotbeobtainedbythesealgorithms. Inordertoavoidthedescriptionoftheunavailablehiddenstatesdynamics,weusetheestimatedembeddingofthenoisymeasurementsfyiginRKHSasthehiddenstate,insteadoftheembeddingoffxig.ThenweconstructanewstatespacemodelinaRKHSusingtheestimatedconditionalembeddingoperatorandimplementtheKalmanlteringinthisspace. Thisapproachlearnsthestateofthenonlineardynamicsysteminaself-organizedmannerbyprocessingthegivenmeasurements.Assuch,itdoesnotrequireanassumedstatespacemodel,unliketheKalmanlteringanditsvariants.Inaddition,theembeddingrepresentsthedistributionofthemeasurements,andtheconditionalembeddingoperatordescribingthedynamicscanbeestimatedusingthetrainingmeasurementdataalone.Therefore,theKalmanlterintheRKHSpropagatesandupdatesthemeasurementdistribution,notjustthemeanandcovarianceasininputspacecounterpart,whichenhancesthelteringcapabilitytowithstandnon-Gaussiannoiseenvironments.Furthermore,thestateandmeasurementmodeluncertaintyisassumedtobeaGaussiannoiseintheRKHS,whichcanrepresentmuchwidernoisetypesintheinputspace[ 12 57 ].ThereforeoursolutionintheRKHSpreservesmostofthefeaturesoftheBayesianlter,althoughtherecursiveupdatestructureoftheKalmanlterfullyutilizedwiththeadvantageofefcientcomputation.Todistinguishourwork 68

PAGE 69

frompreviousattempts,wenameournovelalgorithmthekernelKalmanlterbasedontheconditionalembeddingoperator(KKF-CEO). Therestofthechapterisorganizedasfollows.InSectionII,theHilbertspaceembeddingsandconditionalembeddingsareintroduced.InSectionIII,theconventionalKalmanlterisreviewedandourKKF-CEOalgorithmisderived.SomeoptimizationtechniquesarediscussedinSectionIV.ThenexperimentsarepresentedtocompareournovelalgorithmwithsomeexistingalgorithmsinSectionV.Finally,thediscussionandconclusionaregiveninSectionVI. 4.2ConditionalEmbeddingsinRKHS Inthissection,weintroducetheHilbertspaceembeddingsandtheconditionalembeddingoperatorintheRKHS,whichcanbeestimatedfromgiventrainingmeasurements.Theestimatedconditionalembeddingoperatorwillbeappliedinournovelalgorithmasthestatetransitionoperator. 4.2.1HilbertSpaceEmbeddings ThekernelfunctionsmentionedbeforecaneithermapadeterministicvalueintotheRKHS,orarandomvariable(R.V.)intothespace.Inotherwords,wecanrepresentprobabilitydistributionsbyelementsinareproducingkernelHilbertspacewithanHilbertspaceembedding.WewillsummarizebrieyheretheconceptsofHilbertspaceembeddings[ 61 62 82 ].Inthefollowing,wedenotebyXandY,randomvariableswithdomainXandY,andrefertoinstantiationsofXandYbythelowercasecharacter,xandy.WeendowXandYwithsome-algebraAandB,anddenotethespaceofallprobabilitydistributions(withrespecttoAandB)onXandYbyPXandPY,respectively. Embeddingdistribution.WerstfocusonasingleR.V.XwithprobabilitydistributionPX(x)onX.ConsideringxismappedintoaRKHSFassociatedwiththekernelfunctionkFby'(),theexpectationoff(X)canbeexpressedasaninner 69

PAGE 70

product, E[f(X)]=Zf(x)dPX(x)=Zhf,'(x)iFdPX(x)=f,Z'(x)dPX(x)F=hf,E['(x)]iF=hf,XiF (4) wheref()2FandE[]denotestheexpectationoperator.Here,thetermXistheembeddingintheRKHSwithrespecttotheprobabilitydistributionPX(x),whichisdenedas[ 61 ] X:=EX['(X)].(4) TheembeddingisalsointheRKHSFaslongasE[kF(X,X)]<1,andisreferredasthemeanmap.Itsempiricalestimateis ^X=1 mmXi=1'(xi)(4) whereDX=fx1,...,xmgisatrainingsetassumedtohavebeendrawni.i.d.fromPX(x).WiththeembeddingX,wecanguaranteethatdistinctdistributionscanbemappedtodistinctpointsinaRKHSforthecharacteristickernelclass.[ 79 ]. Denition1(Characteristickernel)LetXbeatopologicalspace,PX2PXbeaBorelprobabilitydistributiononX,FisaRKHSassociatedwithameasurable,boundedkernelk(,)onX,X2FistheembeddingwithrespecttothedistributionPX.Thenk(,)issaidtobecharacteristicifthemapMby M:PX!F,PX7!X(4) isinjective. 70

PAGE 71

Therearesomecloserelationshipsbetweenthecharacteristickernelsandtheuniversalkernels[ 77 86 ].Specically,theGaussiankernelisuniversalandcharacteristic.Therefore,inthischaptertheGaussiankernelisselectedtodevelopournovelalgorithm.FromnowononlyRKHSassociatedwithGaussiankernelsareused. Cross-covarianceoperator.Next,weconsidertwoR.V.XandYwithmarginalandjointprobabilitydistributionsPX,PY,andPXY,respectively.Like'()whichmapstheR.V.XintotheRKHSF,()mapstheR.V.YintotheRKHSGassociatedwiththekernelfunctionkG,withthemeanmapYwithrespecttoPY.Likewise,theexpectationoff(X)g(Y)canalsobeexpressedasaninnerproductinRKHS, E[f(X)g(Y)]=E[hf,'(X)iFhg,(Y)iG]=E[hfg,'(X)(Y)iFG]=hfg,E['(X)(Y)]iFG=hfg,CXYiFG=hf,CXYgiF (4) wheredenotesthetensorproductoperatorandFGisalsoaRKHSassociatedwiththekernelfunctionkFkG.ThetermCXYistheuncenteredcross-covarianceoperator,whichisdenedas CXY=E['(X)(Y)]=Z'(x)(y)dPXY(x,y)2FG (4) andCXXisalsodenedas CXX=E['(X)'(X)]=Z'(x)'(x)dPX(x)2FF. (4) 71

PAGE 72

ThecrosscovarianceoperatorisalinearoperatorCXY:G!F.SincetheuncenteredcrosscovarianceoperatorisonlydeterminedbythejointprobabilitydistributionPXY(x,y)onXYwhenthekernelfunctionsaregiven,itcanbetreatedasthejointdistributionembeddingXYinthetensorproductRKHSFG[ 82 ]. GivenmpairsoftrainingexamplesDXY=f(x1,y1),,(xm,ym)gdrawni.i.d..fromPXY(x,y),theuncenteredcovarianceoperatorCXYcanbeestimatedas ^CXY=1 mmXi=1'(xi)(yi)=1 mT(4) wherewedenotethefeaturematricesby=['(x1),,'(xm)]and=[(y1),,(ym)]. 4.2.2Conditionalembeddingoperator In[ 61 ]theconditionalembeddingisintroduced,whichembedsconditionaldistributionoftheformPYjXintoaRKHS.AssumethattheconditionalexpectationEYjX[g(Y)jX=]2Fforallg2G.WehavethefollowingrelationprovidedbyFukumizuin[ 62 ], CXXEYjX[g(Y)jX=]=CXYg.(4) HerethecrosscovarianceoperatorsCXXandCXYmapthefunctionsEYjX[g(Y)jX=]2Fandg2GintoF,respectively. Using( 4 ),theconditionalexpectationofg(Y)givenX=xcanbeexpressedasaninnerproduct,[ 62 82 ] EYjX[g(Y)jX=x]=g,YjxG=EYjX[g(Y)jX=],'(x)F=C)]TJ /F8 7.97 Tf 6.59 0 Td[(1XXCXYg,'(x)F=g,CYXC)]TJ /F8 7.97 Tf 6.59 0 Td[(1XX'(x)G (4) whereCXXisassumedtobeinjective,i.e.,ifthefunctionf2FwithCXXf=CXYgisunique. 72

PAGE 73

HerethetermYjxistheconditionalembeddingofR.V.YgivenX=x,whichisdenedas Yjx:=E[(Y)jX=x].(4) Accordingto( 4 ),theconditionalembeddingYjxcanbeexpressedas Yjx=CYXC)]TJ /F8 7.97 Tf 6.59 0 Td[(1XX'(x).(4) ItisnoteworthytoremarkthattheassumptionofEYjX[g(Y)jX=]2Falwaysholdsfornitedomainswithcharacteristickernels,dosenotholdnecessarilyforcontinuousdomains.Inthecasewhereitdosenothold,weuseCYXC)]TJ /F8 7.97 Tf 6.59 0 Td[(1XX'(x)asanapproximationofYjx[ 61 ]. UsingthelawoftotalexpectationintheRKHS,wehave Y=EX[Yjx]=CYXC)]TJ /F8 7.97 Tf 6.58 0 Td[(1XXX.(4) Now,wecangivethefollowingdenitionoftheconditionalembeddingoperatorUYjX. Denition2TheconditionalembeddingoperatorUYjXisdenedas UYjX:=CYXC)]TJ /F8 7.97 Tf 6.59 0 Td[(1XX.(4) Theconditionalembeddingoperatorsatisesthreeproperties: 1.Yjx:=EYjx[(Y)jx]=UYjXk(x,); 2.EYjx[g(Y)jx]=g,YjxG; 3.Y=UYjXX. Theconditionalembeddingoperatorcanbeestimatedas ^UYjX=^CYX^C)]TJ /F8 7.97 Tf 6.59 0 Td[(1XX=1 mT1 mT+&I)]TJ /F8 7.97 Tf 6.59 0 Td[(1=T)]TJ /F6 11.955 Tf 5.48 -9.68 Td[(T+&mIm)]TJ /F8 7.97 Tf 6.58 0 Td[(1 (4) 73

PAGE 74

furtherbythematrixinversionlemma[ 81 ], ^UYjX=(K+&mI))]TJ /F8 7.97 Tf 6.59 0 Td[(1T(4) whereK=T,Iisanidentityoperator,Imdenotesanmmidentitymatrix,and&isaregularizationterm.Wemustemphasizethatthesignicanceofthetransitionfrom( 4 )to( 4 ).First,K=Tiscomputablebythekerneltrick.Second,theestimatedconditionalembeddingoperatorcanbeappliedconvenientlytocalculateYjxandYin( 4 )and( 4 ),respectively. Theconditionalembeddingoperatorlinksthefeaturemap'(x)andtheconditionalembeddingYjx,whichrepresentstheconditionaldistributionPYjx(y)intheRKHS.Therefore,severalBayesianalgorithmsaredevelopedusingthisoperator[ 61 82 ],whichareconstructedbyhiddenstatesandtrainingmeasurementdata,toestimatethehiddenstateembeddings.WetaketheDSMCEalgorithm[ 61 ]forexampletoshowhowtousetheconditionalembeddingoperatorstodevelopdynamicalsystemalgorithms. 4.2.3DynamicalSystemModelwithaConditionalEmbeddingoperator Considerthatadynamicalsystemgenerateshiddenstatesandcorrespondingmeasurementsfxi,yig.ThegoaloftheDSMCEalgorithmistorecursivelymaintaintheembeddingxigivenanewmeasurementyi.Thereforethefollowingconditionalembeddingoperatorisconstructed Uxi+1jxiyi+1=Cxi+1(xiyi+1)C)]TJ /F8 7.97 Tf 6.59 0 Td[(1(xiyi+1)(xiyi+1)(4) whichtakesboththepreviousbeliefstateandanmeasurementasinputs,andoutputsthepredictiveembeddingforP(xi+1jxiyi+1).Thatis xi+1jxiyi+1=Uxi+1jxiyi+1!((xi,yi+1))(4) 74

PAGE 75

where!()isthejointfeaturemapforxiandyi+1.In[ 61 ]thejointfeaturemapisapproximatedbytheconcatenationofthefeaturemap'(xi)and (yi+1);thatis !((xi,yi+1))=('(xi)T,(yi+1)T)T.(4) Withthisapproximatedjointfeaturemap,theconditionalembeddingoperatorUxi+1jxiyi+1canbeequivalentlyviewedastheconcatenationoftwooperatorsUxi+1jxiyi+1=(T1,T2)andtheconditionalembeddingxi+1jxiyi+1isexpressedas xi+1jxiyi+1=T1'(xi)+T2 (yi+1).(4) Next,takingtheexpectationofxi+1jxiyi+1withrespecttoP(xijyi+1),onecanobtain xi+1jyi+1=T1xijyi+1+T2 (yi+1)T1xijyi+T2 (yi+1) (4) wherethedistributionP(xijyi+1)isapproximatedbyalesscondentoneP(xijyi).( 4 )isthesolutiontoupdatetheembeddingxigivenanewmeasurementyi.Furthermore,giventrainingsamplestheoperatorscanbeapproximatedeasily.Finally,^xi2Rnxisdeterminedastheestimatedhiddenstatebyperformingthefollowingoptimizationproblem ^xi=argminxk'(x))]TJ /F6 11.955 Tf 12.66 0 Td[(^xijyik2Hx(4) TheDSMCEalgorithmissummarizedinAlgorithm4-1. 4.3KernelKalmanFilterbasedonConditionalEmbeddingOperator Inthissection,weconstructtheKalmanlterintheRKHS,usingtheestimatedconditionalembeddingoperatormentionedinSectionIIasthestatetransitionoperator. TheconceptofstateisfundamentalintheKalmanlter[ 25 ].Thestatevectororsimplystate,denotedbyxi,isdenedastheminimalmodelthatissufcienttouniquelydescribetheautonomousdynamicalbehaviorofthesystem;thesubscriptidenotesdiscretetime.Typically,thestatexiisunobservable.Toestimateit,weuseasetof 75

PAGE 76

Algorithm4-1:DynamicalSystemModelwithConditionalEmbedding(DSMCE) Learning:Giventrainingsamples:D=[(x01,y01),...,(x0m+1,y0m+1)]=['(x01),...,'(x0m)],=['(x02),...,'(x0m+1)]and=['(y02),...,'(y0m+1)]Computematrices:K=T,U=TandG=TComputematrices:~T2=(K+U+mIm))]TJ /F8 7.97 Tf 6.59 0 Td[(1and~T1=~T2G Filtering:fori=0:Let=1 m1,fori>0:Updateestimatedembedding:^xijyi=iandi=~T1i+~T1T (yi)Estimatehiddenstate:^xi=argminxk'(x))]TJ /F6 11.955 Tf 12.67 0 Td[(^xijyik2Hx measurementdata,denotedbythevectoryi.Themodelcanbeexpressedin( 2 )and( 2 )whereFiisthegiventransitionmatrixtakingthestatexifromtimeitotimei+1,andHiisthegivenmeasurementmatrix.Theprocessnoiseniandmeasurementnoiseviarebothassumedtobezero-mean,additive,white,andGaussiannoise,withcovariancematricesQiandRi,respectively.InSection 2.1.2 wepresentthetraditionalKalmanlterequationsinournotation. 4.3.1ConstructionofKalmanFilteringintheRKHS Inthissubsection,wedevelopanalgorithmforthespecialcaseoftrackinganoisytimeseriessignal,whichisgeneratedbythesysteminFigure 4-1 .Thesystemmodeliswrittenas: xi+1=f(xi)+ni (4) yi=xi+vi (4) wherefisanunknownnonlinearfunction,niisthestateprocessingnoisereectingthestatemodeluncertaintyandviisthezeromeanmeasurementnoise.Hereasimpleadditivenoisemeasurementmodelisconsidered. Becausethestatetransitionfunctionisunknown,thetraditionalKalmanlterornonlinearKalmanlterscannotbeappliedinthiscase.Likewise,becausethehiddenstatefxigarenotavailableinmanysignalprocessingcases,wecannotusetheDSMCE 76

PAGE 77

[ 61 ]andKBR[ 82 ]algorithms,inwhichthehiddenstatestrainingdataarenecessarytoconstructthestatetransitionoperators. Inordertoavoidthedescriptionoftheunavailablehiddenstatesdynamics,weusetheestimatedembeddingofthenoisymeasurementsfyigasthehiddenstateinRKHS,insteadoftheembeddingoffxig.Therefore,wecanconstructthenewstatetransitionoperatorsonlyusingthenoisymeasurements,todescribethedynamicsofthemeasurementembeddingsanddevelopaKalmanlterintheRKHS.Fortherestofthispaper,theestimatedmeasurementembeddingisdenotedby^i. Forconvenience,wedenotetheestimatedconditionalembeddingoperatorbyFiinsteadof^U,whichisthestatetransitionoperator,notamatrixnow.AssumingthatwehavethetimeseriesD=y01,...,y0m+1asthetrainingdataset,ablockofsamplestakenfromtimeseriesfyig,whicharemappedintotheRKHSHy,theestimatedconditionalembeddingoperatorFicanbeexpressedas Fi=(K+&mIm))]TJ /F8 7.97 Tf 6.59 0 Td[(1T(4) where='(y01),...,'(y0m),='(y02),...,'(y0m+1),K=T.Herethesuperscript0referstothetrainingdata. Wecanconnectthecurrentestimatedmeasurementembedding^iwiththenextstepestimatedmeasurementembedding^i+1by ^i+1=Fi^i.(4) Becausetheoperatorisestimatedbasedonlimiteddataandonnoisydynamics,weneedtoaddsomenoiseiintheRKHStothepredictednextstepembeddingas ^i+1=Fi^i+i.(4) Here,( 4 )isthestatemodelintheRKHS. 77

PAGE 78

Whenwehaveanewmeasurementyi,wehavetobuildthemeasurementmodelintheRKHStolinkthemeasurementandthehiddenstate,estimatedembedding^i.Then,wemapthenewmeasurementyiintotheRKHSas'(yi)2Hy.Because^iistheestimateofE['(Yi)]and'(yi)isaninstantiationoftheR.V.'(Yi),wehave '(yi)=^i+i(4) whereiisthemeasurementnoiseintheRKHS. Therefore,wehavethedynamicalstate-spacemodelofmeasurementsintheRKHS ^i+1=Fi^i+i'(yi)=^i+i. (4) Itisnoteworthythat^iisaR.V.intheRKHSwhichdoesnotcorrespondtoanyR.V.intheinputspace. HereiandiarebothnoisesintheRKHS,whichrepresenttheuncertaintyintheestimatedstatemodelin( 4 )andmeasurementnoise(differencebetweenestimatedexpectation^iandmeasurementinstantiation'(yi))in( 4 ).IntheRKHSwecantreatthenoisesiandiasnk1vectors.FortheGaussiankernel,nkisinnite.Ifweassumethateachcomponentofthesevectorsisindependentzero-meanGaussiannoiseprocesswithcovariance Qi=qI (4) Ri=rI (4) liketheassumptionoftheEx-KRLSalgorithm[ 17 ],wecanusetheKalmanltertoestimatethehiddenstate^iintheRKHS.Although,( 4 )and( 4 )cannotaccuratelydescribethecovariancesofthenoiseintheRKHS,thesetwoscalarsqandrreectthenoiseintensity.Asmallerqisappliedwhentheestimatedconditional 78

PAGE 79

embeddingoperatorisapproximatedmoreaccurately,whilealargerrisusedwhenthemeasurenoiseintheinputspaceislarger.InSectionVwewillndthatitistheradioq=rthatismoreimportantratherthanqandr. Comparingthetraditionalmodelin( 2 )withthemodelintheRKHS( 4 ),wecanseethat,Fi=(K+&mIm))]TJ /F8 7.97 Tf 6.58 0 Td[(1T,Hi=I,Qi=qIandRi=rI.Therefore,therecursiontoestimateiintheRKHSwillbewrittenasfollows: Startwith^0=E['(y0)]or^0='(y0),P0=IliketheKalmanlter[ 26 ], ^)]TJ /F5 7.97 Tf 0 -8.27 Td[(i=Fi)]TJ /F8 7.97 Tf 6.59 0 Td[(1^i)]TJ /F8 7.97 Tf 6.59 0 Td[(1 (4) P)]TJ /F5 7.97 Tf 0 -8.28 Td[(i=Fi)]TJ /F8 7.97 Tf 6.59 0 Td[(1Pi)]TJ /F8 7.97 Tf 6.59 0 Td[(1FTi)]TJ /F8 7.97 Tf 6.59 0 Td[(1+Qi)]TJ /F8 7.97 Tf 6.58 0 Td[(1 (4) Gi=P)]TJ /F5 7.97 Tf 0 -8.28 Td[(iP)]TJ /F5 7.97 Tf 0 -8.28 Td[(i+Ri)]TJ /F8 7.97 Tf 6.58 0 Td[(1 (4) ^i=^)]TJ /F5 7.97 Tf 0 -8.27 Td[(i+Gi)]TJ /F3 11.955 Tf 5.48 -9.68 Td[('(yi))]TJ /F6 11.955 Tf 12.66 -.01 Td[(^)]TJ /F5 7.97 Tf 0 -8.27 Td[(i (4) Pi=(I)]TJ /F7 11.955 Tf 11.96 0 Td[(Gi)P)]TJ /F5 7.97 Tf 0 -8.28 Td[(i (4) where^)]TJ /F5 7.97 Tf 0 -8.27 Td[(idenotesaprioriestimateofthemeasurementembedding. Becausetheestimatedmeasurementembedding^iandthemappedmeasurement'(yi)bothlieinapossibleinnitedimensionalspaceHy,P)]TJ /F5 7.97 Tf 0 -8.28 Td[(i,PiandGiarealloperatorsinHyHy,whichcanallbetreatedasnknkmatrices(nkisequaltoinnityfortheGaussiankernel).Then,thisrecursiveapproachcannotbecomputeddirectlyusingnormalmatrixcalculations.However,bytherepresentertheoreminRKHS[ 87 88 ],thesolutionalwaysexistsinthesubspacespannedbythedata,whichinthiscaseisofdimensionm.Theissuebecomestondawaytowritetheoperatorsinsuchawaythat11matricesareavoided.Inspiredby[ 89 ]therecursionapproachcanstillbeimplementedduetothefollowingtheoremswehaveproved. Theorem4.1. TheoperatorP)]TJ /F5 7.97 Tf 0 -8.28 Td[(i,Pi,Giareofthefollowingforms P)]TJ /F5 7.97 Tf 0 -8.28 Td[(i=~P)]TJ /F5 7.97 Tf 0 -8.28 Td[(iT+qI(4) 79

PAGE 80

Pi=~PiT+qr q+rI(4) Gi=r q+r~GiT+q q+rI(4) where~P)]TJ /F5 7.97 Tf 0 -8.27 Td[(i,~Piand~Giareallmmmatrices.Furthermore,thesematricescanbecomputedrecursivelyasfollows: ~P)]TJ /F8 7.97 Tf 0 -7.97 Td[(1=(K+&mIm))]TJ /F8 7.97 Tf 6.59 0 Td[(1T(K+&mIm))]TJ /F8 7.97 Tf 6.59 0 Td[(1T(4) ~Gi=h(q+r)Im+~P)]TJ /F5 7.97 Tf 0 -8.28 Td[(iTi)]TJ /F8 7.97 Tf 6.58 0 Td[(1~P)]TJ /F5 7.97 Tf 0 -8.28 Td[(i(4) ~Pi=r q+r~P)]TJ /F5 7.97 Tf 0 -8.28 Td[(i)]TJ /F4 11.955 Tf 23.84 8.09 Td[(r q+r~GiT~P)]TJ /F5 7.97 Tf 0 -8.28 Td[(i)]TJ /F4 11.955 Tf 20.75 8.09 Td[(qr q+r~Gi(4) ~P)]TJ /F5 7.97 Tf 0 -8.28 Td[(i+1=(K+&mIm))]TJ /F8 7.97 Tf 6.59 0 Td[(1T~PiT(K+&mIm))]TJ /F8 7.97 Tf 6.59 0 Td[(1TT+qr q+r(K+&mIm))]TJ /F8 7.97 Tf 6.59 0 Td[(1T(K+&mIm))]TJ /F8 7.97 Tf 6.58 0 Td[(1TT. (4) Theorem4.2. Thepredictedembeddings^)]TJ /F5 7.97 Tf 0 -8.28 Td[(iandtheestimatedembeddings^iarebothcombinationsofthemappedmeasurementsf'(y0i)gm+1i=2 ^)]TJ /F5 7.97 Tf 0 -8.28 Td[(i=ai(4) ^i=bi+q q+r'(yi)(4) whereai=[a1,...,am]Tandbi=[b1,...,bm]Tarebothm1real-valuedvectors.Furthermore,thesevectorscanalsobecomputedrecursivelyasfollows: a1=(K+&mIm))]TJ /F8 7.97 Tf 6.59 0 Td[(1T0(4) bi=r q+rI)]TJ /F4 11.955 Tf 23.84 8.09 Td[(r q+r~GiTai+r q+r~GiT'(yi)(4) ai=(K+&mIm))]TJ /F8 7.97 Tf 6.59 0 Td[(1Tbi)]TJ /F8 7.97 Tf 6.59 0 Td[(1+q q+r'(yi)]TJ /F8 7.97 Tf 6.59 0 Td[(1).(4) ProofsofTheorem4.1andTheorem4.2arepresentedinAPPENDIX B .Bythesetheorems,theoperatorsareallexpressedintermofmmmatricesandfeature 80

PAGE 81

matrices,suchasand,whichcanbothbetreatedasnkmmatrices.Therefore,thecalculationsfrom( 4 )to( 4 )onlyinvolvenormalmatricescalculationsandinnerproductoperationsbetweenfeaturematricesandvectorslikeTandT'(yi),whichcanbecomputedbythekerneltrickreadily. Thegoalofthisalgorithmistoestimatethesignalxiintheinputspacefromthenoisymeasurements.ConsideringthedenitionoftheembeddingsintheRKHS,weproposeanapproachtoestimatethelteredsignal^xifromtheembedding^i,whichissimilartothePre-Imageproblem[ 59 ],butmuchmoreeffective.From( 4 ),wehave ^xi=E[xi]=E[xi+vi]=E[yi]=E[hfI,'(yi)i]=hfI(),E['(yi)]i=hfI(),^ii=fI(),bi+q q+r'(y)i (4) wherefI=f1,...,fnyisarowvectoroffunctionsandfj(y)=y(j)(j=1,...,ny).y(j)isthejthcomponentofthevectory.Althoughthefunctionfj()maynotbelongtotheRKHS,wecanlearnandapproximatethemfromthetrainingdataY0,whereY0=y02,...,y0m+1.ThelearnedrowvectoroffunctionsfI()canbeexpressedas ^fI()=(T))]TJ /F8 7.97 Tf 6.59 0 Td[(1Y0T.(4) Thentheaboveequationcanbeapproximatedas ^xi=bTiT(T))]TJ /F8 7.97 Tf 6.59 0 Td[(1Y0T+q q+ryi=Y0bi+q q+ryi.(4) Atthispoint,wehavetheKernelKalmanlteralgorithmbasedontheestimatedconditionalembeddingoperator,whichissummarizedinAlgorithm4-2. ThespacecomplexityisO(m2),whiletimecomplexityisO(m3)becausethecomplexityisdominatedbythemultiplicationandinversionofmmmatrices.This 81

PAGE 82

Algorithm4-2:KernelKalmanFilteringbasedonconditionalembeddingoperator(KKF-CEO) Initialization:Fori=0,set='(y01),...,'(y0m),='(y02),...,'(y0m+1),Y0=y02,...,y0m+1,K=T,M=T,T=T,L=(K+&mI))]TJ /F8 7.97 Tf 6.59 0 Td[(10='(y0),P0=I,ai+1=LT0,~P)]TJ /F5 7.97 Tf 0 -8.28 Td[(i+1=LKLT Filtering:Fori=1,...,compute~Gi=h(q+r)Im+~P)]TJ /F5 7.97 Tf 0 -8.28 Td[(iMi)]TJ /F8 7.97 Tf 6.58 0 Td[(1~P)]TJ /F5 7.97 Tf 0 -8.28 Td[(i~Pi=r q+r~P)]TJ /F5 7.97 Tf 0 -8.28 Td[(i)]TJ /F5 7.97 Tf 18.82 4.71 Td[(r q+r~GiM~P)]TJ /F5 7.97 Tf 0 -8.28 Td[(i)]TJ /F5 7.97 Tf 16.63 5.03 Td[(qr q+r~Gibi=hr q+rIm)]TJ /F5 7.97 Tf 18.82 4.71 Td[(r q+r~GiMiai+r q+r~GiT'(yi)^xi=Y0bi+q q+ryiai+1=LTbi+q q+rLT'(yi)~P)]TJ /F5 7.97 Tf 0 -8.27 Td[(i+1=LT~PiTTLT+qr q+rLKLT complexityincreaseswhenmoretrainingdataarerequiredtoconstructtheestimatedconditionalembeddingoperators.Therefore,simplifyingtechniquesareusedtoreducethesizeoftheestimatedconditionalembeddingoperators,whichwillbediscussedinSection 4.4 Although,ouralgorithmandFukuminzu'salgorithms[ 61 82 ]arealldevelopedbasedontheembeddingconcept,theyaretotallydifferentmethods.Thehiddenstateinouralgorithmistheestimatedmeasurementembedding,nottheembeddingoftheunobservablesignal,andthetransitionoperatorisestimatedonlyfrommeasurements.Therefore,ouralgorithmcanbeimplementedusingonlythemeasurementdatafyig,whileFukuminzu'salgorithmsbothrequirethecleansignaldatafxigandthecorrespondingmeasurementdatafyigastrainingdatatoconstructtherespectivestatetransitionandmeasurementoperators.Furthermore,comparingwiththeKKFalgorithms[ 56 57 ],ouralgorithmobtainsthestatetransitionoperatorbyconstructingtheestimatedconditionalembeddingoperatordirectlyfromnoisymeasurements,whiletheKKF 82

PAGE 83

algorithmsonlyimplementtheKalmanlterinahighdimensionalsubspace(nottheRKHS),thestatetransitionoperatorhastobelearnedfromthetrainingmeasurementdatabytheEMalgorithm,whichismorecomplicated,andthePre-imagetechniqueisrequiredtoestimatethesignal^xi. 4.3.2KalmanFilterPredictorinRKHS Intheprevioussubsection,theKKF-CEOalgorithmispresentedtodenoisemeasurementsbasedontheavailablenoisydata.Inthissubsection,wepresentapredictorbasedonthisalgorithm. UsingtheKKF-CEOalgorithm,wecanestimatethecurrentmeasurementembedding.InsteadofutilizingtheestimatedconditionalembeddingoperatorFirepeatedlytopredictthecorrespondingmeasurementembedding,weconstructanewconditionalembeddingoperatorfromthetrainingdatasetwhichlinksthecurrentandfuturemeasurementembeddingsdirectly.Forthe`-stepprediction,thepredictiveconditionalembeddingoperatorFPisestimatedas FP=(K+&mIm))]TJ /F8 7.97 Tf 6.59 0 Td[(1T(4) where='(y01+`),...,'(y0m+`).Withthisconditionalembeddingoperatorthemeasurementembeddingattimei+`canbeexpressedasi+`=FPi=(K+&mIm))]TJ /F8 7.97 Tf 6.58 0 Td[(1Tbi+q q+r'(yi)=ci+`, (4) where ci+`=(K+&mIm))]TJ /F8 7.97 Tf 6.59 0 Td[(1Tbi+q q+r'(yi).(4) Like( 4 ),thepredicted^xi+`is ^xi+`=Z0ci+`(4) 83

PAGE 84

whereZ0=y01+`,...,y0m+`. TheformofthepredictiveconditionalembeddingoperatorFPissimilartotheKRLSalgorithmwhenthesystemhasbeenlearnedwell.RecallingtheKRLSalgorithm[ 16 ],thepredictedoutputis oi=!T'(ui)(4) where!istheweightintheRKHSwhichislearnedfromthetrainingdata,anduiistheinputwhichisusedtopredicttheoutputoi.Iftheweight!islearnedfromthetraininginputsety01,...,y0manddesiredsetZ0=y01+`,...,y0m+`,theweightshouldbe !=(K+Im))]TJ /F8 7.97 Tf 6.59 0 Td[(1Z0T(4) andthepredictedoutputgiveninputyiis oi=Z0(K+Im))]TJ /F8 7.97 Tf 6.58 0 Td[(1T'(yi).(4) TheapproximatedpredictionconditionalembeddingoperatorFPhasverysimilarformtotheweight!,butwithdifferentregularizationparameters. However,theyaredifferentalgorithms.TheapproximatedpredictionconditionalembeddingoperatorFPdescribeshowtheembeddingsispropagatedfromtimeitotimei+`,whichoperatesonthecurrentestimatedembedding^i,whiletheweight!speciesanunderlyingfunction,whosevariableisthecurrentinputui.IfandonlyifFPand!arebothtrainedbasedonthesametrainingdata,regularizationparameters=&m,andthemeasurementnoiseparameterr=0,thesetwoalgorithmscanhavethesameapproximatedprediction,becausebi=0and ci+`=(K+&mIm))]TJ /F8 7.97 Tf 6.59 0 Td[(1T'(yi).(4) Bysettingr=0,itmeansthatthereisnodifferencebetweeniandthe'(yi)andthetimeseriesarenotcontaminatedbynoise.Therefore,comparedwiththeKRLS 84

PAGE 85

algorithm,theKKF-CEOalgorithmismoreappropriatefornon-stationaryandnoisyenvironments. 4.4Simplifyingtheconditionalembeddingoperator Intheprevioussection,theKKF-CEOalgorithmisdevelopedtoestimateandpredictthetimeseriesfromthenoisydata.LikethetraditionalKalmanlter,thisalgorithmisimplementedbymaintainingsomecovariancematrices.However,thesizesofthesematricesareallmm,wheremisthesizeofthetrainingdatathatareusedtoestimatethestatetransitionoperatorFi,notthedimensionalityofthehiddenstate.Therefore,thetimeandspacecomplexitiesofthisalgorithmareO(m3)andO(m2),respectively.Thetimecomplexityincreasescubicallywiththenumberofthetrainingdata,whichmayposeabigproblemfortheapplicationofthesealgorithms.Toreducethetimeandspacecomplexities,wehavetoreducethenumberofthetrainingsamplesintheproposedalgorithminareasonableway.Severaltechniquesareproposedinthissection,includingdownsampling,quantizing,andonlineselectingthetrainingdata.Thesetechniquescanalsobeappliedtootheralgorithmswithsimilaroperators,liketheDSMCE[ 61 ]andKBR[ 82 ]algorithms. Forconvenience,y0i+`isdenotedbyz0iinthissection.WegetFiwhen`=1,whilegetFPwhen`>1. 4.4.1DownSamplingTrainingData Accordingto( 4 )and( 4 ),theconditionalembeddingoperatorisestimatedbasedoni.i.d.trainingdata.Underthisassumption,wecandownsampleuniformlythetrainingdatatoreducethesizeoftheestimatedconditionalembeddingoperator.WiththedownsamplingrateLs,thesizesofthematricesbecomek=bm=LscandthetimeandspacecomplexitiesoflteringareO(k3)andO(k2),respectively. 4.4.2QuantizingTrainingData Quantizationtechniquescanalsobeappliedtoreducethesizeoftheconditionalembeddingoperators.TheideaisverysimilartotheQKLMS[ 83 ],whereithas 85

PAGE 86

demonstratedthatfortheGaussiankernelgoodresultsareobtainedwithmuchfewerprototypesthansamples. Becausethelearningoftheconditionalembeddingoperatorsisnotnecessarilyimplementedonline,weusethekernelK-meansalgorithm[ 11 ]toquantizethetrainingdatasetintheRKHS,insteadofusingtheonlinevectorquantization(VQ)methodmentionedin[ 83 ],whichisimplementedintheoriginaltrainingdatasetdomain.ComparingwiththeonlineVQmethod,thereareseveraladvantagesbyusingthekernelK-meansmethod.First,thenumberofthequantizeddatasetiseasytocontrolbytheparameterk,whilefortheonlineVQmethodthenumbercannotbedeterminedinadvance.Second,thekernelK-meansalgorithmobtainsthecodebookbyupdatingbasedonthewholedatasetwhichcangetbetterquantizationperformance,whiletheonlineVQmethodgeneratethecodebookincrementallyandtheoldquantizedcentersarenotupdatedwhennewdataarrive.Third,thekernelK-meansalgorithmisimplementedintheRKHSbasedontheGrammatrix,whichisthenaturalspacetotraintheconditionalembeddingoperator,whiletheonlineVQmethodiscontrolledbyadistancethresholdparameter(quantizationsize")denedintheinputspace,whichisnoteasytochoose. Consideringthatestimatedconditionalembeddingoperatorsareconstructedbasedonthemappedmeasurementsf'(y0i)gandf'(z0i)g,thekernelK-meansalgorithmshouldbeimplementedbasedonthepairsf('(y0i),'(z0i))g.Therearetwooptionstoimplementthequantization. First,implementthekernelK-meansalgorithminadirectproductspace.Weusetheconcatenationofthemappedmeasurements'(y0i)and'(z0i)asanewvector((y0i,z0i))=('(y0i)T,'(z0i)T)2HyHy,wheredenotesthedirectproductoperator. 86

PAGE 87

Theinnerproductofthesenewvectorscanbecalculatedas h((y0i,z0i)),((y0j,z0j))iHyHy=h'(y0i),'(y0j)iHy+h'(z0i),'(z0j)iHy. (4) Thegrammatrixofthisnewvector((y0i,z0i))isdenotedasK,whichcanbeexpressedas K=K1+K2(4) whereK1andK2arethegrammatricesof'(y0i)and'(z0i).Therefore,thegrammatrixKisnonnegativedeniteandthespaceHyHyisaRKHS. Second,implementthekernelK-meansalgorithminatensorproductspace.ThedirectproductRKHSHyHyisnotanuniversalspace,eveniftheRKHSHyandHyareuniversal.NowwecanconsiderthetensorproductRKHSHyHy.TheinnerproductinthisRKHSis h'(y0i)'(z0i),'(y0j)'(z0j)iHyHy=h'(y0i),'(y0j)iHyh'(z0i),'(z0j)iHy. (4) andthegrammatrixforthisRKHScanbeexpressedbyK1andK2as K=K1.K2(4) where.denotestheelement-by-elementproductoperator. BecausethekernelK-meansalgorithmcanclusterdataonlybasedontheGrammatrix,wejustusethenewgrammatricesKorKinsteadoftheoriginalgrammatrixK,thenclusterthedataintokbins.Toquantizethedataset,wecanselectonedatapair('(yqi),'(zqi))asthecodecentertorepresentthedatainthisbin.Forthekbins,wehavethecodebookf('(yqi),(zqi)gki=1. Basedontheclusteringresults,wecanestimatethequantizedconditionalembeddingoperatorFQ. 87

PAGE 88

Theorem4.3. GivenmpairsoftrainingexamplesDYZ=f(y01,z01),...,(y0m,z0m)g,whicharequantizedtoDQYZ=f(yq1,zq1),...,(yqk,zqk)g,thequantizedconditionalembeddingoperatorcanbeestimatedas FQ=Q(KQ+&mIk))]TJ /F8 7.97 Tf 6.58 0 Td[(1TQ(4) whereQ=['(yq1),...,'(yqk)]andQ=['(zq1),...,'(zqk)].Here=diag[q1,...,qk]isadiagonalmatrix,whereqj(1jk)isthenumberofthesamplesclusteredinthejthbinbythekernelK-meansalgorithmandKQ=TQQ. TheproofofTheorem4isgiveninAPPENDIX C Assumingthatineachbinthenumberofthesamplesclusteredisthesame,sayLq=m=k.Thenthediagonalmatrixisrewrittenas=LqIk.ThequantizedconditionalembeddingoperatorcanberewrittenasFQ=QLqIk(KQLqIk+&mIk))]TJ /F8 7.97 Tf 6.59 0 Td[(1TQ=Q(KQ+&kIk))]TJ /F8 7.97 Tf 6.59 0 Td[(1TQ, (4) whichisthesameasthedownsamplingconditionalembeddingoperator.Itshowsthatthedownsamplingconditionalembeddingoperatorisaspecialcaseofthequantizedconditionalembeddingoperator. AccordingtothepropertyofthekernelK-meansalgorithm,thesizesofthematricesshouldbeequaltoorlessthantheparameterk.Therefore,thetimeandspacecomplexitiesoflteringareequaltoorlessthanO(k3)andO(k2),respectively. 4.4.3TrainingDataforOnlineOperation Intheprevioussubsections,thesamplesareselectedfromthewholedatasetinadvancetoconstructtheconditionalembeddingoperators,andtheoperatorsarexedduringthelteringprocedure.However,notallselectedsamplesareimportanttoconstructtheoperators.Theinformationofthecurrentestimatedembedding^ishouldbeconsideredtoselectthesamplestoconstructtheoperators.Sincetheconditional 88

PAGE 89

embeddingoperatorsareconstructeddirectlyfromthedata,wecanupdatetheseoperatorswithrespecttothecurrentestimatedembedding^i,notxthem. InordertoonlineselectthetrainingdatasettoconstructtheestimatedconditionalembeddingoperatorFO,iatiterationi,wecancalculatethedistancednbetweenthecurrentestimatedembedding^iandallthetrainingsamplesfy0ngmn=1intheRKHS.Thenweselecttheclosestksamplesfy0i,jgkj=1andthecorrespondingmeasurementsfz0i,jgkj=1toconstructtheconditionalembeddingoperatoratiterationi.Thesquareddistanced2niscalculatedbysubstituting( 4 ),d2n=k^i)]TJ /F3 11.955 Tf 11.96 0 Td[('(y0n)k2=ki)]TJ /F8 7.97 Tf 6.59 0 Td[(1bi+q q+r'(yi))]TJ /F3 11.955 Tf 11.96 0 Td[('(y0n)k2=bTiTi)]TJ /F8 7.97 Tf 6.59 0 Td[(1i)]TJ /F8 7.97 Tf 6.59 0 Td[(1bi+(q q+r)2k'(yi)k2+k'(y0n)k2+2q q+rbiTi)]TJ /F8 7.97 Tf 6.59 0 Td[(1'(yi))]TJ /F6 11.955 Tf 20.08 8.09 Td[(2q q+r'(yi)T'(y0n))]TJ /F6 11.955 Tf 11.95 0 Td[(2bTiTi)]TJ /F8 7.97 Tf 6.59 0 Td[(1'(y0n)(1nm) (4) wherei)]TJ /F8 7.97 Tf 6.59 0 Td[(1='(z0i)]TJ /F8 7.97 Tf 6.59 0 Td[(1,1),...,'(z0i)]TJ /F8 7.97 Tf 6.59 0 Td[(1,k)atiterationi)]TJ /F6 11.955 Tf 12.56 0 Td[(1.Actually,tondthekclosestdistancesfordifferenty0n,wejustneedtocalculate 4d2n=q q+r'(yi)T'(y0n)+bTiTi)]TJ /F8 7.97 Tf 6.58 0 Td[(1'(y0n)(1nm)(4) andselectklargest4d2n.Thenweusethecorrespondingtrainingmeasurementdataf'(y0i,j),(z0i,j)gkj=1toconstructnewfeaturematricesi,iandtheonlineselectedconditionalembeddingoperatorFO,iatiterationi. Forthesampleselectionpart,thetimeandspacecomplexitiesareO(mk2)andO(m),respectively.Forthelteringpart,thetimeandspacecomplexitiesarethesame.Therefore,thetimeandspacecomplexitiesofthismethodareO(mk2)+O(k3)andO(m)+O(k2),respectively. 89

PAGE 90

4.5ExperimentsandResults Inthissectiontwosyntheticexperimentsandreal-worlddatasetsimulationarepresentedtoevaluatetheestimationandpredictionperformancesofthenovelalgorithmKKF-CEOcomparedwithsomeotherexistingalgorithms. 4.5.1Estimationofnoisytime-series TheexperimentaldataistheIKEDAchaoticdynamicalsystem[ 68 ],whichisrelatedtolaserdynamics.Thistimeseriesisdenedfromastartingpointx0=[x0(1),x0(2)]Tby 8>>>>>><>>>>>>:wi=c1)]TJ /F5 7.97 Tf 37.88 4.88 Td[(c3 1+x2i(1)+x2i(2)xi+1(1)=c4+c2(xi(1)cos(wi))]TJ /F4 11.955 Tf 11.96 0 Td[(xi(2)sin(wi))xi+1(2)=c2(xi(1)sin(wi)+xi(2)cos(wi))(4) wherec1,c2,c3andc4areallrealvaluedparameters.Wesetc1=0.4,c2=0.84,c3=6.0,c4=1.0andx0=[1,0]T.Therst200pointsoftheseriesIKEDAareplottedinFigure 4-2 Figure4-2. Trajectoryoftherst200pointsoftheseriesIKEDA 90

PAGE 91

Inthisexperiment,weaddnoisetothecleanIKEDAdatasettoobtainthenoisydatafyig.Wewanttoestimatethenoiselessdatafromthenoisydata.Foralldatasets,thedataaresplitintontrain=201trainingdataandntest=200testdata,includingthenoisymeasurementsandthecorrespondingcleansignal. TheKKF-CEOalgorithmpresentedinSection 4.3.1 isappliedtotheestimationproblem,anditsestimationperformanceiscomparedwiththeperformancesoftheEKF,UKF,CKFandDSMCEalgorithms. 4.5.1.1Discussionofnoiseparameters Inordertoselectsuitableparametersinouralgorithmtoobtainbestestimationperformances,wehavetodiscusshowtheseparametersaffecttheestimationresults.Amongtheseparameters,qandrarethemostimportantparameters,whichreectthestateandmeasurementnoisesintheRKHS.TakingtheGaussiannoiseasexample,weplottedtheestimationresultsof3dBnoisysignalfordifferentpairsofqandrinFigure 4-3A .Here,themeansquareserror(MSE)iscalculatedby MSE=1 ntestntestXi=1kxi)]TJ /F6 11.955 Tf 11.8 0 Td[(^xik2.(4) A B Figure4-3. MSEofestimationresults(a)withrespecttoqandr(b)withrespecttoand 91

PAGE 92

Theotheralgorithmicparameters,&andaresetas10)]TJ /F8 7.97 Tf 6.58 0 Td[(3and10)]TJ /F8 7.97 Tf 6.58 0 Td[(4,respectively.TheGaussiankernelisusedhereandthekernelsizeisthemedianvalueofthepairwisedistanceofthetrainingdataset. FromFigure 4-3A onecanndoutthattheestimationresultsaresimilarwhentheratiosbetweenqandraresimilar.Therefore,were-plottheresultswithrespecttothe=p q2+r2and=tan)]TJ /F8 7.97 Tf 6.59 0 Td[(1(r=q)inFigure 4-3B Thegureclearlyshowsthatonlytheratiobetweentherandqaffectstheestimationresults.AlthoughthisstatementcannotbeveriedmathematicallybyAlgorithm4-2becauseofthetermqr q+r,itisveriedpracticallyaslongastheparametersqandrarebothlargeenoughtomeetthefollowingapproximations(fori=1) ~Gi=h(q+r)Im+~P)]TJ /F5 7.97 Tf 0 -8.28 Td[(iMi)]TJ /F8 7.97 Tf 6.59 0 Td[(1~P)]TJ /F5 7.97 Tf -.01 -8.28 Td[(i[(q+r)Im])]TJ /F8 7.97 Tf 6.59 0 Td[(1~P)]TJ /F5 7.97 Tf 0 -8.27 Td[(i (4) ~Pi=r q+r~P)]TJ /F5 7.97 Tf 0 -8.28 Td[(i)]TJ /F4 11.955 Tf 23.84 8.09 Td[(r q+r~GiM~P)]TJ /F5 7.97 Tf 0 -8.28 Td[(i)]TJ /F4 11.955 Tf 20.75 8.09 Td[(qr q+r~Gir q+r~P)]TJ /F5 7.97 Tf 0 -8.27 Td[(i (4) bi=r q+rhIm)]TJ /F6 11.955 Tf 13.2 2.66 Td[(~GiMai+~GiT'(yi)ir q+rai (4) ~P)]TJ /F5 7.97 Tf 0 -8.27 Td[(i+1=LT~PiTTLT+qr q+rLKLTqr q+rLKLT. (4) Becausetheelementsin~P)]TJ /F8 7.97 Tf 0 -7.97 Td[(1andMareallverysmall,thisapproximationsalwaysholdwhenq,r>10)]TJ /F8 7.97 Tf 6.59 0 Td[(4.Withtheseapproximations,onecanshowmathematicallythatfori>1thevectorsaiandbionlydependontheratior=q.Evenifapproximationsdonotholdforsomeextremecases(forexample,q,r<10)]TJ /F8 7.97 Tf 6.59 0 Td[(20),theratior=qstillprimarilycontrolsthequalityoftheresults.Topresenthowtheratioaffectstheestimationresults,weplottheMSEversusthedifferentratiosr=q=f1=5,1=4,1=3,1=2,1,2,3,4,5ginFigure 4-4 .Theratioisabalanceonhowmuchthesystemtruststheestimatedstate 92

PAGE 93

transitionoperatorandmeasurements,whichshouldbedeterminedbasedonpriorknowledge. Figure4-4. MSEofestimationresultsversusr=q.Errorbarsmarkonestandarddeviationaboveandbelowthemean. Theseresultsareplottedbasedon20runs.Theyshowthatwecanobtainthebestestimationperformancewhenr=qisselectedbetween2and3. 4.5.1.2Comparisonofestimationresults FortheEKF,UKF,andCKFalgorithms,thesystemmodelsareallassumedknown,whichshouldprovidedanadvantagetothesealgorithms.Notrainingdataarerequiredtolearnthemodels.FortheDSMCEalgorithm,becausethehiddenstateintheRKHSxistheembeddingofthehiddenstatexandtheconditionalembeddingoperatorsinvolvethehiddenstatesfxigandmeasurementsfyig,thetraininghiddenstatesandthemeasurementsareallrequiredtoconstructtheoperators.Whileforouralgorithm,becausethehiddenstateintheRKHS^yistheestimatedmeasurementembedding,onlythetrainingmeasurementsarerequiredtoconstructthestatetransitionoperator. 93

PAGE 94

Inaddition,thesimplifyingtechniquesdiscussedinSectionIVarenotutilizedinthisexperiment. Westudyhowthesedifferentalgorithmsrespondtovariousnoisemodels.Inparticular,wescaleandaddfourdifferentnoisestogetthenoisysignalswith3dBsignalnoiseratio(SNR),suchastheGaussiannoise,Laplaciannoise,zero-meanuniformnoiseandalphastablenoise[ 69 70 ].Intheexperiments,theparametersqandrarealwayssetas1and3,respectively,whichmakestheratior=q=3. Foreachtrial,thealgorithmsarerun20timestovalidatethemeanandvarianceoftheresults.TheMSEbetweentheestimations^xiandrealpositionsxiofthesystemarelistedinTable 4-1 .Alltheresultsinthesetablesareintheformofmeanstandarddeviation,includingTable 4-2 and 4-3 Table4-1. MSEofestimationofIKEDAdatafordifferentalgorithms(=1.6forstablenoise) AlgorithmsNormalLaplacianzero-meanuniformstablenoise EKF0.36300.04480.38970.04070.38430.03080.30210.1232UKF0.26390.02180.27190.02170.26960.02130.24650.0969CKF0.23740.01760.25740.01990.24270.01930.25800.0991DSMCE0.39180.05020.35550.06260.39450.08000.23190.1335KKF-CEO0.22530.01680.21210.02020.23840.01800.14610.0413 ItisclearthattheestimationresultsoftheKKF-CEOalgorithmoutperformstheotheralgorithmsinallnoisyenvironments.Especially,inthestablenoisyenvironmenttheadvantageofouralgorithmisveryobvious.Eveninthenormalnoisyenvironment,ouralgorithmcanstillobtainthebestestimationperformance.ThatisbecausethesystemisnonlinearandtheGaussiannoiseassumptionisnotsatisedinthiscase. Inthepreviousexperiments,theSNRoftheinputsignalisalways3dB.Inordertogureouthowthesealgorithmsrespondinmorenoisyenvironments,weusethesignalcontaminatedbythedifferentnormalnoiseswithSNRsfrom-15dBto3dBtotestthesealgorithms.FortheKKF-CEOalgorithm,thenoiseparameterq=1andrissetthe 94

PAGE 95

differentvaluesf256,128,60,40,12.5,6,3gtoobtaingoodperformancesaccordingtothedifferentSNRs.TheresultsareplottedinFigure 4-5 Figure4-5. SNRofestimationresultsversusinputsignal.Errorbarsmarkonestandarddeviationaboveandbelowthemean. FromFigure 4-5 onecanndthattheKKF-CEOalgorithmhasthebestestimationperformanceswhenSNR)]TJ /F6 11.955 Tf 22.6 0 Td[(12dB.EveninlowSNRenvironment(SNR=-12dB),thisalgorithmcanstillobtainagoodresult.AlthoughtheEKF,UKFandCKFalgorithmscanobtainacceptableperformancewhentheinputsignalislessnoisy(SNR=3dB),thesealgorithmscannotworkwellinthelargernoiseenvironment.FortheDSMCEalgorithm,itsperformancesarealwaysinferiortotheperformancesoftheKKF-CEOalgorithm(exceptSNR=-15dB),especiallyinlessnoisyenvironment.Insuchnoisyenvironment(SNR=-15dB),thesystemdynamicsisdistortedseriouslyandcannotbelearnedwellfromthenoisymeasurementusingouralgorithm,whiletheDSMCE,whichusesthecleanhiddenstatetrainingdatatoconstructtheoperatorswhichpreservethedynamicsofthesignal,canlearnmoreaboutthesystemdynamics. 95

PAGE 96

4.5.2Predictionofnoisytime-series ToevaluatethepredictionperformanceofthealgorithmpresentedinSection 4.3.2 ,ouralgorithmisappliedtopredictthenoisyLorenztime-series.TheLorenzattractor,introducedbyEdwardLorenzin1963[ 74 ],isadynamicalsystemcorrespondingtothelong-termbehaviorofachaoticowandiswellrecognizedbyitsbutteryshape.Thesystemisnonlinear,three-dimensionalanddeterministic,whichisdescribedbythreeordinarydifferentialequationsknownastheLorenzequations:dx(1) dt=)]TJ /F3 11.955 Tf 9.3 0 Td[(x(1)+x(2)x(3)dx(2) dt=(x(3))]TJ /F4 11.955 Tf 11.95 0 Td[(x(2))dx(3) dt=)]TJ /F4 11.955 Tf 9.3 0 Td[(x(1)x(2)+x(2))]TJ /F4 11.955 Tf 11.95 0 Td[(x(3). (4) ItisprovenbyTuckerthatforacertainsetofparametersthesystemexhibitschaoticbehavioranddisplayswhatistodaycalledastrangeattractor[ 75 ].Here,theexperimentaldataaregeneratedbysetting=8=3,=10and=28.Therstorderapproximationisusedwithastepsize0.01toobtainthesignalxi=[xi(1),xi(2),xi(3)]T.TheLorenzdataareplottedinFigure 4-6 Likethepreviousexperiment,weaddGaussianwhitenoisetothecleanLorenztime-seriessignalfxigtoobtainthenoisysignalfyigwith-15dBSNR,andwanttopredictthenoiselessdatainthefuturefromthecurrentnoisydataandthetrainingdata.Forallthedatasets,thedataaresplitintontrain=1500trainingdatafy0i,y0i+1,y0i+`gandntest=200testdatafyi,xi+`g,wherepisthepredictionstepsize. TheKRLSalgorithmachievesaverygoodperformanceinthepredictionproblem[ 76 ],outperformingtheLMS,RLS,EX-RLSandKLMSalgorithms,andtheEx-KRLSalgorithm,whichisjustarandomwalkKRLSalgorithm,hasverysimilarbehaviortotheKRLSalgorithm.Therefore,wecomparethepredictionperformanceofournovelalgorithmonlywithKRLS.FortheKRLSalgorithm,wejustneedtolearntheunderlyingfunction,whichisusedtopredictthefuturesignalfromthecurrentnoisymeasurement 96

PAGE 97

Figure4-6. StatetrajectoryofLorenzsystemforparameters=8=3,=10and=28 yi.However,forouralgorithm,weconstructtheconditionalembeddingoperatorasthestatetransitionoperatorintheRKHSusingthetrainingdata.Ifweusethewholetrainingdatadirectly,thenthesizeofthematricesthatwehavetomaintaininthealgorithmismm,wherem=1500.Thetimeandspacecomplexitiesaresohighthatthealgorithmisnotpractical.Therefore,thesimplifyingtechniquesmentionedinSection 4.4 areappliedtoestimatethecurrentembeddingi.WestilluseallthetrainingdatasettoconstructthepredictionoperatorFP.Inotherwords,theKRLSalgorithmusesthecurrentmeasurementyitopredictxi+`,whileweusethecurrentestimatedembedding^itopredictxi+`.FortheKKF-CEOQalgorithm,Kin( 4 )isusedtoimplementthekernelK-meansalgorithm. FortheKRLSalgorithm,theregularizationfactorissetat&=10)]TJ /F8 7.97 Tf 6.59 0 Td[(5,=10)]TJ /F8 7.97 Tf 6.58 0 Td[(3,andforgettingfactor=1.Forouralgorithm,&=10)]TJ /F8 7.97 Tf 6.58 0 Td[(5and=10)]TJ /F8 7.97 Tf 6.58 0 Td[(3,thenoiseparametersq=0.2andr=0.5,whichmakestheratior=q=2.5.TheGaussiankernelisappliedforallalgorithmsandthekernelparametersareallsetas10)]TJ /F8 7.97 Tf 6.59 0 Td[(3. 97

PAGE 98

Togureouthowthesizeofthemaintainingmatriceskaffectsthepredictionperformances,weusedifferentkvaluestopredictxi+`when`=10.TheresultsareplottedinFigure 4-7 ,bysettingk=[30,40,...,100].HeretheMSEisdenedastheerrorbetweenpredicted^xi+`andthecorrespondingcleansignalxi+`. Figure4-7. 10-stepPredictionMSEwithdifferentk HereKKF-CEOD,KKF-CEOQandKKF-CEOOdenotethealgorithmswiththedownsamplingtrainingdata,thequantizedtrainingdataandonlineselectedtrainingdata,respectively. Theresultsshow:First,ouralgorithmswithdifferentsimplifyingtechniquesalloutperformtheKRLSalgorithm.Withthesamenetworksizek,theKKF-CEOOalgorithmhasbestpredictionperformance.Evenforsmallnetworksizek=30,theKKF-CEOOalgorithmworksverywell.Second,thedifferencesamongthesethreealgorithmsdecreaseaskincreases.Itmakessense,because,whenk=m,thenthesethreealgorithmsshouldhavethesamebehavior. 98

PAGE 99

Thepredictionperformancesfordifferentpredictionstepsarealsopresented.Bysettingk=50andk=100,thepredictionperformancesofthesealgorithmsareplottedinFigure 4-8 andFigure 4-9 andtabulatedinTable 4-2 and 4-3 Figure4-8. PredictionMSEwithk=50 Table4-2. PredictionMSEwithk=50 Algorithms1-Stepprediction10-Stepprediction KRLS5.40810.755597.67841.3396KKF-CEOD4.96810.946916.36671.2723KKF-CEOQ4.26060.677915.87321.1518KKF-CEOO3.87080.477015.64030.93988 Table4-3. PredictionMSEwithk=100 Algorithms1-Stepprediction10-Stepprediction KRLS5.40810.755597.67841.3396KKF-CEOD4.21620.624875.54210.91026KKF-CEOQ3.90400.547875.42940.95342KKF-CEOO3.57710.459495.2530.88234 ThepredictionresultoftheKKF-CEOOalgorithmoutperformsalltheotheralgorithmsforallthepredictionstepswhenthenetworksizeskarethesame. 99

PAGE 100

Figure4-9. PredictionMSEwithk=100 4.5.3PredictionofSunspotdataset Inthissubsection,weapplyouralgorithmtopredictthereal-worlddata,thesunspotdata,andthepredictionperformanceiscomparedwiththeKRLS-ALD[ 16 ]andQKRLS[ 84 ]algorithms. TheWolfannualsunspotcounttimeserieshasbeenthesubjectofinterestinphysicsandstatisticscommunitiesandhasalsobeenthebenchmarkdataintimeseriespredictionproblem[ 85 ].Thenormalizedannualsunspotdataovertheperiod1700-2011isplottedinFigure 4-10 TheSunspotdataaresplitintotrainingdataovertheperiod1700-1979andtestdataovertheperiod1980-2011.Thetrainingdatasetisusedtotrainthekernelsystemstoimplement1-steppredictiontestedonthetestingdataset.BecausetheSunspotdatafxigare1-Dimensionaltimeseries,wehavetousetheembeddingvectorxi=[xi)]TJ /F8 7.97 Tf 6.59 0 Td[((nE)]TJ /F8 7.97 Tf 6.59 0 Td[(1),...,xi)]TJ /F15 7.97 Tf 6.59 0 Td[(,xi]topredictxi+1accordingtoTakens'embeddingtheorem[ 64 ].Inthisexperiment,theseparametersaresetasnE=9and=1,respectively. 100

PAGE 101

Figure4-10. Normalizedannualsunspotdataovertheperiod1700-2011 Likethepreviousexperiments,theGaussiankernelisappliedtothesethreekernelalgorithmswiththesamekernelparameter%=1.FortheKRLS-ALDandQKRLSalgorithms,thesameforgettingfactor=1andregularizationterm&=1e)]TJ /F8 7.97 Tf 6.59 0 Td[(3areapplied.TheALDthresholdALDandthequantizedsize"aresetasdifferentvaluestoobtaindifferentnetworksizes.FortheKKF-CEOalgorithm,wechosethesamesamplesinthesparsicationdictionaryastheKRLS-ALDtoconstructthestatetransitionoperatorFand1-steppredictionoperatorFP.Theregularizationtermissetassameastheotheralgorithms&=1e)]TJ /F8 7.97 Tf 6.59 0 Td[(3.Thenoiseparametersaresetasq=1andr=2forallthenetworksize,whichmakestheratior=q=2. Thepredictionnormalizedmeansquarederror(NMSE)fordifferentnetworksizesareplottedinFigure 4-11 OnecanndthatthattheadvantageoftheKKF-CEOalgorithmovertheothertwoalgorithmsisobviouswhenlargernetworksizeisused.Whenthenetworksizeissmall,thetransitionoperatorconstructedbylesssamplesintheKKF-CEOalgorithmcannot 101

PAGE 102

Figure4-11. NormalizedMSEofannualSunspotdatapredictionovertheperiod1980-2011 describethesystemdynamicssufciently.Whenthetrainingdataaresufcienttodescribethedynamics,theKKF-CEOalgorithmhasmoreadvantage.Thatisbecausereal-worlddataliketheSunspotdataaretypicallycontaminatedwithoutliersandrequiretreatmentbydistributionsthathavelongerand/orthickertailsthantheNormal(Gaussian)distribution.TheKKF-CEOalgorithm,therefore,ismoresuitableinsuchcases. 4.6DiscussionandConclusion Inthischapter,theKalmanlterfordenoisingandpredictionisimplementedintheRKHStodealwithnonlinearestimationproblems.TheembeddingsofthemeasurementsareusedasthehiddenstatesintheRKHSwhicharepropagatedbytheconditionalembeddingoperatorandupdatedbythenewtransformedmeasurements.Thepropagationoftheembeddings,orwecansaythepropagationofthepdf,resultsinthesurvivalofthelteringalgorithminnon-Gaussiannoiseenvironments. 102

PAGE 103

UnliketheclassicalnonlinearKalmanlter,suchastheEKF,UKF,andCKF,whichrequiretheknowledgeofthesystemmodel,thetransitionoperatorisconstructedfromtheobservationsdirectly.Therefore,nohiddenstatesinformationorsystemmodelpriorknowledgearerequired.Furthermore,becausetheGaussiannoiseisalsomodeledintheRKHS,themodelcancoveraverywiderangeofinputspacenoises,notonlytheGaussiannoise.Inaddition,thepropagationofthepdfisdescribedusingthedatadirectly,notfromassumedsystemmodels,whichmakesthealgorithmmoresuitableforcomplicatednonlinearsystemfunctionsorlowSNRcases. ThenovelalgorithmisappliedtoestimatethenoisyIKEDAtime-series.Theestimationperformancesunderdifferentnoiseenvironmentsarecomparedwithsomeexistingalgorithmsandoutperformthem,especiallyforthenon-Gaussiannoise.Therearethreereasonstoexplaintheadvantage.First,thedistributionsofthemeasurementsarepropagatedverywellviathetransitionoperator,whichisconstructedfromthenoisymeasurements,nomatterwhatkindofnoise.Second,thedistributionsofthemeasurementsareusedtocalculatetheestimates,notjusttherstandsecondorderofthestatisticsliketheEKF,UFKandCKF.Third,noisesintheRKHSarebroughtintothealgorithmtoreecttheuncertaintyofthetransitionoperatorandmeasurementnoiseintheRKHS. ThealgorithmisalsoutilizedtopredictthenoisyLorenztime-series.ThepredictionperformancesoutperformtheKRLSalgorithm.FortheLorenztime-seriesdataset,moretrainingdataarerequiredtolearnthesystemthantheIKEDAdataset,whichincurshighcomplexities.Inordertoboundthecomplexities,threetechniques,includingdownsampling(D),quantization(Q)andonlineoperation(O),areproposed.Allofthemcanofferbetterpredictionperformancesevenwithasmallersizeofthetransitionoperator.Theadvantageprimarilyresultsfromthefactthattheembeddingsofthecurrentmeasurementsareusedtopredicttheoutputinthefuture,notthemeasurementitself.However,comparedwiththeKRLSalgorithm,ouralgorithmincurshighercomplexities 103

PAGE 104

tomaintaintheembeddingsandcalculatetheprediction.ThetimecomplexitiesaresummarizedinTable 4-4 Table4-4. Timecomplexities AlgorithmsEstimationofembeddingsPrediction KRLSN/AO(m2)KKF-CEODO(k3)O(m2)+O(mk)KKF-CEOQO(k3)O(m2)+O(mk)KKF-CEOOO(k3)+O(mk2)O(m2)+O(mk) Fromthistable,onecanndoutthatalltheKKF-CEOalgorithmshavesimilartimecomplexitiesofpredictiontotheKRLSalgorithm,becausem>k.Inaddition,althoughtheKKF-CEOOalgorithmcanoutperformotheralgorithmswhenkisthesame,ithasthehighestcomputationalcomplexity. Theadvantageofthisalgorithmisalsoshownbythereal-worlddataapplication,theSunspotdatasetshorttermprediction.ThepredictionperformanceoftheKKF-CEOalgorithmoutperformstheKRLS-ALDandQKRLSalgorithmswhenenoughtrainingdataareapplied. TheKalmanlterisconstructedintheRKHStosolvenonlinearestimationandpredictionproblems.However,becausethemeasurementembeddingsaretreatedasthehiddenstatesintheRKHS,notthehiddenstatesofthephysicalmodel,thealgorithmisnotabletopreciselymodelthedynamics.Although,theuncertaintyinthemodelcancompensateforthis,itwillalwayshavelimitationstodealwithcomplicatedsystem.Therefore,ThismethodologyneedtobeextendedinthefuturetodeveloptheKalmanlterinRKHSwithageneralmeasurementmodeltoimplementhiddenstatesestimation,whichwillrequirethehiddenstatesastrainingdatatoapproximatethestatetransitionandmeasurementoperators. AnotherdifcultythatshouldbefurtherresearcheddealswiththeconstrainedcontrolofthenoisemodelinRKHS.InfactsofaronlythepowerofthenoiseisundercontrolwithqandrandthisaffectsallthedimensionsoftheRKHS.Inaddition,theratior=qhastobedeterminedbasedonthepriorknowledgeofthenoiseinthetimeseriesto 104

PAGE 105

obtaingoodresults.Inmanypracticalcases,onewouldliketohavecontrolthenoiseineachdimensionoftheRKHS,andthiscanonlybedonewithatensorproductkernelinthecurrentapproach. 105

PAGE 106

CHAPTER5KERNELKALMANFILTERBASEDONCONDITIONALEMBEDDINGS 5.1Overview InChapter 4 wedevelopanovelKalmanlteralgorithmintheRKHSbasedontheconditionalembeddingoperatortodealwithnonlineardynamicsystem.However,asdiscussedinChapter 4 ,themeasurementembeddingsaretreatedasthehiddenstatesintheRKHS,notthehiddenstatesofthephysicalmodel,sothealgorithmisnotabletopreciselymodelthedynamics.Moreover,themeasurementfunctioninthatalgorithmisassumedtobeanidentityfunction,whichistoosimpleforsomeapplications.Therefore,inthischapterwewillproposeanotheralgorithmtodealwithnonlineardynamicsystemswithacomplexmeasurementmodel,whichisassumedtobeanynonlinearfunction.Thestate-spacemodelisassumedtobeknownas xi+1=f(xi)+ni (5) yi=h(xi)+vi (5) wheref()andh()areknownstatetransitionandmeasurementfunctions,respectively;niandviarestatenoiseandmeasurementnoise,respectively. Thisalgorithmisalsodevelopedbasedontheconditionalembeddingoperators,butthehiddenstateinRKHSwillbethestateembeddings,insteadofthemeasurementembeddings.Weneedthehiddenstatesastrainingdatatoconstructthestatetransitionandmeasurementoperators,whicharebothconditionalembeddingoperators.However,inmanycasesthishiddenstatesarenotavailableastrainingdata,butasanassumedsystemmodel.Therefore,thenewalgorithmisdevelopedundertheassumptionthatsystemmodelisgiven.Wecanusethesystemmodeltogeneratehiddenstatesandcorrespondingmeasurementsastrainingdatatoconstructtheoperators.Withtheseoperators,wecanestimatethestateembeddingsdirectlyfromsystemmeasurementsasdoneinChapter 4 106

PAGE 107

SincetheRKHSalgorithmdevelopedundertheassumptionthatthesystemmodel(generallythestate-spacemodel)isassumedtobeknown,coincideswiththeassumptionoftheKalmanlterandnonlinearKalmanlters,wecancomparethesealgorithmsmorefairly.Therefore,wenamethisalgorithmthefull-blownkernelKalmanlter(FKKF). Therestofthechapterisorganizedasfollows.InSection 5.2 thestate-spacemodelinRKHSisconstructedwiththeconditionalembeddingoperators.Insection 5.3 thestatetransitionandmeasurementoperatorsareestimatedandtheFKKFalgorithmisproposed.Then,anexperimentispresentedinSection 5.4 .Followingtheexperiment,theresultsarediscussedinSection 5.5 .Finally,theconclusionisgiveninSection 5.6 5.2State-SpaceModelinRKHS Inthissection,weassumethatthetrainingdatasetincludinghiddenstatesandcorrespondingmeasurementsareavailableastrainingdatasettoconstructthestate-spacemodelinRKHS. GiventhetrainingsetD=)]TJ /F7 11.955 Tf 12.45 -9.68 Td[(x01,y01,...,)]TJ /F7 11.955 Tf 5.48 -9.68 Td[(x0m+1,y0m+1,generatedbythesystemmodeledbyfollowingequations xi+1=f(xi) (5) yi=h(xi), (5) wemapthesetrainingdataintotwoRKHSHxandHyas'(xi)and(yi).Then,wecanestimatetheconditionalembeddingsoperatorsFi2HxHxandHi2HyHxforthestatemodelandmeasurementmodelasbelow: Fi=(K+&fmI)T (5) Hi=(T+&hmI)T (5) 107

PAGE 108

where='(x01),...,'(x0m),='(x02),...,'(x0m+1),= (y02),..., (y0m+1),K=T,andT=T. Wedenotethestateandmeasurementembeddingsbyxandy,respectively,andtheestimatesaredenotedby^xand^y.LiketheconditionalembeddingoperatordenedinChapter 4 ,thesetwooperatorcanalsolinktheseestimatedembeddingsas ^x,i+1=Fi^x,i (5) ^y,i=Hi^x,i. (5) Sincethesamplesgeneratedby( 5 )and( 5 ),theoperatorsFiandHireecthowthestatetransitionandmeasurementfunctionsf()andh()propagatetheembeddings^x,iand^y,i,respectively. Consideringthestateandmeasurementnoiseweaddnoisesin( 5 )and( 5 )andobtainfollowingequations ^x,i+1=Fi^x,i+i (5) ^y,i=Hi^x,i+0i (5) wherei2Hxand0i2HyarebothnoiseinRKHS. Comparingwith( 5 ),onecanndthat^y,iissupposedtobethemeasurementintheRKHSstate-spacemodel.However,^y,iistheestimatedmeasurementembeddingatiterationi,whichisunobservable.Onlythemappedmeasurement (yi)canbeobserved,whichisaninstantiationofthemeasurementembeddingyi.Welinktheestimate^y,iandtheinstantiation (yi)by (yi)=^y,i+00i.(5) 108

PAGE 109

where00iisalsonoiseinHy.Then,wehavetheRKHSstate-spacemodelas ^x,i+1=Fi^x,i+i (5) (yi)=Hi^x,i+0i+00i. (5) Forconvenience,werewritetheaboveequationas ^i+1=Fi^i+i (yi)=Hi^i+i (5) where^i=^x,iandi=0i+00i.ThesenoisesinRKHSarealsotreatedashighdimensional(orinnitedimensional)vectors,andcanbeassumedtobezero-meanGaussianwithcovariance Qi=qI (5) Ri=rI. (5) ( 5 )istheRKHSstate-spacemodel,whichisusedtodeveloptheFKKFalgorithminthischapter.Fortherestofthischapter,wedenotetheestimatedstateembeddingby^i,whichisthehiddenstateoftheRKHSstate-spacemodel.Inthenextsection,wewillderivetheFKKFalgorithmtoestimatethehiddenstate^i,thenobtainhiddenstateestimate^xifrom^i. 5.3DerivationofFull-blownKernelKalmanFilter LikethederivationoftheKKF-CEOalgorithm,wedeveloptheFKKFalgorithmbasedontheRKHSstate-spacemodel( 5 )inthissection.First,wederivetherecursiveequationsfollowingthetraditionalKalmanlterbasedonthegiventrainingdata.Then,weproposetheFKKFalgorithmwithgeneratedtrainingdatafromthestatetransitionandmeasurementfunctionsateachiteration. 109

PAGE 110

5.3.1RecursiveEquationswithGivenTrainingData Comparingthetraditionalmodelin( 2 )withtheRKHSstate-spacemodelin( 5 ),wecanseethatFi=(K+&fmI)TandHi=(T+&hmI)T,Qi=qIandRi=rI.Therefore,therecursiontoestimateiintheRKHSwillbewrittenasfollows: Startwith^0=E['(x0)]or^0='(x0),P0=IliketheKalmanlter[ 26 ], ^)]TJ /F5 7.97 Tf 0 -8.27 Td[(i=Fi)]TJ /F8 7.97 Tf 6.59 0 Td[(1^i)]TJ /F8 7.97 Tf 6.59 0 Td[(1 (5) P)]TJ /F5 7.97 Tf 0 -8.28 Td[(i=Fi)]TJ /F8 7.97 Tf 6.59 0 Td[(1Pi)]TJ /F8 7.97 Tf 6.58 0 Td[(1FTi)]TJ /F8 7.97 Tf 6.59 0 Td[(1+Qi)]TJ /F8 7.97 Tf 6.59 0 Td[(1 (5) Gi=P)]TJ /F5 7.97 Tf 0 -8.28 Td[(iHTiHiP)]TJ /F5 7.97 Tf 0 -8.28 Td[(iHTi+Ri)]TJ /F8 7.97 Tf 6.59 0 Td[(1 (5) ^i=^)]TJ /F5 7.97 Tf 0 -8.27 Td[(i+Gi)]TJ /F3 11.955 Tf 5.48 -9.68 Td[('(yi))]TJ /F7 11.955 Tf 11.96 0 Td[(Hi^)]TJ /F5 7.97 Tf 0 -8.27 Td[(i (5) Pi=(I)]TJ /F7 11.955 Tf 11.95 0 Td[(GiHi)P)]TJ /F5 7.97 Tf 0 -8.28 Td[(i (5) where^)]TJ /F5 7.97 Tf 0 -8.27 Td[(idenotesaprioriestimateofthestateembedding.HerewehaveoperatorsP)]TJ /F5 7.97 Tf 0 -8.28 Td[(i2HxHx,Pi2HxHx,andGi2HxHy.AsdiscussedinChapter 4 ,thisrecursiveapproachcannotbecomputeddirectlyusingnormalmatrixcalculations.However,liketheKKF-CEOalgorithm,therecursionapproachcanstillbeimplementedduetothefollowingtheoremswehaveproved. Theorem5.1. TheoperatorP)]TJ /F5 7.97 Tf 0 -8.27 Td[(i,Pi,Giareofthefollowingforms P)]TJ /F5 7.97 Tf 0 -8.28 Td[(i=~P)]TJ /F5 7.97 Tf 0 -8.28 Td[(iT+qI(5) Pi=~PiT+qI(5) Gi=~G1T(5) where~P)]TJ /F5 7.97 Tf 0 -8.28 Td[(i,~Piand~Giareallmmmatrices.Furthermore,thesematricescanbecomputedrecursivelyasfollows: ~P)]TJ /F8 7.97 Tf 0 -7.97 Td[(1=~FT~FT(5) ~Gi=hrIm+~P)]TJ /F5 7.97 Tf 0 -8.28 Td[(iT~HT+q~HTT~HTi)]TJ /F8 7.97 Tf 6.59 0 Td[(1~P)]TJ /F5 7.97 Tf 0 -8.28 Td[(iT~HT+q~HT(5) 110

PAGE 111

~Pi=~P)]TJ /F5 7.97 Tf 0 -8.27 Td[(i)]TJ /F4 11.955 Tf 11.95 0 Td[(q~GiT~H)]TJ /F6 11.955 Tf 13.2 2.66 Td[(~GiT~HT~P)]TJ /F5 7.97 Tf 0 -8.27 Td[(i(5) ~P)]TJ /F5 7.97 Tf 0 -8.28 Td[(i=~FT~Pi)]TJ /F8 7.97 Tf 6.59 0 Td[(1T+qT~FT(5) where~F=(K+&fmI))]TJ /F8 7.97 Tf 6.58 0 Td[(1,~H=(T+&hmI))]TJ /F8 7.97 Tf 6.58 0 Td[(1,F=~FTandH=~HT Theorem5.2. Thepredictedembeddings^)]TJ /F5 7.97 Tf 0 -8.28 Td[(iandtheestimatedembeddings^iarebothcombinationsofthemappedstatesf'(x0i)gm+1i=2 )]TJ /F5 7.97 Tf 0 -8.28 Td[(i=ai(5) i=bi(5) whereai=[a1,...,am]Tandbi=[b1,...,bm]Tarebothm1real-valuedvectors.Furthermore,thesevectorscanalsobecomputedrecursivelyasfollows: a1=~FT0(5) bi=ai+~GiT (yi))]TJ /F6 11.955 Tf 13.2 2.66 Td[(~GiT~HTai(5) ai=~FTbi)]TJ /F8 7.97 Tf 6.59 0 Td[(1.(5) ProofsofTheorem5.1andTheorem5.2arepresentedinAPPENDIX D .Bythesetheorems,theoperatorsareallexpressedintermofmmmatricesandfeaturematrices,suchasand,whichcanbothbetreatedasnkmmatrices.Therefore,thecalculationsfrom( 5 )to( 5 )onlyinvolvenormalmatricescalculationsandinnerproductoperationsbetweenfeaturematricesandvectors,whichcanbecomputedreadilybythekerneltrick. 111

PAGE 112

Onceweestimatethestateembeddingi,like( 4 ),wecancomputethehiddenstateestimate^xiby ^xi=E[xi]=E[hfIx,'(xi)i]=hfIx(),E['(xi)]i=hfIx(),^ii=hfIx(),bii (5) wherefIx=[fx,1,...,fx,nx]isarowvectoroffunctionsandfx,j(x)=x(j)(j=1,...,nx).x(j)isthejthcomponentofthevectorx.Althoughthefunctionfx,j()maynotbelongtotheRKHS,wecanlearnandapproximatethemfromthetrainingdataX0,whereX0=x02,...,x0m+1.ThelearnedrowvectoroffunctionsfIx()canbeexpressedas ^fIx()=(T))]TJ /F8 7.97 Tf 6.59 0 Td[(1X0T.(5) Thentheaboveequationcanbeapproximatedas ^xi=bTiT(T))]TJ /F8 7.97 Tf 6.59 0 Td[(1X0T=X0bi.(5) Atthispoint,wecanestimatethehiddenstatexiusingthegiventrainingdatasetD,whicharegeneratedbythestatetransitionandmeasurementfunctions. 5.3.2FKKFalgorithmwithgeneratedtrainingdata ToimplementtheFKKFalgorithm,weneedtheconditionalembeddingoperatorsFiandHitoconstructthestateandmeasurementmodelinRKHS.Furthermore,theseoperatorshavetobelearnedfromtrainingdata,whicharegeneratedfromtheknownstate-spacemodelfunctions.Thesimplestwaytogeneratesamplesfromthestatetransitionandmeasurementfunctionsistondthefastestrateofchangeofthefunctionandselectthesampleswiththeappropriaterange.However,thiswouldbewastefulinalltheotherpartsofthedomain.Therefore,weproposeheretousethelocal 112

PAGE 113

informationaboutthecurrentestimatedstateembeddingtogeneratenewsamplesandconstructthestatetransitionandmeasurementoperatorsateachiteration.Thisideaofsamplesgenerationissimilartotheideaoftheonlineselectedsamplingtechnique,whichweproposedinSection 4.4.3 .Thedifferenceisthatwegeneratesamplesaroundthepreviousestimatehiddenstate^xi)]TJ /F8 7.97 Tf 6.59 0 Td[(1,notselectsamplesfromtheestimatedembedding. Atiterationi,wegeneratemi.i.d.samplesX0Pre,i=fx0pre,i,jgmj=1aroundestimatedhiddenstate^xi)]TJ /F8 7.97 Tf 6.59 0 Td[(1;then,feedX0Pre,itothestatetransitionfunctionf()toobtainX0i=f(X0Pre,i)=fx0i,jgmj=1;nally,feedX0itothemeasurementfunctionh()tooutputY0i=h(X0i)=fy0i,jgmj=1.Usingthesegeneratedtrainingdata,wecanconstructthestatetransitionandmeasurementoperatorFiandHirespectivelyatiterationiby Fi=i~Fii(5) Hi=i~Hii(5) wherei='(x0Pre,i,1),...,'(x0Pre,i,m),i='(x0i,1),...,'(x0i,m),i= (y0i,1),..., (y0i,m).Themmmatrices~Fiand~Hiareexpressedas ~Fi=(Ki+&fmI))]TJ /F8 7.97 Tf 6.59 0 Td[(1,(5) ~Hi=(Ti+&hmI))]TJ /F8 7.97 Tf 6.59 0 Td[(1(5) whereKi=TiiandTi=Tii. TogeneratesamplesX0Pre,i,weassumeaGaussiandistributionwithmean^xi)]TJ /F8 7.97 Tf 6.59 0 Td[(1andcovariancei.Itisnoteworthythatthecovarianceiisaffectedbythestatetransitionfunctionf().Ifthethedistributionpropagationpropertyofthisfunctionchangesslowly,wecanchoosealargeri,ifnotasmalleriisrequired. 113

PAGE 114

Algorithm5-1:Full-blownKernelKalmanFiltering Initialization:Fori=0,Setregularizationparameters:&fand&hnumberofgeneratedtrainingsamples:minitialhiddenstate:x0and0='(x0)set0='(x0)andb0=1,then0=0b0initialcovariance:P0=I, Filtering:Fori>0Generatetrainingdataforiiteration:generatemi.i.d.samples:X0Pre,i=fx0pre,i,jgmj=1N(^xi)]TJ /F8 7.97 Tf 6.59 0 Td[(1,i)feedX0Pre,ito( 5 )and( 5 )togeneratesamples:X0i=fx0i,jgmj=1,andY0i=fy0i,jgmj=1featurematrices:i,iandiCompute:Ki=Tii,Ti=TiiandSi=Tii~Fi=(Ki+&fmI))]TJ /F8 7.97 Tf 6.58 0 Td[(1and~Hi=(Ti+&hmI))]TJ /F8 7.97 Tf 6.58 0 Td[(1ai=~FiTii)]TJ /F8 7.97 Tf 6.58 0 Td[(1bi)]TJ /F8 7.97 Tf 6.59 0 Td[(1,ifi=1,~P)]TJ /F5 7.97 Tf 0 -8.28 Td[(i=~FiKi~FTi,else~P)]TJ /F5 7.97 Tf 0 -8.28 Td[(i=~FiTii)]TJ /F8 7.97 Tf 6.59 0 Td[(1~Pi)]TJ /F8 7.97 Tf 6.59 0 Td[(1Ti)]TJ /F8 7.97 Tf 6.59 0 Td[(1i+qKi~FTiGi=hrIm+~P)]TJ /F5 7.97 Tf 0 -8.28 Td[(iTieHTi+qeHTiSieHiTii)]TJ /F8 7.97 Tf 6.58 0 Td[(1~P)]TJ /F5 7.97 Tf 0 -8.28 Td[(iTieHTi+qeHTibi=ai+GiTi (yi))]TJ /F7 11.955 Tf 11.96 0 Td[(GiSieHiTiai~Pi=~P)]TJ /F5 7.97 Tf 0 -8.28 Td[(i)]TJ /F4 11.955 Tf 11.95 0 Td[(qGiSieHi)]TJ /F7 11.955 Tf 11.95 0 Td[(GiSieHiTi~P)]TJ /F5 7.97 Tf 0 -8.28 Td[(iEstimatehiddenstate:^xi=X0ibi. BecausethestateembeddingatiterationiisobtainedbyusingFiandHi,^)]TJ /F5 7.97 Tf 0 -8.28 Td[(iand^iarebothexpressedbyi.Therefore,( 5 )and( 5 )shouldbothberewrittenas ^)]TJ /F5 7.97 Tf 0 -8.28 Td[(i=iai(5) ^i=ibi(5) Likewise,withtheseoperatorsFiandHiconstructedbythegeneratedtrainingdataateachiteration,thesemodicationsarealsoneededintheotherrecursiveequationspresentedinTHEOREM5.1andTHEOREM5.2. Finally,wehavetheFKKFalgorithmwiththegeneratedtrainingdataateachiteration.TheFKKFalgorithmissummarizedinAlgorithm5-1. ThecomputationcomplexityoftheFKKFalgorithmisO(m3)becauseofthematrixcalculations,sameastheKKF-CEOestimationpart. 114

PAGE 115

5.4ExperimentandResult TheFKKFproposedalgorithminAlgorithm5-1isourrstattemptofdevelopingthefullblownKalmanlterinRKHS.Thisalgorithmisjustpreliminary,notcomplete.Inthissection,weapplythispreliminaryalgorithmtoasimplesyntheticsystemproblemtoshowthatitworks.However,toobtainacompletealgorithm,morestudiesandmodicationsareneeded. TheperformanceoftheFKKFalgorithmisevaluatedonasyntheticscalarestimationproblemandcomparedwithsomeexistingalgorithmsincludingtheEKF,UKF,CKFandPF. Weconsideranonlineardynamicalsystemdescribedbythefollowingstate-spacemodel: xi+1=f(xi,i)=xi 2+25xi 1+x2i+8cos(1.2i)+ni (5) yi=h(xi)=x2i 20+vi (5) whereniandviarezeromeanGaussianrandomvariableswithcovarianceQiandRitoreectthesystemuncertaintyandmeasurementnoise,respectively.Inthisexperiment,weuseQi=0.1andRi=0.01.Thesamestate-spacemodelhasbeenanalyzedbeforeinmanypublications[ 100 103 ]. Givenonlythenoisymeasurementsfyig(i=1,...,100),thedifferentltersareappliedtoestimatetheunderlyingcleanstatesequencefxig.Forreference,thetruestatesandthecorrespondingmeasurementsfor100iterationsinanexemplarrunareshowninFigure 5-1 ForthenonlinearKalmanlters,thestate-spacemodelandcovarianceQiandRiareusedtoestimatethehiddenstates.TheUKFparametersaresetas=1,=0andk=2.ForthePF,weusethegenericPFalgorithm[ 18 ](theimportancedensityisassumedtobethetransitionalpriorandsystematicresampling[ 24 ]isused).Inthisalgorithm200particlesandsystematicresamplingareappliedtoimplementthe 115

PAGE 116

ATruevaluesofthestates BNoisymeasurements Figure5-1. Truevaluesofthestatesandcorrespondingnoisymeasurementsinanexemplarrun 116

PAGE 117

estimation.FortheproposedFKKF,theGaussiankernelsareappliedtomapxiandyitoRKHSwiththesamekernelparameter%=1.Theotherparametersaresetas:regularizationterms&f=10)]TJ /F8 7.97 Tf 6.58 0 Td[(4and&h=10)]TJ /F8 7.97 Tf 6.59 0 Td[(4,initialcovarianceparameter=10)]TJ /F8 7.97 Tf 6.59 0 Td[(2andsamplegenerationcovariancei=5.Themostimportantnoiseparametersareq=0.1andr=0.01.Theseparametersareselectedtoobtainbestestimationperformancesforallalgorithms. Foreachtrial,thealgorithmsarerun100timestovalidatethemeanandvarianceoftheresults.TheNMSEbetweentheestimations^xiandrealpositionsxiofthesystemarelistedinTable 5-1 .Alltheresultsinthetableareintheformofmeanstandarddeviation. Table5-1. NMSEofStateestimation AlgorithmNMSE EKF0.2904800.293210UFK0.1779500.259540CKF0.8329200.900670PF(m=200)0.0980760.217250FKKF(m=30)0.0879140.045131FKKF(m=200)0.0503360.032467 FromTable 5-1 ,onecanconcludethattheFKKFalgorithmobtainsthebestestimationperformancewiththesmallestmeanandstandarddeviationofNMSE.ItmeansthattheFKKFalgorithmisalsomostrobustinthisapplication.Evenonlyusingm=30trainingsamples,theFKKFcanoutperformthePFwith200particles. Toanalyzehowthenumberofsamplesmaffectstheestimationperformance,weusedifferentnumbersofsamplestoconstructthestatetransitionandmeasurementoperatorstoestimatethehiddenstates.TheestimationresultsareplottedinFigure 5-2 ,bysettingm=[5,10,20,30,50,100,200].Itseemsthatm=50isagoodchoicetoobtainbetterperformancewithlesscomputationcomplexity. 117

PAGE 118

Figure5-2. NMSEofestimationresultsversusnumberofsamplesm.Errorbarsmarkonestandarddeviationaboveandbelowthemean. 5.5Discussion Intheprevioussection,theFKKFandsomeotherexistingalgorithmsaretestedtoestimatethehiddenstatesgeneratedbyascalarstate-spacemodelfromthecorrespondingnoisymeasurements.TheFKKFoutperformsotheralgorithmwithasmallnumberofthetrainingsamplesandshowsitsrobustproperty.Forotheralgorithms,insometrialsalgorithmscannotalwaystrackandestimatethesignalandresultinlargeestimationerrors.Inthissection,wegivemorediscussionsaboutthesealgorithmsandtheirperformances. ForthenonlinearKFalgorithms,theGaussianapproximationisnotasufcientdescriptionofthenonlinearandnon-Gaussiannatureoftheexample.OncethenonlinearKFscannotadequatelyapproximatethebimodalnatureoftheunderlyingposterior,theGaussianapproximationfails.Especially,theCKFgetspoorperformance,actually,itdoesnotworkinthisexperiment.ThisisthersttimeinthedissertationthattheCKFcannotoutperformtheEKFandUKF.Thereareseveralreasonstoexplainthis 118

PAGE 119

result:First,theCKFusesjust2(2nx)cubaturepointstodescribetheunderlyingdistributions,whiletheUKFwith3(2nx+1)sigmapointsandweightscontrolledbyfreeparametersandtodescribetheunderlyingdistributions.Whenthehiddenstatedimensionalityislargethedifferenceisverysmall,butforscalarproblemthisdifferenceishuge.Furthermore,thepairofcubaturepointsdenedin( 2 )areadditiveinverseandthestatetransitionfunction( 5 )iseven.Therefore,itismoredifculttodescribetheunderlyingdistributions. ForthePFalgorithm,200particlesandthecorrespondingweightsareusedtodescribetheunderlyingdistributions.Therefore,evenifthedistributionsarenotGaussian,PFcanstilldescribethemandobtaingoodresults.However,insomecasesthesepropagatedsamplesandweightscannotworkwell.Forexample,theresamplingstep,whichisappliedtoreducetheeffectsofdegeneracy,probablyresultsinalossofdiversityamongtheparticleastheresultantsamplewillcontainmanyrepeatedpoints.Inthiscase,thePFcannotalsotrackandestimatethehiddenstates.[ 18 ].Thisresultisexpected,becausethereisnofeedbackinPFandtheimportancedensityaffectstheperformancesignicantly. FortheFKKFalgorithm,thedistributionsrepresentedbytheestimatedembeddingsarepropagatedateachiteration.Therefore,thenonlinearandnon-Gaussiannatureofthedynamicalsystemcanbedescribedsufcientlywell.Furthermore,althoughtheestimatedembeddingarealsoexpressedintermsofthetrainingsamplesandweights,thetrainingsamplesarenotpropagateddirectlybythestatetransitionandmeasurementfunctions.ItistheestimatedembeddinginRKHSthatispropagatedbytheconditionalembeddingoperators,whichcanbetreatedasthecounterpartofthestatetransitionandmeasurementfunctionsinRKHS.Thetrainingsamplesarejustusedtoconstructtheseoperators.Therefore,therearenoissuesofresamplingandlossofdiversity.TheFKKFcanobtainbestandmostrobustestimationperformance. 119

PAGE 120

5.6Conclusion Inthischapter,weproposeafull-blownkernelKalmanlter(FKKF)algorithmbasedontheconditionalembeddingoperators.ComparedwiththeKKF-CEOalgorithmpresentedinthepreviouschapter,theFKKFalgorithmusestheestimatedstatesembeddingsasthehiddenstateinRKHS.Inaddition,withanon-trivialmeasurementmodelinRKHS,theFKKFalgorithmcanmorepreciselymodelthesystemdynamicswithtwoconditionalembeddingoperatorsintensorproductRKHSHxHxandHyHx,whichareconstructedbysamplesgeneratedbythesystemmodelfunctionsateachiteration. TheFKKFalgorithmistestedtoestimatethescalarhiddenstatesgeneratedbyasyntheticdynamicalsystemfromthenoisymeasurementsandtheestimationperformanceiscomparedwithsomeexistingalgorithms,includingtheEKF,UFK,CKFandPF.Theproposedalgorithmoutperformsthesealgorithminprecisionandrobustness. However,morestudiesaboutthisalgorithmareneededtoimproveit.Forexample,howtoaccuratelymodelthemulti-dimensionalnoiseinRKHS.Inthischapter,theFKKFisappliedtosolveascalarestimationproblemwithverysimplenoisemodelQi=qIandRi=rI.Butforamulti-dimensionaldynamicalsystem,suchnoisemodelsarenotsufcientobviouslyforsomecomplicatedsystems.Furthermore,howtoconstructefcientstatetransitionandmeasurementoperatorswithlesssamplesisalsoaninterestingproblem. 120

PAGE 121

CHAPTER6CONCLUSIONANDFUTUREWORK 6.1ConclusionandDiscussion Inthiswork,weintendtodeveloptheKalmaninRKHStosolvenonlineardynamicalsystemproblems.Inthisresearchdirection,wehaveproposedthreeKalmanbasedalgorithms,namelytheextendedkernelrecursiveleastsquaresbasedonKalmanlter(Ex-KRLS-KF),thekernelKalmanlterbasedonconditionalembeddingoperator(KKF-CEO),andthefull-blownkernelKalmanlter(FKKF). First,weanalyzetheextendedrecursiveleastsquares(Ex-KRLS)algorithmandpointoutitslimitation.AsaextensionoftheKRLSalgorithm,theEx-KRLSbringsthedynamicalstate(weight)intothesystemmodelinRKHS,constructsthestatetransitionoperatortodescribethesystemdynamicsinRKHSandimplementthisalgorithmbysettingtheoperatorA=I.However,ourworkshowsthatthisdynamicalmodelinRKHScannotdescribegeneraldynamicalsystemsandtheEx-KRLSisjustarandomwalkKRLSalgorithmbecausetheparameterhastobe1theoretically. Next,wedeveloptheEx-KRLS-KFalgorithm,whichisacombinationoftheKRLSandnonlinearKalmanlter.ToavoidconstructingacomplicatedstatemodelinRKHS,whichinvolvesoperatorlearning,wemaintainthestatemodelintheinputspace,whilestillconstructthemeasurementmodelinRKHS.Then,welearntheunknownmeasurementmodelinRKHSandusethemtoestimatehiddenstatebythenonlinearKalmanlter.Furthermore,weobtaintheoutputbyfeedingtheestimatedhiddenstatetothelearnedmeasurementmodel.Becauseofthemoreprecisestatemodel,trackingperformanceisbetterthantheEx-KRLSalgorithm.Although,inthisalgorithmweobtainthehiddenstatesastheintermediatesteptocomputetheoutput,wecannotestimaterealhiddenstates.Becausethehiddenstatesareestimatedbasedontheapproximatedmeasurementmodel,nottherealmeasurementmodel.Itisimpossibletoestimaterealhiddenstatesonlyusinginputandoutputinformation. 121

PAGE 122

Then,wepresenttheKKF-CEOalgorithmtoestimatethehiddenstatewithatrivialmeasurementmodelinRKHS.ToconstructastatemodelinRKHS,weintroducetheconceptsaboutembeddingsinRKHSandconditionalembeddingoperators.Theestimatedmeasurementembeddings,whichrepresentthedistributionofmeasurements,areappliedasthehiddenstatesinRKHSandpropagatedbytheconditionalembeddingoperators.Theestimatedmeasurementembeddingislinkedtothemappedmeasurementbyanadditivenoisemodel.Inaddition,themodeluncertaintyandmeasurementerrorareassumedtobezeromeanGaussiannoiseinRKHS.Basedonthisstate-spacemodelinRKHS,aKalmanlterinRKHSisdevelopedtoestimatethemeasurementembedding,aswellasthehiddenstatesintheinputspace.BecausetheGaussiannoiseinRKHScanrepresentmuchwidernoisetypesintheinputspace,thisalgorithmisrobustinnon-Gaussiannoiseenvironmentstodenoiseandpredictsignals.However,itstrivialmeasurementmodelandhiddenstateofmeasurementembeddingsinRKHSlimitsitsapplications.Thealgorithmiswellsuitedasanonlinearmodeltopredictnoisyseriesaswedemonstrate. Finally,weproposetheFKKFalgorithmalsobasedontheconditionalembeddingoperators.Thealgorithmisproposedundertheassumptionthatthestate-spacemodelisknown,likethenonlinearKalmanlters.Usingthestatetransitionandmeasurementfunctions,wegeneratetrainingsamplesateachiteration,includinghiddenstatesandcorrespondingmeasurements,toconstructtheconditionalembeddingoperators,whicharecounterpartsofstatetransitionandmeasurementfunctionsinRKHS.Withtheseoperators,wecanhavethefull-blownstate-spacemodelinRKHSwiththestateembeddingsashiddenstates.Likewise,themodeluncertaintyandmeasurementnoiseareassumedtobezeromeanGaussiannoiseinRKHS.Becauseofthenon-trivialmeasurementmodel,whichisconstructedfromthegeneratedsamples,thisalgorithmisnamedfull-blownKalmanlterinRKHS.Inaddition,becausetrainingsamplesaregeneratedanddistributionsarepropagatedinthisalgorithm,thereissomesimilarity 122

PAGE 123

betweenthisalgorithmandPF.However,theyaretotallydifferentalgorithms.WehavegivendiscussionsandexplanationsaboutthispointinSection 5.5 .However,thisalgorithmisnotcompleted.Thereisstillmanyinterestingissuestoberesearched. IfwetreattheKRLSasaKalmanalgorithmwithastaticstatemodelinRKHS,thenwecansummarizethesestudiedalgorithmsinastate-spacemodelframeworkinFigure 6-1 .InFigure 6-1 thehiddenstate,statetransitionmatrices(operators)andmeasurementmatrices(operators)inthealgorithmsareallpresented. Figure6-1. Researchlineofalgorithms InChapter 1 ,welistsomerequirementsthatwetrytoachievebyourproposedalgorithms.Afterpresentingallalgorithms,letcheckwhetherthesealgorithmachievetheserequirements,ornot.TheresultsarelistedinTable 6-1 Table6-1. Algorithmrequirementschecklist LearnsystemmodelEstimationEstimationRobustAlgorithmsfromdatadirectly/Prediction/Predictioninanon-Gaussianoutputofhiddenstatenoiseenvironment Ex-KRLS-KFXXKKF-CEOXXXXFKKFXXXX 123

PAGE 124

ItisclearthattheKKF-CEOandFKKFalgorithmsachieveallrequirements,whiletheEx-KRLS-KFjustachievessome,becauseitisnotaKalmanlterinRKHS,butahybridapproach. 6.2Futurework WehavepresentedseveralalgorithmsbasedonKalmanalgorithminRKHS.However,therearestillmanyissuethatweneedtoaddresstoimprovethesealgorithmsordevelopnewalgorithms. 6.2.1NoisemodelinRKHS ItshouldbenotedthatnoisesinRKHSinallthealgorithms,includingtheEx-KRLS,KKF-CEOandFKKF,areallassumedtobezeromeanGaussiandistributions.Although,thisassumptionissimpleandeffective,thisnoisemodelcannotrepresentsomespecicnoisesincomplicateddynamicalsystem.Especially,howtodescribeamulti-dimensionalnoisewithfullrankcovariancematrixinRKHSisverychallenging,andtheonlysuggestionatthistimeistouseaproductkernelthatcontrolssomeoftheaxisinRKHS.Furthermore,becausethekernelmappingisnonlinear,thenoiseisactuallynotadditive.Howtoreectthenon-additivitypropertyofnoiseisanopenproblem. 6.2.2OperatorDesign IntheKKF-CEOandFKKFalgorithms,theconditionalembeddingoperatorsareusedtoconstructthestate-spacemodelinRKHS.ItisalsoanopenquestiontondnewoperatorstoimplementKalmanlterinRKHSwithbetterproperties. 6.2.3ApplicationofKalmanlterinKernelAdaptiveAlgorithms Sincethetraditional(nonlinear)Kalmanltersareappliedinsomeadaptivealgorithms,wecanalsoconsidertheKernelKalmanalgorithminkerneladaptiveltering.Forexample,wehavestudiedtherst(toourknowledge)hierarchicalkernellterasakernelversionofrecurrentneuralnetworks(RNN),namelykernelrecurrentsystem(KRS).Moreover,wedevelopedakernelversionofreal-timerecurrentlearning 124

PAGE 125

(KRTRL)algorithmtotraintheKRS.ThedetailscanbefoundinAppendix E .BecausethenonlinearKalmanltershavebeenusedtotrainRNN,itisalsopossibletodevelopakernelversionofKalmanltertotraintheKRS. 125

PAGE 126

APPENDIXASQUARE-ROOTCUBATUREKALMANFILTER InthisAPPENDIX,wesummarizethesquare-rootcubatureKalmanlter(SCKF)algorithm.Beforegivingthealgorithm,wedenoteageneraltriangularizationalgorithm(e.g.theQRdecomposition)asD=Tria(A),whereDisalowertriangularmatrix.ThematricesAandDarerelatedasfollows:LetRbeanuppertriangularmatrixobtainedfromtheQRdecompositiononAT,thenD=RT.Weusethesymbol/torepresentthematrixrightdivisionoperator.WhenweperformtheoperationA=B,itappliesthebacksubstitutionalgorithmforanuppertriangularmatrixBandtheforwardsubstitutionalgorithmforalowertriangularmatrixB. 126

PAGE 127

AlgorithmA-1:Square-rootCubatureKalmanlter Initialization:Fori=0,set^x0=E[x0]P0=E[(x0)]TJ /F4 11.955 Tf 11.96 0 Td[(E[x0])(x0)]TJ /F4 11.955 Tf 11.96 0 Td[(E[x0])T]FactorizeP0=D0DT0 Computation:Fori=1,2,...,TimeUpdateEvaluatethecubaturepoints(n=1,2,...,m)Xn,i)]TJ /F8 7.97 Tf 6.58 0 Td[(1=Di)]TJ /F8 7.97 Tf 6.59 0 Td[(1n+^xi)]TJ /F8 7.97 Tf 6.59 0 Td[(1,wherem=2nxEvaluatethepropagetecubaturepoints(n=1,2,...,m)Xn,i=f(Xn,i)]TJ /F8 7.97 Tf 6.59 0 Td[(1,ui)]TJ /F8 7.97 Tf 6.58 0 Td[(1).Estimatethepredictedstate^x)]TJ /F5 7.97 Tf 0 -8.28 Td[(i=1 mPmn=1Xn,iEstimatethesquare-rootfactorofthepredictederrorcovarianceD)]TJ /F5 7.97 Tf 0 -8.28 Td[(i=Tria\002XiDQ,i)]TJ /F5 7.97 Tf 6.58 0 Td[(1whereDQ,i)]TJ /F5 7.97 Tf 6.58 0 Td[(1denotesasquare-rootfactorofQi)]TJ /F8 7.97 Tf 6.58 0 Td[(1suchthatQi)]TJ /F8 7.97 Tf 6.58 0 Td[(1=DQ,i)]TJ /F5 7.97 Tf 6.59 0 Td[(1DTQ,i)]TJ /F5 7.97 Tf 6.58 0 Td[(1andtheweighted,centeredmatrixXi=1 p mX1,i)]TJ /F6 11.955 Tf 11.81 0 Td[(^x)]TJ /F5 7.97 Tf 0 -8.28 Td[(iX2,i)]TJ /F6 11.955 Tf 11.8 0 Td[(^x)]TJ /F5 7.97 Tf 0 -8.28 Td[(i...Xm,i)]TJ /F6 11.955 Tf 11.81 0 Td[(^x)]TJ /F5 7.97 Tf 0 -8.28 Td[(iMeasurementUpdateEvaluatethecubaturepoints(n=1,2,...,m)X)]TJ /F5 7.97 Tf 0 -8.28 Td[(n,i=Di)]TJ /F8 7.97 Tf 6.59 0 Td[(1n+^x)]TJ /F5 7.97 Tf 0 -8.28 Td[(i,Evaluatethepropagatedcubaturepoints(n=1,2,...,m)Y)]TJ /F5 7.97 Tf -.33 -8.28 Td[(n,i=h)]TJ /F7 11.955 Tf 5.48 -9.68 Td[(X)]TJ /F5 7.97 Tf 0 -8.28 Td[(n,i,ui.Estimatethepredictedmeasurement^y)]TJ /F5 7.97 Tf -.18 -8.28 Td[(i=1 mPmn=1Y)]TJ /F5 7.97 Tf -.33 -8.28 Td[(n,iEstimatethesquare-rootoftheinnovationcovariancematrixDyy,i=Tria\002YiDR,iwhereDR,i)]TJ /F5 7.97 Tf 6.59 0 Td[(1denotesasquare-rootfactorofRi)]TJ /F8 7.97 Tf 6.59 0 Td[(1suchthatRi)]TJ /F8 7.97 Tf 6.59 0 Td[(1=DR,i)]TJ /F5 7.97 Tf 6.59 0 Td[(1DTR,i)]TJ /F5 7.97 Tf 6.59 0 Td[(1andtheweighted,centeredmatrixYi=1 p mY)]TJ /F8 7.97 Tf -.33 -8.28 Td[(1,i)]TJ /F6 11.955 Tf 11.89 0 Td[(^y)]TJ /F5 7.97 Tf -.18 -8.28 Td[(iY)]TJ /F8 7.97 Tf -.32 -8.28 Td[(2,i)]TJ /F6 11.955 Tf 11.9 0 Td[(^y)]TJ /F5 7.97 Tf -.18 -8.28 Td[(i...Y)]TJ /F5 7.97 Tf -.33 -8.28 Td[(m,i)]TJ /F6 11.955 Tf 11.9 0 Td[(^y)]TJ /F5 7.97 Tf -.18 -8.28 Td[(iEstimatethecross-covariancematrixPxy=XiYTi.wheretheweighted,centeredmatrixXi=1 p mX)]TJ /F8 7.97 Tf 0 -8.28 Td[(1,i)]TJ /F6 11.955 Tf 11.81 0 Td[(^x)]TJ /F5 7.97 Tf 0 -8.28 Td[(iX)]TJ /F8 7.97 Tf 0 -8.28 Td[(2,i)]TJ /F6 11.955 Tf 11.81 0 Td[(^x)]TJ /F5 7.97 Tf 0 -8.28 Td[(i...X)]TJ /F5 7.97 Tf 0 -8.28 Td[(m,i)]TJ /F6 11.955 Tf 11.81 0 Td[(^x)]TJ /F5 7.97 Tf 0 -8.28 Td[(iEstimatetheKalmangainGi=(Pxy=Dyy,i)=Dyy,iEstimatetheupdatestate^xi=^x)]TJ /F5 7.97 Tf 0 -8.27 Td[(i+Gi)]TJ /F7 11.955 Tf 5.48 -9.68 Td[(yi)]TJ /F6 11.955 Tf 11.9 0 Td[(^y)]TJ /F5 7.97 Tf -.18 -8.27 Td[(iEstimatethecorrespondingerrorcovarianceDi=Tria\002Xi)]TJ /F7 11.955 Tf 11.95 0 Td[(GiYiGiDQ,i. 127

PAGE 128

APPENDIXBPROOFSOFTHEOREM4.1ANDTHEOREM4.2 Becauseoftheircloserelationship,weprovethemtogether. Proof. GiventhatFi=(K+&mIm))]TJ /F8 7.97 Tf 6.58 0 Td[(1T,Hi=I,Qi=qIandRi=rI.Startingwith0='(y0),P0=Iandfollowing( 4 )to( 4 ),wehavetherstiterationas: First,wepredictthe)]TJ /F8 7.97 Tf 0 -7.98 Td[(1 )]TJ /F8 7.97 Tf 0 -7.98 Td[(1=F00=(K+&mIm))]TJ /F8 7.97 Tf 6.59 0 Td[(1T0=a1.(B) wherea1=(K+&mIm))]TJ /F8 7.97 Tf 6.59 0 Td[(1T0. ThenweexpressP)]TJ /F8 7.97 Tf -.98 -7.97 Td[(1as P)]TJ /F8 7.97 Tf 0 -7.97 Td[(1=F0P0FT0+qI=(K+&mIm))]TJ /F8 7.97 Tf 6.58 0 Td[(1T(K+&mIm))]TJ /F8 7.97 Tf 6.58 0 Td[(1TT+qI=~P)]TJ /F8 7.97 Tf 0 -7.98 Td[(1T+qI. (B) where~P)]TJ /F8 7.97 Tf 0 -7.98 Td[(1=(K+&mIm))]TJ /F8 7.97 Tf 6.59 0 Td[(1T(K+&mIm))]TJ /F8 7.97 Tf 6.59 0 Td[(1T. ForKalmangainG1, G1=P)]TJ /F8 7.97 Tf 0 -7.97 Td[(1HT1H1P)]TJ /F8 7.97 Tf 0 -7.97 Td[(1HT1+R1)]TJ /F8 7.97 Tf 6.59 0 Td[(1=P)]TJ /F8 7.97 Tf 0 -7.98 Td[(1P)]TJ /F8 7.97 Tf 0 -7.98 Td[(1+rI)]TJ /F8 7.97 Tf 6.58 0 Td[(1=~P)]TJ /F8 7.97 Tf 0 -7.97 Td[(1T+qIh~P)]TJ /F8 7.97 Tf 0 -7.97 Td[(1T+(q+r)Ii)]TJ /F8 7.97 Tf 6.59 0 Td[(1=q q+rh~P)]TJ /F8 7.97 Tf 0 -7.97 Td[(1T+(q+r)Ii+r q+r~P)]TJ /F8 7.97 Tf 0 -7.97 Td[(1Th~P)]TJ /F8 7.97 Tf 0 -7.97 Td[(1T+(q+r)Ii)]TJ /F8 7.97 Tf 6.58 0 Td[(1=r q+r~P)]TJ /F8 7.97 Tf 0 -7.98 Td[(1Th~P)]TJ /F8 7.97 Tf 0 -7.98 Td[(1T+(q+r)Ii)]TJ /F8 7.97 Tf 6.59 0 Td[(1+q q+rI. (B) 128

PAGE 129

Furtherbythematrixinversionlemma,wehave G1=r q+rh(q+r)Im+~P)]TJ /F8 7.97 Tf 0 -7.97 Td[(1Ti)]TJ /F8 7.97 Tf 6.58 0 Td[(1~P)]TJ /F8 7.97 Tf 0 -7.97 Td[(1T+q q+rI=r q+r~G1T+q q+rI. (B) where~G1=h(q+r)Im+~P)]TJ /F8 7.97 Tf 0 -7.97 Td[(1Ti)]TJ /F8 7.97 Tf 6.59 0 Td[(1~P)]TJ /F8 7.97 Tf 0 -7.97 Td[(1. Nextweupdatecovariancematrix P1=(I)]TJ /F7 11.955 Tf 11.95 0 Td[(G1)P)]TJ /F8 7.97 Tf 0 -7.97 Td[(1=r q+r~P)]TJ /F8 7.97 Tf 0 -7.97 Td[(1)]TJ /F4 11.955 Tf 23.84 8.09 Td[(r q+r~G1T~P)]TJ /F8 7.97 Tf 0 -7.97 Td[(1)]TJ /F4 11.955 Tf 20.75 8.09 Td[(qr q+r~G1T+q)]TJ /F4 11.955 Tf 20.74 8.09 Td[(q2 q+rI=~P1T+qr q+rI. (B) where~P1=r q+r~P)]TJ /F8 7.97 Tf 0 -7.97 Td[(1)]TJ /F5 7.97 Tf 18.83 4.7 Td[(r q+r~G1T~P)]TJ /F8 7.97 Tf 0 -7.97 Td[(1)]TJ /F5 7.97 Tf 16.64 5.03 Td[(qr q+r~G1. Finally,weupdatethestate1as 1=)]TJ /F8 7.97 Tf 0 -7.97 Td[(1+G1)]TJ /F3 11.955 Tf 5.48 -9.68 Td[('(y1))]TJ /F3 11.955 Tf 11.96 0 Td[()]TJ /F8 7.97 Tf 0 -7.97 Td[(1=a1+r q+r~G1T+q q+rI('(y1))]TJ /F6 11.955 Tf 11.95 0 Td[(a1)=r q+rI)]TJ /F4 11.955 Tf 23.84 8.09 Td[(r q+r~G1Ta1+r q+r~G1T'(y1)+q q+r'(y1)=b1+q q+r'(y1). (B) whereb1=hr q+rI)]TJ /F5 7.97 Tf 18.82 4.71 Td[(r q+r~G1Tia1+r q+r~G1T'(y1). Here,wehaveprovenTheorem2andTheorem3fori=1iteration,thenweprovethemfori>1.AssumingthatTheorem2andTheorem3arebothsatisedati)]TJ /F6 11.955 Tf 12.66 0 Td[(1 129

PAGE 130

iteration,wehavethefollowingequationsatiiteration. )]TJ /F5 7.97 Tf 0 -8.28 Td[(i=Fi)]TJ /F8 7.97 Tf 6.59 0 Td[(1i)]TJ /F8 7.97 Tf 6.59 0 Td[(1=(K+&mIm))]TJ /F8 7.97 Tf 6.58 0 Td[(1Tbi)]TJ /F8 7.97 Tf 6.59 0 Td[(1+q q+r'(yi)]TJ /F8 7.97 Tf 6.59 0 Td[(1)=ai (B) whereai=(K+&mIm))]TJ /F8 7.97 Tf 6.59 0 Td[(1Thbi)]TJ /F8 7.97 Tf 6.59 0 Td[(1+q q+r'(yi)]TJ /F8 7.97 Tf 6.58 0 Td[(1)i and P)]TJ /F5 7.97 Tf 0 -8.27 Td[(i=Fi)]TJ /F8 7.97 Tf 6.59 0 Td[(1Pi)]TJ /F8 7.97 Tf 6.59 0 Td[(1FTi)]TJ /F8 7.97 Tf 6.59 0 Td[(1+qI=Fi)]TJ /F8 7.97 Tf 6.59 0 Td[(1~Pi)]TJ /F8 7.97 Tf 6.59 0 Td[(1T+qr q+rIFTi)]TJ /F8 7.97 Tf 6.58 0 Td[(1+qI=~P)]TJ /F5 7.97 Tf 0 -8.28 Td[(iT+qI. (B) where ~P)]TJ /F5 7.97 Tf 0 -8.27 Td[(i=(K+&mIm))]TJ /F8 7.97 Tf 6.59 0 Td[(1T~Pi)]TJ /F8 7.97 Tf 6.59 0 Td[(1T(K+&mIm))]TJ /F8 7.97 Tf 6.59 0 Td[(1T+qr q+r(K+&mIm))]TJ /F8 7.97 Tf 6.59 0 Td[(1T(K+&mIm))]TJ /F8 7.97 Tf 6.59 0 Td[(1T (B) Moreover,followingfrom( B )to( B )otherequationsinTheorem2Theorem3canbeobtainedfori>1. 130

PAGE 131

APPENDIXCPROOFSOFTHEOREM4-3 Proof. Letej=[0,0,...,1,...,0]T(1jk)beak-Dimensionalstandardbasis(canonicalbasis).If(y0i,z0i)isquantizedto(yqj,zqj),'(y0i)and'(z0i)arequantizedasQejandQej,respectively.Forconvince,wedenec[i]=j.ThenthefeaturematricesandarequantizedasQEandQE,whereE=ec[1],...,ec[m]. Substitutingthequantizedandinto( 4 ),wehave FQ=QE)]TJ /F7 11.955 Tf 5.48 -9.68 Td[(ETKQE+&mI)]TJ /F8 7.97 Tf 6.59 0 Td[(1ETTQ.(C) Furtherbythematrixinversionlemma[ 81 ],wehaveFQ=QEET)]TJ /F7 11.955 Tf 5.48 -9.68 Td[(KQEET+&mI)]TJ /F8 7.97 Tf 6.59 0 Td[(1TQ=Q(KQ+&mI))]TJ /F8 7.97 Tf 6.59 0 Td[(1TQ (C) where=EETisadiagonalmatrixwhosejthdiagonalelementqjisthenumberofthesamplesclusteredinthejthbinbythekernelK-meansalgorithm. 131

PAGE 132

APPENDIXDPROOFSOFTHEOREM5.1ANDTHEOREM5.2 Becauseoftheircloserelationship,weprovethemtogether. Proof. GiventhatFi=~FT,H=~HT,Qi=qIandRi=rI.Startingwith0='(x0),P0=Iandfollowing( 5 )to( 5 ),wehavetherstiterationas: First,wepredictthe)]TJ /F8 7.97 Tf 0 -7.98 Td[(1 )]TJ /F8 7.97 Tf 0 -7.98 Td[(1=F0=~FT0=~FT0=a1 (D) ThenweexpressP)]TJ /F8 7.97 Tf -.98 -7.97 Td[(1as P)]TJ /F8 7.97 Tf 0 -7.97 Td[(1=FiP0FTi+qI=FiFTi+qI=~FT~FTT+qI=~P)]TJ /F8 7.97 Tf 0 -7.97 Td[(1T+qI. (D) where~P)]TJ /F8 7.97 Tf 0 -7.98 Td[(1=~FT~FT.ForKalmangainG1, G1=P)]TJ /F8 7.97 Tf 0 -7.97 Td[(1HTHP)]TJ /F8 7.97 Tf 0 -7.97 Td[(1HT+R1)]TJ /F8 7.97 Tf 6.59 0 Td[(1=~P)]TJ /F8 7.97 Tf 0 -7.98 Td[(1T+qI~HTT~HT~P)]TJ /F8 7.97 Tf 0 -7.97 Td[(1T+qI~HTT+rI)]TJ /F8 7.97 Tf 6.59 0 Td[(1=~P)]TJ /F8 7.97 Tf 0 -7.97 Td[(1T~HTT+q~HTTh~HT~P)]TJ /F8 7.97 Tf 0 -7.97 Td[(1T~HTT+q~HTT+rIi)]TJ /F8 7.97 Tf 6.59 0 Td[(1=~P)]TJ /F8 7.97 Tf 0 -7.98 Td[(1T~HT+q~HTTh~HT~P)]TJ /F8 7.97 Tf 0 -7.98 Td[(1T~HT+q~HTT+rIi)]TJ /F8 7.97 Tf 6.59 0 Td[(1. (D) 132

PAGE 133

Furtherbythematrixinversionlemma,)]TJ /F7 11.955 Tf 5.48 -9.69 Td[(A)]TJ /F7 11.955 Tf 11.95 0 Td[(BD)]TJ /F8 7.97 Tf 6.59 0 Td[(1C)]TJ /F8 7.97 Tf 6.59 0 Td[(1BD)]TJ /F8 7.97 Tf 6.59 0 Td[(1=A)]TJ /F8 7.97 Tf 6.58 0 Td[(1B)]TJ /F7 11.955 Tf 5.48 -9.69 Td[(D)]TJ /F7 11.955 Tf 11.95 0 Td[(CA)]TJ /F8 7.97 Tf 6.59 0 Td[(1B)]TJ /F8 7.97 Tf 6.58 0 Td[(1,letA=Im,B=~P)]TJ /F8 7.97 Tf 0 -7.97 Td[(1T~HT+q~HTT,C=)]TJ /F6 11.955 Tf 9.3 0 Td[(~HT,andrI,whereImisammidenticalmatrix.wehave G1=hrIm+~P)]TJ /F8 7.97 Tf 0 -7.97 Td[(1T~HT+q~HTT~HTi)]TJ /F8 7.97 Tf 6.59 0 Td[(1~P)]TJ /F8 7.97 Tf 0 -7.97 Td[(1T~HT+q~HTT=~G1T. (D) where~G1=hrIm+~P)]TJ /F8 7.97 Tf 0 -7.97 Td[(1T~HT+q~HTT~HTi)]TJ /F8 7.97 Tf 6.59 0 Td[(1~P)]TJ /F8 7.97 Tf 0 -7.97 Td[(1T~HT+q~HT. Nextweupdatecovariancematrix P1=(I)]TJ /F7 11.955 Tf 11.96 0 Td[(G1H)P)]TJ /F8 7.97 Tf 0 -7.97 Td[(1=P)]TJ /F8 7.97 Tf 0 -7.97 Td[(1)]TJ /F7 11.955 Tf 11.96 0 Td[(G1HP)]TJ /F8 7.97 Tf 0 -7.97 Td[(1=~P)]TJ /F8 7.97 Tf 0 -7.98 Td[(1T+qI)]TJ /F6 11.955 Tf 11.96 0 Td[(~G1T~HT~P)]TJ /F8 7.97 Tf 0 -7.98 Td[(1T+qI=~P)]TJ /F8 7.97 Tf 0 -7.97 Td[(1T)]TJ /F4 11.955 Tf 11.95 0 Td[(q~G1T~HT)]TJ /F6 11.955 Tf 11.96 0 Td[(~G1T~HT~P)]TJ /F8 7.97 Tf 0 -7.97 Td[(1T+qI=~P)]TJ /F8 7.97 Tf 0 -7.98 Td[(1)]TJ /F4 11.955 Tf 11.96 0 Td[(q~G1T~H)]TJ /F6 11.955 Tf 13.2 2.65 Td[(~G1T~HT~P)]TJ /F8 7.97 Tf 0 -7.98 Td[(1T+qI=~P1T+qI. (D) where~P1=~P)]TJ /F8 7.97 Tf 0 -7.98 Td[(1)]TJ /F4 11.955 Tf 11.96 0 Td[(q~G1T~H)]TJ /F6 11.955 Tf 13.2 2.65 Td[(~G1T~HT~P)]TJ /F8 7.97 Tf 0 -7.98 Td[(1. Finally,weupdatethestate1as 1=)]TJ /F8 7.97 Tf 0 -7.97 Td[(1+G1)]TJ /F3 11.955 Tf 5.48 -9.68 Td[( (y1))]TJ /F7 11.955 Tf 11.96 0 Td[(H)]TJ /F8 7.97 Tf 0 -7.97 Td[(1=a1+~G1T (y1))]TJ /F6 11.955 Tf 11.95 0 Td[(~HTa1=a1+~G1T (y1))]TJ /F6 11.955 Tf 13.2 2.66 Td[(~G1T~HTa1=b1. (D) whereb1=a1+~G1T (y1))]TJ /F6 11.955 Tf 13.2 2.66 Td[(~G1T~HTa1. Here,wehaveprovenTheorem5.1andTheorem5.2fori=1iteration,thenweprovethemfori>1.AssumingthatTheorem5.1andTheorem5.2arebothsatisedat 133

PAGE 134

i)]TJ /F6 11.955 Tf 11.95 0 Td[(1iteration,wehavethefollowingequationsatiiteration. )]TJ /F5 7.97 Tf 0 -8.28 Td[(i=Fi)]TJ /F8 7.97 Tf 6.59 0 Td[(1=~FTbi)]TJ /F8 7.97 Tf 6.59 0 Td[(1=ai (D) whereai=~FTbi)]TJ /F8 7.97 Tf 6.59 0 Td[(1and P)]TJ /F5 7.97 Tf 0 -8.28 Td[(i=FPi)]TJ /F8 7.97 Tf 6.59 0 Td[(1FT+qI=~FT~Pi)]TJ /F8 7.97 Tf 6.58 0 Td[(1T+qI~FTT+qI=~FT~Pi)]TJ /F8 7.97 Tf 6.58 0 Td[(1T+qT~FTT+qI (D) =~P)]TJ /F5 7.97 Tf 0 -8.28 Td[(iT+qI (D) where~P)]TJ /F5 7.97 Tf 0 -8.28 Td[(i=~FT~Pi)]TJ /F8 7.97 Tf 6.59 0 Td[(1T+qT~FT. Moreover,followingfrom( D )to( D )otherequationsinTheorem2Theorem3canbeobtainedfori>1. 134

PAGE 135

APPENDIXEKERNELRECURRENTSYSTEMTRAINEDBYREAL-TIMERECURRENTLEARNINGALGORITHM E.1Introduction Inthisappendix,weproposeanovelhierarchicalkerneladaptivelteralgorithmfortimeseriespredictionwithrecurrenthiddenstatemodel,whichislearnedfromtheprocessingdataandisabletodescribethedynamicsofthedata. Sincethesuccessofthesupportvectormachine(SVM)[ 7 ],thekernelmethodologyhasbeenappliedtomanyalgorithmsofmachinelearningandadaptivelters.Utilizingthefamedkerneltrick,theselinearmethodshavebeenrecastinhighdimensionalreproducingkernelHilbertspaces(RKHS)[ 48 51 ]toyieldmorepowerfulnonlinearextensions,includingkernelprincipalcomponentanalysis(KPCA)[ 11 12 ]andkernelindependentcomponentanalysis(KICA)[ 13 14 ].Recently,someon-linekerneladaptivelteralgorithmsarealsodevelopedtosolvemanynonlinearregressionproblems,suchaskernelrecursiveleastsquares(KRLS)[ 16 ],kernelleastmeansquares(KLMS)[ 15 ]andkernelrecurrentgammanetwork(KRGN)[ 92 ]algorithms,etc. TheKRLSandKLMSalgorithmsareabletodiscovertheunderlyinginput-outputmappingverywellforstationarycases.However,becauseofabsenceofhiddenstates,thesealgorithmscannotdescribetheunderlyingdynamicsoftheprocessingdatafornon-stationarycases.Therefore,theyarenotabletoachievegoodperformanceinsuchcases.IntheKRGNalgorithm,althoughtherecursiveGammalterisimplementedintotheRKHS,thespecictopologyonlyallowslocalrecursiononthepastoftheinput,whichiseasytocontrolstabilitybutdoesnotallowafullrecurrentstate.Tosolveglobalrecursiveness,theextendedkernelrecursiveleastsquared(Ex-KRLS)algorithm[ 17 ]wasproposed.But,itcanonlyimplementarandomwalkmodelinthehiddenstate.Then,anotherextendedkernelrecursiveleastsquaresalgorithm(Ex-KRLS-KF)wasproposedbasedontheKalmanlter(KF)[ 76 ],whichcanbeappliedtoanyknownlinearornonlinearhiddenstatemodelcases.Forbothofthesealgorithms(Ex-KRLSand 135

PAGE 136

Ex-KRLS-KF),however,thehiddenstatemodelhastobeknowninadvance,whichmaynotbeavailableinmanysignalprocessingproblems. Toconstructandlearntheunderlyinghiddenstatemodel,theideaofrecurrentneuralnetworks(RNNs)[ 96 ]isadopted.Inkernelrecurrentsystems,theKLMSalgorithmisappliedinsteadoftheoriginalperceptronalgorithm,whilemakingthetopologyrecurrent,asinJordanandElman'snetworks[ 97 99 ].Furthermore,akernelversionofreal-timerecurrentlearning(RTRL)algorithm[ 94 95 ]isderivedtolearnthisrecurrentnetwork.Toobtainastablesystem,theteacherforcing[ 93 94 ]techniqueisappliedtotheKRTRLwhichsubstitutesthedynamicsofthehiddenstateusingthedesiredresponsetoavoidsysteminstability. Therestoftheappendixisorganizedasfollows.InSection2,thekernelrecurrentsystem(KRS)aredescribed.Next,thekernelRTRL(KRTRL)algorithmisderivedinSection3.Then,theexperimentofLorenztimeseriespredictionispresentedtoevaluatetheproposedalgorithminSection4.Finally,discussionsandconclusionaregiveninSection5. E.2Kernelrecurrentnetworks Inthissection,wepresentahierarchicalgeneralkernelrecurrentnetworkarchitectures,namelythestate-spacemodelandpointoutthatanystate-spacemodelcanbesubsumedbyaspecicstate-spacemodelwithakernelstatemodelandalinearmeasurementmodel.Then,inthenextsection,wewilldevelopakernelRTRLalgorithmtotrainthisspecicstate-spacemodel. Fig. E-1 showstheblockdiagramofastate-spacemodel.ui2Rnu,xi2Rnxandyi2Rnyareinputs,hiddenstatesandoutputattimei,respectively.Thestate-spacemodelis xi+1=f(xi,ui) (E) yi=h(xi+1), (E) 136

PAGE 137

FigureE-1. State-spacemodel,thefeedbackpartisshowninred. wherefi(xi,ui)=f(1)(xi,ui),...,f(nx)(xi,ui)Tandhi(xi)=h(1)(xi+1),...,h(ny)(xi+1)T.Inthispaperthesuperscript(k)denotesthekthcomponentofavectororthekthvectorofamatrix. Inordertosimplifythecomputationandreducelearningtime,wemodifythisstate-spacemodelas 264xi+1yi375=264f(xi,ui)hf(xi,ui)375 (E) yi=0Iny264xi+1yi375 (E) Wedenote264xi+1yi375bysi+1asanewhiddenstate,and0InybyWmasaknownmeasurementmatrix,whereInyisannynyidentitymatrix,andisthecompositionoperator.Furthermore,letg(si,ui)=f(xi,ui). WemapsianduiintotheRKHSHsandHuas'(si)2Hsand(ui)2Hu,respectively.Thenthenewnon-linearstatetransitionweightsWH=264g(,)hg(,)375areinthesameRKHSHsu=HsHu,whereisthetensoroperator.Wedenote'(si)(ui)2Hsuby (si,ui). 137

PAGE 138

Atthispoint,wecanexpressgeneralstate-spacerecurrentnetworksusingaspecialkernelstate-spacemodelas si+1=WTH (si,ui) (E) yi=Wmsi+1 (E) whereWHareweightsinRKHSandWmisaknownlinearmatrix. E.3On-linerecurrentsystemlearning Intheprevioussection,nonlinearrecurrentsystemsarereformulatedin( E )and( E ).Toperformon-linelearninginthisnetwork,wedevelopthekernelreal-timerecurrentlearning(KRTRL)algorithminthissectionandteacherforcingtechniqueisalsointroducedtoavoidinstabilityduringtraining. E.3.1Kernelreal-timerecurrentlearningalgorithm Thekernelreal-timerecurrentlearning(KRTRL)isdevelopedbasedontheRTRLalgorithm[ 94 ],whichderivesitsnamefromthefactthatadjustmentsaremadetotheweightsofafullyconnectedrecurrentnetworkinrealtime.TheKRTRLalgorithmcanalsoupdatetheweightsinRKHS,whicharefunctionsactually,whilethenetworkcontinuestoperformitssignal-processingfunction.TheKRTRLisderivedwithrespectto( E )and( E ).Withoutlossofgenerality,allkernelsinvolvedinthealgorithmareGaussiankernels.FortheRKHSHuandHs,thekernelsareku(x,y)andks(x,y)withkernelparametersuands,respectively. Tocompletethedescriptionofthislearningprocess,weneedtocalculatethegradientoftheerrorsurfacewithrespectto!k2Hsu,whichisthekthcomponentofWH.Todothis,werstuse( E )todenetheny1errorvector: ei=di)]TJ /F7 11.955 Tf 11.95 0 Td[(yi,(E) 138

PAGE 139

andtheinstantaneoussumofsquarederrorsattime-stepiisdenedintermofeiby Ei=1 2eTiei.(E) Toimplementthison-linelearningalgorithm,weusethemethodofsteepestdescent,whichrequiresknowledgeofthegradientmatrix@Ei @!k,writtenas @Ei @!k=@eTiei 2@!k=)]TJ /F7 11.955 Tf 9.3 0 Td[(eTi@yi @!k=)]TJ /F7 11.955 Tf 9.3 0 Td[(eTi@yi @si+1@si+1 @!k.(E) Accordingto( E )and( E ),wehave @yi @si+1=Wm,(E) and @si+1 @!k=@WTH (si,ui) @!k=WTH@ (si,ui) @!k+I(k)ns (si,ui)T (E) whereInsisthensnsidentitymatrixandI(k)nsisthekthcolumnoftheidentitymatrix. Accordingtorepresentationtheory,wecanexpress!k,itheweightattimeiasalinearcombinationoff (sj,uj)gi)]TJ /F8 7.97 Tf 6.59 0 Td[(1j=0suchas !(k),i=ick,i(E) wherei=[ (s0,u0),..., (si)]TJ /F8 7.97 Tf 6.59 .01 Td[(1,ui)]TJ /F8 7.97 Tf 6.58 .01 Td[(1)]andck,i2Ri,andWHcanbeexpressedas WH=iCi(E) whereCi=[c1,i,...,cns,i]2Rins. 139

PAGE 140

Thenthersttermofthelastlinein( E )iscalculatedby WTH@ (si,ui) @!k=CTi@Ti (si,ui) @si@si @!k=2xCTiDiSTi@si @!k=)]TJ /F5 7.97 Tf 26.03 -1.79 Td[(i@si @!k (E) whereDi=diag(T (si,ui)), Si=[(s0)]TJ /F7 11.955 Tf 11.96 0 Td[(si),...,(si)]TJ /F8 7.97 Tf 6.58 0 Td[(1)]TJ /F7 11.955 Tf 11.96 0 Td[(si)],and)]TJ /F5 7.97 Tf 6.77 -1.79 Td[(i=@si+1 @si=2sCTiDiSTi.Substituting( E )into( E ),wehavethefollowingrecursiveequation: @si+1 @!k=)]TJ /F5 7.97 Tf 19.39 -1.79 Td[(i@si @!k+I(k)ns (si,ui)T.(E) Ifweassumethat@s1 @!k=0,thenwecanexpress@si @!kas @si @!k=Mk,iTi, (E) whereMk,iisansimatrixandMk,1=[0,0,...,...,0]T.Furthermore,( E )canberewrittenas @si+1 @!k=)]TJ /F5 7.97 Tf 26.03 -1.79 Td[(iMiTi+I(k)ns (si,ui)T=)]TJ /F5 7.97 Tf 6.77 -1.8 Td[(iMi,I(k)nsTi+1=Mk,i+1Ti+1 (E) where Mk,i+1=)]TJ /F5 7.97 Tf 6.77 -1.79 Td[(iMk,i,I(k)ns.(E) Substituting( E )and( E )into( E ),wehave @Ei @!k=)]TJ /F7 11.955 Tf 9.3 0 Td[(eTiWmMk,i+1Ti+1.(E) andcanupdate!kby !k,(i+1)=!k,i+i+1(WmMk,i+1)Tei(E) 140

PAGE 141

whereisthelearningrateparameter. Therefore,thefeaturematrixiandcoefcientmatrixCi2Rinsshouldbeupdatedby i+1=[i, (xi,ui)](E) C(k)i+1=ck,i+1=cTk,i,0T+1MTk,i+1WTmei (E) (k=1,2,...,nx) Ifwesubstitute( E )into( E ),wehaveMTk,i+1WTmei=)]TJ /F5 7.97 Tf 6.77 -1.79 Td[(iMk,i,I(k)nsTWTmei=h)]TJ /F7 11.955 Tf 5.48 -9.69 Td[(MTk,i)]TJ /F5 7.97 Tf 6.77 4.94 Td[(TiWTmeiT,)]TJ /F7 11.955 Tf 5.48 -9.69 Td[(WTmei(k)iT (E) Atthispoint,thelearningprocedureiscompleteandissummarizedinAlgorithmE-1. ThecomputationalcomplexityofKRTRLalgorithmisO(nsnyi+n2si),consideringthatDiisadiagonalmatrix.Thecomputationalcomplexityincreaseslinearlywiththesamplenumberi,liketheKLMSalgorithm. E.3.2Teacherforcing TheKRTRLalgorithmisagradientbasedapproachappliedtoarecurrentsystemascanbeseeninFig. E-1 .However,unliketheRTRLalgorithminwhichtheactivationfunctionsarexedinadvanceandonlytheweightsareupdatedtoadjustthenetwork,theKRTRLconstructstheexpectedKRSbyadjustingthefunctionsthemselves.Therefore,thestabilityofthelearningsystemisabigissueforthislearningalgorithm,andunfortunately,thestabilityanalysisofrecurrentfunctionsissocomplicatedthatthereisnogeneralmethodtoeasilyspecifystabilityconditions.Specically,theupdatedfunctionsandcalculatedgradientsbothdependontheinputsuiandhiddenstatessi,whicharefunctionsinthiscase.IntheKRTRLalgorithmthegradientofhiddenstateswithrespecttothefunctions@si+1 @!kispropagatedforwardby( E )to( E ),sothepropagatedgradientcannotalwaysreecttherealcontributionof!ktothecurrentcost 141

PAGE 142

AlgorithmE-1:KernelReal-TimeRecurrentLearning Initialization:Fori=0,InputDim.:nu,StateDim.:nsandOutputDim.:nyrandomlysetu0,s0ands1,Gaussiankernelsize:sandufeaturematrices:i+1=[ (si,ui)]Coeff.matrix:randomlysetCi+12R1nsMeas.matrix:Wm2R1nyGradientmatrix:Mk,i+1=0ns1fork=1,...,nsLearningrateparameters: Learning:Fori=1,...Forwardpass:si+1=CTiTi (si,ui),yi=Wmsi+1ei=di)]TJ /F7 11.955 Tf 11.96 0 Td[(yiBackwardpass:Si=[(s0)]TJ /F7 11.955 Tf 11.96 0 Td[(si),...,(si)]TJ /F8 7.97 Tf 6.58 0 Td[(1)]TJ /F7 11.955 Tf 11.96 0 Td[(si)]Di=diag(Ti'(si,ui)))]TJ /F5 7.97 Tf 6.77 -1.79 Td[(i=2sCTiDiSTiMk,i+1=h)]TJ /F5 7.97 Tf 6.77 -1.79 Td[(iMk,i,I(k)nsifork=1,...,nsUpdateweightsinRKHSWH:i+1=[i, (si,ui)]C(k)i+1=C(k)i+MTk,i)]TJ /F5 7.97 Tf 6.78 4.34 Td[(TiWTmeifork=1,...,nxCi+1=hCTi+1,)]TJ /F7 11.955 Tf 5.48 -9.68 Td[(WTmeiTiT function,becauseifthedynamicsareunstabletheweightupdatewillbewrong.Theupdatemadebasedonthegradientdoesnotguaranteethatthefunctionismodiedinaproperway,whateverinitialconditionorhowsmalllearningrate. Fortunately,wecanapplytheteacherforcingtechniquetoavoidthisproblemandtraintheKRSinastablemanner.Theideaoftheteacherforcingtechniqueistoreplace,inthetrainingprocedure,theactualhiddenstates(j)i+1bythecorrespondingdesiredresponsed(j)iinsubsequentcomputationofthebehaviorofthenetwork.Thiscanbedoneateveryiterationorperiodicallytoavoiddivergenceofstates.Theteacherforcingtechniqueusesthedesiredresponse,whichisassumedboundedandstable,toconstrainandguidethelearningdynamicsofthehiddenstatesbyfollowingthesequentialdesiredresponsesandtoforcetheKRStoconvergeinaproperway,avoidingsysteminstability. 142

PAGE 143

Inordertoderivealearningalgorithmforthissituation,weonceagaindifferentiatethecostfunctionwithrespectto!k.WendonlythetermMk,ineedtobemodied.In( E ),@si @!k=Mk,iTi.Ifs(j)i+1isreplacedbyd(j)i,then@s(j)i+1 @!kshouldbeequaltozero.ThatistosaythejthrowofthematrixMk,iissetaszeros.Ifallthehiddenstatess(j)i+1(j=1,...,ns)areallsubstitutedbydi,thesystemtrainingproceduredegradestoaMIMOKLMSalgorithm.Inaddition,theteacherforcingisappliedonceeverytsteps. E.4Experimentsandresults ToevaluatetheproposedKRTRLalgorithm,wechosetopredicttheLorenztimeseriesandtheperformanceiscomparedwiththeKLMSalgorithm.Thesystemisnonlinear,three-dimensionalanddeterministic,denedbythesetofdifferentialequationsgivenin[ 17 ].Here,theexperimentaldataaregeneratedbythecoefcients=8=3,=10and=28. Therstorderdifferenceapproximationisusedwithastepsize0.01toobtainthesignalxi=[xi(1)xi(2)xi(3)]T.Wepicktherstcomponentx(1)astheinputsignaltoimplementashortpredictionoftheothercomponents.Theshorttermpredictiontaskcanbeformulatedasfollows:using10pastdataui=[x(1)i)]TJ /F8 7.97 Tf 6.59 0 Td[(10,...,x(1)i)]TJ /F8 7.97 Tf 6.58 0 Td[(1]Tastheinputtopredictx(k)ik=1,2,3.Foreachpredictiononlytheinputsignalx(1)anddesiredsignalx(k)areavailable.Wedonothavethewholethreesignalsatthesametime.Intherst3000timeseries,werandomlyobtain2000time-stepstolearnthelters,andusethenext200time-stepstotestthesealgorithms.Tocomparethepredictionperformancesinafairway,werun50independentsimulationsandthemeansquarederror(MSE)andsignalnoiseratio(SNR)arerecorded. Allkernelparametersinvolvedinthisexperimentaresetas1.FortheKLMSalgorithmthelearningratesaresetas0.1and0.01.ForourKRTRLalgorithm,thelearningrateissetas0.1andthedimensionofthehiddenstatesis2.Theteacherforcingtechniqueisappliedateverystep(t=1).[xi(k),xi)]TJ /F8 7.97 Tf 6.59 0 Td[(1(k)]isusedasteacher 143

PAGE 144

signaltotraintheKRS.Fortesting,thehiddenstatecorrespondingtoxi(k)ischosenasthesystemoutput.Thepredictionperformancesaretabulatedasbelow: TableE-1. Usingx(1)asinputstopredictx(1),x(2)andx(3) Des.AlgorithmsSNRMSE KLMS(=0.1)25.7411.7716(dB)0.000962180.00083813x(1)KLMS(=0.01)12.19332.1166(dB)0.0217710.013692KRS(=0.1)24.39631.6035(dB)0.00127550.00097267KLMS(=0.1)-2.2230.25915(dB)1.68340.12373x(2)KLMS(=0.01)3.4830.27802(dB)0.449980.026276KRS(=0.1)23.40772.4422(dB)0.00512750.002389KLMS(=0.1)-2.50390.22352(dB)1.52960.11027x(3)KLMS(=0.01)3.07120.25711(dB)0.420380.024646KRS(=0.1)17.93621.7771(dB)0.0147370.005885 Fromtheseresults,onecanconcludethatbothalgorithmscanobtaingoodpredictionperformanceswhileusingx(1)topredictx(1).ButtheKLMSalgorithmgetpoorperformanceswhenpredictingx(2)orx(3)timeseriesusingx(1).While,ouralgorithmcanobtaingoodpredictionperformancesinpredictingthetheredimensionalsystem.Thisisbecausethemappingsfromx(1)tox(2)orx(3)aremorecomplexandtheerrorincreaseswithiterationsproportionallytothelargestLyapunovexponentofthesystemthatproducedthetimeseries,whichispositiveinthiscase(Chaoticsystem).TheKRSlearnedbytheKRTRLalgorithmisdesignedtodescribethesystemdynamics,whichcanquantifybetterthedatastructuretoimplementthepredictiontask.Therefore,thereisnosurprisethatourproposedalgorithmcanstillbesuccessful. E.5Conclusion Inthisappendix,weproposedforthersttimeahierarchicalkernelltercalledthekernelrecurrentsystem(KRS)whichcandescribegeneralstate-spacemodelrecurrentnetworksandimplementthenonlinearcomputationsintheRKHS.Furthermore,akernelreal-timerecurrentlearning(KRTRL)algorithmisdevelopedtotraintheKRS.TheteacherforcingtechniqueisappliedtomodifytheKRTRLalgorithmtoimprovethe 144

PAGE 145

learningtasks.TheKRTRLalgorithmisappliedtoLorenztimeseriesprediction,andthepredictionperformancesoutperformstheKLMSalgorithmwhentheinputtooutputmappingsarenon-stationary. ThecomputationalcomplexityoftheKRTRLalgorithmissimilartotheKLMSincreasinglinearlywiththesamplenumberi.However,becauseoftherecurrentarchitecture,thewholecoefcientsareupdatedwhenannewsamplearrives,unliketheKLMSalgorithm,whichonlyupdatescurrentcoefcient.Thispropertyalsoenhancethelearningcapabilityofthisalgorithm.Intheexperiment,theteacherforcingisappliedateverystep.Wecanalsoincreasetheintervalt,liket=2or3,tolearnthestatedynamicsbetter,whichhoweverrequiresthesmallerlearningrateandlongertrainingtime. 145

PAGE 146

REFERENCES [1] A.SayedFundamentalsofAdaptiveFiltering.NewYork:Wiley,2003. [2] S.Haykin,AdaptiveFilterTheory,Thirded.EnglewoodCliffs,NJ:Prentice-Hall,1996. [3] S.Haykin,A.Sayed,J.R.Zeidler,P.Yee,P.C.Wei,AdaptivetrackingoflineartimevariantsystemsbyextendedRLSalgorithm,IEEETransactionsonSignalProcssing,vol.45,no.5,pp.1118-1128,May,1997. [4] S.Haykin,Adaptivelterthoery.(4thed.).UpperSaddleRiver,N.J.:PrenticeHall. [5] B.Boser,I.Guyon,andV.Vapnik,Atrainingalgorithmforoptimalmarginclassiers,FifthAnnualWorkshoponComputationalLearningTheory,pp.144,SanMateo,CA:MorganKaufmann. [6] C.Cortes,V.Vapnik,Supportvectornetworks,MachineLearning,vol.20,pp.273. [7] V.Vapnik,TheNatureofStatisticalLearningTheory,NewYork:SpringerVerlag,1995. [8] V.Vapnik,StatisticalLearningTheory,NewYork:Weiley,1998. [9] N.CristianiniandJ.Shawe-Taylor,AnIntroductiontoSupportVectorMachines,Cambridge,U.K.:CambridgeUniv.Press,2000. [10] C.Burges,Atutorialonsupportvectormachinesforpatternrecognition,DataMiningandKnowledgeDiscovery,vol.2,pp.121. [11] B.Scholkopf,A.Smola,K.Muller,Nonlinearcomponentanalysisasakerneleigenvalueproblem,NeuralComputation,vol.10,pp.1299-1319,1998. [12] S.Mika,B.Scholkopf,A.J.Smola,K.R.Muller,M.Scholz,andG.Ratsch,KernelPCAandDe-NoisinginfeatureSpaces,Adv.inNeualInformationProcessingSystems,1999. [13] FrancisR.BachandMichaelI.Jordan,KernelIndependentComponentAnalysis,JournalofMachineLearningResearch,vol.3,pp1-48,2002. [14] S.Hao,S.Jegelka,andA.Gretton,FastKernel-BasedIndependentComponentAnalysis,IEEETransactionsonSignalProcessing.vol.57,Iss.9,pp3498-3511,2009. [15] W.Liu,P.Pokharel,J.C.Prncipe,Thekernelleastmeansquarealgorithm,IEEETransactionsonSignalProcssing,vol.56,no.2,pp.543-554,Feb,2008. [16] Y.Engel,S.Mannor,R.Meir,Thekernelrecursiveleast-squaresalgorithm,IEEETransactionsonSignalProcssing,vol.52,no.8,pp.2275-2285,Aug,2004. 146

PAGE 147

[17] W.Liu,I.Park,Y.Wang,J.C.Prncipe,ExtendedKernelRecursiveLeastSquaresAlgorithm,IEEETransactionsonSignalProcssing,vol.57,no.10,pp.3801-3814,Oct.2009. [18] B.Ristic,S.Arulampalam,N.Gordon,BeyondtheKalmanFilter:ParticleFiltersforTrackingApplications,ArtechHouse,2004. [19] ArnaudDoucet,NandodeFreitas,andNeilGordon,SequentialMonteCarlomethodsinpractice,NewYork:Springer,2001. [20] N.J.Gordon,D.J.Salmond,andA.F.M.Smith,Novelapproachtononlinear/non-GaussianBayesianstateestimation,Proc.Inst.Elect.Eng.F,vol.140,pp.107,Apr.1993. [21] J.Carpenter,P.Clifford,andP.Fearnhead,Improvedparticlelterfornon-linearproblems,IEEProc.PartF:RadarandSonarNavigation,vol.146,pp.2,February1999. [22] W.R.GilksandC.Berzuini,Followingamovingtarget-MonteCarloinferencefordynamicBayesianmodels,JournaloftheRoyalStatisticalSociety,B,vol.63,pp.127,2001. [23] C.Musso,N.Oudjane,andF.LeGland,Improvingregularizedparticlelters,inSequentialMonteCarloMethodsinPractice,NewYork:Springer,2001. [24] G.Kitagawa,MontoCarlolterandsmootherfornon-Gaussiannon-linearstatespacemodels,JournalofComputationalandGraphicalStatistics,vol.5,no.1,pp.1,1996. [25] R.E.Kalman,Anewapproachtolinearlteringandpredictionproblem,Transac-tionsoftheASME,Ser.D,JournalofBasicEngineering,82,pp.34-45,1960. [26] S.Haykin,KalmanFilteringandNetworks.NewYork:Wiley,2001. [27] B.Anderson,J.Moore,OptimalFiltering.Prentice-Hall,1979. [28] J.K.Uhlman,Algorithmsformultipletargettracking,AmericanScientist,vol.80(2),pp.128-141,1992. [29] G.Welch,G.Bishop,AnIntroductiontotheKalmanFilter.Technicalreport,UNC-CHComputerScienceTechnicalReport95041,1995. [30] A.Jazwinski,StochasticProcessandFilteringTheory.NewYork:AcademicPress,1970. [31] P.Maybeck,StochasticModels,EstimationandControl,vol.1,NewYork:AcademicPress,1979. [32] P.Maybeck,StochasticModels,EstimationandControl,vol.2,NewYork:AcademicPress,1982. 147

PAGE 148

[33] S.J.Julier,J.K.Uhlmann,andH.Durrant-Whyte,Anewapproachforlteringnonlinearsystems,inProceedingsoftheAmericanControlConference,1995,pp.1628. [34] S.J.andJulier,J.K.Uhlmann,Ageneralmethodforapproximatingnonlineartransformationsofprobabilitydistributions,TechnicalReport,RRG,DepartmentofEngineeringScience,UniversityofOxford,November1996. [35] S.J.JulierandJ.K.Uhlmann,ANewExtensionoftheKalmanFiltertoNonlinearSystems,Proc.ofAeroSense:The11thInt.Symp.onAerospace/DefenceSensing,SimulationandControls.,1997. [36] E.A.Wan,R.V.Merwe,&A.T.Nelso,DualEstimationandtheUnscentedTransformation,AdvancesinNeuralInformationProcessingSystems,no.12,pp.666-672,MITPress,2000 [37] E.A.Wan,R.V.Merwe,TheUnscentedKalmanFilterforNonlinearEstimation,Proc.ofIEEESymposium2000(AS-SPCC),LakeLouise,Alberta,Canada,Oct.,2000. [38] I.Arasaratnam,S.Haykin,CubatureKalmanFilters,IEEETransactionsonSignalProcssing,vol.54,no.6,pp.1254-1269,Jun.,2009. [39] I.Arasaratnam,S.Haykin,andT.R.Hurd,CubatureKalmanFilteringforContinuous-DiscreteSystems:TheoryandSimulations,IEEETransactionsonSignalProcssing,vol.58,no.6,pp.4977-4993,Oct.,2010. [40] A.Stroud,ApproximateCalculationofMultipleIntegrals,EnglewoodCliffs,NJ:Prentice-Hall,1971. [41] R.Cools,Computingcubatureformulas:Thesciencebehindtheart,ActaNumerica,vol.6,pp.1,Cambrige,U.K.:CambridgeUniversityPress. [42] E.A.WanandA.T.Nelson,DualKalmanlteringmethodsfornonlinearprediction,estimation,andsmoothing,inAdvancesinNeuralInformationPro-cessingSystem9.Cambridge,MA:MITPress,1997. [43] E.A.WanandA.T.Nelson,NeuraldualextendedKalmanltering:Applicationsinspeechenhancementandmonauralblindsignalseparation,inProceedingsofIEEEWorkshoponNeuralNetworksforSignalprocessing,1997. [44] A.T.NelsonandE.A.Wan,Atwo-observationKalmanframeworkformaximum-likelihoodmodelingofnoisytimeseries,inproceedingsofInterna-tionalJointConferenceonNeuralNetworks,IEEE/INNS,May1998. [45] A.T.Nelson,NonlinearEstimationandModelingofNoisyTime-seriesbyDualKalmanFilterMethods,PhDthesis,OregonGraduateInstituteofScienceandTechnology,2000. 148

PAGE 149

[46] E.A.WanandA.T.Nelson,RemovalofnoisefromspeechusingthedualEKFalgorithm,inProceedingsofInternationalConferenceonAcoustics,Speech,andSignalProcessing,IEEE,May1998. [47] P.Zhu,B.Chen,J.C.Prncipe,ExtendedKalmanFilterUsingaKernelRecursiveLeastSquaresObserver,InProceedingsoftheinternationaljointconferenceonneuralnetworks.IEEE(pp.1402-1408),2011. [48] A.Berlinet,C.Thomas-Agnan,ReproducingKernelHilberSpacesinProbabilityandStatistics.KluwerAcademicPublisher,2004. [49] N.Aronszajn,Theoryofreproducingkernels,Trans.Amer.Math.Soc.,vol.68,pp.337-404,1950. [50] T.Hofmann,B.Scholkopf,A.J.Smola,AReviewofKernelMethodsinMachineLearning,TechnicalReport156,Max-Planck-InstitutfurbiologischeKybernetik,2006. [51] MartinSewell,KernelMethods,DepartmentofComputerScience,UniversityCollegeLondon,UK. [52] B.ScholkopfandA.Smola,LearningWithKernels,Cambridge,MA:MITPress,2002. [53] R.Herbrich,LearningKernelClassiers,Cambridge,MA:MITPress,2002. [54] J.Shawe-TaylorandN.Cristianini,KernelMethodsforPattenAnalysis,CambridgeUniversityPress,Cambridge,2004. [55] W.Liu,I.Park,J.C.Prncipe,Aninformationtheoreticapproachofdesigningsparsekerneladaptivelters,IEEETransactionsonNeuralNetworks,vol.20,no.12,pp.1950-1961,Dec.,2009. [56] L.RalaivolaandF.d'AlcheBuc,Dynamicalmodelingwithkernelsfornonlineartimeseriesprediction,AdvancesinNeuralInformationProcessingSystems,vol.16,2004. [57] L.RalaivolaandF.d'AlcheBuc,Timeseriersltering,smoothingandlearningusingthekernelKalmanlter,Proceedingsof2005IEEEInternationalJointConferenceonNeuralNetworks,pp.1449-1454,2005. [58] A.V.RostiandM.Gales,GeneralisedlinearGaussianmodels,TechnicalReportCUED/FINFENG/TR.420,CambridgeUniversityEngineeringDepartment,2001. [59] G.H.Bakir,J.Weston,andB.Scholkopf,LearningofndPre-Images,inNerualInformationProcessingSystems,vol.16,2004. [60] C.Baker,Jointmeasuresandcross-covarianceoperators,TransactionsoftheAmericanMathematicalSociety,186,pp.273-289,1973. 149

PAGE 150

[61] L.Song,J.Huang,A.Smola,K.Fukuminzu,HilbertSpaceEmbeddingsofConditionalDistributionswithApplicationstoDynamicalSystems,Proceedingsofthe26thInternationalConferenceonMachineLearning,Montreal,Canada,2009. [62] K.Fukumizu,F.R.BachandM.I.Jordan,DimensionalityReductionforSupervisedLearningwithReproducingKernelHilbertSpace,JournalofMa-chineLearningResearch,5,pp73-99,2004. [63] B.K.Sriperumbudur,K.FukumizuandGertR.G.Lanckriet,Ontherelationbetweenuniversality,characteristickernelsandRKHSembeddingsofmeasures,Proceedingsofthe13thInternationalConferenceonArticialIntelligenceandStatistics(AISTATS)2010,ChiaLagunaResort,Sardinia,Italy.Vol.9ofJMLR:W&CP9. [64] F.Takens,DetectingStrangeActuatorsinTurbulence,inD.A.RandandL.S.Yong,Eds.DynamicalSystemandTurbulence,Warwick,LectureNotesinMathematics,898,1980. [65] M.Welling,SupportVectorRegression,DepartmentofComputerScience,UniversityofToronto,Toronto(Kanada),2004. [66] A.J.SmolaandB.Scholkopf,ATutorialonSupportVectorRegression,NeuroCOLT2TechnicalReportSeries,NC2-TR-1998-030,Oct,1998. [67] R.CollobertandS.Bengio,SVMTorch:supportvectormachinesforlarge-scaleregressionproblems,J.MachineLearningRes.,vol.1,pp.143,2001.large-scaleregressionproblems, [68] K.Ikeda,Multiple-valuedstationarystateanditsinstabilityofthetransmittedlightbyaringcavitysystem,Opt.Commun.,30,pp257,1979. [69] M.ShaoandC.L.Nikias,Signalprocessingwithfractionallowerordermoments:Stableprocessesandtheirapplications,Proc.IEEE,July,vol.81,pp986,1993. [70] C.L.NikiasandM.Shao,SignalProcessingwithAlpha-StableDistributionsandApplications,NewYork:Wiley,July,1995. [71] S.HaykinandJ.C.Prncipe,Makingsenseofacomplexworld,SignalProcess-ingMagazinevol.15,no.3,pp66,1998. [72] B.SriperumbudurandA.GrettonandK.FukumizuandG.LanckrietandB.Scholkopf,Injectivehilbertspaceembeddingsofprobabilitymeasures,InConferenceonLearningTheory,2008. [73] A.SmolaandA.GrettonandL.SongandB.Scholkopf,Ahilbertspaceembeddingfordistributions,InAlgorithmLearningTheory,2007. 150

PAGE 151

[74] E.N.Lorenz,Deterministicnonperiodicow,J.Atmospher.Sci.,vol.20,pp130,1963. [75] W.Tucker,ArigorousODEsolverandSmale's14thproblem,Found.Comput.Math,vol.2,pp53,2002. [76] PingpingZhuandBadongChenandJ.C.Prncipe,Anovelextendedkernelrecursiveleastsquaresalgorithm,NeuralNetworks,Aug.vol.32,pp349,2012. [77] B.K.SriperumbudurandK.FukumizuandG.Lanckriet,Universality,CharacteristicKernelsandRKHSEmbeddingofMeasures,JournalofMachineLearningResearch,no.12,pp2389,2011. [78] C.A.MicchelliandY.XuandH.Zhang,Universalkernels,JournalofMachineLearningResearch,no.12,pp2651,2006. [79] K.FukumizuandA.GrettonandX.SunandB.Scholkopf,Kernelmeasuresofconditionaldependence,AdvancesinNeuralInformationProcessingSystems,Cambridge,MA.MITPress,no.20,pp489,2008. [80] K.FukumizuandF.BachandM.Jordan,Kernelreductioninregression,AnnalsofStatisticsno.5,pp1871,2009. [81] H.HendersonandS.Searle,OnDerivingtheInverseofaSumofMatrices,SIAMReviewvol.23,no.1,pp53,1981. [82] K.FukumizuandL.SongandA.Gretton,KernelBayes'rule,AdvancesinNeuralInformationProcessingSystems,2011. [83] B.ChenandS.ZhaoandP.ZhuandJ.C.Prncipe,QuantizedKernelLeastMeanSquareAlgorithm,IEEETransactionsonNeuralNetworksandLearningSystemsJan.vol.23,no.1,pp22,2012. [84] B.ChenandS.ZhaoandP.ZhuandJ.C.Prncipe,QuantizedKernelRecursiveLeastSquaresAlgorithm,IEEETransactionsonNeuralNetworksandLearningSystems,accepted. [85] H.Tong,NonlinearTimeSeries:ADynamicalSystemsApproach,OxfordUniversityPress,NewYork,1990. [86] C.A.MicchelliandY.XuandH.Zhang,Universalkernels,JournalofMachineLearningResearch,no.12,pp.2651,2006. [87] G.KimeldorfandG.Wahba,SomeresultsonTchebychefansplinefunctions,J.Math.Anal.Applic.vol.33,pp.82,1971. 151

PAGE 152

[88] B.ScholkopfandR.HerbrichandA.J.Smola,AGeneralizedRepresenterTheorem,Proceedingsofthe14thAnnualConferenceonComputationalLearn-ingTheory,London,UK,Springer-Verlag,pp.416,2001. [89] W.LiuandJ.C.PrncipeandS.Haykin,KernelAdaptiveFiltering:AComprehen-siveIntroduction,John:Wiley,pp.12,2010. [90] J.Mercer,Functionsofpositiveandnegativetype,andtheirconnectionwiththetheoryofintegralequations,Phil.Trans.Roy.Soc.London,vol.209,pp.415,1909. [91] J.Michel,P.BalabanandK.S.Shanmugan,SimulationofCommunicationSystems,SecondEdition,NewYork,KluwerAcademic/Plenum,2000. [92] D.Tuia,G.Camps-Valls,andM.Martinez-Ramon,ExplicitrecursivityintoreproducingkernelHilbertspaces,IEEE-ICASSP,pp.4148,2011. [93] R.J.WilliamsandD.Zipser,Gradient-basedlearningalgorithmsforrecurrentnetworksandtheircomputationalcomplexity,Backpropagation:Theory,Architec-tures,andApplications,pp.433,Hillsdale,NJ:LawrenceErlbaum. [94] R.J.WilliamsandD.Zipser,ExperimentalanalysisoftheReal-timeRecurrentLearningAlgorithm,ConnectionScience,1,pp.87,1989. [95] R.J.WilliamsandD.Zipser,Alearningalgorithmforcontinuallyrunningfullyrecurrentneuralnetworks,ConnectionScience,1,pp.270,1989. [96] S.Haykin,NeuralNetworksandLearningMachines(3rdEdition),PrenticeHall,2008. [97] J.L.Elman,Findingstructureintime,CognitiveScience,14,pp.179,1990. [98] J.L.Elman,E.A.Bates,M.Johnson,A.Karmiloff-Smith,D.Parisi,andK.Plinket,RethinkingInnateness:AConnectionistPerspectiveonDevelopment,Cambridge,MA:MITPress. [99] M.L.Jordan,Attractordynamicsandparallelisminaconnectionistsequentialmachine,Proceedingsoftheeighthannualconferenceofthecognitivesciencesociety,pp.531,1986. [100] M.Arulampalam,S.Maskell,N.Gordon,andT.Clapp,ATutorialonParticleFiltersforOnlineNonlinear/Non-GaussianBayesianTracking,IEEETRANSAC-TIONSONSIGNALPROCESSING,vol.50,no.2,Feb.2002. [101] B.P.Carlin,N.G.Polson,andD.S.Stoffer,AMonteCarloapproachtononnormalandnonlinearstate-spacemodeling,J.Amer.Statist.Assoc.,vol.87,no.418,pp.493,1992. 152

PAGE 153

[102] N.Gordon,D.Salmond,andA.F.M.Smith,Novelapproachtononlinearandnon-GaussianBayesianstateestimation,Proc.Inst.Elect.Eng.,F,vol.140,pp.107,1993. [103] G.Kitagawa,MonteCarlolterandsmootherfornon-Gaussiannonlinearstatespacemodels,J.Comput.Graph.Statist.,vol.5,no.1,pp.1,1996. 153

PAGE 154

BIOGRAPHICALSKETCH PingpingZhuwasborninWuhan,China.HereceivedtheB.S.degreeinelectricalandinformation,theM.S.degreeinpattenrecognitionandintelligentsystemin2006and2008respectively,fromHuazhongUniversityofScienceandTechnology,Wuhan,China.HealsoreceivedtheM.S.andthePh.D.degreeinelectricalandcomputerengineeringin2010and2013,respectively,fromUniversityofFlorida,Gainesville,USA.Hestudiedanddevelopedthefast(inverse)discretecosinetransform(DCT/IDCT)algorithmsinvideocompression(MPEG-4)andfastmodied(inverse)discretecosinetransform(MDCT/IMDCT)algorithmsinaudiocodec(G.711)inInstituteofPatternRecognitionandArticialIntelligence,HuazhongUniversityofScienceandTechnology.In2009,hejoinedtheComputationalNeuroEngineeringLaboratoryattheUniversityofFloridaasaPh.D.student.Hisresearchfocusesonstatisticalsignalprocessing,adaptiveltering,andmachinelearning. 154