<%BANNER%>

A Family of Minimum Renyi's Error Entropy Algorithm for Information Processing

Permanent Link: http://ufdc.ufl.edu/UFE0021428/00001

Material Information

Title: A Family of Minimum Renyi's Error Entropy Algorithm for Information Processing
Physical Description: 1 online resource (155 p.)
Language: english
Creator: Han, Seungju
Publisher: University of Florida
Place of Publication: Gainesville, Fla.
Publication Date: 2007

Subjects

Subjects / Keywords: entropy, fgt, ifgt, mee, meefp, meesas, nmee
Electrical and Computer Engineering -- Dissertations, Academic -- UF
Genre: Electrical and Computer Engineering thesis, Ph.D.
bibliography   ( marcgt )
theses   ( marcgt )
government publication (state, provincial, terriorial, dependent)   ( marcgt )
born-digital   ( sobekcm )
Electronic Thesis or Dissertation

Notes

Abstract: Adaptive systems are self-adjusting and seek the optimum in a continuous way, thus becoming less dependent on a priori knowledge. However, the input signal statistics play an important role in selecting the appropriate cost function optimization. Recently, the error entropy criterion with nonparametric estimator for Renyi's quadratic definition has been proposed as an alternative for mean square error (MSE) in supervised adaptation by Principe, Erdogmus and coworkers. For instance, minimum error entropy (MEE) had been shown as a more robust criterion for dynamic modeling and an alternative to MSE in other supervised learning applications using nonlinear systems. The major goal of our research was to extend their work, improving the MEE algorithm and demonstrating its superior performance in many practical applications that concern adaptive signal processing. We proposed four new algorithms: minimum error entropy with self adjusting step-size (MEE-SAS), normalized minimum error entropy (NMEE), fixed-point minimum error entropy (MEE-FP) and fast minimum error entropy with fast Gauss transform (fast MEE with FGT) and improved fast Gauss transform (fast MEE with IFGT). First, MEE-SAS provides a natural 'Target' that is available to automatically control the algorithm step size. We attribute the self adjusting step size property of MEE-SAS to its changing curvature as opposed to MEE which has a constant curvature. Therefore, MEE-SAS has faster speed of convergence as compared to MEE algorithm for the same misadjustment. However, in the case of a non-stationary environment, MEE-SAS loses its tracking ability due to the 'flatness' of the curvature near the optimal solution. We solved this problem by proposing a switching scheme between MEE and MEE-SAS algorithms for non-stationary scenario which effectively combines the speed of MEE-SAS when far from the optimal solution with the tracking ability of MEE when near the solution. Second, NMEE, which aims at minimizing the weight change subject to the constraint of optimal information potential, performs better than MEE with respect to three major points: it is less sensitive to the input power and the kernel size, and converges faster. Third, the MEE-FP utilizes the first order optimality condition of the error entropy and the fixed-point iteration. Since this algorithm is the second order update similar to recursive least square (RLS), this is suitable to speed up convergence irrespective of the eigenvalue spread of the input correlation matrix. The original error entropy criteria estimated using Parzen windowing have higher computational complexity of O(N^2) when compared with MSE, where N is the number of samples in the training set. Therefore, the fourth algorithm is the fast MEE methods with FGT and IFGT which help alleviate this problem by accurate and efficient computation of entropy using the Hermite expansion and the Taylor expansion in O(pN ), where p is the order of the expansion approximation. Although the MEE cost function is particularly applicable to nonlinear signal processing, in our research we used linear system problems to demonstrate the convergence properties of the new entropy based algorithms and to compare them with the MSE counterparts. In the application chapter we addressed the two main application domains of the proposed algorithms: linear or nonlinear model fitting in the presence of impulsive noise and nonlinear system identification.
General Note: In the series University of Florida Digital Collections.
General Note: Includes vita.
Bibliography: Includes bibliographical references.
Source of Description: Description based on online resource; title from PDF title page.
Source of Description: This bibliographic record is available under the Creative Commons CC0 public domain dedication. The University of Florida Libraries, as creator of this bibliographic record, has waived all rights to it worldwide under copyright law, including all related and neighboring rights, to the extent allowed by law.
Statement of Responsibility: by Seungju Han.
Thesis: Thesis (Ph.D.)--University of Florida, 2007.
Local: Adviser: Principe, Jose C.
Electronic Access: RESTRICTED TO UF STUDENTS, STAFF, FACULTY, AND ON-CAMPUS USE UNTIL 2009-08-31

Record Information

Source Institution: UFRGP
Rights Management: Applicable rights reserved.
Classification: lcc - LD1780 2007
System ID: UFE0021428:00001

Permanent Link: http://ufdc.ufl.edu/UFE0021428/00001

Material Information

Title: A Family of Minimum Renyi's Error Entropy Algorithm for Information Processing
Physical Description: 1 online resource (155 p.)
Language: english
Creator: Han, Seungju
Publisher: University of Florida
Place of Publication: Gainesville, Fla.
Publication Date: 2007

Subjects

Subjects / Keywords: entropy, fgt, ifgt, mee, meefp, meesas, nmee
Electrical and Computer Engineering -- Dissertations, Academic -- UF
Genre: Electrical and Computer Engineering thesis, Ph.D.
bibliography   ( marcgt )
theses   ( marcgt )
government publication (state, provincial, terriorial, dependent)   ( marcgt )
born-digital   ( sobekcm )
Electronic Thesis or Dissertation

Notes

Abstract: Adaptive systems are self-adjusting and seek the optimum in a continuous way, thus becoming less dependent on a priori knowledge. However, the input signal statistics play an important role in selecting the appropriate cost function optimization. Recently, the error entropy criterion with nonparametric estimator for Renyi's quadratic definition has been proposed as an alternative for mean square error (MSE) in supervised adaptation by Principe, Erdogmus and coworkers. For instance, minimum error entropy (MEE) had been shown as a more robust criterion for dynamic modeling and an alternative to MSE in other supervised learning applications using nonlinear systems. The major goal of our research was to extend their work, improving the MEE algorithm and demonstrating its superior performance in many practical applications that concern adaptive signal processing. We proposed four new algorithms: minimum error entropy with self adjusting step-size (MEE-SAS), normalized minimum error entropy (NMEE), fixed-point minimum error entropy (MEE-FP) and fast minimum error entropy with fast Gauss transform (fast MEE with FGT) and improved fast Gauss transform (fast MEE with IFGT). First, MEE-SAS provides a natural 'Target' that is available to automatically control the algorithm step size. We attribute the self adjusting step size property of MEE-SAS to its changing curvature as opposed to MEE which has a constant curvature. Therefore, MEE-SAS has faster speed of convergence as compared to MEE algorithm for the same misadjustment. However, in the case of a non-stationary environment, MEE-SAS loses its tracking ability due to the 'flatness' of the curvature near the optimal solution. We solved this problem by proposing a switching scheme between MEE and MEE-SAS algorithms for non-stationary scenario which effectively combines the speed of MEE-SAS when far from the optimal solution with the tracking ability of MEE when near the solution. Second, NMEE, which aims at minimizing the weight change subject to the constraint of optimal information potential, performs better than MEE with respect to three major points: it is less sensitive to the input power and the kernel size, and converges faster. Third, the MEE-FP utilizes the first order optimality condition of the error entropy and the fixed-point iteration. Since this algorithm is the second order update similar to recursive least square (RLS), this is suitable to speed up convergence irrespective of the eigenvalue spread of the input correlation matrix. The original error entropy criteria estimated using Parzen windowing have higher computational complexity of O(N^2) when compared with MSE, where N is the number of samples in the training set. Therefore, the fourth algorithm is the fast MEE methods with FGT and IFGT which help alleviate this problem by accurate and efficient computation of entropy using the Hermite expansion and the Taylor expansion in O(pN ), where p is the order of the expansion approximation. Although the MEE cost function is particularly applicable to nonlinear signal processing, in our research we used linear system problems to demonstrate the convergence properties of the new entropy based algorithms and to compare them with the MSE counterparts. In the application chapter we addressed the two main application domains of the proposed algorithms: linear or nonlinear model fitting in the presence of impulsive noise and nonlinear system identification.
General Note: In the series University of Florida Digital Collections.
General Note: Includes vita.
Bibliography: Includes bibliographical references.
Source of Description: Description based on online resource; title from PDF title page.
Source of Description: This bibliographic record is available under the Creative Commons CC0 public domain dedication. The University of Florida Libraries, as creator of this bibliographic record, has waived all rights to it worldwide under copyright law, including all related and neighboring rights, to the extent allowed by law.
Statement of Responsibility: by Seungju Han.
Thesis: Thesis (Ph.D.)--University of Florida, 2007.
Local: Adviser: Principe, Jose C.
Electronic Access: RESTRICTED TO UF STUDENTS, STAFF, FACULTY, AND ON-CAMPUS USE UNTIL 2009-08-31

Record Information

Source Institution: UFRGP
Rights Management: Applicable rights reserved.
Classification: lcc - LD1780 2007
System ID: UFE0021428:00001


This item has the following downloads:


Full Text

PAGE 1

1

PAGE 2

2

PAGE 3

whohaveencouragedandsupportedmesinceIwasinthecradle 3

PAGE 4

Iexpressmygratitudetomysupervisor,Dr.JoseC.Principe,forhispatientguidanceandinvaluableadvice,fornumerousdiscussionsandencouragementthroughoutthecourseoftheresearch.Ialsothankallthemembersofmyadvisorycommittee,Dr.JohnG.Harris,Dr.K.ClintSlattonandDr.WilliamHager,fortheirvaluabletimeandinterestinsavingonmysupervisorycommittee,aswellastheircomments.Also,IthankDr.DenizErdogmusforthefruitfuldiscussionsonmyresearch.IthankSudhirRao,Kyu-HwaJeong,AntonioPaiva,JianwuXuandPuskalPokharel,myfriendsandcolleaguesatCNEL,whosecontributionsinthisresearchhavebeentremendous.IalsoextendmyacknowledgementstoallthemembersofCNELfortheircompanionshipandsupportthroughoutthetimespentworkingonmyPh.Dresearch.Finally,Iexpressmygreatestgratitudetomyfamily,especiallymyfatherandmother,fortheirrelentlesssupportandlove. 4

PAGE 5

page ACKNOWLEDGMENTS ................................. 4 LISTOFTABLES ..................................... 8 LISTOFFIGURES .................................... 9 LISTOFABBREVIATIONS ............................... 12 ABSTRACT ........................................ 14 CHAPTER 1INTRODUCTION .................................. 16 1.1HistoricalBackground ............................. 16 1.2MotivationandContribution .......................... 18 2MINIMUMERRORENTROPY(MEE) ...................... 21 2.1MinimizationofRenyi'sErrorEntropy .................... 21 2.2NonparametricEstimatorforRenyi'sQuadraticErrorEntropy ....... 22 2.3MEECriterionandGradientSearchAlgorithm ............... 24 2.4GradientandHessianRelationbetweenMSEandErrorEntropy ...... 26 2.5SimulationsandDiscussion ........................... 27 2.5.1FirstStudy:PerformanceSurfaceofMSEandErrorEntropy .... 27 2.5.2SecondStudy:SystemIdentication(MA(9))inImpulsiveNoiseEnvironments .............................. 31 3MINIMUMERRORENTROPYWITHSELFADJUSTINGSTEP-SIZE(MEE-SAS) ......................................... 35 3.1MEE-SASCriterionandGradientSearchAlgorithm ............. 35 3.2StructuralAnalysisofConvergence ...................... 38 3.3SimulationsandDiscussionI .......................... 42 3.3.1FirstStudy:CurvatureAnalysisofMEEandMEE-SAS ....... 42 3.3.2SecondStudy:SystemIdentication(MA(9)) ............ 48 3.3.3ThirdStudy:ChaoticTimeSeriesPrediction ............. 50 3.4SwitchingSchemebetweenMEEandMEE-SAS ............... 53 3.5SimulationsandDiscussionII ......................... 55 3.5.1FirstStudy:ChaoticTimeSeriesPrediction ............. 56 3.5.2SecondStudy:SystemIdentication .................. 56 3.5.3ThirdStudy:AcousticEchoCancellation ............... 60 4NORMALIZEDMINIMUMERRORENTROPY(NMEE) ............ 63 4.1NMEEastheSolutiontoaConstrainedOptimizationProblem ....... 63 5

PAGE 6

........................ 65 4.2.1NMEEvs.NLMS ............................ 66 4.2.2NMEEvs.AneProjectionAdaptive(APA)Filter ......... 67 4.3SimulationsandDiscussion ........................... 68 4.3.1FirstStudy:DependenceonInputPower(MA(9)) .......... 68 4.3.2DependenceonKernelSizeandSpeedofAdaptation(MA(9)) ... 70 4.3.3SecondStudy:SystemIdentication(AR(1)) ............. 72 5FIXED-POINTMINIMUMERRORENTROPY(MEE-FP) ........... 74 5.1DerivationofMEE-FP ............................. 74 5.2ConvergenceAnalysisofMEE-FP ....................... 76 5.3RecursiveMEE-FP ............................... 78 5.3.1Recursiveestimateofthexed-pointupdate ............. 78 5.3.2InversionLemma ............................ 79 5.4ComputationalComplexity ........................... 80 5.5SimulationsandDiscussion ........................... 81 5.5.1FirstStudy:SystemIdentication(MA(9)) .............. 82 5.5.2SecondStudy:SystemIdentication(AR(1)) ............. 86 5.5.3ThirdStudy:EectofEigenvalueSpread(MA(2)) .......... 87 6FASTMINIMUMERRORENTROPY ....................... 92 6.1EstimatinganInformationPotentialwiththeFastAlgorithms ....... 93 6.1.1FastGaussTransform(FGT) ...................... 93 6.1.2ImprovedFastGaussTransform(IFGT) ............... 95 6.2MEEusingFGTandIFGT .......................... 96 6.3SimulationsandDiscussion ........................... 98 6.3.1FirstStudy:EntropyEstimationusingFGTandIFGT ....... 98 6.3.2SecondStudy:SystemIdentication(MA(9)) ............ 100 7ROBUSTNESSOFENTROPYCOSTFUNCTION ................ 105 7.1LimitationofMSECriterionbasedAlgorithminImpulsiveNoise ..... 105 7.2OutlierRejectionPropertyofMEE ...................... 106 7.3Non-GaussianNoiseModels .......................... 107 7.3.1GaussianMixture(GM)Models .................... 109 7.3.2Alpha-StableModels .......................... 111 7.4AcousticEchoCancellationinGMNoiseEnvironments ........... 113 7.4.1SystemandNoiseModel ........................ 113 7.4.2SimulationsandDiscussionI ...................... 114 7.5AdaptiveBeamforminginAlpha-StableNoiseEnvironments ........ 119 7.5.1BeamformingProblem ......................... 119 7.5.1.1SystemandNoiseModel ................... 119 7.5.1.2MinimumVarianceBeamforming .............. 121 7.5.1.3LeastMeanSquareandp-normalgorithms ......... 122 7.5.2MinimumOutputEntropy(MOE)andMOE-SAS .......... 123 6

PAGE 7

..................... 123 8APPLICATIONS ................................... 129 8.1AdaptiveWirelessChannelTracking ..................... 129 8.1.1ChannelTrackingProblem ....................... 131 8.1.1.1WirelessChannelModel ................... 131 8.1.1.2SystemandNoiseModel ................... 131 8.1.2SimulationsandDiscussionI ...................... 132 8.2NonlinearSystemIdentication ........................ 137 8.2.1SystemModelandTrainingAlgorithms ................ 137 8.2.2SimulationsandDiscussionII ..................... 138 9CONCLUSIONSANDFUTUREWORK ...................... 142 9.1Conclusions ................................... 142 9.2FutureWork ................................... 145 LISTOFREFERENCES ................................. 147 BIOGRAPHICALSKETCH ................................ 155 7

PAGE 8

Table page 1-1CorrespondingalgorithmsbasedonMSEanderrorentropy ............ 19 2-1GradientandHessianrelationbetweenMSEanderrorentropy .......... 26 3-1TurningpointofcurvatureforMEE-SAS ...................... 42 9-1Summaryofproposedalgorithms .......................... 145 8

PAGE 9

Figure page 2-1Blockdiagramofadaptivesystem .......................... 21 2-2PerformancesurfaceandcontourofMSE(J(e)=E[e2]) ............. 28 2-3Performancesurfaceandcontourplotoferrorentropy(J(e)=V(0)V(e)) .. 29 2-4Performancesurfaceandcontourplotoferrorentropy(NormalizedIP)forvariouschoicesofkernelsize ................................. 30 2-5WeightSNRofLMS,MLMSandMEEinGaussianmeasurementnoise ..... 32 2-6WeightSNRofLMS,MLMSandMEEinnon-Gaussianmeasurementnoise .. 32 2-7OutputsignalsofLMS,MLMSandMEEinnon-Gaussianmeasurementnoise 33 3-1PerformancesurfaceofMEEandMEE-SAS .................... 44 3-2ContourandgradientplotofMEEandMEE-SASfor=0:35 .......... 45 3-3ContourandgradientdierencebetweenMEEandMEE-SASforvariouschoicesofkernelsize ..................................... 46 3-4ContourandgradientdierencebetweenMEEandMEE-SASforthreedierentmeasurementnoises .................................. 47 3-5AverageweighterrorpowerofMEEandMEE-SASforcriticallyttedmodel .. 49 3-6AverageweighterrorpowerofMEEandMEE-SASforunderttedmodel ... 49 3-7OnlinepredictionlearningofMEEandMEE-SASforMackeyGlasstimeseries 51 3-8ErrorprobabilitydensityofMEEandMEE-SASforlast200samples ...... 51 3-9TwoofthesixweighttracksforMEEandMEE-SAS ............... 52 3-10PredictionperformanceofMEEandMEE-SASfortheMackeyGlasstimeseries 52 3-11Performanceofswitchingschemeonchaotictimeseriesprediction ........ 57 3-12Performanceofswitchingschemeonsystemidentication ............. 59 3-13Performanceofswitchingschemeonacousticechocancellation .......... 61 4-1AverageweighterrorpowerofMEE,NMEEandNLMSforMA(9)withdierentinputpowers(P=1,10) ............................... 69 4-2AverageweighterrorpowerforMA(9)withdierentkernelsizes(solidlines-NMEE,dashedlines-MEE) .................................. 71 9

PAGE 10

....................................... 72 4-4AverageweighterrorpowerofMEEandNMEEforAR(1)incaseofbestresults 73 5-1ConvergenceoftherecursiveMEE-FPforMA(9)withdierentvaluesoftheforgettingfactor. ................................... 83 5-2ConvergenceoftherecursiveMEE-FPforMA(9)withdierentvaluesofthewindowlength. .................................... 84 5-3ConvergenceoftherecursiveMEE-FPforMA(9)withdierentvaluesofthekernelsize. ....................................... 85 5-4ConvergenceperformanceoftherecursiveMEEandtherecursiveMEE-FPforMA(9). ........................................ 85 5-5ConvergenceperformanceoftherecursiveMEEandrecursiveMEE-FPforAM(1). 87 5-6PerformanceofMEEandMEE-FPforeigenvaluespread(S=1.22) ........ 88 5-7PerformanceofMEEandMEE-FPforeigenvaluespread(S=10) ......... 89 5-8PerformanceofRLSandMEE-FPfortwodierenteigenvaluespreads(S=1.22and10) ........................................ 91 6-1Absoluteerrorandrunningtimesforagivenexpansionorder(p=3and6)vs.thenumberofclusters(K) .............................. 99 6-2Absoluteerrorandrunningtimesforagivennumberofclusters(K=20)vs.theorderofexpansion(p) ................................ 101 6-3Absoluteerrorandrunningtimesforagivennumberofclusters(K=20)andexpansionorder(p=5)vs.thenumberofsamples ................ 102 6-4ComparisonofMEE,fastMEEwithFGTandIFGTforsystemidentication(MA(9)) ........................................ 104 7-1Characteristicsof(x)andd dx(x) ......................... 108 7-2SamplesofGaussianmixturemodelwithO2=1:0;102and104 110 7-3Samplesofthreealphastableprocesseswith=2:0;1:5and1:0(=0,=1=2,=0) ...................................... 112 7-4BlockdiagramofAEC ................................ 113 7-5Roomimpulseresponse ................................ 114 7-6ERLEofLMSandMEEat2o=104(SNR=34.22dB) .............. 115 10

PAGE 11

................ 116 7-8ERLEandoutputsignalsofLMSandMEEat2o=10(SNR=-2.52dB) ..... 117 7-9ERLEandoutputsignalsofLMSandMEEat2o=108(SNR=-73.03dB) .... 118 7-10ERLEofLMSandMEEfor"=0:1(10%),"=0:2(20%),and"=0:4(40%)outliers ........................................ 120 7-11BERperformanceofMOEandMOE-SASwithfourdierentkernelsizes .... 125 7-12Comparisonsofthebeampatternin=2:0(SNR=15dB) ............ 126 7-13ComparisonofBERperformanceatdierentcharacteristicexponentlevels .. 126 7-14Comparisonsofthebeampatternin=1:5and=1:0 ............. 128 8-1ADopplerfadedchannelrealizationwiththefadingratefDTs=0:001 ..... 133 8-2WeightSNRinGaussianmeasurementnoise(pn=NC(0;106)) ......... 134 8-3WeightSNRatthedierentSNR(from17dBto-33dB) ............. 135 8-4WeightSNRat&2=105and&2=101. ...................... 136 8-5Identicationperformancefor=1 ......................... 140 8-6Identicationperformancefor=2 ......................... 141 11

PAGE 12

AECAcousticEchoCancellerAPAAneProjectionAdaptive(lter)ARAutoRegressive(model)AWGNAdditiveWhiteGaussianNoiseBERBitErrorRateBPSKBinaryPhaseShiftKeyingCMACECorrentropyMinimumAverageCorrelationEnergyCNELComputationalNeuroEngineeringLaboratoryERLEEchoReturnLossEnhancementEW-RLSExponentiallyWeightedRecursiveLeastSquareEX-RLSEXtendedRecursiveLeastSquareFGTFastGaussTransformFIRFiniteImpulseResponseICAIndependentComponentAnalysisIFGTImprovedFastGaussTransformi.i.d.independentandidenticallydistributedIIRInniteImpulseResponseIPInformationPotentialITLInformationTheoreticLearningLMFLeastMeanFourthLMPLeastMeanP-normLMSLeastMeanSquareMAMoving-Average(model)MCCMaximumCorrentropyCriterionMEEMinimumErrorEntropyMEEFMinimumErrorEntropywithFiducialpoints 12

PAGE 13

13

PAGE 14

Adaptivesystemsareself-adjustingandseektheoptimuminacontinuousway,thusbecominglessdependentonaprioriknowledge.However,theinputsignalstatisticsplayanimportantroleinselectingtheappropriatecostfunctionoptimization.Recently,theerrorentropycriterionwithnonparametricestimatorforRenyi'squadraticdenitionhasbeenproposedasanalternativeformeansquareerror(MSE)insupervisedadaptationbyPrincipe,Erdogmusandcoworkers.Forinstance,minimumerrorentropy(MEE)hadbeenshownasamorerobustcriterionfordynamicmodelingandanalternativetoMSEinothersupervisedlearningapplicationsusingnonlinearsystems. Themajorgoalofourresearchwastoextendtheirwork,improvingtheMEEalgorithmanddemonstratingitssuperiorperformanceinmanypracticalapplicationsthatconcernadaptivesignalprocessing.Weproposedfournewalgorithms:minimumerrorentropywithselfadjustingstep-size(MEE-SAS),normalizedminimumerrorentropy(NMEE),xed-pointminimumerrorentropy(MEE-FP)andfastminimumerrorentropywithfastGausstransform(fastMEEwithFGT)andimprovedfastGausstransform(fastMEEwithIFGT). First,MEE-SASprovidesanatural\Target"thatisavailabletoautomaticallycontrolthealgorithmstepsize.WeattributetheselfadjustingstepsizepropertyofMEE-SAStoitschangingcurvatureasopposedtoMEEwhichhasaconstantcurvature.Therefore,MEE-SAShasfasterspeedofconvergenceascomparedtoMEEalgorithmforthesame 14

PAGE 15

Second,NMEE,whichaimsatminimizingtheweightchangesubjecttotheconstraintofoptimalinformationpotential,performsbetterthanMEEwithrespecttothreemajorpoints:itislesssensitivetotheinputpowerandthekernelsize,andconvergesfaster. Third,theMEE-FPutilizestherstorderoptimalityconditionoftheerrorentropyandthexed-pointiteration.Sincethisalgorithmisthesecondorderupdatesimilartorecursiveleastsquare(RLS),thisissuitabletospeedupconvergenceirrespectiveoftheeigenvaluespreadoftheinputcorrelationmatrix. TheoriginalerrorentropycriteriaestimatedusingParzenwindowinghavehighercomputationalcomplexityofO(N2)whencomparedwithMSE,whereNisthenumberofsamplesinthetrainingset.Therefore,thefourthalgorithmisthefastMEEmethodswithFGTandIFGTwhichhelpalleviatethisproblembyaccurateandecientcomputationofentropyusingtheHermiteexpansionandtheTaylorexpansioninO(pN),wherepistheorderoftheexpansionapproximation. AlthoughtheMEEcostfunctionisparticularlyapplicabletononlinearsignalprocessing,inourresearchweusedlinearsystemproblemstodemonstratetheconvergencepropertiesofthenewentropybasedalgorithmsandtocomparethemwiththeMSEcounterparts.Intheapplicationchapterweaddressedthetwomainapplicationdomainsoftheproposedalgorithms:linearornonlinearmodelttinginthepresenceofimpulsivenoiseandnonlinearsystemidentication. 15

PAGE 16

Formanyyears,theadaptivesignalprocessingcommunityhasbeenusingmeansquareerror(MSE)astheoptimalitycriterion[ 1 ],[ 2 ].ThemainreasonforthewideuseofMSEliesinthevariousanalyticalandcomputationalsimplicitiesitbringscoupledwiththeminimizationoftheerrorenergy,whichmakessenseintheframeworkoflinearsignalprocessing.However,fromastatisticalpointofview,MSEonlytakesintoaccountthesecondorderstatisticsandisthereforeonlyoptimalinthecaseofGaussiansignalsandlinearlters. Inaneorttotakeintoaccounthigherorderstatistics,meanfourtherror(MFE)anditsfamilyofcostfunctionshadbeenproposedbyWalachandWidrow[ 3 ].MFEanditshigherordercounterpartshavefasteradaptationforadditivenoisehavingalight-tailedprobabilitydistributionfunction(PDF),butarestableonlyinaverynarrowrangeandaproperselectionoflearningrateisverycrucial.Toovercomethisdiculty,alinearcombinationofthecostfunctionsoftheleastmeansquare(LMS)andtheleastmeanfourth(LMF)ltersusingasingleparameter01hasbeenproposed[ 4 ],[ 5 ].Manyvariationsoftheseltershavealreadybeendevelopedbyadaptivelyestimatingtheoptimalparameterorbyrecursiveestimatingthecostfunction[ 6 ]. 16

PAGE 17

7 ],[ 8 ]wasthersttodeneEntropyastheaverageinformationofarandomprocessandtoestablishaprofoundtheoryarounditwithspecicapplicationsandimplicationsondigitalcommunications.Moreover,AlfredRenyi[ 9 ],[ 10 ]showedthatShannon'sentropywasinfactaspecialcaseofamoregeneralfamilyofentropies,whichisnowcalledRenyi'sentropy.However,whileShannon'sentropywaswidespreadrecognizedduetoitssignicantimplicationsincommunicationstheory,Renyi'sentropywasnotrecognizedasausefultoolbyresearchersinengineeringandothereldsuntilrecently.Inthe1990s,someinterestdevelopedinRenyi'sentropyindierenteldsincludingpatternrecognition[ 11 ],andcryptology[ 12 ]. Inthelatenineties,Principeandhisco-workersatCNELmadetheinitialattemptstouseRenyi'sentropyforadaptation[ 13 ],[ 14 ],[ 15 ],[ 16 ],[ 17 ].TheysuccessfullyappliedRenyi'sentropyandotherderivativeoptimalitycriteriatoproblemsofblindsourceseparation,dimensionalityreduction,featureextraction,etc.AlthoughmanyothersusedShannon'sdenitionsofinformationtheoreticcriteriaforadaptationprocesses[ 18 ],[ 19 ],[ 20 ],Principewasthersttointroducetheterminologyinformationtheoreticlearning(ITL)intoadaptivesystemsliterature.Recently,theRenyi'serrorentropycriterionhasbeenutilizedasanalternativeforMSEinsupervisedadaptationbyPrincipe,Erdogmusandcoworkers[ 21 ],[ 22 ].Forinstance,minimumerrorentropy(MEE)hadbeenshownasamorerobustcriterionfordynamicmodeling[ 23 ]andanalternativetoMSEinothersupervisedlearningapplicationsusingnonlinearsystems[ 21 ].Morerecently,anewgeneralizedcorrelationfunction,calledcorrentropyhasbeenintroducedbyCNELgroup[ 24 ].Correntropyisapositivedenitefunction,whichmeasuresanonlinearsimilaritybetweenrandomvariables(orstochasticprocesses)anditinvolveshigh-orderstatistics 17

PAGE 18

25 ].MCChastheadvantagethatitisalocalcriterionofsimilarityanditshouldbeveryusefulforcaseswhenthemeasurementnoiseisnon-zeromean,non-Gaussian,withlargeoutliers. Inourresearch,weextendtheirwork,improvingtheMEEalgorithmandtheMCCalgorithmanddemonstratingtheirsuperiorperformanceinmanypracticalapplicationsthatconcernadaptivesignalprocessing. 26 ],noisecanceling[ 27 ]andchannelequalization[ 28 ],thegoalistominimizethedierencebetweenthedesiredandthesystemoutputsbylearningmechanisms.Theselearningmechanismshavethreemajorconcerns.Therstisthearchitectureoftheadaptivelter,thesecondisthecriterionandtheotheristhelearningalgorithm.Todeneasuitablearchitecturefortheadaptivelter,werstidentifythecategoriesoftheadaptivesignalproblem.Asuitablecriterionisdetermineddependingontheassumptionsonthestatisticalbehavioroftheinputsignal.Thelearningalgorithmndsthebestpossiblesolutionbyoptimizingthecriterionundersomeconstraints.Optimizationtheoryhasprovideduswithavarietyoflearningtechniquespossessingdierentdegreesofcomplexityandrobustness. Withthebasicadaptiveniteimpulseresponse(FIR)lterstructure,MSEyieldsasimpleoptimizationproblem,whoseanalyticalsolutionisprovidedbytheWiener-Hopfequation[ 2 ].Followingthis,algorithmsforiterativelyapproximatingtheoptimalsolutionincludingthesteepestdescentapproasch(suchasLMS,LMFandNLMS,etc.)andthesecond-orderoptimizationtechniques(suchasRLS)havebeenproposedandanalyzed[ 2 ],[ 29 ].Theleastsquarealgorithm(LMS)andtherecursiveleastsquares(RLS)algorithmsarethemostwidelyrecognizedvariantsofthesealgorithms. Inthisresearch,weuseentropycriterionasanalternativeforMSEinsupervisedadaptationandderiveseveralalgorithmsforthecostfunction.Table 1-1 showsthe 18

PAGE 19

CorrespondingalgorithmsbasedonMSEanderrorentropy ErrorEntropy LMS MEE(SIG) LMF MEE-SAS NLMS NMEE RLS MEE-FP FastMEE First,weproposeaminimumerrorentropywithselfadjustingstep-size(MEE-SAS)[ 30 ]toacceleratethesearchfortheoptimalsolution.WeattributetheselfadjustingstepsizepropertyofMEE-SAStoitschangingcurvatureasopposedtoMEEwhichhasaconstantcurvature. Second,weproposeanormalizedminimumerrorentropy(NMEE)[ 31 ].Followingthesamerationalthatleadtothenormalizedleastmeansquare(NLMS),theweightupdateadjustmentforminimumerrorentropy(MEE)isconstrainedbytheprincipleofminimumdisturbance.Further,weshowthatthealgorithmnotonlyisinsensitivetothepoweroftheinput,butisalsofasterthantheMEEforthesamemisadjustment,andalsothatislesssensitivetothekernelsize. Third,weproposeaxed-pointminimumerrorentropy(MEE-FP)[ 32 ]asanalternativetotheminimumerrorentropy(MEE)algorithmfortrainingadaptivesystems.Thexed-pointalgorithmsaredierentfromthegradientmethodslikeMEE,andareproventobefasterandstep-sizefree.Thischaracteristicisduetothesecondorderupdatesimilartotherecursiveleastsquares(RLS)thattrackstheWienersolutionwitheveryupdate. 19

PAGE 20

33 ]andtheimprovedfastgausstransform(IFGT).Weexemplifyherethecaseoftheminimumerrorentropycriteriontotrainadaptivesystems.TheFGTandtheIFGTreducethecomplexityoftheestimationfromO(N2)toO(pkN)wherepistheorderoftheHermiteandtheTaylorapproximationandkthenumberofclustersutilizedinFGT.Further,weshowthatFGTconvergestotheactualentropyvaluerapidlywithincreasingorderpunliketheStochasticInformationGradient,thepresentO(pN)approximationtoreducethecomputationalcomplexityinITL. Fifth,weinvestigateimpulsenoiseandoutlierrejectioncapabilitiesoftheminimumerrorentropyalgorithm.WhiletheLMSalgorithmisnotanappropriateoneduetoitsnonrobustnessagainst\outliers"introducedbytheimpulsivenoises,MEEisveryrobusttoimpulsivenoiseduetoitsM-estimatorpropertyderivedfromthefactthatMEEconstrainstheerrorentropy. Finally,weaddressthetwomainapplicationdomainsoftheproposedentropybasedalgorithms:linearornonlinearmodelttinginthepresenceofimpulsivenoiseandnonlinearsystemidentication. 20

PAGE 21

Manyadaptivesignalprocessingproblemssuchassystemidentication[ 26 ],noisecanceling[ 27 ]andchannelequalization[ 28 ]aretypicallysolvedintheframeworkofgure 2-1 ,wheretheaimistominimizethedierencebetweenthedesiredandthesystemoutputs.MinimizationofMSEinthecriterionblocksimplyconstrainsthesquaredierencebetweentheoriginaltrajectoryandthetrajectorycreatedbytheadaptivesystem,whichdoesnotguaranteethecapturingofalltheinformationabouttheunderlyingdynamics.Insuchsituations,itisnecessarytoconsidertheamountofinformationlosttotheerrorsignalanditislogicaltominimizethisinformation.Thisisachievedwhentheerrorentropyisminimized,forentropyistheaverageinformationofarandomvariable. 1logZfe;w(e)de:(2{1) Figure2-1. Blockdiagramofadaptivesystem 21

PAGE 22

21 ]. 1logZfe;w(e)deminwZZfxy;w(x;y)fxd(x;y) WerecognizethislastexpressionastheCsiszardistancewiththeconvexfunctionchosentobe()1.Takingthelimitofthisexpressionas!1usingL'Hopital'srule,weobtaintheKullback-Leiblerdivergence. Theevaluationoftheerrorentropyfromthetrainingdatasamplesdirectlyborrowsfromkerneldensityestimation,alsoreferredtoasParzenwindowing,whichisawell-understoodandusefulnonparametrictechnique.Foragivensetofi.i.d.errorsamplesfe1;:::;eNgdrawnfromtheoriginaldistributionp(e),theParzenwindowestimateforthedistribution,assumingaxed-sizekernelfunctionK(e)forsimplicity,isgivenby ^p(e)=1 22

PAGE 23

34 ] Foragivenkernelfunction,theParzenwindowestimatorexhibitsthefollowingproperties: Theseconditionsguaranteethatforanalyticprobabilitydistributionfunctions,theParzenwindowestimateisasymptoticallyunbiasedandconsistent(usingasuitableannealingrateforthekernelsize). WewilltreatnowthenonparametricestimationofRenyi'squadraticentropy.Itisaspecialcaseofthisgeneralizedestimatorcorrespondingto=2withaGaussiankernelfunction.Withoutlossofgenerality,onlytheGaussiankernelwouldbeneedinourstudy.Renyi'squadraticentropyestimatorbelongstothisfamilyofentropyestimators,withanexactevaluationoftheintegral.Substitutingequation( 2{3 )inRenyi'sentropydenition( 2{1 )with=2,weobtainthefollowingnonparametrickernelentropyestimator, whereV(e)iscalledthequadraticinformationpotential.NotethattheinformationpotentialforthecontinuousrandomvariablecanbeexactlyestimatedbythedoublesumoverthesamplesduetothewellnowpropertyoftheintegralofproductofGaussiansisstillaGaussian,butwithlargervariance.

PAGE 24

2p 21 ]. Asstatedabove,oneproblemofMEEisthatMEEhasadegenerateminimabecauseitisinsensitivetothemeanoftheerror(constantc).Wehavetwowaystosolvetheproblem.Therstwayistocorrectbyproperlymodifyingthebiasofthesystemoutputtoyieldzeromeanerroroverthetrainingdatasetjustaftertrainingends.TheotherwayistoaddMCCtermintotheMEEcostfunctionas whichisaweightingconstantbetween0and1.ThisabovecostfunctioniscalledMinimumErrorEntropywithFiducialpoints(MEEF)[ 35 ].TheMEEtermminimizestheerrorentropyandtheMCCtermanchorsthemeanoferroratzero. 2{4 ),minimizingtheentropyisequivalenttomaximizingtheinformationpotentialsincethelogisamonotonicfunction.Therefore,thecostfunctionJ(e)fortheMEEcriterionisgivenby SincetheinformationpotentialissmoothanddierentiablebecauseoftheGaussiankernelproperties,wecanuseitsgradientvectortobeusedinthesteepestascentalgorithm 24

PAGE 25

whererV(e)denotesthegradientoftheinformationpotentialandassumingaGaussiankernelthegradientis 2N22NXi=1NXj=1[ejei]Kp Thiscostfunctionisnotparabolicintheweights.However,theselectionofasmoothkernelfunctionwithasucientlylargekernelsizeallowsaquadraticapproximationforthecostfunctiontobevalidonaneighborhoodmotivatesustoemployaTaylorseriesexpansiontruncatedatthelineartermforthegradientaroundtheoptimalweightvectorw. 2~w(n)TR~w(n)(2{10) where~w(n)=ww(n)andR:=r2V(e). Nowthatwehaveavalidquadraticapproximationforthecostfunctionaroundtheoptimumandalinearapproximationfortheweightupdateequations,wecanborrowthewell-knownconvergenceanalysisresultsfromtheMSEconvergencetheory,andreplacetheeigenvaluesoftheinputcovariancematrix(autocorrelationmatrixintheFIRltercase)withtheeigenvaluesoftheHessianmatrixfortheentropycriterion. Thisleadstothefollowingupperboundonthestepsizeofthesteepestascentalgorithmforstableconvergencetotheoptimalsolution, 0<<1 wherekisthelargesteigenvalueofthequadraticapproximationtotheMEEcostfunction. AbatchestimationofthegradientoverNsamplesprovidesasimpleestimationofthegradient,butnoticethatthisprocedureisO(N2).Foronlinetrainingmethods,the 25

PAGE 26

GradientandHessianrelationbetweenMSEanderrorentropy ErrorEntropy r2J(e(n))=u(n)u(n)T 2{12 ),[ 36 ].HeretheoutersummationisdroppedtogetastochasticversionoftheinformationgradientandthesumistakenoverthemostrecentLsamplesattimek.ThusforalterorderoflengthM,thecomplexityofMEEisequaltoO(ML)perweightupdate. Theselectionofthekernelsizeisanimportantstepinestimatingtheinformationpotentialandiscriticaltothesuccessoftheseinformationtheoreticcriteria.Inparticular,increasingthekernelsizeleadstoastretchingeectontheperformancesurfaceintheweightspace,whichresultsinincreasedaccuracyofthequadraticapproximationaroundtheoptimalpoint[ 37 ].So,weusealargeenoughkernelsizeduringtheadaptationprocesstoguaranteethattheoperatingpointliesintheconvexhull,andannealitduringtraining[ 21 ].Inpractice,anappropriatekernelsizewillbeusedbyruleofthumb. 2-1 .Unlikethis,inthepresentscenariowhereweareoptimizingnonlinearfunctionoferror(V(e)),suchsimpleexpressionisnotfeasible.Nevertheless,asshownintable 2-1 theinputu(n)isstillembeddedinthegradientandtheHessianofthecostfunction.Theeectofinputtermu(n)stillremainsintheupdate 26

PAGE 27

Figure 2-2(a) showstheperformancesurfaceofMSE.Thissurfacehasconstantcurvature(secondderivativeorHessian)andisunbounded.Duetothequadraticnatureofthesurface,thelevelsetsofthecontourplotshowningure 2-2(b) becomesdenserasthelevelincreases.Ontheotherhand,theerrorentropycounterpartsplottedingure 2-3 showsadierentshapecharacteristic.Neartheminimum,thecurvebehavesasaquadraticcurvesimilartotheMSE,butduetotheboundednessoftheperformancesurfacethegradientbecomessmallerthusenlargingtheintervalgapbetweenthelevelsetsaswegofartherfromtheminimum.Figure 2-4 showstheperformancesurfaceandthecontouroftheinformationpotential(IP)forfourdierentkernelsizes.Forsmallerkernelsize,thesurfaceislikeafunnelandtheareaofsaturationislarger.But,nearthemaximumitisstillconvex.Further,thisattributeisalsoresponsibleforthetradeobetweenslowerspeedandrobustnesstooutliersofMEEwhenfarfromthesolution. 27

PAGE 28

(b)Contour PerformancesurfaceandcontourofMSE(J(e)=E[e2]) 28

PAGE 29

(b)Contour Performancesurfaceandcontourplotoferrorentropy(J(e)=V(0)V(e)) 29

PAGE 30

(b)Contour Performancesurfaceandcontourplotoferrorentropy(NormalizedIP)forvariouschoicesofkernelsize 30

PAGE 31

2{15 ).Also,wecompareMEEwithamedianLMS(MLMS)[ 38 ],whichisanalternativealgorithmtoLMSinimpulsivenoiseenvironments.But,MLMSdiersfromLMSinthatablockofpastgradienttermsmustbestoredlikeMEE(SIG)[ 39 ]. Weconsideramoving-averagemodelwithtransferfunctiongivenby TheFIRadaptivelterisselectedwithequalorder(order=9).TheinputsignaltoboththesystemandtheadaptivelteriswhiteGaussiandistributedwithzeromeanandunitvariance.Inordertomaketheresultindependentoftheinputandweightinitializations,weperformedMonte-Carlosimulationswith100dierentinputsand100dierentweightinitializationsforeachinput.Wesetakernelsizeto1andwindowlengthto100inMEE,andalsoxawindowlengthto100inMLMS.WeusetheweightSNRfortwoalgorithmsasameasureofperformance. TheaimofthisexperimentistoshowthatitisadvantageoustouseMEEeveninLinearSystemsduetoitsM-estimatorpropertywhichmakesitrobusttononGaussiannoiseprevalentinreallifescenario.Weillustratethisintwophases.Intherstphasewetuneallthesystemsuchthattheyhavesamesecondordercharacteristics.Toachievethis,weusewhiteGaussianwithzeromeanand104varianceasthemeasurementnoiseandaddittothesystemoutput.WeselectthestepsizeparametersforLMS,MLMSand 31

PAGE 32

WeightSNRofLMS,MLMSandMEEinGaussianmeasurementnoise Figure2-6. WeightSNRofLMS,MLMSandMEEinnon-Gaussianmeasurementnoise 32

PAGE 33

OutputsignalsofLMS,MLMSandMEEinnon-Gaussianmeasurementnoise MEEsuchthattheyperformsimilarlyandwiththesameweightSNR(around55dB)asshowningure 2-5 Inthesecondphase,webringinthisequalsecondorderperformingsystemsundernonGaussianenvironment.Themeasurementnoisewasgeneratedby Thisnoiseisnon-Gaussianwithheavytails,anexamplebeingimpulsivenoise. Figure 2-6 showstheperformanceofLMS,MLMSandMEEunderthisimpulsivenoisescenario.TheperformanceofMEE(35dB)isbetterthanthatofLMS(17dB).MEEconvergesfasterthanMLMSforthesameweightSNR.TounderstandtherobustnessofMEEagainstimpulsivenoisebetter,weplottheoutputsignalasshowningure 2-7 .EventhoughimpulsivenoiseispresentMEEandMLMStracksthesignalverywellwhereasLMSgivesabiasedsolutionasitfollowstheimpulsivenoise. WecanconcludethattheperformanceofMEEdefaultstothatofLMSfortheGaussiannoiseandlinearlter,sinceforthiscase,MEEcannotimproveuponthatof 33

PAGE 34

34

PAGE 35

TheMEEcostfunctioncanbesearchedwithgradientdescentlearning[ 14 ]orevensecond-ordersearchmethods[ 40 ].Oneofthedicultieswiththesesearchalgorithmsisthecomputationalcomplexitythatarisesduetotheestimationofentropy.Stochasticgradientalgorithmshavebeenderivedtoalleviatethisproblem[ 36 ].WeextendstheclassofsearchalgorithmsfortheMEEbytakingadvantageofthefactthatthecostmaximizestheargumentofthelogarithm(aquantitythatiscalledinformationpotential),whichisnon-linearityrelatedtothesamples.Aswillbedemonstratedinthischapter,aselfadjustingstepsizecanbedened,whichrequiresonlyaninitialstep-sizeselectionforamorecontrolledgradientsearch(apartfromtheselectionofthekernelsizeforinformationpotentialestimation).Thisnewsearchalgorithmwillbecalledminimumerrorentropywithselfadjustingstep-size(MEE-SAS). Wecanseethatthestepsize()controlsthebehaviorofthealgorithm,andthattwoimportantgoalsarecompeting:forfastconvergence,onewouldusealargestepsize,buttoachievelowsteady-stateMSE,asmallerstepsizewouldbebetter.Theidealstepsizeshoulddecreaseorincreaseastheoverallsystemerrordecreasesorincreases.VariousschemesforcontrollingthestepsizeofLMShavebeenproposedin[ 41 ],[ 42 ],[ 43 ],[ 44 ].Theseschemesprovidea\measureoferror"tocontrolthestepsizeusingtheadditionalparameters.However,MEE-SASprovidesanatural\target"thatisavailabletoautomaticallycontrolthealgorithmstepsize.OneintuitivewaytounderstandtheMEE-SASalgorithmistoconsideritasavarianttoMEEwithavariablestepsize.Whentheerrorislarge,adaptationisfaster,whentheerrorissmall,adaptationisslower,resultinginafastconvergencewithsmallsteady-stateerror. 1 ),V(e)V(0)always;henceV(0)providesanupperboundontheachievableV(e).Seenfromadierentperspective,V(0)isthe 35

PAGE 36

2{8 ).ThismodiedsearchalgorithmisnamedMEE-SAS.TheweightupdateinMEE-SASbecomesw(n+1)=w(n)+[V(0)V(e)]rV(e)=w(n)+(n)rV(e); where(n)=[V(0)V(e)].Wecanfurthernotethatthereexistsacostfunctionwhichgivesrisetothisgradientdescentalgorithmwhichisgivenby, Maximizingtheinformationpotentialisequivalenttominimizingthecostfunction( 3{2 ).Takingthegradientofthiscostfunctionasshownbelowgivesthegradientdescentmethodofequation( 3{1 ). Thestationarypointsoff(V(e))andtheirnature(minima,saddle,maxima)inthewspacearethesameasthoseofV(e)iff()isstrictlymonotonicontherangeofV(e). Proof. 36

PAGE 37

Proof. InordertocontinuewiththeconvergenceanalysisofMEE-SAS,weconsideraquadraticapproximationfortheinformationpotentialV(e)byemployingaTaylorseriesexpansiontruncatedatthelineartermforthegradientaroundtheoptimalweightvector. 2~w(n)TR~w(n)(3{4) wheretheoptimalsolutionisdenedasw=argmaxwV(e),and~w(n)=ww(n)andR:=r2V(e). Infact,whenthekernelsize(thewidthofthewindowfunctionusedintheparzenestimator)tendstoinnity,thelocalminimaandmaximaoftheMEEdisappear,leavingaunique,butbiased,globalminimum.ThisdilationpropertyoftheMEEisshownin[21].Clearly,anycontinuousand(twice)dierentiablecostfunctioncanberepresentedaccuratelywithaquadraticapproximationinsomeneighborhoodofitsglobaloptimum.Then,providedthatthekernelsizeislargeenoughduringtheadaptationprocesstoguaranteethattheoperatingpointliesintheconvexhull,onecanperformglobalconvergenceanalyzesofthesteepestdescentalgorithmintheMEEanddetermineupperboundsonthestepsizeofgradient-basedoptimizationtechniquestoguaranteestability. AssumethatV(e)isaquadraticsurfacewithaTaylorseriesapproximationgivenbyV(e)=Vw(e)+1 2~w(n)TR~w(n),where~w(n)=ww(n)andR:=r2V(e).ToensureconvergenceoftheMEE-SASalgorithm,anecessaryconditionis 37

PAGE 38

Proof. 3{1 )fromwandsubstitutingrV(e)=R~w(n)andR=QQT,weget~w(n+1)=~w(n)+(n)R~w(n)=Q[I+(n)]QT~w(n): Theweighterroralongthenaturalmodes(v(n)=QT~w(n))isthusgivenby Theexpressionforthekthmodethenbecomes, Fromequation( 3{8 )forstability,thestepsizeshouldsatisfytheconstraint Q.E.D. OneintuitivewaytounderstandtheMEE-SASalgorithmistoconsideritasavarianttoMEEwithavariablestepsize(n)=[V(0)V(e)].Theterm[V(0)V(e)]regulatesautomaticallythestepsizebygivingaccelerationwhenfarawayfromtheoptimalsolutionandreducingthestepsizeasthesolutionisapproached.Thisintuitioncanbemathematicallyprovedasfollows. Theorem4.

PAGE 39

Proof. 3{2 )twicewithrespecttotheweightvectorproduces~R=2[V(0)V(e)]r2V(e)+2rV(e)rV(e)Tandequation( 3{10 )isobtainedbysubstitutingequation( 3{4 )andrV(e)=R~w(n).Q.E.D. Fromtheaboveequation( 3{10 ),usingtheeigen-decompositionofMEE-SAS(~R=~Q~~QT)andMEE(R=QQT),andtransformingthecoordinates(v(n)=QT~w(n))intothenaturalmodes,weobtain,~Q~~QT=cQQT+~w(n)TQQT~w(n)QQT+2QQT~w(n)~w(n)TQQTT=Qc+~w(n)TQQT~w(n)+2QT~w(n)~w(n)TQTQT=Qc+v(n)Tv(n)+2v(n)v(n)TQT wherec=2[V(0)Vw(e)].Ifwecandeterminetheeigendecompositionofthematrix[c+v(n)Tv(n)+2v(n)v(n)T],whichisdenotedbyDT,whereisorthonormalandDisdiagonal,thenequation( 3{10 )becomes Bydirectcomparison,theeigenvectorsandtheeigenvaluesaredeterminedtobe TheentriesofDTarefoundasfollows:Theithdiagonalentryisci+(MPj=1jv2j)i+22iv2iandthe(i,j)thentryis2ijvivj,whereiistheithdiagonalentryofandvitheithentryofv(n). 39

PAGE 40

Considerthespecialcasewhenwearemovingalongoneoftheeigenvectors(v=[0;:::;vk;:::;0]T).Thentheexpressionssimplifytothefollowing. ~=2666666666666664c1+k1v2k0000c2+k2v2k00.........00ck+32kv2k0.........000cM+kMv2k3777777777777775(3{14) Inrealscenarios,thereexistmodeswhichconvergeslowerthanothersduetotheeigenvaluespread.IfweanalyzetheconvergencealongtheprincipalaxisofR,itiseasytoseethatweobtain~j=2[V(0)Vw(e)]j+kjv2k8j6=k Whentheweightsareclosetotheoptimalsolutionv2k0,thereforetheeigenvaluesareproportionaltotheeigenvaluesoftheMEEcostwhichisquadratic.Ontheotherhand,whentheweightsarefarfromthesolution,v2kislargeandthusthesecondtermdominatesandtheweightsareproportionaltothesquareoftheoriginaleigenvalues.AconsequenceofthisisthatMEE-SAShastheremarkablepropertyofchangingcurvature.ThisisattributedtothefactthattheeigenvalueofMEE-SAS~kisquadraticallyrelatedtotheeigenvaluesofMEEkwhentheweightsarefarfromthesolutionandislinearlyrelated 40

PAGE 41

Foreachnaturalmodevkinequation( 3{16 ),therelationshipbetweentheeigenvalueofMEE-SAS(~k)andthatofMEE(k)is~k=2[V(0)Vw(e)]k+32kv2k=kc3kv2k wherec=2[V(0)Vw(e)].SincewemaximizethecostfunctionV(e)inMEE,theeigenvalueskofitsHessianarenegative.Similarly,forMEE-SAS,theminimizationofitscostfunctionmakes~kpositive.TheshapeofthebowlisquadraticateachnaturalmodeforMEE-SASandtheturningpointofcurvatureoccurswhen Fromequation( 3{18 ),weanalyzespecically, 3k:(3{19) Usingthenon-negativepropertyofcandtheformofequation( 3{19 ),weget 0c1:(3{20) Inequation( 3{20 ),c=0impliesV(0)=Vw(e),whereasc=1impliesvk=0(i.e.,w=w).Itisinterestingtonotethatthelocationoftheturningpointofcurvaturedependsoncasseeninequation( 3{19 ),whichmeansthatitdependsontheachievablenalerror.Thelargerthenalerror,thefasteristheconvergence.

PAGE 42

TurningpointofcurvatureforMEE-SAS 0q 3k(c>0).Q.E.D. Thus,theturningpointvkofcurvatureisfartherfromtheoptimalsolutionforthezeroerroradaptationcasethanforthenon-zeroerrorcase.Sincethispointmarksthechangeofcurvaturefrom4thorderto2ndorder,thisimpliesthatforpracticalscenarios(i.e.V(0)6=Vw(e)),thecurvatureisgoingtobe4thorder,leadingtomuchfasterconvergencethanMEEforthesameinitialstepsize. 42

PAGE 43

Thiscasestudyaimstoillustratehowtheperformancesurface(hererepresentedbyitscontourandgradientvectorplots)ofMEE(V(0)Vw(e))andMEE-SAS([V(0)Vw(e)]2)arealteredasaconsequenceofchangingthekernelsizeintheestimator.Inordertoavoidexcessivecomputationtimerequirements,wehaveutilized100noiselesstrainingsamplestoobtainthecontourandgradientvectorplots.Akernelsizeissetto=0:1;0:35;0:6. Ingure 3-2 ,weshowthatwhenthecurrentweightisclosetooptimalsolution,themagnitudeofgradientvectorincreasesquadraticallyinaradialdirection.Notethatthegradientvectordecreaseswhenfarfromthesolution,sincetheperformancehasanupperbound(V(0)V(e)V(0))unlikeMSE(seegure 3-1 ).InordertodistinguishthegradientrelationbetweenMEEandMEE-SAS,weplotthegradientdierencebetweenthem. Ingure 3-3 ,whenusingsmallkernelsize(=0:1),MEE-SASissuperiortoMEEwithrespecttothemagnitudeofgradient;whileforlargekernelsize(=0:6),MEEissuperiortoMEE-SAS.Weshowthatthesmallerthekernelweuse,thelargeristheregionoverwhichMEE-SASissuperiortoMEE. ThecaseofV(0)6=Vw(e)includestwocases:MeasurementNoisecaseandErrorModelingcase.Thesimulationresultofmeasurementnoisecaseissimilartothatoferrormodelingcase,so,wejustshowthesimulationresultforthemeasurementnoisecase.Weaddtheuniformdistributednoisewiththreedierentpowers(P=1,2,and3)intheaboveexample. 43

PAGE 44

(b)MEE-SAS PerformancesurfaceofMEEandMEE-SAS 44

PAGE 45

(b)MEE-SAS ContourandgradientplotofMEEandMEE-SASfor=0:35 45

PAGE 46

(b)KernelSize:0.35 (c)KernelSize:0.6 ContourandgradientdierencebetweenMEEandMEE-SASforvariouschoicesofkernelsize 46

PAGE 47

(b)NoisePower:1 (c)NoisePower:2 (d)NoisePower:3 ContourandgradientdierencebetweenMEEandMEE-SASforthreedierentmeasurementnoises Asseeningure 3-4 ,thehigherthenoisepower,thelargeristheregionoverwhichMEE-SASissuperiortoMEEintermsofgradientmagnitude.ThismeansthatthepointatwhichthecurvaturechangesfromhigherthansecondordertosecondorderisclosertotheoptimalsolutionwhenV(0)6=Vw(e)thaninthecaseofV(0)=Vw(e)aselucidatedbytheorem 5 .Thisalsomeansthatthelargerthenalerror,thefasteristheconvergence. 47

PAGE 48

3 ]). Weconsiderasimpleplantidenticationmodelwithtransferfunctiongivenby(order=9)[ 3 ] TheinputtoboththeplantandtheadaptivelteriswhiteGaussiannoisewithunitpower.TheGaussianmeasurementnoisewithzeromeanandvariance103.Weanalyzethisproblemforbothcriticalandunderttedmodelsusingthestochasticgradient(LMStypeadaptation).Astandardmethodofcomparingtheperformanceinsystemidenticationproblemsisplottingtheweighterrornormsincethisisdirectlyrelatedtomisadjustment[ 3 ].Ineachcasethepoweroftheweightnoise(averagedover125samples)wasplottedversusthenumberofiterationsperformed.Theadaptiveweightswereinitializedrandomlyateachinstance.Further,inordertomaketheresultindependentoftheinputandweightinitializations,weperformedMonte-Carlosimulationswith100dierentinputsand100dierentweightinitializationsforeachinput. Considerthecasewherethemodeloftheadaptivelterisequaltothatoftheplant(ModelOrder=9).Inthiscase,ideallywecanexactlytracktheoutputoftheplant.Figure 3-5 showstheweightmisadjustmentvaluesforthelast100samplesoferror(Forpracticalpurposes,weconsidermisadjustmentvaluesaszeroforvalueslessthan103).Thuswiththesamemisadjustmentvaluesaround5104,itcanbeobservedthatMEE-SASconvergesin150iterationswhereasMEEtakes600iterationstoconverge.Also,LMFconvergesin300iterationswhereasLMStakes500iterationstoconverge. 48

PAGE 49

AverageweighterrorpowerofMEEandMEE-SASforcriticallyttedmodel Figure3-6. AverageweighterrorpowerofMEEandMEE-SASforunderttedmodel 49

PAGE 50

3-6 showstheaveragedweighterrorpower.MEE-SASjusttakes250iterationstoconvergewithamisadjustmentof1:2103ascomparedtoMEEwhichtakesnearly1000iterationsforthesamemisadjustment.Also,LMFconvergesin190iterationswhereasLMStakes500iterationstoconverge.Theseresultsforlinearsystemsareencouraging. whichhasamaximumvalueof1. WeusedthenonstationaryMGtimeseriestocomparetheweighttrackingabilityofMEEandMEE-SASalgorithms.Duetoonlinemodeofsimulation,SIGresultsinsomemisadjustmentandvariationabouttheoptimalsolution.Inordertocomparetwoalgorithms,wendthestepsizeforeachalgorithmtobesuchthatitproducessimilarprobabilitydensitiesoferror(e)forbothcaseswithinawindowlengthofL=200asshowningure 3-8 Ingure 3-7 ,MEE-SASconvergesinabout400iterationswhereasMEEneed700iterationstoachievethesamelevelofperformance.NotethelargeuctuationsintheinformationpotentialcurveofMEEascomparedtoMEE-SAS.Toinvestigatetheeectoftheselargeuctuations,weplottwoweighttracksandthepredictedoutputsofboththealgorithmsingures 3-9 and 3-10 .TheuctuationsintheMEEinformationpotentialcurvetranslateintoanabilitytotrackthechangesintheFIRoptimalsolution.Thisis 50

PAGE 51

OnlinepredictionlearningofMEEandMEE-SASforMackeyGlasstimeseries Figure3-8. ErrorprobabilitydensityofMEEandMEE-SASforlast200samples 51

PAGE 52

TwoofthesixweighttracksforMEEandMEE-SAS Figure3-10. PredictionperformanceofMEEandMEE-SASfortheMackeyGlasstimeseries 52

PAGE 53

3-10 .MEEperformsbetterespeciallynearhighpeaksandvariationsinMGsignal. Thelossof\sensitivity"ofMEE-SAScanbeattributedtotheextremelysmallvalueof[V(0)V(e)]neartheoptimalsolutionwhichsuppressesthetransferofinformationfromtheinformationpotentialgradienttotheweightvectors.Innon-stationarysignalstrackingthesesmallchangesinthelocationoftheweightvectoriscrucialforgoodprediction.Therefore,MEE-SASstillsuersfromatradeobetweenspeedofconvergenceandtrackingoftheoptimalsolution.AcompromiseistouseMEE-SASforfasterconvergenceandthenswitchtoMEEtechniquewhentheinformationpotentialisclosetounity(whichisachievedneartheoptimalsolution).Inthiswaywecandoublethespeedofconvergenceaswellasretaintheabilitytotrackthechangesinweightvector. 30 ].Thelossof\sensitivity"ofMEE-SAScanbeattributedtotheextremelysmallvalueofneartheoptimalsolutionwhichsuppressesthetransferofinformationfromtheinformationpotentialgradienttotheweightvectors.WearetryingtoapplyMEEandMEE-SAScombinedalgorithmfornon-stationarysignalswheretrackingisveryimportant.Inordertodecidetheswitchingtimetomaximizeconvergencespeed,ananalyticalcriterionneedstobedeveloped.ThedynamicsofadaptationcanbeunderstoodintermsofenergyminimizationinthecontextofLyapunovstabilitytheory[ 45 ].Lyapunovenergyfunctionisamethodforanalyzingtheconvergencecharacteristicsofdynamicsystems.Inourcase,weareusingittoanalyzethespeedofconvergence.Simply,thefastertheLyapunovenergydecreases,thefasterwearegettingtowardstheoptimalsolution,especiallysinceourenergyfunctionisbasedonthe 53

PAGE 54

Fromthis,weobtainthefollowingtemporaldynamicsfortheLyapunovenergythatdescribesthelearningrule: _JMEESAS(e)=2[V(0)V(e)]@V(e) Onthecontrary,theregularMEErulewouldhavethefollowingenergyfunctionandupdaterule:JMEE(e)=[V(0)V(e)] (3{26)_w=MEE@JMEE(e) Thiscorrespondstothefollowingtemporaldynamicsfortheminimizationofenergy: _JMEE=MEE@V(e) Fromequations( 3{25 )and( 3{28 ),thegeneralswitchingtimeisdeterminedas Therefore,intheregionsatisfyingthecondition_JMEESAS(e)>_JMEE(e),MEE-SASshouldbeusedsinceMEE-SASconvergesfasterthanMEE,otherwiseMEEisused.However,theapplicationoftheswitchingdecisionexpression( 3{29 )tothestochastic 54

PAGE 55

3{27 )toread 2q Inequation( 3{30 ),weneedtocheckjusttheinformationpotentialateachiterationandcompareitwithaconstant,whichisevaluatedwiththelearningratesofMEEandMEE-SAS. Weselectthetwostepsizessuchthateachalgorithmisseparatelystable.ThestepsizeofMEE-SASischosenlargerforfasterconvergence,andthatofMEEischosensmallerforlowmisadjustment.Therefore,theratioofthestepsizeforMEEtothatforMEE-SASshouldbesmall. Inpractice,ifweredeneV(e)bynormalizingitwithV(0)toobtainVN(e)=V(e)/V(0),thenVN(e)=1.ForthesamestepsizeofMEEandMEE-SAS,thisleadstothefollowingMEE-SAScriterion:[VN(0)VN(e)]2,whichconvergesfasterthanthenormalizedMEEcriterionVN(e)atanypointintheweightspacewhereVN(e)
PAGE 56

Weusedthenon-stationaryMGtimeseriestocomparetheweighttrackingabilityofMEE,MEE-SASandtheswitchingMEEandMEE-SAS.Duetoonlinemodeofsimulation,SIGresultsinsomemisadjustmentandvariationabouttheoptimalsolution.Wechooseaproperkernelsize(=1)basedonSilverman'sruleandsetthewindowlengthtoL=50. Ingure 3-11(a) ,thedrawbackofMEEbecomesquiteevident.MEEtakes800iterationstoconvergecomparedtoMEE-SASwhichconvergesinabout100iterations.Ontheotherhand,notethelargeuctuationsintheinformationpotentialcurveofMEEascomparedtoMEE-SAS.Toinvestigatetheeectoftheselargeuctuations,weplottheweighttrackingure 3-11(b) .TheuctuationsintheinformationpotentialcurveofMEEtranslateintoabilitytotrackthechangesinoptimalsolutionofthenon-stationaryMGtimeseries.UnlikeMEE,thelossoftrackingabilityofMEE-SASisattributedtothesmalleectivestepsizeneartheoptimalsolution.Asseenfromgure 3-11(a) ,theswitchingalgorithmutilizedtheMEE-SAStogoquicklyneartheoptimalsolutionandthenswitchedtoMEEfortrackingthesmallchangeinthesolution,thuseectivelycombiningthestrengthsofboththealgorithms.Thisbecomesclearlyingure 3-11(c) wheretheexactswitchingnatureofthenewalgorithmisdepicted. 56

PAGE 57

(b)Oneofthesixweighttracks(W3) (c)MEEorMEE-SASusedtime Performanceofswitchingschemeonchaotictimeseriesprediction 57

PAGE 58

TheFIRadaptivelterisselectedwithequalorder.TheinputtoboththeplantandtheadaptivelteriswhiteGaussiannoisewithunitvariance.WeselectwindowlengthL=50andkernelsize=1.TheSystemmismatch(weighterrorpower)isselectedasaperformancemeasure.Also,wecomparethisswitchingalgorithmwiththecombinedLMS/Falgorithm,whichcombinesthebenetsofLMSandLMFmethods[ 46 ].ThestepsizeofMEE-SASwassetto2:2forfasterconvergenceandthatofMEEwassetto0:2forgoodtracking.Intheswitchingalgorithm,theratioofthestepsizeforMEEtoMEE-SASwassetto0:09(=0:2=2:2)forthisproblem.Thisratiohastobeselectedbythedesignerforeachapplication,whichisthesameshortcomingasforanyofthecurrentswitchingalgorithms.Also,thestepsizeofthecombinedLMS/Fwassetto0:1forthesamemisajustmentwiththeswitchingschemebetweenMEEandMEE-SAS. Figure 3-12(a) showstheweighttracksofMEE,MEE-SAS,theswitchingalgorithmandthecombinedLMS/Falgorithm.NotehowquicklyMEE-SAStrackstheabruptchange.TheabilitytoadaptivelychangeitsstepsizeandtracklargevariationsisoneofthestrengthsofMEE-SAS.OntheotherhandMEE,eventhoughittakeslongtimetotracktheswitchingbetweenthesubsystems,givesalowerweighterrorpoweronalongrunasshowningure 3-12(b) .Thisisattributedonceagaintoitabilitytotracksmallchangesneartheoptimalsolution.Figure 3-12(b) showshowtheswitchingschemetakesadvantageofbothofthemgivingthebestperformanceintermsofweighterrorpower. 58

PAGE 59

(c)MEEorMEE-SASusedtime Performanceofswitchingschemeonsystemidentication 59

PAGE 60

3-12(c) ,inabruptchangingpart(atinitialand1000thiteration),theswitchingalgorithmusesMEE-SASwhiletotracknechangesitusesMEEalgorithm.Also,theswitchingalgorithmconvergesfasterthanthecombinedLMS/Falgorithmsforthesamemisadjustment.WeremarkthattheswitchingschemebetweenMEEandMEE-SAShasthesameperformance,buttheheaviercomputationcomplexitythanthecombinedLMS/FforlinearsysteminGaussiannoise.However,fornon-Gaussiannoiseandnonlinearsystem,weexpectthattheswitchingschemebetweenMEEandMEE-SASperformsbetterthanthecombinedLMS/Fsinceitemploysthesignalstatisticsveryeectively. WeusetwodierentimpulseresponseoflengthH=128ingure 3-13(a) .At10000thiteration,theacousticpathchangedfromH1toH2.Unlikethepreviousexperiment,thesystemisstationarybeforeandafterthisabruptchange.Thesamelengthisusedforalltheadaptivelters.Theinputsignalisauniformdistributionsignalwithunitvariance.ThemeasurementnoiseiswhiteGaussiandistributedwithzeromeanand104variance.Weselectedakernelsizeof=1basedonSilverman'sruleandsetthewindowlengthtoL=200.Inordertotesttheabilityofconvergencetocompensatefortheabruptechopathchange,weusetheweightSNRasameasureofperformance. 60

PAGE 61

(b)WeightSNR (c)MEEorMEE-SASusedtime (d)Step-sizeofMEE-SAS Performanceofswitchingschemeonacousticechocancellation Figure 3-13(b) showstheweightSNRofthreealgorithms.TheperformanceoftheswitchingalgorithmisthesameasthatofMEE-SASinabruptchangingpart(atinitialand10000thiteration),whileitisthesameasthatofMEEaroundthesolution.Thisswitchingisseenclearlyingure 3-13(c) .Noticetherattlingeectaroundtheswitchingtime.ThisisduetotheuctuationsintheeectivestepsizeofMEE-SASasshowninFig.10.Sincetheswitchingscheme( 3{30 )utilizesthisinformation,wecontinuously 61

PAGE 62

3-13(d) at10000thiterationreectingaveryfaststep-sizeadjustmentwhichhelpsMEE-SAStoimmediatelytracktheacousticpathchange. 62

PAGE 63

TheMEEalgorithmrequiresaprioriknowledgeoftheinputprocessstatistics(poweranddynamicrange)toselectthelearningrateforstabilityandconvergenceandthekernelsizeforgoodresults.Sincethisknowledgeisusuallyunavailable,thestepsizeisnormallyestimatedpriortobeginningtheadaptationprocessforagivenmisadjustmentorspeedofconvergence,andthekernelsizeisestimatedbySilverman'srule[ 34 ]orsimilarheuristics[ 22 ].Changesinstepsizeorkernelsizeiftheinputpoweructuatesshouldbedoneforoptimalperformance,butareseldomperformedleadingtosub-optimalperformance.ThepurposeofthischapteristoproposeanenhancedMEEalgorithmwheretheselectionofwillbeindependentoftheinputpowerandwheretheeectofthekernelsizewillbeminimized.WhentheMEEalgorithmismodiedinthismanner,werefertothisasthenormalizedminimumerrorentropy(NMEE). 2 ].ThecriterionofNMEEisformulatedasconstrainedoptimization:minkw(n+1)w(n)k2; (4{2) whereV(ep(n))=1 4{2 )translatestheconstraintofoptimalperformanceintermsoftheinformationpotential. Tosolvetheconstrainedoptimizationproblem,weusethemethodofLagrangemultipliers. 63

PAGE 64

whererV(ep(n))=@V(ep(n)) 22Ln1Pi=nL(ep(n)ep(i))fep(n)ep(i)gfu(n)u(i)g.Settingthisresultequaltozeroandsolvingfortheoptimumvalue,weobtain 2rV(ep(n)):(4{5) Tosolvefortheunknownmultiplier,wesubstituteequation( 4{5 )intoequation( 4{2 ).Doingthesubstitution,wewriteV(ep(n))=V(0),ep(n)=ep(i);nLin1,d(n)wT(n+1)u(n)=d(i)wT(n+1)u(i),d(n)w(n)+1 2rV(ep(n))Tu(n)=d(i)w(n)+1 2rV(ep(n))Tu(i),d(n)wT(n)u(n)d(i)wT(n)u(i)=1 2(rV(ep(n)))Tfu(n)u(i)g,ea(n)ea(i)=1 2(rV(ep(n)))Tfu(n)u(i)g;nLin1,1 2(rV(ep(n)))T"1 Solvingfor,weobtain whereea(i)=d(i)wT(n)u(i)istheerrorsignal. 64

PAGE 65

2rV(ep(n))=w(n)+n1Pi=nLfea(n)ea(i)grV(ep(n)) (rV(ep(n)))Tn1Pi=nLfu(n)u(i)g Inordertoaddanextradegreeoffreedomtotheadaptationstrategy,oneconstant,,controllingthestepsizewillbeintroduced: (rV(ep(n)))Tn1Pi=nLfu(n)u(i)g(4{9) whereea(i)=d(i)w(n)Tu(i),fornLin,istheapriorierror,andisthenormalizedstepsizewhichcanbeproventobebetween0and2forstability.Inthisupdate,thereisanaddeddicultybecauseestimatingw(n+1)requirestheposteriorerrorep.Weproposetosubstituteapriorierroreabytheposteriorerrorepbecauseweaimtominimizekw(n+1)w(n)k2.Therefore,weobtainthefollowingweightupdateforNMEE, NMEE:w(n+1)=w(n)+n1Pi=nLfea(n)ea(i)grV(ea(n)) (rV(ea(n)))Tn1Pi=nLfu(n)u(i)g:(4{10) 65

PAGE 66

ForMEEweseethatitdependsnotonlyontheerrorandthenormoftheinput,butalsoonthekernelsizethroughtheinformationforce[ 14 ]givenby 1 22Ln1Xi=nL(ep(n)ep(i))fep(n)ep(i)g:(4{13) ThismeansthatwecanexpectNMEEtobelessdependent,whencomparedtotheMEEalgorithm,notonlytotheinputpower,butalsotothekernelsizebecauseofthenormalizationbytheinformationforce.Furthermore,weseethatascomparedtoNLMS( 4{10 ),anextraerrortermappearsinthenumeratoroftheupdate( 4{14 ),whichindicatesthatthespeedofconvergencewilllikelychange.AlltheseaspectsfollowfromtheparticularconstrainttheMEEsolutionrequiresandalsofromthenonlinearnatureoftheerrorgradient. (4{15) Asseeninequation( 4{14 ),NLMSusesthepowerofinputu(n)initsupdateequation.Fromspectralestimation( 4{15 ),thiscorrespondstoinformationaboutthelargesteigenvalueonly.Ontheotherhand,NMEEtakesintoaccountalltheeigenvalues(throughtheRMmatrix)andhencethecurvatureinformation.Thisdierenceistriggered 66

PAGE 67

(4{16),minkw(n+1)w(n)k2+(V(0)V(ep(n)))NLMS:minkw(n+1)w(n)k2subjecttod(n)w(n+1)Tu(n)=0 (4{17),minkw(n+1)w(n)k2+d(n)w(n+1)Tu(n) (4{18)minkw(n+1)w(n)k2+1 (4{19),minkw(n+1)w(n)k2+L1Xk=0kd(nk)w(n+1)Tu(nk)

PAGE 68

(4{20)=w(n)+A(n)TR1e(n); 3 ] TheFIRadaptivelterisselectedwithequalorder.Astandardmethodofcomparingtheperformanceinsystemidenticationistoplottheweighterrornormsincethisisdirectlyrelatedtomisadjustment[ 3 ].Ineachcasethepoweroftheweightnoise(averaged 68

PAGE 69

AverageweighterrorpowerofMEE,NMEEandNLMSforMA(9)withdierentinputpowers(P=1,10) over125samples)wasplottedversusthenumberofiterationsperformed.TheinputtoboththeplantandtheadaptivelteriswhiteGaussiannoise.Intherstexperiment,anunitpowerinputwasused(1P),whereasinthesecondexperimentinputwith10xpower(P)wasselected.WechooseaproperkernelsizebyusingSilverman'sruleinMEE(mee=0:7)andNMEE(nmee=1),respectively. Inthisperfectidentication,wecanexactlytracktheoutputoftheplant.Figure 4-1 showstheplotoftheweighterrornormforamovingaveragemodel.WechooseaslargeastepsizeaspossiblewithintherangeofMEEstability. Fortheunitinputpowercase,bothMEEandNMEEconvergein190iterationswithbasicallythesamemisadjustment(1016orbelow).ToguaranteethestabilityofMEEadaptationfor10xinputpower,thestepsizeischosenalmost10timessmaller,whileitremainsatthesamevalueforNMEE.NMEEjusttakes190iterationstoconvergewithamisadjustmentof1:5621010ascomparedtoMEEwhichtakesnearly700iterationswithamisadjustmentof1:243108.WecanthereforeconcludethattheNMEEis 69

PAGE 70

4-1 theconvergenceoftheNLMSforcomparisonpurposes(misadjustmentof1016),andconcludethattheNMEEisfasterthantheNLMS. 22 ].TheeectofkernelsizesonbothMEEandNMEEisshowningures 4-2(a) and 4-2(b) .Figure 4-2(a) showstheweighterrorpowercurveswithdierentkernelsizeswhentheinputdataiswhiteGaussianwithzeromean,10xunitvarianceandwithaneigenvaluespreadratio(S)of1.Figure 4-2(b) showstheresultsinthecaseofcoloredGaussianinput,whosevarianceis6andeigenvaluespreadratio(S)is550.WeobservethattheperformanceofMEEissensitivetothekernelsizeandthissensitivityincreaseswhentheeigenvaluespreadoftheinputsignalincreases.However,NMEEshowsamuchmoreuniformperformancewithdierentkernelsizesevenwhentheinputhasalargedynamicrange.ThemisadjustmentofNMEEintheworstcaseisalmostthesameasthatofMEEinthebestcase.Furthermore,thekernelsizeof0.7and1forMEEandNMEErespectivelyarefoundtobeoptimal,givingthelowestmisadjustmentof6:7111010and8:2991014inthecasewhenS=1and3:865108and1:4721012whenS=550. AnotherimportantaspectoftheseexperimentsisthedierentspeedofconvergencebetweentheMEEandtheNMEE.Figure 4-2 clearlyshowsthattherearetwosetsofcurves,oneforeachalgorithm.Wecouldinterpretthisbyhypothesizingthatthelearningratesarenotcompatiblewiththesamemisadjustment.However,acloserlookshowsthattheNMEEisfasterthantheMEEandprovidesasmallermisadjustment(8:2991014versus6:7111010).ThereforetheNMEEconvergesfasterthantheMEE. Specically,wetesttheeectofthekernelsizeonbothMEEandNMEE.TheinputdataiswhiteGaussiandistributedwithzeromeanandvariance10.ThemeasurementnoiseiswhiteGaussianwithzeromeanandvariance103.Also,wesetthewindowlength 70

PAGE 71

(b)ColoredGaussianinputwithvariance15,eigenspreadS=550. AverageweighterrorpowerforMA(9)withdierentkernelsizes(solidlines-NMEE,dashedlines-MEE) 71

PAGE 72

AverageweighterrorpowerofMEEandNMEEforMA(9)withrespecttothekernelsize to100.InordertocompareNMEEwithMEE,weusetheweighterrorpowerastheperformancemeasure.Thevalueswereaveragedafterconvergenceforeachkernelsize. Ascanbeobservedingure 4-3 ,theweighterrorpowerofMEEinitiallydecreaseswiththeamplitudeofthekernelsizebuteventuallystartstoincreaseagain.However,NMEEhasthesameperformancewithrespecttotheamplitudeofthekernelsizeexcepttheunstableregion(<0:5).Therefore,weconcludethatNMEEislesssensitivetothekernelsizeascomparedtoMEE. 10:9z1:(4{22) TheunknownsystemH2(z)isapproximatedbyamovingaveragemodelwith16taps.TwosetsofwhiteGaussiannoiseinputswithunitvarianceand10xunitvarianceare 72

PAGE 73

AverageweighterrorpowerofMEEandNMEEforAR(1)incaseofbestresults appliedtoboththeplantandtheadaptivelter.WechooseaproperkernelsizeforMEE(mee=0:7)andNMEE(nmee=1),respectively. Figure 4-4 showsthebestresultsforMEEandNMEEoftheaverageweighterrorpowertoidentifytherstorderautoregressivemodel.The10xinputvarianceproducesthelargermisadjustmentof0:1,whiletheunitinputvariancehasamisadjustmentof0:05.WecanobservethattheMEEwith10xinputvarianceconvergesafter1300iterations.TheNMEEresultsareverydierent.Indeed,thecurvesfortheunitand10xpowerinputareveryclosetoeachother,andconvergewithin400iterations.Theseresultsshowthatevenwhentheidenticationisdonewitharesidualerror,theNMEEconvergesfasterandisbasicallyindependentoftheinputpower. 73

PAGE 74

TheMEEalgorithmisbasedonsimplegradienttechniques,soitrequiresaprioriknowledgeoftheinputprocessstatistics(poweranddynamicrange)toselectthelearningrateforstabilityandconvergence.Sincethisknowledgeisusuallyunavailable,thestepsizeisnormallyestimatedpriortobeginningtheadaptationprocessforagivenmisadjustmentorspeedofconvergence.Thismeansthisalgorithmwillsuerfromtheusualtradeobetweenmisadjustmentandspeedofconvergence.Moreover,asmanyothergradientmethods,generallyitconvergesratherslowly[ 2 ],[ 29 ]. Aneectivealternativetogradientmethodsarexed-pointalgorithmssincethesealgorithmsarestep-sizefreeandproventobefasterandmorestable[ 47 ],[ 48 ].ThischaracteristicisduetothesecondorderupdatesimilartotheRLSalgorithmthattrackstheWienersolutionwitheveryupdate[ 47 ].Ontheotherhand,thesealgorithmsarecomputationallymoredemanding.Althoughuntilrecentlyxed-pointalgorithmshadbeenlimitedtoapplicationswithstrictrequirementsinconvergencespeed,therehasbeenagrowinginterestinthistypeofalgorithmsduetorecentincreasesincomputationalpower.ApplyingsecondorderoptimizationtechniquestoMEEtakingadvantageofxed-pointalgorithms,inthissectionthexed-pointminimumerrorentropy(MEE-FP)algorithmispresented.Althoughxed-pointmethodshavebeenappliedtotheMEEcriterionbefore[ 48 ],herewetakeasystematicapproach.InadditiontoareinterpretationoftheupdateequationinawaythatgreatlyresemblestheWienersolution,bothbatchandonlinemodesareconsideredandthealgorithmisstudiedindepthforitsconvergenceandstatisticalproperties.Somepreliminaryresultsofthisworkwerepreviouslypresented[ 32 ]. 74

PAGE 75

Thisconditionimpliesthatatw,thereexistsalocaloptimumforwhiche1=e2==eN.ComputingthegradientoftheinformationpotentialassumingaGaussiankernelyields 2N22NXi=1NXj=1[ejei]Kp whereei=diyi=diwHui;i=1;:::;N.Substitutingthisgradientinequation( 5{1 )andrearrangingintothexedpointformw=F(w)servesasthebasisfortheiterationalgorithm.Atiterationk,lettheweightvectorbewktheestimateofanoptimalsolution.Then,theestimateofthenewweightvectoratthenextiterationaccordingtow=F(w)is,wk+1=F(wk)=NXi=1NXj=1Kp whereRE(wk)andPE(wk)arethepairwiseincrementalautocorrelationandcrosscorrelationoftheinputanddesiredsignals,weightedbytheerrorkernelsderivedfromtheentropyformulation,RE(wk)=NXi=1NXj=1Kp (5{5) 75

PAGE 76

32 ].ItisconceptuallyanalogoustotheRLSupdaterulethattrackstheWienersolutionwitheveryupdate[ 2 ]. 49 ],whichensuresthatinacompletemetricspace,axed-pointalgorithmbasedonacontractiveoperatorsurelyconvergestoasteadystate.Inthiscase,theiterationiscarriedoutonaLw-dimensionalhyper-spherethatisacompletemetrizablespace,andthexed-pointoperatorF(wk)isapparentlydierentiable.Asaconsequence,if issatisedateveryiteration,theconvergenceofthealgorithmisguaranteed.Noticethattheremightbemorethanonepossiblesolutiontotheequationw=F(w).However,fromthese,onlystablepoints(i.e.,forwhichequation( 5{6 )holds)willbeactualsolutionsfoundbytheoptimization. ConsideranapproximationoftheweightupdatefunctionF(wk)byemployingaTaylorseriesexpansiontruncatedatthelineartermforthegradientaroundtheoptimalweightvector. Ifwedenotetheweighterrorinwkby^wk=wkwandusethefactthatF(w)=w,weget

PAGE 77

5{3 ), wherewk=[wk1;wk2;:::;wkt;:::wkM]TandMisthelterorder.Dierentiatingthisequationwithrespecttowktleadsto, Calculating@F(wk) with, sinceatwk=walocaloptimumexistsforwhiche1=e2=:::=eN.So, Therefore,theMEE-FPislocallyconvergentaroundthesolutionw.Q.E.D. Inotherwords,Theorem6statesthatMEE-FPalwaysconvergeswhenevertheinitialconditionw0ischosensucientlyclosetow. 77

PAGE 78

5{3 )optimizestheweightsateverystepconsideringthewholedataset,oronwhatisknownasbatch-mode.Foranonlineapplication,however,batch-modecomputationisnotfeasible.Analternativeapproachwouldbetoestimateateachtimetheinformationpotentialinawindowoftheprevioussamples,butthesmallnumberofsamplesmakestheestimationlessrobust.Recursiveestimationisabetterapproachwhichalleviatesthesediculties. 50 ],arecursiveformulaisderivedtoupdatetheestimateoftheentropyweighteddeltaautocorrelationRE(wk)andtheentropyweighteddeltacrosscorrelationPE(wk)matricesinequations( 5{4 )and( 5{5 )whenanewsampleisacquired.Whenanewsamplearrives,RE(wk1)andPE(wk1)aremodiedusingthenewinput-desiredsamplepairfu(k);d(k)gasRE(wk)=kXi=1kXj=1Kp Thisexactrecursionisusefulforestimatingtheentropyweighteddeltaautocorrelationandcrosscorrelationofstationarysignals,however,itisnotsuitablefornonstationaryenvironmentsduetoitsincreasingmemorydepth.Thus,aforgettingrecursiveestimatoris 78

PAGE 79

(5{18) wheretheparametersandLaretheforgettingfactorandwindowlengthforstochasticestimationofRE(wk)andPE(wk),respectively.Thesefreedesignparameterswillaectthepropertiesoftherecursiveestimation.NoticethathereplaysthesameroleoftheforgettingfactorinRLS. 5{3 )iscostlybothcomputationallyandmemory-wise.ThisisbecauseitrequiresaninversionofatalltimeinstantsofaMMcoecientmatrix,RE(wk)whereMisthelterorder.ThismatrixinversionrequiresO(M3)operations.SimilarlytotheapproachtakenintheRLSalgorithm,wecandirectlyderiveanupdateexpressionforR1E(wk)basedontheinversionlemmawhich,canbemorecomputationalecientbecauseitisdependentuponL. First,weneedtoconvertthesummationoftherecursiveestimatorinequation( 5{17 )to where=[qk1;qk2;:::;qkL];qi=p (A+BCD)1=A1A1BC1+DA1B1DA1(5{20) withA=RE(wk1),B=(1) 79

PAGE 80

Thatmatrixinversionincludedinequation( 5{21 )requiresO(L3)operationswhereListhewindowlength. 2{5 )includesanadditionalsummationshowingthatIPdependsonallpairwisedierencesbetweenerrorsamples.Indeed,thepairwisedierencebetweenerrorsamplesisthekeyelementintheestimationofentropysinceitaccountsfortheinformationexpressedininteractionsbetweensamplesoftheerror,whichisneglectedbytheMSEcriterion. Likewise,comparingtherecursiveestimatorfortheautocorrelationmatrixintheRLSalgorithms(e.g.EW-RLS[ 2 ],EX-RLS[ 51 ])withtheMEE-FPcounterpart(theentropyweighteddeltaautocorrelationmatrix(ER)),weagainverifythepresenceofanextrasummation.Forthisreason,whileinRLStheestimateofthenewautocorrelationisobtainedfromthecurrentinputvector,intherecursiveMEE-FPtheestimationoftheERmatrixmustaccountfortherelationofthecurrentinputanderrorwithallpreviousinputsandcorrespondingerrors.Inpractice,however,tolimitthecomputationalcomplexitythisestimationistruncatedtoawindowofthelastLsamples.BasedonthisobservationitisnaturaltoexpectanincreaseinthecomputationalcomplexityoftheerrorentropybasedalgorithmssuchasMEE-FP,unlessL=1. Inbig-Onotation,becauseRLSinvolvesonlymatrixmultiplicationsthecomputationcomplexityisO(M2),whereMisthelterorder.Ontheotherhand,therecursiveMEE-FPintheformofequations( 5{17 )and( 5{18 )isorderO(M3+M2L).TheO(M2L)termisduetotheevaluationofthecontributiontotheERmatrixandEPvectorofthecurrentstate(inputanderror).TheO(M3)termrelatestotheinversionoftheERmatrix.Inthesecondformshowninequation( 5{21 ),thecomputationcomplexityis 80

PAGE 81

ConsiderthesituationthatMissignicantlysmallerthanLintherecursiveMEE-FP.Inthiscasetherstformofthealgorithmissimpler,andthecomputationalcomplexitysimpliestoO(M2L)sincethisisthedominantterm.Conversely,ifLissignicantlysmallerthanMthenthesecondformoftherecursionismorecomputationallyecient.Similarly,inthiscasetheO(M2L)termdominatesthecomputationcomplexity.Consequently,foragivenvalueofM,ineithercasethecomplexityishigherthanthatofRLSbyalinearfactorofL.Thisfactoristheresultoftheneedforanadditionalsummationintheestimationoftheinformationpotentialand,inthissense,itisnotatallanunreasonableincrease.ForanextremecaseL=1,bothRLSandrecursiveMEE-FPhavethesamecomputationalcomplexity(O(M2)). Finally,wefollowedbyanalyzingthedierencesinspeedanddirectionofconvergenceduetothexedpointupdateofMEE-FPincomparisontoMEE. 81

PAGE 82

3 ] TheFIRadaptivelterisselectedwithequalorder(ModelOrder=9).TheinputtoboththeplantandtheadaptivelteriswhiteGaussiannoisewithunitpower.TheobservationnoiseiswhiteGaussiandistributedwithzeromeanandvariance1010.TheobjectiveistoadapttheweightsofanFIRltersoastoemulatetheplantascloselyaspossible.Astandardmethodofcomparingtheperformanceinsystemidenticationproblemsisbyplottingtheweighterrornormsincethisisdirectlyrelatedtomisadjustment.Inthiscaseideally,wecanexactlytracktheoutputoftheplant.Althoughentropycostfunctionsaredesignedforidenticationofnonlinearsystemswithnonlinearlters,hereweareinterestedintheadaptationperformanceonly,sowechoseaverysimplelinearmodelandplanttoobtainaconvexcostfunction. Intherstsimulation,weinvestigatetheeectoftheforgettingfactorontheconvergencetimeandtheconvergenceaccuracy(varianceafterconvergence)oftherecursiveMEE-FP.Forthispurpose,wehaveutilizedthisrecursionfor2000iterations.Fivedierentvaluesareusedfortheforgettingfactor:0.1,0.9,0.95,0.99and0.995.Theconvergenceplotsoftheestimatesareshowningure 5-1 .Startingfromthesameinitialestimate,theverecursionsconvergeafterapproximately30,70,170,700and1500iterations.Asexpected,thefastertheconvergence,thelargertheestimationvarianceis.Whenweevaluatethevariancesoftheestimatedvaluesoverthelast1000samplesofeachconvergencecurve,weseethatsmallerforgettingfactorsresultinlargervariance;thevariances(means)arerespectively,1:321022(3:051011),8:781023(2:111011),4:111023(1:411011),2:251024(3:051012)and4:541025(2:071012)Intheseruns,wehaveusedL=100and=1.Thisresultconformstothewell-knowngeneral 82

PAGE 83

ConvergenceoftherecursiveMEE-FPforMA(9)withdierentvaluesoftheforgettingfactor. behavioroftheforgettingfactorinrecursiveestimates.Thisisanintrinsictrade-obetweenspeedandvariance,whichthedesignermustconsiderinselectingtheforgettingfactor. Thesecondsimulationstudiestheeectofthewindowlength,whichapproximatestheexpectationoperator.Forthispurpose,wehavexedtheforgettingfactorto=0:99,andthekernelsizeto=1.ThreevaluesofLaretried:10,100,and1000.TheresultsoftherecursiveMEE-FPusingthesethreedierentwindowlengthsareshowningure 5-2 .Asexpected,thespeedofconvergence(at700iterations)isnotaectedbythevariationsinthisparameter.Only,theestimationvarianceafterconvergenceisalittleaected.Specically,thevariances(means)oftheestimatesforthesethreecasesoverthelast1000iterationsoftherecursionare3:761024(4:111012),2:161024(2:991012)and3:041025(1:201012).Thisconformstothegeneralbehaviorofthesamplemeanapproximationforexpectation:Themoresamplesused,thesmallerthevariancegets.Thetrade-ointheselectionofthisparameterisbetweentheaccuracyafterconvergence 83

PAGE 84

ConvergenceoftherecursiveMEE-FPforMA(9)withdierentvaluesofthewindowlength. andthememoryrequirement.ThelargerLgets,themorestoragespaceisrequiredforprevioussamplesinmemory;ontheotherhand,estimationvarianceisdecreased. ThisthirdsimulationinvestigatestheeectofkernelsizeontheconvergenceperformanceoftherecursiveMEE-FP.Asweknow,Parzenwindowinghasabiasthatincreaseswithlargerkernelsizes,whereasitsvarianceincreaseswithsmallerkernelsizes.Theconvergenceplotsoftherecursionsforvariousvaluesofthekernelsizeareshowningure 5-3 .Inallruns,theforgettingfactorwasxedto0.9andthewindowlengthwastakenas100.FortheGaussiankernelfunctionwithsizesof0.001,0.01,0.1and1,thevariances(means)overthelast1000samplesoftherecursionsturnedouttobe2:611024(3:941012)alltogether.Thespeedandaccuracyofconvergenceisnotaectedbythevariationsofthekernelsizeforthisexample.Ontheotherhand,ascanbeobservedintheenlargedplot,thesmallerthekernelsizetheatteristheslopeoftheweighterrorcurveattheorigin,followedbyasharpdropbeforeconvergence. 84

PAGE 85

ConvergenceoftherecursiveMEE-FPforMA(9)withdierentvaluesofthekernelsize. Figure5-4. ConvergenceperformanceoftherecursiveMEEandtherecursiveMEE-FPforMA(9). 85

PAGE 86

5-4 showstheplotoftheweighterrornorm.Theweightmisadjustmentvalues(variances)forthelast1000samplesoferrorare3:191011(3:841022)and1:761011(8:231023)fortherecursiveMEEandtherecursiveMEE-FPrespectively(Forpracticalpurposes,weconsidermisadjustmentvaluesaszeroforvalueslessthan1030).Thuswiththesamemisadjustmentvalues,itcanbeobservedthatMEE-FPconvergesin100iterationswhereasMEEtakes1200iterationstoconverge. 10:9z1:(5{23) TheunknownsystemH2(z)isapproximatedbyamovingaveragemodelwith16taps.ThewhiteGaussiannoiseinputswithunitvarianceareappliedtoboththeplantandtheadaptivelter.Wehavesetthewindowlengthto100andthekernelsizeto1.Also,wehavexedtheforgettingfactorofMEEandFixed-PointMEEto0.6and0.95,respectively. Figure 5-5 showsthebestresultsforMEEandMEE-FPoftheaverageweighterrorpowertoidentifytherstorderautoregressivemodel.WecanobservethattherecursiveMEE-FPtakesjust35iterationstoconvergewithamisadjustmentof0.0248ascomparedtotherecursiveMEEwhichtakesnearly500iterationswithmisadjustmentof0.0254.ThisresultshowsthatevenwhentheunknownplantisanARmodel,therecursiveMEE-FPconvergesfasterthantherecursiveMEEbasedonthegradientalgorithms. 86

PAGE 87

ConvergenceperformanceoftherecursiveMEEandrecursiveMEE-FPforAM(1). TheeectofthedierenteigenvaluespreadsonbothrecursiveMEEandrecursiveMEE-FPisshowningures 5-6 and 5-7 .Theseguresshowtheweighterrorpowercurvesandtheweighttracksforeigenvaluespreadratios(S)of1.22and10,respectively.Toobtainthisinputdata,zero-meanunitvariancewhiteGaussiannoisewasfeedtoasecond-orderinniteimpulseresponse(IIR)lter(seeHaykin[ 2 ]fordetails).Theconvergencecharacteristicsofbothalgorithmsarecomparedforthetwoeigenvaluespreadsandtwowindowsizes,L=1andL=100.Forcomparison,wendtheMEEstepsizeandtheMEE-FPforgettingfactortobesuchthatcreatethesamemisadjustment 87

PAGE 88

(b)WeighttrackoncontourofIPsurface PerformanceofMEEandMEE-FPforeigenvaluespread(S=1.22) 88

PAGE 89

(b)WeighttrackoncontourofIPsurface PerformanceofMEEandMEE-FPforeigenvaluespread(S=10) 89

PAGE 90

TheimportantaspectoftheseexperimentsistoverifytheexpecteddierentspeedanddirectionofconvergencebetweentheMEEandMEE-FP.Fromgures 5-6 and 5-7 ,wecanobservethattheMEE-FPconvergesmuchfasterthantheMEEalgorithmasexpected,andtheMEE-FPwithL=1performscomparabletoMEEwithL=100.Figure 5-7(b) clearlyshowsthattheweightsofMEEmovetowardthedirectionofthesteepestslope,whereastheweightsofMEE-FPmovedirectlyalongthedirectionoftheoptimalsolution.Putdierently,theeigenvaluespreaddoesnotaectthespeedanddirectionintheMEE-FPalgorithmunlikeMEE,evenwhenLissmall. NextexperimentistocomparetheconvergenceoftheMEE-FPandRLSalgorithms.Theforgettingfactorofallalgorithmswassetto0.9.Moreover,the100dierentinitializationswerechosenrandomlyaccordingtotheGaussiandistributionwithzeromeanand10variance. Fromgure 5-8 ,wecanobservethattheMEE-FPperformsbetterthantheRLSalgorithm,andeventheMEE-FPwithL=1convergesmuchfasterthantheRLSalgorithmsforthesamemisadjustment. 90

PAGE 91

(b)WeighterrorpowerforS=10 PerformanceofRLSandMEE-FPfortwodierenteigenvaluespreads(S=1.22and10) 91

PAGE 92

InformationTheoreticLearning(ITL)isamethodologytonon-parametricallyestimateentropyanddivergencedirectlyfromdata,withdirectapplicationstoadaptivesystemstraining[ 14 ].ThecenterpieceofthetheoryisanewestimatorforRenyi'squadraticentropythatavoidstheexplicitestimationoftheprobabilitydensityfunction.TheargumentofthelogarithmofRenyi'sentropyiscalledtheinformationpotential(IP),andsincethelogarithmisamonotonicfunction,itissucienttousetheIPintraining[ 52 ].ITLhasbeenusedinindependentcomponentanalysis(ICA)[ 53 ],blindequalization[ 54 ],clustering[ 55 ],andprojectionsthatpreservediscriminability[ 56 ]. OneofthedicultiesofITListhatthecalculationoftheIPisO(N2),whichmaybecomeprohibitiveforlargedatasets.AstochasticapproximationoftheIPcalledthestochasticinformationgradient(SIG)[ 37 ]decreasesthecomplexitytoO(N),butslowsdowntrainingduetothenoiseintheestimate. ThischapterpresentsaneorttomaketheestimationfasterandmoreaccurateusingthefastGausstransform(FGT)[ 57 ]andtheimprovedfastGausstransform(IFGT)[ 58 ].TheFGTisoneofaclassofveryinterestingandimportantnewfamiliesoffastevaluationalgorithmsthathavebeendevelopedoverthepastdozenyearstoenablerapidcalculationofapproximationsatarbitraryaccuracytomatrix-vectorproductsoftheformAdwhereaij='(xixj)and'isaparticularspecialfunction.Thesesumsrstaroseinastrophysicalobservationswherethefunction'wasthegravitationaleld.Thebasicideaistoclusterthesourcesandtargetpointsusingappropriatedatastructures,andtoreplacethesumswithsmallersummationsthatareequivalenttoagivenlevelofprecision. TheFGTalgorithmhassuccessfullyacceleratedthekerneldensityestimationtolinearrunningtimeforlow-dimensionalproblems.Unfortunately,thecostofadirectextensionoftheFGTtohigher-dimensionalproblemsgrowsexponentiallywithdimension,makingitimpracticalfordimensionsabove3.TheIFGTisdevelopedto 92

PAGE 93

WewilluseheretheFGTalgorithmproposedbyGreengardandStrain[ 57 ],theIFGTalgorithmproposedbyYang,DuraiswamiandGumerov[ 58 ],thefarthest-pointclusteringproposedbyGonzalez[ 59 ]forevaluatingGaussiansums. Renyi'squadraticentropyestimatorHR2(X)forasetofdiscretedatasamplesxi2
PAGE 94

57 ]istoexpandtheGaussianfunctionintoamultivariateHermitefunctions.WeapplytheFGTideabyusingthefollowingexpansionsfortheGaussianinonedimension exp(xixj)2 wheretheHermitefunctionhn(x)isdenedby andcistheexpansioncenter.TheextensiontohigherdimensionswasdonebytreatingthemultivariateGaussianasaKroneckerproductofunivariateGaussians.Followingthemulti-indexnotationoftheoriginalFGTpapers[ 57 ],[ 60 ],wedenethemultidimensionalHermitefunctionas wherex=x1;;xdT2
PAGE 95

IfwetruncateeachoftheHermiteseriesafterpterms,theneachofthecoecientsCisad-dimensionalmatrixwithpdterms.ThetotalcomputationalcomplexityforasingleHermiteexpansionisO(Nkpd),wherekisthenumberofclusters.ThefactorO(pd)growsexponentiallyasthedimensionalityincreases. ThekeyissuetospeedupthecomputationofinformationpotentialwiththeFGTistoreducethefactorpdinthecomputationalcomplexity.Toreducethisfactor,theothermethodintroducedbyYang,DuraiswamiandGumerov[ 58 ]istoexpandtheGaussianfunctionintoamultivariateTaylorseries.WefactorizetheGaussianfunctionas expkxixjk2 42:(6{9) Inthethirdterminequation( 6{9 ),webreaktheentanglementbyexpandingitintoTaylorseriesas exp2(xic)(xjc) 42=X02jj wherethefactorialandthelengthofaredenedas!=1!2!d!andjj=1+2++d.Thus,theinformationpotentialV(x)expandedintoTaylorseriesis 95

PAGE 96

wherethecoecientsCaregivenby TostoreallthecoecientsD,wesortthecoecienttermsaccordingtoagradedlexicographicorder.Oneofthebenetsofthegradedlexicographicorderisthattheexpansionofmultivariatepolynomialscanbeperformedeciently.Forad-variatepolynomialoforderp,wecanstorealltermsinavectoroflengthrp;d=0B@p+dd1CA=(p+d)! 6{2 ),theinformationpotentialforasetofdiscreteerrorsamplesbecomes[ 21 ]: Minimizingtheerrorentropyinequation( 6{1 )isequivalenttomaximizingtheinformationpotentialsincethelogisamonotonicfunction.Thus,theweightupdateofMEEis 96

PAGE 97

22N2NXi=1NXj=1p Forecientcomputation,estimatinginformationpotentialVH(e)withFGTfromequation( 6{7 )isgivenas 2N2p whereCn(B)isdenedby ThegradientoftheinformationpotentialVH(e)withrespecttotheweightsisgivenas 2N2p whererCn(B)isdenedby IntheexpansionintoTaylorseries,theinformationpotentialVT(e)fromequation( 6{11 )isgivenas 2N2p whereDn(B)isdenedby 97

PAGE 98

2N2p whererDn(B)isdenedby 6.3.1FirstStudy:EntropyEstimationusingFGTandIFGT 2{5 )andthefastmethodsusingtheFGT( 6{7 )andtheIFGT( 6{11 ).Werandomlygeneratethesamplepoints(N)inaunithypercubeaccordingtoauniformdistribution.Thebandwidthissetto=1.AllthealgorithmsareprogrammedinMATLABandwererunona1.8GHzPIVPC. Therstexperimentistoexaminetheperformacneofestimationaccordingtovaryingthenumberofclusters(K)from1to20.WegeneratethenumberofpointsN=5000in3dimensionsandusetheorderofexpansion,p=3and6.Resultsaredepictedingure 6-1 .Figure 6-1(a) showstherelationbetweenIPestimationandthenumberofclusters.Asthenumberofclustersincreases,theabsoluteerrordecreasesquicklyatrstandthenlevelso.Fromgure 6-1(b) ,wenoticethattherunningtimeofcomputationgrowslinearlywiththenumberofclusters. Thesecondexperimentistoexaminetheestimationperformancebyvaryingtheexpansionorderfrom2to10.WealsogenerateN=5000samplesin3dimensionsandxthenumberofclusterstoK=20.Ascanbeobservedingure 6-2(a) ,theabsoluteerror 98

PAGE 99

(b)Runningtimes Absoluteerrorandrunningtimesforagivenexpansionorder(p=3and6)vs.thenumberofclusters(K) 99

PAGE 100

6-2(b) showstherelationbetweentheexpansionorderandtherunningtimesofthedirectmethod,theFGTandtheIFGT.Therunningtimewiththefastmethodsgrowsexponentiallyastheorderofexpansionpincreases.TheFGTmethodisslowerthantheIFGT.Moreover,theFGTmethodwithp>7isslowerthanthedirectevaluation. Thethirdexperimentistoexaminetheperformanceofestimationin2and3dimensionsbyvaryingthenumberofsamples.ThenumberofclustersissettoK=20andtheorderofexpansiontop=5.WecomparedtheabsoluteerrorandtherunningtimeofthedirectmethodwithtwofastmethodsasafunctionofthenumberofsamplesfromN=100toN=10000ingure 6-3 .Theabsoluteerrorincreaseswithdimensionality,butnotwiththenumberofsamples.Ingure 6-3(b) ,wenoticethattherunningtimeofthedirectmethodgrowsquadraticallywiththenumberofsamples,whilethatoftheFGTmethodandtheIFGTmethodgrowslinearly. usingtheminimizationoftheerrorentropy[ 21 ].AlthoughthetrueadvantageofMEEisfornonlinearsystemidenticationwithnonlinearlters,herethegoalistocompareadaptationaccuracyandspeedsoweelectedtousealinearplantandaFIRadaptivelterwiththesameplantorder(zeroachievableerror).Astandardmethodofcomparingtheperformanceinsystemidenticationproblemsistoplottheweighterrornormsincethisisdirectlyrelatedtomisadjustment.Ineachcasethepoweroftheweightnoisewas 100

PAGE 101

(b)Runningtimes Absoluteerrorandrunningtimesforagivennumberofclusters(K=20)vs.theorderofexpansion(p) 101

PAGE 102

(b)Runningtimes Absoluteerrorandrunningtimesforagivennumberofclusters(K=20)andexpansionorder(p=5)vs.thenumberofsamples 102

PAGE 103

Ascanbeobservedingure 6-4(a) ,alltheadaptivemethodsoftheinformationpotentialproduceconverginglters.However,thespreadofconvergenceandtheweighterrorvaluesofthenalepocharedierent.TheFGTmethodperformsbetterintrainingtheadaptivesystemascomparedtotheIFGTmethod.TheFGTwithp=2hasabetterperformancewhencomparedwiththeIFGTwithp=10.TheFGTwithp=10isvirtuallyidenticaltothevalueofthedirectmethod. Figure 6-4(b) showstheplotofthenumberofclustersduringadaptation.Sincetheerrorisdecreasingateachepoch,thenumberofclustersgetsprogressivelysmaller.Inthiscase,wheretheachievableerroriszero,thenumberreducestooneclusterafteradaptation.TheIFGTwithp=2and10goestooneclusterafter12and10epochsrespectively,whiletheFGTmethodwithp=2and10doesitafter5epochs. 103

PAGE 104

(b)Numberofclusters ComparisonofMEE,fastMEEwithFGTandIFGTforsystemidentication(MA(9)) 104

PAGE 105

Manyregressionmethodsusetheleastsquaresformodeltting[ 61 ],[ 62 ].Aproblemthatisfrequentlyencounteredintheapplicationofregressionisthepresenceofoneormoreoutliersinthedata.Outlierscanarisefromsimplecomputationalorcodingmistakes,byincludingobservationsfromadierentpopulation,orbyresponsevaluesthatareduetomachinefailuresortransienteects.Oneoutlyingobservationcandestroyleastsquaresestimation,resultinginparameterestimatesthatdonotprovideusefulinformationforthemajorityofthedata. Entropycriterionhavebeenutilizedasanimprovementtoleastsquaresestimationinthepresenceofoutliers. ^rLMS=@e2(k) Theltercoecientdoesnotreachthetruesolutionduetotheeectoftheimpulsivenoise,andanerrorremainsinthepseudoerrorsignal,resultinginresidualerror.Theresidualerrore(k)thatisnoteliminatedhasamagnitudethatisburiedintheimpulsivenoisen(k). Ontheotherhand,thegradientinMEEisgivenasshowninequation( 7{2 ).Insubsequentsectionweshowthatitis()whichgivestheM-estimatorpropertytoMEE 105

PAGE 106

^rMEE=@V(e) wheredek;ki=e(k)e(ki)=(d(k)d(ki))wT(dxk;ki),dxk;ki=x(k)x(ki)and(x)=1expx2 wheredek;ki=e(k)e(ki)=(d(k)d(ki))wT(x(k)x(ki)).Wecanchangeeasilythemaximumoptimizationtotheminimumonebyusingthemaximumvalueoftheinformationpotential.maxwV(e)minw[V(0)V(e)]=minw"1 where(x)=1expx2 7{4 ),thefunction()givesthecontributionofeachresidualtothecostfunction.The()ofMEEsatisesthefollowingproperties: limx!d dx(x)=0:(7{5) 106

PAGE 107

Proof. dx(x)=limx!x 2expx2 (7{6) NotethatthisconditionholdsforMEEcostfunction,whereasitdoesnotholdforLMScostfunctionwhere(e(k))=e2(k). 63 ],[ 64 ],[ 65 ],and[ 66 ]),useexplicitnon-Gaussianstatisticalmodelstodescribetheimpulsivebehaviorofthenoise. OnesuchmodelfordescribingimpulsivenoiseistheGaussianmixture(GM)model,usedin[ 63 ]and[ 64 ].GMmodelingispopularinthesignal-processingcommunitymainlyinthecontextofspeechrecognition.However,littleresearcheorthasbeendirectedat 107

PAGE 108

(b)Graphofd dx(x) Characteristicsof(x)andd dx(x) 108

PAGE 109

Anotherpossiblemodel,whichgainedincreasingpopularityinthepastdecade,isthealpha-stablemodel,usedin[ 65 ]and[ 66 ].Adrawbackofthismodelistherelativecomplexityofboththeanalyticalderivationsandlterimplementationsinvolved. wherebothpBofthebasenoiseandpOoftheoutliernoisearezero-meanGaussiandensitieswithvariances2Band2O,respectively.Theparametercanbeinterpretedastheamountofcontaminationallowedinthemodel.The\strength"oftheimpulsivenesscanbeeasilyassociatedwiththeratioofthevariances,2O=2B. IncontrasttotheconventionalstochasticalgorithmslikeLMS,whichdependontheinstantaneouserrorsampleonly,MEEreliesontemporaldierencesoferrorsamplesduetothepairwisenatureofthekernelestimatorforentropy.Thisobservationleadstolinearadaptivelterlearningalgorithmsbasedontemporaldierencestatisticsthatareabletoreduce(andevencompletelyeliminate)thebiasintroducedtotheltersolutionduetothepowerofnoiseinthesesignals. Weassumethattheimpulsivenoiseisthewhiterandomvariable.ThepdfofthedierenceerrorispdN(dn)=pN(n)pN(n)=(1")2pB(n)pB(n)+2(1")"pB(n)pO(n)+"2pO(n)pO(n): 109

PAGE 110

SamplesofGaussianmixturemodelwithO2=1:0;102and104 7{8 )contributefortheimpulsivepartinthenoisesignalforMEE.Theproportionofthisimpulsivepartinthenoiseis 2(1")"+"2=2""2(7{9) Since2""2>"for"<<1,theproportionofimpulsivenoiseinMEEisalwayslargerthaninLMS. Figure 7-2 shows1000samplesfromthreekindsofGaussianmixturemodelgeneratedby Itisclearhowalargerratioofthevariances,2O=2Bresultsinahigherprobabilityoflargesampleslevels(impulses). 110

PAGE 111

Thereisnoclosed-formexpressionfortheprobabilitydensityfunctionof-stabledistributions,butthecharacteristicfunction'(t)isgivenby wherew(t;)=8><>:tan (7{12)sign(t)=8>>>><>>>>:1;ift>00;ift=01;ift<0 (7{13) and 111

PAGE 112

Samplesofthreealphastableprocesseswith=2:0;1:5and1:0(=0,=1=2,=0) Thecaseof=2,=0correspondstotheGaussiandistribution,while=1,=0,correspondstotheCauchydistributionand=0:5,=1,totheLevydistribution.Thedensityfunctionsinthesethreecasesaregivenby 2+(x)2; (x)3/2exp2 whichisconcentratedon(;1).Figure 7-3 shows1000samplesfromthreealphastabledistributedsignalswithdierentcharacteristicexponent,namely=2:0,1:5and1:0.Itisclearhowasmallervalueofresultsinahigherprobabilityoflargesamplelevels(impulses). 112

PAGE 113

BlockdiagramofAEC 7.4.1SystemandNoiseModel 7-4 ,theacousticechocanceller(AEC)seekstominimizecontributionoftheechosignalr(k)tothepoweroftheerrorsignale(k)bysubtractinganestimateoftheechosignaly(k)fromthemicrophonesignald(k). Theacousticcouplingbetweentheloudspeakerandthemicrophoneintheroomgeneratesechoes.Tocanceltheseechoes,weneedtoidentifytheimpulseresponseduetothisacousticcoupling.Inthesesimulations,weuseanacousticimpulseresponseoflengthH=512ingure 7-5 .Thesamelengthisusedforalltheadaptivelters.Thesamplingrateis8kHz.Theinputsignalx(k)isawhiteuniformdistributionsignalwithunitvariance. Anindependentimpulsivenoisesignaln(k)isgeneratedby anditisaddedtothesystemoutput,y(k). 113

PAGE 114

Roomimpulseresponse WetesttherobustnessofLMSandMEEtotheimpulsivenoiseatdierentSNRlevels.ThechangeinSNRisachievedbychangingthevarianceofimpulsivenoise(2o)from104and108. Weuseechoreturnlossenhancement(ERLE)fortwoalgorithmsasameasureofperformance. 7-6 .Thuswiththeseparametersxed,westartoLMSandMEEwiththesamelevelofperformance. 114

PAGE 115

ERLEofLMSandMEEat2o=104(SNR=34.22dB) LMS:=1 (7{20)MEE:=0:005;L=50;=1 (7{21)Figure 7-7 showstheperformanceofMEEvs.LMSwhen"=0:05.SinceLMSisnotrobusttoimpulsivenoise,theERLEperformanceofLMSdecreaseslinearlywithrespecttoSNRofthesignal.ThegraphofMEEneedssomeexplanation.NotethatthekernelsizeselectedforMEEis=1.Fromgure 7-7 weknowthatMEEsuppressesasignalifitsmagnitudeisverylargecomparedto=1.Sowhenthevarianceoftheimpulsivenoiseisbetween104and1,MEEconsidersthenoiseasanordinaryGaussiannoiseandshowsthesameperformanceasLMS.SincetheproportionofthisnormalnoiseincreasesasSNRdecreases,theERLEofMEEdropsasfunctionofdecreasingSNRintheinitialstagesjustlikeLMS.Ontheotherhand,when2o10,thenoisesignalhasaordinaryGaussiannoiseandanimpulsivepart.ItisatthisstagethentheM-estimatorpropertyofMEEkicksinandeectivelyremovestheimpulsivenoise.SinceMEEisable 115

PAGE 116

ERLEofLMSandMEEfor"=0:05(5%)outliers tocompletelyremovethegrowingimpulsivenoise,theERLEofMEEincreaseswithdecreasingSNRfor2o10. Figure 7-8 showstheperformanceofMEEandLMSwhen2o=10and"=0:05.ThiscorrespondstoSNR=-2.52dB.ItisclearthatMEEquicklyadaptswithin1000iterationsandgivesanERLE>10dB.TounderstandtheimpulsivenoiserejectionofMEEbetter,weplottheoutputsignal.EventhoughimpulsivenoiseispresentMEEtracksthesignalverywellwhereasLMSgivesabiasedsolutionasitfollowstheimpulsivenoise.Theextremecaseisdepictedingure 7-9 when2o=108andSNR=-73.03dB.Inthisscenario,MEEisabletocompletelyrejecttheimpulsivenoise.SincetheremainingnoiseissmallamountofordinaryGaussiannoise,MEEalmostperfectlytrackthesignalandgivesanERLEcomparabletothatobtainedwhen2o=104. TotesttherobustnessofMEEtoimpulsivenoise,weincreased".Figure 7-10 showstheplotsobtainedfor"=0:1;0:2and0:4.MEEstillperformsverywellandgivesahighERLEforlowSNRvalues.AlsotheERLEdecreaseswithincreasingwhichisnaturalastheproportionofnoiseincreasesdramatically.WhatisimpressiveaboutMEEisthatit 116

PAGE 117

(b)Outputsignals ERLEandoutputsignalsofLMSandMEEat2o=10(SNR=-2.52dB) 117

PAGE 118

(b)Outputsignals (c)OutputsignalsofMEE ERLEandoutputsignalsofLMSandMEEat2o=108(SNR=-73.03dB) 118

PAGE 119

67 ],[ 68 ],andhasfoundnumerousapplicationsinradar,sonar,seismology,radioastronomy,medicalimaging,speechprocessing,andwirelesscommunications. WiththeassumptionofadditiveGaussianwhitenoise(AWGN)inantennaesystems,themostcommonlyusedadaptivealgorithmsaretheLMSandRLSalgorithms[ 69 ],[ 70 ],[ 71 ].Thesealgorithmsarebasedonsecond-orderstatistics.However,theleastsquarecriterionisnotanappropriatechoiceinimpulsivenoisescenarios.Recently,furtherresearchintosignalmodelinghasledtotherealizationthatmanynaturalphenomenacanbebetterrepresentedbydistributionsofamoreimpulsivenature.OnetypeofdistributionthatexhibitsheaviertailsthattheGaussianistheclassofstabledistributionsintroducedbyNikiasandShao[ 65 ].Alpha-stabledistributionshavebeenusedtomodeldiversephenomenasuchasrandomuctuationsofgravitationalelds,economicmarketindexes[ 72 ],andradarclutter[ 73 ]. Inastatisticallearningsense,abetterapproachwouldbetoconstraindirectlytheinformationcontentofnon-Gaussiansignalsratherthansimplytheirenergy,ifthedesignerseekstoachievethebestperformanceintermsofinformationltering.Inthisregard,Renyi'sentropycriterionhasbeenutilizedasanalternativetoMSEinadaptivesystemmodeling[ 21 ].Inthissection,weapplytheminimumRenyi'sentropyconcepttoadaptivebeamformingandshowitsrobustnesstoimpulsivenoise. 7.5.1.1SystemandNoiseModel 119

PAGE 120

(b)"=0:2(20%) (c)"=0:4(40%) ERLEofLMSandMEEfor"=0:1(10%),"=0:2(20%),and"=0:4(40%)outliers 120

PAGE 121

wherea()2CM1isthesteeringvectorofthearraytowarddirectionas andnkistheM1vectorofadditivewhitenoise.Also,thebeamformeroutputisgivenby wherew2CM1isavectorofweightsandHdenotestheconjugatetranspose.ThegoalistosatisfywHa()=1andminimizetheeectofthenoise(wHnk),inwhichcase,ykrecoverssk. Besides,wealsoassumethateachelementofnkfollowsasymmetric-stable(SS)distributiondescribedbythefollowingcharacteristicfunction whereisthecharacteristicexponentrestrictedtothevalues0<2,(<<1)isthelocationparameter,and(>0)isthedispersionofthedistribution.Thevalueofisrelatedtothedegreeoftheimpulsivenessofthedistribution.Smallervaluesofcorrespondtoheaviertaileddistributionsandhencetomoreimpulsivebehavior,whileasincreases,thetailsarelighterandthebehaviorislessimpulsive.Thespecialcaseof=2correspondstotheGaussiandistribution(N(;2)),while=1correspondstotheCauchydistribution. 121

PAGE 122

minwEy2ksubjecttowHa()=1:(7{26) TheconstraintwHa()=1preventsthegaininthedirectionofthesignalfrombeingreduced.ThisiscommonlyreferredtoasCapon'smethod[ 69 ],[ 74 ].Equation( 7{26 )hasananalyticalsolutiongivenby whereRxdenotesthecovariancematrixofthearrayoutputvector.Inpracticalapplications,Rxisreplacedbythesamplecovariancematrix^Rx,where withNdenotingthenumberofsnapshots.Substitutingwcaponintoequation( 7{24 ),theconstrainedleastsquaresestimateofthelook-directionoutputis 7{27 )requiresknowledgeofsecondorderstatisticsRx.Inpractice,thesecond-orderstatisticsareusuallynotknown,butwiththeassumptionofergodicity,theycanbeestimatedfromavailabledata.Statisticsmayalsochangeovertime.Tosolvetheseproblems,theweightistypicallydeterminedbyadaptivealgorithms[ 67 ].Thebeamformerweightsadaptationprocedureproposedin[ 75 ]isbasedonagradient-descentconstrainedLMSalgorithm: whereP=Ia()a()Ha()1a()HandF=a()a()Ha()1. 122

PAGE 123

76 ]. Whenp=1,theabovealgorithmisthewell-knownsignLMSalgorithm. maxwV(y)subjecttowHa()=1:(7{32) ByusingthemethodofLagrangemultipliers,wegettheupdateequationas whereP=Ia()a()Ha()1a()H,F=a()a()Ha()1,andrV(y)denotesthegradientofthestochasticinformationpotential, Aconstrainedminimumoutputentropywithselfadjustingstep-size(MOE-SAS)algorithmasthevariantsofMOEhasbeenintroducedtoadjustanarrayresponsetoadesiredsignal. 123

PAGE 124

andthetargetofinterestislocatedatangle=0.Theimpulsivenoise(nk)modelisalpha-stabledistributedandthestrengthoftheimpulsivenessiscontrolledbychangingthecharacteristicexponentfrom2to0.5. Intherstsimulation,weinvestigatetheeectofkernelsize()ontherobustnessperformanceoftheMOEandMOE-SASalgorithm.Asweknow,Parzenwindowinghasabiasthatincreaseswithlargerkernelsizes,whereasitsvarianceincreaseswithsmallerkernelsizes.Figure 7-11 showsthebiterrorrate(BER)performanceoftheMOEandMOE-SASalgorithmwithfourdierentvaluesofthekernelsize(0.01,0.1,1and10).Wehavesetthewindowlength(L)to50andthestepsize()to0.005.Kernelsizeplaysanimportantroleindiscriminatingsignalsfromimpulsivenoise.Intheseruns,weobservethat=1isapropersize. Next,wetesttherobustnessoftheLMS,LMP,MOEandMOE-SASalgorithmstotheimpulsivenoise.Inordertomaketheresultindependentoftheinputandnoise,weperformMonte-Carlosimulationswith100dierentinputsandnoises. Inasystematicmanner,weselectthestepsizesuchthatallalgorithmshavethesameperformancewhenanalpha-stablenoisewith=2(Gaussiannoise,SNR=15dB)asshowningure 7-12 .Wehavesetthewindowlength(L)to50andthekernelsize()to1intheMOEandMOE-SASalgorithm.Thuswiththeseparametersxed,westartotheLMS,LMPandMOEalgorithmswiththesamelevelofperformance. Figure 7-13 showsthebiterrorrate(BER)performanceofallalgorithmsatdierentlevels.AlthoughallalgorithmsstartatthesamelevelunderAWGN,theMOEandMOE-SASalgorithmsdisplayssuperiorperformancefordecreasing,thatis,increasingthestrengthofimpulsiveness.Figure 7-14 showsthebeampatternofallalgorithmsat 124

PAGE 125

(b)MOE-SAS BERperformanceofMOEandMOE-SASwithfourdierentkernelsizes 125

PAGE 126

Comparisonsofthebeampatternin=2:0(SNR=15dB) Figure7-13. ComparisonofBERperformanceatdierentcharacteristicexponentlevels 126

PAGE 127

127

PAGE 128

(b)=1:0 Comparisonsofthebeampatternin=1:5and=1:0 128

PAGE 129

77 ],[ 78 ].Thereasonisthatthetransmissionpathistimevaryingandintroducesintersymbolinterference(ISI)whichimposesseverelimitationsonthecapacityofthecommunicationchannel.Channeltrackingandequalization(basedonchannelestimation)arethusparamounttoovercometheill-eectsofISI.Inmobilecommunicationenvironments,thefasttime-varyingcharacteristicsofthechannelfurtherincreasethestressonthechanneltrackingmethodwhichmustcopewiththefastchangesinadataecientmanner. Therearetwomainstrategiesforchannelestimation:blindidenticationandpilot-assistedtransmission(PAT).Theconceptofblindchannelidentication,pioneeredbytheworksofGodard[ 79 ]andSato[ 80 ],triestosolvetheproblemofchannelestimationusingonlypriorknowledgeabouttheproblem.Manymethodshavebeenproposedthroughouttheliteraturebased,forexample,onassumptionsofcyclostationarity[ 81 ],[ 82 ],subspacemethods[ 83 ],maximum-likelihoodduetocouplingwiththeViterbidecodingalgorithm[ 84 ],andmorerecentlyBayesiantrackingmethodssuchasKalmanltering[ 85 ]andparticleltering[ 86 ],justtomentionafew.Althoughquiteattractivesincenoadditionalinformationneedstobetransmitted,blindmethodssuerfromtwodrawbacks:slowconvergenceandtheirmanylocalextrema[ 81 ],[ 83 ]. Pilot-assistedtransmission(PAT)(orpilotsymbolassistedmodulationascoinedbyCavers[ 87 ])isacompletelydierentstrategy[ 88 ].PATperiodicallymultiplexesasequenceofsymbolsknownaprioribythereceiverintheinformationsentbythetransmitter.Thereceivercanthenutilizethesesymbolsequencesastrainingdataforwhichthedesiredisknown.Thus,thekeyadvantagesofPATarethatitsimpliesthechannelestimationproblemandallowsforsimplerandfasteradaptivelearningmethods 129

PAGE 130

88 ]andreferenceswithin). InthischapterweassumeaPAT-basedstrategysothatsupervisedadaptivelearningmethodscanbeused.Asstatedearlier,thisisoneofthekeystrengthsontheuseofPAT.Havingsetthebasestrategy,thetypicalchoiceforthechannelestimationcriterionisthemean-squareerror(MSE)duetotheanalyticalandcomputationaleaseinevaluatingcriteriaandderivedadaptivealgorithms[ 26 ],[ 2 ].Duetotheinherentnon-stationaryofthewirelesschannelinmobilecommunications,adaptivealgorithmswhichiterativelyapproximatetheoptimalsolutionateachtimehavebeenproposed[ 2 ],[ 29 ].Themostcommonlyusedbeingtheleastmeansquares(LMS)algorithmandtherecursiveleastsquares(RLS)algorithm[ 77 ].Morespecically,tocopewiththefastchangingchannelRLSisfurtherequippedwithexponentialweightingintothecostfunction(EW-RLS).Additionally,RLSwasrecentlyreformulatedasaspecialcaseoftheKalmanlterwhichcontributestotheunderstandingofitsbehaviorandlimitationsinfastvaryingenvironments[ 89 ].Moreover,thisreformulationenabledamodiedalgorithmusingKalmanvariablestoformulatetheso-calledextendedRLSalgorithm(EX-RLS)[ 51 ]. Despitethegreateortinaddressingthenonstationarityinthechannelestimationproblem,MSEhasremainedthecriterionofchoice.However,MSEaccountsonlyforsecondorderstatisticsandthusisnotanappropriatechoiceinnon-Gaussiandistributeddatasuchasimpulsivenoisescenariosoftenencounteredinrealisticwirelesscommunicationsenvironments[ 90 ],[ 91 ].Amoredescriptivecriterionistheentropyoftheerrorsignal[ 14 ].Hence,byminimizingtheerrorentropyratherthansimplyitsenergyweareguaranteedtoachieveabettermappingperformance. 130

PAGE 131

90 ],[ 91 ].ThefundamentalMiddletonClassAnoisemodelhasbeenproposedfortheimpulsivenoisecommonlygeneratedinanindoor/urbanwirelessenvironment[ 90 ],[ 92 ].Here,anapproximationofthisnoisemodelisutilized. 93 ].Assumingthemulti-pathchannelfadesatthesamerate,thechannelimpulseresponseattimekcanbeapproximatedastherst-orderauto-regressive(AR)model, where=J0(2fDTs).J0()isthezeroth-orderBesselfunction,Tsisthesamplingperiod,fDistheDopplerfrequency,andvk2NC(0;2vI)withzeromeanandvariancematrix2vIisthecomplexdrivingnoiseofthemodel.Then,thecovariancematrixofvkisV=(12)I,whereIdenotestheidentitymatrix. 131

PAGE 132

wherenkisadditivewhitenoise.Theadditivemeasurementnoisenkcanbemodeledaseitheracomplex-Gaussiandistribution(pn=NC(0;2n))orasatwo-termGaussianmixturemodel.Theprobabilitydensityfunction(PDF)ofthelatternoisemodelhastheform where0"1and1.Therstterm(1")NC(0;&2)representsthenormalbackgroundnoisewithprobability1",whereas"NC(0;&2)denotesthepresenceofanimpulsivecomponentoccurringwithprobability".Itisusuallyofinteresttostudytheeectsofvariationintheshapeofadistributionontheperformanceofthesystembyvaryingtheparameters"andwithxedtotalnoisevariance Furthermore,wedenotetheestimatedchannelvectoraswk.Then,anestimatorofthereceivedsignalisgivenby ^dk=wHkuk:(8{5) Theestimationerrorbetweenthereceivedsignalanditsestimatoris 8-1 .Then,0:99999,and2v=1:9739105.Ifweassumethecellularsystemisusinga2.1GHzcarrierfrequencyandtransmittingundera 132

PAGE 133

ADopplerfadedchannelrealizationwiththefadingratefDTs=0:001 GSM/EDGEsymbolrate[ 94 ],thenthisrateoffadingwouldbeexpectedforvehicularusersoperatingonafreeway. TheFIRadaptivelterwasselectedwithequalorder(M=5)allowing,ideally,forperfectidenticationofthechannel.TheinputsignaltoboththewirelesschannelandtheadaptivelterisBPSK.Then,theerrorisgivenby andtheaimistominimizethedierencebetweenthereceivedsignalandtheadaptivesystemoutputwithrespecttothechannelestimatewk.Inordertomaketheresultindependentoftheinputandweightinitializations,Monte-Carlosimulationswereperformedwith100dierentinputsand100dierentweightinitializationsforeachinput.WeusetheweightedSNRfortwoalgorithmsasameasureofperformance. 133

PAGE 134

WeightSNRinGaussianmeasurementnoise(pn=NC(0;106)) Weillustratetheresultsintwophases.Intherstphase,theparametersoftheMSEbasedalgorithms(LMS,EW-RLSandEX-RLS)andtheentropybasedalgorithms(MEE,MEE-SAS,NMEEandMEE-FP)weretuned,suchthattheyhavethesamesecondordercharacteristicstoprovideacommongroundforthecomparison.Toachievethis,weusedzeromeanwhiteGaussianadditivenoisewithvariance106asthemeasurementnoise.Theparameters(stepsize,forgettingfactorandwindowlength)wereselectedforeachalgorithmsuchthattheywouldperformsimilarly;thatis,withthesameweightedSNR(around20dB)asshowningure 8-2 .Incaseoftheentropybasedalgorithms,thekernelsize()wassetto1andthewindowlength(L)to50.Also,incaseoftherecursivealgorithms(EW-RLS,EX-RLSandMEE-FP),theforgettingfactorwassetto0:9. Inthesecondphase,bothalgorithmswereusednowunderanon-Gaussiannoiseenvironment.TherobustnessofallalgorithmswastestedfordierentSNRlevels, 134

PAGE 135

WeightSNRatthedierentSNR(from17dBto-33dB) ThedierentSNRsareachievedbychangingthevariance&2ofthenoisemodel( 8{3 )from106to101andxing"to0:1andto106.Thisnoiseisnon-Gaussianwithheavytails,anexamplebeingimpulsivenoise. Figure 8-3 showstheperformanceoftheMSEbasedalgorithmsandtheentropybasedalgorithms.TheweightedSNRperformanceoftheMSEbasedalgorithms(LMS,EW-RLSandEX-RLS)decreasesalmostlinearlywithrespecttoSNR,althoughEX-RLSperformsbetterthanLMSandEW-RLSduetoitsimprovedtrackingabilityinthisnonstationaryenvironment.Theperformanceoftheentropybasedalgorithms(MEE,MEE-SAS,NMEEandMEE-FP),however,changedlittle,remainingclosetoaweightedSNRof20dB. Figure 8-4 showstheevolutionoftheweightSNRovertimeofallalgorithmswhen&2=105and&2=101.TheseplotscorrespondtoSNRlevelsof6.91dBand-32.90dB,respectively.Intherstcase,itcanbeobservedthattheentropybasedalgorithmsconvergestoaweightedSNRof20dB,comparedtoaround14dBfortheMSEbasedalgorithms.Thesecondcase,when&2=101,depictsanextremesituation.Inthiscase,itisremarkabletoverifythattheperformanceoftheentropybasedalgorithmsisbarely 135

PAGE 136

WeightSNRat&2=105and&2=101. 136

PAGE 137

95 ]u(n)=sin2n 1+x21(n)+1sin(x2(n))x2(n+1)=x2(n)cos(x2(n))+x1(n)expx21(n)+x22(n) 8+u3(n) 1+u2(n)+0:5cos(x1(n)+x2(n))y(n)=x1(n) 1+0:5sin(x2(n))+x2(n) 1+0:5sin(x1(n)): 96 ].However,inthisresearch,theadaptationcriterionispickedtobetheminimizationofRenyi'squadraticentropyoferror,withMEE,MEE-SASandNMEE 137

PAGE 138

(rV(e))T"NPi=1NPj=1fxixjg#; wherethegradientvectorofinformationpotentialis 2N22NXi=1NXj=1[eiej]Kp andyiistheoutputandxiistheinputatthelinearoutputprocessingelements(PE)ofMLPs.Theterm@yi/@wcanbecomputedasinthestandardbackpropagationalgorithm[ 97 ]. Wetesttheperformanceoftheentropybasedalgorithmswithtwodierentkernelsizes(=1and2),andcomparethesealgorithmswiththeMSEbasedalgorithm. Intherstcaseof=1,weplotthenormalizedinformationpotential(NIP)ingure 8-5(a) ,whichhasamaximumvalueof1.Thishelpsustocomparetheperformancebetweendierentexperiments.Asseenhere,MEE-SASandNMEEconvergesinabout 138

PAGE 139

Figures 8-5(b) and 8-5(c) showthetestoutputsandtheerrorprobabilitydensitieswithinthelastepochat=1,respectively.Ascanbeseenfromthegures,theentropybasedalgorithmsinTDNNoutperformtheMSEbasedalgorithminTDNNwhichisexpectedsinceMSEisnotsucientfornonlineardynamicsystemidentication. Ingure 8-6 ,weshowtheperformanceofnonlinearidenticationwith=2.Asexpected,whenthekernelsizeislarge,thecurveofinformationpotentialbehavesasaquadraticcurvesimilartotheMSE.So,MEEandMEE-SAShavethesameperformanceasMSE,whileNMEEperformsbetterthantheotheralgorithmsasshowningure 8-6(c) 139

PAGE 140

(b)Testoutputs (c)Probabilitydensityoferrorforlastepoch Identicationperformancefor=1 140

PAGE 141

(b)Testoutputs (c)Probabilitydensityoferrorforlastepoch Identicationperformancefor=2 141

PAGE 142

First,weproposedaninformation-theoreticsupervisedlearningcriterionforadaptivesystems,namely,theminimumerrorentropywithselfadjustingstep-size(MEE-SAS).WedemonstratedthatMEE-SASextendsMEEbyusinganautomaticadaptivestepsizetoacceleratethesearchfortheoptimalsolution.Instructuralanalysispart,weanalyticallyfoundtheturningpointofcurvatureforMEE-SAS.ItwasobservedthatthiscontourofpointsinthecostfunctioncurvaturedependsontheSNRofthesignal(providedthatwehavecriticalmodelorder).Further,MEE-SASisexpectedtoperformbetterthanMEEinthecasewherethereisnon-zeroerror(duetomodelingerrorormeasurementnoise)sincetheturningpointofcurvatureisgoingtobeclosertotheoptimal,leadingtofasterconvergence.Forthecasewherezeroerrorisachievable,theturningpointofcurvatureisfartherawayfromthesolutionandsotheMEE-SASslowsdownduringthepathtotheoptimalsolution.TheslowtrackingabilityofMEE-SASbeyondtheturningpointofthecurvatureduetosmalleectivestepsizehinderstheperformanceofMEE-SASinnon-stationaryenvironments.WesolvedthisproblemusingaswitchingschemebetweenMEEandMEE-SAS.StartingwithMEE-SASforfasterconvergencewhenfarfromthesolution,thealgorithmswitchestoMEEneartheoptimalsolutionforimprovedtrackingability.SimulationresultsinnonstationaryscenarioshowedthattheproposedswitchingalgorithmsoutperformsbothMEEandMEE-SASalgorithmswhenusedindependentlyandquicklyadaptstochangingenvironment.However,theselectionofthestepsizesfortheMEEandMEE-SAShastobedeterminedfromthedataforgoodperformance. 142

PAGE 143

Third,wepresentedanewErrorEntropybasedalgorithm,thexed-pointminimumerrorentropy(MEE-FP)usingtherstorderoptimalityconditionoftheerrorentropytogetherwithaxed-pointiteration.ThisupdateruleisanalogoustoRLSupdaterulethattrackstheWienersolutionwitheveryupdate.Moreover,weprovedthatMEE-FPislocallyconvergentaroundtheoptimalsolution,andexperimentallyveriedthatforalargerangeofinitialconditionsthealgorithmisalwaysstable.Furthermore,wederivedrecursiveestimatorsallowingforon-lineestimationandtrackingoftime-varyingsolutionsinacomputationallysimplemanner.TheseresultsextendtherangeofpossibleapplicationsforMEE-FP.Also,weshowedthatMEE-FPconvergesfasterthanRLSthroughsystemidenticationsimulation,forreasonsthatrequiredfurtherinvestigation. InformationTheoreticLearningandinparticulartheminimumerrorentropycriterionhasbeenrecentlyproposedasamoreprincipledapproachfortrainingadaptivesystems. 143

PAGE 144

Fifth,weinvestigatedtherobustnessoftheMEEalgorithm.WeshowedthatMEEperformsbetterthanLMSalgorithminsystemidenticationwithimpulsivenoise.Also,weshowedtheoreticallythatMEEisveryrobusttoimpulsivenoiseduetoitsM-estimatorpropertyderivedfromthefactthatMEEconstrainstheerrorentropy.Althoughthereareotherrobustadaptivealgorithms,withMEEthispropertycomesnaturallyandwiththeselectionofjustoneparameter,thekernelsize. Finally,weappliedtheseEntropybasedalgorithmstowirelesschanneltrackingandnonlinearsystemidentication.WeshowedthattheMEEisagoodalternativeinthechanneltrackingincasesthatthenoiseiswellmodelledbytheMiddletonnoisemodel(wirelessinurbanenvironments).WithMEE-FPbothgoodtrackingandrobustnesstonoiseareachievable.Althoughthereareotheralgorithmsthatcantrackthechannelbetter,theyarenotinsensitivetoimpulsivenoise. WeremarkthattheperformanceofEntropybasedalgorithmsdefaultstothatofMSEbasedalgorithmsfortheGaussiannoiseandlinearlter,sinceforthiscaseEntropybasedalgorithmscannotimproveuponthatofMSEbasedalgorithmsbecausehigher-orderstatistics(abovesecond-order)conveynoadditionalinformationaboutthemappingbeyondthedescriptionalreadyprovidedbyMSE.Ontheotherhand,whenthenoiseisnon-Gaussian(hereimpulsive)andthesystemisnonlinear,higher-orderstatisticswhicharenotaccountedforbyMSEhavetobeutilizedforbetterperformance.Theprimeadvantageofanerrorentropybasedcriterionhowever,suchastheinformationpotential 144

PAGE 145

Summaryofproposedalgorithms ProsandCons MEE-SAS Pros:FasterconvergencethanMEE Cons:Lossofsensitivityfortrackingsmallchange Solution:SwitchingschemebetweenMEEandMEE-SAS NMEE Pros:Insensitivitytoinputpowerandkernelsize FasterconvergencethanMEE Cons:Unstablebehaviorwhennormalizedterm=>zero Solution:Addingapositiveconstanttothenormalizedterm MEE-FP Pros:Stepsizefree FasterconvergencethanMEEandRLS Cons:HighercomputationcomplexitythanMEEandRLS FastMEE Pros:Reductionofcomputationcomplexity EntropybasedAlg. Pros:RobustertooutlierthanMSEbasedAlg. Applicabletononlinearsystem Cons:HighercomputationcomplexitythanMSEbasedAlg. OurresearchalsoshowedthatthedynamicsoflearningforMEEcostfunctionwithgradientdescentapproachesyieldsverysimilarconvergencespeedsforthesamemisadjsutementastheMSEcostfunctionwiththecorrespondinggradientdescentalgorithms.Therefore,thecompromisebetweenthesetwotrade-osremainsbasicallythesameinbothcostfunctionsandthereisnopointofusingMEEcostsinGaussiannoiseandlinearoptimization.IndeedtheMEEfamilyofalgorithmsismuchmorecomputationaldemandingthantheMSEalgorithmsandneedstheselectionofanextraparameterforproperoperation(thekernelsizefortheestimationoftheinformationpotential). 145

PAGE 146

Second,weneedtoimplementtheMEE-FPwithTDNNfornonlinearsignalprocessing.Inmanyadaptiveproblems,weexpectthattheMEE-FPwithTDNNperformsbetterthanRLSwithTDNN. Third,selectingtheoptimalkernelsizeshouldbestudied.Inthisresearch,westartedwiththeSilverman'sthumbofrule.Thekernelsizewhichwasusedinthesimulationisthebestvaluefortheperformanceofallthealgorithms. 146

PAGE 147

[1] B.WidrowandS.D.Steams,AdaptiveSignalProcessing,NewJersey:Prentice-Hall,1985. [2] S.Haykin,AdaptiveFilterTheory,NewJersey:Prentice-Hall,4thedition,2002. [3] E.WalachandB.Widrow,\Theleastmeanfourth(lmf)adaptivealgorithmanditsfamily,"IEEETransactionsonInformationTheory,vol.30,no.2,pp.275{283,March1984. [4] J.A.Chambers,O.Tanrikulu,andA.G.Constantinides,\Leastmeanmixed-normadaptiveltering,"IEEElectronicsLetters,vol.30,no.19,pp.1574{1575,September1994. [5] O.TanrikuluandJ.A.Chambers,\Convergenceandsteady-statepropertiesoftheleast-meanmixednorm(lmmn)adaptivealgorithm,"IEEProceedingsVision,ImageandSignalProcessing,vol.143,no.3,pp.137{142,June1996. [6] C.F.N.CowanandC.Rusu,\Adaptiveechocancellationusingcostfunctionadaptation,"in4thIMAInternationalConferenceonMathematicsinSignalProcessing,Warwick,UK,December1996. [7] C.E.Shannon,\Amathematicaltheoryofcommunication,"TheBellSystemTechnicalJournal,vol.27,pp.379{423and623{656,JulyandOctober1948. [8] C.E.ShannonandW.Weaver,TheMathematicalTheoryofCommunication,Urbana:UniversityofIllinoisPress,1949. [9] A.Renyi,ProbabilityTheory,Amsterdam:North-Holland,1970. [10] A.Renyi,\Somefundamentalquestionsofinformationtheory,"SelectedPapersofAlfredRenyi,vol.30,pp.526{552,1976. [11] P.Sahoo,C.Wilkins,andJ.Yeager,\Thresholdselectionusingrenyi'sentropy,"PatternRecognition,vol.30,no.1,pp.71{84,January1997. [12] C.Cachin,\Smoothentropyandrenyientropy,"inProceedingsofInternationalConferenceontheTheoryandApplicationofCryptographicTechniques,Konstanz,Germany,May1997,vol.1233,pp.193{208. [13] J.W.Fisher,NonlinearExtensionstotheMinimumAverageCorrelationEnergyFilter,Ph.D.dissertation,UniversityofFlorida,Gainesville,FL,May1997. [14] J.C.Principe,D.Xu,andJ.W.Fisher,UnsupervisedAdaptiveFiltering,chapterInformationTheoreticLearning,pp.265{319,NewYork:JohnWileyandSons,2000. [15] J.C.Principe,D.Xu,Q.Zhao,andJ.Fisher,\Learningfromexampleswithinformationtheoreticcriteria,"JournalofVLSISignalProcessingSystems,vol.30,no.1-2,pp.61{77,August2000. 147

PAGE 148

D.Xu,J.C.Principe,J.Fisher,andH.Wu,\Anovelmeasureforindependentcomponentanalysis(ica),"inProceedingsofIEEEInternationalConferenceonAcoustics,SpeechandSignalProcessing,Seattle,WA,May1998,vol.2,pp.1161{1164. [17] D.Xu,Energy,EntropyandInformationPotentialforNeuralComputation,Ph.D.dissertation,UniversityofFlorida,Gainesville,FL,May1999. [18] A.BellandT.Sejnowski,\Aninformation-maximizationapproachtoblindseparationandblinddeconvolution,"NeuralComputation,vol.7,no.6,pp.1129{1159,1995. [19] J.F.BercherandC.Vignat,\Estimatingtheentropyofasignalwithapplications,"IEEETransactionsonSignalProcessing,vol.48,no.6,pp.1687{1694,June2000. [20] P.Viola,N.Schraudolph,andT.Sejnowski,\Empiricalentropymanipulationforreal-worldproblems,"inProceedingsofAdvancesinNeuralInformationProcessingSystems,Veil,CO,November1995,vol.8,pp.851{857. [21] D.ErdogmusandJ.C.Principe,\Anerror-entropyminimizationalgorithmforsupervisedtrainingofnonlinearsystems,"IEEETransactionsonSignalProcessing,vol.50,no.7,pp.1780{1786,July2002. [22] D.Erdogmus,InformationTheoreticLearning:Renyi'sEntropyanditsApplicationstoAdaptiveSystemTraining,Ph.D.dissertation,UniversityofFlorida,Gainesville,FL,May2002. [23] D.ErdogmusandJ.C.Principe,\Generalizedinformationpotentialcriterionforadaptivesystemtraining,"IEEETransactionsonNeuralNetworks,vol.13,no.5,pp.1035{1044,September2002. [24] I.Santamaria,P.P.Pokharel,andJ.C.Principe,\Generalizedcorrelationfunction:Denition,propertiesandapplicationtoblindequalization,"IEEETransactionsonSignalProcessing,vol.54,no.6,pp.2187{2197,June2006. [25] LiuWeifeng,P.P.Pokharel,andJ.C.Principe,\Correntropy:propertiesandapplicationsinnon-gaussiansignalprocessing,"IEEETransactionsonSignalProcessing,accepted,2006. [26] N.KalouptsidisandS.Theodoridis,AdaptiveSystemIdenticationandSignalProcessingAlgorithms,NewJersey:Prentice-Hall,1993. [27] A.Zerguine,C.F.N.Cowan,andM.Bettayeb,\Adaptiveechocancellationusingleastmeanmixed-normalgorithm,"IEEETransactionsonSignalProcessing,vol.45,no.5,pp.1340{1343,May1997. [28] C.F.N.Cowan,AdaptiveSystemIdenticationandSignalProcessingAlgorithms,chapterChannelEqualization,pp.388{406,NewJersey:Prentice-Hall,1993. 148

PAGE 149

B.Farhang-Boroujeny,AdaptiveFilters:TheoryandApplications,NewYork:Wiley,1998. [30] S.Han,S.Rao,D.Erdogmus,andJ.C.Principe,\Animprovedminimumerrorentropycriterionwithself-adjustingstep-size,"inProceedingsofIEEEInternationalWorkshoponMachineLearningforSignalProcessing,Mystic,CT,September2005,pp.317{322. [31] S.Han,S.Rao,K.H.Jeong,andJ.C.Principe,\Anormalizedminimumerrorentropystochasticalgorithm,"inProceedingsofIEEEInternationalConferenceonAcoustics,SpeechandSignalProcessing,Toulouse,France,May2006,vol.5,pp.609{612. [32] S.HanandJ.C.Principe,\Axed-pointminimumerrorentropyalgorithm,"inProceedingsofIEEEInternationalWorkshoponMachineLearningforSignalProcessing,Maynooth,Ireland,September2006,pp.167{172. [33] S.Han,S.Rao,andJ.C.Principe,\Estimatingtheinformationpotentialwiththefastgausstransform,"inProceedingsof6thInternationalConferenceonIndependentComponentAnalysisandBlindSourceSeparation,Charleston,SC,March2006,pp.82{89. [34] B.W.Silverman,DensityEstimationforStatisticsandDataAnalysis,London:ChapmanandHall,1986. [35] LiuWeifeng,P.P.Pokharel,andJ.C.Principe,\Errorentropy,correntropy,metric,andm-estimation,"inProceedingsofIEEEInternationalWorkshoponMachineLearningforSignalProcessing,Maynooth,Ireland,September2006,pp.179{184. [36] D.Erdogmus,J.C.Principe,andK.E.Hild,\Onlineentropymanipulation:Stochasticinformationgradient,"IEEESignalProcessingLetters,vol.10,no.8,pp.242{245,August2003. [37] D.ErdogmusandJ.C.Principe,\Convergencepropertiesanddataeciencyoftheminimumerrorentropycriterioninadalinetraining,"IEEETransactionsonSignalProcessing,vol.51,no.7,pp.1966{1978,July2003. [38] P.M.ClarksonandT.I.Haweel,\Amedianlmsalgorithm,"IEEElectronicsLetters,vol.25,no.8,pp.520{522,April1989. [39] G.A.Williamson,P.M.Clarkson,andW.A.Sethares,\Performancecharacteristicsofthemedianlmsadaptivelter,"IEEETransactionsonSignalProcessing,vol.41,no.2,pp.667{680,February1993. [40] R.A.MorejonandJ.C.Principe,\Advancedsearchalgorithmsforinformation-theoreticlearningwithkernel-basedestimators,"IEEETransactionsonNeuralNetworks,vol.15,no.4,pp.874{884,July2004. 149

PAGE 150

R.H.KwongandE.W.Johnston,\Avariablestepsizelmsalgorithm,"IEEETransactionsonSignalProcessing,vol.40,no.7,pp.1633{1642,July1992. [42] T.AboulnasrandK.Mayyas,\Arobustvariablestep-sizelms-typealgorithm:Analysisandsimulations,"IEEETransactionsonSignalProcessing,vol.45,no.3,pp.631{639,March1997. [43] D.I.PazaitisandA.G.Constantinides,\Anovelkurtosisdrivenvariablestep-sizeadaptivealgorithm,"IEEETransactionsonSignalProcessing,vol.47,no.3,pp.864{872,March1999. [44] H-C.Shin,A.H.Sayed,andW-J.Song,\Variablestep-sizenlmsandaneprojectionalgorithms,"IEEESignalProcessingLetters,vol.11,no.2,pp.132{135,February2004. [45] H.Khalil,NonlinearSystems,NewYork:Macmillan,1992. [46] Shao-JenLimandJ.G.Harris,\Performancecharacteristicsofthemedianlmsadaptivelter,"IEEElectronicsLetters,vol.33,no.6,pp.467{468,March1997. [47] A.Hyvarinen,\Fastandrobustxed-pointalgorithmsforindependentcomponentanalysis,"IEEETransactionsonNeuralNetworks,vol.10,no.3,pp.626{634,May1999. [48] I.Santamaria,D.Erdogmus,andJ.C.Principe,\Entropyminimizationforsuperviseddigitalcommunicationschannelequalization,"IEEETransactionsonSignalProcessing,vol.50,no.5,pp.1184{1192,May2002. [49] J.K.Hale,OrdinaryDierentialEquations,NewYork:JohnWileyandSons,1969. [50] D.Erdogmus,J.C.Principe,S.P.Kim,andJ.C.Sanchez,\Arecursiverenyi'sentropyestimator,"inProceedingsofIEEEInternationalWorkshoponNeurealNetworksforSignalProcessing,Martigny,Switzerland,September2002,pp.209{217. [51] S.Haykin,A.H.Sayed,J.R.Zeidler,P.Yee,andP.C.Wei,\Adaptivetrackingoflineartime-variantsystemsbyextendedrlsalgorithms,"IEEETransactionsonSignalProcessing,vol.45,no.5,pp.1118{1128,May1997. [52] J.C.PrincipeandD.Xu,\Information-theoreticlearningusingrenyi'squadraticentropy,"inProceedingsof1stInternationalWorkshoponIndependentComponentAnalysisandSignalSeparation,Aussois,France,January1999,pp.407{412. [53] K.E.Hild,D.Erdogmus,andJ.C.Principe,\Blindsourceseparationusingrenyi'smutualinformation,"IEEESignalProcessingLetters,vol.8,no.6,pp.174{176,June2001. [54] M.Lazaro,I.Santamaria,D.Erdogmus,K.E.Hild,C.Pantaleon,andJ.C.Principe,\Stochasticblindequalizationbasedonpdfttingusingparzenestimator,"IEEETransactionsonSignalProcessing,vol.53,no.2,pp.696{704,February2005. 150

PAGE 151

R.Jenssen,T.Eltoft,andJ.C.Principe,\Informationtheoreticspectralclustering,"inProceedingsofInternationalJointConferenceonNeuralNetworks,Budapest,Hungary,July2004,vol.1,pp.111{116. [56] K.Torkkola,\Learningdiscriminativefeaturetransformstolowdimensionsinlowdimensions,"inProceedingsofAdvancesinNeuralInformationProcessingSystems,Vancouver,BC,Canada,December2001. [57] L.GreengardandJ.Strain,\Thefastgausstransform,"SIAMJournalonScienticComputing,vol.12,no.1,pp.79{94,January1991. [58] C.Yang,R.Duraiswami,andN.Gumerov,\Improvedfastgausstransform,"TechnicalReportCS-TR-4495,UMIACS,UniversityofMD,CollegePark,2003. [59] T.Gonzalez,\Clusteringtominimizethemaximuminterclusterdistance,"TheoreticalComputerScience,38,vol.38,no.2-3,pp.293{306,June1985. [60] J.Strain,\Thefastgausstransformwithvariablescales,"SIAMJournalonScienticandStatisticalComputing,vol.12,no.5,pp.1131{1139,September1991. [61] D.C.MontgomerandE.A.Peck,IntroductiontoLinearRegressionAnalysis,NewYork:Wiley,2ndedition,1992. [62] R.H.Myers,ClassicalModernRegressionwithApplications,Boston:PWS-Kent,2ndedition,1990. [63] C.Masreliez,\Approximatenon-gaussianlteringwithlinearstateandobservationrelations,"IEEETransactionsonAutomaticControl,vol.20,no.1,pp.107{110,February1975. [64] H.W.SorensonandD.L.Alspach,\Recursivebayesianestimationusinggaussiansums,"Automatica,vol.7,no.4,pp.465{479,July1971. [65] M.ShaoandC.Nikias,\Signalprocessingwithfractionallowerordermoments:Stableprocessesandtheirapplications,"ProceedingsofIEEE,vol.81,no.7,pp.986{1009,July1993. [66] J.BodenschatzandC.Nikias,\Symmetricalpha-stableltertheory,"IEEETransactionsonSignalProcessing,vol.45,no.9,pp.2301{2306,September1997. [67] B.D.VanVeenandK.M.Buckley,\Beamforming:aversatileapproachtospatialltering,"IEEEASSPMagazine,vol.5,no.2,pp.4{22,April1988. [68] H.KrimandM.Viberg,\Twodecadesofarraysignalprocessingresearch:theparametricapproach,"IEEESignalProcessingMagazine,vol.13,no.4,pp.67{94,July1996. [69] J.Capon,\High-resolutionfrequency-wavenumberspectrumanalysis,"ProceedingsoftheIEEE,vol.57,no.8,pp.1408{1418,August1969. 151

PAGE 152

B.Widrow,P.E.Mantey,L.J.Griths,andB.B.Goode,\Adaptiveantennasystems,"ProceedingsoftheIEEE,vol.55,no.12,pp.2143{2159,December1967. [71] C.A.Baird,\Recursiveprocessingforadaptivearrays,"inProceedingsofAdaptiveAntennaSystemsWorkshop,Washington,DC,September1974,pp.11{13. [72] R.Adler,R.E.Feldman,andM.S.Taqqu,APracticalGuidetoHeavyTails:StatisticalTechniquesandApplications,Boston:Birkhauser,1998. [73] P.Tsakalides,R.Raspanti,andC.L.Nikias,\Angle/dopplerestimationinheavy-tailedclutterbackgrounds,"IEEETransactionsonAerospaceandElectronicSystems,vol.35,no.2,pp.419{436,April1999. [74] R.T.Lacoss,\Dataadaptivespectralanalysismethods,"Geophysics,vol.36,no.4,pp.661{675,August1971. [75] O.L.Frost,\Analgorithmforlinearlyconstrainedadaptivearrayprocessing,"ProceedingsoftheIEEE,vol.60,no.8,pp.926{935,August1972. [76] P.TsakalidesandC.L.Nikias,\Robustadaptivebeamforminginalpha-stablenoiseenvironments,"inProceedingsofIEEEInternationalConferenceonAcoustics,SpeechandSignalProcessing,Atlanta,GA,May1996,vol.5,pp.2884{2887. [77] J.G.Proakis,DigitalCommunications,NewYork:McGraw-Hill,4thedition,2001. [78] H.Meyr,M.Moeneclaey,andS.A.Fechtel,DigitalCommunicationReceivers:Synchronization,ChannelEstimation,andSignalProcessing,NewYork:JohnWileyandSons,1997. [79] D.Godard,\Self-recoveringequalizationandcarriertrackingintwo-dimensionaldatacommunicationsystems,"IEEETransactionsonCommunications,vol.28,no.11,pp.1867{1875,November1980. [80] Y.Sato,\Amethodofself-recoveringequalizationformultilevelamplitude-modulationsystems,"IEEETransactionsonCommunications,vol.23,no.6,pp.679{682,June1975. [81] L.Tong,G.Xu,andT.Kailath,\Blindidenticationandequalizationbasedonsecond-orderstatistics:atimedomainapproach,"IEEETransactionsonInformationTheory,vol.40,no.2,pp.340{349,March1994. [82] W.Gardner,\Anewmethodofchannelidentication,"IEEETransactionsonCommunications,vol.39,no.6,pp.813{817,June1991. [83] E.Moulines,P.Duhamel,J.F.Cardoso,andS.Mayrargue,\Subspacemethodsfortheblindidenticationofmultichannelrlters,"IEEETransactionsonSignalProcessing,vol.43,no.2,pp.516{525,February1995. 152

PAGE 153

H.Kubo,K.Murakami,andT.Fujino,\Anadaptivemaximum-likelihoodsequenceestimatorforfasttime-varyingintersymbolinterferencechannels,"IEEETransactionsonCommunications,vol.42,no.2-4,pp.1872{1880,April1994. [85] Z.Liu,X.Ma,andG.Giannakis,\Space-timecodingandkalmanlteringfortime-selectivefadingchannels,"IEEETransactionsonCommunications,vol.50,no.2,pp.183{186,February2002. [86] K.HuberandS.Haykin,\Improvedbayesianmimochanneltrackingforwirelesscommunications:Incorporatingadynamicalmodel,"IEEETransactionsonWirelessCommunications,vol.5,no.9,pp.2468{2476,September2006. [87] J.Cavers,\Ananalysisofpilotsymbolassistedmodulationforrayleighfadingchannels,"IEEETransactionsonVehicularTechnology,vol.40,no.4,pp.686{693,November1991. [88] L.Tong,B.Sadler,andM.Dong,\Pilot-assistedwirelesstransmissions:generalmodel,designcriteria,andsignalprocessing,"IEEESignalProcessingMagazine,vol.21,no.6,pp.12{25,November2004. [89] A.H.SayedandT.Kailath,\Astate-spaceapproachtoadaptiverlsltering,"IEEESignalProcessingMagazine,vol.11,no.3,pp.18{60,July1994. [90] K.L.Blackard,T.S.Rappaport,andC.W.Bostian,\Measurementsandmodelsofradiofrequencyimpulsivenoiseforindoorwirelesscommunications,"IEEETransactionsonSelectedAreasinCommunications,vol.11,no.7,pp.991{1001,May1993. [91] T.K.Blankenship,D.M.Kriztman,andT.S.Rappaport,\Measurementsandsimulationofradiofrequencyimpulsivenoiseinhospitalsandclinics,"inProceedingsofIEEE47thVehicularTechnologyConference,Phoenix,AZ,May1997,vol.3,pp.1942{1946. [92] X.WangandV.Poor,\Robustmultiuserdetectioninnon-gaussianchannels,"IEEETransactionsonSignalProcessing,vol.47,no.2,pp.289{305,February1999. [93] H.WangandP.Chang,\Onverifyingtherstordermarkovianassumptionforarayleighfadingchannelmodel,"IEEETransactionsonVehicularTechnology,vol.45,no.2,pp.353{357,May1996. [94] A.Furuskar,S.Mazur,F.Muller,andH.Olofsson,\Edge:Enhanceddataratesforgsmandtdma/136evolution,"IEEEPersonalCommunications,vol.6,no.3,pp.56{66,June1999. [95] J.C.Principe,N.Euliano,andC.Lefebvre,NeuralandAdaptiveSystems:Funda-mentalsThroughSimulations,NewYork:Wiley,1999. [96] D.Rumelhart,GHinton,andR.Williams,\Learninginternalrepresentationsbyerrorbackpropagation,"Nature,vol.323,pp.533{536,1986. 153

PAGE 154

S.Haykin,NeuralNetworks:AComprehensiveFoundation,NewYork:MacMillan,1994. 154

PAGE 155

SeungjuHanwasborninGok-Seong,SouthKorea.HereceivedtheB.S.degreeinelectricalandcomputerengineeringfromSungKyunKwanUniversityin2001andtheM.S.degreefromSeoulNationalUniversityin2003.Since2003,hehasbeenaresearchassistantintheComputationalNeuroEngineeringLaboratory(CNEL)attheUniversityofFloridaworkingunderDr.JoseC.PrincipeontheInformationTheoreticLearning(ITL)Project.Hisresearchinterestsincludenonlinearsignalprocessing,adaptivesignalprocessing,informationtheory,machinelearningandpatternrecognition.HereceivedhisPh.Ddegreein2007. 155