Citation
Sparse and Deep Learning-Based Nonlinear Control Design with Hypersonic Flight Applications

Material Information

Title:
Sparse and Deep Learning-Based Nonlinear Control Design with Hypersonic Flight Applications
Creator:
Nivison, Scott A
Place of Publication:
[Gainesville, Fla.]
Florida
Publisher:
University of Florida
Publication Date:
Language:
english
Physical Description:
1 online resource (170 p.)

Thesis/Dissertation Information

Degree:
Doctorate ( Ph.D.)
Degree Grantor:
University of Florida
Degree Disciplines:
Electrical and Computer Engineering
Committee Chair:
KHARGONEKAR,PRAMOD P
Committee Co-Chair:
WU,DAPENG
Committee Members:
DIXON,WARREN E
FITZ-COY,NORMAN G

Subjects

Subjects / Keywords:
adaptive -- hypersonic -- learning -- lyapunov
Electrical and Computer Engineering -- Dissertations, Academic -- UF
Genre:
bibliography ( marcgt )
theses ( marcgt )
government publication (state, provincial, terriorial, dependent) ( marcgt )
born-digital ( sobekcm )
Electronic Thesis or Dissertation
Electrical and Computer Engineering thesis, Ph.D.

Notes

Abstract:
The task of hypersonic vehicle (HSV) flight control is both intriguing and complicated. HSV control requires dealing with interactions between structural, aerodynamic, and propulsive effects that are typically ignored for conventional aircraft. Furthermore, due to the long distance and high-speed requirements of HSVs, the size of the flight envelope becomes quite expansive. This research focuses on the development of sparse and deep neural network-based control methods to solve HSV challenges. The first aspect of the research develops a novel switched adaptive control architecture called sparse neural network (SNN) in order to improve transient performance of flight vehicles with persistent and significant region based uncertainties. The SNN is designed to operate with small to moderate learning rates in order to avoid high-frequency oscillations due to unmodeled dynamics in the control bandwidth. In addition, it utilizes only a small number of active neurons in the adaptive controller during operation in order to reduce the computational burden on the flight processor. We develop novel adaptive laws for the SNN and derive a dwell time condition to ensure safe switching. We demonstrate the effectiveness of the SNN approach by controlling a sophisticated HSV with flexible body effects and provide a detailed Lyapunov-based stability analysis of the controller. The second aspect of the research develops a training procedure for a robust deep recurrent neural network (RNN) with gated recurrent unit (GRU) modules. This procedure leverages ideas from robust nonlinear control to create a robust and high performance controller that tracks time-varying trajectories. During optimization, the controller is trained to negate uncertainties in the system dynamics while establishing a set of stability margins. This leads to improved robustness compared to typical baseline controllers for flight systems. We leverage a recently developed region of attraction (ROA) estimation scheme to verify the stability margins of the flight system. Inspired by the SNN adaptive control research, we develop the concept of a sparse deep learning controller (S-DLC) in order to improve perimeter convergence and reduce the computational load on the processor. We demonstrate the effectiveness of each approach by controlling a hypersonic vehicle with flexible body effects. ( en )
General Note:
In the series University of Florida Digital Collections.
General Note:
Includes vita.
Bibliography:
Includes bibliographical references.
Source of Description:
Description based on online resource; title from PDF title page.
Source of Description:
This bibliographic record is available under the Creative Commons CC0 public domain dedication. The University of Florida Libraries, as creator of this bibliographic record, has waived all rights to it worldwide under copyright law, including all related and neighboring rights, to the extent allowed by law.
Thesis:
Thesis (Ph.D.)--University of Florida, 2017.
Local:
Adviser: KHARGONEKAR,PRAMOD P.
Local:
Co-adviser: WU,DAPENG.
Statement of Responsibility:
by Scott A Nivison.

Record Information

Source Institution:
UFRGP
Rights Management:
Applicable rights reserved.
Classification:
LD1780 2017 ( lcc )

Downloads

This item has the following downloads:


Full Text

PAGE 1

SPARSEANDDEEPLEARNING-BASEDNONLINEARCONTROLDESIGNWITH HYPERSONICFLIGHTAPPLICATIONS By SCOTTA.NIVISON ADISSERTATIONPRESENTEDTOTHEGRADUATESCHOOL OFTHEUNIVERSITYOFFLORIDAINPARTIALFULFILLMENT OFTHEREQUIREMENTSFORTHEDEGREEOF DOCTOROFPHILOSOPHY UNIVERSITYOFFLORIDA 2017

PAGE 2

c 2017ScottA.Nivison 2

PAGE 3

Idedicatethisdissertationtomywonderfulwife,Ivy,andtwoamazingdaughters,Rori andCassia. 3

PAGE 4

ACKNOWLEDGEMENTS Firstandforemost,IwouldliketoacknowledgemyadvisorDr.PramodKhargonekar.Hehasprovidedatremendousamountofencouragement,support,and motivationduringmygraduatestudies.Ifeelfortunatetohavehadtheopportunityto learnfromhim.HehasprovidedalastingimpactthatIwillnotforget. Iwouldliketothankmycommitteemembers,Dr.DapengWuandDr.Warren Dixon,fortheirinspiringgraduatecourseswhichhavecontributedgreatlytomyresearch.IwouldalsoliketothankDr.NormanFitz-Coyforhisthought-provokingquestions,duringtheoralqualiers,whichincreasedthequalityofthedissertation.Iwould alsoliketoexpressmygratitudetoDr.EugeneLavretskyandDr.DavidJeffcoatfortheir time,effort,anddiscussions. IwouldalsoliketoacknowledgethesupportgiventomebytheAirForceResearch LabAFRLandmymanyco-workers.Specically,IwouldliketomentionMr.JohnK. O'NealandMrs.SharonStockbridgefortheirhelpfulnessandcontinuedsupportduring mygraduateprogram.Lastly,IwouldliketothanktheDepartmentofDefenseforits nancialsupportthroughtheScience,Mathematics,andResearchforTransformation SMARTScholarshipprogram. 4

PAGE 5

TABLEOFCONTENTS page ACKNOWLEDGEMENTS.................................4 LISTOFTABLES......................................8 LISTOFFIGURES.....................................9 LISTOFABBREVIATIONS................................12 ABSTRACT.........................................13 CHAPTER 1INTRODUCTION...................................15 1.1MotivationandLiteratureReview.......................15 1.1.1NeuralNetwork-BasedModelReferenceAdaptiveControl.....17 1.1.2PolicySearchandDeepLearning...................21 1.1.2.1Reinforcementlearning...................21 1.1.2.2Robustdeeplearningcontrolleranalysis.........24 1.2OverallContribution..............................25 1.3ChapterDescriptions..............................25 1.3.1Chapter4:ImprovingLearningofModelReferenceAdaptiveControllers:ASparseNeuralNetworkApproach.............26 1.3.2Chapter5:ASparseNeuralNetworkApproachtoModelReferenceAdaptiveControlwithHypersonicFlightApplications.....26 1.3.3Chapter6:DevelopmentofaRobustDeepRecurrentNetwork ControllerforFlightApplications....................27 1.3.4Chapter7:DevelopmentofaDeepandSparseRecurrentNeural NetworkHypersonicFlightControllerwithStabilityMarginAnalysis27 2BACKGROUND:DEEPNEURALNETWORKS..................29 2.1DeepMulti-layerNeuralNetworks.......................32 2.2DeepRecurrentNeuralNetworks.......................34 2.2.1MemoryModulesforRecurrentNeuralNetworks..........35 2.2.1.1Longshort-termmemoryLSTM.............35 2.2.1.2GatedrecurrentunitGRU.................36 2.2.2DeepRecurrentNetworkArchitectures................36 2.3Optimization...................................37 2.3.1GradientDescent............................37 2.3.2StochasticGradientDescent......................37 2.3.3Broyden-Fletcher-Goldfarb-ShannoBFGS.............39 5

PAGE 6

3BACKGROUND:MODELREFERENCEADAPTIVECONTROL........41 3.1BaselineControlofaFlightVehicle......................43 3.1.1LinearizedFlightDynamicsModel...................43 3.1.2BaselineController...........................44 3.1.3IterativeDesignLoop..........................46 3.2NonlinearandAdaptiveControlofaFlightVehicle.............47 3.2.1Single-InputSingle-OutputSISOAdaptiveControl.........47 3.2.2DirectModelReferenceAdaptiveControlMRACwithUncertaintiesMIMO...............................50 3.2.3RobustAdaptiveControlTools.....................51 3.2.4AdaptiveAugmentation-BasedController...............53 3.2.5StructureoftheAdaptiveController..................55 4IMPROVINGLEARNINGOFMODELREFERENCEADAPTIVECONTROLLERS: ASPARSENEURALNETWORKAPPROACH..................56 4.1ModelReferenceAdaptiveControlFormulation...............56 4.2NeuralNetwork-BasedAdaptiveControl...................60 4.2.1RadialBasisFunctionRBFAdaptiveControl............61 4.2.2SingleHiddenLayerSHLAdaptiveControl.............63 4.2.3SparseNeuralNetworkSNNAdaptiveControl...........64 4.3NonlinearightdynamicsbasedSimulation.................70 4.4Results.....................................72 4.4.1SingleHiddenLayerSHL.......................72 4.4.2RadialBasisFunctionRBF......................73 4.4.3SparseNeuralNetworkSNN.....................74 4.5Summary....................................78 5ASPARSENEURALNETWORKAPPROACHTOMODELREFERENCEADAPTIVECONTROLWITHHYPERSONICFLIGHTAPPLICATIONS........80 5.1AugmentedModelReferenceAdaptiveControlFormulation........80 5.2SparseNeuralNetworkArchitecture.....................84 5.2.1SparseNeuralNetworkControlConcept...............84 5.2.2SparseNeuralNetworkAlgorithm...................87 5.3AdaptiveControlFormulation.........................88 5.3.1NeuralNetworkAdaptiveControlLaw.................91 5.3.2RobustAdaptiveControl........................94 5.4StabilityAnalysis................................95 5.4.1RobustControlforSafeSwitching...................96 5.4.2SparseNeuralNetworkControl....................104 5.5HypersonicFlightVehicleDynamicswithFlexibleBodyEffects......108 5.6AdaptiveControlResults...........................112 5.6.1SingleHiddenLayerSHLNeuralNetworkAdaptiveControl....114 5.6.2SparseNeuralNetworkSNNAdaptiveControl...........115 6

PAGE 7

5.7Summary....................................118 6DEVELOPMENTOFAROBUSTDEEPRECURRENTNEURALNETWORK CONTROLLERFORFLIGHTAPPLICATIONS..................120 6.1DeepLearning-BasedFlightControlDesign.................120 6.2FlightVehicleModel..............................121 6.3DeepRecurrentNeuralNetworkController.................124 6.3.1ControllerOptimizationProcedure...................126 6.3.2IncrementalTrainingProcedure....................129 6.4FlightControlSimulationResults.......................129 6.4.1RNN/GRUControllerOptimization..................129 6.4.2AnalysisandResults..........................131 6.5Summary....................................133 7DEVELOPMENTOFADEEPANDSPARSERECURRENTNEURALNETWORKHYPERSONICFLIGHTCONTROLLERWITHSTABILITYMARGIN ANALYSIS......................................135 7.1DeepLearningController...........................135 7.1.1ControllerArchitecture.........................136 7.1.2ExtensiontoSparseNeuralNetwork.................137 7.1.3OptimizationProcedure.........................139 7.1.4SystematicProcedureforWeightConvergence...........142 7.2HypersonicFlightVehicleModel.......................142 7.3Results.....................................146 7.3.1DeepLearningOptimization......................146 7.3.2HypersonicFlightControlSimulation.................148 7.3.3RegionofAttractionEstimationviaForwardReachableSets....149 7.4Summary....................................151 8CONCLUSIONS...................................153 APPENDIX ADEVELOPMENTOFBOUNDSFORTHENEURALNETWORKADAPTIVE ERRORCH4....................................155 BPROJECTIONOPERATORDEFINITIONSCH3/4...............157 CDEVELOPMENTOFUPPERBOUNDSBASEDONSPARSENEURALNETWORKADAPTIVEUPDATELAWSCH4.....................159 DALTERNATIVEDWELL-TIMECONDITIONAPPROACHCH4.........161 LISTOFREFERENCES..................................163 BIOGRAPHICALSKETCH................................170 7

PAGE 8

LISTOFTABLES Table page 4-1Flightconditiontoanalyze..............................70 4-2TrackingerrorcomparisontableofRBF,SHL,andSNN.............78 4-3EstimationcomparisontableofRBF,SHL,andSNN...............78 5-1RangeofexiblemodefrequenciesandtemperatureofHSV..........113 5-2TrackingperformancecomparisonofSHLversusSNN..............118 6-1Rangeofinitialconditionsanduncertaintyvariables...............130 6-2CumulativeerrorCTE,controlrateCCR,andnalcost............130 6-3Rangeofinitialconditionsanduncertaintyvariables...............131 7-1Rangeofinitialconditionsanduncertaintyvariables...............146 7-2AveragetrackingerrorATE,averagecontrolrateACR,andnalcost....148 8

PAGE 9

LISTOFFIGURES Figure page 2-1Exampleofamulti-layerfeed-forwardneuralnetwork...............32 2-2Exampleofasinglehiddenlayerrecurrentneuralnetwork............33 2-3Exampleofanexpandedrecurrentneuralnetwork.................34 2-4SimpliedstackedrecurrentneuralnetworkS-RNNarchitecture........39 3-1StandardLQRPIbaselinecontrollerarchitecture..................45 4-1Modelreferenceadaptivecontrolaugmentationcontrolarchitecture.......58 4-2ExamplesinglehiddenlayerSHLneuralnetwork.................60 4-3ExampleradialbasisfunctionRBFnetworkdistributedacrossangleofattack.63 4-4AdaptivesparseneuralnetworkSNNsegmentedightenvelopeinonedimension........................................66 4-5AdaptivesparseneuralnetworkSNNsegmentedightenvelopeintwoand threedimensions...................................67 4-6Singlehiddenlayerwithtypicalconnectivity,blendedconnectivity,andspare connectivity.......................................69 4-7Baselinecontroltransientperformancewhensubjectedtoradialbasisfunctionmatcheduncertainty...............................71 4-8Singlehiddenlayertransientanalysisbyvaryingthenumberoftotalnodes andthethelearningrate...............................72 4-9RadialbasisfunctionRBFversussinglehiddenlayerSHLanalysisplots..73 4-10RadialbasisfunctionRBFtransientanalysisbyvaryingthenumberofnodes andthethelearningrate...............................74 4-11Sparseneuralnetworkmatcheduncertaintycomparisonbyvaryingthelearningrateandthenumberofsharednodes......................75 4-12Sparseneuralnetworktransientanalysisbysinusoidalcommandswitherror comparisonresults..................................76 4-13Learningratecomparisonusingsinusoidalcommandswithtransientresults..77 5-1Sparseneuralnetworksegmentedightenvelopeforhypersoniccontrolin twoandthreedimensions...............................85 9

PAGE 10

5-2Delaunaydiagramsforsparseneuralnetworkhypersoniccontrolintwoand threedimensions...................................86 5-3Neuralnetworkcontrollerforhypersoniccontrolwithfullconnectivityandsparse connectivity.......................................87 5-4VisualLyapunovfunctionusedforstabilityanalysis................107 5-5HypersonicbaselinecontrollerundersignicantRBFbasedmatcheduncertaintywithresultingtrackingperformance......................114 5-6SinglehiddenlayerSHLhypersonictrackingperformanceanderrortracking...........................................115 5-7Hypersonictwodimensionalightenvelopepartitioninbird'seyeformand zoomedin.......................................116 5-8Hypersonicsparseneuralnetworktransientperformanceincludingtracking performanceanderrortracking...........................117 5-9HypersonicsparseneuralnetworkSNNmatcheduncertaintyestimation....118 6-1DeeplearningcontrolblockdiagramforRNN/GRU.................121 6-2ExampleCoefcientPolynomialFit.........................124 6-3TwolayerstackedrecurrentneuralnetworkS-RNN................125 6-4PhaseportraitanalysisforGSandRNN/GRUwithuncertaintyvalues u = 0 : 25 u =0 =0 ,and q =0 ............................133 6-5PhaseportraitanalysisforGSandRNN/GRUwithuncertaintyvalues u = 0 : 75 u =0 =0 : 025 ,and q =5 ..........................134 6-6PhaseportraitanalysisforGSandRNN/GRUwithuncertaintyvalues u = 0 : 5 u =0 =0 : 05 ,and q =2 : 5 ..........................134 6-7TraditionalstepresponsesforGSandRNN/GRUwithuncertaintyvalues u = 0 : 5 u =0 =0 : 05 ,and q =2 : 5 ..........................134 7-1Deeplearningcontrolclosed-loopblockdiagram..................135 7-2StackeddeeplearningcontrollerDLCarchitecturewithtwohiddenlayers..137 7-3TwodimensionalSNNsegmentedightspacefordeeplearningcontrol....139 7-4TraditionalstepresponsesforDLCandGSwithuncertaintyvalues u =0 : 6 u = )]TJ/F15 11.9552 Tf 9.298 0 Td [(5 ,and d =3 ..................................151 7-5TraditionalstepresponsesforDLCandGSwithuncertaintyvalues u =1 : 5 u =3 ,and d =1 ...................................152 10

PAGE 11

7-6PhaseportraitplotsforDLCandGSwithuncertaintyvalues u =0 : 5 u = )]TJ/F15 11.9552 Tf 9.298 0 Td [(0 : 5 ,and d =0 ....................................152 7-7Regionofattractionestimateviaforwardreachablesetsfor u =[0 : 5 ; 2 : 0] ...152 11

PAGE 12

LISTOFABBREVIATIONS DLCDeepLearningController GRUGatedRecurrentUnit HSVHypersonicVehicle LQRLinearQuadraticRegulator NNNeuralNetwork PEPersistenceofExcitation ROARegionofAttraction RBFRadialBasisFunction RNNRecurrentNeuralNetwork SHLSingleHiddenLayer S-DLCSparseandDeepLearningController 12

PAGE 13

AbstractofDissertationPresentedtotheGraduateSchool oftheUniversityofFloridainPartialFulllmentofthe RequirementsfortheDegreeofDoctorofPhilosophy SPARSEANDDEEPLEARNING-BASEDNONLINEARCONTROLDESIGNWITH HYPERSONICFLIGHTAPPLICATIONS By ScottA.Nivison December2017 Chair:PramodKhargonekar Major:ElectricalandComputerEngineering ThetaskofhypersonicvehicleHSVightcontrolisbothintriguingandcomplicated.HSVcontrolrequiresdealingwithinteractionsbetweenstructural,aerodynamic, andpropulsiveeffectsthataretypicallyignoredforconventionalaircraft.Furthermore, duetothelongdistanceandhigh-speedrequirementsofHSVs,thesizeoftheightenvelopebecomesquiteexpansive.Thisresearchfocusesonthedevelopmentofsparse anddeepneuralnetwork-basedcontrolmethodstosolveHSVchallenges. TherstaspectoftheresearchdevelopsanovelswitchedadaptivecontrolarchitecturecalledsparseneuralnetworkSNNinordertoimprovetransientperformanceof ightvehicleswithpersistentandsignicantregionbaseduncertainties.TheSNNisdesignedtooperatewithsmalltomoderatelearningratesinordertoavoidhigh-frequency oscillationsduetounmodeleddynamicsinthecontrolbandwidth.Inaddition,itutilizes onlyasmallnumberofactiveneuronsintheadaptivecontrollerduringoperationin ordertoreducethecomputationalburdenontheightprocessor.Wedevelopnovel adaptivelawsfortheSNNandderiveadwelltimeconditiontoensuresafeswitching. WedemonstratetheeffectivenessoftheSNNapproachbycontrollingasophisticated HSVwithexiblebodyeffectsandprovideadetailedLyapunov-basedstabilityanalysis ofthecontroller. Thesecondaspectoftheresearchdevelopsatrainingprocedureforarobust deeprecurrentneuralnetworkRNNwithgatedrecurrentunitGRUmodules.This 13

PAGE 14

procedureleveragesideasfromrobustnonlinearcontroltocreatearobustandhighperformancecontrollerthattrackstime-varyingtrajectories.Duringoptimization,the controlleristrainedtonegateuncertaintiesinthesystemdynamicswhileestablishing asetofstabilitymargins.Thisleadstoimprovedrobustnesscomparedtotypical baselinecontrollersforightsystems.Weleveragearecentlydevelopedregionof attractionROAestimationschemetoverifythestabilitymarginsoftheightsystem. InspiredbytheSNNadaptivecontrolresearch,wedeveloptheconceptofasparse deeplearningcontrollerS-DLCinordertoimproveperimeterconvergenceandreduce thecomputationalloadontheprocessor.Wedemonstratetheeffectivenessofeach approachbycontrollingahypersonicvehiclewithexiblebodyeffects. 14

PAGE 15

CHAPTER1 INTRODUCTION 1.1MotivationandLiteratureReview HypersonicvehicleHSVresearchcouldprovideapathforsafespaceexploration andspacetravelwhileimprovingthecapabilitytolaunchsatellitesintolowEarthorbit. Additionally,HSVresearchisapplicabletonumerousmilitarycapabilitiesincluding increasedsurvivabilityofightvehiclesandtheabilitytorespondquicklytolong distancetargetsthatposeasignicantthreat[1].Recentsuccessfromhypersonic ightvehiclessuchastheX-51reliesonconventionalaircraftsuchastheB-52and asolidrockettoboostthevehicletohighaltitudeandvelocities.Afterwards,theight vehicleseparatesfromtheaircraft,discardstherocket,andusesitsactivelycooledairbreathingsupersoniccombustionramjetscramjetenginetoacceleratetohypersonic speeds[2,3].Otherhypersonicightvehicles,commonlycalledboost-glidevehicles, usere-entryintotheEarth'satmospheretogainenoughspeedtobecomehypersonic [4].Alessdocumentedlimitationofhypersoniccontrolisthelackofprocessingpower. Forinstance,itisestimatedthatinrealtimethehypersonicightvehicleneedsto processinformationanorderofmagnitudefasterthanasubsonicplatform. Regardlessofthedetailsoftheoperation,controlofhypersonicightvehicles isachallengingtaskduetoextremechangesinaircraftdynamicsduringightand thevastnessoftheencounteredightenvelope[5].Furthermore,thesevehicles operateinenvironmentswithstrongstructural,propulsion,thermal,andcontrolsystem interactions[6].Moreover,theinitialmassoftheHSVcandramaticallydecrease duringightwhichsignicantlyimpactsthestructuralmodesoftheightvehicle[7]. Itisworthnotingthatexiblebodydynamicmodelingisalsoasignicantareaof interestforextremelylargeexiblesubsonicightvehiclessuchasX-HALEhigh altitudelongendurance.Thesevehiclespossesslow-frequencystructuralvibration modeswhichcancauselargenonlinearbodydeformations[8].Inaddition,these 15

PAGE 16

enduranceightvehiclesoperateathighanglesofattackwherenonlineareffects becomeprominentuncertaintiesinthesystemdynamics.Inaddition,operatingat extremelyhightemperaturescandeterioratesensorsandhaveamajorimpactonthe pressuredistributionontheHSVwhichaffectsvehiclestiffness.Unpredictableerrors stemmingfromablationorthermalexpansioncandrasticallyaffectmodelparameters. Forexample,travelingatsuchhighspeedscausestheHSVtobesubjectedtoextreme temperaturesduetoshockwaveradiationandaerodynamicfriction.Typically,an ablativematerialisaddedtoabsorbtheheatandprotectthevehiclesurfacesfrom melting.Theablativematerialisdesignedtodecomposewhilekeepingthesurfacesof theightvehiclerelativelycool.Unfortunately,thisablationprocessisdifculttomodel duetothecomplexstructureoftheablativematerialandthecouplingbetweenthe ablativematerialandtheightvehiclesurface[9].Anothertroublesomesourceoferror couldcomefromelasticeffectssuchasthermalexpansioncausedbyspatiallyvarying temperaturedistributionwhichcancausestructuraldeformations[10]. Inspiredbythevastightenvelopeandcomputationlimitationsencountered byhypersonicvehicles,wedevelopedthesparseneuralnetworkSNNconceptfor adaptivecontrolseeChapters4and5.TheSNNusesasegmentedightenvelope approachtoselectasmallnumberofactivenodesbasedonoperatingconditions.This encouragesregionbasedlearningwhilereducingthecomputationalburdenonthe processor.Additionally,theSNNallowssegmentstosharenodeswithoneanother inordertosmoothtransitionbetweensegmentsandimprovetransientperformance. VariousadvancementsintheSNNarchitecturearedevelopedthroughoutthisdocument withmaincontributionsstatedinSection1.3. Inordertotakeadvantageofthesophisticatedhypersonicmodels,wedevelop atrainingprocedureforadeeprecurrentneuralnetworkseeChapters6and7. Basedonourpreviousworkinadaptivecontrol,weextendtheSNNconcepttoa recurrentdeeparchitecture.ThisresultsinasparsedeeplearningcontrollerDLC 16

PAGE 17

whichutilizesastackeddeeprecurrentneuralnetworkRNNarchitecturewithgated recurrentunitsGRU.Thesparsenatureofthecontrollerdrasticallylimitsthenumber ofcomputationsrequiredateachtimestepwhilestillreapingbenetsofitsdeep architecture.RobustnessmetricsoftheDLCareveriedthroughsimulationandregion ofattractionROAestimationviaforwardreachablesets.Weshowtheeffectivenessof thesparsedeeplearningcontrollerapproachthroughsimulationresults. TheresearchincludedinthisdissertationcontributestotheeldofHSVcontrol byimprovinguponandusingtechniquesfromModelReferenceAdaptiveControl MRAC,policysearchanddeeplearning.Sections1.1.1and1.1.2belowprovidebrief overviewsofMRAC,deeplearningandpolicysearch.Moredetailonthesetopicscan befoundinthebackgroundchaptersChapter2andChapter3. 1.1.1NeuralNetwork-BasedModelReferenceAdaptiveControl Adaptivecontrolhasrevolutionizednonlinearcontrolandhasbroughttremendous improvementstotheeldintermsofperformanceandtheamountofuncertaintyand disturbancesthatsystemcanhandle.Unfortunately,therearesomedrawbacks.For instance,typicaladaptivenonlinearcontrollersarenotdesignedseekingspecied transientperformancerequirementse.g.rise-time,settling-time,etc.anddonotseek tominimizeenergyexpenditureoptimally.Inaddition,itischallengingtodesignadaptive controllerstoperformwellwhenfacingsignicantnon-parametricuncertaintiesand adaptivesystemsfrequentlyoscillateandpossessslowerconvergencerateswhen encounteringsizableparametricuncertainties. Inordertoaddressthelargeparametricuncertaintydrawback,therehavebeentwo keyapproaches.Gibson[11]suggestedaddingaLuenberger-liketermtothereference model.Narendra,usesthemultiplemodelapproach.Thisapproachcreatesmultiple identicationmodelsthatswitchbetweenoneanotherdependingonwhichmodelis workingthebest,whichisbasedonsomepredenedperformancecriteria.More recently,Narendra[12]usesthepreviouslymentionedindirectadaptivecontrolapproach 17

PAGE 18

utilizingmultiplemodelswheretheoutputofallthemodelsiscombinedtocreateamore accurateestimateoftheparametervectorwhichresultsinbettertrackingperformance. InthecomingchaptersseeChapter4and5,wepresentasparseadaptivecontrol methodologywhichseekstoreduceoscillationsbyoperatingwithsmalllearningrates andencouragingregion-basedlearning.Also,Chapters6and7focusonimproving transientperformanceandminimizingcontrolratesofvehicleswithsophisticated dynamicsbyusingadeeprecurrentarchitecture. Asmentionedpreviously,amainfocusofthisresearchistoimprovedirectmodel referenceadaptivecontrolMRACmethodsthroughtheuseofneuralnetworks.Traditionally,therehavebeentwodominantapproachestoneuralnetworkbasedadaptive ightcontrol:structuredneuralnetworksandunstructuredneuralnetworks.Astructured linear-in-the-parametersneuralnetworkapproachprovidesadaptivecontrollerswitha universalbasisbyxingtheinnerlayerweightsandactivationfunctionswhileupdating onlytheouterlayerweightswithanadaptiveupdatelaw[13].Atypicalstructuredneural networkapproachforightcontrolutilizesaGaussianRadialBasisFunctionRBF NeuralNetworkwhereRBFsaregenerallyspreadoutacrosstheinputspacewithxed centersandwidths.ThedrawbackofthisapproachisthatthenumberofRBFsneeds toincreasesignicantlyastheinputspaceincreases[14].AsinglehiddenlayerSHL neuralnetworkisanunstructuredapproachnonlinearlyparameterizedneuralnetwork whereboththeinnerlayerweightsandouterlayerweightsareupdatedconcurrently. Althoughmorecomplicated,SHLbasedadaptivecontrollersoftenhavebettertransient performanceandareeasiertotunecomparedtotheRBFnetworks[14,15].However, RBFneuralnetworkscontainlocalsupportthatallowsformoredesirablelearningstructure[16].Inbothapproaches,inordertoensureuniformlyultimatelyboundednessof thetrackingerrorandboundednessoftheadaptiveweights,robustadaptivecontrol modicationsmustbeappliede.g.projection,dead-zonetotheupdatelawswhich ensurestabilityevenifthepersistenceofexcitationconditionisnotsatised[17,18].In 18

PAGE 19

orderforanadaptivecontrollertocanceltheuncertaintypreciselyandhavetheadaptive weightstoconvergetotheiridealvalues,thestatesofthesystemmustbepersistently excitingPE[19].Unfortunately,thisconditionisoftennotmetinightcontrolandis difculttoverify[16,20]. ForbothRBFandSHLadaptivesystems,theselectionofadaptivelearningrates andthenumberofhiddennodesisparamount.Bothselectionshavetrade-offsthat signicantlyaffecttrackingperformanceandareareasofactiveresearch[16].Recently, aperformancecomparisonbetweenSHLandRBFbasedcontrollersin[15]showed that,foraspeciedconstantlearningrate,thereisanoptimalnumberofhiddennodes suchthatthenormofthetrackingerrordoesnotsignicantlydecreasebyadding additionalnodestothesystem.Oftenforightcontrolapplications,thenumberof nodesislessthanidealandselectedbasedoncomputationalconstraints[21].Another well-knowntrade-offinadaptivecontrol,specicallyMRAC,istheselectionofthe learningrates.Higherlearningratescorrespondtoreducingthetrackingerrormore rapidly[13,17,22].However,highlearningratescancausehigh-frequencyoscillations insystemswithunmodeleddynamics[23].Hence,thereexistsasignicanttrade-off betweenrobustnessandtrackingperformance. Intheadaptivecontrolcommunity,neuralnetwork-basedadaptivecontrollers arepredominatelyinitializedtosmallrandomnumbersthenupdatedaccordingto adaptiveupdatelawsdiscoveredthroughLyapunovanalysis.Inotherwords,weare handcuffedindesigningweightsfortheneuralnetworkbasedonthestabilityanalysis. Thismethodologyisusedbytwoofthemostpopularauthorsinneuralnetworkadaptive controlofightvehicles,Lavretsky[16]andCalise[24].Itisalsoworthnotingthatthere hasbeensomeresultswhichallowtheneuralnetworkweightstobedesignedoff-line undercertainrestrictions[25].Inaddition,usingindirectadaptivecontrolandmultiple modelphilosophy,ChenandNarendraproposedanalternateapproachwhichswitches 19

PAGE 20

betweenanonlinearneuralnetworkandarobustlinearcontroldesign[26]whichhas moreexibilityindesign. Althoughnotthefocusofthisresearch,recentlytherehavebeenmanyenhancementstotheMRACarchitecturethataimtoimprovetransientperformancewithout satisfyingthePEcondition.Concurrentlearningisonedevelopedapproachwhich focusesonweightconvergencebyusingcurrentandrecordeddataduringadaptation toimprovelearningperformanceand,undercertainconditions,guaranteesexponentialtrackingandidealweightconvergenceevenwithoutsatisfyingthepersistenceof excitationPEcondition[18].Anotherapproachcalled L 1 adaptivecontrolfocuseson instantaneouslydominatinguncertaintythroughfastadaptationbyusinghighadaptive gainsandemployingalowpasslteratthecontrolleroutput[27].Inaddition,therehas beenmuchefforttoestablishstabilityandperformancemetricsforadaptiveightcontrol throughtheuseofvericationmethods[28]. Recently,therehasbeenincreasedresearchinterestintheareaofswitched nonlinearsystems,fuzzylogic,intelligentcontrol,andneuralnetworksduetonumerous breakthroughsinLyapunovbasedstabilitymethodsforswitchedandhybridsystems [29,30].Forinstance,neuralnetworkbasedfuzzylogictechniqueshavebeendeveloped forMRACsystemswhichaimstomodifythereferencemodelonlineinordertoimprove transientperformancethroughtheuseofasupervisoryloop[31].Additionally,adaptive neuralnetworkshaverecentlybeenusedinbothSHLandRBFnetworksinorderto augmentswitchedrobustbaselinecontrollersforrobotmanipulatorandunmanned aerialvehicles[32,33].Moreover,controlmethodologieshavebeendevelopedfor hypersonicandhighlynonlinearightvehiclesusingfuzzylogicandswitchedtracking controllers[32,34].Furthermore,thestabilityofadaptiveRBFnetworkswithdwelltime conditionsthatdynamicallyaddandremovenodeswasinvestigatedforsystemsthat includeswitcheddynamics[35,36]. 20

PAGE 21

AssummarizedinSection1.3,Chapters4and5developasparseneuralnetworkswitchednonlinearcontrolapproachtoMRAC.Additionally,Chapter3provides additionalbackgroundanddetailonneuralnetworkbasedMRAC. 1.1.2PolicySearchandDeepLearning Overthepastdecade,therehavebeentremendousbreakthroughsinmachine learning.Manyofthesebreakthroughsderivefromdeeplearningmethods.Deep architectureslearncomplexrepresentationsofdatasetsinmultiplelayerswhereeach layeriscomposedofanumberofprocessingnodes.Deeplearningsucceedsby discoveringcomplexstructureinhighdimensionaldatawhereeachlayerstrivestond hiddenandlowdimensionalfeatures[37].Inordertodoso,deeplearningarchitectures aretypically,atleast,threetovelayersdeep.Thedeeperthedesign,themore complexofafunctionthealgorithmcanlearn.Deeplearninghasbecomeprevalentin industrywhereseveralcompaniese.g.Apple,Microsoft,Google,Facebook,Adobe haveobtainedimpressiveperformanceandutilityinspeechrecognitionandface recognitiontasksandapplicationssee[37,38]. Muchoftheliteraturesurroundingtheutilizationofdeeplearningmethodsfor controlofdynamicalsystemsisfoundinthereinforcementlearningeldwithinmachine learning.Thegoalofreinforcementlearningistodevelopmethodstosufciently trainanagentbymaximizingacostfunctionthroughrepeatedinteractionswithits environment[39]. 1.1.2.1Reinforcementlearning Themajorityofreinforcementlearningbasedmethodsfocusonadynamicprogrammingbasedapproachtocontrol.Thesemethodsareeithermodel-basedor model-free.Model-freebasedmethodsaretypicallybasedontemporaldifference learningalgorithmssuchasQ-learningorSARSAwherethevaluefunctionisestimated fromrepeatedinteractionswiththeenvironment.Alternatively,model-basedmethods useamodelofthesystemdynamicsanddynamicprogrammingtocomputeacontrol 21

PAGE 22

policythatminimizesavaluefunction.Inbothcases,thevaluefunctionispre-dened. Insmalldiscretespacesthesealgorithmsworkwell,butinmorerealisticscenarios continuousandlargethesemethodssufferfromthecurseofdimensionality[40]. Thatis,discretizationschemeresultsgrowexponentiallyinthedimensionofspace. Policysearchisasubeldofreinforcementlearningthatseekstondthebest possibleparametersofacontrollerinaparticularforme.g.deepneuralnetwork suchthatitoptimizesapre-denedcostfunction[39].Optimizationofthecontroller parametersisoftenperformedusinggradient-basedpolicyupdateswherethegradients areeithercomputedanalyticallyornumericallye.g.nitedifferencemethods.Inorder tocomputeanalyticgradients;thecostfunction,policy,andplantdynamicsarerequired tobedifferentiable.Incontrasttodynamicprogramming,policysearchmethodshandle high-dimensionalspaceswell,andtheydonotsufferfromthecurseofdimensionality. Similartodynamicprogramming,thereexistbothmodel-basedandmodel-freepolicy searchmethods.Model-freemethodslearnpolicyparametersbyinteractingwith theenvironmentthroughtheuseofsampletrajectories.Model-basedpolicysearch methodstypicallylearntheplantdynamicsthroughrepeatedinteractionswiththe environmentandthensubsequentlyusethelearneddynamicstotraintheparameters ofthecontrollerinternally.Similartoadaptivecontrolliterature,ifthetrainedparameters ofthecontrollerareobtainedbasedoninternalsimulations;theresultingcontrolleroften lacksrobustnesstomodelingerrors.Byaddingnoiseanduncertaintiestothesystem throughprobabilisticmodelsordirectinjectionleadstoimprovedrobustnessanda smootherobjectivefunctionwhichallowstheparameterstoavoidlocalminima[39]. TheresearchdescribedinthisdissertationisinspiredbytherecentworkofLevine andKoltun[41]andSutskever[42].LevineandKoltuncreateaGuidedPolicySearch frameworkthatusestrajectoryoptimizationinordertoassistpolicylearningandavoid parameterconvergencetopoorlocalminima.Levinerecentlyexploredtrainingdeep learningarchitecturesusingthemethodologyestablishedinguidedpolicysearch[43] 22

PAGE 23

andappliedthosecontrollerstosophisticatedrobotictasks.Sutskever[42]alsoexplored deeplearningbasedpolicysearchmethods,buthisresearchfocusedonthebenetsof usingHessian-freeoptimization.Hefoundthatincludingdisturbancesandtimedelays intrainingsamplesduringoptimizationofhisdeeprecurrentnetworkledtoimproved robustnessforsimplesystems. TheideaofLyapunovfunnelshasalsobeenamaininuenceonourwork.Funnels forcontrolpolicieshavebeenusedinroboticsandnonlinearcontroltoprovidecerticatesofconvergencefromalargesetofinitialconditionstoaspeciedgoalregion[44]. Recently,Tedrakeetal.[45]usedtoolsfromsumofsquaresSOSprogrammingto estimateregionsofattractionforrandomizedtreesstabilizedwithLQRfeedback.SOS programmingleveragesconvexoptimizationtoallowthecontroldesignertocheck positivityandnegativityofLyapunovfunctionsforpolynomialsystems[46].SOSprogramminghasstrongconnectionstorobustnessanalysisofnonlinearsystemsand hasbecomeapopulartoolforanalyzingstabilityfortime-invariantsystems[47].In addition,anumberofpowerfulcomputationaltoolsforSOSprogramminghavebeen developedincludingthedirectcomputationofLyapunovfunctionsforsmoothnonlinear systems[48].MajumdarandTedrake[49]usedSOSprogrammingtodesignsequential controllersalongpreplannedtrajectorieswhileexplicitlyaimingtomaximizethesizeof thefunnelsduringdesign.Oneshortcomingofthesemethodsistheydonotexplicitly focusonimprovingtherobustnessofthecontroller.Infact,asdiscussedbytheauthor, uncertaintiesanddisturbancescancausestabilityandperformanceguaranteestobe violatedinpractice[50]. Eventhoughdeeplearningcontrollershaveshowngreatpromiseincompleting roboticcontroltasks,theyoftenrequirealargenumberofcomputationsateachtime step,donotpossessstandardtrainingprocedures,andhavefewanalysistoolsfor theresultingcontroldesign.Inaddition,deeplearningcontrollersareoftenoptimized withoutregardtorobustnesswiththeassumptionthattheoptimalpolicyforthelearned 23

PAGE 24

modelcorrespondstotheoptimalpolicyforthetruedynamicsi.e.thecertaintyequivalenceassumption[39].Chapters6and7aimtodevelopmethodsthatcombat theselimitations. 1.1.2.2Robustdeeplearningcontrolleranalysis Traditionallyinightcontrol,gainandphasemarginshavebeenrequiredfor lineartime-invariantLTIbasedcontrollawsforightsystemsinordertoprovide sufcientrobustnessfromuncertaintiesandunmodeleddynamics.Fornonlinear systems,especiallyadaptivesystems,controldesignerscanutilizetimedelaymargins andanalternativeformofgainmarginsasimportantrobustnessmetrics[28].Time delaymargincanbedenedastheamountoftimedelaythattheclosed-loopsystem canhandlewithoutresultingininstability.Wewillassumethatthetimedelayenters thesystemthroughthecontrolsignal.Althoughrecentresearchhasattemptedto establishfundamentalmethodsforcomputingtimedelaymargin,itisstillquitepopular tocomputetimedelaymarginusingsimulations. Therehavebeenanumberofauthorsthathaveexploredrobustcontrolsynthesis basedonROAestimation.ThevastmajorityoftheseauthorsrelyonLyapunovbased methodsforROAestimatione.g.sumofsquareswithcontrollersinpolynomialor linearform.Forinstance,Dorobantuetal.[47]havedevelopedamethodologytotrain linearcontrollerstoberobusttoparametricuncertaintiesbyiteratingbetweenusing sumofsquaresSOSandnonlinearoptimization.Theis[51]usedSOSmethodstond controlLyapunovfunctionsCLFsforpolynomialsystemswithboundeduncertainties. Kwonetal.[52]usedSOSmethodstodeterminehowtogainscheduleaboosted missile.EventhoughmanypopularandsuccessfulcomputationaltoolsforSOS programminghavebeendeveloped,includingthedirectcomputationofLyapunov functionsforsmoothnonlinearsystems[48],Lyapunov-basedmethodshaveanumber ofdrawbacks.Forexample,LyapunovbasedROAestimatesoftenleadtoresultsthat 24

PAGE 25

areextremelyconservative,restrictedtopolynomialmodels,andarenegligentofsystem limitationse.g.actuatorsaturation[53]. InChapter7,weexploreusingreachablesetanalysistoestimatetheROAof anequilibriumpointofourclosed-loopsystem.Reachabilityanalysisisusedtond theexactorover-approximatesetofstatesthatasystemcanreach,givenaninitial setofstatesandinputs.Itiswell-knownthattheexactreachablesetofhybridand continuoussystemscanonlybecomputedinspecialcases.Hence,typicalmethods over-approximatebyusinggeometricmethods.Popularinthesafetyassurancecommunity,reachablesetmethodscanbeusedtoguaranteetheavoidanceofunsafestates whileassuringconvergencetodesirableequilibriumpoints[54]. 1.2OverallContribution Thegoaloftheresearchincludedinthisdissertationwastodevelopsparseand deeplearning-basednonlinearandadaptivecontrolmethodswhichdirectlytargetHSV controlchallenges.Inordertodoso,weusedtwoseparatecontrolmethodologies: modelreferenceadaptivecontrolandpolicysearch-baseddeeplearning.Usingmodel referenceadaptivecontrolastheframework,wedevelopedasparseneuralnetworkbasedadaptivecontrollerwhichreducesthecomputationalburdenontheprocessor whiledrasticallyimprovinglearningandtrackingperformanceoncontroltaskswith persistentregion-baseduncertainties.Alternatively,wedevelopedaninnovativeoff-line trainingprocedureforadeeplearningcontrollerDLCthatsimultaneouslytrainsfor performanceandrobustness.ByusingasparselyconnectedDLC,weshowsignicant improvementinparameterconvergencewhilealsoreducingthenumberofcomputations requiredateachtimestep.Botharchitectureswereevaluatedandanalyzedusinga sophisticatedhypersoniccontrolmodel. 1.3ChapterDescriptions Followingthischapter,weprovidetwobackgroundchapterswhichaimtoprovideasufcientbackgroundindeeplearningandmodelreferenceadaptivecontrol 25

PAGE 26

MRACChapter2andChapter3.Anoverviewwhichincludeskeycontributionsofthe remainingchaptersaregivenbelow. 1.3.1Chapter4:ImprovingLearningofModelReferenceAdaptiveControllers:A SparseNeuralNetworkApproach Thecontributionofthischapterisanovelapproachtoadaptivecontrolthatuses sparseneuralnetworksSNNtoimprovelearningandcontrolofightvehiclesunder persistentandsignicantuncertainties.TheSNNapproachisproventoenhance long-termlearningandtrackingperformancebyselectivelymodifyingasmallportion ofweightswhileoperatingineachportionoftheightenvelope.Thisresultsinbetter controllerperformanceduetothebetterinitializationoftheweightsafterrepeated visitstoaspecicportionoftheightenvelope.Flightcontrolsimulationsshowquite impressiveresultsgeneratedfromtheSNNapproachagainstthetraditionalsingle hiddenlayerSHLandradialbasisfunctionRBFbasedadaptivecontrollers. 1.3.2Chapter5:ASparseNeuralNetworkApproachtoModelReferenceAdaptiveControlwithHypersonicFlightApplications Thischapterexpandsontheprogressmadeinthepreviouschapter.Weprovide threekeycontributionsthatleadtosignicantlyimprovedtrackingperformancebased onsimulationresultsandauniformlyultimatelyboundedUUBLyapunovstabilityresult withadwelltimecondition.First,wedevelopadaptivecontroltermswhichmitigate theeffectofanunknowncontroleffectivenessmatrixonthebaseline,adaptive,and robustcontrollers.Secondly,wederivearobustcontroltermwhichisusedtocalculate adwelltimeconditionfortheswitchedsystem.Theinclusionoftherobustcontrol termalongwithanewlyderiveddwelltimeconditionisusedtoensuresafeswitching betweensegments.Inourwork,therobustcontroltermisonlyactivatedwhenthe normofthetrackingerrorexceedspresetbounds.Whileinsidetheerrorbounds,we disabletherobustcontrollerinordertoallowtheSNNtomaximizelearningandcontrol thevehiclemoreprecisely.Lastly,weinvestigateneuralnetworkadaptivecontrollaws producedbyincreasingtheorderoftheTaylorseriesexpansionaroundthehiddenlayer 26

PAGE 27

ofthematcheduncertainty.Inadditiontothepreviouslymentioneddevelopments,we demonstratetheperformanceoftheSNNadaptivecontrollerusingahypersonicight vehicleHSVmodelwithexiblebodyeffects.Forcomparison,wealsoincludeanalysis resultsforthemoreconventionalSHLapproach. 1.3.3Chapter6:DevelopmentofaRobustDeepRecurrentNetworkController forFlightApplications Themaincontributionofthischapteristhedevelopmentofanoptimizationproceduretotraintheparametersofadeeprecurrentnetworkcontrollerforcontrolofa highlydynamicightvehicle.Wetrainourrecurrentneuralnetworkcontrollerusinga setofsampletrajectoriesthatcontaindisturbances,aerodynamicuncertainties,and signicantcontrolattenuationandamplicationintheplantdynamicsduringoptimization.Inaddition,wedeneapiecewisecostfunctionthatallowsthedesignertocapture bothrobustnessandperformancecriteriasimultaneously.Inspiredbylayer-wisetraining methodsindeeplearning,weutilizeanincrementalinitializationtrainingprocedurefor multi-layerrecurrentneuralnetworks.Next,wecomparetheperformanceofthedeep RNNcontrollertoatypicalgain-scheduledlinearquadraticregulatorLQRdesign. Finally,wedemonstratetheabilityofthecontrollertonegatesignicantuncertaintiesin theaerodynamictableswhileremainingrobusttodisturbancesandcontrolamplication/attenuation. 1.3.4Chapter7:DevelopmentofaDeepandSparseRecurrentNeuralNetwork HypersonicFlightControllerwithStabilityMarginAnalysis Inthischapter,wepresentanovelapproachfortrainingandverifyingarobust deeplearningcontrollerDLCforahighlydynamichypersonicightvehicle.We leverageasamplebasedtrajectorytrainingmethodologyestablishedinChapter6for trainingandoptimizationoftheDLCweights.Inordertodesignasufcientlyrobust andhigh-performingcontroller,thecontrollerutilizesatrainingsetwithahighnumber oftrainingsampleswhichcontainvaryingcommands,uncertaintiese.g.timedelay, controlamplication/attenuation,anddisturbancese.g.exiblebodyeffectsineach 27

PAGE 28

trainingsample.Thetrainingphaseallowsthedesignertoincludedisturbancesand uncertaintiesthatarenotexplicitlyincludedinthedynamicsoftheightvehiclebutare anticipatedbasedonregion-specicmodels.Next,weextendthesparseneuralnetwork SNNadaptivecontrollerarchitecturedevelopedinChapter4intoadeepneural networkframework.WeusethisinnovativesparsedeeplearningcontrollerS-DLC architecturetoreducethecomputationloadontheprocessorandsignicantlyimprove parameterconvergence.ByrecognizingconnectionsoftheGRUmoduletofeed-forward networks,wedevelopasystematictrainingprocedurewhichimprovesoptimizationof theDLC.Additionally,robustnessmetricsoftheDLCareveriedthroughsimulationby usingROAestimationviaforwardreachablesets.Lastly,weanalyzetheresultsofthe optimizationandprovidesimulationresults. 28

PAGE 29

CHAPTER2 BACKGROUND:DEEPNEURALNETWORKS Sincetheestablishmentofmachinelearningmethods,therehasbeenagroupof techniquesreferredtoassupervisedlearning.Theideabehindsupervisedlearning istoadjustasetofparametersweightsinordertoreducesomepredenederror costfunctionbasedonlabeledtrainingdata.Thatis,duringthetrainingphase, thesupervisedlearningalgorithmisfedinputdatae.g.imagesandknownoutputs targetvalues.Theideaistoadjustthesystemweightsinordertondalocal minimumofthecostfunctionusingachosenoptimizationmethod.Typically,researchers employbatchmethodse.g.L-BFGSorstochasticgradientdescentSGDtosolve unconstrainedoptimizationproblemswithdeeparchitectures.Duringimplementation, batchmethodsarefarlesscommonduetothefactthattheyrequiretheentiretraining setofdatainordertocomputethevalueandgradientofthecostfunction.Another popularresearchareainthedeeplearningcommunityisunsupervisedlearning. Unsupervisedlearningmethodsmakeuseofunlabeleddatainordertodiscover structureinthedata[37,55]. Neuralnetworksisabranchofmachinelearningthatbecamepopularinthe1980s. Aneuralnetworkisabiologicallyinspiredmathematicalarchitecturewhichcanbe describedasaninput-outputmap.Thisnetworkcontainsalargenumberofneurons whereeachneuronisrepresentedbyanactivationfunctione.g.sigmoid,tanh,or linearrectier.Eachneuronactivationfunctionproducesasingleoutputbasedon anonlineartransformationoftheinput.Neuralnetworksbecameprevalentinmachine learningafterthediscoveryofbackpropagation.Backpropagationisanextensionof thechainruleandcanbeimplementedonamulti-layerneuralnetworktocomputethe gradientsofthatnetwork[55].Inthe1990s,manyresearcherswereledawayfrom neuralnetworkresearchduetoinsignicantandinconsistentresultswhileusingmultilayerneuralnetworks.Soonafter,supportvectormachinesSVMwerediscovered 29

PAGE 30

andwereemployedmoreoftenduetoseveralreasons.Mostimportantly,multi-layer networksoftenconvergedtosaddlepointsandlocalminimawhileSVMsweretypically formulatedasconvexoptimizationproblemswithglobalminimums.Secondly,neural networksoftenhaveissuesover-ttingthedatawhichleadstolowerrorsduringthe trainingphaseandlargeerrorsduringtestingorexecutionphase.Atthetime,thatledto SVMsoutperformingmulti-layerneuralnetworksonmosttasks[37]. Intheearly2000s,improvementsincomputationalpower,parallelcomputing, automaticgradientcomputationengines,andlargedatabasesoflabeledtrainingdata ledtothere-emergenceofdeepneuralnetworks.Shortlyafter,in2006,therewasa breakthroughindeeplearningresearch.Asuccessfulmethodfortrainingthedeep learningarchitectureswasdiscovered,calledgreedylayer-wisetraining.Themain ideabehindthisapproachissimple;incrementallytraineachlayerofthenetwork; then,afterallofthelayersaretrained,ne-tunethenetworkbyre-tuningallthe weightsconcurrently.Inthemachinelearningcommunity,oftenmanyinitiallayersare pre-trainedusingunsupervisedlearningmethods.Afterthenetworkispre-trained, theweightshaveabetterinitializationintheparameterspacethaniftheyhadbeen randomlyinitialized.Thistypicallyresultsintheoptimizationalgorithmconvergingtoa betterlocalminimumbecauseithassomeinformationaboutthesystemapriori[55]. Forsmallerlabeleddatasets,unsupervisedpre-traininghelpstopreventover-tting andleadstobettergeneralization[56].Inpractice,digitalsignalprocessingDSP techniquese.g.Fouriertransformsandcepstralanalysisareusedtotrainlowerlayers ofthedeeplearningnetworkinanunsupervisedfashionbeforeend-to-endtraining. Duetothemanybreakthroughsindeepneuralnetworktraining,GaussianHMM-based statisticalmethods,whichhavebeenusedfordecades,arebeingreplacedonthe mostpopularcommercialspeechrecognitionsystemse.g.Apple,Microsoft,Google, Facebook,Adobe[37,57]. 30

PAGE 31

Sparseneuralnetworkmethodshavealsocontributedtotherecentsuccessof deeplearning.Oneexampleisthewidespreadunsupervisedlearningmethodcalled thesparseautoencoder.Forthesparseautoencoder,aneuralnetworkistrainedto produceanoutputwhichisidenticaltotheinputwhilelimitingtheaverageactivation valueforeachnodeinthenetwork.Byutilizingthistechnique,onlyasmallpercentage ofneuronsareimpactedwhenoptimizingweightsforeachtrainingexample.Thisleads tofasterconvergenceandabetterclassierforimageandtexttasks[58].Another exampleofsparsityindeeplearningcanbefoundinthemostsuccessfulactivation functions.Forinstance,linearrectiersasactivationfunctionsaremorebiologically plausibleduetotheirnaturalsparsitycharacteristics.Thisleadstosuperiorperformance onlearningtaskscomparedtotheirsigmoidalorhyperbolictangentcounterparts [58].Furthermore,in2015,recentresearchforinitializationofReLUbaseddeep architectureswasprovidedin[59].Afterrandominitialization,linearrectiershidden unitsareactiveapproximately50%ofthetime.Inaddition,recentlydevelopedsparse techniquessuchasthemaxoutfunction[60]andcloselyrelatedchannel-outfunction [61]havefoundtremendoussuccessonclassicationproblems. Morerecently,duetothepopularityofdeeplearning,therehasbeensignicant researcheffortinapplyingdeeplearningtosolvetime-seriesproblems,specically, usingrecurrentneuralnetworksRNNs.RNNsareneuralnetworksthatprocessinputs sequentiallywherethehiddennodeskeepahistoryofthepaststates.Recurrentneural networksaremostfrequentlytrainedusingbackpropagationthroughtimeBPTTor real-timerecurrentlearningRTRL.Incomparisontostandardfeed-forwardneural networks,recurrentneuralnetworksareevenmoredifculttotrainandhavelarge susceptibilitytolocalminimaandsaddlepoints.Thishasbeenattributedtothedifculty thatrecurrentneuralnetworkshavewithlearninglong-rangetemporaldependencies, alsoknownasthewell-studiedvanishinggradientproblem.Fortunately,therehave beennotableimprovementsintrainingrecurrentneuralnetworks.Firstly,longshort-term 31

PAGE 32

Figure2-1.Exampleofamulti-layerfeed-forwardneuralnetwork. memoryLSTMandgatedrecurrentunitsGRUaremodularizedarchitecturesfor recurrentneuralnetworkswhichhaveproventoprovidesignicantbenetsintermsof performancebyaddressingthevanishinggradientproblem.Hessian-freeoptimization wascreatedtoaddressthatsameissue[42].Echo-statemachinesisanotheralternative whichside-stepsthevanishinggradientproblem. Therestofthischapterisdedicatedtoprovidingageneralbackgroundinneural networksanddeeplearningwhileprovidingreferencesforthereader.Chapter4 describesasparseneuralnetworkSNNadaptivecontrolarchitectureinspiredbydeep sparseneuralnetworkliterature.Chapter6aimstoextenddeeplearningbasedpolicy searchmethodstothecomplicateddynamicsofhigh-speedightwhileconsidering robustnesspropertiesofthecontrollerduringoptimization.Chapter7extendsthework inChapter6toincludehypersoniccontrolandimprovesthecontroller'srobustnessand trackingperformancebyemployingasparseconnectivityscheme. 2.1DeepMulti-layerNeuralNetworks Algorithm2.1 ForwardPropagation 1: a i = data 2: for i =1: numHidden +1 do 3: z i +1 = W a i + b i 4: a i +1 = f z i +1 5: endfor 6: y = a i +1 32

PAGE 33

Algorithm2.2 Backpropagation 1: for l = numLayers : )]TJ/F15 11.9552 Tf 9.299 0 Td [(1:1 do 2: if i = lastLayer then 3: l = )]TJ/F15 11.9552 Tf 9.299 0 Td [( y )]TJ/F25 11.9552 Tf 11.955 0 Td [(labels f 0 z l 4: else 5: l = W T l +1 f 0 z l 6: endif 7: dJ=dW l = l +1 a l T 8: dJ=db l = l +1 9: endfor Thesimplestformofmulti-layernetworkisillustratedinFigure2-1andisoften referredtoasafeedforwardmulti-layerneuralnetwork.Thenetworkistrainedsee Algorithm2.1usinganoptimizationalgorithme.g.stochasticgradientdescentor L-BFGSandamethodcalledbackpropagationseeAlgorithm2.2.Backpropagation determinesthederivativeofthecostfunction,usuallydenoted J ,withrespectto theweightsusingthechainrulei.e. dJ=dW .ThenetworkshowninFigure2-1has velayers:aninputlayer,threehiddenlayers,andanoutputlayer.Asstatedinthe introduction,thedeeperthenetworkthemorecomplexofafunctiontheneuralnetwork canrepresent.Generallyfortime-seriesproblems,eachinputnoderepresentsone-step backinthetimehistoryofthatsignal,whichisreferredtoasatime-delayedneural network.Foreachnodeaddedtothenetwork,moreweightsareneededtoconnect totheinternalnodes.Thisresultsinexpandingmemoryforeachtimehistorysample added.Hence,timedelaymulti-layerfeed-forwardnetworksaregenerallyrestrictedtoa smallamountoftimehistory. Figure2-2.Exampleofasinglehiddenlayerrecurrentneuralnetwork. 33

PAGE 34

Figure2-3.Exampleofanexpandedrecurrentneuralnetwork. 2.2DeepRecurrentNeuralNetworks RecurrentnetworksRNNhavehadoutstandingsuccessinaddressingtimeseries problemsinrecenthistory.AnexamplediagramofanRNNcanbeseeninFigure 2-2.RNNshaveanaturalwayofincorporatingtimehistoryofstatesintothenetwork structure.Recurrentnetworksalsohavethebenetofoperatingwithamuchsmaller numberofparametersthanfeed-forwardnetworks.TheexpandedviewoftheRNN canbeseeninFigure2-3,wherethecurrenttime-stepisfeddatafromprevioustime steps.Generally,deeprecurrentnetworksattachseveralhiddenlayersbeforeresolving theoutput.Similartofeed-forwardnetworks,thederivativeofthecostfunctionwith respecttotheweightsi.e. dJ dW isobtainedbyusingthechainruleafterrunningforward propagationtodeterminethecostseeAlgorithm2.3.Forrecurrentnetworks,thisis calledbackpropagationthroughtimeseeAlgorithm2.4.Recently,RNNshavesetnew benchmarksinspeechrecognitionbyusingdeepbidirectionallongshort-termmemory LSTMbasednetworks[62]. Traditionalrecurrentneuralnetworksoftenstrugglewithcapturinginformation fromlong-termdependencies.Thisstruggleiscoinedthevanishinggradientproblem andalsooccurswithverydeepfeed-forwardnetworks[63].Thestruggleoriginates 34

PAGE 35

whencalculatingthederivativeofthecostfunctionwithrespecttotheweightse.g. BPTT.Regularization,RNNarchitecturee.g.longshort-termmemoryLSTMorgated recurrentunitGRU,andoptimizerssuchasHessian-freeoptimizationhaveaddressed thelong-standingissueofvanishinggradients.Asimilarissueofexplodinggradients canalsooccurduringtraining,butrecentlyamethodofclippingthegradienthasbeen proventomitigatethatissue[64]. 2.2.1MemoryModulesforRecurrentNeuralNetworks 2.2.1.1Longshort-termmemoryLSTM Instandardrecurrentneuralnetworks,thehiddenstateisgivenby s t = f Ux t + Ws t )]TJ/F24 7.9701 Tf 6.587 0 Td [(1 where x t isthecurrentstate, s t )]TJ/F24 7.9701 Tf 6.586 0 Td [(1 istheprevioushiddenstate,and U;W aretunable parametersofthenetwork.Asstatedearlier,thisstructurehasdifcultyinlearning long-termdependenciesduetothevanishinggradientsproblem.Inliterature,LSTMis oftenmislabeledasanewarchitectureforrecurrentneuralnetworks.Infact,LSTMisa modulewhichreplaceshowtoupdatethehiddenstateusingagatedmechanism[65]. SeetheequationsstatedinAlgorithm2.5,whereeachgatehascorrespondingweights thatdeterminehowmuchpreviousinformationshouldbekeptandhowmuchshouldbe forgotten[63].Inadditiontothegatedmechanism,LSTMhasaninternalmemory, c t andanoutputgate, o ,whichdetermineshowmuchoftheinternalmemoryisprovidedto thenextmodule.LSTMrequiresmorememorythanthetraditionalimplementationand moreweightstotunebutissimpletoimplementandeffective. Algorithm2.3 ForwardPropagationRNN 1: for t =1: T f do 2: for i =1: numLayers do 3: s i t = f U i a i t + W i S i t )]TJ/F24 7.9701 Tf 6.586 0 Td [(1 + b i 4: endfor 5: y t = V S i t + b i +1 6: endfor 35

PAGE 36

Algorithm2.4 BackpropagationthroughtimeBPTT 1: for i = T f : )]TJ/F15 11.9552 Tf 9.298 0 Td [(1:1 do 2: dJ=dU = dJ=dx dx=du 3: dU=dz = f 0 u 4: dJ=dz 2 = dJ=dy dy=dz 5: dJ=ds = V 0 dJ=dz 0 2 + dJ=dS 6: dJ=dV = dJ=dz 2 s i + dJ=dV 7: dS=dz 1 = f 0 s i 8: dJ=dz 1 = dJ=dS dS=dz 9: dJ=dS i )]TJ/F24 7.9701 Tf 6.586 0 Td [(1 = W j dJ=dz 1 10: for j = numHidden : )]TJ/F15 11.9552 Tf 9.298 0 Td [(1:1 do 11: dJ=dU j = dJ=dz 1 s j +1 i + dJ=dU j 12: dJ=dW j = dJ=dz 1 s j i )]TJ/F24 7.9701 Tf 6.586 0 Td [(1 + dJ=dW j 13: endfor 14: endfor 15: dJ=db j +1 = dJ=dz 2 16: dJ=dS 0 j +1 = W 0 j +1 dJ=dS dS=dz j +1 17: for j = numHidden : )]TJ/F15 11.9552 Tf 9.299 0 Td [(1:1 do 18: dJ=dS 0 j = W 0 j dJ=dS j dS=dz j +1 19: dJ=db j = dJ=dS j dS=dz j +1 20: endfor 2.2.1.2GatedrecurrentunitGRU GatedRecurrentUnitsGRUwererecentlydiscoveredmodulesusedinasimilar fashiontoLSTMbutrequirelessmemoryandhavedifferentstructure[66],seeAlgorithm2.6.ThetwogatesintheGRUstructuredeterminethetrade-offbetweennew informationandpastinformation.NoticethattheGRUdoesnothaveinternalmemory.WewillutilizeGRUmodulesinourdeeparchitectureusedforightcontrol,see Chapters6and7. 2.2.2DeepRecurrentNetworkArchitectures Inadditiontooptimizationalgorithmresearch,memorymodules,andlayer-wise training,anothersignicantareaofresearchinthelastfewyearsliesindeterminingthe mostproductiverecurrentnetworkarchitectures.Pascanuetal.[67]exploreddifferent recurrentneuralnetworkarchitecturesandfoundthatthedeepstackedrecurrentneural networkworkedbestforthemajorityofthetasksseeFigure2-4.Gravesetal.[62] 36

PAGE 37

determineddeepbidirectionalrecurrentneuralnetworkstobeeffectiveforspeech recognitionproblems.InChapters6and7,wewillprovideamoredetaileddescription ofthestackedrecurrentneuralnetwork. 2.3Optimization Optimizationisanecessarytoolformachinelearningalgorithmsinordertominimizeapredenedcostfunctionthatisspeciedbytheuser.Afewpopularoptimization routinesforunconstrainedoptimizationproblemsfoundindeeplearningaredescribed below,butthereaderisencouragedtoseeNgiametal.[68]andSutskever[42]fora moredetaileddescriptionandcomparison. 2.3.1GradientDescent Oneofthemostfundamentalandsimplewaystosolveafunctionminimization problemdescribedpreviouslyisgradientdescent.Gradientdescentisarst-order optimizationalgorithmwhichiseasilydescribedby2,whereasmallgain, ,isused tostepinthedirectionofthenegativegradient.Thesmallerthegain,thelongeritwill taketoconverge.Ifthegainistoolarge,thealgorithmmaydivergeoverstep. t = t )]TJ/F24 7.9701 Tf 6.587 0 Td [(1 )]TJ/F25 11.9552 Tf 11.955 0 Td [( r E [ J ] 2.3.2StochasticGradientDescent StochasticGradientDescentSGDisbyfarthemostpopularoptimizationmethod usedfordeeplearningduetoitsspeedandeffortlessimplementation.SGDhasthe advantageoverbatchmethods,liketraditionalgradientdescent,becauseitdoesnot Algorithm2.5 LongShort-TermMemoryLSTM i = x t U i + s t )]TJ/F24 7.9701 Tf 6.587 0 Td [(1 W i f = x t U f + s t )]TJ/F24 7.9701 Tf 6.586 0 Td [(1 W f o = x t U o + s t )]TJ/F24 7.9701 Tf 6.586 0 Td [(1 W o g = tanh x t U g + s t )]TJ/F24 7.9701 Tf 6.587 0 Td [(1 W g c t = c t )]TJ/F24 7.9701 Tf 6.587 0 Td [(1 f + g i s t = tanh c t o Note: representselement-wisemultiplication 37

PAGE 38

Algorithm2.6 GatedRecurrentUnitsGRU u = x t U u + s t )]TJ/F24 7.9701 Tf 6.586 0 Td [(1 W u r = x t U r + s t )]TJ/F24 7.9701 Tf 6.587 0 Td [(1 W r h = tanh x t U h + s t )]TJ/F24 7.9701 Tf 6.587 0 Td [(1 r W h s t = )]TJ/F25 11.9552 Tf 11.956 0 Td [(z h + z s t )]TJ/F24 7.9701 Tf 6.586 0 Td [(1 Note: representselement-wisemultiplication requiretheentiretrainingsettomakeparameteradjustments.Forverylargedatasets, batchmethodsbecomeslow.ForSGD,thegradientoftheparametersisupdated usingonlyasingletrainingexample,see2.Inpractice,asubsetoftheoriginal trainingsetischosenatrandomtoperformtheupdate.SGDhasthereputationof leadingtoastableconvergenceataspeedypace.Unfortunately,SGDcomeswith drawbacks.Forinstance,thelearningratehastobechosenbytheuserandcanbe difculttodetermine.Fortunately,therehasbeenatremendousamountofresearch forSGDtoimproveconvergence.Methodsincludeadaptivelychangingthelearning rateorsimplydecreasingitbasedonsomepre-denedschedule[55].Inaddition, momentummethodscanbeappliedinordertoaccelerateindirectionofthegradient andconsistentlyreducethecostfunctionmorequickly.Traditionalmomentumequations arestatedinAlgorithm2.7.Unfortunately,thisintroducesanothertunableparameter whichdetermineshowmuchgradientinformationfromthepastisusedoneachupdate andisoftenreferredtoasthelearningrate. t = t )]TJ/F24 7.9701 Tf 6.587 0 Td [(1 )]TJ/F25 11.9552 Tf 11.955 0 Td [( r J ; x i ;y i Recently,Nesterov'sacceleratedgradientalgorithmhasbeenshowntoimprove convergencefordeeprecurrentneuralnetworks[42].Nesterov'salgorithm[69],seen inAlgorithm2.8,additionallyrequiresamomentumschedule andadaptivelearning rate 38

PAGE 39

Figure2-4.SimpliedstackedrecurrentneuralnetworkS-RNNarchitecture. 2.3.3Broyden-Fletcher-Goldfarb-ShannoBFGS Newton'smethodisaniterativemethodthatusestherstfewtermsintheTaylor seriesexpansiontondrootsofafunctioni.e.where f x =0 .Inoptimization,this conceptisusedtondtherootsofthederivativeofthefunction, f .Thatis,Newton's optimizationmethodseeAlgorithm2.9isusedtondthestationarypointsof f whicharealsothelocalminimaandmaximaofthefunction.Thisisoftendescribedas ttingaquadraticfunctionaroundpoint x andthentakingstepstowardtheminimum ofthatquadraticfunction.TheissuewithNewton'smethodforoptimizationisthat itrequiresananalyticalexpressionfortheHessianwhichisoftencomputationally expensivetocomputeandrequirestheHessiantobeinvertible.Forthesereasonsmost second-ordermethods,calledquasi-newtonmethods,solveunconstrainedoptimization problemsbyestimatingtheinverseHessianmatrix. Broyden-Flectcher-Goldfarb-ShannoBFGSisaquasi-newtonsecond-orderbatch methodwhichcanbeusedforndingextrema.BFGSisoftenimplementedasL-BFGS whichstandsforlimitedmemoryBFGS.BFGSisonetypeofquasi-newtonmethodthat solvesunconstrainednonlinearoptimizationproblemsbyestimatingtheinverseHessian seeAlgorithms2.10and2.11.Acomparisonofquasi-Newtonmethods,detailsof 39

PAGE 40

implementationandtestresultscanbeseenin[70].WewillutilizeL-BFGSoptimization inChapters6and7. Algorithm2.7 Momentum v t = v t )]TJ/F24 7.9701 Tf 6.587 0 Td [(1 )]TJ/F25 11.9552 Tf 11.955 0 Td [( r J ; x i ;y i t = t )]TJ/F24 7.9701 Tf 6.586 0 Td [(1 + v t Algorithm2.8 Nesterov'sMomentum v t = t )]TJ/F24 7.9701 Tf 6.586 0 Td [(1 v t )]TJ/F24 7.9701 Tf 6.586 0 Td [(1 )]TJ/F25 11.9552 Tf 11.955 0 Td [( t )]TJ/F24 7.9701 Tf 6.587 0 Td [(1 r f t )]TJ/F24 7.9701 Tf 6.587 0 Td [(1 + t )]TJ/F24 7.9701 Tf 6.586 0 Td [(1 v t )]TJ/F24 7.9701 Tf 6.587 0 Td [(1 t = t )]TJ/F24 7.9701 Tf 6.586 0 Td [(1 + v t u t = 1 )]TJ/F15 11.9552 Tf 11.956 0 Td [( =t +5 Algorithm2.9 Newton'sMethodforOptimization g = r f x t )]TJ/F24 7.9701 Tf 6.587 0 Td [(1 H = r 2 f x t )]TJ/F24 7.9701 Tf 6.586 0 Td [(1 x t = x t )]TJ/F24 7.9701 Tf 6.586 0 Td [(1 )]TJ/F25 11.9552 Tf 11.955 0 Td [(H )]TJ/F24 7.9701 Tf 6.586 0 Td [(1 g Algorithm2.10 BFGSUpdate g t = r f x t )-222(r f x t )]TJ/F24 7.9701 Tf 6.586 0 Td [(1 x t = x t )]TJ/F25 11.9552 Tf 11.956 0 Td [(x t )]TJ/F24 7.9701 Tf 6.586 0 Td [(1 t = g T t x t )]TJ/F24 7.9701 Tf 6.586 0 Td [(1 H )]TJ/F24 7.9701 Tf 6.586 0 Td [(1 t +1 = I )]TJ/F25 11.9552 Tf 11.955 0 Td [( t g t x T t H )]TJ/F24 7.9701 Tf 6.586 0 Td [(1 t I )]TJ/F25 11.9552 Tf 11.956 0 Td [( t x t g T t + t x t x T t Algorithm2.11 Broyden,Fletcher,Goldfarb,ShannoBFGSMinimization min H )]TJ/F18 5.9776 Tf 5.756 0 Td [(1 jj H )]TJ/F24 7.9701 Tf 6.587 0 Td [(1 t )]TJ/F25 11.9552 Tf 11.955 0 Td [(H )]TJ/F24 7.9701 Tf 6.586 0 Td [(1 t )]TJ/F24 7.9701 Tf 6.587 0 Td [(1 jj 2 s:t:H )]TJ/F24 7.9701 Tf 6.586 0 Td [(1 t g t = x t H )]TJ/F24 7.9701 Tf 6.586 0 Td [(1 t issymmetric 40

PAGE 41

CHAPTER3 BACKGROUND:MODELREFERENCEADAPTIVECONTROL Adaptivecontrolwasanearlyinnovationinthedevelopmentofightcontrol.Aircraft andotherightvehiclesareoftenrequiredtooperateindynamicightenvelopesthat spanvastlydifferentspeeds,altitudes,anddynamicpressures.Inaddition,theight vehicleissubjectedtonumerousdynamicdisturbancesthroughoutitsight.Incontrast tomanyroboticsystems,ightcontrollersareoftenrequiredtofollowpredetermined trajectoriesorguidancelaws.Therefore,itisbenecialforthatsystemtoadhereto pre-denedtransientperformancemetrics.Formanytypesofcontrolsystems,these metricsareembeddedinbaselinecontrollers[16].Thatis,baselinecontrollersareoften requiredtopossesscertainperformanceandrobustnesspropertiesforightcontrol applications.Oftenthesecontrollersaredesignedusingagain-scheduledcontroller whosegainsareadjustedbasedonthecurrentoperatingconditionoftheightvehicle. Inordertodeterminecontrollergainsacrosstheightenvelope,theightvehicle's modelislinearizedaboutselectedtrimpoints.Ateachtrimpoint,alinearcontroller isdevelopedbasedontransientandrobustnesscriteria.Linearquadraticregulator LQRisaprovenoptimalcontroltechniquethatgivesthecontroldesignerindependent variablesQandRmatricestotuneandtweakperformance.Thedesignerusually looksatmetricsbasedonloopshaping,whichisamechanismfordesigningcontrollers inthefrequencydomain.Thesemetricsincludemarginsgainandphase,singular values,rise-time,andsensitivityfunctionse.g.gangofsix[71].Gain-scheduledLQR controllersoftenremainrobusttotime-statedependentnonlinearuncertaintiesthat existthroughthecontrolchannelmatched.Unfortunately,inthepresenceofthese uncertaintiesandsignicantnonlinearities,thebaselineperformanceofthesystemis degraded[16].Inmostmoderncontrolsystems,thisdegradationisovercomebythe useofadaptivecontrollers. 41

PAGE 42

Generally,adaptivecontrollersaredesignedbycreatingupdatelawsbasedonthe closed-looperrorsofthesystem.Lyapunov-basedstabilityanalysisisusedtomake guaranteesintermsofstability,boundednessofadaptiveweights,andtrackingconvergence.Ubiquitousintheaerospaceindustry,atypeofdesigncalledModelReference AdaptiveControlMRACisusedtoimprovetherobustlinearcontrolbaselinecontroller byaddinganadaptivelayer.Mostcommonly,MRACisimplementedsothattheadaptive controlportionofthecontrolisonlyactivewhenthebaselineperformanceisdegraded. Traditionally,therearetwodifferenttypesofMRAC:directandindirect[19].Fordirect adaptivecontrol,controlgainsareadapteddirectlytoenforcetheclosed-looptracking performance.Forindirectadaptivecontrol,thecontrollersaredesignedtoestimatethe unknownplantparameterson-linethenusetheirestimatedvaluestocalculatecontroller gains.Asstatedabove,adaptivecontrolisusedtodrivethedenedsystemerrorto zeroevenwhentheparametersofasystemvary.Theseparametersdonotnecessarily convergetotheirtruevaluewhentheerrorisdriventozero.Inorderforconvergence tooccur,thepersistenceofexcitationconditionmustbemet.Toexplainthissuccinctly, thepersistenceofexcitationrequiresthecontrolinputtoexhibitaminimumlevelof spectrumvariabilityinorderforparameterstoconvergetotheirtruevalue[18]. Traditionally,MRACproblemsareconceivedassumingthestructureoftheuncertaintyisknownandcanbelinearlyparameterizedusingacollectionofknownnonlinear continuousfunctionsregressionmatrixandunknownparameters.Forproblemswhere thestructureoftheuncertaintyisunknown,universalapproximationpropertiesofneural networkscanbeexploitedinadaptivecontrollerstomitigatetheuncertaintiesofthe systemwithincertaintolerancesoveracompactset[16].Oneofthemostsuccessful implementationsofaneuralnetworkbasedadaptivecontrollerinpracticewasimplementedonseveralvariantsoftheJointDirectAttackMunitionsJDAMwhichhasbeen developedbyBoeingandhashadnumeroussuccessfulighttests[72].Thisight vehicleoperatesusingadirectmodelreferenceadaptivecontrolMRACarchitecture 42

PAGE 43

whererobustmodicationse.g.sigmamodicationaredesignedtokeeptheneural networkweightswithinpre-speciedbounds. Thischapterisdedicatedtoprovidingabackgroundinmodelreferenceadaptive controlMRAC.Chapter4willdevelopasparseneuralnetworkSNNarchitecturefor modelreferenceadaptivecontrolMRACthatencourageslong-termlearning.Chapter 5buildsonChapter4andprovidesaLyapunovstabilitybasedresultbasedonan enforceddwelltimecondition. 3.1BaselineControlofaFlightVehicle Themajorityofthissectionisformulatedbasedonpreliminaryworkpublished in[73].Theworkisbasedonmethodologyandderivationsin[16]andappliedto ahigh-speedightvehicle.Theapproachisconsideredstate-of-the-arttheight controlcommunityandisthefoundationfortheMRACbasedresearcheffortsinthis dissertation.Theoutputfeedbacknatureofthissectionisignored,asthisisnotthe focusofthedissertationbutisincludedinthepapercitedabove.Thecontroldesign isbasedonlinearizedmathematicalmodelsoftheaircraftdynamics.Thebaseline controllerisdesignedtoberobusttonoiseanddisturbances.Therobustnessofthe baselinedesignisaugmentedwithanadaptivecontrollerintheformofamodelreferenceadaptivecontrollerMRAC. 3.1.1LinearizedFlightDynamicsModel Weassumetheightdynamicscanbedescribedbyasetofordinarydifferential equations x = f t;x ;x t 0 = x 0 ;t t 0 whicharecomposedofposition,kinematic,translationalandrotationalequationsof motionseeforexample,StevensandLewis[74]. Weareinterestedindesigningabaselinecontrollerusingagain-scheduledlinear quadraticregulatorLQRapproach.Hence,itisnecessarytolinearizethesystemat variousightconditionsbasedonthemodeleddynamicsofthevehicleandderivelinear 43

PAGE 44

shortperiodplantmatrices A p 2 R n p n p ;B p 2 R n p m ;C p 2 R p n p ; and D p 2 R p m Theightdynamicsarenumericallylinearizedwithrespecttostatesandcontrolinputs aroundeachightconditionwhichresultsin A p i;j = @f i @x j x = x u = u B p i;k = @f i @u k x = x u = u ; C p i;j = @x i @x j x = x u = u D p i;k = @x i @u k x = x u = u where i and j areindicesofthestatevector,trimconditionsaredenotedbyasterisksas x and u ,and k istheindexofthecontrolinput.Wecannowwritetheaugmentedshort periodlinearizedmodelas 2 6 4 e I x p 3 7 5 = 2 6 4 0 C reg 0 A p 3 7 5 2 6 4 e I x p 3 7 5 + 2 6 4 0 B p 3 7 5 u + 2 6 4 )]TJ/F15 11.9552 Tf 9.298 0 Td [(1 0 3 7 5 y cmd where C reg 2 R m n p selectstheregulatedstateand y cmd 2 R m istheboundedexternal command. 3.1.2BaselineController Wecanrewrite3toincludesystemuncertainties 2 R m m and f x 2 R m as 2 6 4 e I x p 3 7 5 = 2 6 4 0 C reg 0 A p 3 7 5 2 6 4 e I x p 3 7 5 + 2 6 4 D reg B p 3 7 5 u + f x + 2 6 4 )]TJ/F15 11.9552 Tf 9.298 0 Td [(1 0 3 7 5 y cmd where y = C reg x p andtheintegralerrorisdenedas e I = y )]TJ/F25 11.9552 Tf 12.833 0 Td [(y cmd .Thegoalisto designacontrolinput, u ,suchthatthesystemoutput y tracksthecommand y cmd .We cangeneralize3as[16] x = Ax + B u + f x + B ref y cmd whereoptimalcontrolcanbeusedtoproduceasymptoticconstantcommandtracking proportionalplusintegralPIbaselinecontroller.Westartthebaselinecontrollerdesign 44

PAGE 45

Figure3-1.StandardLQRPIbaselinecontrollerarchitecture. processbyrewritingtheequationsaboveintoservomechanismformgivenby[75] 2 6 4 e x p 3 7 5 = 2 6 4 0 C c 0 A p 3 7 5 2 6 4 e x p 3 7 5 + 2 6 4 D c B p 3 7 5 where =_ u e = y c )]TJ/F25 11.9552 Tf 12.968 0 Td [(r c denotesthecommandtrackingvariables,and r isthe commandinput.Wecanwritetheservomechanismdesignmodelintheform: z = ~ Az + ~ Bu where z ~ A ,and ~ B correspondtothevectorsin3. OptimallinearquadraticLQmethodsenableasystematicapproachtocontroller design.LinearquadraticregulationLQRcanbeappliedinordertoobtainasetof lineargainsthatminimizethequadraticcostdenedby J = 1 t 0 z T Qz + T Rdt whichisconstrainedby3.SolvingthealgebraicRiccatiequationAREgivenby ~ A T P + P ~ A + Q )]TJ/F25 11.9552 Tf 11.955 0 Td [(P ~ BR )]TJ/F24 7.9701 Tf 6.587 0 Td [(1 ~ B T P =0 obtainsthesetofgains, K c ,thatareusedtodenethecontrollaw u t = )]TJ/F25 11.9552 Tf 9.299 0 Td [(K I t t 0 e d )]TJ/F25 11.9552 Tf 11.956 0 Td [(K x p x p t where K c =[ K I ;K x p ] .TheresultingcontrollerisshowninFigure3-1. 45

PAGE 46

3.1.3IterativeDesignLoop Inordertodevelopabaselinecontrollerthatisrobusttodisturbancesandachieves sufcienttransientresponse,aniterativelooparoundtheoptimalcontrollerisdeveloped. Withinthisloop,thevalueswithinthe Q and R matricesaretweakedtoadjustperformance.Specically,increasingthepenaltyforintegralerrorwillincreasetherisetime. Increasingthepenaltyonangularratewilldecreaseoscillationsinthestepresponse. Ateachiteration,thecontrollerincreasesitsgainswhileminimizingoscillationsinthe time-domain.Thisloopiteratesuntilthecontrollerhastomeetcertaincriteria.This includesevaluatingthegainsinamuchmorerealisticmodelthanthedesignmodel whichincludessensorsandactuators.Ifthedesignmeetscertainperformancecriteria e.g.rise-time,overshoot,undershoot,crossoverfrequency,gainandphasemargins thenthegainsaresavedandtheloopexits.MarginsintheMIMOsensearerelatedto thereturndifferenceRDandstabilityrobustnessSR,see[16]formoredetails. Itisimportanttonotethatsingularvaluebasedstabilitymarginsarealwaysmore conservativethanthesingle-loopclassicalmargins.Itispossibleandcanbeinsightful tocomputethesingle-inputsingle-outputstabilitySISOmarginsofeachchannelofa MIMOtransferfunctionwhileallotherchannelsareclosed.Thisanalysiscanprovide thecontroldesignerwithsomeinformationonwheretheweaknessesofthedesign are[73]. Typically,linearcontroldesignerslookatdesignsusingsensitivityanalysis.Robustnessanalysisrequiresacceptablelevelsatboththeplantinputandoutput.The frequencyresponsesofMIMOtransferfunctionsarecomputedateachinputandoutput channelwithallotherchannelsremainingclosed.Thisincludesthegangofsixtransfer functions[71].Thisgangisrequiredtodescribehowthesystemreactstodisturbances, noise,andsetpointchanges.Thegangincludesthereturndifference,stabilityrobustness,sensitivity,andcomplementarysensitivityfunctions.UnlikeMIMOsystems,SISO systemshaveequivalentloopgainsattheplantinputandoutputwhere L denotesloop 46

PAGE 47

gain.Wedenesensitivity S ,co-sensitivity T ,returndifference RD ,andstability robustness SR intheSISOcaseby S = I I + L T = L I + L RD = I + L SR = I + L )]TJ/F24 7.9701 Tf 6.586 0 Td [(1 where I istheidentitymatrixofsufcientsize.Generally,designerswanttodesign T s =1 atlowfrequenciesforsufcienttrackingperformanceand T s =0 athigh frequenciesforsensornoiserejection.Since T s + S s =1 ,designerswant S s =0 atlowfrequenciesforplantdisturbancerejectionand S s =1 athighfrequencies. Withoutproperanalysis,designsmayleadtoprocessdisturbancesandsensornoise beingampliedintheclosed-loop. Itisworthnotingthatmostdynamicsvarydrasticallydependingoncurrentposition, velocity,ororientationchanges.Inordertoaccountforthesechangeswhilestillusinga linearcontroller,gain-schedulingisperformed[19].Usually,differentmetricsarechosen foreachtrimpointdesigne.g.rise-timewherethedesignerattemptstoidentifytrends basedontheightdynamicswhichenableamoreautomateddesignprocessoverthe operatingregime. 3.2NonlinearandAdaptiveControlofaFlightVehicle Inordertoillustratethebasisfordirectadaptivecontrol,asingle-inputsingle-output SISOexampleisshownbelowfrom[16,76].Themethodisexpandedtomulti-input multi-outputMIMOsystemswhileincorporatingpossiblesystemuncertainties. 3.2.1Single-InputSingle-OutputSISOAdaptiveControl Consideradynamicalsystemintheform: x p t = a p t x p t + b p t u t 47

PAGE 48

_ x m t = a m x m t + b m r t where a p and b p areunknownscalarplantparameters, u istheplantinputprovidedby thecontroller, r istheexternalcommand,and x p isthestate'scurrentvalue.Areference modelisdescribedin3where a m and k m areknownconstants.Thereference modelrepresentstheidealclosed-loopperformanceofthesystemundernominal conditions.Hence, x m representsthedesiredreferencemodelstateofthesystem whereweassume a m < 0 .Itisnecessaryforthereferencemodeltobestableinorder toprovestability.Thegoalistodrive x p to x m whilekeepingallstatesandcontrolinputs ofthesystembounded.Wedenethestatetrackingerroranditsderivativeas e t = x p t )]TJ/F25 11.9552 Tf 11.955 0 Td [(x m t e t =_ x p t )]TJ/F15 11.9552 Tf 13.98 0 Td [(_ x m t : AcontrolsolutiontotheSISOplantproblemwithcorrespondingadaptivelawstake theform: u t = ^ k p x p t + ^ k r r t wheretheexistenceofidealcontrollergains k p and k r givenby k p = a m )]TJ/F25 11.9552 Tf 11.955 0 Td [(a p b p k r = b m b p isguaranteedforanycontrollablepair a p ;b p .Usingtheidealformofthecontroller gains,wecanderivetheclosed-looperrordynamicsintheform: e = a m e t + b p ~ k p x p + b p ~ k r r where ~ k r = ^ k r )]TJ/F25 11.9552 Tf 11.955 0 Td [(k r and ~ k p = ^ k p )]TJ/F25 11.9552 Tf 11.955 0 Td [(k p Generally,theLyapunovfunctionischosentorepresentthetotalkineticenergy ofalltheerrorsinthesystem.Thegoalofthiscontroldesignistodesignacontroller 48

PAGE 49

suchthattheenergyofthesystemisguaranteedtodissipatewithtime,i.e.ideally thederivativeoftheLyapunovfunctionwouldbenegativedenite,see326.The LyapunovfunctioncandidateforourSISOproblemisgivenby V e; k p ; k r = e 2 2 + j b p j ~ k p 2 2)]TJ/F26 7.9701 Tf 13.167 -1.794 Td [(p + ~ k r 2 2)]TJ/F26 7.9701 Tf 13.167 -1.794 Td [(r where jj denotesabsolutevalue. Next,wedesigntheadaptiveupdatelawsintheform: ^ k p t = )]TJ/F15 11.9552 Tf 9.299 0 Td [()]TJ/F26 7.9701 Tf 7.314 -1.793 Td [(p x p e sgn b p ^ k r t = )]TJ/F15 11.9552 Tf 9.299 0 Td [()]TJ/F26 7.9701 Tf 7.314 -1.794 Td [(r re sgn b p where )]TJ/F26 7.9701 Tf 7.314 -1.794 Td [(p )]TJ/F26 7.9701 Tf 7.314 -1.794 Td [(r arepositivedenitelearningratesandthesignof b p isassumedtobe known. Thetimederivativeof V computedalongthetrajectoriesof3resultsin V e; k p ; k r = e 2 2 + j b p j ~ k p 2 2)]TJ/F26 7.9701 Tf 13.168 -1.793 Td [(p + ~ k r 2 2)]TJ/F26 7.9701 Tf 13.168 -1.793 Td [(r V e; k p ; k r = e e + j b p j ~ k p ^ k p )]TJ/F26 7.9701 Tf 7.314 -1.793 Td [(p + ~ k r ^ k r )]TJ/F26 7.9701 Tf 7.314 -1.793 Td [(r V e; k p ; k r = a m e 2 + j b p j ~ k p sgn b p pe + ^ k p )]TJ/F26 7.9701 Tf 7.315 -1.793 Td [(p + j b p j ~ k r sgn b p re + ^ k r )]TJ/F26 7.9701 Tf 7.314 -1.793 Td [(r V e; k p ; k r = a m e 2 0 : Bysubstitutingtheadaptiveupdatelawsshownin3intothetimederivative of3resultsinanegativesemi-denite V .Weassumethereferenceinputisa bounded,whichimpliesthecontrolinputisbounded.Itfollowsthatwecanshowthatthe statesofthesystemandadaptivegains ^ k p ; ^ k r arebounded.UsingBarbalat'sLemma see[77],wecanprovethattheerrorbetweentheplantstatesandthedesigned 49

PAGE 50

referencemodelstatesisdriventozero,shownin3. lim t !1 e t =0 However,theadaptiveparametersdonotnecessarilyconvergetotheirtrueunknown valuesunlessthepersistenceofexcitationcriteriawasmet. 3.2.2DirectModelReferenceAdaptiveControlMRACwithUncertainties MIMO WecanextendthepreviousdesigntoaMIMOsystemusingthemethodologies discussedin[16].Thegoalistodesign u suchthat y tracks y cmd whileoperatingwiththe uncertainties and f x .Thedynamicalsystemandreferencemodelcanbewrittenas: x = Ax + B u + f x x ref = A ref x ref + B ref y cmd where A 2 R n n B 2 R n m ,and C 2 R p n areknown.Thematrix 2 R m m isa constantunknowndiagonalmatrixwhichwewillrefertoasthecontroleffectiveness term.Thematcheduncertainty, f x ,isanunknowncontinuouslydifferentiablefunction whichisoftenassumedtobelinear-in-the-parameters.Inotherwords, f x takesthe form f x = N i =1 i i x = T x where x 2 R n istheknownregressorvectorand i areunknownconstantsorslowlyvaryingparameters.Thegoalistondadaptiveupdatelawsthatdrivetheplantstate dynamics x toreferencestatedynamics x ref RecentlyithasbeenshownthataddingaLuenberger-liketermtothetypicalopenloopreferencemodelcandecreaseoscillationsandincreasetransientperformance. Theclosed-loopreferencemodeltakestheform[11]: x ref = A ref x ref + L v x )]TJ/F25 11.9552 Tf 11.956 0 Td [(x ref + B ref y cmd y = Cx ref 50

PAGE 51

where A ref = A )]TJ/F25 11.9552 Tf 12.37 0 Td [(BK LQR isdesignedtobeHurwitzand K LQR arebaselinecontroller gainsthatweredesignedtomeetcertainperformanceandrobustnesscriteria. Consideradirectadaptivecontrollerinthefollowingform: u = ^ k T x x + ^ k T r y cmd )]TJ/F15 11.9552 Tf 13.581 3.022 Td [(^ T x wherewecanusetheformofthecontrolleralongwiththeknownformofthedynamics andthereferencemodeltoobtainthematchingconditionsforthemodelreference adaptivecontroller.Thematchingconditionsaregivenby A + B K T x = A ref B K T r = B ref where,forMRACsystems,theidealgains K x and K r mustexistforthecontrollerto remainstable.Foraerospacesystems,thisisatypicalassumptionthatissatiseddue totheknownformofthesystemdynamicsandthecarefullychosenreferencemodel. AfterapplyingLyapunovstabilityanalysistothepreviouslymentionedsystem,we retrievethefollowingadaptiveupdatelaws: ^ =)]TJ/F24 7.9701 Tf 45.448 -1.794 Td [( x e T PB ^ k x = )]TJ/F15 11.9552 Tf 9.299 0 Td [()]TJ/F26 7.9701 Tf 7.314 -1.793 Td [(x xe T PB ^ k r = )]TJ/F15 11.9552 Tf 9.299 0 Td [()]TJ/F26 7.9701 Tf 7.314 -1.793 Td [(r re T PB where 'supdatelawisafunctionof P whichistheuniquesymmetricpositive-denite solutionofthealgebraicLyapunovequation PA + A T P = Q where A istypically A ref Thistermexistsbecause e T Pe isatermthatisincludedintheLyapunovfunction. 3.2.3RobustAdaptiveControlTools Robustadaptivecontrolprovidesadditionaltoolstothedesignerwhichincrease performanceandstabilityinthepresenceofuncertainties.Unmatcheduncertainties e.g.environmentaldisturbancesareespeciallytroublesometoadaptivecontrollaws. 51

PAGE 52

Forinstance,estimatedparameterscandriftslowlyduetonon-parametricuncertainties whenthepersistenceofexcitationconditionisnotmet[16].Thishasbeenshownto leadtosuddendivergenceandfailure. Inordertoimproveadaptivecontrollersinthepresenceofsuchuncertainties,IoannouandKokotovic[78]proposedthe -modicationwhichcombatsthisphenomenon byprovidingadampingterm,see3.Unfortunately,whenerrorsaresmall,the dampingtermdominatestheadaptiveupdatelawandcausestheparameterestimates todrift.Ifpersistenceofexcitationisnotmet,theerrorswillincreaseimmediately.Note thatthe -modicationbasedadaptivelawoftentakestheform: ^ =)]TJ/F24 7.9701 Tf 32.165 -1.793 Td [( x e T PB )]TJ/F25 11.9552 Tf 11.955 0 Td [( ^ where isapositiveconstant. NarendraandAnnaswamy[79]proposed e modicationwhichmitigatesthisissue byaddingthetrackingerrortothemodicationterm,see3. ^ =)]TJ/F24 7.9701 Tf 32.165 -1.794 Td [( x e T PB )]TJ/F25 11.9552 Tf 11.955 0 Td [( jj e T PB jj ^ Dead-zoneisanotherrobustadaptivecontrolmodicationwhichkeepstheparametersfromdriftingduetonoise.Dead-zoneperformsthedesiredupdate,statedin3, unlesstheerrorfallsbelowacertainthreshold.Inthiscase,theadaptiveparameters arefrozen. Unfortunately, and e modicationcannegativelyimpacttrackingperformance byslowingadaptation.Theprojectionoperatorisanalternativemethodwhichbounds adaptationparametersgainsandprotectsagainstintegratorwindupwhilestillallowing fastadaptation.Wewillutilizeanddescribetheprojectionoperatorinmoredetailin Chapters4and5. 52

PAGE 53

3.2.4AdaptiveAugmentation-BasedController Intheprevioussections,linearoptimalcontroltheorywasusedtocreatedesigns atcertainightconditionswhichdeterminethereferencemodelforwhichtheadaptiveerrorwasdrivento.Inthissection,weshowhowthebaselinecontrollercanbe augmentedwithanadaptivecontrollertoformtheoverallcontroller. Thebaselinedesignisthecoreoftherobustandadaptivecontrolframework. Itdenesthetargetperformanceofthecontrollerwhichwasdesignedwithbuilt-in robustnessattributes.Adaptiveaugmentationisneededtocomplementthebaseline robustnessproperties.Additionally,theadaptivelawsarecapableofrestoringbaseline controlperformanceaschangestotheplanttakeplacei.e.matcheduncertainties.For additionaldetailsonderivationsee[16]. Theadaptiveportionofthecontrollerisusedtodealwiththesystemmatched uncertainties.Whenaugmentingthebaselinecontrollerwithanadaptivecomponent withnouncertainties,theadaptivecontroller'scontributiontothecontrollerwillbezero. Thisissimplybecausethecontrolleristrackingthereferencemodelperfectly. Forthissectionweassumethenonlinearplantisintheformsimilarto35.We re-writethedynamicsas x = Ax + B )]TJ/F25 11.9552 Tf 5.479 -9.684 Td [(u + T x + B ref y cmd where x 2 R n istheknownregressorvectorwithcomponents i x whichareLipschitzin x .Wealsonotethat i areunknownconstants.Noticethatprevioustechniques couldonlycompensateforlineardependenceamongunknownparametersparametric uncertainty. Neuralnetworkscanextendthisdesigntonon-linearintheparameterfunctions. Forexample, f x = W T V T x + representsaneuralnetworkwithidealweights W and V andreconstructionerror .Whenweightsoftheinnerlayer, V ,arexed,then Lyapunov-basedstabilityanalysiswouldresultinthefollowingadaptiveupdatelawfor 53

PAGE 54

theouter-layerweights W [17]: ^ W =Proj)]TJ/F26 7.9701 Tf 46.889 -1.793 Td [(W V T x e T PB : Theadaptivelawisderivedbyassumingthattheuncertaintiescanbeapproximated onaboundedclosedsetwithinasmalltoleranceusingtheuniversalapproximation theorem.Theuniversalapproximationtheoremonlyholdsifandonlyiftheregressor V T x providesabasisforthesystem.Commonlyforaerospaceapplicationsand structuredneuralnetworkapproaches,radialbasisfunctionsarechosentodenethe regressor. RadialbasisfunctionsRBFscanbewritteninthefollowingform: i x;x c = e )]TJ/F29 5.9776 Tf 7.782 4.025 Td [(k x )]TJ/F27 5.9776 Tf 5.756 0 Td [(x c k 2 2 w i where x c 2 R denesthecenteroftheRBFand w i 2 R + denotesthewidthoftheRBF. ThenumberofRBFsandthespacingischosentocoverthetaskspace.Inthecaseof ightcontrol,theRBFswouldlltheightenvelope.Typically,theweightsoftheneural networkareupdatedbasedonthefollowingupdatelaw ^ =Proj )]TJ/F15 11.9552 Tf 5.48 -9.684 Td [()]TJ/F24 7.9701 Tf 7.314 -1.793 Td [( x e T PB whichwasformulatedusingLyapunov-basedstabilityanalysiswhere e = x )]TJ/F25 11.9552 Tf 10.517 0 Td [(x ref .Notice thenotationwhere replaces W and x replaces V T x .TheRBF-basedadaptive controlleriselaboratedoninChapter4. ThesinglehiddenlayerSHLneuralnetworkisanalternativeapproachwhich adaptivelyupdatesboththeinnerlayerweightsalongwiththeouterlayerweights: ^ W =)]TJ/F26 7.9701 Tf 26.382 -1.793 Td [(W e V T )]TJ/F15 11.9552 Tf 13.871 0 Td [(_ ^ V T ^ V T e T PB ^ V =)]TJ/F26 7.9701 Tf 26.382 -1.793 Td [(V e T PB ^ W T ^ V T 54

PAGE 55

where =[ x T 1] T =[ T 1] T ; and isthederivativeofthesigmoidfunction.Wewill directlycomparetheperformanceoftheRBFandSHLadaptivecontrollersinChapter4. 3.2.5StructureoftheAdaptiveController Thenalformofthecontrollerusesanyoftheaboveneuralnetworkbasedadaptivecontrollermethodologiesalongwithabaselinecontroller.Inadditiontotheprojectionoperator,weusedead-zonetocombatnoise.ForareviewofLyapunov-based stabilityresultssee[76]and[77].Forthisspecicproofofstabilitysee[16].Theoverall controlsignalisdeterminedfromthesumofadaptiveandbaselinecomponentsandis givenby u = u bl + u ad where u bl = )]TJ/F25 11.9552 Tf 9.298 0 Td [(K LQR x and u ad = )]TJ/F15 11.9552 Tf 10.924 3.022 Td [(^ T x )]TJ/F15 11.9552 Tf 20.527 3.022 Td [(^ K BL T u bl wheretheadaptiveupdatelawsaregivenby ^ K BL =Proj)]TJ/F26 7.9701 Tf 53.53 -1.793 Td [(u u bl e T PB ^ =Proj)]TJ/F24 7.9701 Tf 72.598 -1.794 Td [( x e T PB : 55

PAGE 56

CHAPTER4 IMPROVINGLEARNINGOFMODELREFERENCEADAPTIVECONTROLLERS:A SPARSENEURALNETWORKAPPROACH Overthelastdecadeneuralnetworkbasedmodelreferenceadaptivecontrollers MRAChavebecomeubiquitousintheightcontrolcommunityduetotheirability toadaptquicklyforcomplexmaneuverswhileundergoingsignicantuncertainties. SinglehiddenlayerSHLandradialbasisfunctionRBFbasedneuralnetworks arethemosteffectiveandcommoninadaptivecontrol.Recentmachinelearning breakthroughshaveshowntheadvantagesofusingsparsenetworksforlearning basedidenticationtasks.Weshowthatusingsparsenetworksforadaptivecontrolcan reapsimilarbenetsincludingimprovedlong-termlearningandbettertrackingwhile reducingthecomputationalburdenonthecontroller.Simulationsresultsdemonstrate theeffectivenessoftheproposedcontroller. Thissectionisbasedonthepublishedpaper[80]. 4.1ModelReferenceAdaptiveControlFormulation MRACisanadaptivearchitectureusedtoensureplantstates, x ,successfullytrack thechosenreferencemodelstates, x ref .Thereferencemodelisdesignedtospecifythe desiredclosed-looptrackingperformanceofthesystemundernominalconditions.We willaugmentabaselinecontrollerwithaneuralnetworkbasedadaptivecontroller.The adaptivecontrolleraimstorestoredegradedbaselineclosed-looptrackingperformance inthepresenceofmatcheduncertainties[20].Morespecically,thegoalistodesign acontrolinput, u ,suchthat y tracks y cmd whileoperatingunderuncertainties and f x [73].Thedynamicalsystemhastheform: x = Ax + B u + f x + B ref y cmd y = Cx where A 2 R n n B 2 R n m B ref 2 R n m ,and C 2 R p n areknownmatrices.Thematrix 2 R m m isanuncertaintyintheformofaconstantunknowndiagonalmatrix.The 56

PAGE 57

diagonalelementsof areassumedtobestrictlypositive.Thematcheduncertainty, f x 2 R m ,isanunknowncontinuouslydifferentiablefunction.Weassumeanexternal boundedtime-varyingcommand y cmd 2 R m wherethe B ref y cmd termintheopen-loop dynamicsisincludedwhenthestatevectorisaugmentedwiththeintegratedtracking error,i.e. x = e I x p 2 R n where x p representstheoriginalplantstatesand e I = y )]TJ/F25 11.9552 Tf 9.731 0 Td [(y cmd ThereaderisreferredtoLavretsky[16]foraderivationofthismodel. Consideraclosedloopreferencemodelintheform: x ref = A ref x ref + B ref y cmd y ref = C ref x ref where A ref 2 R n n and C ref 2 R p n areknownmatrices.Weassumethepair A ref; B iscontrollableand A ref isdesignedtobeHurwitzwhere A ref = A )]TJ/F25 11.9552 Tf 11.955 0 Td [(BK T LQR : Theoverallcontrollertakesthefollowingform: u = u BL + u AD where u AD representstheadaptivecontrolsignaland u BL isdesignedusinglinear optimalregulationLQRtoobtainaproportional-integralPIbaselinegivenby u BL = )]TJ/F25 11.9552 Tf 9.298 0 Td [(K LQR x = )]TJ/F25 11.9552 Tf 9.298 0 Td [(e I K I )]TJ/F25 11.9552 Tf 11.955 0 Td [(x p K p whichtakesintoconsiderationrise-time,undershoot,overshoot,androbustnessmetrics whichareusefulforapplicationssuchasightcontrol.OptimallinearquadraticLQ controlmethodsprovideasystematicapproachtodesignbaselinecontrollergains, K LQR .AblockdiagramoftheoverallcontrollerisshowninFigure4-1. 57

PAGE 58

Figure4-1.Modelreferenceadaptivecontrolaugmentationcontrolarchitecture. Thestatetrackingerrorisdenedas e = x )]TJ/F25 11.9552 Tf 11.955 0 Td [(x ref whichisusedintheLyapunovanalysistoproveauniformlyultimatelyboundedresult. Bysubstitutingtheformofthebaselinecontrollerin4intothedynamicsin4 andutilizingtheformofthestatetrackingerrorin4,theopen-looptrackingerror dynamicscanbewrittenas e = A ref e + B u AD + f x + u BL )]TJ/F15 11.9552 Tf 11.955 0 Td [( )]TJ/F24 7.9701 Tf 6.587 0 Td [(1 u BL wheretheadaptivecontrolportionofthecontroller, u AD ,takesthefollowingform: u AD = )]TJ/F15 11.9552 Tf 10.924 3.022 Td [(^ T x )]TJ/F15 11.9552 Tf 15.088 3.022 Td [(^ K BL u BL where ^ K BL u BL termaccountsforthematcheduncertainty, u BL )]TJ/F15 11.9552 Tf 12.025 0 Td [( )]TJ/F24 7.9701 Tf 6.586 0 Td [(1 u BL .Forthischapter,wewillfocusentirelyontheneuralnetworkcontributiontothecontroller,i.e. ^ T x Forthestructuredneuralnetworkapproach,theregressionmatrix, x ,containsa predenedsetofknownnonlinearbasisfunctionse.g.radialbasisfunctions.Forthe unstructuredapproach, x isavectorthatholds N activationfunctionse.g.sigmoid alongwithassociatedestimatesoftheinner-layerweights.Forbothapproaches, ^ isa matrixthatcontainstheestimatesofouter-layerweightsoftheneuralnetwork. 58

PAGE 59

ForeachneuralnetworkbasedapproachinSection4.2,weemployrobustadaptivecontroltechniquestoensureboundednessoftheweightswithoutrequiringthe PEcondition.Theprojectionoperatorisonesuchtechniquethatboundsadaptation parameters/gainsandprotectsagainstintegratorwindupwhilestillallowingfastadaptation[16].Robustadaptivecontrolprovidesadditionaltoolstothedesignerwhich improvesperformanceandstabilityinthepresenceofuncertainties.Forinstance, estimatedparameterscandriftslowlyduetonon-parametricuncertaintieswhenthe persistenceofexcitationconditionisnotmet.Thishasbeenshowntoleadtosudden divergenceandfailureinightvehicles[16]. TheupdatelawspresentedinSection4.2arestatedasafunctionof P ,whichisthe uniquesymmetricpositivedenitesolutionofthealgebraicLyapunovequation PA ref + A T ref P = Q where Q isatunablepositivedenitematrix.Thatis,foranypositivedenitematrix Q 2 R n n thereexistsapositivedenitesolution P 2 R n n totheLyapunovEquation. Since e T Pe isatermthatisincludedintheLyapunovfunction;thistermisnecessaryfor theadaptiveupdatelaw. EachadaptiveupdatelawinSection4.2alsocontainsatunablesymmetricpositive denitematrix, )]TJ/F22 11.9552 Tf 7.314 0 Td [(,thatincluderatesofadaptation.Thelargerthevalues,thefasterthe systemchangestheweightsinordertoadaptfortheuncertaintiesinthesystem[16]. Asmentionedintheintroduction,thegoalistosetthelearningratestoareasonable valuetolimithighactuatorratesandavoidhigh-frequencyunmodeleddisturbances.For thischapter,weholdthelearningratesconstant. FullstabilityproofsandanalysisfortheneuralnetworkbasedMRACschemes canbeseeninHovakimyan[14,22]orLewis[13,17,81].Forresultsonapplicationsof previouslydenedmethodsinightcontrolseeShin[82],Lavretsky[16],Mcfarland[83], orChowdhary[18,84]. 59

PAGE 60

Figure4-2.ExamplesinglehiddenlayerSHLneuralnetwork. Inthefollowingsections,singlehiddenlayerSHL,radialbasisfunctionRBFand sparseneuralnetworkSNNapproachestoneuralnetworkbasedadaptivecontrolare presentedandanalyzed. 4.2NeuralNetwork-BasedAdaptiveControl ThemostpopularneuralnetworkbasedadaptivecontrolapproachesSHLand RBFarepresentedinthissection,aswellasthenovelSNNapproach. Aneuralnetworkcanbedescribedasaninput-outputmap NN x : R n R m composedof N neuronswhere m isthenumberofoutputlayerneuronsand n isthe numberofinputlayerneurons.Aconventionalfeed-forwardneuralnetworktakesthe form: NN x = W T f V T x + b V + b W where x 2 R n denotestheinputvectorand W 2 R N m ;V 2 R n N ;b v 2 R N ; and b W 2 R m representstheidealweightsandbiasesoftheneuralnetwork.Eachneuronis composedofanactivationfunctione.g.sigmoid,tanh,orlinearrectier, f x : R R ; whichproducesasingleoutputbasedonanonlineartransformationoftheinput.An exampleofaneuralnetworkcanbeseeninFigure4-2,whereeachlineconnectinga 60

PAGE 61

nodehasasingleweightassociatedwithit.Theoutputoftherstnodeofthehidden layercanbemathematicallyexpressedas N 1 = f V 1 x + b V Theobjectiveoftheneuralnetworkistoadjustparameters W;V;b v ; and b W inorder toapproximateasmoothnonlinearfunctionwithinspeciedthresholds.Neuralnetworks satisfytheuniversalapproximationtheoremwhichhasbeenprovedforsigmoidaland radialbasisfunctionnetworks[85].Theuniversalapproximationtheoremstatesthatany smoothfunction, f x ,canbeapproximatedoveracompactdomain, x 2 X R n ,by asinglehiddenlayerneuralnetwork.Itimpliesthatgivenacompactset, X ,andforany > 0 ,thereexistneuralnetworkweightssuchthat f x = W T V T x + b V + b W + ; jj jj < ? where isreferredtoasthereconstructionerror.Forstructuredneuralnetworks,the reconstructionerrorbound, ,canbemadearbitrarilysmallbyincreasingthenumber ofhiddenlayernodes.ForthestructuredRBFnetworkapproach,thiscorrespondsto increasingthenumberofradialbasisfunctions. Forthischapter,inordertobeconciseandcomputationallyefcient,wewill redenetheneuralnetworkweightmatrices W and V toincludethebiasterms b v and b W oftheneuralnetwork[16].Thatis, V =[ V T b V ] T 2 R n +1 N and W =[ W T b W ] T 2 R N +1 m : 4.2.1RadialBasisFunctionRBFAdaptiveControl ThefollowingequationdenesaradialbasisfunctionRBF: x;x c = e )]TJ/F29 5.9776 Tf 7.782 4.025 Td [(jj x )]TJ/F27 5.9776 Tf 5.756 0 Td [(x c jj 2 2 2 where x istheinputoftheactivationfunction, x c isthecenteroftheRBF,and denotes theRBFwidth[16].Foradaptivecontrolapplications,aradialbasisfunctionRBFis apopularchoiceofanactivationfunctionusedforastructuredneuralnetworkbased adaptivecontrolapproachwherethecentersofeachRBFarepredenedtollthe 61

PAGE 62

operationalenvelope.Theadaptivecontrollertakesthefollowingform: u AD = )]TJ/F15 11.9552 Tf 9.299 0 Td [( T x = )]TJ/F15 11.9552 Tf 12.711 3.022 Td [(^ W T x where x =[ 1 x ; 2 x ; 3 x ;:::; 1] T 2 R N +1 isavectorof N radialbasis functions[84]withdistinctxedcentersand ^ W 2 R N +1 m aretheouterlayerweight andbiasestimatesthatareadjustedusingtheLyapunovbasedadaptiveupdatelaw shownin4. Inotherwords,structuredneuralnetworkcontrollershaveweightsoftheinnerlayer thatarexed,andstandardLyapunovanalysisprovidestheadaptiveupdatelawforthe outer-layerweights, ^ W .WhenusingRBFneuralnetworkbasedcontrollerpairedwith arobustadaptivecontroltoolsuchastheprojectionoperator,itiswellknownthatthe followingadaptiveupdatelawresultsinuniformultimateboundednessofthetracking error: ^ W =Proj)]TJ/F26 7.9701 Tf 46.888 -1.794 Td [(W x e T PB : TheRBFbasedapproachleveragestheuniversalapproximationtheoremfor RadialBasisFunctionsiftheoutputoftheradialbasisfunctionvector, x ; providesa basis[13].ItisimportanttonotethatthepersistenceofexcitationPEconditionisstill requiredtoensurethattheadaptedweights, ^ W ,convergetotheiridealweights, W AnexampleofanRBFbasedregressormatrixusedinastructuredneuralnetwork basedadaptivecontrolapproachisseeninFigure4-3,whichisdesignedforaight controlsystemthattheoperatingregimeisknownapriorie.g.inFigure4-3between -6and6degreesangleofattack.Theelementsinthisdesignareconsideredtobe parametricelementswithcentersandwidthsthatdonotchangeduringoperation.One obviousdisadvantageisthatthecontrollerissemi-globalinthesensethatrepresentationoftheinputvectorissignicantlyreducedoutsideofthedesignedenvelope.As thewidthoftheRBFvectorsisincreased,thehiddenlayersparsityisreduced,but therichnessofrepresentationoftheinputisincreased.AdisadvantageofRBFbased 62

PAGE 63

Figure4-3.ExampleradialbasisfunctionRBFnetworkdistributedacrossangleof attack. representationsisthattherichnessandentanglementofeachrepresentationchanges basedontheinput.Forinstance,consideriftheinputliesdirectlybetweentwoRBFs versusdirectlyononeRBF.Byincreasingthenumberofradialbasisfunctions,this problemismitigated. 4.2.2SingleHiddenLayerSHLAdaptiveControl Wewillusethefollowingcompactnotationfortheunstructuredneuralnetwork basedadaptivecontroller u AD = )]TJ/F15 11.9552 Tf 12.711 3.022 Td [(^ W T f ^ V T where =[ x 1] 2 R n +1 istheinputtotheneuralnetwork. FortheSHLapproach,thestabilityanalysisholdsforanychosensquashing function[14].Inthischapter,asigmoidalactivationfunctionwillbechosenandis denedasfollows x = 1 1+ e )]TJ/F27 5.9776 Tf 5.756 0 Td [(x 0 x = x )]TJ/F25 11.9552 Tf 11.956 0 Td [( x where 0 x representsthederivativeofthesigmoidfunctionintermsofitself. 63

PAGE 64

ThefollowingupdatelawisderivedfromtheLyapunovstabilityanalysis ^ W =Proj)]TJ/F26 7.9701 Tf 52.742 -1.793 Td [(W ^ V T )]TJ/F15 11.9552 Tf 13.871 0 Td [(_ ^ V T ^ V T ^ e T PB ^ V =Proj)]TJ/F26 7.9701 Tf 52.742 -1.793 Td [(V ^ e T PB ^ W T ^ V T SimilartotheRBFbasedapproach,theseupdatelawsensureuniformultimate boundednessofthetrackingerror.Inaddition,thesinglehiddenlayerSHLbased approachleveragestheuniversalapproximationtheoremforsigmoidalfunctions. RBFbasedadaptivecontrollersareoftenchosenoverSHLnetworksforightcontrol applicationsduetotheirsimplicityandabilitytolearn.Thistopicwillbefurtherexplored intheresultssectionofthischapter. 4.2.3SparseNeuralNetworkSNNAdaptiveControl Asmentionedintheintroduction,ideasinthischapterwereconstructedbasedon inspirationfromthemachinelearningcommunity.Morespecically,weaimtoexploit thebenetsofdistributedsparserepresentationsofneuralnetworksdiscoveredinthe deeplearningliterature.Incontrasttodenserepresentations,sparserepresentations resultintheactivationofonlyasmallpercentageoftheneuronsforavarietyofinput data.Duringoptimization,thispreventsagainstun-learningweightsfortaskswith dissimilarinputdata.Densedistributednetworkshavetherichestrepresentationsi.e. highestprecision,whilesparselydistributednetworksholdtheenticingpropertiesof richrepresentationsandnon-entangleddata[58].Wewillexplorethesetrade-offsby creatinganapproachthatsegmentstheightenvelopeandenablesustovarythe numberofneuronsinthesystem. Forightcontrolproblems,wetypicallyarelimitedinthenumberofneurons nodesbasedononboardcomputationalprocessingcapabilities[21].Theproposed adaptivearchitecturewillbeabletoincreasethetotalnumberofnodesbykeepinga highpercentageofthesenodesinactive.Wedothisbysegmentingtheightenvelope intoregionsanddistributingacertainamountofnodestoeachregion.Comparedtoa 64

PAGE 65

typicalfully-connectedneuralnetworkbasedcontrolschemes,ourapproachallowsfor morenodeswhilereducingthecomputationalburdenontheprocessorbyonlyusinga smallpercentageofthenodesforcontrolateachpointintheoperatingenvelope.This strategywillallowthenetworktorememberweightsthatwereobtainedwhileoperating ineachregionandrecallthoseweightswhentheregionisrevisited.Theoverallgoalsof thisapproacharetoimprovelong-termlearningperformanceofthecontroller,especially forrepeatedightmaneuvers,whileavoidinghigh-frequencyoscillationsandactuator overusebyoperatingwithsmalltomoderatelearningrates. ForthesparseneuralnetworkSNNapproach,webeginbycreatingapredened N D -dimensionalgridthatspanstheightenvelope.Wedeneasetofpoints, P = f p 1 ;:::;p T g ,andsegments, S = f s 1 ;:::;s T g ,whereeachpoint, p i 2 I 2 P ,isuniformly spacedonthatgrid,and T 2 N representsthetotalnumberofsegments.Welet I = f 1 ;:::;T g betheindexsetofthesets S and P .Wealloweachdimensionofthe gridtoberepresentedbyadistinctightconditione.g.angleofattack,Mach,altitude. WecanvisualizethisconceptusingaguresimilartoaVoronoidiagraminFigures4-4 and4-5wheretheVoronoispaceoperatingenvelopeisdividedintoVoronoiregions segments[86].Eachsegment, s i 2 I 2 S ,denesaconvexpolytopethatsurrounds eachpre-denedpoint, p i 2 I ,onthegrid s i = f x op 2 X : D x op ;p i D x op ;p j 8 i 6 = j g where x op isanarbitrarypointintheVoronoispaceand X;D denesthemetricspace where X R N D isthenon-emptysetcomposedofpointsthatdeneinputstoafunction D : X X R .Inourcase, X iscomprisedofallthepossiblepointsintheoperating envelope.Thefunction D ; calculatestheEuclideandistancebetweentwopoints.We denethetotalnumberofnodesavailabletothesystemas N 2 N wherethecomplete setofnodesisincludedintheset Y = f e 1 ;:::; e N g and B = f 1 ;:::;N g istheindexsetfor theset Y .Allnodesin Y arethendistributedequallyamongstthesegmentswherewe 65

PAGE 66

Figure4-4.AdaptivesparseneuralnetworkSNNsegmentedightenvelopeinone dimension. denethenumberofnodespersegmentas Q 2 N ,where Q = N T .Weestablishasetof nodesforeachindex, i ,denotedby E i 2 I = f e Q i )]TJ/F24 7.9701 Tf 6.587 0 Td [(1+1 ;:::;e iQ g wherethecompletesetof nodescanbestatedas Y = S i 2 I E i Ateachpointintime, t ,theSNNdeterminesthesetofnodesandassociated weights, ^ W i ; ^ V i ,usedintheadaptivecontrollerandadaptiveupdatelawsbasedon thesegmentnumber, i ,thattheightvehicleiscurrentlyoperating, s i : Thatis,every segment, s i ,hasapre-denedsetofnodesthatisactivewhenoperatinginthatregion. Thesetofactivenodesisdenotedby E A i where 8 i 2 I : E i E A i .Weassume thenumberofactivenodesisrestrictedduetoprocessingconstraintsanddenoted R 2 N where R Q .Noticeif R = Q = N thenwearesimplyusingthetraditional SHLapproach.ThesimplestimplementationoftheSNNisachievedbyassumingthat whileoperatinginthesegment s i onlythenodesthatwereallocatedtothatsegment areactivei.e. 8 i 2 I : E A i = E i .Werefertothisextremecaseasthepuresparse approach.Noticeforthepuresparseapproach,nodesareonlyactivatedandupdated bytheadaptiveupdatelawswhileoperatingwithinaspecicregionandcannotbeused forcontrolormodiedoutsidethatsegment.Forablendedapproachwhere R>Q theadaptivecontrollerwilluseallnodesallocatedtoitscurrentsegmentalongwith 66

PAGE 67

Figure4-5.AdaptivesparseneuralnetworkSNNsegmentedightenvelopeinAtwo andBthreedimensions. additionalnodesfromnearbysegments.Forthisapproach,theactivenodelistforeach segment i E A i 2 I ; isdeterminedbyselectingtheclosest R nodestoeachsegment's centerpoint, p i .Inordertoaccomplishthistask,weuniformlydistribute Q pointswithin eachsegmentandassigneachnodetoadistinctpointwithinitsassignedregion.Next, foreachindex i 2 I wecomputethedistancebetweenthecenterpointofthatsegment, p i ,andeverynodeintheightenvelope.Theindicesfortheclosest R nodestoeach centerpoint, p i 2 I ,arethenstoredinset C i 2 I forlateruse.Sincealltheparametersof theSNNapproacharepre-dened,thelistofactiveindicesiscreatedbeforerun-time. Algorithm4.1showsthestep-by-stepSNNapproachforeachtimethecontrolleris called. Possiblesegmentedightspacefor1,2,and3-dimensionalcasescanbeseenin Figure4-4andFigure4-5.Ineachgure,an X representsthecurrentoperatingpoint oftheightvehiclewithinthespeciedightenvelope.Inthe1-Dcase,Figure4-4, theneuralnetworkwilldeterminetheactivenodesbasedonasinglestate,angleof attackAoA,whichisdividedinto11segments.Thesegmentscontainedbydottedand 67

PAGE 68

Algorithm4.1 SparseNeuralNetworkExecution 1: receivextandthecorrespondinglocationintheoperatingenvelope, x op 2: determinetheindex, i 2 I ,ofpoint p i intheset P wherethedistancebetween x op and p i isminimumi.e. argmin i 2 I D x op ;p i 3: retrievethesetofindices, C i ,correspondingtotheindex i 4: formasinglehiddenlayerneuralnetworkusingtheweights, ^ W i and ^ V i ,associated withtheactivenodesstoredin E A i 5: usetheneuralnetworktoformanadaptivecontrolsignalfollowing420 6: createanoverallcontrolsignalfollowing4 7: applytheoverallcontrolsignaltothesystemdynamics4 8: updatetheweights ^ W i ^ V i ofeachnodein E A i accordingtotheadaptiveupdate lawsin4and4 dashedwhiteregionsshowpossibleactiveregionsbyvaryingthenumberofshared nodesbetweensegmentsintheblendedapproach. AnexampleofatypicalneuralnetworkRBFandSHLusedforadaptivecontrol isshowninFigure4-6.InRBFandSHLnetworks,fullconnectivitybetweennodes isassumedatallpointsintheightenvelope.Incontrast,Figure4-6alsoshowsa puresparsenetworkapproachwherethestatedomainisdividedintovesegments witheachsegmentcontainingexactlytwonodesandeachsegmentnotsharingany nodeswithadjacentsegments.ForeachtypeofnetworkinFigure4-6,segmentsare representedbyrectangleswhileactivenodesforeachsegmentarerepresentedwith distinctcolors.Theactivenodesforsegment1-5arehighlightedinred,yellow,green, blue,andpurple,respectively.Wealsoincludeablendednetworkinthegure.This architecturehasthesamenumberoftotalnodesasthesparseapproach,butsegments nowshare1nodewitheachneighboringsegment.Forinstance,N0,N1,N2,N3are allactivewhenoperatingintherstsegmentredandN2,N3,N4,N5areallactivein thesecondsegmentyellow.ThenodeN4iscoloredyellowandgreentosignifythat theweightsofthatnodeareusedintheadaptivecontrollerwhenoperatinginsegments 3and4.Thesparsityofthenetworkcanbeadjustedbyvaryingthenumberofnodes thatadjacentsegmentsshare,thetotalnumberofnodesinthenetwork,andthetotal numberofactivenodes. 68

PAGE 69

Figure4-6.SinglehiddenlayerwithAtypicalconnectivity,Bblendedconnectivity,and Cspareconnectivity. Notethatthediscreteupdatefortheselectionofthenodesusedforcontrolcauses switchingintheclosed-loopsystem.Foreverypointintime, t ,theSNNcontrolleris operatingwithinasinglesegment, s i ,usingapre-denedsetofactivenodes, E A i 2 I ,for control.Usingcompactneuralnetworknotation,theformoftheadaptivecontrollerwhile operatinginthe i th segmentcanbestatedas u AD = ^ W i T ^ V i T where ^ W i and ^ V i denotetheestimatesoftheouterandinnerlayerweightsupdatedby speciedadaptiveupdatelaws. Theadaptiveupdatelawsusedinthe i th intervalwerederivedusingLyapunov stabilityanalysisandaregivenby ^ W i =Proj)]TJ/F26 7.9701 Tf 59.384 -1.793 Td [(W ^ V i T )]TJ/F15 11.9552 Tf 13.871 0 Td [(_ ^ V i T ^ V i T ^ e T PB ^ V i =Proj)]TJ/F26 7.9701 Tf 59.384 -1.793 Td [(V ^ e T PB ^ W i T ^ V i T wheretheadaptiveupdatelawsensurethetrackingerror, e ,andtheSNNweights errors, ~ W i ~ V i ,remainboundedwhileoperatingineverysegment, 8 i 2 I ThebenetsofthisapproachareanalyzedinSection4.4.3. 69

PAGE 70

4.3NonlinearightdynamicsbasedSimulation Inthissection,wecomparetheperformanceoftheneuralnetworkbasedMRAC controllersdescribedpreviously.Weareinterestedinaightvehiclewithcomplicated anddominantdisturbancesanduncertainties.Alinearmodeloftheshort-period longitudinaldynamicswasextractedfromagenericightvehicleatthespecicight conditionstatedinTable4-1andwasimplementedinthefashiondescribedinSection 4.1.Theextendedopen-loopdynamicscanbestatedas 0 B B B B @ e I _ q 1 C C C C A = 0 B B B B @ 010 0 Z V 1+ Z q V 0 M M q 1 C C C C A 0 B B B B @ e I q 1 C C C C A + 0 B B B B @ 0 Z e M e 1 C C C C A e + f x + 0 B B B B @ )]TJ/F15 11.9552 Tf 9.298 0 Td [(1 0 0 1 C C C C A y cmd where istheangleofattackAoA, q isthepitchrate, e isthecontrolinputelevator deection, isapositiveconstantsettoreducecontroleffectiveness, f x represents thematcheduncertainty,and Z ;Z q ;M ;M q ;Z e ;M e aretheightvehiclesstability derivatives[16]. Beforetheneuralnetworkbasedadaptivecontrollerscouldbeimplemented,awelldesignedLQRbaselinecontrollerwithservomechanismstructurewascreatedusingan iterativeloopdescribedinDickinsonetal.[73]Theservomechanismstructureallowsfor theadditionoftheintegralerrorofAoAasastate,whichprovidesasymptoticconstant commandtrackingwithdesirableandpredictablerobustnessproperties[16].The linearoptimalcontrollerwasusedinthemodelreferenceadaptivecontrollerMRACto formtheclosedloopreferencemodel.Thecombinationoftheadaptivecontrollerand baselinecontrollerwasusedtoprovidetheelevatordeectioncommandtotheight vehicle,displayedin4. Table4-1.Flightconditiontoanalyze MachAltitudemAoAdegAoSdeg q kPa Motor 0.770000014.1Off 70

PAGE 71

Figure4-7.BaselinecontrolAtransientperformancewhensubjectedtoBradialbasis functionmatcheduncertainty. Inordertoprovideahighlynonlinearandsignicantuncertaintywhichdominates thedynamicsofthesystem,wecreatedanuncertaintyterm, f x ,thatistheresultof thesummationofseveralradialbasisfunctionscenteredatdifferentanglesofattack AoA.Todemonstratetheeffectivenessofthisdisturbance,itwasrsttestedagainst thewell-tunedLQRbaselinecontroller.Thedisturbancesandtrackingperformance oftheLQRbaselinecontrollerisshowninFigure4-7.Itisworthnotingthatthe uncertainty,whichessentiallyisacontroleffectivenessterm,isquicklymitigatedineach controllersapproach.Hence,itisnotanalyzedintheresultssection. Forthesimulation,eachtypeofneuralnetworkbasedadaptivecontrollerwas testedagainstarepeatedangleofattackmaneuver.EachmaneuverrequiredthevehicletoincreaseanddecreaseinAoAinasinusoidalfashion.Thisrequiredthevehicleto spendsimilaramountsoftimeineachsectionoftheightenvelopeandprovidedagood learningcomparison.Inadditiontothetrackingperformance,uncertaintyestimates werecomputedbasedonthenalweightsthatwereobtainedafternishingthesimulation.Typically,thesecontrollersareonlyinterestedindeterminingtheuncertaintyat thecurrentoperatingpoint,butbyxingtheweightsandsweepingtheinputacrossthe wholeightenvelope,thisenablesacomparisonintheabilityoftheadaptivecontroller tolearnandrememberestimatesofuncertaintyfrompreviouslyvisitedregionsinthe 71

PAGE 72

Figure4-8.SinglehiddenlayertransientanalysisbyvaryingtheAnumberoftotal nodesandtheBthelearningrate. ightenvelope.Eachmaneuverwillberegardedasapassintheresultssectionofthis chapter. Forallsimulations,aconstant dt =0 : 01 sec wasusedalongwithasecond-order integrationmethodAB-2.Wefoundthatthemosteffectiveinitialweightsoftheneural networkweresimilartothosestatedinLewis[13,17].Thatis, ^ V 0 isinitializedtosmall randomnumbersandeachelementin ^ W 0 isinitializedtozero. 4.4Results Inordertounderstandtheadvantagesofthesparselearning-basedapproach, SHLandRBFbasedapproacheswereimplementedandanalyzedforcomparison usingthedisturbancesgeneratedintheprevioussection.Fortheseexperiments,the valuefortheconstantlearningrate, )]TJ/F22 11.9552 Tf 7.315 0 Td [(,wasdenotedinthefollowinggureseitherS )]TJ/F26 7.9701 Tf 7.314 -1.793 Td [(W =0 : 025 I; )]TJ/F26 7.9701 Tf 7.314 -1.793 Td [(V =0 : 0125 I ,M )]TJ/F26 7.9701 Tf 7.314 -1.793 Td [(W =0 : 05 I; )]TJ/F26 7.9701 Tf 7.314 -1.793 Td [(V =0 : 025 I ,orL )]TJ/F26 7.9701 Tf 7.314 -1.793 Td [(W =0 : 10 I; )]TJ/F26 7.9701 Tf 7.314 -1.793 Td [(V =0 : 05 I ,where I denotesanidentitymatrixofappropriatesize. Thenumberoftotalnodesinthenetworktestedvariedfrom N =10 to 100 andisalso identiedinthesubsequentgures. 4.4.1SingleHiddenLayerSHL TheresultsfromimplementingaSHLbasedMRACusingtheformofthecontroller speciedin4wasconsistentwithliteratureresults.Forinstance,increasingthe 72

PAGE 73

Figure4-9.RadialbasisfunctionRBFversussinglehiddenlayerSHLanalysisplots. numberofnodesofthenetworkorincreasingthelearningratesgreatlyimproves systemtrackingperformanceseeFigure4-8. Inmostcases,thesimulationshowedthattheSHLbasedadaptivecontrollers reactedfasterthantheirstructuredneuralnetworkcounterpartswiththesamenumber ofactivenodesandsimilarlearningrates.Buttheirdisadvantageisthattheydid notlearnorremembertheweightsthattheyusedforanangleofattackthatwas previouslyvisited.Hence,traditionalSHLadaptivecontrollersdonotimproveintracking performancewithmultiplepasses. 4.4.2RadialBasisFunctionRBF Inthissection,wewillconsidertheproblemofusingasetofradialbasisfunctions todeneourregressionmatrixforastructuredneuralnetworkapproach. SimilartotheSHL,theRBFbasedapproachperformedbetterasthenumberof nodesinthenetworkincreased.Thelargerlearningratedrasticallyimprovedtransient performancebutseemedtohaveadetrimentaleffectontheuncertaintyestimatefor repeatedightmaneuvers.Thiscouldbeduetothefactthatwhenthelearningrate isincreased,theweightswiththelowestimpactendofchangingsignicantly.Usinga smalllearningratewithseveralrepeatedpassesoverthesameregionresultedinthe bestuncertaintyestimate.Thispointwillbeelaboratedoninthesuccessivesection. 73

PAGE 74

Figure4-10.RadialbasisfunctionRBFtransientanalysisbyvaryingAthenumberof nodesandtheBthelearningrate. Theuncertaintyestimateatanygivenalphaisbasedontheweightsobtained attheendofthesimulationi.e.approximately-3degreesalphaandisdisplayedin Figure4-9.Asmentionedpreviously,SHLcontrolschemesadaptwellforightcontrol systemsbutdonotpossessthearchitecturetorememberweightsorestimatesof uncertaintyforpreviouslyvisitedregionsintheightregime.Incontrast,RBFbased controlschemesdoholdthatvaluableabilitytokeepreasonableestimatesofthe uncertaintyforalldifferentsectorsofthestatevectoraroundtheneighborhoodofthe estimateevenwhenoperatingoutsideofthatregion.Unfortunately,itisdifcultto determinetheidealplacementoftheRBFcenters,andthereexistsacleartrade-off betweenincreasingthesizeoftheRBFwidthsfortransientimprovementordecreasing thewidthsforbetterlearningcapabilities.Thisresultisdemonstratedmoreclearly inFigure4-10,wherenormeisdenedas jj e jj 2 where e = x )]TJ/F25 11.9552 Tf 13.078 0 Td [(x ref andebaris denedbythefollowingequation, e = e 0 PB .Thisgivesadifferentperspectiveofthe simulationresultsshownpreviously.ItclearlyshowshowtheRBFcontrollerimproves withrepeatedmaneuvers,whileSHLhasconsistentperformancewitheachpass. 4.4.3SparseNeuralNetworkSNN FollowingtheapproachpresentedinSection4.2.3,severaldifferentsparsenetworksweredesignedandtestedusingthesamedynamicsanddisturbancesdescribed 74

PAGE 75

Figure4-11.SparseneuralnetworkmatcheduncertaintycomparisonbyAvaryingthe learningrateandtheBnumberofsharednodes. intheprevioussections.Sincethedisturbance, f x ,wasdesignedbasedonasingle inputvariable, ,onlythe1-DSNNarchitecturewasemployedforsimulationresults. Thatis,eachSNNpartitionedtheightenvelope,whichisonlyrepresentedby inthe 1-Dcase,into T segments.Eachsparsenetworkwascreatedwithadifferentamount ofsparsitywhileholdingthetotalnumberofsegmentsandnumberofactivenodesconstant.Asstatedpreviously,thesparsenessofeachnetworkisdeterminedbasedonthe ratioofthenumberofactivenodes, R; tothetotalnumberofnodesinthenetwork, N InordertocomparethenewsparsearchitectureresultswiththeRBFandSHLresults, wesetthenumberofactivenodestobeequivalenti.e. R =9 ineachcase.Inother words,theprocessingloadwasheldconstant.Wedescribetheparametersofeach SNNarchitectureinthelegendofeachgureinthefollowingformat:totalnumberof nodes N -numberofactivenodes R -numberofnodespersegment Q -learning rates \051 -totalnumberofsegmentsintheightenvelope T ,i.e. N )]TJ/F25 11.9552 Tf 11.955 0 Td [(R )]TJ/F25 11.9552 Tf 11.955 0 Td [(Q )]TJ/F15 11.9552 Tf 11.956 0 Td [()]TJ/F23 11.9552 Tf 9.97 0 Td [()]TJ/F25 11.9552 Tf 11.956 0 Td [(T Afterrunningnumeroussimulations,itbecameapparentthattheSNNarchitecture issuperiortothelegacyarchitectureinmanyways.Forinstance,Figure4-11shows theestimatesoftheuncertaintyaftertheruniscomplete.Clearly,thisisamuch betterestimatethanobtainedwiththeRBFcontrollerusingthesamelearningrates. 75

PAGE 76

Figure4-12.SparseneuralnetworktransientanalysisbyAsinusoidalcommandswith Berrorcomparisonresults. ComparingFigure4-12andFigure4-13revealstheimprovedtrackingperformancewith thesparseneuralnetworkSNNcontrollerespeciallyafterrepeatingmaneuvers. ResultsfromFigure4-13showthatanincreasedlearningratehasbenecial resultsintermsoftrackinganduncertaintyestimates.ThemostsparseSNNcontroller -9-9-M-91whichdedicatedninenodestoeachsegmentanddidnotsharenodes withadjacentsegmentsresultedinthebestuncertaintyestimatebutperformedpoorly intheinitialstagesoftherunse.g.Figure4-12andhadtheworstoveralltracking performancecomparedtotherestoftheSNNcontrollers.TheblendedSNN-91-M-91thatacquiredalmostalli.e.~88.8%itsactivenodesfromothersegments hadtheworstuncertaintyestimatebutperformedbetterintheinitialstagesoftheruns thantheothercontrollers.ThebestoverallSNNcontrollertookthebestaspectsfrom eachextreme.Forexample,theblendedcontrolleri.e.459-9-5-M-91whichacquired exactlyfourofitsactivenodesfromadjacentsegments,hadthebestoveralltracking performanceandaterricmemoryofuncertaintyestimates.Takingacloserlookatthis SNN'sarchitecturerevealwhyitwassuccessful.Firstly,ifthecontrollerenteredinto anadjacentsegmente.g.higherangleofattack,fouroutofthenineactivenodesfor thenewsegmentwereupdatedwhileoperatingintheprevioussegment,whichwould allowittoperformbetterthanhavingrandomlyinitializednodes.Theothervenodes 76

PAGE 77

Figure4-13.LearningratecomparisonusingAsinusoidalcommandswithBtransient results. thatwerenotacquiredintheprevioussegmentwillbeupdatedwhileoperatinginthe currentsegmentandwillbeavailableforotheradjacentsegmentstouse.Noticethat onenodeallocatedtoeachsegmentisonlyactiveforonesegment,whiletheother fourareactiveforexactlytwoadjacentsegments.ThisallowsthisSNN-9-5-M-91 toretainbetteruncertaintyestimatesthantheSNN-9-1-M-91whosenodesspan severalsegments. Inordertoquantiablydemonstratethesuperiorperformanceofthesparseneural networkoverthetraditionalarchitectures,thefollowingtableswereconstructedbased onthedataobtainedpreviously.Ratherthancomparingeverycontrollerpreviously tested,weselectedthebestcontrollerfromeachcategorywithmoderatelearningrates whichusedonly9activenodes,i.e.RBFcontrollerRBF-9-M,SHLcontrollerSHL-9-M, andSNNcontroller459-9-5-M-91.Table4-2comparesthenormoftheerrorintracking eachmaneuverusingthefollowingequation: e TE = T F X t =0 jj e t jj 2 where T F isdenedasthenaltimeofeachpassand e t = x t )]TJ/F25 11.9552 Tf 11.955 0 Td [(x ref t 77

PAGE 78

Table4-2.TrackingerrorcomparisontableofRBF,SHL,andSNN TrackingErrorPass1Pass2Pass3Pass4Pass5Pass6 RBF88.9975.5565.2464.6962.3261.11 SHL92.6888.8185.2589.2183.8285.26 SNN89.2642.0729.8425.1724.7423.98 Table4-3showsanuncertaintyestimationcomparisonusingthesummationofthe squaredestimationerrorSSEoverthewholeightenvelope,asseenin e UE = X f )]TJ/F15 11.9552 Tf 14.503 3.155 Td [(^ f 2 where f isthetrueuncertaintyinthesystemand ^ f istheestimateduncertainty usingthefrozenweightsobtainedattheendofthesimulation. Clearly,theSNNiscompetitiveintherstpassanddominatesinperformancein thefollowing5passes.TheSHLdoesnotchangeinperformancewitheachpass.The SNNhasasuperioruncertaintyestimatewhichdemonstratesthatitdoesabetterjob learningstatedependentmatcheduncertaintiesthantheotherarchitectures. 4.5Summary TraditionalneuralnetworkbasedadaptivecontrollersRBFandSHLupdate theirweightestimatesbasedsolelyonthecurrentstatevectorasinputandutilizeall nodesforcontrolduringeveryportionoftheightenvelope.Thisleadstopoorlongtermlearningandonlyslightimprovementsintransientperformancewhentrackinga repeatedcommandsequenceorightmaneuver. Sparseadaptivecontrollersonlyupdateasmallportionofneuronsateachpointin theightenvelopewhichresultsintheSNNsabilitytorememberestimatesandweights frompreviouslyvisitedsectors.TheblendedSNNalsohastheabilitytosharenodes Table4-3.EstimationcomparisontableofRBF,SHL,andSNN EstimationErrorTotalSSE RBF4.90 SHL29.89 SNN0.11 78

PAGE 79

withothersegments,whichimprovesthetrackingperformanceofinitialpassesand smoothstransitionsbetweensegments.TheSNNhassuperiorperformanceintermsof trackingandestimatinguncertaintiesthanthesinglehiddenlayerSHLandradialbasis functionRBFsystemsontasksthathaveconsistentuncertaintiesanddisturbances overregionsoftheightenvelope.Sparseneuralnetworkshavetheaddedadvantage ofhavingonlyasmallnumberofcomputationsateachpointintimeduetothehigh percentageofinactiveneurons. 79

PAGE 80

CHAPTER5 ASPARSENEURALNETWORKAPPROACHTOMODELREFERENCEADAPTIVE CONTROLWITHHYPERSONICFLIGHTAPPLICATIONS Inthischapter,weensurestabilityoftheclosed-loopsystemusingadwelltime conditionrequirementanddemonstratethesparseneuralnetworkSNNadaptive controlcapabilitiesonahypersonicightvehicleHSV.Inaddition,wedevelopa numberofimprovementstotheSNNcontrolscheme,includinganadaptivecontrol termusedtocounteractcontroldegradation.Wealsoexploreincludinghigherorder Taylorseriesexpansiontermsinouradaptiveerrorderivationforadditionalbenets. HypersoniccontrolsimulationresultsareusedtoproperlycomparetheSNNtothemore conventionalsinglehiddenlayerSHLapproach. 5.1AugmentedModelReferenceAdaptiveControlFormulation Westartbyconsideringaclassof n dimensionalmultipleinputmultipleoutput MIMOsystemdynamicswith m inputsintheform: x = Ax + B u + f x + B ref y cmd y = Cx where A 2 R n n B 2 R n m ,and C 2 R p n areknownmatrices.Weassumean externalboundedtime-varyingcommandas y cmd 2 R m .Thecontroleffectiveness matrix 2 R m m isanunknownconstantdiagonalmatrixwithuncertaindiagonal elements,denoted i ,whichareassumedtobestrictlypositive.Weassumethe controleffectivenessmatrix canbeupperboundedinbyaconstanttermwhichwe denote .Thematcheduncertaintyinthesystemisrepresentedbythecontinuously differentiablefunction f x 2 R m .Wedenethestatetrackingerroras e = x )]TJ/F25 11.9552 Tf 12.132 0 Td [(x ref and theoutputtrackingerroras e y = y )]TJ/F25 11.9552 Tf 12.185 0 Td [(y cmd .Thesystemstatevector, x ,whichappearsin 5,containsthetraditionalplantstatevector, x p ,alongwiththeintegraltrackingerror asasystemstate,i.e. x = e I x p 2 R n .Theinclusionofthisaugmentedstatevector 80

PAGE 81

createsthe B ref y cmd termintheextendedopen-loopdynamics,5.See[16]formore detailsandderivationsofthismodel. WenowdiscussthereferencemodelfortheMRACarchitecturewhichdenesthe idealclosed-loopbehaviorofthesystemundernominalconditions.Tobegindesign,we removetheuncertaintiesfromthesystemdynamicsstatedin5.Thisresultsinthe followingidealextendedopen-loopdynamics x = Ax + Bu + B ref y cmd y = Cx: Next,weassumethebaselinecontrollertakesthefollowingform: u BL = )]TJ/F25 11.9552 Tf 9.299 0 Td [(K LQR x = )]TJ/F25 11.9552 Tf 9.299 0 Td [(e I K I )]TJ/F25 11.9552 Tf 11.955 0 Td [(x p K P where K LQR arexedbaselinecontrollergainstypicallydesignedusingsystematic optimallinearquadraticLQcontrolmethods.ThesystematicapproachtodesignLQ gainsallowthecontroldesignertotakeintoconsiderationrobuste.g.margins,singular values,andloopshapingandperformancemetricse.g.rise-time,overshoot,and undershootwhileutilizingtraditionalanalysistoolse.g.rootlocus,bodeplots.Forthis paper,weassumethebaselinecontrollerisinproportional-integralPIformwithgains K LQR consistingofintegral K I andproportionalgains K P Bysubstitutingtheformofthebaselinecontrollerdenedin5intotheideal extendedopen-loopdynamicsshownin5,wederivetheformoftheclosed-loop referencemodeldynamicsas x ref = A ref x ref + B ref y cmd y ref = C ref x ref whereweassumethepair A;B iscontrollable, C ref = C 2 R p n isknown,and A ref = A )]TJ/F25 11.9552 Tf 12.351 0 Td [(BK T LQR 2 R n n isdesignedtobeHurwitz,see5.Notethattheproper 81

PAGE 82

designofthebaselinecontrollergainsin5resultsinarobustlinearcontrollerwith idealtransientcharacteristics. ThemodelmatchingconditionsforthepreviouslydenedMRACsystemcanbe describedasfollows.Givenaconstantunknownpositivedenitematrix andaHurwitz matrix A ref ,thereexistsaconstantunknowngainmatrix K BL suchthat A ref = A )]TJ/F25 11.9552 Tf 11.955 0 Td [(B K T BL : Inordertosatisfyboththemodelmatchingconditionsandtheclosed-loopreferencemodeldynamicsin5,wedenetheidealgainmatrix, K BL ,as K BL = K LQR )]TJ/F24 7.9701 Tf 6.586 0 Td [(1 where K BL hasbeenproventoexistforanynonsingularmatrix andcontrollablepair A;B [16]. TheoverallMRACdesigngoalistoachievereasonableboundedtrackingofthe externaltime-varyingcommand y cmd andreferencestates x ref inthepresenceofthe nonlinearmatcheduncertainty f x andthecontroleffectivenessterm .However, sincethebaselinecontrollerisaxedgaincontroller,unexpectedchangesinight dynamicsandunmodeledeffectscreateuncertaintieswhichdegradestheperformance ofthecontroller.TheroleoftheadaptivecontrollerintheMRACarchitectureisto reduceorideallycanceltheeffectthattheuncertaintieshaveontheoveralldynamics ofthesystemandrestorebaselinetrackingperformance.Thatis,foranybounded time-varyingcommand y cmd theadaptivecontrollerisdesignedtodrivetheselected states x totrackthereferencemodelstates x ref withinboundswhilekeepingthe remainingsignalsbounded.Thisgoalisachievedthroughanincrementaladaptive controlarchitectureinthefollowingform: u = u BL + u AD + u RB 82

PAGE 83

where u AD representstheadaptivecontrolsignal, u BL istheoptimallydesigned proportional-integralPIbaselinecontroller,and u RB istherobusttermusedtoensurequickconvergencetoaspeciederrorregionforsafeswitching.Thesubsequent sectionswillprovidearchitecturaldetailsoftheadaptivecontrolleralongwithLyapunov basedstabilityanalysisresults. WenowstateacommonLyapunov-liketheoremthatwillbeusedtoverifyuniform andultimateboundednessofthepreviouslydenedsystemandcontrolscheme.Itis worthnotingthatmanyofthesymbolsforthevariablesinthetheoremwerechosento matchthoseinthestabilityanalysis. Theorem5.1. Let 1 and 2 beclass functions, V beacontinuouslydifferentiable function,and X R n beadomainthatcontainstheoriginand V .Considerasystemin theform: = f t; where f ispiecewisecontinuousin t andlocallyLipschitzintheassociatedstatevector 2 X .Supposethat 1 jj jj V t; 2 jj jj V t + V f t; )]TJ/F25 11.9552 Tf 28.559 0 Td [(W 3 ; 8jj jj r 1 > 0 for 8 t 0 and W 3 isacontinuouspositivedenitefunction.Supposethat r 1 < )]TJ/F24 7.9701 Tf 6.587 0 Td [(1 2 1 r where r> 0 .Foreveryinitialstate t 0 satisfying jj t 0 jj )]TJ/F24 7.9701 Tf 6.587 0 Td [(1 2 1 r thereexistsa T 0 suchthatthesolutionof5satises jj t jj )]TJ/F24 7.9701 Tf 6.587 0 Td [(1 1 2 r 1 = r 2 ; 8 t t 0 + T 83

PAGE 84

wherewerefertotheultimateboundas r 2 Proof. See[16,77,82]. 5.2SparseNeuralNetworkArchitecture ThesparseneuralnetworkSNNarchitectureiscreatedbysegmentingtheight envelopeintoregionswhereeachregionisassignedacertainnumberofneuronswith associatedneuralnetworkNNweightsbasedonuserselections.Duringoperation withinthatregion,theadaptivecontrolleronlyutilizesandupdatesweightsbelongingto neuronsthatareactiveinthatregionwhiletheremainingregionsandneuronweights arefrozen.ThefollowingsectionsaimtoexpoundontheSNNarchitecturedevelopedin Chapter4byprovidingadetailedLyapunovstabilityanalysiswithsimulationresults. NoticethatthetraditionalsinglehiddenlayerSHLneuralnetworkapproachcan beviewedasaspecialcaseofthemoreextensiveSNNconceptwheretheentire operatingdomainisconsideredonepassivesegmentandtheadaptiveupdatelaws areestablishedusingrst-orderTaylorseriesexpansionofthematcheduncertainty. 5.2.1SparseNeuralNetworkControlConcept InordertointroducetheSNNconcept,wecreateapre-dened N D -dimensional gridthatspanstheightenvelope.Theuserselectsthedimensionsofthegridbased onknownoperatingconditionse.g.velocity,Mach,andaltitude.Next,wedenea setofsegments S = f s 1 ;:::;s T g withcenterpoints P = f p 1 ;:::;p T g where T 2 N denotesthetotalnumberofsegments.Welet I = f 1 ;:::;T g betheindexsetofthe sets S and P .Similartothegrid,thecenterpointsareprovidedbytheuseranddonot requireanyspacingrequirements.Forbestresults,thenumberofcenterpointsshould bechosentobedenseinregionswithsignicantlyvaryingdynamicsorregionswhich containsignicantorunknownuncertaintiese.g.highangleofattack.Wedenethe totalnumberofnodesintheneuralnetworkas N 2 N whereeverynodeisassigned toaspecicsegment.Wedenethenumberofnodespersegmentas Q 2 N where Q = N T and E i 2 I isthesetofnodesallocatedtosegment s i .Withineachsegment,the 84

PAGE 85

Figure5-1.Sparseneuralnetworksegmentedightenvelopeforhypersoniccontrolin AtwoandBthreedimensions. userdeterminesthespacingofthenodese.g.uniformdistributedwhereeverynodeis allocatedtoaspecicpositionwithinitsassignedsegment.Finally,wedenethesetof activenodesforeachsegment s i as E A i WeuseaVoronoidiagramtocreatethe T segmentsusingconvexpolygonswhere wedenotetheindexofsegment s i as i 2 I [86].InaVoronoidiagram,eachsegmentis denedbyaregionsurroundingacenterpointwhichenclosesallpointsinthe N D space thatareclosertothatcenterpointthananyothercenterpoint.TheVoronoidiagramalso allowsforanefcientandsimplewaytopartitiontheightenvelopeintosegmentswhile stillallowingexibilityinselectingthelocationsofthecenterpoints.Inaddition,byusing thenearestneighborgraphgeneratedfromDelaunaytriangulation,wecanefciently locatetheregionforwhichthecurrentoperatingpointlieswithintheightspaceduring ighttime. Considertheexamplesofstaticsegmentedightenvelopesinthe N D =2 and N D =3 dimensionalcases.VoronoiandDelaunaydiagramsfora2-DSNNight controllerutilizingangleofattack andaltitudeasoperatingconditionsareshown inFigure5-1andFigure5-2.Forthe3-Dcase,thegeometricdiagramsareshownin Figure5-1andFigure5-2.Thechosencenterpointsforeachsegmentarelabeledfor the2-Dcase.Foreachcase,theneuralnetworkwilldeterminethesetofactivenodes 85

PAGE 86

Figure5-2.DelaunaydiagramsforsparseneuralnetworkhypersoniccontrolinAtwo andBthreedimensions. E A i usedintheadaptivecontrollerbasedontheindex i ofthecurrentsegment s i ofoperation.EachenclosedcoloredregionoftheVoronoidiagramsinFigure5-1 representsthedomainofasinglesegment s i Intermsofimplementation,considerthefeed-forwardneuralnetworkdiagramsin Figure5-3.Eachdiagramcontainsanarbitrarynumbere.g. N =10 ofnodeswhere thenodesaredenotedby N 0 )]TJ/F25 11.9552 Tf 12.354 0 Td [(N 9 .Eachcolorindicatesadifferentsegmentnumber. Allnodesinsideacoloredsegmentbelongtothatsegment.Forinstance,nodesN2 andN3areallocatedtothesegmentrepresentedbythecoloryellow.Anexample ofthetraditionalfullyconnectedneuralnetworkarchitectureusedforSHLandRBF adaptivecontrolschemesisalsoshowninFigure5-3whereallnodesbelongtothe samesegment.AnexampleoftheSNNarchitectureisshowninFigure5-3wherethere are N =10 nodeswith T =5 segmentsand Q =2 nodespersegment. Forthesakeofbrevity,thereaderisreferredtoChapter4formoreintimatedetails onthesparseneuralnetworkSNNadaptivecontrolconceptwithmathematicalnotationandoriginalimplementationresults.Thenextsubsectiondescribestheprocedure forwhichtheSNNadaptivecontrollerexecutesduringight. 86

PAGE 87

Figure5-3.NeuralnetworkcontrollerforhypersoniccontrolwithAfullconnectivityand Bsparseconnectivity. 5.2.2SparseNeuralNetworkAlgorithm Ateachpointintime, t ,theSNNdeterminesthesetofactivenodes E A i with associatedweights, ^ W i ; ^ V i tobeusedintheadaptivecontrollerandadaptiveupdate lawsbasedonthesegmentnumber, i ,thattheightvehicleiscurrentlyoperating. Ratherthanndingtheindexofthecurrentsegmentnumberbybruteforce,weinstead usethepreviousoperatingsegmentindexandthenearestneighborgraph,generated fromtheDelaunaydiagram,todeterminethecurrentsegmentnumber.Thatis,weuse thenearestneighborgraphtogenerateatablewhichstoresalistofneighborsforeach segment.Ateachtimethecontrolleriscalled,thatlistisusedtocalculatetheclosest centerpoint.Wehavefoundthatthisapproachsignicantlyreducestheburdenonthe processor. Wenowdenethenumberofactivenodes,denoted N act 2 N .Activenodes denetheexactnumberofneuronsthatarebeingusedforcontrolatalltimes.The numberofactivenodesisaparametercanbevariedbytheuserbasedonprocessing constraints.Werefertothecasewhere N act isselectedtobeequalto Q asthepure sparseapproach,whereonlythesetofnodesallocatedtothesegment E i currently inoperationareusedintheadaptivecontroller.Wenowconsiderthecasewhere thenumberofactivenodesis N act >Q .Thatis,theadaptivecontrolleroperatingin 87

PAGE 88

Algorithm5.1 SparseNeuralNetworkExecution 1: receive x t andthecorrespondinglocationintheoperatingenvelope 2: useprevioussegmentindex j andDelaunaydiagramtodeterminetheindex, i ,of theclosestoperatingsegment s i 3: if dwelltimeconditionismetORinsidetheerrorbounds then 4: retrievethesetofnodes, E A i ,correspondingtotheindex i 5: formaSHLNNusingtheweights ^ W i ; ^ V i associatedwiththesegment i 6: else 7: storecurrentoperatingsegmentindex i 8: useprevioussegmentindex j toformaSHLNNusingtheweights ^ W j ^ V j associatedwiththesegment j 9: endif 10: usetheneuralnetworktoformanadaptivecontrolsignalfollowing518 11: createanoverallcontrolsignalfollowing5 12: applytheoverallcontrolsignaltothesystemdynamics5 13: updatetheadaptiveweights ^ W i ; ^ V i ^ K accordingtotheadaptiveupdatelaws statedin5,5,5 segment s i mustutilizeitsnodesalongwithnodesfromnearbysegmentsforcontrol. Forthisblendedapproach,theactivenodelistforeachsegment, E A i 2 I ; isdetermined byselectingtheclosest N act nodestoeachsegment'scenterpoint, p i .Noticeif N act = Q = N thenweareresortingtothetraditionalSHLapproach.FortheSNN,allthe parametersarepre-denedbasedonuserselectionsandthelistofactivenodesis createdbeforerun-time. Algorithm5.1showsthestep-by-stepSNNapproachforeachtimethecontrolleris called. 5.3AdaptiveControlFormulation Inordertodevelopaneuralnetworkapproximationofthematcheduncertainty, f x ,ourstabilityanalysisforthesparseneuralnetworkSNNadaptivecontroller leveragestheuniversalneuralnetworkapproximationtheorem[13,17].Weconsidera singlehiddenlayerSHLfeed-forwardneuralnetworkwhichtakestheform: NN x = W T V T x + b V + b W 88

PAGE 89

where x 2 R n 1 istheinputvectorand W 2 R N m ;V 2 R n N ;b V 2 R N ; and b W 2 R m representtheidealweights W;V andbiases b V ;b W oftheneuralnetwork.Wealso dene ^ W; ^ V; ^ b V ; and ^ b W astheestimatesoftheidealweightsandbiaseswitherror termsdenedas ~ W = ^ W )]TJ/F25 11.9552 Tf 12.094 0 Td [(W ~ V = ^ V )]TJ/F25 11.9552 Tf 12.094 0 Td [(V .Belowistheneuralnetworkapproximation theoremthatwillbeutilizedinsubsequentsections. Theorem5.2. Anysmoothfunction, f x ,andcanbeapproximatedoveracompact domain, x 2 X ,byasinglehiddenlayerneuralnetworkwithaboundedmonotonically increasingcontinuousactivationfunction, .Thatis,forany > 0 ,thereexistneural networkweights W;V b V ,and b W with N neuronssuchthat f x = W T V T x + b V + b W + ; jj jj < ? where iscalledthereconstructionerror. Proof. See[16]or[82]. Theobjectiveoftheadaptiveneuralnetworkcontrolleristoadjusttheneural networkparametersi.e. ^ W; ^ V; ^ b V ; and ^ b W inordertoapproximateasmoothnonlinear functionwithinspeciedthresholds.Byredeningtheneuralnetworkweightmatrices W =[ W T b W ] T 2 R N +1 m and V =[ V T b V ] T 2 R n +1 N andtheinputvector 2 R n +1 1 ,wecansimplifynotationandbemorecomputationallyefcientduring run-time[16,80].Forthestabilityanalysis,wewillassumethatanidealneuralnetwork approximationexistswithinaknownconstanttolerancewhileoperatingwithinacompact set, X; withknownbounds.Thatis,wedenethecompactset X = f x 2 R n : jj x jj R g andapproximationboundas jj jj ; 8 2 X: Forourstabilityanalysis,theSNNcontrollerrequiresadiscreteupdateforthe selectionofthenodesusedforcontrolwhichcausesswitchingintheclosed-loop 89

PAGE 90

system.Thatis,foreverypointintime, t ,theSNNcontrollerisoperatingwithinasingle segment, s i ,usingapre-denedsetofactivenodes, E A i 2 I ,forcontrol.Usingcompact neuralnetworknotation,theformoftheadaptivecontrollerwhileoperatinginthe i th segmentcanbestatedas u NN = )]TJ/F15 11.9552 Tf 13.588 3.022 Td [(^ W i T ^ V i T where ^ W i 2 R N act +1 m and ^ V i 2 R n +1 N act denotetheestimatesoftheouterandinner layerweightsoftheidealneuralnetwork.Thetotaladaptivecontrolsignalisgivenby u AD = u NN + u K = )]TJ/F15 11.9552 Tf 13.589 3.022 Td [(^ W i T ^ V i T )]TJ/F15 11.9552 Tf 15.088 3.022 Td [(^ K u BL + u NN + u RB where u K isdesignedtonegatethecontroldegradationterm, .Weassumetheinput vector, ; isuniformlyboundedandstatedby jj jj ; > 0 where isanupperbound. Theopen-loopdynamicscanbederivedbyusingtheformofthebaselinecontroller in5alongwiththechosenreferencemodelin5andcanbestatedas x = A ref x + B u AD + u RB + B )]TJ/F15 11.9552 Tf 5.48 -9.684 Td [( I )]TJ/F15 11.9552 Tf 11.956 0 Td [( )]TJ/F24 7.9701 Tf 6.586 0 Td [(1 u + f x + B ref y cmd where I 2 R m m isanidentitymatrix. Recallthatthedenitionofthestatetrackingerroras e = x )]TJ/F25 11.9552 Tf 11.955 0 Td [(x ref ; thenthestatetrackingerrordynamicscanbestatedas e = A ref e + B u AD + f x + Bu RB + B K u: 90

PAGE 91

Thisisobtainedbyutilizingtheformoftheopen-loopdynamicsin5andthestate trackingerrorin5wherewedene K = I )]TJ/F15 11.9552 Tf 11.955 0 Td [( )]TJ/F24 7.9701 Tf 6.587 0 Td [(1 and f x = f x Theneuralnetworkweightsareupdatedaccordingtothefollowingadaptiveupdate lawsusedinthe i th segment: ^ W i = Proj )]TJ/F26 7.9701 Tf 11.867 -1.793 Td [(W ^ V i T )]TJ/F15 11.9552 Tf 13.871 0 Td [(_ ^ V i T ^ V i T + ^ V i T diag ^ V i T ^ V T i e T PB ^ V i = Proj )]TJ/F26 7.9701 Tf 11.867 -1.794 Td [(V e T PB ^ W i T ^ V i T )]TJ/F15 11.9552 Tf 12.57 0 Td [( ^ V i T diag ^ V i T wherethepositivedenitematrix P = P T > 0 satisesthealgebraicLyapunovequation PA ref + A T ref P = )]TJ/F25 11.9552 Tf 9.298 0 Td [(Q foranypositivedenite Q = Q T > 0 .Theindexfortheestimatedneuralnetworkweights ^ V i ; ^ W i isdeterminedbasedontheightvehicle'slocationwithintheightenvelope. Theadaptivelawusedforcancelingtheeffectof takesthefollowingform[16]: ^ K = Proj )]TJ/F26 7.9701 Tf 17.719 -1.793 Td [(K u BL + u NN + u RB e T PB : Notethat )]TJ/F26 7.9701 Tf 7.314 -1.793 Td [(W; )]TJ/F26 7.9701 Tf 7.314 -1.793 Td [(V ,and )]TJ/F26 7.9701 Tf 7.314 -1.793 Td [(K arediagonalmatricesofpositiveconstantlearningrates. 5.3.1NeuralNetworkAdaptiveControlLaw Inthissubsection,weformulatetheadaptiveupdatelawsstatedin524to5 25throughtheuseofTaylorSeriesexpansionanddetermineanupperboundonthe adaptiveerror. RecallfromSection5.2that T referstothetotalnumberofsegmentsinthesparse neuralnetworkarchitectureand N act denotesthenumberofactivenodespersegment whichissetbasedonprocessingconstraints.Duringoperationinsegment s i ,weform aSHLneuralnetworkwiththefollowingnotation.Let ^ V i =[ ^ V i ^ V i N act +1] T and ^ V i a =[ ^ V i a; 1 ^ V i a;m ] T where ^ V i a;b istheinnerlayerweightfromthe a th hiddennodetothe m th outputforthe i th segment.Similarly,let ^ W i =[ ^ W i ^ W i n + 91

PAGE 92

1] T and ^ W i c =[ ^ W i c; 1 ^ W i c;N act +1] T where ^ W i c;d istheouterlayerneural networkweightfromthe c th inputnodetothe d th hiddenlayernodeforthe i th segment. Now,considerthetrackingerrordynamicsgivenby5andtheupdatelawsin 5and5.Recall,thattheneuralnetworkportionoftheadaptivecontrolleris givenby u NN = )]TJ/F15 11.9552 Tf 13.588 3.022 Td [(^ W i T ^ V i T whiletheneuralnetworkapproximationerrorforthe i th segmenttakestheform: f x = W T i V T i + i : Wedeneupperbounds W i and V i ontheidealneuralnetworkweightsforthe i th intervalas jj W i jj < W i ; jj V i jj < V i wheretheysatisfythefollowinginequalities[14]: jj ^ W i jj < jj ~ W i jj + W i ; jj ^ V i jj < jj ~ V i jj + V i : Wenowcomputethesecond-orderTaylorseriesexpansionof V T i around ^ V i T whichyields V i T = ^ V i T +_ ^ V i T V T i )]TJ/F15 11.9552 Tf 13.742 3.022 Td [(^ V T i + 1 2 ^ V i T V T i )]TJ/F15 11.9552 Tf 14.131 3.022 Td [(^ V i T 2 + O ~ V T i 3 = ^ V i T +_ ^ V i T V T i )]TJ/F15 11.9552 Tf 14.132 3.022 Td [(^ V i T )]TJ/F15 11.9552 Tf 10.494 8.088 Td [(1 2 ^ V i T V T i )]TJ/F15 11.9552 Tf 14.131 3.022 Td [(^ V i T ~ V i T + O ~ V T i 3 where isusedtodenotecomponent-wisemultiplicationand O ~ V T i x 3 representsthe sumofthehigherordertermsgreaterthantwo.Theactivationfunctionisdenotedby andisstoredasadiagonalmatrixwithJacobian andHessian .Forthispaper,we 92

PAGE 93

usethesigmoidactivationfunctionanditsderivativeswhichcanbestatedas = 1 + e )]TJ/F26 7.9701 Tf 6.587 0 Td [(x = e x e x +1 2 = )]TJ/F25 11.9552 Tf 11.955 0 Td [( = )]TJ/F15 11.9552 Tf 9.298 0 Td [( e x e x )]TJ/F15 11.9552 Tf 11.956 0 Td [(1 e x +1 3 = )]TJ/F25 11.9552 Tf 11.955 0 Td [( )]TJ/F15 11.9552 Tf 11.955 0 Td [(2 withbounds 0 1 0 1 4 )]TJ/F15 11.9552 Tf 9.299 0 Td [(1 6 p 3 1 6 p 3 : Rearrangingtermsfrom5yields O ~ V T i 3 =_ ^ V i T ~ V i T )]TJ/F15 11.9552 Tf 11.955 0 Td [( ^ V i T )]TJ/F25 11.9552 Tf 11.955 0 Td [( V T i + 1 2 ^ V i T V T i )]TJ/F15 11.9552 Tf 14.131 3.022 Td [(^ V i T ~ V i T : AsderivedinAppendixA,thisdirectlyleadsintotheboundsontheadaptiveerrorgiven by )]TJ/F25 11.9552 Tf 9.299 0 Td [(u NN )]TJ/F25 11.9552 Tf 11.956 0 Td [(f x = ^ W i T ^ V i T )]TJ/F25 11.9552 Tf 11.955 0 Td [(W T i V T i )]TJ/F25 11.9552 Tf 11.955 0 Td [( i = ~ W i T ^ V i T )]TJ/F15 11.9552 Tf 13.871 0 Td [(_ ^ V i T ^ V i T + 1 2 ^ V T i ^ V i T ^ V T i + ^ W i T ^ V i T ~ V i T )]TJ/F15 11.9552 Tf 13.15 8.088 Td [(1 2 ^ W T i ^ V T i ^ V i T ~ V i T + h i )]TJ/F25 11.9552 Tf 11.955 0 Td [( i : wherewedene h i as h i = ~ W i T ^ V i T V T i )]TJ/F15 11.9552 Tf 13.151 8.088 Td [(1 2 ^ V T i ^ V i T V T i + 1 2 W T i ^ V T i V T i ~ V i T )]TJ/F25 11.9552 Tf 11.955 0 Td [(W T i O ~ V i T 3 whichwaschosentoincludetermscontainingunknowncoefcientse.g. W i ;V i and higherorderterms.Thetermsthatarelinearin ~ W i ; ~ V i withknowncoefcientscanbe 93

PAGE 94

adaptedfor[17].AsweshowinAppendixA,anupperboundcanbeestablishedusing thedenitionof h i from5andthehigherordertermsdenitionin5[87,88]: jj h i )]TJ/F25 11.9552 Tf 11.956 0 Td [( i jj = jj ^ W T i ^ V i T V T i )]TJ/F25 11.9552 Tf 11.955 0 Td [(W T i ^ V i T ^ V T i + 1 2 W T i ^ V T i ^ V i T ^ V T i )]TJ/F15 11.9552 Tf 10.494 8.088 Td [(1 2 ^ W T i ^ V T i ^ V i T V T i + W T i ~ V i T )]TJ/F25 11.9552 Tf 11.955 0 Td [( i jj i i : WedenetheFrobeniusnormofamatrix X as jj X jj F = p trace X T X andusethe relation jj XY jj F jj X jj F jj Y jj F toformthefollowing[14]: i V i ; W i =max f V i ; W i ; g i ^ V i ; ^ W i ; = jj ^ W T i ^ V i T jj F jj jj F + jj ^ V i T ^ V T i jj F + 1 2 jj ^ V T i ^ V i T ^ V T i jj F + 1 2 jj ^ W T i ^ V T i diag ^ V i T jj F jj jj F +2 : Noticethat i V i ; W i isamatrixofnormsforunknowncoefcientsand i ^ V i ; ^ W i ; ismatrixofknownterms.Wedenoteanupperboundon jj h i )]TJ/F25 11.9552 Tf 12.418 0 Td [( i jj as U i 2 R m 1 .The readerisreferredto[88]and[16]fordetailedderivationsofupperboundsforneural networkbasedadaptivecontrollers. 5.3.2RobustAdaptiveControl Next,wediscusstheadaptivelawsstatedin5to5whichwillbeused inthestabilityanalysis.Eachadaptivelawincludestheprojectionoperatorsee[16] and[19]asthechosenrobustadaptivecontroltechniquewhichforcestheweight estimatestoremainwithinaknownconvexboundedsetbounded.Itisworthnoting thatwechosenottousethe e )]TJ/F25 11.9552 Tf 12.734 0 Td [(modification or )]TJ/F25 11.9552 Tf 12.734 0 Td [(modification operatorsinour adaptivelawsduetotheiradverseeffectsonadaptationwhenoperatingwithlarge 94

PAGE 95

trackingerrors[16].Detailsregardingtheprojectionoperator,includingdenitionsused inthispaper,isprovidedinAppendixB. Theprojectionoperatorensuresthatasystemstartingwithanyinitialweights ^ t 0 withintheset 0 willevolvewithweights ^ t withintheset 1 forall t t 0 .In ourneuralnetworkadaptivecontrolset-up,wecanusethepropertiesoftheprojection operatortodeneupper-boundsfortheweightmatrices ^ W i and ^ V i .Thatis,weassume thattheinitialconditionsof ^ W i and ^ V i lieinthecompactsets W 0 and V 0 whichare denedinthesamemanneras 0 inAppendixB.Then,wecandenethemaximum valuesforthenormoftheweightmatrices ^ W i and ^ V i as[88] ^ W i =max ^ W i 2 W 1 jj ^ W i t jj8 t t 0 ^ V i =max ^ V i 2 V 1 jj ^ V i t jj : Similarly,theboundoftheadaptivecontroleffectivenesstermcanbestatedas ^ K =max ^ K 2 K 1 jj ^ K t jj8 t t 0 : 5.4StabilityAnalysis UsingthemultipleLyapunovfunctionapproach[30],considerthefollowingLyapunovcandidateforeachsegment, s i ,as V = e T Pe + 1 2 trace ~ V T i )]TJ/F28 7.9701 Tf 7.314 4.948 Td [()]TJ/F24 7.9701 Tf 6.587 0 Td [(1 V ~ V i + 1 2 trace ~ W i T )]TJ/F28 7.9701 Tf 7.314 4.948 Td [()]TJ/F24 7.9701 Tf 6.586 0 Td [(1 W ~ W i + 1 2 trace ~ K 1 2 T )]TJ/F28 7.9701 Tf 7.314 4.948 Td [()]TJ/F24 7.9701 Tf 6.587 0 Td [(1 K ~ K 1 2 wheretimedifferentiatingalongthetrajectoriesof5resultsin V =_ e T Pe + e T P e +trace ~ V T i )]TJ/F28 7.9701 Tf 7.314 4.948 Td [()]TJ/F24 7.9701 Tf 6.586 0 Td [(1 V ~ V i +trace ~ W i T )]TJ/F28 7.9701 Tf 7.314 4.948 Td [()]TJ/F24 7.9701 Tf 6.587 0 Td [(1 V ~ W i +trace ~ K T )]TJ/F28 7.9701 Tf 7.314 4.948 Td [()]TJ/F24 7.9701 Tf 6.586 0 Td [(1 K ~ K = )]TJ/F25 11.9552 Tf 9.298 0 Td [(e T Q ref e +2 e T PB u AD + f x +2 e T PB K u AD + u BL + u RB +trace ~ K T )]TJ/F28 7.9701 Tf 7.314 4.948 Td [()]TJ/F24 7.9701 Tf 6.587 0 Td [(1 K ~ K +trace ~ V T i )]TJ/F28 7.9701 Tf 7.314 4.948 Td [()]TJ/F24 7.9701 Tf 6.586 0 Td [(1 V ^ V i +trace ~ W i T )]TJ/F28 7.9701 Tf 7.314 4.948 Td [()]TJ/F24 7.9701 Tf 6.587 0 Td [(1 W ^ W i : 95

PAGE 96

Bysubstitutingthepreviousresultinto5,wehave V = )]TJ/F25 11.9552 Tf 9.298 0 Td [(e T Q ref e +trace ~ V T i )]TJ/F28 7.9701 Tf 7.315 4.948 Td [()]TJ/F24 7.9701 Tf 6.586 0 Td [(1 V ^ V i +trace ~ W i T )]TJ/F28 7.9701 Tf 7.314 4.948 Td [()]TJ/F24 7.9701 Tf 6.587 0 Td [(1 W ^ W i +trace ~ K T )]TJ/F28 7.9701 Tf 7.315 4.948 Td [()]TJ/F24 7.9701 Tf 6.586 0 Td [(1 K ^ K )]TJ/F15 11.9552 Tf 9.299 0 Td [(2 e T PB ~ W i T ^ V i T )]TJ/F15 11.9552 Tf 13.87 0 Td [(_ ^ V i T ^ V i T + 1 2 ^ V T i ^ V i T ^ V T i + ^ W i T ^ V i T ~ V i T )]TJ/F15 11.9552 Tf 13.15 8.088 Td [(1 2 ^ W T i ^ V T i ^ V i T ~ V i T + h i )]TJ/F25 11.9552 Tf 11.955 0 Td [( i )]TJ/F15 11.9552 Tf 9.298 0 Td [(2 e T PB ~ K u BL + u NN + u RB +2 e T PBu RB : Byusingtheformoftheadaptiveupdatelawsstatedin5to527,theprojection operatorpropertystatedinB,thevectorproperty diag a b = a b ,andthetrace property trace a + b =trace a +trace b ,wecanestablishasimpliedupperboundof 5whichisgivenby V )]TJ/F25 11.9552 Tf 21.918 0 Td [(e T Q ref e )]TJ/F15 11.9552 Tf 11.955 0 Td [(2 e T PB h i )]TJ/F25 11.9552 Tf 11.955 0 Td [( i +2 e T PBu RB : NotethatthederivationsanddetailsareprovidedinAppendixC. Sinceweknowthatforall e 2 R n ; min Q ref jj e jj 2 e T Q ref e max Q ref jj e jj 2 thenwecanrewrite5usingtheupperboundsstatedinAas V )]TJ/F25 11.9552 Tf 21.917 0 Td [( min Q ref jj e jj 2 +2 jj e jj max P jj B jj U i +2 e T PBu RB : 5.4.1RobustControlforSafeSwitching Forourproblem,switchinginthesystemdynamicsisintroducedthroughtheuse ofthesparseneuralnetworkadaptivecontroller.Byusingarobustcontroltermand amultipleLyapunovfunctionapproachwithstrictdwelltimecondition,wewillensure thatswitchingbetweendifferentsegmentsintheadaptivecontrollerdoesnotresultin instability.Considertheopen-loopdynamicsstatedin5intheformofafamilyof dynamicsystems[29]: x = f i x;u 96

PAGE 97

where i 2 I isthesegmentnumberand f i islocallyLipschitz. Nowconsidertheswitchedsystemgeneratedfrom5[89]: x = f S x;u whereaswitchingsignal S : R + I speciestheindexoftheactivesegmentattime t Fortheremainderofthiswork,wewillassumethattheswitchingsignal, S ,ispiecewise constantandrightcontinuous. Prevalentinswitchedsystemandhybridliterature,theaveragedwelltimecondition ofaswitchingsignal, S ,canbestatedas N S T;t N o + T )]TJ/F25 11.9552 Tf 11.955 0 Td [(t wheretheswitchingsignal, S ,hasanaveragedwelltimeof ifthereexisttwonumbers N o 2 N and 2 R + thatsatisfytheaveragedwelltimeconditionstatedin5where N S T;t denotesthenumberofswitchesonthetimeintervalfrom [ t;T Forthesparseneuralnetworkcontroller,weareinterestedincalculatingastrict dwelltimeconditionforthecontrollertofollow.Thedwelltimeconditionholdsifthere existsadwelltime, T dwell 2 R + ,suchthat t S +1 )]TJ/F25 11.9552 Tf 11.955 0 Td [(t S T dwell ; 8 s 2 S where t S denotesthetimeofthe S th switchwithdwellinginterval [ t S ;t S +1 .Itcaneasily beshownthatthestrictdwelltimeconditionisaspecialcaseoftheaveragedwelltime conditionwhere N o =1 [29,30].Inthecaseoftheaveragedwelltimecondition,some switchingintervalscanbelessthanthespeciedaveragedwelltime, Ingeneral,theuseoftherobustcontrolcanleadtohighgaincontrolandlimitthe learningperformanceofadaptivecontrollers.However,robustcontroltermscanbeused effectivelyinordertoensuresafeswitchingbetweenintervals.Thisisaccomplishedby 97

PAGE 98

enablingarobustcontroltermwhenthesystemerrorbecomeslargerthanapredeterminedthreshold.Whiletheerrorremainslargerthanthethreshold,therobustcontrol termremainsactive,andthesparseneuralnetworkisrequiredtosatisfyadwellingtime, T dwell ; requirementbeforeswitchingtothenextsegment.Thisset-upensuresconvergencetothepredeterminederrorboundwheretherobustcontroltermisdeactivated andthecontrollerperformanceisthendeterminedbasedonthesparseadaptiveneural networkandthebaselinecontroller. Suppose u RB takestheform: u RB = )]TJ/F15 11.9552 Tf 9.299 0 Td [( )]TJ/F25 11.9552 Tf 11.955 0 Td [(f RB k RB sgn e T PB where k RB > 0 isselectedtobetherobustcontrolgainand f RB isusedtofadeoutthe effectofthiscontrolterm.Wedenethefadeoutfunction, f RB ; by f RB = 8 > > < > > : 0 1 jj e jj r 0 E + RB jj e jj r 0 E where r 0 E isadesignparameterusedtodeneerrorboundsfortheactiveregionofthe robustcontroltermand RB andisselectedtobethelengthofthefadeoutregion. Considerwhen jj e jj r 0 E + RB andtherobustcontroltermisactive.Bysubstituting5into5,wederivethefollowinginequality: V )]TJ/F25 11.9552 Tf 21.917 0 Td [(e T Q ref e )]TJ/F15 11.9552 Tf 11.955 0 Td [(2 e T PB h i )]TJ/F25 11.9552 Tf 11.955 0 Td [( i + k RB sgn e T PB : Usingtheequation[16] e T PBu RB = )]TJ/F25 11.9552 Tf 9.299 0 Td [(k RB m X j =1 j e T PB j j and5,where jj denotesabsolutevalue,resultsinthefollowing V )]TJ/F25 11.9552 Tf 21.918 0 Td [(e T Q ref e +2 m X j =1 j e T PB j j jj h i )]TJ/F25 11.9552 Tf 11.955 0 Td [( i jj)]TJ/F25 11.9552 Tf 21.254 0 Td [(k RB : 98

PAGE 99

Ifweselecttherobustcontrollergain, k RB ; tosatisfythefollowinginequality: k RB U i then5becomes V )]TJ/F25 11.9552 Tf 21.918 0 Td [(e T Q ref e: Considerthesphereset, S r 0 X : S r 0 = n f e; ~ V i ; ~ W i ; ~ K g : jj e jj r 0 E + RB o wherewedenetheradiusassociatedwiththelargestLyapunovfunctionvalueinside S r 0 as r 0 = r 0 E + RB ; ~ V i ; ~ W i ; ~ K T whichwillbereferencedinthelatersections. Withoutlossofgenerality,wedenetheupperboundsfortheadaptiveerrorterms intheLyapunovcandidatein5forthe i th segmenttobe: k ~ V i =max ^ V i 2 V 1 1 2 trace ~ V T i )]TJ/F28 7.9701 Tf 7.314 4.948 Td [()]TJ/F24 7.9701 Tf 6.586 0 Td [(1 V ~ V i k ~ W i =max ^ W i 2 W 1 1 2 trace ~ W T i )]TJ/F28 7.9701 Tf 7.314 4.948 Td [()]TJ/F24 7.9701 Tf 6.586 0 Td [(1 W ~ W i k ~ K =max K 2 K 1 1 2 trace ~ K 1 2 T )]TJ/F28 7.9701 Tf 7.314 4.948 Td [()]TJ/F24 7.9701 Tf 6.586 0 Td [(1 K ~ K 1 2 wherewedenotetheupperboundsoftheadaptiveerrorsas ~ V i ; ~ W i ; and ~ K : Rewriting5intermsoftheLyapunovcandidatein5,weobtain: V )]TJ/F25 11.9552 Tf 21.918 0 Td [(c V V + c V k T where k T = k ~ V i + k ~ W i + k ~ K and c V = min Q ref max P .Let t 0 and t F betheinitialtimeandnal timewhileoperatinginasinglesegment, i ,then5impliesthat V t k T + V t 0 )]TJ/F15 11.9552 Tf 12.273 3.155 Td [( k T e )]TJ/F26 7.9701 Tf 6.587 0 Td [(c V t 99

PAGE 100

for 8 t 2 [ t 0 ;t F ] Wenowcalculateanupperboundforthedwelltimeofouradaptivesystem, T dwell basedonthepreviousequations.Recallthatweusedwelltimetodenetheminimum timeforthesystemtowaitbeforeswitchingtoadifferentsegmentwhichensuressafe switchingbetweensegments. Theorem5.3. Supposethatthereexistscontinuouslydifferentiablepositivedenite Lyapunovcandidatefunctions V i 2 I .Let t =[ t 0 ;t F ] representthecompletetimesegment forwhichtherobustcontroltermisactive,i.e. jj e jj >r 0 E + RB ,where t F isthenaltime and t 0 isthestartingtime.Supposethatthecompletetimesegment t canbebroken intoanitenumberoftimesegments N S denoted t S wherethesubscript S denotes thetimesegmentnumberofthetimesegmentdenedby t S =[ t S )]TJ/F24 7.9701 Tf 6.586 0 Td [(1 ;t S .Considera switchingsequence = f t 1 ; t 2 ;::: t N S g whereexactlyonesetofnodes i with correspondingneuralnetworkweights ^ V i ; ^ W i isactiveforeachtimesegmentin .Ifwe assumethedwellingtime, T dwell ; oftheSNNadaptivecontrollerischosentosatisfy T dwell 1 c V ln thenthesystemisguaranteedtoenterthesphereset S r 0 innitetimewithallclosedloopsignalsbounded. Proof. Supposethesystemistransitioningfromsegment t S intosegment t S +1 attime t S .Weimaginethesetofactivenodesforsegment t S is p whilethesetof activenodesforthesegment t S +1 is c .Alsolet V p t S and V c t S denotetheLyapunov candidatevaluesofthesegments p and c ,respectively,atthetimeinstant t S .Ourgoal istoshowthatbychoosingadwelltime, T dwell ; thatsatises5thentheLyapunov candidatevalueswillsatisfy V p t S )]TJ/F24 7.9701 Tf 6.587 0 Td [(1 >V c t S and V p t S >V c t S +1 .Hence,this guaranteesthatthesystemwillenterthesphereset S r 0 innitetime. Considerthetimeinstant, t S ,wheretheswitchoccurs,thetrackingerror e p t S will initiallybeequivalentto e c t S butthenewsetofboundedneuralnetworkweights 100

PAGE 101

~ V c ; ~ W c intheadaptivecontrollerwillcauseadifferentLyapunovresult.Thatis,we denethecontributionoftheneuralnetworkweighterrorstotheLyapunovcandidates forsegments c and p intheformstatedin5as k c = 1 2 trace ~ V T c )]TJ/F28 7.9701 Tf 7.314 4.948 Td [()]TJ/F24 7.9701 Tf 6.587 0 Td [(1 V ~ V c + 1 2 trace ~ W c T )]TJ/F28 7.9701 Tf 7.314 4.948 Td [()]TJ/F24 7.9701 Tf 6.587 0 Td [(1 V ~ W c k p = 1 2 trace ~ V T p )]TJ/F28 7.9701 Tf 7.314 4.948 Td [()]TJ/F24 7.9701 Tf 6.587 0 Td [(1 V ~ V p + 1 2 trace ~ W p T )]TJ/F28 7.9701 Tf 7.314 4.948 Td [()]TJ/F24 7.9701 Tf 6.587 0 Td [(1 V ~ W p whichimpliestheinstantaneouschangeintheLyapunovvalueisupperboundedby V = V c t S )]TJ/F25 11.9552 Tf 11.955 0 Td [(V p t S = k c )]TJ/F25 11.9552 Tf 11.956 0 Td [(k p k NN k T where k NN = k ~ V i + k ~ W i isdenedin5. Ifweassume jj e jj >r 0 E + RB whileoperatinginthe c th segmentduringthetime interval t S +1 ,then5becomes V c t S +1 k T + V c t S )]TJ/F15 11.9552 Tf 12.274 3.155 Td [( k T e )]TJ/F26 7.9701 Tf 6.586 0 Td [(c V t S +1 Byforcingthesystemtoabidebythedwelltimerequirement,i.e. 8 t S T dwell ,then 5becomes V c t S +1 k T + V c t S )]TJ/F15 11.9552 Tf 12.273 3.154 Td [( k T e )]TJ/F26 7.9701 Tf 6.587 0 Td [(c V T dwell Usingtheinequalityin5,weobtain V c t S +1 k T + V p t S + k NN )]TJ/F15 11.9552 Tf 12.273 3.155 Td [( k T e )]TJ/F26 7.9701 Tf 6.587 0 Td [(c V T dwell Sinceourgoalistonda T dwell suchthat V c t S +1
PAGE 102

whichimpliesthat T dwell > 1 c V ln V p t S + k NN )]TJ/F15 11.9552 Tf 12.273 3.155 Td [( k T V p t S )]TJ/F15 11.9552 Tf 12.273 3.155 Td [( k T : Byassumingthat V p t S > k T + k NN + k =2 k T where k T k NN ,thenitfollowsthat 1 c V ln 1 c V ln k NN + k T k T > 1 c V ln V p t S + k NN )]TJ/F15 11.9552 Tf 12.274 3.155 Td [( k T V p t S )]TJ/F15 11.9552 Tf 12.273 3.155 Td [( k T : Hence,if T dwell isselectedtosatisfy5then5willalsobesatised.Itisworth notingthatalargerlowerboundassumptionon V p t S wouldresultinasmallerdwell timerequirement. Next,letusassumethat jj e jj >r 0 E + RB whileoperatinginthesegment p during thetimeinterval t S ,then5becomes V p t S k T + V p t S )]TJ/F24 7.9701 Tf 6.587 0 Td [(1 )]TJ/F15 11.9552 Tf 12.273 3.155 Td [( k T e )]TJ/F26 7.9701 Tf 6.587 0 Td [(c V t S k T + V p t S )]TJ/F24 7.9701 Tf 6.587 0 Td [(1 )]TJ/F15 11.9552 Tf 12.273 3.155 Td [( k T e )]TJ/F26 7.9701 Tf 6.586 0 Td [(c V T dwell andafterrearrangingtermsresultsin V p t S )]TJ/F24 7.9701 Tf 6.587 0 Td [(1 k T + V p t S )]TJ/F15 11.9552 Tf 12.274 3.155 Td [( k T e c V T dwell : Bypluggingthelowerboundderivedin5into5resultsinthefollowing inequality: V p t S )]TJ/F24 7.9701 Tf 6.586 0 Td [(1 >V p t S + k NN whichimpliesthat V c t S 2 k T ,implies V p t S )]TJ/F24 7.9701 Tf 6.586 0 Td [(1 > 2 k T + k NN .Theproofiscomplete. Noticethatweareinterestedinndinganerrorbound, e R ,suchthatthetracking errorisguaranteedtoenterthesphereset, S r 0 ; innitetime, t F ,where jj e t jj = jj x t )]TJ/F25 11.9552 Tf 11.955 0 Td [(x ref t jj e R : 102

PAGE 103

Byusing5,5,andtheupperbound V t e T Pe min P jj e jj 2 wendthefollowingrelationship jj e t jj 2 V t 0 )]TJ/F15 11.9552 Tf 12.274 3.154 Td [( k T min P e )]TJ/F26 7.9701 Tf 6.587 0 Td [(c V t + k T min P for 8 t .Byusingtheassumption V t 0 > 2 k T + k NN andthedwelltimerequirement, T dwell ,from5,resultsin jj e t jj s 2 k T min P = e R where r 0 E + RB mustbeselectedtosatisfy r 0 E + RB > e R BythedesignoftheSNNdiscussedinSection5.3,itisclearthatonlyoneset ofnodesisactiveforeachdiscretetimesegmentforwhichthecontrollerisactive. Inpractice,timesegmentsarenotdesignedtobeequivalentinlengthduetothe possiblevariationsinsegmentsizes.Also,thevaryingprocessingspeedoftheon-board processorsalongwithfastswitchingcouldresultininstability.However,byusingthe robustcontroltermwitherrorthresholdandenforcingthedwellingtimeconditionof 5,then V c t S +1 r 0 E + RB .Noticethatif jj e jj r 0 E + RB whileoperatingintimesegment t S ,thenthesystemalreadybelongs tothesphereset S r 0 andtherobustcontroltermisnotactive.Thisprocesswillcontinue untiltheendofighttime.Itcanbeshownthatallsignalsintheclosedloopsystem remainuniformlyboundedbasedon5,5,5,and5. Noticethatwecanrelatethestrictdwelltimeconditionderivedin5tocommon derivationsinswitchedsystemscontrolbyconsideringthegenericformoftheswitched systemshownin5[29,30,89].ThisderivationisgiveninAppendixD. 103

PAGE 104

5.4.2SparseNeuralNetworkControl Nowconsiderwhenthesystementersthesphereset S r 0 forwhich jj e jj b RB + RB andtherobustcontroltermisnotactive i:e:u RB =0 .Startingfrom5,wecan write V )]TJ/F25 11.9552 Tf 21.918 0 Td [( min Q ref jj e jj 2 +2 jj PB jj jj e jj [ U i ] : Afterrearranging,theLyapunovderivativebecomes V )]TJ/F25 11.9552 Tf 21.917 0 Td [( min Q ref jj e jj)]TJ 24.878 8.088 Td [(jj PB jj [ U i ] min Q ref 2 + jj PB jj [ U i ] 2 min Q ref : Hence, V e; ~ V i ; ~ W i ; ~ K < 0 if jj e jj > 2 jj PB jj [ U i ] min Q ref and V e; ~ V i ; ~ W i ; ~ K < 0 outsidethesphereset, S r 1 X : S r 1 = e 2 R n : jj e jj 2 jj PB jj [ U i ] min Q ref = r 1 E Usingtheresultof5,wedenetheradiusassociatedwiththelargestLyapunov functionvalueinside S r 1 as r 1 = 2 jj PB jj [ U i ] min Q ref ; ~ V i ; ~ W i ; ~ K T : Thenextportionofthisanalysisdetailsthederivationofanultimateboundforthe previouslydenedadaptivecontroller.SeeFigure5-4asareferencetothesetsreferred tointhestabilityanalysis. Notice,wecanrewritetheLyapunovfunctioncandidatein5as V = T M wherewedenetheLyapunovvectoras =[ e; ~ W i ; ~ V i ; ~ K 1 2 ] T and M isdenedas 104

PAGE 105

M 2 6 6 6 6 6 6 6 4 P 000 0)]TJ/F28 7.9701 Tf 24.789 4.948 Td [()]TJ/F24 7.9701 Tf 6.586 0 Td [(1 W 00 00)]TJ/F28 7.9701 Tf 53.384 4.948 Td [()]TJ/F24 7.9701 Tf 6.587 0 Td [(1 V 0 000)]TJ/F28 7.9701 Tf 81.98 4.948 Td [()]TJ/F24 7.9701 Tf 6.586 0 Td [(1 K 3 7 7 7 7 7 7 7 5 Wedenethelargestsphereset, S R ; containedinthedomain, X ,as S R = f 2 X : jj jj R g withtheassumptionthat R>r 0 >r 2 >r 1 UsingtheLyapunovfunctioncandidatein5,wecandenetwoclass functions 1 and 2 whichdenethebounds: 1 = min P jj e jj 2 + min )]TJ/F28 7.9701 Tf 11.867 4.948 Td [()]TJ/F24 7.9701 Tf 6.586 0 Td [(1 W ~ W 2 + min )]TJ/F28 7.9701 Tf 11.866 4.948 Td [()]TJ/F24 7.9701 Tf 6.587 0 Td [(1 V ~ V 2 + min )]TJ/F28 7.9701 Tf 11.867 4.948 Td [()]TJ/F24 7.9701 Tf 6.586 0 Td [(1 K ~ K 1 2 2 2 = max P jj e jj 2 + max )]TJ/F28 7.9701 Tf 11.867 4.949 Td [()]TJ/F24 7.9701 Tf 6.586 0 Td [(1 W ~ W + max )]TJ/F28 7.9701 Tf 11.867 4.948 Td [()]TJ/F24 7.9701 Tf 6.586 0 Td [(1 V ~ V 2 + max )]TJ/F28 7.9701 Tf 11.866 4.948 Td [()]TJ/F24 7.9701 Tf 6.587 0 Td [(1 K ~ K 1 2 2 where 1 V 2 andmoregenerically min M jj jj 2 V max M jj jj 2 : Next,wedenecompactsets o and i as[88]: o = f 2 S R : V o min jj jj = R V = R 2 min M g i = f 2 S R : V i max jj jj = r 1 V = r 2 1 max M g where w i denotesthemaximumvalueoftheLyapunovfunction, V ,ontheedgeof S r 1 and w o istheminimumvalueoftheLyapunovfunction, V ,ontheedgeof S R 105

PAGE 106

Letusalsointroducethesmallestsphereset, S r 2 ,thatcontains i as S r 2 = f 2 S R : jj jj r 2 g : Next,wecreateanannulusset givenby = f 2 X : i V o g wherethetimederivativeof V isstrictlynegativedeniteinside : Forall 2 i min M jj jj 2 T M max M r 2 1 whichimplies jj jj 2 max M min M r 2 1 = r 2 2 : Usingtherelationshipderivedin5andthedenitionof r 1 in5,wecan deriveanultimatebound, r 2 ,givenby r 2 = s max M min M r 1 where jj t jj r 2 ; 8 t t 0 + T whichisequivalenttoapplyingtheformulashownin5ofTheorem5.1.Thatis, jj jj )]TJ/F24 7.9701 Tf 6.586 0 Td [(1 1 2 r 1 = r 2 [77,82]. Theorem5.4. Considerthesystemin5withcontroleffectivenessterm and matcheduncertainty f x ,thecontrollawstatedin5,andthesparseneuralnetwork SNNswitchingscheme.Then,theadaptiveswitchingweightupdatelawsin524, 5,and5ensurethatthetrackingerror, e ,of5andtheSNNweight errors, ~ W i ~ V i remainuniformlyultimatelyboundedUUBwhileoperatingwithineach segment, 8 i 106

PAGE 107

Figure5-4.VisualLyapunovfunctionusedforstabilityanalysis. Proof. ByTheorem5.3,therobustcontrollerensuressafeswitchingbetween segmentsifthedwelltimeconditionstatedin5ismetwhenoperatingabovethe errorthresholdspeciedin5.ThroughtheuseofrobustcontrolSection5.4.1,the projectionoperatorSection5.3.2,andtheassumptionthatalltrajectories, t 0 ,start in o ensurestheconvergencetothesphereset S r 0 innitetimewithboundedweight errors. Whileoperatingwithinasinglesegmentinsidethesphereset S r 0 ,5reveals thatthetrackingerroris e 2 L 1 andtheneuralnetworkweighterrorsintheoperating regionare ~ V i ; ~ W i 2 L 1 .Asstatedpreviously,theinputcommandisrequiredtobe y cmd 2 L 1 and A ref isdesignedtobeHurwitzwhichimplies x ref ;x 2 L 1 : Sincethe matricesoftheidealneuralnetworkparametersforeachregion W i ;V i arebounded, thentheneuralnetworkweighterrorsandestimatesarealso ^ V i ; ^ W i 2 L 1 .Sincethe baselinecontrollerisdesignedtobe u BL 2 L 1 andtheadaptivecontrollerconsistsof boundedneuralnetworkweightestimates,thentheoverallcontrolsignalis u 2 L 1 : Hence,theclosed-looperrorsystemis e 2 L 1 whichimpliesthat V 2 L 1 .Since V isstrictlynegativeintheannulusset and V islowerbounded,then V decreases monotonicallyuntilthesolutionenterstheset i innitetimewithultimatebound, r 2 107

PAGE 108

Hence,thisanalysisandthepreviouslyderivedequationsestablishthatiftheinitial conditionsofthestatevector, t 0 ,liein o denedby5,thenthecontrollaw givenby5,5,and5andtheadaptationlawsstatedin5,5,and 5ensurethatthestatetrackingerror, e ,andneuralnetworkweighterrorsforeach segment, ~ V i and ~ W i ,intheclosed-loopsystemareuniformlyultimatelyboundedUUB. 5.5HypersonicFlightVehicleDynamicswithFlexibleBodyEffects Intypicalsubsonicightcontrolsystemsthestiffnessoftheaircraftbody,thebenign operatingconditions,andthenaturalfrequencyseparationbetweentherigidbody modesandtheexiblebodymodesallowexiblebodyeffectstobeignoredinmodeling duetotheirnegligibleeffectduringight.Sincehypersonicightvehiclesoperateat extremetemperaturesandhighspeeds,thevehicle'sexibilityandstatecouplingcan createdramaticchangesintheoveralloweldwhichcausesvariationsinthepressure distributionontheightvehicle[6,90].Inaddition,theexingofthefuselagecancause unexpectedcontrolmomenteffectsfromthecontrolsurfaces.Forthesereasons,we investigatetheperformanceoftheSNNcontrollerversusthetraditionalSHLapproach onahypersonicvehiclewithexiblebodyeffects. Weconsiderahighlynonlinearhypersonicightvehiclemodelwithfourindependentservo-controlledns 1 ; 2 ; 3 ; 4 orientedinanX-conguration[73].For convenience,wecreatedvirtualnsaileron a ,elevator e ,andrudder r usedforcontroldesignintheconventionalautopilotreferenceframe.Wecreatedastaticmapping fromthevirtualnstoactualndisplacementinordertocalculateforcesandmoments oftheightvehiclesee[73]. Inthisresearchweconsideronlylongitudinaldynamicsoftheightvehicle,which areassumedtobeentirelydecoupledfromthelateraldynamics.Wecanwritethe longitudinal3-DoFequationsofmotionforahypersonicightvehicleinfollowingform 108

PAGE 109

see[74]and[6]: V T = 1 m T cos )]TJ/F25 11.9552 Tf 11.955 0 Td [(D )]TJ/F25 11.9552 Tf 11.956 0 Td [(g sin )]TJ/F25 11.9552 Tf 11.956 0 Td [( = 1 mV T )]TJ/F25 11.9552 Tf 9.298 0 Td [(T sin )]TJ/F25 11.9552 Tf 11.956 0 Td [(L + q + g V T cos )]TJ/F25 11.9552 Tf 11.955 0 Td [( = q q = M I YY h = V T sin )]TJ/F25 11.9552 Tf 11.955 0 Td [( i = )]TJ/F15 11.9552 Tf 9.299 0 Td [(2 i i i )]TJ/F25 11.9552 Tf 11.955 0 Td [(! 2 i i + N i ;i =1 ; 2 ;::;n where m isthemassofthevehicle, isthepitchangle, q ispitchrate, I YY isthe momentofinertia,and g isgravity.Theequationsforthe i thstructuralmodeofthe ightvehiclearedenedbythenaturalfrequency w i ,thedampingratio i ,andthe generalizedforce N i .Thenaturalfrequenciesofthehypersonicvehicle'sbodymodes varysignicantlybasedontemperaturechangesexperiencesthroughoutight[7]. Hence,wewillconsider w i afunctionoftemperature, T .Theforcesandmomentsacting ontheightvehicleconsistofthrust T ,drag D ,lift L ,andpitchmoment M .Ifwe assumethreeelasticmodesoftheightvehicleareactive,thestatevector, x 2 R 11 ,is givenby x =[ V T ;; ;q;h; 1 ; 1 ; 2 ; 2 ; 3 ; 3 ] where V T isthetrueairspeed, isangleofattack,and h isthealtitudeoftheight vehicle. Theaxialandnormalbodyforces A;N andpitchingmoment M canbeapproximatedbysee[74]or[73]: A t 1 2 V 2 T SC A N t 1 2 V 2 T SC N M t 1 2 V 2 T Sc ref C m 109

PAGE 110

N i t 1 2 V 2 T SC N i i =1 ; 2 ;::;n where denotesairdensity, S isthereferencearea, c ref isthemeanaerodynamic chord,andweassumezerothrusti.e. T =0 .Weassumethefollowingmappingfrom axialandnormal A N forcestoliftanddragforces L D usedin5: L = N cos )]TJ/F25 11.9552 Tf 11.955 0 Td [(A sin D = N sin + A cos : Theaxialforcecoefcient C A ,normalforcecoefcient C N ,pitchmoment coefcient C m ,andthegeneralizedforce N i appearingin5to5takethe followingform: C A = C A ALT h; Mach+ C A AB ; Mach+ 4 X j =1 C A j ; Mach ; j C N = C N 0 ; Mach+ 4 X j =1 C N j ; Mach ; j + C N q ; Mach ;q C m = C m 0 ; Mach+ 4 X j =1 C m j ; Mach ; j + C m q ; Mach ;q C N i = N i 2 2 + N i + 4 X j =1 N i j j + 3 X k =1 N i k k i =1 ; 2 ;::;n wheretheaerodynamiccoefcientscanbecomputedusingalook-uptablebasedon theightcondition ; Mach ;h;q ,thecontrolinputs 1 ; 2 ; 3 ; 4 ,andtheexiblebody states 1 2 3 .Thetotalliftforce,dragforce,andpitchmomentequationscanbe statedas L T = L + L flex D T = D + D flex M T = M + M flex 110

PAGE 111

wherethecontributionsduetotheexiblemodestaketheform: L flex = 1 2 V 2 T S c L 1 1 + c L 2 2 + c L 3 3 D flex = 1 2 V 2 T S c D 1 1 + c D 2 2 + c D 3 3 M flex = 1 2 V 2 T S c M 1 1 + c M 2 2 + c M 3 3 where c L 1 c L 2 c L 3 c D 1 c D 2 c D 3 c M 1 c M 2 ,and c M 3 areconstants. Inordertodeterminecontrol-orientedmodelssuitableforMRAC,weselectadense setoftrimpointstosuitablyllthepotentialightenvelope.Theightconditionsofthe trimpointsareselectedtoaccuratelyrepresentthevariationsofthemodeleddynamics oftheightvehiclethroughouttheightenvelope.Foreachightconditionforwhich atrimpointislocated,weareinterestedinadeterminingalongitudinallinearvehicle modelwhichincludestheshort-periodmodesandthestructuralbendingmodes.In ordertoaccomplishthistask,wedecoupletheshortperiodandphugoidmodesofthe vehiclebysubstitutingtrimvaluesfor V T ;h and into5. Next,byrepresentingthedynamicalequationsin5asasetofordinary differentialequationsgivenby x = f t;x ;x t 0 = x 0 ;t t 0 allowsustocomputethelinearshortperiodplantmatrices A p 2 R n p n p ;B p 2 R n p m ;C p 2 R p n p ; and D p 2 R p m [73].Thatis,theightmodelisnumericallylinearizedwithrespecttothestatesandcontrolinputsaroundeachightconditionwhere theshortperiodmatricestaketheform: A p i;j = @f i @x j x = x u = u B p i;k = @f i @u k x = x u = u ; C p i;j = @x i @x j x = x u = u D p i;k = @x i @u k x = x u = u 111

PAGE 112

wheretrimconditionsaredenotedbyasterisksas x and u i and j areindicesofthe statevectorand k istheindexofthecontrolinput.Forcontrolpurposes,wecreatean additionalintegralerroroftracking e I statetoincludeinthelinearmodel.Theresulting augmentedshortperiodlinearizedmodeltakestheform: 2 6 4 e I x p 3 7 5 = 2 6 4 0 C reg 0 A p 3 7 5 2 6 4 e I x p 3 7 5 + 2 6 4 0 B p 3 7 5 u + 2 6 4 )]TJ/F15 11.9552 Tf 9.298 0 Td [(1 0 3 7 5 y cmd where C reg 2 R m n p selectstheregulatedstate,thestatevectorisgivenby x =[ e I ;;q; 1 ; 1 ; 2 ; 2 ; 3 ; 3 ] 2 R 9 whichincludestheintegralerroroftracking,angleofattack,pitchrate,exiblemode positions,andexiblemoderates.Weassumetheexiblemodalcoordinatesarenot measuredorusedforfeedback.Thecontrolleroutputelevatordeection u = e is producedanactuator.Noticebyintroducingtheformoftheuncertaintiesinthesystem and f x ,5takesthegeneralformoftheMRACproblemshownin5. Foreachightcondition,wethendeterminexedbaselinecontrollergainsusingLQ methodsasdiscussedinSection5.1.Forimplementation,thecompletebaseline controlsignalofthenonlinearightvehicleisdeterminedbygain-schedulingthelinear controllergains K I ;K P byMach,angleofattack ,andaltitude h 5.6AdaptiveControlResults Inthissection,werevealthesimulationresultsusingvariousadaptivecontrollers SHLandSNNonthehypersonicvehiclemodelwhiletrackingasimplesinusoidal commandwithrelativelyslowfrequency f 0 : 24 Hz between-3and3degrees angleofattack for t f =250 sec .Thecommandrequiresthevehicletospendsimilar amountsoftimeineachregionoftheightenvelopewhilerepeatingeachsinusoidal maneuverapproximatelysixtimes.Wewillrefertoonecompleteperiodsinusoidal maneuverasapass.Thissimulationwasdesignedtoprovideanenvironmentfor 112

PAGE 113

Table5-1.RangeofexiblemodefrequenciesandtemperatureofHSV MINMAX Temp F -1002000 1 rad=sec 1520 2 rad=sec 3545 3 rad=sec 5060 adaptivecontrollong-termlearningandtrackingcomparison.Duringtesting,we assumedconstantexiblemodedampingtermswhere 1 = 2 = 3 =0 : 02 andlearning ratematricesfortheadaptivecontrollersweresettoeithersmallS )]TJ/F26 7.9701 Tf 7.315 -1.793 Td [(W =2 : 5 e )]TJ/F24 7.9701 Tf 6.587 0 Td [(5 I and )]TJ/F26 7.9701 Tf 7.315 -1.793 Td [(V =1 : 25 e )]TJ/F24 7.9701 Tf 6.587 0 Td [(5 I ormoderateM )]TJ/F26 7.9701 Tf 7.314 -1.793 Td [(W =5 e )]TJ/F24 7.9701 Tf 6.586 0 Td [(5 I and )]TJ/F26 7.9701 Tf 7.314 -1.793 Td [(V =2 : 5 e )]TJ/F24 7.9701 Tf 6.586 0 Td [(5 I where I denotestheidentitymatrixofappropriatesize.Inordertoprovideapropercomparison oftheSHLandSNNarchitectures,thesamenumberofactivenodeswereusedineach testcase i:e:N act =16 .Also,notethataconstant t =0 : 01 sec timestepwitha secondorderintegrationmethodAB-2wasusedforthesimulation. Forthesakeofbrevity,weselectedaregionsurroundingaspecictrimpointinthe ightenvelopetoanalyzei.e. Mach=6 : 0 =0 ; and h =14 km .Inadditiontovarying angleofattack andpitchrate q inthenonlinearmodelduringsimulation,wealso assumethattemperaturevariesduetoangleofattack assumingconstantvelocity whereweusethefollowingrelationshipbetweentemperatureandangleofattack : Temp = k 1 t 2 + k 2 t + k 3 t where k 1 t ;k 2 t ; and k 3 t areconstants.Wealsoassumeasimilarrelationshipbetween temperatureandnaturalfrequency i ofeachexiblemode: 1 = k 1 a Temp 2 + k 1 b Temp+ k 1 c 2 = k 2 a Temp 2 + k 2 b Temp+ k 2 c 3 = k 3 a Temp 2 + k 3 b Temp+ k 3 c 113

PAGE 114

Figure5-5.HypersonicbaselinecontrollerunderAsignicantRBFbasedmatched uncertaintywithBresultingtrackingperformance. where k 1 a k 1 b k 1 c k 2 a k 2 b k 2 c k 3 a k 3 b ,and k 3 c areconstantsandtherangeoftemperaturesandmodefrequenciesareshowninTable5-1. Theuncertaintyterm, f x ,thatwasusedduringthesimulationwascreatedusing thesummationofseveralradialbasisfunctionsRBFscenteredatvariousangles ofattack .ThemagnitudesfortheRBFsweresettobringthebaselinecontroller tothebrinkofinstabilitywhenoperatingwithoutadaptivecontrol.Thespacingofthe RBFswaschosentosignicantlyvarytheuncertaintybasedonangleofattack .The uncertaintytermandbaselinetrackingperformancecanbeseeninFigure5-5. 5.6.1SingleHiddenLayerSHLNeuralNetworkAdaptiveControl WerstprovidetheresultsofthetraditionalSHLMRACadaptivecontrollerwith adaptiveupdatelawsinthefollowingform[17,22]: ^ W =Proj)]TJ/F26 7.9701 Tf 59.384 -1.793 Td [(W ^ V T )]TJ/F15 11.9552 Tf 13.871 0 Td [(_ ^ V T ^ V T e T PB ^ V =Proj)]TJ/F26 7.9701 Tf 59.384 -1.793 Td [(V e T PB ^ W T ^ V T wherewewillrefertothiscontrollerasSHL-TS1.Forthisapproach,thenumberof activenodes N act isequaltothenumberoftotalnodes N .Theadaptivelawsinthe traditionalcasewerederivedusingarst-orderTaylorexpansion. 114

PAGE 115

Figure5-6.SinglehiddenlayerSHLhypersonicAtrackingperformanceandBerror tracking. Wealsoconsiderthecaseofusingtheadaptiveupdatelaws: ^ W =Proj)]TJ/F26 7.9701 Tf 53.531 -1.794 Td [(W ^ V T )]TJ/F15 11.9552 Tf 13.87 0 Td [(_ ^ V T ^ V T + ^ V T diag ^ V T ^ V T e T PB ^ V =Proj)]TJ/F26 7.9701 Tf 53.531 -1.793 Td [(V e T PB ^ W T ^ V T )]TJ/F15 11.9552 Tf 12.57 0 Td [( ^ V T diag ^ V T derivedusingthesecond-orderTaylorseriesexpansionwherewewillrefertothis adaptivecontrollerasSHL-TS2. AsweseeinthetrackingplotsofFigure5-6,theSHLadaptivecontrollersachieve similarperformanceforeachsinusoidalmaneuveranddonotimprovewithrepeated passesduetothelackoflearningcapabilityinthetraditionalSHLadaptivearchitecture. Also,notethattheSHL-TS2performssimilarlytotheSHL-TS1whileusingthesame learningratesandactivenodesduringthesimulation.AtrackingcomparisonperformanceoftheSHL-TS1whilevaryingtheadaptivelearningratesisshownintheerror plotsofFigure5-6,where jj e jj isdenedasthe2-normwhere e = x )]TJ/F25 11.9552 Tf 12.716 0 Td [(x ref and e is denedbythefollowingequation, e = e T PB 5.6.2SparseNeuralNetworkSNNAdaptiveControl Sincethefrequenciesoftheexiblemodesexplicitlydependontemperatureand theexiblemodesimpactthemomentandforceequations,wewillutilizea2-DSNN 115

PAGE 116

Figure5-7.HypersonictwodimensionalightenvelopepartitioninAzoomedinandB bird'seyeform. architectureshowninFigure5-7whereeachredrectanglesigniestheborderofa segmentintheightenvelopeandeachblueXindicatesaparticularlocationofanode intheSNNarchitecture.Wewillutilizethesamevaluesforlearningratesandnumber ofactivenodesispreviouslyspecied.Wechosetodividetheightenvelopeinto T =4050 segmentswith Q =4 nodesallocatedtoeachsegmentwhereeachsegment operatesusing12nodes.Aclose-upviewofanarbitraryoperatingsegmentcanbe seeninFigure5-7wherethemagentacirclesdenotetheactivenodesofthecurrent segment. Similartotheprevioussubsection,wewillcomparetheperformanceoftwoadaptive controllersutilizingthesamearchitecture i:e: SNN,learningrates,commands, referencemodel,andarchitectureparameters i:e: numberofactivenodes R ,number oftotalnodes N ,numberofnodespersegment Q ,andnumberoftotalsegments T butvarytheformofadaptiveupdatelaws.Firstweconsideranadaptivecontroller withadaptiveupdatelawsexercisedinChapter4,inthefollowingform: ^ W i =Proj)]TJ/F26 7.9701 Tf 59.384 -1.793 Td [(W ^ V i T )]TJ/F15 11.9552 Tf 13.871 0 Td [(_ ^ V T i ^ V i T e T PB ^ V i =Proj)]TJ/F26 7.9701 Tf 59.384 -1.793 Td [(V e T PB ^ W i T ^ V i T 116

PAGE 117

Figure5-8.HypersonicsparseneuralnetworktransientperformanceincludingA trackingperformanceandBerrortracking. wherewerefertothiscontrollerasSNN-TS1.Thesecondtypeofcontroller,whichwe refertoasSNN-TS2,usestheadaptivecontrollawsspeciedin5to5. ThetrackingplotsofFigure5-8showthesuperiortransientperformanceofthe SNNovertheSHLapproachduetotheimprovedlearningarchitecture.Italsodemonstratestheimprovedlearningperformancewhenthelearningratesareincreased.The controllerclearlyimprovesinperformancewitheachrepeatedmaneuver.Inadditionto theexcellenttrackingoftheSNNcontroller,theSNNhastheabilitytoretainreasonable estimatesofthematchedsystemuncertaintythroughouttheightenvelope.SeeFigure 5-9wheretheuncertaintyestimatefortheSNNwasobtainedafterthesimulationby sweepingtheneuralnetworkinput e I ;;q acrosstheightenvelope.Inthiscase,we set e I = q =0 andvariedangleofattack .Itisworthnotingthatwechose r 0 E =1 ; RB =0 : 05 ,and U i =5 forparametersoftherobustcontrolterm.Howeverforthisanalysis,therobustcontroltermneverbecomesnecessaryduetothenormofthetracking errornoteclipsingtherobustcontrolerrorthreshold. Inordertoproperlycomparetheperformanceoftheadaptivecontrollerswith variouslearningrates,denotedSHL-S,SHL-M,SNN-S,andSNN-M,wecreatedatable whichquantiesthetrackingperformanceofeachcontrollerovereachpass.Thatis,for 117

PAGE 118

Figure5-9.HypersonicsparseneuralnetworkSNNmatcheduncertaintyestimation. eachcompletesinusoidalmaneuveri.e.pass,wecalculatethenormoftheerrorfor thattimeinterval TE p usingthefollowingequation: TE p = Tf p X t = Ts p jj e p t jj 2 p =1 ;::; 6 where Ts p and Tf p denetheboundsofthetimeintervaland e p isthetrackingerrorat time t where e p t = x t )]TJ/F25 11.9552 Tf 11.955 0 Td [(x ref t .TheresultsareshowninTable5-2. 5.7Summary Byusingsmalllearningratesandarelativelysmallnumberofneurons,wewere abletocontrolasophisticatedHSVmodelwithexiblebodyeffectsusingtheSHLand SNNarchitectures.TheSNNarchitectureprovidedsuperiorperformanceintrackingand learningduetoitssparsearchitecture.Theinnovativeadaptivecontroleffectiveness termwasproventonullifytheeffectofthecontroldegradationoneachportionof control.Throughtheuseofarobustcontroltermanddwelltimecondition,wewereable Table5-2.TrackingperformancecomparisonofSHLversusSNN TE 1 TE 2 TE 3 TE 4 TE 5 TE 6 SHL-S145.07141.14141.10141.11147.05141.00 SHL-M136.03134.68134.68134.65134.65134.71 SNN-S114.0991.7983.6678.0170.1361.92 SNN-M100.7981.2974.6950.4846.0942.38 118

PAGE 119

toprovideacompleteLyapunovstabilityanalysisfortheadaptiveswitchinglawsofthe SNN.Futureworkincludesimplementingvariousformsofthesparseadaptivecontroller onavarietyofvehicledynamicswhileutilizingdifferentactivationfunctionsandlearning ratesintheadaptivecontroller. 119

PAGE 120

CHAPTER6 DEVELOPMENTOFAROBUSTDEEPRECURRENTNEURALNETWORK CONTROLLERFORFLIGHTAPPLICATIONS Thischapterpresentsanovelapproachfortrainingadeeprecurrentneuralnetwork RNNtocontrolahighlydynamicnonlinearightvehicle.Weanalyzetheperformance ofthedeepRNNusingatimeandphaseportraitanalysis.Thesuperiorperformanceof thedeepRNNisdemonstratedagainawell-tunedgain-scheduledLQRdesign. 6.1DeepLearning-BasedFlightControlDesign Weareinterestedindesigningadeeprecurrentneuralnetworkcontrollertoensure theselectedplantoutputstate, y sel 2 y ,successfullytracksthereferenceoutput, y ref Thereferenceoutputisproducedbyanoraclewhichisdesignedtospecifythedesired closed-looptrackingperformanceofthesystemundernominalconditions[16].Inother words,theoraclewilltransform y cmd ,whichistypicallygeneratedbyaguidancelaw, intoareferencecommand, y ref .Byusingtheoracletoproducedesiredidealbehavior, wecantransformthecontrolproblemintoasupervisedlearningproblem.Thegoalis todesignthedeepRNNcontrollertoachieveclosed-looptrackingperformancethatis closetotheoracle,regardlessofnumerousaerodynamicuncertaintiesandnonlinear effectswhileremainingrobusttodisturbancesandnoise. Forthischapter,again-scheduledLQRcontrollerwithservomechanismaction wasdesignedtoprovideoracletrajectories.Forfurtherdetails,thereaderisreferred to[16],[73],and[80]. Forasystemlevelview,wenowconsidertheblockdiagramshowninFigure6-1. Theightvehicleplantdynamicsiscontrolledbyadeeprecurrentneuralnetwork RNNwhichtakesavectorofstatesandcommandsandproducesacontrolcommand, u .Thecontrolinputtotheplant, u act ,isproducedbyasecondorderactuatorwithinput u ,naturalfrequency w n ; anddampingfactor 120

PAGE 121

Figure6-1.DeeplearningcontrolblockdiagramforRNN/GRU. 6.2FlightVehicleModel ThissectiondetailstheplantdynamicsblockshowninFigure6-1.Weconsidera highlynonlinearvehiclemodelcomposedofacylindricalbodywithfourindependent servo-controlledns 1 ; 2 ; 3 ; 4 orientedinanX-conguration[73].Theightmodel wasobtainedfromatraditionalaerodynamicdesigntoolbysweepingoveralarge rangeofightconditions.Aerodynamiccoefcientswerecomputedfromsimulation dataandindexedbyangleofattack ,angleofside-slip ,Mach,andaltitude h Forconvenience,wecreatedvirtualnsaileron a ,elevator e ,andrudder r used forcontroldesignintheconventionalautopilotreferenceframe.Wecreatedastatic mappingfromthevirtualnstoactualndisplacementinordertocalculateforcesand momentsoftheightvehiclesee[73]. Inthisstudy,weconsideronlylongitudinaldynamicsoftheightvehicle,whichare assumedtobedecoupledfromthelateraldynamics.Thelongitudinal3-DoFequations ofmotionforaightvehicletakesthefollowingformsee[6,74]: V T = 1 m T cos )]TJ/F25 11.9552 Tf 11.955 0 Td [(D )]TJ/F25 11.9552 Tf 11.956 0 Td [(g sin )]TJ/F25 11.9552 Tf 11.956 0 Td [( = 1 mV T )]TJ/F25 11.9552 Tf 9.298 0 Td [(T sin )]TJ/F25 11.9552 Tf 11.956 0 Td [(L + q + g V T cos )]TJ/F25 11.9552 Tf 11.955 0 Td [( = q q = M I YY h = V T sin )]TJ/F25 11.9552 Tf 11.955 0 Td [( 121

PAGE 122

where m isthemassofthevehicle, I YY isthemomentofinertia,and T;D;L ,and M areforcesandmoments.Thestatevector, x 2 R 5 ,isgivenby x =[ V T ;; ;q;h ] where V T isthetrueairspeed, isangleofattack, isthepitchangle, q ispitchrate, and h isthealtitudeoftheightvehicle.Wedenetheoutputvectoras y =[ ;q;A z ; q ] where q isdynamicpressureand A z isverticalacceleration. Weassumethefollowingmappingfromaxialandnormal A N forcestoliftand dragforces L D usedin6to6: L = N cos )]TJ/F25 11.9552 Tf 11.955 0 Td [(A sin D = N sin + A cos wherethebodyforces A;N andpitchingmoment M canbeapproximatedby see[73,74] A t 1 2 V 2 T SC A N t 1 2 V 2 T SC N M t 1 2 V 2 T Sc ref C m where denotesairdensity, S isthereferencearea, c ref isthemeanaerodynamic chord,andweassumezerothrusti.e. T =0 Theaxialforcecoefcient C A ,normalforcecoefcient C N ,andpitchmoment coefcient C m appearingin6to6takethefollowingform: C A = C A ALT h;M + C A AB ;M + 4 X i =1 C A i ;M; i 122

PAGE 123

C N = C N 0 ;M + 4 X i =1 C N i ;M; i C m = C m 0 ;M + 4 X i =1 C m i ;M; i + C m q ;M;q + q q + whereeachaerodynamiccoefcientcomponentcanbecomputedusingalook-uptable basedontheightcondition ;M;h;q andcontrolinputs 1 ; 2 ; 3 ; 4 .Weintroduce constantuncertaintyparameters q; in6thatwillbeutilizedduringthecontrol designprocessandelaboratedoninSection6.3.1. Inordertoreducethecomputationalcostsofevaluatingeachaerodynamiclook-up tableduringrun-time,polynomialapproximationswereproduced.Thetermsselected forapproximationincludeatmosphericvariablese.g.speedofsoundandairdensity, inverseairspeed 1 V T ,andeachaerodynamiccoefcientcomponent.Foreachterm, theorderofthepolynomialtwasadjustedbyevaluatingaccuracyandsmoothness oftheapproximation.Inaddition,thetrigonometricfunctionse.g. sin )]TJ/F25 11.9552 Tf 12.807 0 Td [( were approximatedusingTaylorseriesexpansion.Inordertoincreasetheaccuracyofthe modelwhilereducingtheorderofthepolynomials,wecreatedseveralpolynomial modelsthroughouttheightenvelope.Thereaderisreferredto[91]formoredetails regardingtheuseofpolynomialapproximationsofaircraftdynamics.Anexample componentofanaerodynamiccoefcientthatwasttedwitha 3 rd orderpolynomialis showninFigure6-2.Themeshrepresentsthevalueofthecoefcientsafterevaluating thepolynomialwhilethecirclesdenotethevalueofthecoefcientevaluatedatthe breakpointsofthelook-uptables. Thecompletepolynomialmodelforthelongitudinaldynamicsoftheightvehicleis givenby x = f x; u u act + u ; ; q y = f x; u u act + u ; ; q 123

PAGE 124

Figure6-2.ExampleCoefcientPolynomialFit whereelevatordeection e isthecontrolinput u act ,thestatevector x iscomposed ofthevestateslistedin6, u isamultiplicativecontroleffectivenessterm, u isaconstantadditivecontroluncertainty,and ; q representadditiveuncertainty parametersthatdirectlyaffectaerodynamiccoefcients.Theconstantuncertainty parameters ; q ; u ; u areincludedasinputstotheightvehiclemodelforcontrol designpurposesandwillbeelaboratedoninSection6.3. Forclarityofthecontroldesignprocess,wetransformthegeneralformoftheight vehiclemodel,shownin6,intodiscretetimeform[42] x t +1 = f x t ; u u act t + u + d u ; ; q + p y t = f x t ; u u act t + u + d u ; ; q + p where x t arethestatesoftheplantattime t u act t istheactuatoroutputattime t ,and y t istheoutputvectorattime t .Noticewealsoinclude p and d u asplantnoiseandinput disturbanceterms,respectively. 6.3DeepRecurrentNeuralNetworkController Forthischapter,theRNNcontrollerblockshowninFigure6-1willutilizeastacked recurrentneuralnetworkS-RNNarchitecture[67].AdiagramoftheS-RNNarchitectureisshowninFigure6-3where u t denotesthecontrolcommandattime t and H i t 124

PAGE 125

Figure6-3.TwolayerstackedrecurrentneuralnetworkS-RNN. signiesahiddennodeinthe ith layerattime t .Eachhiddennode H i isafunctionor setoffunctionsthatproduceahiddenstate, s t : Forinstance,whenusingthetraditional formofarecurrentneuralnetwork,thehiddenstate s t = f Uc t + Ws t )]TJ/F24 7.9701 Tf 6.587 0 Td [(1 iscalculated ateachhiddennodeusingthecurrentinput c t ,theprevioushiddenstate s t )]TJ/F24 7.9701 Tf 6.586 0 Td [(1 ,and theRNNcontrollerparameters =[ U;W ] .Thesehiddenstatesarethenpassedto adjacentnodesasindicatedinFigure6-3. Forthischapter,theinputtotheRNNcontrollerattime t c t ,takesthefollowing form: c t =[ e I ;;q; q ] where q isthepitchrate, q isthedynamicpressureofthevehicle,and e I istheintegral erroroftracking e I = t f 0 y sel )]TJ/F25 11.9552 Tf 11.955 0 Td [(y cmd dt Asmentionedintheintroduction,thetraditionalRNNstructurehasdifcultyin learninglong-termdependenciesduetothevanishinggradientproblemsee[64,92]. Inordertocombatthisproblemandencouragelong-termlearning,wewillutilizegated recurrentGRUmodulesatthehiddennodes[66].Algorithm6.1statestheequations bywhichtheGRUmodulesoperate,where i =[ U u ;W u ;U r ;W r ;U h ;W h ;b 1 ;b 2 ;b 3 ] isamatrixofparametersforeachGRUmoduleindexedby i where z;r;h areinternal statesofthemodule[63]. 125

PAGE 126

Algorithm6.1 GatedRecurrentUnitsGRU 1: z = c t U u + s t )]TJ/F24 7.9701 Tf 6.586 0 Td [(1 W u + b 1 2: r = c t U r + s t )]TJ/F24 7.9701 Tf 6.587 0 Td [(1 W r + b 2 3: h = tanh c t U h + s t )]TJ/F24 7.9701 Tf 6.586 0 Td [(1 r W h + b 3 4: s t = )]TJ/F25 11.9552 Tf 11.955 0 Td [(z h + z s t )]TJ/F24 7.9701 Tf 6.587 0 Td [(1 5: representselement-wisemultiplication TheoutputnodesinFigure6-3oftheRNNproducecontrolcommands u t issued totheactuatorsattime t .Thecontrolleroutputiscalculatedby u t = s t V + b o where s t isacquiredfromtheoutputofthelasthiddenlayer'sGRUmoduleand V;b o areparametersofthecontroller.Wecannowdene =[ i ; i +1 ;:::; M ;V;b o ] to containalltheparametersofthecontrollerwhere M denotesthetotalnumberofhidden layers. 6.3.1ControllerOptimizationProcedure Thegoalofthecontroldesignistondasetofparameters, ,ofaRNN/GRU controllerthatminimizesapredenedcostfunctioninthefollowingform[39]: J = t f X t =0 t E [ x t ;u t j u ] where istheimmediatecost, u isthepolicycontrollerparameterizedby isthe idealsetofparameters, t f isthetimeduration,and isaconstantdiscountterm. Wedeneanestimateoftheoverallcostfunctionwithadiverserangeofinitial conditions,disturbances,uncertainties,andcommandsas ~ J = 1 N N X i =1 J i where N isthetotalnumberofsampletrajectories.Foreachtrajectory,weintroduce uncertaintyinthesystemthroughasetofconstantparameters i ; i q ; i u ; i u where i referstothe i thtrajectory.Thenon-zeroconstantparameters ; q createadditive 126

PAGE 127

aerodynamicuncertaintiesshownexplicitlyin6.Theconstantparameter u acts asacontroleffectivenessterm,andparameter u isanadditivematcheduncertainty, shownin6. Inordertocreatethesampletrajectories,wedene R ;R q ;R u ; u U [ min;max ] tobeavectorofuniformlydistributedrandomvariableswithxed min=max valuesthat aredeterminedaprioribasedonsystemrequirements.Wethendraw N samplesfrom eachrandomvariableandassigneachsampletoaparticulartrajectory.Moreprecisely, = f 1 ; 2 ;:::; N g q = f 1 q ; 2 q ;:::; N q g u = f 1 u ; 2 u ;:::; N u g ; and u = f 1 u ; 2 u ;::: N u g arethesetsofsamplesdrawnfromtheuniformrandomvariables R ;R q ;R u ; and u respectively. Sincethegoalistosatisfyrobustnessandperformanceconstraintssimultaneously, wecreatetwotypesofsampletrajectorieswithdistinctlabels.Eachtrajectory'slabel performance P orrobustness R determineswhichelementse.g.noise,uncertainties,anddisturbancesareactiveduringoptimizationandtheformofthecostfunction forwhichweareminimizing.Foreachperformancetrajectory P ,theRNN/GRUcontrollerparameterswillbeoptimizedinordertotracktheoracle'snominaltrajectory fromavarietyofinitialconditionswithpenaltiesonhighcontrolratesandtrackingerror whileincludingsmalltomoderateaerodynamicuncertaintiesinthenonlinearplant.For eachrobusttrajectory R ,theRNN/GRUcontrollerwilluseavariantoftime-varying funnelstotracktheoracle'snominaltrajectorywithinpredenedboundswhileincluding largeaerodynamicuncertainties,noise,anddisturbancesintheplantdynamicsduring optimization.Wedeneasetoftime-varyingfunnelsby u t = f x 2 R n j U x;t b u t g e t = f x 2 R n j E x;t b e t g where b u t ;b e t denetheboundariesofthefunnelsattime t .Inthischapter,we set E x;t = j e t j and U x;t = j u t j where| j denotesabsolutevalue,theoptimization 127

PAGE 128

trackingerrorisdenedby e t = y sel )]TJ/F25 11.9552 Tf 12.304 0 Td [(y ref ,andtheestimatedcontrolrateiscalculated by u t = u t )]TJ/F26 7.9701 Tf 6.587 0 Td [(u t )]TJ/F18 5.9776 Tf 5.756 0 Td [(1 t .Wealsochoosetoselectthetime-varyingfunnelbounds b u t ;b e t as constantsi.e. b u t = b u and b e t = b e Wenowdenethecostfunctionforeachtrajectory, i ,inthefollowingform: J i = t f X t =0 t x t ;u t wheretheformoftheinstantaneousperformancemeasurementcalculation, x t ;u t isdeterminedbasedoneachtrajectory'slabel.Thatis,theinstantaneousperformance measurementisgivenby x t ;u t = 8 > > < > > : k 1 e t 2 + k 2 f 2 u k 3 f 2 e + k 4 f 2 u iflabel = P iflabel = R where k 1 ;k 2 ;k 3 ; and k 4 arepositiveconstants.Thefunneltrackingerror f e andfunnel controlrateerror f u attime t aredenedby f e =max j e t j)]TJ/F25 11.9552 Tf 17.933 0 Td [(b e ; 0 f =max j u t j)]TJ/F25 11.9552 Tf 17.933 0 Td [(b u ; 0 where b u ;b e aretheconstantboundsofthefunnels. Asanadditionalgoalofthecontroldesign,weseektoestablishnonlinearstability marginsthroughtheuseofthecontroleffectivenessterm, u .Wedenethestability margincriteriatodenetheamountofamplication/attenuationthatacontrollercan handlebeforethesystembecomesunstable.Thatis,anonlinearsystemwithaasymptoticstabilizingcontrollaw, u = u BL ; possessesstabilitymargins SM min ;SM max where )]TJ/F15 11.9552 Tf 9.298 0 Td [(1 SM min SM max 1 ,ifforevery SM 2 [ SM min ;SM max ] thecontrol input u =+ SM u BL alsoasymptoticallystabilizesthesystem[93].Forsamplebasedoptimization,wewillrelaxtheasymptotictrackingcondition.Instead,wewill considerasystemtohaveacceptablestabilitymargins, SM 2 [ SM min ;SM max ] ,ifeach 128

PAGE 129

Algorithm6.2 IncrementalTrainingProcedure 1: Randomlyinitializecontrollerparameters 2: STEP1:Optimize forRNN/GRUusingcostfunction6withlineardynamics andlabels=P 3: STEP2:Re-optimize usingnonlineardynamics,labels=P,withsmalluncertainties inaerodynamics 4: STEP3:Re-optimize usingnonlineardynamics,labels=R,P,withuncertainties, disturbances,andnoise trajectorygeneratedduringtrainingremainsbetweenpre-denedfunnelbounds,for all t 2 [0 ;t f ] .Thatis,thecontroleffectivenessterm, u ,wassampledintherangeof + SM min u + SM max inordertoestablishstabilitymargins SM min ;SM max 6.3.2IncrementalTrainingProcedure Sincetheoptimizationproblemthatwearesolvingisnon-convex,thesolution isonlyguaranteedtoconvergetoalocalminimum.Forthisreason,vericationis necessarytoguaranteethesystemadherestoallthepreviouslyspeciedconstraints. Inordertoassistpolicylearningandavoidparameterconvergencetopoorlocal minima,wedevelopedanincrementaltrainingproceduredetailedinAlgorithm6.2. Thisprocedurewasinspiredbylayer-wisetrainingmethodsindeeplearning[55]and resultedinthebestoverallresultsusingtheRNN/GRUcontrollerarchitecture.Note, beforeexecutingSTEP1,theparameters wereinitializedrandomlybasedon methodsdiscussedin[55].AnalysisfortheoptimizedcontrollersisprovidedinSection 6.4. 6.4FlightControlSimulationResults OurgoalforthissectionistodescribethetrainingdetailsoftheRNN/GRUcontrollerandprovideacomparisonofperformanceandrobustnessoftheRNN/GRU controllertothatofatypicalgain-scheduledGSbaselinecontroller. 6.4.1RNN/GRUControllerOptimization TheRNN/GRUcontrollerparameteroptimizationwasperformedusing2,970 sampletrajectories.Forbestresults,thesampletrajectoriesweredividedinto540 129

PAGE 130

Table6-1.Rangeofinitialconditionsanduncertaintyvariables MINMAX 0 [deg]-2525 q 0 [deg/sec]-7575 Mach 0 1.02.0 alt 0 [km]714 performancetrajectoriesand2,430robusttrajectories.Dependingonthelabel,each trajectorywassubjectedtodifferentdisturbances,uncertainties,andnoisewhile optimizinggainsofthecontrolleraccordingtoapre-denedcostfunctiondescribedin Section6.3.1.Theinitialconditionsforeachtrajectorywereselectedtolieintherange displayedinTable6-1where 0 istheinitialangleofattack, q 0 istheinitialpitchrate, Mach 0 istheinitialMachnumber,and alt 0 istheinitialaltitude.Eachtrajectoryusedin trainingwas t f =2 : 5 secindurationwheretheplantdynamicswerediscretizedwith t =0 : 01 timestepsusinganAB-2integrationscheme.Thecontrolleroptimization wasperformedusingangleofattackstepcommandtrackingi.e. y sel = ;y cmd = cmd wherewesettheboundsoffunnelsto b e =0 : 05 and b u =1 : 74 Numerouscontrolarchitectureswereexploredusingtheplantdynamicsdescribed intheprevioussections.Wefoundthat2hiddenlayernetworkwith25hiddennodes wassufcientwhenusingtheRNN/GRUarchitecture.Usingthreehiddenlayersslightly increasedperformance,butrequiredsignicantlymoreoptimizationtimeandadditional sampletrajectoriestomeetrequirements.Itisinterestingtonotethatseveraldifferent architecturese.g.time-delayedfeed-forwardnetwork,RNNsucceededincertain aspectsofoptimizationbutclearlylackedthesophisticationnecessarytocomplete thistaskinitsentirety.Infact,trainingtheRNN/GRUcontrollerarchitecturewiththe Table6-2.CumulativeerrorCTE,controlrateCCR,andnalcost CTECCRCost 3-LayerRNN/GRU339.15100.161.0438 2-LayerRNN359.28313.642.4970 GS1000.211180.04130

PAGE 131

sequentialinitializationschemepresentedinSection6.3.2producedthebestresults. ThisexperimentusedL-BFGSoptimizationmethodfortrainingthedeeprecurrent network.L-BFGSisaquasi-newtonsecondorderbatchmethodwhichhasproduced state-of-the-artresultsonbothregressionandclassicationtasks[55].Table6-2shows thevalueofthenalestimatedcostfunctionaftercompletingparameteroptimizationfor severalcontrollers. Asmentionedpreviously,theminimumandmaximumforeachuniformlydistributed variable R ;R q ;R u ; u usedtoproducesampletrajectoriesforcontrolleroptimization waspre-denedandshowninTable6-3.Theboundsofthecontroleffectiveness variable u weresettoestablishstabilitymarginsof )]TJ/F15 11.9552 Tf 9.299 0 Td [(0 : 75 ; 2 : 0 6.4.2AnalysisandResults InordertocomparetheGSandRNN/GRUcontrollers,weareinterestedina determininganonlinearshortperiodmodelofthelongitudinaldynamicsfortheight vehicledescribedpreviously.Inordertoaccomplishthistask,wedecoupledtheshort periodandphugoidmodesofthevehiclebysubstitutingtrimvaluesfor V T ;h and into 6.Theaugmentedpolynomialshortperiodmodelisgivenby e I = )]TJ/F25 11.9552 Tf 11.956 0 Td [( cmd = f ;q; e ; ; q ; u ; u q = f ;q; e ; ; q ; u ; u wherethestatevectorincludestheintegralerroroftracking,angleofattack,andpitch ratei.e. x =[ e I ;;q ] .Theinputtothemodeliselevatordeection u act = e produced Table6-3.Rangeofinitialconditionsanduncertaintyvariables MINMAX R -0.50.1 R q -75 R u -55 u 0.253.0 131

PAGE 132

bytheactuator,andtheconstantuncertaintyparameters ; q ; u ; u areincludedto assessrobustness.Forthesakeofbrevity,weselectedaregionsurroundingaspecic trimpointintheightenvelopetoanalyzei.e. Mach=1 : 7 =0 ; and h =14 km .For thisstudy,wesetthecommandedangleofattack cmd to0. Next,weobtainedvariousanalysismodelsbyselectingdifferentvaluesforthe constantuncertaintyparametersi.e. ; q ; u ; u thatlieintherangeshownin Table6-3andsubstitutedthemdirectlyinto6.Thisanalysiswasperformedby simulating32trajectoriesforeachanalysismodelfromvariousinitialconditions.The initialconditionswerechosentospanarangethatslightlyexceededtherangeusedfor trainingi.e. 0 =[ )]TJ/F15 11.9552 Tf 9.298 0 Td [(30 ; 30] degreesand q 0 =[ )]TJ/F15 11.9552 Tf 9.299 0 Td [(100 ; 100] degreespersecond.Inorder tovisualizealargenumberofsystemtrajectoriesforeachanalysismodel,weemploy theuseofphaseportraitsandtrackingperformanceplotsshowninFigure6-4through 6-7.Theanalysismodelsusedtogeneratetheplotswerechosentodemonstratethe GRU/RNNimprovementovertheGScontrollerinextremeoperatingconditions. Inordertoquantifythetrackingperformanceofeachcontroller,wecalculate theaveragetrackingerror ATE = P N i =1 P t f t =1 j e t j =N andaveragecontrolrate ACR = P N i =1 P t f t =1 j u t j =N foreachanalysismodelusedtogenerateFigures6-4 through6-7.WedenethecumulativetrackingerrorCTEasthesummationofthe ATEsandcumulativecontrolrateCCRasthesummationoftheACRsTable6-2. Traditionally,whentrainingadeepneuralnetworkforcontroltasks,thegoalis simplytodrivetheselectedstate, y sel ; tothecommandedstate, y cmd : Wefoundbyusing performancetrajectories P androbustnesstrajectories R duringtraining,thedeep RNNcontrollerperformedsignicantlybetterthantryingtosolvethetypicalperformance problem P foreachtrajectoryregardlessofthecircumstances.Forinstance,the optimizationsequencewasnotabletoconvergetoanacceptablesolutionwhenusinga traditionalformofthecostfunctionwhileincludingnoiseandwindgustsinthesample 132

PAGE 133

trajectories.However,controllerparameterconvergencesignicantlyimprovedbyusing theoptimizationprocedurediscussedinSection6.3.1. Onedrawbacktodeeplearningistheexecutiontime.Weimplementedthedeep learningprograminMATLABusinga3.1GHzPCwith256GBRAMand10cores.It tookapproximately 1 : 45 10 5 secondsor40hoursonaverageforcompletingAlgorithm 6.2initsentirety.Fortunately,aftertherststep,eachsuccessivestepwassignicantly shorterinexecutiontime.Thisdrawbackresultsinfewertestcasesduetothelarge computationrequirement.However,usingaplatformthatisspecicallydesignedfor deeplearningandutilizingtheGPUislikelytodecreasetheaveragecomputationtime. 6.5Summary Wehavepresentedanovelapproachfortrainingdeeprecurrentneuralnetworks forightcontrolapplications.WesatisedrobustnessandperformancemetricsconcurrentlybytrainingtheRNN/GRUcontrollerusingsampletrajectoriesthatcontain disturbances,additiveaerodynamicuncertainties,andcontroldegradationintheplant dynamics.WedemonstratedtheperformanceofthetrainedRNN/GRUcontrollerona highlynon-linearagileightvehicleagainstatraditionalgain-scheduledLQRdesign. Figure6-4.PhaseportraitanalysisforAGSandBRNN/GRUwithuncertaintyvalues u =0 : 25 u =0 =0 ,and q =0 133

PAGE 134

Figure6-5.PhaseportraitanalysisforAGSandBRNN/GRUwithuncertaintyvalues u =0 : 75 u =0 =0 : 025 ,and q =5 Figure6-6.PhaseportraitanalysisforAGSandBRNN/GRUwithuncertaintyvalues u =0 : 5 u =0 =0 : 05 ,and q =2 : 5 Figure6-7.TraditionalstepresponsesforAGSandBRNN/GRUwithuncertainty values u =0 : 5 u =0 =0 : 05 ,and q =2 : 5 134

PAGE 135

CHAPTER7 DEVELOPMENTOFADEEPANDSPARSERECURRENTNEURALNETWORK HYPERSONICFLIGHTCONTROLLERWITHSTABILITYMARGINANALYSIS Inthischapter,weextendthesparseneuralnetworkSNNconceptforadaptive controltoadeeprecurrentneuralnetwork.ThesparsedeeplearningcontrollerSDLCistrainedusingavariantoftheoptimizationproceduredescribedintheprevious chapter.Inthischapter,weaimtotraintheS-DLCtoestablishstabilityandtimedelay margins.Thestabilitymarginsareanalyzedaroundanequilibriumpointusingregionof attractionROAestimationviaforwardreachablesets.Simulationresultsdemonstrate theeffectivenessofthecontrolapproach. 7.1DeepLearningController Thedeeplearningcontrollerdescribedinthesubsequentsectionistrainedto ensuretheselectedplantoutputstate, y sel 2 y ,successfullytracksthereference output, y ref .Thereferenceoutputisproducedbyanoraclewhichisdesignedto specifytheclosed-looptrackingperformanceofthesystemunderidealconditions, i.e.nouncertaintiesordisturbances.Inspiredfrommodelreferenceadaptivecontrol methods,weuseaseriesoflinearquadraticregulatorLQRcontrollerstoprovide oracletrajectories[16]. Theblockdiagramoftheclosed-loopsystemisshowninFigure7-1.Thediagram showsthattheightvehicledynamicsarecontrolledbyadeeplearningcontrollerDLC whichpassesacontrolcommand, u ,totheactuator.Theactuatorproducescontrol deectionsdenoted u act whichactsontheightvehicledynamics. Figure7-1.Deeplearningcontrolclosed-loopblockdiagram. 135

PAGE 136

7.1.1ControllerArchitecture Similartoourpreviouswork,wewilluseastackedrecurrentneuralnetworkSRNNforthearchitectureoftheDLCsee[67,94].Inordertocombatthewell-known vanishinggradientproblemandencouragelong-termlearning,wewillemploygated recurrentGRUmodulesatthehiddennodes[66]. Forthereader'sconvenience,anexampleofatwo-layerS-RNNarchitectureis showninFigure7-2where u t denotesthecontrolcommandattime t and H i t signies ahiddennodeinthe i th layerattime t .Eachhiddenlayer, H i ,isasetoffunctions thatoutputahiddenstate, s t ,ateachdiscretetime t .Theinputofeachhiddennode, denoted i t intheequations,isreceivedfromthepreviouslayer.Forinstance,theinputs of H 1 t and H 2 t ofFigure7-2are c t and s 1 t ,respectively. Asmentionedpreviously,eachhiddennodei.e.GRUmoduleactsaccordingtothe followingequations: z GRU = i t U u + s t )]TJ/F24 7.9701 Tf 6.586 0 Td [(1 W u + b 1 r GRU = i t U r + s t )]TJ/F24 7.9701 Tf 6.587 0 Td [(1 W r + b 2 h GRU =tanh i t U h + s t )]TJ/F24 7.9701 Tf 6.586 0 Td [(1 r GRU W h + b 3 s t = )]TJ/F25 11.9552 Tf 11.955 0 Td [(z GRU h GRU + z GRU s t )]TJ/F24 7.9701 Tf 6.587 0 Td [(1 where z GRU ;r GRU ;h GRU areinternalstatesofthemoduleand denoteselementwisemultiplication.Noticethatthehiddenstatesaresenttothenextlayerforuse asinputs, i t ,aswellasfeedbacktoitselfatthenexttimestep, s t )]TJ/F24 7.9701 Tf 6.586 0 Td [(1 .Wedene i = [ U u ;W u ;U r ;W r ;U h ;W h ;b 1 ;b 2 ;b 3 ] asthematrixofweightsforeachGRUmodule indexedby i where L 2 N denotesthetotalnumberofhiddenlayers[63]. Thecontrolleroutputiscalculatedfrom u t = s L t V + b o 136

PAGE 137

Figure7-2.StackeddeeplearningcontrollerDLCarchitecturewithtwohiddenlayers where V;b o areweightsofthecontrollerand s L t isacquiredfromtheoutputofthelast hiddenlayer'sGRUmodule.Lastly,wedene =[ i ; i +1 ;:::; L ;V;b o ] asthematrix thatcontainsalltheweightsofthecontroller. 7.1.2ExtensiontoSparseNeuralNetwork ThesparseneuralnetworkSNNwasoriginallycreatedasaswitchedcontrol systemapproachinmodelreferenceadaptivecontrolMRAC.Itwasinspiredbyrecent resultsindistributedsparselearningmethodsinthedeeplearningcommunity.The approachaimstoutilizecontrollermemorywhilereducingthecomputationalloadonthe processorinordertoimprovethetrackingperformanceofightvehicleswithpersistent andsignicantuncertainties.ThereaderisreferredtoChapter4formoredetailsand descriptionoftheSNN[80]. ThebasicideabehindtheSNNistosegmenttheightenvelopeintoregions andactivateonlyasmallpercentageofneuronswhileoperatinginthatregion.The remainingunusedneuronweightsarefrozenuntilactivatedbythecontroller.We considera N D dimensionalightenvelopewherethedimensionoftheightenvelope isdeterminedbasedonightconditionswhichimpacttheperformanceofthecontroller e.g.Machandangleofattack.Wedenotethetotalnumberofsegmentsintheight envelopeas T 2 N where S = f s 1 ;:::;s T g and P = f p 1 ;:::;p T g arethesetofsegments 137

PAGE 138

Algorithm7.1 SparseNeuralNetworkExecution 1: receive x t andthecorrespondinglocationintheoperatingenvelope x op 2: recalltheprevioussegmentindex j andthenearestneighborgraph 3: determinethecurrentoperatingsegmenti.e. argmin 8 i 2 I dist x op ;p i 4: retrievetheindicesforthesetofactivenodes, E A i ,correspondingto i 5: use E A i toselecttheappropriateneuralweights, ,foruseincontrol andcenterpoints,respectively.Forfutureuse,welet I = f 1 ;:::;T g betheindexsetof thesets S and P Let N 2 N bethenumberofhiddenlayernodesforeachlayerwhereweallocate Q = N T nodestoeachsegment.Thespacingofthenodeswithineachsegmentis determinedbytheuser.Inthisresearch,weassumethespacingofnodesisthesame foreachlayer.Hence,theactivenodeindicesforeachlayerarethesame.Wecannow establishthesetofindicesofthehiddenlayernodesforeachsegment, s i ,denotedby E i 2 I = f Q i )]TJ/F15 11.9552 Tf 12.775 0 Td [(1+1 ;:::;iQ g .Inaddition,wedeneaparticularpositioninthe N D spaceforeachhiddennodewherewelet D i 2 I = f d 1 ;:::;d N g bethesetofEuclidean distancesfromthe i th centerpointtoeachhiddennodelocationintheightspace.This calculationisperformedaprioribasedonthelocationofthecenterpointsprovidedby theuser.Finally,wecandenethesetofindicesfortheactivenodesforeachsegment s i as E A i where E i E A i .Wedenethenumberofactivenodesas N act 2 N where N act Q .Foreachindexofacenterpoint, i ,wecandetermine E A i byndingthe closest N act hiddennodestothecenterpoint p i .Thiscanbeeasilyaccomplishedbefore run-timeusingthesetofEuclideandistancesstoredin D i 2 I .Intermsofimplementation, theDLCoperatesaccordingtotheprocessdescribedinAlgorithm7.1where dist calculatestheEuclideandistancebetweentwo N D vectors. Forthischapter,wechosea N D =2 dimensionalSNNightspacetotrainthe DLC.Weassumethespacingofnodesisthesameforeachlayer.Hence,theactive nodeindicesforeachlayerareequivalent.TheVoronoidiagramoftheightspaceis showninFigure7-3andthesimulationresultscanbeseeninsubsequentsections. 138

PAGE 139

Figure7-3.TwodimensionalSNNsegmentedightspacefordeeplearningcontrol. EachenclosedcoloredregionoftheVoronoidiagramrepresentsthedomainofasingle segment s i wherethepredeterminedcenterpointsarelabeled. 7.1.3OptimizationProcedure Thegoaloftheoptimizationprocedureistondasetofweights, ,ofa L layer DLCinordertominimizeacostfunction.Forourresearch,thecostfunctionisgivenby J = t f X t =0 t E [ x t ;u t j u ] where E [ ] denotesexpectedvalue, t f isthetimeduration, isaconstantdiscountterm, u representsthecontrollerparameterizedby ,and istheimmediatecost. UsingconceptsestablishedinChapter6,wecanestimatetheoverallcostfunction usingasampletrajectory-basedapproach[39].Theestimatedcostfunctiontakesthe followingform: ~ J = 1 N T N T X i =1 J i where N T denotesthetotalnumberofsampletrajectoriesusedduringtrainingand J i representsthecostassociatedwiththe i th trajectory.Thesampletrajectoriesare createdusingadiversesetofinitialconditions,uncertainparameters,andtime-varying disturbances.Inthisresearch,wedenethesetofuncertainparametersforthe i th 139

PAGE 140

trajectoryas P i c = i u ; i u ; i d where i u ; i u ; and i d representcontroleffectiveness, additiveuncertainty,andtimedelay,respectively.Similarly,wedenethesetoftime varyingdisturbancesforthe i th trajectoryas P v = d i u ; i p where d i u and i p denoteplant noiseandinputdisturbanceterms,respectively.Weassumeforeachsampletrajectory, thesetofuncertainparametersisheldconstant.Thedisturbancesstemmedfroma varietyofsourcese.g.wind,vibrationandwereactivatedrandomlyforeachsample trajectory.SeeSection7.2foracompletedescriptionofthedynamicsofthehypersonic model. Tobegintheoptimizationprocedure,wedenethesetsofuncertaintyparameters forallsampletrajectoriesas u = f 1 u ; 2 u ;:::; N T u g P u = f 1 u ; 2 u ;:::; N T u g ,and T d = f 1 d ; 2 d ;:::; N T d g .Thesetsofuncertaintyparametersareobtainedbysamplinga setofuniformlydistributedrandomvariables.Therangeofeachrandomvariableused intrainingwillbedescribedinSection7.3. WeutilizetherobusttrainingprocessforrecurrentneuralnetworksRNNswith GRUmodulesestablishedinChapter4,whichissummarizedhereforconvenience[94]. Thecostfunctionforthe i th trajectoryisgivenby J i = t f X t =0 t x t ;u t where 2 [1 ; 2] isusedemphasizetheimportanceofconvergencenearthenaltime, t f .Wenowintroducetheformtheinstantaneouscostfunction x t ;u t = 8 > > < > > : k 1 e t 2 + k 2 f 2 u k 3 f 2 e + k 4 f 2 u iflabel = P iflabel = R whichisdeterminedbasedonthe i th trajectory'slabelwhere k 1 ;k 2 ;k 3 and k 4 areuserdenedpositiveconstants.Thefunneltrackingerror f e andfunnelcontrolrateerror 140

PAGE 141

f u attime t aredenedby f e = max j e t j)]TJ/F25 11.9552 Tf 17.933 0 Td [(b e ; 0 f = max j u t j)]TJ/F25 11.9552 Tf 17.933 0 Td [(b u ; 0 where b u and b e aretheconstantboundsofthefunnels.Wedeneinstantaneous trackingerroras e t = y sel )]TJ/F25 11.9552 Tf 12.127 0 Td [(y ref andtheestimatedcontrolrateas u t = u t )]TJ/F26 7.9701 Tf 6.586 0 Td [(u t )]TJ/F18 5.9776 Tf 5.756 0 Td [(1 t where t denotesthetimestepduration.Theuseoftrajectoryperformance P orrobust R labelsareusedtobalancerobustnessandperformancegoalssimultaneously.Robust trajectoriesaredesignedtotracktheoracle'sreferencetrajectorywithinpredened boundswhileincludingsignicantlylargeuncertainties,disturbances,andnoisein thenonlinearplantduringoptimization.Additionally,performancetrajectoriesare optimizedtotracktheoracle'sreferencetrajectorieswhileincludingsmallaerodynamic uncertaintiesintheplantdynamics.Bothsetsoftrajectoriesarepenalizedforhigh controlratesduringoperation.Formoredetailsregardingthetrainingprocedure, see[94]. Inspiredbyresearchinsafetyassuranceforightsystems,weseektoestablish atimedelaymarginandnonlinearstabilitymargin.Thisgoalwasindirectlytargeted throughtheuseofcontroleffectiveness, u ,andtimedelay, d ,uncertaintiesusedduring DLCtraining.Wesayanonlinearsystemwithanasymptoticstabilizingcontrollaw, u = u DLC ; possessesstabilitymargins SM min ;SM max where )]TJ/F15 11.9552 Tf 9.299 0 Td [(1 SM min SM max 1 ifforevery u 2 [ SM min ;SM max ] thecontrolinput u =+ u u DLC alsoasymptotically stabilizesthesystem[93].Similarly,wesayadiscretenonlinearsystemhastimedelay margins TD min ;TD max ifforevery d 2 [ TD min ;TD max ] thecontrolinput u t = u t )]TJ/F26 7.9701 Tf 6.587 0 Td [( d alsoasymptoticallystabilizesthesystem.Notice, d denotesthenumberofunitsoftime delayintheclosed-loopsystem.AfteroptimizationofDLCweights,theclosed-loop system'sstabilitymarginswillbeveriedforparticularequilibriumpointsusingROA 141

PAGE 142

estimationviaforwardreachablesetsinSection7.3.Timedelaymarginwillbeveried usingextensivesimulations. 7.1.4SystematicProcedureforWeightConvergence WeintroduceasystematicprocedurefortrainingaDLCforimprovedweightconvergence.Webeginbyredeningtheweightmatrixforthe i th layerintotwoseparate matriceswhere i FF =[ U u ;U r ;U h ] and i FB =[ W u ;W r ;W h ] .Noticethat FF includes weightsofthedeepnetworkprimarilyassociatedwiththeinputvector, i t ,and FB containsneuralnetworkweightsassociatedwiththerecurrenthiddenstate, s t )]TJ/F24 7.9701 Tf 6.587 0 Td [(1 .Then,we deneamatrixofneuralweightsas FF =[ i FF ; i +1 FF ;:::; L FF ;V;b o ;b 1 ;b 2 ;b 3 ] where FF doesnotinvolveweightsassociatedwiththerecurrenthiddenstate.Byeliminating therecurrenthiddenstateintheGRUmoduleequationsstatedin7to7,the DLCbecomesafeed-forwardnetworkwithallweightscontainedin FF .Hence,the trainingprocedurerstoptimizesthesetofweightsin FF bydisregardingtherecurrent hiddenstateinput.Then,theoptimizationisre-runtooptimizethefeedbackweights, FB ,usingfrozenfeed-forwardweightsfromthepreviousstep.Finally,weagainreoptimizethetotalweightsinthesystemtogether.Foreachstepofthetrainingprocedure describedabove,weutilizethelayer-wisetrainingproceduredescribedinChapter6, see[94]. 7.2HypersonicFlightVehicleModel Inthisresearch,weconsiderahighlynonlinearhypersonicightvehiclemodelwith exiblebodyeffects.Forightvehiclesoperatingatlowerspeeds,theseexibleeffects aretypicallyignoredduetothenaturalfrequencyseparationbetweentherigidbody modesandtheexiblebodymodes[16].Wefounditdesirabletoincludesucheffects duetotheunexpectedcontrolmomentsthatcanbegeneratedfromtheexingofthe fuselageatsuchhighspeeds.Inaddition,operatingathypersonicspeedswithaexible vehiclecancausechangesinthepressuredistributionontheightvehicleresultingin changesintheoveralloweld[6,90]. 142

PAGE 143

Thehypersonicightvehiclemodelwasmodeledwithfourindependentlycontroller surfaces 1 ; 2 ; 3 ; 4 [94].Forcontroldesignpurposes,wecreatedvirtualcontrol surfacesaileron a ,elevator e ,andrudder r whichareoperatingintheconventional autopilotreferenceframe,see[73]fordetails.Weassumetheformofastaticmatrix whichdescribesthemappingfromthevirtualcontrolsurfacestotheactualdisplacement.Thisisusedinsidethemodelinordertocalculateforcesandmomentsonthe ightvehicle[73]. Weconsideronlylongitudinaldynamicsoftheightvehiclewhichareassumed tobedecoupledfromthelateraldynamics.Thewell-studiedlongitudinal3degreesof freedomDoFequationsofmotionforahypersonicightvehiclecanbewritteninthe followingform[6,74]: V T = 1 m T cos )]TJ/F25 11.9552 Tf 11.955 0 Td [(D )]TJ/F25 11.9552 Tf 11.956 0 Td [(g sin )]TJ/F25 11.9552 Tf 11.956 0 Td [( = 1 mV T )]TJ/F25 11.9552 Tf 9.298 0 Td [(T sin )]TJ/F25 11.9552 Tf 11.956 0 Td [(L + q + g V T cos )]TJ/F25 11.9552 Tf 11.955 0 Td [( = q q = M I YY h = V T sin )]TJ/F25 11.9552 Tf 11.955 0 Td [( i = )]TJ/F15 11.9552 Tf 9.299 0 Td [(2 i i i )]TJ/F25 11.9552 Tf 11.955 0 Td [(! 2 i i + N i ;i =1 ; 2 ;::;n where m isthemassofthevehicle, T isthethrust, V T isthetrueairspeed, isangle ofattack, h istheheighti.e.altitudeoftheightvehicle, isthepitchangle, q ispitch rate, I YY isthemomentofinertia,and g isgravity.Forthisproblem,weconsiderno engineorrocketboosterwhichresultsinzerothrusti.e. T =0 .The i th structuralmode oftheightvehicleisdenedbythenaturalfrequency i ,thedampingratio i ,and thegeneralizedforce N i .Theforcesandmomentsactingonthelongitudinaldynamics oftheightvehicleincludethrust T ,drag D ,lift L ,andpitchingmoment M 143

PAGE 144

Forthiswork,weassumethevehicleoperatesataconstanttemperature.However, itisworthnotingthatinmanycasesthenaturalfrequenciesofthehypersonicvehicle's bodymodescanvarybasedontemperature[7].Wedenethestatevector, x 2 R 11 ,by x =[ V T ; ; ;q;h; 1 ; 1 ; 2 ; 2 ; 3 ; 3 ] whereweassumethatonlythreeelasticmodesoftheightvehicleareactive.Wealso assumethefollowingrelationshipbetweenaxialandnormal A;N forcesandliftand dragforces L;D : L = Ncos )]TJ/F25 11.9552 Tf 11.955 0 Td [(Asin D = Nsin + Acos where L and D aretheliftanddragforcesusedin7. Likemanyrecentworksinhypersoniccontrol,weestimatetheaxialandnormal bodyforces A;N andpitchingmoment M by[73,74] A t 1 2 V 2 T SC A N t 1 2 V 2 T SC N M t 1 2 V 2 T Sc ref C m N i t 1 2 V 2 T SC N i i =1 ; 2 ;::;n where denotesairdensity, S isthereferencearea, c ref isthemeanaerodynamic chord, C A istheaxialforcecoefcient, C N isthenormalforcecoefcient,and C m isthe pitchmomentcoefcient. Theforceandmomentcoefcientsarecomposedofsmallercoefcientcomponents e.g. C A AB and C m 0 whicharestoredintheformofalook-uptableLUT.Theinputsof theLUTconsistoftheightcondition ; Mach ;h;q ,thecontrolinputs 1 ; 2 ; 3 ; 4 andtheexiblebodystates 1 2 3 .Weassumethecoefcientstakethefollowing 144

PAGE 145

form: C A = C A ALT h; Mach+ C A AB ; Mach + 4 X j =1 C A j ; Mach ; j + 3 X k =1 C A k k C N = C N 0 ; Mach+ C N q ; Mach ;q + 4 X j =1 C N j ; Mach ; j + 3 X k =1 C N k k C m = C m 0 ; Mach+ C m q ; Mach ;q + 4 X j =1 C m j ; Mach ; j + 3 X k =1 C m k k C N i = N i 2 2 + N i + 4 X j =1 N i j j + 3 X k =1 N i k k SinceourplantdynamicsarecalledateachiterationoftheDLCoptimization,it isbenecialtousepolynomialapproximationsofeachaerodynamiccoefcientstated previously.Thelowestorderofthepolynomialtwasselectedbasedonevaluating accuracyandsmoothnessmetrics.Asdiscussedinthesparseneuralnetworkcase,we createdseparatepolynomialmodelsfordifferentregionsthroughouttheightenvelope. Thisallowstheaccuracyofthemodeltoincreasewhilereducingtheorderofthe polynomials.WereferthereadertoourpreviousworkinChapter6and[91]formore detailsregardingtheuseofpolynomialapproximationsofaircraftdynamics. Thediscretetimeformofthepolynomialmodelofthelongitudinaldynamicsofthe ightvehicleusedforoptimizationoftheDLCisgivenby x t +1 = f x t ; u u act t )]TJ/F26 7.9701 Tf 6.587 0 Td [( d + u + d u + p y t = f x t ; u u act t )]TJ/F26 7.9701 Tf 6.587 0 Td [( d + u + d u + p 145

PAGE 146

where u act isthecontroldeectionfromtheactuator, x t =[ V T ; ; ;q;h ] isthestate vectorwhichdisregardstheexiblemodes, y t isthesetofoutputstates, u isamultiplicativecontroleffectivenessterm, u isaconstantadditivecontroluncertainty,and p and d u areplantnoiseandinputdisturbanceterms,respectively,attime t .Wealso includeatimedelayparameter d 2 N whichisusedduringoptimizationtoproducea timedelaymargin. 7.3Results 7.3.1DeepLearningOptimization WeexploredseveralvariationsoftheDLCbyvaryingthenumberoflayers,hidden nodes,andsharednodesfortheS-DLCcase.Similartopreviouswork,wefoundthat byusing L =2 hiddenlayersand N =12 hiddennodesresultedinreasonabletracking performance.Unfortunately,byaddinganadditionallayer,theDLConlyperformed slightlybetter.However,wedidseesignicantimprovementinparameterconvergence byusingtheS-DLCwithsegmentedightspaceshowninFigure7-3.TheS-DLC contained T =39 segmentswith Q =6 nodesallocatedtoeachsegment.TheSDLCoperatedwith N act =12 ,thesamenumberastheDLC,whichaimedtokeep thecomputationalburdenonbothsystemsequivalent.TheS-DLC,withitsexpanded memory,used N =234 totalnodesinthesystem.Weusedthewell-studiedL-BFGS optimizationmethodinordertotrainthedeeprecurrentnetworkwithGRUmodules. L-BFGSisaquasi-newtonsecondorderbatchmethodwhichhasproducedstate-of-theartresultsonbothregressionandclassicationtasks[55]. Table7-1.Rangeofinitialconditionsanduncertaintyvariables MINMAX 0 deg-2525 q 0 deg/sec-7575 Mach 0 56 alt 0 km1216 P u deg-55 T d samples05 u 0.52 146

PAGE 147

Forourresearch,thecontrolleroptimizationwasperformedusingangleofattack commandtrackingi.e. y ref = ref ;y cmd = cmd ,and y sel = wherewesetthebounds offunnelsto b e =0 : 005 and b u =0 : 9 .TheinputtotheDLCcontrollerattime t c t ,takes thefollowingform: c t =[ e I ;;q; Mach ;h ] where istheangleofattackofthevehicle, q isthepitchrate, M istheMachnumber ofthevehicle, h istheheightaltitude,and e I istheintegralerroroftracking e I = t f 0 y sel )]TJ/F25 11.9552 Tf 12.6 0 Td [(y cmd dt .FortheS-DLCcase,wereducethesetofstatesusedasinputsto c t =[ e I ;;q ] TheRNN/GRUcontrollerparameteroptimizationwasperformedusing N T = 5 ; 760 sampletrajectories.Thesampletrajectoriesweredividedinto420performance trajectoriesand5,250robusttrajectories.Theoptimizationprocedureandassociated costfunctionaredescribedinSection7.1.3.Theplantdynamicswerediscretizedwith t =0 : 01 secondtimestepsusingAB-2integration.Eachtrajectoryusedintrainingwas t f =2 : 5 secondsinduration. AsmentionedinSection7.1.3,eachtrajectoryisdenedbyitspredetermined initialconditionsanduncertaintyparameters.Individualsamplesweredeterminedby samplinguniformlydistributedrandomvariablesbetweenboundsestablishedbasedon systemrequirements.Forinstance,Table7-1showstherangeofinitialconditionsused duringtrainingwhere 0 istheinitialangleofattack, q 0 istheinitialpitchrate, Mach 0 is theinitialMachnumber,and alt 0 istheinitialaltitude.Similarly,thetablealsoshows therangeofeachuncertaintyparameterin P c whereeachparameterisheldconstant foreachsampletrajectory.Notethatthecontroleffectivenessterm, u ,wassampledin therangeof + SM min u + SM max inordertoestablishstabilitymargins SM min ;SM max .Whilethetimedelayterm, d ,wassampledtoestablishtimedelay marginsof TD min ;TD max .Inordertoreducethedimensionalityofthepolynomial systemduringoptimization,theexibleeffectswereincludedasdisturbancesinour 147

PAGE 148

Table7-2.AveragetrackingerrorATE,averagecontrolrateACR,andnalcost ATEACRCost 3-LayerS-DLCS185.6193.498.79 2-LayerDLCS568.479.23572.5 2-LayerDLC592.688.34633.1 GS1450.171.85Note:Sdenotessystematictrainingprocedure systemandtheweightsofthecontrollerwerenotoptimizedtocounter-actthem.Wewill verifythestabilitymarginsusingROAestimationinSection7.3.3. 7.3.2HypersonicFlightControlSimulation Forsimulationresultsandclarity,wedevelopedananalysismodelforthelongitudinaldynamicsofaightvehicle.Thisanalysismodelwasobtainedbysubstituting particulartrimvaluesoftrueairspeed V T ,altitude h ,andpitchangle into7. Theanalysismodelisgivenby e I = )]TJ/F25 11.9552 Tf 11.955 0 Td [( cmd = f ;q; e ; u ; u ; d q = f ;q; e ; u ; u ; d where P c = u ; u ; d isthesetofconstantsystemuncertainties, u act = e isthe elevatordeectionproducedbytheactuators,andthestatevectorincludestheintegral erroroftrackingandtakestheform x =[ e I ;;q ] .Fortheanalysisinthischapter,we selected Mach=5 : 5 ; =0 ;h =14 km,and cmd =0 asthetrimconditiontoanalyze. Weanalyzedasetof N S =64 trajectorieswithvariousinitialconditionsanduncertaintyvalues.Similartotheprocedureusedduringoptimization,weobtainedsample vectorsforeachuncertaintyparameterandinitialconditionstatebysamplingauniformly sampledrandomvariablewithrangesthatslightlyexceededtherangesusedduring optimizationTable7-1.Forsimplicity,weusedtwopreviouslydevelopedmetrics,the averagetrackingerrorATEandtheaveragecontrolrateACR,tocomparetheperformanceoftheS-DLC,DLC,andgain-scheduledcontroller[94].Themetricsaredened 148

PAGE 149

by ATE = k TE N S N S X i =1 t f X t =1 j e t j ACR = k CR N S N S X i =1 t f X t =1 j u t j whereand e t and u t aretheinstantaneoustrackingerrorandcontrolratedened previously.Weusedtheconstants k TE and k CR toscale ATE and ACR forcomparison purposes.TheoptimizationresultsareshowninTable7-2.Timedelaymarginsand closed-loopperformanceunderadditiveuncertaintieswereveriedfortheanalysis modeldescribedpreviouslyusingextensivesimulations.Phaseportraitandtracking performanceplotsareprovidedinFigures7-4to7-6inordertocomparethecontroller's performanceunderharshconditionsvisually. 7.3.3RegionofAttractionEstimationviaForwardReachableSets WeareinterestedinleveragingveryrecentbreakthroughsinROAestimationof anequilibriumpointfortheclosed-loopightsystemviaforwardreachablesets.This estimationprovidesalessconservativeandnon-polynomialapproachwhichisprovable andaccurate.Weusethisapproachtoverifythestabilitymarginsforourclosed-loop system. Consideragenericdynamicsystemintheform: x = f x t ;u t where u istheinput, x isthestatevector, t isthetime,and f islocallyLipschitzcontinuous. Thestabilityregionthatweareinterestedinestimatingcanbedenedby S ROA x eq = f x 2 R n :lim t !1 x;t = x eq g 149

PAGE 150

where x;t isasystemtrajectoryof7whichstartsfromtheinitialstates x .A reachablesetisdenedasthesetofstatesthatsatises7withinagivenperiodof time 2 [0 ;t f ] startingfromasetofinitialstates x 2 R n Thealgorithmleveragedinthischapterndsanover-approximationofreachable setsusinglinearapproximationmethodsandrecursivepartitioningofthestatespace. Thevastmajorityofresearchoverthelastdecadeonreachabilitysetmethodsstem fromlineartime-invariantLTIsystemswithuncertaininputs[54].Bylinearizingthe systemdynamicsaboutthecurrentoperatingpointandconsideringthelinearization errorasanuncertaininputtothelinearizedsystem,wecanapplyresultsinreachability analysistogeneralnonlinearandhybridsystems,see[53,54].Thisapproachhasthe addedbenetofallowingustousewell-studiedtoolsfromlinearsystemstheorysuch asthesuperpositionprinciple,linearmaps,andlineargeometricmethods.Wewillnot provideintimatedetailsofROAestimationorreachablesettheory.Instead,thereaderis referredto[53,54]. Inthischapter,weuseafeed-forwardonlyversionoftheDLCtoanalyzethe stabilitymargins.Hence,weassumetherecurrenthiddenstateisdisconnected,andthe DLCoperateswithonlytheweightsassociatedwithfeed-forwardconnections,i.e. FF Theresultingcontrollertakesthefollowingform: e = f e I ;;q wherethereisnolongeradependenceontime.Duetomemoryconstraints,we obtainedahighorderpolynomialapproximationofthefeed-forwardcontrollerusing methodsdescribedpreviouslywhichisusedinthisanalysis.Considerthe4-dimensional systemdenedby e I = )]TJ/F25 11.9552 Tf 11.955 0 Td [( cmd = f ;q; e ; u 150

PAGE 151

_ q = f ;q; e ; u u =0 whereweset cmd =0 .Thesystemhasanequilibriumpointabouttheorigin.We investigatethedomain D =[ )]TJ/F15 11.9552 Tf 9.298 0 Td [(0 : 5 ; 0 : 5] [ )]TJ/F15 11.9552 Tf 9.299 0 Td [(0 : 25 ; 0 : 25] [ )]TJ/F15 11.9552 Tf 9.298 0 Td [(0 : 5 ; 0 : 5] [0 : 5 ; 2 : 0] ofthe closed-loopsystemwhichisshowninFigure7-7.Thegrayboxesinthegurerevealthe regionsofthestate-spacewhereinitialconditionshaveguaranteedconvergencetothe origin.Theregionofattractionwasveriedusingextensivesimulations. 7.4Summary Wedevelopedasparselyactivatedrobustdeeplearningcontrollerforhypersonic ightvehiclecontrol.Thiscontrollerwastrainedsystematicallyusingalargesetof sampletrajectories.Weincludedcontroleffectivenessandtimedelaytermsinsample trajectoriesduringoptimizationinordertoestablishstabilityandtimedelaymargins. Thesemarginswereveriedforaparticularequilibriumpointbyusingaregionof attractionmethodviaforwardreachablesets.Wefoundthesparsedeeplearning controllertobesuperiorintermsofbothtransientandrobustnesscomparedtoa standardfully-connecteddeeplearningcontrollerandamoretraditionalgain-scheduled controller. Figure7-4.TraditionalstepresponsesforADLCandBGSwithuncertaintyvalues u =0 : 6 u = )]TJ/F15 11.9552 Tf 9.299 0 Td [(5 ,and d =3 151

PAGE 152

Figure7-5.TraditionalstepresponsesforADLCandBGSwithuncertaintyvalues u =1 : 5 u =3 ,and d =1 Figure7-6.PhaseportraitplotsforADLCandBGSwithuncertaintyvalues u =0 : 5 u = )]TJ/F15 11.9552 Tf 9.298 0 Td [(0 : 5 ,and d =0 Figure7-7.Regionofattractionestimateviaforwardreachablesetsfor u =[0 : 5 ; 2 : 0] 152

PAGE 153

CHAPTER8 CONCLUSIONS Thedissertationattacksvariousproblemsrelatedtohypersoniccontrolusing deepandsparseneuralnetworkcontrollers.Themainproblemsthatwereaddressed includethecomputationallimitations,vastdynamicalchanges,andexiblebody effectsofhypersonicvehiclesHSVs.Theseissueswereaddressedthroughtwo relatedmethodologies.First,wedevelopedasparselyconnectedadaptivecontroller whichimprovestransientperformanceoncontroltaskswithpersistentregion-based uncertainties.Thecontrolleroperatesbypartitioningtheightenvelopeintoregionsand utilizingonlyasmallnumberofneuronswhileoperatingineachregion.Weimproved thecontrollerbyincludingadditionaladaptivetermstocounter-actcontroldegradation. Weshowedthatbyenforcingaspecieddwelltimeconditionandutilizingarobust adaptiveterm,wecouldguaranteeauniformlyultimateboundedUUBLyapunov stabilityresult.Simulationstudiesshowedthesignicantbenetsofemployingthe SNNarchitectureovertraditionalradialbasisfunctionRBFandsinglehiddenlayer SHLschemes.ThesecondapproachthatwasusedtosolveHSVcontrolchallenges utilizedadeeprecurrentneuralnetworkarchitectureforthecontroller.Wedevelopeda noveltrainingprocedurethatsimultaneouslyaddressedperformanceandrobustness basedgoals.Weanalyzedthestabilitymarginsoftheclosed-loopsystemusingrecently developedregionofattractionROAestimationmethods.Inaddition,wedevelopeda sparsedeeplearningcontrollerS-DLCwhichledtosignicantlyimprovedparameter convergence.Thisconvergenceresultedinahigh-performancerobustcontrollerfor hypersonicight.SimulationstudiesshowtheeffectivenessoftheDLCagainsta traditionalgain-scheduledapproachwithuncertaintiesinvariousforms. Similartomuchresearchincontrolliterature,wemakeuseofcontrol-oriented modelsreducedorderthatmakemanyassumptionsaboutthedynamicsoftheight vehicle.Forinstance,althoughweincludetherstseveralbendingmodesinourmodel, 153

PAGE 154

weassumesufcientdecouplingbetweenthelongitudinalandlateraldynamicsof thevehicleanddonotconsiderpropulsionorpropulsion-basedeffects.Inaddition, thearchitecturedevelopedinChapters4and5reliesontheadaptivecontrollerto compensateforeffectsnotcapturedinthelinearizedgain-scheduleddynamicsofthe ightvehicle.Hence,itwouldbeinterestingtoemploythesparseadaptivecontroller invariouscontrolarchitectureswhichallowformorerealisticmodels.Similarly,future workcouldincludeusingthesparsedeepnetworkconceptandtrainingmethodology establishedinChapters6and7toobtainamoreaccurateightvehiclemodelfor control. Asmentionedthroughoutthisdocument,thepersistenceofexcitationPEconditionisoftenrelieduponformanysystemidenticationandadaptivecontroltasks.This conditionrequirestheinputtohavesufcientenergyinorderfortheestimatedparameterstoconvergeproperly.Anotherinterestingavenueofresearchcouldbetopursuethe benetsofthesparseneuralnetworkarchitectureinsystemidenticationtasks. Futureworkcouldalsoincludeinvestigatingtheeffectivenessofthehigher-order termsintheneuralnetworkadaptivelaws.Itwouldbeinterestingtoanalyzevarious controlplatforms,activationfunctions,andlearningratesinthesparseneuralnetwork architecture.Sparsemulti-layerneuralnetworkadaptivecontrolcouldalsobeanother directiontopursue. Finally,theexpressivepowerofdeepneuralnetworksiswell-known.However,due toitshighlynonlinearandelaboratestructure,deepneuralnetworksareoftendifcult toanalyze.Anotherareaofinterestistoestablishmoreaccurateandsuitableanalysis toolsfordeepneuralnetwork-basedcontrollers. 154

PAGE 155

APPENDIXA DEVELOPMENTOFBOUNDSFORTHENEURALNETWORKADAPTIVEERROR CH4 UsingthesecondorderTaylorseriesexpansiontermsgeneratedinChapter5along withtheformoftheneuralnetworkadaptivecontrollerin5resultsinthefollowing boundontheadaptiveerror: )]TJ/F15 11.9552 Tf 9.298 0 Td [( u NN + f x = ^ W i T ^ V i T )]TJ/F25 11.9552 Tf 11.956 0 Td [(W T i V T i )]TJ/F25 11.9552 Tf 11.955 0 Td [( i = ^ W i T ^ V i T )]TJ/F25 11.9552 Tf 11.956 0 Td [(W T i ^ V i T )]TJ/F15 11.9552 Tf 13.871 0 Td [(_ ^ V i T ~ V i T )]TJ/F15 11.9552 Tf 10.494 8.088 Td [(1 2 ^ V i T V T i )]TJ/F15 11.9552 Tf 14.131 3.022 Td [(^ V i T ~ V i T + O ~ V T i 3 )]TJ/F25 11.9552 Tf 11.955 0 Td [( i = ~ W i T ^ V i T + W T i ^ V i T ^ V i T )]TJ/F25 11.9552 Tf 11.955 0 Td [(W T i ^ V i T V T i + 1 2 W T i ^ V i T V T i ~ V i T A )]TJ/F15 11.9552 Tf 10.494 8.088 Td [(1 2 W T i ^ V i T ^ V T i ~ V i T )]TJ/F25 11.9552 Tf 11.955 0 Td [(W T i O ~ V T i 3 )]TJ/F25 11.9552 Tf 11.955 0 Td [( i = ~ W i T ^ V i T )]TJ/F15 11.9552 Tf 13.87 0 Td [(_ ^ V i T ^ V i T + 1 2 ^ V T i ^ V i T ^ V T i + ^ W i T ^ V i T ~ V i T )]TJ/F15 11.9552 Tf 13.15 8.088 Td [(1 2 ^ W i T ^ V i T ^ V i T ~ V i T + ~ W i T ^ V i T V T i )]TJ/F15 11.9552 Tf 13.15 8.088 Td [(1 2 ^ V T i ^ V i T V T i + 1 2 W T i ^ V T i V T i ~ V i T )]TJ/F25 11.9552 Tf 11.956 0 Td [(W T i O ~ V i T 3 )]TJ/F25 11.9552 Tf 11.956 0 Td [( i : Thefollowingupperboundcanbeestablishedusingthedenitionof h i from5: jj h i )]TJ/F25 11.9552 Tf 11.955 0 Td [( i jj = jj ~ W i T ^ V i T V T i )]TJ/F15 11.9552 Tf 13.151 8.088 Td [(1 2 ^ V T i ^ V i T V T i + 1 2 W T i ^ V T i V T i ~ V i T )]TJ/F25 11.9552 Tf 11.955 0 Td [(W T i O ~ V i T 3 )]TJ/F25 11.9552 Tf 11.955 0 Td [( i jj = jj ~ W i T ^ V i T V T i )]TJ/F25 11.9552 Tf 11.955 0 Td [(W T i ^ V i T ~ V i T + W T i ^ V i T )]TJ/F25 11.9552 Tf 11.955 0 Td [( V T i A )]TJ/F15 11.9552 Tf 10.494 8.087 Td [(1 2 ~ W i T ^ V T i ^ V i T V T i + 1 2 W T i ^ V T i V T i ~ V i T )]TJ/F15 11.9552 Tf 10.494 8.087 Td [(1 2 ^ V i T V T i )]TJ/F15 11.9552 Tf 14.131 3.022 Td [(^ V i T ~ V i T )]TJ/F25 11.9552 Tf 11.955 0 Td [( i jj : 155

PAGE 156

Aftergroupingtermstheboundbecomes jj h i )]TJ/F25 11.9552 Tf 11.955 0 Td [( i jj = jj [ ^ W T i ^ V T i V T i )]TJ/F25 11.9552 Tf 11.955 0 Td [(W T i ^ V T i V T i ] )]TJ/F15 11.9552 Tf 9.298 0 Td [([ W T i ^ V i T ^ V i )]TJ/F25 11.9552 Tf 11.955 0 Td [(W T i ^ V T i V T i ] )]TJ/F15 11.9552 Tf 10.494 8.087 Td [(1 2 ~ W T i ^ V T i ^ V i T ^ V T i + 1 2 ^ W T i ^ V T i ^ V i T ~ V T i + W T i ~ V T i )]TJ/F25 11.9552 Tf 11.955 0 Td [( i jj A = jj ^ W T i ^ V i T V T i )]TJ/F25 11.9552 Tf 11.955 0 Td [(W T i ^ V i T ^ V i + 1 2 W T i ^ V T i ^ V i T ^ V T i )]TJ/F15 11.9552 Tf 10.494 8.088 Td [(1 2 ^ W T i ^ V T i ^ V i T V T i + W T i ~ V i T )]TJ/F25 11.9552 Tf 11.955 0 Td [( i jj i i : 156

PAGE 157

APPENDIXB PROJECTIONOPERATORDEFINITIONSCH3/4 Inordertointroducetheprojectionoperator,werstdeneachosensmooth convexfunction, f : R N R ,whichtakestheform[16]: f = f ^ = + jj ^ jj 2 )]TJ/F15 11.9552 Tf 13.581 3.022 Td [( 2 2 B where 2 [0 ; 1] isoftenreferredtoastheprojectiontoleranceand ^ 2 R N isavectorof adaptiveweightswithanupperbounddenotedby 2 R .Theprojectiontolerance, andtheupperboundontheadaptiveweights, ,arepredenedparameterssetbythe userandusedbytheadaptivelawduringrun-time.Wedenethegradientof f by r f = 2+ 2 ^ B andtwoconvexsets 0 and 1 : 0 = f f ^ 0 g B = f ^ 2 R N : jj ^ jj p 1+ g B 1 = f f ^ 1 g B = f ^ 2 R N : jj ^ jj g : B Foruseintheadaptivelaws,theprojectionoperatoroperatesbythefollowingequation [16]: ^ =Proj ^ ; )]TJ/F25 11.9552 Tf 7.315 0 Td [(y B =)]TJ/F30 11.9552 Tf 28.374 31.681 Td [(8 > > < > > : y )]TJ/F24 7.9701 Tf 13.151 5.699 Td [()]TJ/F28 7.9701 Tf 5.289 0 Td [(r f r f T r f T )]TJ/F28 7.9701 Tf 5.289 0 Td [(r f )]TJ/F25 11.9552 Tf 7.314 0 Td [(yf y iff ^ > 0 andy T )]TJ/F23 11.9552 Tf 7.314 0 Td [(r f> 0 otherwise 157

PAGE 158

where y 2 R N isaknownpiecewisecontinuousvector.Then,thefollowinguseful propertywillbeutilizedinLyapunovstabilityanalysis: ~ )]TJ/F28 7.9701 Tf 25.524 4.936 Td [()]TJ/F24 7.9701 Tf 6.586 0 Td [(1 Proj ^ ; )]TJ/F25 11.9552 Tf 7.314 0 Td [(y )]TJ/F25 11.9552 Tf 11.955 0 Td [(y 0 B wherewedenetheadaptiveweighterroras ~ = ^ )]TJ/F15 11.9552 Tf 11.956 0 Td [( 158

PAGE 159

APPENDIXC DEVELOPMENTOFUPPERBOUNDSBASEDONSPARSENEURALNETWORK ADAPTIVEUPDATELAWSCH4 Inthefollowingequations,wewillshowhowanupper-boundofzeroisdeveloped foranumberoftermsin5byusingthechosenformoftheadaptivecontrollaws statedin5to5.Werefertothegroupoftermsthatwillbeupper-boundedas V impact W impact; and K impact Forinstance,theeffectoftheadaptivelawstatedin5on V resultsin: V impact = )]TJ/F15 11.9552 Tf 9.299 0 Td [(2 e T PB ^ W i T ^ V i T ~ V i T + e T PB ^ W T i ^ V T i ^ V i T ~ V i T +trace ~ V T i )]TJ/F28 7.9701 Tf 7.314 4.948 Td [()]TJ/F24 7.9701 Tf 6.587 0 Td [(1 V ^ V i = )]TJ/F15 11.9552 Tf 9.299 0 Td [(2 e T PB ^ W i T ^ V i T ~ V i T +2trace ~ V i T )]TJ/F28 7.9701 Tf 7.314 4.948 Td [()]TJ/F24 7.9701 Tf 6.587 0 Td [(1 V Proj ^ V i ; )]TJ/F26 7.9701 Tf 7.314 -1.793 Td [(V e T PB ^ W i T ^ V i T + e T PB ^ W T i ^ V T i ^ V i T ~ V i T +trace ~ V i T )]TJ/F28 7.9701 Tf 7.314 4.948 Td [()]TJ/F24 7.9701 Tf 6.587 0 Td [(1 V Proj ^ V i ; )]TJ/F15 11.9552 Tf 9.298 0 Td [()]TJ/F26 7.9701 Tf 7.314 -1.793 Td [(V e T PB ^ W i T ^ V T i diag ^ V i T C =2trace ~ V i T )]TJ/F28 7.9701 Tf 11.866 4.948 Td [()]TJ/F24 7.9701 Tf 6.587 0 Td [(1 V Proj ^ V i ; )]TJ/F26 7.9701 Tf 7.314 -1.793 Td [(V e T PB ^ W i T ^ V i T )]TJ/F25 11.9552 Tf 11.955 0 Td [(e T PB ^ W i T ^ V i T +trace ~ V i T )]TJ/F28 7.9701 Tf 11.867 4.948 Td [()]TJ/F24 7.9701 Tf 6.586 0 Td [(1 V Proj ^ V i ; )]TJ/F15 11.9552 Tf 9.299 0 Td [()]TJ/F26 7.9701 Tf 7.314 -1.794 Td [(V e T PB ^ W i T ^ V T i diag ^ V i T + e T PB ^ W i T ^ V T i diag ^ V i T 0 : C Similarly,theimpactof ^ W i yields: W impact = )]TJ/F15 11.9552 Tf 9.299 0 Td [(2 e T PB ~ W i T ^ V i T )]TJ/F15 11.9552 Tf 13.871 0 Td [(_ ^ V i T ^ V i T )]TJ/F25 11.9552 Tf 9.299 0 Td [(e T PB ~ W i T ^ V T i ^ V i T ^ V T i +trace ~ W i T )]TJ/F28 7.9701 Tf 7.314 4.948 Td [()]TJ/F24 7.9701 Tf 6.586 0 Td [(1 W ^ W i = )]TJ/F15 11.9552 Tf 9.299 0 Td [(2 e T PB ~ W i T ^ V i T )]TJ/F15 11.9552 Tf 13.87 0 Td [(_ ^ V i T ^ V i T +2trace ~ W i T )]TJ/F28 7.9701 Tf 7.315 4.949 Td [()]TJ/F24 7.9701 Tf 6.586 0 Td [(1 W Proj ^ W i ; )]TJ/F26 7.9701 Tf 7.314 -1.793 Td [(W ^ V i T )]TJ/F15 11.9552 Tf 13.871 0 Td [(_ ^ V i T ^ V i T e T PB )]TJ/F25 11.9552 Tf 9.299 0 Td [(e T PB ~ W i T ^ V T i ^ V i T ^ V T i C +trace ~ W i T )]TJ/F28 7.9701 Tf 7.314 4.948 Td [()]TJ/F24 7.9701 Tf 6.587 0 Td [(1 W Proj ^ W i ; )]TJ/F26 7.9701 Tf 7.314 -1.793 Td [(W ^ V T i ^ V i T ^ V T i e T PB =2trace ~ W i T )]TJ/F28 7.9701 Tf 7.314 4.948 Td [()]TJ/F24 7.9701 Tf 6.586 0 Td [(1 W Proj ^ W i ; )]TJ/F26 7.9701 Tf 7.315 -1.793 Td [(W ^ V i T )]TJ/F15 11.9552 Tf 13.87 0 Td [(_ ^ V i T ^ V i T e T PB )]TJ/F15 11.9552 Tf 9.299 0 Td [( ^ V i T )]TJ/F15 11.9552 Tf 13.87 0 Td [(_ ^ V i T ^ V i T e T PB 159

PAGE 160

+trace ~ W T i )]TJ/F28 7.9701 Tf 7.314 4.949 Td [()]TJ/F24 7.9701 Tf 6.587 0 Td [(1 W Proj ^ W i ; )]TJ/F26 7.9701 Tf 7.314 -1.793 Td [(W ^ V T i diag ^ V i T ^ V T i e T PB )]TJ/F15 11.9552 Tf 9.913 0 Td [( ^ V T i diag ^ V i T ^ V T i e T PB 0 : C Finally, ^ K in5formsthefollowingupperbound: K impact =2 e T PB K )]TJ/F15 11.9552 Tf 15.088 3.022 Td [(^ K u BL + u NN + u RB +trace ~ K T )]TJ/F28 7.9701 Tf 7.314 4.948 Td [()]TJ/F24 7.9701 Tf 6.586 0 Td [(1 K ^ K = )]TJ/F15 11.9552 Tf 9.299 0 Td [(2 e T PB ~ K u BL + u NN + u RB +trace ~ K T )]TJ/F28 7.9701 Tf 7.314 4.948 Td [()]TJ/F24 7.9701 Tf 6.587 0 Td [(1 K ^ K C =2trace ~ K T )]TJ/F28 7.9701 Tf 7.314 4.948 Td [()]TJ/F24 7.9701 Tf 6.587 0 Td [(1 K Proj ^ K ; )]TJ/F26 7.9701 Tf 7.314 -1.794 Td [(K u BL + u NN + u RB e T PB )]TJ/F15 11.9552 Tf 9.299 0 Td [( u BL + u NN + u RB e T PB 0 : C 160

PAGE 161

APPENDIXD ALTERNATIVEDWELL-TIMECONDITIONAPPROACHCH4 Supposethatforeachtimesegment t S thereexists S 1 suchthatatany switchingtime t S wehave V t S +1 t S S V t S t )]TJ/F26 7.9701 Tf 0 -8.19 Td [(S D where V t S +1 t S referstotheLyapunovcandidatevalueduringthe t S +1 segmentattime t S .Nowconsiderafunctioninthefollowingform[30,89]: W t = e c V t V i t D wherefrom5andDwecanshowthat W t e c V t c V k T D foreverytimesegment.Consideraintervaloftime [ t S )]TJ/F24 7.9701 Tf 6.586 0 Td [(1 ;t S ] withoneswitch,where W t S +1 S W t S + t S t S )]TJ/F18 5.9776 Tf 5.756 0 Td [(1 e c V t c V k T dt D isderivedfrom5,D,andD.Rearrangingtermsandsimplifyingresultsin V t S +1 t S S e )]TJ/F26 7.9701 Tf 6.586 0 Td [(c V t S e c V t S )]TJ/F18 5.9776 Tf 5.757 0 Td [(1 V t S t S )]TJ/F24 7.9701 Tf 6.586 0 Td [(1 + t S t S )]TJ/F18 5.9776 Tf 5.756 0 Td [(1 e c V t c V k T dt : D Using5andsince t S t S )]TJ/F18 5.9776 Tf 5.756 0 Td [(1 e c V t c V k T dt = k T e c V t S )]TJ/F25 11.9552 Tf 11.955 0 Td [(e c V t S )]TJ/F18 5.9776 Tf 5.756 0 Td [(1 D thenDbecomes V t S +1 t S S e )]TJ/F26 7.9701 Tf 6.586 0 Td [(c V T dwell V t S t S )]TJ/F24 7.9701 Tf 6.587 0 Td [(1 )]TJ/F15 11.9552 Tf 12.273 3.155 Td [( k T + k T D whichholdsforeachtimesegment.Ifweusethedwelltimeboundderivedin5 andtheassumptionthat V t S t S )]TJ/F24 7.9701 Tf 6.586 0 Td [(1 > 2 k T + k NN ,thenDbecomes V t S +1 t S S V t S t S )]TJ/F24 7.9701 Tf 6.587 0 Td [(1 )]TJ/F15 11.9552 Tf 12.273 3.155 Td [( k NN : D 161

PAGE 162

SinceweknowDisalwaystrueand V t S +1 t S S V t S t S + k NN D where S 2 R + isauserdenedconstantbasedon5.Then,wecanset S = S k NN + V t S t S V t S t S D whichresultsin V t S +1 t S S V t S t S D whichisequivalenttothesolutionderivedinTheorem5.3. 162

PAGE 163

LISTOFREFERENCES [1]Z.T.Dydek,A.M.Annaswamy,andE.Lavretsky,Adaptivecontrolandthenasa x-15-3ightrevisited, IEEEContr.Syst. ,vol.30,no.3,pp.32,2010. [2]B.McGarry.Airforcegettingclosertotestinghypersonicweapon, engineerssay.[Online].Available:http://www.military.com/daily-news/2015/05/19/ air-force-getting-closer-to-testing-hypersonic-weapon.html [3]R.Aditya,Directadaptivecontrolforstabilityandcommandaugmentationsystem ofanair-breathinghypersonicvehicle,Master'sthesis,2015. [4]K.Mizokami.Lockheedtobuildamach20hypersonicweaponsystem. [Online].Available:http://www.popularmechanics.com/military/research/a22970/ lockheed-hypersonic-weapon/ [5]M.Creagh,M.Kearney,andP.Beasley,Adaptivecontrolforahypersonic gliderusingparameterfeedbackfromsystemidentication,in AIAAGuidance, Navigation,andControlConf. ,2011,p.6230. [6]M.A.Bolender,Anoverviewondynamicsandcontrolsmodellingofhypersonic vehicles,in Proc.Amer.ControlConf. ,2009,pp.2507. [7]T.Williams,M.Bolender,D.Doman,andO.Morataya,Anaerothermalexible modeanalysisofahypersonicvehicle,in AIAAAtmos.FlightMech.Conf. ,2006,p. 6647. [8]C.Cesnik,P.Senatore,W.Su,E.Atkins,C.Shearer,andN.Pitchter,X-hale:a veryexibleuavfornonlinearaeroelastictests,in Proc.Struct.Dyn.Mater.Conf. 2010,p.2715. [9]R.Upadhyay,K.Schulz,P.Bauman,R.Stogner,andO.Ezekoye,Steady-state ablationmodelcouplingwithhypersonicow,in AIAAAerospaceSciencesMeeting IncludingtheNewHorizonsForumandAerospaceExposition ,2010,p.1176. [10]N.J.Falkiewicz,C.E.S.Cesnik,A.R.Crowell,andJ.J.McNamara,Reduced-order aerothermoelasticframeworkforhypersonicvehiclecontrolsimulation, AIAAJ. vol.49,no.8,pp.1625,2011. [11]T.E.Gibson,Closed-loopreferencemodeladaptivecontrol:withapplicationto veryexibleaircraft,Ph.D.dissertation,MassachusettsInst.ofTechnology,2013. [12]Z.HanandK.S.Narendra,Newconceptsinadaptivecontrolusingmultiple models, IEEETrans.Autom.Control ,vol.57,no.1,pp.78,2012. [13]F.LewisandS.S.Ge,Neuralnetworksinfeedbackcontrolsystems, Mech.Eng. Handbook ,2005. 163

PAGE 164

[14]N.Hovakimyan,F.Nardi,A.Calise,andN.Kim,Adaptiveoutputfeedbackcontrol ofuncertainnonlinearsystemsusingsingle-hidden-layerneuralnetworks, IEEE Trans.NeuralNetw. ,vol.13,no.6,pp.1420,2002. [15]R.T.Anderson,G.Chowdhary,andE.N.Johnson,Comparisonofrbfandshl neuralnetworkbasedadaptivecontrol,in UnmannedAircraftSystems .Springer, 2008,pp.183. [16]E.LavretskyandK.Wise, RobustandAdaptiveControl:WithAerospaceApplications .SpringerScience&BusinessMedia,2012. [17]F.Lewis,Nonlinearnetworkstructuresforfeedbackcontrol, AsianJ.Control ,vol.1, no.4,pp.205,1999. [18]G.ChowdharyandE.Johnson,Concurrentlearningforconvergenceinadaptive controlwithoutpersistencyofexcitation,in IEEEConf.DecisionandControl ,2010, pp.3674. [19]P.A.IoannouandJ.Sun, Robustadaptivecontrol .CourierCorporation,2012. [20]R.Gadient,Adaptivecontrolwithaerospaceapplications,Ph.D.dissertation,2013. [21]Z.T.Dydek,Adaptivecontrolofunmannedaerialsystems,Ph.D.dissertation, MassachusettsInstituteofTechnology,2010. [22]N.HovakimyanandA.J.Calise,Adaptiveoutputfeedbackcontrolofuncertain multi-inputmulti-outputsystemsusingsinglehiddenlayerneuralnetworks,in Proc. Amer.ControlConf. ,vol.2.IEEE,2002,pp.1555. [23]T.YucelenandW.M.Haddad,Low-frequencylearningandfastadaptationin modelreferenceadaptivecontrol, IEEETrans.Autom.Control ,vol.58,no.4,pp. 1080,2013. [24]A.J.Calise,N.Hovakimyan,andM.Idan,Adaptiveoutputfeedbackcontrol ofnonlinearsystemsusingneuralnetworks, Automatica ,vol.37,no.8,pp. 1201,2001. [25]P.Patre,W.MacKunis,K.Dupree,andW.Dixon,Anewclassofmodularadaptive controllers,parti:Systemswithlinear-in-the-parametersuncertainty,in Proc.Amer. ControlConf. ,2008,pp.1208. [26]L.ChenandK.S.Narendra,Nonlinearadaptivecontrolusingneuralnetworksand multiplemodels, Automatica ,vol.37,no.8,pp.1245,2001. [27]C.CaoandN.Hovakimyan,Designandanalysisofanoveladaptivecontrol architecturewithguaranteedtransientperformance, IEEETrans.Autom.Control vol.53,no.2,pp.586,2008. 164

PAGE 165

[28]V.Stepanyan,K.Krishnakumar,N.Nguyen,andL.VanEykeren,Stabilityand performancemetricsforadaptiveightcontrol,in AIAAGuidance,Navigation,and ControlConf. ,2009. [29]J.P.Hespanha,Uniformstabilityofswitchedlinearsystems:Extensionsoflasalle's invarianceprinciple, IEEETrans.Autom.Control ,vol.49,no.4,pp.470,2004. [30]D.Liberzon, Switchinginsystemsandcontrol .SpringerScience&Business Media,2012. [31]S.Kamalasadan,Anewgenerationofadaptivecontrol:Anintelligentsupervisory loopapproach,Ph.D.dissertation,UniversityofToledo,2004. [32]Z.LiuandY.Wang,Fuzzyadaptivetrackingcontrolwithinthefullenvelopeforan unmannedaerialvehicle, Chin.J.Aeronaut. ,vol.27,no.5,pp.1273,2014. [33]L.YuandS.Fei,Robustlystableswitchingneuralcontrolofroboticmanipulators usingaveragedwell-timeapproach, Trans.Inst.MeasurementControl ,vol.36, no.6,pp.789,2014. [34]Y.Huang,C.Sun,C.Qian,andL.Wang,Non-fragileswitchingtrackingcontrolfor aexibleair-breathinghypersonicvehiclebasedonpolytopiclpvmodel, Chin.J. Aeronaut. ,vol.26,no.4,pp.948,2013. [35]J.Lian,Y.Lee,S.D.Sudhoff,andS.H.Zak,Variablestructureneuralnetwork baseddirectadaptiverobustcontrolofuncertainsystems,in Proc.Amer.Control Conf. ,2008,pp.3402. [36]J.Lian,J.Hu,andS.H.Zak,Adaptiverobustcontrol:Apiecewiselyapunov functionapproach,in Proc.Amer.ControlConf. ,2009,pp.568. [37]Y.LeCun,Y.BengioandG.E.Hinton,Deeplearning, Nature ,vol.521,pp. 436,2015. [38]R.Schneiderman,L.Deng,andV.Sejnoha,Accuracy,appsadvancespeech recognition, IEEESignal.Proc.Mag. ,vol.1053,no.5888/15,2015. [39]M.P.Deisenroth,G.Neumann,J.Peters etal. ,Asurveyonpolicysearchfor robotics, FoundationsandTrendsinRobotics ,vol.2,no.1,pp.1,2013. [40]R.E.BellmanandS.E.Dreyfus,Applieddynamicprogramming,1962. [41]S.LevineandV.Koltun,Guidedpolicysearch,in Proc.Int.Conf.Mach.Learning 2013,pp.1. [42]I.Sutskever,Trainingrecurrentneuralnetworks,Ph.D.dissertation,Universityof Toronto,2013. [43]S.Levine,Exploringdeepandrecurrentarchitecturesforoptimalcontrol, arXiv preprintarXiv:1311.1761 ,2013. 165

PAGE 166

[44]R.Tedrake,Lqr-trees:Feedbackmotionplanningonsparserandomizedtrees, 2009. [45]R.Tedrake,I.R.Manchester,M.Tobenkin,andJ.W.Roberts,Lqr-trees:Feedback motionplanningviasums-of-squaresverication, Int.J.RoboticsResearch ,2010. [46]P.A.Parrilo,Structuredsemideniteprogramsandsemialgebraicgeometry methodsinrobustnessandoptimization,Ph.D.dissertation,CaliforniaInstituteof Technology,2000. [47]A.Dorobantu,L.Crespo,andP.Seiler, Robustnessanalysisandoptimallyrobust controldesignviasum-of-squares ,2012. [48]A.PapachristodoulouandS.Prajna,Ontheconstructionoflyapunovfunctions usingthesumofsquaresdecomposition,in IEEEConf.DecisionandControl vol.3,2002,pp.3482. [49]A.MajumdarandR.Tedrake,Robustonlinemotionplanningwithregionsofnite timeinvariance,in AlgorithmicFoundationsofRobotics .Springer,2013,pp. 543. [50]A.Majumdar,A.A.Ahmadi,andR.Tedrake,Controldesignalongtrajectorieswith sumsofsquaresprogramming,in IEEEInt.Conf.RoboticsAutomat. IEEE,2013, pp.4054. [51]J.Theis,Sum-of-squaresapplicationsinnonlinearcontrollersynthesis,Ph.D. dissertation,PhDThesis,UniversityofCalifornia,Berkeley,2012. [52]M.-W.Seo,H.-H.Kwon,andH.-L.Choi,Nonlinearmissileautopilotdesignusing adensityfunction-basedsum-of-squaresoptimizationapproach,in IEEEConf. ControlApplicat. ,2015,pp.947. [53]A.El-Guindy,D.Han,andM.Althoff,Estimatingtheregionofattractionviaforward reachablesets,in Proc.Amer.ControlConf. ,2017. [54]M.Althoff,Reachabilityanalysisanditsapplicationtothesafetyassessmentof autonomouscars, TechnischeUniversittMnchen ,2010. [55]A.Ng,J.Ngiam,C.Y.Foo,Y.Mai,andC.Suen.,AugustUdltutorial. [Online].Available:http://udl.stanford.edu/tutorial/ [56]Y.Bengio,A.Courville,andP.Vincent,Representationlearning:Areviewandnew perspectives, IEEETrans.PatternAnal.Mach.Intell. ,vol.35,no.8,pp.1798, 2013. [57]R.Schneiderman,Accuracy,appsadvancespeechrecognition, IEEESignal.Proc. Mag. ,2015. 166

PAGE 167

[58]X.Glorot,A.Bordes,andY.Bengio,Deepsparserectierneuralnetworks,in Int. Conf.ArticialIntell.Stat. ,2011,pp.315. [59]Q.V.Le,N.Jaitly,andG.E.Hinton,Asimplewaytoinitializerecurrentnetworksof rectiedlinearunits, arXivpreprintarXiv:1504.00941 ,2015. [60]I.J.Goodfellow,D.Warde-Farley,M.Mirza,A.Courville,andY.Bengio,Maxout networks, arXivpreprintarXiv:1302.4389 ,2013. [61]Q.WangandJ.JaJa,Frommaxouttochannel-out:Encodinginformationonsparse pathways,in ArticialNeuralNetworksandMachineLearning .Springer,2014, pp.273. [62]A.Graves,A.-r.Mohamed,andG.Hinton,Speechrecognitionwithdeeprecurrent neuralnetworks,in IEEETrans.Acoust.,Speech,SignalProcess. IEEE,2013, pp.6645. [63]D.Britz.,OctoberRecurrentneuralnetworktutorial.[Online].Available: http://www.wildml.com/ [64]R.Pascanu,T.Mikolov,andY.Bengio,Onthedifcultyoftrainingrecurrentneural networks. Int.Conf.Mach.Learning ,vol.28,pp.1310,2013. [65]S.HochreiterandJ.Schmidhuber,Longshort-termmemory, NeuralComput. vol.9,no.8,pp.1735,1997. [66]K.Cho,B.VanMerrinboer,C.Gulcehre,D.Bahdanau,F.Bougares,H.Schwenk, andY.Bengio,Learningphraserepresentationsusingrnnencoder-decoderfor statisticalmachinetranslation, arXivpreprintarXiv:1406.1078 ,2014. [67]R.Pascanu,C.Gulcehre,K.Cho,andY.Bengio,Howtoconstructdeeprecurrent neuralnetworks, arXivpreprintarXiv:1312.6026 ,2013. [68]J.Ngiam,A.Coates,A.Lahiri,B.Prochnow,Q.V.Le,andA.Y.Ng,Onoptimization methodsfordeeplearning,in Int.Conf.Mach.Learning ,2011,pp.265. [69]Y.Nesterov,Amethodofsolvingaconvexprogrammingproblemwithconvergence rateo/k2,in SovietMathematicsDoklady ,vol.27,no.2,1983,pp.372. [70]A.S.LewisandM.L.Overton,Nonsmoothoptimizationviaquasi-newtonmethods, Math.Programming ,vol.141,no.1-2,pp.135,2013. [71]K.J.AstrmandR.M.Murray, Feedbacksystems:anintroductionforscientistsand engineers .Princetonuniversitypress,2010. [72]M.Sharma,E.Lavretsky,andK.A.Wise,Applicationandighttestingofan adaptiveautopilotonprecisionguidedmunitions,in AIAAGuidance,Navigation andControlConf. ,2006. 167

PAGE 168

[73]B.Dickinson,S.Nivison,A.Hart,C.Hung,B.Bialy,andS.Stockbridge,Robustand adaptivecontrolofarocketboostedmissile,in Proc.Amer.ControlConf. ,2015,pp. 2520. [74]B.L.StevensandF.L.Lewis, Aircraftcontrolandsimulation .JohnWiley&Sons, 2003. [75]K.A.Wise,Bank-to-turnmissileautopilotdesignusinglooptransferrecovery, J. Guid.Contr.Dynam. ,vol.13,no.1,pp.145,1990. [76]K.S.NarendraandA.M.Annaswamy, Stableadaptivesystems .Courier Corporation,2012. [77]H.K.KhalilandJ.Grizzle, Nonlinearsystems .PrenticehallNewJersey,1996, vol.3. [78]P.IoannouandB.Fidan,Adaptivecontroltutorial,societyforindustrialandapplied mathematics,2006. [79]K.S.NarendraandA.M.Annaswamy,Anewadaptivelawforrobustadaptation withoutpersistentexcitation, IEEETrans.Autom.Control ,vol.32,no.2,pp. 134,1987. [80]S.A.NivisonandP.Khargonekar,Improvinglong-termlearningofmodelreference adaptivecontrollersforightapplications:Asparseneuralnetworkapproach,in AIAAGuidance,Navigation,andControlConf. ,2017,p.1249. [81]F.L.Lewis,A.Yesildirek,andK.Liu,Multilayerneural-netrobotcontrollerwith guaranteedtrackingperformance, IEEETrans.NeuralNetw. ,vol.7,no.2,pp. 388,1996. [82]Y.Shin,Neuralnetworkbasedadaptivecontrolfornonlineardynamicregimes, Ph.D.dissertation,2005. [83]M.B.McFarlandandD.T.Stansbery,Adaptivenonlinearautopilotforanti-air missiles.DTICDocument,Tech.Rep.,1998. [84]G.Chowdhary,H.A.Kingravi,J.P.How,andP.A.Vela,Bayesiannonparametric adaptivecontrolusinggaussianprocesses, IEEETrans.NeuralNetw. ,vol.26, no.3,pp.537,2015. [85]G.Gybenko,Approximationbysuperpositionofsigmoidalfunctions, Math.Control SignalsSyst. ,vol.2,no.4,pp.303,1989. [86]S.J.Russell,P.Norvig,J.F.Canny,J.M.Malik,andD.D.Edwards, Articial intelligence:amodernapproach .PrenticehallUpperSaddleRiver,2003,vol.2. 168

PAGE 169

[87]T.Zhang,S.S.Ge,andC.C.Hang,Designandperformanceanalysisofa directadaptivecontrollerfornonlinearsystems, Automatica ,vol.35,no.11,pp. 1809,1999. [88]N.Kim,Improvedmethodsinneuralnetwork-basedadaptiveoutputfeedback control,withapplicationstoightcontrol,Ph.D.dissertation,GeorgiaInstituteof Technology,2003. [89]L.Vu,D.Chatterjee,andD.Liberzon,Issofswitchedsystemsandapplicationsto switchingadaptivecontrol,in IEEEConf.DecisionandControl ,2005,pp.120. [90]M.A.BolenderandD.B.Doman,Nonlinearlongitudinaldynamicalmodelofan air-breathinghypersonicvehicle, J.SpacecraftRockets ,vol.44,no.2,pp.374, 2007. [91]A.Chakraborty,P.Seiler,andG.J.Balas,Nonlinearregionofattractionanalysis forightcontrolvericationandvalidation, ControlEng.Pract. ,vol.19,no.4,pp. 335,2011. [92]J.Chung,C.Gulcehre,K.Cho,andY.Bengio,Empiricalevaluationofgated recurrentneuralnetworksonsequencemodeling, arXivpreprintarXiv:1412.3555 2014. [93]J.W.CurtisandR.W.Beard,Agraphicalunderstandingoflyapunov-based nonlinearcontrol,in IEEEConf.DecisionandControl ,vol.2,2002,pp.2278. [94]S.A.NivisonandP.P.Khargonekar,Developmentofarobustdeeprecurrentneural networkcontrollerforightapplications,in Proc.Amer.ControlConf. ,2017,pp. 5336. 169

PAGE 170

BIOGRAPHICALSKETCH ScottNivisonwasborninFairbanks,Alaska.Afterfouryearsofstudy,hereceived aBachelorofSciencedegreeinelectricalandcomputerengineeringfromtheUniversity ofFloridain2009.Afterthecompletionofhisdegree,hebeganworkingfortheAir ForceResearchLaboratory,MunitionsDirectorateatEglinAFB,Florida.Duringhistime atthelab,heearnedaMasterofSciencedegreeinelectricalandcomputerengineering in2012.HebeganhispursuitofhisPh.Din2013. 170