<%BANNER%>

Optimal First Order Methods for a Class of Non-Smooth Convex Optimization with Applications to Image Analysis

MISSING IMAGE

Material Information

Title:
Optimal First Order Methods for a Class of Non-Smooth Convex Optimization with Applications to Image Analysis
Physical Description:
1 online resource (190 p.)
Language:
english
Creator:
Ouyang, Yuyuan
Publisher:
University of Florida
Place of Publication:
Gainesville, Fla.
Publication Date:

Thesis/Dissertation Information

Degree:
Doctorate ( Ph.D.)
Degree Grantor:
University of Florida
Degree Disciplines:
Mathematics
Committee Chair:
Chen, Yunmei
Committee Members:
Rao, Murali
Mccullough, Scott A
Mareci, Thomas H
Zhang, Lei

Subjects

Subjects / Keywords:
imaging -- nonsmooth -- optimization -- smooth -- stochastic
Mathematics -- Dissertations, Academic -- UF
Genre:
Mathematics thesis, Ph.D.
bibliography   ( marcgt )
theses   ( marcgt )
government publication (state, provincial, terriorial, dependent)   ( marcgt )
born-digital   ( sobekcm )
Electronic Thesis or Dissertation

Notes

Abstract:
This PhD Dissertation concerns optimal first order methods in convex optimization, and their applications in imaging science. The research is motivated by the rapid advances in the technologies for digital data acquisition, which results in high demand for efficient algorithms to solve non-smooth convex optimization problems. In this dissertation we will develop theories and optimal numerical methods for solving a class of deterministic and stochastic saddle point problems and more general variational inequalities arising from large-scale data analysis problems. In the first part of this dissertation, we aim to solve a class of deterministic and stochastic saddle point problems (SPP), which has been considered as a framework of ill-posed inverse problems regularized by a non-smooth functional in many data analysis problems, such as image reconstruction in compressed sensing and machine learning.  The proposed deterministic accelerated primal dual (APD) algorithm is expected to have the same optimal rate of convergence as the one obtained by Nesterov for a different scheme. We also propose a stochastic APD algorithm that also exhibits an optimal rate of convergence. To our best knowledge, no stochastic primal-dual algorithms have  been developed in literatures. In the second part, we consider a class of affine equality constrained convex composite optimization problems, which can be solved by alternating direction method of multipliers (ADMM). The problem class of interest is also closely related to the SPP studied in the first part of this dissertation. We propose two novel accelerated linearized ADMM methods, namely the accelerated linearized ADMM (AL-ADMM) and the accelerated linearized preconditioned ADMM (ALP-ADMM) methods, and prove that the accelerated methods exhibit better rate of convergence than their unaccelerated counterparts, in both theories and experiments. In the third part, We consider a broader class of problems, the variation inequalities (VI), which includes the previous two parts as special cases. We demonstrate that, if we identify and decompose the potential functional components of VI, and treat them differently in the design of the VI solution methods, our numerical method can achieve better rate of convergence. We propose an Accelerated Prox-Method (AC-PM) for solving a class of deterministic and stochastic VI. Both the deterministic and stochastic AC-PM algorithms achieve the optimal rate of convergence. In the last part, we introduce an application of the total variation (TV) and wavelet regularization framework for Diffusion Weighted Imaging (DWI). The proposed framework is able to simultaneously reconstruct and regularize the Orientation Distribution Function (ODF), and extract better directional information from noisy DWI data. We show that the TV-wavelet framework for DWI is a special class of SPP, and use a primal-dual method to solve the ODF reconstruction problem.
General Note:
In the series University of Florida Digital Collections.
General Note:
Includes vita.
Bibliography:
Includes bibliographical references.
Source of Description:
Description based on online resource; title from PDF title page.
Source of Description:
This bibliographic record is available under the Creative Commons CC0 public domain dedication. The University of Florida Libraries, as creator of this bibliographic record, has waived all rights to it worldwide under copyright law, including all related and neighboring rights, to the extent allowed by law.
Statement of Responsibility:
by Yuyuan Ouyang.
Thesis:
Thesis (Ph.D.)--University of Florida, 2013.
Local:
Adviser: Chen, Yunmei.
Electronic Access:
RESTRICTED TO UF STUDENTS, STAFF, FACULTY, AND ON-CAMPUS USE UNTIL 2015-08-31

Record Information

Source Institution:
UFRGP
Rights Management:
Applicable rights reserved.
Classification:
lcc - LD1780 2013
System ID:
UFE0045919:00001

MISSING IMAGE

Material Information

Title:
Optimal First Order Methods for a Class of Non-Smooth Convex Optimization with Applications to Image Analysis
Physical Description:
1 online resource (190 p.)
Language:
english
Creator:
Ouyang, Yuyuan
Publisher:
University of Florida
Place of Publication:
Gainesville, Fla.
Publication Date:

Thesis/Dissertation Information

Degree:
Doctorate ( Ph.D.)
Degree Grantor:
University of Florida
Degree Disciplines:
Mathematics
Committee Chair:
Chen, Yunmei
Committee Members:
Rao, Murali
Mccullough, Scott A
Mareci, Thomas H
Zhang, Lei

Subjects

Subjects / Keywords:
imaging -- nonsmooth -- optimization -- smooth -- stochastic
Mathematics -- Dissertations, Academic -- UF
Genre:
Mathematics thesis, Ph.D.
bibliography   ( marcgt )
theses   ( marcgt )
government publication (state, provincial, terriorial, dependent)   ( marcgt )
born-digital   ( sobekcm )
Electronic Thesis or Dissertation

Notes

Abstract:
This PhD Dissertation concerns optimal first order methods in convex optimization, and their applications in imaging science. The research is motivated by the rapid advances in the technologies for digital data acquisition, which results in high demand for efficient algorithms to solve non-smooth convex optimization problems. In this dissertation we will develop theories and optimal numerical methods for solving a class of deterministic and stochastic saddle point problems and more general variational inequalities arising from large-scale data analysis problems. In the first part of this dissertation, we aim to solve a class of deterministic and stochastic saddle point problems (SPP), which has been considered as a framework of ill-posed inverse problems regularized by a non-smooth functional in many data analysis problems, such as image reconstruction in compressed sensing and machine learning.  The proposed deterministic accelerated primal dual (APD) algorithm is expected to have the same optimal rate of convergence as the one obtained by Nesterov for a different scheme. We also propose a stochastic APD algorithm that also exhibits an optimal rate of convergence. To our best knowledge, no stochastic primal-dual algorithms have  been developed in literatures. In the second part, we consider a class of affine equality constrained convex composite optimization problems, which can be solved by alternating direction method of multipliers (ADMM). The problem class of interest is also closely related to the SPP studied in the first part of this dissertation. We propose two novel accelerated linearized ADMM methods, namely the accelerated linearized ADMM (AL-ADMM) and the accelerated linearized preconditioned ADMM (ALP-ADMM) methods, and prove that the accelerated methods exhibit better rate of convergence than their unaccelerated counterparts, in both theories and experiments. In the third part, We consider a broader class of problems, the variation inequalities (VI), which includes the previous two parts as special cases. We demonstrate that, if we identify and decompose the potential functional components of VI, and treat them differently in the design of the VI solution methods, our numerical method can achieve better rate of convergence. We propose an Accelerated Prox-Method (AC-PM) for solving a class of deterministic and stochastic VI. Both the deterministic and stochastic AC-PM algorithms achieve the optimal rate of convergence. In the last part, we introduce an application of the total variation (TV) and wavelet regularization framework for Diffusion Weighted Imaging (DWI). The proposed framework is able to simultaneously reconstruct and regularize the Orientation Distribution Function (ODF), and extract better directional information from noisy DWI data. We show that the TV-wavelet framework for DWI is a special class of SPP, and use a primal-dual method to solve the ODF reconstruction problem.
General Note:
In the series University of Florida Digital Collections.
General Note:
Includes vita.
Bibliography:
Includes bibliographical references.
Source of Description:
Description based on online resource; title from PDF title page.
Source of Description:
This bibliographic record is available under the Creative Commons CC0 public domain dedication. The University of Florida Libraries, as creator of this bibliographic record, has waived all rights to it worldwide under copyright law, including all related and neighboring rights, to the extent allowed by law.
Statement of Responsibility:
by Yuyuan Ouyang.
Thesis:
Thesis (Ph.D.)--University of Florida, 2013.
Local:
Adviser: Chen, Yunmei.
Electronic Access:
RESTRICTED TO UF STUDENTS, STAFF, FACULTY, AND ON-CAMPUS USE UNTIL 2015-08-31

Record Information

Source Institution:
UFRGP
Rights Management:
Applicable rights reserved.
Classification:
lcc - LD1780 2013
System ID:
UFE0045919:00001


This item has the following downloads:


Full Text

PAGE 1

OPTIMALFIRSTORDERMETHODSFORACLASSOFNON-SMOOTHCONVEXOPTIMIZATIONWITHAPPLICATIONSTOIMAGEANALYSISByYUYUANOUYANGADISSERTATIONPRESENTEDTOTHEGRADUATESCHOOLOFTHEUNIVERSITYOFFLORIDAINPARTIALFULFILLMENTOFTHEREQUIREMENTSFORTHEDEGREEOFDOCTOROFPHILOSOPHYUNIVERSITYOFFLORIDA2013

PAGE 2

c2013YuyuanOuyang 2

PAGE 3

Tomyfamily. 3

PAGE 4

ACKNOWLEDGMENTS Iwouldliketoexpressmydeepestappreciationtomanypeoplewhohelpedmethroughmygraduatestudyandofferednumeroushelpinthepreparationofthisdissertation.Iamheartilythankfultomyadvisor,Dr.YunmeiChen,forherencouragement,guidanceandsupportfromtheinitialtothenallevelofmystudy.Shenotonlymentoredmehowtodevelopunderstandingandconductresearchinimagingscience,butalsogavemegeneroushelpandwisesuggestionsonmyfurthercareer.IamdeeplygratefultoDr.GuanghuiLan,forintroducingmetotheeldofnonlinearoptimizationandsharingmehisgreatknowledge.HehasgivenmeinvaluableadvicesandtremendoussupportasImovedfrombasicideastocompletedstudies.IwouldliketothankDr.MuraliRao,Dr.LeiZhang,Dr.ScottMcCulloughandDr.ThomasMareciforservingasmydoctoralcommitteemembers,andhelpingmetothecompletionofthisdissertation.Finally,andmostimportantly,Iwouldexpressmyspecialthankstomyparentsandmywifefortheirendlesslove.Withouttheirencouragement,understandingandpatiencethisdissertationwouldnothappen. 4

PAGE 5

TABLEOFCONTENTS page ACKNOWLEDGMENTS .................................. 4 LISTOFTABLES ...................................... 7 LISTOFFIGURES ..................................... 8 ABSTRACT ......................................... 9 CHAPTER 1BACKGROUND ................................... 11 1.1BasicConvexOptimizationandOptimalFirstOrderMethods ....... 11 1.1.1SmoothConvexOptimization ..................... 12 1.1.2SaddlePointProblemsandVariationalInequalities ......... 14 1.2TotalVariation,ImageAnalysisandRelatedOptimizationProblems .... 18 1.3OutlineofTheDissertation .......................... 20 2OPTIMALSCHEMESFORACLASSOFSADDLEPOINTPROBLEMS .... 22 2.1Introduction ................................... 22 2.1.1DeterministicSPP ........................... 23 2.1.2StochasticSPP ............................. 25 2.1.3MainResults .............................. 28 2.2AcceleratedPrimal-DualMethodsforDeterministicSPP .......... 29 2.3StochasticAPDMethodsforStochasticSPP ................ 36 2.4ConvergenceAnalysis ............................. 43 2.4.1ConvergenceAnalysisforDeterministicAPDAlgorithm ....... 43 2.4.2ConvergenceAnalysisforStochasticAPDAlgorithm ........ 51 2.5Application:PartiallyParallelImaging .................... 62 2.6ConcludingRemarksofThisChapter .................... 64 3ACCELERATIONOFLINEARIZEDALTERNATINGDIRECTIONMETHODOFMULTIPLIERS .................................. 68 3.1Introduction ................................... 68 3.1.1NotationsandTerminologies ..................... 70 3.1.2AlternatingDirectionMethodofMultipliersandItsVariants ..... 71 3.1.3AcceleratedMethodsforAECOandUCOProblems ........ 75 3.1.4MainResults .............................. 77 3.2AnAcceleratedLinearizedADMMFramework ................ 78 3.2.1GapFunctions ............................. 78 3.2.2ProposedFramework .......................... 81 3.2.3ConvergenceResultsforProblemsofTypeI ............. 84 3.2.4ConvergenceResultsforProblemsofTypeII ............ 89 5

PAGE 6

3.2.5MoreGeneralChoicesofTheWeightingParameters ........ 92 3.3ConvergenceAnalysis ............................. 94 3.3.1ConvergenceAnalysisforProblemsofTypeI ............ 100 3.3.2ConvergenceAnalysisforProblemsofTypeII ............ 105 3.4NumericalExamples .............................. 107 3.4.1ComparisonofLinearizedADMMAlgorithms ............ 109 3.4.2ComparisonwithOtherAlgorithms .................. 112 3.5ConcludingRemarksofThisChapter .................... 113 4OPTIMALSCHEMESFORACLASSOFHEMIVARIATIONALINEQUALITYPROBLEMS ..................................... 123 4.1Introduction ................................... 123 4.1.1DeterministicHVI ............................ 124 4.1.2StochasticHVI ............................. 125 4.1.3MainResults .............................. 127 4.2AcceleratedProx-MethodforDeterministicHVI ............... 128 4.3AcceleratedProx-MethodforStochasticHVI ................. 133 4.4ConvergenceAnalysis ............................. 138 4.4.1ConvergenceAnalysisforDeterministicAC-PM ........... 138 4.4.2ConvergenceAnalysisforStochasticAC-PM ............ 145 4.5ConcludingRemarksofThisChapter .................... 154 5DIFFUSIONWEIGHTEDIMAGING ........................ 156 5.1Introduction ................................... 156 5.2SpericalHarmonicSeriesforODFReconstruction ............. 160 5.2.1SphericalHarmonicsSeries ...................... 161 5.2.2SHSApproximationofFunk-RadonTransform ............ 162 5.3ModelDescription ............................... 163 5.3.1LeastSquaresEnergy ......................... 163 5.3.2AngularRegularization ......................... 165 5.3.3SpatialRegularization ......................... 166 5.3.4ProposedModel ............................ 167 5.3.5DiscreteFormofTheProposedModel ................ 167 5.4NumericalScheme ............................... 168 5.4.1Primal-DualFormulation ........................ 168 5.4.2Primal-DualSchemeforSolvingtheProposedModel ........ 170 5.5ExperimentalResults ............................. 172 5.5.1SyntheticResults ............................ 173 5.5.2RealData ................................ 176 5.6ConcludingRemarksofThisChapter .................... 177 REFERENCES ....................................... 182 BIOGRAPHICALSKETCH ................................ 190 6

PAGE 7

LISTOFTABLES Table page 3-1TherateofconvergenceofADMM-typealgorithmsforsolvingproblemsoftypeI ......................................... 86 3-2TherateofconvergenceoftheprimalresidualsofADMM-typealgorithmsforsolvingproblemsoftypeII .............................. 91 3-3TherateofconvergenceofthefeasibilityresidualsofADMM-typealgorithmsforsolvingproblemsoftypeII ............................ 92 3-4ComparisonofobjectivevaluesoflinearizedADMMalgorithmsforsolving( 3 )withinstanceBernoulliand=0.005 .................. 114 3-5ComparisonofnormalizedRMSEsoflinearizedADMMalgorithmsforsolving( 3 )withinstanceBernoulliand=0.005 .................. 114 3-6ComparisonofobjectivevaluesoflinearizedADMMalgorithmsforsolving( 3 )withinstanceGaussianand=0.005 .................. 115 3-7ComparisonofnormalizedRMSEsoflinearizedADMMalgorithmsforsolving( 3 )withinstanceGaussianand=0.005 .................. 115 3-8ComparisonofobjectivevaluesoflinearizedADMMalgorithmsforsolving( 3 )withinstanceGaussianand=10)]TJ /F7 7.97 Tf 6.59 0 Td[(5 ................... 116 3-9ComparisonofnormalizedRMSEsoflinearizedADMMalgorithmsforsolving( 3 )withinstanceGaussianand=10)]TJ /F7 7.97 Tf 6.59 0 Td[(5 ................... 116 3-10ComparisonoftheperformanceoflinearizedADMMalgorithmsforsolving( 3 ),whenissetto28. ............................. 117 5-1Comparisonofcomputationaltime(inseconds)resultingfromthreemodelsunderSNR=15,20,25and30respectively .................... 178 5-2Comparisonofcomputationaltime(inseconds)resultingfromthreemodelsunderSNR=15,20,25and30respectively .................... 178 5-3ComparisonofRMSEresultingfromthreemodelsunderSNR=15,20,25and30respectively .................................. 178 5-4ComparisonofSSDresultingfromthreemodelsunderSNR=15,20,25and30respectively. .................................... 178 7

PAGE 8

LISTOFFIGURES Figure page 2-1Themaskofthek-spacedataacquisition. ..................... 65 2-2Thesensitivitymapsoftheeightreceivercoils. .................. 65 2-3ComparisonofNesterov,APD,PD,OS,SBBandSBBwithlinesearch ..... 66 2-4ComparisonofthereconstructedimagefromAPDalgorithmandthegroundtruth .......................................... 67 3-1ThereconstructedimagesoftheBernoulliinstancefromdifferentalgorithms 118 3-2ThereconstructedimagesoftheGaussianinstancefromdifferentalgorithms 119 3-3ThereconstructedimagesofthePPIinstancefromdifferentalgorithms ... 120 3-4TheCerebellumpartofreconstructedimagesofthePPIinstancefromdifferentalgorithms ....................................... 121 3-5ComparisonoftheperformanceofAL-ADMM-4,L-ADMM,NESTAandAPDintermsofobjectivevalueandnormalizedRMSE ................ 122 5-1Thesimulatedregionofbercrossings. ...................... 173 5-2Theperformanceoftheproposedmodelwhilevaryingoneparameterandxingtheothertwo .................................. 179 5-3Theimageofsphericalharmoniccoefcientsa2(x),a3(x),...,aR(x)ofODF2 180 5-4Theregionofinterestinrealdata. ......................... 180 5-5ComparisonoftheODFreconstructionresultsfromrealdata .......... 181 8

PAGE 9

AbstractofdissertationPresentedtotheGraduateSchooloftheUniversityofFloridainPartialFulllmentoftheRequirementsfortheDegreeofDoctorofPhilosophyOPTIMALFIRSTORDERMETHODSFORACLASSOFNON-SMOOTHCONVEXOPTIMIZATIONWITHAPPLICATIONSTOIMAGEANALYSISByYuyuanOuyangAugust2013Chair:YunmeiChenMajor:Mathematics ThisPhDDissertationconcernsoptimalrstordermethodsinconvexoptimization,andtheirapplicationsinimagingscience.Theresearchismotivatedbytherapidadvancesinthetechnologiesfordigitaldataacquisition,whichresultsinhighdemandforefcientalgorithmstosolvenon-smoothconvexoptimizationproblems.Inthisdissertationwewilldeveloptheoriesandoptimalnumericalmethodsforsolvingaclassofdeterministicandstochasticsaddlepointproblemsandmoregeneralvariationalinequalitiesarisingfromlarge-scaledataanalysisproblems. Intherstpartofthisdissertation,weaimtosolveaclassofdeterministicandstochasticsaddlepointproblems(SPP),whichhasbeenconsideredasaframeworkofill-posedinverseproblemsregularizedbyanon-smoothfunctionalinmanydataanalysisproblems,suchasimagereconstructionincompressedsensingandmachinelearning.Theproposeddeterministicacceleratedprimaldual(APD)algorithmisexpectedtohavethesameoptimalrateofconvergenceastheoneobtainedbyNesterovforadifferentscheme.WealsoproposeastochasticAPDalgorithmthatalsoexhibitsanoptimalrateofconvergence.Toourbestknowledge,nostochasticprimal-dualalgorithmshavebeendevelopedinliteratures. Inthesecondpart,weconsideraclassofafneequalityconstrainedconvexcompositeoptimizationproblems,whichcanbesolvedbyalternatingdirectionmethodofmultipliers(ADMM).Theproblemclassofinterestisalsocloselyrelatedtothe 9

PAGE 10

SPPstudiedintherstpartofthisdissertation.WeproposetwonovelacceleratedlinearizedADMMmethods,namelytheacceleratedlinearizedADMM(AL-ADMM)andtheacceleratedlinearizedpreconditionedADMM(ALP-ADMM)methods,andprovethattheacceleratedmethodsexhibitbetterrateofconvergencethantheirunacceleratedcounterparts,inboththeoriesandexperiments. Inthethirdpart,Weconsiderabroaderclassofproblems,thevariationinequalities(VI),whichincludestheprevioustwopartsasspecialcases.Wedemonstratethat,ifweidentifyanddecomposethepotentialfunctionalcomponentsofVI,andtreatthemdifferentlyinthedesignoftheVIsolutionmethods,ournumericalmethodcanachievebetterrateofconvergence.WeproposeanAcceleratedProx-Method(AC-PM)forsolvingaclassofdeterministicandstochasticVI.BoththedeterministicandstochasticAC-PMalgorithmsachievetheoptimalrateofconvergence. Inthelastpart,weintroduceanapplicationofthetotalvariation(TV)andwaveletregularizationframeworkforDiffusionWeightedImaging(DWI).TheproposedframeworkisabletosimultaneouslyreconstructandregularizetheOrientationDistributionFunction(ODF),andextractbetterdirectionalinformationfromnoisyDWIdata.WeshowthattheTV-waveletframeworkforDWIisaspecialclassofSPP,anduseaprimal-dualmethodtosolvetheODFreconstructionproblem. 10

PAGE 11

CHAPTER1BACKGROUND Inthischapter,weintroducethebasicsofconvexoptimization,totalvariation(TV)andsomeimageanalysis,whichmotivatesourresearch.InSection 1.1 ,wereviewtheiterationcomplexityresultsofsmoothconvexoptimizationproblems,thevariationalinequalities(VI)andthesaddlepointproblems(SPP).WeintroduceTVbasedimageanalysisinSection 1.2 ,anddiscusstheformulationofTVbasedimagingproblemsasconvexoptimizationproblems.WeprovidetheoutlineofthisthesisinSection 1.3 1.1BasicConvexOptimizationandOptimalFirstOrderMethods Inthissection,weintroducethebasicsofconvexoptimizationandthecomplexitytheory.LetXbeanitedimensionalvectorspacewithnormkkandinnerproducth,i.Ingeneral,aconvexoptimizationproblemhasthefollowingform:minx2Xf(x)s.t.hi(x)0,i=1,...,l, wheretheobjectivefunctionf(x):X!Randthefunctionalconstraintshi(x):X!Rareconvexfunctions,andthefeasiblesetXXisaconvexset.Throughoutthethesis,weconsiderabasicformofconvexoptimization,withoutthefunctionalconstraints: minx2Xf(x).(1) Also,wewillonlyconsideriterativerst-ordermethodsforsolving( 1 ),i.e.,atanyiteratepointx,theonlyinformationweuseisthefunctionvaluef(x)anditsgradient/subgradientf0(x)2@f(x).Inparticular,wemayrefertotheassumptionofrst-orderoracles,whichassumesthatwecanonlyaccessinformationfromanoracleO:foranytestpointx,theoracleOoutputstherst-orderinformationaboutf()atx.Forexample,intheblack-boxassumption,itisassumedthattheoracleOonlyoutputsf(x)andonesubgradientf0(x).Undertheassumptionoforacles,theefciencyofanyrst-ordermethodcanbeevaluatedbythenumberofcallsoforacle. 11

PAGE 12

1.1.1SmoothConvexOptimization Throughoutthissubsection,weassumethatf(x)isconvexandcontinuouslydifferentiablewithLipschitzcontinuousgradientoverX,i.e.,krf(x))-20(rf(y)kLkx)]TJ /F8 11.955 Tf 9.54 0 Td[(ykforallx,y2X,wherekkistheconjugatenorm.Supposingthatthereexistsaminimizerx,foranyapproximatesolutionx2X,wemeasurethequalityofxbyf(x))]TJ /F8 11.955 Tf 11.95 0 Td[(f(x).Iff(x))]TJ /F8 11.955 Tf 11.96 0 Td[(f(x)", thenwesaythatxisan"-solutionofproblem( 1 ). Inaseminalwork[ 70 ],Nesterovpresentedanmethodthatisabletosolvean"-solutionof( 1 )inO r L "kx1)]TJ /F8 11.955 Tf 11.96 0 Td[(xk2! iterations.Inotherwords,iffxtgNt=1isthesequenceofiterates,thentherateofconvergenceofNesterov'smethodisO(1=N2):f(xN))]TJ /F8 11.955 Tf 11.96 0 Td[(f(x)OLkx1)]TJ /F8 11.955 Tf 11.96 0 Td[(xk2 N2. Nesterovshowedtheoptimalityofthemethodinthefollowingsense:undertheblack-boxassumption,foranyNandanyL,thereexistsafunctionfNwithLipschitzconstantL,suchthatforanyrstordermethod,theN-thiteratexNalwayssatisesfN(xN))]TJ /F8 11.955 Tf 11.95 0 Td[(fN3Lkx1)]TJ /F8 11.955 Tf 11.95 0 Td[(xk 32(N+1)2, (1) wherexistheminimizeroffNandfN=fN(x).By( 1 ),theO(1=N2)rateisunbeatableamonganyrst-ordermethods,thusNesterov'smethodisoptimal. Nesterov'smethodwasfurtherstudiedinseveralpapersandhasmanyvariants,forexample,[ 4 53 71 72 91 ].WereviewavariantofNesterov'smethodin[ 4 91 ]. 12

PAGE 13

Tobeginwith,wedenethedistancegeneratingfunctionsandtheBregmandivergence,whichisageneralizationoftheEuclideandistance.TheBregmandistanceisaveryusefultoolwhichisrelatedtothegeometricpropertyofthesetX. Denition1. Acontinuousconvexfunction!:X!Risadistancegeneratingfunctionmodulus>0withrespecttokkifthefollowingholds: 1. ThesetXo:=fx2Xj@!(x)6=;g isnonemptyandconvex. 2. !()isstronglyconvexonXo:8x,x02Xo,hr!(x0))-222(r!(x),x0)]TJ /F8 11.955 Tf 11.96 0 Td[(xikx0)]TJ /F8 11.955 Tf 11.95 0 Td[(xk2. Denition2. Givenadistancegeneratingfunction!(),thefunctionV:XoX!R+withexpressionV(u,x):=!(u))]TJ /F4 11.955 Tf 11.96 0 Td[(!(x))-222(hr!(x),u)]TJ /F8 11.955 Tf 11.96 0 Td[(xi iscalledoraBregmandivergence(oraprox-function),associatedwith!(). Throughoutthisdissertation,wemayuseeitherthenameBregmandivergenceorprox-functiontodescribethefunctionV(,)inDenition 2 .FromDenition 1 ,wecanobservethat,fromthestrongconvexityof!(), 2kx)]TJ /F8 11.955 Tf 11.95 0 Td[(x1k2V(x,x1),8x12Xo,x2X. Inthecontextofthisdissertation,weassumethat!()isxed,thusV(,)isxed. Foragivenpointx2Xo,theprox-mappingassociatedwith!()andV(,)isdenedasPx():=argminu2Xh,u)]TJ /F8 11.955 Tf 11.96 0 Td[(xi+V(u,x). Asimplestexampleofprox-mappingisundertheEuclideansetting.Ifweassumethatkkisinducedbyh,i,andset!():=1 2kk2,then!()isadistancegeneratingfunction 13

PAGE 14

modulus=1.Furthermore,wehaveV(u,x)=1 2kx)]TJ /F8 11.955 Tf 11.96 0 Td[(uk2,and Px()=argminu2Xh,u)]TJ /F8 11.955 Tf 11.96 0 Td[(xi+1 2kx)]TJ /F8 11.955 Tf 11.96 0 Td[(uk2.(1) WearenowreadytopresentinAlgorithm 1.1 belowavariantofNesterov'soptimalmethodin[ 4 91 ]. Algorithm1.1Nesterov'soptimalmethodin[ 91 ]. 1: Choosex12X. 2: Fort=1,2,...,N)]TJ /F5 11.955 Tf 11.95 0 Td[(1,calculatexmdt=t)]TJ /F5 11.955 Tf 11.95 0 Td[(1 t+1xagt+2 t+1xt,xt+1=Pxt2 Lrf(xmdt),xagt+1=t)]TJ /F5 11.955 Tf 11.95 0 Td[(1 t+1xagt+2 t+1xt+1, 3: Outputxagt+1. ThemainresultconcerningtherateofconvergenceofAlgorithm 1.1 issummarizedinthefollowingtheorem. Theorem1.1. Assumethatf(x)isaconvex,continuouslydifferentiablefunctionandthatrf()hasLipschitzconstantL.LetV(,)beaBregmandivergencewithrespecttoadistancegeneratingfunctionofmodulus.Foranyt1,theiteratexagt+1ofAlgorithm 1.1 satisesf(xagt+1))]TJ /F8 11.955 Tf 11.96 0 Td[(f4LV(x,x1) t(t+1), wherex2Xisaminimizeroff(x)andf:=f(x). 1.1.2SaddlePointProblemsandVariationalInequalities Consideraspecialclassoftheconvexoptimizationproblem( 1 ),inwhichf(x)=maxy2Y(x,y), 14

PAGE 15

whereYisaconvexsetinanitedimensionalvectorspaceY,and(,)isaconvex-concavefunction.Wecanformulateouroptimizationproblemasasaddlepointproblem(SPP) minx2Xmaxy2Y(x,y).(1) Wesaythat(x,y)isapairofsaddlepointifforallx2Xandy2Y,wehave(x,y)(x,y)(x,y). (1) Forexample,ifweconsiderthefollowingafneequalityconstrainedoptimization(AECO)problemminx2X,w2WG(x)+F(w)s:t:Bw)]TJ /F8 11.955 Tf 11.96 0 Td[(Kx=b, (1) whereWisanitedimensionalvectorspace,G():X!RandF():W!Rarenitevalued,convex,properandlowersemi-continuousfunctions,andK:X!Y,B:W!Yareboundedlinearoperators,then( 1 )isequivalenttothefollowingSPP:minx2X,w2Wmaxy2YG(x)+F(w))-222(hy,Bw)]TJ /F8 11.955 Tf 11.96 0 Td[(Kx)]TJ /F8 11.955 Tf 11.96 0 Td[(bi. Nowletusconsiderthecasewhen(x,y)iscontinuouslydifferentiablebothwithrespecttoxandwithrespecttoy.IfweletZ=XY,theninviewof( 1 )weseethatsolving( 1 )becomessolvingz2Zthatsatises hH(z),z)]TJ /F8 11.955 Tf 11.96 0 Td[(zi0,8z2Z,(1) where H(z):=2664@ @x(x,y))]TJ /F4 11.955 Tf 13.92 8.08 Td[(@ @y(x,y)3775,8z=(x,y)2Z.(1) Problemoftype( 1 )iscalledavariationalinequality(VI)problem.Inparticular,givennitedimensionalvectorspaceZ,aconvexsetZZandafunctionH:Z!Z,theVI 15

PAGE 16

problemwithrespecttoH()aimstosolvez2Zsuchthateither( 1 )holds,or hH(z),z)]TJ /F8 11.955 Tf 11.95 0 Td[(zi0(1) holds.Asolutionzthatsatises( 1 )iscalledastrongsolution,andasolutionthatsatises( 1 )iscalledaweaksolution.ItisworthnotingthatifH()ismonotone,i.e,hH(u))]TJ /F8 11.955 Tf 12.18 0 Td[(H(v),u)]TJ /F8 11.955 Tf 12.17 0 Td[(vi0forallu,v2Z,thenastrongsolutionisalsoaweaksolution;ifH()iscontinuous,thenaweaksolutionisalsoastrongsolution.Asanexample,wecanseethatH()isbothmonotoneandcontinuousin( 1 ),henceif(x,y)isasaddlepointforSPP( 1 ),thenz=(x,y)isbothastrongandweaksolutionfortheVIwithrespecttoH()in( 1 ). Nemirovskiproposedaprox-methodin[ 65 ]tosolve( 1 ).Theschemeoftheprox-methodisdescribedbelowinAlgorithm 1.2 Algorithm1.2Nemirovski'sprox-methodforsolving( 1 ) 1: Chooser12Z.Setw1=r1. 2: Fort=1,2,...,N)]TJ /F5 11.955 Tf 11.95 0 Td[(1,calculatewt+1=Prt(tH(rt)),rt+1=Prt(tH(wt+1)), 3: Output"NXt=1t#)]TJ /F7 7.97 Tf 6.59 0 Td[(1NXt=1twt. Itshouldbenotedthat,iftheprox-mappinginAlgorithm 1.2 isundertheEuclideansettingin( 1 ),thenAlgorithm 1.2 isequivalenttoKorpelevich'sextragradientmethod[ 47 ].Inoneresultof[ 65 ],NemirovskishowedthatwhenZisboundedandH()ismonotoneandLipschitzcontinuouswithconstantLH,therateofconvergenceof 16

PAGE 17

Algorithm 1.2 forsolving( 1 )isO(1)LH N, (1) whereLHistheLipschitzconstantofH().UndertheEuclideansetting,Monteiroetal.showedin[ 62 ]that,forunboundedZ,therateofconvergenceoftheextragradientmethodisalso( 1 ).Ifweassumeablack-boxassumptionforVI,i.e.,thereexistsanoraclethatoutputsH(x)foranytestpointx,thenAlgorithm 1.2 isanoptimalmethod.ThisisbecausethatitispossibletoconstructafunctionH(),suchthattherateofconvergenceofanyrstordermethodforsolving( 1 )isnobetterthan( 1 )[ 67 68 ]. OneremarkablestudyofaclassofSPPisconductedbyNesterovin[ 72 ],inwhichthefollowingSPPisstudied: minx2Xmaxy2YG(x)+hKx,yi)]TJ /F8 11.955 Tf 19.26 0 Td[(J(y),(1) whereXandYareboundedsets,G(x)isaconvex,continuouslydifferentiablefunctionwithLipschitzcontinuousgradientwithconstantLG,KisanoperatorwithnormLK,andJ(y)isageneralconvexfunction.Wecanstudy( 1 )fromeitheraminimizationperspective,oraVIperspective.Ononehand,ifweletf(x)=G(x)+maxy2YhKx,yi)]TJ /F8 11.955 Tf 19.26 0 Td[(J(y), thenfcanpossiblybenon-smooth,andtheproblem( 1 )isanon-smoothconvexoptimizationproblem.Ontheotherhand,ifJ0,lettingZ=XYanddeningH(z)=264rG(x)+KTy)]TJ /F8 11.955 Tf 11.95 0 Td[(Kx375,8z=(x,y)2XY, (1) thentheSPP( 1 )isequivalenttoaVIproblemwithrespecttoH()in( 1 ),andwecanshowthatLH=LG+LK.Itisinterestingtoobservethatfromeitherperspectives,problem( 1 )showsmorerst-orderinformationthanthemere 17

PAGE 18

black-boxassumption.Infact,undertheblack-boxassumption,theboundforrateofconvergenceforsolvinggeneralnon-smoothconvexproblem( 1 )isO(1=p N)(see,e.g.,[ 72 ]),andaswementionedafterAlgorithm 1.2 ,theO(LH=N)rateisoptimalforsolvinggeneralVI.However,Nesterovshowedin[ 72 ]thatbyutilizingasmoothingtechnique,itispossibletodevelopanumericalschemethatachievesthefollowingrateofconvergenceforsolving( 1 ):O(LG N2+LK N). AnimportantreasonforNesterov'sbetterrateofconvergenceisthatNesterov'sschemein[ 72 ]exploresthestructuralinformationoftheSPPaswellasthesmoothnessofG(x).Inotherwords,itispossibletoachievebetterrateofconvergenceifonehasmoreinformationthantheblack-boxassumption. 1.2TotalVariation,ImageAnalysisandRelatedOptimizationProblems TheapplicationofTotalVariation(TV)basedimageanalysisoriginatesfromtheimagedenoisingmodelbyRudin,OsherandFatemiin[ 85 ].TheROFmodelisdenedasminu2L1()ZjDuj+1 2Zju(x))]TJ /F8 11.955 Tf 11.95 0 Td[(f(x)j2dx, whereRdisad-dimensionalimagedomain,f2L1()isthenoisyinput,u2L1()istherecoveredimage,andistheregularizationparameter.Ifu2W1,1(),thenthetotalvariationtermreducestoZjDuj=Zjru(x)jdx. Indiscreteform,theROFmodelcanbeformulatedasminx2RnnXi=1kDixk+1 2kx)]TJ /F8 11.955 Tf 11.96 0 Td[(fk2, wherexisthevectorformofthesoughtimage,n=jjandDi:Rn!Rdisthenitedifferenceoperatoratindexioftheimagedomain.TheTVregularizationtechnique 18

PAGE 19

hasbeenproveneffectiveinimagingscience,sinceitiscapableofsmoothingnoisesawaywhilepreservingtheedgesoftherecoveredimage.Ingeneral,aTVbasedimageanalysismodelisformulatedasminx2RnnXi=1kDixk+G(x), (1) whereG(x)isatermthatdescribesthedatadelity.Theproblemin( 1 )isingeneralnon-smoothduetothetotalvariationregularizationterm,thereforebyclassicalcomplexitytheory(see,e.g.,[ 72 ]),theboundofrateofconvergenceofanyrst-ordermethodisO(1=p N)undertheblack-boxassumption.However,wecanobservethatproblem( 1 )containsspecialstructuralinformation,anditisnolongernecessarytoassumetheblack-boxassumption.Intherestofthissection,wedescribetwoequivalentformulationsofproblem( 1 ),andshowthatproblem( 1 )isaspecialinstanceoftheAECOproblem( 1 ),theSPP( 1 ),andalsotheVIproblem( 1 ). Firstly,ifweintroduceoneadditionalvariablew2Rdnsuchthatw=Dx,wherew=0BBBBBBBB@w1w2...wn1CCCCCCCCA,wi=Dix2Rdforalli,andD=0BBBBBBBB@D1D2...Dn1CCCCCCCCA, (1) thenproblem( 1 )isequivalenttominx2Rn,w2RdnG(x)+nXi=1kwiks:t:w=Dx, 19

PAGE 20

whichbelongstotheclassofAECOproblems( 1 ).Secondly,usingthedualformulationofnorms,wehaveaprimal-dualformulationofproblem( 1 )minx2RnnXi=1kDixk+G(x)=minx2Rnmaxyi2Rdkyik1nXi=1hDix,yi+G(x)=minx2Rnminy2YhDx,yi+G(x), (1) whereY=f(yT1,...,yTn)Tjyi2Rd,kyik1foralli=1,...,ng, andDisdenedin( 1 ).Wecanobservethat( 1 )isnotonlyaspecialinstanceoftheSPP( 1 ),butalsoaspecialinstanceoftheVIproblem( 1 ). 1.3OutlineofTheDissertation Thisthesisisorganizedasfollows.InChapter 2 ,wediscusstheoptimalschemesforsolvingtheclassofSPP( 1 ).In[ 72 ],Nesterovproposedtheanoptimalmethodforsolving( 1 )fromasmoothingperspective.Itshouldbenotedthatin[ 72 ]itisassumedthateitherXorYisboundedin( 1 ).Wepresentanacceleratedprimaldual(APD)methodforsolving( 1 ),whichhasthesameoptimalrateasNesterov'ssmoothingmethodin[ 72 ]whenXandYisbounded.Furthermore,ifeitherXorYisunbounded,theproposedAPDmethodalsoachievestheoptimalrateofconvergence.Wealsoconsiderthestochasticcaseof( 1 ),wherethereexistsastochasticoracleSOthatsuppliesnoisyrstorderinformation.ThestochasticversionoftheproposedAPDmethodcangettheoptimalrateofconvergenceforsolvingthestochasticSPP.InChapter 3 ,wepresentanaccelerationframeworkforlinearizedADMM,thatisabletosolvetheAECOproblem( 1 ).Weconductconvergenceanalysisoftheproposedacceleratedalgorithmsintermsofboththeprimalandfeasibilityresidualsof( 1 ),andshowthattheproposedacceleratedmethodscanefcientlysolve( 1 )whenrGhaslargeLipschitzconstant(aslargeasO(N)).Inchapter 4 ,weextendtothedevelopmentofoptimalmethodsforsolvingaclassofVIthatincludestheclassofthe 20

PAGE 21

SPPandAECOasspecialinstances.WeshowthatwecansignicantlyacceleratingthesolutionsofVIproblemsthroughtheidentication,decompositionandspecialtreatmentofthepotentialfunctionalcomponents,whichhasnotyetbeenstudiedinotherliteratures.Inchapter 5 ,weapplytheTV-waveletregularizationframeworkinsimultaneousreconstructionandregularizationofOrientationDistributionFunctions(ODF)inDiffusionWeightedImaging(DWI),andapplyaprimal-dualmethodtosolvetheODFreconstructionproblem. 21

PAGE 22

CHAPTER2OPTIMALSCHEMESFORACLASSOFSADDLEPOINTPROBLEMS 2.1Introduction LetXandYdenotethenite-dimensionalvectorspacesequippedwithaninnerproducth,iandnormkk,andXX,YYbegivenclosedconvexsets.Thebasicproblemofinterestwithinthischapteristhesaddlepointproblem(SPP)wedescribedin( 1 ): minx2Xf(x):=maxy2YG(x)+hKx,yi)]TJ /F8 11.955 Tf 19.26 0 Td[(J(y).(2) Here,G(x)isageneralconvexandcontinuouslydifferentiablefunctionsuchthat,forsomeLG0,G(y))]TJ /F8 11.955 Tf 11.96 0 Td[(G(x))-222(hrG(x),y)]TJ /F8 11.955 Tf 11.95 0 Td[(xiLG 2ky)]TJ /F8 11.955 Tf 11.96 0 Td[(xk2,8x,y2X, (2)K:X!YisalinearoperatorwithinducednormLK=kKk,andJ:Y!Risarelativelysimple,proper,convex,lowersemi-continuous(l.s.c.)function(i.e.,problem( 2 )iseasytosolve;wewillusethewordsimplethroughoutthisdissertation).Inparticular,ifJistheconvexconjugateofsomeconvexfunctionFandYY,then( 2 )isequivalenttotheprimalproblem: minx2XG(x)+F(Kx).(2) Problemsofthesetypeshaverecentlyfoundmanyapplicaitonsindataanalysis,especiallyinimagingprocessingandmachinelearning.Inmanyoftheseapplications,G(x)isaconvexdatadelityterm,whileF(Kx)isacertainregularization,e.g.,totalvariation[ 85 ],lowranktensor[ 45 88 ],overlappedgrouplasso[ 41 58 ],andgraphregularization[ 41 87 ]. Wefocusonrst-ordermethodsforsolvingbothdeterminisitcSPP,whereexactrst-orderinformationonfisavailable,andstochasticSPP,whereweonlyhaveaccess 22

PAGE 23

toinexactinformationaboutf.Letusstartbyreviewingafewexistingrst-ordermethodsinbothcases. 2.1.1DeterministicSPP Sincetheobjectivefunctionfdenedin( 2 )isnonsmoothingeneral,traditionalnonsmoothoptimizationmethods,e.g.,subgradientmethods,wouldexhibitanO(1=p N)rateofconvergencewhenappliedto( 2 )[ 67 ],whereNdenotesthenumberofiterations.However,followingthebreakthroughpaperbyNesterov[ 72 ],muchresearchefforthasbeendevotedtothedevelopmentofmoreefcientmethodsforsolvingproblem( 2 ). (1)Smoothingtechniques.In[ 72 ],Nesterovproposedtoapproximatethenonsmoothobjectivefunctionfin( 2 )byasmoothonewithLipschitz-continuousgradient.Then,thesmoothapproximationfunctionisminimizedbyanacceleratedgradientmethodin[ 70 71 ].Nesterovdemonstratedin[ 72 ]that,ifXandYarecompact,thentherateofconvergenceofthissmoothingschemeappliedto( 2 )canbeboundedby: OLG N2+LK N,(2) whichsignicantlyimprovesthepreviousboundO(1=p N).Itcanbeseenthattherateofconvergencein( 2 )isactuallyoptimal,basedonthefollowingobservations: a) ThereexistsafunctionGwithLipschitzcontinuousgradients,suchthatforanyrst-ordermethod,therateofconvergenceforsolvingminx2XG(x)isatmostO)]TJ /F8 11.955 Tf 5.48 -9.68 Td[(LG=N2[ 71 ]. b) Thereexistsb2Y,whereYisaconvexcompactsetofRmforsomem>0,andalinearboundedoperatorK,suchthatforanyrst-ordermethod,therateofconvergenceforsolvingminx2Xmaxy2YhKx,yi)]TJ /F8 11.955 Tf 20.19 0 Td[(J(y):=minx2Xmaxy2YhKx)]TJ /F8 11.955 Tf 12.43 0 Td[(b,yiisatmostO(LK=N)[ 65 68 ]. 23

PAGE 24

Nesterov'ssmoothingtechniquehasbeenextensivelystudied(see,e.g.,[ 4 8 22 48 51 69 75 91 ]).Observethatinordertoproperlyapplythesesmoothingtechnqiues,weneedtoassumeeitherXorYtobebounded. (2)Primal-dualmethods.WhileNesterov'ssmoothingschemeoritsvariantsrelyonasmoothapproximationtotheorginalproblem( 2 ),primal-dualmethodsworkdirectlywiththeoriginalsaddle-pointproblem.ThistypeofmethodwasrstpresentedbyArrowetal.[ 2 ]andnamedastheprimal-dualhybridgradient(PDHG)methodin[ 102 ].Theresultsin[ 15 28 102 ]showedthatthePDHGalgorithm,ifemployedwithwell-chosenstepsizepolicies,exhibitveryfastconvergenceinpractice,especiallyforsomeimagingapplications.RecentlyChambolleandPork[ 15 ]presentedauniedformofprimal-dualalgorithms,anddemonstratedthat,withaproperlyspeciedstepsizepolicyandaveragingscheme,thesealgorithmscanalsoachievetheO(1=N)rateofconvergence.Theyalsodiscussedpossiblewaystoextendprimal-dualalgorithmstodealwiththecasewheneitherXandYareunbounded.IntheoriginalworkofChambolleandPork,theyassumeGtoberelativelysimplesothatthesubproblemscanbesolvedefciently.Withlittleadditionaleffort,onecanshowthat,bylinearizingGateachstep,theirmethodcanalsobeappliedtothecasewhenGisageneralsmoothconvexfuntionG,andtherateofconvergenceofthismodiedalgorithmisgivenby OLG+LK N.(2) Itshouldbenoted,however,thatalthoughbothboundsin( 2 )and( 2 )areO(1=N),theonein( 2 )hasasignicantlybetterdependenceonLG.Morespecically,Nesterov'ssmoothingschemewouldallowaverylargeLipschitzconstantLG(asbigasO(N))withoutaffectingtherateofconvergence(uptoaconstantfactorof2).Thisisdesirableinmanydataanalysisapplications(e.g.,imageprocessing),whereLGisusuallysignicantlybiggerthanLK.Notethattheprimal-dualmethodsarealsorelatedtotheDouglas-Rachfordsplittingmethod[ 26 ]andapre-conditionedversionof 24

PAGE 25

thealternatingdirectionmethodofmultipliers(ADMM)[ 31 ].WewillrevisittheADMMmethodinChapter 3 .(3)Extragradientmethodsforvariationinequality(VI)reformulation.MotivatedbyNesterov'swork,Nemirovskipresentedamirror-proxmethod,bymodifyingKorpelevich'sextragradientalgorithm[ 46 ],forsolvingamoregeneralclassofvariationalinequalities[ 65 ](seealso[ 44 ]).Similartotheprimal-dualmethodsmentionedabove,theextragradientmethodsupdateiteratesonboththeprimalspaceXanddualspaceY,anddonotrequireanysmoothingtechnique.Thedifferenceisthateachiterationoftheextragradientmethodsrequiresanextragradientdescentstep.Nemirovski'smethod,whenspecializedto( 2 ),alsoexhibitsarateofconvergencegivenby( 2 ),which,inviewofourpreviousdiscussion,isnotoptimalintermsofitsdependenceonLG.Itcanbeshownthat,insomespecialcases(e.g.,Gisquadratic),onecanwriteexplicitlythe(stronglyconcave)dualfunctionofG(x)andobtainaresultsimilarto( 2 ),e.g.,byapplyinganimprovedalgorithmin[ 44 ].However,thisapproachwouldincreasethedimensionoftheproblemandcannotbeappliedforageneralsmoothfunctionG.Itshouldbenotedthat,whileNemirovski'sinitialworkonlyconsidersthecasewhenbothXandYarebounded,MonteiroandSvaiter[ 62 ]recentlyshowedthatextragradientmethodscandealwithunboundedsetsXandYbyusingaslightlymodiedterminationcriterion.WewillrevisittheVIproblemsinChapter 4 2.1.2StochasticSPP WhiledeterminisitcSPPhasbeenextensivelyexplored,thestudyonstochasticrst-ordermethodsforstochasticSPPisstillquitelimited.Inthestochasticsetting,weassumethatthereexistsastochasticoracle(SO)thatcanprovideunbiasedestimatorstothegradientoperatorsrG(x)and()]TJ /F8 11.955 Tf 9.3 0 Td[(Kx,KTy).Morespecically,atthei-thcalltoSO,(xi,yi)2XYbeingtheinput,theoraclewilloutputthestochasticgradient 25

PAGE 26

(^G(xi),^Kx(xi),^Ky(yi))(G(xi,i),Kx(xi,i),Ky(yi,i))suchthat E[^G(xi)]=rG(xi),E2640B@)]TJ /F5 11.955 Tf 11.47 2.52 Td[(^Kx(xi)^Ky(yi)1CA375=0B@)]TJ /F8 11.955 Tf 9.3 0 Td[(KxiKTyi1CA.(2) Herefi2Rdg1i=1isasequenceofi.i.d.randomvariables.Inaddition,weassumethat,forsomex,G,y,x,K0,thefollowingassumptionholds: A1. E[k^G(xi))-222(rG(xi)k2]2x,G,E[k^Kx(xi))]TJ /F8 11.955 Tf 11.95 0 Td[(Kxik2]2y,E[k^Ky(yi))]TJ /F8 11.955 Tf 11.95 0 Td[(KTyik2]2x,K. Sometimeswesimplydenotex:=q 2x,G+2x,Kforthesakeofnotationalconvenience.StochasticSPPoftenappearsinmachinelearningapplications.Forexample,forproblemsgivenintheformof( 2 ),G(x)(resp.F(Kx))canbeusedtodenoteasmooth(resp.nonsmooth)expectedconvexlossfunction.ItshouldalsobenotedthatdeterministicSPPisaspecialcaseoftheabovesettingwithx=y=0. Inviewoftheclassiccomplexitytheoryforconvexprogramming[ 43 67 ],alowerboundontherateofconvergenceforsolvingstochasticSPPisgivenby OLG N2+LK N+x+y p N,(2) wherethersttwotermsfollowfromthediscussionafter( 2 )andthelasttermfollowsfromSection5.3and6.3of[ 67 ].However,tothebestofourknowledge,theredoesnotexistanoptimalalgorithmintheliteraturewhichexhibitsexactlythesamerateofconvergenceasin( 2 ),althoughthereareafewgeneral-purposestochasticoptimizationalgorithmswhichpossessdifferentnearlyoptimalratesofconvergencewhenappliedtoabovestochasticSPP. (1)Mirror-descentstochasticapproximation(MD-SA).TheMD-SAmethoddevelopedbyNemirovskietal.in[ 66 ]originatesfromtheclassicalstochasticapproximation(SA)ofRobbinsandMonro[ 82 ].TheclassicalSAmimicsthesimplegradientdescentmethodbyreplacingexactgradientswithstochasticgradients,butcanonlybeappliedtosolvestronglyconvexproblems(seealsoPolyak[ 77 ]andPolyakandJuditsky[ 78 ],and 26

PAGE 27

Nemirovskietal.[ 66 ]foranaccountfortheearlierdevelopmentofSAmethods).ByproperlymodifyingtheclassicalSA,Nemirovskietal.showedin[ 66 ]thattheMD-SAmethodcanoptimallysolvegeneralnonsmoothstochasticprogrammingproblems.Therateofconvergenceofthisalgorithm,whenappliedtothestochasticSPP,isgivenby(seeSection3of[ 66 ])O(LG+LK+x+y)1 p N. However,theaboveboundissignicantlyworsethanthelowerboundin( 2 )intermsofitsdependenceonbothLGandLK. (2)Stochasticmirror-prox(SMP).InordertoimprovetherateofconvergenceoftheMD-SAmethod,Juditskyetal.[ 43 ]developedastochasticmirror-proxmethod,whichisthecounterpartofNemirovski'smirror-proxmethodforsolvinggeneralvariationalinequalities.Thestochasticmirror-proxmethod,whenspecializedtotheabovestochasticSPP,yieldsarateofconvergencegivenbyOLG+LK N+x+y p N. Notehowever,thattheaboveboundisstillsignicantlyworsethanthelowerboundin( 2 )intermsofitsdependenceonLG. (3)Acceleratedstochasticapproximation(AC-SA).Morerecently,Lanpresentedin[ 50 ](seealso[ 32 33 ])auniedoptimalmethodforsolvingsmooth,nonsmoothandstochasticoptimizationbydevelopingastochasticverstionofNesterov'smethod[ 70 71 ].ThedevelopedAC-SAalgorithmin[ 50 ],whenappliedtotheaforementionedstochasticSPP,possessestherateofconvergencegivenbyOLG N2+(LK+x+y)1 p N. However,sincethenonsmoothterminfof( 2 )hascertainspecialstructure,theaboveboundisstillsignicantlyworsethanthelowerboundin( 2 )intermsofitsdependenceonLK.ItshouldbenotedthatsomeimprovementforAC-SAhasbeen 27

PAGE 28

madebyLinetal.[ 54 ]byapplyingthesmoothingtechniqueto( 2 ).However,suchanimprovementworksonlyforthecasewhenYisboundedandy=x,K=0.Otherwise,therateofconvergenceoftheAC-SAalgorithmwilldependonthevarianceofthestochasticgradientscomputedforthesmoothapproximationproblem,whichisusuallyunknownanddifculttocharacterize(seeSection3formorediscussions). Therefore,noneofthestochasticoptimizationalgorithmsmentionedabovecouldachievethelowerboundontherateofconvergencein( 2 ). 2.1.3MainResults Ourcontributionmainlyconsistsofthefollowingthreeaspects.Firstly,wepresentanewprimal-dualtypemethod,namelytheacceleratedprimal-dual(APD)method,thatcanachievetheoptimalrateofconvergencein( 2 )fordeterministicSPP.Thebasicideaofthisalgorithmistoincorporateamulti-stepaccelerationschemeintotheprimal-dualmethodin[ 15 ].Wedemonstratethat,withoutrequiringtheapplicationofthesmoothingtechnique,thismethodcanalsoachievethesameoptimalrateofconvergenceasNesterov'ssmoothingschemewhenappliedto( 2 ).WealsoshowthatthecostperiterationforAPDiscomparabletothatofNesterov'ssmoothingscheme.HenceourmethodcanefcientlysolveproblemswithabigLipschtizconstantLG. Secondly,inordertosolvestochasticSPP,wedevelopastochasticcounterpartoftheAPDmethod,namelystochasticAPDanddemonstratethatitcanactuallyachievethelowerboundontherateofconvergencein( 2 ).Therefore,thisalgorithmexhibitsanoptimalrateofconvergenceforstochasticSPPnotonlyintermsofitsdependenceonN,butalsoonavarityofproblemparametersincluding,LG,LK,xandy.Tothebestofourknowledge,thisisthersttimethatsuchanoptimalalgorithmhasbeendevelopedforstochasticSPPintheliterature.Inaddition,weinvestigatethestochasticAPDmethodinmoredetails,e.g.,bydevelopingthelarge-deviationresultsassociatedwiththerateofconvergenceofthestochasticAPDmethod. 28

PAGE 29

Finally,forbothdeterministicandstochasticSPP,wedemonstratethatthedevelopedAPDalgorithmscandealwiththesituationwheneitherXorYisunbounded,aslongasasaddlepointofproblem( 2 )exists.WeincorporateintotheAPDmethodtheterminationcriterionemployedbyMonteiroandSvaiter[ 63 ]forsolvingvariationalinequalities,andgeneralizeitforsolvingstochasticSPP.Inbothdeterministicandstochasticcases,therateofconvergenceoftheAPDalgorithmsdependsonthedistancefromtheinitialpointtothesetofoptimalsolutions. 2.2AcceleratedPrimal-DualMethodsforDeterministicSPP Ourgoalinthissectionistopresentanacceleratedprimal-dualmethodfordeterministicSPPanddiscussitsmainconvergenceproperties.Inordertofacilitatethereaders,weputtheproofsofourmainresultsinSection 2.4.1 Thestudyonrst-orderprimal-dualmethodfornonsmoothconvexoptimizationhasbeenmainlymotivatedbysolvingtotalvariationbasedimageprocessingproblems(e.g.[ 11 15 28 38 76 102 ]).Algorithm 2.1 showsaprimal-dualmethodsummarizedin[ 15 ]forsolvingaspecialcaseofproblem(1.1),whereY=Rmforsomem>0,andJ(y)=F(y)istheconvexconjugateofaconvexandl.s.c.functionF. Algorithm2.1Primal-dualmethodforsolvingdeterministicSPP 1: Choosex12X,y12Y.Setx1=x1. 2: Fort=1,...,N,calculateyt+1=argminy2Yh)]TJ /F8 11.955 Tf 13.95 0 Td[(Kxt,yi+J(y)+1 2tky)]TJ /F8 11.955 Tf 11.96 0 Td[(ytk2, (2)xt+1=argminx2XG(x)+hKx,yt+1i+1 2tkx)]TJ /F8 11.955 Tf 11.96 0 Td[(xtk2, (2)xt+1=t(xt+1)]TJ /F8 11.955 Tf 11.95 0 Td[(xt)+xt+1. (2) 3: OutputxN=1 NNXt=1xt,yN=1 NNXt=1yt. Theconvergenceofthesequencef(xt,yt)ginAlgorithm 2.1 hasbeenstudiedin[ 11 15 28 38 76 ]forvariouschoicesoft,andunderdifferentconditionsonthe 29

PAGE 30

stepsizestandt.However,therateofconvergenceforthisalgorithmhasonlybeendiscussedbyChambolleandPockin[ 15 ].Morespecically,theyassumethattheconstantstepsizesareused,i.e.,t=,t=andt=forsome,,>0forallt1.IfL2K<1,whereLK=kKk,thentheoutput(xN,yN)possessesarateofconvergenceofO(1=N)for=1,andofO(1=p N)for=0,intermsofpartialdualitygap(dualitygapinaboundeddomain,see( 2 )below). Onepossiblelimitationof[ 15 ]isthatbothGandJneedtobesimpleenoughsothatthetwosubproblems( 2 )and( 2 )inAlgorithm 2.1 areeasytosolve.TomakeAlgorithm 2.1 applicabletomorepracticalproblemsweconsidermoregeneralcases,whereJissimple,butGmaynotbeso.Inparticular,weassumethatGisageneralsmoothconvexfunctionsatisfying( 2 ).Inthiscase,wecanreplaceGin( 2 )byitslinearapproximationG(xt)+hrG(xt),x)]TJ /F8 11.955 Tf 11.96 0 Td[(xti.Then,( 2 )becomes xt+1=argminx2XhrG(xt),xi+hKx,yt+1i+1 2kx)]TJ /F8 11.955 Tf 11.96 0 Td[(xtk2.(2) Inthefollowingcontext,wewillrefertothismodiedalgorithmasthelinearizedversionofAlgorithm 2.1 .Bysomeextraeffortwecanshowthat,iffort=1,...,N,0
PAGE 31

Algorithm2.2Acceleratedprimal-dualmethodfordeterministicSPP 1: Choosex12X,y12Y.Setxag1=x1,yag1=y1,x1=x1. 2: Fort=1,2,...,N)]TJ /F5 11.955 Tf 11.95 0 Td[(1,calculatexmdt=(1)]TJ /F4 11.955 Tf 11.96 0 Td[()]TJ /F7 7.97 Tf 6.58 0 Td[(1t)xagt+)]TJ /F7 7.97 Tf 6.58 0 Td[(1txt, (2)yt+1=argminy2Yh)]TJ /F8 11.955 Tf 13.95 0 Td[(Kxt,yi+J(y)+1 tVY(y,yt), (2)xt+1=argminx2XhrG(xmdt),xi+hx,KTyt+1i+1 tVX(x,xt), (2)xagt+1=(1)]TJ /F4 11.955 Tf 11.96 0 Td[()]TJ /F7 7.97 Tf 6.58 0 Td[(1t)xagt+)]TJ /F7 7.97 Tf 6.58 0 Td[(1txt+1, (2)yagt+1=(1)]TJ /F4 11.955 Tf 11.96 0 Td[()]TJ /F7 7.97 Tf 6.58 0 Td[(1t)yagt+)]TJ /F7 7.97 Tf 6.59 0 Td[(1tyt+1, (2)xt+1=t+1(xt+1)]TJ /F8 11.955 Tf 11.95 0 Td[(xt)+xt+1. (2) 3: OutputxagN,yagN. Ouracceleratedprimal-dual(APD)methodispresentedinAlgorithm 2.2 .Observethatinthisalgorithm,thesuperscriptagstandsforaggregated,andmdstandsformiddle.ThefunctionsVX(,)andVY(,)areBregmandivergencesdenedasVX(x,u):=dX(x))]TJ /F8 11.955 Tf 11.96 0 Td[(dX(u))-221(hrdX(u),x)]TJ /F8 11.955 Tf 11.95 0 Td[(ui,8x,u2X, (2)VY(y,v):=dY(y))]TJ /F8 11.955 Tf 11.96 0 Td[(dY(v))-221(hrdY(v),y)]TJ /F8 11.955 Tf 11.96 0 Td[(vi,8y,v2Y, (2) wheredX()anddY()arestronglyconvexfunctionswithstrongconvexityparametersXandY.Forexample,undertheEuclideansetting,wecansimplysetVX(x,xt):=kx)]TJ /F8 11.955 Tf 12.62 0 Td[(xtk2=2andVY(y,yt):=ky)]TJ /F8 11.955 Tf 12.62 0 Td[(ytk2=2,andX=Y=1.WeassumethatJ(y)isasimpleconvexfunction,sothattheoptimizationproblemin( 2 )canbesolvedefciently. Notethatift=1forallt1,thenxmdt=xt,xagt+1=xt+1,andAlgorithm 2.2 isthesameasthelinearizedversionofAlgorithm 2.1 .However,byspecifyingadifferentselectionoft(e.g.,t=O(t)),wecansignicantlyimprovetherateofconvergenceofAlgorithm 2.2 intermsofitsdependenceonLG.Itshouldbenotedthattheiterationcost 31

PAGE 32

fortheAPDalgorithmisaboutthesameasthatforthelinearizedversionofAlgorithm 2.1 InordertoanalyzetheconvergenceofAlgorithm 2.2 ,itisnecessarytointroduceanotiontocharacterizethesolutionsof( 2 ).Specically,denotingZ=XY,forany~z=(~x,~y)2Zandz=(x,y)2Z,wedene Q(~z,z):=[G(~x)+hK~x,yi)]TJ /F8 11.955 Tf 19.26 0 Td[(J(y)])]TJ /F5 11.955 Tf 11.95 -.17 Td[([G(x)+hKx,~yi)]TJ /F8 11.955 Tf 19.27 0 Td[(J(~y)].(2) Itcanbeeasilyseenthat~zisasolutionofproblem( 2 ),ifandonlyifQ(~z,z)0forallz2Z.Therefore,ifZisbounded,itissuggestivetousethegapfunction g(~z):=maxz2ZQ(~z,z)(2) toassessthequalityofafeasiblesolution~z2Z.Infact,wecanshowthatf(~x))]TJ /F8 11.955 Tf 12.21 0 Td[(fg(~z)forall~z2Z,wherefdenotestheoptimalvalueofproblem( 2 ).However,ifZisunbounded,theng(~z)isnotwell-denedevenforanearlyoptimalsolution~z2Z.Hence,inthesequel,wewillconsidertheboundedandunboundedcaseseparately,byemployingaslightlydifferenterrormeasureforthelattersituation. ThefollowingtheoremdescribestheconvergencepropertiesofAlgorithm 2.2 whenZisbounded. Theorem2.1. SupposethatforsomeX,Y>0, supx1,x22XVX(x1,x2)2Xandsupy1,y22YVY(x1,x2)2Y.(2) Alsoassumethattheparameterst,t,t,tinAlgorithm 2.2 arechosensuchthatforallt1,1=1,t+1)]TJ /F5 11.955 Tf 11.95 0 Td[(1=tt+1, (2)0
PAGE 33

Thenforallt1, g(zagt+1)1 tt2X+1 tt2Y.(2) Therearevariousoptionsforchoosingtheparameterst,t,tandtsuchthat( 2 )( 2 )hold.Belowweprovidesuchanexample. Corollary1. Supposethat( 2 )holds.InAlgorithm 2.2 ,iftheparametersaresettot=t+1 2,t=t)]TJ /F5 11.955 Tf 11.95 0 Td[(1 t,t=Xt 2LG+tLKDY=DXandt=YDY LKDX, (2) whereDX:=Xp 2=XandDY:=Yp 2=Y,thenforallt2,g(zagt)2LGD2X t(t)]TJ /F5 11.955 Tf 11.96 0 Td[(1)+2LKDXDY t. (2) Proof. Itsufcestoverifythattheparametersin( 2 )satises( 2 )( 2 )inTheorem 2.1 .Itiseasytocheckthat( 2 )and( 2 )hold.Furthermore,X t)]TJ /F8 11.955 Tf 13.15 8.09 Td[(LG t)]TJ /F8 11.955 Tf 13.15 8.09 Td[(L2Kt Y=2LG+tLKDY=DX t)]TJ /F5 11.955 Tf 16.12 8.09 Td[(2LG t+1)]TJ /F8 11.955 Tf 13.15 8.09 Td[(LKDY DX0, so( 2 )holds.Therefore,by( 2 ),forallt1wehaveg(zagt)1 t)]TJ /F7 7.97 Tf 6.58 0 Td[(1t)]TJ /F7 7.97 Tf 6.58 0 Td[(12X+1 t)]TJ /F7 7.97 Tf 6.58 0 Td[(1t)]TJ /F7 7.97 Tf 6.59 0 Td[(12Y=4LG+2(t)]TJ /F5 11.955 Tf 11.95 0 Td[(1)LKDY=DX Xt(t)]TJ /F5 11.955 Tf 11.96 0 Td[(1)X 2D2X+2LKDX=DY YtY 2D2Y=2LGD2X t(t)]TJ /F5 11.955 Tf 11.96 0 Td[(1)+2LKDXDY t. Clearly,inviewof( 2 ),therateofconvergenceofAlgorithm 2.2 appliedtoproblem( 2 )isoptimalwhentheparametersarechosenaccordingto( 2 ).AlsoobservethatweneedtoestimateDY=DXtousetheseparameters.However,itshouldbepointedoutthatreplacingtheratioDY=DXin( 2 )byanypositiveconstantonlyresultsanincreaseintheRHSof( 2 )byaconstantfactor. 33

PAGE 34

Now,westudytheconvergencepropertiesoftheAPDalgorithmforthecasewhenZ=XYisunbounded,byusingaperturbation-basedterminationcriterionrecentlyemployedbyMonteiroandSvaiterandappliedtoSPP[ 61 63 ].Thisterminationcriterionisbasedontheenlargementofamaximalmonotoneoperator,whichisrstintroducedin[ 13 ].Oneadvantageofusingthiscriterionisthatitsdenitiondoesnotdependontheboundednessofthedomainoftheoperator.Morespecically,asshownin[ 62 63 ],therealwaysexistsaperturbationvectorvsuchthat ~g(~z,v):=maxz2ZQ(~z,z))-222(hv,~z)]TJ /F8 11.955 Tf 11.96 0 Td[(zi(2) iswell-dened,althoughthevalueofg(~z)in( 2 )maybeunboundedwhenZisunbounded.Inthefollowingresult,weshowthattheAPDalgorithmcancomputeanearlyoptimalsolution~zwithasmallresidue~g(~z,v),forasmallpurterbationvectorv(i.e.,kvkissmall).Inaddition,ourderivediterationcomplexityboundsareproportionaltothedistancefromtheinitialpointtothesolutionset. Theorem2.2. Letfzagtg=f(xagt,yagt)gbetheiteratesgeneratedbyAlgorithm 2.2 withVX(x,xt)=kx)]TJ /F8 11.955 Tf 12.71 0 Td[(xtk2=2andVY(y,yt)=ky)]TJ /F8 11.955 Tf 12.71 0 Td[(ytk2=2.Assumethattheparameterst,t,tandtsatisfy( 2 ),t=t)]TJ /F7 7.97 Tf 6.59 0 Td[(1 t=t)]TJ /F7 7.97 Tf 6.59 0 Td[(1 t, (2)X t)]TJ /F8 11.955 Tf 13.15 8.08 Td[(LG t)]TJ /F8 11.955 Tf 13.15 8.08 Td[(L2Kt pY0, (2) forallt1andforsome0
PAGE 35

where(^x,^y)isasolutionpairofproblem( 2 )and D:=r k^x)]TJ /F8 11.955 Tf 11.96 0 Td[(x1k2+1 1k^y)]TJ /F8 11.955 Tf 11.95 0 Td[(y1k2.(2) Belowwesuggestaspecicparametersettingwhichsatises( 2 ),( 2 )and( 2 ). Corollary2. InAlgorithm 2.2 ,ifNisgivenandtheparametersaresettot=t+1 2,t=t)]TJ /F5 11.955 Tf 11.95 0 Td[(1 t,t=t+1 2(LG+NLK),andt=t+1 2NLK (2) thenthereexistsvNthatsatises( 2 )with "N10LG^D2 N2+10LK^D2 NandkvNk15LG^D N2+16LK^D N,(2) where^D=p k^x)]TJ /F8 11.955 Tf 11.96 0 Td[(x1k2+k^y)]TJ /F8 11.955 Tf 11.95 0 Td[(y1k2. Proof. Fortheparameterst,t,t,tin( 2 ),itisclearthat( 2 ),( 2 )holds.Furthermore,letp=1=4,foranyt=1,...,N)]TJ /F5 11.955 Tf 11.96 0 Td[(1,wehave1 t)]TJ /F8 11.955 Tf 13.15 8.09 Td[(LG t)]TJ /F8 11.955 Tf 13.15 8.09 Td[(L2Kt p=2LG+2LKN t+1)]TJ /F5 11.955 Tf 16.13 8.09 Td[(2LG t+1)]TJ /F5 11.955 Tf 13.15 8.09 Td[(2L2K(t+1) LKN2LKN t+1)]TJ /F5 11.955 Tf 13.15 8.09 Td[(2LK(t+1) N0, thus( 2 )holds.ByTheorem 2.2 ,inequalities( 2 )and( 2 )hold.Notingthattt,in( 2 )and( 2 )wehaveD^Dandk^x)]TJ /F8 11.955 Tf 11.95 0 Td[(x1k+k^y)]TJ /F8 11.955 Tf 11.96 0 Td[(y1kp 2^D,hencekvt+1kp 2^D tt+(1+p 4=3)^D tt+2LK^D t,"t+1(2)]TJ /F8 11.955 Tf 11.96 0 Td[(p)^D2 tt(1)]TJ /F8 11.955 Tf 11.96 0 Td[(p)=7^D2 3tt. Alsonotethatby( 2 ),1 N)]TJ /F7 7.97 Tf 6.58 0 Td[(1N)]TJ /F7 7.97 Tf 6.59 0 Td[(1=4(LG+LKN) N2=4LG N2+4LK N. Usingtheabovethreerelationsandthedenitionoftin( 2 ),weobtain( 2 )aftersimpliyingtheconstants. 35

PAGE 36

Itisinterestingtonoticethat,iftheparametersinAlgorithm 2.2 aresetto( 2 ),thenbothresidues"NandkvNkin( 2 )reducetozerowithapproximatelythesamerateofconvergence(uptoafactorof^D).AlsoobservethatinTheorem 2.2 andCorollary 2 ,wexVX(,)andVY(,)toberegulardistancefunctionsratherthanmoregeneralBregmandivergences.ThisisduetofactthatweneedtoapplytheTriagularinequalityassociatedwithp VX(,)andp VY(,),whilesuchaninequalitydoesnotnecessarilyholdforBregmandivergencesingeneral. 2.3StochasticAPDMethodsforStochasticSPP OurgoalinthissectionistopresentastochasticAPDmethodforstochasticSPP(i.e.,problem( 2 )withastochasticoracle)anddemonstratethatitcanactuallyachievethelowerboundin( 2 )ontherateofconvergenceforstochasticSPP.Inordertofacilitatethereaders,weputtheproofsofourmainresultsinSection 2.4.2 ThestochasticAPDmethodisastochasticcounterpartoftheAPDalgorithminSection 2.2 ,obtainedbysimplyreplacingthegradientoperators)]TJ /F8 11.955 Tf 9.3 0 Td[(Kxt,rG(xmdt)andKTyt+1,usedin( 2 )and( 2 ),withthestochasticgradientoperatorscomputedbytheSO,i.e.,)]TJ /F5 11.955 Tf 11.46 2.53 Td[(^Kx(xt),^G(xmdt)and^Ky(yt+1),respectively.ThisalgorithmisformallydescribedasinAlgorithm 2.3 Algorithm2.3StochasticAPDmethodforstochasticSPP Modify( 2 )and( 2 )inAlgorithm 2.2 toyt+1=argminy2Yh)]TJ /F5 11.955 Tf 16.12 2.52 Td[(^Kx(xt),yi+J(y)+1 tVY(y,yt), (2)xt+1=argminx2Xh^G(xmdt),xi+hx,^Ky(yt+1)i+1 tVX(x,xt). (2) AfewmoreremarksaboutthedevelopmentoftheabovestochasticAPDmethodareinorder.Firstly,observethat,althoughprimal-dualmethodshavebeenextensivelystudiedforsolvingdeterministicsaddle-pointproblems,itseemsthatthesetypesofmethodshavenotyetbeengeneralizedforstochasticSPPintheliterature.Secondly,as 36

PAGE 37

notedinSection 2.1.1 ,onepossiblewaytosolvestochasticSPPistoapplytheAC-SAalgorithmin[ 50 ]toacertainsmoothapproximationof( 2 )byNesterov[ 72 ].However,therateofconvergenceofthisapproachwilldependonthevarianceofthestochasticgradientscomputedforthesmoothapproximationproblem,whichisusuallyunkownanddifculttocharacterize.Ontheotherhand,thestochasticAPDmethoddescribedaboveworksdirectlywiththeoriginalproblemwithoutrequringtheapplicationofthesmoothingtechnique,anditsrateofconvergencewilldependonthevarianceofthestochasticgradientoperatorscomputedfortheoriginalproblem,i.e.,2x,G,2yand2x,KinAssumption A1 .Wewillshowthatitcanachieveexactlythelowerboundin( 2 )ontherateofconvergenceforstochasticSPP. SimilarlytoSection 2.2 ,weusethetwogapfunctionsg()and~g(,),respectively,denedin( 2 )and( 2 )astheterminationcriteriaforthestochasticAPDalgorithm,dependingonwhetherthefeasiblesetZ=XYisboundedornot.Sincethealgorithmisstochasticinnature,forbothcasesweestablishitsexpectedrateofconvergenceintermsofg()or~g(,),i.e.,theaveragerateofconvergenceovermanyrunsofthealgorithm.Inaddition,weshowthatifZisbounded,thentheconvergenceoftheAPDalgorithmcanbestrengthenedunderthefollowinglight-tailassumptiononSO: A2. EexpfkrG(x))]TJ /F5 11.955 Tf 14.06 2.52 Td[(^G(x)k2=2x,Ggexpf1g,EexpfkKx)]TJ /F5 11.955 Tf 14.13 2.52 Td[(^Kx(x)k2=2ygexpf1gandEexpfkKTy)]TJ /F5 11.955 Tf 14.12 2.53 Td[(^Ky(y)k2=2x,Kgexpf1g. ItiseasytoseethatAssumption A2 impliesAssumption A1 byJensen'sinequality. Theorem 2.3 belowsummarizestheconvergencepropertiesofAlgorithm 2.3 whenZisbounded.NotethatthefollowingquanitywillbeusedinthestatementofthisresultandtheconvergenceanalysisoftheAPDalgorithms(seeSection 2.4 ): t=8><>:1,t=1,)]TJ /F7 7.97 Tf 6.59 0 Td[(1tt)]TJ /F7 7.97 Tf 6.58 0 Td[(1,t2.(2) 37

PAGE 38

Theorem2.3. Supposethat( 2 )holdsforsomeX,Y>0.Alsoassumethatforallt1,theparameterst,t,tandtinAlgorithm 2.3 satisfy( 2 ),( 2 ),andqX t)]TJ /F8 11.955 Tf 13.15 8.09 Td[(LG t)]TJ /F8 11.955 Tf 13.15 8.09 Td[(L2Kt pY0 (2) forsomep,q2(0,1).Then, a) UnderAssumption A1 ,forallt1, E[g(zagt+1)]Q0(t),(2) whereQ0(t):=1 tt2t t2X+2t t2Y+1 2tttXi=1(2)]TJ /F8 11.955 Tf 11.95 0 Td[(q)ii (1)]TJ /F8 11.955 Tf 11.95 0 Td[(q)X2x+(2)]TJ /F8 11.955 Tf 11.95 0 Td[(p)ii (1)]TJ /F8 11.955 Tf 11.96 0 Td[(p)Y2y. (2) b) Underassumption A2 ,forall>0andt1,Probfg(zagt+1)>Q0(t)+Q1(t)g3expf)]TJ /F4 11.955 Tf 15.28 0 Td[(2=3g+3expf)]TJ /F4 11.955 Tf 15.28 0 Td[(g, (2) whereQ1(t):=1 tt p 2xX p X+yY p Y!vuut 2tXi=12i+1 2tttXi=1(2)]TJ /F8 11.955 Tf 11.96 0 Td[(q)ii (1)]TJ /F8 11.955 Tf 11.96 0 Td[(q)X2x+(2)]TJ /F8 11.955 Tf 11.96 0 Td[(p)ii (1)]TJ /F8 11.955 Tf 11.96 0 Td[(p)Y2y. (2) Weprovidebelowaspecicchoiceoftheparameterst,t,tandtforthestochasticAPDmethodforthecasewhenZisbounded. Corollary3. Supposethat( 2 )holdsandletDXandDYbedenedinCorolloary 1 .InAlgorithm 2.3 ,ifN1isgivenandtheparametersaresettot=t+1 2,t=t)]TJ /F5 11.955 Tf 11.95 0 Td[(1 t,t=2XDXt 6LGDX+3LKDY(N)]TJ /F5 11.955 Tf 11.96 0 Td[(1)+3xNp N)]TJ /F5 11.955 Tf 11.96 0 Td[(1,andt=2YDYt 3LKDX(N)]TJ /F5 11.955 Tf 11.95 0 Td[(1)+3yNp N)]TJ /F5 11.955 Tf 11.95 0 Td[(1. (2) 38

PAGE 39

ThenunderAssumption A1 ,wehaveE[g(zagN)]6LGD2X N(N)]TJ /F5 11.955 Tf 11.96 0 Td[(1)+6LKDXDY N+4(xDX+yDY) p N)]TJ /F5 11.955 Tf 11.95 0 Td[(1=:C0(N). (2) Ifinaddition,Assumption A2 holds,thenforall>0,wehaveProbfg(zagN)>C0(N)+C1(N)g3expf)]TJ /F4 11.955 Tf 15.28 0 Td[(2=3g+3expf)]TJ /F4 11.955 Tf 15.28 0 Td[(g, (2) whereC1(N)=3(xDX+yDY) p N)]TJ /F5 11.955 Tf 11.95 0 Td[(1. (2) Proof. Firstwecheckthattheparametersin( 2 )satisfytheconditionsinTheorem 2.3 .Theinequalities( 2 )and( 2 )canbecheckedeasily.Furthermore,forallt=1,...,N)]TJ /F5 11.955 Tf 11.96 0 Td[(1,settingp=q=2=3wehaveqX t)]TJ /F8 11.955 Tf 13.15 8.09 Td[(LG t)]TJ /F8 11.955 Tf 13.15 8.09 Td[(L2Kt pY2LGDX+LKDY(N)]TJ /F5 11.955 Tf 11.96 0 Td[(1) DXt)]TJ /F5 11.955 Tf 16.12 8.09 Td[(2LG t+1)]TJ /F8 11.955 Tf 30.28 8.09 Td[(L2KDYt LKDX(N)]TJ /F5 11.955 Tf 11.96 0 Td[(1)0, thus( 2 )holds,andhenceTheorem 2.3 holds. Toshow( 2 )and( 2 ),itsufcestoshowthatC0(N)Q0(N)]TJ /F5 11.955 Tf 13.14 0 Td[(1)andC1(N)Q1(N)]TJ /F5 11.955 Tf 12.2 0 Td[(1).Observethatby( 2 )and( 2 ),wehavet=t.Also,observethatN)]TJ /F7 7.97 Tf 6.59 0 Td[(1Xi=1i2(N)]TJ /F5 11.955 Tf 11.95 0 Td[(1)N2=3, thus1 N)]TJ /F7 7.97 Tf 6.58 0 Td[(1N)]TJ /F7 7.97 Tf 6.58 0 Td[(1Xi=1ii2XDX 3(N)]TJ /F5 11.955 Tf 11.96 0 Td[(1)3=2NxN)]TJ /F7 7.97 Tf 6.59 0 Td[(1Xi=1i22XDXN 9xp N)]TJ /F5 11.955 Tf 11.95 0 Td[(1,1 N)]TJ /F7 7.97 Tf 6.58 0 Td[(1N)]TJ /F7 7.97 Tf 6.58 0 Td[(1Xi=1ii2YDY 3(N)]TJ /F5 11.955 Tf 11.95 0 Td[(1)3=2NyN)]TJ /F7 7.97 Tf 6.59 0 Td[(1Xi=1i22YDYN 9yp N)]TJ /F5 11.955 Tf 11.96 0 Td[(1. 39

PAGE 40

Applytheaboveboundsto( 2 )and( 2 ),wegetQ0(N)]TJ /F5 11.955 Tf 11.96 0 Td[(1)2 N6LGDX+3LKDY(N)]TJ /F5 11.955 Tf 11.95 0 Td[(1)+3Np N)]TJ /F5 11.955 Tf 11.96 0 Td[(1x XDX(N)]TJ /F5 11.955 Tf 11.96 0 Td[(1)X 2D2X+3LKDX(N)]TJ /F5 11.955 Tf 11.95 0 Td[(1)+3Np N)]TJ /F5 11.955 Tf 11.95 0 Td[(1y YDY(N)]TJ /F5 11.955 Tf 11.95 0 Td[(1)Y 2D2Y+22x X2XDXN 9xp N)]TJ /F5 11.955 Tf 11.96 0 Td[(1+22y Y2YDYN 9yp N)]TJ /F5 11.955 Tf 11.96 0 Td[(1C0(N), andQ1(N)]TJ /F5 11.955 Tf 11.96 0 Td[(1)2 N(N)]TJ /F5 11.955 Tf 11.96 0 Td[(1)xDX+yDY p 2r 2(N)]TJ /F5 11.955 Tf 11.96 0 Td[(1)N2 3+42x XN2XDXN 9xp N)]TJ /F5 11.955 Tf 11.96 0 Td[(1+42y YN2YDYN 9yp N)]TJ /F5 11.955 Tf 11.95 0 Td[(1C1(N), so( 2 )and( 2 )holds. Comparingtherateofconvergenceestablishedin( 2 )withthelowerboundin( 2 ),wecanclearlyseethatthestochasticAPDalgorithmisanoptimalmethodforsolvingthestochasticsaddle-pointproblems.Morespecically,inviewof( 2 ),thisalgorithmallowsustohaveverylargeLipschitzconstantsLG(asbigasO(N3 2))andLK(asbigasO(p N))withoutsignicantlyaffectingitsrateofconvergence. WenowpresenttheconvergenceresultsforthestochasticAPDmethodappliedtostochasticsaddle-pointproblemswithpossiblyunboundedfeasiblesetZ.Itappearsthatthesolutionmethodsofthesetypesofproblemshavenotbeenwell-studiedintheliterature. Theorem2.4. Letfzagtg=f(xagt,yagt)gbetheiteratesgeneratedbyAlgorithm 2.2 withVX(x,xt)=kx)]TJ /F8 11.955 Tf 12.71 0 Td[(xtk2=2andVY(y,yt)=ky)]TJ /F8 11.955 Tf 12.71 0 Td[(ytk2=2.Assumethattheparameterst,t,tandtinAlgorithm 2.3 satisfy( 2 ),( 2 )and( 2 )forallt1andsomep,q2(0,1),thenthereexistsaperturbationvectorvt+1suchthat E[~g(zagt+1,vt+1)]1 tt6)]TJ /F5 11.955 Tf 11.95 0 Td[(4p 1)]TJ /F8 11.955 Tf 11.95 0 Td[(pD2+5)]TJ /F5 11.955 Tf 11.95 0 Td[(3p 2)]TJ /F5 11.955 Tf 11.95 0 Td[(2pC2=:"t+1(2) 40

PAGE 41

foranyt1.Moreover,wehaveE[kvt+1k]2k^x)]TJ /F8 11.955 Tf 11.95 0 Td[(x1k tt+2k^y)]TJ /F8 11.955 Tf 11.95 0 Td[(y1k tt+p 2D2+C22 tt+1 ttr 1 1r 1 1)]TJ /F8 11.955 Tf 11.96 0 Td[(p+1+2LK t, (2) where(^x,^y)isapairofsolutionsforproblem( 2 ),Disdenedin( 2 )andC:=vuut tXi=12i2x 1)]TJ /F8 11.955 Tf 11.96 0 Td[(q+tXi=1ii2y 1)]TJ /F8 11.955 Tf 11.96 0 Td[(p. (2) BelowwespecializetheresultsinTheorem 2.4 bychoosingasetofparameterssatisfying( 2 ),( 2 )and( 2 ). Corollary4. InAlgorithm 2.3 ,ifNisgivenandtheparametersaresettot=t+1 2,t=t)]TJ /F5 11.955 Tf 11.95 0 Td[(1 t,t=3t 4,andt=t (2) where =2LG+2LK(N)]TJ /F5 11.955 Tf 11.96 0 Td[(1)+Np N)]TJ /F5 11.955 Tf 11.96 0 Td[(1=~Dforsome~D>0,=r 9 42x+2y,(2) thenthereexistsvNthatsatises( 2 )with"N36LGD2 N(N)]TJ /F5 11.955 Tf 11.96 0 Td[(1)+36LKD2 N+D18D=~D+3~D=D p N)]TJ /F5 11.955 Tf 11.96 0 Td[(1, (2)E[kvNk]50LGD N(N)]TJ /F5 11.955 Tf 11.96 0 Td[(1)+LKD(55+3~D=D) N+(6+25D=~D) p N)]TJ /F5 11.955 Tf 11.95 0 Td[(1, (2) whereDisdenedin( 2 ). Proof. Fortheparametersin( 2 ),itisclearthat( 2 )and( 2 )hold.Inaddition,lettingp=1=4andq=3=4,thenforallt=1,...,N)]TJ /F5 11.955 Tf 11.95 0 Td[(1,wehaveq t)]TJ /F8 11.955 Tf 13.15 8.09 Td[(LG t)]TJ /F8 11.955 Tf 13.15 8.09 Td[(L2Kt p= t)]TJ /F5 11.955 Tf 16.13 8.09 Td[(2LG t+1)]TJ /F5 11.955 Tf 13.15 8.09 Td[(4L2Kt 2LG+2LK(N)]TJ /F5 11.955 Tf 11.96 0 Td[(1) t)]TJ /F5 11.955 Tf 13.15 8.09 Td[(2LG t)]TJ /F5 11.955 Tf 27.31 8.09 Td[(2L2Kt LK(N)]TJ /F5 11.955 Tf 11.96 0 Td[(1)0, 41

PAGE 42

thus( 2 )holds.ByTheorem 2.4 ,weget( 2 )and( 2 ).Notethatt=t=3=4,hence1 N)]TJ /F7 7.97 Tf 6.59 0 Td[(1N)]TJ /F7 7.97 Tf 6.58 0 Td[(1k^x)]TJ /F8 11.955 Tf 11.95 0 Td[(x1k1 N)]TJ /F7 7.97 Tf 6.59 0 Td[(1N)]TJ /F7 7.97 Tf 6.59 0 Td[(1D, and1 N)]TJ /F7 7.97 Tf 6.58 0 Td[(1N)]TJ /F7 7.97 Tf 6.59 0 Td[(1k^y)]TJ /F8 11.955 Tf 11.96 0 Td[(y1k1 N)]TJ /F7 7.97 Tf 6.58 0 Td[(1N)]TJ /F7 7.97 Tf 6.59 0 Td[(1N)]TJ /F7 7.97 Tf 6.59 0 Td[(1 N)]TJ /F7 7.97 Tf 6.58 0 Td[(1r 4 3D=p 3=4D N)]TJ /F7 7.97 Tf 6.58 0 Td[(1N)]TJ /F7 7.97 Tf 6.59 0 Td[(1, soin( 2 )and( 2 )wehave"t+11 N)]TJ /F7 7.97 Tf 6.59 0 Td[(1N)]TJ /F7 7.97 Tf 6.59 0 Td[(1(20 3D2+17 6C2), (2)E[kvt+1k](2+p 3)D N)]TJ /F7 7.97 Tf 6.58 0 Td[(1N)]TJ /F7 7.97 Tf 6.59 0 Td[(1+p 2D2+C23+p 3=4 N)]TJ /F7 7.97 Tf 6.59 0 Td[(1N)]TJ /F7 7.97 Tf 6.59 0 Td[(1+2LKp 2D2+C2 N)]TJ /F7 7.97 Tf 6.59 0 Td[(1. (2) By( 2 )andthefactthatPN)]TJ /F7 7.97 Tf 6.58 0 Td[(1i=1i2N2(N)]TJ /F5 11.955 Tf 11.96 0 Td[(1)=3,wehaveC=vuut N)]TJ /F7 7.97 Tf 6.59 0 Td[(1Xi=192xi2 42+N)]TJ /F7 7.97 Tf 6.59 0 Td[(1Xi=12xi2 2s 1 32N2(N)]TJ /F5 11.955 Tf 11.95 0 Td[(1)92x 4+2x=Np N)]TJ /F5 11.955 Tf 11.96 0 Td[(1 p 3. Applyingtheaboveboundto( 2 )and( 2 ),andusingthefactthatp 2D2+C2p 2D+C,weobtain"N8 3N(N)]TJ /F5 11.955 Tf 11.95 0 Td[(1)20 3D2+172N2(N)]TJ /F5 11.955 Tf 11.95 0 Td[(1) 182=8 3N(N)]TJ /F5 11.955 Tf 11.95 0 Td[(1)20 3D2+172N2(N)]TJ /F5 11.955 Tf 11.96 0 Td[(1) 18320LGD2 9N(N)]TJ /F5 11.955 Tf 11.95 0 Td[(1)+320LK(N)]TJ /F5 11.955 Tf 11.96 0 Td[(1)D2 9N(N)]TJ /F5 11.955 Tf 11.96 0 Td[(1)+160Np N)]TJ /F5 11.955 Tf 11.95 0 Td[(1D2=~D 9N(N)]TJ /F5 11.955 Tf 11.95 0 Td[(1)+682N2(N)]TJ /F5 11.955 Tf 11.95 0 Td[(1) 27N2(N)]TJ /F5 11.955 Tf 11.96 0 Td[(1)3=2=~D36LGD2 N(N)]TJ /F5 11.955 Tf 11.96 0 Td[(1)+36LKD2 N+D18D=~D+3~D=D p N)]TJ /F5 11.955 Tf 11.96 0 Td[(1, 42

PAGE 43

andE[kvNk]1 N)]TJ /F7 7.97 Tf 6.59 0 Td[(1N)]TJ /F7 7.97 Tf 6.59 0 Td[(12D+p 3D+3p 2D+p 6D=2+3C+p 3C=2+2p 2LKD N)]TJ /F7 7.97 Tf 6.59 0 Td[(1+2LKC N)]TJ /F7 7.97 Tf 6.59 0 Td[(116LG+16LK(N)]TJ /F5 11.955 Tf 11.96 0 Td[(1)+8Np N)]TJ /F5 11.955 Tf 11.95 0 Td[(1=~D 3N(N)]TJ /F5 11.955 Tf 11.96 0 Td[(1)2+p 3+3p 2+p 6=2D+8 3p N)]TJ /F5 11.955 Tf 11.96 0 Td[(1p 3+1=2+4p 2LKD N+4LKNp N)]TJ /F5 11.955 Tf 11.96 0 Td[(1 Np 3Np N)]TJ /F5 11.955 Tf 11.96 0 Td[(1=~D50LGD N(N)]TJ /F5 11.955 Tf 11.96 0 Td[(1)+LKD(55+3~D=D) N+(6+25D=~D) p N)]TJ /F5 11.955 Tf 11.95 0 Td[(1. Observethattheparametersettingsin( 2 )and( 2 )aremorecomplicatedthantheonesin( 2 )forthedeterministicunboundedcase.Inparticular,forthestochasticunboundedcase,weneedtochooseaparameter~Dwhichisnotrequiredforthedeterministiccase.Clearly,theoptimalselectionfor~DminimizingtheRHSof( 2 )isgivenbyp 6D.Notehowever,thatthevalueofDwillbeverydifculttoestimatefortheunboundedcaseandhenceoneoftenhastoresorttoasuboptimalselectionfor~D.Forexample,if~D=1,thentheRHSof( 2 )and( 2 )willbecomeO(LGD2=N2+LKD2=N+D2=p N)andO(LGD=N2+LKD=N+D=p N),respectively. 2.4ConvergenceAnalysis OurgoalinthissectionistoprovethemainresultspresentedinSection2and3,namely,Theorems 2.1 2.2 2.3 and 2.4 2.4.1ConvergenceAnalysisforDeterministicAPDAlgorithm Inthissection,weproveTheorems 2.1 and 2.2 which,respectively,describetheconvergencepropertiesforthedeterministicAPDalgorithmfortheboundedandunboundedSPPs. 43

PAGE 44

BeforeprovingTheorem 2.1 ,werstprovetwotechnicalresults:Proposition 2.1 showssomeimportantpropertiesforthefunctionQ(,)in( 2 )andLemma 1 establishesaboundonQ(xagt,z). Proposition2.1. Assumethatt1forallt.Ifzagt+1=(xagt+1,yagt+1)isgeneratedbyAlgorithm 2.2 ,thenforallz=(x,y)2Z, tQ(zagt+1,z))]TJ /F5 11.955 Tf 11.95 0 Td[((t)]TJ /F5 11.955 Tf 11.96 0 Td[(1)Q(zagt,z)hrG(xmdt),xt+1)]TJ /F8 11.955 Tf 11.96 0 Td[(xi+LG 2tkxt+1)]TJ /F8 11.955 Tf 11.95 0 Td[(xtk2+[J(yt+1))]TJ /F8 11.955 Tf 11.96 0 Td[(J(y)]+hKxt+1,yi)-222(hKx,yt+1i.(2) Proof. Byequations( 2 )and( 2 ),xagt+1)]TJ /F8 11.955 Tf 13.11 0 Td[(xmdt=)]TJ /F7 7.97 Tf 6.58 0 Td[(1t(xt+1)]TJ /F8 11.955 Tf 13.11 0 Td[(xt).UsingthisobservationandtheconvexityofG(),wehavetG(xagt+1)tG(xmdt)+thrG(xmdt),xagt+1)]TJ /F8 11.955 Tf 11.95 0 Td[(xmdti+tLG 2kxagt+1)]TJ /F8 11.955 Tf 11.95 0 Td[(xmdtk2tG(xmdt)+thrG(xmdt),xagt+1)]TJ /F8 11.955 Tf 11.96 0 Td[(xmdti+LG 2tkxt+1)]TJ /F8 11.955 Tf 11.95 0 Td[(xtk2=tG(xmdt)+(t)]TJ /F5 11.955 Tf 11.95 0 Td[(1)hrG(xmdt),xagt)]TJ /F8 11.955 Tf 11.95 0 Td[(xmdti+hrG(xmdt),xt+1)]TJ /F8 11.955 Tf 11.96 0 Td[(xmdti+LG 2tkxt+1)]TJ /F8 11.955 Tf 11.96 0 Td[(xtk2=(t)]TJ /F5 11.955 Tf 11.95 0 Td[(1)G(xmdt)+hrG(xmdt),xagt)]TJ /F8 11.955 Tf 11.96 0 Td[(xmdti+G(xmdt)+hrG(xmdt),xt+1)]TJ /F8 11.955 Tf 11.96 0 Td[(xmdti+LG 2tkxt+1)]TJ /F8 11.955 Tf 11.96 0 Td[(xtk2=(t)]TJ /F5 11.955 Tf 11.95 0 Td[(1)G(xmdt)+hrG(xmdt),xagt)]TJ /F8 11.955 Tf 11.96 0 Td[(xmdti+G(xmdt)+hrG(xmdt),x)]TJ /F8 11.955 Tf 11.95 0 Td[(xmdti+hrG(xmdt),xt+1)]TJ /F8 11.955 Tf 11.96 0 Td[(xi+LG 2tkxt+1)]TJ /F8 11.955 Tf 11.96 0 Td[(xtk2(t)]TJ /F5 11.955 Tf 11.95 0 Td[(1)G(xagt)+G(x)+hrG(xmdt),xt+1)]TJ /F8 11.955 Tf 11.96 0 Td[(xi+LG 2tkxt+1)]TJ /F8 11.955 Tf 11.95 0 Td[(xtk2. (2) Moreover,by( 2 )andtheconvexityofJ(),wehavetJ(yagt+1))]TJ /F4 11.955 Tf 11.96 0 Td[(tJ(y)(t)]TJ /F5 11.955 Tf 11.96 0 Td[(1)J(yagt)+J(yt+1))]TJ /F4 11.955 Tf 11.95 0 Td[(tJ(y)=(t)]TJ /F5 11.955 Tf 11.96 0 Td[(1)[J(yagt))]TJ /F8 11.955 Tf 11.95 0 Td[(J(y)]+J(yt+1))]TJ /F8 11.955 Tf 11.95 0 Td[(J(y). 44

PAGE 45

By( 2 ),( 2 ),( 2 )andthetwoinequalitiesabove,weobtaintQ(zagt+1,z))]TJ /F5 11.955 Tf 11.95 0 Td[((t)]TJ /F5 11.955 Tf 11.96 0 Td[(1)Q(zagt,z)=tG(xagt+1)+hKxagt+1,yi)]TJ /F8 11.955 Tf 19.26 0 Td[(J(y))]TJ /F12 11.955 Tf 11.95 9.68 Td[(G(x)+hKx,yagt+1i)]TJ /F8 11.955 Tf 19.26 0 Td[(J(yagt+1))]TJ /F5 11.955 Tf 11.95 0 Td[((t)]TJ /F5 11.955 Tf 11.96 0 Td[(1)f[G(xagt)+hKxagt,yi)]TJ /F8 11.955 Tf 19.27 0 Td[(J(y)])]TJ /F5 11.955 Tf 11.96 -.17 Td[([G(x)+hKx,yagti)]TJ /F8 11.955 Tf 19.26 0 Td[(J(yagt)]g=tG(xagt+1))]TJ /F5 11.955 Tf 11.96 0 Td[((t)]TJ /F5 11.955 Tf 11.95 0 Td[(1)G(xagt))]TJ /F8 11.955 Tf 11.95 0 Td[(G(x)+tJ(yagt+1))]TJ /F8 11.955 Tf 11.95 0 Td[(J(y))]TJ /F5 11.955 Tf 11.95 0 Td[((t)]TJ /F5 11.955 Tf 11.96 0 Td[(1)[J(yagt))]TJ /F8 11.955 Tf 11.96 0 Td[(J(y)]+hK(txagt+1)]TJ /F5 11.955 Tf 11.95 0 Td[((t)]TJ /F5 11.955 Tf 11.96 0 Td[(1)xagt),yi)-222(hKx,tyagt+1)]TJ /F5 11.955 Tf 11.95 0 Td[((t)]TJ /F5 11.955 Tf 11.96 0 Td[(1)yagtihrG(xmdt),xt+1)]TJ /F8 11.955 Tf 11.96 0 Td[(xi+LG 2tkxt+1)]TJ /F8 11.955 Tf 11.95 0 Td[(xtk2+J(yt+1))]TJ /F8 11.955 Tf 11.96 0 Td[(J(y)+hKxt+1,yi)-222(hKx,yt+1i. Lemma 1 establishesaboundforQ(zagt+1,z)forallz2Z,whichwillbeusedintheproofofbothTheorems 2.1 and 2.2 Lemma1. Letzagt+1=(xagt+1,yagt+1)betheiteratesgeneratedbyAlgorithm 2.2 .Assumethattheparameterst,t,t,andtsatisfy( 2 ),( 2 )and( 2 ).Then,foranyz2Z,wehave ttQ(zagt+1,z)Bt(z,z[t])+thK(xt+1)]TJ /F8 11.955 Tf 11.95 0 Td[(xt),y)]TJ /F8 11.955 Tf 11.95 0 Td[(yt+1i)]TJ /F4 11.955 Tf 19.26 0 Td[(tX 2t)]TJ /F8 11.955 Tf 14.68 8.08 Td[(LG 2tkxt+1)]TJ /F8 11.955 Tf 11.95 0 Td[(xtk2,(2) wheretisdenedin( 2 ),z[t]:=f(xi,yi)gt+1i=1andBt(z,z[t]):=tXi=1i i[VX(x,xi))]TJ /F8 11.955 Tf 11.95 0 Td[(VX(x,xi+1)]+i i[VY(y,yi))]TJ /F8 11.955 Tf 11.96 0 Td[(VY(y,yi+1)]. (2) Proof. Firstofall,weexploretheoptimalityconditionsiniterations( 2 )and( 2 ).ApplyLemma2in[ 32 ]to( 2 ),wehaveh)]TJ /F8 11.955 Tf 13.95 0 Td[(Kxt,yt+1)]TJ /F8 11.955 Tf 11.96 0 Td[(yi+J(yt+1))]TJ /F8 11.955 Tf 11.95 0 Td[(J(y)1 tVY(y,yt))]TJ /F5 11.955 Tf 14.71 8.09 Td[(1 tVY(yt+1,yt))]TJ /F5 11.955 Tf 14.71 8.09 Td[(1 tVY(y,yt+1)1 tVY(y,yt))]TJ /F4 11.955 Tf 13.44 8.09 Td[(Y 2tkyt+1)]TJ /F8 11.955 Tf 11.95 0 Td[(ytk2)]TJ /F5 11.955 Tf 14.71 8.09 Td[(1 tVY(y,yt+1), (2) 45

PAGE 46

wherethelastinequalityfollowsfromthefactthat,bythestrongconvexityofdY()and( 2 ),VY(y1,y2)Y 2ky1)]TJ /F8 11.955 Tf 11.96 0 Td[(y2k2,forally1,y22Y. (2) Similarly,from( 2 )wecanderivethathrG(xmdt),xt+1)]TJ /F8 11.955 Tf 11.95 0 Td[(xi+hxt+1)]TJ /F8 11.955 Tf 11.96 0 Td[(x,KTyt+1i1 tVX(x,xt))]TJ /F4 11.955 Tf 13.96 8.09 Td[(X 2tkxt+1)]TJ /F8 11.955 Tf 11.96 0 Td[(xtk2)]TJ /F5 11.955 Tf 15.05 8.09 Td[(1 tVX(x,xt+1). (2) OurnextstepistoestablishacrucialrecursionofAlgorithm 2.2 .Itfollowsfrom( 2 ),( 2 )and( 2 )thattQ(zagt+1,z))]TJ /F5 11.955 Tf 11.96 0 Td[((t)]TJ /F5 11.955 Tf 11.95 0 Td[(1)Q(zagt,z)hrG(xmdt),xt+1)]TJ /F8 11.955 Tf 11.96 0 Td[(xi+LG 2tkxt+1)]TJ /F8 11.955 Tf 11.96 0 Td[(xtk2+[J(yt+1))]TJ /F8 11.955 Tf 11.96 0 Td[(J(y)]+hKxt+1,yi)-222(hKx,yt+1i1 tVX(x,xt))]TJ /F5 11.955 Tf 15.06 8.09 Td[(1 tV(x,xt+1))]TJ /F12 11.955 Tf 11.96 16.86 Td[(X 2t)]TJ /F8 11.955 Tf 14.69 8.09 Td[(LG 2tkxt+1)]TJ /F8 11.955 Tf 11.96 0 Td[(xtk2+1 tVY(y,yt))]TJ /F5 11.955 Tf 14.7 8.09 Td[(1 tV(y,yt+1))]TJ /F4 11.955 Tf 13.45 8.09 Td[(Y 2tkyt+1)]TJ /F8 11.955 Tf 11.96 0 Td[(ytk2)-222(hxt+1)]TJ /F8 11.955 Tf 11.95 0 Td[(x,KTyt+1i+hKxt,yt+1)]TJ /F8 11.955 Tf 11.95 0 Td[(yi+hKxt+1,yi)-222(hKx,yt+1i. (2) Alsoobservethatby( 2 ),wehave)-222(hxt+1)]TJ /F8 11.955 Tf 11.96 0 Td[(x,KTyt+1i+hKxt,yt+1)]TJ /F8 11.955 Tf 11.96 0 Td[(yi+hKxt+1,yi)-223(hKx,yt+1i=hK(xt+1)]TJ /F8 11.955 Tf 11.95 0 Td[(xt),y)]TJ /F8 11.955 Tf 11.95 0 Td[(yt+1i)]TJ /F4 11.955 Tf 19.27 0 Td[(thK(xt)]TJ /F8 11.955 Tf 11.95 0 Td[(xt)]TJ /F7 7.97 Tf 6.58 0 Td[(1),y)]TJ /F8 11.955 Tf 11.95 0 Td[(yt+1i=hK(xt+1)]TJ /F8 11.955 Tf 11.95 0 Td[(xt),y)]TJ /F8 11.955 Tf 11.95 0 Td[(yt+1i)]TJ /F4 11.955 Tf 19.27 0 Td[(thK(xt)]TJ /F8 11.955 Tf 11.95 0 Td[(xt)]TJ /F7 7.97 Tf 6.58 0 Td[(1),y)]TJ /F8 11.955 Tf 11.95 0 Td[(yti)]TJ /F4 11.955 Tf 19.27 0 Td[(thK(xt)]TJ /F8 11.955 Tf 11.95 0 Td[(xt)]TJ /F7 7.97 Tf 6.58 0 Td[(1),yt)]TJ /F8 11.955 Tf 11.95 0 Td[(yt+1i. 46

PAGE 47

Multiplyingbothsidesof( 2 )byt,usingtheaboveidentityandthefactthattt=t)]TJ /F7 7.97 Tf 6.58 0 Td[(1dueto( 2 ),weobtain ttQ(zagt+1,z))]TJ /F5 11.955 Tf 11.96 0 Td[((t)]TJ /F5 11.955 Tf 11.95 0 Td[(1)tQ(zagt,z)t tVX(x,xt))]TJ /F4 11.955 Tf 13.15 8.09 Td[(t tVX(x,xt+1)+t tVY(y,yt))]TJ /F4 11.955 Tf 13.15 8.09 Td[(t tVY(y,yt+1)+thK(xt+1)]TJ /F8 11.955 Tf 11.95 0 Td[(xt),y)]TJ /F8 11.955 Tf 11.95 0 Td[(yt+1i)]TJ /F4 11.955 Tf 19.27 0 Td[(t)]TJ /F7 7.97 Tf 6.59 0 Td[(1hK(xt)]TJ /F8 11.955 Tf 11.96 0 Td[(xt)]TJ /F7 7.97 Tf 6.59 0 Td[(1),y)]TJ /F8 11.955 Tf 11.96 0 Td[(yti)]TJ /F4 11.955 Tf 11.96 0 Td[(tX 2t)]TJ /F8 11.955 Tf 14.69 8.09 Td[(LG 2tkxt+1)]TJ /F8 11.955 Tf 11.96 0 Td[(xtk2)]TJ /F4 11.955 Tf 13.15 8.09 Td[(Yt 2tkyt+1)]TJ /F8 11.955 Tf 11.95 0 Td[(ytk2)]TJ /F4 11.955 Tf 11.95 0 Td[(t)]TJ /F7 7.97 Tf 6.59 0 Td[(1hK(xt)]TJ /F8 11.955 Tf 11.96 0 Td[(xt)]TJ /F7 7.97 Tf 6.59 0 Td[(1),yt)]TJ /F8 11.955 Tf 11.96 0 Td[(yt+1i.(2) Now,applyingCauchy-Schwartzinequalitytothelasttermin( 2 ),usingthenotationLK=kKkandnoticingthatt)]TJ /F7 7.97 Tf 6.58 0 Td[(1=t=tminft)]TJ /F7 7.97 Tf 6.59 0 Td[(1=t,t)]TJ /F7 7.97 Tf 6.58 0 Td[(1=tgfrom( 2 ),wehave)]TJ /F4 11.955 Tf 11.95 0 Td[(t)]TJ /F7 7.97 Tf 6.58 0 Td[(1hK(xt)]TJ /F8 11.955 Tf 11.96 0 Td[(xt)]TJ /F7 7.97 Tf 6.59 0 Td[(1),yt)]TJ /F8 11.955 Tf 11.95 0 Td[(yt+1it)]TJ /F7 7.97 Tf 6.59 0 Td[(1kK(xt)]TJ /F8 11.955 Tf 11.96 0 Td[(xt)]TJ /F7 7.97 Tf 6.59 0 Td[(1)kkyt)]TJ /F8 11.955 Tf 11.95 0 Td[(yt+1kLKt)]TJ /F7 7.97 Tf 6.59 0 Td[(1kxt)]TJ /F8 11.955 Tf 11.95 0 Td[(xt)]TJ /F7 7.97 Tf 6.59 0 Td[(1kkyt)]TJ /F8 11.955 Tf 11.96 0 Td[(yt+1kL2K2t)]TJ /F7 7.97 Tf 6.59 0 Td[(1t 2Ytkxt)]TJ /F8 11.955 Tf 11.95 0 Td[(xt)]TJ /F7 7.97 Tf 6.58 0 Td[(1k2+Yt 2tkyt)]TJ /F8 11.955 Tf 11.95 0 Td[(yt+1k2L2Kt)]TJ /F7 7.97 Tf 6.59 0 Td[(1t)]TJ /F7 7.97 Tf 6.58 0 Td[(1 2Ykxt)]TJ /F8 11.955 Tf 11.95 0 Td[(xt)]TJ /F7 7.97 Tf 6.58 0 Td[(1k2+Yt 2tkyt)]TJ /F8 11.955 Tf 11.95 0 Td[(yt+1k2. (2) Notingthatt+1=t=t+1,soby( 2 )wehave(t+1)]TJ /F5 11.955 Tf 12.66 0 Td[(1)t+1=tt.Combiningtheabovetworelationswithinequality( 2 ),wegetthefollowingrecursionforAlgorithm 2.2 :(t+1)]TJ /F5 11.955 Tf 11.96 0 Td[(1)t+1Q(zagt+1,z))]TJ /F5 11.955 Tf 11.95 0 Td[((t)]TJ /F5 11.955 Tf 11.96 0 Td[(1)tQ(zagt,z)=ttQ(zagt+1,z))]TJ /F5 11.955 Tf 11.96 0 Td[((t)]TJ /F5 11.955 Tf 11.95 0 Td[(1)tQ(zagt,z)t tVX(x,xt))]TJ /F4 11.955 Tf 13.15 8.09 Td[(t tVX(x,xt+1)+t tVY(y,yt))]TJ /F4 11.955 Tf 13.15 8.09 Td[(t tVY(y,yt+1)+thK(xt+1)]TJ /F8 11.955 Tf 11.96 0 Td[(xt),y)]TJ /F8 11.955 Tf 11.96 0 Td[(yt+1i)]TJ /F4 11.955 Tf 19.26 0 Td[(t)]TJ /F7 7.97 Tf 6.58 0 Td[(1hK(xt)]TJ /F8 11.955 Tf 11.95 0 Td[(xt)]TJ /F7 7.97 Tf 6.58 0 Td[(1),y)]TJ /F8 11.955 Tf 11.95 0 Td[(yti)]TJ /F4 11.955 Tf 11.95 0 Td[(tX 2t)]TJ /F8 11.955 Tf 14.69 8.09 Td[(LG 2tkxt+1)]TJ /F8 11.955 Tf 11.95 0 Td[(xtk2+L2Kt)]TJ /F7 7.97 Tf 6.58 0 Td[(1t)]TJ /F7 7.97 Tf 6.59 0 Td[(1 2Ykxt)]TJ /F8 11.955 Tf 11.96 0 Td[(xt)]TJ /F7 7.97 Tf 6.59 0 Td[(1k2,8t1. Applyingtheaboveinequalityinductivelyandassumingthatx0=x1,weconcludethat(t+1)]TJ /F5 11.955 Tf 11.96 0 Td[(1)t+1Q(zagt+1,z))]TJ /F5 11.955 Tf 11.96 0 Td[((1)]TJ /F5 11.955 Tf 11.95 0 Td[(1)1Q(zag1,z)Bt(z,z[t])+thK(xt+1)]TJ /F8 11.955 Tf 11.96 0 Td[(xt),y)]TJ /F8 11.955 Tf 11.96 0 Td[(yt+1i)]TJ /F4 11.955 Tf 11.96 0 Td[(tX 2t)]TJ /F8 11.955 Tf 14.68 8.09 Td[(LG 2tkxt+1)]TJ /F8 11.955 Tf 11.95 0 Td[(xtk2)]TJ /F9 7.97 Tf 13.18 14.94 Td[(t)]TJ /F7 7.97 Tf 6.59 0 Td[(1Xi=1iX 2i)]TJ /F8 11.955 Tf 14.13 8.09 Td[(LG 2i)]TJ /F8 11.955 Tf 13.15 8.09 Td[(L2Ki 2Ykxi+1)]TJ /F8 11.955 Tf 11.96 0 Td[(xik2, 47

PAGE 48

which,inviewof( 2 )andthefactsthat1=1and(t+1)]TJ /F5 11.955 Tf 12.31 0 Td[(1)t+1=ttby( 2 ),implies( 2 ). WearenowreadytoproveTheorem 2.1 ,whichisanimmediateconsequenceofLemma 1 ProofofTheorem 2.1 LetBt(z,z[t])bedenedin( 2 ).Firstnotethatbythedenitionoftin( 2 )andrelation( 2 ),wehavet=t)]TJ /F7 7.97 Tf 6.58 0 Td[(1=tt)]TJ /F7 7.97 Tf 6.59 0 Td[(1=tandhencet)]TJ /F7 7.97 Tf 6.58 0 Td[(1=t)]TJ /F7 7.97 Tf 6.58 0 Td[(1t=t.Usingthisobservationand( 2 ),weconcludethatBt(z,z[t])=1 1VX(x,x1))]TJ /F9 7.97 Tf 13.18 14.95 Td[(t)]TJ /F7 7.97 Tf 6.58 0 Td[(1Xi=1i i)]TJ /F4 11.955 Tf 13.15 8.09 Td[(i+1 i+1VX(x,xi+1))]TJ /F4 11.955 Tf 13.15 8.09 Td[(t tVX(x,xt+1)+1 1VY(y,y1))]TJ /F9 7.97 Tf 13.17 14.94 Td[(t)]TJ /F7 7.97 Tf 6.59 0 Td[(1Xi=1i i)]TJ /F4 11.955 Tf 13.15 8.09 Td[(i+1 i+1VY(y,yi+1))]TJ /F4 11.955 Tf 13.15 8.09 Td[(t tVY(y,yt+1)1 12X)]TJ /F9 7.97 Tf 13.18 14.94 Td[(t)]TJ /F7 7.97 Tf 6.59 0 Td[(1Xi=1i i)]TJ /F4 11.955 Tf 13.15 8.08 Td[(i+1 i+12X)]TJ /F4 11.955 Tf 13.15 8.08 Td[(t tVX(x,xt+1)+1 12Y)]TJ /F9 7.97 Tf 13.17 14.95 Td[(t)]TJ /F7 7.97 Tf 6.58 0 Td[(1Xi=1i i)]TJ /F4 11.955 Tf 13.15 8.09 Td[(i+1 i+12Y)]TJ /F4 11.955 Tf 13.15 8.09 Td[(t tVY(y,yt+1)=t t2X)]TJ /F4 11.955 Tf 13.15 8.09 Td[(t tVX(x,xt+1)+t t2Y)]TJ /F4 11.955 Tf 13.15 8.09 Td[(t tVY(y,yt+1). (2) NowapplyingCauchy-Schwartzinequalitytotheinnerproducttermin( 2 ),wegetthK(xt+1)]TJ /F8 11.955 Tf 11.95 0 Td[(xt),y)]TJ /F8 11.955 Tf 11.95 0 Td[(yt+1iLKtkxt+1)]TJ /F8 11.955 Tf 11.95 0 Td[(xtkky)]TJ /F8 11.955 Tf 11.95 0 Td[(yt+1kL2Ktt 2Ykxt+1)]TJ /F8 11.955 Tf 11.95 0 Td[(xtk2+Yt 2tky)]TJ /F8 11.955 Tf 11.96 0 Td[(yt+1k2. (2) Usingtheabovetworelations,( 2 ),( 2 )and( 2 ),wehavettQ(zagt+1,z)t t2X)]TJ /F4 11.955 Tf 13.15 8.09 Td[(t tVX(x,xt+1)+t t2Y)]TJ /F4 11.955 Tf 13.15 8.09 Td[(t tVY(y,yt+1))]TJ /F4 11.955 Tf 13.15 8.09 Td[(Y 2ky)]TJ /F8 11.955 Tf 11.95 0 Td[(yt+1k2)]TJ /F4 11.955 Tf 11.96 0 Td[(tX 2t)]TJ /F8 11.955 Tf 14.69 8.08 Td[(LG 2t)]TJ /F8 11.955 Tf 13.15 8.08 Td[(L2Kt 2Ykxt+1)]TJ /F8 11.955 Tf 11.96 0 Td[(xtk2t t2X+t t2Y,8z2Z, (2) whichtogetherwith( 2 ),thenclearlyimply( 2 ). 48

PAGE 49

OurgoalintheremainingpartofthissubsectionistoproveTheorem 2.2 ,whichsummarizestheconvergencepropertiesofAlgorithm 2.2 whenXorYisunbounded.WewillrstproveatechnicalresultwhichspecializestheresultsinLemma 1 forthecasewhen( 2 ),( 2 )and( 2 )hold. Lemma2. Let^z=(^x,^y)2Zbeasaddlepointof( 2 ).IfVX(x,xt)=kx)]TJ /F8 11.955 Tf 12.56 0 Td[(xtk2=2andVY(y,yt)=ky)]TJ /F8 11.955 Tf 12.1 0 Td[(ytk2=2inAlgorithm 2.2 ,andtheparameterst,t,tandtsatisfy( 2 ),( 2 )and( 2 ),then a) k^x)]TJ /F8 11.955 Tf 11.73 0 Td[(xt+1k2+t(1)]TJ /F8 11.955 Tf 11.96 0 Td[(p) tk^y)]TJ /F8 11.955 Tf 11.73 0 Td[(yt+1k2k^x)]TJ /F8 11.955 Tf 11.73 0 Td[(x1k2+t tk^y)]TJ /F8 11.955 Tf 11.74 0 Td[(y1k2,forallt1.(2) b) ~g(zagt+1,vt+1)1 2ttkxagt+1)]TJ /F8 11.955 Tf 11.96 0 Td[(x1k2+1 2ttkyagt+1)]TJ /F8 11.955 Tf 11.96 0 Td[(y1k2=:t+1,forallt1,(2) where~g(,)isdenedin( 2 )andvt+1=1 tt(x1)]TJ /F8 11.955 Tf 11.96 0 Td[(xt+1),1 tt(y1)]TJ /F8 11.955 Tf 11.95 0 Td[(yt+1)+1 tK(xt+1)]TJ /F8 11.955 Tf 11.95 0 Td[(xt). (2) Proof. ItiseasytocheckthattheconditionsinLemma 1 aresatised.By( 2 ),( 2 )inLemma 1 becomestQ(zagt+1,z)1 2tkx)]TJ /F8 11.955 Tf 11.96 0 Td[(x1k2)]TJ /F5 11.955 Tf 18.19 8.09 Td[(1 2tkx)]TJ /F8 11.955 Tf 11.96 0 Td[(xt+1k2+1 2tky)]TJ /F8 11.955 Tf 11.96 0 Td[(y1k2)]TJ /F5 11.955 Tf 17.85 8.09 Td[(1 2tky)]TJ /F8 11.955 Tf 11.96 0 Td[(yt+1k2+hK(xt+1)]TJ /F8 11.955 Tf 11.95 0 Td[(xt),y)]TJ /F8 11.955 Tf 11.95 0 Td[(yt+1i)]TJ /F12 11.955 Tf 19.27 16.86 Td[(1 2t)]TJ /F8 11.955 Tf 14.69 8.09 Td[(LG 2tkxt+1)]TJ /F8 11.955 Tf 11.96 0 Td[(xtk2. (2) Toprove( 2 ),observingthathK(xt+1)]TJ /F8 11.955 Tf 11.95 0 Td[(xt),y)]TJ /F8 11.955 Tf 11.95 0 Td[(yt+1iL2Kt 2pkxt+1)]TJ /F8 11.955 Tf 11.96 0 Td[(xtk2+p 2tky)]TJ /F8 11.955 Tf 11.96 0 Td[(yt+1k2 (2) wherepistheconstantin( 2 ).By( 2 )andtheabovetwoinequalities,wegettQ(zagt+1,z)1 2tkx)]TJ /F8 11.955 Tf 11.96 0 Td[(x1k2)]TJ /F5 11.955 Tf 18.19 8.09 Td[(1 2tkx)]TJ /F8 11.955 Tf 11.95 0 Td[(xt+1k2+1 2tky)]TJ /F8 11.955 Tf 11.95 0 Td[(y1k2)]TJ /F5 11.955 Tf 13.15 8.09 Td[(1)]TJ /F8 11.955 Tf 11.96 0 Td[(p 2tky)]TJ /F8 11.955 Tf 11.95 0 Td[(yt+1k2. Lettingz=^zintheabove,andusingthefactthatQ(zagt+1,^z)0,weobtain( 2 ). 49

PAGE 50

Nowweprove( 2 ).Notingthatkx)]TJ /F8 11.955 Tf 11.96 0 Td[(x1k2)-222(kx)]TJ /F8 11.955 Tf 11.95 0 Td[(xt+1k2=2hxt+1)]TJ /F8 11.955 Tf 11.95 0 Td[(x1,xi+kx1k2)-222(kxt+1k2=2hxt+1)]TJ /F8 11.955 Tf 11.95 0 Td[(x1,x)]TJ /F8 11.955 Tf 11.96 0 Td[(xagt+1i+2hxt+1)]TJ /F8 11.955 Tf 11.95 0 Td[(x1,xagt+1i+kx1k2)-222(kxt+1k2=2hxt+1)]TJ /F8 11.955 Tf 11.95 0 Td[(x1,x)]TJ /F8 11.955 Tf 11.96 0 Td[(xagt+1i+kxagt+1)]TJ /F8 11.955 Tf 11.95 0 Td[(x1k2)-222(kxagt+1)]TJ /F8 11.955 Tf 11.96 0 Td[(xt+1k2, (2) weconcludefrom( 2 )and( 2 )thatforanyz2Z,tQ(zagt+1,z))-222(hK(xt+1)]TJ /F8 11.955 Tf 11.96 0 Td[(xt),yagt+1)]TJ /F8 11.955 Tf 11.95 0 Td[(yi)]TJ /F5 11.955 Tf 15.06 8.09 Td[(1 thx1)]TJ /F8 11.955 Tf 11.95 0 Td[(xt+1,xagt+1)]TJ /F8 11.955 Tf 11.96 0 Td[(xi)]TJ /F5 11.955 Tf 22.01 8.09 Td[(1 thy1)]TJ /F8 11.955 Tf 11.95 0 Td[(yt+1,yagt+1)]TJ /F8 11.955 Tf 11.96 0 Td[(yi1 2t)]TJ /F2 11.955 Tf 5.47 -9.69 Td[(kxagt+1)]TJ /F8 11.955 Tf 11.96 0 Td[(x1k2)-222(kxagt+1)]TJ /F8 11.955 Tf 11.95 0 Td[(xt+1k2+1 2t)]TJ /F2 11.955 Tf 5.48 -9.69 Td[(kyagt+1)]TJ /F8 11.955 Tf 11.95 0 Td[(y1k2)-221(kyagt+1)]TJ /F8 11.955 Tf 11.95 0 Td[(yt+1k2+hK(xt+1)]TJ /F8 11.955 Tf 11.96 0 Td[(xt),yagt+1)]TJ /F8 11.955 Tf 11.96 0 Td[(yt+1i)]TJ /F12 11.955 Tf 19.26 16.85 Td[(1 2t)]TJ /F8 11.955 Tf 14.69 8.08 Td[(LG 2tkxt+1)]TJ /F8 11.955 Tf 11.95 0 Td[(xtk21 2t)]TJ /F2 11.955 Tf 5.47 -9.68 Td[(kxagt+1)]TJ /F8 11.955 Tf 11.96 0 Td[(x1k2)-222(kxagt+1)]TJ /F8 11.955 Tf 11.95 0 Td[(xt+1k2+1 2t)]TJ /F2 11.955 Tf 5.48 -9.68 Td[(kyagt+1)]TJ /F8 11.955 Tf 11.95 0 Td[(y1k2)-221(kyagt+1)]TJ /F8 11.955 Tf 11.95 0 Td[(yt+1k2p 2tkyagt+1)]TJ /F8 11.955 Tf 11.95 0 Td[(yt+1k2)]TJ /F12 11.955 Tf 11.96 16.86 Td[(1 2t)]TJ /F8 11.955 Tf 14.69 8.09 Td[(LG 2t)]TJ /F8 11.955 Tf 13.15 8.09 Td[(L2Kt 2pkxt+1)]TJ /F8 11.955 Tf 11.96 0 Td[(xtk21 2tkxagt+1)]TJ /F8 11.955 Tf 11.95 0 Td[(x1k2+1 2tkyagt+1)]TJ /F8 11.955 Tf 11.95 0 Td[(y1k2. Theresultsin( 2 )and( 2 )immediatelyfollowfromtheaboveinequalityand( 2 ). WearenowreadytoproveTheorem 2.2 ProofofTheorem 2.2 Wehaveestablishedtheexpressionofvt+1andt+1inLemma 2 .Itsufcestoestimatetheboundonkvt+1kandt+1.ItfollowsfromthedenitionofD,( 2 )and( 2 )thatforallt1,k^x)]TJ /F8 11.955 Tf 11.96 0 Td[(xt+1kDandk^y)]TJ /F8 11.955 Tf 11.96 0 Td[(yt+1kDr 1 1(1)]TJ /F8 11.955 Tf 11.96 0 Td[(p). 50

PAGE 51

Nowby( 2 ),wehavekvt+1k1 ttkx1)]TJ /F8 11.955 Tf 11.96 0 Td[(xt+1k+1 ttky1)]TJ /F8 11.955 Tf 11.96 0 Td[(yt+1k+LK tkxt+1)]TJ /F8 11.955 Tf 11.95 0 Td[(xtk1 tt(k^x)]TJ /F8 11.955 Tf 11.95 0 Td[(x1k+k^x)]TJ /F8 11.955 Tf 11.96 0 Td[(xt+1k)+1 tt(k^y)]TJ /F8 11.955 Tf 11.96 0 Td[(y1k+k^y)]TJ /F8 11.955 Tf 11.95 0 Td[(yt+1k)+LK t(k^x)]TJ /F8 11.955 Tf 11.95 0 Td[(xt+1k+k^x)]TJ /F8 11.955 Tf 11.96 0 Td[(xtk)1 tt(k^x)]TJ /F8 11.955 Tf 11.95 0 Td[(x1k+D)+1 ttk^y)]TJ /F8 11.955 Tf 11.96 0 Td[(y1k+Dr 1 1(1)]TJ /F8 11.955 Tf 11.96 0 Td[(p)+2LK tD=1 ttk^x)]TJ /F8 11.955 Tf 11.96 0 Td[(x1k+1 ttk^y)]TJ /F8 11.955 Tf 11.95 0 Td[(y1k+D1 tt1+r 1 1(1)]TJ /F8 11.955 Tf 11.96 0 Td[(p)+2LK t. Toestimatetheboundoft+1,considerthesequenceftgdenedin( 2 ).Usingthefactthat(t+1)]TJ /F5 11.955 Tf 12.01 0 Td[(1)t+1=ttdueto( 2 )and( 2 ),andapplying( 2 )and( 2 )inductively,wehavexagt+1=1 tttXi=1ixt+1,yagt+1=1 tttXi=1iyt+1and1 tttXi=1i=1. (2) Thusxagt+1andyagt+1areconvexcombinationsofsequencesfxi+1gti=1andfyi+1gti=1.Usingtheserelationsand( 2 ),wehavet+1=1 2ttkxagt+1)]TJ /F8 11.955 Tf 11.95 0 Td[(x1k2+1 2ttkyagt+1)]TJ /F8 11.955 Tf 11.95 0 Td[(y1k21 tt)]TJ /F2 11.955 Tf 5.48 -9.68 Td[(k^x)]TJ /F8 11.955 Tf 11.95 0 Td[(xagt+1k2+k^x)]TJ /F8 11.955 Tf 11.96 0 Td[(x1k2+1 tt)]TJ /F2 11.955 Tf 5.48 -9.68 Td[(k^y)]TJ /F8 11.955 Tf 11.95 0 Td[(yagt+1k2+k^y)]TJ /F8 11.955 Tf 11.95 0 Td[(y1k2=1 ttD2+k^x)]TJ /F8 11.955 Tf 11.96 0 Td[(xagt+1k2+t(1)]TJ /F8 11.955 Tf 11.96 0 Td[(p) tk^y)]TJ /F8 11.955 Tf 11.95 0 Td[(yagt+1k2+tp tk^y)]TJ /F8 11.955 Tf 11.95 0 Td[(yagt+1k21 tt"D2+1 tttXi=1ik^x)]TJ /F8 11.955 Tf 11.95 0 Td[(xi+1k2+t(1)]TJ /F8 11.955 Tf 11.95 0 Td[(p) tk^y)]TJ /F8 11.955 Tf 11.96 0 Td[(yi+1k2+tp tk^y)]TJ /F8 11.955 Tf 11.95 0 Td[(yi+1k2#1 tt"D2+1 tttXi=1iD2+tp t1 1(1)]TJ /F8 11.955 Tf 11.96 0 Td[(p)D2#=(2)]TJ /F8 11.955 Tf 11.95 0 Td[(p)D2 tt(1)]TJ /F8 11.955 Tf 11.96 0 Td[(p). 2.4.2ConvergenceAnalysisforStochasticAPDAlgorithm Inthissubsection,weproveTheorems 2.3 and 2.4 whichdescirbetheconvergencepropertiesofthestochasticAPDalgorithmpresentedinSection3. 51

PAGE 52

Let^G(xmdi),^Kx(xi)and^Ky(yi+1)betheoutputfromtheSOatthet-thiterationofAlgorithm 2.3 .Throughoutthissubsection,wedenotetx,G:=^G(xmdi))-222(rG(xmdt),tx,K:=^Ky(yi+1))]TJ /F8 11.955 Tf 11.95 0 Td[(KTyt+1,ty:=)]TJ /F5 11.955 Tf 11.47 2.53 Td[(^Kx(xi)+Kxt,tx:=tx,G+tx,Kandt:=(tx,ty). Moreover,foragivenz=(x,y)2Z,letusdenotekzk2=kxk2+kyk2anditsassociatedualnormfor=(x,y)bykk2=kxk2+kyk2.WealsodenetheBregmandivergenceV(z,~z):=VX(x,~x)+VY(y,~y)forz=(x,y)and~z=(~x,~y). BeforeprovingTheorem 2.3 ,werstestimateaboundonQ(zagt+1,z)forallz2Z.ThisresultisanalogoustoLemma 1 forthedeterministicAPDmethod. Lemma3. Letzagt=(xagt,yagt)betheiteratesgeneratedbyAlgorithm 2.3 .Assumethattheparameterst,t,tandtsatisfy( 2 ),( 2 )and( 2 ).Then,foranyz2Z,wehavettQ(zagt+1,z)Bt(z,z[t])+thK(xt+1)]TJ /F8 11.955 Tf 11.95 0 Td[(xt),y)]TJ /F8 11.955 Tf 11.95 0 Td[(yt+1i)]TJ /F4 11.955 Tf 19.26 0 Td[(tqX 2t)]TJ /F8 11.955 Tf 14.69 8.09 Td[(LG 2tkxt+1)]TJ /F8 11.955 Tf 11.96 0 Td[(xtk2+tXi=1i(z), (2) wheretandBt(z,z[t]),respectively,aredenedin( 2 )and( 2 ),z[t]=f(xi,yi)gt+1i=1andi(z):=)]TJ /F5 11.955 Tf 10.49 8.09 Td[((1)]TJ /F8 11.955 Tf 11.95 0 Td[(q)Xi 2ikxi+1)]TJ /F8 11.955 Tf 11.95 0 Td[(xik2)]TJ /F5 11.955 Tf 13.15 8.09 Td[((1)]TJ /F8 11.955 Tf 11.95 0 Td[(p)Yi 2ikyi+1)]TJ /F8 11.955 Tf 11.96 0 Td[(yik2)]TJ /F4 11.955 Tf 11.96 0 Td[(ihi,zi+1)]TJ /F8 11.955 Tf 11.95 0 Td[(zi. (2) Proof. Similarto( 2 )and( 2 ),weconcludefromtheoptimalityconditionsof( 2 )and( 2 )thath)]TJ /F5 11.955 Tf 16.12 2.52 Td[(^Kx(xt),yt+1)]TJ /F8 11.955 Tf 11.95 0 Td[(yi+J(yt+1))]TJ /F8 11.955 Tf 11.96 0 Td[(J(^y)1 tVY(^y,yt))]TJ /F4 11.955 Tf 13.44 8.09 Td[(Y 2tkyt+1)]TJ /F8 11.955 Tf 11.95 0 Td[(ytk2)]TJ /F5 11.955 Tf 14.71 8.09 Td[(1 tVY(^y,yt+1), 52

PAGE 53

andh^G(xmdt),xt+1)]TJ /F8 11.955 Tf 11.95 0 Td[(xi+hxt+1)]TJ /F8 11.955 Tf 11.95 0 Td[(x,^Ky(yt+1)i1 tVX(x,xt))]TJ /F4 11.955 Tf 13.96 8.09 Td[(X 2tkxt+1)]TJ /F8 11.955 Tf 11.96 0 Td[(xtk2)]TJ /F5 11.955 Tf 15.05 8.09 Td[(1 tVX(x,xt+1). NowweestablishanimportantrecursionforAlgorithm 2.3 .ObservingthatProposition 2.1 alsoholdsforAlgorithm 2.3 ,andapplyingtheabovetwoinequalitiesto( 2 )inProposition 2.1 ,similarto( 2 ),wehave ttQ(zagt+1,z))]TJ /F5 11.955 Tf 11.96 0 Td[((t)]TJ /F5 11.955 Tf 11.95 0 Td[(1)tQ(zagt,z)t tVX(x,xt))]TJ /F4 11.955 Tf 13.15 8.09 Td[(t tVX(x,xt+1)+t tVY(y,yt))]TJ /F4 11.955 Tf 13.15 8.09 Td[(t tVY(y,yt+1)+thK(xt+1)]TJ /F8 11.955 Tf 11.96 0 Td[(xt),y)]TJ /F8 11.955 Tf 11.96 0 Td[(yt+1i)]TJ /F4 11.955 Tf 19.26 0 Td[(t)]TJ /F7 7.97 Tf 6.58 0 Td[(1hK(xt)]TJ /F8 11.955 Tf 11.95 0 Td[(xt)]TJ /F7 7.97 Tf 6.58 0 Td[(1),y)]TJ /F8 11.955 Tf 11.95 0 Td[(yti)]TJ /F4 11.955 Tf 11.95 0 Td[(tX 2t)]TJ /F8 11.955 Tf 14.69 8.08 Td[(LG 2tkxt+1)]TJ /F8 11.955 Tf 11.95 0 Td[(xtk2)]TJ /F4 11.955 Tf 13.15 8.08 Td[(Yt 2tkyt+1)]TJ /F8 11.955 Tf 11.95 0 Td[(ytk2)]TJ /F4 11.955 Tf 11.96 0 Td[(t)]TJ /F7 7.97 Tf 6.59 0 Td[(1hK(xt)]TJ /F8 11.955 Tf 11.95 0 Td[(xt)]TJ /F7 7.97 Tf 6.58 0 Td[(1),yt)]TJ /F8 11.955 Tf 11.96 0 Td[(yt+1i)]TJ /F4 11.955 Tf 11.96 0 Td[(thix,G+ix,K,xt+1)]TJ /F8 11.955 Tf 11.95 0 Td[(xi)]TJ /F4 11.955 Tf 19.26 0 Td[(thiy,yt+1)]TJ /F8 11.955 Tf 11.96 0 Td[(yi,8z2Z.(2) ByCauchy-Schwartzinequalityand( 2 ),forallp2(0,1),)]TJ /F4 11.955 Tf 11.95 0 Td[(t)]TJ /F7 7.97 Tf 6.59 0 Td[(1hK(xt)]TJ /F8 11.955 Tf 11.96 0 Td[(xt)]TJ /F7 7.97 Tf 6.59 0 Td[(1),yt)]TJ /F8 11.955 Tf 11.96 0 Td[(yt+1it)]TJ /F7 7.97 Tf 6.59 0 Td[(1kK(xt)]TJ /F8 11.955 Tf 11.95 0 Td[(xt)]TJ /F7 7.97 Tf 6.59 0 Td[(1)kkyt)]TJ /F8 11.955 Tf 11.96 0 Td[(yt+1kLKt)]TJ /F7 7.97 Tf 6.58 0 Td[(1kxt)]TJ /F8 11.955 Tf 11.96 0 Td[(xt)]TJ /F7 7.97 Tf 6.59 0 Td[(1kkyt)]TJ /F8 11.955 Tf 11.96 0 Td[(yt+1kL2K2t)]TJ /F7 7.97 Tf 6.59 0 Td[(1t 2pYtkxt)]TJ /F8 11.955 Tf 11.95 0 Td[(xt)]TJ /F7 7.97 Tf 6.58 0 Td[(1k2+pYt 2tkyt)]TJ /F8 11.955 Tf 11.95 0 Td[(yt+1k2L2Kt)]TJ /F7 7.97 Tf 6.59 0 Td[(1t)]TJ /F7 7.97 Tf 6.58 0 Td[(1 2pYkxt)]TJ /F8 11.955 Tf 11.95 0 Td[(xt)]TJ /F7 7.97 Tf 6.58 0 Td[(1k2+pYt 2tkyt)]TJ /F8 11.955 Tf 11.95 0 Td[(yt+1k2. (2) By( 2 ),( 2 ),( 2 )and( 2 ),wecandevelopthefollowingrecursionforAlgorithm 2.3 :(t+1)]TJ /F5 11.955 Tf 11.96 0 Td[(1)t+1Q(zagt+1,z))]TJ /F5 11.955 Tf 11.95 0 Td[((t)]TJ /F5 11.955 Tf 11.96 0 Td[(1)tQ(zagt,z)=ttQ(zagt+1,z))]TJ /F5 11.955 Tf 11.96 0 Td[((t)]TJ /F5 11.955 Tf 11.95 0 Td[(1)tQ(zagt,z)t tVX(x,xt))]TJ /F4 11.955 Tf 13.15 8.09 Td[(t tVX(x,xt+1)+t tVY(y,yt))]TJ /F4 11.955 Tf 13.15 8.09 Td[(t tVY(y,yt+1)+thK(xt+1)]TJ /F8 11.955 Tf 11.96 0 Td[(xt),y)]TJ /F8 11.955 Tf 11.96 0 Td[(yt+1i)]TJ /F4 11.955 Tf 19.26 0 Td[(t)]TJ /F7 7.97 Tf 6.58 0 Td[(1hK(xt)]TJ /F8 11.955 Tf 11.95 0 Td[(xt)]TJ /F7 7.97 Tf 6.58 0 Td[(1),y)]TJ /F8 11.955 Tf 11.95 0 Td[(yti)]TJ /F4 11.955 Tf 11.95 0 Td[(tqX 2t)]TJ /F8 11.955 Tf 14.68 8.08 Td[(LG 2tkxt+1)]TJ /F8 11.955 Tf 11.96 0 Td[(xtk2+L2Kt)]TJ /F7 7.97 Tf 6.59 0 Td[(1t)]TJ /F7 7.97 Tf 6.58 0 Td[(1 2pYkxt)]TJ /F8 11.955 Tf 11.95 0 Td[(xt)]TJ /F7 7.97 Tf 6.58 0 Td[(1k2+t(x),8z2Z. 53

PAGE 54

Applyingtheaboveinequalityinductivelyandassumingthatx0=x1,weobtain(t+1)]TJ /F5 11.955 Tf 11.96 0 Td[(1)t+1Q(zagt+1,z))]TJ /F5 11.955 Tf 11.96 0 Td[((1)]TJ /F5 11.955 Tf 11.95 0 Td[(1)1Q(zag1,z)Bt(z,z[t])+thK(xt+1)]TJ /F8 11.955 Tf 11.95 0 Td[(xt),y)]TJ /F8 11.955 Tf 11.95 0 Td[(yt+1i)]TJ /F4 11.955 Tf 19.26 0 Td[(tqX 2t)]TJ /F8 11.955 Tf 14.69 8.08 Td[(LG 2tkxt+1)]TJ /F8 11.955 Tf 11.96 0 Td[(xtk2)]TJ /F9 7.97 Tf 13.17 14.95 Td[(t)]TJ /F7 7.97 Tf 6.59 0 Td[(1Xi=1iqX 2i)]TJ /F8 11.955 Tf 14.12 8.09 Td[(LG 2i)]TJ /F8 11.955 Tf 15.89 8.09 Td[(L2Ki 2pYkxi+1)]TJ /F8 11.955 Tf 11.95 0 Td[(xik2+tXi=1i(x),8z2Z. (2) Relation( 2 )thenfollowsimmediatelyfromtheaboveinequality,( 2 )and( 2 ). WealsoneedthefollowingtechnicalresultwhoseproofisbasedonLemma2.1of[ 66 ]. Lemma4. Leti,iandi,i=1,2,...,begivenpositiveconstants.Foranyz12Z,ifwedenezv1=z1andzvi+1=argminz=(x,y)2Z)]TJ /F4 11.955 Tf 9.3 0 Td[(ihix,xi)]TJ /F4 11.955 Tf 19.26 0 Td[(ihiy,yi+V(z,zvi), (2) thentXi=1ih)]TJ /F5 11.955 Tf 13.95 0 Td[(i,zvi)]TJ /F8 11.955 Tf 11.96 0 Td[(ziBt(z,zv[t])+tXi=1ii 2Xkixk2+tXi=1ii 2Ykiyk2, (2) wherezv[t]:=fzvigti=1andBt(z,zv[t])isdenedin( 2 ). Proof. Notingthat( 2 )implieszvi+1=(xvi+1,yvi+1)wherexvi+1=argminx2X)]TJ /F4 11.955 Tf 9.3 0 Td[(ihix,xi+VX(x,xvi)andyvi+1=argminy2Y)]TJ /F4 11.955 Tf 9.3 0 Td[(ihiy,yi+V(y,yvi), 54

PAGE 55

fromLemma2.1of[ 66 ]wehaveVX(x,xvi+1)VX(x,xvi))]TJ /F4 11.955 Tf 11.96 0 Td[(ihix,x)]TJ /F8 11.955 Tf 11.96 0 Td[(xvii+2ikixk2 2X,VY(y,yvi+1)VY(y,yvi))]TJ /F4 11.955 Tf 11.95 0 Td[(ihiy,y)]TJ /F8 11.955 Tf 11.95 0 Td[(yvii+2ikiyk2 2Y, foralli1.Thusi iVX(x,xvi+1)i iVX(x,xvi))]TJ /F4 11.955 Tf 11.95 0 Td[(ihix,x)]TJ /F8 11.955 Tf 11.96 0 Td[(xvii+iikixk2 2X,i iVY(y,yvi+1)i iVY(y,yvi))]TJ /F4 11.955 Tf 11.96 0 Td[(ihiy,y)]TJ /F8 11.955 Tf 11.96 0 Td[(yvii+iikiyk2 2Y. Addingtheabovetwoinequalitiestogether,andsummingupthemfromi=1totweget0Bt(z,zv[t]))]TJ /F9 7.97 Tf 18.68 14.95 Td[(tXi=1ihi,z)]TJ /F8 11.955 Tf 11.96 0 Td[(zvii+tXi=1iikixk2 2X+tXi=1iikiyk2 2Y, so( 2 )holds. WearenowreadytoproveTheorem 2.3 ProofofTheorem 2.3 Firstly,applyingtheboundsin( 2 )and( 2 )to( 2 ),wegetttQ(zagt+1,z)t t2X)]TJ /F4 11.955 Tf 13.15 8.08 Td[(t tVX(x,xt+1)+t t2Y)]TJ /F4 11.955 Tf 13.15 8.08 Td[(t tVY(y,yt+1)+Yt 2tky)]TJ /F8 11.955 Tf 11.95 0 Td[(yt+1k2)]TJ /F4 11.955 Tf 11.95 0 Td[(tqX 2t)]TJ /F8 11.955 Tf 14.68 8.09 Td[(LG 2t)]TJ /F8 11.955 Tf 13.15 8.09 Td[(L2Kt 2Ykxt+1)]TJ /F8 11.955 Tf 11.95 0 Td[(xtk2+tXi=1i(z)t t2X+t t2Y+tXi=1i(z),8z2Z. (2) 55

PAGE 56

By( 2 ),wehavei(z)=)]TJ /F5 11.955 Tf 10.5 8.09 Td[((1)]TJ /F8 11.955 Tf 11.96 0 Td[(q)Xi 2ikxi+1)]TJ /F8 11.955 Tf 11.96 0 Td[(xik2)]TJ /F5 11.955 Tf 13.15 8.09 Td[((1)]TJ /F8 11.955 Tf 11.96 0 Td[(p)Yi 2ikyi+1)]TJ /F8 11.955 Tf 11.95 0 Td[(yik2+ihi,z)]TJ /F8 11.955 Tf 11.95 0 Td[(zi+1i=)]TJ /F5 11.955 Tf 13.15 8.08 Td[((1)]TJ /F8 11.955 Tf 11.96 0 Td[(q)Xi 2ikxi+1)]TJ /F8 11.955 Tf 11.96 0 Td[(xik2)]TJ /F5 11.955 Tf 13.15 8.08 Td[((1)]TJ /F8 11.955 Tf 11.96 0 Td[(p)Yi 2ikyi+1)]TJ /F8 11.955 Tf 11.96 0 Td[(yik2+ihi,zi)]TJ /F8 11.955 Tf 11.96 0 Td[(zi+1i+ihi,z)]TJ /F8 11.955 Tf 11.96 0 Td[(ziiii 2(1)]TJ /F8 11.955 Tf 11.95 0 Td[(q)Xkixk2+ii 2(1)]TJ /F8 11.955 Tf 11.95 0 Td[(p)Ykiyk2+ihi,z)]TJ /F8 11.955 Tf 11.95 0 Td[(zii, (2) wherethelastrelationfollowsfromYoung'sinequality.Foralli1,lettingzv1=z1,andzvi+1asin( 2 ),weconcludefrom( 2 )andLemma 4 that,8z2Z,tXi=1i(z)tXi=1ii 2(1)]TJ /F8 11.955 Tf 11.96 0 Td[(q)Xkixk2+ii 2(1)]TJ /F8 11.955 Tf 11.96 0 Td[(p)Ykiyk2+ihi,zvi)]TJ /F8 11.955 Tf 11.95 0 Td[(zii+ih)]TJ /F5 11.955 Tf 13.95 0 Td[(i,zvi)]TJ /F8 11.955 Tf 11.95 0 Td[(ziBt(z,zv[t])+1 2tXi=1(2)]TJ /F8 11.955 Tf 11.96 0 Td[(q)ii (1)]TJ /F8 11.955 Tf 11.96 0 Td[(q)Xkixk2+(2)]TJ /F8 11.955 Tf 11.95 0 Td[(p)ii (1)]TJ /F8 11.955 Tf 11.95 0 Td[(p)Ykiyk2+ihi,zvi)]TJ /F8 11.955 Tf 11.95 0 Td[(zii| {z }Ut, (2) wheresimilarto( 2 )wehaveBt(z,zv[t])2Xt=t+2Yt=t.Usingtheaboveinequality,( 2 ),( 2 )and( 2 ),weobtain ttg(zagt+1)2t t2X+2t t2Y+Ut.(2) NowitsufcestoboundtheabovequantityUt,bothinexpectation(parta))andinprobability(partb)). Werstshowparta).NotethatbyourassumptionsonSO,atiterationiofAlgorithm 2.3 ,therandomnoisesiareindependentofziandhenceE[hi,z)]TJ /F8 11.955 Tf 11.96 0 Td[(zii]=0.Inaddition,Assumption A1 impliesthatE[kixk2]2x,G+2x,K=2x(notingthatix,Gand 56

PAGE 57

ix,Kareindepdentatiterationi),andE[kiyk2]2y.Therefore,E[Ut]1 2tXi=1(2)]TJ /F8 11.955 Tf 11.96 0 Td[(q)ii2x (1)]TJ /F8 11.955 Tf 11.96 0 Td[(q)X+(2)]TJ /F8 11.955 Tf 11.96 0 Td[(p)ii2y (1)]TJ /F8 11.955 Tf 11.96 0 Td[(p)Y. (2) Takingexpectationonbothsidesof( 2 )andusingtheaboveinequality,weobtain( 2 ). Wenowshowthatpartb)holds.NotethatbyourassumptionsonSOandthedenitionofzvi,thesequencesfhix,G,xvi)]TJ /F8 11.955 Tf 12.19 0 Td[(xiigi1isamartingale-differencesequence.Bythewell-knownlarge-deviationtheoremformatrigale-differencesequence(e.g.,Lemma2of[ 49 ]),andthefactthatE[expY2ihix,G,xvi)]TJ /F8 11.955 Tf 11.95 0 Td[(xii2=)]TJ /F5 11.955 Tf 5.48 -9.68 Td[(22i2Y2x,G]E[expYkix,Gk2kxvi)]TJ /F8 11.955 Tf 11.95 0 Td[(xik2=)]TJ /F5 11.955 Tf 5.48 -9.68 Td[(22Y2x,G]E[expkix,Gk2V(xvi,xi)=)]TJ /F5 11.955 Tf 5.47 -9.69 Td[(2Y2x,G]E[expkix,Gk2=2x,G]expf1g, weconcludethatProb8<:tXi=1ihix,G,xvi)]TJ /F8 11.955 Tf 11.96 0 Td[(xii>x,GXvuut 2 XtXi=12i9=;expf)]TJ /F4 11.955 Tf 15.27 0 Td[(2=3g,8>0. Byusingasimilarargument,wecanshowthat,8>0,Prob8<:tXi=1ihiy,yvi)]TJ /F8 11.955 Tf 11.96 0 Td[(yii>yYvuut 2 YtXi=12i9=;expf)]TJ /F4 11.955 Tf 15.27 0 Td[(2=3g,Prob8<:tXi=1ihix,K,x)]TJ /F8 11.955 Tf 11.95 0 Td[(xii>x,KXvuut 2 XtXi=12i9=;expf)]TJ /F4 11.955 Tf 15.27 0 Td[(2=3g. 57

PAGE 58

Usingthepreviousthreeinequalitiesandthefactthatx,G+x,Kp 2x,wehave,8>0,Prob8<:tXi=1ihi,zvi)]TJ /F8 11.955 Tf 11.95 0 Td[(zii>"p 2xX p X+yY p Y#vuut 2tXi=12i9=;Prob8<:tXi=1ihi,zvi)]TJ /F8 11.955 Tf 11.95 0 Td[(zii>(x,G+x,K)X p X+yY p Yvuut 2tXi=12i9=;3expf)]TJ /F4 11.955 Tf 15.28 0 Td[(2=3g. (2) NowletSi:=(2)]TJ /F8 11.955 Tf 12.37 0 Td[(q)ii=[(1)]TJ /F8 11.955 Tf 12.37 0 Td[(q)X]andS:=tXi=1Si.Bytheconvexityofexponentialfunction,wehaveE"exp(1 StXi=1Sikix,Gk2=2x,G)#E"1 StXi=1Siexpkix,Gk2=2x,G#expf1g. wherethelastinequalityfollowsfromAssumption A2 .Therefore,byMarkov'sinequality,forall>0,Prob(tXi=1(2)]TJ /F8 11.955 Tf 11.96 0 Td[(q)ii (1)]TJ /F8 11.955 Tf 11.96 0 Td[(q)Xkix,Gk2>(1+)2x,GtXi=1(2)]TJ /F8 11.955 Tf 11.96 0 Td[(q)ii (1)]TJ /F8 11.955 Tf 11.96 0 Td[(q)X)=Prob(exp(1 StXi=1Sikiyk2=2y)expf1+g)expf)]TJ /F4 11.955 Tf 15.27 0 Td[(g. Usingansimilarargument,wecanshowthatProb(tXi=1(2)]TJ /F8 11.955 Tf 11.96 0 Td[(q)ii (1)]TJ /F8 11.955 Tf 11.96 0 Td[(q)Xkix,Kk2>(1+)2x,KtXi=1(2)]TJ /F8 11.955 Tf 11.96 0 Td[(q)ii (1)]TJ /F8 11.955 Tf 11.96 0 Td[(q)X)expf)]TJ /F4 11.955 Tf 15.28 0 Td[(g,Prob(tXi=1(2)]TJ /F8 11.955 Tf 11.96 0 Td[(p)ii (1)]TJ /F8 11.955 Tf 11.95 0 Td[(p)Ykiyk2>(1+)2ytXi=1(2)]TJ /F8 11.955 Tf 11.95 0 Td[(p)ii (1)]TJ /F8 11.955 Tf 11.96 0 Td[(p)Y)expf)]TJ /F4 11.955 Tf 15.28 0 Td[(g. Combiningthepreviousthreeinequalities,weobtainProb(tXi=1(2)]TJ /F8 11.955 Tf 11.95 0 Td[(q)ii (1)]TJ /F8 11.955 Tf 11.96 0 Td[(q)Xkixk2+tXi=1(2)]TJ /F8 11.955 Tf 11.96 0 Td[(p)ii (1)]TJ /F8 11.955 Tf 11.95 0 Td[(p)Ykiyk2>(1+)"2xtXi=1(2)]TJ /F8 11.955 Tf 11.95 0 Td[(q)ii (1)]TJ /F8 11.955 Tf 11.95 0 Td[(q)X+2ytXi=1(2)]TJ /F8 11.955 Tf 11.95 0 Td[(p)ii (1)]TJ /F8 11.955 Tf 11.96 0 Td[(p)Y#)3expf)]TJ /F4 11.955 Tf 15.28 0 Td[(g, (2) 58

PAGE 59

Ourresultnowfollowsdirectlyfrom( 2 ),( 2 ),( 2 )and( 2 ). Intheremainingpartofthissubsection,ourgoalistoproveTheorem 2.4 ,whichdescribestheconvergencerateofAlgorithm 2.3 whenXandYarebothunbounded.SimilarasprovingTheorem 2.2 ,rstwespecializetheresultofLemma 3 under( 2 ),( 2 )and( 2 ).ThefollowinglemmaisanalogoustoLemma 2 Lemma5. Let^z=(^x,^y)2Zbeasaddlepointof( 2 ).IfVX(x,xt)=kx)]TJ /F8 11.955 Tf 12.56 0 Td[(xtk2=2andVY(y,yt)=ky)]TJ /F8 11.955 Tf 12.1 0 Td[(ytk2=2inAlgorithm 2.3 ,andtheparameterst,t,tandtsatisfy( 2 ),( 2 )and( 2 ),then a) k^x)]TJ /F8 11.955 Tf 11.95 0 Td[(xt+1k2+k^x)]TJ /F8 11.955 Tf 11.96 0 Td[(xvt+1k2+t(1)]TJ /F8 11.955 Tf 11.96 0 Td[(p) tk^y)]TJ /F8 11.955 Tf 11.95 0 Td[(yt+1k2+t tk^y)]TJ /F8 11.955 Tf 11.96 0 Td[(yvt+1k22k^x)]TJ /F8 11.955 Tf 11.96 0 Td[(x1k2+2t tk^y)]TJ /F8 11.955 Tf 11.95 0 Td[(y1k2+2t tUt,forallt1, (2) where(xvt+1,yvt+1)andUtaredenedin( 2 )and( 2 ),respectively. b) ~g(zagt+1,vt+1)1 ttkxagt+1)]TJ /F8 11.955 Tf 11.95 0 Td[(x1k2+1 ttkyagt+1)]TJ /F8 11.955 Tf 11.96 0 Td[(y1k2+1 ttUt=:t+1,forallt1,(2) where~g(,)isdenedin( 2 )andvt+1=1 tt(2x1)]TJ /F8 11.955 Tf 11.96 0 Td[(xt+1)]TJ /F8 11.955 Tf 11.95 0 Td[(xvt+1),1 tt(2y1)]TJ /F8 11.955 Tf 11.96 0 Td[(yt+1)]TJ /F8 11.955 Tf 11.96 0 Td[(yvt+1)+1 tK(xt+1)]TJ /F8 11.955 Tf 11.96 0 Td[(xt). (2) Proof. Applying( 2 ),( 2 )and( 2 )to( 2 )inLemma 3 ,wegetttQ(zagt+1,z)B(z,zt)+pt 2tky)]TJ /F8 11.955 Tf 11.96 0 Td[(ytk2+B(z,zvt)+Ut, whereforallz=(x,y),~z=(~x,~y)2Z,B(,)isdenedasB(z,~z):=t 2tkx)]TJ /F8 11.955 Tf 11.96 0 Td[(x1k2)]TJ /F4 11.955 Tf 16.14 8.09 Td[(t 2tkx)]TJ /F5 11.955 Tf 12.14 0 Td[(~xk2+t 2tky)]TJ /F8 11.955 Tf 11.95 0 Td[(y1k2)]TJ /F4 11.955 Tf 15.79 8.09 Td[(t 2tky)]TJ /F5 11.955 Tf 12.24 0 Td[(~yk2, thanksto( 2 ).Nowlettingz=^z,andnotingthatQ(zagt+1,^z)0,weget( 2 ). 59

PAGE 60

Ontheotherhand,ifweonlyapply( 2 )and( 2 )to( 2 )inLemma 3 ,thenwegetttQ(zagt+1,z)B(z,zt)+thK(xt+1)]TJ /F8 11.955 Tf 11.95 0 Td[(xt),y)]TJ /F8 11.955 Tf 11.96 0 Td[(yt+1i+B(z,zvt)+Ut. Applying( 2 )and( 2 )toB(z,zt)andB(z,zvt)intheaboveinequality,weget( 2 ). WiththehelpofLemma 5 ,wearereadytoproveTheorem 2.4 ProofofTheorem 2.4 Lett+1andvt+1bedenedin( 2 )and( 2 ),respectively.AlsoletCandD,respectively,bedenedin( 2 )and( 2 ).ItsufcestoestimateE[kvt+1k]andE[t+1].Firstitfollowsfrom( 2 ),( 2 )and( 2 )that E[Ut]t 2tC2.(2) Usingtheaboveinequality,( 2 ),( 2 )and( 2 ),wehaveE[k^x)]TJ /F8 11.955 Tf 11.95 0 Td[(xt+1k2]2D2+C2,E[k^y)]TJ /F8 11.955 Tf 11.95 0 Td[(yt+1k2](2D2+C2)1 1(1)]TJ /F8 11.955 Tf 11.95 0 Td[(p), which,byJensen'sinequality,thenimplythatE[k^x)]TJ /F8 11.955 Tf 11.96 0 Td[(xt+1k]p 2D2+C2,E[k^y)]TJ /F8 11.955 Tf 11.95 0 Td[(yt+1k2]p 2D2+C2r 1 1(1)]TJ /F8 11.955 Tf 11.96 0 Td[(p). Similarly,wecanshowthatE[k^x)]TJ /F8 11.955 Tf 11.95 0 Td[(xvt+1k]p 2D2+C2,E[k^y)]TJ /F8 11.955 Tf 11.95 0 Td[(yvt+1k2]p 2D2+C2r 1 1. 60

PAGE 61

Therefore,by( 2 )andtheabovefourinequalities,wehaveE[kvt+1k]E1 tt)]TJ /F2 11.955 Tf 5.48 -9.68 Td[(kx1)]TJ /F8 11.955 Tf 11.96 0 Td[(xt+1k+kx1)]TJ /F8 11.955 Tf 11.96 0 Td[(xvt+1k+1 tt)]TJ /F2 11.955 Tf 5.48 -9.68 Td[(ky1)]TJ /F8 11.955 Tf 11.95 0 Td[(yt+1k+ky1)]TJ /F8 11.955 Tf 11.95 0 Td[(yvt+1k+LK tkxt+1)]TJ /F8 11.955 Tf 11.96 0 Td[(xtkE1 tt)]TJ /F5 11.955 Tf 5.48 -9.69 Td[(2k^x)]TJ /F8 11.955 Tf 11.96 0 Td[(x1k+k^x)]TJ /F8 11.955 Tf 11.96 0 Td[(xt+1k+k^x)]TJ /F8 11.955 Tf 11.96 0 Td[(xvt+1k+1 tt)]TJ /F5 11.955 Tf 5.48 -9.68 Td[(2k^y)]TJ /F8 11.955 Tf 11.96 0 Td[(y1k+k^y)]TJ /F8 11.955 Tf 11.95 0 Td[(yt+1k+k^y)]TJ /F8 11.955 Tf 11.96 0 Td[(yvt+1k+LK t(k^x)]TJ /F8 11.955 Tf 11.95 0 Td[(xt+1k+k^x)]TJ /F8 11.955 Tf 11.95 0 Td[(xtk)2k^x)]TJ /F8 11.955 Tf 11.96 0 Td[(x1k tt+2k^y)]TJ /F8 11.955 Tf 11.96 0 Td[(y1k tt+p 2D2+C22 tt+1 ttr 1 1r 1 1)]TJ /F8 11.955 Tf 11.95 0 Td[(p+1+2LK t, thus( 2 )holds. Nowletusestimateaboundont+1.By( 2 ),( 2 ),( 2 )and( 2 ),wehaveE[t+1]=E1 ttkxagt+1)]TJ /F8 11.955 Tf 11.95 0 Td[(x1k2+1 ttkyagt+1)]TJ /F8 11.955 Tf 11.96 0 Td[(y1k2+1 ttE[Ut]E2 tt)]TJ /F2 11.955 Tf 5.48 -9.68 Td[(k^x)]TJ /F8 11.955 Tf 11.96 0 Td[(xagt+1k2+k^x)]TJ /F8 11.955 Tf 11.95 0 Td[(x1k2+2 tt)]TJ /F2 11.955 Tf 5.48 -9.68 Td[(k^y)]TJ /F8 11.955 Tf 11.95 0 Td[(yagt+1k2+k^y)]TJ /F8 11.955 Tf 11.96 0 Td[(y1k2+1 2ttC2=E1 tt2D2+2k^x)]TJ /F8 11.955 Tf 11.96 0 Td[(xagt+1k2+2t(1)]TJ /F8 11.955 Tf 11.95 0 Td[(p) tk^y)]TJ /F8 11.955 Tf 11.96 0 Td[(yagt+1k2+2tp tk^y)]TJ /F8 11.955 Tf 11.96 0 Td[(yagt+1k2+1 2ttC21 tt"2 tttXi=1iEk^x)]TJ /F8 11.955 Tf 11.96 0 Td[(xi+1k2+t(1)]TJ /F8 11.955 Tf 11.96 0 Td[(p) tEk^y)]TJ /F8 11.955 Tf 11.96 0 Td[(yi+1k2+tp tEk^y)]TJ /F8 11.955 Tf 11.96 0 Td[(yi+1k2+2D2+C2 21 tt"2 tttXi=1i2D2+C2+tp t1 1(1)]TJ /F8 11.955 Tf 11.95 0 Td[(p)(2D2+C2)+2D2+C2 2#=1 tt6)]TJ /F5 11.955 Tf 11.96 0 Td[(4p 1)]TJ /F8 11.955 Tf 11.95 0 Td[(pD2+5)]TJ /F5 11.955 Tf 11.95 0 Td[(3p 2)]TJ /F5 11.955 Tf 11.95 0 Td[(2pC2. Therefore( 2 )holds. 61

PAGE 62

2.5Application:PartiallyParallelImaging Inthissection,applythedeterministicAPDmethodtotheimagereconstructionprobleminpartiallyparallelimaging(PPI),whichisanemergingtechniqueinmagneticresonanceimaging(MRI).MRIisanon-invasiveandnon-ionizingmedicalimagingtechniquethatusesmagneticeldsandradiowavestoproduceimagesofinternalstructuresandorgansofthebody.Byapplyingradiofrequencymagneticeldstothebody,thespatialinformationisrecoveredbyFourieranalysisofthedatacollectedfromthereceivercoils.MRIiswellknownforitsabilitytodisplaygoodcontrastbetweendifferentsofttissues,likethebrainandtheheart.PPIisanMRItechniquetoacceleratethedataacquisitionprocessbycollectingparallelpartialFourierdata(k-spacedata)frommultiplecoilarrays.SensitivityEncoding(SENSE)isoneofthemostcommonlyusedPPImethodbasedonthefollowingassumption:MF(Sju)=fj. (2) In( 2 ),Misthemaskforthepartialk-spacedatacollection,Fisthe2DFouriertransform,Sjisthesensitivitymapofthej-coil,uistheobservationimage,andfjisthecollectedk-spacedatafromj-thcoil.Inthedataacquisitionprocess,thej-thcoilcollectspartialk-spacedataoftheSju,andproducesk-spacesignalfj.ThesensitivitymapSjdescribestheimpactontheimagedataduetothepositionofthecoil:atthepositionsclosertothecoilreceiver,theimagesignalmaybestrongerthanthepositionsfartheraway.Indiscreteform,wecanformulate( 2 )asMFSjx=fj, (2) wherex2Rnisthevectorformoftheimageuwithnpixels,Sj2Cnnisthematrixofsensitivitymap,F2CnnisthediscreteFouriertransformmatrix,andthemaskM2Rnnisadiagonalbinarymatrixdescribingtheportionorscanneddatainthek-space.InSENSE,itisassumedthatSj'sdescribetheimpactofcoilsensitivityonthe 62

PAGE 63

imagesignalduetodifferentpositionsofpixels,andthatthematrixSj'sarediagonal.Infact,sincediagSjanddiagMareofsamedimensionasx,wecanalsovisualizediagManddiagSjasimages(seeFigures 2-1 and 2-2 ). Inviewof( 2 )andtheTVbasedimagereconstructionmodel( 1 ),thefollowingTVregularizationmodelcanbeusedforrecoveringtheimagex:minx2Rnf(x):=1 2kXj=1kMFSjx)]TJ /F8 11.955 Tf 11.95 0 Td[(fjk2+nXi=1kDixk. (2) By( 1 ),wehaveanequivalentdeterministicSPPformulationminx2Xmaxy2YG(x)+hKx,yi, (2) whereX=Rn,Y=f(yT1,...,yTn)Tjyi2R2,kyik1foralli=1,...,ng,K=D,andG(x):=1 2kXj=1kMFSjx)]TJ /F8 11.955 Tf 11.96 0 Td[(fjk2. Wearenowreadytosolve( 2 )byAlgorithm 2.2 withthestepsizepolicyinCorollary 2 .FortheLipschitzconstantLG,sinceMisabinarydiagonalmatrix,Fisorthonormal,Sjisdiagonal,wehavekMk1,kFk=1,andkrG(x1))-222(rG(x2)k=kXj=1kSjFTMFSj(x1)]TJ /F8 11.955 Tf 11.96 0 Td[(x2)kkXj=1kSjk2kx1)]TJ /F8 11.955 Tf 11.95 0 Td[(x2k, hencewehaveLGkXj=1maxfjdiagSjj2g. Intheexperiment,wechooseanaggressivevalueofLG=Pkj=1maxfjdiagSjj2g=3.ForLK,itisshownin[ 14 ]thatkDTk28forthetwodimensionalnitedifferenceoperatorD,sowecansetLK=2p 2. IntheremainderofthissectionweshowtheresultsofthePPIreconstructionbasedonAlgorithm 2.2 .WecomparetheperformancetothatofNesterov'ssmoothing 63

PAGE 64

technique(Nesterov)in[ 72 ],theprimaldual(PD)algorithmin[ 15 ],theoperatorsplitting(OS)algorithmin[ 57 ],andtwovariantsofOSwithBarzilai-Borweinstepsize(SBBandSBBwithlinesearch)in[ 100 ].ThePPIdatasetisaradialbraindatasetacquiredona1.5TSiemensSymphonysystem(SiemensMedicalSolutions,Erlangen,Germany).TheacquisitionparameterswereFOV=220mm2,slicethickness5mm,TR=53.5ms,TE=3.4ms,andipangle75.Thefullk-spacedataisofsize256256with8coilreceivers,andwesimulatedPPIscanbyundersamplingthefulldatausingaPoissondistributedmaskwithsamplingratioof24.28%.TheimagesofthemaskandthesensitivitymapsareshowninFigure 2-1 andFigure 2-2 IntheTVbasedimagereconstructionmodelin( 2 ),wesettheTVregularization=10)]TJ /F7 7.97 Tf 6.59 0 Td[(4.Tocomparetheefciencyofdifferentoptimizationalgorithms,wecomparethenormalizedRMSEandobjectivefunctionvaluesin( 2 )versusCPUtime,whichareplottedinFigure 2-3 ThetopplotinFigure 2-3 istheplotofnormalizedRMSEversusCPUtime,andthebottomistheplotofobjectivefunctionvaluesversusCPUtime.WecanseethatourAPDmethodhassimilarperformanceasNesterov'ssmoothingtechnique,andbothofthemoutperformsotheralgorithms.ThecomparisonbetweenthereconstructedimagefromAPDalgorithmandthegroundtruthisshowninFigure 2-4 2.6ConcludingRemarksofThisChapter WepresenttheAPDmethod,whichincorporatesamulti-stepaccelerationschemeintotheprimal-dualmethodin[ 15 ].WeshowthatthisalgorithmcanachievetheoptimalrateofconvergenceforsolvingbothdeterministicandstochasticSPP.Inparticular,thestochasticAPDalgorithmseemstobetherstoptimalalgorithmforsolvingthisimportantclassofstochasticsaddle-pointproblemsintheliterature.ForbothdeterministicandstochasticSPP,thedevelopedAPDalgorithmscandealwitheitherboundedorunboundedfeasiblesetsaslongasasaddlepointofSPPexists.In 64

PAGE 65

theunboundedcase,therateofconvergenceoftheAPDalgorithmswilldependonthedistancefromtheinitialpointtothesetofoptimalsolutions. Figure2-1.Themaskofthek-spacedataacquisition. Figure2-2.Thesensitivitymapsoftheeightreceivercoils. 65

PAGE 66

(a)NormalizedRMSEvsCPUtime (b)ObjectivefunctionvaluevsCPUtime Figure2-3.ComparisonofNesterov,APD,PD,OS,SBBandSBBwithlinesearch 66

PAGE 67

(a)Groundtruth (b)ReconstructedimagefromAPD (c)ReconstructionerrorofAPD Figure2-4.ComparisonofthereconstructedimagefromAPDalgorithmandthegroundtruth 67

PAGE 68

CHAPTER3ACCELERATIONOFLINEARIZEDALTERNATINGDIRECTIONMETHODOFMULTIPLIERS 3.1Introduction AssumethatW,XandYarenitedimensionalvectorialspacesequippedwithinnerproducth,i,normkkandconjugatenormkk.Ourproblemofinterestinthischapteristhefollowingafneequalityconstrainedoptimization(AECO)problemin( 1 ):minx2X,w2WH(x)+F(w),s:t:Bw)]TJ /F8 11.955 Tf 11.95 0 Td[(Kx=b, (3) whereXXisaclosedconvexset,H():X!RandF():W!Rarenitevalued,convex,properandlowersemi-continuousfunctions,andK:X!Y,B:W!Yareboundedlinearoperators. Inthischapter,weassumethatH()canbedecomposedtotwodifferentparts:H(x):=G(x)+J(x),8x2X, (3) whereJ()isrelativelysimple,sothatthefollowingclassofoptimizationprobleminvolvingJ()canbesolvedefciently:minx2X 2kx)]TJ /F8 11.955 Tf 11.96 0 Td[(ck2+J(x),wherec2X,2R, (3) andG()isnotasimplefunction,butwithgoodrst-orderstucture.Inparticular,weassumethatG()iscontinuouslydifferentiableandthatthereexistsLG>0suchthatG(x2))]TJ /F8 11.955 Tf 11.95 0 Td[(G(x1))-221(hrG(x1),x2)]TJ /F8 11.955 Tf 11.95 0 Td[(x1iLG 2kx2)]TJ /F8 11.955 Tf 11.95 0 Td[(x1k2,8x12X,x22X. (3) WealsoassumethatF()isrelativelysimple,sothattheoptimizationproblemminw2W 2kw)]TJ /F8 11.955 Tf 11.95 0 Td[(ck2+F(w),wherec2W,2R (3) 68

PAGE 69

canbesolvedefciently.NoticingthatthesolutionefciencyrequirementofF()in( 3 )issimilarastherequirementofJ()in( 3 ),andsimilaraswhatweintroducedinChapter 2 ,wewillsimplysaythatfunctionswiththissolutionefciencyassumption,suchasF()andJ(),aresimple. OnespecialcaseoftheAECOproblemin( 3 )iswhenB=Iandb=0.Underthissituation,problem( 3 )isequivalenttothefollowingunconstrainedcompositeoptimization(UCO)problem:minx2Xf(x):=H(x)+F(Kx). (3) Itshouldbenotedthat( 3 )isalsoaspecialinstanceoftheSPPdiscussedinChapter 2 .TheAECOandUCOproblemshavefoundnumerousapplicationsinmachinelearningandimageprocessing.Inmostapplications,H()isknownasthedelitytermandF()istheregularizationterm.Forexample,considerthefollowingtwodimensionaltotalvariationbasedimagereconstructionproblemminx2Fn1 2kAx)]TJ /F8 11.955 Tf 11.96 0 Td[(ck2+kDxk2,1, (3) wheretheeldFiseitherRorC,andthetotalvariationseminormkDxk2,1isdenedasfollows:D:Fn!F2nisthetwodimensionaldiscretegradientoperator,andkyk2,1:=nXi=1q y2i+y2i+n. (3) NoticethattheoperatorDin( 3 )hasalreadybeenintroducedin( 1 ).Infact,problem( 3 )isaspecialinstanceof( 1 ). Bythediscussionafter( 1 ),settingH(x):=kAx)]TJ /F8 11.955 Tf 12.91 0 Td[(ck2=2,F():=kk2,1,K=r,X=X=FnandW=F2n,problem( 3 )becomesaUCOproblemin( 3 ).Inparticular,ifforanyconstant>0,thematrixinversion(I+AA))]TJ /F7 7.97 Tf 6.59 0 Td[(1canbecalculatedefciently,then( 3 )canbesolvedefciently,andwecanassumethatG=0andJ(x):=kAx)]TJ /F8 11.955 Tf 12.5 0 Td[(bk2=2in( 3 ).Ontheotherhand,ifthecalculationoftheinverse 69

PAGE 70

(I+AA))]TJ /F7 7.97 Tf 6.58 0 Td[(1istime-consuming,thenwecanassumethatJ=0andG(x):=kAx)]TJ /F8 11.955 Tf 9.93 0 Td[(bk2=2in( 3 ),andLG=kAAk=p max(AA)in( 3 ). 3.1.1NotationsandTerminologies Themaininterestofthischapteristosolvetheapproximatedsolutionsof( 3 )and( 3 ).Inthissubsection,wegivethedenitionsoftheirapproximatesolutions,aswellasothernecessaryassumptions,notationsandterminologiesthatwillbeusedthroughoutthischapter. BythemethodofLagrangianmultipliers,problem( 3 )isequivalenttothefollowingsaddlepointproblem:minx2X,w2Wmaxy2YH(x)+F(w))-222(hy,Bw)]TJ /F8 11.955 Tf 11.95 0 Td[(Kx)]TJ /F8 11.955 Tf 11.96 0 Td[(bi. (3) Weassumethatthereexistsanoptimalsolution(w,x)of( 3 )andthatthereexistsy2Ysuchthatz:=(w,x,y)2Zisasaddlepointof( 3 ),whereZ:=WXY.WealsousethenotationZ:=WXYifasetYYisdeclaredreadily.Weusef:=H(x)+F(w)todenotetheoptimalobjectivevalueofproblem( 3 ).Since( 3 )isaspecialcaseof( 3 ),wewillalsouseftodenotetheoptimalvalueH(x)+F(Kx). Ourgoalinthischapteristondapproximatesolutionsof( 3 )and( 3 ).Withtheassumptionsandnotationsintroducedabove,theapproximatesolutionof( 3 )isdenedasfollows: Denition1. Apair(w,x)2WXiscalledan(",)-solutionof( 3 )ifH(x)+F(w))]TJ /F8 11.955 Tf 11.96 0 Td[(f",andkBw)]TJ /F8 11.955 Tf 11.95 0 Td[(Kx)]TJ /F8 11.955 Tf 11.96 0 Td[(bk. Wesaythat(w,x)hasprimalresidual"andfeasibilityresidual.Inparticular,if(w,x)isan(",0)-solution,thenwesimplysaythatitisan"-solution. ThefeasibilityresidualinDenition 1 measurestheviolationoftheequalityconstraint,andtheprimalresidual"measuresthegapbetweentheobjectivevalueoftheapproximatesolutionandtheoptimalsolutionof( 3 ).Itshouldbenotedthatforan 70

PAGE 71

(",)-solution(w,x)where>0,itispossiblethatH(x)+F(w))]TJ /F8 11.955 Tf 12.25 0 Td[(f<0.However,if(w,x)isan"-solution,thenwealwayshaveH(x)+F(w))]TJ /F8 11.955 Tf 11.96 0 Td[(f0. Intheremainderofthissubsection,weintroducesomenotationsthatwillbeusedthroughoutthischapter.Thefollowingdistanceconstantswillbeusedforsimplicity:Dw,B:=kB(w1)]TJ /F8 11.955 Tf 11.96 0 Td[(w)k,Dx,K:=kK(x1)]TJ /F8 11.955 Tf 11.95 0 Td[(x)k,Dx:=kx1)]TJ /F8 11.955 Tf 11.95 0 Td[(xk,Dy:=ky1)]TJ /F8 11.955 Tf 11.96 0 Td[(yk,DX,K:=supx1,x22XkKx1)]TJ /F8 11.955 Tf 11.96 0 Td[(Kx2k,andDS:=sups1,s22Sks1)]TJ /F8 11.955 Tf 11.95 0 Td[(s2k,foranycompactsetS. (3) Forexample,foranycompactsetYY,weuseDYtodenotethediameterofY.Inaddition,sincethenotationsofsequencesareusedfrequentlyintheconvergenceanalysisthroughoutthischapter,fornotationsimplicityweusex[t]todenotesequencefxigti=1,wherexi'scanbeeitherrealnumbers,orpointsinanyvectorialspace.Wewillalsoequipafewoperationsonthenotationofsequences.Firstly,supposethatV1,V2areanyvectorspaces,v[t+1]V1isanysequenceinV1andA:V1!V2isanyoperator,weuseAv[t+1]todenotethesequencefAvigt+1i=1.Secondly,if[t],[t]Rareanyrealvaluedsequences,andL2Risanyrealnumber,then[t])]TJ /F8 11.955 Tf 12.31 0 Td[(L[t]denotesfi)]TJ /F8 11.955 Tf 12.11 0 Td[(Ligti=1.Finally,wedenoteby)]TJ /F7 7.97 Tf 6.58 0 Td[(1[t]thereciprocalsequencefigti=1foranynon-zerorealvaluedsequence[t]. 3.1.2AlternatingDirectionMethodofMultipliersandItsVariants Inthischapter,weconsidertheaccelerationofthealternatingdirectionmethodofmultipliers(ADMM)algorithmforsolving( 3 ).Inthissubsection,wegiveabriefintroductionontheADMMschemeandsomeofitsvariants. TheADMMmethodforsolvingequalityconstrainedoptimizationproblemsoriginatesfromtheworkofaugmentedLagrangianmethod(ALM)byHestenes[ 40 ]andPowell[ 80 ].TheideaoftheALM(originallycalledthemethodofmultipliersin[ 40 80 ];seealsothetextbooks,e.g.,[ 9 10 73 ])istosolvethefollowingaugmented 71

PAGE 72

Lagrangianformulationof( 3 ):minx2X,w2Wmaxy2YH(x)+F(w))-222(hy,Bw)]TJ /F8 11.955 Tf 11.96 0 Td[(Kx)]TJ /F8 11.955 Tf 11.95 0 Td[(bi+ 2kBw)]TJ /F8 11.955 Tf 11.96 0 Td[(Kx)]TJ /F8 11.955 Tf 11.95 0 Td[(bk2, (3) whereisapenaltyparameter.TheALMisaspecialcaseoftheDouglas-Rachfordsplittingmethod[ 26 30 55 ],whichisalsoaninstanceoftheproximalpointalgorithm[ 27 84 ].TheADMMalgorithm[ 31 34 ]canbeseenasanalternatingmethodforsolving( 3 )thatminimizesxandwalternativelyandthenupdatestheLagrangiancoefcienty(See[ 12 ]foracomprehensiveexplanationonALM,ADMMandotheralgorithms).Incompressivesensingandimagingscience,theclassofBregmaniterativemethodsisanapplicationoftheALMandtheADMM.Inparticular,theBregmaniterativemethod[ 37 ]isequivalenttotheALM,andthesplitBregmanmethod[ 36 ]isequivalenttotheADMM.WedescribetheschemeofADMMinAlgorithm 3.1 InAlgorithm 3.1 ,weassumethattheregularizationfunctionF()andtheoperatorBarebothsimple,sothattheoptimizationproblemin( 3 )canbesolvedefciently.Whilethisassumptionisreasonableinmostapplications,inmanyoccasionstheotheroptimizationproblemin( 3 )maynotbesolvedeasily,sinceH()hasacomponentG()whichmaynotbesimpleandthematrixAmaybelargefullmatrix.Insuchsituations,linearizationofG()andlinearizationofthequadraticpenaltytermkBwt)]TJ /F8 11.955 Tf 10.11 0 Td[(Kx)]TJ /F8 11.955 Tf 10.12 0 Td[(bk2maybeconsidered.WecallthevariantthatlinearizesG()thelinearizedADMM(L-ADMM),andthevariantthatlinearizeskBwt)]TJ /F8 11.955 Tf 12.39 0 Td[(Kx)]TJ /F8 11.955 Tf 12.39 0 Td[(bk2thepreconditionedADMM(P-ADMM).IfbothG()andkBwt)]TJ /F8 11.955 Tf 12.99 0 Td[(Kx)]TJ /F8 11.955 Tf 12.99 0 Td[(bk2arelinearized,wenametheschemeasthelinearizedpreconditionedADMM(LP-ADMM).TheschemesofL-ADMM,P-ADMMandLP-ADMMarelistedinAlgorithm 3.2 3.3 and 3.4 TherehasbeenseveralworksontheconvergenceanalysisandapplicationsofADMM,L-ADMM,andP-ADMM. 1) UCOproblems.DuetovariousapplicationsofUCOproblemsonimagingscience,therehasbeenmanystudiesonprimal-dualtypealgorithmsforsolvingthe 72

PAGE 73

Algorithm3.1Thealternationdirectionmethodofmultipliers(ADMM)forsolving( 3 ) 1: Choosex12X,w12Wandy12Y. 2: Fort=1,...,N)]TJ /F5 11.955 Tf 11.96 0 Td[(1,updatext+1=argminx2XH(x))-222(hyt,Bwt)]TJ /F8 11.955 Tf 11.96 0 Td[(Kx)]TJ /F8 11.955 Tf 11.95 0 Td[(bi+ 2kBwt)]TJ /F8 11.955 Tf 11.95 .01 Td[(Kx)]TJ /F8 11.955 Tf 11.96 .01 Td[(bk2, (3)=argminx2XH(x)+hyt,Kxi+ 2kBwt)]TJ /F8 11.955 Tf 11.96 0 Td[(Kx)]TJ /F8 11.955 Tf 11.95 0 Td[(bk2,wt+1=argminw2WF(w))-221(hyt,Bw)]TJ /F8 11.955 Tf 11.96 0 Td[(Kxt+1)]TJ /F8 11.955 Tf 11.96 0 Td[(bi+ 2kBw)]TJ /F8 11.955 Tf 11.95 0 Td[(Kxt+1)]TJ /F8 11.955 Tf 11.96 0 Td[(bk2, (3)=argminw2WF(w))-221(hyt,Bwi+ 2kBw)]TJ /F8 11.955 Tf 11.96 0 Td[(Kxt+1)]TJ /F8 11.955 Tf 11.95 0 Td[(bk2,yt+1=yt)]TJ /F4 11.955 Tf 11.96 0 Td[((Bwt+1)]TJ /F8 11.955 Tf 11.95 0 Td[(Kxt+1)]TJ /F8 11.955 Tf 11.95 0 Td[(b). (3) Algorithm3.2L-ADMMforsolving( 3 ) Modify( 3 )inAlgorithm 3.1 toxt+1=argminx2XhrG(xt),xi+J(x)+hyt,Kxi+t 2kBwt)]TJ /F8 11.955 Tf 11.96 0 Td[(Kx)]TJ /F8 11.955 Tf 11.95 0 Td[(bk2+t 2kx)]TJ /F8 11.955 Tf 11.96 0 Td[(xtk2, (3) whererG(xt)isthestochasticgradientobtainedfromSOwithinputxt. Algorithm3.3P-ADMMforsolving( 3 ) Modify( 3 )inAlgorithm 3.1 toxt+1=argminx2XH(x)+hyt,Kxi+thBwt)]TJ /F8 11.955 Tf 11.96 0 Td[(Kxt)]TJ /F8 11.955 Tf 11.96 0 Td[(b,Axi+t 2kx)]TJ /F8 11.955 Tf 11.96 0 Td[(xtk2, (3) whererG(xt)isthestochasticgradientobtainedfromSOwithinputxt. Algorithm3.4LP-ADMMforsolving( 3 ) Modify( 3 )inAlgorithm 3.1 toxt+1=argminx2XrG(xt),xi+J(x)+hyt,Kxi+thBwt)]TJ /F8 11.955 Tf 11.96 0 Td[(Kxt)]TJ /F8 11.955 Tf 11.96 0 Td[(b,Axi+t 2kx)]TJ /F8 11.955 Tf 11.96 0 Td[(xtk2, (3) whererG(xt)isthestochasticgradientobtainedfromSOwithinputxt. 73

PAGE 74

UCOproblems.In[ 15 28 ],therelationshipbetweenADMM,P-ADMMandotheralgorithmsforsolvingtheUCOproblemin( 3 )isstudied,includingtheextrapolationalgradientmethod[ 47 79 ],theDouglasRachfordsplittingmethod[ 26 ]andtheArrow-Hurwicz-Uzawamethod[ 2 102 ].Inparticular,itisshownin[ 15 ]thatiftandtin( 3 ),thenP-ADMMforsolvingtheUCOproblemisequivalenttoaninstanceofaprimal-dualmethod,andifkKk2=<1,thentheP-ADMMsolvestheUCOproblemwiththerateofconvergence(seethediscussiononAlgorithm1with=1in[ 15 ])O(kKkD2 N), whereNisthenumberofiterationsandDdependsonthedistancesDxandDy.TherearealsoseveralworksconcerningthetuningofthestepsizetinL-ADMM,including[ 17 100 101 ]. 2) AECOproblems.OneimportantpaperontheconvergenceanalysisofADMMis[ 61 ],inwhichADMMistreatedasaninstanceofblock-decompositionhybridproximalextragradient(BD-HPE),anditisprovedthattherateofconvergenceofADMMforsolvingAECOisO(D2 N), whereDdependsonB,DxandDy.TheconvergenceanalysisofADMMandP-ADMMonsolvingtheAECOproblemin( 3 )isalsostudiedin[ 38 ],inwhichitisassumedthatboththeprimalanddualspacesofthesaddlepointproblem( 3 )arebounded,andtheresultisbasedonthevariationalinequalityformulationof( 3 ).InastudyofthestochasticADMMin[ 74 ],aconvergenceresultbasedonthelinearcombinationoftheprimalresidualandfeasibilityresidualisstudied,andinthedeterministiccasein[ 74 ],therateofconvergenceofADMMandL-ADMM 74

PAGE 75

forsolvingtheAECOproblemwhenXiscompactisH(xN)+F(wN))]TJ /F8 11.955 Tf 11.96 0 Td[(H(x))]TJ /F8 11.955 Tf 11.96 0 Td[(F(w)+kBwN)]TJ /F8 11.955 Tf 11.95 0 Td[(KxN)]TJ /F8 11.955 Tf 11.95 0 Td[(bk2OLGD2X+D2y,B N,8>0, (3) where(xN,wN)istheaverageofalltheiteratesoftheADMMalgorithm.Theresultin( 3 )isstrongerthantheresultin[ 38 ],inthesensethatboththeprimalresidualandthefeasibilityresidualareincludedin( 3 ),whilein[ 38 ]thereisnodiscussiononthefeasibilityresidual.However,itshouldbenotedthatsincetheprimalresidualH(xN)+F(wN))]TJ /F8 11.955 Tf 12.03 0 Td[(H(x))]TJ /F8 11.955 Tf 12.03 0 Td[(F(w)in( 3 )canbenegative,( 3 )doesnotgivetheexactrateofconvergenceofthefeasibilityresidual. Itshouldbenotedthatfortheconvergenceanalysisresultsintroducedinthissubsection,theorderoftherateofconvergenceisO(1=N).Inthefollowingsubsection,weintroducesomeaccelerationtechniquesforADMM-typealgorithms. 3.1.3AcceleratedMethodsforAECOandUCOProblems Inhisseminalpaper[ 70 ],Nesterovstudiedanoptimalrst-ordermethodforsolvingsmoothoptimizationproblems.Inparticular,NesterovdemonstratedthattheoptimalrateforminimizingaconvexcontinuouslydifferentiablefunctionG(x)thatsatises( 3 )isOLGD2x N2. (3) Inoneotherseminalstudy[ 72 ],Nesterovdemonstratedthattheoptimalmethodcanbeappliedonnon-smoothoptimizationwithO(1=N)rateofconvergence,whichoutperformsthetraditionalnon-smoothsubgradientmethodsbyanorder.Inparticular,therateofconvergenceofNesterov'ssmoothingtechniqueappliedtotheUCOproblemisOLGD2x N2+kKkDxDY N, (3) 75

PAGE 76

whereYisthedualspaceoftheUCOproblem.Itisworthnotingthatin[ 72 ]XisassumedcompactandhencetherateofconvergenceisdependentonDXinsteadofDxin( 3 ).However,theanalysisin[ 72 ]isalsoapplicableforthecasewhenXisunbounded,yielding( 3 ).Followingthebreakthroughin[ 72 ],muchefforthasbeendevotedtothedevelopmentofmoreefcientrst-ordermethodsfornon-smoothoptimization(see,e.g.,[ 4 8 22 51 52 69 75 91 ]).ItshouldbenotedthattheboundednessofYisimportantfortheconvergenceanalysisofNesterov'ssmoothingtechnique.In[ 19 ],anaccelerationprimal-dual(APD)methodforsolvingtheUCOproblemisproposed,whichincorporatesamulti-stepaccelerationschemeintoaprimal-dualmethod.Theproposedmethodin[ 19 ]achievestheO(LG=N2+kKk=N)rateofconvergencewithoutusingthesmoothingtechnique,anddoesnotrequireanyboundednessonthedualspace. BetteraccelerationresultscanbeobtainedifmoreassumptionsareenforcedfortheAECOandUCOproblem.Wegivealistofsuchaccelerationresults. 1) Specialinstance.FortheUCOproblem,ifK=IandH(x)isalsosimple,anacceleratedmethodwithskippingstepsisproposed(seeAlgorithm7in[ 35 ])thatachievestheO(1=N2)rateofconvergence,whichisbetterthan( 3 ),inthecostofevaluatingobjectivevaluefunctionsineachiteration.IfK6=I,thealgorithmin[ 35 ]requiresthatthefunctionF(K)issimple,i.e.,theproblemminx2X 2kx)]TJ /F8 11.955 Tf 11.95 0 Td[(ck2+F(Kx),wherec2X,2R (3) canbesolvedefciently. 2) Excessivegaptechnique.Theexcessivegaptechniqueisproposedin[ 69 ],inwhichitisassumedthatH(x)isalsosimple.IfH()isstronglyconvex,itisshownin[ 69 ]thattherateofconvergenceoftheexcessivegaptechniquecanbeacceleratedtoO(1=N2)forsolvingtheUCOproblem. 76

PAGE 77

3) Strongconvexity.Aprimal-dualmethodforsolvingtheUCOproblemisstudiedin[ 15 ].TheauthorsshowedthatP-ADMMisequivalenttotheirproposedmethod,andfurthermore,ifeitherH()orF()isuniformlyconvex,thentherateofconvergenceofP-ADMMappliedtotheUCOproblemcanbeacceleratedtoO(1=N2).Inaddition,ifbothH()andF()areuniformlyconvex(hencetheobjectivefunctionin( 3 )iscontinuouslydifferentiable),theproposedmethodin[ 15 ]convergeslinearly.ForADMMalgorithmappliedtotheAECOproblemwhenbothH(x)andF(x)arestronglyconvex,anaccelerationmethodisproposedin[ 36 ]thatachievestheO(1=N2)rateofconvergence. 3.1.4MainResults Ourcontributioninthischaptermainlyconsistsofthefollowingaspects.Firstly,wepresenttwonovelacceleratedADMM-typemethods,namelytheacceleratedlinearizedADMM(AL-ADMM)andtheacceleratedlinearizedpreconditionedADMM(ALP-ADMM),thatsolvesboththeAECOandUCOproblemswithoutrequiringtheapplicationofthesmoothingtechnique.Theproposedacceleratedmethodscanachievesimilarrateofconvergenceasin( 3 ),hencetheycanefcientlysolvebothtypesofproblemswithlargeLipschitzconstantLG(aslargeasO(N)).Asby-products,wealsoshowthatfortheratesofconvergenceofADMM,P-ADMM,L-ADMMandLP-ADMMarealloforderO(1=N). Secondly,weclassifytheAECOandUCOproblemstotwotypes:TypeIandTypeII.ForproblemsofTypeII,ourconvergenceanalysisisperformedonboththeprimalandfeasibilityresiduals.WeimposemoreassumptionsonproblemsofTypeI,andwedemonstratethatunderproperassumptions,therealwaysexistsapproximatesolutionstoTypeIproblemsthathaszerofeasibilityresidual. Finally,wedemonstratethattheproposedframeworkcandealwiththesituationwheneitherprimalordualspacesoftheAECOandUCOproblemsisunbounded,as 77

PAGE 78

longasasaddlepointofproblem( 3 )exists.Therateofconvergencewilldependonthedistancefromtheinitialpointtothesetofoptimalsolutions. 3.2AnAcceleratedLinearizedADMMFramework Inthissection,weintroduceanacceleratedlinearizedADMMframeworkforsolvingtheAECOproblem( 3 )andtheUCOproblem( 3 ).TheADMM,L-ADMM,P-ADMMandLP-ADMMalgorithmsarespecialcasesoftheproposedframework.Inaddition,wedemonstratethattheproposedframeworkincludestwoacceleratedlinearizedalgorithms,thatacceleratetherateofconvergenceoflinearizedADMM,namelyL-ADMMandLP-ADMM,respectively.Tostartwith,weintroducetwotypesofgapfunctionsinSection 3.2.1 ,andtheAECOandUCOproblemsareclassiedtotwoclassesbasedonthesetwotypesofgapfunctions.Later,wedescribetheschemeoftheacceleratedlinearizedADMMframeworkinSection 3.2.2 ,andthenprovidethemainconvergenceresultsinSections 3.2.3 and 3.2.4 ,basedonthetwogapfunctionsdiscussedinSection 3.2.1 3.2.1GapFunctions Inthissubsection,weintroducetwotypesofgapfunctionsthatcanbeusedtoanalyzetheperformanceofADMM-typealgorithmsforsolvingapproximatesolutionsofAECOandUCOproblems. Ourgapfunctionsoriginatefromthetheoryofsaddlepointproblems,whichhasbeenvisitedinChapter 2 .Forany~z=(~w,~x,~y)2Zandz=(w,x,y)2Z,wedenethefunction:Q(~w,~x,~y;w,x,y):=[H(x)+F(w))-222(h~y,Bw)]TJ /F8 11.955 Tf 11.96 0 Td[(Kx)]TJ /F8 11.955 Tf 11.95 0 Td[(bi])]TJ /F5 11.955 Tf 11.95 0 Td[([H(~x)+F(~w))-222(hy,B~w)]TJ /F8 11.955 Tf 11.96 0 Td[(K~x)]TJ /F8 11.955 Tf 11.96 0 Td[(bi]. (3) Forsimplicity,wemayalsousethenotationQ(~z;z):=Q(~w,~x,~y;w,x,y),andunderdifferentsituations,wemayusenotationsQ(~z;w,x,y)orQ(~w,~x,~y;z)forthesamemeaning.WecanseethatQ(z,z)0andQ(z,z)0forallz2Z,wherez 78

PAGE 79

isasaddlepointof( 3 ),asassumedinSection 3.1.1 .Bythetheoryofsaddlepointproblems,forcompactsetsWW,XX,YY,thedualitygapfunctionsup~w2W,~x2X,~y2YQ(~w,~x,~y;w,x,y) (3) measurestheaccuracyofanapproximatesolution(w,x,y)tothesaddlepointproblemminx2X,w2Wmaxy2YH(x)+F(w))-222(hy,Bw)]TJ /F8 11.955 Tf 11.95 0 Td[(Kx)]TJ /F8 11.955 Tf 11.96 0 Td[(bi. However,ourproblemofinterest( 3 )hasasaddlepointformulation( 3 ),inwhichthefeasibleset(W,X,Y)isunbounded.Therefore,weneedtomakeslightmodicationsonthegapfunctionin( 3 )inordertomeasuretheaccuracyofapproximatesolutionsto( 3 ). ForanyclosedsetYY,andforanyz2Zandv2Y,wedenethefollowinggapfunction:gY(v,z):=sup~y2YQ(w,x,~y;z)+hv,~yi, (3) whereQisdenedin( 3 ).Inaddition,wedenegY(z):=gY(0,z)=sup~y2YQ(w,x,~y;z). (3) IfY=Y,wewillomitthesubscriptYandsimplyusenotationsg(v,z)andg(z). TheconvergenceanalysisthroughoutthischapterisbasedonthepropertiesofgapfunctionsgY(v,z)andgY(z).InPropositions 1 and 2 below,wedescribetherelationshipbetweenthegapfunctionsandtheapproximatesolutions. Proposition1. ForanyYY,ifgY(Bw)]TJ /F8 11.955 Tf 12.06 0 Td[(Kx)]TJ /F8 11.955 Tf 12.07 0 Td[(b,z)"<1andkBw)]TJ /F8 11.955 Tf 12.07 0 Td[(Kx)]TJ /F8 11.955 Tf 12.07 0 Td[(bkwherez=(w,x,y)2Z,then(w,x)isan(",)-solutionof( 3 ).Specially,ifg(v,z)"<1andkvk,thenv=Bw)]TJ /F8 11.955 Tf 11.95 0 Td[(Kx)]TJ /F8 11.955 Tf 11.96 0 Td[(b. 79

PAGE 80

Proof. By( 3 )and( 3 ),forallv2YandYY,wehavegY(v,z)=sup~y2Y[H(x)+F(w))-222(h~y,Bw)]TJ /F8 11.955 Tf 11.95 0 Td[(Kx)]TJ /F8 11.955 Tf 11.96 0 Td[(bi])]TJ /F5 11.955 Tf 11.96 0 Td[([H(x)+F(w)]+hv,~yi=H(x)+F(w))]TJ /F8 11.955 Tf 11.96 0 Td[(f+sup~y2Yh)]TJ /F5 11.955 Tf 14.23 0 Td[(~y,Bw)]TJ /F8 11.955 Tf 11.95 0 Td[(Kx)]TJ /F8 11.955 Tf 11.96 0 Td[(b)]TJ /F8 11.955 Tf 11.95 0 Td[(vi. FromtheaboveweseethatgY(Bw)]TJ /F8 11.955 Tf 13.09 0 Td[(Kx)]TJ /F8 11.955 Tf 13.09 0 Td[(b,z)=H(x)+F(w))]TJ /F8 11.955 Tf 13.09 0 Td[(f,henceifkBw)]TJ /F8 11.955 Tf 12.4 0 Td[(Kx)]TJ /F8 11.955 Tf 12.4 0 Td[(bk,then(w,z)isan(",)-solution.Inaddition,wecanalsoseethatg(v,z)=1ifv6=Bw)]TJ /F8 11.955 Tf 11.96 0 Td[(Kx)]TJ /F8 11.955 Tf 11.95 0 Td[(b,henceg(v,z)<1impliesthatv=Bw)]TJ /F8 11.955 Tf 11.96 0 Td[(Kx)]TJ /F8 11.955 Tf 11.95 0 Td[(b. Itisworthnotingthatinthegapfunctiong(v,z),thevariablevisalwaysthefeasibilityresidualoftheapproximatesolution(w,x).InProposition 2 below,wedemonstratethatunderproperassumptions,thereexistsapproximatesolutionsthathaszerofeasibilityresidual. Proposition2. AssumethatBisanone-to-onelinearoperatorsuchthatBW=Y,andF()isLipschitzcontinuous.IfwedeneY:=(B))]TJ /F7 7.97 Tf 6.58 0 Td[(1domF,thenYisbounded,andifgY(z)",thenthepair(~w,x)isan"-solutionof( 3 ),where~w=(B))]TJ /F7 7.97 Tf 6.59 0 Td[(1(Kx+b). Proof. Wecanseethat~wiswelldenedsinceBW=Y.Also,usingthefactthatF()isnitevalued,byCorollary13.3.3in[ 83 ]weknowthatdomFisbounded,henceYisbounded.Inaddition,asB~w)]TJ /F8 11.955 Tf 11.96 0 Td[(Kx)]TJ /F8 11.955 Tf 11.95 0 Td[(b=0,wehaveBw)]TJ /F8 11.955 Tf 11.96 0 Td[(Kx)]TJ /F8 11.955 Tf 11.95 0 Td[(b=Bw)]TJ /F8 11.955 Tf 11.96 0 Td[(B~w,andgY(z)=sup~y2Y[H(x)+F(w))-221(h~y,Bw)]TJ /F8 11.955 Tf 11.96 0 Td[(Kx)]TJ /F8 11.955 Tf 11.95 0 Td[(bi])]TJ /F5 11.955 Tf 11.95 0 Td[([H(x)+F(w)]=H(x)+F(w))]TJ /F8 11.955 Tf 11.95 0 Td[(f+sup~y2Yh)]TJ /F5 11.955 Tf 14.24 0 Td[(~y,Bw)]TJ /F8 11.955 Tf 11.96 0 Td[(Kx)]TJ /F8 11.955 Tf 11.95 0 Td[(bi=H(x)+F(~w))]TJ /F8 11.955 Tf 11.95 0 Td[(f+sup~y2Y[F(w))]TJ /F8 11.955 Tf 11.96 0 Td[(F(~w))-222(hB~y,w)]TJ /F5 11.955 Tf 13.64 0 Td[(~wi]. IfBY\@F(~w)6=;,thenapplyingtheconvexityofF()totheequationabovewehavegY(z)H(x)+F(~w))]TJ /F8 11.955 Tf 11.95 0 Td[(f, 80

PAGE 81

thus(~w,x)isan"-solution.TonishtheproofitsufcestoshowthatBY\@F(~w)6=;.Observingthatsupw2BYh~w,wi)]TJ /F8 11.955 Tf 19.26 0 Td[(F(w)=supw2domFh~w,wi)]TJ /F8 11.955 Tf 19.26 0 Td[(F(w)=supw2Wh~w,wi)]TJ /F8 11.955 Tf 19.26 0 Td[(F(w), sinceYisclosed,wecanconcludethatthereexistsB~y2BYsuchthatB~yattainsthesupremumofthefunctionh~w,wi)]TJ /F8 11.955 Tf 20.51 0 Td[(F(w)withrespecttow.ByTheorem23.5in[ 83 ],wehaveB~y2@F(~w),andhence@F(~w)\BY6=;. InviewofPropositions 1 and 2 ,weusethefollowingterminologiesthroughoutthischapter: ForAECOandUCOproblemsin( 3 )and( 3 ),ifBisone-to-one,BW=YandF()isLipschitzcontinuous,thenwesaythattheyareproblemsoftypeI.Otherwise,wesaythattheyareproblemsoftypeII. Afewremarksareinorderonthedifferencesbetweenthesetwotypesofproblems.Firstly,onemajordifferencebetweenthesetwotypesofproblemsisthatthesetYisboundedinproblemsofTypeIandY=YisunboundedinTypeII.WewillseelaterinSections 3.2.3 and 3.2.4 thatdifferentparametersettingsarebeproposeddependingontheboundednessofY.Secondly,duetotheassumptionsforproblemsofTypeI,weseefromProposition 2 thattherealwaysexistsapproximatesolutionswithzerofeasibilityresidual.Finally,ifX=Rn,W=RkandY=Rm,thenforproblemsofTypeIwehavethatk=mandB2Rmmisaninvertiblematrix. 3.2.2ProposedFramework Inthissection,weproposeanacceleratedlinearizedADMMframeworkforsolvingtheAECOandUCOproblems.TheADMM,L-ADMM,P-ADMMandLP-ADMMalgorithmsarespecialinstancesoftheproposedframework.Inaddition,twonovelacceleratedADMMmethodscanbederivedfromtheproposedframework,namelytheacceleratedlinearizedADMM(AL-ADMM)andacceleratedlinearizedpreconditioned 81

PAGE 82

ADMM(ALP-ADMM).AL-ADMMandALP-ADMMhavebetterrateofconvergencethanL-ADMMandLP-ADMM,respectively,intermsofthedependenceontheLipschitzconstantLG. TheproposedacceleratedADMMframeworkispresentedinalgorithm 3.5 Algorithm3.5AgeneralframeworkforADMM-typealgorithms 1: Choosex12X,w12Wandy12Y.Setxag1=x1andwag1=w1. 2: Fort=1,...,N)]TJ /F5 11.955 Tf 11.96 0 Td[(1,rstupdatexmdt=(1)]TJ /F4 11.955 Tf 11.95 0 Td[(t)xagt+txt, (3)xt+1=argminx2XhrG(xmdt),xi+J(x))]TJ /F4 11.955 Tf 11.95 0 Td[(thBwt)]TJ /F8 11.955 Tf 11.96 0 Td[(Kxt)]TJ /F8 11.955 Tf 11.96 0 Td[(b,Kxi+(1)]TJ /F4 11.955 Tf 11.95 0 Td[()t 2kBwt)]TJ /F8 11.955 Tf 11.95 0 Td[(Kx)]TJ /F8 11.955 Tf 11.96 0 Td[(bk2+hyt,Kxi+t 2kx)]TJ /F8 11.955 Tf 11.96 0 Td[(xtk2, (3)wt+1=argminw2WF(w))-221(hyt,Bwi+t 2kBw)]TJ /F8 11.955 Tf 11.96 0 Td[(Kxt+1)]TJ /F8 11.955 Tf 11.96 0 Td[(bk2, (3)yt+1=yt)]TJ /F4 11.955 Tf 11.96 0 Td[(t(Bwt+1)]TJ /F8 11.955 Tf 11.95 0 Td[(Kxt+1)]TJ /F8 11.955 Tf 11.96 0 Td[(b). (3)xagt+1=(1)]TJ /F4 11.955 Tf 11.95 0 Td[(t)xagt+txt+1, (3)wagt+1=(1)]TJ /F4 11.955 Tf 11.95 0 Td[(t)wagt+twt+1, (3)yagt+1=(1)]TJ /F4 11.955 Tf 11.95 0 Td[(t)yagt+tyt+1. (3) 3: OutputzagN=(wagN,xagN). InAlgorithm 3.5 ,thesuperscriptagstandsforaggregate,andmdstandsformiddle.Thebinaryconstantin( 3 )iseither0or1.If=0,then( 3 )isequivalenttoxt+1=argminx2XhrG(xmdt),xi+J(x)+hyt,Kxi+t 2kBwt)]TJ /F8 11.955 Tf 11.95 0 Td[(Kx)]TJ /F8 11.955 Tf 11.96 0 Td[(bk2+t 2kx)]TJ /F8 11.955 Tf 11.95 0 Td[(xtk2, (3) 82

PAGE 83

inwhichG(x)islinearizedatpointxmdt.WecallAlgorithm 3.5 with=0theacceleratedlinearizedADMM(AL-ADMM).If=1,then( 3 )becomesxt+1=argminx2XhrG(xmdt),xi+J(x))-222(ht(Bwt)]TJ /F8 11.955 Tf 11.96 0 Td[(Kxt)]TJ /F8 11.955 Tf 11.96 0 Td[(b),Kxi+hyt,Kxi+t 2kx)]TJ /F8 11.955 Tf 11.95 0 Td[(xtk2, andcomparingtheabovewith( 3 ),weseethattheleastsquareskBwt)]TJ /F8 11.955 Tf 12.09 0 Td[(Kx)]TJ /F8 11.955 Tf 12.09 0 Td[(bk2isalsolinearizedatpointxt.WenameAlgorithm 3.5 with=1theacceleratedlinearizedpreconditionedADMM(ALP-ADMM). SeveralremarksonAlgorithm 3.5 areinplace.Firstly,foranyt>1,theaggregatepointswagt+1,xagt+1andyagt+1aretheweightedsumsofallthepreviousiteratesw[t+1],x[t+1]andy[t+1].Iftheweightst1,thentheaggregatepointsarethesameasthecurrentiterateswt+1,xt+1andyt+1.Secondly,ADMM,P-ADMM,L-ADMMandLP-ADMMarespecialcasesofAlgorithm 3.5 whent=t=tandt1.Infact,if=0inAlgorithm 3.5 ,thenAlgorithm 3.5 becomesL-ADMM,andifinadditionG=0andt0,thenAlgorithm 3.5 becomesADMM.Ontheotherhand,if=1,thenAlgorithm 3.5 becomesLP-ADMM,andifinadditionG=0,Algorithm 3.5 becomesP-ADMM.However,wewillshowinthischapterthat,withproperselectionoftheweightingsequenceftgt1,itispossibletoacceleratetheconvergencerateofAlgorithm 3.5 withrespecttoitsdependenceonLG,yieldingtheAL-ADMMandALP-ADMMalgorithms.Thirdly,theparameterst,tandtinAlgorithm 3.5 arenotnecessarilyequal,whichisdifferentfromtheoriginalADMMinAlgorithm 3.1 .Whilethecasewhent=t=tisaspecialcaseofAlgorithm 3.5 ,ourintuitionofconsideringmoreparametersiniterations( 3 ),( 3 )and( 3 )istogainmoreexibilityforadjustingtheminthepursuitofbetterrateofconvergence.Infact,wewilldemonstratethatt,tandtcanbepredeterminedbyproblemparametersLG,kKkandsomedistanceconstants,andthatalthoughweintroducemoreparameters,thereisnoincrementinthenumberoffreeparametersthatrequiresnetuning.Finally,itshouldbenotedthattheiteration( 3 )isonlyforconvergenceanalysispurposes,andthereisnoneedtocalculateitinpractice. 83

PAGE 84

WelistthemainconvergenceresultsofAlgorithm 3.5 inSections 3.2.3 3.2.4 and 3.2.5 .TheproofsofallthetheoremsthroughoutthissectionwillbegiveninSection 3.3 .InSections 3.2.3 and 3.2.4 ,wepresentthemainconvergencepropertiesoftheacceleratedADMMframeworkinAlgorithm 3.5 forproblemsoftypeIandtypeII,respectively.InSection 3.2.5 ,wediscussmoregeneralchoicesoftheweightingparametersequence[N]. 3.2.3ConvergenceResultsforProblemsofTypeI Inthissubsection,welistthreetheoremsthatdescribetherateofconvergenceofAlgorithm 3.5 forsolvingproblemsoftypeI.SinceBisone-to-oneanddomFisboundedforproblemsoftypeI,throughoutthissectionwedenotethatY:=(B))]TJ /F7 7.97 Tf 6.58 0 Td[(1domF. (3) InTheorems 1 and 2 below,weprovidetwoconvergenceresultsofAlgorithm 3.5 .Theorem 1 describestherateofconvergenceofADMM,L-ADMM,P-ADMMandLP-ADMMalgorithms,andTheorem 2 providestherateofconvergenceofAL-ADMMandALP-ADMMalgorithms. Theorem1. InAlgorithm 3.5 ,iftheparametersofaresettot1,tttandtLG+kKk2,andtheinitialvaluey12Y,thenH(xt+1)+F(~wt+1))]TJ /F8 11.955 Tf 11.96 0 Td[(f1 2tD2x+Dw,B)]TJ /F4 11.955 Tf 11.96 0 Td[(D2x,K+D2Y (3) wherext+1:=t+1Xi=2xiand~wt+1:=(B))]TJ /F7 7.97 Tf 6.59 0 Td[(1(Kxt+1+b).Specially,ifBw1=Kx1+b,satises=DY kKkDx+(1)]TJ /F4 11.955 Tf 11.95 0 Td[()Dx,K, (3) and=LG+kKk2,thenH(xt+1)+F(~wt+1))]TJ /F8 11.955 Tf 11.96 0 Td[(fLGD2x 2t+2kKkDxDY+2(1)]TJ /F4 11.955 Tf 11.96 0 Td[()Dx,KDY t. (3) 84

PAGE 85

AfewremarksareinplaceforTheorem 1 .Firstly,thistheoremisauniedstatementfortherateofconvergenceofADMM,L-ADMM,P-ADMMandLP-ADMMalgorithms.Forclarity,theratesofconvergenceofthesealgorithmsforsolvingproblemsoftypeIarelistedinTable 3-1 .Secondly,althoughisbesttunedatthevaluein( 3 ),foranytherateofconvergencebyTheorem 1 isalwaysoforderO(1=t).Thirdly,preconditionedADMM(P-ADMMandLP-ADMM)methodsareslowerthantheADMMmethodswithoutpreconditioning(ADMMandL-ADMM)byaconstantfactor,sincekKkDxDx,Kin( 3 ).However,itshouldbenotedthatatmostofthetimeinpractice,theiterationcostforsolvingtheoptimizationproblem( 3 )when=0ishigherthanthecasewhen=1.Therefore,thetrade-offbetweenbetterrateofconvergenceandcheaperiterationcostshouldbetakenintoconsiderationinpractice.Finally,thepair(~wt+1,xt+1)isanapproximatesolutionwithzerosfeasibilityresidual,sinceB~wt+1)]TJ /F8 11.955 Tf 11.96 0 Td[(Kxt+1)]TJ /F8 11.955 Tf 11.96 0 Td[(b=0. InTheorem 2 ,weshowthatiftheweightingsequenceftgt1arechosenproperly,therateofconvergenceofAlgorithm 3.5 canbeimproved,intermsofthedependenceonLGfortheAECOandUCOproblems. Theorem2. InAlgorithm 3.5 ,ifthetotalnumberofiterationsNischosen,andtheparametersaresettot=2 t+1,t=t=t=(N)]TJ /F5 11.955 Tf 11.95 0 Td[(1) t,t=2LG+(N)]TJ /F5 11.955 Tf 11.95 0 Td[(1)kKk2 t, (3) thenH(xagN)+F(~wagN))]TJ /F8 11.955 Tf 11.96 0 Td[(f2LGD2x N(N)]TJ /F5 11.955 Tf 11.96 0 Td[(1)+D2w,B N+kKk2D2x N)]TJ /F4 11.955 Tf 13.15 8.66 Td[(D2x,K N+D2Y N, (3) where~wagN:=(B))]TJ /F7 7.97 Tf 6.58 0 Td[(1(KxagN+b).Specially,ifBw1=Kx1+bandisdenedby( 3 ),thenH(xagN)+F(~wagN))]TJ /F8 11.955 Tf 11.96 0 Td[(f2LGD2x N(N)]TJ /F5 11.955 Tf 11.95 0 Td[(1)+2kKkDxDY+2(1)]TJ /F4 11.955 Tf 11.95 0 Td[()Dx,KDY N. (3) 85

PAGE 86

Table3-1.TherateofconvergenceofADMM-typealgorithmsforsolvingproblemsoftypeI NopreconditioningPreconditioned ADMMODx,KDY NOkKkDxDY NLinearizedADMMOLGD2x+Dx,KDY NOLGD2x+kKkDxDY NAcceleratedOLGD2x N2+Dx,KDY NOLGD2x N2+kKkDxDY N Inviewof( 3 ),wecanseetheratesofconvergenceoftheacceleratedlinearizedADMMalgorithms(namelyAL-ADMMandALP-ADMM)arebetterthantheresultsinTheorem 1 ,intermsoftheredependenceonLG.ForacceleratedlinearizedADMMmethods,theconstantLGcanbeasbigasO(N)withoutaffectingtherateofconvergence.ThecomparisonbetweentheacceleratedalgorithmsandthealgorithmswithnoaccelerationintermsofrateofconvergenceisshowninTable 3-1 .Also,wecanseefrom( 3 )thatAL-ADMMisbetterthanALP-ADMMintermsofrateofconvergencebyaconstantfactor.However,similarastheremarksafterTheorem 1 ,thetrade-offbetweenbetterrateandcheaperiterationcosthastobeconsideredinpractice. ComparingTheorems 1 and 2 wecanseethatthetotalnumberofiterationsNisrequiredfortheanalysisinTheorem 2 .InTheorem 3 below,weshowthatifXisbounded,thentherequirementonNinTheorem 2 canberemoved. Theorem3. IfXisbounded,andiftheparametersofAlgorithm 3.5 aresettot=2 t+1,t=t=,t=t)]TJ /F5 11.955 Tf 11.96 0 Td[(1 t,t=2LG+kKk2(t)]TJ /F5 11.955 Tf 11.96 0 Td[(1) t, (3) thenH(xagt+1)+F(~wagt+1))]TJ /F8 11.955 Tf 11.96 0 Td[(f2LGD2X t(t+1)+D2w,B t+1+kKk2D2X t+1)]TJ /F4 11.955 Tf 13.15 8.66 Td[(D2X,K t+1+D2Y (t+1), (3) 86

PAGE 87

where~wagN:=(B))]TJ /F7 7.97 Tf 6.58 0 Td[(1(KxagN+b).Specially,ifBw1=Kx1+band=DY kKkDX+(1)]TJ /F4 11.955 Tf 11.95 0 Td[()DX,K, (3) thenH(xagt+1)+F(~wagt+1))]TJ /F8 11.955 Tf 11.96 0 Td[(f2LGD2X t(t+1)+2kKkDXDY+2(1)]TJ /F4 11.955 Tf 11.96 0 Td[()DX,KDY t+1. (3) Theparametersettingin( 3 )isdifferentfromtheoriginalADMMschemeinAlgorithm 3.1 ,inwhichthestepsizeparametertisnowslightlydifferentfromtandt.However,comparingTheorem 3 withTheorems 1 and 2 ,wecanseethatthereisnoadditionalfreeparameterthatrequirestuning.Infact,thetuningoftheconstantinTheorem 3 isinsomesenseeasier,sincethedistanceconstantsDXandDX,Kin( 3 )maybeestimatedeasilyinsomeapplications,whiletheconstantsDxandDx,Kin( 3 )areunknowninmostofthecases. In[ 19 ],anacceleratedprimal-dual(APD)methodisproposedforsolvingaclassofsaddlepointofproblems.ItisinterestingtoseetheconnectionbetweentheAPDmethodandtheALP-ADMMmethod.In[ 15 28 ],itisshownthatprimal-dualmethodsandpreconditionedADMMmethodsareequivalentbyMoreaudecomposition(see,e.g.,[ 21 64 83 ]).Similarly,wecanalsoshowtherelationshipbetweenAPDandALP-ADMM,basedonLemma 1 below,whichisadirectconsequenceofMoreaudecomposition: Lemma1. IfBisone-to-one,BW=YintheAECOproblem,andy12Y,t=tinAlgorithm 3.5 ,thenforallt>1,theiterateyt2Y,andyt+1=argminy2Y(B)]TJ /F7 7.97 Tf 6.59 0 Td[(1)F(y)+1 2tky)]TJ /F5 11.955 Tf 11.95 0 Td[((yt+tKxt+1+tb)k2. (3) 87

PAGE 88

Proof. By( 3 ),( 3 )andMoreau'sdecompositiontheorem,forallt>1wehaveyt+1=yt)]TJ /F4 11.955 Tf 11.96 0 Td[(t(Bwt+1)]TJ /F8 11.955 Tf 11.95 0 Td[(Kxt+1)]TJ /F8 11.955 Tf 11.95 0 Td[(b)=(yt+tKxt+1+tb))]TJ /F4 11.955 Tf 11.96 0 Td[(tBargminw2WF(w)+t 2kBw)]TJ /F8 11.955 Tf 11.96 0 Td[(Kxt+1)]TJ /F8 11.955 Tf 11.96 0 Td[(b)]TJ /F8 11.955 Tf 13.15 8.08 Td[(yt tk2=(yt+tKxt+1+tb)+targminBw2YF(B)]TJ /F7 7.97 Tf 6.59 0 Td[(1Bw)+t 2kBw)]TJ /F5 11.955 Tf 14.7 8.09 Td[(1 t(tKxt+1+tb+yt)k2=argminy2Y(B)]TJ /F7 7.97 Tf 6.59 0 Td[(1)F(y)+1 2tky)]TJ /F5 11.955 Tf 11.95 0 Td[((yt+tKxt+1+tb)k22Y, andhence( 3 )holds. ByLemma 1 ,wecanseethatwhenALP-ADMMisappliedtoUCOproblemsofTypeIwitht=tandJ()0,theiterations( 3 ),( 3 )and( 3 )areequivalenttoxt+1=argminx2XhrG(xmdt),xi)]TJ /F4 11.955 Tf 19.27 0 Td[(thwt)]TJ /F8 11.955 Tf 11.96 0 Td[(Kxt,Kxi+hyt,Kxi+t 2kx)]TJ /F8 11.955 Tf 11.96 0 Td[(xtk2,yt+1=argminy2YF(y)+1 2tky)]TJ /F5 11.955 Tf 11.96 0 Td[((yt+tKxt+1)k2. Noticingby( 3 )thatwt)]TJ /F8 11.955 Tf 11.96 0 Td[(Kxt=(yt)]TJ /F7 7.97 Tf 6.59 0 Td[(1)]TJ /F8 11.955 Tf 11.95 0 Td[(yt)=t)]TJ /F7 7.97 Tf 6.58 0 Td[(1,theaboveiterationsareequivalenttoyt=yt)]TJ /F4 11.955 Tf 18.48 8.09 Td[(t t)]TJ /F7 7.97 Tf 6.59 0 Td[(1(yt)]TJ /F7 7.97 Tf 6.59 0 Td[(1)]TJ /F8 11.955 Tf 11.95 0 Td[(yt),xt+1=argminx2XhrG(xmdt),xi+hyt,Kxi+t 2kx)]TJ /F8 11.955 Tf 11.95 0 Td[(xtk2,yt+1=argminy2YF(y))-222(hKxt+1,yi+1 2tky)]TJ /F8 11.955 Tf 11.96 0 Td[(ytk2. Specially,ifweusetheparametersettingin( 3 )and( 3 ),thenthemainiteratesofALP-ADMMareequivalenttoyt=yt)]TJ /F8 11.955 Tf 13.15 8.09 Td[(t)]TJ /F5 11.955 Tf 11.96 0 Td[(1 t(yt)]TJ /F7 7.97 Tf 6.59 0 Td[(1)]TJ /F8 11.955 Tf 11.95 0 Td[(yt),xt+1=argminx2XhrG(xmdt),xi+hyt,Kxi+2LG+(t)]TJ /F5 11.955 Tf 11.96 0 Td[(1)kKkDY=DX 2tkx)]TJ /F8 11.955 Tf 11.96 0 Td[(xtk2,yt+1=argminy2YF(y))-221(hKxt+1,yi+kKkDX 2DYky)]TJ /F8 11.955 Tf 11.96 0 Td[(ytk2. 88

PAGE 89

Ontheotherhand,inviewof(2.20)1in[ 19 ],oneexampleoftheparametersettingsofAPDiterationsisxt=xt)]TJ /F8 11.955 Tf 13.15 8.09 Td[(t)]TJ /F5 11.955 Tf 11.95 0 Td[(1 t(xt)]TJ /F7 7.97 Tf 6.58 0 Td[(1)]TJ /F8 11.955 Tf 11.96 0 Td[(xt),yt+1=argminy2YF(y))-222(hKxt,yi+kKkDX 2DYky)]TJ /F8 11.955 Tf 11.95 0 Td[(ytk2,xt+1=argminx2XhrG(xmdt),xi+hyt+1,Kxi+2LG+(t)]TJ /F5 11.955 Tf 11.96 0 Td[(1)kKkDY=DX 2tkx)]TJ /F8 11.955 Tf 11.96 0 Td[(xtk2. wecanseethattheALP-ADMMmethodisanalogoustotheAPDalgorithmin[ 19 ].Inviewof( 3 ),wecanalsowritetheaboveAPDiterationsasxt=xt)]TJ /F8 11.955 Tf 13.15 8.09 Td[(t)]TJ /F5 11.955 Tf 11.96 0 Td[(1 t(xt)]TJ /F7 7.97 Tf 6.59 0 Td[(1)]TJ /F8 11.955 Tf 11.95 0 Td[(xt), (3)yt+1=argminy2YF(y))-222(hKxt,yi+1 2ky)]TJ /F8 11.955 Tf 11.96 0 Td[(ytk2, (3)xt+1=argminx2XhrG(xmdt),xi+hyt+1,Kxi+2LG+(t)]TJ /F5 11.955 Tf 11.95 0 Td[(1)kKk2 2tkx)]TJ /F8 11.955 Tf 11.95 0 Td[(xtk2. (3) whereisdependentonkKk,DXandDY. 3.2.4ConvergenceResultsforProblemsofTypeII Inthissubsection,wedescribetherateofconvergenceofAlgorithm 3.5 forsolvingproblemsoftypeII.TheresultsinthissubsectionisdifferentfromtheresultsinSection 3.2.3 ,sinceweneedtoconsiderthefeasibilityresidualforproblemsoftypeII. WestartwiththeconvergenceanalysisofADMM,L-ADMM,P-ADMMandLP-ADMMalgorithms. 1Thereisaslightdifferencebetweentheparametersettingwedescribehereandtheparametersetting(2.20)in[ 19 ].Thestepsizeforthex-iterationin(2.20)in[ 19 ]is2LG+tkKkDY=DX 2t,buthereweuse2LG+(t)]TJ /F7 7.97 Tf 6.59 0 Td[(1)kKkDY=DX 2t.ItiseasytoshowthattheconvergenceresultinCorollary2.2in[ 19 ]holdsforeitherparametersettings. 89

PAGE 90

Theorem4. InAlgorithm 3.5 ,ift1,tttandtLG+kKk2,thenH(xt+1)+F(wt+1))]TJ /F8 11.955 Tf 11.95 0 Td[(f1 2tD2x+D2w,B)]TJ /F4 11.955 Tf 11.96 0 Td[(D2x,K+ky1k2 (3)kBwt+1)]TJ /F8 11.955 Tf 11.95 0 Td[(Kxt+1)]TJ /F8 11.955 Tf 11.96 0 Td[(bk22 t22D2y 2+D2x +D2x,B)]TJ /F4 11.955 Tf 11.95 0 Td[(D2x,K, (3) wherext+1:=t+1Xi=2xiandwt+1:=t+1Xi=2wi.Specially,ify1=0,Bw1=Kx1+b,=1and=LG+kKk2,thenH(xt+1)+F(wt+1))]TJ /F8 11.955 Tf 11.96 0 Td[(f1 2t(LGD2x+kKk2D2x+(1)]TJ /F4 11.955 Tf 11.95 0 Td[()D2x,K), (3)kBwt+1)]TJ /F8 11.955 Tf 11.95 0 Td[(Kxt+1)]TJ /F8 11.955 Tf 11.95 0 Td[(bk22 t2(LGD2x+2D2y+kKk2D2x+(1)]TJ /F4 11.955 Tf 11.95 0 Td[()D2x,K). (3) FromTheorem 4 weseethattheforADMM,L-ADMM,P-ADMMandLP-ADMMalgorithms,therateofconvergenceofboththeprimalresidualandthefeasibilityresidualareoforderO(1=t).ThedetailedrateofconvergenceofeachalgorithmislistedinTables 3-2 and 3-3 .UnlikethesettingofinSection 3.2.3 ,herewesimplyset=1,sincethereisnooptimalvalueofthatminimizesbothboundsin( 3 )and( 3 ).Itshouldalsobenotedfrom( 3 )thattherateofconvergenceofthefeasibilityresidualiskBwt+1)]TJ /F8 11.955 Tf 11.96 0 Td[(Kxt+1)]TJ /F8 11.955 Tf 11.96 0 Td[(bkOp LGDx+Dy+kKkDx+(1)]TJ /F4 11.955 Tf 11.96 0 Td[()Dx,K N. (3) InTheorem 5 below,weshowthatthereexistsaweightsequenceftgt1thatimprovestherateofconvergenceofAlgorithm 3.5 intermsofthedependenceonLG. Theorem5. InAlgorithm 3.5 ,ifthetotalnumberofiterationsissettoN,andtheinitialvaluessatisesBw1=Kx1+bandy1=0,andtheparametersaresettot=2 t+1,t=t=N)]TJ /F5 11.955 Tf 11.95 0 Td[(1 t,t=t N)]TJ /F5 11.955 Tf 11.96 0 Td[(1,t=2LG+kKk2(N)]TJ /F5 11.955 Tf 11.96 0 Td[(1) t, (3) 90

PAGE 91

Table3-2.TherateofconvergenceoftheprimalresidualsofADMM-typealgorithmsforsolvingproblemsoftypeII NopreconditioningPreconditioned ADMMOD2x,K NOkKkD2x NLinearizedADMMOLGD2x+D2x,K NOLGD2x+kKkD2x NAcceleratedOLGD2x N2+D2x,K NOLGD2x N2+kKkD2x N thenH(xagN)+F(wagN))]TJ /F8 11.955 Tf 11.95 0 Td[(f2LGD2x N(N)]TJ /F5 11.955 Tf 11.96 0 Td[(1)+kKk2D2x N+(1)]TJ /F4 11.955 Tf 11.95 0 Td[()D2x,K N, (3)kBwagN)]TJ /F8 11.955 Tf 11.95 0 Td[(KxagN)]TJ /F8 11.955 Tf 11.96 0 Td[(bk216LGD2x N2(N)]TJ /F5 11.955 Tf 11.96 0 Td[(1)++8kKk2D2x N2+8(1)]TJ /F4 11.955 Tf 11.95 0 Td[()D2x,K N2+16D2y N2. (3) Itshouldbenotedthatintheparametersetting( 3 ),thestepsizeparametertisdifferentfromtandt,whichisdifferentfromtheoriginalADMMmethodinAlgorithm 3.1 .Itisalsodifferentfromtheparametersettingin( 3 )foracceleratedlinearizedalgorithmsappliedtoproblemsoftypeI,inwhichthevalueofparametertisdifferentfromtandt. From( 3 ),therateofconvergenceofthefeasibilityresidualoftheacceleratedschemeiskBwt+1)]TJ /F8 11.955 Tf 11.95 0 Td[(Kxt+1)]TJ /F8 11.955 Tf 11.95 0 Td[(bkOp LGDx N3=2+Dy+kKkDx+(1)]TJ /F4 11.955 Tf 11.96 0 Td[()Dx,K N. (3) Comparing( 3 )and( 3 )with( 3 )and( 3 )wecanseetheadvantageoftheacceleratedlinearizedADMMclearly.Inparticular,theAL-ADMMandALP-ADMMalgorithmscanallowverylargeLipschitzconstantLG(asbigasO(N)),withoutaffectingtherateofconvergenceofboththeprimalresidualin( 3 )andthefeasibilityresidualin( 3 ). 91

PAGE 92

Table3-3.TherateofconvergenceofthefeasibilityresidualsofADMM-typealgorithmsforsolvingproblemsoftypeII NopreconditioningPreconditioned ADMMODx,K+Dy NOkKkDx+Dy NLinearizedADMMOp LGDx+Dx,K+Dy NOp LGDx+kKkDx+Dy NAcceleratedOp LGDx N3=2+Dx,K+Dy NOp LGDx N3=2+kKkDx+Dy N 3.2.5MoreGeneralChoicesofTheWeightingParameters Inthissubsection,weprovideotherchoicesoftheweightingparametersequence[N]. ThefollowingquantitywillbeusedthroughoutthischapterfordescribingtheconvergencerateofAlgorithm 3.5 :)]TJ /F9 7.97 Tf 6.77 -1.79 Td[(t=8>><>>:1whent=1ort=1,(1)]TJ /F4 11.955 Tf 11.96 0 Td[(t))]TJ /F9 7.97 Tf 11.65 -1.79 Td[(t)]TJ /F7 7.97 Tf 6.59 0 Td[(1whent>1, (3) Throughoutthissubsection,weassumethatthesequencesftgt1andf)]TJ /F9 7.97 Tf 6.77 -1.79 Td[(tgt1satisfy( 3 )andthefollowingconditions: 1=1,1)]TJ /F4 11.955 Tf 11.95 0 Td[(t+1 2t+11 2t,00. (3) Theconditionin( 3 )isinspiredbyNesterov'sparametersettingsforacceleratedgradientmethods([ 70 71 91 ])Inthefollowingproposition,weprovidesomepropertiesofsequencesftgt1andf)]TJ /F9 7.97 Tf 6.78 -1.79 Td[(tgt1thatsatisfy( 3 ). 92

PAGE 93

Proposition3. Foranypositivesequencesftgt1andf)]TJ /F9 7.97 Tf 6.77 -1.79 Td[(tgt1thatsatisfy( 3 )and( 3 ),wehavebothft=)]TJ /F9 7.97 Tf 6.78 -1.8 Td[(tgt1andft=)]TJ /F7 7.97 Tf 6.77 4.34 Td[(2tgt1arenon-decreasingandf)]TJ /F9 7.97 Tf 6.78 -1.8 Td[(tgt1isnon-increasing. Proof. By( 3 )and( 3 )wehave)]TJ /F9 7.97 Tf 6.78 -1.79 Td[(t+1=(1)]TJ /F4 11.955 Tf 11.96 0 Td[(t+1))]TJ /F9 7.97 Tf 11.65 -1.79 Td[(t)]TJ /F9 7.97 Tf 6.77 -1.79 Td[(t,)]TJ /F9 7.97 Tf 6.77 -1.79 Td[(t+1 2t+1=(1)]TJ /F4 11.955 Tf 11.95 0 Td[(t+1))]TJ /F9 7.97 Tf 11.66 -1.79 Td[(t 2t+1)]TJ /F9 7.97 Tf 6.77 -1.79 Td[(t 2t, i.e.,f)]TJ /F9 7.97 Tf 6.78 -1.8 Td[(tgt1isnon-increasing,f2t=)]TJ /F9 7.97 Tf 6.78 -1.8 Td[(tgt1isnon-decreasing.Hencef2t=)]TJ /F7 7.97 Tf 6.77 4.34 Td[(2tgt1isnon-decreasing,andsoisft=)]TJ /F9 7.97 Tf 6.77 -1.79 Td[(tgt1. Clearly,if=2,t=2 t+1,)]TJ /F9 7.97 Tf 12.25 -1.79 Td[(t=2 t(t+1),8t1, (3) then( 3 )and( 3 )hold.TheaboveweightingparametersettingisexactlywhatweusedinTheorems 2 3 and 5 .Itisalsoworthnotingthatif(1)]TJ /F4 11.955 Tf 12.54 0 Td[(t+1)=2t+1=1=2tin( 3 ),then=1,1=1,t+1=p 4t+42t)]TJ /F4 11.955 Tf 11.95 0 Td[(2t 2,)]TJ /F9 7.97 Tf 12.25 -1.79 Td[(t=2t,8t1 (3) isoneotherchoicethatsatises( 3 )and( 3 ).Aspointedoutin[ 91 ],thechoicein( 3 )decreasesthefastestamongallthechoicesofweightingparametersthatsatisfy( 3 ). InTheorems 6 7 and 8 below,weprovideexampleweightingparametersettingsofAlgorithm 3.5 thatgeneralizesthesettingsinTheorems 2 3 and 5 Theorem6. InTheorem 2 ,iftheparametersaresettot=t=t=N)]TJ /F7 7.97 Tf 6.59 0 Td[(1)]TJ /F9 7.97 Tf 6.77 -1.79 Td[(t t)]TJ /F9 7.97 Tf 6.78 -1.79 Td[(N)]TJ /F7 7.97 Tf 6.59 0 Td[(1,t=)]TJ /F9 7.97 Tf 6.77 -1.79 Td[(t(LG+kKk2N)]TJ /F7 7.97 Tf 6.58 0 Td[(1=)]TJ /F9 7.97 Tf 6.77 -1.79 Td[(N)]TJ /F7 7.97 Tf 6.58 0 Td[(1) t, (3) where,t,)]TJ /F9 7.97 Tf 6.78 -1.79 Td[(tsatises( 3 ),thenTheorem 2 stillholds.Inparticular,theparametersettingin( 3 )isaspecialcaseof( 3 ). 93

PAGE 94

Theorem7. InTheorem 3 ,iftheparametersaresettot=t=,t=t)]TJ /F7 7.97 Tf 6.59 0 Td[(1)]TJ /F9 7.97 Tf 6.77 -1.79 Td[(t t)]TJ /F9 7.97 Tf 6.77 -1.79 Td[(t)]TJ /F7 7.97 Tf 6.59 0 Td[(1,t=)]TJ /F9 7.97 Tf 6.77 -1.79 Td[(t(LG+kKk2t)]TJ /F7 7.97 Tf 6.58 0 Td[(1=)]TJ /F9 7.97 Tf 6.77 -1.79 Td[(t)]TJ /F7 7.97 Tf 6.59 0 Td[(1) t, (3) where,t,)]TJ /F9 7.97 Tf 6.78 -1.79 Td[(tsatises( 3 )(assumingthat0=)]TJ /F7 7.97 Tf 19.74 -1.79 Td[(0=1),thenTheorem 3 stillholds.Inparticular,theparametersettingin( 3 )isaspecialcaseof( 3 ). Theorem8. InTheorem 5 ,iftheparametersaresettot=t=N)]TJ /F7 7.97 Tf 6.58 0 Td[(1)]TJ /F9 7.97 Tf 6.78 -1.8 Td[(t t)]TJ /F9 7.97 Tf 6.78 -1.79 Td[(N)]TJ /F7 7.97 Tf 6.59 0 Td[(1,t=t)]TJ /F9 7.97 Tf 6.78 -1.8 Td[(N)]TJ /F7 7.97 Tf 6.59 0 Td[(1 N)]TJ /F7 7.97 Tf 6.59 0 Td[(1)]TJ /F9 7.97 Tf 6.78 -1.79 Td[(t,t=)]TJ /F9 7.97 Tf 6.78 -1.8 Td[(t(LG+kKk2N)]TJ /F7 7.97 Tf 6.59 0 Td[(1=)]TJ /F9 7.97 Tf 6.78 -1.8 Td[(N)]TJ /F7 7.97 Tf 6.59 0 Td[(1) t, (3) where,t,)]TJ /F9 7.97 Tf 6.78 -1.8 Td[(tsatises( 3 ),thenTheorem 5 stillholds.Inparticular,theparametersettingin( 3 )isaspecialcaseof( 3 ). Clearly,Theorems 2 3 and 5 arespecialinstancesofTheorems 6 7 and 8 ,inthesensethattand)]TJ /F9 7.97 Tf 6.77 -1.79 Td[(tinTheorems 2 3 and 5 isgivenby( 3 ). 3.3ConvergenceAnalysis Inthissection,weprovethemainconvergenceresultsdescribedinSection 3.2.2 ,namely,Theorems 1 4 6 7 and 8 .WedonotneedtoproveTheorems 2 3 and 5 sincetheyareinstancesofTheorems 6 8 WestartwithLemmas 2 and 3 ,whicharethefoundationsofalltheconvergenceanalysisthroughoutthischapter. Lemma2. 8y2Y,theiteratesfzagtgt1=f(wagt,xagt,yagt)gt1ofAlgorithm 3.5 satisfy1 )]TJ /F9 7.97 Tf 6.77 -1.8 Td[(tQ(w,x,y;zagt+1))]TJ /F9 7.97 Tf 18.69 14.94 Td[(tXi=21)]TJ /F4 11.955 Tf 11.95 0 Td[(i )]TJ /F9 7.97 Tf 6.78 -1.8 Td[(i)]TJ /F5 11.955 Tf 20.5 8.08 Td[(1 )]TJ /F9 7.97 Tf 6.77 -1.8 Td[(i)]TJ /F7 7.97 Tf 6.58 0 Td[(1Q(w,x,y;zagi)Bt(x,x[t+1],[t])+Bt(y,y[t+1],)]TJ /F7 7.97 Tf 6.59 0 Td[(1[t])+Bt(Bw,Bw[t+1],[t]))]TJ /F4 11.955 Tf 11.96 0 Td[(Bt(Kx,Kx[t+1],[t]))]TJ /F9 7.97 Tf 18.69 14.95 Td[(tXi=1i(i)]TJ /F4 11.955 Tf 11.96 0 Td[(i) 2)]TJ /F9 7.97 Tf 13.05 -1.79 Td[(ikBwt+1)]TJ /F8 11.955 Tf 11.95 0 Td[(Kx)]TJ /F8 11.955 Tf 11.95 0 Td[(bk2+tXi=1i(i)]TJ /F4 11.955 Tf 11.95 0 Td[(i) 2)]TJ /F9 7.97 Tf 13.05 -1.79 Td[(ikK(xi+1)]TJ /F8 11.955 Tf 11.95 0 Td[(x)k2)]TJ /F9 7.97 Tf 18.69 14.94 Td[(tXi=1i(i)]TJ /F4 11.955 Tf 11.96 0 Td[(i) 2)]TJ /F9 7.97 Tf 13.05 -1.79 Td[(i2ikyi)]TJ /F8 11.955 Tf 11.95 0 Td[(yi+1k2)]TJ /F9 7.97 Tf 18.69 14.94 Td[(tXi=1i 2)]TJ /F9 7.97 Tf 13.05 -1.79 Td[(i)]TJ /F4 11.955 Tf 5.48 -9.68 Td[(i)]TJ /F8 11.955 Tf 11.95 0 Td[(LGi)]TJ /F4 11.955 Tf 11.96 0 Td[(ikKk2kxi)]TJ /F8 11.955 Tf 11.96 0 Td[(xi+1k2. (3) 94

PAGE 95

wherethemapBt(,,)isdenedasfollows:foranypointvandanysequencev[t+1]inanyvectorialspaceV,andanyrealvaluedsequence[t],Bt(v,v[t+1],[t]):=tXi=1i 2)]TJ /F9 7.97 Tf 13.05 -1.8 Td[(ii)]TJ /F2 11.955 Tf 5.48 -9.68 Td[(kvi)]TJ /F8 11.955 Tf 11.95 0 Td[(vk2)-222(kvi+1)]TJ /F8 11.955 Tf 11.95 0 Td[(vk2. (3) Proof. Tostartwith,weproveanimportantpropertyofthefunctionQ(,)underAlgorithm 3.5 .Byequations( 3 )and( 3 ),xagt+1)]TJ /F8 11.955 Tf 12.33 0 Td[(xmdt+1=t(xt+1)]TJ /F8 11.955 Tf 12.33 0 Td[(xt).UsingthisobservationaswellastheconvexityofG(),wehaveG(xagt+1)G(xmdt)+hrG(xmdt),xagt+1)]TJ /F8 11.955 Tf 11.95 0 Td[(xmdti+LG 2kxagt+1)]TJ /F8 11.955 Tf 11.96 0 Td[(xmdtk2G(xmdt)+hrG(xmdt),xagt+1)]TJ /F8 11.955 Tf 11.96 0 Td[(xmdti+LG2t 2kxt+1)]TJ /F8 11.955 Tf 11.96 0 Td[(xtk2=G(xmdt)+(1)]TJ /F4 11.955 Tf 11.96 0 Td[(t)hrG(xmdt),xagt)]TJ /F8 11.955 Tf 11.96 0 Td[(xmdti+thrG(xmdt),xt+1)]TJ /F8 11.955 Tf 11.95 0 Td[(xmdti+LG2t 2kxt+1)]TJ /F8 11.955 Tf 11.96 0 Td[(xtk2=(1)]TJ /F4 11.955 Tf 11.96 0 Td[(t)G(xmdt)+hrG(xmdt),xagt)]TJ /F8 11.955 Tf 11.95 0 Td[(xmdti+tG(xmdt)+hrG(xmdt),xt+1)]TJ /F8 11.955 Tf 11.96 0 Td[(xmdti+LG2t 2kxt+1)]TJ /F8 11.955 Tf 11.96 0 Td[(xtk2=(1)]TJ /F4 11.955 Tf 11.96 0 Td[(t)G(xmdt)+hrG(xmdt),xagt)]TJ /F8 11.955 Tf 11.95 0 Td[(xmdti+tG(xmdt)+hrG(xmdt),x)]TJ /F8 11.955 Tf 11.95 0 Td[(xmdti+thrG(xmdt),xt+1)]TJ /F8 11.955 Tf 11.96 0 Td[(xi+LG2t 2kxt+1)]TJ /F8 11.955 Tf 11.95 0 Td[(xtk2(1)]TJ /F4 11.955 Tf 11.96 0 Td[(t)G(xagt)+tG(x)+thrG(xmdt),xt+1)]TJ /F8 11.955 Tf 11.95 0 Td[(xi+LG2t 2kxt+1)]TJ /F8 11.955 Tf 11.96 0 Td[(xtk2,8x2X. (3) Alsoby( 3 )wehaveQ(z;zagt+1))]TJ /F5 11.955 Tf 11.95 0 Td[((1)]TJ /F4 11.955 Tf 11.95 0 Td[(t)Q(z;zagt)=[H(xagt+1)+F(wagt+1))-222(hy,Bwagt+1)]TJ /F8 11.955 Tf 11.95 0 Td[(Kxagt+1)]TJ /F8 11.955 Tf 11.95 0 Td[(bi])]TJ /F5 11.955 Tf 11.96 0 Td[([H(x)+F(w))-222(hyagt+1,Bw)]TJ /F8 11.955 Tf 11.95 0 Td[(Kx)]TJ /F8 11.955 Tf 11.96 0 Td[(bi])]TJ /F5 11.955 Tf 11.96 0 Td[((1)]TJ /F4 11.955 Tf 11.96 0 Td[(t)[H(xagt)+F(wagt))-222(hy,Bwagt)]TJ /F8 11.955 Tf 11.96 0 Td[(Kxagt)]TJ /F8 11.955 Tf 11.96 0 Td[(bi]+(1)]TJ /F4 11.955 Tf 11.96 0 Td[(t)[H(x)+F(w))-221(hyagt,Bw)]TJ /F8 11.955 Tf 11.96 0 Td[(Kx)]TJ /F8 11.955 Tf 11.95 0 Td[(bi], 95

PAGE 96

andhenceby( 3 ),( 3 ),( 3 )andtheconvexityofJ()andF(),weconcludethatQ(z;zagt+1))]TJ /F5 11.955 Tf 11.96 0 Td[((1)]TJ /F4 11.955 Tf 11.96 0 Td[(t)Q(z;zagt)G(xagt+1))]TJ /F5 11.955 Tf 11.95 0 Td[((1)]TJ /F4 11.955 Tf 11.95 0 Td[(t)G(xagt))]TJ /F4 11.955 Tf 11.96 0 Td[(tG(x)+J(xagt+1))]TJ /F5 11.955 Tf 11.95 0 Td[((1)]TJ /F4 11.955 Tf 11.95 0 Td[(t)J(xagt))]TJ /F4 11.955 Tf 11.96 0 Td[(tJ(x)+F(wagt+1))]TJ /F5 11.955 Tf 11.95 0 Td[((1)]TJ /F4 11.955 Tf 11.95 0 Td[(t)F(wagt))]TJ /F4 11.955 Tf 11.95 0 Td[(tF(w))]TJ /F4 11.955 Tf 11.96 0 Td[(thy,Bwt+1)]TJ /F8 11.955 Tf 11.96 0 Td[(Kxt+1)]TJ /F8 11.955 Tf 11.95 0 Td[(bi+thyt+1,Bw)]TJ /F8 11.955 Tf 11.95 0 Td[(Kx)]TJ /F8 11.955 Tf 11.96 0 Td[(bithrG(xmdt),xt+1)]TJ /F8 11.955 Tf 11.96 0 Td[(xi+[J(xt+1))]TJ /F8 11.955 Tf 11.95 0 Td[(J(x)]+[F(wt+1))]TJ /F8 11.955 Tf 11.96 0 Td[(F(w)]hy,Bwt+1)]TJ /F8 11.955 Tf 11.95 0 Td[(Kxt+1)]TJ /F8 11.955 Tf 11.95 0 Td[(bi+hyt+1,Bw)]TJ /F8 11.955 Tf 11.95 0 Td[(Kx)]TJ /F8 11.955 Tf 11.95 0 Td[(bi+LGt 2kxt+1)]TJ /F8 11.955 Tf 11.95 0 Td[(xtk2 (3) Next,weexaminetheoptimalityconditionsin( 3 )and( 3 ).forallx2Xandw2W,wehavehrG(xmdt)+t(xt+1)]TJ /F8 11.955 Tf 11.96 0 Td[(xt),xt+1)]TJ /F8 11.955 Tf 11.95 0 Td[(xi+J(xt+1))]TJ /F8 11.955 Tf 11.96 0 Td[(J(x))-222(ht(Bwt)]TJ /F8 11.955 Tf 11.96 0 Td[(K~xt)]TJ /F8 11.955 Tf 11.96 0 Td[(b))]TJ /F8 11.955 Tf 11.95 0 Td[(yt,K(xt+1)]TJ /F8 11.955 Tf 11.95 0 Td[(x)i0,F(wt+1))]TJ /F8 11.955 Tf 11.95 0 Td[(F(w)+ht(Bwt+1)]TJ /F8 11.955 Tf 11.95 0 Td[(Kxt+1)]TJ /F8 11.955 Tf 11.96 0 Td[(b))]TJ /F8 11.955 Tf 11.95 0 Td[(yt,B(wt+1)]TJ /F8 11.955 Tf 11.95 0 Td[(w)i0, where~xt:=xt+(1)]TJ /F4 11.955 Tf 11.95 0 Td[()xt+1. (3) Observingfrom( 3 )thatBwt+1)]TJ /F8 11.955 Tf 12.38 0 Td[(Kxt+1)]TJ /F8 11.955 Tf 12.38 0 Td[(b=(yt)]TJ /F8 11.955 Tf 12.38 0 Td[(yt+1)=tandBwt)]TJ /F8 11.955 Tf 12.38 0 Td[(K~xt)]TJ /F8 11.955 Tf 12.38 0 Td[(b=(yt)]TJ /F8 11.955 Tf 11.96 0 Td[(yt+1)=t)]TJ /F8 11.955 Tf 11.96 0 Td[(K(~xt)]TJ /F8 11.955 Tf 11.96 0 Td[(xt+1)+B(wt)]TJ /F8 11.955 Tf 11.95 0 Td[(wt+1),theoptimalityconditionsbecomehrG(xmdt)+t(xt+1)]TJ /F8 11.955 Tf 11.96 0 Td[(xt),xt+1)]TJ /F8 11.955 Tf 11.95 0 Td[(xi+J(xt+1))]TJ /F8 11.955 Tf 11.96 0 Td[(J(x)+ht t)]TJ /F5 11.955 Tf 11.95 0 Td[(1(yt)]TJ /F8 11.955 Tf 11.96 0 Td[(yt+1))]TJ /F8 11.955 Tf 11.95 0 Td[(yt+1,)]TJ /F8 11.955 Tf 9.29 0 Td[(K(xt+1)]TJ /F8 11.955 Tf 11.95 0 Td[(x)i+thK(~xt)]TJ /F8 11.955 Tf 11.96 0 Td[(xt+1),K(xt+1)]TJ /F8 11.955 Tf 11.95 0 Td[(x)i+thB(wt)]TJ /F8 11.955 Tf 11.96 0 Td[(wt+1),)]TJ /F8 11.955 Tf 9.3 0 Td[(K(xt+1)]TJ /F8 11.955 Tf 11.96 0 Td[(x)i0,andF(wt+1))]TJ /F8 11.955 Tf 11.95 0 Td[(F(w)+ht t)]TJ /F5 11.955 Tf 11.96 0 Td[(1(yt)]TJ /F8 11.955 Tf 11.95 0 Td[(yt+1))]TJ /F8 11.955 Tf 11.96 0 Td[(yt+1,B(wt+1)]TJ /F8 11.955 Tf 11.96 0 Td[(w)i0. 96

PAGE 97

Therefore,hrG(xmdt),xt+1)]TJ /F8 11.955 Tf 11.96 0 Td[(xi+J(xt+1))]TJ /F8 11.955 Tf 11.95 0 Td[(J(x)+F(wt+1))]TJ /F8 11.955 Tf 11.96 0 Td[(F(w))-222(hy,Bwt+1)]TJ /F8 11.955 Tf 11.95 0 Td[(Kxt+1)]TJ /F8 11.955 Tf 11.95 0 Td[(bi+hyt+1,Bw)]TJ /F8 11.955 Tf 11.95 0 Td[(Kx)]TJ /F8 11.955 Tf 11.96 0 Td[(biht(xt)]TJ /F8 11.955 Tf 11.96 0 Td[(xt+1),xt+1)]TJ /F8 11.955 Tf 11.96 0 Td[(xi+hyt+1)]TJ /F8 11.955 Tf 11.96 0 Td[(y,Bwt+1)]TJ /F8 11.955 Tf 11.95 0 Td[(Kxt+1)]TJ /F8 11.955 Tf 11.95 0 Td[(bi)-222(ht t)]TJ /F5 11.955 Tf 11.95 0 Td[(1(yt)]TJ /F8 11.955 Tf 11.96 0 Td[(yt+1),)]TJ /F8 11.955 Tf 9.3 0 Td[(K(xt+1)]TJ /F8 11.955 Tf 11.96 0 Td[(x)i)-222(ht t)]TJ /F5 11.955 Tf 11.96 0 Td[(1(yt)]TJ /F8 11.955 Tf 11.95 0 Td[(yt+1),B(wt+1)]TJ /F8 11.955 Tf 11.96 0 Td[(w)i+thK(xt+1)]TJ /F5 11.955 Tf 12.14 0 Td[(~xt),K(xt+1)]TJ /F8 11.955 Tf 11.96 0 Td[(x)i+thB(wt+1)]TJ /F8 11.955 Tf 11.95 0 Td[(wt),)]TJ /F8 11.955 Tf 9.3 0 Td[(K(xt+1)]TJ /F8 11.955 Tf 11.95 0 Td[(x)i. (3) Threeobservationsontherighthandsideof( 3 )areinplace.Firstly,by( 3 )wehaveht(xt)]TJ /F8 11.955 Tf 11.96 0 Td[(xt+1),xt+1)]TJ /F8 11.955 Tf 11.95 0 Td[(xi+hyt+1)]TJ /F8 11.955 Tf 11.95 0 Td[(y,Bwt+1)]TJ /F8 11.955 Tf 11.96 0 Td[(Kxt+1)]TJ /F8 11.955 Tf 11.95 0 Td[(bi=thxt)]TJ /F8 11.955 Tf 11.95 0 Td[(xt+1,xt+1)]TJ /F8 11.955 Tf 11.95 0 Td[(xi+1 thyt+1)]TJ /F8 11.955 Tf 11.95 0 Td[(y,yt)]TJ /F8 11.955 Tf 11.96 0 Td[(yt+1i=t 2(kxt)]TJ /F8 11.955 Tf 11.96 0 Td[(xk2)-221(kxt+1)]TJ /F8 11.955 Tf 11.95 0 Td[(xk2))]TJ /F4 11.955 Tf 13.15 8.09 Td[(t 2(kxt)]TJ /F8 11.955 Tf 11.95 0 Td[(xt+1k2)+1 2t)]TJ /F2 11.955 Tf 5.47 -9.68 Td[(kyt)]TJ /F8 11.955 Tf 11.96 0 Td[(yk2)-221(kyt+1)]TJ /F8 11.955 Tf 11.95 0 Td[(yk2)-222(kyt)]TJ /F8 11.955 Tf 11.96 0 Td[(yt+1k2, (3) andsecondly,by( 3 )wecanseethatht t)]TJ /F5 11.955 Tf 11.96 0 Td[(1(yt)]TJ /F8 11.955 Tf 11.95 0 Td[(yt+1),K(xt+1)]TJ /F8 11.955 Tf 11.96 0 Td[(x)i)-222(ht t)]TJ /F5 11.955 Tf 11.96 0 Td[(1(yt)]TJ /F8 11.955 Tf 11.95 0 Td[(yt+1),1 t(yt)]TJ /F8 11.955 Tf 11.95 0 Td[(yt+1)+(Kxt+1)]TJ /F8 11.955 Tf 11.95 0 Td[(Kx)i=t)]TJ /F4 11.955 Tf 11.96 0 Td[(t thyt)]TJ /F8 11.955 Tf 11.95 0 Td[(yt+1,)]TJ /F8 11.955 Tf 9.3 0 Td[(K(xt+1)]TJ /F8 11.955 Tf 11.96 0 Td[(x)i)]TJ /F4 11.955 Tf 20.46 8.09 Td[(t)]TJ /F4 11.955 Tf 11.96 0 Td[(t 2tkyt)]TJ /F8 11.955 Tf 11.95 0 Td[(yt+1k2=t)]TJ /F4 11.955 Tf 11.96 0 Td[(t 21 2tkyt)]TJ /F8 11.955 Tf 11.96 0 Td[(yt+1k2+kK(xt+1)]TJ /F8 11.955 Tf 11.96 0 Td[(x)k2)-222(k1 t(yt)]TJ /F8 11.955 Tf 11.96 0 Td[(yt+1)+K(xt+1)]TJ /F8 11.955 Tf 11.95 0 Td[(x)k2)]TJ /F4 11.955 Tf 13.15 8.09 Td[(t)]TJ /F4 11.955 Tf 11.96 0 Td[(t 2tkyt)]TJ /F8 11.955 Tf 11.96 0 Td[(yt+1k2=t)]TJ /F4 11.955 Tf 11.96 0 Td[(t 21 2tkyt)]TJ /F8 11.955 Tf 11.96 0 Td[(yt+1k2+kK(xt+1)]TJ /F8 11.955 Tf 11.96 0 Td[(x)k2)-222(kBwt+1)]TJ /F8 11.955 Tf 11.95 0 Td[(Kx)]TJ /F8 11.955 Tf 11.96 0 Td[(bk2)]TJ /F4 11.955 Tf 13.15 8.09 Td[(t)]TJ /F4 11.955 Tf 11.95 0 Td[(t 2tkyt)]TJ /F8 11.955 Tf 11.95 0 Td[(yt+1k2, (3) 97

PAGE 98

wherethelastequalityisfromB(wt+1)]TJ /F8 11.955 Tf 11.96 0 Td[(w)=1 t(yt)]TJ /F8 11.955 Tf 11.96 0 Td[(yt+1)+(Kxt+1)]TJ /F8 11.955 Tf 11.96 0 Td[(Kx))]TJ /F5 11.955 Tf 11.95 0 Td[((Bw)]TJ /F8 11.955 Tf 11.96 0 Td[(Kx)]TJ /F8 11.955 Tf 11.95 0 Td[(b). (3) Thirdly,from( 3 )wehavethK(xt+1)]TJ /F5 11.955 Tf 12.14 0 Td[(~xt),K(xt+1)]TJ /F8 11.955 Tf 11.96 0 Td[(x)i+thB(wt+1)]TJ /F8 11.955 Tf 11.95 0 Td[(wt),)]TJ /F8 11.955 Tf 9.3 0 Td[(K(xt+1)]TJ /F8 11.955 Tf 11.96 0 Td[(x)i=)]TJ /F4 11.955 Tf 13.15 8.08 Td[(t 2)]TJ /F2 11.955 Tf 5.48 -9.68 Td[(kK(xt)]TJ /F8 11.955 Tf 11.95 0 Td[(x)k2)-221(kK(xt+1)]TJ /F8 11.955 Tf 11.96 0 Td[(x)k2)-222(kK(xt)]TJ /F8 11.955 Tf 11.95 0 Td[(xt+1)k2+t 2)]TJ /F2 11.955 Tf 5.48 -9.68 Td[(kBwt)]TJ /F8 11.955 Tf 11.95 0 Td[(Kx)]TJ /F8 11.955 Tf 11.96 0 Td[(bk2)-222(kBwt+1)]TJ /F8 11.955 Tf 11.96 0 Td[(Kx)]TJ /F8 11.955 Tf 11.95 0 Td[(bk2+kBwt+1)]TJ /F8 11.955 Tf 11.95 0 Td[(Kxt+1)]TJ /F8 11.955 Tf 11.95 0 Td[(bk2)-222(kBwt)]TJ /F8 11.955 Tf 11.96 0 Td[(Kxt+1)]TJ /F8 11.955 Tf 11.96 0 Td[(bk2))]TJ /F4 11.955 Tf 28.43 8.09 Td[(t 2)]TJ /F2 11.955 Tf 5.48 -9.68 Td[(kK(xt)]TJ /F8 11.955 Tf 11.95 0 Td[(x)k2)-221(kK(xt+1)]TJ /F8 11.955 Tf 11.96 0 Td[(x)k2+tkKk2 2kxt)]TJ /F8 11.955 Tf 11.96 0 Td[(xt+1k2+t 2)]TJ /F2 11.955 Tf 5.48 -9.68 Td[(kBwt)]TJ /F8 11.955 Tf 11.95 0 Td[(Kx)]TJ /F8 11.955 Tf 11.96 0 Td[(bk2)-222(kBwt+1)]TJ /F8 11.955 Tf 11.96 0 Td[(Kx)]TJ /F8 11.955 Tf 11.95 0 Td[(bk2+t 22tkyt)]TJ /F8 11.955 Tf 11.95 0 Td[(yt+1k2)]TJ /F4 11.955 Tf 13.15 8.09 Td[(t 2kBwt)]TJ /F8 11.955 Tf 11.95 0 Td[(Kxt+1)]TJ /F8 11.955 Tf 11.95 0 Td[(bk2. (3) Applying( 3 )( 3 )to( 3 ),weget1 )]TJ /F9 7.97 Tf 6.77 -1.79 Td[(tQ(z;zagt+1))]TJ /F5 11.955 Tf 13.15 8.09 Td[(1)]TJ /F4 11.955 Tf 11.95 0 Td[(t )]TJ /F9 7.97 Tf 6.78 -1.79 Td[(tQ(z;zagt)t )]TJ /F9 7.97 Tf 6.78 -1.8 Td[(tt 2(kxt)]TJ /F8 11.955 Tf 11.95 0 Td[(xk2)-222(kxt+1)]TJ /F8 11.955 Tf 11.96 0 Td[(xk2)+1 2t(kyt)]TJ /F8 11.955 Tf 11.96 0 Td[(yk2)-221(kyt+1)]TJ /F8 11.955 Tf 11.95 0 Td[(yk2))]TJ /F4 11.955 Tf 13.16 8.09 Td[(t)]TJ /F4 11.955 Tf 11.95 0 Td[(t 22tkyt)]TJ /F8 11.955 Tf 11.95 0 Td[(yt+1k2+t 2kBwt)]TJ /F8 11.955 Tf 11.95 0 Td[(Kx)]TJ /F8 11.955 Tf 11.96 0 Td[(bk2)]TJ /F4 11.955 Tf 13.16 8.09 Td[(t 2kBwt+1)]TJ /F8 11.955 Tf 11.96 0 Td[(Kx)]TJ /F8 11.955 Tf 11.95 0 Td[(bk2)]TJ /F4 11.955 Tf 13.15 8.09 Td[(t 2(kK(xt)]TJ /F8 11.955 Tf 11.96 0 Td[(x)k2)-222(kK(xt+1)]TJ /F8 11.955 Tf 11.95 0 Td[(x)k2)+ht t)]TJ /F5 11.955 Tf 11.96 0 Td[(1(yt)]TJ /F8 11.955 Tf 11.95 0 Td[(yt+1),Bw)]TJ /F8 11.955 Tf 11.96 0 Td[(Kx)]TJ /F8 11.955 Tf 11.95 0 Td[(bi+t)]TJ /F4 11.955 Tf 11.95 0 Td[(t 2kK(xt+1)]TJ /F8 11.955 Tf 11.95 0 Td[(x)k2)]TJ /F4 11.955 Tf 13.15 8.09 Td[(t 2kBwt)]TJ /F8 11.955 Tf 11.95 0 Td[(Kxt+1)]TJ /F8 11.955 Tf 11.96 0 Td[(bk2)]TJ /F5 11.955 Tf 13.15 8.09 Td[(1 2)]TJ /F4 11.955 Tf 5.48 -9.69 Td[(t)]TJ /F8 11.955 Tf 11.96 0 Td[(LGt)]TJ /F4 11.955 Tf 11.95 0 Td[(tkKk2kxt)]TJ /F8 11.955 Tf 11.96 0 Td[(xt+1k2. (3) Lettingw=wandx=xintheabove,observingfrom( 3 )that)]TJ /F9 7.97 Tf 6.77 -1.79 Td[(t)]TJ /F7 7.97 Tf 6.58 0 Td[(1=(1)]TJ /F4 11.955 Tf 12.14 0 Td[(t)=)]TJ /F9 7.97 Tf 6.77 -1.79 Td[(t,inviewof( 3 )andapplyingtheaboveinequalityinductivelyweget( 3 ). Itisworthnotingthattheinequality( 3 )isnearlyequivalenttotheoptimalityconditionsof( 3 )and( 3 ),sincetheonlyinequalityrelationshipthroughoutthe 98

PAGE 99

derivationisin( 3 )inwhichweusekKxt)]TJ /F8 11.955 Tf 12.25 0 Td[(Kxt+1k2kKk2kxt)]TJ /F8 11.955 Tf 12.24 0 Td[(xt+1k2.If=0,then( 3 )isequivalenttotheoptimalityconditionsof( 3 )and( 3 ). TherearetwomajorconsequencesofLemma 2 .Firstly,ift1forallt,thenthelefthandsideof( 3 )becomesPti=2Q(z;zagi).Secondly,ift2[0,1)forallt,theninviewof( 3 ),thelefthandof( 3 )isQ(z;zagt+1)=)]TJ /F9 7.97 Tf 6.77 -1.79 Td[(t.TherstcasewillbeusedfortheanalysisofADMM,P-ADMM,L-ADMMandLP-ADMMintheorems 1 and 4 ,andthesecondcasewillbeusedfortheanalysisofAL-ADMMandALP-ADMMintheorems 6 8 Inthenextlemma,weconsidertwopossibleboundsofthemapB(,,)thatappearsinLemma 2 Lemma3. SupposethatVisanyvectorspaceandVVisanyconvexset.Foranyv2V,v[t+1]Vand[t]R,wehavethefollowings: a) Ifthesequencefii=)]TJ /F9 7.97 Tf 6.77 -1.79 Td[(igisdecreasing,thenBt(v,v[t+1],[t])11 2)]TJ /F7 7.97 Tf 13.06 -1.8 Td[(1kv1)]TJ /F8 11.955 Tf 11.95 0 Td[(vk2)]TJ /F4 11.955 Tf 13.15 8.09 Td[(tt 2)]TJ /F9 7.97 Tf 13.05 -1.8 Td[(tkvt+1)]TJ /F8 11.955 Tf 11.95 0 Td[(vk2. (3) b) Ifthesequencefii=)]TJ /F9 7.97 Tf 6.77 -1.79 Td[(igisincreasing,Visboundedandv[t+1]V,thenBt(v,v[t+1],[t])tt 2)]TJ /F9 7.97 Tf 13.05 -1.79 Td[(tD2V)]TJ /F4 11.955 Tf 13.15 8.09 Td[(tt 2)]TJ /F9 7.97 Tf 13.05 -1.79 Td[(tkvt+1)]TJ /F8 11.955 Tf 11.96 0 Td[(vk2. (3) Proof. By( 3 )wehaveBt(v,v[t+1],[t])=11 2)]TJ /F7 7.97 Tf 13.05 -1.79 Td[(1kv1)]TJ /F8 11.955 Tf 11.96 0 Td[(vk2)]TJ /F9 7.97 Tf 13.17 14.94 Td[(t)]TJ /F7 7.97 Tf 6.59 0 Td[(1Xi=1ii 2)]TJ /F9 7.97 Tf 13.05 -1.79 Td[(i)]TJ /F4 11.955 Tf 13.15 8.09 Td[(i+1i+1 2)]TJ /F9 7.97 Tf 13.05 -1.79 Td[(i+1kvi+1)]TJ /F8 11.955 Tf 11.95 0 Td[(vk2)]TJ /F4 11.955 Tf 13.15 8.09 Td[(tt 2)]TJ /F9 7.97 Tf 13.05 -1.79 Td[(tkvt+1)]TJ /F8 11.955 Tf 11.95 0 Td[(vk2. Ifthesequencefii=)]TJ /F9 7.97 Tf 6.78 -1.79 Td[(igisdecreasing,thentheaboveequationimplies( 3 ).Ifthesequencefii=)]TJ /F9 7.97 Tf 6.77 -1.8 Td[(igisincreasing,Visboundedandv[t+1]V,thenfromtheaboveequationwehaveBt(v,v[t+1],[t])11 2)]TJ /F7 7.97 Tf 13.05 -1.79 Td[(1D2V)]TJ /F9 7.97 Tf 13.18 14.95 Td[(t)]TJ /F7 7.97 Tf 6.58 0 Td[(1Xi=1ii 2)]TJ /F9 7.97 Tf 13.05 -1.79 Td[(i)]TJ /F4 11.955 Tf 13.15 8.09 Td[(i+1i+1 2)]TJ /F9 7.97 Tf 13.05 -1.79 Td[(i+1D2V)]TJ /F4 11.955 Tf 13.16 8.09 Td[(tt 2)]TJ /F9 7.97 Tf 13.05 -1.79 Td[(tkvt+1)]TJ /F8 11.955 Tf 11.96 0 Td[(vk2=tt 2)]TJ /F9 7.97 Tf 13.05 -1.8 Td[(tD2V)]TJ /F4 11.955 Tf 13.15 8.08 Td[(tt 2)]TJ /F9 7.97 Tf 13.05 -1.8 Td[(tkvt+1)]TJ /F8 11.955 Tf 11.96 0 Td[(vk2, 99

PAGE 100

hence( 3 )holds. WearenowreadytoanalyzetheconvergenceofAlgorithm 3.5 .Insections 3.3.1 and 3.3.2 below,wederivetheconvergenceresultsofAlgorithm 3.5 forsolvingproblemsoftypeIandtypeII,respectively. 3.3.1ConvergenceAnalysisforProblemsofTypeI Inthissubsection,weproveTheorems 1 6 and 7 ,whicharetheconvergenceresultsofAlgorithm 3.5 forsolvingproblemsoftypeI.SinceBisone-to-oneforthistypeofproblems,weusethenotationYasdenedin( 3 )throughoutthissubsection. ProofofTheorem 1 Sincet1,By( 3 ),( 3 )and( 3 )wehavexagt=xt,wagt=wtandyagt=yt,andby( 3 )wehave)]TJ /F9 7.97 Tf 6.77 -1.79 Td[(t=1.Applyingtheparametersettingsto( 3 )inLemma 3 wehaveBt(x,x[t+1],[t]) 2(kx1)]TJ /F8 11.955 Tf 11.96 0 Td[(xk2)-222(kxt+1)]TJ /F8 11.955 Tf 11.96 0 Td[(xk2)= 2(D2x)-221(kxt+1)]TJ /F8 11.955 Tf 11.95 0 Td[(xk2),Bt(Bw,Bw[t+1],[t]) 2(kBw1)]TJ /F8 11.955 Tf 11.96 0 Td[(Bwk2)-221(kBwt+1)]TJ /F8 11.955 Tf 11.95 0 Td[(Bwk2)D2w,B 2,)]TJ /F4 11.955 Tf 9.3 0 Td[(Bt(Kx,Kx[t+1],[t])=Bt(Kx,Kx[t+1],)]TJ /F4 11.955 Tf 9.3 0 Td[([t]))]TJ /F4 11.955 Tf 28.42 8.09 Td[( 2(kKx1)]TJ /F8 11.955 Tf 11.95 0 Td[(Kxk2)-222(kKxt+1)]TJ /F8 11.955 Tf 11.96 0 Td[(Kxk2))]TJ /F4 11.955 Tf 28.42 8.09 Td[( 2(D2x,K)-221(kKk2kxt+1)]TJ /F8 11.955 Tf 11.95 0 Td[(xk2). Inaddition,byLemma 1 wegety[t+1]Y,thusby( 3 )inLemma 3 wealsohaveBt(y,y[t+1],)]TJ /F7 7.97 Tf 6.58 0 Td[(1[t])1 2(D2Y)-222(kyt+1)]TJ /F8 11.955 Tf 11.95 0 Td[(yk2)D2Y 2,8y2Y. Noticingthatbytheparametersettings,( 3 )inLemma 2 ist+1Xi=2Q(w,x,y;zi)Bt(x,x[t+1],[t])+Bt(y,y[t+1],)]TJ /F7 7.97 Tf 6.59 0 Td[(1[t])+Bt(Bw,Bw[t+1],[t]))]TJ /F4 11.955 Tf 11.96 0 Td[(Bt(Kx,Kx[t+1],[t]), 100

PAGE 101

andapplyingallthecalculationabove,wegett+1Xi=2Q(w,x,y;zi)1 2D2x+1 D2Y+D2w,B)]TJ /F4 11.955 Tf 11.96 0 Td[(D2x,K)]TJ /F5 11.955 Tf 11.96 0 Td[(()]TJ /F4 11.955 Tf 11.96 0 Td[(kKk2)kxt+1)]TJ /F8 11.955 Tf 11.95 0 Td[(xk21 2D2x+1 D2Y+D2w,B)]TJ /F4 11.955 Tf 11.96 0 Td[(D2x,K,8y2Y. Nowforally2Y,bytheconvexityofQ(x,w,y,),Q(w,x,y;zt+1)1 tt+1Xi=2Q(w,x,y;zi),wherezt+1:=t+1Xi=2zi, weconcludefrom( 3 ),Proposition 2 andthetwoinequalitiesabovethat( 3 )holds.Inequality( 3 )followsimmediatelybydirectsubstitutionandthefactthatDx,K=Dw,BwhenBw1=Kx1+b. ProofofTheorem 6 Fromtheparametersettings,wecanseethatthesequencesftt=)]TJ /F9 7.97 Tf 6.77 -1.79 Td[(tgandftt=)]TJ /F9 7.97 Tf 6.77 -1.79 Td[(tgareconstantsequences,soby( 3 )inLemma 3 wehaveBt(x,x[t+1],[t])1 2kx1)]TJ /F8 11.955 Tf 11.95 0 Td[(xk2)]TJ /F4 11.955 Tf 13.15 8.09 Td[(tt )]TJ /F9 7.97 Tf 6.77 -1.79 Td[(tkxt+1)]TJ /F8 11.955 Tf 11.96 0 Td[(xk2=1 2D2x)]TJ /F4 11.955 Tf 13.15 8.09 Td[(tt )]TJ /F9 7.97 Tf 6.78 -1.79 Td[(tkxt+1)]TJ /F8 11.955 Tf 11.95 0 Td[(xk2,Bt(Bw,Bw[t+1],[t])1 2kBw1)]TJ /F8 11.955 Tf 11.96 0 Td[(Bwk2)]TJ /F4 11.955 Tf 13.15 8.08 Td[(tt )]TJ /F9 7.97 Tf 6.78 -1.8 Td[(tkBwt+1)]TJ /F8 11.955 Tf 11.95 0 Td[(Bwk21D2w,B 2,)]TJ /F4 11.955 Tf 9.3 0 Td[(Bt(Kx,Kx[t+1],[t])=Bt(Kx,Kx[t+1],)]TJ /F4 11.955 Tf 9.3 0 Td[([t]))]TJ /F4 11.955 Tf 28.43 8.08 Td[(1 2kKx1)]TJ /F8 11.955 Tf 11.95 0 Td[(Kxk2+tt )]TJ /F9 7.97 Tf 6.77 -1.79 Td[(tkKxt+1)]TJ /F8 11.955 Tf 11.95 0 Td[(Kxk2)]TJ /F4 11.955 Tf 28.43 8.09 Td[(1 2D2x,K+ttkKk2 )]TJ /F9 7.97 Tf 6.77 -1.79 Td[(tkxt+1)]TJ /F8 11.955 Tf 11.96 0 Td[(xk2. Ontheotherhand,fromProposition 3 weknowthatft=()]TJ /F9 7.97 Tf 11.65 -1.8 Td[(tt)gisanincreasingsequence,andbyLemma 1 wegety[t+1]Y,henceby( 3 )inLemma 3 wehaveBt(y,y[t+1],)]TJ /F7 7.97 Tf 6.59 0 Td[(1[t])t )]TJ /F9 7.97 Tf 6.77 -1.8 Td[(tt(D2Y)-222(kyt+1)]TJ /F8 11.955 Tf 11.96 0 Td[(yk2)tD2Y )]TJ /F9 7.97 Tf 6.78 -1.8 Td[(tt,8y2Y. 101

PAGE 102

Inaddition,bytheparametersettingand( 3 )wealsohavet)]TJ /F8 11.955 Tf 11.95 0 Td[(LGt)]TJ /F4 11.955 Tf 11.96 0 Td[(tkKk2=LG()]TJ /F9 7.97 Tf 6.78 -1.79 Td[(t)]TJ /F4 11.955 Tf 11.95 0 Td[(2t) t0. Applyingalltheinequalitiesaboveto( 3 )inLemma 2 ,weget1 )]TJ /F9 7.97 Tf 6.78 -1.79 Td[(tQ(w,x,y;zagt+1)Bt(x,x[t+1],[t])+Bt(y,y[t+1],)]TJ /F7 7.97 Tf 6.58 0 Td[(1[t])+Bt(Bw,Bw[t+1],[t]))]TJ /F4 11.955 Tf 11.95 0 Td[(Bt(Kx,Kx[t+1],[t])1 21D2x+t )]TJ /F9 7.97 Tf 6.77 -1.79 Td[(ttD2Y+1D2w,B)]TJ /F4 11.955 Tf 11.96 0 Td[(1D2x,K)]TJ /F4 11.955 Tf 13.15 8.08 Td[(t )]TJ /F9 7.97 Tf 6.78 -1.79 Td[(t(t)]TJ /F4 11.955 Tf 11.95 0 Td[(tkKk2)kxt+1)]TJ /F8 11.955 Tf 11.95 0 Td[(xk21 2LGD2x+kKk2N)]TJ /F7 7.97 Tf 6.59 0 Td[(1D2x )]TJ /F9 7.97 Tf 6.78 -1.8 Td[(N)]TJ /F7 7.97 Tf 6.59 0 Td[(1+2t)]TJ /F9 7.97 Tf 6.78 -1.79 Td[(N)]TJ /F7 7.97 Tf 6.59 0 Td[(1D2Y N)]TJ /F7 7.97 Tf 6.59 0 Td[(1)]TJ /F7 7.97 Tf 6.77 4.11 Td[(2t+N)]TJ /F7 7.97 Tf 6.59 0 Td[(1D2w,B )]TJ /F9 7.97 Tf 6.78 -1.8 Td[(N)]TJ /F7 7.97 Tf 6.59 0 Td[(1)]TJ /F4 11.955 Tf 13.15 8.65 Td[(N)]TJ /F7 7.97 Tf 6.59 0 Td[(1D2x,K )]TJ /F9 7.97 Tf 6.78 -1.8 Td[(N)]TJ /F7 7.97 Tf 6.59 0 Td[(1,8y2Y. (3) Thus,att=N)]TJ /F5 11.955 Tf 11.96 0 Td[(1,using 3 andProposition 2 wehaveH(xagN)+F(~wagN))]TJ /F8 11.955 Tf 11.95 0 Td[(f1 2)]TJ /F9 7.97 Tf 6.78 -1.79 Td[(N)]TJ /F7 7.97 Tf 6.59 0 Td[(1LGD2x+kKk2N)]TJ /F7 7.97 Tf 6.59 0 Td[(1D2x+N)]TJ /F7 7.97 Tf 6.59 0 Td[(1D2Y +N)]TJ /F7 7.97 Tf 6.59 0 Td[(1D2w,B)]TJ /F4 11.955 Tf 11.96 0 Td[(N)]TJ /F7 7.97 Tf 6.59 0 Td[(1D2x,K2LGD2x N(N)]TJ /F5 11.955 Tf 11.95 0 Td[(1)+kKk2D2x N+D2Y N+D2w,B N)]TJ /F4 11.955 Tf 13.15 8.66 Td[(D2x,K N. Inequality( 3 )followsimmediatelybysubstituting( 3 )intotheaboveandthefactthatDx,K=Dw,BwhenBw1=Kx1+b. Itshouldbenotedfromthelastinequalityof( 3 )thattherateofconvergenceofacceleratedlinearizedADMMalgorithmsactuallydependsontherateofconvergencethattand)]TJ /F9 7.97 Tf 6.78 -1.79 Td[(ttendsto0.Fromthispointofview,theweightingparametertin( 3 )shouldbethebestchoice,sinceitdecreasesthefastestamongallchoicesofweightingparametersthatsatisfy( 3 ). 102

PAGE 103

ProofofTheorem 7 Basedontheparametersettingand( 3 ),wemakeafewobservationsabouttherighthandsideof( 3 )inLemma 2 .Firstlywehavet=t,tt )]TJ /F9 7.97 Tf 6.78 -1.8 Td[(t=tt )]TJ /F9 7.97 Tf 6.78 -1.8 Td[(t=t )]TJ /F9 7.97 Tf 6.77 -1.8 Td[(t=t+1t+1 )]TJ /F9 7.97 Tf 6.78 -1.8 Td[(t+1 andt)]TJ /F8 11.955 Tf 11.95 0 Td[(LGt)]TJ /F4 11.955 Tf 11.96 0 Td[(tkKk2=LG()]TJ /F9 7.97 Tf 6.78 -1.79 Td[(t)]TJ /F4 11.955 Tf 11.95 0 Td[(2t) t0. Secondly,bytheaboveresult,( 3 ),andthefactthatDw,B=Dx,K,wehaveBt(Bw,Bw[t+1],t))]TJ /F9 7.97 Tf 18.68 14.94 Td[(tXi=1i(i)]TJ /F4 11.955 Tf 11.95 0 Td[(i) 2)]TJ /F9 7.97 Tf 13.05 -1.79 Td[(ikBw)]TJ /F8 11.955 Tf 11.95 0 Td[(Bwt+1k2=1 2kBw1)]TJ /F8 11.955 Tf 11.95 0 Td[(Bwk2)]TJ /F9 7.97 Tf 13.18 14.95 Td[(t)]TJ /F7 7.97 Tf 6.59 0 Td[(1Xi=1ii 2)]TJ /F9 7.97 Tf 13.05 -1.79 Td[(i)]TJ /F4 11.955 Tf 13.15 8.09 Td[(i+1i+1 2)]TJ /F9 7.97 Tf 13.05 -1.79 Td[(i+1kBwi+1)]TJ /F8 11.955 Tf 11.95 0 Td[(Bwk2)]TJ /F4 11.955 Tf 13.15 8.09 Td[(tt 2)]TJ /F9 7.97 Tf 13.05 -1.79 Td[(tkBwt+1)]TJ /F8 11.955 Tf 11.95 0 Td[(Bwk21 2kK(x1)]TJ /F8 11.955 Tf 11.96 0 Td[(x)k21 2D2X,K. Thirdly,byLemma( 1 ),y[t+1]Y,andby( 3 ),ft=)]TJ /F9 7.97 Tf 6.77 -1.79 Td[(ttgt1isanon-decreasingsequence,hencefrom( 3 )wehaveBt(y,y[t+1],)]TJ /F7 7.97 Tf 6.59 0 Td[(1[t])t 2)]TJ /F9 7.97 Tf 13.05 -1.79 Td[(tD2Y,8y2Y. Finally,notingthatftt=)]TJ /F9 7.97 Tf 6.77 -1.8 Td[(tgt1=fLG+kKk2t)]TJ /F7 7.97 Tf 6.59 0 Td[(1=)]TJ /F9 7.97 Tf 6.78 -1.8 Td[(t)]TJ /F7 7.97 Tf 6.59 0 Td[(1gt1isalsonon-decreasing,by( 3 )wehaveBt(x,x[t+1],[t])=tt 2)]TJ /F9 7.97 Tf 13.05 -1.8 Td[(tD2X)]TJ /F4 11.955 Tf 13.15 8.08 Td[(tt 2)]TJ /F9 7.97 Tf 13.05 -1.8 Td[(tkxt+1)]TJ /F8 11.955 Tf 11.95 0 Td[(xk2. 103

PAGE 104

Applyingalltheaboveobservationsto( 3 )inLemma 2 withx=xandw=w,forally2Ywehave1 )]TJ /F9 7.97 Tf 6.77 -1.79 Td[(tQ(w,x,y;zagt+1)tt 2)]TJ /F9 7.97 Tf 13.05 -1.79 Td[(tD2X)]TJ /F4 11.955 Tf 13.15 8.09 Td[(tt 2)]TJ /F9 7.97 Tf 13.05 -1.79 Td[(tkxt+1)]TJ /F8 11.955 Tf 11.96 0 Td[(xk2+t 2)]TJ /F9 7.97 Tf 13.05 -1.79 Td[(tD2Y+1 2kK(x1)]TJ /F8 11.955 Tf 11.96 0 Td[(x)k2)]TJ /F4 11.955 Tf 11.96 0 Td[(Bt(Kx,Kx[t+1],[t])+tXi=1i(i)]TJ /F4 11.955 Tf 11.95 0 Td[(i) 2)]TJ /F9 7.97 Tf 13.05 -1.79 Td[(ikK(xi+1)]TJ /F8 11.955 Tf 11.95 0 Td[(x)k2=tt 2)]TJ /F9 7.97 Tf 13.05 -1.79 Td[(tD2X)]TJ /F4 11.955 Tf 13.15 8.09 Td[(tt 2)]TJ /F9 7.97 Tf 13.05 -1.79 Td[(tkxt+1)]TJ /F8 11.955 Tf 11.96 0 Td[(xk2+t 2)]TJ /F9 7.97 Tf 13.05 -1.79 Td[(tD2Y+(1)]TJ /F4 11.955 Tf 11.96 0 Td[()1 2kK(x1)]TJ /F8 11.955 Tf 11.95 0 Td[(x)k2+t)]TJ /F7 7.97 Tf 6.59 0 Td[(1Xi=1i[i)]TJ /F5 11.955 Tf 11.95 0 Td[((1)]TJ /F4 11.955 Tf 11.95 0 Td[()i] 2)]TJ /F9 7.97 Tf 13.05 -1.79 Td[(i)]TJ /F4 11.955 Tf 13.15 8.09 Td[(i+1i+1 2)]TJ /F9 7.97 Tf 13.05 -1.79 Td[(i+1kK(xi+1)]TJ /F8 11.955 Tf 11.96 0 Td[(x)k2+t[t)]TJ /F5 11.955 Tf 11.96 0 Td[((1)]TJ /F4 11.955 Tf 11.96 0 Td[()t] 2)]TJ /F9 7.97 Tf 13.05 -1.79 Td[(tkK(xt+1)]TJ /F8 11.955 Tf 11.95 0 Td[(x)k2=tt 2)]TJ /F9 7.97 Tf 13.05 -1.79 Td[(tD2X)]TJ /F4 11.955 Tf 13.15 8.09 Td[(tt 2)]TJ /F9 7.97 Tf 13.05 -1.79 Td[(tkxt+1)]TJ /F8 11.955 Tf 11.96 0 Td[(xk2+tt 2)]TJ /F9 7.97 Tf 13.05 -1.79 Td[(tkK(xt+1)]TJ /F8 11.955 Tf 11.96 0 Td[(x)k2+t 2)]TJ /F9 7.97 Tf 13.05 -1.8 Td[(tD2Y+(1)]TJ /F4 11.955 Tf 11.96 0 Td[()1 2kK(x1)]TJ /F8 11.955 Tf 11.96 0 Td[(x)k2)]TJ /F5 11.955 Tf 11.95 0 Td[((1)]TJ /F4 11.955 Tf 11.95 0 Td[()tXi=1ii 2)]TJ /F9 7.97 Tf 13.05 -1.79 Td[(i)]TJ /F4 11.955 Tf 13.15 8.09 Td[(i+1i+1 2)]TJ /F9 7.97 Tf 13.05 -1.79 Td[(i+1kK(xi+1)]TJ /F8 11.955 Tf 11.95 0 Td[(x)k2+t(t)]TJ /F4 11.955 Tf 11.96 0 Td[(t) 2)]TJ /F9 7.97 Tf 13.05 -1.79 Td[(tkK(xt+1)]TJ /F8 11.955 Tf 11.96 0 Td[(x)k2. Notingthatftt=)]TJ /F9 7.97 Tf 6.77 -1.79 Td[(tgt1=ft)]TJ /F7 7.97 Tf 6.59 0 Td[(1=)]TJ /F9 7.97 Tf 6.78 -1.79 Td[(t)]TJ /F7 7.97 Tf 6.59 0 Td[(1gt1isnon-decreasingandusingtheinequalityaboveandProposition 2 ,weconcludethat1 )]TJ /F9 7.97 Tf 6.77 -1.79 Td[(tgY(zagt+1)1 )]TJ /F9 7.97 Tf 6.78 -1.79 Td[(tsup~y2YQ(w,x,~y;zagt+1)tt 2)]TJ /F9 7.97 Tf 13.05 -1.79 Td[(tD2X)]TJ /F4 11.955 Tf 13.15 8.09 Td[(t )]TJ /F9 7.97 Tf 6.77 -1.79 Td[(tt 2)]TJ /F4 11.955 Tf 11.95 0 Td[(kKk2t)]TJ /F7 7.97 Tf 6.59 0 Td[(1=)]TJ /F9 7.97 Tf 6.77 -1.79 Td[(t)]TJ /F7 7.97 Tf 6.59 0 Td[(1kxt+1)]TJ /F8 11.955 Tf 11.96 0 Td[(xk2+t 2)]TJ /F9 7.97 Tf 13.05 -1.79 Td[(tD2Y+(1)]TJ /F4 11.955 Tf 11.96 0 Td[()1 2D2X,K)]TJ /F5 11.955 Tf 11.95 0 Td[((1)]TJ /F4 11.955 Tf 11.95 0 Td[()tXi=1ii 2)]TJ /F9 7.97 Tf 13.05 -1.79 Td[(i)]TJ /F4 11.955 Tf 13.15 8.09 Td[(i+1i+1 2)]TJ /F9 7.97 Tf 13.05 -1.79 Td[(i+1D2X,K+t(t)]TJ /F4 11.955 Tf 11.96 0 Td[(t) 2)]TJ /F9 7.97 Tf 13.05 -1.79 Td[(tkK(xt+1)]TJ /F8 11.955 Tf 11.96 0 Td[(x)k2tt 2)]TJ /F9 7.97 Tf 13.05 -1.79 Td[(tD2X+t 2)]TJ /F9 7.97 Tf 13.05 -1.79 Td[(tD2Y+(1)]TJ /F4 11.955 Tf 11.95 0 Td[()tt 2)]TJ /F9 7.97 Tf 13.05 -1.79 Td[(tD2X,K+t(t)]TJ /F4 11.955 Tf 11.95 0 Td[(t) 2)]TJ /F9 7.97 Tf 13.05 -1.79 Td[(tD2X,KLG 2D2X+t)]TJ /F7 7.97 Tf 6.58 0 Td[(1 2)]TJ /F9 7.97 Tf 13.05 -1.8 Td[(t)]TJ /F7 7.97 Tf 6.59 0 Td[(1(kKk2D2X)]TJ /F8 11.955 Tf 11.95 0 Td[(D2X,K)+t 2)]TJ /F9 7.97 Tf 13.05 -1.8 Td[(tD2Y+tt 2)]TJ /F9 7.97 Tf 13.06 -1.8 Td[(tD2X,KLG 2D2X+tkKk2D2X 2)]TJ /F9 7.97 Tf 13.05 -1.79 Td[(t+(1)]TJ /F4 11.955 Tf 11.96 0 Td[()t 2)]TJ /F9 7.97 Tf 13.05 -1.79 Td[(tD2X,K+t 2)]TJ /F9 7.97 Tf 13.05 -1.79 Td[(tD2Y. 104

PAGE 105

Finally,bytheinequalityabove,( 3 )andProposition 2 ,weconcludethatH(xagt+1)+F(~wagt+1))]TJ /F8 11.955 Tf 11.96 0 Td[(fgY(zagt+1)2LGD2X t(t+1)+kKk2D2X t+1+(1)]TJ /F4 11.955 Tf 11.95 0 Td[()D2X,K t+1+D2Y (t+1), and( 3 )followsimmediately. 3.3.2ConvergenceAnalysisforProblemsofTypeII Inthissubsection,weprovethemainconvergenceresultsofAlgorithm 3.5 forsolvingproblemsoftypeII,namely,Theorems 4 and 8 ProofofTheorem 4 SimilarastheproofofTheorem 1 ,butonlyreplacingtheestimationofBt(y,y[t+1],)]TJ /F7 7.97 Tf 6.58 0 Td[(1[t])byBt(y,y[t+1],)]TJ /F7 7.97 Tf 6.58 0 Td[(1[t])=1 2(ky1)]TJ /F8 11.955 Tf 11.95 0 Td[(yk2)-222(ky1)]TJ /F8 11.955 Tf 11.96 0 Td[(yk2), forally2YwegetQ(w,x,y;zt+1)1 2tD2x+1 (ky1)]TJ /F8 11.955 Tf 11.96 0 Td[(yk2)-222(kyt+1)]TJ /F8 11.955 Tf 11.95 0 Td[(yk2)+D2w,B)]TJ /F4 11.955 Tf 11.96 0 Td[(D2x,K (3)1 2tD2x+1 ky1k2+D2w,B)]TJ /F4 11.955 Tf 11.95 0 Td[(D2x,K)-222(h1 t(y1)]TJ /F8 11.955 Tf 11.96 0 Td[(yt+1),yi, (3) wherezt+1=Pt+1t=2zi.NotingthatQ(z,zt+1)0,by( 3 )wegetkyt+1)]TJ /F8 11.955 Tf 11.95 0 Td[(yk2D2x+D2y+2D2w,B)]TJ /F4 11.955 Tf 11.95 0 Td[(2D2x,K, henceifweletvt+1=(y1)]TJ /F8 11.955 Tf 11.96 0 Td[(yt+1)=(t),thenwehavekvt+1k22 2t2(ky1)]TJ /F8 11.955 Tf 11.96 0 Td[(yk2+kyt+1)]TJ /F8 11.955 Tf 11.96 0 Td[(yk2)2 t2(D2x +2 2D2y+D2w,B)]TJ /F4 11.955 Tf 11.95 0 Td[(D2x,K), andby( 3 )wealsohaveg(vt+1,zt+1)1 2tD2x+1 ky1k2+D2w,B)]TJ /F4 11.955 Tf 11.96 0 Td[(D2x,K. 105

PAGE 106

BythetwoinequalitiesaboveandProposition 1 ,weget( 3 )and( 3 ).Theresults( 3 )and( 3 )followsimmediately. ProofofTheorem 5 Fromtheparametersetting,itisclearthatthesequencesfii=)]TJ /F9 7.97 Tf 6.78 -1.79 Td[(igi1,fi=()]TJ /F9 7.97 Tf 11.65 -1.79 Td[(ii)gi1andfii=)]TJ /F9 7.97 Tf 6.77 -1.79 Td[(igi1areallconstantsequences,henceby( 3 )wehaveBt(x,x[t+1],[t])=1 2(kx1)]TJ /F8 11.955 Tf 11.96 0 Td[(xk2)-222(kxt+1)]TJ /F8 11.955 Tf 11.96 0 Td[(xk2)1 2(D2x)-222(kxt+1)]TJ /F8 11.955 Tf 11.95 0 Td[(xk2),Bt(y,y[t+1],)]TJ /F7 7.97 Tf 6.59 0 Td[(1[t])=1 21(ky1)]TJ /F8 11.955 Tf 11.96 0 Td[(yk2)-222(kyt+1)]TJ /F8 11.955 Tf 11.95 0 Td[(yk2),8y2Y,Bt(Bw,Bw[t+1],[t])=1 2(kBw1)]TJ /F8 11.955 Tf 11.95 0 Td[(Bwk2)-221(kBt+1)]TJ /F8 11.955 Tf 11.95 0 Td[(Bwk2)1 2D2w,B,)]TJ /F4 11.955 Tf 9.3 -.01 Td[(Bt(Kx,Kx[t+1],[t])=1 2(kKx1)]TJ /F8 11.955 Tf 11.96 .01 Td[(Kxk2)-222(kKxt+1)]TJ /F8 11.955 Tf 11.95 .01 Td[(Kxk2),1 2(D2x,K)-222(kKk2kxt+1)]TJ /F8 11.955 Tf 11.95 0 Td[(xk2). Inaddition,bytheparametersettingand( 3 )wealsoseethattt,andt)]TJ /F8 11.955 Tf 11.95 0 Td[(LGt)]TJ /F4 11.955 Tf 11.96 0 Td[(tkKk2=LG()]TJ /F9 7.97 Tf 6.78 -1.8 Td[(t)]TJ /F4 11.955 Tf 11.95 0 Td[(2t) t0. Applyingallthecalculationsaboveto( 3 )inLemma 2 ,andnoticingfromBw1=Kx1+bthatDw,B=Dx,K,wehave1 )]TJ /F9 7.97 Tf 6.77 -1.79 Td[(tQ(w,x,y;zagt+1)1 2D2x)]TJ /F4 11.955 Tf 13.15 8.09 Td[(1 2D2x,K)]TJ /F5 11.955 Tf 13.15 8.09 Td[(1 2)]TJ /F4 11.955 Tf 5.48 -9.68 Td[(1)]TJ /F4 11.955 Tf 11.96 0 Td[(1kKk2kxt+1)]TJ /F8 11.955 Tf 11.95 0 Td[(xk2+1 2D2w,B+1 21(ky1)]TJ /F8 11.955 Tf 11.95 0 Td[(yk2)-222(kyt+1)]TJ /F8 11.955 Tf 11.96 0 Td[(yk2)1 2D2x+(1)]TJ /F4 11.955 Tf 11.95 0 Td[()1 2D2x,K+1 21(ky1)]TJ /F8 11.955 Tf 11.96 0 Td[(yk2)-222(kyt+1)]TJ /F8 11.955 Tf 11.96 0 Td[(yk2),8y2Y. Twoconsequencestotheaboveestimationcanbederived.Firstly,sinceQ(z;zagt+1)0,wehavekyt+1)]TJ /F8 11.955 Tf 11.96 0 Td[(yk211D2x+(1)]TJ /F4 11.955 Tf 11.95 0 Td[()1D2x,K+D2y, 106

PAGE 107

and)]TJ /F7 7.97 Tf 6.77 4.34 Td[(2t 21ky1)]TJ /F8 11.955 Tf 11.96 0 Td[(yt+1k22)]TJ /F7 7.97 Tf 13.05 4.34 Td[(2t 21(ky1)]TJ /F8 11.955 Tf 11.96 0 Td[(yk2+kyt+1)]TJ /F8 11.955 Tf 11.95 0 Td[(yk2)2)]TJ /F7 7.97 Tf 13.05 4.34 Td[(2t 11D2x+(1)]TJ /F4 11.955 Tf 11.95 0 Td[()1D2x,K+2 1D2y. Secondly,sinceky1)]TJ /F8 11.955 Tf 10.06 0 Td[(yk2)-63(kyt+1)]TJ /F8 11.955 Tf 10.05 0 Td[(yk2=ky1k2)-63(kyt+1k2)]TJ /F5 11.955 Tf 10.06 0 Td[(2hy1)]TJ /F8 11.955 Tf 10.05 0 Td[(yt+1,yi)]TJ /F5 11.955 Tf 29.89 0 Td[(2hy1)]TJ /F8 11.955 Tf 10.06 0 Td[(yt+1,yi,1 )]TJ /F9 7.97 Tf 6.78 -1.8 Td[(tQ(w,x,y;zagt+1)+1 thy1)]TJ /F8 11.955 Tf 11.95 0 Td[(yt+1,yi1 2D2x+(1)]TJ /F4 11.955 Tf 11.96 0 Td[()1 2D2x,K,8y2Y. Lettingvt+1:=)]TJ /F9 7.97 Tf 22.88 -1.8 Td[(t(y1)]TJ /F8 11.955 Tf 11.95 0 Td[(yt+1)=1,fromthetwoinequalitiesaboveand( 3 )wehavekvNk22N)]TJ /F7 7.97 Tf 6.59 0 Td[(1)]TJ /F9 7.97 Tf 6.77 -1.79 Td[(N)]TJ /F7 7.97 Tf 6.58 0 Td[(1LGD2x+kKk2N)]TJ /F7 7.97 Tf 6.59 0 Td[(1D2x )]TJ /F9 7.97 Tf 6.77 -1.79 Td[(N)]TJ /F7 7.97 Tf 6.59 0 Td[(1+(1)]TJ /F4 11.955 Tf 11.95 0 Td[()N)]TJ /F7 7.97 Tf 6.59 0 Td[(1D2x,K )]TJ /F9 7.97 Tf 6.78 -1.79 Td[(N)]TJ /F7 7.97 Tf 6.59 0 Td[(1+2N)]TJ /F7 7.97 Tf 6.58 0 Td[(1D2y )]TJ /F9 7.97 Tf 6.78 -1.79 Td[(N)]TJ /F7 7.97 Tf 6.59 0 Td[(116LGD2x N2(N)]TJ /F5 11.955 Tf 11.95 0 Td[(1)+8kKk2D2x N+8(1)]TJ /F4 11.955 Tf 11.96 0 Td[()D2x,K N+16D2y N,g(vN,zagN))]TJ /F9 7.97 Tf 6.77 -1.79 Td[(N)]TJ /F7 7.97 Tf 6.59 0 Td[(1 2LGD2x+kKk2N)]TJ /F7 7.97 Tf 6.58 0 Td[(1D2x )]TJ /F9 7.97 Tf 6.77 -1.79 Td[(N)]TJ /F7 7.97 Tf 6.58 0 Td[(1+(1)]TJ /F4 11.955 Tf 11.95 0 Td[()N)]TJ /F7 7.97 Tf 6.58 0 Td[(1D2x,K )]TJ /F9 7.97 Tf 6.77 -1.79 Td[(N)]TJ /F7 7.97 Tf 6.59 0 Td[(12LGD2x N(N)]TJ /F5 11.955 Tf 11.95 0 Td[(1)+kKk2D2x N+(1)]TJ /F4 11.955 Tf 11.96 0 Td[()D2x,K N. ByProposition 1 ,weget( 3 )and( 3 ). 3.4NumericalExamples Inthissection,weconsidersomenumericalexamplesoftheproposedmethods.Theproblemofinterestinthissectionisthetotalvariation(TV)regularizationimagereconstructionproblem( 1 ),whichwewriteas:minx2RnG(x)+kDxk2,1, (3) whereG(x)isconvex,continuouslydifferentiableandsatises( 3 ).WeassumethatthenitedifferenceoperatorDsatisestheperiodicboundarycondition,sotheoptimizationproblemin( 3 )canbesolvedeasilybyutilizingFouriertransform(see 107

PAGE 108

[ 95 ]).Thefunctionkk2,1issimple,andtheoptimizationproblemin( 3 )canbesolvedbysoftthresholding. WeconsiderthreeinstancesofG(): Bernoulli.IntheBernoulliinstance,G(x)isofformG(x)=1 2kAx)]TJ /F8 11.955 Tf 11.95 0 Td[(fk2 (3) wherethemeasurementmatrixA=fAi,jg2Rmnisgeneratedbyi.i.d.Bernoullidistribution.Inparticular,foranyelementAi,j,Ai,j=(1=p mwithprobability1/2,)]TJ /F5 11.955 Tf 9.3 0 Td[(1=p mwithprobability1/2. IntheBernoulliinstance,wegeneratethemeasurementfbyf=Axtrue+", (3) wherethegroundtruthimagextrue2Rnisa6464Shepp-Loganphantom(seeFigure 3-1 ),andthesamplerateofmeasurementis50%,hencen=4096andm=2048.Thenoise"N(0,0.001I)isi.i.d.normallydistributed.TheLipschitzconstantinthisinstanceisLG=max(ATA)5.8.Wesettheregularizationparameterin( 3 )to=0.005. Gaussian.InthisinstanceG(x)isalsodenedby( 3 ),butthemeasurementmatrixA=fAi,jg2Rmnisgeneratedbyi.i.d.Gaussiandistribution:Ai,jN(0,1 m). SimilarastheBernoulliinstance,wealsosetn=4096,m=2048andgeneratethemeasurementsfby( 3 )usingthesamegroundtruthimageandsamestandarddeviationonthenoise.TheLipschitzconstantinthiscaseisLG=max(ATA)8,whichislargerthantheBernoulliinstance.Theregularizationparameterin( 3 )isalsosetto=0.005. PPI.ThePPIistheinstancethatwasintroducedinSection 2.5 ,whichisanexperimentofpartiallyparallelimaging(PPI)reconstructionformagneticresonanceimaging(MRI).Inthisinstance,G(x)representsthedatadelityofPPIreconstruction.Inparticular,G(x)=1 2nchXj=1kMFSjx)]TJ /F8 11.955 Tf 11.96 0 Td[(fjk2, (3) 108

PAGE 109

wherenchisthenumberofsensors,F2Cnnisa2DdiscreteFouriertransformmatrix,Sj2Cnnisacomplexvalueddiagonalmatrix,andM2Rnnisabinarydiagonalmatrixwithmnon-zeroelementsinM.Itshouldbenotedthatx,diagManddiagSjaretwo-dimensionalimages.Inparticular,diagSjisthesensitivityencodingmapthatdescribesthesensitivityofthej-thsensor,anddiagMisthemaskthatdescribesthepartialselectionofk-spacedata.Figures 2-1 and 2-2 showstheimagesofdiagManddiagSj.TheLipschitzconstantinthisinstanceissettoLG=nchXj=1kMkkFkkSjk=nchXj=1kdiagSjk1. Usingtheestimationabove,wesettheLipschitzconstantthePPIinstancetoLG3.6.Ourgroundtruthimageisa256256humanbrainimage,anythereare15910non-zeroelementsinthemaskM,hencethesamplingrateinthisinstanceisaround25%.Themeasurementsfjaregeneratedbyfj=MFSj(xtrue+"rej+"imjp )]TJ /F5 11.955 Tf 9.3 0 Td[(1),j=1,...,nch where"rej,"imjN(0,0.0005I)areindependentlygeneratednoises.Wesettheregularizationparameterin( 3 )to=10)]TJ /F7 7.97 Tf 6.59 0 Td[(5. 3.4.1ComparisonofLinearizedADMMAlgorithms Inthissubsection,wecomparetheperformanceofallthelinearizedADMMalgorithmsdiscussedinthischapter.Althoughproblem( 3 )isaproblemofTypeI,wecanstillusetheparametersettingsforTypeIIproblems,sinceweareinterestedattheperformanceofdifferentparametersettings.WelabeltheacceleratedlinearizedADMMalgorithmsbyAL-ADMM-1throughAL-ADMM-4andALP-ADMM-1throughALP-ADMM-4,dependingontheparametersettings.ThecorrespondingparametersettingofAL-ADMMandALP-ADMMalgorithmsarelistsasfollows: AL-ADMM-1andALP-ADMM-1useparametersin( 3 ). AL-ADMM-2andALP-ADMM-2useparametersin( 3 ). AL-ADMM-3andALP-ADMM-3useparametersin( 3 ). AL-ADMM-4andALP-ADMM-4useparametersin( 3 )wheretand)]TJ /F9 7.97 Tf 6.78 -1.79 Td[(tsatises( 3 ). 109

PAGE 110

Itshouldbenotedthatproblem( 3 )isaproblemwithunboundedfeasibleset,buttheparametersettingsAL-ADMM-2,ALP-ADMM-2,AL-ADMM-4andALP-ADMM-4areforproblemswithboundedfeasiblesets.ThereasonwechoosetheseparameterarebecausethattheyareindependenceofthetotaliterationN.Infact,inourexperimentsthoseparametersettingsstillperformwell.Inouriterations,wesetthemaximumiterationNto200.ThefollowingconclusionsregardingthecomparisonofthelinearizedADMMalgorithmscanbedrawnfromthenumericalresults. Dependenceontheconstant .FromTables 3-4 through 3-9 ,wecomparetheperformanceoflinearizedADMMalgorithmsafter200iterations,underdifferentchoicesof,whichrangesfrom26to212.Fromthenumericalresults,wecanseethatthelinearizedADMMalgorithmsarenotverysensitiveonthechoiceofconstant.Infact,whenragesfrom27to29,theperformanceoflinearizedADMMalgorithmsdoesnotchangedrastically.ThiscoincideswithourremarkafterTheorem 1 thatO(1=t)rateofconvergenceisguaranteedforanychoiceof.However,ne-tuningofisstillusefulforimprovingthesolutionqualityofADMM. Acceleratedvs.unacceleratedalgorithms .FromTables 3-4 through 3-10 ,wecansafelydrawtheconclusionthatacceleratedalgorithms(AL-ADMMandALP-ADMM)outperformstheunacceleratedalgorithms(L-ADMMandLP-ADMM).Inparticular,theadvantageofacceleratedalgorithmscanbeseenclearlyfromthecolumnsofnormalizedRMSEinTable 3-10 .InTable 3-10 wereporttheperformanceoflinearizedADMMalgorithmsonreconstructingtheimageofthethreeinstances,intermsofbothobjectivevalueandnormalizedRMSE,atiterationsN=3,2N=3andN.IntermsofnormalizedRMSE,wecanseethatL-ADMMandLP-ADMMperformstheworstattheGaussianinstance,andperformsthebestatthePPIinstance.ThiscoincideswiththesettingsofLipschtizconstantsLGofthethreeinstances,inwhichGaussianhasthehighestLGandPPIhasthelowestLG.However,theeffectofLGontheAL-ADMMandALP-ADMMarenotsignicant.Infact,thenormalizedRMSEoftheaccelerated 110

PAGE 111

algorithmsarebetterinbothBernoulliandGaussianinstances.ForinstancePPI,wealsoseefromFigure 3-3 thatAL-ADMM-4outperformsL-ADMMinreconstructingthecerebellumpartofthebrainimagewithbetterresolution. AL-ADMMvs.ALP-ADMM .Amongtheacceleratedalgorithms,wecanseethatAL-ADMMoutperformsALP-ADMM.Thiscoincideswithourconvergenceresults,sincethedistanceconstantintheconvergenceresultofAL-ADMMisDx,K,whichissmallerthankKkDxofALP-ADMM. ParametersettingsofacceleratedlinearizedADMM .WeusefourtypesofparametersettingsfortheacceleratedlinearizedADMMinourexperiment.WecanseetheperformanceoftheseparametersettingsforAL-ADMMfromTable 3-10 .Theorderoftheirperformance,frombesttoworst,are: AL-ADMM-4,AL-ADMM-2,AL-ADMM-1,andAL-ADMM3. Sincethechoiceofweightingparameterstin( 3 )decreasesthefastestamongallthechoicesofweightingparametersthatsatisfy( 3 )(seethediscussionafterproofofTheorem 6 ),itisreasonabletoanticipatethatAL-ADMM-4performsslightlybetterthanAL-ADMM-2.ItisveryinterestingtoseethattheparametersettingsintendedforboundedX,namelyAL-ADMM-4andAL-ADMM-2,arebetterthantheparametersettingsintendedforunboundedX,namelyAL-ADMM-1andAL-ADMM-3.Wegiveaplausibleexplanationofthisobservation:withouttheboundednessoffeasibleset,toensuretheglobalconvergenceofAlgorithm 3.5 ,itisnecessarytochoosemoreconservativeparametersettings.Ontheotherhand,aggressiveparametersettingsmayresultinbetterperformanceinpractice,aslongasitconverges.Amongthefourparametersettings,AL-ADMM-3isintendedforthecasewhenbothXanddomFareunbounded,henceitisthemostconservative.AL-ADMM-1isintendedforthecasewhenXisunboundedbutdomFisbounded,henceitisbetterthanAL-ADMM-3,butworsethanAL-ADMM-4andAL-ADMM-2.SimilarperformancecanalsobeobservedfromtheperformanceofALP-ADMM. 111

PAGE 112

3.4.2ComparisonwithOtherAlgorithms Inthissubsection,wecomparetheperformanceoftheproposedAL-ADMMandALP-ADMMalgorithmswithtwootheralgorithmsthathassimilarrateofconvergence( 3 ).Inpreviousliteratures,therearetwootheralgorithmsthatareabletoachievetherateofconvergence( 3 ):Nesterov'smethodin[ 72 ](NESTA)2,andtheacceleratedprimal-dual(APD)methodin[ 19 ].ItshouldbenotedthatbothNESTAandAPDrequireaconstantwhichneedstobene-tuned.WeusetheAPDiterationsin( 3 )( 3 ),whichsharestheconstantwiththeADMMalgorithms.Ontheotherhand,NESTAconsidersasmoothedversionofproblem( 3 ):minx2XH(x)+maxy2YhKx,yi)]TJ /F8 11.955 Tf 19.26 0 Td[(F(y))]TJ /F4 11.955 Tf 13.15 8.09 Td[( 2kyk2, whereisaconstantthatdependsonsomedistanceconstants.IfthemaximumiterationNisgiven,theoptimalchoiceofisgiveninTheorem3in[ 72 ].Inviewof( 3 )andTheorem3in[ 72 ],weset=2=(N)inourexperiments. Wemakeafewobservationsabouttheproposealgorithms,NESTAandAPDbasedonthenumericalresultsobtainedinTables 3-4 3-10 andFigures 3-1 3-3 .Firstly,FromTables 3-4 3-9 ,weseethattheperformanceofAL-ADMM,ALP-ADMM,NESTAandAPDaresimilar.However,itisinterestingtoobservethatNESTAperformsthebestwhenissufcientlyne-tuned.Infact,wecanseefromTable 3-7 thatNESTAoutperformsotheralgorithmsdrasticallywhen=28,butwhen=27itisworstthanAL-ADMM-1,andwhen=29NESTAperformsbadly.Secondly,fromTable 3-10 andFigures 3-1 3-2 wecanseethatforBernoulliandGaussianinstances,AL-ADMM-4 2ThenameNESTAisfrom[ 8 ].Itshouldbenotedthattheauthorsin[ 8 ]proposeacontinuationschemetone-tunethesmoothingparameterinNESTA,whichisdependentonthedistanceconstantsDxandDdomF.ThecontinuationschemecanalsobeappliedtolinearizedADMMalgorithmsforne-tuning.However,sincewearefocusingonthetheoreticaldevelopmentofacceleratedlinearizedADMMs,weonlyconsiderthecasewithnocontinuation. 112

PAGE 113

andAPDoutperformsNESTAatearlyiterationstwhent<
PAGE 114

Table3-4.ComparisonofobjectivevaluesoflinearizedADMMalgorithmsforsolving( 3 )withinstanceBernoulliand=0.005 AlgorithmValueof26272829210211212 AL-ADMM-1 1.72 1.711.721.721.721.741.76AL-ADMM-21.731.711.711.711.721.721.73AL-ADMM-31.791.741.731.721.721.731.76AL-ADMM-41.731.711.711.71 1.72 1.72 1.72 ALP-ADMM-11.721.721.741.761.811.912.11ALP-ADMM-21.731.711.711.711.721.731.76ALP-ADMM-31.791.751.731.751.801.902.11ALP-ADMM-41.731.711.711.711.721.731.76L-ADMM2.392.252.192.162.152.142.15LP-ADMM2.402.262.192.172.182.212.29NESTA1.73 1.71 1.70 1.71 1.792.062.81APD1.731.711.711.721.721.731.76 Allalgorithmsrun200iterations,underchoicesof=26,...,212.Theobjectivevalueofthegroundtruthis1.71.Thenumbersinredareofthebestperformanceamongallthealgorithms,underselected,i.e.,minimalwithinthecorrespondingcolumn. Table3-5.ComparisonofnormalizedRMSEsoflinearizedADMMalgorithmsforsolving( 3 )withinstanceBernoulliand=0.005 AlgorithmValueof26272829210211212 AL-ADMM-1 2.23% 2.14% 2.11%2.17%2.37%2.87%4.00%AL-ADMM-22.62%2.28%2.13%2.06%2.05%2.11%2.26%AL-ADMM-35.82%3.90%2.96%2.54%2.50%2.88%3.98%AL-ADMM-42.61%2.26%2.11% 2.04% 2.03% 2.08% 2.23% ALP-ADMM-12.36%2.40%2.66%3.32%4.76%7.80%14.11%ALP-ADMM-22.64%2.30%2.16%2.14%2.22%2.51%3.38%ALP-ADMM-35.81%3.97%3.19%3.32%4.62%7.69%14.06%ALP-ADMM-42.62%2.28%2.15%2.12%2.20%2.49%3.36%L-ADMM23.29%21.47%20.52%20.06%19.88%19.95%20.29%LP-ADMM23.35%21.60%20.78%20.59%20.94%22.01%24.20%NESTA3.12%2.33% 2.01% 2.11%3.24%15.61%36.58%APD2.65%2.31%2.17%2.15%2.23%2.53%3.40% Allalgorithmsrun200iterations,underchoicesof=26,...,212.Thenumbersinredareofthebestperformanceamongallthealgorithms,underselected,i.e.,minimalwithinthecorrespondingcolumn. 114

PAGE 115

Table3-6.ComparisonofobjectivevaluesoflinearizedADMMalgorithmsforsolving( 3 )withinstanceGaussianand=0.005 AlgorithmValueof26272829210211212 AL-ADMM-1 1.72 1.721.731.731.741.751.78AL-ADMM-21.741.721.721.721.731.731.74AL-ADMM-31.791.741.721.721.731.751.78AL-ADMM-41.741.721.72 1.72 1.73 1.73 1.74 ALP-ADMM-11.731.731.751.781.841.972.23ALP-ADMM-21.741.721.721.731.741.771.83ALP-ADMM-31.791.741.741.771.831.962.23ALP-ADMM-41.741.721.721.731.741.761.83L-ADMM2.742.682.652.642.642.632.63LP-ADMM2.742.682.662.652.652.672.70NESTA1.73 1.71 1.71 1.781.982.342.79APD1.741.721.721.731.741.771.83 Allalgorithmsrun200iterations,underchoicesof=26,...,212.Theobjectivevalueofthegroundtruthis1.71.Thenumbersinredareofthebestperformanceamongallthealgorithms,underselected,i.e.,minimalwithinthecorrespondingcolumn. Table3-7.ComparisonofnormalizedRMSEsoflinearizedADMMalgorithmsforsolving( 3 )withinstanceGaussianand=0.005 AlgorithmValueof26272829210211212 AL-ADMM-1 5.85% 5.55% 5.60%5.98%6.86%8.74%12.77%AL-ADMM-27.77%6.32%5.73%5.55%5.66%6.11%7.29%AL-ADMM-313.15%8.75%6.77%6.31%6.89%8.70%12.73%AL-ADMM-47.69%6.23%5.64% 5.46% 5.56% 6.00% 7.15% ALP-ADMM-16.30%6.41%7.31%9.39%13.79%22.77%39.01%ALP-ADMM-27.83%6.52%6.00%6.11%6.90%9.20%16.22%ALP-ADMM-313.47%9.25%8.21%9.64%13.85%22.79%39.00%ALP-ADMM-47.75%6.42%5.91%6.01%6.79%9.06%15.95%L-ADMM58.87%58.25%57.92%57.76%57.66%57.61%57.56%LP-ADMM58.90%58.31%58.05%58.02%58.19%58.63%59.53%NESTA7.35%5.95% 2.78% 12.32%29.82%47.52%61.47%APD7.85%6.54%6.02%6.13%6.91%9.22%16.24% Allalgorithmsrun200iterations,underchoicesof=26,...,212.Thenumbersinredareofthebestperformanceamongallthealgorithms,underselected,i.e.,minimalwithinthecorrespondingcolumn. 115

PAGE 116

Table3-8.ComparisonofobjectivevaluesoflinearizedADMMalgorithmsforsolving( 3 )withinstanceGaussianand=10)]TJ /F7 7.97 Tf 6.58 0 Td[(5 AlgorithmValueof26272829210211212 AL-ADMM-12796.262796.442796.892798.002800.972808.242821.37AL-ADMM-22796.142796.192796.292796.542797.152798.792802.87AL-ADMM-32796.122796.142796.182796.282796.512797.13 2798.82 AL-ADMM-4 2793.47 2793.52 2793.63 2793.88 2794.53 2796.25 2800.53ALP-ADMM-12796.262796.442796.892798.002800.972808.242821.37ALP-ADMM-22796.142796.192796.292796.542797.152798.792802.87ALP-ADMM-32796.122796.142796.182796.282796.512797.132798.83ALP-ADMM-42793.472793.522793.632793.882794.532796.252800.53L-ADMM9884.109884.139884.219884.359884.659885.059883.02LP-ADMM9884.109884.149884.219884.359884.659885.069883.02NESTA2796.202796.322796.602797.312799.232804.232815.71APD2796.142796.192796.292796.542797.152798.792802.87 Allalgorithmsrun200iterations,underchoicesof=26,...,212.Theobjectivevalueofthegroundtruthis4175.23.Thenumbersinredareofthebestperformanceamongallthealgorithms,underselected,i.e.,minimalwithinthecorrespondingcolumn. Table3-9.ComparisonofnormalizedRMSEsoflinearizedADMMalgorithmsforsolving( 3 )withinstanceGaussianand=10)]TJ /F7 7.97 Tf 6.58 0 Td[(5 AlgorithmValueof26272829210211212 AL-ADMM-16.02%6.01%5.99%5.95%5.85%5.74%5.63%AL-ADMM-26.03%6.02%6.01%5.99%5.95%5.88%5.73%AL-ADMM-36.03%6.02%6.02%6.01%5.99%5.95%5.87%AL-ADMM-46.06%6.05%6.04%6.02%5.98%5.90%5.75%ALP-ADMM-16.02%6.01%5.99%5.95%5.85%5.74%5.63%ALP-ADMM-26.03%6.02%6.01%5.99%5.95%5.88%5.73%ALP-ADMM-36.03%6.02%6.02%6.01%5.99%5.95%5.87%ALP-ADMM-46.06%6.05%6.04%6.02%5.98%5.90%5.75%L-ADMM12.71%12.71%12.71%12.71%12.71%12.71%12.71%LP-ADMM12.71%12.71%12.71%12.71%12.71%12.71%12.71%NESTA 6.02% 6.00% 5.98% 5.93% 5.83% 5.63% 5.34% APD6.03%6.02%6.01%5.99%5.95%5.88%5.73% Allalgorithmsrun200iterations,underchoicesof=26,...,212.Thenumbersinredareofthebestperformanceamongallthealgorithms,underselected,i.e.,minimalwithinthecorrespondingcolumn. 116

PAGE 117

Table3-10.ComparisonoftheperformanceoflinearizedADMMalgorithmsforsolving( 3 ),whenissetto28. ObjectivevalueRelativeerrorBernoulliGaussianPPIBernoulliGaussianPPI AL-ADMM-1t=661.922.104295.369.08%34.84%8.45%t=1331.741.782931.893.09%10.19%5.62%t=2001.721.732796.892.11%5.60%5.99% AL-ADMM-2t=661.872.074296.627.79%34.22%8.46%t=1331.731.762932.033.00%10.25%5.63%t=2001.711.722796.292.13%5.73%6.01% AL-ADMM-3t=662.032.184296.7014.06%39.39%8.46%t=1331.771.782932.104.77%12.14%5.64%t=2001.731.722796.182.96%6.77%6.02% AL-ADMM-4t=66 1.86 2.04 4146.91 7.63% 32.50% 8.23% t=1331.73 1.76 2921.23 2.94% 9.95% 5.62% t=2001.711.72 2793.63 2.11%5.64%6.04% ALP-ADMM-1t=662.122.354295.3714.12%45.82%8.45%t=1331.791.842931.894.47%14.21%5.62%t=2001.741.752796.892.66%7.31%5.99% ALP-ADMM-2t=661.882.104296.627.75%36.16%8.46%t=1331.731.772932.033.11%10.97%5.63%t=2001.711.722796.292.16%6.00%6.01% ALP-ADMM-3t=662.192.404296.7118.62%48.37%8.46%t=1331.791.812932.105.40%15.28%5.64%t=2001.731.742796.183.19%8.21%6.02% ALP-ADMM-4t=661.872.074146.91 7.61% 34.50%8.23%t=1331.731.772921.233.04%10.65%5.62%t=2001.711.722793.632.15%5.91%6.04% L-ADMMt=663.513.4231030.2150.54%73.58%17.04%t=1332.692.9514681.4134.05%65.13%14.30%t=2002.192.659884.2120.52%57.92%12.71% LP-ADMMt=663.523.4231030.2150.73%73.67%17.04%t=1332.702.9614681.4134.29%65.25%14.30%t=2002.192.669884.2120.78%58.05%12.71% NESTAt=662.472.674297.9428.64%58.26%8.46%t=133 1.73 1.862931.733.39%21.53%5.62%t=200 1.70 1.71 2796.60 2.01% 2.78% 5.98% APDt=661.882.104296.627.84%36.24%8.46%t=1331.741.772932.033.14%11.00%5.63%t=2001.711.722796.292.17%6.02%6.01% Thenumbersinredarethebestperformanceamongallthealgorithms,atselectediteration(minimalwithintheallthevaluesinthecolumnthatareassociatedtothecorrespondingiterationt). 117

PAGE 118

(a)Groundtruth (b)L-ADMM (c)AL-ADMM-4 (d)NESTA Figure3-1.ThereconstructedimagesoftheBernoulliinstancefromdifferentalgorithms 118

PAGE 119

(a)Groundtruth (b)L-ADMM (c)AL-ADMM-4 (d)NESTA Figure3-2.ThereconstructedimagesoftheGaussianinstancefromdifferentalgorithms 119

PAGE 120

(a)Groundtruth (b)L-ADMM (c)AL-ADMM-4 (d)NESTA Figure3-3.ThereconstructedimagesofthePPIinstancefromdifferentalgorithms 120

PAGE 121

(a)Groundtruth (b)L-ADMM (c)AL-ADMM-4 (d)NESTA Figure3-4.TheCerebellumpartofreconstructedimagesofthePPIinstancefromdifferentalgorithms 121

PAGE 122

(a)Objectivevalue,Bernoulli (b)NormalizedRMSE,Bernoulli (c)Objectivevalue,Gaussian (d)NormalizedRMSE,Gaussian (e)Objectivevalue,PPI (f)NormalizedRMSE,PPI Figure3-5.ComparisonoftheperformanceofAL-ADMM-4,L-ADMM,NESTAandAPDintermsofobjectivevalueandnormalizedRMSE 122

PAGE 123

CHAPTER4OPTIMALSCHEMESFORACLASSOFHEMIVARIATIONALINEQUALITYPROBLEMS 4.1Introduction LetEbeanitedimensionalvectorspacewithproducth,iandnormkk,andZbeanon-emptyclosedconvexsetinE.Ourproblemofinterestistondu2Zthatsolvesthefollowinghemivariationalinequality(HVI)problem: G(u))]TJ /F8 11.955 Tf 11.96 0 Td[(G(u)+hH(u),u)]TJ /F8 11.955 Tf 11.95 0 Td[(ui+J(u))]TJ /F8 11.955 Tf 11.95 0 Td[(J(u)0,8u2Z.(4) In( 4 ),G()isageneralcontinuouslydifferentiablefunctionwhosegradienthasLipschitzconstantLG,i.e.,0G(w))]TJ /F8 11.955 Tf 11.95 0 Td[(G(v))-222(hrG(w),w)]TJ /F8 11.955 Tf 11.95 0 Td[(viLG 2kw)]TJ /F8 11.955 Tf 11.95 0 Td[(vk2,8w,v2Z,H:Z!EisamonotoneoperatorwithLipschitzconstantLH,thatis,forallw,v2Z,hH(w))]TJ /F8 11.955 Tf 11.95 0 Td[(H(v),w)]TJ /F8 11.955 Tf 11.96 0 Td[(vi0,kH(w))]TJ /F8 11.955 Tf 11.95 0 Td[(H(v)kLHkw)]TJ /F8 11.955 Tf 11.96 0 Td[(vk, andJ()isarelativelysimpleandconvexfunction.Wedenoteproblem( 4 )byHVI(Z;G,H,J),andsaythatuisaweaksolutionofHVI(Z;G,H,J).Ontheotherhand,wesaythatuisastrongsolutionofHVI(Z;G,H,J)ifitsatisesG(u))]TJ /F8 11.955 Tf 11.95 0 Td[(G(u)+hH(u),u)]TJ /F8 11.955 Tf 11.95 0 Td[(ui+J(u))]TJ /F8 11.955 Tf 11.96 0 Td[(J(u)0,8u2Z. Problem( 4 )isanextensionofthewell-studiedvariationalinequality(VI)problem.ForanymappingF:Z!E,TheVIproblem,denotedbyVI(Z;F),istondu2ZthatsatiseshF(u),u)]TJ /F8 11.955 Tf 11.95 0 Td[(ui0,8u2Z, (4)orhF(u),u)]TJ /F8 11.955 Tf 11.96 0 Td[(ui0,8u2Z. (4) 123

PAGE 124

Solutionsof( 4 )and( 4 )areoftencalledthestrongandweaksolutionofVI(Z;F),respectively.NotethatifFismonotoneandcontinuous,theweaksolutionsofVI(Z;F)mustalsobestrongsolutions,andviceversa.Furthermore,ifJ=0,itcanbeshownthatHVI(Z;G,H,0)isequivalenttoVI(Z;F)whereF=rG+H.Inaddition,sincerGandHarecontinuousandmonotonebyourassumptions,theweaksolutionandstrongsolutionofHVI(Z;G,H,0)areequivalent.Therefore,HVI(Z;G,H,0)canbetreatedasaVIproblemVI(Z;F)withadditionalstructuralinformationthatFiscompositedbyagradientfunctionrGandamonotoneoperatorH. TwotypesofHVIproblemswillbeconsideredinthischapter,dependingontherstorderinformationofGandHin( 4 ).Therstcase,calleddeterministicHVI,iswhentherstorderinformationin( 4 )isexact.Ontheotherhand,iftherstorderinformationisinexact,problem( 4 )becomesastochasticHVIproblem.ThegoalofthischapteristodeveloprstordermethodsforsolvingboththedeterministicHVIandthestochasticHVI.Westartbyreviewingsomebackgroundsforsolvingbothproblems. 4.1.1DeterministicHVI Forsimplicityinthediscussion,weconsiderJ=0intheremainderofthissubsection,andconsidertheHVIproblemHVI(Z;G,H,0).Aswepointedoutpreviously,suchproblemisequivalenttoVI(Z;F)whereF=rG+H. ForgeneralLipschitz-continuousFwithLipschitzconstantLF,itisshownin[ 67 ]thattherateofconvergenceforsolvingVI(Z;F)byanyrstordermethodcannotbebetterthan OLF N.(4) MostexistingVIsolversaimtoachievesuchbound.In[ 65 ],Nemirovskidevelopedaprox-methodtosolvetheweaksolutionofVI(Z,F)forcompactZ.Aspecialcaseoftheprox-methodisequivalenttoKorpelevich'sextragradientalgorithm[ 47 ],whichachievestherateofconvergencein( 4 ).In[ 62 ],MonteiroandSvaitershowedthattheextragradientmethodachievestheratein( 4 )forunboundedZaswell.Therefore, 124

PAGE 125

ifweusetheVIformulationofHVI(Z;G,H,0)andapplytheextragradientmethodtoVI(Z;F),therateofconvergencewillbeboundedby OLG+LH N,(4) whereNisthetotalnumberofiterations. OndrawbackoftheVIformulationVI(Z;F)isthatitdoesnotimposeanystructuralinformationofF.However,aspointedoutinNesterov'sremarkablepaper[ 72 ],thestructuralinformationofaconvexoptimizationproblemiscriticalforimprovetheperformanceofnumericalimplementations.Infact,byutilizingthesmoothingtechniquein[ 72 ],itcanbeshownthatwhenHislinear,therateofconvergenceofsolvingHVI(Z;G,H,J)isboundedby OLG N2+LH N.(4) Itcanbeseenthattheratein( 4 )isoptimalbythefollowingobservations: a) IfH=0,therateofconvergenceforminimizingG(u)+J(u)cannotbebetterthanO(LG=N2)[ 70 71 ]; b) IfG=0andJ=0,therateofconvergenceforsolvingaweaksolutionofVI(Z;H)cannotbebetterthanO(LH=N)[ 68 ]. Itshouldbenotedthattherateofconvergencein( 4 )issignicantlybetterthan( 4 )duetoitsimproveddependenceonLG.Morespecically,theratein( 4 )wouldallowaverylargeLipschitzconstantLG(asbigasO(N))withoutaffectingtherateofconvergence.Therefore,theratein( 4 )isapparentlymorefavorablewhenLGissignicantlybiggerthanLK. 4.1.2StochasticHVI Inthestochasticsetting,weassumethatthereexistsstochasticoraclesSOGandSOHthatprovidesunbiasedsamplestotherstorderoperatorsrG(u)andH(u)foranytestpointu2Z.Morespecically,weassumethatatthei-thcallofSOGandSOH 125

PAGE 126

withinputz2Z,theoraclesSOGandSOHoutputstochasticrstorderinformationG(z,i)andH(z,i)respectively,suchthatE[G(x,i)]=rG(x),E[H(x,i)]=H(x),and A1. EhkG(x,i))-222(rG(x)k2i2G,EhkH(x,i))]TJ /F8 11.955 Tf 11.96 0 Td[(H(x)k2i2H, wherei2,i2areindependentlydistributedrandomvariables.ItshouldbenotedthatdeterministicHVIisaspecialcaseofstochasticHVIwithG=H=0.TodistinguishstochasticHVIfromdeterministicHVI,wewilluseHVIS(Z;G,H,J)todenoteproblem( 4 )withstochasticsettings.Correspondingly,whenJ=0weuseVIS(Z;F)todenotetheVIformulationofstochasticHVI,whereF=rG+H. Followingthediscussionaround( 4 )andthecomplexitytheoryforstochasticoptimization[ 43 67 ],anoptimalrateofconvergenceforsolvingstochasticHVIisgivenby OLG N2+LH N+G+H p N.(4) However,tothebestofourknowledge,theoptimalratein( 4 )hasnotbeenreachedinpreviousliteratures.ForHVIS(Z;G,H,0),anearlyoptimalrateofconvergencecanbeachievedbyapplyingthestochasticmirror-prox-methodin[ 43 ]toVIS(Z;F),whichyieldsthefollowingrateofconvergence:OLG+LH N+G+H p N. However,aswepointedoutinSection 4.1.1 ,theVIformulationdoesnotutilizethestructuralinformationthatFcontainsagradientfunction.ItshouldbenotedthatifH=0,therehasbeenseveralstudiesonthestochasticoptimizationproblemminu2ZG(u)+J(u)thatutilizethestructuralinformation(smoothnessofG),andthefollowingoptimalrateofconvergenceisachieved(see[ 32 33 50 98 ]):OLG N2+G p N. 126

PAGE 127

4.1.3MainResults Ourmainresultsinthischaptermainlyconsistsofthefollowingthreeaspects.Firstly,wepresentanovelmethod,namelytheacceleratedprox-method(AC-PM),thatcanachievetheoptimalrateofconvergenceforsolvingaweaksolutionofHVI(Z;G,H,J).ThebasicideaofAC-PMistoimplementthestructuralinformationofHVI(Z;G,H,J),andincorporateamulti-stepaccelerationschemeintotheprox-methodin[ 65 ]thatutilizethesmoothnessofG().WedemonstratethatAC-PMachievestheoptimalrateofconvergencein( 4 ),andhencecanefcientlysolveHVIproblemswithbigLipschtizconstantLG. Secondly,wedevelopastochasticcounterpartofAC-PM,namelystochasticAC-PM,tosolveaweaksolutionofHVIS(Z;G,H,J),anddemonstratethatitcanactuallyachievetheoptimalrateofconvergencein( 4 ).Therefore,thisalgorithmexhibitsanoptimalrateofconvergenceforstochasticHVInotonlyintermsofitsdependenceonN,butalsoonavarityofproblemparametersincluding,LG,LH,GandH.Tothebestofourknowledge,thisisthersttimethatsuchanoptimalalgorithmhasbeendevelopedforstochasticHVIintheliterature.Inaddition,weinvestigatethestochasticHVImethodinmoredetails,e.g.,bydevelopingthelarge-deviationresultsassociatedwiththerateofconvergenceofstochasticAC-PM. Finally,forbothdeterministicandstochasticHVI,wedemonstratethattheAC-PMcandealwiththecasewhenZisunbounded,aslongasastrongsolutionofHVI(Z;G,H,J)exists.WeincorporateintoAC-PMtheterminationcriterionemployedbyMonteiroandSvaiter[ 62 63 ]forsolvingvariationalinequalitiesandhemivariationalinequalitiesposedasmonotoneinclusionproblem.Inbothdeterministicandstochasticcases,whenZisunbounded,therateofconvergenceofAC-PMwilldependonthedistancefromtheinitialpointtothesetofstrongsolutions. 127

PAGE 128

4.2AcceleratedProx-MethodforDeterministicHVI Thepurposeofthissectionistopresenttheacceleratedprox-method(AC-PM)thatsolvesaweaksolutionofHVI(Z;G,H,J),andtopresentitsmainconvergenceresults. RecallthataBregmandivergenceV(,)isamapthatisdenedinDenition 2 .IfV(,)isassociatedwithadistancegeneratedfunctionofmodulus,then 2kz)]TJ /F8 11.955 Tf 11.96 0 Td[(z1k2V(z,z1),8z1,z2Z. (4) Withxedprox-functionV(,),foranyz2Zand2Ewedeneamodiedprox-mapping: PJz():=argminu2Zh,u)]TJ /F8 11.955 Tf 11.95 0 Td[(zi+V(u,z)+J(u).(4) ItshouldbenotedthatthePJz()isdifferentfromtheprox-mappingin( 1.1.1 ),duetothefunctionJ().Throughoutthischapter,weassumethatJ()issimple,sothattheoptimizationproblem( 4 )canbesolvedefciently. Ourconceptontheacceleratedprox-method(AC-PM)isinspiredbytheacceleratedschemeforsmoothoptimizationin[ 72 ],andtheprox-methodforVIin[ 43 65 ].Theproposedalgorithmisasfollows: Algorithm4.1AC-PMforsolvingaweaksolutionofHVI(Z;G,H,J) 1: Chooser12Z.Setw1=r1,wag1=r1. 2: Fort=1,2,...,N)]TJ /F5 11.955 Tf 11.95 0 Td[(1,calculatewmdt=(1)]TJ /F4 11.955 Tf 11.96 0 Td[(t)wagt+trt, (4)wt+1=PtJrt)]TJ /F4 11.955 Tf 5.48 -9.68 Td[(tH(rt)+trG(wmdt), (4)rt+1=PtJrt)]TJ /F4 11.955 Tf 5.48 -9.68 Td[(tH(wt+1)+trG(wmdt), (4)wagt+1=(1)]TJ /F4 11.955 Tf 11.96 0 Td[(t)wagt+twt+1. (4) 3: OutputwagN+1. InAlgorithm 4.1 ,mdstandsformiddle,andagstandsforaggregated.Itshouldbenotedthatif=1andG=0,J=0,thenAlgorithm 4.1 forsolvingHVI(Z;0,H,0)isequivalenttotheprox-methodin[ 65 ]forsolvingVI(Z;H),whichis 128

PAGE 129

equivalenttotheextragradientmethodin[ 47 ]ifV(,)istheEuclideandistance.Ontheotherhand,ifH=0,thentheiterations( 4 )and( 4 )producesthesameoptimizerwt+1=rt+1,andAlgorithm 4.1 isequivalenttoaversionofNesterov'sacceleratedmethodforsolvingminu2ZG(u)+J(u)(see,forexample,Algorithm1in[ 91 ]).TheideaofNesterov'sacceleratedmethod([ 70 72 ],seealso[ 91 ])istomakefulluseofthesmoothnessofG()toachievetheoptimalrateofconvergenceforsmoothoptimization.InspiredbyNesterov'sconcept,wewilldemonstratethat,byutilizingthestructuralinformationthatG()issmooth,wecanachievetheoptimalrateofconvergence( 4 )forsolvingdeterministicHVI. InordertoanalyzetheconvergenceofAlgorithm 4.1 ,weintroduceanotiontocharacterizetheweaksolutionsofHVI(Z;G,H,J).Forall~u,u2Z,wedeneQ(~u,u):=G(~u))]TJ /F8 11.955 Tf 11.96 0 Td[(G(u)+hH(u),~u)]TJ /F8 11.955 Tf 11.95 0 Td[(ui+J(~u))]TJ /F8 11.955 Tf 11.96 0 Td[(J(u). (4) Clearly,~uisaweaksolutionofHVI(Z;G,H,J)ifandonlyifQ(~u,u)0forallu2Z.TostudytheconvergencepropertiesofAC-PM,weuseaperturbation-basedterminationcriterionrecentlyemployedbyMonteiroandSvaiter[ 62 63 ],whichisbasedontheenlargementofamaximalmonotoneoperatorrstintroducedin[ 13 ].ItshouldbenotedthatwehaveusedtheSPPversionoftheperturbation-basedterminationcriterionintheanalysisofChapter 2 .FortheHVIcase,wesaythatthepair(~v,~u)2EZisa(,")-approximatesolutionofHVI(Z;G,H,J)ifk~vkand~g(~u,~v)",wherethegapfunction~g(,)isdenedby~g(~u,~v):=supu2ZQ(~u,u))-221(h~v,~u)]TJ /F8 11.955 Tf 11.96 0 Td[(ui. Wecall~vtheperturbationvectorassociatedwith~u.Oneadvantageofemployingthisterminationcriterionisthattheconvergenceanalysisdoesnotdependonthe 129

PAGE 130

boundednessofZ.Furthermore,if~usatisesg(~u)"whereg():=~g(~u,0)=supu2ZQ(~u,u), then(0,~u)isa(0,")-solution.Insuchcase,wesimplysaythat~uisa"-solution. TheconvergencepropertiesofAlgorithm 4.1 ispresentedinthefollowingtheoremsandcorollaries.ThedetailedproofofthetheoremswillbedemonstratedinSection 4.4 .Wewilldemonstratethat,whenZisbounded,wagt+1isa"t+1-solutionofHVI(Z;G,H,J)where"t+1dependsonthediameterofZ,andastincreases,theresidual"t+1approaches0attheoptimalratein( 4 ).Ontheotherhand,ifZisunbounded,thenthereexistsvt+1suchthat(wagt+1,vt+1)isa(t+1,"t+1)-solution,wheretheresidualst+1and"t+1dependsonthedistancefromtheinitialpointr1toastrongsolution,andthatbothresidualsapproaches0attheoptimalratein( 4 ). ThefollowingtheoremdescribestheconvergencepropertyofAlgorithm 4.1 whenZisbounded. Theorem4.1. Supposethat supz1,z22ZV(z1,z2)=2Z.(4) Iftheparameterst,tinAlgorithm 4.1 arechosen,suchthat1=1,andforallt>1, 0t<1,)]TJ /F8 11.955 Tf 11.96 0 Td[(LGtt)]TJ /F8 11.955 Tf 13.15 8.09 Td[(L2H2t 0,andt )]TJ /F9 7.97 Tf 6.77 -1.8 Td[(ttt+1 )]TJ /F9 7.97 Tf 6.78 -1.8 Td[(t+1t+1,(4) wheref)]TJ /F9 7.97 Tf 6.78 -1.8 Td[(tgisdenedby: )]TJ /F9 7.97 Tf 6.77 -1.79 Td[(t=8>><>>:1,whent=1(1)]TJ /F4 11.955 Tf 11.95 0 Td[(t))]TJ /F9 7.97 Tf 11.66 -1.79 Td[(t)]TJ /F7 7.97 Tf 6.59 0 Td[(1,whent>1,(4) Then, g(wagt+1)t t2Z.(4) Therearevariouswaystochooseparameterst,tthatsatises( 4 ).Inthefollowingcorollary,wegiveonesuchexample. 130

PAGE 131

Corollary5. Supposethat( 4 )holds.Iftheparametersftg,ftginAC-MParet=2 t+1,t=t 2(LG+LHt), (4) thenforallu2Z,Q(wagt+1,u)2LG t(t+1)+2LH tD2Z, (4) whereDZ:=Zs 2 (4) Proof. From( 4 )and( 4 )wecancalculatethat)]TJ /F9 7.97 Tf 6.77 -1.79 Td[(t=2 t(t+1).Thent )]TJ /F9 7.97 Tf 6.78 -1.8 Td[(tt=2 (LG+LHt)t+1 )]TJ /F9 7.97 Tf 6.77 -1.8 Td[(t+1t+1. Moreover,)]TJ /F8 11.955 Tf 11.95 0 Td[(LGtt)]TJ /F8 11.955 Tf 13.15 8.09 Td[(L2H2t =)]TJ /F4 11.955 Tf 26.59 8.08 Td[(LG LG+LHtt t+1)]TJ /F4 11.955 Tf 32.01 8.08 Td[(L2Ht2 4(LG+LHt)2)]TJ /F4 11.955 Tf 26.59 8.08 Td[(LG LG+LHt)]TJ /F4 11.955 Tf 23.99 8.08 Td[(LHt LG+LHt=0. Thus( 4 )holds.Hence,byapplying( 4 )inTheorem 4.1 withtheparametersettingin( 4 )andusing( 4 ),weget( 4 ). Clearly,inviewof( 4 ),whentheparametersarechosenaccordingto( 4 ),therateofconvergenceofAlgorithm 4.1 forsolvingaweaksolutionofHVI(Z;G,H,J)isoptimal.Infact,aslongasZisbounded,thereisnoneedknowtheexactvaluesofZorDZ.However,ifZisunbounded,thenZandDZcanpossiblybeinnity,andtheresultin( 4 )maynotbemeaningful.Insuchcase,wesuggesttousetheperturbation-basedgapfunction~g(,).WepresentthefollowingtheoremfortheconvergencepropertyofAlgorithm 4.1 whenZisunbounded. 131

PAGE 132

Theorem4.2. SupposethatV(r,z):=kz)]TJ /F8 11.955 Tf 11.96 0 Td[(rk2=2foranyr,z2Z.Alsoassumethattheparameterst,tinAlgorithm 4.1 arechosen,suchthat1=1,andforallt>1, 0t<1,LGtt+L2H2tc2forsomec<1,andt )]TJ /F9 7.97 Tf 6.77 -1.79 Td[(tt=t+1 )]TJ /F9 7.97 Tf 6.77 -1.79 Td[(t+1t+1,(4) where)]TJ /F9 7.97 Tf 6.77 -1.79 Td[(tisdenedin( 4 ).Thenforallt>1thereexistsvt+12E,"t+10suchthat~g(wagt+1,vt+1)"t+1,andkvt+1k2tD t,"t+12t(1+t)D2 t, (4) whereuisastrongsolutionofHVI(Z;G,H,J),andD:=kr1)]TJ /F8 11.955 Tf 11.95 0 Td[(uk,t:=s c2)]TJ /F9 7.97 Tf 6.77 -1.8 Td[(t 1)]TJ /F8 11.955 Tf 11.95 0 Td[(c2max1itfi )]TJ /F9 7.97 Tf 6.77 -1.79 Td[(ig. (4) Belowweprovideaspecicsettingofparameterstandtthatsatisescondition( 4 ). Corollary6. SupposethatV(r,z):=kz)]TJ /F8 11.955 Tf 12.85 0 Td[(rk2=2foranyr,z2ZandLH>0.InAlgorithm 4.1 ,ifNisgivenandtheparamterstandtaresettot=2 t+1,t=t 2(LG+LHN), (4) thenthereexistsvN2Esuchthat~g(wagN,vN)"N,andkvNk8LG N(N)]TJ /F5 11.955 Tf 11.96 0 Td[(1)+8LH N)]TJ /F5 11.955 Tf 11.95 0 Td[(1D,"N8LG N(N)]TJ /F5 11.955 Tf 11.96 0 Td[(1)+8LH N)]TJ /F5 11.955 Tf 11.95 0 Td[(1(1+N)]TJ /F7 7.97 Tf 6.59 0 Td[(1)D2, (4) whereuisastrongsolutionofHVI(Z;G,H,J),Disdenedin( 4 )above,andN)]TJ /F7 7.97 Tf 6.58 0 Td[(1r 2LG LHN2+2 N. 132

PAGE 133

Proof. Itiseasytocheckthat)]TJ /F9 7.97 Tf 6.77 -1.79 Td[(t=2 t(t+1)satises( 4 ),andthatt )]TJ /F9 7.97 Tf 6.77 -1.79 Td[(tt2(LG+LHN).Furthermore,LGtt+L2H2t=LGt (LG+LHN)(t+1)+L2Ht2 4(LG+LHN)2LG LG+LHN+L2HN2 4(LG+LHN)2=(2LG+LHN)2 4(LG+LHN)2=:c2. Fromtheabovewecanseethatc<1,andc2 1)]TJ /F8 11.955 Tf 11.95 0 Td[(c2=4L2G+4LGLHN+L2HN2 4LGLHN+3LHN2LG LHN+1, soN)]TJ /F7 7.97 Tf 6.59 0 Td[(1=r c2 1)]TJ /F8 11.955 Tf 11.96 0 Td[(c2r )]TJ /F9 7.97 Tf 6.78 -1.79 Td[(N)]TJ /F7 7.97 Tf 6.59 0 Td[(1max1iN)]TJ /F7 7.97 Tf 6.58 0 Td[(1fi )]TJ /F9 7.97 Tf 6.77 -1.79 Td[(igr LG LHN+1r 2 N=r 2LG LHN2+2 N. Finally,weget( 4 )bysubstitutingthevaluesofN)]TJ /F7 7.97 Tf 6.58 0 Td[(1andN)]TJ /F7 7.97 Tf 6.58 0 Td[(1to( 4 ). SeveralremarksareinplaceforTheorem 4.2 and 6 .Firstly,althoughtheexistenceofastrongsolutionuisrequired,noinformationofeitheruorDisneededforchoosingparameterstandt,asshownin( 4 )ofCorollary 6 .Secondly,bothresidualskvNkand"Nin( 4 )convergeatthesamerate(uptoaconstantD).Lastly,itisonlyforsimplicitythatweassumethatV(r,z)=kz)]TJ /F8 11.955 Tf 12.26 0 Td[(rk2=2;Similarresultscanbeachievedundermildassumptionsthatr!isLipschitzcontinuousandthatp V(,)isametric. 4.3AcceleratedProx-MethodforStochasticHVI Inthissection,wefocusonsolvingthestochasticHVI,thatistosay,problem( 4 )withstochasticoracle.WewilldemonstratethatthestochasticcounterpartofAlgorithm 4.1 canachievetheoptimalrateofconvergencein( 4 ). ThestochasticAC-PMisobtainedbyreplacingtherst-orderoperatorsH(rt),H(wt+1)andrG(xmdt)inAlgorithm 4.1 bytheirstochasticcounterpartsH(rt,2t)]TJ /F7 7.97 Tf 6.59 0 Td[(1), 133

PAGE 134

H(wt+1,2t)andG(wmdt,t)respectively,bycallingthestochasticoraclesSOGandSOH.Theproposedstochasticversionofacceleratedprox-methodisasfollows: Algorithm4.2AC-PMforsolvingaweaksolutionofHVIS(Z;G,H,J) Modify( 4 )and( 4 )inAlgorithm 4.1 towt+1=PtJrt)]TJ /F4 11.955 Tf 5.48 -9.68 Td[(tH(rt,2t)]TJ /F7 7.97 Tf 6.59 0 Td[(1)+tG(wmdt,t), (4)rt+1=PtJrt)]TJ /F4 11.955 Tf 5.48 -9.68 Td[(tH(wt+1,2t)+tG(wmdt,t), (4) Itshouldbenotedthatforanyt,therearetwocallsofSOHandonecallofSOG.However,ifweassumethatJ=0andusethestochasticprox-methodin[ 43 ]tosolvetheVIformulationVI(Z;rG+H),foranyttherewouldbetwocallsofSOHandtwocallsofSOG.Therefore,thecostperiterationofAC-PMisinfactlessthanthecostofthestochasticprox-methodin[ 43 ]. SimilarasSection 4.2 ,weusethegapfunctiong()forthecasewhenZisbounded,andusethegapfunction~g(,)forthecasewhenZisunbounded.Forbothcasesweestablishtherateofconvergenceofthegapfunctionsintermsoftheirexpectation,i.e.,theaveragerateofconvergenceovermanyrunsofthealgorithm.Furthermore,wedemonstratethatifZisbounded,thenwecanalsoestablishtherateofconvergenceofg()intheprobabilitysense,underthefollowinglight-tailassumption: A2. Foranyi-thcallonoraclesSOHandSOHwithanyinputu2Z,E[expfkrG(u))-222(G(u,i)k2=2Gg]expf1g, andE[expfkH(u))-222(H(u,i)k2=2Hg]expf1g. ItshouldbenotedthatAssumption A2 impliesAssumption A1 byJensen'sinequality. ThefollowingtheoremshowstheconvergencepropertyofAlgorithm 4.2 whenZisbounded. 134

PAGE 135

Theorem4.3. Supposethat( 4 )holds.Alsoassumethattheparameterst,tinAlgorithm 4.2 satisfy1=1andq)]TJ /F8 11.955 Tf 11.95 0 Td[(LGtt)]TJ /F5 11.955 Tf 13.15 8.09 Td[(3L2H2t 0forsomeq2(0,1),t )]TJ /F9 7.97 Tf 6.77 -1.79 Td[(ttt+1 )]TJ /F9 7.97 Tf 6.78 -1.79 Td[(t+1t+1,8t1, (4) where)]TJ /F9 7.97 Tf 6.77 -1.79 Td[(tisdenedin( 4 ).Then a) UnderAssumption A1 ,forallt>1,Eg(wagt+1)Q0(t), (4) whereQ0(t):=2t t2Z+42H+1+1 2(1)]TJ /F8 11.955 Tf 11.96 0 Td[(q)2G)]TJ /F9 7.97 Tf 6.78 -1.79 Td[(ttXi=1ii )]TJ /F9 7.97 Tf 6.78 -1.79 Td[(i. (4) b) UnderAssumption A2 ,forall>0andt>1,Probfg(wagt+1)>Q0(t)+Q1(t)g2expf)]TJ /F4 11.955 Tf 15.28 0 Td[(2=3g+3expf)]TJ /F4 11.955 Tf 15.28 0 Td[(g (4) whereQ1(t):=)]TJ /F9 7.97 Tf 31.08 -1.8 Td[(t(G+H)Zvuut 2 tXi=1i )]TJ /F9 7.97 Tf 6.78 -1.8 Td[(i2+42H+1+1 2(1)]TJ /F8 11.955 Tf 11.96 0 Td[(q)2G)]TJ /F9 7.97 Tf 6.77 -1.8 Td[(ttXi=1ii )]TJ /F9 7.97 Tf 6.78 -1.8 Td[(i. (4) Wepresentbelowaspecicparametersettingoftandtthatsatises( 4 ). Corollary7. Supposethat( 4 )holds.Ifthestepsizesftg,ftginAlgorithm 4.2 aresetto:t=2 t+1,t=t 4LG+3LHt+(t+1)p t=DZ, (4) where:=2p 2H+2G.thenunderAssumption A1 wehaveEg(wagt+1)8LGD2Z t(t+1)+6LHD2Z t+1+8(G+H)DZ p t=:C0(t). (4) 135

PAGE 136

Furthermore,ifAssumption A2 issatised,thenforall>0,wehaveProbfg(wagt+1)>C0(t)+C1(t)g2expf)]TJ /F4 11.955 Tf 15.28 0 Td[(2=3g+3expf)]TJ /F4 11.955 Tf 15.28 0 Td[(g, whereC1(t)=6(G+H)DZ p t. (4) Proof. Itiseasytocheckthatt=2 t+1,)]TJ /F9 7.97 Tf 6.78 -1.79 Td[(t=2 t(t+1)satisfythedenitionin( 4 ),andthatt )]TJ /F9 7.97 Tf 6.78 -1.79 Td[(ttt+1 )]TJ /F9 7.97 Tf 6.78 -1.79 Td[(t+1t+1,.Inaddition,inviewof( 4 ),tt=(4LG)and2t(2)=(9L2H),hence5 6)]TJ /F8 11.955 Tf 11.95 0 Td[(LGtt)]TJ /F5 11.955 Tf 13.15 8.09 Td[(3L2H2t 5 6)]TJ /F4 11.955 Tf 13.15 8.09 Td[(t 42 t+1)]TJ /F4 11.955 Tf 13.15 8.09 Td[( 30. Therefore,assumption( 4 )holdswithconstantq=5=6,thusTheorem 4.3 holds.Inordertonishtheproof,itsufcestoshowthatQ0(t)C0(t)andQ1(t)C1(t).Observethatt=)]TJ /F9 7.97 Tf 6.77 -1.79 Td[(t=t,andt(DZt)=(t3=2),thustXi=1ii )]TJ /F9 7.97 Tf 6.77 -1.79 Td[(itXi=1DZi2 i3=2=DZ tXi=1p i2DZ 3(t+1)3=2. By( 4 ),( 4 ),( 4 )and( 4 )andusingthefactthat(t+1)=t2andPti=1i2t(t+1)2=3wehaveQ0(t)=2t t2Z+42H+1+1 2(1)]TJ /F8 11.955 Tf 11.96 0 Td[(q)2G)]TJ /F9 7.97 Tf 6.77 -1.79 Td[(ttXi=1ii )]TJ /F9 7.97 Tf 6.78 -1.79 Td[(i=42Z t(t+1)4LG+3LHt+(t+1)p t=DZ+22 t(t+1)tXi=1ii )]TJ /F9 7.97 Tf 6.77 -1.79 Td[(i8LGD2Z t(t+1)+6LHD2Z t+1+2DZ p t+4DZ 3p tr t+1 tC0(t), 136

PAGE 137

andQ1(t)=)]TJ /F9 7.97 Tf 27.6 -1.8 Td[(t(G+H)Zvuut 2 tXi=1i )]TJ /F9 7.97 Tf 6.77 -1.8 Td[(i2+)]TJ /F9 7.97 Tf 18.73 -1.8 Td[(ttXi=1ii )]TJ /F9 7.97 Tf 6.78 -1.8 Td[(i42H+1+1 2(1)]TJ /F8 11.955 Tf 11.95 0 Td[(q)2G=2(G+H) t(t+1)DZvuut tXi=1i2+22 t(t+1)tXi=1ii )]TJ /F9 7.97 Tf 6.77 -1.79 Td[(i2(G+H)DZ p 3p t+4DZ 3p tr t+1 tC1(t). Inviewoftherateofconvergenceestablishedin( 4 )and( 4 ),wecanclearlyseethatthestochasticAC-PMachievestheoptimalrateofconvergenceinbothexpectationandprobability.Morespecically,thisalgorithmallowsLGtobeaslargeasO(t3=2)withoutsignicantlyaffectingtheconvergenceproperties. Inthefollowingtheorem,wedemonstratetheconvergencepropertyofAlgorithm 4.2 forsolvingthestochasticproblemHVIS(Z;G,H,J)whenZisunbounded.Itseemsthatthiscasehasnotbeenwell-studiedinpreviousliteratures. Theorem4.4. SupposethatV(r,z):=kz)]TJ /F8 11.955 Tf 12.02 0 Td[(rk2=2foranyr,z2Z.Iftheparameterst,tinAlgorithm 4.1 arechosen,suchthat1=1,andforallt>1, 0t<1,LGtt+3L2H2tc21thereexistsaperturbationvectorvt+1andaresidual"t+10suchthat~g(wagt+1,vt+1)"t+1,andE[kvt+1k]t tp 4D2+2C2,E["t+1]3t tD2+(2D2+C2)maxf1,c2 q)]TJ /F8 11.955 Tf 11.96 0 Td[(c2g+182t2H 2ttXi=13i+t 2tC2, 137

PAGE 138

whereuisastrongsolutionofHVI(Z;G,H,J),Disdenedin( 4 ),andC=vuut 82H+2+1 1)]TJ /F8 11.955 Tf 11.96 0 Td[(q2GtXi=12i, (4) 4.4ConvergenceAnalysis Inthissection,wefocusonprovingthemainconvergenceresultsinSection 4.2 and 4.3 ,namely,Theorems 4.1 4.2 4.3 and 4.4 4.4.1ConvergenceAnalysisforDeterministicAC-PM Inthissection,weproveTheorems 4.1 and 4.2 inSection 4.2 ,whicharethemainconvergencepropertiesofAlgorithm 4.1 forsolvingthedeterministicproblemHVI(Z;G,H,J). ToprovetheconvergenceofthedeterministicAC-PM,rstwepresentsometechnicalresults.Propositions 4.1 and 4.2 describesimportantpropertiesoftheprox-mappingPJr()initerations( 4 )and( 4 )ofAlgorithm 4.1 .Proposition 4.3 providesarecursionpropertyoffunctionQ(,)denedin( 4 ).WiththehelpofPropositions 4.1 4.2 and 4.3 ,wecanestimateaboundforQ(,)inLemma 6 Proposition4.1. Forallr,2Z,ifw=PJr(),thenforallu2Z,wehaveh,w)]TJ /F8 11.955 Tf 11.96 0 Td[(ui+J(w))]TJ /F8 11.955 Tf 11.95 0 Td[(J(u)V(r,u))]TJ /F8 11.955 Tf 11.95 0 Td[(V(r,w))]TJ /F8 11.955 Tf 11.96 0 Td[(V(w,u). Proof. Thispropositionwasstatedin[ 32 ].SeeLemma2in[ 32 ]fortheproof. Proposition4.2. Givenr,w,y2Zand,#2Zthatsatisesw=PJr(), (4)y=PJr(#), (4) andk#)]TJ /F4 11.955 Tf 11.96 0 Td[(k2L2kw)]TJ /F8 11.955 Tf 11.96 0 Td[(rk2+M2, (4) 138

PAGE 139

thenforallu2Zwehave:h#,w)]TJ /F8 11.955 Tf 11.95 0 Td[(ui+J(w))]TJ /F8 11.955 Tf 11.96 0 Td[(J(u)V(r,u))]TJ /F8 11.955 Tf 11.96 0 Td[(V(y,u))]TJ /F12 11.955 Tf 11.96 16.86 Td[( 2)]TJ /F8 11.955 Tf 13.96 8.09 Td[(L2 2kr)]TJ /F8 11.955 Tf 11.96 0 Td[(wk2+M2 2, (4) andV(y,w)L2 2V(r,w)+M2 2. (4) Proof. ApplyingProposition 4.1 to( 4 )and( 4 ),forallu2Zwehaveh,w)]TJ /F8 11.955 Tf 11.95 0 Td[(ui+J(w))]TJ /F8 11.955 Tf 11.96 0 Td[(J(u)V(r,u))]TJ /F8 11.955 Tf 11.96 0 Td[(V(r,w))]TJ /F8 11.955 Tf 11.95 0 Td[(V(w,u), (4)h#,y)]TJ /F8 11.955 Tf 11.95 0 Td[(ui+J(y))]TJ /F8 11.955 Tf 11.96 0 Td[(J(u)V(r,u))]TJ /F8 11.955 Tf 11.96 0 Td[(V(r,y))]TJ /F8 11.955 Tf 11.95 0 Td[(V(y,u), (4) Specically,lettingu=yin( 4 )wehaveh,w)]TJ /F8 11.955 Tf 11.96 0 Td[(yi+J(w))]TJ /F8 11.955 Tf 11.95 0 Td[(J(y)V(r,y))]TJ /F8 11.955 Tf 11.96 0 Td[(V(r,w))]TJ /F8 11.955 Tf 11.96 0 Td[(V(w,y). (4) Addinginequalities( 4 )and( 4 ),thenh#,y)]TJ /F8 11.955 Tf 11.95 0 Td[(ui+h,w)]TJ /F8 11.955 Tf 11.96 0 Td[(yi+J(w))]TJ /F8 11.955 Tf 11.95 0 Td[(J(u)V(r,u))]TJ /F8 11.955 Tf 11.95 0 Td[(V(y,u))]TJ /F8 11.955 Tf 11.95 0 Td[(V(r,w))]TJ /F8 11.955 Tf 11.96 0 Td[(V(w,y), whichisequivalenttoh#,w)]TJ /F8 11.955 Tf 11.95 0 Td[(ui+J(w))]TJ /F8 11.955 Tf 11.96 0 Td[(J(u)h#)]TJ /F4 11.955 Tf 11.96 0 Td[(,w)]TJ /F8 11.955 Tf 11.95 0 Td[(yi+V(r,u))]TJ /F8 11.955 Tf 11.95 0 Td[(V(y,u))]TJ /F8 11.955 Tf 11.95 0 Td[(V(r,w))]TJ /F8 11.955 Tf 11.96 0 Td[(V(w,y). ApplyingSchwartzinequalityandYoung'sinequalitytotheabove,andusing( 4 ),wegeth#,w)]TJ /F8 11.955 Tf 11.95 0 Td[(ui+J(w))]TJ /F8 11.955 Tf 11.96 0 Td[(J(u)k#)]TJ /F4 11.955 Tf 11.96 0 Td[(kkw)]TJ /F8 11.955 Tf 11.96 0 Td[(yk+V(r,u))]TJ /F8 11.955 Tf 11.95 0 Td[(V(y,u))]TJ /F8 11.955 Tf 11.95 0 Td[(V(r,w))]TJ /F4 11.955 Tf 13.15 8.09 Td[( 2kw)]TJ /F8 11.955 Tf 11.95 0 Td[(yk21 2k#)]TJ /F4 11.955 Tf 11.96 0 Td[(k2+ 2kw)]TJ /F8 11.955 Tf 11.95 0 Td[(yk2+V(r,u))]TJ /F8 11.955 Tf 11.95 0 Td[(V(y,u))]TJ /F8 11.955 Tf 11.96 0 Td[(V(r,w))]TJ /F4 11.955 Tf 13.15 8.09 Td[( 2kw)]TJ /F8 11.955 Tf 11.95 0 Td[(yk2=1 2k#)]TJ /F4 11.955 Tf 11.96 0 Td[(k2+V(r,u))]TJ /F8 11.955 Tf 11.96 0 Td[(V(y,u))]TJ /F8 11.955 Tf 11.96 0 Td[(V(r,w). (4) Theresult( 4 )followsimmediatelyfromabovebyapplying( 4 )and( 4 ). 139

PAGE 140

Moreover,observethatifweletu=win( 4 )andletu=yin( 4 )respectively,thenh#,y)]TJ /F8 11.955 Tf 11.95 0 Td[(wi+J(y))]TJ /F8 11.955 Tf 11.96 0 Td[(J(w)V(r,w))]TJ /F8 11.955 Tf 11.95 0 Td[(V(r,y))]TJ /F8 11.955 Tf 11.96 0 Td[(V(y,w),h#,w)]TJ /F8 11.955 Tf 11.95 0 Td[(yi+J(w))]TJ /F8 11.955 Tf 11.96 0 Td[(J(y)1 2k#)]TJ /F4 11.955 Tf 11.96 0 Td[(k2+V(r,y))]TJ /F8 11.955 Tf 11.96 0 Td[(V(r,w). Addingthetwoinequalitiesabove,using( 4 )and( 4 ),wehave01 2k#)]TJ /F4 11.955 Tf 11.74 0 Td[(k2)]TJ /F8 11.955 Tf 11.74 0 Td[(V(y,w)L2 2kr)]TJ /F8 11.955 Tf 11.75 0 Td[(wk2+M2 2)]TJ /F8 11.955 Tf 11.74 0 Td[(V(y,w)L2 2V(r,w)+M2 2)]TJ /F8 11.955 Tf 11.75 0 Td[(V(y,w), thus( 4 )holds. Proposition4.3. Assumethat0t<1forallt>1.Foranysequencesfrtgt1,fwtgt1Z,ifsequencesfwagtgandfwmdtgaregeneratedby( 4 )and( 4 ),thenforallu2Z,Q(wagt+1,u))]TJ /F5 11.955 Tf 11.96 0 Td[((1)]TJ /F4 11.955 Tf 11.96 0 Td[(t)Q(wagt,u)thrG(wmdt),wt+1)]TJ /F8 11.955 Tf 11.95 0 Td[(ui+LG2t 2kwt+1)]TJ /F8 11.955 Tf 11.96 0 Td[(rtk2+thH(wt+1),wt+1)]TJ /F8 11.955 Tf 11.95 0 Td[(ui+tJ(wt+1))]TJ /F4 11.955 Tf 11.96 0 Td[(tJ(u). (4) Proof. Observingfrom( 4 )and( 4 )thatwagt+1)]TJ /F8 11.955 Tf 11.89 0 Td[(wmdt=t(wt+1)]TJ /F8 11.955 Tf 11.9 0 Td[(rt),andusingtheconvexityofG(),forallu2ZwegetG(wagt+1)G(wmdt)+hrG(wmdt),wagt+1)]TJ /F8 11.955 Tf 11.96 0 Td[(wmdti+LG 2kwagt+1)]TJ /F8 11.955 Tf 11.96 0 Td[(wmdtk2=(1)]TJ /F4 11.955 Tf 11.95 0 Td[(t)G(wmdt)+hrG(wmdt),wagt)]TJ /F8 11.955 Tf 11.95 0 Td[(wmdti+tG(wmdt)+hrG(wmdt),u)]TJ /F8 11.955 Tf 11.95 0 Td[(wmdti+thrG(wmdt),wt+1)]TJ /F8 11.955 Tf 11.95 0 Td[(ui+LG2t 2kwt+1)]TJ /F8 11.955 Tf 11.95 0 Td[(rtk2(1)]TJ /F4 11.955 Tf 11.95 0 Td[(t)G(wagt)+tG(u)+thrG(wmdt),wt+1)]TJ /F8 11.955 Tf 11.95 0 Td[(ui+LG2t 2kwt+1)]TJ /F8 11.955 Tf 11.96 0 Td[(rtk2. 140

PAGE 141

Applying( 4 )and( 4 )totheaboveinequality,andusingthefactthatH()ismonotone,wehaveQ(wagt+1,u))]TJ /F5 11.955 Tf 11.95 0 Td[((1)]TJ /F4 11.955 Tf 11.95 0 Td[(t)Q(wagt,u)=G(wagt+1))]TJ /F5 11.955 Tf 11.95 0 Td[((1)]TJ /F4 11.955 Tf 11.95 0 Td[(t)G(wagt))]TJ /F4 11.955 Tf 11.96 0 Td[(tG(u)+hH(u),wagt+1)]TJ /F8 11.955 Tf 11.96 0 Td[(ui)]TJ /F5 11.955 Tf 19.26 0 Td[((1)]TJ /F4 11.955 Tf 11.95 0 Td[(t)hH(u),wagt)]TJ /F8 11.955 Tf 11.95 0 Td[(ui+J(wagt+1))]TJ /F5 11.955 Tf 11.95 0 Td[((1)]TJ /F4 11.955 Tf 11.95 0 Td[(t)J(wagt))]TJ /F4 11.955 Tf 11.95 0 Td[(tJ(u)G(wagt+1))]TJ /F5 11.955 Tf 11.95 0 Td[((1)]TJ /F4 11.955 Tf 11.95 0 Td[(t)G(wagt))]TJ /F4 11.955 Tf 11.96 0 Td[(tG(u)+thH(u),wt+1)]TJ /F8 11.955 Tf 11.96 0 Td[(ui+tJ(wt+1))]TJ /F4 11.955 Tf 11.96 0 Td[(tJ(u)thrG(wmdt),wt+1)]TJ /F8 11.955 Tf 11.95 0 Td[(ui+LG2t 2kwt+1)]TJ /F8 11.955 Tf 11.96 0 Td[(rtk2+thH(wt+1),wt+1)]TJ /F8 11.955 Tf 11.95 0 Td[(ui+tJ(wt+1))]TJ /F4 11.955 Tf 11.95 0 Td[(tJ(u). ThefollowinglemmaestimatesaboundforQ(wagt+1,u),andwillbeappliedintheproofofbothTheorem 4.1 andTheorem 4.2 Lemma6. SupposethattheparametersftginAlgorithm 4.1 satises1=1and0t<1forallt>1,andletthesequencef)]TJ /F9 7.97 Tf 6.77 -1.8 Td[(tgbedenedin( 4 ).Thentheiteratesrt,wt,wagtofAlgorithm 4.1 satises1 )]TJ /F9 7.97 Tf 6.77 -1.79 Td[(tQ(wagt+1,u)Bt(u,r[t]))]TJ /F9 7.97 Tf 18.69 14.95 Td[(tXi=1i 2)]TJ /F9 7.97 Tf 13.05 -1.79 Td[(ii)]TJ /F8 11.955 Tf 11.96 0 Td[(LGii)]TJ /F8 11.955 Tf 13.15 8.09 Td[(L2H2i kri)]TJ /F8 11.955 Tf 11.96 0 Td[(wi+1k2,8u2Z, (4)wherer[t]:=frigt+1i=1,andBt(u,r[t]):=tXi=1i )]TJ /F9 7.97 Tf 6.78 -1.8 Td[(ii(V(ri,u))]TJ /F8 11.955 Tf 11.96 0 Td[(V(ri+1,u)). (4) Proof. First,applyingproposition 4.2 toiterations( 4 )and( 4 )bysettingr=rt,w=wt+1,y=rt+1,#=tH(rt)+trG(wmdt),=tH(wt+1)+trG(wmdt),J=tJ,L=LHtandM=0,thenfrom( 4 ),wehavethatforanyu2Z,thH(wt+1)+rG(wmdt),wt+1)]TJ /F8 11.955 Tf 11.96 0 Td[(ui+tJ(wt+1))]TJ /F4 11.955 Tf 11.96 0 Td[(tJ(u)V(rt,u))]TJ /F8 11.955 Tf 11.96 0 Td[(V(rt+1,u))]TJ /F12 11.955 Tf 11.95 16.86 Td[( 2)]TJ /F8 11.955 Tf 13.15 8.09 Td[(L2H2t 2krt)]TJ /F8 11.955 Tf 11.96 0 Td[(wt+1k2. 141

PAGE 142

Nowapplyingtheaboveinequalityto( 4 )inProposition 4.3 ,wehaveQ(wagt+1,u))]TJ /F5 11.955 Tf 11.95 0 Td[((1)]TJ /F4 11.955 Tf 11.95 0 Td[(t)Q(wagt,u)thrG(wmdt),wt+1)]TJ /F8 11.955 Tf 11.96 0 Td[(ui+LG2t 2kwt+1)]TJ /F8 11.955 Tf 11.95 0 Td[(rtk2+thH(wt+1),wt+1)]TJ /F8 11.955 Tf 11.96 0 Td[(ui+tJ(wt+1))]TJ /F4 11.955 Tf 11.95 0 Td[(tJ(u)t t[V(rt,u))]TJ /F8 11.955 Tf 11.96 0 Td[(V(rt+1,u)])]TJ /F4 11.955 Tf 15.6 8.09 Td[(t 2t)]TJ /F8 11.955 Tf 11.95 0 Td[(LGtt)]TJ /F8 11.955 Tf 13.15 8.09 Td[(L2H2t krt)]TJ /F8 11.955 Tf 11.95 0 Td[(wt+1k2, thus1 )]TJ /F9 7.97 Tf 6.77 -1.8 Td[(tQ(wagt+1,u))]TJ /F5 11.955 Tf 13.15 8.09 Td[(1)]TJ /F4 11.955 Tf 11.96 0 Td[(t )]TJ /F9 7.97 Tf 6.77 -1.8 Td[(tQ(wagt,u)t )]TJ /F9 7.97 Tf 6.77 -1.79 Td[(tt[V(rt,u))]TJ /F8 11.955 Tf 11.95 0 Td[(V(rt+1,u)])]TJ /F4 11.955 Tf 21.14 8.09 Td[(t 2)]TJ /F9 7.97 Tf 13.05 -1.79 Td[(tt)]TJ /F8 11.955 Tf 11.95 0 Td[(LGtt)]TJ /F8 11.955 Tf 13.15 8.09 Td[(L2H2t krt)]TJ /F8 11.955 Tf 11.95 0 Td[(wt+1k2. Noticingthat1=1,andinaddition,inviewof( 4 )weget1)]TJ /F4 11.955 Tf 11.95 0 Td[(t )]TJ /F9 7.97 Tf 6.77 -1.8 Td[(t=1 )]TJ /F9 7.97 Tf 6.78 -1.8 Td[(t)]TJ /F7 7.97 Tf 6.59 0 Td[(1whent>1.Hencewecanapplytheaboveinequalityrecursivelytoget( 4 ). WearenowreadytoproveTheorem 4.1 ,whichdescribestheconvergencepropertyofdeterministicAC-PMwhenZisbounded.ThisfollowsimmediatelyfromLemma 6 ProofofTheorem 4.1 Inviewofthe( 4 )and( 4 ),toprove( 4 )itsufcestoshowthatBt(u,r[t])t )]TJ /F9 7.97 Tf 6.77 -1.79 Td[(tt2Zforallu2Z.Infact,foranysequencefrigt+1i=1intheboundedsetZ,applying( 4 )and( 4 )to( 4 )wehaveBt(u,r[t])1 )]TJ /F7 7.97 Tf 6.77 -1.79 Td[(11V(r1,u))]TJ /F9 7.97 Tf 13.17 14.94 Td[(t)]TJ /F7 7.97 Tf 6.59 0 Td[(1Xi=1i )]TJ /F9 7.97 Tf 6.77 -1.79 Td[(ii)]TJ /F4 11.955 Tf 22.95 8.08 Td[(i+1 )]TJ /F9 7.97 Tf 6.77 -1.79 Td[(i+1i+1V(rt+1,u)1 )]TJ /F7 7.97 Tf 6.78 -1.79 Td[(112Z)]TJ /F9 7.97 Tf 13.18 14.95 Td[(t)]TJ /F7 7.97 Tf 6.59 0 Td[(1Xi=1i )]TJ /F9 7.97 Tf 6.77 -1.79 Td[(ii)]TJ /F4 11.955 Tf 22.95 8.09 Td[(i+1 )]TJ /F9 7.97 Tf 6.77 -1.79 Td[(i+1i+12Z=t )]TJ /F9 7.97 Tf 6.78 -1.79 Td[(tt2Z,8u2Z, (4) thus( 4 )holds. Intheremainingpartofthissubsection,wewillfocusonprovingTheorem 4.2 ,whichsummarizestheconvergencepropertyofdeterministicAC-PMwhenZisunbounded. 142

PAGE 143

ProofofTheorem 4.2 Bytheassumptionofthetheorem,andapplying( 4 )to( 4 ),wegetBt(u,r[t])=t 2)]TJ /F9 7.97 Tf 13.05 -1.79 Td[(ttkr1)]TJ /F8 11.955 Tf 11.96 0 Td[(uk2)]TJ /F4 11.955 Tf 21.14 8.09 Td[(t 2)]TJ /F9 7.97 Tf 13.05 -1.79 Td[(ttkrt+1)]TJ /F8 11.955 Tf 11.96 0 Td[(uk2. Applying( 4 )andtheaboveinequalityto( 4 )inLemma 6 ,wegetQ(wagt+1,u)t 2tkr1)]TJ /F8 11.955 Tf 11.96 0 Td[(uk2)]TJ /F4 11.955 Tf 15.6 8.09 Td[(t 2tkrt+1)]TJ /F8 11.955 Tf 11.95 0 Td[(uk2)]TJ /F4 11.955 Tf 15.6 8.09 Td[(t 2ttXi=1)]TJ /F5 11.955 Tf 5.48 -9.69 Td[(1)]TJ /F8 11.955 Tf 11.96 0 Td[(c2kri)]TJ /F8 11.955 Tf 11.95 0 Td[(wi+1k2. (4) Noticingthat1 2kr1)]TJ /F8 11.955 Tf 11.95 0 Td[(uk2)]TJ /F5 11.955 Tf 13.15 8.09 Td[(1 2krt+1)]TJ /F8 11.955 Tf 11.95 0 Td[(uk2=1 2kr1k2)]TJ /F5 11.955 Tf 13.15 8.09 Td[(1 2krt+1k2)-222(hr1)]TJ /F8 11.955 Tf 11.96 0 Td[(rt+1,ui=1 2kr1)]TJ /F8 11.955 Tf 11.95 0 Td[(wagt+1k2)]TJ /F5 11.955 Tf 13.15 8.09 Td[(1 2krt+1)]TJ /F8 11.955 Tf 11.95 0 Td[(wagt+1k2+hr1)]TJ /F8 11.955 Tf 11.95 0 Td[(rt+1,wagt+1)]TJ /F8 11.955 Tf 11.95 0 Td[(ui, (4) andapplyingtheaboveobservationto( 4 ),wegetQ(wagt+1,u))]TJ /F4 11.955 Tf 13.15 8.09 Td[(t thr1)]TJ /F8 11.955 Tf 11.95 0 Td[(rt+1,wagt+1)]TJ /F8 11.955 Tf 11.95 0 Td[(uit 2tkr1)]TJ /F8 11.955 Tf 11.95 0 Td[(wagt+1k2)]TJ /F4 11.955 Tf 15.6 8.09 Td[(t 2tkrt+1)]TJ /F8 11.955 Tf 11.96 0 Td[(wagt+1k2)]TJ /F4 11.955 Tf 15.6 8.09 Td[(t 2t(1)]TJ /F8 11.955 Tf 11.96 0 Td[(c2)tXi=1kri)]TJ /F8 11.955 Tf 11.96 0 Td[(wi+1k2. Therefore,ifweletvt+1:=t t(r1)]TJ /F8 11.955 Tf 11.95 0 Td[(rt+1),"t+1:=t 2tkr1)]TJ /F8 11.955 Tf 11.95 0 Td[(wagt+1k2)]TJ /F4 11.955 Tf 15.6 8.08 Td[(t 2tkrt+1)]TJ /F8 11.955 Tf 11.95 0 Td[(wagt+1k2)]TJ /F4 11.955 Tf 15.6 8.08 Td[(t 2t(1)]TJ /F8 11.955 Tf 11.96 0 Td[(c2)tXi=1kri)]TJ /F8 11.955 Tf 11.95 0 Td[(wi+1k2, (4) thenwehaveQ(wagt+1,u))-247(hvt+1,wagt+1)]TJ /F8 11.955 Tf 12.26 0 Td[(ui"t+1forallu2Z.Itshouldbenotedthat"t+10holdstriviallybylettingu=wagt+1inthisinequality.Henceweget~g(wagt+1,vt+1)"t+1anditsufcestoestimatetheboundofkvt+1kand"t+1. IfthereexistsastrongsolutionuofHVI(Z;G,H,J),thenQ(wagt+1,u)0,andby( 4 )wehavekr1)]TJ /F8 11.955 Tf 11.95 0 Td[(uk2)-221(krt+1)]TJ /F8 11.955 Tf 11.96 0 Td[(uk2)]TJ /F9 7.97 Tf 18.68 14.95 Td[(tXi=1)]TJ /F5 11.955 Tf 5.48 -9.69 Td[(1)]TJ /F8 11.955 Tf 11.95 0 Td[(c2kri)]TJ /F8 11.955 Tf 11.95 0 Td[(wi+1k20. 143

PAGE 144

LetDbedenedin( 4 ),thenwehavethefollowingtwoinequalities:krt+1)]TJ /F8 11.955 Tf 11.96 0 Td[(ukD, (4)tXi=1kri)]TJ /F8 11.955 Tf 11.96 0 Td[(wi+1k2D2 1)]TJ /F8 11.955 Tf 11.95 0 Td[(c2. (4) By( 4 ),wecangetaboundofkvt+1kimmediately:kvtkt t(kr1)]TJ /F8 11.955 Tf 11.96 0 Td[(uk+krt+1)]TJ /F8 11.955 Tf 11.96 0 Td[(uk)2t tD. Toestimateaboundfor"t,rstweexplorethedenitionoftheaggregatepointwagt+1.By( 4 )and( 4 ),wehave1 )]TJ /F9 7.97 Tf 6.77 -1.79 Td[(twagt+1=1 )]TJ /F9 7.97 Tf 6.77 -1.79 Td[(t)]TJ /F7 7.97 Tf 6.59 0 Td[(1wagt+t )]TJ /F9 7.97 Tf 6.77 -1.79 Td[(twt+1,8t>1. Usingtheassumptionthatwag1=w1,wegetwagt+1=)]TJ /F9 7.97 Tf 19.39 -1.79 Td[(ttXi=1i )]TJ /F9 7.97 Tf 6.78 -1.79 Td[(iwi+1, (4) whereby( 4 )wehave)]TJ /F9 7.97 Tf 6.77 -1.79 Td[(ttXi=1i )]TJ /F9 7.97 Tf 6.77 -1.8 Td[(i=1. (4) Therefore,wagt+1isaconvexcombinationofiteratesw2,...,wt+1.Ifwedeneoneotheraggregatepointasragt+1=)]TJ /F9 7.97 Tf 19.39 -1.79 Td[(ttXi=1i )]TJ /F9 7.97 Tf 6.78 -1.8 Td[(iri+1, (4) thenby( 4 )wehavekragt+1)]TJ /F8 11.955 Tf 11.96 0 Td[(r1k=)]TJ /F9 7.97 Tf 6.78 -1.79 Td[(ttXi=1i )]TJ /F9 7.97 Tf 6.78 -1.79 Td[(iri+1)]TJ /F8 11.955 Tf 11.96 0 Td[(r1)]TJ /F9 7.97 Tf 6.77 -1.79 Td[(ttXi=1i )]TJ /F9 7.97 Tf 6.78 -1.79 Td[(ikri+1)]TJ /F8 11.955 Tf 11.95 0 Td[(r1k)]TJ /F9 7.97 Tf 6.77 -1.79 Td[(ttXi=1i )]TJ /F9 7.97 Tf 6.77 -1.79 Td[(i(kri+1)]TJ /F8 11.955 Tf 11.95 0 Td[(uk+kr1)]TJ /F8 11.955 Tf 11.95 0 Td[(uk)2D. 144

PAGE 145

Bytheaboveinequalityand( 4 ),wecanestimateaboundfor"t+1:"t+1t 2tkr1)]TJ /F8 11.955 Tf 11.95 0 Td[(wagt+1k2)]TJ /F4 11.955 Tf 15.6 8.09 Td[(t 2tkrt+1)]TJ /F8 11.955 Tf 11.95 0 Td[(wagt+1k2=t 2t2hrt+1)]TJ /F8 11.955 Tf 11.95 0 Td[(r1,wagt+1)]TJ /F8 11.955 Tf 11.95 0 Td[(r1i)-222(kr1)]TJ /F8 11.955 Tf 11.96 0 Td[(rt+1k2t 2t2krt+1)]TJ /F8 11.955 Tf 11.95 0 Td[(r1kkwagt+1)]TJ /F8 11.955 Tf 11.96 0 Td[(r1k)-222(kr1)]TJ /F8 11.955 Tf 11.96 0 Td[(rt+1k2t 2t2krt+1)]TJ /F8 11.955 Tf 11.95 0 Td[(r1k)]TJ /F2 11.955 Tf 5.48 -9.68 Td[(kwagt+1)]TJ /F8 11.955 Tf 11.96 0 Td[(ragt+1k+kragt+1)]TJ /F8 11.955 Tf 11.95 0 Td[(r1k)-222(kr1)]TJ /F8 11.955 Tf 11.95 0 Td[(rt+1k2t 2t2t( t+2D))]TJ /F4 11.955 Tf 11.96 0 Td[(2t, wherewedenet:=krt+1)]TJ /F8 11.955 Tf 12.72 0 Td[(r1kand t:=kwagt+1)]TJ /F8 11.955 Tf 12.72 0 Td[(ragt+1k.By( 4 )wegettkrt+1)]TJ /F8 11.955 Tf 12.01 0 Td[(uk+kr1)]TJ /F8 11.955 Tf 12.01 0 Td[(uk2D,thusforthescalarquadraticpolynomial2t( t+2D))]TJ /F4 11.955 Tf 12.02 0 Td[(2twithvariablet,itsmaximalisachievedatt=2D.Therefore, "tt 2t)]TJ /F5 11.955 Tf 5.48 -9.68 Td[(4D2+4D t=2t t)]TJ /F8 11.955 Tf 5.48 -9.68 Td[(D2+ tD.(4) Nowitsufcestoestimateaboundof t.Applying( 4 )inProposition 4.2 withr=rt,w=wt+1,v=rt+1,#=tH(rt)+trG(wmdt),=tH(wt+1)+trG(wmdt),J=tJ,L=LHt,andM=0,wehavekrt+1)]TJ /F8 11.955 Tf 12.66 0 Td[(wt+1k2L2H2tkrt)]TJ /F8 11.955 Tf 12.66 0 Td[(wt+1k2.Therefore,noticingthatL2H2tc2by( 4 ),andapplying( 4 ),( 4 ),( 4 )and( 4 ),weget 2t=kwagt+1)]TJ /F8 11.955 Tf 11.95 0 Td[(ragt+1k2)]TJ /F9 7.97 Tf 6.77 -1.79 Td[(ttXi=1i )]TJ /F9 7.97 Tf 6.78 -1.79 Td[(ikwi+1)]TJ /F8 11.955 Tf 11.96 0 Td[(ri+1k2)]TJ /F9 7.97 Tf 6.78 -1.79 Td[(ttXi=1i )]TJ /F9 7.97 Tf 6.77 -1.79 Td[(ic2kwi+1)]TJ /F8 11.955 Tf 11.95 0 Td[(rik2)]TJ /F9 7.97 Tf 6.77 -1.8 Td[(tmax1itfi )]TJ /F9 7.97 Tf 6.77 -1.8 Td[(igtXi=1c2kwi+1)]TJ /F8 11.955 Tf 11.95 0 Td[(rik2c2D2)]TJ /F9 7.97 Tf 6.78 -1.8 Td[(t 1)]TJ /F8 11.955 Tf 11.96 0 Td[(c2.max1itfi )]TJ /F9 7.97 Tf 6.78 -1.8 Td[(ig=2td2. Finally,applyingtheaboveinequalityto( 4 ),weget( 4 ). 4.4.2ConvergenceAnalysisforStochasticAC-PM Inthissection,weprovetheconvergencepropertiesofstochasticAC-PMpresentedinSection 4.3 ,namely,Theorems 4.3 and 4.4 Throughoutthissection,wewillusethefollowingnotationstodescribetheinexactnessoftherstorderinformationfromSOHandSOG.Atthet-thiteration, 145

PAGE 146

letH(rt,2t)]TJ /F7 7.97 Tf 6.59 0 Td[(1),H(wt+1,2t)andG(wmdt,t)betheoutputsofthestochasticoracles.Wedenote2t)]TJ /F7 7.97 Tf 6.58 0 Td[(1H:=H(rt,2t)]TJ /F7 7.97 Tf 6.59 0 Td[(1))]TJ /F8 11.955 Tf 11.96 0 Td[(H(rt),2tH:=H(wt+1,2t))]TJ /F8 11.955 Tf 11.95 0 Td[(H(wt+1),tG:=G(wmdt,t))-222(rG(wmdt). (4) Tostartwith,wepresentatechnicalresultonaboundofQ(wagt+1,u)forallu2Z.ThefollowinglemmaisanalogoustoLemma 6 fordeterministicAC-PM,andwillbeappliedintheproofofTheorems 4.3 and 4.4 Lemma7. SupposethattheparametersftginAlgorithm 4.1 satises1=1and0t<1forallt>1,andletthesequencef)]TJ /F9 7.97 Tf 6.77 -1.79 Td[(tgbedenedin( 4 ).Thentheiteratesrt,wt,wagtofAlgorithm 4.2 satises1 )]TJ /F9 7.97 Tf 6.78 -1.8 Td[(tQ(wagt+1,u)Bt(u,r[t]))]TJ /F9 7.97 Tf 18.68 14.94 Td[(tXi=1i 2)]TJ /F9 7.97 Tf 13.05 -1.79 Td[(iiq)]TJ /F8 11.955 Tf 11.96 0 Td[(LGii)]TJ /F5 11.955 Tf 13.15 8.09 Td[(3L2H2i kri)]TJ /F8 11.955 Tf 11.95 0 Td[(wi+1k2+tXi=1i(u),8u2Z, (4) whereBt(u,r[t])isdenedin( 4 ),and i(u):=3ii 2)]TJ /F9 7.97 Tf 6.77 -1.8 Td[(i)]TJ /F2 11.955 Tf 5.48 -9.68 Td[(k2iHk2+k2i)]TJ /F7 7.97 Tf 6.58 0 Td[(1Hk2)]TJ /F5 11.955 Tf 13.15 8.08 Td[((1)]TJ /F8 11.955 Tf 11.95 0 Td[(q)i 2)]TJ /F9 7.97 Tf 13.06 -1.8 Td[(iikri)]TJ /F8 11.955 Tf 11.96 0 Td[(wi+1k2)]TJ /F4 11.955 Tf 13.15 8.08 Td[(i )]TJ /F9 7.97 Tf 6.78 -1.8 Td[(ih2iH+iG,wi+1)]TJ /F8 11.955 Tf 11.96 0 Td[(ui.(4) Proof. Observefrom( 4 )thatkH(wt+1,2t))-222(H(rt,2t)]TJ /F7 7.97 Tf 6.58 0 Td[(1)k2)]TJ /F2 11.955 Tf 5.48 -9.68 Td[(kH(wt+1))]TJ /F8 11.955 Tf 11.96 0 Td[(H(rt)k+k2tHk+k2t)]TJ /F7 7.97 Tf 6.59 0 Td[(1Hk23)]TJ /F2 11.955 Tf 5.48 -9.68 Td[(kH(wt+1))]TJ /F8 11.955 Tf 11.96 0 Td[(H(rt)k2+k2tHk2+k2t)]TJ /F7 7.97 Tf 6.58 0 Td[(1Hk23)]TJ /F8 11.955 Tf 5.48 -9.69 Td[(L2Hkwt+1)]TJ /F8 11.955 Tf 11.96 0 Td[(rtk2+k2tHk2+k2t)]TJ /F7 7.97 Tf 6.59 0 Td[(1Hk2. Fromtheobservationabove,applyingProposition 4.1 toprox-mappings 4 and 4 withsettingsr=rt,w=wt+1,y=rt+1,#=tH(rt,2t)]TJ /F7 7.97 Tf 6.59 0 Td[(1)+tG(wmdt,t),= 146

PAGE 147

tH(wt+1,2t)+tG(wmdt,t),J=tJ,L=3L2H2tandM2=3t(k2tHk2+k2t)]TJ /F7 7.97 Tf 6.59 0 Td[(1Hk2)thenaccordingto( 4 ),foranyu2ZwehavethH(wt+1,2t)+G(wmdt,t),wt+1)]TJ /F8 11.955 Tf 11.96 0 Td[(ui+tJ(w))]TJ /F4 11.955 Tf 11.96 0 Td[(tJ(u)V(rt,u))]TJ /F8 11.955 Tf 11.96 0 Td[(V(rt+1,u))]TJ /F12 11.955 Tf 11.95 16.86 Td[( 2)]TJ /F5 11.955 Tf 13.15 8.09 Td[(3L2H2t 2krt)]TJ /F8 11.955 Tf 11.96 0 Td[(wt+1k2+32t 2(k2tHk2+k2t)]TJ /F7 7.97 Tf 6.59 0 Td[(1Hk2). Consequently,applyingtheaboveinequalityto( 4 )inProposition 4.3 ,inviewof( 4 )wehaveQ(wagt+1,u))]TJ /F5 11.955 Tf 11.96 0 Td[((1)]TJ /F4 11.955 Tf 11.96 0 Td[(t)Q(wagt,u)thrG(wmdt),wt+1)]TJ /F8 11.955 Tf 11.95 0 Td[(ui+LG2t 2kwt+1)]TJ /F8 11.955 Tf 11.96 0 Td[(rtk2+thH(wt+1),wt+1)]TJ /F8 11.955 Tf 11.95 0 Td[(ui+tJ(wt+1))]TJ /F4 11.955 Tf 11.96 0 Td[(tJ(u)=thH(wt+1,2t)+G(wmdt,t),wt+1)]TJ /F8 11.955 Tf 11.95 0 Td[(ui+tJ(wt+1))]TJ /F4 11.955 Tf 11.95 0 Td[(tJ(u)+LG2t 2kwt+1)]TJ /F8 11.955 Tf 11.95 0 Td[(rtk2)]TJ /F4 11.955 Tf 11.96 0 Td[(th2tH+tG,wt+1)]TJ /F8 11.955 Tf 11.95 0 Td[(ui, andthenQ(wagt+1,u))]TJ /F5 11.955 Tf 11.95 0 Td[((1)]TJ /F4 11.955 Tf 11.95 0 Td[(t)Q(wagt,u)t t(V(rt,u))]TJ /F8 11.955 Tf 11.96 0 Td[(V(rt+1,u)))]TJ /F4 11.955 Tf 15.6 8.08 Td[(t 2t)]TJ /F8 11.955 Tf 11.95 0 Td[(LGtt)]TJ /F5 11.955 Tf 13.15 8.08 Td[(3L2H2t krt)]TJ /F8 11.955 Tf 11.96 0 Td[(wt+1k2+3tt 2)]TJ /F2 11.955 Tf 5.48 -9.68 Td[(k2tHk2+k2t)]TJ /F7 7.97 Tf 6.59 0 Td[(1Hk2)]TJ /F4 11.955 Tf 11.95 0 Td[(th2tH+tG,wt+1)]TJ /F8 11.955 Tf 11.96 0 Td[(ui=t t(V(rt,u))]TJ /F8 11.955 Tf 11.96 0 Td[(V(rt+1,u)))]TJ /F4 11.955 Tf 15.6 8.09 Td[(t 2tq)]TJ /F8 11.955 Tf 11.96 0 Td[(LGtt)]TJ /F5 11.955 Tf 13.15 8.09 Td[(3L2H2t krt)]TJ /F8 11.955 Tf 11.95 0 Td[(wt+1k2+)]TJ /F9 7.97 Tf 18.73 -1.79 Td[(tt(u), whichisequivalentto1 )]TJ /F9 7.97 Tf 6.78 -1.8 Td[(tQ(wagt+1,u))]TJ /F5 11.955 Tf 13.15 8.09 Td[(1)]TJ /F4 11.955 Tf 11.96 0 Td[(t )]TJ /F9 7.97 Tf 6.77 -1.8 Td[(tQ(wagt,u)t )]TJ /F9 7.97 Tf 6.78 -1.79 Td[(tt(V(rt,u))]TJ /F8 11.955 Tf 11.96 0 Td[(V(rt+1,u)))]TJ /F4 11.955 Tf 21.14 8.09 Td[(t 2)]TJ /F9 7.97 Tf 13.05 -1.79 Td[(ttq)]TJ /F8 11.955 Tf 11.95 0 Td[(LGtt)]TJ /F5 11.955 Tf 13.15 8.09 Td[(3L2H2t krt)]TJ /F8 11.955 Tf 11.96 0 Td[(wt+1k2+t(u). Noticethat1=1,andalsoinviewof( 4 )weget1)]TJ /F4 11.955 Tf 11.95 0 Td[(t )]TJ /F9 7.97 Tf 6.77 -1.8 Td[(t=1 )]TJ /F9 7.97 Tf 6.78 -1.8 Td[(t)]TJ /F7 7.97 Tf 6.59 0 Td[(1whent>1.Hencebyapplyingtheaboveinequalityrecursively,andnoticing( 4 ),wecanget( 4 ). 147

PAGE 148

WealsoneedthefollowingtechnicalresultforprovingTheorems 4.3 and 4.4 Lemma8. Supposethatthesequenceftgsatises1=1and0t<1forallt>1,andthesequencef)]TJ /F9 7.97 Tf 6.78 -1.79 Td[(tgisdenedby( 4 ).Foranyw12ZandanysequenceftgE,ifwedenewv1=w1and wvi+1=argminu2Z)]TJ /F4 11.955 Tf 9.3 0 Td[(ihi,ui+V(wvi,u),8i>1,(4) whereiispositiveforalli,then tXi=1i )]TJ /F9 7.97 Tf 6.77 -1.79 Td[(ih)]TJ /F5 11.955 Tf 13.95 0 Td[(i,wvi)]TJ /F8 11.955 Tf 11.96 0 Td[(uiBt(u,wv[t])+tXi=1ii 2)]TJ /F9 7.97 Tf 6.78 -1.79 Td[(ikik2,forallu2Z.(4) Proof. ApplyingProposition 4.1 withr=wvi,w=wvi+1,=)]TJ /F4 11.955 Tf 9.3 0 Td[(iiandJ=0,forallu2Zwehave)]TJ /F4 11.955 Tf 9.3 0 Td[(ihi,wvi+1)]TJ /F8 11.955 Tf 11.96 0 Td[(uiV(wvi,u))]TJ /F8 11.955 Tf 11.96 0 Td[(V(wvi,wvi+1))]TJ /F8 11.955 Tf 11.95 0 Td[(V(wvi+1,u). Ontheotherhand,bySchwartzinequality,Young'sinequalityand( 4 )wehave)]TJ /F4 11.955 Tf 11.96 0 Td[(ihi,wvi)]TJ /F8 11.955 Tf 11.96 0 Td[(wvi+1iikikkkwvi)]TJ /F8 11.955 Tf 11.96 0 Td[(wvi+1k2i 2kik2+ 2kwvi)]TJ /F8 11.955 Tf 11.95 0 Td[(wvi+1k22i 2kik2+V(wvi,wvi+1). Addingthetwoinequalitiesaboveandmultiplyingbyi=()]TJ /F9 7.97 Tf 11.66 -1.8 Td[(ii),weget)]TJ /F4 11.955 Tf 10.49 8.09 Td[(i )]TJ /F9 7.97 Tf 6.77 -1.79 Td[(ihi,wvi)]TJ /F8 11.955 Tf 11.96 0 Td[(uiii 2)]TJ /F9 7.97 Tf 6.77 -1.79 Td[(ikik2+i )]TJ /F9 7.97 Tf 6.77 -1.79 Td[(ii(V(wvi,u))]TJ /F8 11.955 Tf 11.95 0 Td[(V(wvi+1,u)). Finally,summingfromi=1totandusing( 4 )weget( 4 ). NowWearereadytoprovetheTheorem 4.3 ProofofTheorem 4.3 Firstly,applying( 4 )and( 4 )to( 4 )inLemma 7 ,wehave1 )]TJ /F9 7.97 Tf 6.77 -1.8 Td[(tQ(wagt+1,u)t )]TJ /F9 7.97 Tf 6.77 -1.8 Td[(tt2Z+tXi=1i(u),8u2Z. (4) 148

PAGE 149

Next,wemakethefollowingobservationoni(u)in( 4 ):tXi=1i(u)=tXi=13ii 2)]TJ /F9 7.97 Tf 6.77 -1.79 Td[(i)]TJ /F2 11.955 Tf 5.48 -9.69 Td[(k2iHk2+k2i)]TJ /F7 7.97 Tf 6.59 0 Td[(1Hk2)]TJ /F9 7.97 Tf 18.69 14.95 Td[(tXi=1(1)]TJ /F8 11.955 Tf 11.95 0 Td[(q)i 2)]TJ /F9 7.97 Tf 13.05 -1.79 Td[(iikri)]TJ /F8 11.955 Tf 11.96 0 Td[(wi+1k2)]TJ /F9 7.97 Tf 18.69 14.95 Td[(tXi=1i )]TJ /F9 7.97 Tf 6.77 -1.79 Td[(ih2iH+iG,wi+1)]TJ /F8 11.955 Tf 11.95 0 Td[(ui=)]TJ /F9 7.97 Tf 18.69 14.94 Td[(tXi=1i )]TJ /F9 7.97 Tf 6.77 -1.79 Td[(ih2iH+iG,wvi)]TJ /F8 11.955 Tf 11.95 0 Td[(ui+tXi=1i )]TJ /F9 7.97 Tf 6.77 -1.79 Td[(i)]TJ /F5 11.955 Tf 10.49 8.09 Td[((1)]TJ /F8 11.955 Tf 11.95 0 Td[(q) 2ikri)]TJ /F8 11.955 Tf 11.96 0 Td[(wi+1k2)-221(hiG,wi+1)]TJ /F8 11.955 Tf 11.95 0 Td[(rii+tXi=13ii 2)]TJ /F9 7.97 Tf 6.77 -1.8 Td[(i)]TJ /F2 11.955 Tf 5.48 -9.68 Td[(k2iHk2+k2i)]TJ /F7 7.97 Tf 6.58 0 Td[(1Hk2)]TJ /F9 7.97 Tf 18.69 14.94 Td[(tXi=1i )]TJ /F9 7.97 Tf 6.77 -1.8 Td[(ihiG,ri)]TJ /F8 11.955 Tf 11.96 0 Td[(wvii)]TJ /F9 7.97 Tf 26 14.94 Td[(tXi=1i )]TJ /F9 7.97 Tf 6.77 -1.8 Td[(ih2iH,wi+1)]TJ /F8 11.955 Tf 11.96 0 Td[(wvii. Lettingwv1=w1,andforalli>1denewvi+1asin( 4 )withthesettingi=2iH+iG,andapplyingLemma 8 andYoung'sinequalitytotheaboveinequality,wegettXi=1i(u)Bt(u,wv[t])+Ut, (4) whereUt:=tXi=1ii 2)]TJ /F9 7.97 Tf 6.78 -1.79 Td[(ik2iH+iGk2+tXi=1ii 2(1)]TJ /F8 11.955 Tf 11.96 0 Td[(q))]TJ /F9 7.97 Tf 6.77 -1.79 Td[(ikiGk2+tXi=13ii 2)]TJ /F9 7.97 Tf 6.78 -1.79 Td[(i)]TJ /F2 11.955 Tf 5.48 -9.69 Td[(k2iHk2+k2i)]TJ /F7 7.97 Tf 6.59 0 Td[(1Hk2)]TJ /F9 7.97 Tf 18.69 14.95 Td[(tXi=1i )]TJ /F9 7.97 Tf 6.77 -1.79 Td[(ihiG,ri)]TJ /F8 11.955 Tf 11.95 0 Td[(wvii)]TJ /F9 7.97 Tf 26 14.95 Td[(tXi=1i )]TJ /F9 7.97 Tf 6.77 -1.79 Td[(ih2iH,wi+1)]TJ /F8 11.955 Tf 11.95 0 Td[(wvii. (4) Consequently,weconcludefrom( 4 ),( 4 ),( 4 )and( 4 )that1 )]TJ /F9 7.97 Tf 6.78 -1.79 Td[(tQ(wagt+1,u)2t t)]TJ /F9 7.97 Tf 6.78 -1.79 Td[(t2Z+Ut,8u2Z, whichisequivalenttog(wagt)2t t2Z+)]TJ /F9 7.97 Tf 18.73 -1.8 Td[(tUt. (4) Tonishtheproof,itsufcestoboundUt,bothinexpectationandinprobability. 149

PAGE 150

Weproveparta)rst.Notethatbyourassumptionsonstochasticoraclesandinviewof( 4 ),( 4 )and( 4 ),duringthei-thiterationofAlgorithm 4.2 ,therandomnoise2iHisindependentofwi+1andwvi,andiGisindependentofriandwvi,henceE[hiG,ri)]TJ /F8 11.955 Tf 12.68 0 Td[(wvii]=E[h2iH,wi+1)]TJ /F8 11.955 Tf 12.67 0 Td[(wvii]=0.Inaddition,Assumption A1 impliesthatE[kiGk2]2G,E[k2i)]TJ /F7 7.97 Tf 6.59 0 Td[(1Hk2]2HandE[k2iHk2]2H(notingthatiG,2i)]TJ /F7 7.97 Tf 6.58 0 Td[(1Hand2iHareindependentatiterationi).Therefore,takingexpectationon( 4 )wehaveE[Ut]E"tXi=1ii )]TJ /F9 7.97 Tf 6.78 -1.8 Td[(i)]TJ /F2 11.955 Tf 5.48 -9.68 Td[(k2iHk2+kiGk2+tXi=1ii 2(1)]TJ /F8 11.955 Tf 11.95 0 Td[(q))]TJ /F9 7.97 Tf 6.78 -1.8 Td[(ikiGk2+tXi=13ii 2)]TJ /F9 7.97 Tf 6.78 -1.79 Td[(i)]TJ /F2 11.955 Tf 5.48 -9.69 Td[(k2iHk2+k2i)]TJ /F7 7.97 Tf 6.59 0 Td[(1Hk2#=tXi=1ii )]TJ /F9 7.97 Tf 6.77 -1.79 Td[(i42H+1+1 2(1)]TJ /F8 11.955 Tf 11.96 0 Td[(q)2G. (4) Takingexpectationonbothsidesof( 4 ),andusingtheaboveestimationonE[Ut],weobtain( 4 ). Nextweprovepartb).ObservethatthesequencefhiG,ri)]TJ /F8 11.955 Tf 12 0 Td[(wviigi1isamartingaledifferenceandhencesatisesthelarge-deviationtheorem(see,e.g.,Lemma2of[ 49 ]),thereforeusingthefactthatE[expf(i)]TJ /F6 7.97 Tf 6.78 4.95 Td[()]TJ /F7 7.97 Tf 6.58 0 Td[(1ihiG,ri)]TJ /F8 11.955 Tf 11.95 0 Td[(wvii)2=2(Gi)]TJ /F6 7.97 Tf 6.77 4.95 Td[()]TJ /F7 7.97 Tf 6.59 0 Td[(1iZ)2g]E[expfkiGk2kri)]TJ /F8 11.955 Tf 11.95 0 Td[(wvik2=22G2Zg]E[expfkiGk2g=2G]expf1g, weconcludefromthelarge-deviationtheoremthatProb8<:tXi=1i )]TJ /F9 7.97 Tf 6.78 -1.79 Td[(ihiG,ri)]TJ /F8 11.955 Tf 11.95 0 Td[(wvii>GZvuut 2 tXi=1i )]TJ /F9 7.97 Tf 6.77 -1.79 Td[(i29=;expf)]TJ /F4 11.955 Tf 15.28 0 Td[(2=3g. (4) BythesimilarargumentwealsohaveProb8<:tXi=1i )]TJ /F9 7.97 Tf 6.77 -1.79 Td[(ih2iH,wi+1)]TJ /F8 11.955 Tf 11.95 0 Td[(wvii>HZvuut 2 tXi=1i )]TJ /F9 7.97 Tf 6.77 -1.79 Td[(i29=;expf)]TJ /F4 11.955 Tf 15.28 0 Td[(2=3g. (4) 150

PAGE 151

NowletSi=ii=()]TJ /F9 7.97 Tf 6.77 -1.79 Td[(i)andS=Pti=1Si.ByAssumption A2 andtheconvexityofexponentialfunctions,E"exp(1 StXi=1SikiGk2=2G)#E"1 StXi=1SiexpkiGk2=2G#expf1g. ThereforebyMarkov'sinequalitywehaveProb(1+1 2(1)]TJ /F8 11.955 Tf 11.95 0 Td[(q)tXi=1ii )]TJ /F9 7.97 Tf 6.78 -1.79 Td[(ikiGk2>(1+)2G1+1 2(1)]TJ /F8 11.955 Tf 11.95 0 Td[(q)tXi=1ii )]TJ /F9 7.97 Tf 6.78 -1.79 Td[(i)expf)]TJ /F4 11.955 Tf 15.28 0 Td[(g. (4) Usingasimilarargument,wecanalsodemonstratethatProb(tXi=13ii 2)]TJ /F9 7.97 Tf 6.77 -1.79 Td[(ik2i)]TJ /F7 7.97 Tf 6.58 0 Td[(1Hk2>(1+)32H 2tXi=1ii )]TJ /F9 7.97 Tf 6.78 -1.79 Td[(i)expf)]TJ /F4 11.955 Tf 15.28 0 Td[(g, (4)Prob(tXi=15ii 2)]TJ /F9 7.97 Tf 6.77 -1.8 Td[(ik2iHk2>(1+)52H 2tXi=1ii )]TJ /F9 7.97 Tf 6.78 -1.8 Td[(i)expf)]TJ /F4 11.955 Tf 15.28 0 Td[(g. (4) Finally,weconcludefrom( 4 )( 4 )that( 4 )holds. Intheremainingpartofthissubsection,wewillfocusonprovingTheorem 4.4 ,whichdescribestherateofconvergenceofAlgorithm 4.2 forsolvingHVIS(Z;G,H,J)whenZisunbounded. ProoftheTheorem 4.4 Firstly,apply( 4 )and( 4 )to( 4 )inLemma 7 ,wehave1 )]TJ /F9 7.97 Tf 6.77 -1.79 Td[(tQ(wagt+1,u)Bt(u,r[t]))]TJ /F4 11.955 Tf 21.14 8.08 Td[(t 2)]TJ /F9 7.97 Tf 13.05 -1.79 Td[(tttXi=1)]TJ /F8 11.955 Tf 5.48 -9.68 Td[(q)]TJ /F8 11.955 Tf 11.96 0 Td[(c2kri)]TJ /F8 11.955 Tf 11.95 0 Td[(wi+1k2+Bt(u,wv[t])+Ut,8u2Z. (4) Inaddition,usingtheEuclideanassumptionandapplying( 4 )to( 4 ),wegetBt(u,r[t])=t 2)]TJ /F9 7.97 Tf 13.05 -1.79 Td[(tttXi=1(kri)]TJ /F8 11.955 Tf 11.96 0 Td[(uk2)-221(kri+1)]TJ /F8 11.955 Tf 11.95 0 Td[(uk2)=t 2)]TJ /F9 7.97 Tf 13.05 -1.79 Td[(tt(kr1)]TJ /F8 11.955 Tf 11.95 0 Td[(uk2)-222(krt+1)]TJ /F8 11.955 Tf 11.95 0 Td[(uk2) (4)=t 2)]TJ /F9 7.97 Tf 13.05 -1.79 Td[(tt(kr1)]TJ /F8 11.955 Tf 11.95 0 Td[(wagt+1k2)-222(krt+1)]TJ /F8 11.955 Tf 11.95 0 Td[(wagt+1k2+2hr1)]TJ /F8 11.955 Tf 11.96 0 Td[(rt+1,wagt+1)]TJ /F8 11.955 Tf 11.96 0 Td[(ui), (4) 151

PAGE 152

wherethelastequalityisfrom( 4 ).Byasimilarargumentandusingthesettingthatwv1=r1,wehaveBt(u,wv[t])=t 2)]TJ /F9 7.97 Tf 13.05 -1.8 Td[(tt(kr1)]TJ /F8 11.955 Tf 11.95 0 Td[(uk2)-222(kwvt+1)]TJ /F8 11.955 Tf 11.96 0 Td[(uk2 (4)=t 2)]TJ /F9 7.97 Tf 13.05 -1.79 Td[(tt(kr1)]TJ /F8 11.955 Tf 11.95 0 Td[(wagt+1k2)-222(kwvt+1)]TJ /F8 11.955 Tf 11.96 0 Td[(wagt+1k2+2hr1)]TJ /F8 11.955 Tf 11.95 0 Td[(wvt+1,wagt+1)]TJ /F8 11.955 Tf 11.96 0 Td[(ui). (4) Weconcludefrom( 4 ),( 4 )and( 4 )thatQ(wagt+1,u))-222(hvt+1,wagt+1)]TJ /F8 11.955 Tf 11.96 0 Td[(ui"t+1, (4) wherevt+1:=t t(2r1)]TJ /F8 11.955 Tf 11.96 0 Td[(rt+1)]TJ /F8 11.955 Tf 11.96 0 Td[(wvt+1), (4)"t+1:=t 2t 2kr1)]TJ /F8 11.955 Tf 11.96 0 Td[(wagt+1k2)-222(krt+1)]TJ /F8 11.955 Tf 11.95 0 Td[(wagt+1k2)-222(kwvt+1)]TJ /F8 11.955 Tf 11.96 0 Td[(wagt+1k2)]TJ /F9 7.97 Tf 18.02 14.94 Td[(tXi=1)]TJ /F8 11.955 Tf 5.48 -9.68 Td[(q)]TJ /F8 11.955 Tf 11.95 0 Td[(c2kri)]TJ /F8 11.955 Tf 11.95 0 Td[(wi+1k2+)]TJ /F9 7.97 Tf 18.73 -1.79 Td[(tUt!. (4) Ititeasytoseethattheresidual"t+1ispositivebysettingu=wagt+1in( 4 ).Hence~g(wagt+1,vt+1)"t+1.Tonishtheproof,itsufcestoestimateboundsforE[kvt+1k]andE["t+1]. SupposethatuisastrongsolutionofHVI(Z;G,H,J),thenQ(wagt+1,u)0.Therefore,ifweletu=uin( 4 ),( 4 )and( 4 ),wecanget2kr1)]TJ /F8 11.955 Tf 11.96 0 Td[(uk2)-222(krt+1)]TJ /F8 11.955 Tf 11.95 0 Td[(uk2)-222(kwvt+1)]TJ /F8 11.955 Tf 11.96 0 Td[(uk2)]TJ /F9 7.97 Tf 18.69 14.94 Td[(tXi=1)]TJ /F8 11.955 Tf 5.48 -9.68 Td[(q)]TJ /F8 11.955 Tf 11.96 0 Td[(c2kri)]TJ /F8 11.955 Tf 11.95 0 Td[(wi+1k2+2)]TJ /F9 7.97 Tf 13.05 -1.79 Td[(tt tUt0, andbyusingthenotationDdenedin( 4 )wehavekrt+1)]TJ /F8 11.955 Tf 11.96 0 Td[(uk2+kwvt+1)]TJ /F8 11.955 Tf 11.95 0 Td[(uk2+tXi=1)]TJ /F8 11.955 Tf 5.48 -9.68 Td[(q)]TJ /F8 11.955 Tf 11.95 0 Td[(c2kri)]TJ /F8 11.955 Tf 11.96 0 Td[(wi+1k22D2+2)]TJ /F9 7.97 Tf 13.05 -1.8 Td[(tt tUt. (4) 152

PAGE 153

Inaddition,accordingto( 4 ),( 4 )and( 4 )weobservethatE[Ut] (4)tXi=1t2i )]TJ /F9 7.97 Tf 6.77 -1.8 Td[(tt42H+1+1 2(1)]TJ /F8 11.955 Tf 11.95 0 Td[(q)2G (4)=t 2)]TJ /F9 7.97 Tf 13.05 -1.8 Td[(ttC2. (4) Takingexpectationon( 4 ),andapplyingtheinequalityabovewehaveE[krt+1)]TJ /F8 11.955 Tf 11.95 0 Td[(uk2]+E[kwvt+1)]TJ /F8 11.955 Tf 11.95 0 Td[(uk2]+tXi=1)]TJ /F8 11.955 Tf 5.48 -9.68 Td[(q)]TJ /F8 11.955 Tf 11.95 0 Td[(c2E[kri)]TJ /F8 11.955 Tf 11.96 0 Td[(wi+1k2]2D2+C2. (4) WearenowreadytoestimateE[kvt+1k].By( 4 )and( 4 )wegetE[kvt+1k2]2t 2tE[2kr1)]TJ /F8 11.955 Tf 11.96 0 Td[(rt+1k2+2kr1)]TJ /F8 11.955 Tf 11.96 0 Td[(wvt+1k2]2t 2t)]TJ /F5 11.955 Tf 5.48 -9.69 Td[(4D2+2C2, hencebyJensen'sinequalitywegetE[kvt+1k]t tp 4D2+2C2. OurremaininggoalistoestimateaboundforE["t+1].ApplyingProposition 4.1 toprox-mappings( 4 )and( 4 )withsettingsr=rt,w=wt+1,y=rt+1,#=tH(rt,2t)]TJ /F7 7.97 Tf 6.58 0 Td[(1)+tG(wmdt,t),=tH(wt+1,2t)+tG(wmdt,t),J=tJ,L=3L2H2tandM2=32t(k2tHk2+k2t)]TJ /F7 7.97 Tf 6.58 0 Td[(1Hk2),thenaccordingto( 4 ),wehave1 2krt+1)]TJ /F8 11.955 Tf 11.96 0 Td[(wt+1k23L2H2t 2krt)]TJ /F8 11.955 Tf 11.96 0 Td[(wt+1k2+32t 2(k2tHk2+k2t)]TJ /F7 7.97 Tf 6.59 0 Td[(1Hk2)c2 2krt)]TJ /F8 11.955 Tf 11.95 0 Td[(wt+1k2+32t 2(k2tHk2+k2t)]TJ /F7 7.97 Tf 6.58 0 Td[(1Hk2), 153

PAGE 154

wherethelastinequalityisfrom( 4 ).Nowdeningragt+1by( 4 ),andby( 4 ),( 4 ),( 4 ),( 4 )andtheinequalityabove,wehave"t+1)]TJ /F5 11.955 Tf 11.96 0 Td[()]TJ /F9 7.97 Tf 6.77 -1.79 Td[(tUtt tkr1)]TJ /F8 11.955 Tf 11.96 0 Td[(wagt+1k2t t(kr1)]TJ /F8 11.955 Tf 11.96 0 Td[(uk+ku)]TJ /F8 11.955 Tf 11.96 0 Td[(ragt+1k+kragt+1)]TJ /F8 11.955 Tf 11.96 0 Td[(wagt+1k)23t t(kr1)]TJ /F8 11.955 Tf 11.96 0 Td[(uk2+ku)]TJ /F8 11.955 Tf 11.96 0 Td[(ragt+1k2+kragt+1)]TJ /F8 11.955 Tf 11.95 0 Td[(wagt+1k2)3t t"D2+)]TJ /F9 7.97 Tf 18.73 -1.8 Td[(ttXi=1i )]TJ /F9 7.97 Tf 6.78 -1.8 Td[(i)]TJ /F2 11.955 Tf 5.48 -9.68 Td[(kri+1)]TJ /F8 11.955 Tf 11.96 0 Td[(uk2+kwi+1)]TJ /F8 11.955 Tf 11.96 0 Td[(ri+1k2#3t t"D2+)]TJ /F9 7.97 Tf 18.73 -1.8 Td[(ttXi=1i )]TJ /F9 7.97 Tf 6.78 -1.79 Td[(i)]TJ /F2 11.955 Tf 5.48 -9.68 Td[(kri+1)]TJ /F8 11.955 Tf 11.96 0 Td[(uk2+c2kwi+1)]TJ /F8 11.955 Tf 11.96 0 Td[(rik2+32i(k2iHk2+k2i)]TJ /F7 7.97 Tf 6.59 0 Td[(1Hk2)k#3t t"D2+)]TJ /F9 7.97 Tf 18.73 -1.79 Td[(ttXi=1i )]TJ /F9 7.97 Tf 6.78 -1.79 Td[(i(kri+1)]TJ /F8 11.955 Tf 11.96 0 Td[(uk2+(q)]TJ /F8 11.955 Tf 11.96 0 Td[(c2)kwi+1)]TJ /F8 11.955 Tf 11.95 0 Td[(rik2)maxf1,c2 q)]TJ /F8 11.955 Tf 11.95 0 Td[(c2g+)]TJ /F9 7.97 Tf 16.07 -1.79 Td[(ttXi=13i2i )]TJ /F9 7.97 Tf 6.77 -1.79 Td[(i(k2iHk2+k2i)]TJ /F7 7.97 Tf 6.58 0 Td[(1Hk2)k)#3t t"D2+(2D2+2)]TJ /F9 7.97 Tf 13.05 -1.8 Td[(tt tUt)maxf1,c2 q)]TJ /F8 11.955 Tf 11.96 0 Td[(c2g+)]TJ /F9 7.97 Tf 18.73 -1.79 Td[(ttXi=13i2i )]TJ /F9 7.97 Tf 6.77 -1.79 Td[(i(k2iHk2+k2i)]TJ /F7 7.97 Tf 6.59 0 Td[(1Hk2)k)#. Takingexpectationontheaboveinequality,applying( 4 ),( 4 ),( 4 ),andbyAssumption A1 wehaveE["t+1]3t tD2+(2D2+C2)maxf1,c2 q)]TJ /F8 11.955 Tf 11.96 0 Td[(c2g+182t2H 2ttXi=13i+t 2tC2. 4.5ConcludingRemarksofThisChapter Wepresentanovelacceleratedprox-method(AC-PM)forsolvingaclassofdeterministicandstochastichemivariationalinequality(HVI)problems.Thebasicideaofthisalgorithmistoincorporateamulti-stepaccelerationschemeintotheprox-methodin[ 43 65 ].ForboththedeterministicandstochasticHVI,theAC-PMachievestheoptimalrateofconvergence,notonlyintermsofitsdependenceonthenumberoftheiterations,butalsoonavarietyofproblemparameters.Tothebestofourknowledge,thisisthersttimethatsuchanoptimalalgorithmhasbeendevelopedforbothdeterministic 154

PAGE 155

andstochasticHVIintheliterature.Furthermore,weshowthatthedevelopedAC-PMschemecandealwiththesituationwhenthefeasibleregionisunbounded,aslongasastrongsolutionoftheHVIexists.Intheunboundedcase,weincorporatethemodiedterminationcriterionemployedbyMonteiroandSvaiterinsolvingHVIproblemposedasmonotoneinclusion,anddemonstratethattherateofconvergenceofAC-PMdependsonthedistancefromtheinitialpointtothesetofstrongsolutions.Specially,intheunboundedcaseofthedeterministicHVI,theAC-PMschemeachievestheoptimalratewithoutrequiringanyknowledgeonthedistancefromtheinitialpointtothesetofstrongsolutions. 155

PAGE 156

CHAPTER5DIFFUSIONWEIGHTEDIMAGING 5.1Introduction DiffusionWeightedMagneticResonanceImaging(DW-MRI,orshortenedasDWI)hasbeenimplementedwidelyasanon-invasivemethodtoquantifywaterdiffusionintissues.Underthehypothesisthatthepreferredorientationsofwaterdiffusionwillcoincidewiththeberdirections,DWIcandeterminethedirectionalityofneuronalberbundles,thatyieldinformationonstructuralconnectionsinbrains[ 6 42 59 60 ]. Waterdiffusionwithintissuedependsonthemicrostructureofthetissue.Theaveragewaterdiffusionprobabilitydensityfunction(PDF)P(r)ataspecicvoxelonadisplacementroveranexperimentdiffusiontimeisrelatedtotheDWImeasurementsS(q)[ 86 ]byaFouriertransform S(q)=S0ZR3P(r)e)]TJ /F9 7.97 Tf 6.59 0 Td[(iqrdr,(5) whereS(q)istheattenuationoftheMRsignalwithrespecttothediffusionsensitizinggradientq,S0istheMRIsignalintheabsenceofanygradient.ThePDFP(r)providesvaluableinformationonthetissuemicrostructure.Sincethewaterdiffusionismorelikelytohappenatthedirectionofbertissue,thedirectionrofthemaximumdiffusionprobabilityP(r)willhighlycoincidewiththedirectionofbertissue.However,forinvivoapplicationsitisnotfeasibletoreconstructthediffusionPDFP(r)fromtheMRsignalsS(q)=S0usingthecomplexFouriertransform,sinceitrequiresalargenumberofmeasurementsofS(q)overawiderangeofq2R3inordertoperformastableinverseFouriertransform. DiffusionTensorImaging(DTI)isawell-knownclassicalMRItechniqueusedtoexplorebertissueinformationinthebrain.TherehavebeenalargeamountofworkonDTIthatemploysasecondorder,positivedenite,symmetricdiffusiontensorDtorepresentthelocaltissuestructure[ 5 7 ].DTIimplicitlyassumesthattheprobability 156

PAGE 157

densityfunctionofthedisplacementofwaterdiffusionisGaussianwithmeanzeroandcovariancematrixD.Thefractionalanisotropy(FA)denedusingtheeigenvaluesofDhasbecomethemostwidelyusedmeasureofdiffusionanisotropyinwhitematter.DTIhasbeenshowntobeavaluabletoolinhandlingvoxelswithonlyoneber,andstudieshaveshownincreasingclinicalutilityofDTIintheinvestigationofneuronalaxonberintegrityofwhitebrainmatter.However,ithasbeenrecognizedthatthesingleGaussianmodelisinappropriateforassessingmultiplebertractorientations,whencomplextissuestructureisfoundwithinavoxel[ 29 94 97 ]. Inordertoovercomethesedifcultiesseveralapproacheshavebeentaken.Tuchetal.haveproposedhighangularresolutiondiffusionimaging(HARDI)methodinwhichtheacquisitionmakesthediffusionsensitizinggradientssampleonthesurfaceofasphere[ 93 94 ].In[ 92 ]TuchintroducedQ-ballimaging(QBI),whichisaHARDItechnique,andusedtheorientationdistributionfunction(ODF)todescribetheorientationalstructureofbretissue.ThelocalmaximaoftheODFsimpliesthemostprobableberdirections.Indeterministicbertracingmethods,suchasstreamlinealgorithms,thelocalmaximaoftheODFsareassumedastheberdirections.Instatisticalbertracingmethods,suchasMarkovChainMonteCarlo(MCMC)basedalgorithms,theODFscanbeusedastheprobabilitydensityfunctionsoftheberorientation. TheoriginaldenitionofODFisoftheform1(u)=1 ZZ10P(ru)dr, whereP(ru)isthesameasinequation( 5 ),r=jrj,andu=r=r.WithpropernormalizationconstantZ,theODF1(u)isaprobabilitydensityfunctiondenedonaunitsphere.Tuchalsoshowedin[ 92 ]thattheODFcouldbeapproximateddirectlyfromtherawHARDIsignalS(u)onasingleunitsphereofq-spacebytheFunk-Radon 157

PAGE 158

transform(FRT)G:1(u)1 ZG[S](u) whereG[S](u)isdenedasG[S](u)=Zjwj=1(uTw)S(w)dw, andistheDiracdeltafunction. In[ 1 89 ]itispointedoutthat,ifwerepresenttheorientationofunitvectoruusingsphericalcoordinate(,),thenZR3P(r)dr=Z0Z20Z10P(ru)r2sin()drdd=Z0Z20Z10P(ru)r2drsin()dd, andthusthemarginalPDFontheunitsphereshouldberepresentedby2(u)=Z10P(ru)r2dr. Thedenitionof2(u)wasactuallyproposedbeforeinWedeenetal.[ 96 ]asaweightedradialsummation.Comparingto2(u),inthedenitionof1theJacobianfactorr2isdropped,so1doesnotrepresentatrueprobabilitydensityfunction,andinpracticetheorientationinformationisblurredintheODFestimationby1.Ontheotherhand,since2isaprobabilitydistributionfunction,itdoesnotrequirethenormalizationfactorZanymore. Tristan-Vegaetal.[ 89 ]usethepropertyofFouriertransformandproposetoestimate2(u)basedon2(u)CG4S(q) S0(u)=CG1 q2S0@ @qq2@S(q) @q+1 q2S04bS(q)(u), 158

PAGE 159

whereCisaconstant,q=jqj,and4bistheLaplacian-Beltramioperator.Aganjetal.[ 1 ]showedthatG1 q2S0@ @qq2@S(q) @q)]TJ /F5 11.955 Tf 21.92 0 Td[(2, anddevelopedasimplerelationshipbetween2(u)andthesignalintensityontheunitsphere: 2(u)1 4+1 162G[4b~S](u),(5) where~S(q)=ln()]TJ /F5 11.955 Tf 11.3 0 Td[(ln(S(q) S0)). Inequation( 5 ),TheODF2(u)isestimatedineachindividualvoxel,andnoconnectionbetweentheneighborhoodpointsisconsidered.ThiscanresultintheerrorinODFestimationwhenthedataisnoisy.TherehasbeensomeworkonthespatialregularizationoftheODFresults.H-E.Assemlaletal.[ 3 ]presentedavariationalframeworkfor1(u).ThemodelinhisworkisadaptabletotheRiciandistributionofMRInoiseandabletouseneighboringinformationbytotalvariation(TV)basedminimization.ThesimilarmethodshavebeenproposedfortheregularizationofDTI[ 20 81 90 ]apparentdiffusioncoefcient(ADC)[ 18 ],andHARDIdata[ 60 ].However,thereisstilldifcultyinincorporatingTVbasedregularizationintoODFestimation.Onebigproblemisthecomputationalcomplexitycausedbynon-differentiabilityoftheTVnorm.InmanyTVbasedsmoothingalgorithmsaregularizedTVnormwasusedtoavoidnon-smoothnessproblems,sothatgradientdescentmethodscanbeapplied.ThedrawbackofusingregularizedTVnormisthatitissensitivetotheregularizationparameter,andtakeslongertimetogetconvergence. Recently,severalmethodshavebeendevelopedtosolvetheTVdenoisingproblemefcientlywithexact(notapproximated)TVnorm.Theyincludeusingdualformulation[ 16 ],variablesplittingandcontinuation[ 95 99 ],splitBregman[ 37 ],primal-dualformulation[ 15 28 102 103 ],andvariousformsofoperatorsplitting[ 55 57 ].Other 159

PAGE 160

alternativestoTVbasedregularizersarealsoconsideredformagneticresonanceimagereconstruction.OneofthemistheuseofL1sparsityunderawavelettransform.IthasbeenexploitedthatMRimagesaresparsebothinthespatialnitedifferencesdomainandunderwavelettransform[ 56 ].ThesepropertieshavebeensuccessfullyappliedinMRreconstructionsincompressivesensing[ 56 ]. InthisthesiswefocusonthejointestimationandregularizationoftheODF2.ThepurposeofthisthesisistoprovideaframeworkthatsimultaneouslyestimateandsmooththeODFsfromtheHARDIdata,andafastrobustnumericalalgorithmtogetthemodelsolutions.Inspiredbythepreviousworkontheregularizationfor1,weapplytheangularandspatialregularizationframeworkontheODFmodelfor2,whichhasnotbeenimplementedpreviously.Furthermore,wearetherstonesthatconsiderthecombinationoftotalvariationandwaveletbasedregularizationasthespatialregularizationonODF.Wealsoadapttheprimal-dualnumericalalgorithmforsolvingcombinedtotalvariationandwaveletbasedregularizationmodelsintheestimationoftheODF.Moreover,unliketheworkin[ 1 23 25 89 92 ],wheretheestimationofODFisdoneafterthereconstructionofSisperformed,weintroduceadirectestimationandsmoothingmodelonODFsinthehopetoreducetheaccumulationofestimationerrorinthecalculation. Experimentalresultsandcomparisonsprovidedinthisworkindicatetheefciencyoftheproposedmethod. 5.2SpericalHarmonicSeriesforODFReconstruction TheQ-BallImagingschemeforsolvingODF1in[ 92 ]requiresveryhighloadofcalculationontheFunk-RadonTransform(FRT).AsimplifciationwasprovidedbyDescoteauxetal.[ 24 ],inwhichtheHARDIdataisrepresentedbysphericalharmonicseries(SHS).ByintroducingSHS,thecalculationoftheFRTismuchsimpler.Descoteauxetal.implementedtheSHSonthecalculationof1[ 23 25 ].TheuseofSHSisalsoappliedbyAganjetal.andVegaetal.[ 1 89 ]intheestimationofODF2. 160

PAGE 161

5.2.1SphericalHarmonicsSeries Asphericalharmonicfunction,denotedasYml(,),isoftheformYml(,)=s 2l+1 4(l)]TJ /F8 11.955 Tf 11.96 0 Td[(m)! (l+m)!Pml(cos)eim, wherePmlistheassociatedLegendrepolynomial.liscalledtheorderofthesphericalharmonicfunction,misthephasefactor,andm=)]TJ /F8 11.955 Tf 9.3 0 Td[(l,...,0,...,l.ThefunctionYml(,)isdenedonunitsphere,andthesetofallsphericalharmonicfunctionsisanorthonormalbasisofcomplexfunctionsdenedonunitsphere. Forallrealfunctionsdenedonunitsphere,theorthogonalsetofrealsphericalharmonicbasisisusuallyused.Forlevennumber,choosek=0,2,4,...,l,m=)]TJ /F8 11.955 Tf 9.3 0 Td[(k,...,0,...,k,amodiedsphericalharmonicbasisYjcanbedenedbyYj=8>>>><>>>>:p 2Re(Ymk),if)]TJ /F8 11.955 Tf 9.3 0 Td[(km<0Y0k,ifm=0p 2Im(Ymk),if0
PAGE 162

whereljistheorderofYj(u).Descoteauxetal.[ 24 ]usedthispropertyontheangularregularizationofODF1.Aganjetal.[ 1 ]appliedthispropertyintheprocessofsolvingtheanalyticalsolutiontotheequation( 5 )for2. 5.2.2SHSApproximationofFunk-RadonTransform IndiffusionMRI,ataxedvoxel,theachievedsignalintensitiesS(u)anditsFunk-RadontransformG[S](u)arerealvaluedfunctionsdenedonunitsphere,andthustheycanbeapproximatedbytherealsphericalharmonicbasis.Descoteauxetal.[ 24 ]provedthatifS(u)canbeapproximatedasS(u)=RXj=1cjYj(u), thentheFunk-RadontransformG[S](u)canbeapproximatedbyG[S](u)=RXj=12Plj(0)cjYj(u), whereljistheorderofthemodiedsphericalharmonicsfunctionYj,andPlj(0)istheLegendrepolynomialofdegreeljevaluatedat0,i.e.,Plj(0)=8><>:0ljodd,()]TJ /F5 11.955 Tf 9.3 0 Td[(1)lj=2135(lj)]TJ /F7 7.97 Tf 6.59 0 Td[(1) 246ljljeven. Aganjetal.[ 1 ]showedthatifthesignalisrepresentedusingSHS,i.e., ~S(u)=ln()]TJ /F5 11.955 Tf 11.29 0 Td[(ln(S(u) S0))=RXj=1cjYj(u),(5) then 2(u)=RXj=1ajYj(u),(5) where aj=8><>:1 2p ,j=1,)]TJ /F5 11.955 Tf 14.03 8.09 Td[(1 8Plj(0)lj(lj+1)cj,j>1.(5) 162

PAGE 163

Inequation( 5 ),a1=1=2p isduetothefactthatY1(u)1=2p .Infact,fromequation( 5 ),RXj=2ajYj(u)=1 162G[4b~S](u). Aganjetal.[ 1 ]proposetoestimate2(u)intwosteps:FirstestimatetheSHScoefcientscj'sofS(u)byleastsquaresfromequation( 5 ),andthencalculatetheSHScoefcientsaj'sof2(u)bytherelationshipinequation( 5 ). 5.3ModelDescription InthissectionwepresentamodelthatisabletosimultaneouslyestimateandsmooththeODF2(u),wherethesmoothingisperformedwithrespecttoboththespatialvariablexandtheangularvariable(,). Thedatadelitytermintheproposedmodelisbasedonequation( 5 ).Insteadofvoxel-by-voxelleastsquarettingon~S(u)asin[ 1 ],westartbyassumingthat 2(x,u)=RXj=1aj(x)Yj(u),x2,(5) whereistheimagedomain.Thegoalissimultaneouslyestimatingandregularizing2(x,u)fromthedataS(x,u)andS0(x).Bythelinearexpansiondescribedinequation( 5 ),thisproblemreducetotheestimationandregularizationofthecoefcientsaj(x),wherej=1,2,...,R,x2.Moreover,fromequation( 5 )wealreadyhavea1(x)1=p 2,8x2. Ourmodelconsistsoffourterms:aleastsquaresenergy,anangularregularizationenergy,atotalvariationregularizationenergy,andawaveletL1sparsityregularizationenergy. 5.3.1LeastSquaresEnergy Werstpresentaleastsquarestypeenergyfortheestimationofthecoefcientsaj(x)inthissubsectionasthedatattingterminourenergyfunctional.Byusingthe 163

PAGE 164

relationofequations( 5 )and( 5 ),from( 5 )and( 5 )wehave~S(x,u)=c1(x)Y1(u))]TJ /F9 7.97 Tf 17.42 14.94 Td[(RXj=2aj(x)8[Plj(0)lj(lj+1)])]TJ /F7 7.97 Tf 6.59 0 Td[(1Yj(u), wherec1(x)isthecoefcientofY1(u)intheSHSrepresentationofS(x,u).Infact,fromtheorthogonalityoftherealSHS,wehaveY1(u)1=p 2andZ@B1Yj(u)du=0,8j>1, thereforec1(x)Y1(u)1 4Z@B1~S(x,u)du, and 1 4Z@B1~S(x,u)du)]TJ /F5 11.955 Tf 12.84 2.66 Td[(~S(x,u)=RXj=2aj(x)8[Plj(0)lj(lj+1)])]TJ /F7 7.97 Tf 6.59 0 Td[(1Yj(u).(5) NowifweletF(x,u)=1 4Z@B1~S(x,u)du)]TJ /F5 11.955 Tf 12.84 2.66 Td[(~S(x,u),andleteYj(u)=8[Plj(0)lj(lj+1)])]TJ /F7 7.97 Tf 6.59 0 Td[(1Yj(u), thenfromequation( 5 )wehave F(x,u)=RXj=2aj(x)eYj(u).(5) whereF(x,u)canbecalculateddirectlyfromthesignaldataS(x,u).Thereforewedenetheleastsquaresenergyas E1(a2,...,aR)=1 2ZZ@B1(F(x,u))]TJ /F9 7.97 Tf 17.43 14.94 Td[(RXj=2aj(x)eYj(u))2dudx, (5) where@B1denotestheunitsphere. 164

PAGE 165

AlthoughtheDWIdatahasRiciannoise,weusetheleastsquaresdelityforsimplicity,sincetheprimal-dualoptimizationschemesarewellstudiedespeciallyforleastsquaresdelityterms.ForRiciannoise,itisalsopossibletousealikelihoodbaseddelityterm,andthenuseageneralprimal-dualschemetosolvetheproblem. 5.3.2AngularRegularization M.Descoteauxetal.[ 24 ]proposedaangularregularizationonthesignalS(x,u)byminimizingthefollowingterm: Z@B142bS(x,u)du,(5) where4bdenotestheLaplace-Beltramioperatoronunitsphere.ThisangularregularizationcanreducetheaffectionbynoiseinODFestimation,especiallywhenusinghigherordersphericalharmonicsintherepresentationoftheODF.Giventhepropertyinequation( 5 ),itisveryeasytoevaluatetheLaplace-BeltramioperatoractingonSinequation( 5 ).Inspiredby[ 24 ]weapplytheLaplace-Beltramioperator4bon2(x,u).Then,wedene E2(a2,...,aR)=1 2Z@B1(4b2(x,u))2du=1 2Z@B1 RXj=1aj(x)4bYj(u)!2du=1 2RXj=2a2j(x)l2j(lj+1)2, (5) where4bY1(u)=0sinceY1(u)1=p 2. Therearetwoadvantagesbyincludingtheangularregularization.First,sincetheweightsl2j(lj+1)2arelargerforhighordercoefcients,theangularregularizationtendstosuppressthevalueofa2j(x)'swhenjislarge,whichhelpsinreducingthefakeODFmaximacausedbythenoise.Thiseffectiswellstudiedin[ 24 ].Second,theenergyfunctionalE2isstronglyconvexwithrespectstothecoefcientsa2(x),...,aR(x),which 165

PAGE 166

willprovidefasterconvergenceandbetterrobustnessforthenumericalschemethatsolvestheproposedmodel. ThereisaslightdifferencebetweenourenergyfunctionsE1,E2andtheenergyfunctionsintheworkofDescoteauxetal..In[ 24 25 ],wheretheleast-squarestandangularsmoothingareappliedonS(x,u),theODF1(x,u)iscalculatedusingthesmoothedS(x,u)throughthecoefcientsintheSHrepresentationsofS.InthisthesistheestimationandsmoothingareapplieddirectlytotheODFs2(x,u). 5.3.3SpatialRegularization Minimizingthefollowingenergyfunctional E3(a2,...,aR)=RXj=2Zjrxaj(x)jdx (5) isthetotalvariation(TV)basedregularization,whichisatechniqueusedwidelyinMRIreconstruction.TheideaofapplyingTVonthesphericalharmonicrepresentationfordiffusionMRIisfrom[ 3 18 ]. Intheproposedmodelweconsideroneotherspatialregularizationenergyfunctional E4(a2,...,aR)=RXj=2ZjW[aj](y)jdy, (5) whereW:!isawavelettransformoperator.Thisisasparsityconstraintforimagesaj'sondomain. Thepurposeofspatialregularizationistoenhancetheimagesofaj(x)'s,andremoveimagenoisesbyregularizingthesparsityofaj(x)'sinnitedifferencedomainandwaveletdomain.ThecombinationoftotalvariationandwaveletbasedregularizationhasbeenproventobeveryeffectiveinMRI,sincemostMRIimageshavebeenshowntobesparseinboththenitedifferencedomainandwaveletdomain[ 56 ].ForTVbasedimagerestoration,therestoredimageisoftensharperinedge,butwithpossiblestaircaseeffects,whilebywaveletbasedimagerestoration,therestoredimageis 166

PAGE 167

smoother.In[ 56 ],Lustigetal.introducedthisregularizationtechniqueinmulti-channelfastMRIreconstruction.Infact,theproposedmodelcanbeseenasanextensionofthemethodbyLustigetal.ondiffusionMRIwithdifferentleastsquaresenergy,whichrepresentsthecorrelationbetweenthemultiplechannels.Otherstudiesonthetotalvariationandwaveletregularizationtechniquearealsoin[ 39 57 ]. Inspiredbythosework,wecombinebothtotalvariationandwaveletregularizationforanalyzingdiffusionMRI. 5.3.4ProposedModel Inthissectionwepresentourmodel.Denethefollowingenergyfunctional:E(a2,...,aR)=E1(a2,...,aR)+E2(a2,...,aR)=+E3(a2,...,aR)+E4(a2,...,aR), whereE1istheleastsquareenergydenedinequation( 5 ),E2istheenergyforangularregularizationdenedin( 5 ),E3istheTVregularizationenergyin( 5 ),andE4isthewaveletL1sparsityregularizationin( 5 ).Theparameters,,arethebalancingweightsforangularregularization,TVbasedregularization,andwaveletbasedregularization.Ourmodelestimatesthecoefcientsa2,...,aRbyminimizingtheenergyfunctionalE(a2,...,aR). 5.3.5DiscreteFormofTheProposedModel Inthissectionweprovidethediscreteformofourmodel.LetF2RMNbethematrixrepresentingthediscreteformoftheMRsignalinformationF(x,u).MisthetotalnumberofthesensitizinggradientappliedtogetthedatainQ-ballimaging,andNisthetotalnumberofvoxelsinimagedomain.LetA2R(R)]TJ /F7 7.97 Tf 6.58 0 Td[(1)Nbethematrixofthediscreteformoftheaj(x)'s,andB2RM(R)]TJ /F7 7.97 Tf 6.58 0 Td[(1)bethematrixforrealsphericalharmonicbasisfunctionseYj(u).WecanrewritetheleastsquareenergyfunctionE1asE1(A)=kBA)]TJ /F22 11.955 Tf 11.95 0 Td[(Fk2F, 167

PAGE 168

wherekkFistheFrobeniusnorm. Also,writeA=(A2,A3,...,AR)T.Then,foreachj=2,3,...,R,Aj2R1Nistherowvectorfortheimagedenedbyfunctionaj(x),x2.UnderthisnotationourmodelcanbewrittenasminimizingthefollowingenergyfunctionE(A): E(A)=1 2kBA)]TJ /F22 11.955 Tf 11.95 0 Td[(Fk2F+ 2kLAk2F+RXj=2kATjkTV+RXj=2kW(ATj)k1,(5) whereLisarowvectorwithLj=lj(lj+1),Wdenotesthediscretewavelettransformoperator,andkkTVisthetotalvariationofaimage. 5.4NumericalScheme Tominimizetheenergyfunctionin( 5 ),weadaptthemodiedprimal-dualhybridgradientalgorithmproposedbyE.Esser,X,ZhangandT.Chan[ 28 ],withslightlymodicationtocopewiththewaveletregularizationterm.Theprimal-dualschemeisalsoequivalenttoaspecialcaseoftheprimal-dualalgorithmsdiscussedin[ 15 ]. 5.4.1Primal-DualFormulation In[ 102 ]aprimal-dualhybridgradient(PDHG)schemewasdevelopedonlinearinversionproblemswithonlyTVregularization.Nowweextendtheirschemetotheproblemconsistingofmoreregularizationterms: minx2RnH(x)+nXi=1kDixk2+kWxk1.(5) HereH(x)isaclosedproperconvexfunction,Di2R2n,12Rnnarelinearoperatorsactingonx.Forthenormskk2andkk1,wehavekDixk2=maxpi2R2,kpik21=maxpi2R2,kpik21,kWxk1=maxq2Rn,kqk11=maxq2Rn,kqk11. 168

PAGE 169

ThereforeifweletD=0BBBBBBBBBB@D1D2...DnW1CCCCCCCCCCA2R3nn,p=0BBBBBBBBBB@p1p2...pnq1CCCCCCCCCCA2R3n, thennXi=1kDixk2+k1xk1=maxp2X, whereX=fp2R3n:k(p2i)]TJ /F7 7.97 Tf 6.59 0 Td[(1,p2i)Tk21,8i=1,,n,k(p2n+1,p2n+2,,p3n)Tk11g. Thentheminimizationprobleminequation( 5 )becomesamin-maxproblem:minx2Rnmaxp2XH(x)+. By[ 28 102 ],thePDHGschemeisasfollows: (DualStep)pk+1=argmaxp2X)]TJ /F5 11.955 Tf 15.7 8.09 Td[(1 2kkp)]TJ /F8 11.955 Tf 11.95 0 Td[(pkk22=argminp2X<)]TJ /F8 11.955 Tf 9.3 0 Td[(Dxk,p>+1 2kkp)]TJ /F8 11.955 Tf 11.96 0 Td[(pkk22=argminp2X1 2kkp)]TJ /F5 11.955 Tf 11.96 0 Td[((pk+kDxk)k22=X(pk+kDxk), (PrimalStep)xk+1=argminx2Rn+H(x)+1 2kkx)]TJ /F8 11.955 Tf 11.96 0 Td[(xkk22, whereXdenotestheprojectionontospaceX,kandkarestepsizes. 169

PAGE 170

AmodiedPDHGschemeisproposedin[ 28 ],bymodifyingtheiterationofpk+1indualsteptopk+1=argminp2X<)]TJ /F8 11.955 Tf 9.3 0 Td[(Dyk,p>+1 2kkp)]TJ /F8 11.955 Tf 11.95 0 Td[(pkk22=X(pk+kDyk), whereyk=(1+k k)]TJ /F7 7.97 Tf 6.59 0 Td[(1)xk)]TJ /F4 11.955 Tf 18.66 8.09 Td[(k k)]TJ /F7 7.97 Tf 6.58 0 Td[(1xk)]TJ /F7 7.97 Tf 6.59 0 Td[(1. Infact,iffkgisaconstantsequence,yk=2xk)]TJ /F8 11.955 Tf 12.94 0 Td[(xk)]TJ /F7 7.97 Tf 6.58 0 Td[(1.Inthiscase,themodiedPDHGalgorithmisalsoaspecialcaseoftheprimal-dualalgorithmsstudiedin[ 15 ].Theconvergenceanalysisisdiscussedin[ 15 28 ]. 5.4.2Primal-DualSchemeforSolvingtheProposedModel Thevariablexintheminimizationproblem( 5 )isavectorinRn.Inourproposedmodel,thevariableAisamatrix.However,sincetheFrobeniusnormisanentry-wisematrixnorm,wecaneasilyadaptourmodeltoavectorform. AssumeA=fai,jg(R)]TJ /F7 7.97 Tf 6.59 0 Td[(1)N=(A2,A3,...,AR)T,andF=fsi,jgMN.HereeachAicanbetreatedasavectorformofa2Dimage.Nowletx2R(R)]TJ /F7 7.97 Tf 6.59 0 Td[(1)N1ands2RMN1bethevectorformofAandFbyusingdictionaryorder,i.e.,x=(a2,1,...,aR,1,a2,2,...,aR,2,...,a2,N,...,aR,N)T,s=(s1,1,...,sM,1,s1,2,...,sR,2,...,s1,N,...,sM,N)T. Thenwehave1 2kBA)]TJ /F22 11.955 Tf 11.96 0 Td[(Fk2F+ 2kLAk2F=1 2kB0x)]TJ /F8 11.955 Tf 11.95 0 Td[(sk22+ 2kL0xk22, whereB0=diag(B,B,...,B),L0=diag(L,L,...,L),B0,L02R(R)]TJ /F7 7.97 Tf 6.59 0 Td[(1)N(R)]TJ /F7 7.97 Tf 6.58 0 Td[(1)N.NowdeneDi2R2(R)]TJ /F7 7.97 Tf 6.59 0 Td[(1)Ntobethediscreteformofgradientoperatoractingonxatvoxeli,dene1tobetheoperatorthatperforms2DdiscretewavelettransformoneachAi,andletH(x)=1 2kB0x)]TJ /F8 11.955 Tf 11.96 0 Td[(sk22+ 2kL0xk22. 170

PAGE 171

Thentheminimizationproblemminx2R(R)]TJ /F15 5.978 Tf 5.76 0 Td[(1)NH(x)+(R)]TJ /F7 7.97 Tf 6.58 0 Td[(1)NXi=1kDixk2+kWxk1 willbeequivalenttoourmodelin( 5 ). Tobeconsistentinnotation,wewritetheprimal-dualschemehereusingouroriginalnotationwithmatrices. DeneD:R(R)]TJ /F7 7.97 Tf 6.58 0 Td[(1)N!R(R)]TJ /F7 7.97 Tf 6.59 0 Td[(1)N2tobethediscreteformofgradientoperator,andW:R(R)]TJ /F7 7.97 Tf 6.59 0 Td[(1)N!R(R)]TJ /F7 7.97 Tf 6.58 0 Td[(1)Ntobethediscretewavelettransformoperator:DA=D(A2,...,AN)T=(DAT2,...,DATN),WA=W(A2,...,AN)T=(WAT2,...,WATN), NoticethatintheaboveAiareactuallythevectorformofa2Dimage.ThusDisa2Dgradientoperator,andWisa2Dwavelettransformoperator. Indualstep,wehave Pk+1=X1(Pk+kDYk),Qk+1=X2(Qk+kWYk),(5) wherePk2R(R)]TJ /F7 7.97 Tf 6.58 0 Td[(1)N2,Yk,Qk2R(R)]TJ /F7 7.97 Tf 6.58 0 Td[(1)N,andX1=f(Pi,j)2R(R)]TJ /F7 7.97 Tf 6.59 0 Td[(1)N2:kPi,jk21,8i=2,...,R,8j=1,...,Ng,X2=f(Qi,j)2R(R)]TJ /F7 7.97 Tf 6.59 0 Td[(1)N:jQi,jj1,8i=2,...,R,8j=1,...,Ng. ForanyP=(Pi,j)2R(R)]TJ /F7 7.97 Tf 6.59 0 Td[(1)N2,Q=(Qi,j)2R(R)]TJ /F7 7.97 Tf 6.58 0 Td[(1)N,wecanactuallywritetheprojectionscomponent-wiseasfollows:(X1(P))i,j=Pi,j max(kPi,jk2,1),(X2(Q))i,j=Qi,j max(jQi,jj,1). 171

PAGE 172

Inprimalstep,theoptimalconditionforAk+1isDTPk+1+WTQk+1+BT(BAk+1)]TJ /F22 11.955 Tf 11.95 0 Td[(F)+L2Ak+1+1 k(Ak+1)]TJ /F22 11.955 Tf 11.95 0 Td[(Ak)=0, Thus,wecanwrite Ak+1=)]TJ /F4 11.955 Tf 5.48 -9.68 Td[(k(BTB+L2)+I)]TJ /F7 7.97 Tf 6.59 0 Td[(1(Ak)]TJ /F4 11.955 Tf 9.3 0 Td[(kDTPk+1)]TJ /F4 11.955 Tf 11.96 0 Td[(kWTQk+1+kBTF),Yk+1=(1+k k)]TJ /F7 7.97 Tf 6.58 0 Td[(1)Ak+1)]TJ /F4 11.955 Tf 18.66 8.09 Td[(k k)]TJ /F7 7.97 Tf 6.59 0 Td[(1Ak.(5) HereDT:R(R)]TJ /F7 7.97 Tf 6.59 0 Td[(1)N2!R(R)]TJ /F7 7.97 Tf 6.58 0 Td[(1)Nisinfactthediscreteformofnegativedivergenceoperator. Finally,wewriteourprimal-dualschemeasfollows: Algorithm5.1Primal-DualSchemeforsolving( 5 ) A0 0,P0 0,Q0 0,Y0 0 repeat IteratePk+1,Qk+1by( 5 ) IterateAk+1,Yk+1by( 5 ) untilconvergenceconditionismet 5.5ExperimentalResults Toverifytheeffectivenessoftheproposedmodelandnumericalalgorothmsinthissectionweprovideourexperimentalresults,andcomparewiththealgorithmsin[ 1 ],whichincorporateleastsquaresestimationandangularregularizationoversignalinformationF(x,u)denedinequation( 5 ).Forthe2Dwavelettransforms,weusealevel2Daubechies-6wavelettransform.ToperformthetransforminourprogramweusetheRiceWaveletToolbox(RWT). 172

PAGE 173

5.5.1SyntheticResults TheaimofthisexperimentistoexaminetheaccuracyandrobustnesstonoiseoftheproposedmodelinthereconstructionofODFs.WegeneratethediffusionweightedsignalSusingabi-Gaussianmodel.Foreachgradientdirectionu,wegeneratethesignalintensitybyS(u)=1 2exp()]TJ /F8 11.955 Tf 9.3 0 Td[(buTD1u)+1 2exp()]TJ /F8 11.955 Tf 9.29 0 Td[(buTD2u), whereD1andD2arediffusiontensorproleswitheigenvalues[1700,300,300]10)]TJ /F7 7.97 Tf 6.59 0 Td[(6,andb=3000.TheeigenvectorsofthetensorprolesD1andD2arechosentosimulateasystemoftwocrossingberbundlesinadomainof32x32voxels.TheregionofthesimulatedbercrossingsisshowninFigure 5-1 .55gradientdirectionsareused,andRicianrandomnoiseisaddedtothesignalwithdifferentsignal-to-noiseratio(SNR):15,20,25and30. Figure5-1.Thesimulatedregionofbercrossings. ToexaminetheaccuracyoftheproposedmodelonthedirectionalstructuresofthereconstructedODFs,weestimatetheregularizedODF,andthencomparetheberdirectionswiththetruevalues.Bycomparison,wealsoapplytheschemebyAganjetal.[ 1 ].Foreachvoxel,theestimatedberdirectionsareassumedtobethelocalmaximathatsurpassesacertainthreshold(weuse0.5here)oftheestimatedODF,assuggestedin[ 24 ].ThetrueberdirectionsareassumedtobetheeigenvectorscorrespondingtothelargesteigenvaluesofthetensorprolesD1andD2inthebi-Gaussianmodel.WecalculatedthedegreeofangulardifferenceintheberdirectionsbetweentheestimatedODFsandtrueODFs.Theestimationerrorisrepresentedas 173

PAGE 174

therootmeansquareerror(RMSE)oftheangulardifferencesinberdirections:forestimatedberdirectionsfdeigNi=1andtrueberdirectionsfdtigNi=1, RMSE=s PNi=1g(dei,dti)2 N,(5) whereforanyvectorsdeanddt,g(de,dt)denotestheanglebetweenthem(indegrees).Theparametersareoptimizedforthebestberdirectionestimation.TheresultsarepresentedinTable 5-3 .WecanseethattheRMSEiseffectivelyreducedbyourmodel,andthecombinationofTVsmoothingandWaveletsmoothingprovidesthebestestimation. Toshowhowtheregularizationparametersaffecttheperformanceoftheproposedmodel,weapplythemodeltotwodatasets,whilevaryingoneparameterandxingtheothertwo.InFigure 5-2 ,fromthelefttotheright,thegraphsaretheRMSEofberdirectionestimationresultsfromtheproposedmodel,withvaryingTVregularizationparameter(andarexed),varying(andarexed),andvarying(andarexed),respectively.Wecanseefromtherstcolumnthatwhenandarexed,theRMSEdecreasessignicantlywhentheTVregularizationparametervariesfrom0to0.7(rstcolumn),whiletheRMSEdecreasesslightlywhenthewaveletregularizationparametervariesfrom0to0.3(andxed,secondcolumn).ThisshowsthattheTVregularizationisdominantwithinspatialregularization.Ontheotherhand,theRMSEdecreasesgreatlywhentheangularregularizationparametervariesfrom0to0.001(andxed,thirdcolumn),whilebarelychangedwhenvariesfrom0.001to0.008.Thisshowsthattheangularregularizationisimportant,whileinsensitivetothechoiceofwhen>0.001. Figure 5-2 providesaguidelineofchoosingtheregularizationparameters.Thex-axisdenotesthechoicesofparameters,andthey-axisdenotesthedirectionalRMSEdenedinequation( 5 ).FromtoptobottomaretheRMSEunderdifferentchoicesofTVregularizationparameter(withxedand),waveletregularizationparameter 174

PAGE 175

(withxedand)andangularregularizationparameter(withxedand),respectively.TheleftcolumnistheperformanceonthedatasetwithSNR=25(thebestchoiceofparametersare=0.7,=0.3and=0.004).TherightcolumnistheperformanceonthedatasetwithSNR=30(thebestchoiceofparametersare=0.7,=0.3and=0.006).Sincetheangularregularizationisinsensitivetothechoiceofwhen>0.001,wecanchooseaconstantsmallformostdiffusionMRIproblems.Furthermore,sincetheTVregularizationisdominantamongthespatialregularization,wecansetthewaveletregularizationparameter=0rst,andnetunetheTVregularizationparameter.Afterwegetadesirablerangeof,wecanstarttuningtosuppressthestaircaseeffect.Inmostofoursyntheticandpracticalexperiments,wendtheaboveguidelineuseful. Nextweshowthattheproposedmodelprovidesmoreaccurateestimationofsphericalharmoniccoefcientsa2(x),a3(x),...,aR(x)ofODFs.InFigure 5-3 thesphericalharmoniccoefcientsa2(x),a3(x),...,aR(x)ofODFestimationisshowas2Dimages.TheestimationisperformedonthesyntheticdatawithSNR20.Intherstrowaretheimagesofcoefcientsfromdifferentmodels.Therstcolumnisthegroundtruth,thesecondcolumnistheresultbyAganjetal.in[ 1 ],thethirdcolumnisourmodelwithonlyTVusedinspatialregularization,andthelastcolumnisourmodelwithbothTVandwaveletinspatialregularization.Foreachimageofcoefcients,theimagesa2(x),a3(x),...,aR(x)arearrangedfromtoptobottomandlefttorightorderwitha2(x)atthetopleftcorner.Thesecondrowisthezoomedinimageofa13(x)(theregioninsidetheredboxintherstrow),wherethestaircaseeffectofTVregularizationcanbeobserved.WecanseefromFigure 5-3 thatifonlyTVregularizationisused,astaircaseeffectcanbeobserved.Ontheotherhand,theestimationresultbyimplementationofbothTVandwaveletregulariztioniseffectivelyimproved. Toquantifytheperformanceoftheproposedmodelunderdifferentnoiselevel,wecomparetheestimatedSHScoefcientsoftheODFsandcomparethemwiththe 175

PAGE 176

groundtruth.Wecalculateasetofsphericalharmoniccoefcientsat1(x),at2(x),...,atR(x)onthesyntheticdatawithnonoise,andusethesecoefcientsasgroundtruth.Basedonthegroundtruth,wecomparetheperformanceofdifferentmodelsbycomparingthesumofsquaresofthedeviation(SSD)oftheestimatedcoefcientsfai(x)gRi=1withthegroundtruthfati(x)gRi=1.TheSSDisdenedasSSD=RXi=1(ai(x))]TJ /F8 11.955 Tf 11.96 0 Td[(ati(x))2. ThecomparisonbySSDislistedinTable 5-4 .Fromtable 5-4 wecanclearlyseethatthecombinationofbothTVandwaveletregularizationprovidesthebestestimationofsphericalharmoniccoefcients. ThecomparisonofcomputationaltimesisshowninTable 5-1 .OurcodesarewritteninMATLABandrunonaLinux(version2.6.38)computerwith2.67GHzInteli5CPUand8GBmemory.Wecanseethatalthoughourmodelrequiresmorecomputationaltime,butduetotheefcientnumericalscheme,thetotalcomputationalloadisstillreasonableforthedatasetwith45x32x32sphericalharmoniccoefcients.WealsoperformtheODFestimationononelargerdomainof64x64voxels,andfromTable 5-2 wecanseethatthecomputationalloadisstillreasonableforsolvingasetof45x64x64sphericalharmoniccoefcients. 5.5.2RealData Weapplytheproposedmodelinasetofrealexperimentaldata.TheDWIdataisobtainedonaSIMENS3.0Telsascanner,withrepetitiontime(TR)=9835ms,echotime(TE)=96ms,(FOV)=170.1mmx204.8mm,b=1000,M=30.ThesmoothingparametersareR=15,=0.006,==0.2.Theregionofinterest(ROI)isshowninFigure 5-4 ,andtheestimatedODFresultsispresentedinFigure 5-5 .FromFigure 5-5 wecanseethatbytheproposedmodelthenoiseonberdirectionsiseffectivelyreduced,andacleartrackofberdirectionscanbeseen. 176

PAGE 177

5.6ConcludingRemarksofThisChapter WeproposeamodelforregularizationintheestimationofODFs.ThemodelperformssimultaneousangularandspatialregularizationtoODFselds.TheangularregularizationisbyusingLaplace-Beltramioperator.Forthespatialregularization,weusetotalvariationandwavelettransform.Theimplementednumericalmethodisrecentlydevelopedandveryfast.Wedemonstratethedrawbackofonlyangularregularizationandtheadvantageofincorporatingbothangularandspatialregularizationinoursyntheticexperiments.WithourmodelwecanachievebetterorientationalinformationforthereconstructedODFelds. 177

PAGE 178

Table5-1.Comparisonofcomputationaltime(inseconds)resultingfromthreemodelsunderSNR=15,20,25and30respectively ModelSNR=15SNR=20SNR=25SNR=30 2byequations( 5 )-( 5 )0.270.260.270.302byproposedmodel,=00.870.770.760.962byproposedmodel,>01.431.481.701.66 Thedatasetisonadomainof3232voxels. Table5-2.Comparisonofcomputationaltime(inseconds)resultingfromthreemodelsunderSNR=15,20,25and30respectively ModelSNR=15SNR=20SNR=25SNR=30 2byequations( 5 )-( 5 )1.031.061.061.052byproposedmodel,=03.583.673.553.352byproposedmodel,>05.135.295.375.16 Thedatasetisonadomainof3232voxels. Table5-3.ComparisonofRMSEresultingfromthreemodelsunderSNR=15,20,25and30respectively ModelSNR=15SNR=20SNR=25SNR=30 2by( 5 )-( 5 )5.874.703.893.372byproposedmodel,=01.681.431.421.412byproposedmodel,>01.331.321.301.24 Theangulardifferenceismeasuredindegrees. Table5-4.ComparisonofSSDresultingfromthreemodelsunderSNR=15,20,25and30respectively. ModelSNR=15SNR=20SNR=25SNR=30 2byequations( 5 )-( 5 )15.4011.309.998.492byproposedmodel,=09.727.056.285.992byproposedmodel,>07.186.626.045.90 178

PAGE 179

Figure5-2.Theperformanceoftheproposedmodelwhilevaryingoneparameterandxingtheothertwo.Thex-axisdenotesthechoicesofparameters,andthey-axisdenotesthedirectionalRMSEdenedinequation( 5 ).Fromtoptobottom:RMSEunderdifferentchoicesofTVregularizationparameter(withxedand),waveletregularizationparameter(withxedand)andangularregularizationparameter(withxedand),respectively.TheleftcolumnistheperformanceonthedatasetwithSNR=25(thebestchoiceofparametersare=0.7,=0.3and=0.004).TherightcolumnistheperformanceonthedatasetwithSNR=30(thebestchoiceofparametersare=0.7,=0.3and=0.006) 179

PAGE 180

Figure5-3.Theimageofsphericalharmoniccoefcientsa2(x),a3(x),...,aR(x)ofODF2.TheimagesareestimatedfromasyntheticdatasetwithSNR20.Intherstrowaretheimagesofcoefcientsfromdifferentmodels.Therstcolumnisthegroundtruth,thesecondcolumnistheresultbyAganjetal.in[ 1 ],thethirdcolumnisourmodelwithonlyTVusedinspatialregularization,andthelastcolumnisourmodelwithbothTVandwaveletinspatialregularization.Foreachimageofcoefcients,theimagesa2(x),a3(x),...,aR(x)arearrangedfromtoptobottomandlefttorightorderwitha2(x)atthetopleftcorner.Thesecondrowisthezoomedinimageofa13(x)(theregioninsidetheredboxintherstrow),wherethestaircaseeffectofTVregularizationcanbeobserved. Figure5-4.Theregionofinterestinrealdata. 180

PAGE 181

(a)Nospatialsmoothing (b)Smoothedusingtheproposedframework Figure5-5.ComparisonoftheODFreconstructionresultsfromrealdata 181

PAGE 182

REFERENCES [1] I.Aganj,C.Lenglet,G.Sapiro,E.Yacoub,K.Ugurbil,andN.Harel.Reconstructionoftheorientationdistributionfunctioninsingle-andmultiple-shellq-ballimagingwithinconstantsolidangle.MagneticResonanceinMedicine,64(2):554,2010. [2] K.Arrow,L.Hurwicz,andH.Uzawa.StudiesinLinearandNon-linearProgram-ming.StanfordMathematicalStudiesintheSocialSciences.StanfordUniversityPress,1958. [3] H.-E.Assemlal,D.Tschumperle,andL.Brun.FibertrackingonHARDIdatausingrobustodfelds.InImageProcessing,2007.ICIP2007.IEEEInternationalConferenceon,volume3,pages344.IEEE,2007. [4] A.AuslenderandM.Teboulle.Interiorgradientandproximalmethodsforconvexandconicoptimization.SIAMJournalonOptimization,16(3):697,2006. [5] P.J.Basser,J.Mattiello,D.Lebihan,etal.Estimationoftheeffectiveself-diffusiontensorfromtheNMRspinecho.JournalofMagneticResonance-SeriesB,103(3):247,1994. [6] P.J.Basser,S.Pajevic,C.Pierpaoli,J.Duda,andA.Aldroubi.InvivobertractographyusingDT-MRIdata.Magneticresonanceinmedicine,44(4):625,2000. [7] P.J.Basser,C.Pierpaoli,etal.Microstructuralandphysiologicalfeaturesoftissueselucidatedbyquantitative-diffusion-tensorMRI.Journalofmagneticresonance.SeriesB,111(3):209,1996. [8] S.Becker,J.Bobin,andE.Candes.NESTA:afastandaccuraterst-ordermethodforsparserecovery.SIAMJournalonImagingSciences,4(1):1,2011. [9] D.P.Bertsekas.ConstrainedOptimizationandLagrangeMultiplierMethods.AcademicPress,1982. [10] D.P.Bertsekas.Nonlinearprogramming.AthenaScientic,1999. [11] S.BonettiniandV.Ruggiero.Ontheconvergenceofprimaldualhybridgradientalgorithmsfortotalvariationimagerestoration.JournalofMathematicalImagingandVision,pages1,2012. [12] S.Boyd,N.Parikh,E.Chu,B.Peleato,andJ.Eckstein.Distributedoptimizationandstatisticallearningviathealternatingdirectionmethodofmultipliers.Founda-tionsandTrendsRinMachineLearning,3(1):1,2011. [13] R.S.Burachik,A.N.Iusem,andB.F.Svaiter.Enlargementofmonotoneoperatorswithapplicationstovariationalinequalities.Set-ValuedAnalysis,5(2):159,1997. 182

PAGE 183

[14] A.Chambolle.Analgorithmfortotalvariationminimizationandapplications.JournalofMathematicalImagingandVision,20(1):89,2004. [15] A.ChambolleandT.Pock.Arst-orderprimal-dualalgorithmforconvexproblemswithapplicationstoimaging.JournalofMathematicalImagingandVision,40(1):120,2011. [16] A.Chambolle.Analgorithmfortotalvariationminimizationandapplications.JournalofMathematicalimagingandvision,20(1-2):89,2004. [17] Y.Chen,W.Hager,F.Huang,D.Phan,X.Ye,andW.Yin.FastalgorithmsforimagereconstructionwithapplicationtopartiallyparallelMRimaging.SIAMJournalonImagingSciences,5(1):90,2012. [18] Y.Chen,W.Guo,Q.Zeng,andY.Liu.Anonstandardsmoothinginreconstructionofapparentdiffusioncoefcientprolesfromdiffusionweightedimages.InverseProbl.Imaging,2(2):205,2008. [19] Y.Chen,G.Lan,andY.Ouyang.Optimalprimal-dualmethodsforaclassofsaddlepointproblems.UCLACAMreport13-31,2013. [20] O.Christiansen,T.-M.Lee,J.Lie,U.Sinha,andT.F.Chan.Totalvariationregularizationofmatrix-valuedimages.InternationalJournalofBiomedicalImaging,2007,2007. [21] P.L.CombettesandV.R.Wajs.Signalrecoverybyproximalforward-backwardsplitting.MultiscaleModeling&Simulation,4(4):1168,2005. [22] A.d'Aspremont.Smoothoptimizationwithapproximategradient.SIAMJournalonOptimization,19(3):1171,2008. [23] M.Descoteaux,E.Angelino,S.Fitzgibbons,andR.Deriche.Apparentdiffusioncoefcientsfromhighangularresolutiondiffusionimaging:Estimationandapplications.MagneticResonanceinMedicine,56(2):395,2006. [24] M.Descoteaux,E.Angelino,S.Fitzgibbons,andR.Deriche.Regularized,fast,androbustanalyticalq-ballimaging.MagneticResonanceinMedicine,58(3):497,2007. [25] M.Descoteaux,R.Deriche,T.Knosche,andA.Anwander.Deterministicandprobabilistictractographybasedoncomplexbreorientationdistributions.MedicalImaging,IEEETransactionson,28(2):269,2009. [26] J.DouglasandH.Rachford.Onthenumericalsolutionofheatconductionproblemsintwoandthreespacevariables.TransactionsoftheAmericanmathematicalSociety,82(2):421,1956. 183

PAGE 184

[27] J.EcksteinandD.P.Bertsekas.OntheDouglasRachfordsplittingmethodandtheproximalpointalgorithmformaximalmonotoneoperators.MathematicalProgramming,55(1-3):293,1992. [28] E.Esser,X.Zhang,andT.Chan.Ageneralframeworkforaclassofrstorderprimal-dualalgorithmsforconvexoptimizationinimagingscience.SIAMJournalonImagingSciences,3(4):1015,2010. [29] L.R.Frank.Anisotropyinhighangularresolutiondiffusion-weightedMRI.MagneticResonanceinMedicine,45(6):935,2001. [30] D.Gabay.Applicationsofthemethodofmultiplierstovariationalinequalities.InM.FortinandR.Glowinski,editors,AugmentedLagrangianMethods:ApplicationstotheNumericalSolutionofBoundary-ValueProblems,volume15ofStudiesinMathematicsandItsApplications,pages299331.Elsevier,1983. [31] D.GabayandB.Mercier.Adualalgorithmforthesolutionofnonlinearvariationalproblemsvianiteelementapproximation.Computers&MathematicswithApplications,2(1):17,1976. [32] S.GhadimiandG.Lan.Optimalstochasticapproximationalgorithmsforstronglyconvexstochasticcompositeoptimization,PartII:shrinkingproceduresandoptimalalgorithms.Manuscript2010-4,DepartmentofIndustrialandSystemsEngineering,UniversityofFlorida,Gainesville,FL32611,USA,2010.SIAMJournalonOptimization(underthird-roundreview). [33] S.GhadimiandG.Lan.Optimalstochasticapproximationalgorithmsforstronglyconvexstochasticcompositeoptimization,PartI:agenericalgorithmicframework.SIAMJournalonOptimization,22:1469,2012. [34] R.GlowinskiandA.Marroco.Surl'approximation,parelementsnisd'ordreun,etlaresolution,parpenalisation-dualited'uneclassedeproblemesdedirichletnonlineaires.ESAIM:MathematicalModellingandNumericalAnalysis-ModelisationMathematiqueetAnalyseNumerique,9(R2):41,1975. [35] D.GoldfarbandS.Ma.Fastmultiple-splittingalgorithmsforconvexoptimization.SIAMJournalonOptimization,22(2):533,2012. [36] T.Goldstein,B.O'Donoghue,andS.Setzer.Fastalternatingdirectionoptimizationmethods.CAMreport,pages12,2012. [37] T.GoldsteinandS.Osher.ThesplitBregmanmethodforL1-regularizedproblems.SIAMJournalonImagingSciences,2(2):323,2009. [38] B.HeandX.Yuan.OntheO(1/n)convergencerateoftheDouglas-Rachfordalternatingdirectionmethod.SIAMJournalonNumericalAnalysis,50(2):700,2012. 184

PAGE 185

[39] L.He,T.C.Chang,S.Osher,T.Fang,andP.Speier.MRimagereconstructionbyusingtheiterativerenementmethodandnonlinearinversescalespacemethods.UCLACAMReport06,2006. [40] M.R.Hestenes.Multiplierandgradientmethods.JournalofOptimizationTheoryandApplications,4(5):303,1969. [41] L.Jacob,G.Obozinski,andJ.-P.Vert.Grouplassowithoverlapandgraphlasso.InProceedingsofthe26thInternationalConferenceonMachineLearning,2009. [42] D.K.Jones,A.Simmons,S.C.Williams,andM.A.Horseld.Non-invasiveassessmentofaxonalberconnectivityinthehumanbrainviadiffusiontensorMRI.MagneticResonanceinMedicine,42(1):37,1999. [43] A.Juditsky,A.Nemirovski,andC.Tauvel.Solvingvariationalinequalitieswithstochasticmirror-proxalgorithm.Manuscript,GeorgiaInstituteofTechnology,Atlanta,GA,2008. [44] A.JuditskyandA.Nemirovski.Firstordermethodsfornonsmoothconvexlarge-scaleoptimization,II:utilizingproblemsstructure.OptimizationforMachineLearning,pages149,2011. [45] T.G.KoldaandB.W.Bader.Tensordecompositionsandapplications.SIAMReview,51(3):455,2009. [46] G.Korpelevich.ExtrapolationgradientmethodsandrelationtomodiedLagrangians.EkonomikaiMatematicheskieMetody,19:694,1983.inRussian;EnglishtranslationinMatekon. [47] G.Korpelevich.Theextragradientmethodforndingsaddlepointsandotherproblems.Matecon,12:747,1976. [48] G.Lan,Z.Lu,andR.D.C.Monteiro.Primal-dualrst-ordermethodswithO(1=")iteration-complexityforconeprogramming.MathematicalProgramming,126:1,2011. [49] G.Lan,A.Nemirovski,andA.Shapiro.Validationanalysisofmirrordescentstochasticapproximationmethod.MathematicalProgramming,134(2):425,2012. [50] G.Lan.Anoptimalmethodforstochasticcompositeoptimization.MathematicalProgramming,133(1-2):365,2012. [51] G.Lan.Bundle-leveltypemethodsuniformlyoptimalforsmoothandnon-smoothconvexoptimization.Manuscript,DepartmentofIndustrialandSystemsEngineer-ing,UniversityofFlorida,Gainesville,FL,2013. 185

PAGE 186

[52] G.Lan,Z.Lu,andR.D.Monteiro.Primal-dualrst-ordermethodswithO(1=")iteration-complexityforconeprogramming.MathematicalProgramming,126(1):1,2011. [53] G.LanandR.D.Monteiro.Iteration-complexityofrst-orderaugmentedlagrangianmethodsforconvexprogramming.Manuscript.SchoolofIndus-trialandSystemsEngineering,GeorgiaInstituteofTechnology,Atlanta(May,2009),2009. [54] Q.Lin,X.Chen,andJ.Pena.Asmoothingstochasticgradientmethodforcompositeoptimization.Manuscript,CarnegieMellonUniversity,2011. [55] P.-L.LionsandB.Mercier.Splittingalgorithmsforthesumoftwononlinearoperators.SIAMJournalonNumericalAnalysis,16(6):964,1979. [56] M.Lustig,D.Donoho,andJ.M.Pauly.SparseMRI:TheapplicationofcompressedsensingforrapidMRimaging.MagneticResonanceinMedicine,58(6):1182,2007. [57] S.Ma,W.Yin,Y.Zhang,andA.Chakraborty.AnefcientalgorithmforcompressedMRimagingusingtotalvariationandwavelets.InComputerVi-sionandPatternRecognition,2008.CVPR2008.IEEEConferenceon,pages1.IEEE,2008. [58] J.Mairal,R.Jenatton,G.Obozinski,andF.Bach.Convexandnetworkowoptimizationforstructuredsparsity.TheJournalofMachineLearningResearch,12:2681,2011. [59] T.McGraw,B.Vemuri,Y.Chen,M.Rao,andT.Mareci.DT-MRIdenoisingandneuronalbertracking.MedicalImageAnalysis,8(2):95,2004. [60] T.McGraw,B.Vemuri,E.Ozarslan,Y.Chen,andT.Mareci.VariationaldenoisingofdiffusionweightedMRI.InverseProblemsandImaging,35(4):625,2009. [61] R.MonteiroandB.Svaiter.Iteration-complexityofblock-decompositionalgorithmsandthealternatingdirectionmethodofmultipliers. [62] R.MonteiroandB.Svaiter.Onthecomplexityofthehybridproximalextragradientmethodfortheiteratesandtheergodicmean.Manuscript,SchoolofISyE,GeorgiaTech,Atlanta,GA,30332,USA,March2009. [63] R.MonteiroandB.Svaiter.ComplexityofvariantsofTseng'smodiedF-BsplittingandKorpelevich'smethodsforhemivariationalinequalitieswithapplicationstosaddle-pointandconvexoptimizationproblems.SIAMJournalonOptimization,21(4):1688,2011. [64] J.-J.Moreau.Decompositionorthogonaledunespacehilbertienselondeuxconesmutuellementpolaires.(French).CRAcad.Sci.Paris,255:238,1962. 186

PAGE 187

[65] A.Nemirovski.Prox-methodwithrateofconvergenceO(1=t)forvariationalinequalitieswithLipschitzcontinuousmonotoneoperatorsandsmoothconvex-concavesaddlepointproblems.SIAMJournalonOptimization,15:229,2004. [66] A.Nemirovski,A.Juditsky,G.Lan,andA.Shapiro.Robuststochasticapproximationapproachtostochasticprogramming.SIAMJournalonOpti-mization,19:1574,2009. [67] A.NemirovskiandD.Yudin.Problemcomplexityandmethodefciencyinoptimization.Wiley-InterscienceSeriesinDiscreteMathematics.JohnWiley,XV,1983. [68] A.Nemirovsky.Information-basedcomplexityoflinearoperatorequations.JournalofComplexity,8(2):153,1992. [69] Y.Nesterov.Excessivegaptechniqueinnonsmoothconvexminimization.SIAMJournalonOptimization,16(1):235,2005. [70] Y.E.Nesterov.AmethodforunconstrainedconvexminimizationproblemwiththerateofconvergenceO(1=k2).DokladyANSSSR,269:543,1983.translatedasSovietMath.Docl. [71] Y.E.Nesterov.IntroductoryLecturesonConvexOptimization:ABasicCourse.KluwerAcademicPublishers,Massachusetts,2004. [72] Y.Nesterov.Smoothminimizationofnon-smoothfunctions.MathematicalProgramming,103(1):127,2005. [73] J.NocedalandS.J.Wright.Numericaloptimization.SpringerScience+BusinessMedia,2006. [74] H.Ouyang,N.He,L.Tran,andA.G.Gray.Stochasticalternatingdirectionmethodofmultipliers.InProceedingsofthe30thInternationalConferenceonMachineLearning(ICML-13),pages80,2013. [75] J.Pena.Nashequilibriacomputationviasmoothingtechniques.Optima,78:12,2008. [76] T.Pock,D.Cremers,H.Bischof,andA.Chambolle.AnalgorithmforminimizingtheMumford-Shahfunctional.InComputerVision,2009IEEE12thInternationalConferenceon,pages1133.IEEE,2009. [77] B.Polyak.Newstochasticapproximationtypeprocedures.Automat.iTelemekh.,7:98,1990. [78] B.PolyakandA.Juditsky.Accelerationofstochasticapproximationbyaveraging.SIAMJ.ControlandOptimization,30:838,1992. 187

PAGE 188

[79] L.D.Popov.AmodicationoftheArrow-Hurwiczmethodforsearchofsaddlepoints.MathematicalNotes,28(5):845,1980. [80] M.J.D.Powell.Amethodfornonlinearconstraintsinminimizationproblems.InOptimization(Sympos.,Univ.Keele,Keele,1968),pages283.AcademicPress,London,1969. [81] A.Ramrez-ManzanaresandM.Rivera.Brainnervebundlesestimationbyrestoringandlteringintra-voxelinformationindiffusiontensorMRI.IEEEWorkshoponVLSM,Proceedings,Oct11-12,pages73,2003. [82] H.RobbinsandS.Monro.Astochasticapproximationmethod.AnnalsofMathematicalStatistics,22:400,1951. [83] R.T.Rockafellar.Convexanalysis.PrincetonUniversityPress(Princeton,NJ),1970. [84] R.T.Rockafellar.Monotoneoperatorsandtheproximalpointalgorithm.SIAMJournalonControlandOptimization,14(5):877,1976. [85] L.Rudin,S.Osher,andE.Fatemi.Nonlineartotalvariationbasednoiseremovalalgorithms.PhysicaD:NonlinearPhenomena,60(1):259,1992. [86] E.StejskalandJ.Tanner.Spindiffusionmeasurements:spinechoesinthepresenceofatime-dependenteldgradient.Thejournalofchemicalphysics,42(1):288,1965. [87] R.Tibshirani,M.Saunders,S.Rosset,J.Zhu,andK.Knight.Sparsityandsmoothnessviathefusedlasso.JournalofRoyalStatisticalSociety:B,67(1):91,2005. [88] R.Tomioka,T.Suzuki,K.Hayashi,andH.Kashima.Statisticalperformanceofconvextensordecomposition.AdvancesinNeuralInformationProcessingSystems,25,2011. [89] A.Tristan-Vega,C.-F.Westin,andS.Aja-Fernandez.Estimationofberorientationprobabilitydensityfunctionsinhighangularresolutiondiffusionimaging.NeuroImage,47(2):638,2009. [90] D.TschumperleandR.Deriche.VariationalframeworksforDT-MRIestimation,regularizationandvisualization.InComputerVision,2003.Proceedings.NinthIEEEInternationalConferenceon,pages116.IEEE,2003. [91] P.Tseng.Onacceleratedproximalgradientmethodsforconvex-concaveoptimization.submittedtoSIAMJournalonOptimization,2008. [92] D.S.Tuch.Q-ballimaging.MagneticResonanceinMedicine,52(6):1358,2004. 188

PAGE 189

[93] D.S.Tuch,T.G.Reese,M.R.Wiegell,andV.J.Wedeen.DiffusionMRIofcomplexneuralarchitecture.Neuron,40(5):885,2003. [94] D.Tuch,R.Weisskoff,J.Belliveau,andV.Wedeen.Highangularresolutiondiffusionimagingofthehumanbrain.InProceedingsofthe7thAnnualMeetingofISMRM,Philadelphia,volume321,1999. [95] Y.Wang,J.Yang,W.Yin,andY.Zhang.Anewalternatingminimizationalgorithmfortotalvariationimagereconstruction.SIAMJournalonImagingSciences,1(3):248,2008. [96] V.J.Wedeen,P.Hagmann,W.-Y.I.Tseng,T.G.Reese,andR.M.Weisskoff.Mappingcomplextissuearchitecturewithdiffusionspectrummagneticresonanceimaging.MagneticResonanceinMedicine,54(6):1377,2005. [97] V.Wedeen,T.Reese,D.Tuch,M.Weigel,J.Dou,R.Weiskoff,andD.Chessler.MappingberorientationspectraincerebralwhitematterwithFouriertransformdiffusionMRI.InProceedingsofthe8thAnnualMeetingofISMRM,Denver,page82,2000. [98] L.Xiao.Dualaveragingmethodsforregularizedstochasticlearningandonlineoptimization.JournalofMachineLearningResearch,pages2543,2010. [99] J.Yang,Y.Zhang,andW.Yin.AfastalternatingdirectionmethodforTVL1-L2signalreconstructionfrompartialFourierdata.SelectedTopicsinSignalProcess-ing,IEEEJournalof,4(2):288,2010. [100] X.Ye,Y.Chen,andF.Huang.ComputationalaccelerationforMRimagereconstructioninpartiallyparallelimaging.MedicalImaging,IEEETransactionson,30(5):1055,2011. [101] X.Ye,Y.Chen,W.Lin,andF.Huang.FastMRimagereconstructionforpartiallyparallelimagingwitharbitraryk-spacetrajectories.IEEETransactionsonMedicalImaging,30(3):575,2011. [102] M.ZhuandT.Chan.Anefcientprimal-dualhybridgradientalgorithmfortotalvariationimagerestoration.UCLACAMReport,pages08,2008. [103] M.Zhu,S.J.Wright,andT.F.Chan.Duality-basedalgorithmsfortotal-variation-regularizedimagerestoration.ComputationalOptimizationandApplications,47(3):377,2010. 189

PAGE 190

BIOGRAPHICALSKETCH YuyuanOuyangwasborninYongzhou,China.Hereceivedhisbachelor'sdegreeininformationandcomputationalsciencefromSchoolofScience(LaterSchoolofMathematicsandSystemsScience)atBeihangUniversity,Beijing,China,inJuly2005.InAugust2005,heenrolledintheDepartmentofMathematicsandStatisticsatUniversityofCalgary,Calgary,Alberta,CanadaandearnedaMasterofSciencedegreeinJuly2007,undertheguidanceofDr.AnatoliySwishchuk.Duringthefollowingyears,hestudiedmathematicalimagingandconvexoptimizationunderhisadvisor,Dr.YunmeiChen.Insummer2013,hereceivedhisPh.D.fromDepartmentofMathematics,UniversityofFlorida. 190