Citation
Mathematical Programming Approaches in Classification and Risk Management

Material Information

Title:
Mathematical Programming Approaches in Classification and Risk Management
Creator:
BUGERA, VLADIMIR A. ( Author, Primary )
Copyright Date:
2008

Subjects

Subjects / Keywords:
Algorithms ( jstor )
Assets ( jstor )
Credit cards ( jstor )
Credit risk ( jstor )
Datasets ( jstor )
Linear programming ( jstor )
Mathematical independent variables ( jstor )
Mathematical monotonicity ( jstor )
Modeling ( jstor )
Utility functions ( jstor )

Record Information

Source Institution:
University of Florida
Holding Location:
University of Florida
Rights Management:
Copyright Vladimir A. Bugera. Permission granted to the University of Florida to digitize, archive and distribute this item for non-profit research and educational purposes. Any reuse of this item in excess of fair use or other copyright exemptions requires permission of the copyright holder.
Embargo Date:
8/31/2014
Resource Identifier:
656216713 ( OCLC )

Downloads

This item is only available as the following downloads:


Full Text

PAGE 3

Ithankmysupervisor(StanislavUryasev),andthemembersofmyPh.D.committeefortheirhelpandguidance.Iamalsogratefultomyparentsfortheirconstantsupport. iii

PAGE 4

Page ACKNOWLEDGMENTS ............................. iii LISTOFTABLES ................................. vi LISTOFFIGURES ................................ vii ABSTRACT .................................... viii CHAPTER 1INTRODUCTION .............................. 1 2CREDITCARDSCORINGWITHQUADRATICUTILITYFUNCTION 4 2.1Introduction .............................. 4 2.2Background .............................. 9 2.2.1SeparationbyLinearSurfaces:TwoClasses ......... 9 2.2.2SeparationbyQuadraticSurfaces:TwoClasses ....... 11 2.2.3SeparationbyLinearSurfaces:ThreeClasses ........ 12 2.3DescriptionofMethodology ..................... 13 2.3.1Approach ............................ 13 2.3.2ClassicationwithaLinearUtilityFunction ........ 15 2.3.3ClassicationwithQuadraticUtilityFunction ....... 16 2.3.4DataFormat .......................... 17 2.3.5MonotonicityConstraints ................... 18 2.3.6Non-Monotonicity ....................... 21 2.4CreditCardScoring .......................... 22 2.4.1DatasetandProblemStatement ............... 22 2.4.2NumericalExperiments .................... 24 2.5Analysis ................................ 27 2.6ConcludingRemarks ......................... 29 3CLASSIFICATIONUSINGOPTIMIZATION:APPLICATIONTOCREDITRATINGSOFBONDS .......................... 37 3.1Introduction .............................. 38 3.2DescriptionofMethodology ..................... 40 3.3Constraints .............................. 47 3.3.1FeasibilityConstraints(F-Constraints) ........... 47 3.3.2MonotonicityConstraints(M-Constraints) ......... 49 iv

PAGE 5

....................... 50 3.3.4GradientMonotonicityConstraints(GM-Constraints). ... 51 3.3.5RiskConstraints(R-Constraints). .............. 51 3.3.6MonotonicityofSeparatingFunctionwithRespecttoClassNumber(MSF-Constraints). ................ 52 3.3.7ModelSqueezingConstraints(MSQ-Constraints). ..... 53 3.3.8LevelSqueezingConstraints(LSQ-Constraints). ...... 53 3.4ChoosingModelFlexibility ...................... 54 3.5ErrorEstimation ........................... 59 3.6BondClassicationProblem ..................... 60 3.7DescriptionofData .......................... 61 3.8NumericalExperiments ........................ 62 3.9ConcludingRemarks ......................... 64 4TRACKINGVOLUME-WEIGHTEDAVERAGEPRICE ........ 73 4.1Introduction .............................. 73 4.2BackgroundandPreliminaryRemarks ............... 74 4.3ModelDesign ............................. 78 4.3.1BestSample .......................... 79 4.4EvaluationofModelPerformance .................. 79 4.5NumericalExperiments ........................ 81 4.5.1Experiments .......................... 81 4.5.2ResultsandDiscussion .................... 82 4.6Conclusions .............................. 83 5REGULATORYIMPACTSONRISK-RETURNEFFICIENTCREDITPORTFOLIOS ............................... 89 5.1Introduction .............................. 89 5.2OptimizationApproach ........................ 90 5.2.1DenitionoftheCVaRRiskMeasure ............ 90 5.2.2FormulationoftheOptimizationModel ........... 91 5.2.3Risk-ReturnAnalysisofthePortfolioAssets ........ 94 5.3ApplicationExample ......................... 95 5.4Conclusion ............................... 96 REFERENCES ................................... 99 BIOGRAPHICALSKETCH ............................ 103 v

PAGE 6

Table page 2{1Codicationofdatainthecreditcardscoringcasestudy ........ 31 2{2Consideredmodels,classesofutilityfunctions,anddatacodications . 31 2{3Calculationresultsfortheconsideredmodels .............. 32 2{4Distributionofmispredictedapplications ................ 32 3{1Dataformatofanentryofthedataset .................. 65 3{2Sizesoftheconsidereddatasets ...................... 66 3{3Listoftheconsideredmodels. ....................... 66 3{4CalculationresultsfordatasetA ..................... 67 3{5ComparisonofbestfoundmodelwithreferencemodelforsetsW,X,YandZ ................................. 68 4{1Performanceofstock-basedmodel .................... 84 4{2Performanceofstock-and-index-basedmodel .............. 85 4{3Bestsampleforstock-basedmodel,S=800 ............... 86 4{4Bestsampleforstock-basedmodel,S=500 ............... 86 4{5Bestsampleforstock-and-index-basedmodel,S=800 .......... 87 4{6Bestsampleforstock-and-index-basedmodel,S=500 .......... 87 vi

PAGE 7

Figure page 2{1Classicationbyadiscriminanthyperplane ............... 33 2{2Inabilityofperfectclassicationbyadiscriminanthyperplane .... 33 2{3Theobjectszlandyiaremisclassied .................. 33 2{4Classicationbyquadraticseparationsurfaces ............. 34 2{5Three-classclassicationbylinearhyperplanes ............. 34 2{6In-sampleandout-of-sampleerrorsfordierentmodels ........ 35 2{7In-sampleandout-of-sampleperformancevs.sizeoftrainingdataset 36 2{8In-sampleandout-of-sampleperformancefordierentmodels ..... 36 3{1Classicationbyseparatingfunctions .................. 69 3{2Natureofriskconstraint ......................... 69 3{3In-sampleandout-of-sampleerrorsformodelA ............ 70 3{4In-sampleandout-of-sampleerrorsformodelB ............ 70 3{5In-sampleandout-of-sampleerrorsformodelC ............ 71 3{6In-sampleandout-of-sampleerrorsformodelD ............ 71 3{7In-sampleandout-of-sampleerrorsforvariousmodels ......... 72 4{1Percentagesofremainingvolumevs.percentagesoftotalvolume .. 88 4{2Dailyvolumedistributions ........................ 88 5{1EcientlinesandportfolioRORACsoftheoptimizationproblems(P)and(P') ............................... 97 5{2Impactoftheregulatoryconstraintontheoptimalportfoliostructures 98 5{3Portfoliostructuresoftheoptimalportfolioin(P)and(P') ...... 98 vii

PAGE 8

Ourstudydevelopednovelapproachestosolvingandanalyzingchallengingproblemsofnancialengineeringandriskmanagement,suchasclassicationandportfoliooptimization,anddevelopingtradingstrategies.Applicationareasoftheconsideredproblemsincludecreditcardscoring,ratingsofnancialinstruments,creditportfoliomanagement,andrealizationofvolume-weightedaverageprice. Alargepartofourstudyconsideredmathematicalprogrammingapproachesforclassicationinnancialapplications.Wedevelopedageneralframeworkofclassicationbasedonoptimizationofapenaltyfunctionmeasuringmisclassication.Basedonthegiveninformation,weformulatedalinearprogrammingproblem.Theoptimalsolutiondenestheparametersforthemodeltoclassifynewobjects.Asanapplicationofthegeneralmethodology,weconsideredtwopracticalproblemsarisinginnance:creditcard,scoringandcreditratingofbonds. Next,weexaminedseveralnewdynamictradingalgorithmsforrealizingthevolume-weightedaverageprice.Theconsideredalgorithmsusedgeneralizedlinearregressionmodelbasedonmean-absolutedeviation. viii

PAGE 9

ix

PAGE 10

Fueledbytheextraordinarydevelopmentofcomputersandinformationtechnology,riskmanagementhasbecomethecoreactivityindierentareas,includingnance.Generallyspeaking,riskmanagementisthedecision-makingprocessinvolvingtheconsiderationofpolitical,social,economicandengineeringfactorswithrelevantriskassessmentrelatedtoapotentialhazardinordertodevelop,analyzeandcompareregulatoryoptionsandselecttheoptimalresponsetothehazard.Essentially,riskmanagementisthecombinationofthreesteps:riskevaluation,emissionandexposurecontrol,andriskmonitoring.Innance,riskmanagementfocusesonthefundamentalprinciplesofcorporatenanceandinvestmentscience(suchascashowstreams,arbitrages,pricingofnancialinstruments,interestratetermstructures,portfoliomanagement,andderivativesecurities). Classicationisoneofthekeytoolsusedinriskmanagement.Accordingtoacommonlyaccepteddenition,classicationisaprocessofcategorizingofobjectsaccordingtotheirqualities(orextrinsicinformationattributedtothem).Inriskmanagement,classicationalgorithmsarecommonlyusedforscoringcreditapplications,developingcreditratingsofnancialinstrumentsandrms,analyzingmultipleinvestmentalternatives,andsoon. Wedevelopnoveloptimizationapproachestoriskmanagementapplications(namely,creditratings,tradingstrategies,andportfoliooptimizationunderregulatoryconstraints). Chapters2and3aredevotedtothegeneralmethodologyforclassifyingobjects.Usingmathematicalprogrammingtechniques,weconstructapenalty 1

PAGE 11

functionthatmeasuresmisclassication.Thepenaltyfunctionisformedbyquadraticseparatingsurfaces,dividingtheobjectspaceintoseveralareas.Elementsfromoneareaaresupposedtobelongtothesameclass.Weadaptthepenaltyminimizationproblemtoalinearprogrammingproblemtondoptimalcoecientsofthequadraticseparatingfunctions.Toadjusttheexibilityofthemodel,andtoavoidovertting,weimposedvariousconstraintsonthelinearprogrammingproblem. Theclassicationprocedureincludestwophases.Intherstphase,theclassicationrulewasdevelopedbasedon\in-sample"informationbyusingadatasetwithknownclassication.Inthesecondphase,weappliedtheobtainedclassicationrule,andvalidateditbyan\out-of-sample"dataset.InChapter2,themethodologyisappliedtothecreditcardscoringproblem.Creditscoringessentiallyrepresentstheassessmentofcreditapplicationsbasedontheirdierentriskcharacteristics[ 26 , 22 ].Thecreditcardapplicationsareclassiedintothreegroups:\acceptapplication,"\moreanalysisisneeded,"and\rejectapplication."Anewfeatureofourapproachisincorporatingtheexperts'judgmentintothemodel.Forinstance,thefollowingpreferenceisincludedbyintroducinganadditionalconstraint:\givemorepreferencetocustomerswithhigherincomes."Numericalexperimentsshowthatincludingconstraintsbasedonexpertjudgmentsimprovestheperformanceofthealgorithm. InChapter3,weapplytheapproachtothemulti-classproblemofbondsratings.Inthisstudywereplicate(approximate)StandardandPoor'screditrating.Thebondsareclassiedaccordingtotheirriskcharacteristics.Wedemonstratedthattheclassicationprocedurereliablyextractstheinformationfromagivendatasetwiththeknownclassication(in-sampleclassication),andthenusedthisinformationtoclassifynewobjects(out-of-sampleclassication).Althoughtheapproachwasimplementedtonanceapplicationsandtheobtained

PAGE 12

resultsaredataspecic,weconcludethattheproposedmethodologyisapromisingclassicationtechnique.Bychangingtheexibilityofthemodel,awiderangeofdatasetscanbeprocessed,fromverysmalltoextremelylarge.Thedevelopedalgorithmisageneraltechnique,andcanbeappliedtootherengineeringareas. Chapter4introducesanewmethodologyfordevelopingecienttradingstrategies.Volume-WeightedAveragePrice(VWAP)isoneofthecommonlyusedtrade-evaluationbenchmarksinthestockmarket.AwidelyusedmethodofVWAPtradingistotradetheorderaccordingtothevolumedistributionofthemarket.InChapter4,wedevelopadynamictradingstrategybasedontheforecastofthemarket'svolumedistributionusingthetechniquesofgeneralizedlinearregression. Chapter5considersanotherapplicationofriskmanagement.Ecientcreditportfoliomanagementisakeysuccessfactorofbanking;andthecreditriskfrombothinternalandregulatorypointsofviewmustbeconsidered.Weintroducedanoptimizationapproachformanagingacreditportfoliothatmaximizestheexpectedreturn,subjecttointernalandregulatoryriskconstraints.Usingasimpliedbankportfolio,weexaminedtheimpactoftheregulatoryrisklimitationrulesontheoptimalsolutions.

PAGE 13

Weconsideredageneralapproachforclassifyingobjects,usingmathematicalprogrammingalgorithms 2.1 Introduction Weconsiderageneralapproachforclassifyingobjectsandexplainitwithcreditcardscoringproblem.Classicationcanbedenedbyaclassicationfunction,assigningtoeachobjectsomecategoricalvaluecalledtheclassnumber.However,thisclassicationfunctionhasaveryinconvenientproperty:itisdiscontinuous(impossibletouseitforclassifyingnewobjects).Wereducethe 4

PAGE 14

classicationproblemtoevaluatingacontinuousutilityfunctionfromsomegeneralclassoffunctions.Thisfunctionisusedforseparatingobjectsbelongingtodierentsets.Valuesofutilityfunctionsforobjectsfromoneclassshouldbeinthesamerange.\Thebest"utilityfunctioninsomeclassisfoundbyminimizingtheerrorofmisclassication.Dependingontheclassofutilityfunction,itmaybeaquitedicultfromanoptimizationpointofview.However,ifoneislookingforautilityfunction,thatisalinearcombinationofotherfunctions(possiblynonlinearwithrespecttoindicatorvariables),itcanbeformulatedasalinearprogrammingproblem.Mangasarianetal.[ 25 ]usedthisapproachforfailure-discriminantanalysiswithlinearutilityfunctions(applicationstobreastcancerdiagnosis).Thisfunctionislinearincontrolparametersandindicatorvariables.Zopounidisetal.[ 44 , 45 ],andPardalosetal.[ 29 ]usedalinearutilityfunctionfortrichotomousclassicationsofcreditcardapplications.KonnoandKobayashi[ 20 ]andKonnoetal.[ 21 ]consideredutilityfunctionsthatarequadraticinindicatorparameters,andlinearincontrolvariables.Theapproachwastestedwiththeclassicationofenterprisesandbreastcancerdiagnosis.KonnoandKobayashi[ 20 ]andKonnoetal.[ 21 ]imposedconvexityconstraintsonutilityfunctionstoavoiddisconnectnessofthediscriminantregions.LikeKonnoandKobayashi[ 20 ]andKonnoetal.[ 21 ],wealsousedthequadraticfunctioninindicatorparameters.Butinsteadofconvexityconstraints,weconsideredmonotonicityconstraintsreectingexperts'opinions.ExtendingsomeideasbyZopounidisetal.[ 44 , 45 ]andPardalosetal.[ 29 ],weconsideramulti-classclassicationwithmanylevelsetsoftheutilityfunction.Although,withourapproachwecanclassifyobjectstoanarbitrarynumberofclasses,inthischapter,weconsideredthetrichotomous(i.e.,threeclasses)classication. Wefocusedonanumericalvalidationoftheproposedalgorithm.Weclassiedadatasetofcredit-cardapplicationssubmittedtoabankinGreece.Thisdataset

PAGE 15

wasearlierconsideredbyDamaskos[ 12 ]andZopounidisetal.[ 45 ].Weinvestigatedtheimpactofmodelexibilityonclassicationcharacteristicsofthealgorithm.Wecomparedtheperformanceofseveralclassesofquadraticandlinearutilityfunctionswithvariousconstraints.Experimentsshowedtheimportanceofimposingconstraintsadjusting\exibility"ofthemodeltothesizeofthedataset.Westudied\in-sample"and\out-of-sample"characteristicsofthesuggestedalgorithms.Ourclassicationapproachminimizedtheempiricalrisk(thatis,theerrorofmisclassication)onthetrainingset(in-sampleerror).Nevertheless,theobjectiveoftheprocedurewastoclassifyobjectsoutsideofthetrainingsetwithminimalerror(out-of-sampleerror).Thein-sampleerrorisnevergreaterthantheout-of-sampleerror.SimilarissueswereaddressedbyVapnik[ 42 ]. Broadlyspeaking,theclassicationproblemcanbereferredasproblemsofdatamining,orknowledgedatadiscovery.Duringthelast50years,awidesetofdierentmethodologieswasproposedfordatadiscovery.Dataminingtechniquescanbedividedintoveclasses[ 8 ]:predictivemodeling(predictingaspecicattributebasedontheotherattributesinthedata);clustering(groupingsimilardatarecordsintosubsets);dependencymodeling(modelingajointprobabilityfunctionoftheprocess);datasummarization(ndingsummariesofpartsofthedata);andchange/deviationdetection(accountingforsequenceinformationindatarecords). Thecreditcardscoringproblemisaparticularcaseofaconsumerlendingproblemusingnancialriskforecastingtechniques[ 39 ].Scoringmodelsaredividedintotwotypes:1)models(ortechniques)helpingcreditorstodecidewhetherornottograntcredittoconsumerswhoappliedforcredit;2)behavior-scoringmodelshelpingtodecidehowtodealwithexistingcustomers.Wefocusonthersttypeofscoringmodels.

PAGE 16

Increditscoring,adecisiononissuingcreditforaclientisbasedonapplicationforcreditandareportobtainedfromacredit-reportingagency.Also,informationonpreviousapplicationsandtheirperformanceisavailable;wecallthisinformationin-sampleinformation.Thecreditorusesin-sampleinformationtogetherwithapplicantinformationtomakeadecision. Creditscoringisessentiallyawayofseparating(recognizing)specicsubgroupsinapopulationofobjects(suchasapplicationsforcredit),thathavesignicantlydierentcredit-riskcharacteristics.Startingwithideasondiscriminatingamonggroups(introducedbyFisher[ 15 ])manydierentapproachesweredevelopedusingstatisticalandoperationalresearchmethods.Thestatisticaltoolsincludediscriminantanalysis(linearandlogisticregressions),andrecursive-partitioningalgorithms(classicationanddecisiontrees).Theoperationresearchtechniquesprimarilyincludemathematicalprogrammingmethods,suchaslinearprogramming.Also,severalnewnonparametricandarticialintelligenceapproacheswererecentlydeveloped.Theyincludeubiquitousneuralnetworks,expertsystems,geneticalgorithms,andthenearest-neighborhoodmethods[ 39 ]. Acommonweaknessofmanycredit-scoringapproachesisthattheydonotprovideclearexplanationsof\reasons"forpreferringsomeobjectsversusothers.Capon[ 10 ]consideredthisasthemaindrawbackofmanycredit-scoringalgorithms.Also,manyimplementationissues[ 17 ]needtobeaddressedbeforeusinganycredit-scoringmodel: Acredit-scoringclassicationproblemcanbedenedasadecisionprocess,thathasinput(answerstotheapplication-formquestionsandvariousinformationobtainedfromthecreditreferencebureau);andtheoutput(separatingapplications

PAGE 17

into\goods"and\bads")[ 39 ].Theobjectiveofcreditscoringistondarulethatseparates\goods"and\bads"withthesmallestpercentageofmisclassications.Notethatperfectclassicationisimpossibleforseveralreasons.Forinstance,therecouldbeerrorsinthesampledata.Moreover,itispossiblethatsome\good"applicationsand\bad"applicationshavetheexactthesameinformationinalldataelds(i.e.,notenoughinformationisavailabletomakeacorrectdecision).Thestatisticallearningtheory[ 42 ]statesthatforamodel,theoptimalprediction(i.e.,out-of-sampleclassicationwithminimalmisclassication)isachievedwhenthein-sampleerrorisclosetotheout-of-sampleerror. Relativelysimplestatisticalapproachesusinglinear-scoringfunctions(Bayesiandecisionrule,discriminantanalysis,andlinearregression)becamethemostpopularforclassicationproblems.TheBayesiandecisionruleworksespeciallywellwhenthedistributionof\goods"and\bads"canbedescribedbymultivariatenormaldistributionswithacommoncovariancematrix;itreducestheproblemtothelinear-decisionrule.Ifcovariancesofthesepopulationsaredierent,thenitleadstoaquadratic-decisionrule.However,Titterington[ 40 ]pointedoutthatinmanycases,thequadratic-scoringfunctionappearslessrobustthanthelinearone.Fisher[ 15 ]useddiscriminantanalysistondalinearcombinationofvariablesthatseparatesthegroupsinthebestway.Hisapproachdoesnotrequireanassumptionofnormality.Theotherwayleadingtolineardiscriminatingfunctionsislinearregression.MyersandForgy[ 28 ]comparedregressionanddiscriminantanalysisincreditscoringapplications.Althoughthereweremanycriticsofthediscriminantandregressionanalysis[ 13 , 16 , 10 ]),empiricalexperienceshowsthattheselinearscoringtechniquesareverycompetitivewithothermoresophisticatedapproaches. Theotherimportantmethodoflineardiscriminationislogisticregression.Earlier,thisapproachwasassociatedwithcomputationaldicultiesofmaximum

PAGE 18

likelihoodestimation,butnowadaysthisisnotaproblembecauseofreadilyavailablehighcomputingpower.Wiginton[ 43 ]wasamongthersttoapplylogisticregressiontocreditscoring;currentlythisapproachiswidelyaccepted. Classicationtreesandexpertsystemsrepresentanotherclassofapproaches.Classicationtreestypicallyareusedinstatistical,articialintelligence,andmachinelearningapplications.Makowski[ 23 ]wasamongthersttosuggestusingclassicationtreesincreditscoring.Coman[ 11 ]showedthatclassicationtreesperformbetterthandiscriminantanalysisdoeswhenthereareinteractionsamongvariables. 2.2 Background 2.2.1 SeparationbyLinearSurfaces:TwoClasses Mangasarianetal.[ 25 ]consideredthefailurediscriminantapproachwiththelinearutilityfunction(applicationstobreastcancerdiagnosis).LetobjectsAi;i=1;::;m;belongtotherstclass;andobjectsBl;l=1;::;h;belongtothesecondclass.Also,letai2Rn,bl2Rnbevectorsofindicators(characteristics)correspondingtoAi,Bl.Ifthereexistsavector(c;c0)2Rn+1,suchthat then,wecallH(c;c0)=x2RnjcTx=c0adiscriminanthyperplane(Figure 2{1 ).Adiscriminanthyperplanemaynotexist(Figure 2{2 ).Ifweintroduceaboundarywiththicknessbetweenclasses,theconditionsinEquations( 3.1 )and( 3.2 )canbeequivalentlyrewrittenas

PAGE 19

Letusdenesubspaces AnobjectAi,suchthatai2H(c;c0),isamisclassiedobjectofthersttype.Also,anobjectBl,suchthatbl2H+(c;c0),isamisclassiedobjectofthesecondtype. Letyibethedistanceofai2H(c;c0)fromhyperplaneH(c;c0+);andletzlbethedistancesofbl2H+(c;c0)(Figure 2{3 ).Classicationisperformedbyminimizingaweightedsumofmisclassicationdistancesyiandzl.Thisisalinearprogrammingproblem, where>0and>0areweightsassociatedwiththemisclassicationoftherstandthesecondtypesofobjects. Problem( 3.7 )hasanoptimalsolution(c;c0;y1;::;ym;z1;::;zn);andthehyperplaneH(c;c0)isthediscriminanthyperplanewithminimalerror(forthepairofweights(;)).Anout-of-sampleobjectgisclassiedbyverifyingwhetheritbelongstothesetH+(c;c0),ortothesetH(c;c0).Ifg2H+(c;c0),wesay

PAGE 20

thatgbelongstotherstclass;andifg2H(c;c0),itbelongstothesecondclass.Inthecasewhengisontheseparatinghyperplane,g2H(c;c0),itnotcleartowhichclasstheobjectbelongs.Wecanassume,forsimplicity,thatobjectg,inthiscase,belongstotherstclass.Equivalently,wecansaythattheobjectgbelongstotherstclass,iftheutilityfunctionU(c;c0;x)=cTxc0isnonnegativeontheobjectg,i.e.U(c;c0;g)0.IfU(c;c0;g)<0,theobjectgbelongstothesecondclass. 2.2.2 SeparationbyQuadraticSurfaces:TwoClasses Toimprovetheprecisionofdiscrimination,KonnoandKobayashi[ 20 ],Konnoetal.[ 21 ]proposedusingquadraticsurfacesinsteadofhyperplanes.Quadraticseparationsurfacesaremoreexiblethanlinearsurfaces,(Figure 2{4 ). LetQ(D;c;c0)beaquadraticsurface, whereDisannmatrix. Thefollowinglinearprogrammingproblemminimizestheclassicationerror whereD2Rnn,c2Rn,c02R1,yi2R1;i=1;::;m;zl2R1;l=1;::;narevariablestobedetermined.VariablesyiandzlrepresentdistancesofthemisclassiedobjectsfromthequadraticsurfaceQ(D;c;c0).Thequadraticutilityfunctionhasasignicantlylargernumberofvariablesthanlinearutilityfunction.Also,inthecaseofquadraticutility,theseparatingsurfaceQ(D;c;c0)maybe

PAGE 21

non-convex(hyperbolicsurface),whichleadstodisconnecteddiscriminantregions.Thismaypoorlyrepresenttheactualstructureofthenancialdata,sinceusuallytherearemonotonictendencies/preferenceswithrespectto(w.r.t.)variables.Tocircumventthisdiculty,KonnoandKobayashi[ 20 ],Konnoetal.[ 21 ]imposedapositivesemi-denitenessconstraintonmatrixD.Thisresultsinthefollowingsemi-deniteprogrammingproblem: Thelastproblemissolvedbyacuttingplanealgorithm,Konnoetal.[ 21 ].Similartothelinearcase,theutilityfunctionU(D;c;c0;x)=xTDx+cTxc0isusedtoclassifyout-of-sampleobjects.Wesaythatanobjectgbelongstotherstclass,ifU(D;c;c0;g)0;otherwise,gbelongstothesecondclass. 2.2.3 SeparationbyLinearSurfaces:ThreeClasses Zopounidisetal.[ 44 ]and[ 45 ]andPardalosetal.[ 29 ]usedlinearutilityfunctionsinthecreditcardscoringproblem(classicationofcreditcardapplications).Comparedtoapproachesdescribedinsections2.2.1and2.2.2,objectsareclassiedintothreeratherthantwoclasses.Withtheutilityfunction,creditcardapplicationsareassignedtothreeclasses(trichotomousmodel):theclassofacceptableapplications;theclassofuncertainapplications,forwhichmoreinvestigationsarerequired;andtheclassofrejectedapplications,(Figure 2{5 ).Zopounidisetal.[ 44 ]and[ 45 ]consideredthreemodels:UTADISI,UTADISIIandUTADISIII.Thesemodelsdieronlyintheformoftheirpenaltyfunction.

PAGE 22

Coecientsoftheseparatinghyperplanesarecalculatedwithalinearoptimizationproblem,similarto( 3.7 ). 2.3 DescriptionofMethodology Thissectionconsidersageneralapproachforclassifyingobjectsusinglinearprogrammingalgorithm.Here,wealsodiscussthedataformatoftheclassicationaswellascontrollingmodelexibilitybyimposingmonotonicityconstraintsonthedevelopedoptimizationmodel. 2.3.1 Approach LetusconsideraclassicationproblemwithJclasses.SupposethatthereisasetofobjectsI=f1;::;mgwithknownclassications.Eachobjectisrepresentedbyapointinann-dimensionalEuclidianspaceRn.Thissetofpoints,X=fxiji2Ig,iscalledatrainingset.SupposethatdecompositionfIjgJj=1,(Ij1\Ij2=;;j16=j2;I=JSj=1Ij)denestheclassicationofmobjects.Apointxibelongstotheclasskifi2Ik. LetusconsiderautilityfunctionU(x)denedonthesetX=fxiji2IgandasetofthresholdsUTH=fu0;u1;::;uJ1;uJgsuchthatuj1
PAGE 23

Asearchofaclassicationutilityfunctioninapredenedclass,suchastheearlierintroducedlinearorquadratic,isamorecomplicatedproblem(comparedtothediscreteutilityfunctionconsideredabove).LetaclassofutilityfunctionsU(x)beparameterizedbyavectorparameter2.SupposethatthetrainingsetX=fxiji2IgissplitintoclassesfIjgJj=1.LetusdenotebyF(+;)thetotalpenaltyforallmisclassieddatapoints.ForthetrainingsetX,thefollowingoptimizationproblemnds\thebest"utilityfunctionU(x)intheprespeciedclassoffunctions, Thisfunctionisnondecreasingw.r.t.classicationerrorsi,+i,i2I.Largedeviationsfromaperfectclassicationimplylargepenalties.Variouspenaltyfunctionsareconsideredinliterature.Here,weconsiderthelinearpenaltyfunction wherei,+iarepenaltycoecientsfordatapointi.Inthispaperweassumei=+i=1fori=1;::;I. LetanobjectxibelongtotheclassIj.Themathematicalprogrammingproblem( 3.12 )impliesthat+i=maxf0;U(xi)ujgandi=maxf0;uj1+U(xi)g.IfthevalueoftheutilityfunctionU(x)onxifromclassIjexceedstheupperthresholduj,thentheerror+iequalsthedierencebetweenU(xi)andtheupperthresholduj;otherwise,+iequalszero.Similar,ifthevalueoftheutilityU(x)onxiisbelowuj1+,theniequalsthe

PAGE 24

dierencebetweenuj1+andU(xi).Withasmallparameter,weintroducedaseparationareabetweenclasslayersinordertouniquelyidentifytheclassofanobject. Inproblem( 3.12 ),weoptimizethepenaltyfunctionw.r.t.theparameter2,whichparameterizestheclassofutilityfunctions.Further,weconsidertwotypesofutilityfunctions: a)Linearfunctionw.r.t.indicatorvectorparameterxandlinearw.r.t.controlvectorvariablec Uc(x)=cTx=nXk=1ckxk;c2Rn:(2.14) b)Quadraticfunctionw.r.t.indicatorvectorparameterxandlinearw.r.t.controlvariablesDandc UD;c(x)=Dx+cTx=nXk=1nXl=1dklxkxl+nXk=1ckxk;(2.15) whereDisamatrixnnandc2Rn. 2.3.2 ClassicationwithaLinearUtilityFunction Withlinearpenalty( 3.13 )andlinearutilityfunction( 3.14 ),problem( 3.12 )canberewritteninthefollowingform

PAGE 25

whereCisafeasiblesetofcoecientsoflinearutilityfunctionUc(x)=nPk=1ckxk.WhenCisapolyhedron(asetspeciedbyanitenumberoflinearinequalities),( 3.16 )isalinearprogrammingproblem(LP). 2.3.3 ClassicationwithQuadraticUtilityFunction Letusconsideraquadraticutilityfunction Further,wewillusethisrepresentationofaquadraticfunction.

PAGE 26

Withlinearpenalty( 3.13 )andquadraticutilityfunction( 3.17 ),problem( 3.12 )canberewritteninthefollowingform IfCisapolyhedron,thenallconstraintsinproblem( 3.18 )arelinearw.r.t.variablesD,c,u,and+.( 3.18 )isalinearprogrammingproblemwhichcanbesolvedusingstandardtechniques. 2.3.4 DataFormat Withourapproach,objectsarepointsinthespaceofindicator(characteristic)variables.Forexample,inthecreditscoringproblemconsideredinthispaper,acreditcardapplicationisanobjectand\maritalstatus"ledintheapplicationisanindicatorvariable.Forexample,wecancodethiseldasfollows:0=divorced,1=single,and2=married/widowed(wemakenodistinctionbetweenmarriedandwidowedpersons). Supposewehaveasetofmobjects,andeachobjectisdescribedbyavectorx=(x1;::;xn)ofindicatorvariables(orindicatorsbyshort).ThesetofobjectsisasetofpointsX=xiji2Iinn-dimensionspace,whereI=f1,..,mg.Letibeasetofallpossiblevaluesofithindicator.Eachindicatorcantakeintegerorcontinuousvalues.Supposingthatthemaritalstatusisthethirdindicator,theset

PAGE 27

ofpossiblevaluesofthisindicatoris3=f0;1;2gandforwidowedpersonx3=2.Inthecaseofdiscreteindicators,wesupposethatthecodicationbeginswith0andtakessuccessiveintegers.Insomecases,acontinuousvariablemaybereplacedbyasetofdiscreteindicators.Inparticular,iftheutilityfunction\isnotsucientlyexible"insomevariable,acontinuousindicatorcorrespondingtothisvariablemaybereplacedbyasetofdiscreteindicators.Supposethatwewanttodiscretizetheithindicatorvariableofanobjectx=(x1;::;xn).Letussupposethatafterrescalingtheithindicatortakesintegervaluesintheinterval(0,k).Wecanreplacetheithcomponentofthevector~xbyasubvector~xi=(xi;1;xi;2;::;xi;k1),suchthat Inotherwords,wesplitthecontinuousindicatorintoasetofdiscreteindicators.Therearemanyalternativewaystodiscretizeacontinuousindicator.Forexample,foracontinuousindicator,suchasage,wecanintroducethefollowingcodication: Age 18-20 21-24 25-29 30-55 56-59 60-75 76-84 Code 0 1 2 3 4 5 6 7 8 Further,usingformula( 3.19 ),wecanconvertthediscreterepresentationpresentedinthetabletoeightdiscreteindicators. 2.3.5 MonotonicityConstraints Theconsideredutilityfunctions,especiallythequadraticutilityfunction,maybetooexible(havetoomanydegreesoffreedom)fordatasetswithasmallnumberofdatapoints.Imposingadditionalconstraintsmayreduceexcessivemodelexibility.KonnoandKobayashi[ 20 ],Konnoetal.[ 21 ]imposedconvexityconstraintsonthequadraticfunctiontomakeitconvex.Here,weconsidera

PAGE 28

dierenttypeofconstraints,whichisdrivenbyexpertjudgments.Weconsidertheutilityfunction,whichismonotonicw.r.t.someindicatorvariables.Innanceapplications,monotonicityw.r.t.someindicatorsfollowsfrom\engineering"considerations.Forinstance,inthecreditscoringproblem,weexpectthattheutilityfunctionismonotonicw.r.t.incomelevelofanapplicant.Ournumericalexperimentsshowedthatconstraints,suchasmonotonicityconstraints,maysignicantlyimprovetheout-of-sampleperformanceofthealgorithm Foralinearutilityfunction,theconditiononmonotonicityw.r.t.anindicatorvariableisquitesimple.ThelinearfunctionUc(x)=cTx=nPi=1cixiincreasesw.r.t.xiifandonlyifci>0. Forexample,foracreditcardscoringproblem,wecanimposeamonotonicityconstraintonthevariablecorrespondingtothemaritalstatusindicator.Inordertoincorporateourpreference 3.16 )byaddingthenon-negativitylinearconstraint,c30. Comparedtoalinearfunction,itismorediculttoimposemonotonicityconstraintsonquadraticfunctions(w.r.t.indicatorvariables).Weconsiderasubclassofmonotonicquadraticfunctionswithnon-negativeelementsofmatrixDandvectorC.Thisfunctionismonotonicw.r.t.eachvariableonR+=xjxi08i2f1;::;ng.Thesefunctionsarecalledroughmonotonicfunctions.Weshouldrememberthatinordertousesuchaclassoffunctions,indicatorvariablesmustbenon-negativeandofincreasingpreference.Although

PAGE 29

thissubclassrepresentsonlysmallpartofvarietyofmonotonicfunctionsonR+,itpossesgoodpredictivecapabilitiesforconsideredapplications. AnotherapproachistoimposemonotonicityconstraintsforthepointsfromagivendatasetX.Supposethatwewanttoincorporateinformationthattheutilityfunctionisincreasingw.r.t.variablexi.Themonotonicityconditionforacontinuousfunctionis Forquadraticutilityfunction( 3.17 ),thisconstraintcanbewrittenasfollows Ifweimposeconstraint( 3.21 )toallpointsfromdatasetX=fxjjj2f1;::;mgg,wegetthesetoflinearconstraints thatmakestheutilityfunctiontobeclosetomonotonic. Usingasimilarapproach,wecanimposeamoregeneraltypeofmonotonicity.SupposethatweknowthatsomesetofpointsY=yiji2f1;::;rgisorderedaccordingtoourpreference,andwewanttondautilityfunctionthatismonotonicw.r.t.thispreferenceonthesetY.Wecanimposethefollowingconstraintsontheutilityfunction:

PAGE 30

Forquadraticutilityfunction( 3.17 ),theseconstraintsarepresentedas Althoughtheseconstraintsarequadraticw.r.t.indicatorvariables,theyarelinearw.r.t.parameterswewanttond. 2.3.6 Non-Monotonicity While,forsomeindicators,wecanincorporatemonotonicityconstraints,forothersitmaybebenecialtouseotherproperties,suchasconvexityandconcavity.Letusconsiderthecontinuousindicator'AGE'mentionedearlier.Supposethatmiddleageispreferabletoyoungandoldages,andthepreferencesisspeciedas, (1)Wecanassumethattheutilityfunctionisconcavew.r.t.variablex1(assumethatthisvariablecorrespondsto'AGE').Thiscanbeassuredbycalculatingthesecondderivativeoftheutilityfunctionw.r.t.x1andincludingthefollowingconstraintinLP( 3.18 ),

PAGE 31

Iftheutilityfunctionisquadraticinx,theresultingoptimizationproblemislinearincontrolvariables. (2)Wecanchangecodingoftheagedataeldasfollows: Age 18-20 21-24 25-29 30-55 56-59 60-75 76-84 Code 0 1 2 3 4 3 2 1 0 Withthiscoding,theutilityfunctionismonotonicw.r.t.indicator'AGE'. 2.4 CreditCardScoring Weexplainourapproachwiththecreditcardscoringproblem,whichearlierwasconsideredin[ 12 ],[ 45 ]. 2.4.1 DatasetandProblemStatement Thedatasetconsistsof150creditcardapplicationssubmittedtotheNationalBankofGreeceduringtheperiod1995-1996.Eachcreditcardapplicationincludes25elds.However,onlyseveneldsselectedbyacreditmanagerwereincludedintheanalysis 2{1 ): 12 ]andZopounidisetal.[ 45 ],wemadetwominormodicationsinthecodicationofthedata.Thisdoesnotaecttheresults,butsimpliestheapproach.First,wehavemeasuredtheprofessioneldontheten-pointscale(fractionalvaluesareconsideredinDamaskos[ 12 ]andZopounidisetal.[ 45 ]).Second,theminimalvalueforalleldsisequalto0(minimalvalueequals1inDamaskos[ 12 ]andZopounidisetal.[ 45 ]).

PAGE 32

impossibletodeclareabusinessphonenumber(e.g.,retiredapplicant,shipment,etc.).Theseprofessionswereincludedinthe\notrequired"eld. Althoughtheannualincomeoftheapplicantscouldbeconsideredasanimportantcriterioninthecreditcardevaluation,itwasnotconsideredinthiscasestudybecauseofthetwofollowingreasons. Table 2{1 summarizestheeldsandtheirvalues. Weknowtheclassofeachcreditcardapplication.The150applicationsaresplitintothreeclasses:

PAGE 33

Thistrichotomous(i.e.threegroup)classicationismoreexiblethanthetwo-groupclassication.Itallowsacreditanalysttodeneanuncertainarea,indicatingthatsomecreditcardapplicationsneedtobereconsideredinthesecondstageoftheanalysisinordertodeterminetheircharacteristics. Table 2{1 mapsapplicationstothesetofpointsX=fx1;::;x150ginR7space.ThissetisdecomposedintothreesubsetsI1,I2,andI3:I1[I2[I3;Ii6=j\Ij=;.Withourapproach,wewanttondthresholdsu1;u2andautilityfunctionU(x)thatclassiesthesetX=fx1;::;x150g,i.e., Weconsiderutilityfunctions,whicharelinearindecisionvariablesandlinearorquadraticinindicatorvariables(eldsinTable 2{1 ).Tondautilityfunctionclassifyingthedatasetwithminimalerror,wesolve( 3.16 )or( 3.18 )linearprogrammingproblems. 2.4.2 NumericalExperiments Tocompareclassicationscapabilitiesofdierentmodels,weconsideredseveralclassesofutilityfunctionsincombinationwithdierentcodicationschemes,seeTable 2{2 .Discretizeddataisobtainedbyconvertingtheoriginal\continuous"variableintosetof\discrete"variablesaccordingtotherulediscussedinSection2.3.4.Forthemodelswithlinearutilityfunctionweconsidermonotonic

PAGE 34

constraints,forthemodelswithquadraticutilityfunctionweconsider\rough"monotonicityconstraintsdiscussedinthesection2.3.5. Toestimatetheperformanceoftheconsideredmodels,wehaveusedthe\leave-one-out"crossvalidationscheme.Fordiscussionofthisschemeandothercrossvalidationapproaches,see,forinstanceEfronandTibshirani[ 14 ].Letusdenotebym=150thenumberofapplicationsinthedataset.Foreachmodel(denedbytheclassoftheutilityfunctionandcodicationaccordingtoTable 2{2 ),weconductedmexperiments.Byexcludingapplications,xi,one-by-onefromthesetX,weconstructedmtrainingsets, 3.16 )or( 3.18 )optimizationproblemsandfoundoptimalparametersoftheutilityfunctionU(x)andthresholdsUTH.Also,wecomputedthenumberofmisclassiedapplicationsMifromthesetYi.Letusintroducethevariable DenotebyEinsampleanestimateofin-sampleerror,whichiscalculatedasfollows, Inthelastformula,theratioMi

PAGE 35

DenotebyEoutofsamleanestimateoftheerror.Itiscalculatedastheratioofthetotalnumberofmisclassiedout-of-sampleobjectsinmexperimentstothenumberofexperiments Fortheexperimentwehavechosen=1,s=0:00001.TheexperimentswereconductedwithaPentiumIII,700MHzinC/C++environment.LinearprogrammingproblemsweresolvedbytheCPLEX7.0package.CalculationresultsareprovidedinthefollowingTable 2{3 .Table 2{4 representsdistributionofpredictionresults. Figure 2{6 representsin-sampleandout-of-sampleerrorsfordierentmodels.Modelswithquadraticutilityfunctions(noconstraints),CQNandDQN,havezeroin-sampleerror.Amongconsideredmodels,thesemodelsarethemost\exible".Thesmallestout-of-sampleerrorisobtainedbyDQM-algorithmwitharoughmonotonicquadraticutilityfunction.Also,DLNhasrelativelysmallout-of-sampleerror.Figure 2{6 showsthatthebestforecastingresults(out-of-samplecharacteristics)achievedbymodelsthattintodatawiththelargevaluesofmisclassicationerror(in-samplecharacteristics).MoreoveronlyDQMmodeldoesnotmake2-classmistake,whentheapplicationfromthethirdclassispredictedtobeintherstclass. Itisworthmentioningthatdiscretizingcontinuouseldsdramaticallyincreasesthenumberofvariablesinoptimizationproblem( 3.12 )andthecomputingtime.Imposingconstraintsonutilityfunctionsalsoincreasesthecomputingtime.However,forallconsideredoptimizationproblems,thecomputingtimeislessthan1second.TheCPLEXLPsolver,whichwasusedinthisstudy,cansolvelarge-scaleproblemswiththousandsofdatapoints.

PAGE 36

2.5 Analysis Theconsideredmodelsaredenedbytheclassoftheutilityfunctionsandthecodicationofthetrainingdata.Weconsideredlinearandquadraticutilityfunctionswithandwithoutmonotonicityconstraintsoncoecients.Themaincharacteristicsoftheclassicationmodelarein-sampleandout-of-sampleerrors.Detaildiscussionofthetheoreticalbackgroundoftheclassicationalgorithms(in-samplevs.out-ofsample)isbeyondthescopeofthispaper.Readersinterestedintheseissuescanndrelevantinformation,forinstance,inthebookbyVapnik[ 42 ].Theoretically,classicationmodelsdemonstratethefollowingcharacteristics,seeFigure 2{7 .Forsmalltrainingsets,themodeltsthedatawith0in-sampleerror,whereastheexpectedout-ofsampleerrorislarge.Withanincreaseofthesizeofthetrainingset,theexpectedin-sampleerrordivergesfromzero(themodelcannotexactlytthetrainingdata).Increasingthesizeofthetrainingsetleadstoalargerexpectedin-sampleerrorandasmallerexpectedout-of-sampleerror.Forsucientlylargedatasets,thein-sampleandout-of-sampleerrorsarequiteclose.Inthiscase,wesaythatthemodelissaturatedwithdata. WesaythattheclassofmodelsAismoreexiblethanclassofmodelsBifclassAincludesclassB.Forinstance,modelswithquadraticutilityfunctionsaremoreexiblethanmodelswithlinearutilityfunctions.Includingconstraintsontheutilityfunctionreducestheexibilityoftheclass.Figure 2{8 illustratestheoreticalin/out-of-samplecharacteristicsfortwoclassesofmodelswithdierentexibilities.Forsmalltrainingsets,thelessexiblemodelgivesasmallerexpectedout-of-sampleerror(comparedtothemoreexiblemodel)and,consequently,predictsbetterthanthemoreexiblemodel.However,themoreexiblemodelsoutperformthelessexiblemodelsforlargetrainingsets(moreexiblemodelshaveasmallerexpectedout-ofsampleerrorcomparedtolessexiblemodels).This

PAGE 37

isbecausethelessexiblemodelrequireslessdataforsaturationthanthemoreexiblemodel. Letusdiscussthecalculationresultsobtainedwiththemodelsconsideredinthiscasestudy.Table 2{3 showsthatformodelswithquadraticutilityfunction,thein-sampleerrorisalmostalways Inadditiontothetypeoftheutilityfunction,theconstraintsandthedataformatcontributetotheexibilityofamodel.Amodelbecomeslessexibleifweimposeadditionalconstraintsontheutilityfunction.Theexperimentsconrmthisstatement:themodelswithoutconstraintshavebetterin-samplecharacteristics(smallerin-sampleerror)thanthesamemodelswithconstraints.Theothercomponentofexibilityofamodelisdataformat:modelswithdiscretedataare

PAGE 38

moreexiblethanoneswithcontinuousdata,becausemanydiscreteeldsrepresentonecontinuouseld. Finalizingthediscussion,wewanttoemphasizethatchoiceofclassoftheutilityfunctionsplaysacriticalroleforgoodout-of-sampleperformanceofthealgorithm.Anexcessiveexibilityofthemodelmayleadto\overtting"andpoorpredictioncharacteristicsofthemodel. 2.6 ConcludingRemarks Wehavestudiedageneralmethodologyforclassifyingobjects.Itisbasedonndingan\optimal"classicationutilityfunctionbelongingtoaprespeciedclassoffunctions.Wehaveconsideredlinearandquadraticutilityfunctions(inindicatorparameters)withmonotonicityconstraints.Theconstraintsonutilityfunctionsweremotivatedbyexpertjudgments.Selectingaproperclassoftheutilityfunctionsandimposingconstraintsplaysacriticalroleinthesuccessofthesuggestedmethodology.Forsmalltrainingsets,exiblemodelsperformquitepoorlyandreducingexibility(imposingconstraints)mayimproveforecastingcapabilitiesofthemodel.Withconstraints,wecanadjusttheexibilityofthemodeltothesizeofthetrainingdataset.Forconsideredclassesofutilityfunctions,theoptimalutilityfunctioncanbefoundusinglinearprogrammingtechniques.Linearprogrammingleadstofastalgorithms,whichcanbeusedforlargedatasets. Wehaveappliedthedevelopedmethodologytoacreditcardscoringproblem.ThecasestudywasperformedwithadatasetofcreditcardapplicationssubmittedtotheNationalBankofGreece.Wefoundthatfortheconsidereddataset,thebestout-of-samplecharacteristics(smallestout-ofsampleerror)aredeliveredbyquadraticutilityfunctionswithpositivityconstraintsonthecontrolvariables(coecientsoftheutilityfunction).Theseconstraintsmaketheutilityfunctionmonotonew.r.t.indicatorvariables.Theresultsofnumericalexperimentsarein

PAGE 39

linewiththegeneralconsiderationsaboutthecomparativeperformanceofmodelshavingdierentexibilities. Thisstudywasfocusedontheevaluationofcomputationalcharacteristicsofthesuggestedapproach.Althoughtheobtainedresultsaredataspecic,wecanconcludethattheapproachleadstoquiterobustclassicationtechniques.Bychangingtheexibilityofthemodelwecanhandleawiderangeofdatasets,fromverysmalltoextremelylarge.

PAGE 40

Table2{1: Codicationofdatainthecreditcardscoringcasestudy Table2{2: Consideredmodels,classesofutilityfunctions,anddatacodications Modelidenticator Typeofdatainelds1-4 Typeofutilityfunction Typeofmono-tonicity CLN Non-discretized Linear No CLM Non-discretized Linear Monotonic CQN Non-discretized Quadratic No CQM Non-discretized Quadratic Rough DLN Discretized Linear No DLM Discretized Linear Monotonic DQN Discretized Quadratic No DQM Discretized Quadratic Rough

PAGE 41

Table2{3: Calculationresultsfortheconsideredmodels In-Sample Out-of-Sample AVR STDV AVR STDV CLN 4.04% 0.19% 10.00% 30.10% CLM 5.29% 0.33% 10.00% 30.10% CQN 0% 0% 11.33% 11.33% CQM 2.65% 0.22% 10.00% 30.10% DLN 1.98% 0.36% 8.67% 28.23% DLM 3.33% 0.26% 10.00% 30.10% DQN 0.01% 0.11% 18.00% 38.55% DQM 4.06% 0.57% 7.33% 26.16% Table2{4: Distributionofmispredictedapplications Class Actual 1 1 1 2 2 2 3 3 3 Predicted 1 2 3 1 2 3 1 2 3 Model NumberofApplication CLN 21 3 0 3 46 3 1 5 68 CLM 21 3 0 3 46 3 1 5 68 CQN 21 3 0 5 43 4 1 4 69 CQM 22 2 0 3 45 4 1 5 68 DLN 20 4 0 1 48 3 1 4 69 DLM 20 4 0 4 45 3 1 3 70 DQN 17 7 0 9 38 5 1 5 68 DQM 23 1 0 2 47 3 0 5 69

PAGE 42

Figure2{1: Classicationbyadiscriminanthyperplane Figure2{2: Inabilityofperfectclassicationbyadiscriminanthyperplane Figure2{3: Theobjectszlandyiaremisclassied

PAGE 43

Classicationbyquadraticseparationsurfaces Three-classclassicationbylinearhyperplanes

PAGE 44

Figure2{6: In-sampleandout-of-sampleerrorsfordierentmodels (C/D=continuous/discretedata,Q/L=quadratic/linearutilityfunction,M/N=with/withoutmonotonicityconstraints)

PAGE 45

In-sampleandout-of-sampleperformancevs.sizeoftrainingdataset In-sampleandout-of-sampleperformancefordierentmodels

PAGE 46

Inthischapterweextendtheapproachdiscussedinthepreviouschaptertomulticlassclassicationmethod.Usingmathematicalprogrammingtechniques,weminimizeapenaltyfunction(measuringmisclassication),whichisconstructedwithquadraticseparatingsurfacesdividingthespaceintoseveralareas.Elementsfromoneareaaresupposedtobelongtothesameclass.Wehaveformulatedalinearprogrammingproblemforndingoptimalcoecientsofthequadraticseparatingfunctions.Toadjustexibilityofthemodelandtoavoidoverttingwehaveimposedvariousconstraintsonthelinearprogrammingproblem. Theclassicationprocedureincludestwophases.Intherstphase,theclassicationruleisdevelopedbasedon\in-sample"informationbyusingadatasetwithknownclassication.Inthesecondphase,weappliedtheobtainedclassicationruleandvalidateditbyan\out-of-sample"dataset. Weappliedthesuggestedapproachtotheproblemofratingofbondsandreplicated(approximated)theStandardandPoor'screditrating.Thebondsareclassiedaccordingtotheirriskcharacteristics.Wedemonstratedthattheclassicationprocedurereliablyextractstheinformationfromagivendatasetwiththeknownclassication(in-sampleclassication)andthenusedthisinformationtoclassifynewobjects(out-of-sampleclassication).Althoughtheapproachisvalidatedwithananceapplication,itisquitegeneralandcanbeappliedtootherengineeringareas. 37

PAGE 47

3.1 Introduction Thechapterconsidersageneralapproachforclassifyingobjectsintoseveralclassesandappliesittoabond-ratingproblem. Mangasarianetal.[ 25 ]usedautilityfunctionforthefailurediscriminantanalysis(applicationstobreastcancerdiagnosis).Theutilityfunctionwasconsideredtobelinearincontrolparametersandindicatorvariablesanditwasfoundbyminimizingtheerrorofmisclassication.Zopounidisetal.[ 44 ]and[ 45 ],Pardalosetal.[ 29 ]usedlinearutilityfunctionsfortrichotomousclassicationsofcreditcardapplications.KonnoandKobayashi[ 20 ],andKonnoetal.[ 21 ]consideredutilityfunctions,quadraticinindicatorparametersandlinearindecisionparameters.Theapproachwastestedwiththeclassicationofenterprisesandbreastcancerdiagnosis.KonnoandKobayashi[ 20 ],andKonnoetal.[ 21 ]imposedconvexityconstraintsonutilityfunctionsinordertoavoiddiscontinuityofdiscriminantregions.SimilartoKonnoandKobayashi[ 20 ],andKonnoetal.[ 21 ],Bugeraetal.[ 9 ]appliedaquadraticutilityfunctiontotrichotomousclassication,butinsteadofconvexityconstraints,monotonicityconstraintsreectingexperts'opinionswereused.TheapproachbyBugeraetal.[ 9 ]iscloselyrelatedtoideasbyZopounidisetal.[ 44 ]and[ 45 ],Pardalosetal.[ 29 ];itconsidersamulti-classclassicationwithseverallevelsetsoftheutilityfunction,whereeverylevelsetcorrespondstoaseparateclass. ThischapterextendstheclassicationapproachconsideredbyBugeraetal.[ 9 ].Severalinnovationsimprovingtheeciencyofthealgorithmaresuggested:Asetofutilityfunctions,calledseparatingfunctions,isusedforclassication.Theseparatingfunctionsarequadraticinindicatorvariablesandlinearindecisionvariables.Asetofoptimalseparatingfunctionsisfoundbyminimizingthemisclassicationerror.Theproblemisformulatedasalinearprogrammingproblemwithrespectto(w.r.t)decisionvariables.

PAGE 48

Controllingexibilityofthemodelwithconstraintsiscruciallyimportantforthesuggestedapproach.Quadraticseparatingfunctions(dependingupontheproblemdimension)mayhaveaverylargenumberoffreeparameters.Therefore,atremendouslylargedatasetmaybeneededto\saturate"themodelwithdata.Constraintsreducethenumberofdegreesoffreedomofthemodelandadjust\exibility"ofthemodeltothesizeofthedataset. Thechapterisfocusedonanumericalvalidationoftheproposedalgorithm.Weratedasetofinternationalbondsusingtheproposedalgorithm.ThedatasetforthecasestudywasprovidedbytheresearchgroupoftheRiskSolutionsbranchofStandardandPoor's,Inc.Weinvestigatedtheimpactofmodelexibilityonclassicationcharacteristicsofthealgorithmandcomparedperformanceofseveralmodelswithdierenttypesofconstraints.Experimentsshowedtheimportanceofconstraintsadjustingtheexibilityofthemodel.Westudied\in-sample"and\out-of-sample"characteristicsofthesuggestedalgorithm.Attherststageofthealgorithm,weminimizedtheempiricalrisk,thatis,theerrorofmisclassicationonatrainingset(in-sampleerror).However,therealobjectiveofthealgorithmistoclassifyobjectsoutsidethetrainingsetwithaminimalerror(out-of-sampleerror).Thein-sampleerrorisalwaysnotgreaterthantheout-of-sampleerror.SimilarissueswerestudiedintheStatisticalLearningTheory,Vapnik[ 42 ]. Thechapterisorganizedasfollows.Section3.2providesageneraldescriptionoftheapproach.Section3.3considersconstraintsappliedtothestudiedoptimizationproblem.Section3.4discussestechniquesforchoosingmodelexibility.Section3.5explainshowtheerrorsestimationwasdoneforthemodels.Section3.6discussesthebond-ratingproblemusedfortestingthemethodology.Section3.7describesthedatasetsusedinthestudy.Section3.8presentstheresults

PAGE 49

ofcomputationalexperimentsandanalysesobtainedresults.WenalizethechapterwithconcludingremarksinSection3.9. 3.2 DescriptionofMethodology Theobjectspaceisasetofelements(objects)tobeclassied.Eachelementofthespacehasnquantitativecharacteristicsdescribingpropertiesoftheconsideredobjects. Werepresentobjectsbyn-dimensionalvectorsandtheobjectspacebyann-dimensionalsetRn.Theclassicationproblemassignselementsoftheobjectspacetoseveralclassessothateachclassconsistsofelementswithsimilarproperties.Theclassicationisbasedonavailablepriorinformation.Inourapproach,thepriorinformationisprovidedbysetofobjectsS=f~x1;::;~xmgwithknownclassication(in-sampledataset).Thepurposeofthemethodologyistodevelopaclassicationalgorithmthatassignsaclasstoanewobjectbasedonthein-sampleinformation. LetusconsideraclassicationproblemwithobjectshavingncharacteristicsandJclasses.Sinceeachobjectisrepresentedbyavectorinamulti-dimensionalspaceRn,theclassicationcanbedenedbyaninteger-valuedfunctionf0(~x);~x2Rn.Thevalueofthefunctiondenestheclassofanobject.Wecallf0(~x)aclassicationfunction.ThisfunctionsplitstheobjectspaceintoJnon-intersectingareas: Rn=JSi=1Fi;Fi\Fj=;;i6=j;(3.1) whereeachareaFiconsistsofelementsbelongingtothecorrespondingclassi: Wecanapproximatetheclassicationfunctionf0(~x)usingoptimizationmethods.LetF(~x)beacumulativedistributionfunctionofobjectsintheobject

PAGE 50

spaceRn,andbeaparameterizedsetofdiscrete-valuefunctions.Then,thefunctionf0(~x)canbeapproximatedbysolvingthefollowingminimizationproblem: minf2ZRnQ(f(~x)f0(~x))dF(~x);(3.3) whereQ(f(~x)f0(~x))isapenaltyfunctiondeningthevalueofmisclassicationforasingleobject~x,andisaparameterizedsetofapproximatingfunctions.Theoptimalsolutionf(~x)ofoptimizationproblem( 3.3 )givesanapproximationofthefunctionf0(~x).Themaindicultyinsolvingproblem( 3.3 )isthediscontinuityoffunctionsf(~x)andf0(~x)leadingtonon-convexoptimization.Tocircumventthisdiculty,wereformulateproblem( 3.3 )inaconvexoptimizationsetting. Letusconsideraclassicationfunctionf(~x)deningaclassicationontheobjectspaceRn.SupposewehaveasetofcontinuousfunctionsU0(~x);:::;UJ(~x).Wecallthemseparatingfunctionsfortheclassicationfunctionf(~x)ifforeveryobject~xfromclassj=f(~x):valuesoffunctionswithnumberslowerthanjarepositive: valuesoffunctionswithnumbershigherorequaltojarenegativeorzeros: IfweknowthefunctionsU0(~x);:::;UJ(~x)wecanclassifyobjectsaccordingtothefollowingrule: Theclassnumberofanobject~xcanbedeterminedbythenumberofpositivevaluesofthefunctions:anobjectinjclasshasexactlyjpositiveseparatingfunctions.Figure 3{1 illustratesthisproperty.

PAGE 51

Supposewecanrepresentanyclassicationfunctionf(~x)fromaparameterizedsetoffunctionsbyasetofseparatingfunctions.Byconstructingapenaltyfortheclassicationf0(~x)usingseparatingfunctions,wecanformulateoptimizationproblem( 3.3 )withrespecttotheparametersoftheseparatingfunctions. Supposetheclassicationfunctionf(~x)isdenedbyasetofseparatingfunctionsU0(~x);:::;UJ(~x).WedenotebyDf0(U0;::;UJ)anintegralfunctionmeasuringdeviationoftheclassication(impliedbytheseparatingfunctionsU0(~x);:::;UJ(~x))fromthetrueclassicationdenedbyf0(~x): whereQ(U0(~x);::;UJ(~x);f0(~x))isapenaltyfunction. Further,weconsiderthefollowingpenaltyfunction: where(y)+=max(0;y),andk;k=0;::;Jarepositiveparameters.ThepenaltyfunctionequalszeroifUk(~x)0fork=0;::;f0(~x)1,andUk(~x)0fork=f0(~x);::;J.IftheclassicationdenedbytheseparatingfunctionsU0(~x);:::;UJ(~x)accordingtorule( 3.6 )coincideswiththetrueclassicationf0(~x),then,thevalueofpenaltyfunction( 3.8 )equalszeroforanyobject~x.Ifthepenaltyispositiveforanobject~xthenseparatingfunctionsmisclassifythisobject.

PAGE 52

Therefore,thepenaltyfunctiondenedby( 3.8 )providestheconditionofthecorrectclassicationbytheseparatingfunctions: Thechoiceofpenaltyfunction( 3.8 )ismotivatedbythepossibilityofbuildinganecientalgorithmfortheoptimizationofintegral( 3.7 )whentheseparatingfunctionsarelinearw.r.t.controlparameters.Inthiscase,optimizationproblem( 3.3 )canbereducedtolinearprogramming. Afterintroducingtheseparatingfunctionswereformulateoptimizationproblem( 3.3 )asfollows: minU0;::;UJDf0(U0;::;UJ):(3.10) Thisoptimizationproblemndsoptimalseparatingfunctions.WecanapproximatethefunctionDf0(U0;::;UJ)bysamplingobjectsaccordingtothecumulativedistributionfunctionF(~x).Assumingthat~x1;::;~xmaresomesamplepoints,theapproximationofDf0(U0;::;UJ)isgivenby: ~Dmf0(U0;::;UJ)=1 Therefore,theapproximationofdeviationfunction( 3.7 )equals: ~Dmf0(U0;::;UJ)=1

PAGE 53

Toavoidpossibleambiguitywhenthevalueofaseparatingfunctionequals0,weintroducedasmallpositiveconstantinsideofeachterminthepenaltyfunction: ThepenaltyfunctionequalszeroifUk(~x)fork=0;::;f0(~x)1,andUk(~x)fork=f0(~x);::;J.Theapproximationofdeviationfunction( 3.7 )becomes ~Dmf0(U0;::;UJ)=1 Further,weparameterizedeachseparatingfunctionbyaK-dimensionalvector~2ARK.Therefore,asetofseparatingfunctionsU(~0;::;~J)(~x)=fU~0;::;U~Jgisdeterminedbyasetofvectors~0;::;~J.Withthisparameterizationwereformulatedtheproblem( 3.10 )asfollows: min~0;::;~J2AmPi=10@f0(~xi)1Pk=0k(U~k(~xi)+)++JPk=f0(~xi)k(U~k(~xi)+)+1A;(3.15)

PAGE 54

whereARK.Byintroducingnewvariablesji,wereducedproblem( 3.15 )toamathematicalprogrammingproblemwiththelinearobjectivefunction: minmPi=1JPj=1jjijiU~j(~xi)+;j=0;::;f0(~xi)1jiU~j(~xi)+;j=f0(~xi);::;J~0;::;~J2A1i;::;Ji0(3.16) Further,weconsiderthattheseparatingfunctionsarelinearincontrolparameters1;::;K.Inthiscasetheseparatingfunctionscanberepresentedinthefollowingform: Inthiscase,optimizationproblem( 3.16 )forndingoptimalseparatingfunctionscanbereducedtothefollowinglinearprogrammingproblem: minmPi=1JPj=1jjiji+KPk=1jkgk(~xi);j=0;::;f0(~xi)1jiKPk=1jkgk(~xi);j=f0(~xi);::;J~0;::;~J2A1i;::;Ji0(3.18) Further,weconsiderquadratic(inindicatorvariables)separatingfunctions:

PAGE 55

Optimizationproblem( 3.18 )withquadraticseparatingfunctionsisreformulatedasfollows: mina;b;c;mPi=1JPj=1jjiji+nPk=1nPl=1ajklxikxil+nPi=1bjkxik+cj;j=0;::;f0(~xi)1jinPk=1nPl=1ajklxikxilnPk=1bjixikcj;j=f0(~xi);::;J1i;::;Ji0(3.20) AlthoughthereareJ+1separatingfunctionsinproblem( 3.20 ),onlyJ1functionsareessentialfortheclassication.ThefunctionsnXk=1nXl=1a0klxkxl+nXk=1b0kxk+c0 3.20 ).However,theseboundaryfunctionscanbeusedforadjustingexibilityoftheclassicationmodel.Inthenextsection,wewillshowhowtousethesefunctionsforimposingtheso-called"squeezing"constraints. ForthecaseJ=2withonlytwoclasses,problem( 3.20 )ndsaquadraticsurfacenPk=1nPl=1a1klxkxl+nPk=1b1kxk+c1=0dividingtheobjectspaceRnintotwoareas.Aftersolvingoptimizationproblem( 3.20 )weexpectthatamajorityofobjectsfromtherstclasswillbelongtotherstarea.OnthesepointsthefunctionnPk=1nPl=1a1klxkxl+nPk=1b1kxk+c1ispositive.Similar,foramajorityofobjectsfromthesecondclassthefunctionnPk=1nPl=1a1klxkxl+nPk=1b1kxk+c1isnegative.

PAGE 56

ForthecasewithJ>2;thegeometricalinterpretationofoptimizationproblem( 3.20 )referstothepartitionoftheobjectspaceRnintoJareasbyJ1non-intersecting Constraints Theconsideredseparatingfunctions,especiallythequadraticfunctions,maybetooexible(havetoomanydegreesoffreedom)fordatasetswithasmallnumberofdatapoints.Imposingadditionalconstraintsmayreduceexcessivemodelexibility.Inthissection,wewilldiscussdierenttypesofconstraintsappliedtothemodel. KonnoandKobayashi[ 20 ],andKonnoetal.[ 21 ]consideredconvexityconstraintsonindicatorvariablesofaquadraticutilityfunction.Bugeraetal.[ 9 ]imposedmonotonicityconstraintsonthemodeltoincorporateexpertpreferences. Constraintsplayacrucialroleindevelopingtheclassicationmodelbecausetheyreduceexcessiveexibilityofamodelforsmalltrainingdatasets.Moreover,aclassicationwithmultipleseparatingfunctionsmaynotbepossibleforthemajorityofobjectsifappropriateconstraintsarenotimposed. 3.3.1 FeasibilityConstraints(F-Constraints) Forclassicationwithmultipleseparatingfunctionswemaypotentiallycometoapossibleintersectionofseparatingsurfaces.Thismayleadtoinabilityoftheapproachtoclassifysomeobjects.Tocircumventthisdiculty,weintroducefeasibilityconstraints,thatkeeptheseparatingfunctionsapartfromeachother.It

PAGE 57

makespossibletoclassifyanynewpointbyrule( 3.6 ).Ingeneral,theseconstraintshavetheform: whereWisasetonwhichwewanttoachievethefeasibility.WedonotconsiderW=Rn,becauseitleadsto\parallel"separatingsurfaces,whichwerestudiedinthepreviousworkbyBugeraetal.[ 9 ].Inthischapter,weconsiderasetWwithanitenumberofelements.Inparticular,weconsiderthesetWbeingthetrainingsetplusthesetofobjectswewanttoclassify(out-of-sampledataset).Inthiscase,constraints( 3.21 )canberewrittenas wheremisanumberofobjectsinthesetW.Thefactthatweuseout-of-samplepointsdoesnotcauseanyproblemsbecauseweusethedatawithoutknowingtheirclassnumbers. Sinceclassicationwithoutfeasibilityconstraintsmayleadtoinabilityofclassifyingnewobjects(especiallyforsmalltrainingsets),wewillalwaysincludethefeasibilityconstraints( 3.22 )toclassicationproblem( 3.16 ).Forquadratic

PAGE 58

separatingfunctions,classicationproblem( 3.20 )withfeasibilityconstraintscanberewrittenas mina;b;c;mPi=1JPj=1jjiji+nPk=1nPl=1ajklxikxil+nPi=1bjkxik+cj;j=0;::;f0(~xi)1jinPk=1nPl=1ajklxikxilnPk=1bjixikcj;j=f0(~xi);::;JnPk=1nPl=1ajklxikxil+nPi=1bjkxik+cjnPk=1nPl=1aj1klxikxil+nPi=1bj1kxik+cj1i=1;::;m;j=1;::;J1i;::;Ji0(3.23) 3.3.2 MonotonicityConstraints(M-Constraints) Weusemonotonicityconstraintstoincorporatethepreferenceofgreatervaluesofindicatorvariables.Innancialapplications,monotonicitywithrespecttosomeindicatorsfollowsfrom\engineering"considerations.Forinstance,inthebondratingproblem,consideredinthischapter,wehaveindicators,greatervaluesofwhichleadtothehigherratingsofbonds.Ifweenforcethemonotonicityoftheseparatingfunctionswithrespecttotheseindicators,objectswithgreatervaluesoftheindicatorswillhavehigherratings.Forasmoothfunctionh(~x),themonotonicityconstraintscanbewrittenastheconstraintonthenon-negativityoftherstpartialderivative: Forthecaseoflinearseparatingfunctions,

PAGE 59

themonotonicityconstraintsare 3.3.3 PositivityConstraintsforQuadraticSeparatingFunctions(P-Constraints). Unlikethecasewiththelinearfunctions,imposingexactmonotonicityconstraintsonaquadraticfunction isamorecomplicatedissue(indeed,ingeneral,thequadraticfunctionisnotmonotonicinthewholespaceRn).Insteadofimposingexactmonotonicityconstraintsweconsiderthefollowingconstraints(wecallthem\positivityconstraints"): Bugeraetal.[ 9 ]demonstratedthatthepositivityconstraintscanbeeasilyincludedintothelinearprogrammingformulationoftheclassicationproblem.Theydonotsignicantlyincreasethecomputationaltimeoftheclassicationprocedure,butproviderobustresultsforsmalltrainingdatasetsanddatasetswithmissingorerroneousinformation.Theseconstraintsimposemonotonicitywithrespecttovariablesxi;i=1;::;nonthepositivepartR+=~x2Rnjxk0;k=1;::;noftheobjectspaceRn.

PAGE 60

3.3.4 GradientMonotonicityConstraints(GM-Constraints). AnotherwaytoenforcemonotonicityofquadraticseparatingfunctionsistorestrictthegradientofseparatingfunctionsonsomesetofobjectsX(forexample,onasetofobjectscombiningin-sampleandout-of-samplepoints): where Inconstraint( 3.29 ),s;s=1;::;Karenonnegativeconstants. 3.3.5 RiskConstraints(R-Constraints). Anotherimportantconstraintthatweapplytothemodelistheriskconstraint.Theriskconstraintrestrictstheaveragevalueofthepenaltyfunctionformisclassiedobjects.ForthispurposeweusetheconceptofConditionalValue-at-Risk(CVaR).TheoptimizationapproachforCVaRintroducedbyRockafellarandUryasev[ 32 ],wasfurtherdevelopedinRockafellarandUryasev[ 33 ],andRockafellaretal.[ 34 ].SupposeXisatrainingsetandforeachobjectfromthissettheclassnumberisknown.Inotherwords,discrete-valuefunctionf(x)assignstheclassforeachobjectfromsetX.LetJbeatotalnumberofclasses.Thefunctionf(x)splitsthesetXintoasetofsubsetsfX1;::;XJg,

PAGE 61

LetIjbethenumberofelements(objects)inthesetXj.WedenetheCVaRconstraintsasfollows: where&+jand&jarefreevariables;andareparameters. Foranoptimalsolutionoftheoptimizationproblem( 3.16 )withtheconstraints( 3.32 ),theleft-handpartoftheinequalityisanaverageofUj(~x)forthelargest%ofobjectsfromthejthclass;theright-handpartoftheinequalityisanaverageofUj(~x)forthesmallest%ofobjectsfromthe(j+1)thclass.Wewillcallthesevalues-CVaRlargestofthejthclassand-CVaRsmallestofthe(j+1)thclass,correspondingly.Thegeneralsenseoftheconstraintsisthefollowing:the-CVaRlargestofthejthclassissmalleratleastbythanthe-CVaRsmallestofthe(j+1)thclass. 3.3.6 MonotonicityofSeparatingFunctionwithRespecttoClassNumber(MSF-Constraints). Byintroducingtheseconstraints,wemoveapartvaluesoftheseparatingfunctionsfortheobjectsbelongingtodierentclasses.Moreover,theconstraintsimplymonotonicityoftheseparatingfunctionswithrespecttotheindexdenotingtheclassnumber.Theconstraintissetforeverypairoftheobjectsbelongingtotheneighboringclasses.Foreverypandq,sothatf0(~xp)=f0(~xq)+1,thefollowingconstraintisimposedontheoptimizationmodel:

PAGE 62

whereipqarenon-negativeconstants.Anotherwaytoimposemonotonicityontheseparatingfunctionswithrespecttothenumberofclassistoconsideripqasvariables,andincludethesevariablesintotheobjectivefunction: 3.3.7 ModelSqueezingConstraints(MSQ-Constraints). Squeezingconstraints(implementedtogetherwithfeasibilityconstraints)ecientlyadjusttheexibilityofthemodelandcontroltheout-of-sampleversusin-sampleperformanceofthealgorithm.Theconstraintshavethefollowingform: Withtheseconstraintsweboundvariationofvaluesofdierentseparatingfunctionsoneachobject.Anotherwaytosqueezethespreadoftheseparatingfunctionsistointroduceapenaltycoecientforthedierenceofthefunctionsin( 3.35 ).Inthiscasetheobjectivefunctionofproblem( 3.16 )canberewrittenas: Theadvantageofthesqueezingconstraintsisthatitiseasytoimplementinthelinearprogrammingframework. 3.3.8 LevelSqueezingConstraints(LSQ-Constraints). Thistypeofconstraintsisverysimilartothemodelsqueezingconstraints.ThedierenceisthatinsteadofsqueezingtheboundaryfunctionsU0(~x)andUJ(~x),

PAGE 63

weboundtheabsolutedeviationofvaluesoftheseparatingfunctionsfromtheirmeanvaluesoneachclassofobjects.Theconstraintshavethefollowingform: wherejkarepositiveconstants,andIkisthenumberofobjects~xinclassK.Similartoconstraints( 3.35 ),constantsjkcanbeconsideredasvariablesandbeincludedintotheobjectivefunction: 3.4 ChoosingModelFlexibility Thissectionexplainsourapproachtoadjustingexibilityofclassicationmodelswithoutstrictdenitionsandmathematicaldetails. Theconsideredconstraintsformvariousmodelsbasedontheoptimizationofquadraticseparatingfunctions.Themodelsdierinthetypeofconstraintsimposedontheseparatingfunctions. Themajorcharacteristicsofaparticularclassicationmodelarein-sampleandout-of-sampleerrors.Tondaclassicationmodelwesolveoptimizationproblem( 3.16 )constructedforatrainingdataset.Theerrorofmisclassicationofaclassicationmodelonthetrainingdatasetiscalledthe\in-sample"error.Theerrorachievedbytheconstructedclassicationmodelontheobjectsoutsideofthetrainingsetiscalledthe\out-of-sample"error.Themisclassicationerrorcanbeexpressedinvariousways.Wemeasurethemisclassication(ifitisnotspeciedotherwise)bythepercentageoftheobjectsonwhichthecomputedmodelgiveswrongclassication. Theoretically,classicationmodelsdemonstratethefollowingcharacteristics(seeFigure 2{7 ofthepreviouschapter).Forsmalltrainingsets,themodeltsthe

PAGE 64

datawithzeroin-sampleerror,whereastheexpectedout-ofsampleerrorislarge.Asthesizeofthetrainingsetincreases,theexpectedin-sampleerrordivergesfromzero(themodelcannotexactlytthetrainingdata).Increasingthesizeofthetrainingsetleadstoalargerexpectedin-sampleerrorandasmallerexpectedout-of-sampleerror.Forsucientlylargedatasets,thein-sampleandout-of-sampleerrorsarequiteclose.Inthiscase,wesaythatthemodelissaturatedwithdata. WesaythattheclassofmodelsAismoreexiblethantheclassofmodelsBiftheclassAincludestheclassB.Imposingconstraintsreducestheexibilityofthemodel.Figure 2{8 ofthepreviouschapterillustratestheoreticalin/out-of-samplecharacteristicsfortwoclassesofmodelswithdierentexibilities.Forsmalltrainingsets,thelessexiblemodelgivesasmallerexpectedout-of-sampleerror(comparedtothemoreexiblemodel)and,consequently,predictsbetterthanthemoreexiblemodel.However,themoreexiblemodelsoutperformthelessexiblemodelsforlargetrainingsets(moreexiblemodelshaveasmallerexpectedout-ofsampleerrorcomparedtolessexiblemodels).Thisisbecausethelessexiblemodelrequireslessdataforsaturationthanthemoreexiblemodel. Amoreexiblemodelmayneedalargetrainingdatasettoprovideagoodprediction,whilealessexiblemodelissaturatedwithdatamorequicklythanamoreexiblemodel.Therefore,forsmalldatasets,lessexiblemodelstendtooutperform(out-ofsample)moreexiblemodels.However,forlargedatasets,moreexiblemodelsoutperform(out-of-sample)lessexibleones. Wedemonstratetheseconsiderationswiththefollowingexample.Forillustrationpurposesweconsiderfourdierentmodels(A,B,CandD)withdierentlevelsofexibility.Weappliedthesemodelstotheclassicationofadatasetcontaining278in-sampledatapointsand128out-of-sampledatapoints.Thein-sampledatasetisusedtoconstructlinearprogrammingproblem( 3.20 ).Thesolutionofthisproblemisasetofseparatingquadraticfunctions( 3.19 ).The

PAGE 65

in-sampleandout-of-sampleclassicationisdoneusingtheobtainedseparatingfunctions. Figures 3{3 , 3{4 , 3{5 ,and 3{6 demonstratethein-sampleandout-of-sampleperformancesoftheconsideredmodels.Thehorizontalaxisofeachgraphcorrespondstotheobjectnumber,whichisorderedaccordingtotheobjectactualclassnumber.Theverticallinecorrespondstothecalculatedbythemodelclassnumber.Thesolidlineonthegraphrepresentstheactualclassnumberofanobject.Theround-pointlinerepresentsthecalculatedclass.Theleftgraphsshowin-sampleclassication:theclassiscomputedbytheseparatingfunctionsfortheobjectsfromthein-sampledataset.Therightgraphsshowout-of-sampleclassication:theclassiscomputedbytheseparatingfunctionsfortheobjectsfromtheout-of-sampledataset. ModelAuses\parallel"quadraticseparatingfunctions.ThiscasecorrespondstotheclassicationmodelconsideredbyBugeraetal.[ 9 ],whereinsteadofmultipleseparatingfunctionsoneutilityfunctionwasused.Theclassicationwiththeutilityfunctioncanbeinterpretedasaclassicationwithseparatingfunctions( 3.15 )withthefollowingconstraintsonthefunctionsU(~x)=nPi=1nPj=1aijxixj+nPi=1bixi+c: Constraints( 3.39 )maketheseparatingfunctionsparallelinthesensethatthedierencebetweenthefunctionsremainsthesameforallthepointsoftheobjectspace. Figure 3{3 showsthatModelAhaslargeerrorsforbothin-sampleandout-of-samplecalculations.AccordingtoFigure 3{2 ,whenamodelhasapproximatelythesamein-sampleandout-of-sampleerrors,thedatasaturation

PAGE 66

occurs.Therefore,theout-of-sampleperformancecannotbeimprovedbyreducingexibilityofthemodel.Amoreexiblemodelshouldbeconsidered. ModelBusesquadraticseparatingfunctionswithoutanyconstraints.ThismodelismoreexiblethanModelA,becauseithasmorecontrolvariables.AccordingtoFigure 3{4 ,thein-sampleerrorequalszeroforModelB.However,incontrasttothein-samplebehavior,theout-of-sampleerrorislargerforModelBthanforModelA.Moreover,ModelBisnotabletoassignaclasstomanyoftheout-of-sampleobjects(onthepicturetheunpredictableresultcorrespondstothenegativenumber-1oftheclass).Theseparatingfunctionsmayintersect,andsomeareasoftheobjectareimpossibletoclassify.Forobjectsfromtheseareas,thecondition and arenotsatisedforanyj.Itmakestheresultoftheout-of-sampleclassicationuninterpretable. ModelCisobtainedbyaddingthefeasibilityconstraints toModelB.SinceModelCislessexiblethenModelB,thein-sampleerrorbecomesgreaterthaninModelB(seeFigure 3{5 .Ontheotherhand,themodelhasabetterout-of-sampleperformancethanModelsAandB.Moreover,afeasibilityconstraintmakestheclassicationpossibleforanyout-of-sampleobject.Sincethereisasignicantdiscrepancybetweenin-sampleandout-of-sample

PAGE 67

performancesofthemodel,itisreasonabletoimposemoreconstraintsonthemodel. ModelDisobtainedbyaddingCVaR-riskconstraintstoModelC: Figure 3{6 showsthatthismodelhashigherin-sampleerrorcomparedtoModelB,buttheout-of-sampleperformanceisthebestamongtheconsideredmodels.Moreover,theriskconstraintreducesthenumberofmany-classmisclassicationjumps.WhereasModelChas8out-of-sampleobjectsforwhichthevalueofmispredictionismorethan5classes,ModelDhasonlyoneobjectwitha5-classmisprediction.So,ModelDhasthebestout-of-sampleperformanceamongalltheconsideredmodels.Thishasbeenachievedbychoosingappropriateexibilityofthemodel. Amodelwithlowexibilitymaytanin-sampledatasetwell.Formodelswithahighin-sampleerror(suchasModelA),moreexibilitycanbeaddedbyintroducingmorecontrolvariablestothemodel.Inclassicationmodelswithseparatingfunctions,feasibilityconstraintsplayacrucialrolebecausetheymakeclassicationalwayspossibleforout-of-sampleobjects.Anexcessiveexibilityofthemodelmayleadto\overtting"andpoorpredictioncharacteristicsofthemodel.Toremoveexcessiveexibilityoftheconsideredmodelsvarioustypesofconstraints(suchasriskconstraints)canbeimposed.Thechoiceofthetypesofconstraints,aswellasthechoiceoftheclassofseparatingfunctions,playsacriticalroleforgoodout-of-sampleperformanceofthealgorithm.

PAGE 68

3.5 ErrorEstimation Toestimatetheperformanceoftheconsideredmodels,weusethe\leave-one-out"crossvalidationscheme.Forthedescriptionofthisschemeandothercrossvalidationapproaches,see,forinstance,EfronandTibshirani[ 14 ].Letusdenotebymanumberofobjectsintheconsidereddataset.Foreachmodel(denedbytheclassesoftheconstraintsimposedonoptimizationproblem( 3.20 )),weperformedmexperiments.Byexcludingobjects~xionebyonefromthesetX,weconstructedmtrainingsets, 3.20 )optimizationproblemswiththeappropriatesetofconstraintsandfoundtheoptimalparametersoftheseparatingfunctionsfU0(~x);::;UJ(~x)g.Further,wecomputedthenumberofmisclassiedobjectsMifromthesetYi.Letusintroducethevariable Thein-sampleerrorestimateEinsampleiscalculatedbythefollowingformula: whereMiisthenumberofmisclassiedobjectsinthesetYi.Inthelastformula,theratioMi

PAGE 69

Theout-of-sampleerrorestimateEoutofsamleisdenedbytheratioofthetotalnumberofthemisclassiedout-of-sampleobjectsinmexperimentstothenumberofexperiments: 3.6 BondClassicationProblem Wehavetestedtheapproachwithabondclassicationproblem.Bondsrepresentthemostliquidclassofthexed-incomesecurities.Abondisanobligationofthebondissuertopaycashowtothebondholderaccordingtotherulesspeciedatthetimethebondisissued.Abondpaysaspeciccashow(facevalue)atthetimeofmaturity.Inaddition,abondmaypayperiodiccouponpayments. Althoughabondgeneratesaprespeciedcashowstream,itmaydefault,ifanissuergetsintonancialdicultiesorbecomesabankrupt.Tocharacterizethisrisk,bondsareratedbyseveralratingorganizations(Standard&Poor'sandMoody'saremajorratingcompanies).Bondratingevaluatesthepossibilityofdefaultofabondissuerbasedontheissuer'snancialconditionandprotspotential.Theassignmentofaratingclassismostlybasedontheissuer'snancialstatus.Aratingorganizationevaluatesthestatususingexpertopinionsandformalmodelsbasedonvariousfactorsincludingnancialratios,suchastheratioofdebttoequity,theratioofcurrentassetstocurrentliabilities,andtheratioofcashowtooutstandingdebt. AccordingtoStandardandPoor's,bondratingsstartatAAAforbondshavingthehighestinvestmentquality,andendatDforbondsinpaymentdefault.Theratingmaybemodiedbyaplusorminustoshowrelativestandingwithin

PAGE 70

thecategory.Typically,abondwithalowerratinghasalowerpricethanabondgeneratingthesamecashow,buthavingahigherrating. Inthisstudy,wehavereplicatedStandard&Poor'sratingsofbondsusingthesuggestedclassicationmethodology.Theratingreplicationproblemisreducedtoaclassicationproblemwith20classes. 3.7 DescriptionofData Forthecomputationalvericationoftheproposedmethodologyweusedseveraldatasets(A,W,X,Y,Z)providedbytheresearchgroupoftheRiskSolutionsbranchofStandardandPoor's,Inc.ThedatasetscontainquantitativeinformationaboutseveralhundredcompaniesratedbyStandardandPoor's,Inc.Eachentryinthedatasetshasfourteeneldsthatcorrespondtocertainparametersofaspecicrminaspecicyear.Thersttwoeldsarethecompanyname,andtheyearwhentheratingwascalculated.Weusedtheseeldsasidentiersofobjectsforclassication.Thenexteleveneldscontainquantitativeinformationaboutnancialperformanceoftheconsideredcompany.Theseeldsareusedforthedecision-makingintheclassicationprocess.ThelasteldisthecreditratingofthecompanyassignedbyStandardandPoor's,Inc.Table 3{1 representstheinformationusedforclassication. Wepreprocessedthedataandrescaledallquantitativecharacteristicsinto[-1,1]intervals.Sincethetotalnumberofratingclassesequals20,weusedintegernumbersfrom1to20torepresentthecreditratingofanobject.Theratingisarrangedaccordingtocreditqualityoftheobjects:thegreatervaluecorrespondstothebettercreditrating. Inthecasestudy,weconsidered5dierentdatasets(A,W,X,Y,Z)toverifytheproposedmethodologyfordierentsizesoftheinputdata.Eachsetwassplitintoanin-samplesetandanout-of-sampleset.Therstonewasusedfordevelopingamodel,andthesecondonewasusedforthevericationofthe

PAGE 71

out-of-sampleperformanceofthedevelopedmodel.Table 3{2 containsinformationaboutthesizesoftheconsidereddatasets. 3.8 NumericalExperiments FordatasetAweperformedcomputationalexperimentswith16modelsgeneratedbyallpossiblecombinationsoffourdierenttypesofconstraints.ForeachdatasetW,X,Y,andZwefoundamodelwiththebestout-of-sampleperformance. FordatasetAweappliedoptimizationproblem( 3.20 )withFeasibilityConstraints(F).Besides,weappliedallpossiblecombinationsofthefollowingfour 3{3 . ThenumericalexperimentswereconductedwithPentiumIII,1.2GHzinC/C++environment.ThelinearprogrammingproblemsweresolvedbyCPLEX7.0package.Thecalculationresultsfor16modelsfordatasetAarepresentedinTable 3{4 andFigure 3{7 . Figure 3{7 representsin-sampleandout-of-sampleerrorsfordierentmodels.Foreachmodeltherearethreecolumnsrepresentingthecomputationalresults.Therstcolumncorrespondstothepercentageofin-sampleclassicationerror,thesecondcolumncorrespondstothepercentageofout-of-sampleclassicationerror,andthethirdcolumnrepresentstheaverageout-of-sampleclassicationexpressed

PAGE 72

inpercents(100%mispredictioncorrespondstothecasewhenthedierencebetweenactualandcalculatedclassesequals1). OnFigure 3{7 ,themodelsareprioritizedaccordingtoanaverageofout-of-sampleerror.Models0000and0010havezeroin-sampleerror.Thesetwomodelsarethemost\exible"{theytthedatawithzeroobjectivein( 3.20 ).Thesmallestaverageout-of-sampleerrorisobtainedby0101-modelwithMSFandLSQconstraints.Table 3{4 showsthatthismodelhasthelowestmaximalmisprediction.Models0100and0110havetheminimalnumberofout-of-samplemispredictions(columnError(1)),butthemaximalmispredictions(columnMAX)forthesemodelsarehigh.Itisworthmentioningthatimposingconstraintsontheseparatingfunctionsincreasesthecomputingtime.However,foralltheconsideredoptimizationproblems,theclassicationproceduredidnottakemorethan5minutes ForthedatasetsW,X,Y,andZweimposedvariousconstraints,andselectedthemodelswithbestperformance.ForeachdatasetwechoosethebestperformingmodelsandcomparedperformancesofthemodelswithreferencemodelsusedbyRiskSolutionsgroupatStandardandPoor's,seeTable 3{5 .Foreachdatasetthetablecontainsthefollowingperformanceinformationaboutthebestchosenandthereferencemodel:theaverageerror(expressedinnumberofclasses);thestandarddeviation;thepercentageofcorrectout-of-samplepredictions,thedistributionofout-of-sampleprediction(expressedinpercentageofout-of-samplepredictionswithadeviationlessthanorequaltoiclassesfori=1,2,3);andthedescriptionoftheconstraintsincludedinthebestmodel.

PAGE 73

TheresultsofthisanalysisshowthattheproposedmethodologyofclassicationwiththeseparatingfunctionsiscompetitivewiththereferencemodelavailablefromRiskSolutionsgroupatStandardandPoor's.Althoughtheproposedmodelsdonotprovidebetterresultscomparedtothereferencemodelinallthecases,theproposedalgorithmshaveabetteroratleastcomparableperformance,forsmalldatasetsZandW. 3.9 ConcludingRemarks Weextendedthemethodologydiscussedinthepreviouschaptertothecaseofmulti-classclassication.Theextendedapproachisbasedonnding\optimal"separatingfunctionsbelongingtoaprespeciedclassoffunctions.Wehaveconsideredthemodelswithquadraticseparatingfunctions(inindicatorparameters)withdierenttypesofconstraints.Asbefore,selectingaproperclassofconstraintsplaysacrucialroleinthesuccessofthesuggestedmethodology:byimposingconstraints,weadjusttheexibilityofamodeltothesizeofthetrainingdataset.Theadvantageoftheconsideredmethodologyisthattheoptimalclassicationmodelcanbefoundbylinearprogrammingtechniques,whichcanbeecientlyusedforlargedatasets. Wehaveappliedthedevelopedmethodologytoabondratingproblem.Wehavefoundthatfortheconsidereddataset,thebestout-of-samplecharacteristics(thesmallestout-ofsampleerror)aredeliveredbythemodelwithMSFandLSQconstraintsonthecontrolvariables(coecientsoftheseparatingfunction). Thestudywasfocusedontheevaluationofcomputationalcharacteristicsofthesuggestedapproach.Asbefore,despitethefactthattheobtainedresultsaredataspecic,wecanconcludethatthedevelopedmethodologyisarobustclassicationtechniquesuitableforavarietyofengineeringapplications.

PAGE 74

Table3{1: Dataformatofanentryofthedataset Identier Quantitativecharacteristics Class 1)CompanyName 1)IndustrySector 1)Creditrating 2)Year 2)EBITinterestscoverage(times) 3)EBITDAinterestcoverage(times) 4)Returnoncapital(%) 5)Operatingincome/sales(%) 6)Freeoperatingcashow/totaldebt(%) 7)Fundsformoperations/totaldebt(%) 8)Totaldebt/capital(%) 9)Sales(mil) 10)Totalequity(mil) 11)Totalassets(mil) Therstcolumncontainstwoeldsidentifyingtheobject.Thesecondcolumnhaseleveneldscontainingquantitativeinformationaboutthecompany.Theseeldsareusedforthedecision-makingintheclassicationprocess.Thelasteldisthecreditratingofthecompany,andispresentedinthelastcolumnofthetable.

PAGE 75

Table3{2: Sizesoftheconsidereddatasets Dataset A W X Y Z SizeofSet 406 205 373 187 108 SizeofIn-Sample 278 172 315 157 89 SizeofOut-of-Sample 128 33 58 30 19 Table3{3: Listoftheconsideredmodels. ModelIdenticator TypesofConstraintsApplied GM MSF R LSQ 0000 0001 YES 0010 YES 0011 YES YES 0100 YES 0101 YES YES 0110 YES YES 0111 YES YES YES 1000 YES 1001 YES YES 1010 YES YES 1011 YES YES YES 1100 YES YES 1101 YES YES YES 1110 YES YES YES 1111 YES YES YES YES

PAGE 76

Table3{4: CalculationresultsfordatasetA Model Error(In) AVR Correct Error(1) Error(2) Error(3) Error(4) MAX 0000 0.00% 1.18 37.37% 62.63% 23.23% 9.09% 3.03% 18 0001 16.00% 1.54 30.30% 67.68% 35.35% 22.22% 13.13% 13 0010 0.00% 1.28 39.39% 60.61% 24.24% 10.10% 5.05% 17 0011 26.70% 1.53 25.25% 74.75% 37.37% 18.18% 12.12% 7 0100 13.50% 1.18 44.44% 55.56% 25.25% 12.12% 7.07% 18 0101 17.00% 1.03 39.39% 60.61% 25.25% 12.12% 4.04% 5 0110 13.60% 1.18 44.44% 55.56% 25.25% 12.12% 7.07% 18 0111 27.30% 1.24 38.38% 61.62% 30.30% 13.13% 5.05% 16 1000 16.90% 1.44 36.36% 63.64% 26.26% 11.11% 10.10% 16 1001 17.50% 1.19 33.33% 66.67% 29.29% 11.11% 7.07% 6 1010 22.60% 1.20 42.42% 57.58% 26.26% 12.12% 8.08% 14 1011 22.50% 1.15 37.37% 62.63% 29.29% 11.11% 7.07% 7 1100 33.10% 1.25 36.36% 63.64% 24.24% 14.14% 7.07% 18 1101 34.50% 1.19 34.34% 65.66% 24.24% 14.14% 7.07% 10 1110 35.70% 1.53 40.40% 59.60% 30.30% 19.19% 9.09% 18 1111 36.70% 1.22 42.42% 57.58% 28.28% 18.18% 8.08% 10 Error(In)=percentageofin-samplemisclassications;AVR=anaverageerror(innumberofclasses);Correct=percentageofcorrectout-of-samplepredictions;Error(1)=percentageofout-of-samplemispredictions;Error(i)=percentageofout-of-samplepredictionerrorsformorethanorequaltoiclasses;MAX=maxi-malerrorforout-of-sampleclassication.

PAGE 77

Table3{5: ComparisonofbestfoundmodelwithreferencemodelforsetsW,X,YandZ Model W X Y Z In-Sample 172 315 157 89 Out-of-Sample 33 58 30 19 Model Best Refer. Best Refer. Best Refer. Best Refer. AVR 1.82 1.18 1.67 1.02 1.93 1.13 2.21 2.32 STDV 1.72 0.92 2.45 1.07 2.60 1.50 2.20 2.50 Correct 33.3% 21.2% 24.1% 37.9% 30.0% 50.0% 15.8% 21.1% 1-Classarea 42.4% 72.7% 53.4% 70.7% 60.0% 70.0% 42.1% 63.2% 2-Classarea 69.7% 87.9% 86.2% 94.8% 80.0% 83.3% 78.9% 68.4% 3-Classarea 84.8% 100.0% 94.8% 96.6% 86.7% 86.7% 84.2% 68.4% 4-Classarea 93.9% 100.0% 96.6% 98.3% 86.7% 96.7% 84.2% 78.9% Constraints Feasibility Feasibility Feasibility Feasibility MSF MSF MSF CVaR MSF Foreachdatasetthetablehastwocolumns.Therstcolumncorrespondstothebestfoundmodel;thesecondcolumncorrespondstothereferencemodelfromRiskSolutions.Eachcolumncontainsinformationaboutout-of-sampleperformanceofthecorrespondingmodel. AVR=anaverageerror(innumberofclasses);STDV=astandarddeviation;Correct=percentageofcorrectout-of-samplepredictions;i-Classarea=percent-ageofout-of-samplepredictionswithadeviationlessthanorequaltoiclasses;Constraints=descriptionoftheconstraintsincludedinthebestmodel

PAGE 78

Figure3{1: Classicationbyseparatingfunctions Figure3{2: Natureofriskconstraint

PAGE 79

Figure3{3: In-sampleandout-of-sampleerrorsformodelA Figure3{4: In-sampleandout-of-sampleerrorsformodelB

PAGE 80

Figure3{5: In-sampleandout-of-sampleerrorsformodelC Figure3{6: In-sampleandout-of-sampleerrorsformodelD

PAGE 81

In-sampleandout-of-sampleerrorsforvariousmodels

PAGE 82

Volume-weightedaverageprice(VWAP)isoneofcommonlyusedtradeevaluationbenchmarksinthestockmarket.AwidelyusedmethodofVWAPtradingistotradeanorderaccordingtothemarketvolumedistributionofthestock.Wedevelopadynamictradingstrategybasedonaforecastofthemarketvolumedistributionofstocksusingregressiontechniques. 4.1 Introduction ThepurposeoftheVWAPtradingistoobtainthevolume-weightedpriceoftransactionshigherthanmarketVWAP.AninvestormayactinvariouswayswhenseekingforVWAPexecutionofhisorder.HecanmakeacontractwithabrokerwhoguaranteessellingorbuyingordersatdailyVWAP.SincethebrokerassumestheriskoffailingtoachievebetteraveragepricethanVWAP,commissionsarequitelarge. AnordermaybesenttoelectronicsystemswhereitisexecutedatthedailyVWAPprice(suchordersarecalledVWAPcrosses).Theseordersarematchedelectronicallybeforethebeginningofatradingdayandexecutedduringoraftertradinghours.VWAPcrossesnormallyhavelowtransactioncosts;however,the 73

PAGE 83

priceofexecutionisnotknowninadvanceandtheremayexistthepossibilitythattheorderwillnotbeexecuted. ThemostrecentapproachtoVWAPtradingisparticipatinginVWAPautomatedtrading,whereatradingperiodisdividedintosmallintervalsandtheorderisdistributedascloselyaspossibletothemarket'sdailyvolumedistribution,thatis,theorderistradedwiththeminimalmarketimpact.ThisstrategyprovidesagoodapproximationtomarketVWAP,althoughitgenerallyfailstoreachthebenchmark.Moreintelligentsystemsperformcarefulprojectionsofthemarketvolumedistributionandexpectedpricemovementsandusethisinformationintrading.AmoredetailedsurveyofVWAPtradingcanbefoundin[ 27 ]. AlthoughVWAP-benchmarkhasgainedpopularity,veryfewstudiesconcerningVWAPstrategiesareavailable.Severalstudies[ 7 , 19 ]considerblocktradingwithoptimalsplittingoftheordertooptimizetheexpectedexecutioncost.Konishi[ 18 ]developedastaticVWAPtradingstrategythatminimizestheexpectedexecutionerrorwithrespecttothemarketrealizationofVWAP.Astaticstrategyisdeterminedforthewholetradingperiodanddoesnotchangeasnewinformationarrives. InthischapterwedevelopdynamicVWAPstrategies.Weconsiderliquidstocksandsmallorders,thatmakenegligibleimpactonpricesandvolumesofthemarket.WereducetheproblemoftrackingVWAPtotheforecastofvolumesofstocktradedinthemarketduringaday.Wesplitatradingdayintosmallintervalsandestimatethemarketvolumeconsecutivelyforeachintervalusingthelinearregression.Anorderistradedproportionallytothedynamicallyforecastedmarketvolumeduringeachinterval. 4.2 BackgroundandPreliminaryRemarks Considerthecasewhenonlyonestockisavailablefortrading.Ifattimeatransactionoftradingunitsofthestockatpricepoccurs,wedenotethis

PAGE 84

transactionbyf;;pg.Let=ffk;k;pkg;k=1;::;Kgbeasetofalltransactionsinthemarketduringaday.ThentheVWAPofthestockis SupposethatatradingdayissplitintoNequalintervalsf(tn1;tn]jn=1;::;Ng,tn=(n=N)T,whereTisthelengthoftheday.Then,thetotalvolumetradedduringtimeperiod(tn1;tn]is ThecorrespondingexpressionforthedailyVWAPofthestockis where canbethoughtofastheaveragemarketpriceduringthenthinterval. ConsideranordertosellXunitsofastockduringatradingday.Weassumethatanexecutionofthisorderdoesnotaectthepricesandthemarketvolumedistributionofthestock,whichisreasonableforrelativelysmallvolumesX.Wedescribeatradingstrategybythesequencefxnjn=1;::;Ng;NXn=1xn=1;

PAGE 85

ThedenitionintermsofproportionsismoreappropriatebecauseVWAPdependsonproportionsVi=PNn=1VnratherthanonvolumesVi Valuesxnareassumedtobenonnegative(i.e.thetraderisnotallowedtobuystock). Weassumethatwecanperformtransactionsattheaveragemarketpriceduringasmallinterval(about5min).Withthisassumption,equations( 4.5 )impliesthatapossiblewaytomeetthemarketVWAPofthestockistotradetheorderproportionallytothemarketvolumeduringeachinterval,yieldingthesamedailydistributionofthetradedvolumeasthemarketone.Thiscanbedoneasfollows.Foreachintervalofaday,weforecastthevolumeofthestockthatwillbetradedinthemarketduringthisinterval.Basedonthisforecastwedecideonthefractionoftheorderthatshouldbetradedduringthisinterval.Suppose,forexample,thatviistheforecastoftheproportionofthedailyvolumeofthestocktradedinthemarketduringanintervali(i=1;:::;N).Thedailyvolumedistributionis(v1;:::;vN).TogetthesamedistributionoftradedvolumeoftheorderXduringtheday,weshouldtradevkXsharesoftheorderduringthekthinterval. However,inourstrategywemakeaforecastofmarketvolumeduringanintervalindependentlyofthepreviousforecasts.So,directestimationoftheproportionsofthemarketvolumedoesnotguaranteethattheobtainedproportionsv1;:::;vNwillsumuptoone.Toavoidthisproblem,weforecastthefractionsoftheremainingvolume,thathavenotyetbeentradedatcurrenttime.Supposethat(V1;V2;:::;VN)isthedistributionofvolume(innumberofshares)duringaday.In

PAGE 86

termsoffractionsofthedailyvolumethisdistributioncanbepresentedas(v1;v2;:::;vN);vk=Vk 4{1 demonstratesthetwopresentationsofthevolumedistribution.Note,thatwNisalwaysequalto1.Thereisaone-to-onecorrespondencebetweenvectors(v1;:::;vN)and(w1;:::;wN);therelationbetweenthemisgivenbytheformulas and Equations( 4.7 )followfromthefollowingequalitywi(1wi1):::(1wim)=Vi

PAGE 87

4.3 ModelDesign WeassumethathistoricaltradingdataforthelastSdaysareavailable,whereeachdayissplitintoNequalintervals.Supposewewanttoforecastthefractionoftheremainingvolumewkthatwillbetradedinthemarketduringthekthinterval.Inordertoforecastwk,weusetheinformationaboutvolumesandpricesofthestockrepresentedbycovariatesp1(kl);s;:::;pP(kl);s;l=1;:::;L,wherepi1;:::;piParesomevaluescalculatedintheithintervalandListhenumberoftheprecedingintervals.Ifk
PAGE 88

Bysolvingtheproblem,theoptimalvalue~isobtained.Theforecastofw0kisthenmadebyexpression( 4.8 ). 4.3.1 BestSample Itisreasonabletochooseforregressionthe"nearest"scenariosinthesenseofsimilarityofhistoricaldaystothecurrentday.Sinceforeachdayweareinterestedinthevaluesofvariablesp1(kl);s;:::;pP(kl);s;l=1;:::;L,wedenethe"distance"betweenthecurrentdayandthescenariosinthefollowingway: AftercalculatingdistancestoallSscenarios,wechooseSbestnearestscenarios(inthesenseof( 4.12 )),andusethemintheoptimizationproblem( 4.11 ).Doingso,weeliminate\outliers"withunusual,withrespecttothecurrentday,behaviorofthemarketwhichimprovestheaccuracyoftheprediction. 4.4 EvaluationofModelPerformance ThemodelwasveriedwiththehistoricalpricesofIBMstockfortheperiodApril1997-August2002.Eachdayissplitinto78ve-minuteintervals(dailytradinghoursare9:30AM-4:00PM).Forsomeexperiments,besidespricesandvolumesoftheIBMstock,wealsousedpricesandvolumesofindexSPY. WeevaluatedtheperformanceofthemodelbyapplyingittothehistoricaldatasetandforecastingthevolumedistributionoftheIBMstockfortheperiodof100tradingdays(March,12002-August,32002).Weconsideredonlythosehistoricalwhichwecall"admissible"days.Adayis\admissible",ifthisdayandthepreviousdayarefulltradingdaysstartingandendinginusualhours,andtherearenotradinginterruptionsduringthesedays.Toforecastthetradingvolumesforoneday,asetofscenariosforthelastSadmissibledayswasused.Wecomparedtheforecasteddistributionswiththeactualonesandfoundtheestimationerrorbyaveragingestimationerrorsforeachintervaloveralloutputdays.

PAGE 89

Weestimatethepredictionerrorofthemodelbythemean-absolutedeviation: wheref(vk1;:::;vkN)jk=1;:::;Kg,andf(vk1;:::;vkN)jk=1;:::;Kgaretheactualandforecastedtradingvolumesofthestock,andKisthenumberofdaysusedfortheforecast. Wecompareourmodelswith\averagedailyvolumes"(ADV)strategy.ThisverysimplestrategyprovidesagoodapproximationtoVWAP,andisconsideredtobeoneofthemostaccuratetradingstrategiesforVWAPrealization.Supposeasetofhistoricalvolumesofthemarketis Denote Vn=SXs=1Vns;Vtotal=NXn=1Vn:(4.15) Thentheaveragevolumedistributionis (v1;:::;vN);vn=Vn AnexampleofaveragevolumedistributionversustheactualvolumeevolutionispresentedinFigure 4{2 .Itcanbeseenthatdailyvolumeexhibitsthe"U-shape"andthattheaveragedistributionprovidesagoodapproximationtothedailyvolumeevolution. Forthedatasetdescribedabove,wecalculatedtheaveragevolumedistributionoverSadmissibledays.TheestimationerroroftheADVstrategywascalculatedusing( 4.13 ).Therelativegaininaccuracyoftheregressionalgorithm

PAGE 90

wasjudgedbythevalueof MADADV100%:(4.17) Also,wecomparedthemodels'accuracyinVWAPrealization.WecalculatedVWAPforall100daysandcomparedtheerrorssimilarto( 4.13 )and( 4.17 ). 4.5 NumericalExperiments 4.5.1 Experiments Weperformedthefollowingexperiments. 4{1 )Weusethefollowingcovariatesforregression( 4.8 ): lnVandlnPclose whereVisthevolumeofthestocktradedinthemarketduringaninterval,PopenandPcloseareopenandclosepricesofthestockduringthisinterval.Weperformedtheexperimentforvariousnumbersof\admissible"historicaldays(S=100;200;:::;800),andvariousnumbersofregressionperiods(L=1;2;:::;5). 4{2 )Inadditiontostockpricesandvolumes,wealsoconsideredvolumesandpricesoftheindexSPY.SimilartoExperiment1,thefollowinginformationisusedfortheregression lnV;lnPclose whereVSPY,PSPYcloseandPSPYopenarethevolume,openandclosepricesoftheindexduringtheintervalrespectively.SimilartoExperiment1,weconsidereddierentnumbersof\admissible"historicaldays(S=100;200;:::;800),anddierentnumbersofregressionperiods(L=1;2;:::;5). 4{3 and 4{4 )Weusecovariates( 4.18 )forregressionandchooseSbest"nearest"scenariosfromthehistoricaldaysS.Thatis,weuseS=500

PAGE 91

andS=800andchangethesizeofthebestsampleSbest.Thenumberofregressionperiodsisxed:L=2. 4{5 and 4{6 )Weusecovariates( 4.19 )fortheregressionandchooseSbest"nearest"scenarioswiththesamevaluesofS;SbestandLasinExperiment3. 4.5.2 ResultsandDiscussion ForthesamenumberofregressionperiodsL,modelsthatusetheinformationaboutthestockandtheindex,havetwiceasmanycovariatesintheregressionmodel( 4.8 )thaninthemodelsbasedonstockdataonly.Asthenumberofcovariatesintheregressionmodelincreases,themodelbecomesmoreexibleandmorescenariosareneededtoachieveabetteraccuracyofestimations. Thiscanbeseenfromthecomputationalresults.FromTable 4{1 WithrespecttotherelativegaininVWAP,thereisnoclearpattern,howeveralltheconsideredmodelsprovideasignicantrelativegain(15-20%)inVWAPrealizationcomparedtoADVstrategy. InTable 4{2 IntheTables 4{3 , 4{4 , 4{5 , 4{6

PAGE 92

(thebestcombinationsof\best-sample"/\history"are350/500,and400/800forthemodelsbasedonstockinformation;and400/500,and700/800forthemodelsbasedonstockandindexinformation).ThemostaccuraterealizationsofVWAPcorrespondtothe\full-history"models(i.e.best-sampletechniqueisnotusedforthesemodels). 4.6 Conclusions InthisstudywedesignedseveralVWAP-tradingstrategiesbasedondynamicforecastingofthemarketvolumedistribution.WeshowedthattheproposedmethodologyimprovestheaccuracyofvolumeforecastcomparedtothecommonlyusedADV-strategy.Also,wedemonstratedthatthedevelopedmodelscanbeusedinVWAPtracking.TheyprovidebetterVWAPrealizationthanADV-strategy.

PAGE 93

Table4{1: Performanceofstock-basedmodel #Scen. Model ADV NumberofRegr.Periods 1 2 3 4 5 800 Volumes MAD,% 35.270 34.19 34.21 34.31 Gain,% 3.05 3.00 2.73 VWAP MAD,% 0.100 0.084 0.084 0.084 Gain,% 16.17 16.18 15.95 STDV,% 0.186 0.156 0.157 0.158 0.158 0.158 700 Volumes MAD,% 35.270 34.09 34.14 34.25 Gain,% 3.34 3.20 2.88 VWAP MAD,% 0.100 0.084 0.084 0.084 Gain,% 15.90 15.91 16.220 STDV,% 0.186 0.157 0.158 0.159 0.159 0.159 600 Volumes MAD,% 35.270 34.08 34.09 34.20 Gain,% 3.37 3.35 3.04 VWAP MAD,% 0.100 0.083 0.082 0.082 Gain,% 16.89 17.41 17.29 STDV,% 0.186 0.155 0.156 0.155 0.155 0.155 500 Volumes MAD,% 35.270 34.11 34.17 34.23 Gain,% 3.30 3.11 2.94 VWAP MAD,% 0.100 0.082 0.082 0.081 0.081 17.30 18.01 18.61 18.46 0.186 0.154 0.154 0.154 0.154 0.153 400 Volumes MAD,% 35.270 34.18 34.25 34.40 Gain,% 3.09 2.88 2.46 VWAP MAD,% 0.100 0.082 0.081 0.081 0.081 17.90 18.28 18.93 18.95 0.186 0.152 0.151 0.151 0.150 0.150 300 Volumes MAD,% 35.270 34.41 34.73 34.85 Gain,% 2.44 1.54 1.19 VWAP MAD,% 0.100 0.083 0.082 0.082 Gain,% 17.20 17.89 17.90 STDV,% 0.186 0.153 0.151 0.151 0.151 0.151 200 Volumes MAD,% 35.270 35.00 35.22 35.50 Gain,% 0.78 0.16 -0.64 VWAP MAD,% 0.100 0.082 0.082 0.082 Gain,% 17.43 17.47 17.67 STDV,% 0.186 0.154 0.154 0.153 0.155 0.155 100 Volumes MAD,% 35.270 35.95 36.61 37.15 Gain,% -1.92 -3.79 -5.32 VWAP MAD,% 0.100 0.084 0.083 0.084 0.084 0.084 Gain,% 15.84 15.55 16.16 STDV,% 0.186 0.158 0.160 0.160

PAGE 94

Table4{2: Performanceofstock-and-index-basedmodel #Scen. Model ADV NumberofRegr.Periods 1 2 3 4 5 800 Volumes MAD,% 35.270 34.15 34.29 34.45 Gain,% 3.17 2.77 2.33 VWAP MAD,% 0.100 0.082 0.081 0.080 0.080 17.69 18.32 19.31 19.35 0.186 0.154 0.154 0.153 0.152 0.151 700 Volumes MAD,% 35.270 34.18 34.38 34.53 Gain,% 3.09 2.53 2.11 VWAP MAD,% 0.100 0.082 0.081 0.080 Gain,% 18.00 19.14 20.199 STDV,% 0.186 0.154 0.153 0.152 0.151 0.151 600 Volumes MAD,% 35.270 34.11 34.50 34.64 Gain,% 3.30 2.19 1.80 VWAP MAD,% 0.100 0.081 0.080 0.080 0.079 18.52 19.45 19.45 21.04 0.186 0.153 0.152 0.152 0.149 0.148 500 Volumes MAD,% 35.270 34.41 34.66 34.82 Gain,% 2.43 1.73 1.27 VWAP MAD,% 0.100 0.081 0.081 0.079 0.079 18.48 19.23 20.43 20.85 0.186 0.152 0.152 0.149 0.147 0.145 400 Volumes MAD,% 35.270 34.67 34.92 35.07 Gain,% 1.72 1.00 0.58 VWAP MAD,% 0.100 0.082 0.081 0.080 0.080 17.72 18.34 19.28 19.48 0.186 0.152 0.150 0.150 0.149 0.146 300 Volumes MAD,% 35.270 34.99 35.36 35.76 Gain,% 0.78 -0.24 -1.38 VWAP MAD,% 0.100 0.082 0.082 0.080 0.080 17.50 18.07 19.46 19.66 0.186 0.151 0.149 0.149 0.148 0.147 200 Volumes MAD,% 35.270 35.66 36.17 36.79 Gain,% -1.11 -2.56 -4.31 VWAP MAD,% 0.100 0.082 0.082 0.081 0.080 17.83 17.69 18.85 20.13 0.186 0.152 0.152 0.152 0.151 0.149 100 Volumes MAD,% 35.270 37.91 39.37 40.65 Gain,% -7.47 -11.6 -15.3 VWAP MAD,% 0.100 0.084 0.083 0.083 Gain,% 15.35 16.34 16.36 STDV,% 0.186 0.154 0.149 0.152 0.152 0.155

PAGE 95

Table4{3: Bestsampleforstock-basedmodel,S=800 Volumes VWAP MAD,% Gain,% MAD,% Gain,% STDV,% ADV 35.27 0.0997 0.1856 BestSample 3.428 16.00 780 33.95 3.729 0.0840 15.78 0.1570 750 33.93 3.803 0.0841 15.71 0.1576 700 33.87 3.962 0.0843 15.43 0.1582 600 33.80 4.162 0.0852 14.60 0.1601 500 33.78 4.231 0.0855 14.23 0.1612 33.77 4.265 13.51 0.1624 300 33.80 4.164 0.0864 13.33 0.1623 200 33.89 3.905 0.0864 13.35 0.1589 100 34.28 2.815 0.0872 12.55 0.1608 50 34.90 1.052 0.0875 12.28 0.1643 Table4{4: Bestsampleforstock-basedmodel,S=500 Volumes VWAP MAD,% Gain,% MAD,% Gain,% STDV,% ADV 35.27 0.0997 0.1856 BestSample 3.689 18.01 480 33.87 3.975 0.0818 17.96 0.1533 450 33.82 4.121 0.0820 17.78 0.1534 400 33.77 4.251 0.0824 17.40 0.1547 33.77 4.253 16.79 0.1546 300 33.78 4.236 0.0837 16.10 0.1564 200 33.90 3.895 0.0836 16.21 0.1576 150 34.05 3.446 0.0838 15.97 0.1565 100 34.24 2.917 0.0852 14.57 0.1581 50 35.03 0.672 0.0865 13.28 0.1643

PAGE 96

Table4{5: Bestsampleforstock-and-index-basedmodel,S=800 Volumes VWAP MAD,% Gain,% MAD,% Gain,% STDV,% ADV 35.27 0.0997 0.1856 BestSample 3.469 18.32 780 33.95 3.741 0.0821 17.67 0.1546 750 33.90 3.894 0.0828 16.98 0.1566 33.83 4.074 16.73 0.1575 600 33.85 4.021 0.0841 15.70 0.1590 500 33.92 3.840 0.0849 14.88 0.1602 400 33.99 3.617 0.0847 15.02 0.1610 300 34.15 3.182 0.0856 14.14 0.1614 200 34.39 2.494 0.0869 12.90 0.1624 100 35.45 -0.523 0.0907 9.08 0.1697 50 37.71 -6.904 0.0952 4.50 0.1710 Table4{6: Bestsampleforstock-and-index-basedmodel,S=500 Volumes VWAP MAD,% Gain,% MAD,% Gain,% STDV,% ADV 35.27 0.0997 0.1856 BestSample 3.106 19.23 480 34.08 3.373 0.0816 18.18 0.1532 450 34.05 3.468 0.0819 17.92 0.1538 34.03 3.503 17.86 0.1539 350 34.11 3.293 0.0834 16.40 0.1553 300 34.12 3.269 0.0837 16.06 0.1571 200 34.42 2.411 0.0858 13.94 0.1618 150 34.77 1.407 0.0869 12.88 0.1620 100 35.64 -1.037 0.0894 10.37 0.1671 50 37.67 -6.794 0.0960 3.78 0.1706

PAGE 97

Percentagesofremainingvolumevs.percentagesoftotalvolume Dailyvolumedistributions

PAGE 98

Ecientcreditportfoliomanagementisakeysuccessfactorofbankmanagement.DiscussionsofthenewcapitaladequacyproposalsbytheBasleCommitteeonBankingSupervisionenlightenthenecessitytoconsiderthecreditriskmanagementbothfromtheinternalandtheregulatorypointofview.Weintroduceanoptimizationapproachforthecreditportfoliothatmaximizesexpectedreturnssubjecttointernalandregulatoryriskconstraints.Withasimpliedbankportfolioweexaminetheimpactoftheregulatoryrisklimitationrulesontheoptimalsolutions 5.1 Introduction Ecientcreditportfoliomanagementisakeysuccessfactorofbankmanagement.Inanadversemarketenvironmentandintensifyingcompetitionbanksareexposedtoincreasingrisksanddecreasingreturnmarginsoftheircreditportfolio,whilebankshareholdersaredemandinghigherriskpremiumsfortheirinvestedcapital.Theabilitytoidentifyrisk-returnoptimalportfoliosbecomesafundamentalelementofcreditportfoliomanagement.TherecentdiscussionsoftheBasleCommitteeonBankingSupervisionenlightenthenecessitytomanagecreditrisksimultaneouslyfromaninternalandaregulatoryperspective. 89

PAGE 99

Inthischapter,wegiveasurveyofanewoptimizationalgorithmthatdeterminesrisk-returnecientcreditportfoliosunderinternalandregulatorycreditriskconstraints.Weformulatetheoptimizationproblemforthecreditportfoliobasedontheriskmeasure,ConditionalValue-at-Risk,andderiverisk-returnratiosfortheoptimalportfolios(Section5.2).Withanapplicationexample,weanalyzetherisk-returnstructureofanoptimalportfolio.Weexaminetheimpactoftheregulatoryrisklimitationrulesandvisualizehowtheymayleadtoinecienciesinthecreditportfoliomanagement(Section5.3). 5.2 OptimizationApproach 5.2.1 DenitionoftheCVaRRiskMeasure TheriskmeasureValue-at-Risk(VaR),commonlyappliedinnance,lacksthesub-additivityproperty,whenreturndistributionsarenotnormal.ThismeansthatthediversicationoftheportfoliomayincreaseportfolioVaR.Asimilarpercentileriskmeasure,ConditionalValue-at-Risk(CVaR)doesnothavethisdrawback.ThetermConditionalValue-at-RiskwasintroducedbyRockafellarandUryasev[ 32 ].Forcontinuousdistributions,CVaRisequaltotheconditionalexpectationbeyondVaR[ 32 ].However,forgeneraldistributions,itisaweightedaverageofVaRandtheconditionalexpectationbeyondVaR[ 33 ].CVaRcanbeappliedtomeasurelossriskfromanyasymmetricanddiscontinuouslossdistributionwithdiscreteprobabilitiesanditobeysthepropertyofcoherence[ 1 , 31 , 33 ],asetofaxiomsthatariskmeasureshouldmeetfromthepointofviewofaregulator[ 3 ].CVaRhasbeenprovedtobeappropriateforcreditportfolioriskmeasurements[ 31 , 32 , 33 ]. Letx=(x1,...,xn)Tbeavectorofpositionsofcreditassetsofaportfolio,andy=(y1,:::;yn)Tbeavectorofthecorrespondingmarketprices.Forcontinuous

PAGE 100

distributions,wedeneCVaRdeviationCVaR(L(x;y))oftheportfoliolossriskas wherethelossfunctionL(x,y)isthedierenceoftheuncertainportfoliovaluesandtheexpectedvalueoftheportfolio,i.e.L(x,y)=E[y]TxyTx,andVaR(L(x;y))isthe-quantileofthelossfunctionL(x,y) 5.2.2 FormulationoftheOptimizationModel Theoptimizationproblemmodelsbasicgoalsofthecreditportfoliomanagement.Itmaximizestheexpectedreturnsofthecreditportfoliounderinternalandregulatorylossrisklimits[ 37 ]. Fromthebank'sinternalperspectivecreditrisksarelimitedbytheeconomiccapital,i.e.thecapitalresourcesavailablethatthebankcanapplytocoveroccurringcreditlosses.Theeconomiccapitaloftenisdenedasasubsetofthebank`sequity.Wherenationallawallowstheaccumulationofhiddenreserves,thesearecommonlyappliedaselementsoftheeconomiccapital,astheycanbereleasedtocoveroccurringlosses.Atthesametime,thebankneedstolimititscreditriskfromaregulatoryperspective.WeconsiderthelossrisklimitationrulessetforthbytheBasleCommitteeonBankingSupervision:rst,westudytheprevailingrulesof`BasleI'[ 4 , 5 ]and,then,willextendourstudiesbyanalyzingtheeectsofthe`new'creditriskweightsoftheBasleIIrules[ 6 ]. Banksarechargedcapitaltocoverthecreditrisksoftheirbankbook,thatislimitedbythemaximumamountofregulatorycapitalapplicabletocovertheserisks.Inthisstudy,weconcentrateonthecreditportfolioofthebankbook.The 30 ].

PAGE 101

creditriskofthebankbookislimitedbythe`tier1',i.e.thecorecapital,andthe`tier2',i.e.thesupplementarycapital.Thetier1capitalmainlyconsistsofthecorecapitalofthebank,plusfurthercomponents.Thetier2capitalincludessupplementarycapitalelements,suchastheallowanceforloanlossreservesandvariouslong-termdebtinstruments,suchassubordinateddebt,see[ 4 ],andalso[ 41 ],p.119.Ourstudieswillbeextendedbyanintegratedanalysisofmarketandcreditriskdependentassetsunderinternalandregulatorylossrisklimitations. Thecapitalconstraintsconstraintheabsoluteexpectedprotsthebankisabletoachieveintheplanningperiod.Thelesseconomicandregulatorycapitalisavailable,i.e.thelessriskitisabletotake,thelessexpectedprotsareachievableinthebusinessperiod. Weassumeaplanninghorizonofoneyearandachieveaoneperiodoptimizationmodel.Theexposuresoftheassetsrepresentthedecisionvariables.Forplanningpurposesitsucestoconsideraggregatepositions,e.g.dependingontheorganizationalstructureofthebank,productorcustomersegments,thatareaccountingtothesameprotcenter. Letxbethedecisionvectorand=(1,...,n)Tthevectoroftheexpectedreturnsofsingleassets.Wemaximizetheexpectedportfolioreturn(x)=Tx.TheinternallossriskismeasuredbytheCVaRdeviationoftheportfoliolossaccordingtoequation 5.1 andisconstrainedbythemaximalamountofeconomiccapitalavailable,denotedasec cap max.BasedontheoptimizationalgorithmofRockafellarandUryasev[ 32 ],theCVaRconstraintisapproximatedbyasetoflinearconstraints,leadingtoalinearoptimizationproblem.Toimplementthealgorithm,asinputdata,weuseasampleofmarketpricescenariosy1,...,yKofthevectory.Intheapplicationexampleinthenextsection,thesemarketpricescenariosaregeneratedbyaMonteCarlo-SimulationaccordingtotheCreditMetricsapproachofJ.P.Morgan.Theregulatorycreditriskismeasuredby

PAGE 102

theregulatoryriskbasedcapitalratios,reg cap=(reg cap1;:::;reg capn)Tandislimitedbytheavailableregulatorycoreandsupplementarycapital,denotedbyreg cap max.Theareaofthefeasiblesolutionsisdenedbyupperandlowerpositionbounds,thevectorslow boundandup bound.Wesolvethefollowinglinearoptimizationmodel: maxx;z;q(x)=Tx=Pnj=1jxj;s:t:Constraint#1:InternalRiskConstraintq+1 11 cap max;L(x;yk)qzk;k=1;::;K;zk0;k=1;::;Kqisafreevariable.s:t:Constraint#2:RegulatoryRiskConstraintreg capTxreg cap max:s:t:Constraint#3:BoundariesoftheFeasibleSolutionslow boundxup bound:(5.2) Inordertoanalyzetheeectsoftheregulatoryriskconstraintsontheoptimalportfolios,weconsiderthefollowingoptimizationmodels(P)and(P')withandwithouttheregulatoryriskconstraint,accordingly:

PAGE 103

5.2.3 Risk-ReturnAnalysisofthePortfolioAssets Thecontributionsofthesingleassetstotheoverallportfolioriskandreturnrepresentbasicinformationfortherisk-returnanalysisoftheoptimalportfolios.Thereturncontributionj(x)ofthej-thassettotheportfolioxisgivenbythej-thtermofthereturnfunction,i.e.j(x)=jxj,j=1;::;n. WeapplytheEulerallocationprincipletoderivetheriskcontributionsofthesingleassets(see[ 4 ],and[ 36 ]:Theriskcontributionrj(x)ofthej-thassetisbasedonthepartialderivativeoftheportfolioriskmeasurewithrespecttothej-thasset.Itcorrespondstotheconditionalexpectedlossofthej-thcomponentinthetailoftheportfoliolossdistributionandcanbeestimatedfromthegivensampleofthemarketpricesasthemeanofthelossesofthej-thassetinthetailofthelossdistribution[ 36 ].Weachievethefollowingriskcontributionrj(x)ofthej-thasset,j=1;::;n: where Wedenetherisk-returnratiosofsingleassets,thereturnonriskadjustedcapitalRORACj(x)ofthej-thassetandthereturnonequityRoEj(x),i.e.thereturnontheregulatorycapital,ofthej-thassetas capj;j=1;:::;n:(5.8)

PAGE 104

5.3 ApplicationExample AnABCBankconsistsofthreetypicalcreditassets:asset1representshighqualitybonds(RatingAA),asset2mortgageloans(RatingBB)andasset3retailloans(RatingB).Theregulatorycapitalisusedat94%andcannotbeincreasedinthenextbusinessyear.Theinternalrisk(CVaR)levelmaybevariedtosomeextentaccordingtotheriskpolicyofthebank.Theinitialportfoliouses48unitsoftheeconomiccapital.Ourgoalistoinvestigatehowtherisk-returnrelationsoftheinitialcreditportfoliocanbeimprovedandhowtheregulatoryriskconstrainteectstheoptimalportfolios.Weappliedtheoptimizationmodels(P)and(P')withdierentCVaRlevels.First,wegeneratedtheecientfrontiersandanalyzedtheoverallportfoliorisk-returnrelations.Next,weanalyzedtherisk-returnstructuresofthesingleassetsoftheoptimalportfolios. AsshownintheFigure 5{1 ,weobservethattheregulatoryconstraintbecomesactiveattheCVaR-levelof39.9units.Comparedtotheinitialportfolio,atthegivencapitallevels(ec cap max=48,reg cap max=10),theexpectedportfolioreturnscanbeimprovedby0.07unitsin(P')andby0.23unitsin(P),keepingthesameCVaRdeviationlevel.Thismeansthatwithouttheregulatoryconstraintanadditionalprotof0.16unitscouldbegained.TheportfolioRORAC,denedastheexpectedreturn(x)dividedbytheCVaRdeviationCVaR(L(x;y))oftheportfoliox,increasesfrom6.09%to6.29%in(P')andto6.63%in(P). WealsoobservethattheABCbankcangeneratehigherportfolioRORACsbyloweringthelevelofinternalriskandkeepingtheinitialportfolioreturn.ThemaximalRORACof6.82%canbereachedattheintervalofec cap max=[34.9,37.7],wheretheregulatoryconstraintisnotactive.However,theimplementationofaRORACoptimizingstrategywouldrequirereducingthecreditvolumesandabsolutereturns.Thismightbeconictingwithothercorporategoalsandmaynotbesupportedbytheshareholders.

PAGE 105

Inordertoanalyzetherisk-returnstructureofoptimalportfolioswerstexaminethepositionsofsingleassets,whicharerepresentedinFigure 5{2 .Thenarrowandbroadlinesrepresentthepositionsofsingleassetsinthesolutionsof(P)and(P'),respectively.StartingfromtheminimalCVaRportfolio,theassetsareincreasedintheoptimalsolutionsintheorderofdescendingRORACs,asdenedin 5.7 .AlthoughtheRORACsofthesingleassetsoftheoptimumportfoliosx*dierslightlyalongtheecientline,theirrankingremainsconstant,i.e.RORAC3(x)RORAC2(x)RORAC1(x).Whentheregulatoryconstraintbecomesactive,weobservetheeectofcapitalarbitrage:assetswithhigherRoEsarepreferredtoassetswithhigherRORACs,andtheoverallportfoliolevelofriskisincreased(withthesameexpectedreturnwecanachievelowerriskwithoutregulatoryconstraint). Inordertoanalyzetheeectofcapitalarbitragemoreclosely,weexaminetherisk-returnstructureoftheoptimalportfolioattheinitialCVaRlevelof48units,asdescribedinFigure 5{3 .Withouttheregulatoryriskconstraint,positionofasset1withhighestRORACisincreasedby50%,ofasset2by28.3%andpositionofasset3withlowestRORACisreducedby21.5%.In(P')asset1,showingthelowestRoE,isincreasedlessthanin(P).Positionofasset3withthehighestRoEisincreased,whilepositionofasset2withhigherRORACbutlowerRoEthanasset3isreduced.Theriskierassetsareweightedhigherin(P'),resultinginlowerreturnsatthegivenCVaR-levelandasuboptimaluseoftheeconomiccapital(Figure 5{1 ). 5.4 Conclusion Wehaveintroducedanalgorithmthatmaximizestheexpectedreturnsofacreditportfoliosubjecttotheinternalandregulatoryriskconstraints.ThealgorithmisbasedonthenewriskmeasureCVaR,whichisappropriateforcreditportfolioriskmeasurement,andcanbesolvedbylinearprogrammingmethods.

PAGE 106

Theoptimizationmodelallowscalculatingintervalsofecientuseofbothcapitalresources,theavailableeconomicandregulatorycapital,andofhighestportfolioRORACs.Itidenties\unrealized"protsduetotheregulatoryriskconstraint.Weconductedrisk-returnanalysesofsingleassetsoftheoptimalportfoliosandfoundevidenceofcapitalarbitrage,thatleadstosuboptimalportfoliosundertheregulatoryrisklimitationrule,asassetsofhigherRoEbuthigherriskareweightedhigherthanassetsoflowerriskandhigherRORACs. EcientlinesandportfolioRORACsoftheoptimizationproblems(P)and(P')

PAGE 107

Impactoftheregulatoryconstraintontheoptimalportfoliostructures Portfoliostructuresoftheoptimalportfolioin(P)and(P')

PAGE 108

[1] Acerbi,C.andD.Tasche(2002):Onthecoherenceofexpectedshortfall.JournalofBankingandFinance,26(7),1487-1503. [2] Andersson,F.,Mausser,H.,Rosen,D.,andS.Uryasev(2001):Creditriskoptimizationwithconditionalvalue-at-risk.MathematicalProgramming,26(2002),SeriesB,89:273-291. [3] Artzner,P.,Delbaen,F.,Eber,J.-M.,andD.Heath(1999):CoherentMeasuresofRisk,MathematicalFinance,Vol.9,No.3,pp.203-228. [4] [5] [6] [7] Bertsimas,D.,andA.Lo(1998):OptimalControlofExecutionCosts.JourmalofFinancialMarkets,1,1-50. [8] Bradley,P.S.,Fayyad,U.M.andO.L.Mangasarian(1999):Mathematicalprogrammingfordatamining:formulationsandchallenges,INFORMSJournalonComputing,11(3),217-238. [9] Bugera,V.,Konno,H.,andS.Uryasev(2003):CreditCardsScoringwithQuadraticUtilityFunction,J.ofMulti-CriteriaDecisionAnalysis,11(4-5),197-211. [10] Capon,N.(1982):Creditscoringsystems:acriticalanalysis.JournalofMarketing46,82-91. [11] Coman,J.Y.(1986):Theproperroleoftreeanalysisinforecastingtheriskbehaviorofborrowers,ManagementDecisionSystems,Atlanta,MDSReports3-7. [12] Damaskos,X.S.(1997):Decisionmodelsfortheevaluationofcreditcards:applicationofthemulticriteriamethodELECTRETRI,MasterThesis,TechnicalUniversityofCrete,Chania,Greece(inGreek). 99

PAGE 109

[13] Eisenbeis,R.A.(1978):Problemsinapplyingdiscriminantanalysisincreditscoringmodels.JournalofBankingandFinance2,205-219. [14] Efron,B.andR.J.Tibshirani(1994):Anintroductiontothebootstrap.Chapman&Hall,NewYork. [15] Fisher,R.A.(1936):Theuseofmultiplemeasurementsintaxonomicproblems.AnnalsofEugenics7,179-188. [16] Hand,D.J.,Oliver,J.J.,andA.D.Lunn(1996):Discriminantanalysiswhentheclassesarisefromacontinuum.PatternRecognition31,641-650. [17] Henley,W.E.(1995):Statisticalaspectsofcreditscoring.PhDthesis,OpenUniversity. [18] Konishi,H.(2002):OptimalSliceofaVWAPTrade.JourmalofFinancialMarkets,5(2002),197-221. [19] Konishi,H.,andN.Makimoto(2001):OptimalSliceofaBlockTrade.JourmalofRisk,3(4),33-51. [20] Konno,H.,andH.Kobayashi(2000a):Failurediscriminationandratingofenterprisesbysemi-deniteprogramming,Asia-PacicFinancialMarkets,7,261-273. [21] Konno,H.,Gotoh,J.,andT.Uho(2000b):Acuttingplanealgorithmforsemi-deniteprogrammingproblemswithapplicationstofailurediscriminationandcancerdiagnosis,TokyoInstituteofTechnology,CenterforresearchinAdvancedFinancialTechnology,WorkingPaper00-5,May. [22] Lewis,E.M.(1992):AnintroductiontoCreditScoring.Fair,IsaacandCo.,Inc.,SanRafael. [23] Makowski,P.(1985):Creditscoringbranchesout.CreditWorld75,30-37. [24] Mangasarian,O.L.(1965):Linearandnonlinearseparationofpatternsbylinearprogramming.OperationsResearch13,444-452. [25] Mangasarian,O.,Street,W.andW.Wolberg(1995):Breastcancerdiagnosisandprognosisvialinearprogramming,OperationsResearch,43,570-577. [26] Mays,E.(ed)(2001):HandbookofCreditScoring.AMACOM,NewYork. [27] Madgavan,A.(2002):VWAPStrategies,TechnicalReport,http://www.itginc.com/. [28] Myers,J.H.,andE.W.Forgy(1963):Thedevelopmentofnumericalcreditevaluationsystems.JournalofAmericanStatisticsAssociation58(September),799-806.

PAGE 110

[29] Pardalos,P.M.,Michalopoulos,M.andC.Zopounidis(1997):Ontheuseofmulti-criteriamethodsfortheevaluationofinsurancecompaniesinGreece.InNewOperationalApproachesforFinancialModeling,ZopounidisC.(ed);Physica-Verlag:Berlin-Heidelberg;271-283. [30] Patrik,G.,Bernegger,S.,Ruegg,M.B.(1999):Theuseofriskadjustedcapitaltosupportbusinessdecisionmaking,in:CasualtyActuarialSociety(Hrsg.),CasualtyActuarialSocietyForum,Spring1999Edition,Baltimore. [31] Pug,G.Ch.(2000):SomeRemarksontheValue-at-RiskandtheConditional-Value-at-Risk,in:Uryasev,S.(Ed.),ProbabilisticConstrainedOptimization:MethodologyandApplications,KluwerAcademicPublishers,pp.272-281. [32] Rockafellar,R.T.andS.Uryasev(2000):OptimizationofConditionalValue-At-Risk.TheJournalofRisk,Vol.2,No.3,21-41 [33] Rockafellar,R.T.andS.Uryasev(2002):ConditionalValue-at-RiskforGeneralLossDistributions.JournalofBankingandFinance,26/7,1443-1471. [34] Rockafellar,R.T.,Uryasev,S.andM.Zabarankin(2002):DeviationMeasuresinRiskAnalysisandOptimization.ResearchReport2002-7.ISEDept.,UniversityofFlorida,December2002 [35] Rockafellar,R.T.,Uryasev,S.andM.Zabarankin(2002):DeviationMeasureasinGeneralizedLinearRegression.ResearchReport2002-9.ISEDept.,UniversityofFlorida,December2002. [36] Tasche,D.(1999):RiskContributionsandPerformanceMeasurement,WorkingPaper,TechnischeUniversitaetMuenchen. [37] Theiler,U.(2002):OptimizationApproachfortheRisk-Return-ManagementoftheBankPortfolio,Wiesbaden(InGerman). [38] Theiler,U.,Bugera,V.,Revenko,A.,andS.Uryasev(2002):RegulatoryImpactsonCreditPortfolioManagement,OperationsResearchProceedings2002,SelectedPapersoftheSymposiumonOperationsResearch(OR2002),Berlin. [39] Thomas,L.C.(2000):Asurveyofcreditandbehavioralscoring:forecastingnancialriskoflendingtoconsumers.InternationalJournalofForecasting16(2000),149-172. [40] Titterington,D.M.(1992):Discriminantanalysisandrelatedtopics.InCreditscoringandcreditcontrol,Thomas,L.C.,Crook,J.N.,andD.B.Edelman(eds.),OxfordUniversityPress,Oxford,pp.53-73. [41]

PAGE 111

[42] Vapnik,V.(1998):StatisticalLearningTheory.JohnWiley&Sons,Inc,NewYork. [43] Wiginton,J.C.(1980):Anoteonthecomparisonoflogitanddiscriminantmodelsofconsumercreditbehaviour.JournalofFinancialandQuantitativeAnalysis15,757-770. [44] Zopounidis,C.andM.Doumpos(1997):Preferencedesegregationmethodologyinsegmentationproblems:thecaseofnancialdistress.InNewOperationalApproachesforFinancialModeling,ZopounidisC.(ed.);Physica-Verlag:Berlin-Heidelberg;417-439. [45] Zopounidis,C.,Pardalos,P.,Doumpos,M.andT.Mavridou(1998):Multicriteriadecisionaidincreditcardsassessment.InManaginginUncertainty:TheoryandPractice,C.Zopounidis,P.Pardalos(eds.),KluwerAcademicPublishers,NewYork,163-178.

PAGE 112

VladimirBugerawasbornonFebruary20,1977,inVoronezh,Russia.In1994,hecompletedhishighschooleducationinHighSchool#15inVoronezh.Hereceivedhisbachelor'sandmaster'sdegreeinappliedmathematicsandphysicsfromMoscowInstituteofPhysicsandTechnologyinMoscow,Russia,in1998and2000,respectively.InAugust2000,hebeganhisdoctoralstudiesintheIndustrialandSystemsEngineeringDepartmentattheUniversityofFlorida.HenishedhisPh.D.inquantitativenanceinAugust2004. 103