<%BANNER%>

KLU--A High Performance Sparse Linear Solver for Circuit Simulation Problems

xml version 1.0 encoding UTF-8
REPORT xmlns http:www.fcla.edudlsmddaitss xmlns:xsi http:www.w3.org2001XMLSchema-instance xsi:schemaLocation http:www.fcla.edudlsmddaitssdaitssReport.xsd
INGEST IEID E20101123_AAAABL INGEST_TIME 2010-11-23T09:28:12Z PACKAGE UFE0011721_00001
AGREEMENT_INFO ACCOUNT UF PROJECT UFDC
FILES
FILE SIZE 7107 DFID F20101123_AAAVBX ORIGIN DEPOSITOR PATH palamadai_e_Page_35thm.jpg GLOBAL false PRESERVATION BIT MESSAGE_DIGEST ALGORITHM MD5
a3f053c3b32479ac66282e4265dd6f7a
SHA-1
4eccc01fd061e41e799ac45771504efb1e6d72dc
2884 F20101123_AAAUZX palamadai_e_Page_08thm.jpg
46792624155ed37aa9b470ab05ac9e6a
4c42c099e670ef393aab62b3c156aa9354922cd9
62704 F20101123_AAAUNE palamadai_e_Page_75.jpg
99d2a5ac2bedbc35656fe7970e542316
07a11b1dca7ae1900d574b4e99ccc2a61abd6d57
25271604 F20101123_AAAUSC palamadai_e_Page_48.tif
932ded7c5d9644f282d9077f43b796d1
7852839865939bfbae76d6333972c602d2f6ad50
28950 F20101123_AAAVBY palamadai_e_Page_35.QC.jpg
502bafbf0d3ba19dc71baaea11a5d253
48a5b3cbd0c1d6562e41f881e37b5287d62ffa8f
10602 F20101123_AAAUZY palamadai_e_Page_08.QC.jpg
1c27791a723f76e84ad00a35fc045742
0b43e7c4b9305ba4439e5bf896212e4ecee6dd3c
44654 F20101123_AAAUNF palamadai_e_Page_76.jpg
dd7df57b9d6526b8861ba9d776d117cc
23055f03d33aa7d121287d139342f061ea4b625e
1786 F20101123_AAAUXA palamadai_e_Page_20.txt
77a1cc82a6246766d077e6526cea14ad
29c26e1c59229645b7682b40a4c6505bdf98cc3d
6780 F20101123_AAAVBZ palamadai_e_Page_36thm.jpg
d4e6412d418f95553504458c6568fb22
33980fdf99e581da369f1b189074c5b4c747f011
6274 F20101123_AAAUZZ palamadai_e_Page_09thm.jpg
0faefab6652a70bf82f997376261a56c
957e2dd82befbffac5ae8f7263567db8b2a9e890
98247 F20101123_AAAUNG palamadai_e_Page_77.jpg
e4a4e1cdde123de2eecb283a8305ad65
b7b374e325dcfd556cafb63fe8fd1ebf7c6b835b
1822 F20101123_AAAUXB palamadai_e_Page_21.txt
7c250771d4e6605ed4223aaa2b225f05
690b33920c1a941a521f462412fe61b9dab4030c
F20101123_AAAUSD palamadai_e_Page_49.tif
c1d3c63086d2de0a3b7200e9754ba41f
36672d1bb8ddeca4919b4f8fbf79ff7a9358546b
103456 F20101123_AAAUNH palamadai_e_Page_78.jpg
c9087f3b850ca03109013701ec8aea65
1af810014012cd678aa43e389fc5834112eb8879
1721 F20101123_AAAUXC palamadai_e_Page_22.txt
2e3ec946b04fbbd825a6c0c230a4766a
523a8340758a5cff12c37c5a0fecfc57cc9df55e
F20101123_AAAUSE palamadai_e_Page_50.tif
0cdd131e362eea10c9a813e423e6ad0a
7fa4eba2214c92e700d2084b7391e3941803ba4b
16001 F20101123_AAAVEA palamadai_e_Page_62.QC.jpg
893cac31ca6cef021a5eb079f0aada81
48cba6421e7ddb29c28b74b00241b59695ce4527
33977 F20101123_AAAUNI palamadai_e_Page_79.jpg
1c0748d6669be75e27ad854fb68f6ab0
d5dafaa19ea73bdf715f6bab28d963204a83acee
1511 F20101123_AAAUXD palamadai_e_Page_23.txt
54bc6118a6ec182bb9c0ef4632682887
baa45efb9c3c5167276631950a1513d5021b203c
F20101123_AAAUSF palamadai_e_Page_51.tif
8b077b1c9343409f0b2507d71ae7a558
2272ca67d2989b1c1f4eaea98d5540e1673b3d37
4331 F20101123_AAAVEB palamadai_e_Page_63thm.jpg
f28c388c6f8f6ea2f8737987ec2e8110
c5dac8dd614b13634ddba4a6f1c79fea2a64e5c8
258375 F20101123_AAAUNJ palamadai_e_Page_01.jp2
6c5dffca7f08c9a13f5b7f56f716389b
330dead954ca3763365299a9aa9ef52b1d56fba9
1627 F20101123_AAAUXE palamadai_e_Page_24.txt
b323b9ffca073d3b3edb04108d9143bc
88c68938be7ec6bd93ffe459fe2866bb0a6d82a5
F20101123_AAAUSG palamadai_e_Page_52.tif
87d3e4d6aac3583a08206996aa914aa3
f84cd0ac60fbf34a89dbeaba8cee6672ad0c2cfc
14201 F20101123_AAAVEC palamadai_e_Page_63.QC.jpg
8bb38ec70fd2a1775187ccd10c0fc112
76a3c7c1377a176ff55cd75294cf3359920b3024
42865 F20101123_AAAUNK palamadai_e_Page_02.jp2
f14c2be2cf380119abd9e5043ae8f31a
2b867d751ac392c2247d18c19dc1e3f7c863ae03
2036 F20101123_AAAUXF palamadai_e_Page_25.txt
2f4d768290be607e780a3608a67164ea
4618f752a3942f6e28272c54611b0b013e2112c1
F20101123_AAAUSH palamadai_e_Page_53.tif
4f42ae063b5cac7502ba633d5c23bb99
f5e976067a8deb925bf46df850379769ee9474eb
4308 F20101123_AAAVED palamadai_e_Page_64thm.jpg
fe234dde4b4d70285e6df4c3316f4f45
d9ff9aea13173769a3e4cb05e5303ae9f8821053
66648 F20101123_AAAUNL palamadai_e_Page_03.jp2
53409dbbd7d0c3e356585e18ef3f2121
7269b187b6b6c05e303595230301c6b4870d9c6e
1425 F20101123_AAAUXG palamadai_e_Page_26.txt
10f28b723123cfe462da9d186dc5b88e
a5d8cfc75722b5dcfa0576eb4d3603986a13c58f
F20101123_AAAUSI palamadai_e_Page_54.tif
bb297bf4a06ba127d3e5ee13ccec9629
bb21c0733bd9446548b46000dba4bc4f3a63a4b9
14378 F20101123_AAAVEE palamadai_e_Page_64.QC.jpg
ca20e21e78d9c6c843b9a96c15b95b4c
afd0253078952e062a91634320bc17af3d097c66
433073 F20101123_AAAUNM palamadai_e_Page_04.jp2
8bf81257c1f93d96905268ebb5d5eea3
6a04cc5cd5497c35cea36a566dab731067887955
1905 F20101123_AAAUXH palamadai_e_Page_27.txt
6f986a6499fae085271d83d3bfe785df
8d3ac3adcdad8409029de0f2f8e690358263eab7
F20101123_AAAUSJ palamadai_e_Page_55.tif
5ac2068165be070c3bcc00bd6fd7d2f8
ce668af74d59ee6947d9f53f7ff1668b0875f373
4096 F20101123_AAAVEF palamadai_e_Page_65thm.jpg
65386b8a478cd4b3485dc4b6c3938a04
589ae536687d7deb7ef20cf0d527f57611af73d6
1051971 F20101123_AAAUNN palamadai_e_Page_05.jp2
e0f500b279cfca633ef08eb00ce037a3
133f6cd4eee2217709e52cad9f9730d4cc5b3534
F20101123_AAAUSK palamadai_e_Page_56.tif
4e9ca891c598b25a0399cb30e7eaefe0
cf92c6664169b1285ce13d70301b052eee0169de
13788 F20101123_AAAVEG palamadai_e_Page_65.QC.jpg
7f7d3ae3128b9264ed4bdbc965b72640
c395a4873617c3ccb31313643a82612bce346867
1051958 F20101123_AAAUNO palamadai_e_Page_06.jp2
db46858fbd4c676a30f0def46efa0c09
d0878f46cae6cd9636d49fef62a8c9cf83216fcb
1409 F20101123_AAAUXI palamadai_e_Page_28.txt
91bc4981e8daa48caadb248ab07d6dd9
5c5472de8c506baaa11bd0906bb61b5d2faa30fc
F20101123_AAAUSL palamadai_e_Page_57.tif
4ae3ec437c502e38791871750ae0ff7f
80284dbd75871dd5de7eb531c865bcf1de151483
5497 F20101123_AAAVEH palamadai_e_Page_66thm.jpg
42d2547f86d3dfb5f0140ce1d4fd7ac3
033616c3d31760c984590683dc892357fdec7b44
722484 F20101123_AAAUNP palamadai_e_Page_07.jp2
dd7c8a1c399c66c11b36139e2c48ee2e
cde2de397746f23cf6e2b128365177b5d4a44ed1
2068 F20101123_AAAUXJ palamadai_e_Page_29.txt
d31b265d48ffe223fe0c7b261c042f2b
4936dc875bb215beecef23209930b8831705b2e9
F20101123_AAAUSM palamadai_e_Page_58.tif
d86eb0aad8f1394c00d101f42870096a
6e2cc608c52bfafe3dd8e0c38e1b31fb75792e35
19603 F20101123_AAAVEI palamadai_e_Page_66.QC.jpg
c86944fa834a13ad040d79003985c680
3d79aabdbce7e2acb422e4653cac1cd27a885f7b
532466 F20101123_AAAUNQ palamadai_e_Page_08.jp2
125d60635eba19c4594d08c1749c8ad2
fd897f3874a14c44583e9e0aca8cb10497c8c9ce
1986 F20101123_AAAUXK palamadai_e_Page_30.txt
c72514abcdbd6191886aee061d7d3c7d
0ce9ed2e8247215a5a25d744670c9bd905cf7329
F20101123_AAAUSN palamadai_e_Page_59.tif
792efe15bff936ecccb6fca34cfede23
1d84c8eec11f4ad73adfe6215b54d5f0944f42d0
3949 F20101123_AAAVEJ palamadai_e_Page_67thm.jpg
89d95badd4e12c2997620a5b1a573893
c1ef3160597a33c34c57ab651e5f14d74fabc9a9
952744 F20101123_AAAUNR palamadai_e_Page_09.jp2
8bc90869f9c87140dd26731c10d8ed50
be18edc36397e0854e545b842244cd2cddc6ce54
358 F20101123_AAAUXL palamadai_e_Page_31.txt
18a3b08af3ba285f1733f1f10373f7be
494a3d5cf4cdafc9a4d1188a42a3b0d0ed0affea
F20101123_AAAUSO palamadai_e_Page_60.tif
1545257868111feaed89c132f3dbe6cb
8556a81cf20901e48e0e50be6f1cbe9ae2cb87b6
12913 F20101123_AAAVEK palamadai_e_Page_67.QC.jpg
375f6d4e051c28a80fb5c0a53f45ca2b
9fc8c2986be25d4fef962457edc683aab5b161f8
228600 F20101123_AAAUNS palamadai_e_Page_10.jp2
713a0f2997bdb0f079c33dbe1e31ddf8
a06cf2ab05c94ac971c9dc2540dba3b047008292
1906 F20101123_AAAUXM palamadai_e_Page_32.txt
10bcbfa2ed233cc3865ba3c2ca3f8a74
de5889991808df54e67e3e36dfc672f753ce9ddf
F20101123_AAAUSP palamadai_e_Page_61.tif
a06964ccf25c2a56530fd430b178faa8
a038c36a831d54caccc30b590bc61dcba9c19733
5549 F20101123_AAAVEL palamadai_e_Page_68thm.jpg
5ec3890124e22b3269e877c05ecb5769
ad8b3b275f97b7e7cc47ad26b7e058f0c03a2cf0
1030756 F20101123_AAAUNT palamadai_e_Page_11.jp2
d2c63973011e0ee888e864ef3e7ea73d
df1f8cf747cc4056af387fac113f69d357b8791e
2085 F20101123_AAAUXN palamadai_e_Page_33.txt
c4dcd42cf0c734cc9f45de8bedfc743f
548cf2602eb4a197bf17649f363906917c1b8604
F20101123_AAAUSQ palamadai_e_Page_62.tif
e20d7b56429f324409ff7727087103da
25adad48b21f966344abcdb7353e25dc80112e44
19944 F20101123_AAAVEM palamadai_e_Page_68.QC.jpg
a68dafdc70db4b1e90344c49928ccb80
b761ce0346cfaeda0526287a9d95c1d124f83df8
1585 F20101123_AAAUXO palamadai_e_Page_34.txt
30d3bdfd37cd953efca9dcc30bcbad12
f98b83ebfe3d30e58ee87153a7f9f99a0cba4515
F20101123_AAAUSR palamadai_e_Page_63.tif
03cd6ba0e0f784e13305cea4ba0be0d1
68774130670ac9cc02bdee2e6b4ae8d24dc8b9f0
1051967 F20101123_AAAUNU palamadai_e_Page_12.jp2
fb18fe4e55f4b7e82a222acc40d0b153
6498f1fe3c5608bd14edb434e15c96e007ca2017
4343 F20101123_AAAVEN palamadai_e_Page_69thm.jpg
249cc0a441554b94178a07894dbedec7
e2d00d9aa3bcb683442df25a9c843685df8bcc92
F20101123_AAAUXP palamadai_e_Page_35.txt
2b3d5788c4d55ea0d357d4ddf0f1e8ce
a0316fbcdf3a20132142660bda584284479ec695
F20101123_AAAUSS palamadai_e_Page_64.tif
d452cfb4ac52ac668faf9001bef10318
fe66fba39868bcf7522c9ae2b20cb1b465358bff
1051979 F20101123_AAAUNV palamadai_e_Page_13.jp2
9689daa9d69e2e903ffc01339ec5cbd1
4bb85b1a45e4632455f0851b59ca293c7dad8616
16610 F20101123_AAAVEO palamadai_e_Page_69.QC.jpg
5d62b54656cb106f74b1f6cdbf2504c3
053ad3475c257609e65ce517de948483241b1f68
1917 F20101123_AAAUXQ palamadai_e_Page_36.txt
3260167de5c1c895da6c55f20a182c96
2f47970ada7108e0ec483ce219576c81a80e4ab5
F20101123_AAAUST palamadai_e_Page_65.tif
0582d7193d02eb5a33f0a4d452057d9d
6d9af76aaef431ab6d1bb8596abf9442c83ec85a
980199 F20101123_AAAUNW palamadai_e_Page_14.jp2
e2ad4afd66a1762a3016d9b0367a7734
00210eba3b1a88d8aceb6bd9616b02934fcce05c
4493 F20101123_AAAVEP palamadai_e_Page_70thm.jpg
d862fed258e2f322e399328fd20fc21c
48607d5888dff8bceb915832ddd37fa0bb2662e3
2122 F20101123_AAAUXR palamadai_e_Page_37.txt
99449bf3bc752dd821be9b398d9408d7
da9051d2261c7d6d48859b2eda12d6b14bc4fa15
F20101123_AAAUSU palamadai_e_Page_66.tif
31a8f5911f05f0f4c7c300999c6cdfb8
a26cc08b49f6afea8f52ad3368701dc45a9ac5b6
487544 F20101123_AAAUNX palamadai_e_Page_15.jp2
3c5c7a368187cdd09edb22603d962fa8
47bdf911ee9f97a3f9197179de5c29962af577cc
17755 F20101123_AAAVEQ palamadai_e_Page_70.QC.jpg
02a00d2fceef9de741c6a05c13925d6b
a559853d5df6150d95db5e0c619b0043e13570e6
1936 F20101123_AAAUXS palamadai_e_Page_38.txt
0cd9d88b9b412fd306d686169946fbef
ac42123434a515e227147c6ce4d51fcc2701837f
F20101123_AAAUSV palamadai_e_Page_67.tif
fd21bf63f511c9f131ac248dd51b26bf
48e338a5a4c8c1935e3c65537726c551681e4ddb
580305 F20101123_AAAUNY palamadai_e_Page_16.jp2
9c63fa49a8607b8c6a5a9a6a20aeb2f3
874655b85204ee046b0e910d66173193c29ecc21
4830 F20101123_AAAVER palamadai_e_Page_71thm.jpg
31fcecd7380b1342f8d0c371fbee4d41
902cacc26bf0d168913f9e8464018c92e98caffd
1765 F20101123_AAAUXT palamadai_e_Page_39.txt
19314b5fa1ca1b66638aaf94c125a8ca
72eff5209165e6ed6eaa473a03ad768b024ad721
F20101123_AAAUSW palamadai_e_Page_68.tif
c5c4dd03c1be3e8235cda3b21a3d4ef6
0771eb922b163648fbea88d0beff1801a560c654
61004 F20101123_AAAULA palamadai_e_Page_18.jpg
96da77b02bca90689399e6d47987e206
1de7db43c5729b3455febfb68662c9312ebabc5d
793500 F20101123_AAAUNZ palamadai_e_Page_17.jp2
829dcb4e088b485a5d96d39d784a61a0
2027ae71c1e8830b7cefe673359679e34cb4d596
19205 F20101123_AAAVES palamadai_e_Page_71.QC.jpg
d45bac95bb30761aae6a210f21cb2a76
975dd0fc3e64b3d43fbaf9929a18ecbcdff71791
1605 F20101123_AAAUXU palamadai_e_Page_40.txt
15bd4d8489005a376ba92d358e797742
3f38c4a199a3ebc5056c835989f75afb39cb71e3
F20101123_AAAUSX palamadai_e_Page_69.tif
1358c8911034483fc7a88c790a9e64d9
bf499583a5f01d91ce08f3fca0a861a635127348
49282 F20101123_AAAULB palamadai_e_Page_19.jpg
fba0e89c595256121bebb4457a16f02e
b42f8b323349b81d3a0caeff19535117d7858f03
5692 F20101123_AAAVET palamadai_e_Page_72thm.jpg
e0962fad050225e77110f820c545fd03
e1e355d2d09d43f0ff8adec177f903c6c404d7ee
1961 F20101123_AAAUXV palamadai_e_Page_41.txt
a55b3c8b84becc6b50f6375610e0dfc5
07fcc2ff272a9177bf2138f0e922ba2a3506df28
697931 F20101123_AAAUQA palamadai_e_Page_73.jp2
35f41c7c064a8c54f33db2ffa3443c47
fdb6641aeab52c4e8c2807cdfc4ca2b06a47eec4
F20101123_AAAUSY palamadai_e_Page_70.tif
8ebdc6e5a1d2c6cee0811b92b5249efb
0ee8b7f86a12729cabd27ca50cf21deefa1eadc4
76861 F20101123_AAAULC palamadai_e_Page_20.jpg
3326830708a3b4b5e67b7a2e4102cee5
e9105787db10615816ebad92f14ef3f102c94b0e
2192 F20101123_AAAUXW palamadai_e_Page_42.txt
635a3349f81004e23e8bfc3fea169a0e
3e2f7c8e44cd47d7199750f8b42ca57241a3093d
F20101123_AAAUSZ palamadai_e_Page_71.tif
ec73e60ac81698180c86021b1a85ca11
e2ff08addca64f8eff33f63702c57fb1d1ea9c80
92739 F20101123_AAAULD palamadai_e_Page_21.jpg
b54b36959b9dd977601af5998999ca37
a928f203aa1ef36cc31bf89e5111d3582a299c4b
23488 F20101123_AAAVEU palamadai_e_Page_72.QC.jpg
80101cf0b18d8eb3283fd94276ac36d5
3d5e57e0fb86ab8bdfa17fb0aa85a240a4189ecc
2115 F20101123_AAAUXX palamadai_e_Page_43.txt
d81364e95d55a02b727ef2c91a591488
38a1d6110ed494a59d1fc8975c6df8e477ce9c69
616851 F20101123_AAAUQB palamadai_e_Page_74.jp2
bb8fef3d7ee7974d690f6f0fb1911634
3c9afc09ddd4f84faa0010ab79cd6b15d420b7e1
83434 F20101123_AAAULE palamadai_e_Page_22.jpg
e1705c4ad0772c980650e389e7abab17
598f24c237313fed3219ea5c3f335ec2d02f0d6f
5254 F20101123_AAAVEV palamadai_e_Page_73thm.jpg
8a5760d629a12e4679d761f19eba1d55
85fcd14ab2bc647ccbc3958369be30fb79102557
1564 F20101123_AAAUXY palamadai_e_Page_44.txt
e65707991f294b2f742c4dcfcf109a24
365064372faccf4d8c4f39aa671994d0ada16f91
666680 F20101123_AAAUQC palamadai_e_Page_75.jp2
7bf1350540bfacac30e785b65a4d30c5
b47063d22b2c3ef252ae318548da710674c788a4
78821 F20101123_AAAULF palamadai_e_Page_23.jpg
36e546afb0b11e646970ea1e85229587
52bc2389f44634aa9973909b5270d55248dc5d8b
37967 F20101123_AAAUVA palamadai_e_Page_47.pro
c79f2e763d8c5abf1a05883be698d247
5182756dde858bf827c7cc5ccc915079d7e825c3
21897 F20101123_AAAVEW palamadai_e_Page_73.QC.jpg
be5f1afc8bc307b47eeb0454cf4d2e1f
4b6860bf84e50ecfc1eaf9d0fb3c3ecea5f2b25d
1494 F20101123_AAAUXZ palamadai_e_Page_45.txt
802d9a77c168702a756bed2c8db4aa54
aeb7a71620baf38bc0ce4c733764770572f0e16b
461811 F20101123_AAAUQD palamadai_e_Page_76.jp2
aa3ac4438f222573dad5c55514f04fd1
9928b82dff0c84af777e26657bb98e63fc8d41bc
80714 F20101123_AAAULG palamadai_e_Page_24.jpg
292b063282079403f8a1f1e14943c9cc
99f7b59d5092b3cff65f0ee4ff3f1a57a23a66b2
51856 F20101123_AAAUVB palamadai_e_Page_48.pro
e496eb22f9925a20fa3c127e26193026
631516fec475a9f241b6bb2c4b55beb535145217
4961 F20101123_AAAVEX palamadai_e_Page_74thm.jpg
d80383f482c91e41a7b92a94f29ac25f
05576471676516a12b0bc7592ce15183e83dd7b2
1051985 F20101123_AAAUQE palamadai_e_Page_77.jp2
7fd432aa190c73b96b9b97efb49b8e3f
10f5a867152096a7b9989d2f7e19bd2c84595a84
98275 F20101123_AAAULH palamadai_e_Page_25.jpg
87696c02db8f8a2bc06dc1061c9b0c08
1038dc4c26378365eb2b1f45b41ed395c3861414
54447 F20101123_AAAUVC palamadai_e_Page_49.pro
b7061d912f59cfef890c85d31f516090
ed92afd374dbfa2ec570b23fc2e5f59bad2c3fa1
18339 F20101123_AAAVEY palamadai_e_Page_74.QC.jpg
5e2c880f403b2126f5c4ad5263cbf609
f7e2f624f9688bbe618fac3e924e7bb1a317e54d
26481 F20101123_AAAVCA palamadai_e_Page_36.QC.jpg
9183ad3b412b1aebc31e0e71893ff13e
bb99d9ddf4a77dda1c28e0a9ba1d2594cd744e2c
1051959 F20101123_AAAUQF palamadai_e_Page_78.jp2
7937e71ebc387898e27641d93bbcde53
e014c92ef3be7412f3d60881a3dab94962f8de30
67211 F20101123_AAAULI palamadai_e_Page_26.jpg
19d6c69834d718a667624172c14e86d0
4d3d09d0a857ce69a1e0fe8a0f5012c3a7843384
43451 F20101123_AAAUVD palamadai_e_Page_50.pro
cf61cf2da4979dd2d1cb75e130b08b5c
7bf8ce36a5db7df68a7c2093325963f1a4e806c5
5319 F20101123_AAAVEZ palamadai_e_Page_75thm.jpg
44a675471ec07962c30dc66d5e0de03c
6aef3c107a61d196bbc0d4dfd4533b149cf8c5d2
6792 F20101123_AAAVCB palamadai_e_Page_37thm.jpg
9de2b12e5e4177c0df5b77cf1da53f8b
f036d996bc97f18415da356b680319a990f1ffce
359644 F20101123_AAAUQG palamadai_e_Page_79.jp2
47ee8f3b1a11223ffb3fc62077df997a
37b3407fdae5f4ba44967d53364931558cf5ce9a
96314 F20101123_AAAULJ palamadai_e_Page_27.jpg
2304ead7834249b7cb94b99946824609
7c1982993a23d870c59f9859f332b6710fa709fa
53676 F20101123_AAAUVE palamadai_e_Page_51.pro
b7fcbee81a1ebe686bdf5a817f102358
4ec0af7970077e9571305bf921586411603efe43
28259 F20101123_AAAVCC palamadai_e_Page_37.QC.jpg
c3839ddd3b8b6b74cebc21c25469feb2
3fc585173e6cc19f14e3bb12f1578cfdf5982267
F20101123_AAAUQH palamadai_e_Page_01.tif
f25ff27da17b8428b56c74b6da424583
9bd7003a3a067a8d40d345d6fa6f8be2af1d04ad
71500 F20101123_AAAULK palamadai_e_Page_28.jpg
fbf4616c91332d7d2127a2e90799a4cf
743b45766ce82686fdb02b4ae0af703f0d5dbbff
48336 F20101123_AAAUVF palamadai_e_Page_52.pro
4292adfe3704530fb4d02c3bcd417a4d
0697ab54b1c133fadce2d8265c8f6c36c0e292e4
7024 F20101123_AAAVCD palamadai_e_Page_38thm.jpg
6338481bece2f6986d4627fddce8ffe4
fc8cedf15c428ee504b5b5e9e4aa6dad38c1fc5e
F20101123_AAAUQI palamadai_e_Page_02.tif
f0ee857534f84fe9125c0919c4c2db50
4140749e4828a32458186e7f84a35ccb2d154f24
101731 F20101123_AAAULL palamadai_e_Page_29.jpg
fed913a95e40afbc905a5c8d8831c464
e10b76232402e28b3232b92d13a7cbef34f9fdb9
30504 F20101123_AAAVCE palamadai_e_Page_38.QC.jpg
81b6e6721d507f4cfb5593ccd5168d3c
f354bf750c587c1d752a3a9802f9d38aeb5ac21a
F20101123_AAAUQJ palamadai_e_Page_03.tif
3032ee824f31e605a21a7dfe8d11fad1
f597661b6fb2be36f39251dc6eb828ea68c5d544
99384 F20101123_AAAULM palamadai_e_Page_30.jpg
ff7464c75f56e5ab2d66e04bddfb8f9a
06464baa2e7933cad8d5ddcc601bac877016f4a7
32072 F20101123_AAAUVG palamadai_e_Page_53.pro
1930549c170b284a0fbc6be3e4ffebb8
3852710b2f9d0c56643e02559855cd374c6bca07
6646 F20101123_AAAVCF palamadai_e_Page_39thm.jpg
22fcf3a313e54ad7cbbaf153fe42c3c4
55d87c16534c002f5624354fbce3157b6dfddd57
F20101123_AAAUQK palamadai_e_Page_04.tif
8d157d70d283b55255b91e645712417b
7bc72e3abfd4f012111e00e0a403e73b2a692eeb
26228 F20101123_AAAULN palamadai_e_Page_31.jpg
dd96beaa29702de612bd3acef48a7913
fc3a03421669e9eb19f9faf9c69335eba8c0b07f
47441 F20101123_AAAUVH palamadai_e_Page_54.pro
ff161988710595bf99b00f17d92be3bf
d33a6dcff204db2f3b1f46dcb40a00f7e71e9e09
25863 F20101123_AAAVCG palamadai_e_Page_39.QC.jpg
96c354de85628978b4596637d26c19fc
5133fb06ffb2b369379b905ef6583b32825823dc
F20101123_AAAUQL palamadai_e_Page_05.tif
52c5325d188be2c004fb7bb3ac91d0db
786d64c263657d78a66125b12915c10ba31a8c81
77301 F20101123_AAAULO palamadai_e_Page_32.jpg
213e1dd99ce22fae8dce268da377dfe2
a295473ce9fdd5d80042fbae3c0ae773eba95129
48004 F20101123_AAAUVI palamadai_e_Page_55.pro
762a8f2ee981a6a23d5a9f3861e27edb
888830375fa97c90012ed59789ab52d5a82c3a46
5984 F20101123_AAAVCH palamadai_e_Page_40thm.jpg
2d34ae9d1344ff2188eee5b47af4c536
ad5be0345f0bb71aafc39dd63c3aa02a8fa14e3d
F20101123_AAAUQM palamadai_e_Page_06.tif
c3d8cf3d7c749fe0e54aa449fcd1b978
9f4579b181205cd7cba5a4496b1faeee88da6ae0
87353 F20101123_AAAULP palamadai_e_Page_33.jpg
da107d5fd5065962df7cdedb143e2707
322c9a6f8435906b2f08dc3d7f1584dc607cc710
61844 F20101123_AAAUVJ palamadai_e_Page_56.pro
f4c2d88c9b37309963a0983059e20b34
334cf9acb2def51fec59a300e7bce4b0f2b613e4
22088 F20101123_AAAVCI palamadai_e_Page_40.QC.jpg
64c3f82c38c3991d00e31d4967d02af4
26f5a16412cf846e4eeb3bcd73fbffa002ed15c1
F20101123_AAAUQN palamadai_e_Page_07.tif
35d863c31bcd486752d8ad5e94d90eb6
f839a14c71f3c8eefb93572e4be5ae290c8a5d08
51436 F20101123_AAAULQ palamadai_e_Page_34.jpg
1d084edd0ab66c35cd9c406d4fa34860
23c04b6c1cb1995a7a972fae92af784e7f6457f0
51561 F20101123_AAAUVK palamadai_e_Page_57.pro
07abedc753848acf3933063240ca636e
85a9cd9047b06962981f207800c0066b421d58ab
6999 F20101123_AAAVCJ palamadai_e_Page_41thm.jpg
2c474cee4f835d57f0b4675283e59146
eeb2035c669b358e98981049bd1ee14a89718512
F20101123_AAAUQO palamadai_e_Page_08.tif
cbf3baf11e191bce8ab8e3025b1742da
15f7bf7a1e9ccc1ab2fddd0c83b771479af399ee
88745 F20101123_AAAULR palamadai_e_Page_35.jpg
9572971f9a2475cbd958f0f507003cee
b6a168c85b310ae587d44d3167802c175ee7cfc5
22921 F20101123_AAAUVL palamadai_e_Page_58.pro
d18c374148bbe3c68ec6c526365e3c41
34ec46f5c574b6d0f5ed2cef84846e7705ab04b5
29414 F20101123_AAAVCK palamadai_e_Page_41.QC.jpg
d82505d283ab02546a56487b8bb1d296
d406f71c55bb3519d3f26c351999ddd57a3391f1
F20101123_AAAUQP palamadai_e_Page_09.tif
039afa2dbc2f976a317f643b6ec92349
1db3e4e6056e717b32ce421f1d73f5f1c0c8e432
81223 F20101123_AAAULS palamadai_e_Page_36.jpg
f68380ce110f596b89463d26cb2aa90c
d5929a8b2455261559bda5249483f2f57ac67b54
54741 F20101123_AAAUVM palamadai_e_Page_59.pro
d186097a00490edff82477586bcf85c3
c6bd9ffec1c8052ec8647fbe9aee3529eeac7de6
7665 F20101123_AAAVCL palamadai_e_Page_42thm.jpg
2e77ecd891e2cd6239aea5c7d415d1a2
87e22740fcca05a96619cc5f081d270e88a44457
91528 F20101123_AAAULT palamadai_e_Page_37.jpg
bc83917964398243095f14fff80f2845
f0cfe0e9e412b73fad9636aeef7e3ec4ae66d225
38600 F20101123_AAAUVN palamadai_e_Page_60.pro
841c6ea9cbf571094dd5d19a0041e52a
7bdf3837fb7317073790d2b476c87bc2ee9184ba
F20101123_AAAUQQ palamadai_e_Page_10.tif
9ff2da349de150069775ff0bb7c826ce
b4c6ff35e138b1b4fb6d99fc2c6eabe4a8a72ad5
33499 F20101123_AAAVCM palamadai_e_Page_42.QC.jpg
69b91e97fff4d784f11700de9886c20c
95131bddbc2b226479053eef8983dbbd002fdb08
94597 F20101123_AAAULU palamadai_e_Page_38.jpg
547bf4b78a736e6b31cee988fa605b14
fd2538eb7379c36ee2820c2ce2552274c1826423
21964 F20101123_AAAUVO palamadai_e_Page_61.pro
f4c14380a9af9a6adcb13241276f5613
acbfb1f311152b4278006dcd3efde1e095a95a6d
F20101123_AAAUQR palamadai_e_Page_11.tif
b2f565dd336072ed98eed484b161319a
fe6646b29b3d1537b63d5f3fe62f0bf72bda2292
7709 F20101123_AAAVCN palamadai_e_Page_43thm.jpg
f97b75fa8c9e1468143ad1317d452fa1
1beb7ced8ddd7e12838bb42922086b0d2d8c47f8
80418 F20101123_AAAULV palamadai_e_Page_39.jpg
a22e4d27f58976f60002d233e03fad4e
05e06c0073be0954bc115559e01184a64e24fb00
23368 F20101123_AAAUVP palamadai_e_Page_62.pro
830e85acbcab9f9ce01646f6410bf79d
f8737d0ae8523151293a2c445319dd67a69b644f
F20101123_AAAUQS palamadai_e_Page_12.tif
10dda6f96338c374414f4b7c4b8a07e8
8115ecacfbcd3159c580f411d782706e027aa8c3
32045 F20101123_AAAVCO palamadai_e_Page_43.QC.jpg
e3dbe7abb06a074c97ffed009101facd
6f2969345997b7c802b7ea61738647f3bb7b02a2
67332 F20101123_AAAULW palamadai_e_Page_40.jpg
534cab6d93617f4eae44727c87259401
6c459549b3729c12f36d3495d76292b08607db9d
20727 F20101123_AAAUVQ palamadai_e_Page_63.pro
b00ed171f83b4aa2ff592a904a190c4d
8c5a67d1fea1b503aac5f2a76940e606f2c5a955
F20101123_AAAUQT palamadai_e_Page_13.tif
d3067909677c2db89a6f9c60409a6528
a1f3c049ad766cabfa9b58c2e1194b514764cf8f
6441 F20101123_AAAVCP palamadai_e_Page_44thm.jpg
1a1f158f23f55aa69dfd1fb6fb24a105
b02c9090a5ae391ded0c5e19dccd8d503a7724ae
92886 F20101123_AAAULX palamadai_e_Page_41.jpg
f419ed5b1908d1b4d3fc4208c5536491
0839e257a53f72ff9598268444d2949091c05ade
21422 F20101123_AAAUVR palamadai_e_Page_64.pro
8c5eb84cacf1848fbee37bcb8f58f537
e301d90aebcd69a899abe285979cad1cc606f337
F20101123_AAAUQU palamadai_e_Page_14.tif
81e303ad8df97a2931e44cc647bc35e0
24edff0a009ec056af7282cd9b176e37beaea8bc
24230 F20101123_AAAVCQ palamadai_e_Page_44.QC.jpg
1045e5d331f8953333a354cf7cb07787
d52397b93e29981a1a8a00e6d057fc632bc5b023
105780 F20101123_AAAULY palamadai_e_Page_42.jpg
20509cecd60f34d248586a24f8336fa2
e74f09676acbf6327367e5ab0822308cf55780bb
20696 F20101123_AAAUVS palamadai_e_Page_65.pro
52ce9c4b1de166c9bbfbb017456b8ed6
4dd7c5bf5ed6706b3a7881be70d5a1e4e61705b8
F20101123_AAAUQV palamadai_e_Page_15.tif
34edf648224d6b4b352f56ee8b735d69
27a2691be9bf07982f4a7427f77e3b5fc5d8ee82
5596 F20101123_AAAVCR palamadai_e_Page_45thm.jpg
493006f3ff9675335ee42894a3bab6c9
1bd3e175ba1dc10680f6820add0b35103d5b65d9
101385 F20101123_AAAULZ palamadai_e_Page_43.jpg
3b2aad296f9013c967681bbf9bd78e24
b96c26c5f5c8109b8ab5f287e04352c51c2abab2
30751 F20101123_AAAUVT palamadai_e_Page_66.pro
508f5b18de847c3c4da82166cec531a4
87ab8818a16ec0e192e4e35550b2f61db4c95cf0
F20101123_AAAUQW palamadai_e_Page_16.tif
a767210108307dc0ef6568a5e168c65c
e25a4ad7eea5df8cbf6b8e7d1e255dffd0959b1f
19232 F20101123_AAAUVU palamadai_e_Page_67.pro
a47a2eb8311313698095a70888e10e57
83669e5e91ea0d095c7aefabea5dbd14d8d6bec4
F20101123_AAAUQX palamadai_e_Page_17.tif
e9e546e81a7645b0495a76286e750d51
1da2b4cce4031898072c8e1b3b13d78f6bcfaea5
19295 F20101123_AAAVCS palamadai_e_Page_45.QC.jpg
4648271c7ac5c2d4c3488fea86318bb8
97362813f130e095830e0278fc6af86c7bac02aa
30782 F20101123_AAAUVV palamadai_e_Page_68.pro
28670f66f5499fa4ce569356f1ba5f04
cfd1f689938cb0a55ec94b5b27a7d8a454f6ba81
F20101123_AAAUQY palamadai_e_Page_18.tif
ce277bed9c5ccf15cb5ff7920ea47c31
d6f65059340644edb91f1d18e0e0bb6695b10382
5701 F20101123_AAAVCT palamadai_e_Page_46thm.jpg
a0e2f412e26ac151cdb6ed75c09a7597
b3580b92e643437ce3e77aa0c2326179bf35a711
24650 F20101123_AAAUVW palamadai_e_Page_69.pro
74358063ff0e286a50155350e8534050
c59ffcc265d02cb6bfcc6c935d2210196d76c381
649051 F20101123_AAAUOA palamadai_e_Page_18.jp2
18c08608ec4737a9a7d5a77b3530d1d1
9e7234912ed52a8ee19046d108e15568f6e168d4
F20101123_AAAUQZ palamadai_e_Page_19.tif
ac78dadf7fbaac127442544dc173a0f5
c0feb5353fd40f9c99fc29240d38c61e0284d873
21164 F20101123_AAAVCU palamadai_e_Page_46.QC.jpg
73d94f1b7ec5a94afc811c3c10aadc53
ffc465b937820064e4541a4263adfaf9321eda2b
27072 F20101123_AAAUVX palamadai_e_Page_70.pro
b4ebb555afd8d6e4a99dc04c4b089a86
1491dfe06da1c5ca35a1c557d6f9cb96ff19cfbc
465659 F20101123_AAAUOB palamadai_e_Page_19.jp2
d3c5fcd0c0bea32fa9cc041d8a3742af
409826b6d1f2c6338765eed88ec906cd367be2fa
6522 F20101123_AAAVCV palamadai_e_Page_47thm.jpg
4e5ab188b959f0bd898a3aa0d159fecd
ea17be7a6bbe66e28c34bff4290b82a1b305157f
F20101123_AAAUTA palamadai_e_Page_72.tif
ca6de81f584e42ec87cf904ea5d54cad
2568443902fd310fdb0d4eed927ee60c7cb577b1
29013 F20101123_AAAUVY palamadai_e_Page_71.pro
e15535ea5c0b35a35e8fba20485c5091
41c17b0651fe896228bc3bb99e78906e26237378
989122 F20101123_AAAUOC palamadai_e_Page_21.jp2
03d8e58f28cc9ab85f7fe77e3c8f693a
53fb80da3dab64f7b979a3bc5d634c08dcbac7c2
25430 F20101123_AAAVCW palamadai_e_Page_47.QC.jpg
546d3ec1950dd874d82bad474314465f
a7e75b3cde278ec251309069762e8663be9fb758
F20101123_AAAUTB palamadai_e_Page_73.tif
f83fc5c3d35f840b94125a0e347a4017
18a972251a6134c5edf215df97ec035360f61c56
36063 F20101123_AAAUVZ palamadai_e_Page_72.pro
c0e22ad834b75e616bf8986982d4498a
3c510aeb27ec1b933533e3032d124cd6919e543c
893424 F20101123_AAAUOD palamadai_e_Page_22.jp2
6647d24e3c288ee85bb4bfd63b824e04
d2dfe31ce1179634dbba1ecdf9969f718473dce9
7414 F20101123_AAAVCX palamadai_e_Page_48thm.jpg
10856efcc6bb3f1f192f1b915ad7f8a2
079f49e9a89704914374e506fd94ff98c4573877
F20101123_AAAUTC palamadai_e_Page_74.tif
3b0d3cd8f7c26f3397b34223eaec4f69
418267cd7b1f44c6ff912e46462b84a1c3805881
789419 F20101123_AAAUOE palamadai_e_Page_23.jp2
bcb1d784b2bdfffefa8fcd136a2e2b81
0546f528e6d463610d800b0bcd2992ac8fe071e6
31579 F20101123_AAAVCY palamadai_e_Page_48.QC.jpg
6b14f181813a28af020c88e3bba49076
fb179568e82737de4fcdf8a619b5b3cd98061da5
1808 F20101123_AAAVAA palamadai_e_Page_10thm.jpg
12a05f838eec37cc412c4135f4c45b6b
7220a7c2808e3fe763171141f7c2a352fb72de82
F20101123_AAAUTD palamadai_e_Page_75.tif
0ce329f731dc73aa18b49f0b417e78a4
14fc1e45ee030f2ff63a44430bcc29f9da4d584a
864957 F20101123_AAAUOF palamadai_e_Page_24.jp2
2e224e432a5ab828b66323dcbc14f1f2
da5d479d55e577b137077183bbb5cada93523fba
1444 F20101123_AAAUYA palamadai_e_Page_46.txt
77e7e0f3ff3a33b2d5274d80f6e3b9c4
cae48601c4d7c44b5356b1d1f74bd81626e3192a
7277 F20101123_AAAVCZ palamadai_e_Page_49thm.jpg
c933e2eb3443765ca82197278af195da
bd0addae911a1190c53043d484c492b50fc40c01
7059 F20101123_AAAVAB palamadai_e_Page_10.QC.jpg
21d8ba8062214e916f6476e3ecb60eef
8085417b4953921b14a020332656da1a2d1eb794
F20101123_AAAUOG palamadai_e_Page_25.jp2
212921399f6d296bf725b1c8f25826fa
e00be4a774ba9c68e89491431f75d2764a1ee56e
1764 F20101123_AAAUYB palamadai_e_Page_47.txt
33a4f24a298d4677402629f0bd3e6b27
b88a76c063655bd5329150c5b390b61db1ea8473
7039 F20101123_AAAVAC palamadai_e_Page_11thm.jpg
f1465e7b23ccd262aef902b5ad1a3db7
b1947c05ca97914894a84de007f023259837929c
F20101123_AAAUTE palamadai_e_Page_76.tif
20129046f1135e3b3e04fdd98e6c720b
22ee60337be9be389f16056cedf568d7c25e52ce
658723 F20101123_AAAUOH palamadai_e_Page_26.jp2
1b37869162dc0c53bf3f7e08141ebc08
f21982f4a744ba2b068a27a94cb8b57932fbf88b
2081 F20101123_AAAUYC palamadai_e_Page_48.txt
d005cf7c0a375fc190dd28a1c4ec56d1
a5e7035c207ed46348bda866217446d2782de1cf
20584 F20101123_AAAVFA palamadai_e_Page_75.QC.jpg
c0f75271329526a1535abd1590cf564b
cf1d32370f55d49e962380f9a0c6c6237caa5a68
29371 F20101123_AAAVAD palamadai_e_Page_11.QC.jpg
347976d2792b354f745afe8d7336f5b8
7a74a5860c5df971ba0e2854e60fac029985ec1d
F20101123_AAAUTF palamadai_e_Page_77.tif
7d013cdab72d301d9ce7cad24befeca6
508302f2ae156d28c24e2430e903a1e845a9fe2d
1025718 F20101123_AAAUOI palamadai_e_Page_27.jp2
50212ef15371f5edc366e157e962d972
85bf395ff4ceb8ba4a6b42a50211244232c5d83b
2477 F20101123_AAAUYD palamadai_e_Page_49.txt
a303a655d90999b43662974f37b3fad2
f1f02d1877ef603ef76519212c8c5d5e462fd01b
3711 F20101123_AAAVFB palamadai_e_Page_76thm.jpg
79be42ca256c460c9649065ba28a4fb9
c95c1e9dc000960cb88e8fc56d45e3da398632ce
7311 F20101123_AAAVAE palamadai_e_Page_12thm.jpg
c508063d8ad93d440caed2b059da078f
079fe5d8b215e0edc73fa2f632bedd742046a333
F20101123_AAAUTG palamadai_e_Page_78.tif
94a8cb18d173812e1ada8ef2eb795c93
1d28d2b8f3736d35f488467539a7b56079558b8e
721484 F20101123_AAAUOJ palamadai_e_Page_28.jp2
2f5def61b306741220ebdc7f8facf479
e1afeb9fcbe2e16efc71f66a7655ed76ccf5ef56
1953 F20101123_AAAUYE palamadai_e_Page_50.txt
bb02f85caa56f26e4028689595e0f53b
2bc2175f854cc7ad0e0264bc753269de15d72d3d
14210 F20101123_AAAVFC palamadai_e_Page_76.QC.jpg
5d3cd018a0a10b6dd18e752c5eec93df
1919e9bfed21fc88f5e142519167e062d0f22e7f
31564 F20101123_AAAVAF palamadai_e_Page_12.QC.jpg
5979a760c8664dccc4d0bd6aceb0d8bb
fd36902f05edb6120a4551d94234e394e4b58001
F20101123_AAAUTH palamadai_e_Page_79.tif
8751b5f4075920c51b1e80f148487f4e
50f81ef3213c77995aae0cedced53fca30a83174
1051982 F20101123_AAAUOK palamadai_e_Page_29.jp2
5afbb8a7863134b6b02484cf7816bc49
f244a6c19a366dcdc7744ae310ea7e3ea83ea264
2490 F20101123_AAAUYF palamadai_e_Page_51.txt
a5af132c4c5236d2860f8fff9889c425
bb0c383ec1dd6f7088eeaa8f19005ed03c3492a0
6974 F20101123_AAAVFD palamadai_e_Page_77thm.jpg
c1df67f9fa91922bc0ece88f9bd8d204
d8479a61aadb8c942f9fbd6440d3eec674820daf
7769 F20101123_AAAVAG palamadai_e_Page_13thm.jpg
46ea95f0c7d4c54e5c64e6880213d25c
30ada54703d0f9607132ace5669f541ece61ca14
8181 F20101123_AAAUTI palamadai_e_Page_01.pro
365f1062771acc76e547cdbfc68678a4
ecf24139583e6bfee1132965ad28971e70dcd588
1051978 F20101123_AAAUOL palamadai_e_Page_30.jp2
5668877db6dcf7603ff8b3857d452390
185ef7ce9b453c0c95aef3a040f16060d050a4fc
1945 F20101123_AAAUYG palamadai_e_Page_52.txt
2dd3de7b0c3ab3a27ab81313c22ded52
2c13f7597ce7c420fdf294fb17fb97f25e85a99d
27864 F20101123_AAAVFE palamadai_e_Page_77.QC.jpg
2ad5061a217895d67c94022ff80888f4
9269d208f626ca1abda4741752db687d532cd476
32804 F20101123_AAAVAH palamadai_e_Page_13.QC.jpg
94eb0b6a830e949b07166e4159ea78c1
b6d5cae600fb0701435727a4320cc2000caa2c6e
1540 F20101123_AAAUTJ palamadai_e_Page_02.pro
240ded1703587228b00d8c07abc4b780
b13be7d4c36b5f7338d61e126ce84e4860ca3781
237495 F20101123_AAAUOM palamadai_e_Page_31.jp2
666d1c8c7c45419a72aa45a02da0b94c
89d07f0a75b040e3ae20bc5ad6eb90b53ccf16c6
1609 F20101123_AAAUYH palamadai_e_Page_53.txt
c7221937c536fb6dfddb332af802a660
2cd9476b501a8b3fcdc0cc3cb6aaedf4e797fa00
6911 F20101123_AAAVFF palamadai_e_Page_78thm.jpg
4db81b509cfce9a795474495fd3c2e7c
4904b062c6071b5e4b72ecf8ffbec8896cd7219d
6645 F20101123_AAAVAI palamadai_e_Page_14thm.jpg
999845d095be59793fc9198ff65f973e
6f5d12dabee5076c9f357ce0a1a6a817690581d3
2825 F20101123_AAAUTK palamadai_e_Page_03.pro
63089090c39f8202e23a4962222b215a
b6c49853c853deff7675b01834fd03f14cdbb331
847402 F20101123_AAAUON palamadai_e_Page_32.jp2
f907672d9e25d5017881da3743841c78
c95d5eb721a09f6a3524a75ccbf3e40f8b989bb0
F20101123_AAAUYI palamadai_e_Page_54.txt
58e81f9eb6b17ecf02c7b0686d642253
4a5feb6607bd9660878f95eb2e18b1290f5c79f1
28670 F20101123_AAAVFG palamadai_e_Page_78.QC.jpg
d0f7dbe3897d27d3fe816c7614552b15
4ffef67a67dd3fa1c819cb0b60d6a4d743553226
28092 F20101123_AAAVAJ palamadai_e_Page_14.QC.jpg
2e339fa8033c836e5bbb0814e87b992c
80061acb55491fea1c8f89149478ea0782a6dfc5
19133 F20101123_AAAUTL palamadai_e_Page_04.pro
507fa80a8dd96c67e4da2e49f1500cd0
90a5f9b4836f9e9be1891587a4bfedfcb8dba7d2
956809 F20101123_AAAUOO palamadai_e_Page_33.jp2
68e600b7f4351c90d5d9c96a93f02866
de449858b282bcdf059274bfc52ec1094eb5cbaa
2612 F20101123_AAAVFH palamadai_e_Page_79thm.jpg
7409cd916995863c70976b3ac6664396
6f0128cabdfc92113c402aace847dca50063a54c
4657 F20101123_AAAVAK palamadai_e_Page_15thm.jpg
488939707cdc7a75a87ab8161be42b29
9e7e73e139bc08bf4dc6bdde2a10b2408088431b
545774 F20101123_AAAUOP palamadai_e_Page_34.jp2
475912a69b1e9a4a12d21d8cd67904ea
21fa6ee21c32931689cec4b5b06d049ceca166da
1944 F20101123_AAAUYJ palamadai_e_Page_55.txt
bac3a20712ad67e19d30c1bd375442fe
6fe3c07361dc342fbcbdc51ca3b1344ea96cc238
49794 F20101123_AAAUTM palamadai_e_Page_05.pro
ae0d302f3c62a2e57ed1e24d80c5f709
cc1b9599a23346d6f78c1a9bc5b97d682460b9c7
10797 F20101123_AAAVFI palamadai_e_Page_79.QC.jpg
6be8a8a589cdc199452478171e5996c0
9bc6c0fe56ddd905cab16a982f1a54998c8cf63a
15947 F20101123_AAAVAL palamadai_e_Page_15.QC.jpg
d1e6a1f634a0615849ee6fb015abd797
ebe2031b0300719bcd7d8438529db637f280de57
963644 F20101123_AAAUOQ palamadai_e_Page_35.jp2
1118ebcb63eff2703c879c0675039774
f7a4db253f1cb2095669f5acc6338560f17166f3
2822 F20101123_AAAUYK palamadai_e_Page_56.txt
c9e1e3b7f39880ef66ea2b75202a4da5
2b70aa0c751e65b6a2206efdeb2a0fec59c2b1f5
27303 F20101123_AAAUTN palamadai_e_Page_06.pro
61b9d7bf3fc3383c045d521e64cf5a3f
01b41e41c09369340a85af6bc95f0ba30b231768
93920 F20101123_AAAVFJ UFE0011721_00001.mets FULL
23a67bf46109e872a547f0887d366718
4f8b7c4eedf0e1d0e6959bd52d6fca8d64bebd04
5385 F20101123_AAAVAM palamadai_e_Page_16thm.jpg
ed9442dec176472bf666dfd892d892bd
2a83a380d0dfa41a350d50262ff8e2871c846869
859421 F20101123_AAAUOR palamadai_e_Page_36.jp2
fb33276529107d041e728b59aecc645d
a41bac04fb37724775d24ea918b0fd2d33b54353
2236 F20101123_AAAUYL palamadai_e_Page_57.txt
fbeac86cedf1a43b60a1c2afb086963a
56f5e367fa03c489a153fcb45d0f94eb3ab07a74
17224 F20101123_AAAUTO palamadai_e_Page_07.pro
806696939741b9e4036dff400b042944
4a3fd844dc5ced94ef9d2bde1e7d798480d928d8
19500 F20101123_AAAVAN palamadai_e_Page_16.QC.jpg
bf2dd650ecf34f23304afb9470fc2e3c
d2a56d975e155d34a844a30c7cac9c773e886223
977416 F20101123_AAAUOS palamadai_e_Page_37.jp2
9ede3970687e6e3373abc850f9686c29
d0c440191345bd97b4d116ee228aac035615f540
1003 F20101123_AAAUYM palamadai_e_Page_58.txt
92563b560a22c2f085020cc3d217b9d1
6ee4236266ee35e9c9bc4202107442dbedf13d4b
16722 F20101123_AAAUTP palamadai_e_Page_08.pro
31806962a057503798d7de9a3b9f6ea5
f22dc680087b8134f17381784d70253a01204a25
6207 F20101123_AAAVAO palamadai_e_Page_17thm.jpg
67522dede5db81e6bff2293706ce3fb6
f25bf3ce74d5155b9df4c5a5cad932eae442b2ae
1044029 F20101123_AAAUOT palamadai_e_Page_38.jp2
c23d1863cb54b4f3ea30b09bd87ef169
a7ae5a0619b0793c3a9daffa2f14a34eba4212fd
2546 F20101123_AAAUYN palamadai_e_Page_59.txt
2f8f8561609b75fc5fbd45f5c40c6690
286d490b856833e1f6b856065c81b7192f42bd6c
42473 F20101123_AAAUTQ palamadai_e_Page_09.pro
c368c1b0e603b2310a95ce61be5053c8
440b62b6c6aa801e4ab49553f74490f24b0a051e
24235 F20101123_AAAVAP palamadai_e_Page_17.QC.jpg
004e89bf7dcc86ed431471ae8800bb30
448d1e2c384ca90a6079a6a05849558376f1835c
715385 F20101123_AAAUOU palamadai_e_Page_40.jp2
871ae8a02227f679a6502179f224d143
ffdbe9cdb57dfb33f609042e65ae6aace024670c
1702 F20101123_AAAUYO palamadai_e_Page_60.txt
be30a00c9cb471c573afd471a429326c
f98f7a29af0c4413be944cc877ee974e0abc6b3e
9968 F20101123_AAAUTR palamadai_e_Page_10.pro
2df75b12b895739be24209986f1e4c67
8385f499356ed4db78590822d67e9b04111a16c1
1035253 F20101123_AAAUOV palamadai_e_Page_41.jp2
58bab9d8d4ade6cfd867085d9073bb9b
5e007e141769936a6ce0db50a2d971ea2c5e0178
1072 F20101123_AAAUYP palamadai_e_Page_61.txt
de02c86d2c623c359c5749cf9f4d4141
51d6ce50eb13bbb3e634ff0ffdbb75e885f1ac73
822800 F20101123_AAAUJY palamadai_e_Page_20.jp2
304117b7c14baf7f7d681516c5c463b2
8b1d858ec0a78b83c29698ad17f7146e587996d1
47007 F20101123_AAAUTS palamadai_e_Page_11.pro
30129f572fd13d55f6a9b63236ed593f
81120994aba134187cd54df13d90d76dfdc53f57
5338 F20101123_AAAVAQ palamadai_e_Page_18thm.jpg
72b4187581c126dcf1b1816951a17b54
8f62976ee99e687ae097c3356e390da173fe5f43
F20101123_AAAUOW palamadai_e_Page_42.jp2
de10433e604badf5ede0ce699b183ac2
cd08ee5cea58d952e1e9f45386830a2cb7a83dab
999 F20101123_AAAUYQ palamadai_e_Page_62.txt
5e06dc76d5dc890ad2595280e4ea7003
c1ae0297ab03c2bfa6dc9e83163ea2b0bee53aa2
519613 F20101123_AAAUJZ palamadai_e_Page_62.jp2
60dc892e7bbbac0238286aa5f97db7d4
bb85e1fa97615e41a639f60847936dc19a0c0ce4
53734 F20101123_AAAUTT palamadai_e_Page_13.pro
a5ce498ea62c56d779a58ff137fdb2b0
34b78c23057f7679c99900df4e7cb58ed425932a
19952 F20101123_AAAVAR palamadai_e_Page_18.QC.jpg
7a54ccd009b8f0dd3f39c6dc7c70536e
c9925616d02a97c6ebd3cbca9cfed1cb3971fcf3
1051956 F20101123_AAAUOX palamadai_e_Page_43.jp2
2a0f868ce8c8db7709a78cdd2c005d7c
c2a50e99c2c1714c47ae2e2554a4e38dcb54f6b4
967 F20101123_AAAUYR palamadai_e_Page_63.txt
52bbb53dfac8f1c5a5b8d2c3fc1fe76a
395b8ce8fc61de62c42a39d775f1568046b18d99
25703 F20101123_AAAUTU palamadai_e_Page_15.pro
888b1f3296b090be5ebab47ea208cc54
be195ce35fa825652b2714fac71fbac80680b4fd
4826 F20101123_AAAVAS palamadai_e_Page_19thm.jpg
61bb6cc15c5f0927be7753f6f0ddb06a
e7920a91094ec4ec758b49fe86f54079c5556f5d
775511 F20101123_AAAUOY palamadai_e_Page_44.jp2
0b6ebaa56653f2351490920e8a9fef57
7ed3806c537cbfc83ea1469a6d9a82b88d5790e2
974 F20101123_AAAUYS palamadai_e_Page_64.txt
0b0930dd3ee70e6ee7a2ce7e11f963ff
a3bc44ec7a2cac5a1755bfde3407f000cfae12d2
26386 F20101123_AAAUTV palamadai_e_Page_16.pro
867a67a0ce3f024223511e2d8df925b8
be2e0d05fa094979ae57397243a149581dfb9b9b
6518 F20101123_AAAVAT palamadai_e_Page_20thm.jpg
625fcc716c8e40c6953c2ec800a18033
18de662c83bbc386f33917f1ddb444b4072e4ffe
687676 F20101123_AAAUOZ palamadai_e_Page_45.jp2
20ab4cedd9c7dd7e06481806caf35e08
c138deebae87ccaebe17c8e5286d63c9c776d37a
962 F20101123_AAAUYT palamadai_e_Page_65.txt
59b1335532414e6e64591704f3fb1277
c7bc0830400aaeb571e00b0705099c8ca23c769b
38253 F20101123_AAAUTW palamadai_e_Page_17.pro
16bbcb4a90c8c4baed19fe65aff2434e
f234097302f63bded1ae0e7cf714a8a964f52239
73591 F20101123_AAAUMA palamadai_e_Page_44.jpg
9defd060b0a7fa0ecfdb041ce63c670e
a4346aef31da8d6c803ac828506c4b21a987433f
25371 F20101123_AAAVAU palamadai_e_Page_20.QC.jpg
76a5461734d32b6d7fc8997ca6d6e793
2e18559840126b1cdb4705197a8d93f8670f2ed3
1279 F20101123_AAAUYU palamadai_e_Page_66.txt
d03d7cd0f819f3e861a946b567794748
e0f091ef61d4d6c26e255fd7d5bb5a39259b9eed
30285 F20101123_AAAUTX palamadai_e_Page_18.pro
d645197eb6ef7bf7b8872e62072e2dcc
f6470f6f04fc513bf019085d2ec3985d23eac1f5
62395 F20101123_AAAUMB palamadai_e_Page_45.jpg
faeb23927dfc16e4a91271baff044961
ad60b012283a82eba283c534755af85955a377d1
6986 F20101123_AAAVAV palamadai_e_Page_21thm.jpg
59a1c4c788601252804588397d4ca3e3
826ab853493bbd52ca25ffce09466fc50b3d4122
839 F20101123_AAAUYV palamadai_e_Page_67.txt
7f81fb1cbb81925536c93a6d301856a0
f77c9204649a410df607814b42dfb3846198b107
F20101123_AAAURA palamadai_e_Page_20.tif
d90fd7db338ddbbba0fc60c2fa6eaef9
c414aca0eca52027a334284684dd3c4bd5f3f784
22212 F20101123_AAAUTY palamadai_e_Page_19.pro
eadc573726e75c4a1ef0b57fe64f8a68
5717865a788c8d4ea99768fa1d024bf60f858f98
64300 F20101123_AAAUMC palamadai_e_Page_46.jpg
36a9af94e02805dce6979b1e4bb160f6
c8c3716463dd3ba61edde1e84b484d40a526b5a8
29831 F20101123_AAAVAW palamadai_e_Page_21.QC.jpg
ce2f43372d317cb1ff63985c962a9a19
6d213c15a4ab18414ccf092cd03494e0487ee811
1349 F20101123_AAAUYW palamadai_e_Page_68.txt
b4cf342e9218f044d40da0af4b29758d
b9ddfc496507787f4714778eb58150e91519cc81
F20101123_AAAURB palamadai_e_Page_21.tif
eecd637ba04adfa31afa54466edc9727
540047dc7e3393087aa598aec372f60dbcd1a68a
38406 F20101123_AAAUTZ palamadai_e_Page_20.pro
03b7dd784a8f0dfa993d36f056dd29a3
4230d3146d75853d4c1064f4f9854eb593227e76
78026 F20101123_AAAUMD palamadai_e_Page_47.jpg
352648e97a222c083f13a9ced9a3137a
bafa51c9140d796fd839648b9410596ddcc4f86d
6486 F20101123_AAAVAX palamadai_e_Page_22thm.jpg
86c64f1cbc356861b9c9cdd6b15f6b1b
dc4eb3f639603cb68f817cc5365ae29ed27361cd
1121 F20101123_AAAUYX palamadai_e_Page_69.txt
00be9384d1dea4a610dd57cb0985eff3
47e27da8dc2ffc87a7937aa5cb3b94faa34835e3
100241 F20101123_AAAUME palamadai_e_Page_48.jpg
c560402cd45a5a99390707762a08311c
c6853f66450e36f119f3712d062ea566855a338b
26813 F20101123_AAAVAY palamadai_e_Page_22.QC.jpg
7c7d4793d153224ab15b672d3cb0f31e
028594d56173a6cd40a1dac92e5c6954c33bc353
33125 F20101123_AAAUWA palamadai_e_Page_73.pro
1b9cc67adceb8191b023afe29242d995
331653a62838f05dc4a40e1ba811fc58e334b7a2
1241 F20101123_AAAUYY palamadai_e_Page_70.txt
ec4524f55d34db33cd26c8b2d8d762e8
c946f93ad0428f207392a596539d80c7d517344e
F20101123_AAAURC palamadai_e_Page_22.tif
02f4a17065cf9b21a545ec6e1a7d12d9
785f2bd778172765159e29c42a1243bd3ce13b8b
108788 F20101123_AAAUMF palamadai_e_Page_49.jpg
ea70b190e5a8115f2943a9cc905cee2e
79288092cf2c11113be1bffe6adb04121abcf1bd
6376 F20101123_AAAVAZ palamadai_e_Page_23thm.jpg
b7cae9dbd20321f444eced0651f86453
2cbef3b17010e5494c2d9ba3e3ab3a8a2c6aa2b3
29313 F20101123_AAAUWB palamadai_e_Page_74.pro
34f48d952ba4876c5a3a1e239f0fc9d3
0886f2eb7e05aa24739349f173453fc78e2187a5
1321 F20101123_AAAUYZ palamadai_e_Page_71.txt
1065bfc177d6e028f23b585602ff6c44
c7baec8d4102482d0ae55051b471c9f8b5ccea31
F20101123_AAAURD palamadai_e_Page_23.tif
64ff2ece20072aa40b86daffd86aad37
d5143a932d4ad8a5c5f3f00c30cc5247313fed42
91726 F20101123_AAAUMG palamadai_e_Page_50.jpg
3db0e4d5064e89669d278cd594e11911
700f735a91b95a23ff788ad33f04388a7f8ec8b8
31112 F20101123_AAAUWC palamadai_e_Page_75.pro
e0fc64f8ea1f73d78c96453135eb2fa2
1bc0beb73c0714a29446f18d3d14489f0797aa0a
F20101123_AAAURE palamadai_e_Page_24.tif
cf3ccde9edeb8f2ac738569d552cf272
61879fb6b2b865765b1cef41297bd82acd80477e
99644 F20101123_AAAUMH palamadai_e_Page_51.jpg
d19f97f99e7d814423e17d83c577a8c8
4b6375d60b2a033c2797d559abd10a309491b99b
31181 F20101123_AAAVDA palamadai_e_Page_49.QC.jpg
0953038a39765d9fe68561431f40e02e
ab2075683646ae6f00323ed30a470293f71e146d
21029 F20101123_AAAUWD palamadai_e_Page_76.pro
3ae05c18fd239c805ca4774b29f26a65
2fc587263c8baac6eb0c6987d34e6335dcb308f3
F20101123_AAAURF palamadai_e_Page_25.tif
b672023e8f031e0afa85d4316da81633
e6b65a6f900d394b181c1af45168b82236701005
98243 F20101123_AAAUMI palamadai_e_Page_52.jpg
2fb9b0e9cb16095c6a8479638a140e14
817f89c6b8db068f269341952197dccab35729b1
5526 F20101123_AAAVDB palamadai_e_Page_50thm.jpg
24b3cdd7bb97c62ad7d590231180cb51
30c071293d088741c667c4b067ccfc1b00bc5f2f
49803 F20101123_AAAUWE palamadai_e_Page_77.pro
e88236e217b21dfcb36e827d3ba52e58
540fd5915f0c92de3b5736b6b69df17668ac2594
F20101123_AAAURG palamadai_e_Page_26.tif
7cb5ee0c62d8e050f338924ad74fdd84
4e2a1b633d8f14a4b4979511bfe97ef2a1c8e869
65318 F20101123_AAAUMJ palamadai_e_Page_53.jpg
428c653d55a947ab58062b5793b98a7a
70f53a44b0b334cd5d64170b2d6de420b4ecdb03
23472 F20101123_AAAVDC palamadai_e_Page_50.QC.jpg
8be8cdff08d26b3d21be6330ed53d9c0
0eabd6b6bf02258bd58de7c16a53cdaf5cc1e121
50244 F20101123_AAAUWF palamadai_e_Page_78.pro
8117ca07f20d51db584c64fe69f6f1f1
1c397637b22de12c3120a3a26b7ce90a6f096120
F20101123_AAAURH palamadai_e_Page_27.tif
22208b6351adf3927cc873f50ba9a56f
7af5fb2478a830cb9c44b6535ee15fcd872ab239
95738 F20101123_AAAUMK palamadai_e_Page_54.jpg
a75a1c0e4e25ef5f5b8f90e026271de1
fa94060794c6ae7c38e15528e9a245150950106f
7101 F20101123_AAAVDD palamadai_e_Page_51thm.jpg
ca91efd5358c49d64445eedb59310d68
f8584dd0f391a6c080fe963b7e96e43bb320044f
15343 F20101123_AAAUWG palamadai_e_Page_79.pro
238bf7f9e4b9e18d3c4ec95260e0048c
bbec9afe34a1625a8c6b2b5789d3a26f5ef4d17f
F20101123_AAAURI palamadai_e_Page_28.tif
f53ab405f5c5048775a07dcb288c9112
b2f14b004730583b96b685dd605e21d80c958c68
99723 F20101123_AAAUML palamadai_e_Page_55.jpg
c4660c428b783e1bef4e93f25e5db65d
c8de4e3d34ca6a436344ec90e7b956153bf1b5e3
29472 F20101123_AAAVDE palamadai_e_Page_51.QC.jpg
752a1249bb78484df37791f1dc9dab9c
3d27ec0fa058ba6d015f5b91d301cf064de2fc40
F20101123_AAAURJ palamadai_e_Page_29.tif
946fa7a43c121fa5a37e6ac1f9301a5a
95c84c02445a5f00818018afca2100da1baa471b
115125 F20101123_AAAUMM palamadai_e_Page_56.jpg
c554fa409000d34310655e7515a67d30
e3ccbe18517acf303b8c2bfafe57c140aaba7ade
7466 F20101123_AAAVDF palamadai_e_Page_52thm.jpg
ee382e23565f3fb86a3bba648e1131e0
56e28d16118d247dc06e2a13f4f07c490d6519da
433 F20101123_AAAUWH palamadai_e_Page_01.txt
b3c5ede586e4085a507fc59a4078c5dc
8cb986fa52a2c0fe060dae769c28771769e0cc86
F20101123_AAAURK palamadai_e_Page_30.tif
93accecc8ee1edf0cfb325e2b8f1b02f
c08c21c2c96c68c4a1c86a8a8faee833b9eab851
105505 F20101123_AAAUMN palamadai_e_Page_57.jpg
8404e3bcb59feef4af248cdf19311699
1796b4e8e7bf9e331a1775da37e26ecceb169088
31300 F20101123_AAAVDG palamadai_e_Page_52.QC.jpg
c7245eaa8b23ec94f396dfb7a6b45a12
da7cf7d32d0f4ead188f6db858c270be2eeae02a
F20101123_AAAURL palamadai_e_Page_31.tif
efb592284e53ccc28f500bc165b431fb
59e49708e6b153968170d3602ce9fb840f19dc41
53490 F20101123_AAAUMO palamadai_e_Page_58.jpg
4b98d8d7f969508cb2f36e8eb03ff28b
ecb2cf3ed74fe110ff6b5a4e3168caf724fd8ef1
139 F20101123_AAAUWI palamadai_e_Page_02.txt
e065f04bd52be042d8064b88aa9ca4b7
34ff3fe2620f7daba42dc9874216bd2616f6d7cc
4176 F20101123_AAAVDH palamadai_e_Page_53thm.jpg
f6c1a8e943182258ce7e82a463072462
ae54a328da76b767376b9f6f1ad3399a9e35ec0d
F20101123_AAAURM palamadai_e_Page_32.tif
b30cbbc568f463e0ff3f7daefbe5b50b
5d2604634612b7c9f620c0eb310892a2083987ac
101447 F20101123_AAAUMP palamadai_e_Page_59.jpg
ce24584476b8009626311d8d08f63901
d09d67f67dafaa948507971771bf75ce4805ea84
135 F20101123_AAAUWJ palamadai_e_Page_03.txt
0d1288572844677d2c4929ce86a04f2f
e84256396b451464768e4d03a46f9af27cb8e02a
17423 F20101123_AAAVDI palamadai_e_Page_53.QC.jpg
a8576145b7410548afb217b72a816c52
edb924ba8b913198a022c1bcb4937ea10a256768
F20101123_AAAURN palamadai_e_Page_33.tif
7f302b1757383d7b4833f04d253ec36d
5a42703a49646cb80e9afd80478ba66493d22033
77594 F20101123_AAAUMQ palamadai_e_Page_60.jpg
b6c48239eb1a43ddf36c680b2ea3aea8
1a40e2bb2e9c16b63a5db00225cb80013359bc1e
813 F20101123_AAAUWK palamadai_e_Page_04.txt
2d7a41570aa6a5da9054375f977a6dd8
810d86add42536e5c3eae4e975224401e8c82308
7626 F20101123_AAAVDJ palamadai_e_Page_54thm.jpg
48f0c5c71da89906371d2ce859004c41
ec244488530ed7a5307976f600bc2321a8b9f5d0
F20101123_AAAURO palamadai_e_Page_34.tif
38e0447bf4d9f59ca324506b34992bdb
8e72e746cc29628f3f6d387d9152779cc9fe8986
50026 F20101123_AAAUMR palamadai_e_Page_62.jpg
8436eed98c54b6ff63da787d4ff7f1f8
c848ebe87cd72cee94cf065955bb7535b9b021bd
2181 F20101123_AAAUWL palamadai_e_Page_05.txt
50186ef4a4f427ad53f530dae7fa1aa5
fd3ec5933506f68250fd9cc12d79daa4096b6c7b
30626 F20101123_AAAVDK palamadai_e_Page_54.QC.jpg
215413f8bd4df6983a69a498c51f5340
54c6111ebfa13574756c291c5294efc5d05e816d
F20101123_AAAURP palamadai_e_Page_35.tif
a8967c8636a3d15da8a0267f145e6fcf
4069846dc89b1f962ee5a3b2e41c727399c6b857
43638 F20101123_AAAUMS palamadai_e_Page_63.jpg
8e7125f199e7aa2376296fa2f3a15917
65169f95f4f19648019bf7a918307fe58fb47b1b
1220 F20101123_AAAUWM palamadai_e_Page_06.txt
8e1b3e54265b440d07cc5739622fbec3
08495ef7768dc782318698c327588f398824c0f6
7692 F20101123_AAAVDL palamadai_e_Page_55thm.jpg
eb92cc3d7f5a12375ca855ee9005c6fa
4b9fda025bd9fff30bae4c5440119012c85f1b28
F20101123_AAAURQ palamadai_e_Page_36.tif
d2316d65f1c54a93c29cdcd5acb1b542
f44a2953db0fd45ca927c4fa058af025b580f0ba
45935 F20101123_AAAUMT palamadai_e_Page_64.jpg
80146a65a6e33aa5c97de21949ef5d1f
7f1aed3daeae54964ae3476b31c4d8f6864415a4
769 F20101123_AAAUWN palamadai_e_Page_07.txt
61f18b29c643583c55ee3d59c7f0f6cb
4d6c981dd1501bb05dfe09a11eb2cfce41edcffc
31377 F20101123_AAAVDM palamadai_e_Page_55.QC.jpg
d79a424b5a39919dd127b6c64d934665
9d6b9dd341519cbf99183b6bf4a73de39bb087b4
F20101123_AAAURR palamadai_e_Page_37.tif
dce0be09bdb65a869f73723d44cbfbcf
9ef8f096eac892b93929b47d8001786e208dff61
43286 F20101123_AAAUMU palamadai_e_Page_65.jpg
0ac88bca9ad7aff8821e5b16f3c31877
dbd8f098a12fd5a4e7815c76e3b2e1c62127aaeb
793 F20101123_AAAUWO palamadai_e_Page_08.txt
bd87a6aaf9e8813184918365e715c444
27f2b96c87e34d42d2ca71af7919f680f95342e6
7376 F20101123_AAAVDN palamadai_e_Page_56thm.jpg
dfe82edd364ec924e1de83bf7f1cb3e3
abd1c69f896f988b9d068440616262017f2fbb47
F20101123_AAAURS palamadai_e_Page_38.tif
f7164c17170c325b77ff180ef36c2c16
5337de926d37eb38ba73271f196163405432fd1c
60910 F20101123_AAAUMV palamadai_e_Page_66.jpg
1daabb488484c71f48802e0aa4f5f082
3840a009255c4c966e03a470aa015a4b9161acd3
1853 F20101123_AAAUWP palamadai_e_Page_09.txt
1c270db8c3e7dd67e36b42e8add201a6
384a05d9f2e5e1cab919d608e2a706b17ae4687c
30013 F20101123_AAAVDO palamadai_e_Page_56.QC.jpg
5551a237fee1c81706b90972928366a1
90aaaeded79d35b944fea014f3ca980d209e4064
F20101123_AAAURT palamadai_e_Page_39.tif
2b4528eb20cdf33c74ce73c716f89411
f442cb2d0eb31a070d00192f64abc53fda7a118c
40455 F20101123_AAAUMW palamadai_e_Page_67.jpg
504d7505f9e246cc2a2a2db707a5e9b4
43816765bfc1804f823c6a35d10c3e941220739b
399 F20101123_AAAUWQ palamadai_e_Page_10.txt
0eef3626168cf28233b389a5da0a3903
c05eb7d74dcf139457c9d7e92c1f406aef3b39a3
6043 F20101123_AAAVDP palamadai_e_Page_57thm.jpg
72c6cec0f1afe9ba13f39dc7c1a0a638
d4cbe83bd307b8d720de885fbe9d3aef8b1e2aac
F20101123_AAAURU palamadai_e_Page_40.tif
7e68fe21e134e040da4aa1c79f7d263f
da0f148121025baeeafa01a67023dc3326982c41
61465 F20101123_AAAUMX palamadai_e_Page_68.jpg
95a9cd09db65aef9f7eb93b48c1efdc1
f32140b6adcdb30e3ed6ac3aec4e68d1208a0045
1926 F20101123_AAAUWR palamadai_e_Page_11.txt
f13b9565a0c42a68b2fc38c221bd9c50
056fe00cfb509ba6f29397729088934aff6fff3b
26587 F20101123_AAAVDQ palamadai_e_Page_57.QC.jpg
9e65a2ff84c73dec86451e1e038187ef
dd841873602b62dd331795340b8206c3de282f43
F20101123_AAAURV palamadai_e_Page_41.tif
1ef804561ace051f127589eef3c36033
940dd6e388b840730275a4d4304a87fd2a5647d7
50403 F20101123_AAAUMY palamadai_e_Page_69.jpg
7f69e8adef30731538d942db2a897951
033f6f3e3ddd9235b23fb2c9669a2618e2ef5898
2064 F20101123_AAAUWS palamadai_e_Page_12.txt
e54afe84f8e1131bee883e439360393a
b6e753493cf7dbdd8be2aa6816b13a02a46c041b
4820 F20101123_AAAVDR palamadai_e_Page_58thm.jpg
4467d2efface0366286fef5301db5e96
ea109130a768847b9b90cfbf94f990fa39301e00
F20101123_AAAURW palamadai_e_Page_42.tif
7977b9570aaa13066eb713fb62443a73
5f77b8c086cc7556d4f663bc84e7c7af8faa49e7
44814 F20101123_AAAUKA palamadai_e_Page_14.pro
1882e9be1d7157758d035611a55ba83f
09df2eabd07928f3f1ef81468fde08b5a1f13df2
54550 F20101123_AAAUMZ palamadai_e_Page_70.jpg
cf892fa3db1753af12ececab841e29fe
79561c4045428049950f7e3953a52274563daad2
2117 F20101123_AAAUWT palamadai_e_Page_13.txt
6d751919fe3da7d0002ed14689b4ebf0
c006582831b20e7319850f0a4299a94b512a8026
18717 F20101123_AAAVDS palamadai_e_Page_58.QC.jpg
4736de751ca7e203c6f20e7b4a6b2737
2fd67f303f7172d716b1aa9e080a5b820379a0b6
F20101123_AAAURX palamadai_e_Page_43.tif
6ed967f30c05725587a4fdd199e55866
0defd92137c209680dfc5d3e3ce856580eb10b15
855733 F20101123_AAAUKB palamadai_e_Page_39.jp2
889366221732292151ea57f2510f350b
248590b0157c14984e93a889c542821374626c17
1788 F20101123_AAAUWU palamadai_e_Page_14.txt
9e82dac3737674ff8b68c18dd648b80e
9fddf01468e53c35c7bc01ea160d20a3aec50eae
F20101123_AAAURY palamadai_e_Page_44.tif
63e1514abf5c832a46a97d59bf82def5
a83b53e8fd2287142027039fc32a82e8a9abf49a
26868 F20101123_AAAUKC palamadai_e_Page_09.QC.jpg
ba041698b04ba92ffcaa69bdb690a9cd
53f8422b420fe9545cc295df0ee1d6b08ac1b7b0
1458 F20101123_AAAUWV palamadai_e_Page_15.txt
4638b24fd8f3a2dd09547fad8643d107
ca993f9e5cbbbe298e0133ddc6ea6858e0edf9b2
6415 F20101123_AAAVDT palamadai_e_Page_59thm.jpg
ba58dc6d1f071e9163607675e3d4ce69
f21b13b75540e4b49cdeda2242c1c155223fd8a1
F20101123_AAAURZ palamadai_e_Page_45.tif
e0f2a2edfd0aee994849ee58faa65ee2
94adc93f7016757113da2b87ee41ba312f318e88
87213 F20101123_AAAUKD palamadai_e_Page_09.jpg
1c1f5a3f417155d316871411caa7911e
a99f249f4184e762e591ae5c0b1fb5b57a7b247f
1298 F20101123_AAAUWW palamadai_e_Page_16.txt
c9f0e581930b0e118b94f0331936ad09
3b4cace67959527cbfd290f9848599faeb74aa2b
713596 F20101123_AAAUPA palamadai_e_Page_46.jp2
96d6cd379ad318754cab8b4830190b6b
e18d2430d7d1c2c3c747af7cb51a8ffac806bbc2
27549 F20101123_AAAVDU palamadai_e_Page_59.QC.jpg
2330d927d1250f8e14e669cde7bf6004
05cf3c301a04327da8a4160c3e73ab1097b01172
52194 F20101123_AAAUKE palamadai_e_Page_12.pro
daf699817329f8a15d1003816486e17a
cbf8a60a0f814e8615390ef173c2839896e330b0
1759 F20101123_AAAUWX palamadai_e_Page_17.txt
38a2512f3e9975eeacc6fcbb9b95a73d
096762b5cd61dcb9b4a4a33a23b3e1a1012f0e26
819363 F20101123_AAAUPB palamadai_e_Page_47.jp2
ce73aebda0cacfb57e7f869c0af9c06e
41357cb4b0923f31b01a18758073c5c5937411ad
4671 F20101123_AAAVDV palamadai_e_Page_60thm.jpg
d27741f89a10b0ce998f09032afb9934
4d9e39d2908b25fff7754b0fbd854a966637e5e2
16924 F20101123_AAAUKF palamadai_e_Page_19.QC.jpg
88a3f41c6b6c6eaf4fe0de9ddb973561
9f111511dab8ba81c274b6567df6bdf8d60199ca
45212 F20101123_AAAUUA palamadai_e_Page_21.pro
20f9d1ae8f63dd64c269eab9fb27428e
42ca09d27a7db68ca9de59f71d570d410255d5dc
1402 F20101123_AAAUWY palamadai_e_Page_18.txt
1023d01119b07f23b9b22ec562ac1f8c
20919c06e02e3313cf49126b2583609e85923f0d
1051909 F20101123_AAAUPC palamadai_e_Page_48.jp2
a3442a7ed7b1e97ac34134cbb4504e06
cb12d133e755f0f1a14c05550e9ca9fb6426b5b7
20081 F20101123_AAAVDW palamadai_e_Page_60.QC.jpg
1c38d8fa7ceb8068478ac4132c0340e0
6d7ec3b26b029fad4a14ecb230db0c522f5feefd
46479 F20101123_AAAUKG palamadai_e_Page_61.jpg
c33dc4a7ea944599e4f968c79143a661
6561d03cdd9c25ae0b6caf41eb889031f39e1311
42168 F20101123_AAAUUB palamadai_e_Page_22.pro
bf9c3e30b0c9cea715d8bb97e6ea7ae7
5e2cb9c99be8b84c5368d6d1f02d8c29175ae11c
1202 F20101123_AAAUWZ palamadai_e_Page_19.txt
797b6b4a259362f42bbe9476f0dde235
b442db603f0955fecbddfb146f9b567ad1a5ffdb
1051933 F20101123_AAAUPD palamadai_e_Page_49.jp2
7bd0a4796904feaf73075ae9a2ba11ac
a8220acd4646b0fc96c606f2b169c296ff9ceecb
4224 F20101123_AAAVDX palamadai_e_Page_61thm.jpg
6745677309d9b96a42dbc84283c42dfc
6fde319f7f0f2293dfb176d2a12e244c7ad22f08
121382 F20101123_AAAUKH UFE0011721_00001.xml
0491080103846f0dd1c940ae58df91dd
b1e38bd8363803cceccd1262cdeeb518ebfd401f
35588 F20101123_AAAUUC palamadai_e_Page_23.pro
4d4b04d339c12c3e52dc0b478dd472ab
f0fb3bfa4405706e9c5c300afa58d255e260d4a9
1006591 F20101123_AAAUPE palamadai_e_Page_50.jp2
4aab2ade51f589768e68b164a488f94f
9756deec8059f4aa5987d7be85556b8da1d0fdd6
15582 F20101123_AAAVDY palamadai_e_Page_61.QC.jpg
670a040bf887243a5087a2b3d1d91e90
1b92dd217a230ac750fc3e9ef0cdae89f74f953d
25312 F20101123_AAAVBA palamadai_e_Page_23.QC.jpg
e826c38137c2af72460ce5153cd3d296
090b2d678333d8e442b3d65544f95e4b9ec6ea47
1502 F20101123_AAAUZA palamadai_e_Page_72.txt
ac638e2611d0ed381f836cb71c67c508
6e12e9980779709aa02b2dd76df9c29de1e50632
39670 F20101123_AAAUUD palamadai_e_Page_24.pro
f91a0c0782826b08f3c2f01b23eedbe6
d4ed5f3c9737769092ce3fbeee2f6facf01bff00
1051973 F20101123_AAAUPF palamadai_e_Page_51.jp2
90e035255bade46243d8a45d15413ac1
2fdd1566bc298a2e49eb88fdcb1d4a9987d79cc0
4488 F20101123_AAAVDZ palamadai_e_Page_62thm.jpg
57b7fe447b6e521555fd18ebd8080043
2b6018c95f856049e7e7f2cfead4690e3e3b85db
6455 F20101123_AAAVBB palamadai_e_Page_24thm.jpg
05c05422103ef87e84b9e7fab9f2020b
6f1502455aaf7c66826bf650f115757a7a496efa
1461 F20101123_AAAUZB palamadai_e_Page_73.txt
32098a235d184298de9bafa417bbbeee
976dd94448966221b1c0025d34ff90f7c69b2864
51033 F20101123_AAAUUE palamadai_e_Page_25.pro
1fc0af7a0e700cc13d2e02103b6d2a8e
a6e02bc99202d40df0a7e4116917f087b317aab0
F20101123_AAAUPG palamadai_e_Page_52.jp2
e2c312ab383de72f948e343aa8d4365c
1a23f121b044144694186dd4d65f63ebab0840b3
26320 F20101123_AAAVBC palamadai_e_Page_24.QC.jpg
68721fdc4154563b999561be52078cdb
b5702053b2c46b71b1c31deddcd176f9daff73ff
1335 F20101123_AAAUZC palamadai_e_Page_74.txt
ee956dc869b98438ac03598a13af4c3c
5899dcca91f7a5b6d769067a5797e9c26cd08392
25436 F20101123_AAAUKK palamadai_e_Page_01.jpg
7a9359349e83885298315cc4f671f10a
fe5f9d80c71ebae61d1e2f54113093fe079d328f
691235 F20101123_AAAUPH palamadai_e_Page_53.jp2
da2e2b5bdecceb465c82d6c12757212f
80c16ae8a1bae408a825f8e2c19579009b44b70a
7327 F20101123_AAAVBD palamadai_e_Page_25thm.jpg
acbaf545865aafc0c823d28c37bac979
33dac20af6f27eec3dc0ea15aec177cefc873316
1373 F20101123_AAAUZD palamadai_e_Page_75.txt
c5b65979fe1c65a0c2a5e8195daa628d
d6dcd48becaef39855f2ed0cad6db0a3396443c8
5583 F20101123_AAAUKL palamadai_e_Page_02.jpg
0d2a671abf168979960f2446b8f41a87
39d58e0b01b4b26c937fe1d20ce96dc96456bcab
29800 F20101123_AAAUUF palamadai_e_Page_26.pro
93668a99d257ac5dfd0ef09ed817e8d9
402dab00e8ba17e52a7043cd7dd44a51eb4e3bbd
1051910 F20101123_AAAUPI palamadai_e_Page_54.jp2
904aa53286bb05cef85c514f0fbc2f52
55815942d0e32519e1a9e449875bad10b175de90
30802 F20101123_AAAVBE palamadai_e_Page_25.QC.jpg
ad37d117d801b7a3b0a45783e1e278d9
6967185cdb8c9b08c94130d8d2cb6a99ab04a4a9
934 F20101123_AAAUZE palamadai_e_Page_76.txt
ce7432b91526b6df011d03bef09806eb
5c6531f0ed9a37a7d5d921524fbcfb21813fac42
6706 F20101123_AAAUKM palamadai_e_Page_03.jpg
ec65d23846f12db64f6696af3e1157b4
3c45bd8a12a1739d33c348ca69f3333213a3bc52
47311 F20101123_AAAUUG palamadai_e_Page_27.pro
cb600785eacb3790e46cd156940cb335
0d38fe72f5552ba9c1e4e2d3bcbe9d932a559a34
1051963 F20101123_AAAUPJ palamadai_e_Page_55.jp2
19d72b12579bb1f69461ea34c04562e3
34299a9879b74fff4002a1955c079d501152d860
5904 F20101123_AAAVBF palamadai_e_Page_26thm.jpg
6d068849ae59ed126678d2084739bb4c
b360d77053bc1ce591e971853df1f0e73e47f690
41416 F20101123_AAAUKN palamadai_e_Page_04.jpg
74eea3a531ca4f74da7a927373756c8f
945cbbbc86b83afb90860a4e0c5e36dfd622588a
31940 F20101123_AAAUUH palamadai_e_Page_28.pro
62626ee6862f10c96909dca9d411b1b2
2313bc0b16d218bd88e4abc681e291b17b809dd3
F20101123_AAAUPK palamadai_e_Page_56.jp2
27cc129274a7025353723c2ad7446953
c6404f0e6161b7031c99c46fa357a9afe40eea18
2028 F20101123_AAAUZF palamadai_e_Page_77.txt
3117f72045cf91bbbc2c717507e0509b
aca686897b68d54dae79b0c2146fcc007ff205b0
21935 F20101123_AAAVBG palamadai_e_Page_26.QC.jpg
9caa89c7478d78e2d80cc6044a5fced5
f4a57527c0e0092a1b632507f4831154d727e8f8
93897 F20101123_AAAUKO palamadai_e_Page_05.jpg
57795ed6fbe4f0e1d59fe6b2b0987aab
e3c0df77fb8f72a025c5bb2d5b82543dd8be76ea
51335 F20101123_AAAUUI palamadai_e_Page_29.pro
b20b89313b1d0195c30f5154a21ddea4
7b032b86a368de1f4a9ec32e81882e9836193c9e
1051951 F20101123_AAAUPL palamadai_e_Page_57.jp2
89ea42be2319aaca31f2625cd1a64de7
5e92472e978d64e3bfae2fec36071260d2b5266b
2042 F20101123_AAAUZG palamadai_e_Page_78.txt
2faedc0a55af0b7ccee2c125980a815e
2a1c3edee213686d2893fb779a19f9734853b148
7191 F20101123_AAAVBH palamadai_e_Page_27thm.jpg
2ad008e807348859662291231b9c9f9b
223c0455fd69d738f56a459293969e2221a69e2b
76505 F20101123_AAAUKP palamadai_e_Page_06.jpg
268611b76c025e98dc96c59aba01d5ca
edb13de21c0b2e8c5924aeda8707a366ed21b557
50037 F20101123_AAAUUJ palamadai_e_Page_30.pro
e12f31ea705f83f03e28ecbcc7540fcb
bc5ff0cd809c3ae69cce1e920b28dfbde77f8b0d
518239 F20101123_AAAUPM palamadai_e_Page_58.jp2
3ed3941bc65949360344d17219949249
84f0d77ab74ae777e7a4626f46589509e157fa40
654 F20101123_AAAUZH palamadai_e_Page_79.txt
87b49b8921236e7ad8c3c02422a0effb
b330db26b5c9bc78da394adbba431d3338d30953
30369 F20101123_AAAVBI palamadai_e_Page_27.QC.jpg
fe52b9d339a3bb2f145da79f4f9fa4ae
ff3f5e968da491a00e7cd6408be7b76f3a3ddd62
43103 F20101123_AAAUKQ palamadai_e_Page_07.jpg
a9817ff4351e47b622dd07be2c494360
62f510bb675223fc2b595cae6761e0ff18effb0c
5819 F20101123_AAAUUK palamadai_e_Page_31.pro
bfd881ecd96e84d842c18906c16e1ff1
3525c9402d2b60c636a664020af3f9ba8f79f535
1051916 F20101123_AAAUPN palamadai_e_Page_59.jp2
c57036c231a440df0305ab9c13b60384
868595df261159e95d87d96dab32e891016863dc
332380 F20101123_AAAUZI palamadai_e.pdf
93847475bb7b1143f35e7a92883e274e
32d26c40fac93c279723c879726943ec986a1d71
6244 F20101123_AAAVBJ palamadai_e_Page_28thm.jpg
097a87cae7a9f5ba4dab67fdd2833398
b12bbccced302b353881fddd965b0cf93a1bad8d
36493 F20101123_AAAUKR palamadai_e_Page_08.jpg
66a0fc56d3e40dfcec5f4f77fdbc6b0c
f4002599575730fb8e88d620ecc6a75613111399
40478 F20101123_AAAUUL palamadai_e_Page_32.pro
11150158dcb9f9ca9e39cdc4f1a9e550
1948c78ee6f62895ad1d2d2a08d23b36bf7d7759
851776 F20101123_AAAUPO palamadai_e_Page_60.jp2
625ad45dd70ea5ec48a8a488c6a4879e
1b8c1b64f7d6fb5e6f4e79b08bc992c12277963b
2190 F20101123_AAAUZJ palamadai_e_Page_01thm.jpg
960245d3dc228a1a1cbeaca95bcf6c3c
4a962d30f98a6f9492e63b6029009fe7804534d1
23455 F20101123_AAAVBK palamadai_e_Page_28.QC.jpg
017137843a10f82002b7602dabea34de
c5a6b52d41f82945844d60df18caa6c34341eb71
21537 F20101123_AAAUKS palamadai_e_Page_10.jpg
ba1fef4243eb6e7065da90a8d61c518a
8f46ad98c5427f8ff685138100935e52e47fadf7
45503 F20101123_AAAUUM palamadai_e_Page_33.pro
87a49ee6802fa84e806e41e01a536ec1
ab434ec630bbfc15b26b9325d38e6b8e2527a654
488158 F20101123_AAAUPP palamadai_e_Page_61.jp2
de03c18b37de65dca93875a569aeaa22
1b95ce775cf3d4438c363ec77d5efdeac9d71fd5
7949 F20101123_AAAVBL palamadai_e_Page_29thm.jpg
fc096bc251601682f20fe69f428a9718
4b3b31219a0c0f8e85a4f003ef15dd913d113ab9
7719 F20101123_AAAUZK palamadai_e_Page_01.QC.jpg
88ae19e61d6afae1aa9c356c5a9f6241
ec78abe3a82d2f7014633b9c7bcdfc94aa6a9a24
92371 F20101123_AAAUKT palamadai_e_Page_11.jpg
4eb3b6f21b4194bec6e0ad7c89b00fcd
5fbf5a0c79639d70ba0636d94a6574b46c231719
28160 F20101123_AAAUUN palamadai_e_Page_34.pro
caa2ea4a6842b8e4377afd9dac2afc66
3e8c60c472788ed7866a58b5a0c0215e9d972c3f
462587 F20101123_AAAUPQ palamadai_e_Page_63.jp2
08be7a8a45a5f1c26f24fc7c4cee85f9
32040fbdf9d7204c44cbf65c1d63b791399a4222
32224 F20101123_AAAVBM palamadai_e_Page_29.QC.jpg
ca68f4ee53bc046a46ddd69097ced4da
05a72d396da78ffaaafcbd6912454abd47da3b92
689 F20101123_AAAUZL palamadai_e_Page_02thm.jpg
f59063a7d6b9155db43f3b2566f50575
32abd34342b5062c28a84c7bd9fd637d07be313d
101913 F20101123_AAAUKU palamadai_e_Page_12.jpg
7c5212c21dd8e6896265f4d048d8de58
e51638ee8d08ef438f44897a04cce699723620aa
46157 F20101123_AAAUUO palamadai_e_Page_35.pro
8c9690f9b4066a70ac8b57d81b7bded8
76b4ba95a5a253c56b548e2a131a723e7fe86063
471483 F20101123_AAAUPR palamadai_e_Page_64.jp2
c5fd2c78338373b20d94e47ea53936f2
3b784d4e5c3b52cfc689851055de93b589d77400
7819 F20101123_AAAVBN palamadai_e_Page_30thm.jpg
d675f0c815ddebebda0b06d8870305f7
b650a9994314b89a82136a0dffcd6ecb806edbb5
F20101123_AAAUZM palamadai_e_Page_02.QC.jpg
e3ec3983e070f243e53742455da2e0a1
f834b45539505b36ecb6cf705a24de5eb18336ec
104273 F20101123_AAAUKV palamadai_e_Page_13.jpg
02a93d67c8e4cc0c980ce8013cd6cbea
dfe1f95fdece44e73a0b23b2dbc6ecffcb927895
40411 F20101123_AAAUUP palamadai_e_Page_36.pro
e6bc9c0d8f46455a006edd9d8f331036
bd40e13d67b74544f72f5538500b09440b5687af
456713 F20101123_AAAUPS palamadai_e_Page_65.jp2
fad4032f923909cae01f4ebd7db08b06
c7103f3857cc805485cb2be863b637cd26c08e20
31473 F20101123_AAAVBO palamadai_e_Page_30.QC.jpg
7f3223945c25d191506efddb9ebb0969
8bdc744504ee12e6745570afe4020d8c9a75b20c
831 F20101123_AAAUZN palamadai_e_Page_03thm.jpg
be2d0444684067593f8489e8b02b626a
55112393a2967895470250617a0a07e0b323ad7f
91091 F20101123_AAAUKW palamadai_e_Page_14.jpg
17f8741208503fff4dbcb48bb6c4b57c
cb9949290f7c65977ffad6d81aa3bbc1ec0943c7
45755 F20101123_AAAUUQ palamadai_e_Page_37.pro
c6d9d2ace1c7c84ca09dda364c6aff9c
66db64002eb6827cd03c8ca1fddb3433490422d3
654130 F20101123_AAAUPT palamadai_e_Page_66.jp2
8589bc5d59fd8b117494d836c004afe2
f161efe30c53d5ef141511387685d2e1314cd6f4
2919 F20101123_AAAVBP palamadai_e_Page_31thm.jpg
ece6e7c9d2b024c1d9bfdb4fa345d601
bec330d16265782b877467de2bc71cafddea0be7
2615 F20101123_AAAUZO palamadai_e_Page_03.QC.jpg
27b0cc19676ae7e14de9568a897359d1
b4b1011fbefa4fac8bda6cf9c4a5dbe9db9f50ef
48541 F20101123_AAAUKX palamadai_e_Page_15.jpg
89a83ba6f4269bd63d012a4235d0bfbb
b6f665d09d7e774cc24fe37eeed96659e47bde9b
48597 F20101123_AAAUUR palamadai_e_Page_38.pro
21844676517f551d5bb1dc81e18870a9
65a2404cf94fc62371e6007db6fe2c7ee141cfcb
419355 F20101123_AAAUPU palamadai_e_Page_67.jp2
78f398118329d5e01a931442fdfba305
d758370a639afc090458454c005419d2f2a18dcb
9397 F20101123_AAAVBQ palamadai_e_Page_31.QC.jpg
fdbd481f506f239c0203f0c95f81c65b
a35d2acac06656ffe1747774ab2450abef64f471
3137 F20101123_AAAUZP palamadai_e_Page_04thm.jpg
54f0345806dcd65d2e41c418a36d363e
9361e7abf84a7f1265606c8ad00f162f34c14903
39554 F20101123_AAAUUS palamadai_e_Page_39.pro
339fdbf2d8fb6f095b39fcc7738b3697
7ab612aeb382bd4413c0187cc73a02e98db23bed
656510 F20101123_AAAUPV palamadai_e_Page_68.jp2
1658f5171c031a8d99a250c8a5124c5f
5afe8411a33c2e774364eaa926113d5fc7d5a1cc
59726 F20101123_AAAUKY palamadai_e_Page_16.jpg
457159610d1f099d78d24dac8a845da6
3eb4cc59883c5f81848a1f5732cb8feb06188160
13023 F20101123_AAAUZQ palamadai_e_Page_04.QC.jpg
7d931e4cb0084172a7177405ba2e9092
5e6d1d0809e11e622a92f385506193f1d8fcc418
33193 F20101123_AAAUUT palamadai_e_Page_40.pro
c9e462e17485a33218b0db20e1020f44
0f080c9774a0ce261302384623828e7a9868f452
522116 F20101123_AAAUPW palamadai_e_Page_69.jp2
cb702f04abefb5d831f1ce2e677599a0
1473b59c77f80c1797987ae91ce25e347b2de677
75385 F20101123_AAAUKZ palamadai_e_Page_17.jpg
1e26beeb9ab8441579c6f8a9d3b73e30
2b76d57583eefc7431902642480e33173d9937d1
6285 F20101123_AAAVBR palamadai_e_Page_32thm.jpg
cc38f2cdf76b34ba79f7406f884b3e70
017f5cec8ee74bc7197832333bf2b5caffb722b3
5611 F20101123_AAAUZR palamadai_e_Page_05thm.jpg
6e6c8e4ebd70e18e3f9bb049d9f4abf0
d5f55c10f8d34bb50b289dd1a1339d2d4a1d0104
48311 F20101123_AAAUUU palamadai_e_Page_41.pro
2d0acc306d14c4e4beda0750d0ed8260
f0b8c997beb6bb9bb6ac0862a2b443797775a284
568529 F20101123_AAAUPX palamadai_e_Page_70.jp2
4d2ddbe85486b918452e4c781efb2535
5917ea1128016bf39e52d7ac75baf360cfb58773
24959 F20101123_AAAVBS palamadai_e_Page_32.QC.jpg
d26eca6ad0a46dd7145dccaecfcc88be
85219a1be82bae10ec4fa24d6f9ac0d9407bf8e7
27250 F20101123_AAAUZS palamadai_e_Page_05.QC.jpg
e7d997a6bc26e7766841028aa29b06a0
55d98a104054510d31908938275e9dba61b05200
55624 F20101123_AAAUUV palamadai_e_Page_42.pro
58899f7488cdc90152628a4844111fb9
859b81aeac5eabdda39f175fae6af2626fa67bdf
609834 F20101123_AAAUPY palamadai_e_Page_71.jp2
ce3c5c708a5aac3466fe856cd3461b4a
9eb4ec2aaccf51f297c629ae87ddcc84cec06df0
6805 F20101123_AAAVBT palamadai_e_Page_33thm.jpg
d2fd2c0c9c87a37d636c36c6a2af170d
e4a48037791fc0ef0f7cff345941db69831e180e
4486 F20101123_AAAUZT palamadai_e_Page_06thm.jpg
6bfbb39afe717ae06636e61fc088b28a
4f71bbcd9d6e571e3f9171d12a3097a512f9231b
52250 F20101123_AAAUUW palamadai_e_Page_43.pro
4c514e8c8f0703972f367a08f0811b79
0c8f873b24b3707b97e5b2b3b46b68e8c7d20201
55165 F20101123_AAAUNA palamadai_e_Page_71.jpg
ba4530b2e29a09bf74ded87587556f88
f320a7886715e55b22933138d397188e844051d9
774581 F20101123_AAAUPZ palamadai_e_Page_72.jp2
75bdef800c0ae47a1c60ee0f630b1a65
53a3bd88a952113c36774ecdbf875c934177762d
28183 F20101123_AAAVBU palamadai_e_Page_33.QC.jpg
3f7d10d051bdafd0eef6125b2e9186fa
be697a6a65c4f3746b59cb25a14c966addd5c2dd
21606 F20101123_AAAUZU palamadai_e_Page_06.QC.jpg
44f64875ba730a3ac5446983e596ab7b
4cc4274ccbd43ac97c1b0275188cb934db926a81
34622 F20101123_AAAUUX palamadai_e_Page_44.pro
724fb13a79866daef75b02b7e007d740
67e1ee9dbe2aaa32522fece21c0ee5df11104089
70165 F20101123_AAAUNB palamadai_e_Page_72.jpg
535613d4891570e01c749de929ee2bbf
1590d8463a16f9e13e1c26d1e5ec1c77f94a7dd0
5137 F20101123_AAAVBV palamadai_e_Page_34thm.jpg
ae1d074aa00eb33977fa630406b63298
f04d153cd1d41daa45b5e86a4f2d7eb5f9911d33
3315 F20101123_AAAUZV palamadai_e_Page_07thm.jpg
4bc9412c335f1be957b91c8e6d0e2ae0
dfbd9a1934cde912b82d6f23c9982475ec8f6a3c
31166 F20101123_AAAUUY palamadai_e_Page_45.pro
b3e643e6deeb7bf8dfa58520b9b9af45
4838410395c15b656a487bbed609dad5bb30bf33
67921 F20101123_AAAUNC palamadai_e_Page_73.jpg
77ec51fadac1132d9ed6593ed4353976
385136c5da3310667cee3179cea835ef6dad06bb
F20101123_AAAUSA palamadai_e_Page_46.tif
67952edd3c850da81e2f3b7799f05362
b3a4287daaca4f575f350d55686a61b3633d7afd
17784 F20101123_AAAVBW palamadai_e_Page_34.QC.jpg
f9bde4613d44ed9cb6056510728cf31d
5f52f19b380157e5064f4d00fb0ee88012714793
12964 F20101123_AAAUZW palamadai_e_Page_07.QC.jpg
799d4db92fdb720be7286633c1ae68a1
d8a8df6b3f5da400cc707afbdeaf85af066b3356
31511 F20101123_AAAUUZ palamadai_e_Page_46.pro
be61706605a99e6d84c9d5aea5f0b1de
b5a826116dd11e01a8d83e6d326b43ec19004870
58168 F20101123_AAAUND palamadai_e_Page_74.jpg
7eddd79d61a86b8f607cb9ab56005a72
353470451dfffc45ee6b0cf8e49083b02beedc12
F20101123_AAAUSB palamadai_e_Page_47.tif
fbc97667785c1ad044493fbf04a96a3e
6047aba640d74a101c5af57bf2e160989dd913f7



PAGE 1

KLU{A HIGH PERF ORMANCE SP ARSE LINEAR SOL VER F OR CIR CUIT SIMULA TION PR OBLEMS By EKANA THAN P ALAMAD AI NA T ARAJAN A THESIS PRESENTED TO THE GRADUA TE SCHOOL OF THE UNIVERSITY OF FLORID A IN P AR TIAL FULFILLMENT OF THE REQUIREMENTS F OR THE DEGREE OF MASTER OF SCIENCE UNIVERSITY OF FLORID A 2005

PAGE 2

Cop yrigh t 2005 b y Ek anathan P alamadai Natara jan

PAGE 3

I dedicate this w ork to m y mother Sa vitri who has b een a source of inspiration and supp ort to me.

PAGE 4

A CKNO WLEDGMENTS I w ould lik e to thank Dr. Timoth y Da vis, m y advisor for in tro ducing me to the area of sparse matrix algorithms and linear solv ers. I started only with m y bac kground in n umerical analysis and algorithms, a y ear and half bac k. The insigh ts and kno wledge I ha v e gained since then in the area and in implemen ting a sparse linear solv er lik e KLU w ould not ha v e b een p ossible but for Dr. Da vis' guidance and help. I thank him for giving me an opp ortunit y to w ork on KLU. I w ould lik e to thank Dr. Jose F ortes and Dr. Aruna v a Banerjee for their supp ort and help and for serving on m y committee. I w ould lik e to thank CISE administrativ e sta for helping me at dieren t times during m y master's researc h w ork. iv

PAGE 5

T ABLE OF CONTENTS page A CKNO WLEDGMENTS . . . . . . . . . . . . . . iv LIST OF T ABLES . . . . . . . . . . . . . . . . vii LIST OF FIGURES . . . . . . . . . . . . . . . . viii ABSTRA CT . . . . . . . . . . . . . . . . . . ix CHAPTER1 INTR ODUCTION . . . . . . . . . . . . . . . 1 2 THEOR Y: SP ARSE LU . . . . . . . . . . . . . 5 2.1 Dense LU . . . . . . . . . . . . . . . 5 2.2 Sparse LU . . . . . . . . . . . . . . . 7 2.3 Left Lo oking Gaussian Elimination . . . . . . . . 8 2.4 Gilb ert-P eierls' Algorithm . . . . . . . . . . . 10 2.4.1 Sym b olic Analysis . . . . . . . . . . . 11 2.4.2 Numerical F actorization . . . . . . . . . . 13 2.5 Maxim um T ransv ersal . . . . . . . . . . . . 14 2.6 Blo c k T riangular F orm . . . . . . . . . . . . 16 2.7 Symmetric Pruning . . . . . . . . . . . . . 18 2.8 Ordering . . . . . . . . . . . . . . . . 19 2.9 Piv oting . . . . . . . . . . . . . . . . 22 2.10 Scaling . . . . . . . . . . . . . . . . 23 2.11 Gro wth F actor . . . . . . . . . . . . . . 25 2.12 Condition Num b er . . . . . . . . . . . . . 27 2.13 Depth First Searc h . . . . . . . . . . . . . 30 2.14 Memory F ragmen tation . . . . . . . . . . . . 31 2.15 Complex Num b er Supp ort . . . . . . . . . . . 33 2.16 P arallelism in KLU . . . . . . . . . . . . . 33 3 CIR CUIT SIMULA TION: APPLICA TION OF KLU . . . . . 35 3.1 Characteristics of Circuit Matrices . . . . . . . . . 37 3.2 Linear Systems in Circuit Sim ulation . . . . . . . . 38 3.3 P erformance Benc hmarks . . . . . . . . . . . 39 3.4 Analyses and Findings . . . . . . . . . . . . 41 3.5 Alternate Ordering Exp erimen ts . . . . . . . . . 42 v

PAGE 6

3.6 Exp erimen ts with UF Sparse Matrix Collection . . . . . 44 3.6.1 Dieren t Ordering Sc hemes in KLU . . . . . . 44 3.6.2 Timing Dieren t Phases in KLU . . . . . . . 45 3.6.3 Ordering Qualit y among KLU, UMFP A CK and Gilb ertP eierls . . . . . . . . . . . . . . 45 3.6.4 P erformance Comparison b et w een KLU and UMFP A CK . 48 4 USER GUIDE F OR KLU . . . . . . . . . . . . . 51 4.1 The Primary KLU Structures . . . . . . . . . . 51 4.1.1 klu common . . . . . . . . . . . . . 51 4.1.2 klu sym b olic . . . . . . . . . . . . . 53 4.1.3 klu n umeric . . . . . . . . . . . . . 55 4.2 KLU Routines . . . . . . . . . . . . . . 58 4.2.1 klu analyze . . . . . . . . . . . . . 58 4.2.2 klu analyze giv en . . . . . . . . . . . 59 4.2.3 klu *factor . . . . . . . . . . . . . 59 4.2.4 klu *solv e . . . . . . . . . . . . . 60 4.2.5 klu *tsolv e . . . . . . . . . . . . . 61 4.2.6 klu *refactor . . . . . . . . . . . . . 62 4.2.7 klu defaults . . . . . . . . . . . . . 63 4.2.8 klu *rec piv ot gro wth . . . . . . . . . . 63 4.2.9 klu *estimate cond n um b er . . . . . . . . . 64 4.2.10 klu free sym b olic . . . . . . . . . . . 65 4.2.11 klu free n umeric . . . . . . . . . . . . 66 REFERENCES . . . . . . . . . . . . . . . . . 67 BIOGRAPHICAL SKETCH . . . . . . . . . . . . . . 69 vi

PAGE 7

LIST OF T ABLES T able page 3{1 Comparison b et w een KLU and Sup erLU on o v erall time and ll-in . 39 3{2 Comparison b et w een KLU and Sup erLU on factor time and solv e time 40 3{3 Ordering results using BTF+AMD in KLU on circuit matrices . . 41 3{4 Comparison of ordering results pro duced b y BTF+AMD, AMD, MMD 43 3{5 Fill-in with four dieren t sc hemes in KLU . . . . . . . . 46 3{6 Time in seconds, sp en t in dieren t phases in KLU . . . . . 47 3{7 Fill-in among KLU, UMFP A CK and Gilb ert-P eierls . . . . . 49 3{8 P erformance comparison b et w een KLU and UMFP A CK . . . . 50 vii

PAGE 8

LIST OF FIGURES Figure page 2{1 Nonzero pattern of x when solving Lx = b . . . . . . . 13 2{2 A matrix p erm uted to BTF form . . . . . . . . . . 16 2{3 A symmetric pruning scenario . . . . . . . . . . . 18 2{4 A symmetric matrix and its graph represen tation . . . . . 21 2{5 The matrix and its graph represen tation after one step of Gaussian elimination . . . . . . . . . . . . . . . 21 2{6 A doubly b ordered blo c k diagonal matrix and its corresp onding v ertex separator tree . . . . . . . . . . . . . . 34 viii

PAGE 9

Abstract of Thesis Presen ted to the Graduate Sc ho ol of the Univ ersit y of Florida in P artial F ulllmen t of the Requiremen ts for the Degree of Master of Science KLU{A HIGH PERF ORMANCE SP ARSE LINEAR SOL VER F OR CIR CUIT SIMULA TION PR OBLEMS By Ek anathan P alamadai Natara jan August 2005 Chair: Dr. Timoth y A. Da vis Ma jor Departmen t: Computer and Information Science and Engineering The thesis w ork fo cuses on KLU, a sparse high p erformance linear solv er for circuit sim ulation matrices. During the circuit sim ulation pro cess, one of the k ey steps is to solv e sparse systems of linear equations of v ery high order. KLU targets solving these systems with ecien t ordering mec hanisms and high p erformance factorization and solv e algorithms. KLU uses a h ybrid ordering strategy that comprises an unsymmetric p erm utation to ensure zero free diagonal, a symmetric p erm utation to blo c k upp er triangular form and a ll reducing ordering suc h as appro ximate minim um degree. The factorization is based on Gilb ert-P eierls' left-lo oking algorithm with partial piv oting. KLU also includes features lik e symmetric pruning to cut do wn sym b olic analysis costs. It oers to solv e up to four righ t hand sides in a single solv e step. In addition, it oers transp ose and conjugate transp ose solv e capabilities and imp ortan t diagnostic features to estimate the recipro cal piv ot gro wth of the factorization and condition n um b er of the input matrix. The algorithm is implemen ted in the C language with MA TLAB in terfaces as w ell. The MA TLAB in terfaces enable a user to in v ok e KLU routines from within ix

PAGE 10

the MA TLAB en vironmen t. The implemen tation w as tested on circuit matrices and the results determined. KLU ac hiev es sup erior ll-in qualit y with its h ybrid ordering strategy and ac hiev es a go o d p erformance sp eed-up when compared with existing sparse linear solv ers for circuit problems. The thesis highligh ts the w ork b eing done on exploiting parallelism in KLU as w ell. x

PAGE 11

CHAPTER 1 INTR ODUCTION Sparse is b eautiful. Solving systems of linear equations of the form Ax = b is a fundamen tal and imp ortan t area of high p erformance computing. The matrix A is called the co ecien t matrix and b the righ t hand side v ector. The v ector x is the solution to the equation. There are a n um b er of metho ds a v ailable for solving suc h systems. Some of the p opular ones are Gaussian elimination, QR factorization using Householder transformations or Giv ens rotations and Cholesky factorization. Gaussian elimination with partial piv oting is the most widely used algorithm for solving linear systems b ecause of its stabilit y and b etter time complexit y Cholesky can b e used only when A is symmetric p ositiv e denite. Some systems that are solv ed comprise a dense co ecien t matrix A By dense, w e mean most of the elemen ts in A are nonzero. There are high p erformance subroutines suc h as the BLAS [ 1 2 3 4 5 ] that can maximize rop coun t for suc h dense matrices. The in teresting systems are those where the co ecien t matrix A happ ens to b e sparse. By sparse, w e mean the matrix has few nonzero en tries(hereafter referred to simply as 'nonzeros'). The adjectiv e 'few' is not w elldened as w e will see in c hapter t w o. When matrices tend to b e sparse, w e need to nd out eectiv e w a ys to store the matrix in memory since w e w an t to a v oid storing zeros of the matrix. When w e store only the nonzeros in the matrix, it has consequences in the factorization algorithm as w ell. One t ypical example w ould b e w e do not kno w b efore hand ho w nonzeros w ould app ear in the L and U factors when w e factorize the matrix. While w e a v oid storing the zeros, w e also w an t to ac hiev e go o d time complexit y when solving sparse systems. If the time sp en t to 1

PAGE 12

2 solv e sparse systems remains same as for dense systems, w e ha v e not done an y b etter. KLU stands for Clark Ken t LU, since it is based on Gilb ert-P eierls' algorithm, a non-sup erno dal algorithm, whic h is the predecessor to Sup erLU, a sup erno dal algorithm. KLU is a sparse high p erformance linear solv er that emplo ys h ybrid ordering mec hanisms and elegan t factorization and solv e algorithms. It ac hiev es high qualit y ll-in rate and b eats man y existing solv ers in run time, when used for matrices arising in circuit sim ulation. There are sev eral ra v ours of Gaussian elimination. A left-lo oking Gaussian elimination algorithm factorizes the matrix left to righ t computing columns of L and U. A righ t-lo oking v ersion factorizes the matrix from top-left to b ottom-righ t computing column of L and ro w of U. Both ha v e their adv an tages and disadv an tages. KLU uses a left lo oking algorithm called Gilb ert-P eierls' algorithm. Gilb ert-P eierls' comprises a graph theoretical sym b olic analysis phase that identies the nonzero pattern of eac h column of L and U factors and a left-lo oking n umerical factorization phase with partial piv oting that calculates the n umerical v alues of the factors. KLU uses Symmetric Pruning to cut do wn sym b olic analysis cost. W e shall lo ok in detail on these features in c hapter t w o. A critical issue in linear solv ers is Ordering. Ordering means p erm uting the ro ws and columns of a matrix, so that the ll-in in the L and U factors is reduced to a minim um. A ll-in is dened as a nonzero app earing in either of the matrices L or U, while the elemen t in the corresp onding p osition in A is a zero. L ij or U ij is a ll-in if A ij is a zero. Fill-in has ob vious consequences in memory in that the factorization algorithm could create dense L and U factors that can exhaust a v ailable memory A go o d ordering algorithm yields a lo w ll-in in the factors. Finding the ordering that giv es minimal ll-in is an NP complete problem. So

PAGE 13

3 ordering algorithms use heuristics. KLU accomo dates m ultiple ordering sc hemes lik e AMD, COLAMD and an y user generated p erm utation. There are other orderings for dieren t purp oses. F or example, one could order a matrix to ensure that it has no zeros on the diagonal. Otherwise, the Gaussian elimination w ould fail. Another ordering sc heme could reduce the factorization w ork. KLU emplo ys t w o suc h orderings namely an unsymmetric ordering that ensures a zero free diagonal and a symmetric ordering that p erm utes the matrix in to a blo c k upp er triangular form (BTF) that restricts factorization to only the diagonal blo c ks. One of the k ey steps in the circuit sim ulation pro cess is solving sparse linear systems. These systems originate from solving large systems of non linear equations using Newton's metho d and in tegrating large sti systems of ordinary dieren tial equations. These systems are of v ery high dimensions and a considerable fraction of sim ulation time is sp en t on solving these systems. Often the solv e phase tends to b e a b ottlenec k in the sim ulation pro cess. Hence high p erformance sparse solv ers that optimize memory usage and solution time are critical comp onen ts of circuit sim ulation soft w are. Some of the p opular solv ers in use in circuit sim ulation to ols are Sparse1.3 and Sup erLU. Sparse1.3 is used in SPICE circuit sim ulation pac k age and Sup erLU uses a sup erno dal factorization algorithm. Exp erimen tal results of KLU indicate that it is 1000 times faster than Sparse1.3 and 1.5 to 3 times faster than Sup erLU. Circuit matrices sho w some unique prop erties. They ha v e a nearly zero free diagonal. They ha v e a roughly symmetric pattern but ha v e unsymmetric v alues. They are highly sparse and often ha v e a few dense ro ws and columns. These dense ro ws/columns arise from v oltage sources and curren t sources in the circuit. Circuit matrices sho w go o d amenabilit y to BTF ordering. Though the nonzero pattern of original matrix is unsymmetric, the nonzero pattern of blo c ks pro duced b y BTF

PAGE 14

4 ordering tend to b e symmetric. Since circuit matrices are extremely sparse, sparse matrix algorithms suc h as Sup erLU [ 6 ] and UMFP A CK [ 7 8 ] that emplo y dense BLAS k ernels are often inappropriate. Another unique c haracteristic of circuit matrices is that emplo ying a go o d ordering strategy k eeps the L and U factors sparse. Ho w ev er as w e will see in exp erimen tal results, t ypical ordering strategies can lead to high ll-in. In circuit sim ulation problems, t ypically the circuit matrix template is generated once and the n umerical v alues of the matrix alone c hange. In other w ords, the nonzero pattern of the matrix do es not c hange. This implies that w e need to order and factor the matrix once to generate the ordering p erm utations and the nonzero patterns of L and U factors. F or all subsequen t matrices, w e can use the same information and need only to recompute the n umerical v alues of the L and U factors. This pro cess of skipping analysis and factor phases is called refactorization.Refactorization leads to a signican t reduction in run time. Because of the unique c haracteristics of circuit matrices and their amenabilit y to BTF ordering, KLU is a metho d w ell-suited to circuit sim ulation problems. KLU has b een implemen ted in the C language. It oers a set of API for the analysis phase, factor phase, solv e phase and refactor phase. It also oers the abilit y to solv e upto four righ t hand sides in a single solv e step. In addition, it oers transp ose solv e, conjugate transp ose solv e features and diagnostic to ols lik e piv ot gro wth estimator and condition n um b er estimator. It also oers a MA TLAB in terface for the API so that KLU can b e used from within the MA TLAB en vironmen t.

PAGE 15

CHAPTER 2 THEOR Y: SP ARSE LU 2.1 Dense LU Consider the problem of solving the linear system of n equations in n unkno wns: a 11 x 1 + a 12 x 2 + ::: + a 1 n x n = b 1 a 21 x 1 + a 22 x 2 + ::: + a 2 n x n = b 2 ... (2{1) a n 1 x 1 + a n 2 x 2 + ::: + a nn x n = b n or, in matrix notation, 266666664 a 11 a 12 a 1 n a 21 a 22 a 2 n ... a n 1 a n 2 a nn 377777775 266666664 x 1 x 2 ... x n 377777775 = 266666664 b 1 b 2 ... b n 377777775 (2{2) Ax = b where A = ( a ij ), x = ( x 1 ; x 2 ; :::; x n ) T and b = ( b 1 ; :::; b n ) T : A w ell-kno wn approac h to solving this equation is Gaussian elimination. Gaussian elimination consists of a series of eliminations of unkno wns x i from the original system. Let us briery review the elimination pro cess. In the rst step, the rst equation of 2{1 is m ultiplied b y a 21 a 11 ; a 31 a 11 ; ::: a n 1 a 11 and added with the second through n th equation of 2{1 resp ectiv ely This w ould eliminate x 1 from second through the n th equations. After 5

PAGE 16

6 the rst step, the 2{2 w ould b ecome 266666664 a 11 a 12 a 1 n 0 a (1)22 a (1)2 n ... 0 a (1)n 2 a (1)nn 377777775 266666664 x 1 x 2 ... x n 377777775 = 266666664 b 1 b (1)2 ... b (1)n 377777775 (2{3) where a (1)22 = a 22 a 21 a 11 a 12 a 3 2 (1) = a 32 a 31 a 11 a 12 and so on. In the second step, x 2 will b e eliminated b y a similar pro cess of computing m ultipliers and adding the m ultiplied second equation with the third through n th equations. After n-1 eliminations, the matrix A is transformed to an upp er triangular matrix U The upp er triangular system is then solv ed b y bac k-substitution. An equiv alen t in terpretation of this elimination pro cess is that w e ha v e factorized A in to a lo w er triangular matrix L and an upp er triangular matrix U where L = 2666666666664 1 0 0 0 a 21 a 11 1 0 0 a 31 a 11 a (1)32 a (1)22 1 0 ... a n 1 a 11 a (1)n 2 a (1)22 a (2)n 3 a (2)33 1 3777777777775 (2{4) and U = 266666666664 a 11 a 12 a 13 a 1 n 0 a (1)22 a 2 3 (1) a (1)2 n 0 0 a 3 3 (2) a (2)3 n ... 0 0 0 a ( n 1) nn 377777777775 (2{5) The column k of the lo w er triangular matrix L consists of the m ultipliers obtained during step k of the elimination pro cess, with their sign negated.

PAGE 17

7 Mathematically Ax = b can b e rewritten as ( LU ) x = b L ( U x ) = b (2{6) Substituting U x = y in 2{6 w e ha v e Ly = b (2{7) U x = y (2{8) By solving these t w o lo w er triangular systems, w e nd the solution to the actual system. The reason for triangularizing the system is to a v oid nding the in v erse of the original co ecien t matrix A. In v erse nding is atleast thrice as exp ensiv e as Gaussian elimination in the dense case and often leads to more inaccuracies. 2.2 Sparse LU A sparse matrix is dened as one that has few nonzeros in it. The quantication of the adjectiv e 'few' is not sp ecied. The decision as to what kind of algorithm to use (sparse or dense) dep ends on the ll-in prop erties of the matrices. Ho w ev er, sparse matrices t ypically ha v e O ( n ) nonzero en tries. Dense matrices are t ypically represen ted b y a t w o dimensional arra y .The zeros of a sparse matrix should not b e stored if w e w an t to sa v e memory This fact mak es a t w o dimensional arra y unsuitable for represen ting sparse matrices. Sparse matrices are represen ted with a dieren t kind of data structure. They can b e represen ted in t w o dieren t data structures viz.column compressed form or ro w compressed form. A column compressed form consists of three v ectors Ap, Ai and Ax. Ap consists of column p oin ters. It is of length n + 1. The start of column k of the input matrix is giv en b y Ap [k].

PAGE 18

8 Ai consists of ro w indices of the elemen ts. This is a zero based data structure with ro w indices in the in terv al [0,n). Ax consists of the actual n umerical v alues of the elemen ts. Th us the elemen ts of a column k of the matrix are held in Ax [Ap [k]...Ap [k+1]). The corresp onding ro w indices are held in Ai [Ap [k]...Ap[k+1]). Equiv alen tly a ro w compressed format stores a ro w p oin ter v ector Ap, a column indices v ector Ai and a v alue v ector Ax. F or example, the matrix 266664 5 0 0 4 2 0 3 1 8 377775 when represen ted in column compressed format will b e Ap: 0 3 5 6 Ai: 0 1 2 1 2 2 Ax: 5 4 3 2 1 8 and when represen ted in ro w compressed format will b e Ap: 0 1 3 6 Ai: 0 0 1 0 1 2 Ax: 5 4 2 3 1 8 Let nnz represen t the n um b er of nonzeros in a matrix of dimension n n Then in a dense matrix represen tation, w e will need n 2 memory to represen t the matrix. In sparse matrix represen tation, w e reduce it to O ( n + nnz ) and t ypically nnz n 2 2.3 Left Lo oking Gaussian Elimination Let us deriv e a left lo oking v ersion of Gaussian elimination. Let an input matrix A of order n n b e represen ted as a pro duct of t w o triangular matrices L and U.

PAGE 19

9 Let 266664 A 11 a 12 A 13 a 21 a 22 a 23 A 31 a 32 A 33 377775 = 266664 L 11 0 0 l 21 1 0 L 31 l 32 L 33 377775 266664 U 11 u 12 U 13 0 u 22 u 23 0 0 U 33 377775 (2{9) where A ij is a blo c k, a ij is a v ector and a ij is a scalar. The dimensions of dieren t elemen ts in the matrices are as follo ws: A 11 ; L 11 ; U 11 are k k blo c ks a 12 ; u 12 are k 1 v ectors A 13 ; U 13 are k n ( k + 1) blo c ks a 21 ; l 21 are 1 k ro w v ectors a 22 ; u 22 are scalars a 23 ; u 23 are 1 n ( k + 1) ro w v ectors A 31 ; L 31 are n ( k + 1) k blo c ks a 32 ; l 32 are n ( k + 1) 1 v ectors A 33 ; L 33 ; U 33 are n ( k + 1) n ( k + 1) blo c ks. F rom ( 2{9 ), w e can arriv e at the follo wing set of equations. L 11 U 11 = A 11 (2{10) L 11 u 12 = a 12 (2{11) L 11 U 13 = A 13 (2{12) l 21 U 11 = a 21 (2{13) l 21 u 12 + u 22 = a 22 (2{14) l 21 U 13 + u 23 = a 23 (2{15) L 31 U 11 = A 31 (2{16) L 31 u 12 + l 32 u 22 = a 32 (2{17) L 31 U 13 + l 32 u 23 + L 33 U 33 = A 33 (2{18)

PAGE 20

10 F rom ( 2{11 ), ( 2{14 ) and ( 2{17 ), w e can compute the 2 nd column of L and U, assuming w e ha v e already computed L 11 l 21 and L 31 W e rst solv e the lo w er triangular system ( 2{11 ) for u 12 Then, w e solv e for u 22 using ( 2{14 ) b y computing the sparse dot pro duct u 22 = a 22 l 21 u 12 (2{19) Finally w e solv e ( 2{17 ) for l 32 as l 32 = 1 u 22 ( a 32 L 31 u 12 ) (2{20) This step of computing the 2 nd column of L and U can b e considered equiv alen t to solving a lo w er triangular system as follo ws: 266664 L 11 0 0 l 21 1 0 L 31 0 1 377775 266664 u 12 u 22 l 32 u 22 377775 = 266664 a 12 a 22 a 32 377775 (2{21) This mec hanism of computing column k of L and U b y solving a lo w er triangular system L x = b is the k ey step in a left-lo oking factorization algorithm. As w e will see later, Gilb ert-P eierls' algorithm rev olv es around solving this lo w er triangular system. The algorithm is called a left-lo oking algorithm since column k of L and U are computed b y using the already computed columns 1...k-1 of L. In other w ords, to compute column k of L and U, one lo oks only at the already computed columns 1...k-1 in L, that are to the left of the curren tly computed column k. 2.4 Gilb ert-P eierls' Algorithm Gilb ert-P eierls' [ 9 ] prop osed an algorithm for Gaussian elimination with partial piv oting in time prop ortional to the rop coun t of the elimination to factor an arbitrary non singular sparse matrix A as P A = LU If f l ops ( LU ) is the n um b er

PAGE 21

11 of nonzero m ultiplications p erformed when m ultiplying t w o matrices L and U then Gaussian elimination uses exactly f l ops ( LU ) m ultiplications and divisions to factor a matrix A in to L and U Giv en an input matrix and assuming no partial piv oting, it is p ossible to predict the nonzero pattern of its factors. Ho w ev er with partial piv oting, it is not p ossible to predict the exact nonzero pattern of the factors b efore hand. Finding an upp er b ound is p ossible, but the b ound can b e v ery lo ose [ 10 ]. Note that computing the nonzero pattern of L and U is a necessary part of Gaussian elimination in v olving sparse matrices since w e do not use t w o dimensional arra ys for represen ting them but sparse data structures. Gilb ert-P eierls' algorithm aims at computing the nonzero pattern of the factors and the n umerical v alues in a total time prop ortional to O ( f l ops ( LU )). It consists of t w o stages for determining ev ery column of L and U The rst stage is a sym b olic analysis stage that computes the nonzero pattern of the column k of the factors. The second stage is the n umerical factorization stage that in v olv es solving the lo w er triangular system Lx = b that w e discussed in the section ab o v e. 2.4.1 Sym b olic Analysis A sparse Gaussian elimination algorithm with partial piv oting cannot kno w the exact nonzero structure of the factors ahead of all n umerical computation, simply b ecause partial piv oting at column k can in tro duce new nonzeros in columns k + 1 n Solving Lx = b m ust b e done in time prop ortional to the n um b er of rops p erformed. Consider a simple column-orien ted algorithm in MA TLAB notation for solving Lx = b as follo ws: x = b for j = 1:n if x(j) ~= 0 x(j+1:n) = x(j+1:n) L(j+1:n,j) x(j) end

PAGE 22

12 endThe ab o v e algorithm tak es time O ( n + number of f l ops per f or med ). The O ( n ) term lo oks harmless, but Lx = b is solv ed n times in the factorization of A = LU leading to an unacceptable O ( n 2 ) term in the w ork to factorize A in to L times U T o remo v e the O ( n ) term, w e m ust replace the algorithm with x = b for each j for which x(j) ~= 0 x(j+1:n) = x(j+1:n) L(j+1:n, j) x(j) endThis w ould reduce the O ( n ) term to O ( ( b )), where ( b ) is the n um b er of nonzeros in b Note that b is a column of the input matrix A Th us to solv e Lx = b w e need to kno w the nonzero pattern of x b efore w e compute x itself. Sym b olic analysis helps us determine the nonzero pattern of x Let us sa y w e are computing column k of L and U Let G = G ( L k ) b e the directed graph of L with k 1 v ertices represen ting the already computed k 1 columns. G ( L k ) has an edge j i i l ij 6 = 0. Let = f i j b i 6 = 0 g and X = f i j x i 6 = 0 g No w the elemen ts of X is giv en b y X = R each G ( L ) ( ) (2{22) The nonzero pattern of X is computed b y the determining the v ertices that are reac hable from the v ertices of the set The reac habilit y problem can b e solv ed using a classical depth rst searc h in G ( L k ) from the v ertices of the set If b j 6 = 0, then x j 6 = 0. In addition if L ij 6 = 0, then x i 6 = 0 ev en if b i = 0. This is b ecause a L ij x j con tributes to a nonzero in the equation when w e solv e for x i During the depth rst searc h, Gilb ert-P eierls' algorithm computes the top ological order of X This top ological ordering is useful for eliminating unkno wns in the Numerical factorization step.

PAGE 23

13 L x j x i l ij Figure 2{1: Nonzero pattern of x when solving Lx = b The ro w indices v ector L i of columns 1 k 1 of L represen ts the adjacency list of the graph G ( L k ). The depth rst searc h tak es time prop ortional to the n um b er of v ertices examined plus the n um b er of edges tra v ersed. 2.4.2 Numerical F actorization Numerical factorization consists of solving the system ( 2{21 ) for eac h column k of L and U Normally w e w ould solv e for the unkno wns in ( 2{21 ) in the increasing order of the ro w index. The ro w indices/nonzero pattern computed b y depth rst searc h are not necessarily in increasing order. Sorting the indices w ould increase the time complexit y ab o v e our O ( f l ops ( LU )) goal. Ho w ev er, the requiremen t of eliminating unkno wns in increasing order can b e relaxed to a top ological order of the ro w indices. An unkno wn x i can b e computed, once all the unkno wns x j of whic h it is dep enden t on are computed. This is ob vious when w e write the equations comprising a lo w er triangular solv e. Theoretically the unkno wns can b e solv ed in an y top ological order. The depth rst searc h algorithm giv es one suc h top ological order whic h is sucien t for our case. In our example, the depth rst searc h w ould ha v e nished exploring v ertex i b efore it nishes exploring v ertices j

PAGE 24

14 Hence a top ological order giv en b y depth rst searc h w ould ha v e j app earing b efore i This is exactly what w e need. Gilb ert-P eierls' algorithm starts with an iden tit y L matrix. The en tire left lo oking algorithm can b e summarized in MA TLAB notation as follo ws: L = I for k = 1:n x = L \ A(:,k) %(partial pivoting on x can be done here) U(1:k,k) = x(1:k) L(k:n,k) = x(k:n) / U(k,k) end where x = L n b denotes the solution of a sparse lo w er triangular system. In this case, b is the k th column of A The total time complexit y of Gilb ert-P eierls' algorithm is O ( ( A ) + f l ops ( LU )). ( A ) is the n um b er of nonzeros in the matrix A and f l ops ( LU ) is the rop coun t of the pro duct of the matrices L and U. T ypically f l ops ( LU ) dominates the complexit y and hence the claim of factorizing in time prop ortional to the rop coun t. 2.5 Maxim um T ransv ersal Du [ 11 12 ] prop osed an algorithm for determining the maxim um transv ersal of a directed graph.The purp ose of the algorithm is to nd a ro w p erm utation that minimizes the zeros on the diagonal of the matrix. F or non singular matrices, the algorithm ensures a zero free diagonal. KLU emplo ys Du 's [ 11 12 ] algorithm to nd an unsymmetric p erm utation of the input matrix to determine a zerofree diagonal. A matrix cannot b e p erm uted to ha v e a zero free diagonal if and only if it is structurally singular. A matrix is structurally singular if there is no p erm utation of its nonzero pattern that mak es it n umerically nonsingular.

PAGE 25

15 A transv ersal is dened as a set of nonzeros, no t w o of whic h lie in the same ro w or column, on the diagonal of the p erm uted matrix. A transv ersal of maxim um length is the maxim um transv ersal. Du 's maxim um transv ersal algorithm consists of represen ting the matrix as a graph with eac h v ertex corresp onding to a ro w in the matrix. An edge i k i k +1 exists in the graph if A ( i k ; j k +1 ) is a nonzero and A ( i k +1 ; j k +1 ) is an elemen t in the transv ersal set. A path b et w een v ertices i 0 and i k w ould consist of a sequence of nonzeros ( i 0 ; j 1 ) ; ( i 1 ; j 2 ) ; ( i k 1 ; j k ) where the curren t transv ersal w ould include ( i 1 ; j 1 ) ; ( i 2 ; j 2 ) ; ( i k ; j k ). If there is a nonzero in p osition ( i k ; j k +1 ) and no nonzero in ro w i 0 or column j k +1 is curren tly on the tra vseral, it increases the transerv al b y one b y adding the nonzeros ( i r ; j r +1 ) ; r = 0 ; 1 ; ; k to the transv ersal and remo ving the nonzeros ( i r ; j r ) ; r = 1 ; 2 ; ; k from the transv ersal. This adding and remo ving of nonzeros to and from the transv ersal is called reassignmen t c hain or augmen ting path. A v ertex or ro w is said to b e assigned if a nonzero in the ro w is c hosen for the transv ersal. The pro cess of constructing augmen ting paths is done b y doing a depth rst searc h from an unassigned ro w i 0 of the matrix and con tin ue till a v ertex i k is reac hed where the path terminates b ecause A ( i k ; j k +1 ) is a nonzero and column j K +1 is unassigned. Then the searc h bac ktrac ks to i 0 adding and remo ving transv ersal elemen ts th us constructing an augmen ting path. Du 's maxim um transv ersal transv ersal algorithm has a w orst case time complexit y of O ( n ) where is the n um b er of nonzeros in the matrix and n is the order of the matrix. Ho w ev er in practice, the time complexit y is close to O ( n + ). The maxim um transv ersal problem can b e cast as a maximal matc hing problem on bipartite graphs. This is only to mak e a comparison. The maximal matc hing problem is stated as follo ws.

PAGE 26

16 0 = A 11 A 13 A 14 A 34 A 22 A 23 A 24 A 33 A 44 x 1 x 2 x 3 x 4 b 1 b 2 b 3 b 4 A 12 Figure 2{2: A matrix p erm uted to BTF form Giv en an undirected graph G = ( V ; E ), a matc hing is a subset of the edges M E suc h that for all v ertices v 2 V at most one edge of M is inciden t on v A v ertex v 2 V is matc hed if some edge in M is inciden t on v otherwise, v is unmatc hed. A maximal matc hing is a matc hing of maxim um cardinalit y that is a matc hing M suc h that for an y matc hing M 0 w e ha v e j M j j M 0 j A maximal matc hing can b e built incremen tally b y pic king an arbitrat y edge e in the graph, deleting an y edge that is sharing a v ertex with e and rep eating un til the graph is out of edges. 2.6 Blo c k T riangular F orm A blo c k (upp er) triangular matrix is similar to a upp er triangular matrix except that the diagonals in the former are square blo c ks instead of scalars. Figure 2{2 sho ws a matrix p erm uted to the BTF form. Con v erting the input matrix to blo c k triangular form is imp ortan t in that, 1. The part of the matrix b elo w the blo c k diagonal requires no factorization eort.

PAGE 27

17 2. The diagonal blo c ks are indep enden t of eac h other.Only the blo c ks need to b e factorized. F or example, in gure 2{2 the subsystem A 44 x 4 = b 4 can b e solv ed indep enden tly for x 4 and x 4 can b e eliminated from the o v erall system. The system A 33 x 3 = b 3 A 34 x 4 is then solv ed for x 3 and so on. 3. The o diagonal nonzeros do not con tribute to an y ll-in. Finding a symmetric p erm utation of a matrix to its BTF form is equiv alen t to nding the strongly connected comp onen ts of a graph. A strongly connected comp onen t of a directed graph G = ( V ; E ) is a maximal set of v ertices C V suc h that for ev ery pair of v ertices u and v in C w e ha v e b oth u v and v u The v ertices u and v are reac hable from eac h other. The Algorithm emplo y ed in KLU for symmetric p erm utation of a matrix to a BTF form, is based on Du and Reid's [ 13 14 ] algorithm. Du and Reid pro vide an implemen tation for T arjan's [ 15 ] algorithm to determine the strongly connected comp onen ts of a directed graph. The algorithm has a time complexit y of O ( n + ) where n is the order of the matrix and is the n um b er of o diagonal nonzeros in the matrix. The algorithm essen tially consists of doing a depth rst searc h from un visited no des in the graph. It uses a stac k to k eep trac k of no des b eing visited and uses a path of the no des. When all edges in the path are explored, it generates Strongly connected comp onen ts from the top of stac k. Du 's algorithm is dieren t from the metho d prop osed b y Cormen, Leiserson, Riv est and Stein [ 16 ]. They suggest doing a depth rst searc h on G computing G T and then running a depth rst searc h on G T on v ertices in the decreasing order of their nish times from the rst depth rst searc h (the top ological order from the rst depth rst searc h).

PAGE 28

18 j i k jis = Pruned, = Fill in = Nonzero, Figure 2{3: A symmetric pruning scenario 2.7 Symmetric Pruning Eisenstat and Liu [ 17 ] prop osed a metho d called Symmetric Pruning to exploit structural symmetry for cutting do wn the sym b olic analysis time. The cost of depth rst searc h can b e cut do wn b y pruning unnecessary edges in the graph of L; G ( L ). The idea is to replace G ( L ) b y a reduced graph of L An y graph H can b e used in place of G ( L ), pro vided that i j exists in H i i j exists in G ( L ). If A is symmetric, then the symmetric reduction is just the elimination tree. The symmetric reduction is a subgraph of G ( L ). It has few er edges than G ( L ) and is easier to compute b y taking adv an tage of symmetry in the structure of the factors L and U. Ev en though the symmetric reduction remo v es edges, it still preserv es the paths b et w een v ertices of the original graph. Figure 2{3 sho ws a symmetric pruning example. If l ij 6 = 0 ; u j i 6 = 0, then w e can prune edges j s where s > i The reason b ehind this is that for an y a j k 6 = 0, a sk will ll in from column j of L for s > k

PAGE 29

19 The just computed column i of L is used to prune earlier columns. An y future depth rst searc h from v ertex j will not visit v ertex s since s w ould ha v e b een visited via i already Note that ev ery column is pruned only once. KLU emplo ys symmetric pruning to sp eed up the depth rst searc h in the sym b olic analysis stage. 2.8 Ordering It is a widely used practice to precede the factorization step of a sparse linear system b y an ordering phase. The purp ose of the ordering is to generate a p erm utation P that reduces the ll-in in the factorization phase of P AP T A ll-in is dened as a nonzero in a p osition ( i; j ) of the factor that w as zero in the original matrix. In other w ords, w e ha v e a ll-in if L ij 6 = 0, where A ij = 0. The p erm uted matrix created b y the ordering P AP T creates m uc h less ll-in in factorization phase than the unp erm uted matrix A The ordering mec hanism t ypically tak es in to accoun t only the structure of the input matrix, without considering the n umerical v alues stored in the matrix. P artial piv oting during factorization c hanges the ro w p erm utation P and hence could p oten tially increase ll-in as opp osed to what w as estimated b y the ordering sc heme. W e shall see more ab out piv oting in the follo wing sections. If the input matrix A is unsymmetric, then the p erm utation of the matrix A + A T can b e used. V arious minim um degree algorithms can b e used for ordering. Some of the p opular ordering sc hemes include appro ximate minim um degree(AMD) [ 18 19 ], column appro ximate minim um degree(COLAMD) [ 20 21 ] among others. COLAMD orders the matrix AA T without forming it explicitly After p erm uting an input matrix A in to BTF form using the maxim um transv ersal and BTF orderings, KLU attempts to factorize eac h of the diagonal blo c ks. It applies the ll reducing ordering algorithm on the blo c k b efore factorizing it. KLU supp orts b oth appro ximate minim um degree and column appro ximate

PAGE 30

20 minim um degree. Besides, an y giv en ordering algorithm can b e plugged in to KLU without m uc h eort. W ork is b eing done on in tegrating a Nested Dissection ordering strategy in to KLU as w ell. Of the v arious ordering sc hemes, AMD giv es b est results on circuit matrices. AMD nds a p erm utation P to reduce ll-in for the Cholesky factorization of P AP T (of P ( A + A T ) P T if A is unsymmetric). AMD assumes no n umerical piv oting. AMD attempts to reduce an optimistic estimate of ll-in. COLAMD is an unsymmetric ordering sc heme, that computes a column p erm utation Q to reduce ll-in for Cholesky factorization of ( AQ ) T AQ COLAMD attempts to reduce a "p essimitic" estimate (upp er b ound) of ll-in. Nested Dissection is another ordering sc heme that creates p erm utation suc h that the input matrix is transformed in to blo c k diagonal form with v ertex separators. This is a p opular ordering sc heme. Ho w ev er, it is unsuitable for circuit matrices when applied to the matrix as suc h. It can b e used on the blo c ks generated b y BTF pre-ordering. The idea b ehind a minim um degree algorithm is as follo ws: A structurally symmetric matrix A can b e represen ted b y an equiv alen t undirected graph G ( V ; E ) with v ertices corresp onding to ro w/column indices. An edge i j exists in G if A ij 6 = 0. Consider the gure 2{4 If the matrix is factorized with v ertex 1 as the piv ot, then after the rst Gaussian elimination step, the matrix w ould b e transformed as in gure 2{5 This rst step of elimination can b e considered equiv alen t to remo ving no de 1 and all its edges from the graph and adding edges to connect all no des adjacen t to 1. In other w ords, the elimination has created a clique of the no des adjacen t to the eliminated no de. Note that there are as man y ll-ins in the reduced matrix as there are edges added in the clique formation. In the ab o v e example, w e ha v e

PAGE 31

21 ***** * * ** * 1 2 3 5 4 Figure 2{4: A symmetric matrix and its graph represen tation * * * ** * 2 3 5 4 * * * *** Figure 2{5: The matrix and its graph represen tation after one step of Gaussian elimination

PAGE 32

22 c hosen the wrong no de as piv ot, since no de 1 has the maxim um degree. Instead if w e had c hosen a no de with minim um degree sa y 3 or 5 as piv ot, then there w ould ha v e b een zero ll-in after the elimination since b oth 3 and 5 ha v e degree 1. This is the k ey idea in a minim um degree algorithm. It generates a p erm utation suc h that a no de with minim um degree is eliminated in eac h step of Gaussian elimination, th us ensuring a minimal ll-in. The algorithm do es not examine the n umerical v alues in the no de selection pro cess. It could happ en that during partial piv oting, a no de other than the one suggested b y the minim um degree algorithm m ust b e c hosen as piv ot b ecause of its n umerical magnitude. That's exactly the reason wh y the ll-in estimate pro duced b y the ordering algorithm could b e less than that exp erienced in the factorization phase. 2.9 Piv oting Gaussian elimination fails when the diagonal elemen t in the input matrix happ ens to b e zero. Consider a simple 2 2 system, A = 264 0 a 12 a 21 a 22 375 264 x 1 x 2 375 = 264 b 1 b 2 375 (2{23) When solving the ab o v e system, Gaussian elimination computes the m ultiplier a 21 =a 11 [and m ultiplies ro w 1 with this m ultiplier and adds it to ro w 2] th us eliminating the co ecien t elemen t a 21 from the matrix. This step ob viously w ould fail, since a 11 is zero. No w let's see a classical case when the diagonal elemen t is nonzero but close to zero. A = 264 0 : 0001 1 1 1 375 (2{24) The m ultiplier is 1 = 0 : 0001 = 10 4 The factors L and U are

PAGE 33

23 L = 264 1 0 10 4 1 375 U = 264 0 : 0001 1 0 10 4 375 (2{25) The elemen t u 22 has the actual v alue 1 10 4 Ho w ev er assuming a four digit arithmetic, it w ould b e rounded o to 10 4 Note that the pro duct of L and U is L U = 264 0 : 0001 1 1 0 375 (2{26) whic h is dieren t from the original matrix. The reason for this problem is that the m ultiplier computed is so large that when added with the small elemen t a 22 with v alue 1, it obscured the tin y v alue presen t in a 22 W e can solv e these problems with piv oting. In the ab o v e t w o examples, w e could in terc hange ro ws 1 and 2, to solv e the problem. This mec hanism of in terc hanging ro ws(and columns) and pic king a large elemen t as the diagonal, to a v oid n umerical failures or inaccuracies is called piv oting. T o pic k a n umerically large elemen t as piv ot, w e could lo ok at the elemen ts in the curren t column or w e could lo ok at the en tire submatrix (across b oth ro ws and columns). The former is called partial piv oting and the latter is called complete piv oting. F or dense matrices, partial piv oting adds a time complexit y of O ( n 2 ) comparisons to Gaussian elimination and complete piv oting adds O ( n 3 ) comparisons. Complete piv oting is exp ensiv e and hence is generally a v oided, except for sp ecial cases. KLU emplo ys partial piv oting with diagonal preference. As long as the diagonal elemen t is atleast a constan t threshold times the largest elemen t in the column, w e c ho ose the diagonal as the piv ot. This constan t threshold is called piv ot tolerance. 2.10 Scaling The case where small elemen ts in the matrix get obscured during the elimination pro cess and accuracy of the results gets sk ew ed b ecause of n umerical addition

PAGE 34

24 is not completely o v ercome b y the piv oting pro cess. Let us see an example of this case. Consider the 2 2 system A = 264 10 10 5 1 1 375 264 x 1 x 2 375 = 264 10 5 2 375 (2{27) When w e apply Gaussian elimination with partial piv oting to the ab o v e system, the en try a 11 is largest in the rst column and hence w ould con tin ue to b e the piv ot.After the rst step of elimination assuming a four digit arithmetic, w e w ould ha v e A = 264 10 10 5 0 10 4 375 264 x 1 x 2 375 = 264 10 5 10 4 375 (2{28) The solution from the ab o v e elimination is x 1 = 1 ; x 2 = 0. Ho w ev er the correct solution is close to x 1 = 1 ; x 2 = 1. If w e divide eac h ro w of the matrix b y the largest elemen t in that ro w(and the corresp onding elemen t in the righ t hand side as w ell), prior to Gaussian elimination w e w ould ha v e A = 264 10 4 1 1 1 375 264 x 1 x 2 375 = 264 12 375 (2{29) No w if w e apply partial piv oting w e w ould ha v e, A = 264 1 1 10 4 1 375 264 x 1 x 2 375 = 264 21 375 (2{30) And after an elimination step, the result w ould b e

PAGE 35

25 A = 264 1 1 0 1 10 4 375 264 x 1 x 2 375 = 264 2 1 10 4 375 (2{31) whic h yields the correct solution x 1 = 1 ; x 2 = 1. The pro cess of balancing out the n umerical enormit y or obscurit y on eac h ro w or column is termed as scaling. In the ab o v e example, w e ha v e scaled with resp ect to the maxim um v alue in a ro w whic h is ro w scaling. Another v arian t w ould b e to scale with resp ect to the sum of absolute v alues of all elemen ts across a ro w. In column scaling, w e w ould scale with resp ect to the maxim um v alue in a column or the sum of absolute v alues of all elemen ts in a cloumn. Ro w scaling can b e considered equiv alen t to nding an in v ertible diagonal matrix D 1 suc h that all the ro ws in the matrix D 1 A ha v e equally large n umerical v alues. Once w e ha v e suc h a D 1 the solution of the original system Ax = b is equiv alen t to solving the system ~ A x = ~ b where ~ A = D 1 A and ~ b = D 1 b Equilibration is another p opular term used for scaling. In KLU, the diagonal elemen ts of the diagonal matrix D 1 are either the largest elemen ts in the ro ws of the original matrix or the sum of the absolute v alues of the elemen ts in the ro ws. Besides scaling can b e turned o as w ell, if the sim ulation en vironmen t do es not need scaling. Scaling though it oers b etter n umerical results when solving systems, is not mandatory Its usage dep ends on the data v alues that constitute the system and if the v alues are already balanced, scaling migh t not b e necessary 2.11 Gro wth F actor Piv ot gro wth factor is a k ey diagnostic estimate in determining the stabilit y of Gaussian elimination. Stabilit y of Numerical Algorithms is an imp ortan t factor in determining the accuracy of the solution. Study of stabilit y is done b y a pro cess

PAGE 36

26 called Roundo Error analysis. Roundo error analysis comprises t w o sub t yp es called F orw ard error analysis and Bac kw ard error analysis. If the computed solution ~ x is close to the exact solution x, then the algorithm is said to b e F orw ard stable. If the algorithm computes an exact solution to a nearb y problem, then the algorithm is said to b e Bac kw ard stable.Bac kw ard stabilit y is the most widely used tec hnique in studying stabilit y of systems. Often the data generated for solving systems ha v e impurit y in them or they are distorted b y a small amoun t. Under suc h circumstances w e are in terested that the algorithm pro duce an exact solution to this nearb y problem and hence the relev ance of bac kw ard stabilit y assumes signicance. Piv ot gro wth factor is formally dened as = max k max ij j a ( k ) ij j max ij j a ij j (2{32) where a ( k ) ij is an en try in the reduced matrix A ( k ) after the k th elimination step. F rom ( 2{32 ), w e nd that if the en tries of the reduced matrix gro w arbitrarily w e w ould ha v e a high gro wth factor. This arbitrary gro wth w ould again lead to inaccuracies in the results. Consider the follo wing 2 2 system. A = 264 10 4 1 1 1 375 264 x 1 x 2 375 = 264 12 375 (2{33) After one step of Gaussian elimination assuming four digit arithmetic, w e w ould ha v e the reduced system A = 264 10 4 1 0 1 10 4 375 264 x 1 x 2 375 = 264 1 2 10 4 375 (2{34) Solving the system yields x 1 = 0 ; x 2 = 1 whic h is dieren t from the actual solution x 1 = 1 ; x 2 = 1.The piv ot gro wth factor of the ab o v e system is

PAGE 37

27 = max (1 ; 10 4 ) 1 = 10 4 Th us a large piv ot gro wth clearly indicates the inaccuracy in the result. P artial piv oting generally a v oids large gro wth factors. In the ab o v e example, if w e had applying partial piv oting, w e w ould ha v e got the correct results. But this is not assured and there are cases where partial piv oting migh t not result in an acceptable gro wth factor. This necessitates the estimation of the gro wth factor as a diagnostic to ol to detect cases where Gaussian elimination could b e unstable. Piv ot gro wth factor is calculated usually in terms of its recipro cal, to a v oid n umerical o v erro w problems when the v alue is v ery large. ( 2{32 ) is a harder to compute equation since it in v olv es calculating the maxim um of reduced matrix after ev ery step of elimination. The other denitions of recipro cal gro wth factor that are easy to compute are as follo ws: 1 = min j ( max i j a ij j ) ( max i j u ij j ) (2{35) 1 = ( max ij j a ij j ) ( max ij j u ij j ) (2{36) Equation ( 2{35 ) is the denition implemen ted in KLU and it is a column scaling in v arian t. It helps unmask a large piv ot gro wth that could b e totally mask ed b ecause of column scaling. 2.12 Condition Num b er Gro wth factor is a k ey estimate in determining the stabilit y of the algorithm. Condition n um b er is a k ey estimate in determining the amenabilit y or conditioning of a giv en problem. It is not guaran teed that a highly stable algorithm can yield accurate results for all problems it can solv e. The conditioning of the problem has a dominan t eect on the accuracy of the solution. Note that while stabilit y deals with the algorithm, conditioning deals with the problem itself. In practical applications lik e circuit sim ulation, the data of

PAGE 38

28 a problem come from exp erimen tal observ ations. T ypically suc h data ha v e a factor of error or impurities or noise asso ciated with them. Roundo errors and discretization errors also con tribute to impurities in the data. Conditioning of a problem deals with determining ho w the solution of the problem c hanges in the presence of impurities. The preceding discussion sho ws that one often deals with solving problems not with the original data but that with p erturb ed data. The analysis of eect of p erturbation of the problem on the solution is called P erturbation analysis. It helps in determining whether a giv en problem pro duces a little or h uge v ariation in solution when p erturb ed. Let us see what w e mean b y w ell or ill conditioned problems. A problem is said to b e ill conditioned if a small relativ e error in data leads to a large relativ e error in solution irresp ectiv e of the algorithm emplo y ed. A problem is said to b e w ell conditioned if a small relativ e error in data do es not lead to a large relativ e error in solution. Accuracy of the computed solution is of primary imp ortance in n umerical analysis. Stabilit y of the algorithm and the Conditioning of the giv en problem are the t w o factors that directly determine accuracy .A highly stable algorithm w ell armored with scaling, partial piv oting and other concepts cannot b e guaran teed to yield an accurate solution to an ill-conditioned problem. A bac kw ard stable algorithm applied to a w ell-conditioned problem should yield a solution close to the exact solution. This follo ws from the denitions of bac kw ard stabilit y and w ell-conditioning, where bac kw ard stabilit y assures exact solution to a nearb y problem and w ell-conditioned problem assures that the computed solution to p erturb ed data is relativ ely close to the exact solution of the actual problem.

PAGE 39

29 Mathematically let X b e some problem. Let X(d) b e the solution to the problem for some input d. Let d denote a small p erturbation in the input d. No w if the relativ e error in the solution j X ( d + d ) X ( d ) j j X ( d ) j exceeds the relativ e error in the input j d j j d j then the problem is ill conditioned and w ell conditioned otherwise. Condition n um b er is a measure of the conditioning of the problem. It sho ws whether a problem is w ell or ill conditioned.F or the linear system problems of the form Ax = b the condition n um b er is dened as C ond ( A ) = k A kk A 1 k (2{37) Equation ( 2{37 ) is arriv ed at b y theory that deals with p erturbations either in the input matrix A or the righ t hand side b or b oth the matrix and righ t hand side. Equation ( 2{37 ) can b e dened with resp ect to an y norm viz. 1, 2 or 1 The system Ax = b is said to b e ill-conditioned if the condition n um b er from ( 2{37 ) is quite large. Otherwise it is said to b e w ell-conditioned. A naiv e w a y to compute the condition n um b er w ould b e to compute the in v erse of the matrix, compute the norm of the matrix and its in v erse and compute the pro duct. Ho w ev er, computing the in v erse is atleast thrice as exp ensiv e as solving the linear system Ax = b and hence should b e a v oided. Hager [ 22 ] dev elop ed a metho d for estimating the 1-norm of k A 1 k and the corresp onding 1-norm condition n um b er. Hager prop osed an optimization approac h for estimating k A 1 k 1 The 1-norm of a matrix is formally dened as

PAGE 40

30 k A k 1 = max k Ax k 1 k x k 1 (2{38) Hager's algorithm can b e briery describ ed as follo ws: F or A 2 R n n a con v ex function is dened as F ( x ) = k Ax k 1 (2{39) o v er the con v ex set S = f x 2 R n : k x k 1 1 g Then k A k 1 is the global maxim um of ( 2{39 ). The algorithm in v olv es computing Ax and A T x or computing matrix-v ector pro ducts. When w e w an t to compute k A 1 k 1 it in v olv es computing A 1 x and ( A 1 ) T x whic h is equiv alen t to solving Ax = b and A T x = b W e can use KLU to ecien tly solv e these systems. Higham [ 23 ] presen ts renemen ts to Hager's algorithm and restricts the n um b er of iterations to v e. Higham further presen ts a simple device and using the higher of the estimates from this device and Hager's algorithm to ensure the esimate is large enough. This device in v olv es solving the linear system Ax = b where b i = ( 1) i +1 (1 + i 1 n 1 ) ; i = 1 ; 2 ; :::n The nal estimate is c hosen as the maxim um from Hager's algorithm and 2 k x k 1 3 n KLU's condition n um b er estimator is based on Higham's renemen t of Hager's algorithm. 2.13 Depth First Searc h As w e discussed earlier, the nonzero pattern of the k th column of L is determined b y the Reac habilit y of the ro w-indices of k th column of A in the graph of L.

PAGE 41

31 The reac habilit y is determined b y a depth-rst searc h tra v ersal of the graph of L. The top ological order for elimination of v ariables when solving the lo w er triangular system Lx = b is also determined b y the depth-rst searc h tra v ersal. A classical depth rst searc h algorithm is a recursiv e one. One of the ma jor problems in a recursiv e implemen tation of depth-rst searc h is Stac k o v erro w. Eac h pro cess is allo cated a stac k space up on execution. When there is a high n um b er of recursiv e calls, the stac k space is exhausted and the pro cess terminates abruptly This is a denite p ossibilit y in the con text of our depth-rst searc h algorithm when w e ha v e a dense column of a matrix of a v ery high dimension. The solution to stac k o v erro w caused b y recursion is to replace recursion b y iteration. With an iterativ e or non-recursiv e function, the en tire depth rst searc h happ ens in a single function stac k. The iterativ e solution uses an arra y of ro w indices called pstac k. When descending to an adjacen t no de during the searc h, the ro w index of the next adjacen t no de is stored in the pstac k at the p osition(ro w/column index) corresp onding to the curren t no de. When the searc h returns to the curren t no de, w e kno w that w e next need to descend in to the no de stored in the pstac k at the p osition corresp onding to the curren t no de. Using this extra O(n) memory the iterativ e v ersion completes the depth rst searc h in a single function stac k. This is an imp ortan t impro v emen t from the recursiv e v ersion since it a v oids the stac k o v erro w problem that w ould ha v e b een a b ottlenec k when solving high dimension systems. 2.14 Memory F ragmen tation The data structures for L and U are the ones used to represen t sparse matrices. These comprise 3 v ectors. 1. V ector of column p oin ters 2. V ector of ro w indices

PAGE 42

32 3. V ector of n umerical v alues There are o v erall, six v ectors needed for the t w o matrices L and U. Of these, the t w o v ectors of column p oin ters are of pre-kno wn size namely the size of a blo c k. The remaining four v ectors of ro w indices and n umerical v alues dep end on the llin estimated b y AMD. Ho w ev er AMD giv es an optimistic estimate of ll-in.Hence w e need to dynamically gro w memory for these v ectors during the factorization phase if w e determine that the ll-in is higher than estimated. The partial piv oting strategy can alter the ro w ordering determined b y AMD and hence is another source of higher ll-in than the estimate from AMD. Dynamically gro wing these four v ectors suers from the problem of external memory fragmen tation. In external fragmen tation, free memory is scattered in the memory space. A call for more memory fails b ecause of non-a v ailabilit y of con tiguous free memory space. If the scattered free memory areas w ere con tiguous, the memory request w ould ha v e succeeded. In the con text of our problem, the memory request to gro w the four v ectors could either fail if w e run in to external fragmen tation or succeed when there is enough free space a v ailable. When w e reallo cate or gro w memory there are t w o t yp es of success cases. In the rst case called c heap reallo cation, there is enough free memory space abutting the four v ectors. Here the memory o ccupied b y a v ector is just extended or its end b oundary is increased. The start b oundary remains the same. In the second case called costly reallo cation, there is not enough free memory space abutting a v ector. Hence a fresh memory is allo cated in another region for the new size of v ector and the con ten ts are copied from old lo cation. Finally the old lo cation is freed. With four v ectors to gro w, there is a failure case b ecause of external fragmentation and a costly success case b ecause of costly reallo cation. T o reduce the failure case and a v oid the costly success case, w e ha v e coalesced the four v ectors in to a single v ector. This new data structure is b yte aligned on double b oundary F or

PAGE 43

33 ev ery column of L and U, the v ector con tains the ro w indices and n umerical v alues of L follo w ed b y the ro w indices and n umerical v alues of U. Multiple in teger ro w indices are stored in a single double lo cation The actual n um b er of in tegers that can b e stored in a double lo cation v aries with platform and is determined dynamically The common tec hnique of using in teger p oin ter to p oin t to lo cation aligned on double b oundary is emplo y ed to retriev e or sa v e the ro w indices. In addition to this coalesced data structure con taining the ro w indices and n umerical v alues, t w o more length v ectors of size n are needed to con tain the length of eac h column of L and U These length v ectors are preallo cated once and need not b e gro wn dynamically Some Memory managemen t sc hemes nev er do c heap reallo cation. In suc h sc hemes, the new data structure serv es to reduce external fragmen tation only 2.15 Complex Num b er Supp ort KLU supp orts complex matrices and complex righ t hand sides. KLU also supp orts solving the transp ose system A T x = b for real matrices and solving the conjugate transp ose system A H x = b for complex matrices. Initially it relied on the C99 language supp ort for complex n um b ers. Ho w ev er the C99 sp ecication is not supp orted across op erating systems. F or example, earlier v ersions of Sun Solaris do not supp ort C99. T o a v oid these compatibilit y issues, KLU no longer relies on C99 and has its o wn complex arithmetic implemen tation. 2.16 P arallelism in KLU When solving a system Ax = b using KLU, w e use BTF pre-ordering to con v ert A in to a blo c k upp er triangular form. W e apply AMD on eac h blo c k and factorize eac h blo c k one after the other serially Alternativ ely nested dissection can b e applied to eac h blo c k. Nested dissection ordering con v erts a blo c k to a doubly b ordered blo c k diagonal form. A doubly b ordered blo c k diagonal form is similar to a blo c k upp er triangular form but has nonzeros on the sub diagonal region. These

PAGE 44

34 1 2 4 5 3 A Doubly Bordered Block diagonal Matrix Separator Tree 2 1 4 5 3 Figure 2{6: A doubly b ordered blo c k diagonal matrix and its corresp onding v ertex separator tree nonzeros form a horizon tal strip resem bling a b order. Similarly the nonzeros in the region ab o v e the diagonal form a corresp onding v ertical strip. The doubly b ordered blo c k diagonal form can b e though t of as a separator tree. F actorization of the blo c k then in v olv es a p ost-order tra v ersal of the separator tree. The no des in the separator tree can b e factorized in parallel. The factorization of a no de w ould additionally in v olv e computing the sc h ur complemen t of its paren t and of its ancestors in the tree. Once all the c hildren of a no de ha v e up dated its sc h ur complemen t, the no de is ready to b e factorized and it in turn computes the sc h ur complemen t of its paren t and its ancestors. The factorization and computation of sc h ur complemen t is done in a p ost-order tra v ersal fashion and the pro cess stops at the ro ot. P arallelism can help in reducing the factorization time. It gains imp ortance in the con text of m ulti pro cessor systems. W ork is b eing done to enable parallelism in KLU.

PAGE 45

CHAPTER 3 CIR CUIT SIMULA TION: APPLICA TION OF KLU The KLU algorithm comprises the follo wing steps: 1. Unsymmetric P erm utation to blo c k upp er triangular form. This consists of t w o steps. (a) unsymmetric p erm utation to ensure a zero free diagonal using maxim um transv ersal. (b) symmetric p erm utation to blo c k upp er triangular form b y nding the strongly connected comp onen ts of the graph. 2. Symmetric p erm utation of eac h blo c k(sa y A) using AMD on A + A T or an unsymmetric p erm utation of eac h blo c k using COLAMD on AA T These p erm utations are ll-in reducing orderings on eac h blo c k. 3. F actorization of eac h scaled blo c k using Gilb ert-P eierls' left lo oking algorithm with partial piv oting. 4. Solv e the system using blo c k-bac k substitution and accoun t for the odiagonal en tries. The solution is re-p erm uted to bring it bac k to original order. Let us rst deriv e the nal system that w e need to solv e taking in to accoun t, the dieren t p erm utations, scaling and piv oting. The original system to solv e is Ax = b (3{1) Let R b e the diagonal matrix with the scale factors for eac h ro w. Applying scaling, w e ha v e R Ax = R b (3{2) 35

PAGE 46

36 Let P' and Q' b e the ro w and column p erm utation matrices that com bine the p erm utations for maxim um transv ersal and the blo c k upp er triangular form together. Applying these p erm utations together, w e ha v e P 0 R AQ 0 Q 0 T x = P 0 R b: [ Q 0 Q 0 T = I ; the identity matr ix ] (3{3) Let P and Q b e ro w and column p erm utation matrices that club the P' and Q' men tioned ab o v e with the symmetric p erm utation pro duced b y AMD and the partial piv oting ro w p erm utation pro duced b y factorization. No w, P R AQQ T x = P R b or ( P R AQ ) Q T x = P R b (3{4) The matrix (PRA Q) consists of t w o parts viz. the diagonal blo c ks that are factorized and the o-diagonal elemen ts that are not factorized. ( P R AQ ) = LU + F where LU represen ts the factors of all the blo c ks collectiv ely and F represen ts the en tire o diagonal region. Equation ( 3{4 ) no w b ecomes ( LU + F ) Q T x = P R b (3{5) x = Q ( LU + F ) 1 ( P R b ) (3{6) Equation ( 3{6 ) consists of t w o steps. A blo c k bac k-substitution i.e. computing ( LU + F ) 1 ( P R b ) follo w ed b y applying the column p erm utation Q The blo c k-bac k substitution in ( LU + F ) 1 ( P R b ) lo oks cryptic and can b e b etter explained as follo ws: Consider a simple 3 3 blo c k system

PAGE 47

37 266664 L 11 U 11 F 12 F 13 0 L 22 U 22 F 23 0 0 L 33 U 33 377775 266664 X 1 X 2 X 3 377775 = 266664 B 1 B 2 B 3 377775 (3{7) The equations corresp onding to the ab o v e system are: L 11 U 11 X 1 + F 12 X 2 + F 13 X 3 = B 1 (3{8) L 22 U 22 X 2 + F 23 X 3 = B 2 (3{9) L 33 U 33 X 3 = B 3 (3{10) In blo c k bac k substitution, w e rst solv e ( 3{10 ) for X 3 and then eliminate X 3 from ( 3{9 ) and ( 3{8 ) using the o-diagonal en tries. Next, w e solv e ( 3{9 ) for X 2 and eliminate X 2 from ( 3{8 ). Finally w e solv e ( 3{8 ) for X 1 3.1 Characteristics of Circuit Matrices Circuit matrices exhibit certain unique c haracteristics that mak es KLU more relev an t to them. They are v ery sparse. Because of their high sparsit y BLAS k ernels are not applicable. Circuit matrices often ha v e a few dense ro ws/columns that originate from v oltage or curren t sources. These dense ro ws/columns are eectiv ely remo v ed b y BTF p erm utation. Circuit matrices are asymmetric, but the nonzero pattern is roughly symmetric. They are easily p erm utable to blo c k upp er triangular form. Besides, they ha v e zero-free or nearly zero-free diagonal. Another p eculiar feature of circuit matrices is that the nonzero pattern of eac h blo c k after p erm utation to blo c k upp er triangular form, is more symmetric than the original matrix. T ypical ordering strategies applied to the original matrix cause high ll-in whereas when applied to the blo c ks, leads to less ll-in.

PAGE 48

38 The eciency of the p erm utation to blo c k upp er triangular form sho ws in the fact the en tire sub-diagonal region in the matrix has zero w ork and the o-diagonal elemen ts do not cause an y ll-in since they are not factorized. 3.2 Linear Systems in Circuit Sim ulation The linear systems in circuit sim ulation pro cess originate from solving large systems of non linear equations using Newton's metho d and in tegrating large sti systems of ordinary dieren tial equations. These linear systems consist of the co ecien ts matrix A the unkno wns v ector x and the righ t hand side b During the course of sim ulation, the matrix A retains the same nonzero pattern(structurally unc hanged) and only undergo es c hanges in n umerical v alues. Th us the initial analysis phase (p erm utation to ensure zero-free diagonal, blo c k triangular form and minim um degree ordering on blo c ks) and factorization phase(that in v olv es sym b olic analysis, partial piv oting and symmetric pruning) can b e restricted to the initial system alone. Subsequen t systems A 0 x = b where only the n umerical v alues of A 0 are dieren t from A can b e solv ed using a mec hanism called refactorization. Refactorization simply means to use the same ro w and column p erm utations (comprising en tire analysis phase and partial piv oting) computed for the initial system, for solving the subsequen t systems that ha v e c hanges only in n umerical v alues. Refactorization substan tially reduces run time since the analysis time and factorization time sp en t on sym b olic analysis, partial piv oting, pruning are a v oided. The nonzero pattern of the factors L and U are the same as for the initial system. Only Numerical factorization using the pre-computed nonzero pattern and partial piv oting order, is required. The solv e step follo ws the factorization/refactorization step. KLU accomo dates solving m ultiple righ t hand sides in a single solv e step. Upto four righ t hand sides can b e solv ed in a single step.

PAGE 49

39 3.3 P erformance Benc hmarks During m y in ternship at a circuit sim ulation compan y I did p erformance b enc hmarking of KLU vs Sup erLU in the sim ulation en vironmen t. The p erformance b enc hmarks w ere run on a represen tativ e set of examples. The results of these b enc hmarks are tabulated as follo ws. (the size of the matrix created in sim ulation is sho wn in paren thesis). T able 3{1: Comparison b et w een KLU and Sup erLU on o v erall time and ll-in Ov erall time Nonzeros(L+U) Netlist KLU Sup erLU Sp eedup KLU Sup erLU Problem1 (301) 1.67 1.24 0.74 1808 1968 Problem2 (1879) 734.96 688.52 0.94 13594 13770 Problem3 (2076) 56.46 53.38 0.95 16403 16551 Problem4 (7598) 89.63 81.85 0.91 52056 54997 Problem5 (745) 18.13 16.84 0.93 4156 4231 Problem6 (1041) 1336.50 1317.30 0.99 13198 13995 Problem7 (33) 0.40 0.32 0.80 157 176 Problem8 (10) 4.46 1.570 0.35 40 41 Problem9 (180) 222.26 202.29 0.91 1845 1922 Problem10(6833) 6222.20 6410.40 1.03 56322 58651 Problem11 (1960) 181.78 179.50 0.99 13527 13963 Problem12 (200004) 6.25 8.47 1.35 500011 600011 Problem13 (20004) 0.47 0.57 1.22 50011 60011 Problem14 (40004) 0.97 1.31 1.35 100011 120011 Problem15 (100000) 1.76 2.08 1.18 299998 499932 Problem16 (7602) 217.80 255.88 1.17 156311 184362 Problem17(10922) 671.10 770.58 1.15 267237 299937 Problem18 (14842) 1017.00 1238.10 1.22 326811 425661 Problem19 (19362) 1099.00 1284.40 1.17 550409 581277 Problem20 (24482) 3029.00 3116.90 1.03 684139 788047 Problem21 (30202) 2904.00 3507.40 1.21 933131 1049463 The circuits Problem16{Problem21 are TFT LCD arra ys similar to memory circuits. These examples w ere run atleast t wice with eac h algorithm emplo y ed viz. KLU or Sup erLU to get consisten t results. The results are tabulated in tables 3{1 3{2 and 3{3 The "o v erall time" in table 3{1 comprises of analysis, factorization and solv e time.

PAGE 50

40 T able 3{2: Comparison b et w een KLU and Sup erLU on factor time and solv e time F actor time F actor Solv e time Solv e p er iteration sp eedup p er iteration sp eedup Netlist KLU Sup erLU KLU Sup erLU Problem1 (301) 0.000067 0.000084 1.26 0.000020 0.000019 0.92 Problem2 (1879) 0.000409 0.000377 0.92 0.000162 0.000100 0.61 Problem3 (2076) 0.000352 0.000317 0.90 0.000122 0.000083 0.68 Problem4 (7598) 0.001336 0.001318 0.99 0.000677 0.000326 0.48 Problem5 (745) 0.000083 0.000063 0.76 0.000035 0.000022 0.62 Problem6 (1041) 0.000321 0.000406 1.26 0.000079 0.000075 0.95 Problem7 (33) 0.000004 0.000004 0.96 0.000003 0.000002 0.73 Problem8 (10) 0.000001 0.000001 0.89 0.000001 0.000001 0.80 Problem9 (180) 0.000036 0.000042 1.16 0.000014 0.000011 0.76 Problem10(6833) 0.001556 0.001530 0.98 0.000674 0.000365 0.54 Problem11 (1960) 0.000663 0.000753 1.14 0.000136 0.000122 0.90 Problem12 (200004) 0.103900 0.345500 3.33 0.030640 0.041220 1.35 Problem13 (20004) 0.005672 0.020110 3.55 0.001633 0.002735 1.67 Problem14 (40004) 0.014430 0.056080 3.89 0.004806 0.006864 1.43 Problem15 (100000) 0.168700 0.283700 1.68 0.018600 0.033610 1.81 Problem16 (7602) 0.009996 0.017620 1.76 0.001654 0.001439 0.87 Problem17(10922) 0.018380 0.030010 1.63 0.002542 0.001783 0.70 Problem18 (14842) 0.024020 0.046130 1.92 0.003187 0.002492 0.78 Problem19 (19362) 0.054730 0.080280 1.47 0.005321 0.003620 0.68 Problem20 (24482) 0.121400 0.122600 1.01 0.006009 0.004705 0.78 Problem21 (30202) 0.124000 0.188700 1.52 0.009268 0.006778 0.73

PAGE 51

41 3.4 Analyses and Findings The follo wing are m y inferences based on the results: Most of the matrices in these exp erimen ts are small matrices of the order of a few thousands. Fill in is m uc h b etter with KLU. The 'BTF' ordering com bined with the 'AMD' ordering on eac h of the blo c ks do es a go o d job in reducing the ll in coun t to a go o d exten t. The impro v emen t in ll in a v erages around 6% for man y examples. T able 3{3: Ordering results using BTF+AMD in KLU on circuit matrices Nonzeros Nonzeros No of Max Blo c k Nonzeros Netlist Size in A in L+U Blo c ks size o diagonal Problem1 301 1484 1808 7 295 89 Problem2 1879 12926 13594 19 1861 4307 Problem3 2076 15821 16403 13 2064 6689 Problem4 7598 48922 52056 13 7586 19018 Problem5 745 3966 4156 128 426 1719 Problem6 1041 9654 13198 67 975 2608 Problem7 33 153 157 7 27 50 Problem8 10 39 40 5 6 16 Problem9 180 1503 1845 19 162 661 Problem10 6833 43250 56322 507 6282 12594 Problem11 1960 11187 13527 58 1715 2959 Problem12 200004 500011 500011 200003 2 300005 Problem13 20004 50011 50011 20003 2 30005 Problem14 40004 100011 100011 40003 2 60005 Problem15 100000 299998 299998 1 100000 0 Problem16 7602 32653 156311 103 7500 251 Problem17 10922 46983 267237 123 10800 301 Problem18 14842 63913 326811 143 14700 351 Problem19 19362 83443 550409 163 19200 401 Problem20 24482 105573 684139 183 24300 451 Problem21 30202 130303 933131 203 30000 501 There is no 'll in' in the Problem12, Problem13, Problem14 and Problem15 netlists with KLU. This is quite signican t in that there is no memory o v erhead in these examples. In the case of circuits Problem16{Problem21, the gain in ll

PAGE 52

42 in with KLU ranges from 6% in the Problem19 example to 24% in Problem18 example. The gain in ll in translates in to faster factorization b ecause few nonzeros imply less w ork. The factorization time th us is exp ected to b e lo w. It turns out to b e true in most of the cases (factorization sp eedup of 1.6x in Problem16{ Problem21 examples and 3x for Problem12{Problem14 examples). F or some cases lik e Problem2 and Problem10, the factorization time remains same for b oth KLU and Sup erLU. Solv e phase turns out to b e slo w in KLU. Probably the o diagonal nonzero handling part tends to accoun t for the extra time sp en t in the solv e phase. One w a y of reducing the solv e o v erhead in KLU w ould b e solving m ultiple RHS at the same time. In a single solv e iteration, 4 equations are solv ed. On the whole, the o v erall time sp eedup is 1.2 for Problem16{Problem21 examples and Problem12{Problem14 examples. F or others, the o v erall time is almost the same b et w een the t w o algorithms. BTF is not able to nd out man y blo c ks for most of the matrices and there happ ens to b e a single large blo c k and the remaining are singletons. But the AMD ordering do es a go o d job in getting the ll in coun t reduced. The o-diagonal nonzero coun t is not high. 3.5 Alternate Ordering Exp erimen ts Dieren t ordering strategies w ere emplo y ed to analyze the ll in b eha viour. The statistics using dieren t ordering sc hemes are listed in table 3{4 'COLAMD' is not listed in the table. It t ypically giv es p o or ordering and causes more ll in than AMD, MMD and AMD+BTF. AMD alone giv es relativ ely higher ll in compared to AMD+BTF in most of the matrices. Ho w ev er AMD alone giv es mixed results in comparison with MMD. It matc hes or outp erforms MMD in ll in on Problem12{Problem14 and Problem16{Problem21 matrices. Ho w ev er it giv es

PAGE 53

43 T able 3{4: Comparison of ordering results pro duced b y BTF+AMD, AMD, MMD Nonzeros Fill-in Fill-in Fill-in Netlist in A BTF+AMD AMD MMD Problem1 (301) 1484 1808 1928 1968 Problem2 (1879) 12926 13594 13857 13770 Problem3 (2076) 15821 16403 18041 16551 Problem4 (7598) 48922 52056 57975 54997 Problem5 (745) 3966 4156 5562 4231 Problem6 (1041) 9654 13198 14020 13995 Problem7 (33) 153 157 178 176 Problem8 (10) 39 40 41 41 Problem9 (180) 1503 1845 1968 1922 Problem10(6833) 43250 56322 133739 58651 Problem11 (1960) 11187 13527 14800 13963 Problem12 (200004) 500011 500011 600011 600011 Problem13 (20004) 50011 50011 60011 60011 Problem14 (40004) 100011 100011 120011 120011 Problem15 (100000) 299998 299998 299998 499932 Problem16 (7602) 32653 156311 165264 184362 Problem17(10922) 46983 267237 255228 299937 Problem18 (14842) 63913 326811 387668 425661 Problem19 (19362) 83443 550409 451397 581277 Problem20 (24482) 105573 684139 718891 788047 Problem21 (30202) 130303 933131 839226 1049463

PAGE 54

44 p o or ll in for rest of the circuits when compared with MMD. AMD alone b eats AMD+BTF in ll-in for some of the examples viz. Problem17, Problem19 and Problem21. Ov erall, to sum it up, BTF+AMD is the b est ordering strategy to use. 3.6 Exp erimen ts with UF Sparse Matrix Collection There are a n um b er of circuit matrices in the UF sparse matrix collection. Dieren t exp erimen ts w ere done with these matrices as w ell on dieren t parameters lik e 1. ordering qualit y with dieren t ordering sc hemes in KLU 2. timing of dieren t phases of KLU 3. ordering qualit y among KLU, UMFP A CK and Gilb ert-P eierls' Algorithm 4. p erformance comparison b et w een KLU and UMFP A CK. UMFP A CK is a unsymmetric m ultifron tal sparse solv er and uses an unsymmetric COLAMD ordering or an AMD ordering, automatically selecting whic h metho d to use based on the matrix c haracteristics. Gilb ert-P eierls' algorithm is a v ailable in MA TLAB as an LU factorization sc heme. These exp erimen ts w ere done on a Mandrak e 10.0 lin ux OS, In tel P en tium M Pro cessor with clo c k frequency of 1400 MHz and RAM 768 kB. 3.6.1 Dieren t Ordering Sc hemes in KLU There are six dieren t ordering sc hemes p ossible in KLU. The three ll reducing sc hemes are AMD, COLAMD and User Sp ecied Ordering. These three ll reducing orderings can b e com bined with a BTF preordering or no preordering.Hence six dieren t sc hemes. Ho w ev er in this exp erimen t user sp ecied ordering is not considered. That lea v es us with four dieren t sc hemes.The table 3{5 lists the ll (n um b er of nonzeros in L+U plus the n um b er of o-diagonal elemen ts) for eac h of these ordering sc hemes. F rom table 3{5 w e nd that BTF+AMD giv es consisten tly b etter ll-in across dieren t circuit matrices. Ho w ev er there are observ able ab errations. F or

PAGE 55

45 example, with the circuit Bomhof/circuit 2, AMD and COLAMD giv e b etter ll-in than BTF+AMD. These results determine again that BTF+AMD is the b est ordering sc heme to use for circuit matrices. 3.6.2 Timing Dieren t Phases in KLU These exp erimen ts sho w the time sp en t in dieren t phases of the algorithm. BTF pre-ordering follo w ed b y a AMD ll-reducing ordering is emplo y ed. As men tioned earlier, there are four dieren t phases. 1. Analysis phase: This comprises the pre-ordering and ll reducing ordering. 2. F actor phase: This comprises the factorization part of the algorithm. It includes the sym b olic analysis phase, partial piv oting, symmetric pruning steps. 3. Refactor phase: This comprises the part where w e do a factorization using the already pre-computed partial piv oting p erm utation and the nonzero pattern of the L and U matrices. There is no sym b olic analysis, partial piv oting and symmetric pruning in refactor phase. 4. Solv e phase: This comprises the solv e phase of the algorithm. Solving a single righ t hand side w as exp erimen ted. When giv en a set of matrices with the same nonzero pattern, the analysis and factor phases are done only once. The refactor phase is then rep eated for the remaining matrices. Solv e phase is rep eated as man y times as there are systems to solv e. T able 3{6 consists of the timing results. Analysis phase consumes most of the time sp en t in the algorithm. Refactor time is t ypically 3 or 4 times smaller than factor time and 8 times smaller than analysis time plus factor time put together. Solv e phase consumes the least fraction of time sp en t. 3.6.3 Ordering Qualit y among KLU, UMFP A CK and Gilb ert-P eierls The table 3{7 compares the ordering qualit y among KLU using BTF+AMD, UMFP A CK using COLAMD or AMD and Gilb ert-P eierls' using AMD. W e can

PAGE 56

46 T able 3{5: Fill-in with four dieren t sc hemes in KLU Nonzeros BTF + BTF + Matrix in A AMD COLAMD AMD COLAMD Sandia/adder dcop 01 (1813) 11156 13525 13895 18848 21799 Sandia/adder trans 01 (1814) 14579 20769 36711 24365 119519 Sandia/fpga dcop 01 (1220) 5892 7891 8118 8869 12016 Sandia/fpga trans 01 (1220) 7382 10152 12776 10669 21051 Sandia/init adder1 (1813) 11156 13525 13895 18848 21799 Sandia/m ult dcop 01 (25187) 193276 226673 228301 2176328 1460322 Sandia/oscil dcop 01 (430) 1544 2934 3086 3078 3295 Sandia/oscil trans 01 (430) 1614 2842 3247 2897 3259 Bomhof/circuit 1 (2624) 35823 44879 775363 44815 775720 Bomhof/circuit 2 (4510) 21199 40434 89315 36197 36196 Bomhof/circuit 3 (12127) 48137 86718 98911 245336 744245 Grund/b2 ss (1089) 3895 9994 9212 26971 9334 Grund/b dyn (1089) 4144 11806 10597 33057 10544 Grund/ba y er02 (13935) 63307 889914 245259 1365142 307979 Grund/d dyn (87) 230 481 461 619 494 Grund/d ss (53) 144 302 292 382 298 Grund/meg1 (2904) 58142 232042 184471 1526780 378904 Grund/meg4 (5860) 25258 42398 310126 43250 328144 Grund/p oli (4008) 8188 12200 12208 12238 12453 Grund/p oli large (15575) 33033 48718 48817 49806 51970 Hamm/add20 (2395) 13151 19554 34636 19554 34636 Hamm/add32 (4960) 19848 28754 36030 28754 36030 Hamm/b circuit (68902) 375558 1033240 1692668 1033240 1692668 Hamm/hcircuit (105676) 513072 731634 2623852 736080 4425310 Hamm/memplus (17758) 99147 137030 3282586 137030 3282586 Hamm/scircuit (170998) 958936 2481122 6410286 2481832 6427526 Ra jat/ra jat03 (7602) 32653 163913 235111 172666 236938 Ra jat/ra jat04 (1041) 8725 12863 80518 18618 158000 Ra jat/ra jat05 (301) 1250 1926 2053 2101 3131 Ra jat/ra jat11 (135) 665 890 978 944 1129 Ra jat/ra jat12 (1879) 12818 15308 273667 15571 128317 Ra jat/ra jat13 (7598) 48762 58856 90368 64791 5234287 Ra jat/ra jat14 (180) 1475 1994 2249 2105 2345

PAGE 57

47 T able 3{6: Time in seconds, sp en t in dieren t phases in KLU Analysis F actor Refactor Solv e Matrix time time time time Sandia/adder dcop 01 (1813) 0.0028 0.0028 0.0007 0.0003 Sandia/adder trans 01 (1814) 0.0045 0.0038 0.0013 0.0003 Sandia/fpga dcop 01 (1220) 0.0018 0.0015 0.0004 0.0002 Sandia/fpga trans 01 (1220) 0.0022 0.0017 0.0005 0.0002 Sandia/init adder1 (1813) 0.0028 0.0028 0.0007 0.0003 Sandia/m ult dcop 01 (25187) 0.2922 0.0522 0.0196 0.0069 Sandia/oscil dcop 01 (430) 0.0008 0.0006 0.0002 0.0001 Sandia/oscil trans 01 (430) 0.0008 0.0006 0.0002 0.0001 Bomhof/circuit 1 (2624) 0.0098 0.0085 0.0053 0.0006 Bomhof/circuit 2 (4510) 0.0082 0.0064 0.0034 0.0006 Bomhof/circuit 3 (12127) 0.0231 0.0174 0.0056 0.0020 Grund/b2 ss (1089) 0.0031 0.0018 0.0005 0.0001 Grund/b dyn (1089) 0.0033 0.0021 0.0007 0.0002 Grund/ba y er02 (13935) 0.0584 0.2541 0.2070 0.0090 Grund/d dyn (87) 0.0002 0.0001 0.0000 0.0000 Grund/d ss (53) 0.0001 0.0001 0.0000 0.0000 Grund/meg1 (2904) 0.0178 0.0853 0.0590 0.0033 Grund/meg4 (5860) 0.0157 0.0094 0.0028 0.0009 Grund/p oli (4008) 0.0017 0.0010 0.0004 0.0003 Grund/p oli large (15575) 0.0064 0.0045 0.0018 0.0014 Hamm/add20 (2395) 0.0056 0.0044 0.0014 0.0003 Hamm/add32 (4960) 0.0084 0.0074 0.0019 0.0006 Hamm/b circuit (68902) 0.3120 0.2318 0.1011 0.0257 Hamm/hcircuit (105676) 0.2553 0.1920 0.0658 0.0235 Hamm/memplus (17758) 0.0576 0.0358 0.0157 0.0036 Hamm/scircuit (170998) 0.8491 0.6364 0.3311 0.0622 Ra jat/ra jat03 (7602) 0.0152 0.0440 0.0306 0.0034 Ra jat/ra jat04 (1041) 0.0029 0.0023 0.0008 0.0002 Ra jat/ra jat05 (301) 0.0005 0.0005 0.0001 0.0000 Ra jat/ra jat11 (135) 0.0002 0.0002 0.0001 0.0000 Ra jat/ra jat12 (1879) 0.0038 0.0027 0.0008 0.0002 Ra jat/ra jat13 (7598) 0.0122 0.0105 0.0033 0.0011 Ra jat/ra jat14 (180) 0.0004 0.0003 0.0001 0.0000

PAGE 58

48 infer from the results that KLU pro duces b etter ordering than UMFP A CK and Gilb ert-P eierls' algorithm. F or KLU, the follo wing MA TLAB co de determines the ll. opts = [0.1 1.2 1.2 10 1 0 0 0 ] ; [x info] = klus(A,b,opts, []) ; fill = info (31) + info (32) + info(8) ; F or UMFP A CK, the snipp et is [L U P Q] = lu(A) ; fill = nnz(L) + nnz(U) ; F or Gilb ert-P eierls' the co de is Q = amd(A) ; [L U P] = lu(A(Q,Q), 0.1) ; fill = nnz(L) + nnz(U) ; 3.6.4 P erformance Comparison b et w een KLU and UMFP A CK This exp erimen t compares the total time sp en t in the analysis, factor and solv e phases b y the algorithms. The results are listed in table 3{8 KLU outp erforms UMFP A CK in time. F or KLU, the follo wing snipp et in MA TLAB is used: tic[x info] = klus(A,b,opts) ; t1 = toc ; F or UMFP A CK, the follo wing co de in MA TLAB is used to nd the total time: ticx = A \ b ; t2 = toc ;

PAGE 59

49 T able 3{7: Fill-in among KLU, UMFP A CK and Gilb ert-P eierls Matrix nnz KLU UMFP A CK Gilb ert-P eierls Sandia/adder dcop 01 (1813) 11156 13525 14658 18825 Sandia/adder trans 01 (1814) 14579 20769 20769 24365 Sandia/fpga dcop 01 (1220) 5892 7891 8106 8869 Sandia/fpga trans 01 (1220) 7382 10152 10152 10669 Sandia/init adder1 (1813) 11156 13525 14658 18825 Sandia/m ult dcop 01 (25187) 193276 226673 556746 1300902 Sandia/oscil dcop 01 (430) 1544 2934 2852 3198 Sandia/oscil trans 01 (430) 1614 2842 3069 2897 Bomhof/circuit 1 (2624) 35823 44879 44879 44815 Bomhof/circuit 2 (4510) 21199 40434 35107 38618 Bomhof/circuit 3 (12127) 48137 86718 84117 245323 Grund/b2 ss (1089) 3895 9994 8309 22444 Grund/b dyn (1089) 4144 11806 9642 41092 Grund/ba y er02 (13935) 63307 889914 259329 973093 Grund/d dyn (87) 230 481 442 523 Grund/d ss (53) 144 302 268 395 Grund/meg1 (2904) 58142 232042 151740 1212904 Grund/meg4 (5860) 25258 42398 42398 43250 Grund/p oli (4008) 8188 12200 12200 12239 Grund/p oli large (15575) 33033 48718 48745 49803 Hamm/add20 (2395) 13151 19554 19554 19554 Hamm/add32 (4960) 19848 28754 28754 28754 Hamm/b circuit (68902) 375558 1033240 1033240 1033240 Hamm/hcircuit (105676) 513072 731634 730906 736080 Hamm/memplus (17758) 99147 137030 137030 137030 Hamm/scircuit (170998) 958936 2481122 2481122 2481832 Ra jat/ra jat03 (7602) 32653 163913 163913 172666 Ra jat/ra jat04 (1041) 8725 12863 12860 18613 Ra jat/ra jat05 (301) 1250 1926 1944 2101 Ra jat/ra jat11 (135) 665 890 890 944 Ra jat/ra jat12 (1879) 12818 15308 15308 15571 Ra jat/ra jat13 (7598) 48762 58856 58856 64791 Ra jat/ra jat14 (180) 1475 1994 1994 2105

PAGE 60

50 T able 3{8: P erformance comparison b et w een KLU and UMFP A CK Matrix KLU UMFP A CK Sandia/adder dcop 01 (1813) 0.0116 0.0344 Sandia/adder trans 01 (1814) 0.0112 0.0401 Sandia/fpga dcop 01 (1220) 0.0050 0.0257 Sandia/fpga trans 01 (1220) 0.0054 0.0203 Sandia/init adder1 (1813) 0.0109 0.0323 Sandia/m ult dcop 01 (25187) 1.2383 1.1615 Sandia/oscil dcop 01 (430) 0.0022 0.0070 Sandia/oscil trans 01 (430) 0.0019 0.0074 Bomhof/circuit 1 (2624) 0.0232 0.1223 Bomhof/circuit 2 (4510) 0.0201 0.0522 Bomhof/circuit 3 (12127) 0.0579 0.1713 Grund/b2 ss (1089) 0.0066 0.0168 Grund/b dyn (1089) 0.0072 0.0175 Grund/ba y er02 (13935) 0.6089 0.3565 Grund/d dyn (87) 0.0005 0.0014 Grund/d ss (53) 0.0003 0.0010 Grund/meg1 (2904) 0.1326 0.1018 Grund/meg4 (5860) 0.0571 0.1111 Grund/p oli (4008) 0.0050 0.0121 Grund/p oli large (15575) 0.0208 0.0497 Hamm/add20 (2395) 0.0123 0.0506 Hamm/add32 (4960) 0.0201 0.0738 Hamm/b circuit (68902) 0.7213 1.8823 Hamm/hcircuit (105676) 0.7313 2.7764 Hamm/memplus (17758) 0.1232 0.8140 Hamm/scircuit (170998) 1.9812 7.3448 Ra jat/ra jat03 (7602) 0.0793 0.1883 Ra jat/ra jat04 (1041) 0.0068 0.0284 Ra jat/ra jat05 (301) 0.0014 0.0046 Ra jat/ra jat11 (135) 0.0007 0.0023 Ra jat/ra jat12 (1879) 0.0087 0.0355 Ra jat/ra jat13 (7598) 0.0330 0.1229 Ra jat/ra jat14 (180) 0.0010 0.0032

PAGE 61

CHAPTER 4 USER GUIDE F OR KLU 4.1 The Primary KLU Structures 4.1.1 klu common This is a con trol structure that con tains b oth input con trol parameters for KLU as w ell as output statistics computed b y the algorithm. It is a mandatory parameter for all the KLU routines. Its con ten ts are listed as follo ws: tol T yp e: double Input parameter for piv ot tolerance for diagonal preference.Default v alue is 0.001 gro wth T yp e: double Input parameter for reallo cation gro wth size of LU factors. Default v alue is 1.2 initmem amd T yp e: double Input parameter for initial memory size with AMD. Initial memory size = initmem amd nnz(L) + n. Default v alue is 1.2 initmem T yp e: double Input parameter for initial memory size with COLAMD. Initial memory size = initmem nnz(A) + n. Default v alue is 10 btf T yp e: in t 51

PAGE 62

52 Input parameter to use BTF pre-ordering, or not. Default v alue is 1 (to use BTF) ordering T yp e: in t Input parameter to sp ecify whic h ll reducing ordering to use. 0= AMD,1= COLAMD, 2= user P and Q. Default is 0. scale T yp e: in t Input parameter to sp ecify whic h scaling strategy to use. 0= none, 1= sum, 2= max. Default is 0 singular pro c T yp e: in t Input parameter to sp ecify whether to stop on singularit y or con tin ue. 0= stop, 1= con tin ue. Default is 0. status T yp e: in t Output parameter that indicates the result of the KLU function call. V alues are KLU OK(0) if OK and < 0 if error. Error v alues are KLU SINGULAR (-1), KLU OUT OF MEMOR Y (-2), KLU INV ALID (-3) nreallo c T yp e: in t Output parameter.Con tains n um b er of reallo cations of L and U singular col T yp e: in t Output parameter.Con tains the column no of singular column if an y nodiag T yp e: in t

PAGE 63

53 Output parameter.Con tains the n um b er of o-diagonal piv ots c hosen 4.1.2 klu sym b olic This structure encapsulates the information related to the analysis phase. The mem b ers of the structure are listed as follo ws: symmetry T yp e: double Con tains the symmetry of largest blo c k est rops T yp e: double Con tains the estimated factorization rop coun t lnz T yp e: double Con tains the estimated nonzeros in L including diagonals unz T yp e: double Con tains the estimated nonzeros in U including diagonals Lnz T yp e: double Arra y of size n, but only Lnz [0..n blo c ks-1] is used. Con tains the estimated n um b er of nonzeros in eac h blo c k n T yp e: in t Con tains the size of input matrix A where A is n-b y-n nz T yp e: in t Con tains the n um b er of en tries in input matrix P

PAGE 64

54 T yp e: in t Arra y of size n. Con tains the ro w p erm utation from ordering Q T yp e: in t Arra y of size n. Con tains the column p erm utation from ordering R T yp e: in t Arra y of size n+1, but only R [0..n blo c ks] is used. Con tains the start and end column/ro w index for eac h blo c k. Blo c k k go es from R[k] to R[k+1] 1 nzo T yp e: in t Con tains the n um b er of nonzeros in o-diagonal blo c ks n blo c ks T yp e: in t Con tains the n um b er of blo c ks maxblo c k T yp e: in t Con tains the size of largest blo c k ordering T yp e: in t Con tains the ordering used (0 = AMD, 1 = COLAMD, 2 = GIVEN) do btf T yp e: in t Indicates whether or not BTF preordering w as requested The mem b ers symmetry est rops, lnz, unz, Lnz are computed only when AMD is used. The remaining mem b ers are computed for all orderings.

PAGE 65

55 4.1.3 klu n umeric This structure encapsulates information related to the factor phase. It con tains the LU factors of eac h blo c k, piv ot ro w p erm utation, and the en tries in the odiagonal blo c ks among others. Its con ten ts are listed as follo ws: umin T yp e: double Con tains the minim um absolute diagonal en try in U umax T yp e: double Con tains the maxim um absolute diagonal en try in U n blo c ks T yp e: in t Con tains the n um b er of blo c ks lnz T yp e: in t Con tains actual n um b er of nonzeros in L excluding diagonals unz T yp e: in t Con tains actual n um b er of nonzeros in U excluding diagonals Pn um T yp e: in t Arra y of size n.Con tains the nal piv ot p erm utation Pin v T yp e: in t Arra y of size n.Con tains the in v erse of nal piv ot p erm utation Lbip T yp e: in t **

PAGE 66

56 Arra y of size n blo c ks. Eac h elemen t is an arra y of size blo c k size + 1. Eac h elemen t con tains the column p oin ters for L factor of the corresp onding blo c k Ubip T yp e: in t ** Arra y of size n blo c ks. Eac h elemen t is an arra y of size blo c k size + 1. Eac h elemen t con tains the column p oin ters for U factor of the corresp onding blo c k Lblen T yp e: in t ** Arra y of size n blo c ks. Eac h elemen t is an arra y of size blo c k size. Eac h elemen t con tains the column lengths for L factor of the corresp onding blo c k Ublen T yp e: in t ** Arra y of size n blo c ks. Eac h elemen t is an arra y of size blo c k size. Eac h elemen t con tains the column lengths for U factor of the corresp onding blo c k LUb x T yp e: v oid ** Arra y of size n blo c ks. Eac h elemen t is an arra y con taining the ro w indices and n umerical v alues of LU factors of the corresp onding blo c k. The diagonals of LU factors are not stored here Udiag T yp e: v oid ** Arra y of size n blo c ks.Eac h elemen t is an arra y of size blo c k size. Eac h elemen t con tains the diagonal v alues of U factor of the corresp onding blo c k Singleton T yp e: v oid Arra y of size n blo c ks.Con tains the singleton v alues Rs

PAGE 67

57 T yp e: double Arra y of size n. Con tains the ro w scaling factors scale T yp e: in t Indicates the scaling strategy used. (0 = none, 1 = sum, 2 = max) W ork T yp e: v oid P ermanen t w orkspace used for factorization and solv e. It is of size MAX (4n n umerical v alues, n n umerical v alues + 6*maxblo c k in t's) w orksize T yp e: size t Con tains the size (in b ytes) of W ork allo cated ab o v e Xw ork T yp e: v oid This is an alias in to Numeric->W ork Iw ork T yp e: in t An in teger alias in to Xw ork + n Op T yp e: in t Arra y of size n+1. Con tains the column p oin ters for o-diagonal elemen ts. O T yp e: in t Arra y of size n um b er of o-diagonal en tries. Con tains the ro w indices of o-diagonal elemen ts Ox

PAGE 68

58 T yp e: v oid Arra y of size n um b er of o-diagonal en tries. Con tains the n umerical v alues of o-diagonal elemen ts 4.2 KLU Routines The user callable KLU routines in the C language are explained in this section. The follo wing guidelines are applicable to all routines except when explicitly stated otherwise in the description of a routine. 1. : All the argumen ts are required except when explicitly stated as optional. If optional, a user can pass NULL for the corresp onding argumen t. 2. : The con trol input/output argumen t "Common" of t yp e "klu common *" is a required argumen t for all routines. 3. : All argumen ts other than the ab o v e men tioned con trol argumen t "Common", are input argumen ts and are not mo died. 4.2.1 klu analyze klu_symbolic *klu_analyze ( int n, int Ap [ ], int Ai [ ], klu_common *Common ) ; This routine orders the matrix using BTF if sp ecied and the ll reducing ordering sp ecied. Returns a p oin ter to klu sym b olic structure that con tains the ordering information. All argumen ts are required { n: Size of the input matrix A where A is n*n. { Ap: Arra y of column p oin ters for the input matrix. Size n+1.

PAGE 69

59 { Ai: Arra y of ro w indices. Size n um b er of nonzeros in A. { Common: The con trol input/output structure. 4.2.2 klu analyze giv en klu_symbolic *klu_analyze_given ( int n, int Ap [ ], int Ai [ ], int Puser [ ], int Quser [ ], klu_common *Common ) ; This routine orders the matrix using BTF if sp ecied and the giv en Puser and QUser as ll reducing ordering. If Puser and Quser are NULL, then the natural ordering is used. Returns a p oin ter to klu sym b olic structure that con tains the ordering information. Argumen ts { n: Size of the input matrix A where A is n*n. Required. { Ap: Arra y of column p oin ters for the input matrix. Size n+1. Required. { Ai: Arra y of ro w indices. Size n um b er of nonzeros in A. Required. { Puser: Optional ro w p erm utation to use. { Quser: Optional column p erm utation to use. { Common: The con trol input/output structure. 4.2.3 klu *factor klu_numeric *Numeric klu_factor ( int Ap [ ], int Ai [ ],

PAGE 70

60 double Ax [ ], klu_symbolic *Symbolic, klu_common *Common ) ; This routine factors a real matrix. There is a complex v ersion of this routine klu z factor that factors a complex matrix and has same function declaration as klu factor. Both use the results of a call to klu analyze. Returns a p oin ter to klu n umeric if successful. NULL otherwise. All the argumen ts are required. { Ap: Arra y of column p oin ters for the input matrix. Size n+1. { Ai: Arra y of ro w indices. Size n um b er of nonzeros in A. { Ax: Arra y of n umerical v alues. Size n um b er of nonzeros in A. In the complex case, the arra y should consist of real and imaginary parts of eac h n umerical v alue as adjacen t pairs. { Sym b olic: The structure that con tains the results from a call to klu analyze. { Common: The con trol input/output structure. The status eld in Common is set to indicate if the routine w as successful or not. 4.2.4 klu *solv e void klu_solve ( klu_symbolic *Symbolic, klu_numeric *Numeric, int ldim, int nrhs, double B [ ], klu_common *Common ) ;

PAGE 71

61 This routine solv es a real system. There is a complex v ersion of this routine klu z solv e, that solv es a complex system and has the same function declaration as klu solv e. Both use the results of a call to klu analyze and klu *factor. Return t yp e is v oid. The rhs v ector B is o v erwritten with the solution. All Argumen ts are required. { Sym b olic: The structure that con tains the results from a call to klu analyze. { Numeric: The structure that con tains the results from a call to klu *factor. { ldim: The leading dimension of the righ t hand side B. { nrhs: The n um b er of righ t hand sides b eing solv ed. { B: The righ t hand side. It is a v ector of length ldim nrhs. It can b e real or complex dep ending on whether a real or complex system is b eing solv ed. If complex, the real and imaginary parts of the rhs n umerical v alue m ust b e stored as adjacen t pairs. It is o v erwritten with the solution. { Common: The con trol input/output structure. 4.2.5 klu *tsolv e void klu_tsolve ( klu_symbolic *Symbolic, klu_numeric *Numeric, int ldim, int nrhs, int conj_solve, double B [ ], klu_common *Common ) ;

PAGE 72

62 This routine is similar to klu solv e except that it solv es a transp ose system A 0 x = b This routine solv es a real system. Again there is a complex v ersion of this routine klu z tsolv e for solving complex systems and has the same function declaration as klu tsolv e. It also oers to do a conjugate transp ose solv e for the complex system A H x = b Return t yp e is v oid. The rhs v ector B is o v erwritten with the solution. All argumen ts are required. The descriptions for all argumen ts except conj solv e are same as those for klu *solv e. The argumen t conj solv e is relev an t only for complex case. It tak es t w o v alues 1 = CONJUGA TE TRANSPOSE SOL VE, 0 = TRANSPOSE SOL VE. 4.2.6 klu *refactor void klu_refactor ( int Ap [ ], int Ai [ ], double Ax [ ], klu_symbolic *Symbolic, klu_numeric *Numeric, klu_common *Common ) ; This routine do es a refactor of the matrix using the previously computed ordering information in Sym b olic and the nonzero pattern of the LU factors in Numeric ob jects. It assumes same partial piv oting order computed in Numeric. It c hanges only the n umerical v alues of the LU factors stored in Numeric ob ject. It has a complex v ersion klu z refactor with the same function protot yp e to handle complex cases. Return t yp e is v oid. The n umerical v alues of LU factors in Numeric parameter are up dated. All argumen ts are required.

PAGE 73

63 { Ap: Arra y of column p oin ters for the input matrix. Size n+1. { Ai: Arra y of ro w indices. Size n um b er of nonzeros in A. { Ax: Arra y of n umerical v alues. Size n um b er of nonzeros in A. In the complex case, the arra y should consist of real and imaginary parts of eac h n umerical v alue as adjacen t pairs. { Sym b olic: The structure that con tains the results from a call to klu analyze. { Numeric: Input/output argumen t. The structure con tains the results from a call to klu *factor. The n umerical v alues of LU factors are o v erwritten with the ones for the curren t matrix b eing factorized. { Common: The con trol input/output structure. The status eld in Common is set to indicate if the routine w as successful or not. 4.2.7 klu defaults void klu_defaults ( klu_common *Common ) ; This routine sets the default v alues for the con trol input parameters of the klu common ob ject. The default v alues are listed in the description of the klu common structure. A call to this routine is required unless the user sets the con trol input parameters explicitly Return t yp e is v oid. The argumen t Common is required. The con trol input parameters in Common are set to default v alues. 4.2.8 klu *rec piv ot gro wth double klu_rec_pivot_growth ( int Ap [ ],

PAGE 74

64 int Ai [ ], double Ax [ ], klu_symbolic *Symbolic, klu_numeric *Numeric, klu_common *Common ) ; This routine computes the recipro cal piv ot gro wth of the factorization algorithm. The complex v ersion of this routine klu z rec piv ot gro wth handles complex matrices and has the same function declaration. The piv ot gro wth estimate is returned. All argumen ts are required. { Ap: Arra y of column p oin ters for the input matrix. Size n+1. { Ai: Arra y of ro w indices. Size n um b er of nonzeros in A. { Ax: Arra y of n umerical v alues. Size n um b er of nonzeros in A. In the complex case, the arra y should consist of real and imaginary parts of eac h n umerical v alue as adjacen t pairs. { Sym b olic: The structure that con tains the results from a call to klu analyze. { Numeric: The structure that con tains the results from a call to klu *factor. { Common: The con trol input/output structure. The status eld in Common is set to indicate if the routine w as successful or not. 4.2.9 klu *estimate cond n um b er double klu_estimate_cond_number ( int Ap [ ], double Ax [ ], klu_symbolic *Symbolic, klu_numeric *Numeric,

PAGE 75

65 klu_common *Common ); This routine computes the condition n um b er estimate of the input matrix. As b efore, the complex v ersion of this routine klu z estimate cond n um b er has the same function declaration and handles complex matrices. The condition n um b er estimate is returned. All argumen ts are required. { Ap: Arra y of column p oin ters for the input matrix. Size n+1. { Ax: Arra y of n umerical v alues. Size n um b er of nonzeros in A. In the complex case, the arra y should consist of real and imaginary parts of eac h n umerical v alue as adjacen t pairs. { Sym b olic: The structure that con tains the results from a call to klu analyze. { Numeric: The structure that con tains the results from a call to klu *factor. { Common: The con trol input/output structure. The status eld in Common is set to indicate if the routine w as successful or not. 4.2.10 klu free sym b olic void klu_free_symbolic ( klu_symbolic **Symbolic, klu_common *Common ) ; This routine deallo cates or frees the con ten ts of the klu sym b olic ob ject. The Sym b olic parameter m ust b e a v alid ob ject computed b y a call to klu analyze or klu analyze giv en. Return t yp e is v oid. All argumen ts are required.

PAGE 76

66 { Sym b olic: Input/Output argumen t. Must b e a v alid ob ject computed b y a call to klu analyze or klu analyze giv en. If NULL, the routine just returns. { Common: The con trol input/output structure. 4.2.11 klu free n umeric void klu_free_numeric ( klu_numeric **Numeric, klu_common *Common ) ; This routine frees the klu n umeric ob ject computed b y a call to klu factor or klu z factor routines. It resets the p oin ter to klu n umeric to NULL. There is a complex v ersion of this routine called klu z free n umeric with the same function declaration to handle the complex case. Return t yp e is v oid. All argumen ts are required. { Numeric. Input/Output argumen t. The con ten ts of the klu n umeric ob ject are freed. The p oin ter to klu n umeric ob ject is set to NULL. { Common: The con trol input/output structure.

PAGE 77

REFERENCES [1] C. L. La wson, R. J. Hanson, D. Kincaid, and F. T. Krogh, Basic linear algebra subprograms for F OR TRAN usage, A CM T r ans. Math. Soft. 5: 308{323, 1979. [2] J. J. Dongarra, J. Du Croz, S. Hammarling, and R. J. Hanson, An extended set of F OR TRAN basic linear algebra subprograms, A CM T r ans. Math. Soft. 14: 1{17, 1988. [3] J. J. Dongarra, J. Du Croz, S. Hammarling, and R. J. Hanson, Algorithm 656: An extended set of F OR TRAN basic linear algebra subprograms, A CM T r ans. Math. Soft. 14: 18{32, 1988. [4] J. J. Dongarra, J. Du Croz, I. S. Du, and S. Hammarling, A set of lev el 3 basic linear algebra subprograms, A CM T r ans. Math. Soft. 16: 1{17, 1990. [5] J. J. Dongarra, J. Du Croz, I. S. Du, and S. Hammarling, Algorithm 679: A set of lev el 3 basic linear algebra subprograms, A CM T r ans. Math. Soft. 16: 18{28, 1990. [6] James W. Demmel, Stanley C. Eisenstat, John R. Gilb ert, Xiao y e S. Li and Joseph W. H. Liu, A sup erno dal approac h to sparse partial piv oting, SIAM J. Matrix A nalysis and Applic ations 20(3): 720{755, 1999. [7] Timoth y A. Da vis, I.S.Du, An unsymmetric-pattern m ultifron tal metho d for sparse LU factorization, SIAM J. Matrix A nalysis and Applic. 18(1): 140{158, 1997. [8] Timoth y A. Da vis, Algorithm 832: UMFP A CK, an unsymmetric-pattern m ultifron tal metho d with a column pre-ordering strategy A CM T r ans. Math. Softwar e 30(2): 196{199, 2004. [9] John R.Gilb ert and Tim P eierls, Sparse partial piv oting in time prop ortional to arithmetic op erations, SIAM J. Sci. Stat. Comput. 9(5): 862{873, 1988. [10] A. George and E. Ng, An implemen tation of Gaussian elimination with partial piv oting for sparse systems. SIAM J. Sci. Statist. Comput. 6(2): 390{409, 1985. [11] Iain S. Du, On algorithms for obtaining a maxim um transv ersal, A CM T r ansactions on Mathematic al Softwar e 7(3): 315{330, 1981. 67

PAGE 78

68 [12] Iain S. Du, Algorithm 575 p erm utations for a zero-free diagonal, A CM T r ansactions on Mathematic al Softwar e 7(3): 387{390, 1981. [13] Iain S. Du and John K. Reid, Algorithm 529: p erm utations to blo c k triangular form, A CM T r ans. on Mathematic al Softwar e 4(2): 189{192, 1978. [14] Iain S. Du and John K. Reid, An implemen tation of T arjan's algorithm for the blo c k triangular form of a matrix, A CM T r ans. on Mathematic al Softwar e 4(2): 137{147, 1978. [15] R.E. T arjan, Depth rst searc h and linear graph algorithms, SIAM J. Computing. 1: 146{160, 1972. [16] Thomas H. Cormen, Charles E. Leiserson, Ronald L. Riv est, Cliord Stein, Intr o duction to A lgorithms Second Edition 2001, MIT Press, Cam bridge. [17] S.C. Eisenstat and J.W.H. Liu, Exploiting structural symmetry in a sparse partial piv oting co de, SIAM J.Sci.Comput. 14(1): 253{257, 1993. [18] P R. Amesto y T. A. Da vis, and I. S. Du,An appro ximate minim um degree ordering algorithm, SIAM J. Matrix A nal. Applic. 17(4): 886{905, 1996. [19] P .R. Amesto y T.A. Da vis, and I.S. Du, Algorithm 837: AMD, an appro ximate minim um degree ordering algorithm, A CM T r ansactions on Mathematic al Softwar e 30(3): 381{388, 2004. [20] Timoth y A. Da vis, John R. Gilb ert, Stefan I. Larimore, and Esmond G. Ng. A column appro ximate minim um degree ordering algorithm, A CM T r ansactions on Mathematic al Softwar e 30(3): 353{376, 2004. [21] Timoth y A. Da vis John R. Gilb ert Stefan I. Larimore Esmond G. Ng, Algorithm 836: COLAMD, a column appro ximate minim um degree ordering algorithm, A CM T r ansactions on Mathematic al Softwar e 30(3): 377{380, 2004. [22] W.W. Hager, Condition estimates, SIAM J. Sci. Stat. Comput. 5,2: 311{316, 1984. [23] Nic holas J. Higham, F ortran co des for estimating the one-norm of a real or complex matrix, with applications to condition estimation, A CM T r ans. on Mathematic al Softwar e. 14(4): 381{396, 1988.

PAGE 79

BIOGRAPHICAL SKETCH Ek anathan w as b orn in Tirunelv eli, India, on Octob er 2, 1977. He receiv ed his Bac helor of T ec hnology degree in c hemical engineering from Anna Univ ersit y Chennai, India, in Ma y 1998. He w ork ed with Infosys T ec hnologies at Brussels, Belgium, as programmer/analyst from 1998 till 2001 and with SAP A G at W alldorf, German y as soft w are dev elop er from 2001 till 2003. Curren tly he is pursuing his Master of Science degree in computer science at Univ ersit y of Florida. His in terests and hobbies include tra v el, reading no v els and magazines, m usic and mo vies. 69


Permanent Link: http://ufdc.ufl.edu/UFE0011721/00001

Material Information

Title: KLU--A High Performance Sparse Linear Solver for Circuit Simulation Problems
Physical Description: Mixed Material
Copyright Date: 2008

Record Information

Source Institution: University of Florida
Holding Location: University of Florida
Rights Management: All rights reserved by the source institution and holding location.
System ID: UFE0011721:00001

Permanent Link: http://ufdc.ufl.edu/UFE0011721/00001

Material Information

Title: KLU--A High Performance Sparse Linear Solver for Circuit Simulation Problems
Physical Description: Mixed Material
Copyright Date: 2008

Record Information

Source Institution: University of Florida
Holding Location: University of Florida
Rights Management: All rights reserved by the source institution and holding location.
System ID: UFE0011721:00001


This item has the following downloads:


Full Text











KLU-A HIGH PERFORMANCE SPARSE LINEAR SOLVER
FOR CIRCUIT SIMULATION PROBLEMS















By

EKANATHAN PALAMADAI NATARAJAN


A THESIS PRESENTED TO THE GRADUATE SCHOOL
OF THE UNIVERSITY OF FLORIDA IN PARTIAL FULFILLMENT
OF THE REQUIREMENTS FOR THE DEGREE OF
MASTER OF SCIENCE

UNIVERSITY OF FLORIDA


2005

































Copyright 2005

by

Ekanathan Palamadai Natarajan
















I dedicate this work to my mother Savitri who has been a source of inspiration

and support to me.















ACKNOWLEDGMENTS

I would like to thank Dr. Timothy Davis, my advisor for introducing me

to the area of sparse matrix algorithms and linear solvers. I started only with

my background in numerical analysis and algorithms, a year and half back. The

insights and knowledge I have gained since then in the area and in implementing

a sparse linear solver like KLU would not have been possible but for Dr. Davis'

guidance and help. I thank him for giving me an opportunity to work on KLU. I

would like to thank Dr. Jose Fortes and Dr. Arunava Banerjee for their support

and help and for serving on my committee.

I would like to thank CISE administrative staff for helping me at different

times during my master's research work.















TABLE OF CONTENTS
page

ACKNOWLEDGMENTS ................... ...... iv

LIST OF TABLES ...................... ......... vii

LIST OF FIGURES ................... ......... viii

ABSTRACT ....................... ........... ix

CHAPTER

1 INTRODUCTION .................... ....... 1

2 THEORY: SPARSE LU ........................... 5

2.1 D ense LU . . . . . . . 5
2.2 Sparse LU . . . . . . . 7
2.3 Left Looking Gaussian Elimination ........ ........ 8
2.4 Gilbert-Peierls' Algorithm ......... ........ ..... 10
2.4.1 Symbolic Analysis .......... ........ ..... 11
2.4.2 Numerical Factorization ......... ........ .. 13
2.5 Maximum Transversal ................... ... 14
2.6 Block Triangular Form .......... ............. 16
2.7 Symmetric Pruning ................... .... 18
2.8 Ordering .................. .......... 19
2.9 Pivoting .................. .......... 22
2.10 Scaling ........ ...... ................. 23
2.11 Growth Factor ......... .......... ...... 25
2.12 Condition Number ............... .... .. 27
2.13 Depth First Search ............... ..... .. 30
2.14 Memory Fragmentation ............... .. .. 31
2.15 Complex Number Support ................ . .. 33
2.16 Parallelism in KLU ............... ..... .. 33

3 CIRCUIT SIMULATION: APPLICATION OF KLU . .... 35

3.1 Characteristics of Circuit Matrices ..... . . ..... 37
3.2 Linear Systems in Circuit Simulation . . ..... 38
3.3 Performance Benchmarks ................ .... .. 39
3.4 Analyses and Findings ............. .. .. .. 41
3.5 Alternate Ordering Experiments ... . . 42










3.6 Experiments with UF Sparse Matrix Collection .. ........
3.6.1 Different Ordering Schemes in KLU .. ...........
3.6.2 Timing Different Phases in KLU .. ............
3.6.3 Ordering Quality among KLU, UMFPACK and Gilbert-
P eierls . . . . . . .
3.6.4 Performance Comparison between KLU and UMFPACK .


4 USER GUIDE FOR KLU

4 1 The Primar KLU Structures


4.1.1
4.1.2
4.1.3
4.2 KLU I
4.2.1
4.2.2
4.2.3
4.2.4
4.2.5
4.2.6
4.2.7
4.2.8
4.2.9
4.2.10
4.2.11


L lllCAl V .JU JUl U uAVU .1
klu_common .. ......
klusymbolic ........
klunumeric .. ......
(outines . . .
kluanalyze .........
klu_analyze_given .....
klu_*factor .. .......
klu_*solve .. .......
klu_*tsolve .. .......
klu_ 1 . . ..
klu_defaults .. ......
klu_*rec_pivot_growth .
klu_*estimatecond_number
klu_freesymbolic .....
klu_free_numeric ......


REFEREN CES . . . . . . . . .


BIOGRAPHICAL SKETCH ......















LIST OF TABLES
Table page

3-1 Comparison between KLU and SuperLU on overall time and fill-in .39

3-2 Comparison between KLU and SuperLU on factor time and solve time 40

3-3 Ordering results using BTF+AMD in KLU on circuit matrices .. 41

3-4 Comparison of ordering results produced by BTF+AMD, AMD, MMD 43

3-5 Fill-in with four different schemes in KLU . . 46

3-6 Time in seconds, spent in different phases in KLU . .... 47

3-7 Fill-in among KLU, UMFPACK and Gilbert-Peierls . ... 49

3-8 Performance comparison between KLU and UMFPACK . ... 50















LIST OF FIGURES
Figure page

2-1 Nonzero pattern of x when solving Lx = b . . ...... 13

2-2 A matrix permuted to BTF form ....... .. 16

2-3 A symmetric pruning scenario ................... .. .. 18

2-4 A symmetric matrix and its graph representation . .... 21

2-5 The matrix and its graph representation after one step of Gaussian
elim nation .................. ............ .. 21

2-6 A doubly bordered block diagonal matrix and its corresponding ver-
tex separator tree .................. ......... .. 34















Abstract of Thesis Presented to the Graduate School
of the University of Florida in Partial Fulfillment of the
Requirements for the Degree of Master of Science

KLU-A HIGH PERFORMANCE SPARSE LINEAR SOLVER
FOR CIRCUIT SIMULATION PROBLEMS

By

Ekanathan Palamadai Natarajan

August 2005

Chair: Dr. Timothy A. Davis
I. Pi r Department: Computer and Information Science and Engineering

The thesis work focuses on KLU, a sparse high performance linear solver for

circuit simulation matrices. During the circuit simulation process, one of the key

steps is to solve sparse systems of linear equations of very high order. KLU targets

solving these systems with efficient ordering mechanisms and high performance

factorization and solve algorithms. KLU uses a hybrid ordering strategy that

comprises an unsymmetric permutation to ensure zero free diagonal, a symmetric

permutation to block upper triangular form and a fill reducing ordering such as

approximate minimum degree.

The factorization is based on Gilbert-Peierls' left-looking algorithm with

partial pivoting. KLU also includes features like symmetric pruning to cut down

symbolic analysis costs. It offers to solve up to four right hand sides in a single

solve step. In addition, it offers transpose and conjugate transpose solve capabili-

ties and important diagnostic features to estimate the reciprocal pivot growth of

the factorization and condition number of the input matrix.

The algorithm is implemented in the C language with MATLAB interfaces as

well. The MATLAB interfaces enable a user to invoke KLU routines from within









the MATLAB environment. The implementation was tested on circuit matrices

and the results determined. KLU achieves superior fill-in quality with its hybrid

ordering strategy and achieves a good performance speed-up when compared with

existing sparse linear solvers for circuit problems. The thesis highlights the work

being done on exploiting parallelism in KLU as well.















CHAPTER 1
INTRODUCTION

Sparse is beautiful. Solving systems of linear equations of the form Ax = b

is a fundamental and important area of high performance computing. The matrix

A is called the coefficient matrix and b the right hand side vector. The vector x is

the solution to the equation. There are a number of methods available for solving

such systems. Some of the popular ones are Gaussian elimination, QR factorization

using Householder transformations or Givens rotations and Cholesky factorization.

Gaussian elimination with partial pivoting is the most widely used algorithm for

solving linear systems because of its stability and better time complexity. Cholesky

can be used only when A is symmetric positive definite.

Some systems that are solved comprise a dense coefficient matrix A. By dense,

we mean most of the elements in A are nonzero. There are high performance

subroutines such as the BLAS [1, 2, 3, 4, 5] that can maximize flop count for

such dense matrices. The interesting systems are those where the coefficient

matrix A happens to be sparse. By sparse, we mean the matrix has few nonzero

entries(hereafter referred to simply as 'nonzeros'). The adjective 'few' is not well-

defined as we will see in chapter two. When matrices tend to be sparse, we need

to find out effective ways to store the matrix in memory since we want to avoid

storing zeros of the matrix. When we store only the nonzeros in the matrix, it has

consequences in the factorization algorithm as well. One typical example would be

we do not know before hand how nonzeros would appear in the L and U factors

when we factorize the matrix. While we avoid storing the zeros, we also want to

achieve good time complexity when solving sparse systems. If the time spent to









solve sparse systems remains same as for dense systems, we have not done any

better.

KLU stands for Clark Kent LU, since it is based on Gilbert-Peierls' algorithm,

a non-supernodal algorithm, which is the predecessor to SuperLU, a supernodal

algorithm. KLU is a sparse high performance linear solver that employs hybrid

ordering mechanisms and elegant factorization and solve algorithms. It achieves

high cqii,.lilI fill-in rate and beats many existing solvers in run time, when used for

matrices arising in circuit simulation.

There are several flavours of Gaussian elimination. A left-looking Gaussian

elimination algorithm factorizes the matrix left to right computing columns of L

and U. A right-looking version factorizes the matrix from top-left to bottom-right

computing column of L and row of U. Both have their advantages and disad-

vantages. KLU uses a left looking algorithm called Gilbert-Peierls' algorithm.

Gilbert-Peierls' comprises a graph theoretical symbolic analysis phase that iden-

tifies the nonzero pattern of each column of L and U factors and a left-looking

numerical factorization phase with partial pivoting that calculates the numerical

values of the factors. KLU uses Symmetric Pruning to cut down symbolic analysis

cost. We shall look in detail on these features in chapter two.

A critical issue in linear solvers is Ordering. Ordering means permuting the

rows and columns of a matrix, so that the fill-in in the L and U factors is reduced

to a minimum. A fill-in is defined as a nonzero appearing in either of the matrices

L or U, while the element in the corresponding position in A is a zero. Lij or Uij

is a fill-in if Aij is a zero. Fill-in has obvious consequences in memory in that

the factorization algorithm could create dense L and U factors that can exhaust

available memory. A good ordering algorithm yields a low fill-in in the factors.

Finding the ordering that gives minimal fill-in is an NP complete problem. So









ordering algorithms use heuristics. KLU accomodates multiple ordering schemes

like AMD, COLAMD and any user generated permutation.

There are other orderings for different purposes. For example, one could order

a matrix to ensure that it has no zeros on the diagonal. Otherwise, the Gaussian

elimination would fail. Another ordering scheme could reduce the factorization

work. KLU einr'l,-'- two such orderings namely an unsymmetric ordering that

ensures a zero free diagonal and a symmetric ordering that permutes the matrix

into a block upper triangular form (BTF) that restricts factorization to only the

diagonal blocks.

One of the key steps in the circuit simulation process is solving sparse linear

systems. These systems originate from solving large systems of non linear equations

using Newton's method and integrating large stiff systems of ordinary differential

equations. These systems are of very high dimensions and a considerable fraction of

simulation time is spent on solving these systems. Often the solve phase tends to

be a bottleneck in the simulation process. Hence high performance sparse solvers

that optimize memory usage and solution time are critical components of circuit

simulation software. Some of the popular solvers in use in circuit simulation tools

are Sparsel.3 and SuperLU. Sparsel.3 is used in SPICE circuit simulation package

and SuperLU uses a supernodal factorization algorithm. Experimental results of

KLU indicate that it is 1000 times faster than Sparsel.3 and 1.5 to 3 times faster

than SuperLU.

Circuit matrices show some unique properties. They have a nearly zero free

diagonal. They have a roughly symmetric pattern but have unsymmetric values.

They are highly sparse and often have a few dense rows and columns. These dense

rows/columns arise from voltage sources and current sources in the circuit. Circuit

matrices show good amenability to BTF ordering. Though the nonzero pattern of

original matrix is unsymmetric, the nonzero pattern of blocks produced by BTF









ordering tend to be symmetric. Since circuit matrices are extremely sparse, sparse

matrix algorithms such as SuperLU [6] and UMFPACK [7, 8] that nipil] dense

BLAS kernels are often inappropriate. Another unique characteristic of circuit

matrices is that employing a good ordering strategy keeps the L and U factors

sparse. However as we will see in experimental results, typical ordering strategies

can lead to high fill-in.

In circuit simulation problems, typically the circuit matrix template is gen-

erated once and the numerical values of the matrix alone change. In other words,

the nonzero pattern of the matrix does not change. This implies that we need

to order and factor the matrix once to generate the ordering permutations and

the nonzero patterns of L and U factors. For all subsequent matrices, we can use

the same information and need only to recompute the numerical values of the

L and U factors. This process of skipping analysis and factor phases is called

refactorization.Refactorization leads to a significant reduction in run time.

Because of the unique characteristics of circuit matrices and their amenability

to BTF ordering, KLU is a method well-suited to circuit simulation problems. KLU

has been implemented in the C language. It offers a set of API for the analysis

phase, factor phase, solve phase and refactor phase. It also offers the ability to

solve upto four right hand sides in a single solve step. In addition, it offers trans-

pose solve, conjugate transpose solve features and diagnostic tools like pivot growth

estimator and condition number estimator. It also offers a MATLAB interface for

the API so that KLU can be used from within the MATLAB environment.















CHAPTER 2
THEORY: SPARSE LU

2.1 Dense LU

Consider the problem of solving the linear system of n equations in n un-

knowns:


a11x1 + a122 + ... + ainxn = b

a21x + a22x2 + ... 2+ a = b2

(2-1)

anlXl + an2x2 +... ax, = bn


or, in matrix notation,


all a12 a n X1 bl

a21 a22 ... a*2n x2 b2
(2-2)


anl au2 .. a nn Xn b


Ax = b

where A = (aij), x = (1, X2, ..., n)T and b = (bl,..., b,)T. A well-known approach

to solving this equation is Gaussian elimination. Gaussian elimination consists of a

series of eliminations of unknowns xi from the original system. Let us briefly review

the elimination process. In the first step, the first equation of 2-1 is multiplied

by -21 2 a .. and added with the second through nth equation of 2-1
respectively. This would eliminate from second through the equations. After
respectively. This would eliminate x1 from second through the nth equations. After










the first step, the 2-2 would become


all a12 an
(1) (1)
0 a22 ... 2



0 a(,) a(n


(2-3)


where al22 a22 -a21 a12, a32(1) 32 11a a12 and so on. In the second

step, x2 will be eliminated by a similar process of computing multipliers and adding

the multiplied second equation with the third through nth equations. After n-1

eliminations, the matrix A is transformed to an upper triangular matrix U. The

upper triangular system is then solved by back-substitution.

An equivalent interpretation of this elimination process is that we have

factorized A into a lower triangular matrix L and an upper triangular matrix U

where


al 1 0
"21 1 0
all
a31 32



(1) Un
ttnl Un2
all () a (2)


a12 a13

a2) a23(1)

0 a33(2)


(2-4)











(2-5)


~L l 1 1 ((nr-1)
0 0 0 a.t)nn

The column k of the lower triangular matrix L consists of the multipliers

obtained during step k of the elimination process, with their sign negated.


and









Mathematically, Ax = b can be rewritten as


(LU)x b

L(Ux) = b (2-6)


Substituting Ux = y in 2-6, we have


Ly = b (2-7)

Ux y (2-8)


By solving these two lower triangular systems, we find the solution to the actual

system.

The reason for triangularizing the system is to avoid finding the inverse of

the original coefficient matrix A. Inverse finding is atleast thrice as expensive as

Gaussian elimination in the dense case and often leads to more inaccuracies.

2.2 Sparse LU

A sparse matrix is defined as one that has few nonzeros in it. The quan-

tification of the adjective 'few' is not specified. The decision as to what kind of

algorithm to use (sparse or dense) depends on the fill-in properties of the matrices.

However, sparse matrices typically have O(n) nonzero entries. Dense matrices

are typically represented by a two dimensional .- i. .'.The zeros of a sparse matrix

should not be stored if we want to save memory. This fact makes a two dimensional

array unsuitable for representing sparse matrices. Sparse matrices are represented

with a different kind of data structure. They can be represented in two different

data structures viz.column compressed form or row compressed form.

A column compressed form consists of three vectors Ap, Ai and Ax. Ap

consists of column pointers. It is of length n+ 1. The start of column k of the input

matrix is given by Ap [k].









Ai consists of row indices of the elements. This is a zero based data structure

with row indices in the interval [0,n). Ax consists of the actual numerical values of

the elements.

Thus the elements of a column k of the matrix are held in

Ax [Ap [k]...Ap [k+1]). The corresponding row indices are held in

Ai [Ap [k]...Ap[k+l]).

Equivalently, a row compressed format stores a row pointer vector Ap, a

column indices vector Ai and a value vector Ax. For example, the matrix

500

420

318

when represented in column compressed format will be

Ap: 0 3 5 6

Ai: 0 1 2 1 2 2

Ax: 5 4 3 2 1 8

and when represented in row compressed format will be

Ap: 0 1 3 6

Ai: 0 0 1 0 1 2

Ax: 5 4 2 3 1 8

Let nnz represent the number of nonzeros in a matrix of dimension n n.

Then in a dense matrix representation, we will need n2 memory to represent the

matrix. In sparse matrix representation, we reduce it to O(n + nnz) and typically

nnz < n2.

2.3 Left Looking Gaussian Elimination

Let us derive a left looking version of Gaussian elimination. Let an input

matrix A of order n n be represented as a product of two triangular matrices L

and U.













All a12 A13 L11 0 0 U11 U12 U13

a21 a22 a23 121 1 0 0 22 U23

A31 a32 A33 L31 132 L33 0 0 U33

where Aij is a block, aij is a vector and aij is a scalar. The dimensions of

different elements in the matrices are as follows:

All, L11, U1 are k k blocks

a12, u12 are k 1 vectors

A13, U13 are k n (k + 1) blocks

a21, 21 are 1 k row vectors

a22, u22 are scalars

a23, U23 are 1 n (k + 1) row vectors

A31, L31 are n (k + 1) k blocks

a32, 132 are n (k + 1) 1 vectors

A33, L33, U33 are n (k + 1) n ( + 1) blocks.

From (2-9), we can arrive at the following set of equations.


L U11

L U12

LI* U13

121 U11

121 U12 + U22

121 U13 + U23

L31 UN

La1 U12 + 132 U22

La1 U13 + 132 U23 + L33 U33


All

a12

A13

a21

a22

a23

A31

a32

A33


(2-9)









From (2-11), (2-14) and (2-17), we can compute the 2nd column of L

and U, assuming we have already computed L11, 121 and L31. We first solve the

lower triangular system (2-11) for u12. Then, we solve for u22 using (2-14) by

computing the sparse dot product



U22 a= 22 121 12 (2 19)

Finally we solve (2-17) for 132 as


132 (a32 L31 U12) (2-20)
U22

This step of computing the 2nd column of L and U can be considered equiva-

lent to solving a lower triangular system as follows:


L11 0 0 U12 a12

121 1 0 U22 a22 (221)

L31 0 1 132 U22 a32

This mechanism of computing column k of L and U by solving a lower

triangular system L x = b is the key step in a left-looking factorization algorithm.

As we will see later, Gilbert-Peierls' algorithm revolves around solving this lower

triangular system. The algorithm is called a left-looking algorithm since column

k of L and U are computed by using the already computed columns 1...k-1 of L.

In other words, to compute column k of L and U, one looks only at the already

computed columns 1...k-1 in L, that are to the left of the currently computed

column k.

2.4 Gilbert-Peierls' Algorithm

Gilbert-Peierls' [9] proposed an algorithm for Gaussian elimination with partial

pivoting in time proportional to the flop count of the elimination to factor an

arbitrary non singular sparse matrix A as PA = LU. If flops(LU) is the number









of nonzero multiplications performed when multiplying two matrices L and U, then

Gaussian elimination uses exactly flops(LU) multiplications and divisions to factor

a matrix A into L and U. Given an input matrix and assuming no partial pivoting,

it is possible to predict the nonzero pattern of its factors. However with partial

pivoting, it is not possible to predict the exact nonzero pattern of the factors

before hand. Finding an upper bound is possible, but the bound can be very loose

[10]. Note that computing the nonzero pattern of L and U is a necessary part of

Gaussian elimination involving sparse matrices since we do not use two dimensional

arrays for representing them but sparse data structures. Gilbert-Peierls' algorithm

aims at computing the nonzero pattern of the factors and the numerical values in a

total time proportional to O(flops(LU)).

It consists of two stages for determining every column of L and U. The first

stage is a symbolic analysis stage that computes the nonzero pattern of the column

k of the factors. The second stage is the numerical factorization stage that involves

solving the lower triangular system Lx = b, that we discussed in the section above.

2.4.1 Symbolic Analysis

A sparse Gaussian elimination algorithm with partial pivoting cannot know

the exact nonzero structure of the factors ahead of all numerical computation,

simply because partial pivoting at column k can introduce new nonzeros in columns

k+1 ... n. Solving Lx = b must be done in time proportional to the number of flops

performed. Consider a simple column-oriented algorithm in MATLAB notation for

solving Lx = b as follows:

x =b

for j = l:n

if x(j) ~= 0

x(j+l:n) = x(j+l:n) L(j+l:n,j) x(j)









end

The above algorithm takes time 0(n + number of flops performed). The 0(n)

term looks harmless, but Lx = b is solved n times in the factorization of A = LU,

leading to an unacceptable O(n2) term in the work to factorize A into L times U.

To remove the O(n) term, we must replace the algorithm with

x =b

for each j for which x(j) ~= 0

x(j+l:n) = x(j+l:n) L(j+l:n, j) x(j)

end

This would reduce the 0(n) term to O(q(b)), where q(b) is the number of nonzeros

in b. Note that b is a column of the input matrix A. Thus to solve Lx = b, we need

to know the nonzero pattern of x before we compute x itself. Symbolic analysis

helps us determine the nonzero pattern of x.

Let us say we are computing column k of L and U. Let G = G(Lk) be

the directed graph of L with k 1 vertices representing the already computed

k 1 columns. G(Lk) has an edge j -- i iff lij / 0. Let f = {ilbi 0} and

X = {ixli / 0} Now the elements of X is given by


X = ReachG(L)J) (2-22)


The nonzero pattern of X is computed by the determining the vertices that

are reachable from the vertices of the set 3. The reachability problem can be solved

using a classical depth first search in G(Lk) from the vertices of the set P. If bj / 0,

then xj / 0. In addition if L-j / 0, then xi / 0 even if bi = 0. This is because

a Lij xj contributes to a nonzero in the equation when we solve for xi. During

the depth first search, Gilbert-Peierls' algorithm computes the topological order of

X. This topological ordering is useful for eliminating unknowns in the Numerical

factorization step.











C- xj






^------------- X


L


Figure 2-1: Nonzero pattern of x when solving Lx = b

The row indices vector Li of columns 1 ... k 1 of L represents the .,li.'.,ency

list of the graph G(Lk). The depth first search takes time proportional to the

number of vertices examined plus the number of edges traversed.

2.4.2 Numerical Factorization

Numerical factorization consists of solving the system (2-21) for each col-

umn k of L and U. Normally we would solve for the unknowns in (2-21) in the

increasing order of the row index. The row indices/nonzero pattern computed by

depth first search are not necessarily in increasing order. Sorting the indices would

increase the time complexity above our O(flops(LU)) goal. However, the require-

ment of eliminating unknowns in increasing order can be relaxed to a topological

order of the row indices. An unknown xi can be computed, once all the unknowns

xj of which it is dependent on are computed. This is obvious when we write the

equations comprising a lower triangular solve. Theoretically, the unknowns can be

solved in any topological order. The depth first search algorithm gives one such

topological order which is sufficient for our case. In our example, the depth first

search would have finished exploring vertex i before it finishes exploring vertices j.









Hence a topological order given by depth first search would have j appearing before

i. This is exactly what we need.

Gilbert-Peierls' algorithm starts with an identity L matrix. The entire left

looking algorithm can be summarized in MATLAB notation as follows:

L= I

for k = 1:n

x = L \ A(:,k)

%(partial pivoting on x can be done here)

U(1:k,k) = x(l:k)

L(k:n,k) = x(k:n) / U(k,k)

end

where x = L\b denotes the solution of a sparse lower triangular system. In

this case, b is the kth column of A. The total time complexity of Gilbert-Peierls'

algorithm is O(rq(A) + flops(LU)). r(A) is the number of nonzeros in the matrix A

and flops(LU) is the flop count of the product of the matrices L and U. Typically

flops(LU) dominates the complexity and hence the claim of factorizing in time

proportional to the flop count.

2.5 Maximum Transversal

Duff [11, 12] proposed an algorithm for determining the maximum transversal

of a directed graph.The purpose of the algorithm is to find a row permutation that

minimizes the zeros on the diagonal of the matrix. For non singular matrices, the

algorithm ensures a zero free diagonal. KLU e i'l-1-,t Duff's [11, 12] algorithm

to find an unsymmetric permutation of the input matrix to determine a zero-

free diagonal. A matrix cannot be permuted to have a zero free diagonal if and

only if it is structurally singular. A matrix is structurally singular if there is no

permutation of its nonzero pattern that makes it numerically nonsingular.









A transversal is defined as a set of nonzeros, no two of which lie in the same

row or column, on the diagonal of the permuted matrix. A transversal of maximum

length is the maximum transversal.

Duff's maximum transversal algorithm consists of representing the matrix as

a graph with each vertex corresponding to a row in the matrix. An edge ik -- ik+

exists in the graph if A(ik,jk+i) is a nonzero and A(ik+, jk+i) is an element in the

transversal set. A path between vertices io and ik would consist of a sequence of

nonzeros (io, jl), (i1,j2), (i -1,j) where the current transversal would include

(il,'jl), (i2,j2), (i ,jk'). If there is a nonzero in position (i, jk+i) and no nonzero

in row io or column jk+1 is currently on the travseral, it increases the transerval

by one by adding the nonzeros (ir,jr+i), r =0, 1, k k to the transversal and

removing the nonzeros (ir, j,), r = 1, 2, k k from the transversal. This adding and

removing of nonzeros to and from the transversal is called reassignment chain or

augmenting path.

A vertex or row is said to be assigned if a nonzero in the row is chosen for

the transversal. The process of constructing augmenting paths is done by doing

a depth first search from an unassigned row io of the matrix and continue till a

vertex ik is reached where the path terminates because A(ik, jk+l) is a nonzero and

column jK+1 is unassigned. Then the search backtracks to io adding and removing

transversal elements thus constructing an augmenting path.

Duff's maximum transversal transversal algorithm has a worst case time

complexity of O(nT) where T is the number of nonzeros in the matrix and n is the

order of the matrix. However in practice, the time complexity is close to O(n + r).

The maximum transversal problem can be cast as a maximal matching

problem on bipartite graphs. This is only to make a comparison. The maximal

matching problem is stated as follows.










All A12 A13 A14 Xi bl



A22 A23 A24 X2 b2


A33 A34 X3 b3

0
A44 ;4 b4



Figure 2-2: A matrix permuted to BTF form

Given an undirected graph G = (V, E), a matching is a subset of the edges
M C E such that for all vertices v E V, at most one edge of M is incident on

v. A vertex v E V is matched if some edge in M is incident on v, otherwise, v is
unmatched. A maximal matching is a matching of maximum card Iilili r, that is a
matching M such that for any matching M', we have IMI > |M'|.
A maximal matching can be built incrementally, by picking an arbitraty edge e
in the graph, deleting any edge that is sharing a vertex with e and repeating until

the graph is out of edges.
2.6 Block Triangular Form

A block (upper) triangular matrix is similar to a upper triangular matrix
except that the diagonals in the former are square blocks instead of scalars. Figure
2-2 shows a matrix permuted to the BTF form.

Converting the input matrix to block triangular form is important in that,
1. The part of the matrix below the block diagonal requires no factorization
effort.









2. The diagonal blocks are independent of each other.Only the blocks need to

be factorized. For example, in figure 2-2, the -i1.--I, ,-I. A444 = b4 can be

solved independently for x4 and x4 can be eliminated from the overall system.

The system A33X3 = b A34x4 is then solved for x3 and so on.

3. The off diagonal nonzeros do not contribute to any fill-in.

Finding a symmetric permutation of a matrix to its BTF form is equivalent

to finding the strongly connected components of a graph. A strongly connected

component of a directed graph G = (V, E) is a maximal set of vertices C C V such

that for every pair of vertices u and v in C, we have both u -- v and v -- u. The

vertices u and v are reachable from each other.

The Algorithm ei'l-,v d in KLU for symmetric permutation of a matrix to a

BTF form, is based on Duff and Reid's [13, 14] algorithm. Duff and Reid provide

an implementation for Tarjan's [15] algorithm to determine the strongly connected

components of a directed graph. The algorithm has a time complexity of O(n + r)

where n is the order of the matrix and T is the number of off diagonal nonzeros in

the matrix.

The algorithm essentially consists of doing a depth first search from unvisited

nodes in the graph. It uses a stack to keep track of nodes being visited and uses a

path of the nodes. When all edges in the path are explored, it generates Strongly

connected components from the top of stack.

Duff's algorithm is different from the method proposed by Cormen, Leiserson,

Rivest and Stein [16]. They -I,.1-. -1 doing a depth first search on G computing GT

and then running a depth first search on GT on vertices in the decreasing order of

their finish times from the first depth first search (the topological order from the

first depth first search).










J i k
\

















Pruned, = Nonzero, ( = Fill in

Figure 2-3: A symmetric pruning scenario

2.7 Symmetric Pruning

Eisenstat and Liu [17] proposed a method called Symmetric Pruning to exploit

structural symmetry for cutting down the symbolic analysis time. The cost of

depth first search can be cut down by pruning unnecessary edges in the graph of

L, G(L). The idea is to replace G(L) by a reduced graph of L. Any graph H can be

used in place of G(L), provided that i j exists in H iff i j exists in G(L). If A

is symmetric, then the symmetric reduction is just the elimination tree.

The symmetric reduction is a subgraph of G(L). It has fewer edges than G(L)

and is easier to compute by taking advantage of symmetry in the structure of

the factors L and U. Even though the symmetric reduction removes edges, it still

preserves the paths between vertices of the original graph.

Figure 2-3 shows a symmetric pruning example.

If lij $ 0, uji $ 0, then we can prune edges j s, where s > i. The reason

behind this is that for any ajk $ 0, ask Will fill in from column j of L for s > k.
behind this is that for any ajk / 0, aisk will fill in from column j of L for s > k .









The just computed column i of L is used to prune earlier columns. Any future

depth first search from vertex j will not visit vertex s, since s would have been

visited via i already.

Note that every column is pruned only once. KLU inpl-,i- symmetric pruning

to speed up the depth first search in the symbolic analysis stage.

2.8 Ordering

It is a widely used practice to precede the factorization step of a sparse

linear system by an ordering phase. The purpose of the ordering is to generate a

permutation P, that reduces the fill-in in the factorization phase of PAPT. A fill-in

is defined as a nonzero in a position (i, j) of the factor that was zero in the original

matrix. In other words, we have a fill-in if Lij / 0, where Aj = 0.

The permuted matrix created by the ordering PAPT creates much less fill-in

in factorization phase than the unpermuted matrix A. The ordering mechanism

typically takes into account only the structure of the input matrix, without

considering the numerical values stored in the matrix. Partial pivoting during

factorization changes the row permutation P and hence could potentially increase

fill-in as opposed to what was estimated by the ordering scheme. We shall see more

about pivoting in the following sections.

If the input matrix A is unsymmetric, then the permutation of the matrix

A + AT can be used. Various minimum degree algorithms can be used for ordering.

Some of the popular ordering schemes include approximate minimum degree(AMD)

[18, 19], column approximate minimum degree(COLAMD) [20, 21] among others.

COLAMD orders the matrix AAT without forming it explicitly.

After permuting an input matrix A into BTF form using the maximum

transversal and BTF orderings, KLU attempts to factorize each of the diagonal

blocks. It applies the fill reducing ordering algorithm on the block before factoriz-

ing it. KLU supports both approximate minimum degree and column approximate









minimum degree. Besides, any given ordering algorithm can be plu-.i:'- ..1 into

KLU without much effort. Work is being done on integrating a Nested Dissection

ordering strategy into KLU as well.

Of the various ordering schemes, AMD gives best results on circuit matrices.

AMD finds a permutation P to reduce fill-in for the Cholesky factorization of

PAPT (of P(A + AT)PT, if A is unsymmetric). AMD assumes no numerical

pivoting. AMD attempts to reduce an optimistic estimate of fill-in.

COLAMD is an unsymmetric ordering scheme, that computes a column

permutation Q to reduce fill-in for Cholesky factorization of (AQ)TAQ. COLAMD

attempts to reduce a "pessimitic" estimate (upper bound) of fill-in.

Nested Dissection is another ordering scheme that creates permutation

such that the input matrix is transformed into block diagonal form with vertex

separators. This is a popular ordering scheme. However, it is unsuitable for

circuit matrices when applied to the matrix as such. It can be used on the blocks

generated by BTF pre-ordering.

The idea behind a minimum degree algorithm is as follows: A structurally

symmetric matrix A can be represented by an equivalent undirected graph G(V, E)

with vertices corresponding to row/column indices. An edge i -> j exists in G if

Aj / 0.

Consider the figure 2-4. If the matrix is factorized with vertex 1 as the pivot,

then after the first Gaussian elimination step, the matrix would be transformed as

in figure 2-5.

This first step of elimination can be considered equivalent to removing node

1 and all its edges from the graph and adding edges to connect all nodes .,li.I :ent

to 1. In other words, the elimination has created a clique of the nodes .,li.I :ent

to the eliminated node. Note that there are as many fill-ins in the reduced matrix

as there are edges added in the clique formation. In the above example, we have
















* *


* *
*
* *
*


Figure 2-4: A symmetric matrix and its graph representation


2 3



4 5

Figure 2-5: The matrix and its graph representation after one step of Gaussian
elimination









chosen the wrong node as pivot, since node 1 has the maximum degree. Instead if

we had chosen a node with minimum degree say 3 or 5 as pivot, then there would

have been zero fill-in after the elimination since both 3 and 5 have degree 1.

This is the key idea in a minimum degree algorithm. It generates a permuta-

tion such that a node with minimum degree is eliminated in each step of Gaussian

elimination, thus ensuring a minimal fill-in. The algorithm does not examine the

numerical values in the node selection process. It could happen that during partial

pivoting, a node other than the one .-..i. -1. .1 by the minimum degree algorithm

must be chosen as pivot because of its numerical magnitude. That's exactly the

reason why the fill-in estimate produced by the ordering algorithm could be less

than that experienced in the factorization phase.

2.9 Pivoting

Gaussian elimination fails when the diagonal element in the input matrix

happens to be zero. Consider a simple 2 2 system,


0 a12 x1 b1
A 2 1 (2-23)
a21 a22 x2 b2

When solving the above system, Gaussian elimination computes the multiplier

-a21/all [and multiplies row 1 with this multiplier and adds it to row 2] thus

eliminating the coefficient element a21 from the matrix. This step obviously would

fail, since all is zero. Now let's see a classical case when the diagonal element is

nonzero but close to zero.


0.0001 1
A (2-24)
1 1

The multiplier is -1/0.0001 -104. The factors L and U are











1 0 0.0001 1
L = U 1 (2-25)
104 1 0 -104

The element u22 has the actual value 1 104. However assuming a four digit

arithmetic, it would be rounded off to -104. Note that the product of L and U is

0.0001 1
L U (2-26)
1 0

which is different from the original matrix. The reason for this problem is that the

multiplier computed is so large that when added with the small element a22 with

value 1, it obscured the tiny value present in a22-

We can solve these problems with pivoting. In the above two examples,

we could interchange rows 1 and 2, to solve the problem. This mechanism of

interchanging rows(and columns) and picking a large element as the diagonal, to

avoid numerical failures or inaccuracies is called pivoting. To pick a numerically

large element as pivot, we could look at the elements in the current column or we

could look at the entire submatrix (across both rows and columns). The former is

called partial pivoting and the latter is called complete pivoting.

For dense matrices, partial pivoting adds a time complexity of O(n2) com-

parisons to Gaussian elimination and complete pivoting adds O(n3) comparisons.

Complete pivoting is expensive and hence is generally avoided, except for special

cases. KLU employs partial pivoting with diagonal preference. As long as the

diagonal element is atleast a constant threshold times the largest element in the

column, we choose the diagonal as the pivot. This constant threshold is called pivot

tolerance.

2.10 Scaling

The case where small elements in the matrix get obscured during the elimina-

tion process and accuracy of the results gets skewed because of numerical addition









is not completely overcome by the pivoting process. Let us see an example of this

case.

Consider the 2 2 system


105 xi 105
A= (2-27)
1 1 x2 2

When we apply Gaussian elimination with partial pivoting to the above

system, the entry all is largest in the first column and hence would continue to be

the pivot.After the first step of elimination assuming a four digit arithmetic, we

would have


10 105 X1 105
A (2-28)
0 -104 X2 -104

The solution from the above elimination is x = 1, 2 = 0. However the correct

solution is close to x1 = 1, x2 1.

If we divide each row of the matrix by the largest element in that row(and the

corresponding element in the right hand side as well), prior to Gaussian elimination

we would have


A 10-4 1 X1 129)
A= (2-29)
1 1 x2 2

Now if we apply partial pivoting we would have,


1 1 xi 2
A = ] [] [ (2-30)
10-4 1 a2 s

And after an elimination step, the result would be











1 1 xi 2
A 1 [ (2-31)
0 1 10-4 X2 1 10-4

which yields the correct solution xi 1, 2 = 1. The process of balancing out

the numerical enormity or obscurity on each row or column is termed as scaling.

In the above example, we have scaled with respect to the maximum value in a row

which is row scaling. Another variant would be to scale with respect to the sum of

absolute values of all elements across a row.

In column -.. lii,. we would scale with respect to the maximum value in a

column or the sum of absolute values of all elements in a cloumn.

Row scaling can be considered equivalent to finding an invertible diagonal

matrix D1 such that all the rows in the matrix D-1A have equally large numerical

values.

Once we have such a D1, the solution of the original system Ax = b is

equivalent to solving the system Ax = b where A = D-1A and b = D-b.

Equilibration is another popular term used for scaling.

In KLU, the diagonal elements of the diagonal matrix D1 are either the largest

elements in the rows of the original matrix or the sum of the absolute values of the

elements in the rows. Besides scaling can be turned off as well, if the simulation

environment does not need scaling. Scaling though it offers better numerical results

when solving systems, is not mandatory. Its usage depends on the data values that

constitute the system and if the values are already balanced, scaling might not be

necessary.

2.11 Growth Factor

Pivot growth factor is a key diagnostic estimate in determining the stability

of Gaussian elimination. Stability of Numerical Algorithms is an important factor

in determining the accuracy of the solution. Study of stability is done by a process









called Roundoff Error analysis. Roundoff error analysis comprises two sub types

called Forward error analysis and Backward error analysis. If the computed

solution x is close to the exact solution x, then the algorithm is said to be Forward

stable. If the algorithm computes an exact solution to a nearby problem, then the

algorithm is said to be Backward stable.Backward stability is the most widely used

technique in studying stability of systems. Often the data generated for solving

systems have impurity in them or they are distorted by a small amount. Under

such circumstances we are interested that the algorithm produce an exact solution

to this nearby problem and hence the relevance of backward stability assumes

significance.

Pivot growth factor is formally defined as
max max (

ij a i ,


where a) is an entry in the reduced matrix A(k) after the kt elimination step.

From (2-32), we find that if the entries of the reduced matrix grow arbitrarily,

we would have a high growth factor. This arbitrary growth would again lead to

inaccuracies in the results. Consider the following 2 2 system.



A 10-4 1 Xi 133)
A= (2-33)
1 1 x2 2

After one step of Gaussian elimination assuming four digit arithmetic, we

would have the reduced system


10-4 1 X1
A = (2-34)
0 1 104 2 2 104

Solving the system yields xl = 0, x2 = 1 which is different from the actual

solution x1 1, x2 = 1.The pivot growth factor of the above system is









S ma(l,104) 104
S 1
Thus a large pivot growth clearly indicates the inaccuracy in the result. Partial

pivoting generally avoids large growth factors. In the above example, if we had

applying partial pivoting, we would have got the correct results. But this is not

assured and there are cases where partial pivoting might not result in an acceptable

growth factor. This necessitates the estimation of the growth factor as a diagnostic

tool to detect cases where Gaussian elimination could be unstable.

Pivot growth factor is calculated usually in terms of its reciprocal, to avoid

numerical overflow problems when the value is very large. (2-32) is a harder to

compute equation since it involves calculating the maximum of reduced matrix

after every step of elimination. The other definitions of reciprocal growth factor

that are easy to compute are as follows:
max
Sm a3 .) (2-35)

P ( iN ua j)

max

max (236)
P ( ij uij )
Equation (2-35) is the definition implemented in KLU and it is a column

scaling invariant. It helps unmask a large pivot growth that could be totally

masked because of column scaling.

2.12 Condition Number

Growth factor is a key estimate in determining the stability of the algorithm.

Condition number is a key estimate in determining the amenability or conditioning

of a given problem. It is not guaranteed that a highly stable algorithm can yield

accurate results for all problems it can solve. The conditioning of the problem has

a dominant effect on the accuracy of the solution.

Note that while stability deals with the algorithm, conditioning deals with

the problem itself. In practical applications like circuit simulation, the data of









a problem come from experimental observations. Typically such data have a

factor of error or impurities or noise associated with them. Roundoff errors and

discretization errors also contribute to impurities in the data. Conditioning of a

problem deals with determining how the solution of the problem changes in the

presence of impurities.

The preceding discussion shows that one often deals with solving problems

not with the original data but that with perturbed data. The analysis of effect

of perturbation of the problem on the solution is called Perturbation analysis. It

helps in determining whether a given problem produces a little or huge variation

in solution when perturbed. Let us see what we mean by well or ill conditioned

problems.

A problem is said to be ill conditioned if a small relative error in data leads

to a large relative error in solution irrespective of the algorithm employed. A

problem is said to be well conditioned if a small relative error in data does not

lead to a large relative error in solution.

Accuracy of the computed solution is of primary importance in numerical

analysis. Stability of the algorithm and the Conditioning of the given problem are

the two factors that directly determine accuracy.A highly stable algorithm well

armored with .lii,. partial pivoting and other concepts cannot be guaranteed to

yield an accurate solution to an ill-conditioned problem.

A backward stable algorithm applied to a well-conditioned problem should

yield a solution close to the exact solution. This follows from the definitions of

backward stability and well-c. .i'ili iil -. where backward stability assures exact

solution to a nearby problem and well-conditioned problem assures that the

computed solution to perturbed data is relatively close to the exact solution of the

actual problem.









Mathematically, let X be some problem. Let X(d) be the solution to the

problem for some input d. Let 6d denote a small perturbation in the input d. Now

if the relative error in the solution

IX(d + 6d) X(d) I
Ix(d)l

exceeds the relative error in the input

1Jdl
d\

then the problem is ill conditioned and well conditioned otherwise.

Condition number is a measure of the conditioning of the problem. It shows

whether a problem is well or ill conditioned.For the linear system problems of the

form Ax = b, the condition number is defined as


Cond(A) IIl I IA-l 1 (2-37)

Equation (2-37) is arrived at by theory that deals with perturbations either

in the input matrix A or the right hand side b or both the matrix and right hand

side. Equation (2-37) can be defined with respect to any norm viz. 1, 2 or oo. The

system Ax = b is said to be ill-conditioned if the condition number from (2-37) is

quite large. Otherwise it is said to be well-conditioned.

A naive way to compute the condition number would be to compute the

inverse of the matrix, compute the norm of the matrix and its inverse and compute

the product. However, computing the inverse is atleast thrice as expensive as

solving the linear system Ax = b and hence should be avoided.

Hager [22] developed a method for estimating the 1-norm of IIA-11 and the

corresponding 1-norm condition number. Hager proposed an optimization approach

for estimating IIA-111. The 1-norm of a matrix is formally defined as











||A,||,
I|A|1 max (2-38)

Hager's algorithm can be briefly described as follows: For A E R*", a convex

function is defined as

F(x) = |11, 1 (2-39)

over the convex set

S {x R : I,I < 1}

Then I|A|I1 is the global maximum of (2-39).

The algorithm involves computing Ax and ATx or computing matrix-vector

products. When we want to compute IA-1 I1, it involves computing A-'x and

(A-1)Tx which is equivalent to solving Ax = b and ATx = b. We can use KLU to

efficiently solve these systems.

Higham [23] presents refinements to Hager's algorithm and restricts the

number of iterations to five. Higham further presents a simple device and using

the higher of the estimates from this device and Hager's algorithm to ensure the

estimate is large enough. This device involves solving the linear system Ax = b

where


i-1
bi =(-1)+1(1 + )i 1, 2, ...n

The final estimate is chosen as the maximum from Hager's algorithm and
2 ||-||1
3n
KLU's condition number estimator is based on Higham's refinement of Hager's

algorithm.

2.13 Depth First Search

As we discussed earlier, the nonzero pattern of the kh column of L is deter-

mined by the Reachability of the row-indices of kh column of A in the graph of L.









The reachability is determined by a depth-first search traversal of the graph of L.

The topological order for elimination of variables when solving the lower triangular

system Lx = b is also determined by the depth-first search traversal. A classical

depth first search algorithm is a recursive one. One of the major problems in a

recursive implementation of depth-first search is Stack overflow. Each process is

allocated a stack space upon execution. When there is a high number of recursive

calls, the stack space is exhausted and the process terminates abruptly. This is a

definite possibility in the context of our depth-first search algorithm when we have

a dense column of a matrix of a very high dimension.

The solution to stack overflow caused by recursion is to replace recursion

by iteration. With an iterative or non-recursive function, the entire depth first

search happens in a single function stack. The iterative solution uses an array

of row indices called pstack. When descending to an .,li.I1:ent node during the

search, the row index of the next .,li.1:ent node is stored in the pstack at the

position(row/column index) corresponding to the current node. When the search

returns to the current node, we know that we next need to descend into the node

stored in the pstack at the position corresponding to the current node. Using this

extra O(n) memory, the iterative version completes the depth first search in a

single function stack.

This is an important improvement from the recursive version since it avoids

the stack overflow problem that would have been a bottleneck when solving high

dimension systems.

2.14 Memory Fragmentation

The data structures for L and U are the ones used to represent sparse matri-

ces. These comprise 3 vectors.

1. Vector of column pointers

2. Vector of row indices









3. Vector of numerical values

There are overall, six vectors needed for the two matrices L and U. Of these, the

two vectors of column pointers are of pre-known size namely the size of a block.

The remaining four vectors of row indices and numerical values depend on the fill-

in estimated by AMD. However AMD gives an optimistic estimate of fill-in.Hence

we need to dynamically grow memory for these vectors during the factorization

phase if we determine that the fill-in is higher than estimated. The partial pivoting

strategy can alter the row ordering determined by AMD and hence is another

source of higher fill-in than the estimate from AMD.

Dynamically growing these four vectors suffers from the problem of external

memory fragmentation. In external fragmentation, free memory is scattered in

the memory space. A call for more memory fails because of non-availability of

contiguous free memory space. If the scattered free memory areas were contiguous,

the memory request would have succeeded. In the context of our problem, the

memory request to grow the four vectors could either fail if we run into external

fragmentation or succeed when there is enough free space available.

When we reallocate or grow memory, there are two types of success cases. In

the first case called cheap reallocation, there is enough free memory space abutting

the four vectors. Here the memory occupied by a vector is just extended or its end

boundary is increased. The start boundary remains the same. In the second case

called costly reallocation, there is not enough free memory space abutting a vector.

Hence a fresh memory is allocated in another region for the new size of vector and

the contents are copied from old location. Finally the old location is freed.

With four vectors to grow, there is a failure case because of external fragmen-

tation and a costly success case because of costly reallocation. To reduce the failure

case and avoid the costly success case, we have coalesced the four vectors into a

single vector. This new data structure is byte aligned on double boundary. For









every column of L and U, the vector contains the row indices and numerical values

of L followed by the row indices and numerical values of U. Multiple integer row

indices are stored in a single double location The actual number of integers that

can be stored in a double location varies with platform and is determined dynam-

ically. The common technique of using integer pointer to point to location aligned

on double boundary, is iIpl'-v-d to retrieve or save the row indices.

In addition to this coalesced data structure containing the row indices and

numerical values, two more length vectors of size n are needed to contain the length

of each column of L and U. These length vectors are preallocated once and need

not be grown dynamically.

Some Memory management schemes never do cheap reallocation. In such

schemes, the new data structure serves to reduce external fragmentation only.

2.15 Complex Number Support

KLU supports complex matrices and complex right hand sides. KLU also

supports solving the transpose system ATx = b for real matrices and solving the

conjugate transpose system AHx = b for complex matrices. Initially it relied on the

C99 language support for complex numbers. However the C99 specification is not

supported across operating systems. For example, earlier versions of Sun Solaris do

not support C99. To avoid these compatibility issues, KLU no longer relies on C99

and has its own complex arithmetic implementation.

2.16 Parallelism in KLU

When solving a system Ax = b using KLU, we use BTF pre-ordering to

convert A into a block upper triangular form. We apply AMD on each block and

factorize each block one after the other serially. Alternatively, nested dissection can

be applied to each block. Nested dissection ordering converts a block to a doubly

bordered block diagonal form. A doubly bordered block diagonal form is similar to

a block upper triangular form but has nonzeros on the sub diagonal region. These










0 *5

2 **
2

S000 3 4 3
0**
4 O0
000*** ** 5 2 1
0000 00 00
A Doubly Bordered Block diagonal Matrix Separator Tree

Figure 2-6: A doubly bordered block diagonal matrix and its corresponding vertex
separator tree

nonzeros form a horizontal strip resembling a border. Similarly the nonzeros in the

region above the diagonal form a corresponding vertical strip.

The doubly bordered block diagonal form can be thought of as a separator
tree. Factorization of the block then involves a post-order traversal of the separator
tree. The nodes in the separator tree can be factorized in parallel. The factor-
ization of a node would additionally involve computing the schur complement of
its parent and of its ancestors in the tree. Once all the children of a node have
updated its schur complement, the node is ready to be factorized and it inturn
computes the schur complement of its parent and its ancestors. The factorization
and computation of schur complement is done in a post-order traversal fashion and
the process stops at the root.
Parallelism can help in reducing the factorization time. It gains importance in

the context of multi processor systems. Work is being done to enable parallelism in
KLU.















CHAPTER 3
CIRCUIT SIMULATION: APPLICATION OF KLU

The KLU algorithm comprises the following steps:

1. Unsymmetric Permutation to block upper triangular form. This consists of

two steps.

(a) 'lil-iii, tii permutation to ensure a zero free diagonal using maximum

transversal.

(b) symmetric permutation to block upper triangular form by finding the

strongly connected components of the graph.

2. Symmetric permutation of each block(say A) using AMD on A + AT or an

'ln-vimn lli-ti permutation of each block using COLAMD on AAT. These

permutations are fill-in reducing orderings on each block.

3. Factorization of each scaled block using Gilbert-Peierls' left looking algorithm

with partial pivoting.

4. Solve the system using block-back substitution and account for the off-

diagonal entries. The solution is re-permuted to bring it back to original

order.

Let us first derive the final system that we need to solve taking into account,

the different permutations, scaling and pivoting. The original system to solve is


Ax: =b (3-1)


Let R be the diagonal matrix with the scale factors for each row. Applying scaling,

we have

RAx = Rb (3-2)









Let P' and Q' be the row and column permutation matrices that combine the

permutations for maximum transversal and the block upper triangular form

together. Applying these permutations together, we have


P'RAQ'Q'T = P'Rb. [Q'QT = I,the ,,,-ntt,, matrix] (3-3)


Let P and Q be row and column permutation matrices that club the P' and Q'

mentioned above with the symmetric permutation produced by AMD and the

partial pivoting row permutation produced by factorization. Now,


PRAQQTx = PRb

or

(PRAQ)QTX = PRb (3-4)

The matrix (PRAQ) consists of two parts viz. the diagonal blocks that are factor-

ized and the off-diagonal elements that are not factorized.

(PRAQ) = LU + F where LU represents the factors of all the blocks

collectively and F represents the entire off diagonal region. Equation (3-4) now

becomes

(LU F)QTx = PRb (3-5)

x =Q(LU + F)-'(PRb) (3-6)

Equation (3-6) consists of two steps. A block back-substitution i.e. computing

(LU + F)-l(PRb) followed by applying the column permutation Q.

The block-back substitution in (LU + F)-1(PRb) looks cryptic and can be

better explained as follows: Consider a simple 3 3 block system











L11U11 F12 F13 X1 B1

0 L22U22 F23 X2 2 (37)

0 0 L33U33 X3 B1

The equations corresponding to the above system are:


LUnl1 X1 + F12 X2 F13 X3 = B (3-8)

L22 22* X2 F23* 3 B2 (39)

L33U33* X3 B3 (3-10)


In block back substitution, we first solve (3-10) for X3, and then eliminate Xa

from (3-9) and (3-8) using the off-diagonal entries.

Next, we solve (3-9) for X2 and eliminate X2 from (3-8).

Finally we solve (3-8) for X1

3.1 Characteristics of Circuit Matrices

Circuit matrices exhibit certain unique characteristics that makes KLU more

relevant to them. They are very sparse. Because of their high sparsity, BLAS

kernels are not applicable. Circuit matrices often have a few dense rows/columns

that originate from voltage or current sources. These dense rows/columns are

effectively removed by BTF permutation.

Circuit matrices are ..i-'.mmetric, but the nonzero pattern is roughly symmet-

ric. They are easily permutable to block upper triangular form. Besides, they have

zero-free or nearly zero-free diagonal. Another peculiar feature of circuit matrices is

that the nonzero pattern of each block after permutation to block upper triangular

form, is more symmetric than the original matrix. Typical ordering strategies

applied to the original matrix cause high fill-in whereas when applied to the blocks,

leads to less fill-in.









The efficiency of the permutation to block upper triangular form shows in the

fact the entire sub-diagonal region in the matrix has zero work and the off-diagonal

elements do not cause any fill-in since they are not factorized.

3.2 Linear Systems in Circuit Simulation

The linear systems in circuit simulation process originate from solving large

systems of non linear equations using Newton's method and integrating large stiff

systems of ordinary differential equations. These linear systems consist of the

coefficients matrix A, the unknowns vector x and the right hand side b.

During the course of simulation, the matrix A retains the same nonzero

pattern(structurally unchanged) and only undergoes changes in numerical val-

ues. Thus the initial analysis phase (permutation to ensure zero-free diagonal,

block triangular form and minimum degree ordering on blocks) and factorization

phase(that involves symbolic analysis, partial pivoting and symmetric pruning) can

be restricted to the initial system alone.

Subsequent systems A'x = b where only the numerical values of A' are different

from A, can be solved using a mechanism called refactorization. Refactorization

simply means to use the same row and column permutations (comprising entire

analysis phase and partial pivoting) computed for the initial system, for solving the

subsequent systems that have changes only in numerical values. Refactorization

substantially reduces run time since the analysis time and factorization time spent

on symbolic analysis, partial pivoting, pruning are avoided. The nonzero pattern

of the factors L and U are the same as for the initial system. Only Numerical

factorization using the pre-computed nonzero pattern and partial pivoting order, is

required.

The solve step follows the factorization/refactorization step. KLU accomodates

solving multiple right hand sides in a single solve step. Upto four right hand sides

can be solved in a single step.









3.3 Performance Benchmarks


During my internship at a circuit simulation company, I did performance

benchmarking of KLU vs SuperLU in the simulation environment. The perfor-

mance benchmarks were run on a representative set of examples. The results

of these benchmarks are tabulated as follows. (the size of the matrix created in

simulation is shown in parenthesis).

Table 3-1: Comparison between KLU and SuperLU on overall time and fill-in

Overall time Nonzeros(L+U)
Netlist KLU SuperLU Speedup KLU SuperLU
Problems (301) 1.67 1.24 0.74 1808 1968
Problem (1879) 734.96 1.- ". 0.94 13. i1 13770
Problem (2076) 56.46 53.38 0.95 16403 16551
Problem (7598) 89.63 81.85 0.91 52056 54997
Problem (745) 18.13 16.84 0.93 4156 4231
Problem (1041) 1336.50 1317.30 0.99 13198 1.'' r".
Problem (33) 0.40 0.32 0.80 157 176
Problem (10) 4.46 1.570 0.35 40 41
Problem (180) 222.26 202.29 0.91 1845 1922
Probleml0(6833) 6222.20 6410.40 1.03 56322 ..1
Problemll (1960) 181.78 179.50 0.99 13527 13963
Probleml2 (200004) 6.25 8.47 1.35 500011 600011
Probleml3 (20004) 0.47 0.57 1.22 50011 60011
Probleml4 (40004) 0.97 1.31 1.35 100011 120011
Probleml5 (100000) 1.76 2.08 1.18 299998 499932
Probleml6 (7602) 217.80 255.88 1.17 156311 184362
Probleml7(10922) 671.10 770.58 1.15 267237 299937
Probleml8 (14842) 1017.00 1238.10 1.22 326811 425661
Probleml9 (19362) 1099.00 1284.40 1.17 550409 581277
Problem20 (24482) 3029.00 3116.90 1.03 684139 788047
Problem21 (30202) 2904.00 3507.40 1.21 933131 1049463


The circuits Probleml6-Problem21 are TFT LCD arrays similar to memory

circuits. These examples were run atleast twice with each algorithm empl-v -d viz.

KLU or SuperLU to get consistent results. The results are tabulated in tables 3-1,

3-2 and 3-3. The "overall time" in table 3-1 comprises of analysis, factorization


and solve time.



















Table 3-2: Comparison between KLU and SuperLU on factor time and solve time


Factor time Factor Solve time Solve
per iteration speedup per iteration speedup
Netlist KLU SuperLU KLU SuperLU
Problems (301) 0.000067 0.000084 1.26 0.000020 0.000019 0.92
Problem (1879) 0.000409 0.000377 0.92 0.000162 0.000100 0.61
Problem (2076) 0.000352 0.000317 0.90 0.000122 0.000083 0.68
Problem (7598) 0.001336 0.001318 0.99 0.000677 0.000326 0.48
Problem (745) 0.000083 0.000063 0.76 0.000035 0.000022 0.62
Problem (1041) 0.000321 0.000406 1.26 0.000079 0.000075 0.95
Problem (33) 0.000004 0.000004 0.96 0.000003 0.000002 0.73
Problem (10) 0.000001 0.000001 0.89 0.000001 0.000001 0.80
Problem (180) 0.000036 0.000042 1.16 0.000014 0.000011 0.76
Probleml0(6833) 0.001556 0.001530 0.98 0.000674 0.000365 0.54
Problemll (1960) 0.000663 0.000753 1.14 0.000136 0.000122 0.90
Probleml2 (200004) 0.103900 0.345500 3.33 0.11 I',I. 0 0.041220 1.35
Probleml3 (20004) 0.005672 0.020110 3.55 0.001633 0.002735 1.67
Probleml4 (40004) 0.014430 0.056080 3.89 0.004806 0.iiii.11 .1 1.43
Probleml5 (100000) 0.168700 0.283700 1.68 0.018600 0.033610 1.81
Probleml6 (7602) 0.009996 0.017620 1.76 0.001654 0.001439 0.87
Probleml7(10922) 0.018380 0.030010 1.63 0.002542 0.001783 0.70
Probleml8 (14842) 0.024020 0.046130 1.92 0.003187 0.002492 0.78
Probleml9 (19362) 0.054730 0.080280 1.47 0.005321 0.003620 0.68
Problem20 (24482) 0.121400 0.122600 1.01 0.006009 0.004705 0.78
Problem21 (30202) 0.124000 0.188700 1.52 0.009268 0.006778 0.73









3.4 Analyses and Findings


The following are my inferences based on the results: Most of the matrices in

these experiments are small matrices of the order of a few thousands.

Fill in is much better with KLU. The 'BTF' ordering combined with the

'AMD' ordering on each of the blocks does a good job in reducing the fill in

count to a good extent. The improvement in fill in averages around 6% for many

examples.

Table 3-3: Ordering results using BTF+AMD in KLU on circuit matrices

Nonzeros Nonzeros No of Max Block Nonzeros
Netlist Size in A in L+U Blocks size off diagonal
Problems 301 1484 1808 7 295 89
Problem 1879 12926 13 .i 1 19 1861 4307
Problem 2076 15821 16403 13 2064 6689
Problem 7598 48922 52056 13 7586 19018
Problem 745 3966 4156 128 426 1719
Problem 1041 9654 13198 67 975 2608
Problem 33 153 157 7 27 50
Problem 10 39 40 5 6 16
Problem 180 1503 1845 19 162 661
ProblemlO 6833 43250 56322 507 6282 12594
Problemll 1960 11187 13527 58 1715 2959
Probleml2 200004 500011 500011 200003 2 300005
Probleml3 20004 50011 50011 20003 2 30005
Probleml4 40004 100011 100011 40003 2 60005
Probleml5 100000 299998 299998 1 100000 0
Probleml6 7602 32653 156311 103 7500 251
Probleml7 10922 46983 267237 123 10800 301
Probleml8 14842 63913 326811 143 14700 351
Probleml9 19362 83443 550409 163 19200 401
Problem20 24482 105573 684139 183 24300 451
Problem21 30202 130303 933131 203 30000 501


There is no 'fill in' in the Probleml2, Probleml3, Probleml4 and Probleml5

netlists with KLU. This is quite significant in that there is no memory overhead

in these examples. In the case of circuits Probleml6-Problem21, the gain in fill









in with KLU ranges from 6% in the Probleml9 example to 24% in Probleml8

example.

The gain in fill in translates into faster factorization because few nonzeros

imply less work. The factorization time thus is expected to be low. It turns out

to be true in most of the cases factorizationn speedup of 1.6x in Probleml6

Problem21 examples and 3x for Probleml2-Probleml4 examples). For some cases

like Problem2 and ProblemlO, the factorization time remains same for both KLU

and SuperLU.

Solve phase turns out to be slow in KLU. Probably, the off diagonal nonzero

handling part tends to account for the extra time spent in the solve phase.

One way of reducing the solve overhead in KLU would be solving multiple

RHS at the same time. In a single solve iteration, 4 equations are solved.

On the whole, the overall time speedup is 1.2 for Probleml6-Problem21

examples and Probleml2-Probleml4 examples. For others, the overall time is

almost the same between the two algorithms.

BTF is not able to find out many blocks for most of the matrices and there

happens to be a single large block and the remaining are singletons. But the AMD

ordering does a good job in getting the fill in count reduced. The off-diagonal

nonzero count is not high.

3.5 Alternate Ordering Experiments

Different ordering strategies were employed to analyze the fill in behaviour.

The statistics using different ordering schemes are listed in table 3-4. 'COLAMD'

is not listed in the table. It typically gives poor ordering and causes more fill

in than AMD, MMD and AMD+BTF. AMD alone gives relatively higher fill in

compared to AMD+BTF in most of the matrices. However AMD alone gives mixed

results in comparison with MMD. It matches or outperforms MMD in fill in on

Probleml2-Probleml4 and Probleml6-Problem21 matrices. However it gives




















Table 3-4: Comparison of ordering results produced by BTF+AMD, AMD, MMD

Nonzeros Fill-in Fill-in Fill-in
Netlist in A BTF+AMD AMD MMD
Problems (301) 1484 1808 1928 1968
Problem (1879) 12926 13-.11 13857 13770
Problem (2076) 15821 16403 18041 16551
Problem (7598) 48922 52056 57975 54997
Problem (745) 3966 4156 5562 4231
Problem (1041) 9654 13198 14020 1.''.
Problem (33) 153 157 178 176
Problem (10) 39 40 41 41
Problem (180) 1503 1845 1968 1922
Probleml0(6833) 43250 56322 133739 ."..-.1
Problemll (1960) 11187 13527 14800 13963
Probleml2 (200004) 500011 500011 600011 600011
Probleml3 (20004) 50011 50011 60011 60011
Probleml4 (40004) 100011 100011 120011 120011
Probleml5 (100000) 299998 299998 299998 499932
Probleml6 (7602) 32653 156311 165264 184362
Probleml7(10922) 46983 267237 255228 299937
Probleml8 (14842) 63913 326811 387668 425661
Probleml9 (19362) 83443 550409 451397 581277
Problem20 (24482) 105573 684139 718891 788047
Problem21 (30202) 130303 933131 839226 1049463









poor fill in for rest of the circuits when compared with MMD. AMD alone beats

AMD+BTF in fill-in for some of the examples viz. Probleml7, Probleml9 and

Problem21. Overall, to sum it up, BTF+AMD is the best ordering strategy to use.

3.6 Experiments with UF Sparse Matrix Collection

There are a number of circuit matrices in the UF sparse matrix collection.

Different experiments were done with these matrices as well on different parameters

like

1. ordering
2. timing of different phases of KLU

3. ordering
4. performance comparison between KLU and UMFPACK.

UMFPACK is a unsymmetric multifrontal sparse solver and uses an unsymmetric

COLAMD ordering or an AMD ordering, automatically selecting which method

to use based on the matrix characteristics. Gilbert-Peierls' algorithm is available

in MATLAB as an LU factorization scheme. These experiments were done on a

Mandrake 10.0 linux OS, Intel Pentium M Processor with clock frequency of 1400

MHz and RAM 768 kB.

3.6.1 Different Ordering Schemes in KLU

There are six different ordering schemes possible in KLU. The three fill

reducing schemes are AMD, COLAMD and User Specified Ordering. These

three fill reducing orderings can be combined with a BTF preordering or no

preordering.Hence six different schemes. However in this experiment user specified

ordering is not considered. That leaves us with four different schemes.The table

3-5 lists the fill (number of nonzeros in L+U plus the number of off-diagonal

elements) for each of these ordering schemes.

From table 3-5, we find that BTF+AMD gives consistently better fill-in

across different circuit matrices. However there are observable aberrations. For









example, with the circuit Bomhof/circuit-2, AMD and COLAMD give better fill-in

than BTF+AMD. These results determine again that BTF+AMD is the best

ordering scheme to use for circuit matrices.

3.6.2 Timing Different Phases in KLU

These experiments show the time spent in different phases of the algorithm.

BTF pre-ordering followed by a AMD fill-reducing ordering is eiipl-'w] d. As

mentioned earlier, there are four different phases.

1. Analysis phase: This comprises the pre-ordering and fill reducing ordering.

2. Factor phase: This comprises the factorization part of the algorithm. It

includes the symbolic analysis phase, partial pivoting, symmetric pruning

steps.

3. Refactor phase: This comprises the part where we do a factorization using the

already pre-computed partial pivoting permutation and the nonzero pattern

of the L and U matrices. There is no symbolic analysis, partial pivoting and

symmetric pruning in refactor phase.

4. Solve phase: This comprises the solve phase of the algorithm. Solving a single

right hand side was experimented.

When given a set of matrices with the same nonzero pattern, the analysis

and factor phases are done only once. The refactor phase is then repeated for the

remaining matrices. Solve phase is repeated as many times as there are systems

to solve. Table 3-6 consists of the timing results. Analysis phase consumes most

of the time spent in the algorithm. Refactor time is typically 3 or 4 times smaller

than factor time and 8 times smaller than analysis time plus factor time put

together. Solve phase consumes the least fraction of time spent.

3.6.3 Ordering Quality among KLU, UMFPACK and Gilbert-Peierls

The table 3-7 compares the ordering quality among KLU using BTF+AMD,

UMFPACK using COLAMD or AMD and Gilbert-Peierls' using AMD. We can












Table 3-5: Fill-in with four different schemes in KLU


Nonzeros BTF + BTF +
Matrix in A AMD COLAMD AMD COLAMD
Sandia/adder_dcop_01 (1813) 11156 13525 13895 18848 21799
Sandia/adder_trans_01 (1814) 14579 20769 36711 24365 119519
Sandia/fpga_dcop_01 (1220) 5892 7891 8118 S .' 12016
Sandia/fpga_trans_01 (1220) 7382 10152 12776 10669 21051
Sandia/init_adder1 (1813) 11156 13525 13895 18848 21799
Sandia/mult_dcop_01 (25187) 193276 226673 228301 2176328 1460322
Sandia/oscil_dcop_01 (430) 1544 2934 3086 3078 3295
Sandia/oscil_trans_01 (430) 1614 2842 3247 2897 3259
Bomhof/circuitl (2624) 35823 44879 775363 44815 775720
Bomhof/circuit_2 (4510) 21199 40434 89315 36197 36196
Bomhof/circuit_3 (12127) 48137 86718 98911 245336 744245
Grund/b2_ss (1089) 3895 9994 9212 26971 9334
Grund/b_dyn (1089) 4144 11806 10597 33057 10544
Grund/bayer02 (1;'i ;') 63307 889914 245259 1365142 307979
Grund/d_dyn (87) 230 481 461 619 494
Grund/d_ss (53) 144 302 292 382 298
Grund/megl (2904) 58142 232042 184471 1526780 378904
Grund/meg4 (5860) 25258 42 ;', 310126 43250 328144
Grund/poli (4008) 8188 12200 12208 12238 12453
Grund/poli_large (15575) 33033 48718 48817 49806 51970
Hamm/add20 (2395) 13151 19554 3 1. ;i 19554 3 1. ;i
Hamm/add32 (4960) 19848 28754 36030 28754 36030
Hamm/bcircuit (68902) 375558 1033240 1692668 1033240 1692668
Hamm/hcircuit (105676) 513072 731634 2.2 ;-. 2 736080 4425310
Hamm/memplus (17758) 99147 137030 3282586 137030 3282586
Hamm/scircuit (170998) '."-'i ;i. 2481122 64102-. 2481832 6427526
R. i.l /rajat03 (7602) 32653 163913 235111 172666 236938
R. i.I /rajat04 (1041) 8725 12863 80518 18618 158000
R.,i.l /rajat05 (301) 1250 1926 2053 2101 3131
R.ii.I /rajatll (135) 665 890 978 944 1129
R. i.l /rajatl2 (1879) 12818 15308 273667 15571 128317
R. i.l /rajatl3 (7598) 48762 5 '"i 90368 64791 5234287
R. i.I /rajatl4 (180) 1475 1994 2249 2105 2345













Table 3-6: Time in seconds, spent in different phases in KLU
Analysis Factor Refactor Solve
Matrix time time time time
Sandia/adder_dcop_01 (1813) 0.0028 0.0028 0.0007 0.0003
Sandia/adder_trans_01 (1814) 0.0045 0.0038 0.0013 0.0003
Sandia/fpga_dcop_01 (1220) 0.0018 0.0015 0.0004 0.0002
Sandia/fpgatrans_01 (1220) 0.0022 0.0017 0.0005 0.0002
Sandia/init_adderl (1813) 0.0028 0.0028 0.0007 0.0003
Sandia/mult_dcop_01 (25187) 0.2922 0.0522 0.0196 0.0069
Sandia/oscil_dcop_01 (430) 0.0008 0.0006 0.0002 0.0001
Sandia/oscil_trans_01 (430) 0.0008 0.0006 0.0002 0.0001
Bomhof/circuit_l (2624) 0.0098 0.0085 0.0053 0.0006
Bomhof/circuit_2 (4510) 0.0082 0.0064 0.0034 0.0006
Bomhof/circuit_3 (12127) 0.0231 0.0174 0.0056 0.0020
Grund/b2_ss (1089) 0.0031 0.0018 0.0005 0.0001
Grund/b_dyn (1089) 0.0033 0.0021 0.0007 0.0002
Grund/bayer02 (1:;' ;.) 0.0584 0.2541 0.2070 0.0090
Grund/d_dyn (87) 0.0002 0.0001 0.0000 0.0000
Grund/d_ss (53) 0.0001 0.0001 0.0000 0.0000
Grund/megl (2904) 0.0178 0.0853 0.0590 0.0033
Grund/meg4 (5860) 0.0157 0.0094 0.0028 0.0009
Grund/poli (4008) 0.0017 0.0010 0.0004 0.0003
Grund/polilarge (15575) 0.0064 0.0045 0.0018 0.0014
Hamm/add20 (2395) 0.0056 0.0044 0.0014 0.0003
Hamm/add32 (4960) 0.0084 0.0074 0.0019 0.0006
Hamm/bcircuit (68902) 0.3120 0.2318 0.1011 0.0257
Hamm/hcircuit (105676) 0.2553 0.1920 0.0658 0.0235
Hamm/memplus (17758) 0.0576 0.0358 0.0157 0.0036
Hamm/scircuit (170998) 0.8491 0.6364 0.3311 0.0622
R.ii. ,/rajat03 (7602) 0.0152 0.0440 0.0306 0.0034
R.ii. ,/rajat04 (1041) 0.0029 0.0023 0.0008 0.0002
R.,i. I/rajat05 (301) 0.0005 0.0005 0.0001 0.0000
R.,i. I/rajatll (135) 0.0002 0.0002 0.0001 0.0000
R.,i. ,/rajatl2 (1879) 0.0038 0.0027 0.0008 0.0002
R.,i. ,/rajatl3 (7598) 0.0122 0.0105 0.0033 0.0011
R.,i. ,/rajatl4 (180) 0.0004 0.0003 0.0001 0.0000









infer from the results that KLU produces better ordering than UMFPACK and

Gilbert-Peierls' algorithm. For KLU, the following MATLAB code determines the

fill.

opts = [0.1 1.2 1.2 10 1 0 0 0 ] ;

[x info] = klus(A,b,opts, []) ;

fill = info (31) + info (32) + info(8) ;

For UMFPACK, the snippet is

[L U P Q] = lu(A) ;

fill = nnz(L) + nnz(U) ;

For Gilbert-Peierls' the code is

Q = amd(A) ;

[L U P] = lu(A(Q,Q), 0.1) ;

fill = nnz(L) + nnz(U) ;

3.6.4 Performance Comparison between KLU and UMFPACK

This experiment compares the total time spent in the analysis, factor and solve

phases by the algorithms. The results are listed in table 3-8. KLU outperforms

UMFPACK in time. For KLU, the following snippet in MATLAB is used:

tic

[x info] = klus(A,b,opts)

tl = toc ;

For UMFPACK, the following code in MATLAB is used to find the total time:

tic

x= A \b;

t2 = toc ;













Table 3-7: Fill-in among KLU, UMFPACK and Gilbert-Peierls


Matrix nnz KLU UMFPACK Gilbert-Peierls
Sandia/adder_dcop_01 (1813) 11156 13525 14658 18825
Sandia/adder_trans_01 (1814) 14579 20769 20769 24365
Sandia/fpga_dcop_01 (1220) 5892 7891 8106 I '
Sandia/fpgatrans_01 (1220) 7382 10152 10152 10669
Sandia/iit_adder1 (1813) 11156 13525 14658 18825
Sandia/mult_dcop_01 (25187) 193276 226673 556746 1300902
Sandia/oscil_dcop_01 (430) 1544 2934 2852 3198
Sandia/oscil_trans_01 (430) 1614 2842 3069 2897
Bomhof/circuit-_ (2624) 35823 44879 44879 44815
Bomhof/circuit_2 (4510) 21199 40434 35107 38618
Bomhof/circuit_3 (12127) 48137 86718 84117 245323
Grund/b2_ss (1089) 3895 9994 8309 22444
Grund/b_dyn (1089) 4144 11806 9642 41092
Grund/bayer02 (l:;' ;,) 63307 889914 259329 973093
Grund/d_dyn (87) 230 481 442 523
Grund/dss (53) 144 302 268 395
Grund/megl (2904) 58142 232042 151740 1212904
Grund/meg4 (5860) 25258 42 ;' 42 ;'- 43250
Grund/poli (4008) 8188 12200 12200 12239
Grund/polilarge (15575) 33033 48718 48745 49803
Hamm/add20 (2395) 13151 19554 19554 19554
Hamm/add32 (4960) 19848 28754 28754 28754
Hamm/bcircuit (68902) 375558 1033240 1033240 1033240
Hamm/hcircuit (105676) 513072 731634 730906 736080
Hamm/memplus (17758) 99147 137030 137030 137030
Hamm/scircuit (170998) 1 .' ;i, 2481122 2481122 2481832
R.i i. /rajat03 (7602) 32653 163913 163913 172666
R.i i. ,/rajat04 (1041) 8725 12863 12860 18613
R.,i. ,/rajat05 (301) 1250 1926 1944 2101
R.ii. ,/rajatll (135) 665 890 890 944
R. i., /rajatl2 (1879) 12818 15308 15308 15571
R. i., /rajatl3 (7598) 48762 5'.r, 5 '". 64791
R. i.1 /rajatl4 (180) 1475 1994 1994 2105













Table 3-8: Performance comparison between KLU and UMFPACK


Matrix KLU UMFPACK
Sandia/adder_dcop_01 (1813) 0.0116 0.0344
Sandia/adder_trans_01 (1814) 0.0112 0.0401
Sandia/fpga_dcop_01 (1220) 0.0050 0.0257
Sandia/fpgatrans_01 (1220) 0.0054 0.0203
Sandia/iit_adder1 (1813) 0.0109 0.0323
Sandia/mult_dcop_01 (25187) 1.2383 1.1615
Sandia/oscil_dcop_01 (430) 0.0022 0.0070
Sandia/oscil_trans_01 (430) 0.0019 0.0074
Bomhof/circuitl (2624) 0.0232 0.1223
Bomhof/circuit_2 (4510) 0.0201 0.0522
Bomhof/circuit_3 (12127) 0.0579 0.1713
Grund/b2_ss (1089) 0.0066 0.0168
Grund/b_dyn (1089) 0.0072 0.0175
Grund/bayer02 (1;;' ;,) 0.6089 0.3565
Grund/d_dyn (87) 0.0005 0.0014
Grund/dss (53) 0.0003 0.0010
Grund/megl (2904) 0.1326 0.1018
Grund/meg4 (5860) 0.0571 0.1111
Grund/poli (4008) 0.0050 0.0121
Grund/polilarge (15575) 0.0208 0.0497
Hamm/add20 (2395) 0.0123 0.0506
Hamm/add32 (4960) 0.0201 0.0738
Hamm/bcircuit (68902) 0.7213 1.8823
Hamm/hcircuit (105676) 0.7313 2.7764
Hamm/memplus (17758) 0.1232 0.8140
Hamm/scircuit (170998) 1.9812 7.3448
R.ii.-./rajat03 (7602) 0.0793 0.1883
R.ii. ,/rajat04 (1041) 0.0068 0., 1-
R.,i. ,/rajat05 (301) 0.0014 0.0046
R.,i.i /rajatll (135) 0.0007 0.0023
R.ii.,/rajatl2 (1879) 0.0087 0.0355
R.,i.,I/rajatl3 (7598) 0.0330 0.1229
R.ii. ,/rajatl4 (180) 0.0010 0.0032















CHAPTER 4
USER GUIDE FOR KLU

4.1 The Primary KLU Structures

4.1.1 klu_common

This is a control structure that contains both input control parameters for

KLU as well as output statistics computed by the algorithm. It is a mandatory

parameter for all the KLU routines. Its contents are listed as follows:

tol

Type: double

Input parameter for pivot tolerance for diagonal preference.Default value is

0.001

growth

Type: double

Input parameter for reallocation growth size of LU factors. Default value is

1.2

initmem_amd

Type: double

Input parameter for initial memory size with AMD.

Initial memory size = initmemamd nnz(L) + n. Default value is 1.2

initmem

Type: double

Input parameter for initial memory size with COLAMD.

Initial memory size = initmem nnz(A) + n. Default value is 10

btf

Type: int









Input parameter to use BTF pre-ord' ii,. or not. Default value is 1 (to use

BTF)

* ordering

Type: int

Input parameter to specify which fill reducing ordering to use.

0- AMD,1 COLAMD, 2= user P and Q. Default is 0.

* scale

Type: int

Input parameter to specify which scaling strategy to use. 0= none, 1= sum,

2= max. Default is 0

* singular_proc

Type: int

Input parameter to specify whether to stop on singularity or continue.

0= stop, 1= continue. Default is 0.

* status

Type: int

Output parameter that indicates the result of the KLU function call. Values

are KLU_OK(0) if OK and < 0 if error. Error values are KLU_SINGULAR

(-1), KLU_OUT_OF_MEMORY (-2), KLU_INVALID (-3)

* nrealloc

Type: int

Output parameter.Contains number of reallocations of L and U

* singular_col

Type: int

Output parameter.Contains the column no of singular column if any

* noffdiag

Type: int










Output parameter.Contains the number of off-diagonal pivots chosen

4.1.2 klu_symbolic

This structure encapsulates the information related to the analysis phase. The

members of the structure are listed as follows:

symmetry

Type: double

Contains the symmetry of largest block

estflops

Type: double

Contains the estimated factorization flop count

Inz

Type: double

Contains the estimated nonzeros in L including diagonals

unz

Type: double

Contains the estimated nonzeros in U including diagonals

Lnz

Type: double *

Array of size n, but only Lnz [O..nblocks-1] is used. Contains the estimated

number of nonzeros in each block

Sn

Type: int

Contains the size of input matrix A where A is n-by-n

nz

Type: int

Contains the number of entries in input matrix

*P









Type: int *

Array of size n. Contains the row permutation from ordering

*Q

Type: int *

Array of size n. Contains the column permutation from ordering

*R

Type: int *

Array of size n+1, but only R [0..nblocks] is used. Contains the start and end

column/row index for each block. Block k goes from R[k] to R[k+l] 1

nzoff

Type: int

Contains the number of nonzeros in off-diagonal blocks

nblocks Type: int

Contains the number of blocks

maxblock

Type: int

Contains the size of largest block

ordering

Type: int

Contains the ordering used (0 = AMD, 1 = COLAMD, 2 = GIVEN)

do_btf

Type: int

Indicates whether or not BTF preordering was requested

The members symmetry, est flops, Inz, unz, Lnz are computed only when AMD is

used. The remaining members are computed for all orderings.










4.1.3 klu_numeric

This structure encapsulates information related to the factor phase. It contains

the LU factors of each block, pivot row permutation, and the entries in the off-

diagonal blocks among others. Its contents are listed as follows:

umin

Type: double

Contains the minimum absolute diagonal entry in U

umax

Type: double

Contains the maximum absolute diagonal entry in U

nblocks

Type: int

Contains the number of blocks

Inz

Type: int

Contains actual number of nonzeros in L excluding diagonals

unz

Type: int

Contains actual number of nonzeros in U excluding diagonals

Pnum

Type: int *

Array of size n.Contains the final pivot permutation

Pinv

Type: int *

Array of size n.Contains the inverse of final pivot permutation

Lbip

Type: int **









Array of size nblocks. Each element is an .11,i.,i of size block size + 1. Each

element contains the column pointers for L factor of the corresponding block

* Ubip

Type: int **

Array of size nblocks. Each element is an .11.i i. of size block size + 1. Each

element contains the column pointers for U factor of the corresponding block

* Lblen

Type: int **

Array of size nblocks. Each element is an .11,i.,i of size block size. Each

element contains the column lengths for L factor of the corresponding block

* Ublen

Type: int **

Array of size nblocks. Each element is an .11,i.,i of size block size. Each

element contains the column lengths for U factor of the corresponding block

* LUbx

Type: void **

Array of size nblocks. Each element is an .11,i.,i containing the row indices

and numerical values of LU factors of the corresponding block. The diagonals

of LU factors are not stored here

* Udiag

Type: void **

Array of size nblocks.Each element is an .11,i.,i of size block size. Each

element contains the diagonal values of U factor of the corresponding block

* Singleton

Type: void *

Array of size nblocks.Contains the singleton values

* Rs










Type: double *

Array of size n. Contains the row scaling factors

* scale

Type: int

Indicates the scaling strategy used.

(0 = none, 1 = sum, 2 = max)

* Work

Type: void *

Permanent workspace used for factorization and solve. It is of size

MAX (4n numerical values, n numerical values + 6*maxblock into's)

* worksize

Type: size_t

Contains the size (in bytes) of Work allocated above

* Xwork

Type: void *

This is an alias into Numeric-gWork

* Iwork

Type: int *

An integer alias into Xwork + n

* Offp

Type: int *

Array of size n+1. Contains the column pointers for off-diagonal elements.

* Offi

Type: int *

Array of size number of off-diagonal entries. Contains the row indices of

off-diagonal elements

* Offx










Type: void *

Array of size number of off-diagonal entries. Contains the numerical values of

off-diagonal elements

4.2 KLU Routines

The user callable KLU routines in the C language are explained in this section.

The following guidelines are applicable to all routines except when explicitly stated

otherwise in the description of a routine.

1. : All the arguments are required except when explicitly stated as optional. If

optional, a user can pass NULL for the corresponding argument.

2. : The control input/output argument "C. III." .'" of type "klucommon *" is

a required argument for all routines.

3. : All arguments other than the above mentioned control argument "Com-

ii. ,ii", are input arguments and are not modified.

4.2.1 klu_analyze
klu_symbolic *klu_analyze
(
int n,
int Ap [ ],
int Ai [ ],
klucommon *Common
)

This routine orders the matrix using BTF if specified and the fill reducing

ordering specified.

Returns a pointer to klusymbolic structure that contains the ordering

information.

All arguments are required

n: Size of the input matrix A where A is n*n.

Ap: Array of column pointers for the input matrix. Size n+1.










Ai: Array of row indices. Size number of nonzeros in A.

Common: The control input/output structure.

4.2.2 klu_analyze_given

klu_symbolic *klu_analyze_given

(
int n,
int Ap [ ],
int Ai [ ],

int Puser [ ],
int Quser [ ],
klucommon *Common

) ;

This routine orders the matrix using BTF if specified and the given Puser and

QUser as fill reducing ordering. If Puser and Quser are NULL, then the natural

ordering is used.

Returns a pointer to klusymbolic structure that contains the ordering

information.

Arguments

n: Size of the input matrix A where A is n*n. Required.

Ap: Array of column pointers for the input matrix. Size n+1. Required.

Ai: Array of row indices. Size number of nonzeros in A. Required.

Puser: Optional row permutation to use.

Quser: Optional column permutation to use.

Common: The control input/output structure.

4.2.3 klu_*factor

klunumeric *Numeric klufactor
(

int Ap [ ],
int Ai [ ,










double Ax [ ],

klu_symbolic *Symbolic,

klucommon *Common

) ;

This routine factors a real matrix. There is a complex version of this routine

kluz_factor that factors a complex matrix and has same function declaration as

klu_factor. Both use the results of a call to kluanalyze.

Returns a pointer to klunumeric if successful. NULL otherwise.

All the arguments are required.

Ap: Array of column pointers for the input matrix. Size n+1.

Ai: Array of row indices. Size number of nonzeros in A.

Ax: Array of numerical values. Size number of nonzeros in A. In the

complex case, the i., should consist of real and imaginary parts of

each numerical value as .-li. ,:ent pairs.

Symbolic: The structure that contains the results from a call to

kluanalyze.

Common: The control input/output structure. The status field in

Common is set to indicate if the routine was successful or not.

4.2.4 klu_*solve

void klusolve


klu_symbolic *Symbolic,

klu_numeric *Numeric,

int Idim,

int nrhs,

double B [ ],

klucommon *Common










This routine solves a real system. There is a complex version of this routine

kluz_solve, that solves a complex system and has the same function declaration as

klu_solve. Both use the results of a call to kluanalyze and klu_*factor.

Return type is void. The rhs vector B is overwritten with the solution.

All Arguments are required.

Symbolic: The structure that contains the results from a call to

kluanalyze.

Numeric: The structure that contains the results from a call to

klu_*factor.

Idim: The leading dimension of the right hand side B.

nrhs: The number of right hand sides being solved.

B: The right hand side. It is a vector of length Idim nrhs. It can

be real or complex depending on whether a real or complex system

is being solved. If complex, the real and imaginary parts of the rhs

numerical value must be stored as .,li.I :ent pairs. It is overwritten with

the solution.

Common: The control input/output structure.

4.2.5 klu_*tsolve

void klutsolve


klusymbolic *Symbolic,
klu_numeric *Numeric,
int Idim,
int nrhs,
int conjsolve,
double B [ ],
klu common *Common










This routine is similar to klusolve except that it solves a transpose system

A'x = b. This routine solves a real system. Again there is a complex version of

this routine kluz_tsolve for solving complex systems and has the same function

declaration as klutsolve. It also offers to do a conjugate transpose solve for the

complex system AHx = b.

Return type is void. The rhs vector B is overwritten with the solution.

All arguments are required. The descriptions for all arguments except

conj_solve are same as those for klu_*solve. The argument conli--. .1-,I is

relevant only for complex case. It takes two values 1 = CONJUGATE

TRANSPOSE SOLVE, 0 = TRANSPOSE SOLVE.

4.2.6 klu r..,1 I ..r

void klurefactor


int Ap [ ],
int Ai [ ],
double Ax [ ],
klu_symbolic *Symbolic,
klu_numeric *Numeric,
klucommon *Common

) ;

This routine does a refactor of the matrix using the previously computed

ordering information in Symbolic and the nonzero pattern of the LU factors in

Numeric objects. It assumes same partial pivoting order computed in Numeric. It

changes only the numerical values of the LU factors stored in Numeric object. It

has a complex version klu_z_refactor with the same function prototype to handle

complex cases.

Return type is void. The numerical values of LU factors in Numeric parame-

ter are updated.

All arguments are required.










Ap: Array of column pointers for the input matrix. Size n+1.

Ai: Array of row indices. Size number of nonzeros in A.

Ax: Array of numerical values. Size number of nonzeros in A. In the

complex case, the .- i.,- should consist of real and imaginary parts of

each numerical value as .-li. ,:ent pairs.

Symbolic: The structure that contains the results from a call to

kluanalyze.

Numeric: Input/output argument. The structure contains the results

from a call to klu_*factor. The numerical values of LU factors are

overwritten with the ones for the current matrix being factorized.

Common: The control input/output structure. The status field in

Common is set to indicate if the routine was successful or not.

4.2.7 klu_defaults
void kludefaults


klucommon *Common
) ;

This routine sets the default values for the control input parameters of

the klu_common object. The default values are listed in the description of the

klu_common structure. A call to this routine is required unless the user sets the

control input parameters explicitly.

Return type is void.

The argument Common is required. The control input parameters in Com-

mon are set to default values.

4.2.8 klu_*rec_pivotgrowth
double klu_rec_pivot_growth
<


int Ap [ ,










int Ai [ ],
double Ax [ ],

klu_symbolic *Symbolic,
klu_numeric *Numeric,
klucommon *Common

) ;

This routine computes the reciprocal pivot growth of the factorization algo-

rithm. The complex version of this routine kluzrecpivotgrowth handles complex

matrices and has the same function declaration.

The pivot growth estimate is returned.

All arguments are required.

Ap: Array of column pointers for the input matrix. Size n+1.

Ai: Array of row indices. Size number of nonzeros in A.

Ax: Array of numerical values. Size number of nonzeros in A. In the

complex case, the .- i.,- should consist of real and imaginary parts of

each numerical value as .-li. ,:ent pairs.

Symbolic: The structure that contains the results from a call to

kluanalyze.

Numeric: The structure that contains the results from a call to

klu_*factor.

Common: The control input/output structure. The status field in

Common is set to indicate if the routine was successful or not.

4.2.9 klu_*estimatecondnumber

double klu_estimate_cond_number

(
int Ap [ ],
double Ax [ ],

klu_symbolic *Symbolic,
klu_numeric *Numeric,










klucommon *Common


This routine computes the condition number estimate of the input matrix. As

before, the complex version of this routine kluzestimatecondnumber has the

same function declaration and handles complex matrices.

The condition number estimate is returned.

All arguments are required.

Ap: Array of column pointers for the input matrix. Size n+1.

Ax: Array of numerical values. Size number of nonzeros in A. In the

complex case, the .- i.,i- should consist of real and imaginary parts of

each numerical value as .,i.1:ent pairs.

Symbolic: The structure that contains the results from a call to

kluanalyze.

Numeric: The structure that contains the results from a call to

klu_*factor.

Common: The control input/output structure. The status field in

Common is set to indicate if the routine was successful or not.

4.2.10 klu_free_symbolic
void klu_free_symbolic


klu_symbolic **Symbolic,
klucommon *Common
) ;

This routine deallocates or frees the contents of the klusymbolic object. The

Symbolic parameter must be a valid object computed by a call to kluanalyze or

klu_analyze_given.

Return type is void.

All arguments are required.










Symbolic: Input/Output argument. Must be a valid object computed

by a call to kluanalyze or klu_analyze_given. If NULL, the routine just

returns.

Common: The control input/output structure.

4.2.11 klu_free_numeric

void klufreenumeric
(
klu_numeric **Numeric,
klucommon *Common

) ;

This routine frees the klunumeric object computed by a call to klufactor

or kluz_factor routines. It resets the pointer to klu-numeric to NULL. There is a

complex version of this routine called klu_zfreenumeric with the same function

declaration to handle the complex case.

Return type is void.

All arguments are required.

Numeric. Input/Output argument. The contents of the klunumeric

object are freed. The pointer to klunumeric object is set to NULL.

Common: The control input/output structure.















REFERENCES


[1] C. L. Lawson, R. J. Hanson, D. Kincaid, and F. T. Krogh, Basic linear
algebra subprograms for FORTRAN usage, ACM Trans. Math. Soft., 5:
308-323, 1979.

[2] J. J. Dongarra, J. Du Croz, S. Hammarling, and R. J. Hanson, An extended
set of FORTRAN basic linear algebra subprograms, ACM Trans. Math. Soft.,
14: 1-17, 1988.

[3] J. J. Dongarra, J. Du Croz, S. Hammarling, and R. J. Hanson, Algorithm
656: An extended set of FORTRAN basic linear algebra subprograms, ACM
Trans. Math. Soft., 14: 18-32, 1988.

[4] J. J. Dongarra, J. Du Croz, I. S. Duff, and S. TH.iiiiii.,.lii-. A set of level 3
basic linear algebra subprograms, ACM Trans. Math. Soft., 16: 1-17, 1990.

[5] J. J. Dongarra, J. Du Croz, I. S. Duff, and S. H.,,iiii.i.lii,.. Algorithm 679: A
set of level 3 basic linear algebra subprograms, ACM Trans. Math. Soft., 16:
18-28, 1990.

[6] James W. Demmel, Stanley C. Eisenstat, John R. Gilbert, Xi....,,- S. Li and
Joseph W. H. Liu, A supernodal approach to sparse partial pivoting, SIAM J.
Matrix Al,,i. !-: and Applications, 20(3): 720-755, 1999.

[7] Timothy A. Davis, I.S.Duff, An unsymmetric-pattern multifrontal method
for sparse LU factorization, SIAM J. Matrix Aili,,,1.: and Applic., 18(1):
140-158, 1997.

[8] Timothy A. Davis, Algorithm 832: UMFPACK, an rnii-inin,-tric-pattern
multifrontal method with a column pre-ordering strategy, ACM Trans. Math.
Software, 30(2): 196-199, 2004.

[9] John R.Gilbert and Tim Peierls, Sparse partial pivoting in time proportional
to arithmetic operations, SIAM J. Sci. Stat. Comput., 9(5): ,-,2 873, 1988.

[10] A. George and E. Ng, An implementation of Gaussian elimination with
partial pivoting for sparse systems. SIAM J. Sci. Statist. Comput., 6(2):
390-409, 1985.

[11] Iain S. Duff, On algorithms for obtaining a maximum transversal, ACMI
Transactions on Mathematical Software, 7(3): 315-330, 1981.









[12] Iain S. Duff, Algorithm 575 permutations for a zero-free diagonal, ACM
Transactions on Mathematical Software, 7(3): 387-390, 1981.

[13] Iain S. Duff and John K. Reid, Algorithm 529: permutations to block
triangular form, ACM Trans. on Mathematical Software, 4(2): 189-192, 1978.

[14] Iain S. Duff and John K. Reid, An implementation of Tarjan's algorithm
for the block triangular form of a matrix, ACM Trans. on Mathematical
Software, 4(2): 137-147, 1978.

[15] R.E. Tarjan, Depth first search and linear graph algorithms, SIAM J.
Coi(,iitillm., 1: 146-160, 1972.

[16] Thomas H. Cormen, Charles E. Leiserson, Ronald L. Rivest, Clifford Stein,
Introduction to Algorithms, Second Edition 2001, MIT Press, Cambridge.

[17] S.C. Eisenstat and J.W.H. Liu, Exploiting structural symmetry in a sparse
partial pivoting code, SIAM J.Sci. Comput., 14(1): 253-257, 1993.

[18] P. R. Amestoy, T. A. Davis, and I. S. Duff,An approximate minimum degree
ordering algorithm, SIAM J. Matrix Anal. Applic., 17(4): 886-905, 1996.

[19] P.R. Amestoy, T.A. Davis, and I.S. Duff, Algorithm 837: AMD, an ap-
proximate minimum degree ordering algorithm, ACM Transactions on
Mathematical Software, 30(3): 381-388, 2004.

[20] Timothy A. Davis, John R. Gilbert, Stefan I. Larimore, and Esmond G.
Ng. A column approximate minimum degree ordering algorithm, ACM1I
Transactions on Mathematical Software, 30(3): 353-376, 2004.

[21] Timothy A. Davis John R. Gilbert Stefan I. Larimore Esmond G. Ng,
Algorithm 836: COLAMD, a column approximate minimum degree ordering
algorithm, ACM Transactions on Mathematical Software, 30(3): 377-380,
2004.

[22] W.W. Hager, Condition estimates, SIAM J. Sci. Stat. Comput., 5,2: 311-316,
1984.

[23] Nicholas J. Higham, Fortran codes for estimating the one-norm of a real or
complex matrix, with applications to condition estimation, ACM Trans. on
Mathematical Software., 14(4): 381-396, 1988.















BIOGRAPHICAL SKETCH

Ekanathan was born in Tirunelveli, India, on October 2, 1977. He received

his Bachelor of Technology degree in chemical engineering from Anna University,

Chennai, India, in M.v 1998. He worked with Infosys Technologies at Brussels, Bel-

gium, as programmer/analyst from 1998 till 2001 and with SAP AG at Walldorf,

Germany, as software developer from 2001 till 2003. Currently he is pursuing his

Master of Science degree in computer science at University of Florida. His interests

and hobbies include travel, reading novels and magazines, music and movies.