<%BANNER%>

Efficient Scheduling Techniques and Systems for Grid Computing

xml version 1.0 encoding UTF-8
REPORT xmlns http:www.fcla.edudlsmddaitss xmlns:xsi http:www.w3.org2001XMLSchema-instance xsi:schemaLocation http:www.fcla.edudlsmddaitssdaitssReport.xsd
INGEST IEID E20110218_AAAABI INGEST_TIME 2011-02-18T09:07:04Z PACKAGE UFE0013834_00001
AGREEMENT_INFO ACCOUNT UF PROJECT UFDC
FILES
FILE SIZE 41503 DFID F20110218_AAAREV ORIGIN DEPOSITOR PATH in_j_Page_015.pro GLOBAL false PRESERVATION BIT MESSAGE_DIGEST ALGORITHM MD5
ba362168559e07b8cbc73d094e4051a1
SHA-1
9b4a1ae4988054055e51c428155dbd8faa0f7eda
1829 F20110218_AAAQZP in_j_Page_024.txt
4707d84c7b93d50d207cb61f430ffe75
fc19a0468c9645fd6ee4889399ecbc2e4aa3cd53
43293 F20110218_AAARFJ in_j_Page_029.pro
6c0325c8e7b76e87b4fb9e0feca2b16b
c3fecc5a825976e6eb30bdf1469b508dcf00f4d8
47191 F20110218_AAAREW in_j_Page_016.pro
9ed7b59648546d4d385cd41357de8b65
4c7e3641c4417374a563d3167f14d91acb91f221
1834 F20110218_AAAQZQ in_j_Page_025.txt
d86ba5d48bc0603842f3d8f74bdcfc92
47838919ac831b3c42c3ef57823aa9599badad05
43038 F20110218_AAARFK in_j_Page_030.pro
128140c26fa32f160e74edf78818e72e
a9ed1794d4b85bbd25f046c173fe6967c0564252
47288 F20110218_AAAREX in_j_Page_017.pro
f83e9c2b276e16e1246a0cb8b314ac72
9f2fe89803779b9ec1c4148ed5a383977783e0ba
1740 F20110218_AAAQZR in_j_Page_026.txt
a2c51b9ffe13ecd91addc942463769ae
987e1f0188a39c97e32c2e49dcd86640fcbbc372
42040 F20110218_AAARGA in_j_Page_046.pro
329e03c9d2882304dd2cf3afde007f0d
dde53ce45a5aabb346d5390ed138e00e93fda3ef
46526 F20110218_AAARFL in_j_Page_031.pro
da4f2b98920ecd04581685b5e0f05f2b
6d77ff29c3503699371c41e4fd59235770544371
49598 F20110218_AAAREY in_j_Page_018.pro
dc0bb533f3b718d8b948d3f8cf63a156
4743b26777630c0d4ace7757724bc2b21da2a378
1840 F20110218_AAAQZS in_j_Page_027.txt
5664294cf45047ee8902b4271cbefe19
d679067e48e980f655d9c280994a147e61c4b145
47111 F20110218_AAARGB in_j_Page_047.pro
8f6a9e94ee8c82e3f7406fa67eac50c3
263ed5c17e513de2944a0168078624f8b6931e8b
57646 F20110218_AAARFM in_j_Page_032.pro
7f9fd098fd7db0c965622fc6f1952197
e0dc4c88657231f8c2bcce06b3be8f28566fe210
48534 F20110218_AAAREZ in_j_Page_019.pro
5487d8f14c58f4c4a63a038e229ee74c
9e6ab7792b64fac16bc3b653521e8505e597f0de
1907 F20110218_AAAQZT in_j_Page_028.txt
f389fc63fc93dc231d91635b111c5023
54d8d45a6097d8a54aaf238b6270d10af2cab2e9
41584 F20110218_AAARGC in_j_Page_048.pro
e6944c434ad6dbcd6d35f6ec7a6b22d1
587eb5a1c9e3d5bec03224c025f23e630ce367f7
92423 F20110218_AAARFN in_j_Page_033.pro
9c6ae8495d6fd64403935bb760bcb554
5edcb4b88a9f675825282d99c605a0d5ffad4745
1744 F20110218_AAAQZU in_j_Page_029.txt
26bb0adf3cef32b77bf255e2982c5421
d59df00d08aa87409715d249966de3dcb4f1e9e1
44873 F20110218_AAARFO in_j_Page_034.pro
0803c946d08d5374b191c057a586b4e6
604056dd9b4d268464a27f14d8eea71651e281d5
1752 F20110218_AAAQZV in_j_Page_030.txt
34f1ddb18080421eee59ac57ff3aee77
dcbdaeb93cba6c288d6f8a1a3a6a4287adfa1e01
49516 F20110218_AAARGD in_j_Page_049.pro
acba9bb761fb07916dcac0dc9596a0e2
4ae239f3320a7db42d63e2babeda609c80da09d6
43563 F20110218_AAARFP in_j_Page_035.pro
ce2504776d6895beb0f41312e13d5ccc
f80d02a62ad73d55992ffa13af1a69f81ffb0246
1842 F20110218_AAAQZW in_j_Page_031.txt
e34eeaf36d5ad8af7923c7ab06343cee
aaa740d1f1dcaf024381672f31b991b25d4bad69
34738 F20110218_AAARGE in_j_Page_050.pro
33beaf4ee7e427cdee04afecc4355f03
65271602f130c7bec78d4b6437cfab5f2834508a
57796 F20110218_AAARFQ in_j_Page_036.pro
23a5b9527044a6a2bd52775dd8ea6d67
99b4b53ab7360dbec68bb62abb950b0d751475f1
50367 F20110218_AAARGF in_j_Page_051.pro
ad04fc5eb8ccdac9a2ccc13cde0c775e
0b66353c0e5140cc1bf2dfe941bccc377987b18c
50555 F20110218_AAARFR in_j_Page_037.pro
9eccb5248c2dd3d5a52469188c222611
6c8f4059db682100d355bd9c9b95db8d3ced8bfe
2337 F20110218_AAAQZX in_j_Page_032.txt
869b14b3725ccc371dd53fc6219c9f3d
9a0730a7ad6405dc15ebd1d9ea8bf4995c4d8f87
40313 F20110218_AAARGG in_j_Page_052.pro
a523a358d14855864ecfee84bbad2d90
22bee8d8f9941d8a1a9c27a2c5db9c6f5653a3f0
47218 F20110218_AAARFS in_j_Page_038.pro
31fa0cafcec124017a06d98c725d55d3
b4898cbf9621c9a00bd386ba99c750883d1e2a5e
3790 F20110218_AAAQZY in_j_Page_033.txt
01a58c795f93059196ef9262c91bf065
e759ab0280cc3cd3175425fde1ee27cc6cb859d6
49257 F20110218_AAARGH in_j_Page_053.pro
ee031aa010fb4135c22bff30d00c6870
16221230a3e6e00c110b4289eb75371d2af0e0dc
49455 F20110218_AAARFT in_j_Page_039.pro
27c96747a7bdf02b19ac11c4b250e529
02aa357a43e7b5d4bd218a317e9007983bca45c7
1855 F20110218_AAAQZZ in_j_Page_034.txt
a790708a7c4ae0db4fd211c1dde2ac9e
2df2fee28d5a873add9009f93a15f3b6c7829742
12724 F20110218_AAARGI in_j_Page_054.pro
8ab6740c441b997b3aeb17cfe9e41355
d392abc91672d31aa544fbcde4f111e20ec7464b
48527 F20110218_AAARFU in_j_Page_040.pro
be1c26eea67b952e6b9a7fb5a4a8dd1c
3afe3114c4356f3c24ea8382e16e5501bce424ca
12052 F20110218_AAARGJ in_j_Page_055.pro
1c7c7b476be1e35ce8b515b12d4a5ae3
596fa3c83ae6f48f7cacf4eb5e50e1d0a608b26f
39595 F20110218_AAARFV in_j_Page_041.pro
ccbec1228f369761e63775de9c99bc52
88291c14a5a66d94cc783cd060b880d013504499
40992 F20110218_AAARGK in_j_Page_056.pro
9c758029e899c276449edb107fd3fd8b
b2fc158efffe52922869989d315055cfb67d16c0
47222 F20110218_AAARFW in_j_Page_042.pro
d305d494374bf67f5b7726acd9e3a211
805ecde207b612efa85dc4a9571f44721e21930f
48645 F20110218_AAARHA in_j_Page_072.pro
8250d21725b3e827102691d6afe6e25b
9b2c36f119f4f98ba87095b11d15a2cf32500438
50073 F20110218_AAARGL in_j_Page_057.pro
d51b8d6664c55f754e41f774e1288a77
9125decef9b1af3ec383a369b074d7ce7b153926
45921 F20110218_AAARFX in_j_Page_043.pro
a86a91bda3c2034bb45d6a4cf308c8df
c4fcf6045ea8670875d793d8b780c4ad5585221b
46473 F20110218_AAARHB in_j_Page_073.pro
4fdd49dafd784bb436d910472fe3d5ec
ef50a1fb4d86934807045c6857ec5f22aba27f2b
47301 F20110218_AAARGM in_j_Page_058.pro
7f78658014e611a3f65f47dff67f2f7b
ffc674eea732af9fb9f9a75881bc22402b31df8e
35810 F20110218_AAARFY in_j_Page_044.pro
8b6c19a13f1c54234d2ae34ea78d5f77
45b0a18505ff44f40073a2b11f476a418ae34333
30835 F20110218_AAARHC in_j_Page_074.pro
1ebe85ef4a4de2b96b6e9791af8013ca
ba85142e1d8c33da55779876bf4b8ac378c6617a
50041 F20110218_AAARGN in_j_Page_059.pro
57244a280b5773833e08f79afe05cc54
c657df389ae8586c00c3625a34ad6bfc6f5641db
41358 F20110218_AAARFZ in_j_Page_045.pro
e47b5319def828cd0ed1a72e021bccbf
84b2d1453f4f5f379d6f1f201b78696197f1982a
33481 F20110218_AAARHD in_j_Page_075.pro
be8302d0793839b45ae1527f942fb747
60024bc9134e83745bf03723cecf98d80731f5f3
26084 F20110218_AAARGO in_j_Page_060.pro
a49e59ac4601218d2a57a061710365a0
e38165eab10fbfe2e03a84e064572a11e873f520
46100 F20110218_AAARGP in_j_Page_061.pro
31938386dee0e68c1634fa37b4267b41
fa174b098eb19465e3af62cc36edf457a4809664
44781 F20110218_AAARHE in_j_Page_076.pro
d602789ceef0d0ceddf0b76a0377f5d7
e8a446b2ab9f110643748aad2e454fe48cea9d4c
51622 F20110218_AAARGQ in_j_Page_062.pro
b7d420c412be2f900ad19cc0904a8ae2
1d74a01c23948a916330eaa227a977a4d22fc947
35037 F20110218_AAARHF in_j_Page_077.pro
c3c6e3d65bc7967dd9d4892c9224e6db
5bb827a25a902d1f5534bad10d1a9633fc825145
58227 F20110218_AAARGR in_j_Page_063.pro
d00e220cf5c6e85cb8154541b01612c4
584177fb3c625dfd9803c115e6c5aa8ee51864ec
40532 F20110218_AAARHG in_j_Page_078.pro
a93f3ca1fa3ec6661572b4572febbbbd
f1b5000b6cb23baee9294e73a7a3c3e2e2064b5c
50062 F20110218_AAARGS in_j_Page_064.pro
60e92a9f224e389ab3c2b487a01a4e04
dd62adf89138da74bb0aaf0b7df60bbfb9089fdc
24074 F20110218_AAARHH in_j_Page_079.pro
662a056f891cbb3318434a50efd30b7b
250b87b0ba938103167a040b5b225da1c49d22f8
58018 F20110218_AAARGT in_j_Page_065.pro
a6aab848595dbec1a289c94ad88141c0
19167dbcfb1332f93045a85be58ba187bdae0d62
31945 F20110218_AAARHI in_j_Page_080.pro
1be0ef0263d4864c282ce0bc24d3b08b
229268b32b7527209f328818fd341e1070ee66eb
44336 F20110218_AAARGU in_j_Page_066.pro
2cc9ad72aafd6c7c0c7ee6f46c20c314
d4fbc752d904cdd8f029050c37e71a8c467fd469
15310 F20110218_AAARHJ in_j_Page_081.pro
1bf9bba03e14912413898379a9b41193
7db813e1d8f61c9b7e25616e916d0d61781d41b4
53941 F20110218_AAARGV in_j_Page_067.pro
ec92352504865673a60fee2e6ae149a0
df1461db96a7373bfa1b7920fe2f1bb358fa8e37
21647 F20110218_AAARHK in_j_Page_082.pro
6458263533d5dfb74c1e4b54408d019b
8690f186b02d7de3c2164fff91f49f2f5dc0cf80
47097 F20110218_AAARGW in_j_Page_068.pro
5594df870157bab3b6b8833698f81b86
7d98e25dcfe1d5c6f9f10c7511e6b459d2fd4ce6
18764 F20110218_AAARHL in_j_Page_083.pro
fdfa664a726085bb4279a521fcf00889
265edc9c9bdc3b959cf2ae6c944191707fac240e
51269 F20110218_AAARGX in_j_Page_069.pro
17081e2128ef88ede9c73cf787a582b8
8241de353b5a3b32bb12d317976e765100cb8006
49464 F20110218_AAARIA in_j_Page_098.pro
08f9819a0417c8b9f69171dc11f2878e
037e8075fd9c1da758dce785cdcb437b6593b956
40472 F20110218_AAARHM in_j_Page_084.pro
abd3e59542e5b8e015a342a9bf18af24
b712c88eb528fc544c86369fe74458787026d74b
47086 F20110218_AAARGY in_j_Page_070.pro
3aa706372af7ae0b40b254445a8cdeab
8ddc19878530af6ef4a2e869c07f5da669cb8f7f
44371 F20110218_AAARIB in_j_Page_099.pro
982f72d4d54ccf70ac76b2369e84acd3
7cbdd870c5861509b3715e8af7a83921bf405d5e
18818 F20110218_AAARHN in_j_Page_085.pro
013f1caa7dfe380791ba38b14674c891
fa9be1ab78b5ab42f7838ed50e7660a895b9b435
46126 F20110218_AAARGZ in_j_Page_071.pro
9f819ac29073c3f7fa700042ff6e8e5a
cb5f230f5ad640be31c050b78a576df52c4b23e0
40874 F20110218_AAARIC in_j_Page_100.pro
98ccb6e089992b525565a265f77f776f
59ea9f69c8ea97a3e3ff56076e08a4e3b4c8788a
29463 F20110218_AAARHO in_j_Page_086.pro
09a9c86db6af3d5ee925f31826cfb6be
f6ee49a6083a8ca26dbe8a978a874a284c5acd25
40886 F20110218_AAARID in_j_Page_101.pro
bbcad2bf3b23ebabe80d2bfb34fd240e
a02eb22d9f8c93193921f7d52a4c96cdd5601b08
47010 F20110218_AAARHP in_j_Page_087.pro
ff85e09cc5e47d5c709a7d90ddc45c4f
f3e69ea2000b5d5d19a9b57e357efa00ca1289ac
50126 F20110218_AAARIE in_j_Page_102.pro
2e17b238ad625a92c2719c59cc14a40c
3ec04a3e1a32bb04fd8d28266e237e41bbc51bf5
54241 F20110218_AAARHQ in_j_Page_088.pro
8b492c79850d692fb979ddd353ec1f1c
f5e31e69dbd2103a57ef9633940799e754f141c9
19535 F20110218_AAARHR in_j_Page_089.pro
f345c68faecc4984a754ec90e17da443
60ecd224f5e88d2f078070c1012f8db77ed57077
35488 F20110218_AAARIF in_j_Page_103.pro
f95c4d692031628934a19e7624eeab5a
1c2cec4ac4c8b9995bd03ed2b1bba1ec149e72ca
44856 F20110218_AAARHS in_j_Page_090.pro
cd63004f996d8eaff2cbb627e1aff5b3
a26cc0f3c640b6784893098b79a3cffe532fd6f0
53001 F20110218_AAARIG in_j_Page_104.pro
7077cd4a6fad98ada7662c304467d6d7
7d43d8fbd328b0904bfa16bd8b73d9cd461fc348
38798 F20110218_AAARHT in_j_Page_091.pro
978546b8feff0ba8d5d32ba5415d66b4
f2b186b9ce95086879bbd30f1ef430fd4c4eacfd
F20110218_AAARIH in_j_Page_105.pro
15e35628ef7324083a3e79e27909c2e2
b31118559f1a5c396eea78e01d889c9bf71d4083
37481 F20110218_AAARHU in_j_Page_092.pro
6f36dfc1ddaf296b9c0a634179ce7066
d8469b7906b8fb3b40e0c803ceb2c836c904cddb
54264 F20110218_AAARII in_j_Page_106.pro
bb8216969b5c07dfe37f464ea04bd78b
5c026a755f0287eabeb8880b370fd6e2465e140d
63718 F20110218_AAARHV in_j_Page_093.pro
bc368600a49cc3e38d850b578b448b95
5b57e1fab5f45a56cf941a3b223bbd74ddff24b4
46771 F20110218_AAARIJ in_j_Page_107.pro
4fb8247bab0978d05c435fa9343a84c8
89cfd5aca60159aad9e3e9c3c278700dcc81da8c
51332 F20110218_AAARHW in_j_Page_094.pro
f84f8be1546133f50c6a4efe9b6f9b08
90822c4c66cf9eb3c64ecf8f378ffbc1e95b31be
46006 F20110218_AAARIK in_j_Page_108.pro
50dffb695ad2eebf593319c6518ebd31
b11b8529ecaceb531336ca1f4322eea75285f390
42910 F20110218_AAARHX in_j_Page_095.pro
190ad208eb7ad56088c5ee805b0a3eb5
1c94564d4bfe8436fc8abddd00329cd17277bd38
46213 F20110218_AAARJA in_j_Page_124.pro
4df9079474420eac5a56dead59f3bcab
0d00ff102dd1e83a80b8b2637ebc4c278ef800dd
49277 F20110218_AAARIL in_j_Page_109.pro
b3834304311aed539541321351b18f0b
29e805cfee9f7b5850098d48867ea781e8782a0f
41046 F20110218_AAARHY in_j_Page_096.pro
b92d9f5faecc99ce3b01c6a6c1f27a73
5fa7eddc6c78f0483263b90b8153ed1ab8152bb8
38836 F20110218_AAARJB in_j_Page_125.pro
b743c97e293e41e54f01d9a2f5b37299
db19a77e2f61c7dff70ef55799ec38ef83ac9a54
41479 F20110218_AAARIM in_j_Page_110.pro
61a4983926b4731b9cc899314fa311a7
2832f8d7be67c278db7777ad73b89d4f964c09c8
43172 F20110218_AAARHZ in_j_Page_097.pro
444c898bd8e0a5eff1d0ebacacf873f7
dcc2e21fce59e7ba3e71403d436903ff7fb39580
37406 F20110218_AAARJC in_j_Page_126.pro
151704c075b68b6f067ff3949a0f0356
5ce0a32949405fcc50fc124cc60c2057be03ded4
37563 F20110218_AAARIN in_j_Page_111.pro
9f3521affae1dea0284076cbd5c2c6a2
73cf48137346e8ee77f7cec2e52b4d68fd3e753d
30686 F20110218_AAARJD in_j_Page_127.pro
5967a74bd3306fb9920ff72ed9816093
c8bc3d99377e17e9a59cdaf0170ea8ece68f3664
46852 F20110218_AAARIO in_j_Page_112.pro
15e158d0f7b63c7cbf3dfcb7512026a4
9368fc3bd8fd00077f76295b076b42d37f275ac0
46652 F20110218_AAARJE in_j_Page_128.pro
e6c3e3d62761a424c489a41b42f952c9
22637a4354b2d82a381599abff02fc744ae77d5d
34178 F20110218_AAARIP in_j_Page_113.pro
db1155a2b1fb1766fc2fccaed80d7515
56804f0d710685beb0c2854ad6522c2265d10c14
37592 F20110218_AAARJF in_j_Page_129.pro
8fb7c0671818a658f1804517dad3dec9
fc9dee8a68fd15362cd6fbcb629666b62fa95f0b
43069 F20110218_AAARIQ in_j_Page_114.pro
f13108e8e8cfd80cc90344c2d897997c
9fe8b33f4da617d3920636f6c48f72ed759f1dea
46931 F20110218_AAARIR in_j_Page_115.pro
4adbde7073a6a0b0d7666148184af761
a33a93fe3412eebe3bc42ef4d45bbe0f0acb220f
31824 F20110218_AAARJG in_j_Page_130.pro
52502000e790bcfbee223d30b8ada73c
627a1005ed2c253c98fe17e6c0a5415fe5ff646b
35732 F20110218_AAARIS in_j_Page_116.pro
50f3bf381161ba070d620d77b545babe
fa619cd1a3e02ea0894942d5a99f44370b2a8ac7
47883 F20110218_AAARJH in_j_Page_131.pro
7029f74887590fa193342a720ff19620
81394ebed9dd3898e2f1f1c94f40cdaf1dd686d1
42244 F20110218_AAARIT in_j_Page_117.pro
50791d1e44fff349a458c59c2881b109
c531b5e513bede871af59a39252bdf4e99974d87
2771 F20110218_AAARJI in_j_Page_132.pro
2c615c540dba9a4c4c2101daf6a67173
74f48b65e990e4a9782750c096c773a989ce61b6
45041 F20110218_AAARIU in_j_Page_118.pro
d21045ab74d0800dd8a20a5ec70f1c5f
f9e240b0dd7ba3cb074711f59cffa35107f72565
43009 F20110218_AAARJJ in_j_Page_133.pro
753a052c7cf96e1a795fa750b4feaa7a
43a633e9f1902996a7935581a46bbc6abd7414f3
45804 F20110218_AAARIV in_j_Page_119.pro
96890bb64873cd19db38a5ae0cf0a8c1
5054414018e5f6da7ded3e93e415856e549ad52e
50079 F20110218_AAARJK in_j_Page_134.pro
24c7b2b3ac9e2174f20010ef4c8909ec
2f2446425404aca080c05feb0c541326648473f3
F20110218_AAARIW in_j_Page_120.pro
22c9e441c74e90a021360dbefc297f56
ad2b5f602cf1926bac14a7c11868622183165997
46534 F20110218_AAARJL in_j_Page_135.pro
25238c302ec53a9568413eddaac23a98
ca949c419054b16145e71763922e7f52ee0f0750
32966 F20110218_AAARIX in_j_Page_121.pro
a68962c0cae15679ebcac3d29d12ceb0
d79b748bfff5e116428f554e1a56df8f891bff6c
77702 F20110218_AAARKA in_j_Page_005.jpg
1853160a8e2d5360187e5feb84746b38
d4e786cccffff6940bc7774173b1ef0045fb8997
44370 F20110218_AAARJM in_j_Page_136.pro
9f130c2f25d6f4545d0750f3b9807ff9
a52e6aa5732f41f7bc15ee335bd98d1d2375c6c1
33416 F20110218_AAARIY in_j_Page_122.pro
d7215f2576001aa77429fec7f3bd18c7
1ea044a4ac67f417a568857c23dbbe9974429199
121257 F20110218_AAARKB in_j_Page_006.jpg
eef408622db90d4c57435e9ad10a7870
a93688abffb55078397ba9b2b92f165eb78a0622
32559 F20110218_AAARJN in_j_Page_137.pro
2f07ab58865aca2ad22d42883066cc2c
227e9292d08df1b4d72ae1570e610cb71e522bc6
35532 F20110218_AAARIZ in_j_Page_123.pro
5aeb1119fca5aafd7eb166feda74cb1c
9aef77d80f4a6262727b7312f1fff4eae358744a
55204 F20110218_AAARKC in_j_Page_007.jpg
a4271131e2a85c661be8184f3ccc7d94
7380281a416c4cd12b5b0c019059ca2b00bc4778
49218 F20110218_AAARJO in_j_Page_138.pro
974a2b14bf6a50832afc0b7b85af1ffc
ac17e827184aca81478fed7b6699b043afc037bf
33945 F20110218_AAARKD in_j_Page_008.jpg
2f4af7db56907c5016ab5bffa025b230
3fb0b109c065cdc66d44d0c891abd8c7e0ecaaa6
55427 F20110218_AAARJP in_j_Page_139.pro
d912ff78d2e37ca20d07045a4b012abe
bd06d7b7a652b4274cc02ddb8872d4589a94b0b0
79850 F20110218_AAARKE in_j_Page_009.jpg
d3180021a5e1421307b4dbf8bc1e063e
1e4b28e167db7154dc7633abb0c1ab508572bad1
55475 F20110218_AAARJQ in_j_Page_140.pro
a75a5afe9b15ab9051b2479a6160173b
7d543889bf0ace38b90599b56c100402e3c94e9b
53560 F20110218_AAARKF in_j_Page_010.jpg
8cdd278e1d557acf36527a36b2aef585
450477fdd042e60be1152a48c2f33916d83df7fb
56046 F20110218_AAARJR in_j_Page_141.pro
e83ce5fbecf9c03da5c4df2b6099ed10
d7550386dc9b01d0c5a441b8dad96b4cb1c26b05
82455 F20110218_AAARKG in_j_Page_011.jpg
e5635310b015acd1ea9446325d3199d9
4b691ae6619f2eaa2d5d4a0b42e0ba0bd9ac9161
65435 F20110218_AAARJS in_j_Page_142.pro
d0c7155efaa2aab70cfd75732f713c24
190ff69526c9b7292dfcc7728200102d31dd7454
58395 F20110218_AAARJT in_j_Page_143.pro
66ffa2b671128526067876c5612732df
39815db21e11b47d07a0e058b22b9a3cf62398ed
97798 F20110218_AAARKH in_j_Page_012.jpg
242c49f61de94720bc281693826ee7ef
13780ce731cd80b2e45855e9b66d4c474d0ccfc4
24992 F20110218_AAARJU in_j_Page_144.pro
066f065289817cf532344ec31a90f23b
43bd417af8bf33e8541487e5194c93f7d7a77840
78466 F20110218_AAARKI in_j_Page_013.jpg
0e8357fcdfea41cfc79fe5d8081c90b6
3cbecee6bb3f1bcc390846c23ed6e9b341b0e377
6019 F20110218_AAARJV in_j_Page_145.pro
d2bc1723f9fe2cb890add7c72cbb3c11
5aa0c6a8ad795a33da39f1ac845ccb6e759a688c
100462 F20110218_AAARKJ in_j_Page_014.jpg
e16cecd6eb4de641869b8c82d4373488
0b4e4a25d77fdf8c10beddbd6d7746eb8df839b9
23823 F20110218_AAARJW in_j_Page_001.jpg
87149d278b7ac40ffb2c38dd9120b319
70f1f465aa45b14795d6ad0de3c0b1d6fb985050
93993 F20110218_AAARKK in_j_Page_015.jpg
e212432d4e2006f7de3595cf1df96558
440baf5a7843d7047cf7168ef5d6ba9f4fdcaa35
4211 F20110218_AAARJX in_j_Page_002.jpg
7d119fc4590134f5fcdb978491477f96
a075d6227508e767e59ab548779692c513675715
94768 F20110218_AAARLA in_j_Page_031.jpg
422f4d81ded4af6115b3ffecda2b9695
e2769a84a26cc969b960aab2dee7deddbeb663bb
96488 F20110218_AAARKL in_j_Page_016.jpg
8db8e0c070d985e39c22d2c7325adfa4
3d6eb6aff546efbf35e58d321488cf3f83b71fe8
7336 F20110218_AAARJY in_j_Page_003.jpg
e52bb7f5c83df0fe3b21f065f125d62e
09f7b0a10ecda9367a92b6c0398e06799363f739
112336 F20110218_AAARLB in_j_Page_032.jpg
8a64f83a544cd986534bab1c12887256
968d9ba0670ae0cc6fac87a114cde1544d30413b
94092 F20110218_AAARKM in_j_Page_017.jpg
742c8acab46ced3a26ea530f87c6fe94
6a37054e272b4be85dbaa263841251efe84ef990
13173 F20110218_AAARJZ in_j_Page_004.jpg
d4b9f80d0abcc3ef9b018b9fb69b68a4
f6b325a3c0f3a8ec753e2798ca383e7cfcaa3202
170746 F20110218_AAARLC in_j_Page_033.jpg
c0f5fbfbd0b3bc64489907f0fbbb6c67
4ae90b4a6faa2607ce25998407cf716246e53f23
101410 F20110218_AAARKN in_j_Page_018.jpg
bb1f8f2358d26529c4b36e9d5f9d0ff4
c626ff325ee3995ad36489b67bf3812d85603525
93052 F20110218_AAARLD in_j_Page_034.jpg
558066b7da0dba99a3c0c2d313c9479a
409bbada99228b58a57b2a3e6308a35b40932693
99826 F20110218_AAARKO in_j_Page_019.jpg
002be0a01161d4268bbb8cbc3237798a
bb4bb6cdb64a40adfdeefb7302d19934bd6e206d
89275 F20110218_AAARLE in_j_Page_035.jpg
cc0619b99b4d41ce1ad5a09507bf7d01
5bf80a69c357f0382a450c2db0fc279dec98a8e2
92676 F20110218_AAARKP in_j_Page_020.jpg
74884b02f97e7042eceb835924a04983
d0d1e1de3c2cf90ac52baead70052cb0d94f6f8a
113538 F20110218_AAARLF in_j_Page_036.jpg
8c897e293e3ebd39ba9b28e559ab8661
2a80b59790ff378f2698e0903fd6ddef37c983cd
102554 F20110218_AAARKQ in_j_Page_021.jpg
b10635bac2e6aa6530ab4ab3f0d89a58
bcef89a0c9739c3ad7c42ce719ce2fe82a005404
102735 F20110218_AAARLG in_j_Page_037.jpg
9e739f1c308df36d0b6ec472af2900d8
9eb129d515a0595c5e2c14f09e6fca44721aefb0
93641 F20110218_AAARKR in_j_Page_022.jpg
1f0e32b804c293ca9559f1e02b64febd
eb0ca5cc40c52809269090be4d3cd229db25eb82
97307 F20110218_AAARLH in_j_Page_038.jpg
a7dd47444b32aaeb3205b42385e862e8
c13fddfee2078d21ea16d33ae82eaca092e7187b
97040 F20110218_AAARKS in_j_Page_023.jpg
f33bb12c6ae10ddf00849eb3fd260c96
e487bccef98d4e9664518acc6dcd9b69a7875e36
95793 F20110218_AAARKT in_j_Page_024.jpg
adaf1f8b4cdfb03414703cb08bba9c29
bd65fa6a9e4ca1a4d332aeb712f321179625c27e
99882 F20110218_AAARLI in_j_Page_039.jpg
8820fbe54fcfc2cc84de6ce9827254bd
1a323c49e09711c55b7bfd92728ae3cb4140363b
94781 F20110218_AAARKU in_j_Page_025.jpg
1cfed59e9cf1bf978c44c4bdf9b3ce40
66f52bb21c8ba663f9c08f270374b3126629b80c
96747 F20110218_AAARLJ in_j_Page_040.jpg
bbd013a3990ad456e70378654aa1c000
a964f3ad464901144298e09e179c9786f1270a9b
91884 F20110218_AAARKV in_j_Page_026.jpg
0f5d88430b771c9740118a37110e8bb3
80fa9e01d2ce8524ebc2aa7e5a1c7725688ff9f4
85268 F20110218_AAARLK in_j_Page_041.jpg
0598d1a4797d60df1ae1fa436f07a4e6
230b570df02af7e91ecf741f522b103bde36e8c9
96664 F20110218_AAARKW in_j_Page_027.jpg
3b7942dd32ed2cbda36db48737b3a43c
34727c711b78c2374dd4c1a89ab5c8b4aeaa559a
101643 F20110218_AAARMA in_j_Page_057.jpg
ad78511e44344d7adfd6ea9a8567e88f
68fabd64f706a2aafee3bb1a4cafceb7bd8d7614
94040 F20110218_AAARLL in_j_Page_042.jpg
ff8c81462be7487d55595e3c65769ac9
08fda2baff12c26ecc63255d555dfbab3e126d86
96985 F20110218_AAARKX in_j_Page_028.jpg
2e99aac882cc33abead58345f994f931
bca60e06846fd313706503c0c6c8bdee79db047e
96552 F20110218_AAARMB in_j_Page_058.jpg
fdc17daad2c9081a7a5d118e2a3ca886
aeaab66d3ef8994800e147e6746edd400778fd48
94308 F20110218_AAARLM in_j_Page_043.jpg
703619291b43b84b6f7e42208493813a
f0a4009ee537de762cd6d99fdf585eea62ec610e
90421 F20110218_AAARKY in_j_Page_029.jpg
efcd05d7676c33b9a532b5ec8f357f0b
f9507518a047e89a9a86c2724543d64b60bd70a0
101946 F20110218_AAARMC in_j_Page_059.jpg
dc1a60606ed2cb593d616f6bd0cd4250
734012009b2a9c915ef7f4003e99657675402af5
76074 F20110218_AAARLN in_j_Page_044.jpg
bbab1fa93854f912c43883c627925f35
0a104aec1b9a66a217f72f9eb0eb06a004b63578
90926 F20110218_AAARKZ in_j_Page_030.jpg
69d37502c13240fad36d37d33ad2ddf3
b8f4bdd2871d698a766ec8f1525e97e8f69f5508
85378 F20110218_AAARMD in_j_Page_060.jpg
99e37238abc775ff2e8fcae5570c1bb4
2d21e2e54585ac293ddffc00e91a456f6dcc1205
83871 F20110218_AAARLO in_j_Page_045.jpg
2d5dc66c0a8a1676ab2176fc51cd2942
f65d9509313db71d8dee76ab6e31bab8f21b2e5e
94706 F20110218_AAARME in_j_Page_061.jpg
15240eaf37bb59887a46db8484bb47aa
a0474f3c4a6e03a5f20c065476d3e9d143d12431
83936 F20110218_AAARLP in_j_Page_046.jpg
83c733b4e0347ba20b09fc64edfc3b2f
e78cc47fc022dece2691d19f5ce48b0b821b4104
114418 F20110218_AAARMF in_j_Page_062.jpg
13434b538b425b6b8a6b1814c4b2ffe3
866761165a36ee7c2390f00f0c182975c0108205
93924 F20110218_AAARLQ in_j_Page_047.jpg
3b75c4a530b1bc5dda6ea81620d5e9c8
9c7761a053e7c3ad8cdad259a6b0bd6ae50f36f9
118240 F20110218_AAARMG in_j_Page_063.jpg
89e6c3af55bd761c9da17193665c6cf5
9e50d99be93c3bdc5bbb8434f07f6a944f9710f8
85247 F20110218_AAARLR in_j_Page_048.jpg
c6487c11807dd0882f585b588e6a50f8
c29e9c11b93a54981ad82b48448e8e9769aa7a0c
99733 F20110218_AAARMH in_j_Page_064.jpg
5319272e324b6f0e08247d9beb8e4ef7
c9013842dc2d5b06de877d933fbb88fef1440d2c
98229 F20110218_AAARLS in_j_Page_049.jpg
3e210e5bd570afd9beb212834e5eecbd
3b01067d8c4b24229f830d9eee0a604d2a57f2e1
112708 F20110218_AAARMI in_j_Page_065.jpg
349f8625906e9d1ef6ac73863390bfeb
920b128990691612fc4b473b99f144eef41c5245
85398 F20110218_AAARLT in_j_Page_050.jpg
fb0e640b73d778e40f703aa5ceb03bdb
168a9c01b3896386cb75c035f53048dc6baa8d7a
101589 F20110218_AAARLU in_j_Page_051.jpg
ec3cd2ad53e10ccc8675f6ed6112a296
884fab0f7faf178521321a5f122f7db442bb14a9
95061 F20110218_AAARMJ in_j_Page_066.jpg
82d600511dc5089a877d73463ac0be47
7565dd1f3f72f5693936b016b4367763a71b00c1
99886 F20110218_AAARLV in_j_Page_052.jpg
c469a34054929d07da9aa31d4cf41ea5
810b9db88c101e291272b40484469e966eb7e66c
104367 F20110218_AAARMK in_j_Page_067.jpg
9156b814954820dc12b0983276f55d9e
dde0d79960b85d8e9aca46f151e92486438c37bb
99477 F20110218_AAARLW in_j_Page_053.jpg
6454b889d831132f550ca478cf74504a
17e0e21f83f15f787c0fbb5384a3602dbdab91aa
96847 F20110218_AAARML in_j_Page_068.jpg
e8fe58e23165931adf21cec30cb5ee97
a62bd38a7dc7a7b9dca5bcb1df6ba57470d7faed
98870 F20110218_AAARLX in_j_Page_054.jpg
e779fe57ac109b15f39ad451fcd24fbc
573d8e9d768070b761ef821176e283531397a018
56528 F20110218_AAARNA in_j_Page_083.jpg
a37b5a87163cd03b63b65663f7c29200
cd4c41e076dee0fd088b7ef0889ef4b8c212cd3d
108336 F20110218_AAARMM in_j_Page_069.jpg
0120d6eaf102c80cc9ff7661c06282e5
81c0af058b1c9097d2454002d3757e62eb06037a
26839 F20110218_AAARLY in_j_Page_055.jpg
679504a2533f6d5e99c7318c45680d3e
4097855edc7532175df0f3eae6b99e2f85228245
83564 F20110218_AAARNB in_j_Page_084.jpg
e49ad6570889dd71164c00b3c0a5ca89
d218adca63ad7d51badb73e0a8f6d9ff6a3f716c
96833 F20110218_AAARMN in_j_Page_070.jpg
8cf80395a22fb513926b6000710d5fa4
645a292401090272b216f4de31305b679418527b
86502 F20110218_AAARLZ in_j_Page_056.jpg
4a5a7fe476cf3c9b67660ffdd1cf52d0
4010438daa6ce65b514f3b0111047c1220fd961c
46624 F20110218_AAARNC in_j_Page_085.jpg
aca7fcba47239da7f62c76013987e676
84b56c945a8294fd000d9463a7df1a60778e18f9
98782 F20110218_AAARMO in_j_Page_071.jpg
d4e067504db28835abd3d2e1568540c4
1950bb7c731e17c6534b2b3e428d54431a989f65
65691 F20110218_AAARND in_j_Page_086.jpg
aed57b5b16d07bc7d0643fdeb0859efd
62ac6b3f1980b5f8a93e26e0540995641f19ddad
100840 F20110218_AAARMP in_j_Page_072.jpg
baa22fcc8b982464250d10ccd1606f07
86ff89d321256e79486da71321b585c1287b16c8
95118 F20110218_AAARNE in_j_Page_087.jpg
9ac79c7c7d0c8e0a9520af9022b98734
47704d43102b009f0fab9e1afd5d239b0923ef6f
99382 F20110218_AAARMQ in_j_Page_073.jpg
16dd1ae88db35a629bcd779ab3447577
c2bca1cd868d3fabafb9ab446bed94d52fa2a5ec
110138 F20110218_AAARNF in_j_Page_088.jpg
b2667697aef1e03817f1efe389a173e0
3ca9ea1994964c1bb55b058ab94e064230fdf36e
62735 F20110218_AAARMR in_j_Page_074.jpg
3316aba23b657971a35277879c0d7459
e1085f100cf3263889ed0048adee70c79c80e471
42662 F20110218_AAARNG in_j_Page_089.jpg
feb7d939819be53456e7faf8eb245947
cd4b073d0587c0b056424cfb02366e9ecaccfbd5
63417 F20110218_AAARMS in_j_Page_075.jpg
3276090e4b48635ed15cd180c095cbf3
95bd9dd27d23f91fa4485c14662d12507b884cbe
94343 F20110218_AAARNH in_j_Page_090.jpg
4a2e41c80bbeb878eb883879abea87e3
739a74e2a0ce26cf8efacfeaa2198e766d456e2b
89809 F20110218_AAARMT in_j_Page_076.jpg
5aca7212077b0d464f03bb69fefc99a2
7f19682977eae441cc40e48124b607f1a372d947
81614 F20110218_AAARNI in_j_Page_091.jpg
0210fb855558f68acabdb91cd3de18e5
a40c5b747a26cb13925f537ec9697f80f8302c5c
78259 F20110218_AAARMU in_j_Page_077.jpg
168e2548ae0f5a62d42ba0c77e9fc5a8
4ada7c89008de9e045a7cbb05e81808803023984
85866 F20110218_AAARNJ in_j_Page_092.jpg
c9cca8ba219e7cd10a0ac32bb82b6614
57a3809f7c4ae7e239ea277b301a69d22ed6436e
84787 F20110218_AAARMV in_j_Page_078.jpg
bb3370cf17c03e8d57d8ae3ec3cbade9
63a5f68519b7f548abf684460d540fc00487a7f6
60913 F20110218_AAARMW in_j_Page_079.jpg
dd04113132e5e445370dae8403e37f46
f019eb5c37286e3e2ffe893c52a157998123e43c
124315 F20110218_AAARNK in_j_Page_093.jpg
e89a5fd165316144841d25d67c559335
24734a87dc2d2f1cbd627ed413da9310875096cf
100393 F20110218_AAAROA in_j_Page_109.jpg
69859486db649b5c2796a0eaf9146dd1
bb3c635a8516d6f0bbc65d2eb7fea547970eacec
102808 F20110218_AAARNL in_j_Page_094.jpg
af21a1b1feb439b8424bba1da7b35ae9
b65b03d47eb541acfcaca0589ffd7528269dde91
67765 F20110218_AAARMX in_j_Page_080.jpg
3b461eb49b21db613714da0f41f81a87
854ccaef63136f05f3939816d0440e4704a09cc1
84635 F20110218_AAAROB in_j_Page_110.jpg
7110ca655227909bac9c062955069124
efcf1403f93ac547e23bef25fa94366a1854dbcc
87673 F20110218_AAARNM in_j_Page_095.jpg
12ec6bf100811494e77b1f7e5a367a18
5d3376d0d9320c399064e28ba7f39107d91ba4c1
42579 F20110218_AAARMY in_j_Page_081.jpg
ac3e671ed94cac2b67941c968550c712
091287f82d60adda6d36697a7cb3844ead1725c2
75712 F20110218_AAAROC in_j_Page_111.jpg
2e24c5ad9ae24c7ee2ec1456892a02a7
4058c0e06746364ca2e6362da1e56ad4e0fafbf2
85752 F20110218_AAARNN in_j_Page_096.jpg
4e4516620a2c820cacad3076057d71a1
1a5dac215b6b6ee459f7eb70f49c6a171d4c3011
53880 F20110218_AAARMZ in_j_Page_082.jpg
1ce344cc9541a20c547874d33ae243ef
9ef6046a75a984bd8e887cae2a8adb27f480e1ef
87645 F20110218_AAAROD in_j_Page_112.jpg
551d80f8b8627137f43a573aca74fcac
0faa55dfc1663a087614670b88c45774cb373a60
88165 F20110218_AAARNO in_j_Page_097.jpg
41fe360c1eb0266e6dd862c1e8eb2464
47cb390d300cb7c6e267f681155b09ff51e5bc2e
70781 F20110218_AAAROE in_j_Page_113.jpg
1fa93becb6221a7487e891d57d5c0e01
c72df438f11e48f52468bc8e9269db4fa40c3e55
98933 F20110218_AAARNP in_j_Page_098.jpg
1e1f5f5ed3051b33730d37a35b9f970f
67390d6c9f4382efa2e762650b60d82f48c7764a
89388 F20110218_AAAROF in_j_Page_114.jpg
a2dfbe0976bdb72a4b894a3164ac7a40
f4f6bbc2189be8aae1cfd65e4a5d51f65e9578a5
94694 F20110218_AAARNQ in_j_Page_099.jpg
1bfaa6bbe12b48fee40de9bd300b5814
872380ff944bda405adc1cdd245307c496c37886
97717 F20110218_AAAROG in_j_Page_115.jpg
75421dfc79cda3b9804cba8de09f2a17
34f801202f672b570f65b14446467977c9cb60fd
83147 F20110218_AAARNR in_j_Page_100.jpg
bbace861c68462e4e716f2f9081e5540
86935310468b121231345a76efae847187cb80e1
88601 F20110218_AAAROH in_j_Page_116.jpg
c4e63a1767a4801de3c199753ebcc94a
a6845354795f1fca4515a898bfc024f8f2e4ff0f
83946 F20110218_AAARNS in_j_Page_101.jpg
92167bc942f078619d5b6f96456804a6
007732bc9f13ddbd27af992d5c2dbb9e931109ed
86658 F20110218_AAAROI in_j_Page_117.jpg
bb121aefba3d62c7f1327ed4aaef5622
c6617b5d3c64bb1ef07a963262cc36e218b0ff5e
102525 F20110218_AAARNT in_j_Page_102.jpg
98d4b659dcaef120c3c9efb2576b365b
2b0acb1a273e2237b8dc3ed587c207924c35b5e6
92747 F20110218_AAAROJ in_j_Page_118.jpg
f3d76f2ee839dfb367a60fb76fe4e542
16ab5fa93c3a166f91f9f41013b422eca7d09b60
70859 F20110218_AAARNU in_j_Page_103.jpg
fb83dce4caa783440a170f5e31fb7e76
d84a10ae875aba5fc9168084cfae4efb1165a0f0
92977 F20110218_AAAROK in_j_Page_119.jpg
4cf40a48d6860fe36ed3cda7cb81bbbc
55c42bb7756c4d2448b8d5ec4df79106456bdbb7
106058 F20110218_AAARNV in_j_Page_104.jpg
bdbb7645b6f9d6a7a1a443a8eb935f91
628d9d264006c90b1f6ae19630d75ba010933228
74289 F20110218_AAARNW in_j_Page_105.jpg
3a21115a99e1191c852ef30134c4c182
4b4fe2f783a4494d5288c364f6350956f721b97c
89340 F20110218_AAAROL in_j_Page_120.jpg
8e0c4a8308b0ed1a95cfbd1f12ed7701
741b62aea939a52416b085927ad9c45bd68cff97
108176 F20110218_AAARNX in_j_Page_106.jpg
c994dda1772e50c5f8ef1d867eeb548f
22b86bf09949a59100154c137574b06ed29e35c6
95302 F20110218_AAARPA in_j_Page_135.jpg
ca9e9ff3f86bca8c02bd1e0c2ea1d86b
ed9f7ca1d71c23a7a4c6b1b22d07d3e019934860
78475 F20110218_AAAROM in_j_Page_121.jpg
85b66ac1c98c04cb51b18a2392b0b59f
309f88e38a414e434ffecfd3f0792beedd8e8800
95557 F20110218_AAARNY in_j_Page_107.jpg
be696aba3a32caee86613d87cc7030ff
cd161f418913d1733d78586dd69ad19d7fa7e5ad
91205 F20110218_AAARPB in_j_Page_136.jpg
fbc0c576ab65344d54363a5924efedd5
c1ca57cbb027a1d9747cb5e2526ad87490d2144e
68716 F20110218_AAARON in_j_Page_122.jpg
505903a18a6b1fc780dedde96453150b
d664fac7baaca99f3edfda790ff49e788dfecf1a
94561 F20110218_AAARNZ in_j_Page_108.jpg
21d92c1a14258b73d74f8869c5b5d40c
900f143709ff8d1310a384b5001db9d9c3144036
68100 F20110218_AAARPC in_j_Page_137.jpg
47754fe6c661a917d9daa787797f8cfd
cd83ed9792d19660951cef1a9f177bbfa05af62f
77099 F20110218_AAAROO in_j_Page_123.jpg
234db4950b719cd895bda33dd40032ba
943c0e33a4119686c4103b3a730e0e9c6f2be673
100777 F20110218_AAARPD in_j_Page_138.jpg
80df815168d0d469172ecea88f067fac
0b776d8eec239593dc530fc0171a3e59af15bfb5
95132 F20110218_AAAROP in_j_Page_124.jpg
206c81f3a9c0dfdd7f1a610220efddf2
59464e167c54441d81af314ea3ddcb4c4fc7b4cd
117244 F20110218_AAARPE in_j_Page_139.jpg
365a49da586bd227200fc2a783860d59
2e31c6f7d90bbf6d77db6f41b213b778aa7edefe
82381 F20110218_AAAROQ in_j_Page_125.jpg
c93cafa1caecb6cac18cee49d10bd7e9
5d992ab84ea786bde43471ac9dfb3acb9ff8a90c
115939 F20110218_AAARPF in_j_Page_140.jpg
18cd48fc6ac9c2793bfb7f9bd81cae89
d6a28b027c935f2cf780b81440122fbe5589abed
83001 F20110218_AAAROR in_j_Page_126.jpg
b60100499d8739f1c90fb43ae01df3a3
6da8772fc063e5a8fe0e067e5f299bbbdcc08e0c
121628 F20110218_AAARPG in_j_Page_141.jpg
e50542e2e2851b8980afc2ba4114cc75
d8e59a1fa7851fb22db84561b27588126da7cb3a
60350 F20110218_AAAROS in_j_Page_127.jpg
f78057167f9206483b724ab46cdb0e44
2857d95470bb1525b1479c38228460ee6bde65da
136108 F20110218_AAARPH in_j_Page_142.jpg
8d88c6534a84848067110b88648bf360
9c0aaa70ac52f29a8f9c6c516b41db4e5bc09d2a
98621 F20110218_AAAROT in_j_Page_128.jpg
a52814e2dc71c49e3921fd80d4d68acf
5b28cb98b1268e988b7a8e5811c56eec85a06b7a
116212 F20110218_AAARPI in_j_Page_143.jpg
89cbd62ba59d2c3a768e25f3d6d5a48d
9295fd4c194d0d848dd92c41b7aecaef2a872f81
76969 F20110218_AAAROU in_j_Page_129.jpg
a83f2e542bd311253b2d83ab2619249d
e539ee488449cc5d039ab27d721d1c7eee04bd1f
55196 F20110218_AAARPJ in_j_Page_144.jpg
943f576c43f8aecff8acd29ae3d507f1
e026bc49848d1448616e2765888b66a2baaa67ad
66308 F20110218_AAAROV in_j_Page_130.jpg
f93caef05c3250e13d3688e9148056a3
46fd3adca853d0b454c49d6ea448c73d39a7a152
16819 F20110218_AAARPK in_j_Page_145.jpg
be462371731a34e49c87321a5a8da38a
86d04d63e1dd3eeea88c61ccdf8d4de31fdb217e
97083 F20110218_AAAROW in_j_Page_131.jpg
a495dfb45ae38aafef2d43798326428f
7aa994df8c4fb88b5d1e6be360b8a53fa3b6d8bd
222323 F20110218_AAARPL in_j_Page_001.jp2
aa6fcbd501365276ec7c8aca6861fd1d
282e12da47b38e410fff4c786b88d59af2415cbb
8049 F20110218_AAAROX in_j_Page_132.jpg
391f681b39799ec30200c09fc7126498
2d8cbcfd7182de59942ea0829c182c89ae9f9b5c
1046220 F20110218_AAARQA in_j_Page_016.jp2
f7fb9063b1bdca9eb027ac7ec4914968
23f70c367451a2cbb9471c03220ef4e4f7fa7f49
87796 F20110218_AAAROY in_j_Page_133.jpg
cad4825fc660aa1743ef6e0fada6856a
33d0492515bffabd66eb46a1b49ca850c9f4a010
1043223 F20110218_AAARQB in_j_Page_017.jp2
96c2fa53d1cf87a90e7a5fd0e072d355
ee6c3d1d9b716704a006eb8994d9dfa4f98b5164
23373 F20110218_AAARPM in_j_Page_002.jp2
c0e90cd6350e662b46d880ee2469f9d0
e61f8fcf6d41f12879e73451e968ee39c1097515
100888 F20110218_AAAROZ in_j_Page_134.jpg
1c0514de68bfe4c49e879d5acc914d49
70ea0529e25c774447b5aea8edcecf9201e313ea
1051971 F20110218_AAARQC in_j_Page_018.jp2
a240bf5f72e10298fe9a26ede422a804
b9d7c4694e4d7b4dc6236a6e821f12adf28fd1fe
52680 F20110218_AAARPN in_j_Page_003.jp2
fd2cba55abfac489fc4aa031d4493ba2
7b2921af60bfb5966337c6fb6354d25520cfa9a1
1051914 F20110218_AAARQD in_j_Page_019.jp2
d9c0b1bb5960891b149c1ed5aeddaa90
fc00e7f17c73dbd6b8355d95365ead92c839ad08
114454 F20110218_AAARPO in_j_Page_004.jp2
0d315e6783f7ec8ee1ad0141dd2b33c0
b5e75ac6092a6df4da5b136c294b91a7aa3475e5
1019529 F20110218_AAARQE in_j_Page_020.jp2
2bcbe072303ca8f5e044d1d9e7abd9ee
f9d4d21361145c1d448b9a6948b1adcfd6bc3e23
767426 F20110218_AAARPP in_j_Page_005.jp2
019e24b5940e6d5d819862e7500b7c74
d458c7ef8d0d808c496ea31f800f3c3b1270b0d4
1051979 F20110218_AAARQF in_j_Page_021.jp2
85cf858c6187a4a17d39009078dd8919
2198e393ad9f4a3dd04ef7468efaa55bbca7ffc9
1051977 F20110218_AAARPQ in_j_Page_006.jp2
4e57e35cebdd5b714588c2293d400611
cc75d5cfe07d32592072bafb1f2cac737d9f1232
1026062 F20110218_AAARQG in_j_Page_022.jp2
1cf9385ef99839f40c1c7ec574f87c8d
fe6e0f629d5a6685c0285ad49823066748520b86
556142 F20110218_AAARPR in_j_Page_007.jp2
4d208db2dd05ad61d752fcfd006d1b04
bfa297380cd15bd31d280e2314a3dc09ca6d1a8e
1051944 F20110218_AAARQH in_j_Page_023.jp2
7e0004828a32c7b58d3d4f5ab898438f
d06cad4ce6c52836bbdd90a09a8523c2b30d72df
342564 F20110218_AAARPS in_j_Page_008.jp2
7575139b0a8b8057447410ae4d71d90a
621d50208e9818ac39f439e0ab931d08d9f070ef
1042595 F20110218_AAARQI in_j_Page_024.jp2
4e2218ca79d9cb1a35646a53ed078713
f18beadbdf01afd5e83e736a4b689f645e48959d
854404 F20110218_AAARPT in_j_Page_009.jp2
dd622374a6a7927536d55c1ae1df6b32
1876452fdb19248e5f3d66329e3c3a1880b026c3
1046691 F20110218_AAARQJ in_j_Page_025.jp2
9697c11d412c84ef54773d5c2c80b670
638b9dfa713c9f729276596f554dce0619d63593
550370 F20110218_AAARPU in_j_Page_010.jp2
7c727c29608e5b53729c91eccd93e236
e9149b9da2cf92c76f9e579f982dd27a0a2a88db
1006551 F20110218_AAARQK in_j_Page_026.jp2
52c8e6c51a7f0efb4d7b1186c80e96df
e61446d43d6ec36cb8b7a8aefaf16f1c423a48bb
880060 F20110218_AAARPV in_j_Page_011.jp2
4a74e7a98924bad3bcd4adba365c5ddf
147aa41d1f6ff72d53b138dab948147e8bae3ab2
1051946 F20110218_AAARQL in_j_Page_027.jp2
49cc61f0257eb223daa7d945a3626da0
2686287266d9d26a44371ea83885478b6f961925
1051967 F20110218_AAARPW in_j_Page_012.jp2
9e165c529d56212d29e948de1efc5d59
b10a2988ca3178c99c627670b7e03ad013960b5a
1034804 F20110218_AAARRA in_j_Page_042.jp2
5f62fe401ae4a4745e82cc0b619ae753
ae670a0320b98912952ca9483d3d92e54d5c2ee8
1051961 F20110218_AAARQM in_j_Page_028.jp2
20165cb73bbf47b563dd15e8692501d6
5641f12430075842103e8899f6d008591d05f59b
850140 F20110218_AAARPX in_j_Page_013.jp2
2971b937c54a1df58818d34e40150ae5
49e3567171d54848839d1bc6eebffd1a2b3f9b8c
999441 F20110218_AAARRB in_j_Page_043.jp2
00d565707352d80d314a9d8031de96ef
f32157a000181a8b8261fa1538354012e6f3fe1b
1051956 F20110218_AAARPY in_j_Page_014.jp2
d302abb038fb4dc858353fa47677b957
dc48575ac54bb9e260456c168c00c7abd5a5c764
797159 F20110218_AAARRC in_j_Page_044.jp2
b2118d127d63ed05952a3f1fff4b91f0
abed867bc587be29c9fa9772abf76fc782011b11
983723 F20110218_AAARQN in_j_Page_029.jp2
4518654fd82f246159f8f54cc30a6c8a
b6d3f339f232d1e37af10d7690849c04f6bd9b95
989548 F20110218_AAARPZ in_j_Page_015.jp2
7549ebb6398d0195b0a80c8cee97f369
cef3502a88bed8d28a5de7a95d68dd288b6d22d2
911154 F20110218_AAARRD in_j_Page_045.jp2
712959aa4da78a04102419e8d07d0a60
c53b8d12ff681b025ce77a0d73d89d03c9e50fd2
990191 F20110218_AAARQO in_j_Page_030.jp2
d8f9cb8e29c47bae6cab2230a5e7b126
806d1a89f0b12fed41dd2dda457d6b60cb82535f
920384 F20110218_AAARRE in_j_Page_046.jp2
106b0adeab3c371be8904de0e9faadd2
f037e017f8539f478f8f9563677757a7d7283695
1043199 F20110218_AAARQP in_j_Page_031.jp2
e56dfd3b15f716b99f5686f989a5973a
27365a6b65e5fb16d1679673f036905f61ba7026
1022497 F20110218_AAARRF in_j_Page_047.jp2
8860b88b9e86f051cb47d78ce02c02ff
793d1ea22d68428a7c185866ab6b9adfc104ce09
1051959 F20110218_AAARQQ in_j_Page_032.jp2
7d667e66760e1da420f68df54eb9461e
a616c6ade73949cd1b6e6fcf96b6fb92ba83fd47
921620 F20110218_AAARRG in_j_Page_048.jp2
010271f2da9bf5c112fef853cece0de2
ab4340a0c843e0e214a5bc9f038944275b245780
1051973 F20110218_AAARQR in_j_Page_033.jp2
f857fdd4b4e4d04309cd47417b1eb5be
5d0f8277d27b3cc7fbb552f1bcea56b40cc486dd
1051984 F20110218_AAARRH in_j_Page_049.jp2
7158fff996f37610b41ea0fbe797f684
495b2441f5ee8c271f064d36054070f8f2c4e32a
1022428 F20110218_AAARQS in_j_Page_034.jp2
310bda7087ad5b2c5c07ad0da461efe5
b64c775672259e93240721c637003401e3f1a682
882749 F20110218_AAARRI in_j_Page_050.jp2
c3d2af8d71e0839be0b983dbfdc0fad1
65e117176ff8748334657c53c014b2db1cc98c34
976744 F20110218_AAARQT in_j_Page_035.jp2
0a208660a941df3d91a19d359e6c546c
289fe66fc98dcb722e5dec2e5f92f3fd48d6a6d4
1051943 F20110218_AAARRJ in_j_Page_051.jp2
6865b6f3561436d30e235cfe6d5a7b27
01fe7dd7ab70daa6510aef25ee9e66ebe7c59363
F20110218_AAARQU in_j_Page_036.jp2
1c2c8bd1dd9c50073b958985f9535f1d
b745f5687b3e418096e0d1816c2bd2ba20b842f4
1051916 F20110218_AAARRK in_j_Page_052.jp2
bee9d3676161fa5cbe7f0848c872b29f
d8cecb1409f3da6a71f1583bf0d1cad1b3b78fea
1051947 F20110218_AAARQV in_j_Page_037.jp2
2a7943d2062debf59d5369a0cb90f704
5ae4cfd12fca533f8dd0c8c1c998f74dcf4db324
1051965 F20110218_AAARRL in_j_Page_053.jp2
77c532d74790c941b3b4f0dbbccc8e73
93914ddca57138632ef77cc64b6e6bb7e7ac34a2
1051981 F20110218_AAARQW in_j_Page_038.jp2
5059ff184164436829eb677da8606992
ab13e24b77f3c5f721d5e6c44a260ac7411e1b3c
1041139 F20110218_AAARRM in_j_Page_054.jp2
5e25083a5214f88625cb924668519cd6
0349222d3fc4bc76364113e0652faeab371e0d10
1051975 F20110218_AAARQX in_j_Page_039.jp2
86253518ef273a13280d8dc7e49b18b4
92fa7527826f94d60fa3a848ba1178498d38d2b9
1042970 F20110218_AAARSA in_j_Page_068.jp2
ccbf292ba26d40def31cd67511b164b6
2322888982b04a1ecf3a8395cb94aa7792f5eb8e
274904 F20110218_AAARRN in_j_Page_055.jp2
610ac5c51a63f5ef0c594bf3b188ae23
fd808a7078b659f5deb2eccf17dd4f1941666183
1051922 F20110218_AAARQY in_j_Page_040.jp2
897a9a63a8d1e0e0d7358f65b41927cc
77ac97007ac8ee0d5726c732283e457093af2097
1051966 F20110218_AAARSB in_j_Page_069.jp2
d52870189e83af0a6783fa5704b221f8
2d6a19f71cfc856a78828a2d96829d2c37dd72da
884285 F20110218_AAARQZ in_j_Page_041.jp2
2eaf08557391313668ae4f2ff9339f53
8f61f3178d9fdc5de186d891f9acd023bf4fa0f4
F20110218_AAARSC in_j_Page_070.jp2
2f6d6fbe6c0fc72981141cdfebaea0bb
cec205128d89ff0b5ba15b9ed952f4e63781190c
927209 F20110218_AAARRO in_j_Page_056.jp2
0d21a68e52ed9b843e26c055d2a908d9
10beadd7c5592d63eb63217e9f69dd11dce3d027
1051983 F20110218_AAARSD in_j_Page_071.jp2
d9ba5e6dc456903183a7e3bfc35b86c3
65cafa8935a1d50c5d36f40eee6db9e156244ec3
1051969 F20110218_AAARRP in_j_Page_057.jp2
b3c8c89ff88ff67ba29e45b3bad4a652
c2b45f7cfdbb73e635bcf3222798f98051de4710
F20110218_AAARSE in_j_Page_072.jp2
0368ee305200c83e86ea1c6c4d327ff7
7e93fe78a1793e0424b0da5d628f83000e9ae47f
1050095 F20110218_AAARRQ in_j_Page_058.jp2
d3a7bbaf6d7496f63a16508bf28dfa67
a11275538d3f6b8f40a4fe5809743ea0d14a8d8c
1051962 F20110218_AAARSF in_j_Page_073.jp2
a4545c004e09dad68c6035315df0025e
2e8a9b6500057636cce1c85bf7f8c5b4671f2f75
1051982 F20110218_AAARRR in_j_Page_059.jp2
021c74666f2b3f343fcef05c20ed26b0
99a52b49f501fd3c87d2a8909b8515f3871df865
666549 F20110218_AAARSG in_j_Page_074.jp2
00bf059e11dfe9f52bf5c697df3cdea3
1750e58757d5ca8b7344f7e0fb75bbade8a36978
904974 F20110218_AAARRS in_j_Page_060.jp2
c96c4a5ed4039a68da4543a6b2186cb1
fc45c409261b430da76b2f9ea7d574402e04fa06
681657 F20110218_AAARSH in_j_Page_075.jp2
1547a40b645886527d0232011b0cc193
7091d00b22b3a9b978bcd818f52b6ece5cae6b94
1042212 F20110218_AAARRT in_j_Page_061.jp2
cd3ba97323963d4c03353a8276e199d5
8a7b988201171779cf54f8224b7a3303966d5015
986152 F20110218_AAARSI in_j_Page_076.jp2
9c75c88e92555bb1c9e55db43eecf5fa
a05a5b602751658e22b27eba4ea2ad74c79a01ed
792342 F20110218_AAARSJ in_j_Page_077.jp2
00f9c57429b6fa5a36d363957cb23b90
73edbab44d2084e1d19cf434ea8346ac9425b23b
F20110218_AAARRU in_j_Page_062.jp2
a4d1fd404f2aa9cbfda9b14a6d54e2f9
992df1ea7079f545544ed8576698fac86f13a8b0
928107 F20110218_AAARSK in_j_Page_078.jp2
bf23045da50d8677acd3a4d173d06af2
6206f2743f415a832d3da132aa21ed82bf0430cf
1051976 F20110218_AAARRV in_j_Page_063.jp2
6dd7e204f721e8ab1554e68e2ae1ef56
7fd8d8507a168878b683a1f6926636c68e59118b
563178 F20110218_AAARSL in_j_Page_079.jp2
6e9f3e9f575d702f8b6016792e875ee8
4cfd4cfb0dc12896eefd3eb0898fdda02609e54a
1051949 F20110218_AAARRW in_j_Page_064.jp2
e5cad5d6c2c399e5fa94f637ab04849b
dac25581bfe5dd4530e28d53c314435f1af8f1ec
1051951 F20110218_AAARTA in_j_Page_094.jp2
33a508b6409fe433855f270987a9a418
1ecadfbd130a424a3b747ee4977a0d8347d34ee5
728670 F20110218_AAARSM in_j_Page_080.jp2
7ff5d61f061a18f3237d6f27311ae2ca
65f180ec0525d4c5240b8c7c5f956aae58fca363
F20110218_AAARRX in_j_Page_065.jp2
a51303d5339be575836f3d9018d96dfa
c8a3af4ad8a871eafa21eee5bc859a80a72ede69
958628 F20110218_AAARTB in_j_Page_095.jp2
b7609515acb82e15dd69e8505b4602da
847771ad960b1b80d6260fd9402947ab770a464a
380029 F20110218_AAARSN in_j_Page_081.jp2
f53b30b41607a1d80c470fdbc5ea6a4e
1ad68b0d73bbe05edbc465c3d53f60f2563e6d48
1039731 F20110218_AAARRY in_j_Page_066.jp2
e5cd4f1ee120095beae2f2db2bfdefbe
fbc40c57cf6a2c65dbe46bb5e7a3ace0a20c48ca
930210 F20110218_AAARTC in_j_Page_096.jp2
d4dc45d8cf5225a047bb429aeea84dc3
29fd370db739fc022306bbfab9febca0c6d9380a
496806 F20110218_AAARSO in_j_Page_082.jp2
0dacd60125b9ce7268b36007e1df1e07
1cf6c39d68b35f4e125c1ba26705aaff1707cd22
1051940 F20110218_AAARRZ in_j_Page_067.jp2
4670c2c290f19b1513c43badba4e266f
8b27801ad25a7bfcb764f378a20c86db86aab8a0
952961 F20110218_AAARTD in_j_Page_097.jp2
d894b3329ae68db0a8590dc2b2d64c93
bbd422557c7bf7446412b58429837eff3bd15e29
F20110218_AAARTE in_j_Page_098.jp2
813eda0fc2f4948505592e86f9661b36
182b7abd8e868254636ae8fdffd6a230e0c40f49
562178 F20110218_AAARSP in_j_Page_083.jp2
651b2ffa26c1d3bcb13fc023a5eff0a4
88c71bd4654e3bc72d9ae4289a50a6b8bd689fde
983862 F20110218_AAARTF in_j_Page_099.jp2
3d621c05dd6cf131f69c4346139a5990
bdba88fb810d53ed19d92edec2e7ca59eb423626
916430 F20110218_AAARSQ in_j_Page_084.jp2
6f190bcbc8f45b7c65bb9bd65b2a8414
c8b8e8e97c1774940cb8d5eec16b330dbda70a5e
906830 F20110218_AAARTG in_j_Page_100.jp2
18a0f208ca78b499852ab61dabca6c54
ad94a7b6c36105eb2ef1588abf00b68ebe45856e
420635 F20110218_AAARSR in_j_Page_085.jp2
694778b78eb42e5bfdb2500d133630bb
88371240a495074d2c8ab637ff87067855f8962f
900641 F20110218_AAARTH in_j_Page_101.jp2
7bc9a781dd0853f1be95fe5e07c8117a
88cda04d4a4792d9074d78e95deef983509a9cf5
681873 F20110218_AAARSS in_j_Page_086.jp2
894936b626a4501c782606ffdf7708e2
67bd91eb8b2a7f28eb7283a2af0b03d5ae65bad0
F20110218_AAARTI in_j_Page_102.jp2
fc957f1ea6fc0d26af47c49957bdbfb6
d89b18f51daa440ef96766515a6e4fc911fef4a5
989822 F20110218_AAARST in_j_Page_087.jp2
0d792830c2df4074627e69e8881eca74
aa04488c8f2cf5e9798d8349d2accccd0a17c75b
780002 F20110218_AAARTJ in_j_Page_103.jp2
7d557132db79a72b2ecaaeb988840544
e0dc3f6cffe035f5334f7bf36c878608b2fcd54e
F20110218_AAARSU in_j_Page_088.jp2
fd8d3519be1aedb0619f2da8f20ee5b5
07da0ef5033374cb599c290446db6d31330fc690
1051937 F20110218_AAARTK in_j_Page_104.jp2
61849086e893edd77aa14eef65e2209b
ecb4bc0517f7e50b5ff9636106f21c5b4764dda9
442977 F20110218_AAARSV in_j_Page_089.jp2
1d92ce0c012ac35a183534957251431c
a5f5ea71a19f4cb51090b1bc666f192354347e43
784344 F20110218_AAARTL in_j_Page_105.jp2
abb6e92f49cd21f7c242bac6ef29ed91
444a24185718b618479fcf2822b9e767621a58a2
1023556 F20110218_AAARSW in_j_Page_090.jp2
157863c213bd8f46ab27dbce0d484f2d
c47b88c35ee969b03ab7a6f5cdb8bcee1d3dd2ca
1051957 F20110218_AAARTM in_j_Page_106.jp2
a2407232a3c0baeb985872dc1a685796
fc9b9a92f40875131cf5b01d9d50e67370ebdf45
885787 F20110218_AAARSX in_j_Page_091.jp2
4dd1da3e483a07203535cf611f752947
9b67c42e8534ece1f203ec6cb1dae9d6218687b4
959118 F20110218_AAARUA in_j_Page_120.jp2
eae4bfc216a739c15de010572d1f0642
1a1ae1d59f186e8c3432f17f8da231b008fd698e
1041662 F20110218_AAARTN in_j_Page_107.jp2
e560192f8f9f6cdf03c2b359ce44cb51
e526f0bc9cfb04430de39a00afcf873e92caed88
915159 F20110218_AAARSY in_j_Page_092.jp2
cf018891e682cd561e45cbdb9a3987ee
6877d8a0a2a3e0e8318daf95217cc4e5f08da094
785583 F20110218_AAARUB in_j_Page_121.jp2
538168c59d6070a6479839174425ac20
8949bf4f1f6361400da51ddf8d09d5a05fe59af8
1024310 F20110218_AAARTO in_j_Page_108.jp2
5349da2e0e780d26477b8cba43b466ad
20ccad3593ddd40ada8c40817c923d9ea47cb144
1051883 F20110218_AAARSZ in_j_Page_093.jp2
25021de344908eeab8a6306be2ff3ff2
db712180807830b300e31916d3ba4a2ff8126550
692898 F20110218_AAARUC in_j_Page_122.jp2
e4fa76e933c761d30e72cb68092cc10b
77a1526108f26ac076688e7be5f276c6ec0cd91a
1051978 F20110218_AAARTP in_j_Page_109.jp2
6bbfae7dfbdb5ec1f8e7eac52b92d5b4
9176b7e4c8cc1b665775483839f8485b178f6ebd
801940 F20110218_AAARUD in_j_Page_123.jp2
6662dee53db4e7bb631ce88bcd58fcf1
c3fbea848ea4fbc3c9a7ce2cc35746a3b487067e
1027960 F20110218_AAARUE in_j_Page_124.jp2
e6c2b28f9cb7d41d66d960594473e735
72d2d9cf8b4b30d7d7664636d69cd6c29de64a73
927235 F20110218_AAARTQ in_j_Page_110.jp2
80f92c6718d07d557682d067497fb8e2
1d7df063ca18fb7e5cf49d6d482503315177e443
854365 F20110218_AAARUF in_j_Page_125.jp2
ec6e9856a17c81cd34aa757e04e72c88
716816ab1e052a62b052cbfcedef091bc364c03c
15236 F20110218_AAASAA in_j_Page_085.QC.jpg
2940b4e4779409410d436c22230fe324
eefe64d3c459cb7ad4808556edca457cf88f5325
761380 F20110218_AAARTR in_j_Page_111.jp2
d2587b69c980204e0ebf3647e83db494
7aeacd4defa97da99c15a2acd56bea7aeb1bb8cd
866235 F20110218_AAARUG in_j_Page_126.jp2
a36197ce6940b8e49ed3411685753d58
a6bc82e56acc42c1fa1b9be3f53871a7e61cadd6
22409 F20110218_AAASAB in_j_Page_086.QC.jpg
d465f40df3e7e634595055140bf8095d
aad157d9130e684b1ac42da4c067d79296619bde
894111 F20110218_AAARTS in_j_Page_112.jp2
735dfced6baf8d4e110f9b9923dd1a15
5ec270f92ca7b1a5a56a7db7d8f9d0b6a582bfe0
612387 F20110218_AAARUH in_j_Page_127.jp2
30dd1fad9106ca459d762c30a1000ddc
4aff872282f057c819c9e5773615496d60fd3398
30113 F20110218_AAASAC in_j_Page_087.QC.jpg
7789524878d5e3b4a48660b71eea3e7e
f7cd306ca6d2e033a5e3f639f20582b5fea73508
763910 F20110218_AAARTT in_j_Page_113.jp2
36a479b46aa425b9835580493529afce
a577112c69ee02e9139003948199340dd0985a02
1051954 F20110218_AAARUI in_j_Page_128.jp2
ef2ca17c2e1a548e342f5d9bb87a66e3
26b42eee5d3195ecbb59d02cf22bb77bc78e7d2c
34076 F20110218_AAASAD in_j_Page_088.QC.jpg
e70f76eb47ff2a351626cb00bcaa5640
c565f29fa109da5a76142f8a79d70349fbefb188
933321 F20110218_AAARTU in_j_Page_114.jp2
c3707a13d97f3f8f5fb666ff84d4314d
c31e2833ffc61eeba8afdfefcda9e7b32befdac2
833021 F20110218_AAARUJ in_j_Page_129.jp2
f5a87d52359d493ad6cdd23168933bd7
574b6f17a0fca7e4aa17374fe750af483aa17c1b
13853 F20110218_AAASAE in_j_Page_089.QC.jpg
861a5fa193a931cf1e2084f5abee91b2
719f886ad4bb8997fe3a1758b73b8a69271dcf08
1036129 F20110218_AAARTV in_j_Page_115.jp2
c114cd352f1e4419c07010b88603655b
6afc8c58858921cac941ded5be067c7905ea7db8
670438 F20110218_AAARUK in_j_Page_130.jp2
5237cf021f6fd8bba879deffa23f05d3
6990bf613a39cd6a611cce8fe77be7f40b7e30db
30247 F20110218_AAASAF in_j_Page_090.QC.jpg
f77aab2d03cc18c8426d039c5451d17e
5dc44f6a9114cc0f459e894c1f71a34471e951ae
904131 F20110218_AAARTW in_j_Page_116.jp2
3afba5e48d206df85595ae80e3559dd0
6033e7a4f9b86dd2576b3e4352b1d13b5fa24396
1051938 F20110218_AAARUL in_j_Page_131.jp2
8a590b1a1de292eaa7e527b45d37f97e
3aca139d5f17cc5708a3bf41f111a95402155dce
25945 F20110218_AAASAG in_j_Page_091.QC.jpg
a3568f4e42f321adf8d20177af24a9db
df8c296f64c6aec35b7f81547adf9ea58659aa41
937560 F20110218_AAARTX in_j_Page_117.jp2
2717905a2aa2bd42fd2df9c6ca6ba1d9
393d370ed5526664fa313d330cc0c2d5eb3a5036
865 F20110218_AAARVA in_j_Page_132thm.jpg
e2199be2a918276b963322ce1ab9b5c4
9b8f3b065b30fe21da557705e72546c44a70de62
65074 F20110218_AAARUM in_j_Page_132.jp2
7e6425185c6f634e73493f6efd1a4def
fdde0794d6de6fa7e85b13e763252662cbd97d36
27402 F20110218_AAASAH in_j_Page_092.QC.jpg
09f7d16288a540bf5890494bb69ac7c2
fb2d4df0cb391eebda251924f35ca140bfdc9e8c
1010695 F20110218_AAARTY in_j_Page_118.jp2
8562db1494f55cc1e0f31c8629dbfde4
8d1804bff0002f00bd7ca2730713a3dc9d1bc513
30796 F20110218_AAARVB in_j_Page_016.QC.jpg
12ecfe365e9393e13fb748554b1e0de2
a41f4720519a4741999df24ea88e7d67b330b467
963057 F20110218_AAARUN in_j_Page_133.jp2
0f15bcfd405aacc99cb2bfaebe202f16
5eddbf35a2a95b7da69ca8c8a2aeff7a97ebdc3d
33200 F20110218_AAASAI in_j_Page_094.QC.jpg
742147a1ffc5b011498e8e0ed3691f00
518a9a641f26a356b5782a9065627c43ffb524fb
1006191 F20110218_AAARTZ in_j_Page_119.jp2
81c906f259c8df8e9f7c5aa1eb064861
7a5d3270118163a7c8cf325c98f4abc5a73a259f
29576 F20110218_AAARVC in_j_Page_017.QC.jpg
ce21619ada2dbe4c761447e792b260a3
866a7998ca548453e5422f96b2fdba7827ec1e0b
F20110218_AAARUO in_j_Page_134.jp2
1bab68c5809669df2810648029557025
1d8a3015cc41cb595c3c431a6846151874267497
29298 F20110218_AAASAJ in_j_Page_095.QC.jpg
f1c899872bf25c4d9b204630dc2e3c40
f4a2c3ae2313be433b66620dea4ce433874bb53f
8521 F20110218_AAARVD in_j_Page_094thm.jpg
68876301182a35e9e0ff491a2ae73891
c1c9957ee3836bdba422cb5d67612a0fbdb12280
1051854 F20110218_AAARUP in_j_Page_135.jp2
1e53962b3c0bad7c328fcda31272932a
8fbe50e1facf49ac51f395a6c8c1c6cac7ae3f28
32769 F20110218_AAASAK in_j_Page_098.QC.jpg
00bc46a9a2e99aedf68dcba0762ea1f2
2d157cf647d6e25cb9b3c32e7e981499d2b82b65
34667 F20110218_AAARVE in_j_Page_063.QC.jpg
b0e4e94371f0e79bc9936bd83ee0c26f
a33c3ca974c37828f88284133223e05daf1c4c11
1004413 F20110218_AAARUQ in_j_Page_136.jp2
2c7bab0746261cff40f1e44466b858b3
c68863ac4056011a331d771bb2391851025ec2c5
29295 F20110218_AAASAL in_j_Page_099.QC.jpg
ce18221b9fb3f599c72e3630e6613995
c9d52e0d4e9b38668e193050f7e2fdfb9f0da3b2
4134 F20110218_AAARVF in_j_Page_010thm.jpg
28f7b7713f68d9d94091835c173f7cbb
ca775a8354bdbda905f16c4b60065b6709f22d06
30337 F20110218_AAASBA in_j_Page_115.QC.jpg
713a7406516a9028bd823ce70b77feec
4949ba90fa806aad2c1aedf1f4a83df74793bd27
27373 F20110218_AAASAM in_j_Page_100.QC.jpg
027526d9e7c7b56278013f2a2cb898cb
290eb5a86ecb33a0cd71f6c6e95eb2307f474874
5952 F20110218_AAARVG in_j_Page_113thm.jpg
06e3b085cd3ec4203be111be75af6c47
4abf2ee021a342662add6443a78d6ce45e0d7fb9
739694 F20110218_AAARUR in_j_Page_137.jp2
acd9c8e915f7bc6ab69a2aabd29aebcb
e45fe2e6ea67d3aca65a9aaec3cf770e85c831a8
27088 F20110218_AAASBB in_j_Page_116.QC.jpg
441cabdec789a53635a6d046ba134c49
5a0f586d6a336856e77fc927690957b4764ff808
33035 F20110218_AAASAN in_j_Page_102.QC.jpg
9c396f497f07f4ff6904c930fdae9354
7d1b9de38a21af448ef27bc5ebb8ae6ad3221d2f
31689 F20110218_AAARVH in_j_Page_031.QC.jpg
0bcc35a40ff6a303f84dbcb5ff206d63
6bbb005e5f51a909e1687ec56b07df5c28ef4d38
F20110218_AAARUS in_j_Page_138.jp2
6732a70ff0cb3e244eef3b5be162e826
da44b80fca7e9eec105e0c77f0c22ab66981c1a8
27201 F20110218_AAASBC in_j_Page_117.QC.jpg
68022f4c85b34044dbe35008e3ae1f4d
244923d965b2aaa428d0960516623ead2300622e
23171 F20110218_AAASAO in_j_Page_103.QC.jpg
353307b4283fb54c21d6b50633db004d
fefa2a574d271b96be40786d2284a579fa2ff28b
24559 F20110218_AAARVI in_j_Page_009.QC.jpg
52c0cd78a675c7c9542f397fa9e3f8c0
4c5e2726f92c6374cf8fa95a3b278e0c879f6d79
F20110218_AAARUT in_j_Page_139.jp2
fecb7232f24fd8966111f1608a2680b3
ec4ea72fdd107946c336afca0b32fc1be26db1b3
29989 F20110218_AAASBD in_j_Page_118.QC.jpg
88bb5773bd7b205e3152a9d6767a3a22
7198178eed77861faacd035be1def30c77b70607
34543 F20110218_AAASAP in_j_Page_104.QC.jpg
39af9b937458956b7b3a2f193906ff51
a9eda2d4fc652fb52a93e5dafb591965c1abc132
16426 F20110218_AAARVJ in_j_Page_010.QC.jpg
7331cb70190f8f81f0fc7910a2b7ca91
a78dac0111761944b3668928866f28714d26ca46
F20110218_AAARUU in_j_Page_140.jp2
2d0bb57a0ec291ef2df69d14e65545ad
ac283bbf70aa42813f06760589a73d45062bba68
30076 F20110218_AAASBE in_j_Page_119.QC.jpg
1b78d849eeb13f5f38f0965af98b6677
19b2be4e17056e0edb446a6216f4b892a186b293
23318 F20110218_AAASAQ in_j_Page_105.QC.jpg
3b503a1ba091fffbcc00887989d229d7
4cac5a1dcd9bafb1627bde71b23128b92f3597c2
28143 F20110218_AAARVK in_j_Page_097.QC.jpg
317e0ede2b492e91f17b5cb29a144f39
667d6d479a5bf31b3b832a2a500c1611cc933fb7
F20110218_AAARUV in_j_Page_141.jp2
33811e99b380239335a0b261d4d02b11
041c03f7aaf917b6f5a7be74ed3403f45d977222
27850 F20110218_AAASBF in_j_Page_120.QC.jpg
a3a11646ee9150f29505996cad053c86
9d9d253a6b0fec93dd779f9d47e3cd78cf330975
34994 F20110218_AAASAR in_j_Page_106.QC.jpg
780ad18cfc375c0b96b38a21f05a9584
6421dad047e185f2847fd2d3f362e2386a6a9b38
7557 F20110218_AAARVL in_j_Page_112thm.jpg
5444bcb64e7b3043159102d0fbca86b6
d2fde8aba92577a10c4cf6fe2c76cfc103eb93a7
F20110218_AAARUW in_j_Page_142.jp2
43a4005bb1363d5e703b6e4e190e00ec
e43210ccdba7168dc0240b1dbcdaf3397957c279
24239 F20110218_AAASBG in_j_Page_121.QC.jpg
309ce788431b4b2e05c97c7a370df436
79c07d861639cf0732245f0188ac75ad6efe59a1
31064 F20110218_AAASAS in_j_Page_107.QC.jpg
b46907c5a71638206d50cf1fc9651c85
ad58d563d3527816eadbf4c60394b67906dd8c7a
8040 F20110218_AAARWA in_j_Page_134thm.jpg
665e3f720eba5474f5044b4a518cd6a6
c241e19ea88a400f2485fb1011b711deeb9929c7
8166 F20110218_AAARVM in_j_Page_040thm.jpg
c16eda211bf289a58aa725287e031a66
c661ee8579907b23b89a9932f5f0f73bd8e4f4a5
F20110218_AAARUX in_j_Page_143.jp2
01e200e75cd9bb79306f235b8fc630c2
5fa7c8dc0209b24044f3ce5991fb394660e45c5f
24500 F20110218_AAASBH in_j_Page_123.QC.jpg
4a5514678abcc7bba41ec92466d952ee
3d6f87b7e3278a3afecc7f6f33ab699d58fd60aa
30856 F20110218_AAASAT in_j_Page_108.QC.jpg
15e289343f2e4a6c9b67e77d861da768
fcc9294e7fc35d796ca7a745c701a739bca840a0
30933 F20110218_AAARWB in_j_Page_023.QC.jpg
6a9442fda9826cd3294379c9bccacd7f
46460b407d46a5e056cdf6de6ee96a5e6151f2ed
25343 F20110218_AAARVN in_j_Page_126.QC.jpg
7ce87614498a602fa02cf14312d4c994
4fd96aa0a616b3ad2f5b6b64f1ee65c3bde75956
575517 F20110218_AAARUY in_j_Page_144.jp2
ffe8c2b5d4d390543bcac4ec27343a60
18062e55a9ca0aa2c59f76063082f8e3761af2f7
30875 F20110218_AAASBI in_j_Page_124.QC.jpg
e15a46aac6b9d04bf1359a68e846f195
125b0c7212f8d2233bd8afe2243d1cdef8d012cd
33264 F20110218_AAASAU in_j_Page_109.QC.jpg
918806b65628a09a2f62db3210228d55
df94c7070537ce6279464cb2d8af8934db3fb40f
167172 F20110218_AAAQTA UFE0013834_00001.mets FULL
1404966378a85a69cd0e44d9982f0e58
28aa4bfbb6479155fee4634aca31cb1d9976a223
7897 F20110218_AAARWC in_j_Page_070thm.jpg
f2b2a1dc4a7cc1a07d42b54ba5893f68
dbd8e525d6a1ec2f89f679aff1ee83bfea1f81b9
5244 F20110218_AAARVO in_j_Page_079thm.jpg
3e8f2bfd31a0440c321ccabd8cf8a2e9
97978b62980be3cd7dc08ff7705aa3eb51e1e31a
148141 F20110218_AAARUZ in_j_Page_145.jp2
3878206ee2074eda504c4704a78eab47
895292f353367542f05618cc1cc222b0633145ba
26139 F20110218_AAASBJ in_j_Page_125.QC.jpg
c0ead82c4e7cc3ad20888fa7ea5f3ebc
4b8f999e403a1ae06c1ef70007bcd33e0b9c5280
28433 F20110218_AAASAV in_j_Page_110.QC.jpg
39c246bcfc2524ba37abc5206cb62776
cd6b7787c571b26529c54098d1335e69fd526ee9
6548 F20110218_AAARWD in_j_Page_125thm.jpg
7aaa75f556c7b448bdace767c9697432
2c4fa14d3aad937d65a0142594b818aebcf028bf
5310 F20110218_AAARVP in_j_Page_130thm.jpg
79e44d0b42f4dee4bd5750a53d279386
1af4a4d44e7103adad551a25b3ce6f3726f86712
18593 F20110218_AAASBK in_j_Page_127.QC.jpg
2170276fada1e34b064a68e7b22df841
25be98c6ccbdbbdea940958559595a24fe7d0e56
23890 F20110218_AAASAW in_j_Page_111.QC.jpg
5a1c20379be131e0f0ccfc27510ae902
39aaaebe6ff8418875412f354737c945c250f380
36069 F20110218_AAARWE in_j_Page_093.QC.jpg
234199af2d689930819cc2a4a70d48ff
44768655870d7fa251ef91b6e004d9cf9fa7884a
5726 F20110218_AAARVQ in_j_Page_137thm.jpg
acc661d8292e2266f5b283f18fed7c69
aadc5e649b9aa5d1c62370ea2616ce4c68b0f91d
32190 F20110218_AAASBL in_j_Page_128.QC.jpg
9f112f0eb87a2c912f7eb0b738ef1269
605c551038dcc03e6441112bebd41e1bc6a1ee20
27763 F20110218_AAASAX in_j_Page_112.QC.jpg
30967c6182b88091e86d8e51d06e6edc
ff1c860723af545174c07af41af54e2b50f77dd5
8423998 F20110218_AAAQTD in_j_Page_001.tif
46b15f1ee7aef1095b533c45b7d6e542
a29483908d98d4c65e05b5b7842c28805c4ac0e8
28508 F20110218_AAARWF in_j_Page_133.QC.jpg
f41549ca9fe5b98bc3df7512207d801c
3358a1baf40421d3af2a91f8f405949b08bed03a
7577 F20110218_AAARVR in_j_Page_097thm.jpg
310cf95626cabe5de8aecce9a84c6453
5f068dff8cc5035af88615af7e8df15c8146b6f4
1309 F20110218_AAASCA in_j_Page_004thm.jpg
f7db105854a5ed10b09b65fe75df03f1
5f99e99097a1612dfeabb7f6fcd57ed524b49273
20227 F20110218_AAASBM in_j_Page_130.QC.jpg
076290a463546b93425eb6db6fd03262
a76b4e1ec1d1af4f3f59be895547d9c55c0810cf
23626 F20110218_AAASAY in_j_Page_113.QC.jpg
4867a064fd75a5c54cb520d85772e04a
e62a1fca61fcede3dc1f952e7e66da40891a0e1a
F20110218_AAAQTE in_j_Page_002.tif
1d94b0e40371ca9dddf14490367c5c36
38b7c939842aedb454d3d5aa250a9059466dcc95
27537 F20110218_AAARWG in_j_Page_101.QC.jpg
646422bbd912685ba0f1a355811390c1
718f3922cd44c18f92892e5669fcf924ea35727a
6033 F20110218_AAASCB in_j_Page_006thm.jpg
0197ba5a74aea0b6a43d330a83e78f3f
eec3b596386c3bb197c8b17e7987d4c6071bd762
32424 F20110218_AAASBN in_j_Page_131.QC.jpg
18c17a5bcbddb5b12b1442ea5c818b69
20b40ccae972e11fe9abb89410d82a2297ab2456
27293 F20110218_AAASAZ in_j_Page_114.QC.jpg
200165f954bdeadca1bd3a7096f02757
2319c42ffe24c6f42358c1a69dc2e59420daceb8
F20110218_AAAQTF in_j_Page_003.tif
e1e5298e192b5ca6f4d5b0d39b759815
66a79821d03a9c5cd6929425a4d02d0cbec28ebd
27729 F20110218_AAARWH in_j_Page_096.QC.jpg
5490c5db90b948d8f1ff3d31063f1f64
2b13150508b1cb5ea402a96311e86b996cef60f5
4988 F20110218_AAARVS in_j_Page_127thm.jpg
75a685c82694f95a079341d5945cbeb9
2bf98cd86e4f553ace3c870b981cf04a3a70bef9
3120 F20110218_AAASCC in_j_Page_007thm.jpg
4fbfc329a32793d444257bafc5b7c18a
37befd439bd161a09f124c8db82b81ada29cec33
32723 F20110218_AAASBO in_j_Page_134.QC.jpg
0fe39a21fb341e93dec490b7cc1c1c03
057bacf30a6560dcce1ea6e113fdad86d84325c8
F20110218_AAAQTG in_j_Page_004.tif
91c8d8850b4edada60b458cb93ebe546
747fc7d14fd90a2dc836d2908218830812c4c297
3051 F20110218_AAARWI in_j_Page_132.QC.jpg
4e2a44082c21b7ca3cb26651ed6deb47
72c699652552f70000a8ba9cd91a5a988691dfa2
7946 F20110218_AAARVT in_j_Page_043thm.jpg
f6037559112628f70de4182039b4a9d6
deaa23c9c33f83015cb8534ef7252a8f7cebc34a
6051 F20110218_AAASCD in_j_Page_009thm.jpg
462bdbdd41fc0704fab9bf0630b5ab51
7e32eb885da1367bb67d5dc537b8780c67de48a9
31194 F20110218_AAASBP in_j_Page_135.QC.jpg
bbb0c677fe20c4f2807bce28237a4bb3
fc139a33cea7113aba5738d2d42eab9f8f9f2ba0
5098 F20110218_AAARWJ in_j_Page_074thm.jpg
f5871f62e58fdf1d967869d5dce49922
94b48825011c6131684bc4f11266a8d02bb5ac52
25167 F20110218_AAARVU in_j_Page_011.QC.jpg
f640768c236825262f4ec3f763195ca1
c5bc51733ba255a94978698abbcce3f06f3e556a
F20110218_AAAQTH in_j_Page_005.tif
27a709ca596fc611cfa0b858f51b53b2
e586907eaae4cb825df8310b3bc6b3442fc7e848
6458 F20110218_AAASCE in_j_Page_011thm.jpg
089b2423f99d0c98823e9a1b4fdb25fd
3f1b1df9f7ab49f42c0688dda1be668766f2b34a
30262 F20110218_AAASBQ in_j_Page_136.QC.jpg
90cd9eab1b33971cd93f6fa51dcf384b
6b5e874fe8dfd7407df1da3eb7f65fbe3ff34e7b
7279 F20110218_AAARWK in_j_Page_060thm.jpg
f8a875240ec7da827cfe25623af5c043
b54fd92488e526c1ba9a29e256cd3f8172d0ea72
32103 F20110218_AAARVV in_j_Page_053.QC.jpg
b2a4c74202b19c6ec89309de240a321d
e5bd3176850eb62ea8e049c1c856b1dc7954fb86
F20110218_AAAQTI in_j_Page_006.tif
8f824f11642100cfbffbad66b2d040de
4122137b890ba62337d2f27064fde2138a8c7fc6
7859 F20110218_AAASCF in_j_Page_012thm.jpg
29e678c0305808056a26fc1c130e850f
6285111d91408ab814c5233dc9bd6556b73db5b5
28875 F20110218_AAASBR in_j_Page_138.QC.jpg
c5a1bc6d45183c7275f0deee85ae6089
79ded72ecb085df74bfc2f72252afac668c0f686
7436 F20110218_AAARWL in_j_Page_025thm.jpg
5a82a99697e70a1aa4d4b69fa4de525c
887300956da83d742222926fb8246b5c69802b42
31665 F20110218_AAARVW in_j_Page_071.QC.jpg
8bd48bbb5d8bb85e1b2b55a17f644710
7da5017a7b8a3c70c257f90359b71609180590ab
F20110218_AAAQTJ in_j_Page_007.tif
050e0b71a7d5870d5cd28d116ad0ad59
cf84ff78d8dc07049b977889b5f02c74945be03f
6646 F20110218_AAASCG in_j_Page_013thm.jpg
976fc9ef6df38a2f42741d943769da23
207c852ca10bc3fa23ffc289dbd0718b8dec4e0f
33775 F20110218_AAASBS in_j_Page_139.QC.jpg
0b769f1fafdbe4db8178e668e4c74df1
af9d38ad2635a95a471819d428c6c0e3b62ba633
32299 F20110218_AAARWM in_j_Page_042.QC.jpg
3b572c33df69f11119d34895b113b1b8
4cea317d5b8975cb7b2d4d762c27b1b74f45bd10
7892 F20110218_AAARVX in_j_Page_135thm.jpg
3f239fc99bc2e87251278d597f75e893
d54fc664f521b7856456383a53270a09bee5a911
25551 F20110218_AAARXA in_j_Page_129.QC.jpg
95e99d00c7fb41528bdb7ae0d5dd59b2
bfdd62cdb6c355ea2d86f7f23981dc6aefd2f47d
F20110218_AAAQTK in_j_Page_008.tif
d2faa99de6a1e1bcf253a641f0587c65
70050013dd1acaf67df3676103e03c0265558b70
8036 F20110218_AAASCH in_j_Page_014thm.jpg
6b3e4ffb54e78bb04218f1ac03b5d0bb
e88183fa7ccd9bf835cfe707f2779888c06c8478
33093 F20110218_AAASBT in_j_Page_140.QC.jpg
d6f52ad51ce61905e4d235ac4ec8d916
5b02b3918adbb8cc070a2cf0b0b44394054afe83
25321 F20110218_AAARWN in_j_Page_044.QC.jpg
1c0041b62509a4f4886c88d673913534
32eb65969886dd9ea1bb094cb9357ee0dee8734c
4053 F20110218_AAARVY in_j_Page_005thm.jpg
d4966f666ab9e5c91752b45003d9f5bd
7eb324d72569825ad900991ea39a20cca22f860c
5715 F20110218_AAARXB in_j_Page_086thm.jpg
f84a5244bfe81befc1f58a0a39cede9e
78f4ac295d7c13bfa2f229f576f8ffc716d6ab0f
F20110218_AAAQTL in_j_Page_009.tif
b83c9f89c43a0a7d5610ac91d4bccffa
2b893b887717223c0110c1f5ed34a881fb068bc0
7776 F20110218_AAASCI in_j_Page_016thm.jpg
d8532113bb38a3a69c9a573170bcba66
c3a6b20560fdcdaf8e1595735ac9f08f82d8c4e3
37841 F20110218_AAASBU in_j_Page_142.QC.jpg
f15b810c8d590d6279246ef32b58ff96
d87f31edb3a0a73ba48a63b8a1a310c50c70feec
8507 F20110218_AAARWO in_j_Page_093thm.jpg
771081d6d1b3897d59349c89d2757874
052e4fa13c447c1ce6d648a33b7fbf40458d1174
27377 F20110218_AAARVZ in_j_Page_050.QC.jpg
291a1b5ab02e9e16ddf967e71f74044b
ab8e5a5f1ed6691fcaad0e7095d329256d77d686
F20110218_AAAQUA in_j_Page_024.tif
1e43e585ff52f5b98503001d605e9325
fe96599de10ce305b0dea592f9cf52a74a80916b
21708 F20110218_AAARXC in_j_Page_122.QC.jpg
5e629407d7e89c3a0d7be31226e8001d
d9383f85446dd623747fe6e9a1cbc5efff23bbb8
F20110218_AAAQTM in_j_Page_010.tif
1a205887ce1c0776e6a6e0dcc6f63b0a
18fa36a6afcbbe8e4e96f8537ef72a409be6ce37
7681 F20110218_AAASCJ in_j_Page_017thm.jpg
f93004d0431855e16b400830012f548d
3e0f4bff2792282b644791042a9a3372d32802fc
34190 F20110218_AAASBV in_j_Page_143.QC.jpg
08887254865a53a20954304416bbd165
12d65127c79f83d0b97560de1796b6bc45a223f5
8932 F20110218_AAARWP in_j_Page_033thm.jpg
93f8d64ac19e844d882380e2fc0fb8b2
1ddb9ef8bfaa80c8cef61bcdf362aa7db071dc6b
F20110218_AAAQUB in_j_Page_025.tif
806a544563e88963942e7f1485664d3e
43f87fb8ca74a9bbad4fb996ea9d4138ca15b2a3
7010 F20110218_AAARXD in_j_Page_133thm.jpg
58bea2472f07bf4b5639eacbcb999671
87cab4512c67d8aee4100e1f4dbd624f43029e7c
F20110218_AAAQTN in_j_Page_011.tif
8e8456127a9902827bfa9ecb40899048
0988cbb3390785da27fe106c09776f88ee08706a
8189 F20110218_AAASCK in_j_Page_018thm.jpg
0764fea0fd3ffd52a974f34e9a5e4c4b
ed821b9c5085997e52efd2714dc407b323b99876
15226 F20110218_AAASBW in_j_Page_144.QC.jpg
0ba353a8969e0ee0ad71bfc886a0cd09
17167f271317e95d82e39bd9083c374bf7089234
7888 F20110218_AAARWQ in_j_Page_037thm.jpg
5af02b1d5d6c1fff886e883f41f42dba
a74bc57a1e69c4ef366b22a60f73f72e15715f44
20386 F20110218_AAAQSZ in_j_Page_075.QC.jpg
0de32552ac82075850d7ebaab825f264
3e2c40c2a7da5c02588e848d70b8844148397d9d
F20110218_AAAQUC in_j_Page_026.tif
19832bf8fae2838f747094615be74486
b5c1cc80aa855db8c1f55dd69069834022d2bcba
32968 F20110218_AAARXE in_j_Page_051.QC.jpg
68db72bc164354860bfbc3df687beaa2
af8ae1067b7f493ba28ec0b95b43450bebd18c20
F20110218_AAAQTO in_j_Page_012.tif
799a025c137d710f05fa7c710d51d9a1
835cd79192197dda3980f79216c3687ecc0920ef
7837 F20110218_AAASCL in_j_Page_019thm.jpg
cbfd54f15a3aef5a9813f6c45d4294a5
cc749094ff845ef2cbc35cf4b265dbc7db16ab54
5470 F20110218_AAASBX in_j_Page_145.QC.jpg
0a85d70b5eec74ef6af53b9dde42fc41
fe8ee11079c68ad89821372258651fcfed791459
8134 F20110218_AAARXF in_j_Page_051thm.jpg
6f6bd03ca66395e26241f7878fdd6c9d
aa660c5919362b6ed97103e706d8e77b5405d246
8206 F20110218_AAARWR in_j_Page_069thm.jpg
6e1c6afd8a13a93056d9c3a5b75e6eb1
600c71f648d523c0a99a06b8628bf0d3a6fbf3a7
F20110218_AAAQUD in_j_Page_027.tif
340753c954668a9ad4e16d1a3d0e63a5
b8148a4e4e70c022d9d24bb2030ad604759678bb
F20110218_AAAQTP in_j_Page_013.tif
e0ce59e4862101e6190c603cd5e0dcee
274091a4cb1f3338a68a1d46c8754b392916a624
7877 F20110218_AAASDA in_j_Page_038thm.jpg
249e34ca9ec88f45a7981130257b637d
3d1fdf6ee2278ab38439d1306dd1d3a1a93c480d
7439 F20110218_AAASCM in_j_Page_020thm.jpg
e296f71ade2989271ead4ffef8bbc1f7
5b4664084b9efd70a5839b73b066cb4189f9841a
578 F20110218_AAASBY in_j_Page_002thm.jpg
d858e00347fedca14275ce6634b44a4f
4f60577a5de3f5d2349008d89378b2f2a096df9f
2914 F20110218_AAARXG in_j_Page_008thm.jpg
13cc8a078a6ee4caa2b163305e72aea6
43c89103383405101ad5e8f2a8ae370d8253de5a
F20110218_AAAQUE in_j_Page_028.tif
3e908aee373a4ec531be8a65316c26bf
dd334548fa6e5be4fd89cdbaf38c2cee21430bff
F20110218_AAAQTQ in_j_Page_014.tif
ec3be6123baaf7c0d54749892fe3356a
75620263d355455097ac1db004289cea16592855
33168 F20110218_AAARWS in_j_Page_072.QC.jpg
1e82c53bcf136f1df606449e145fd078
0226188ab1cfbe518b04dbfac7cc4837d20ed20d
7797 F20110218_AAASDB in_j_Page_039thm.jpg
47e83a0b8c521d72aa8c55205d7c8085
d5932d325f9f365f406cce2e56a5d37645bf4a00
7982 F20110218_AAASCN in_j_Page_021thm.jpg
cb02b1cb6ec3941fd20ddde20615fa41
99e83b20ea706906c924d21a5ff22c2852ade333
728 F20110218_AAASBZ in_j_Page_003thm.jpg
e368d6962e61e3d1bb1f0b7952ee0e67
0bcba982d81ab1fe81578f49f0ef8ed8f1cf87f6
2047 F20110218_AAARXH in_j_Page_001thm.jpg
b235ce9dfc63460483d2de4f51ea31b3
dff8cc44135a1783899b458b245bb227c350b421
F20110218_AAAQUF in_j_Page_029.tif
3b0dcc62358479ec5f16f219faae25ec
00ec5cb587f1f2e7a017edfba3a8f31f39e032b8
7267 F20110218_AAASDC in_j_Page_041thm.jpg
c4b17741f5d87d1ee813923bbcf4c8c1
23a2eb8548fe58a54f3a7b626e1e6987efb4a124
7806 F20110218_AAASCO in_j_Page_022thm.jpg
75303e9eede2ef9d9fd5ed85786f5411
6eebbdac295c23e3600621eb8a77a5d30939b5a4
17255 F20110218_AAARXI in_j_Page_005.QC.jpg
cfb703194e8585b7eda06e8f9d696f2d
ee8964dff0b8518e370dadbee6fee8b1820dafa7
1799 F20110218_AAARAA in_j_Page_035.txt
f70084b36c542f4d4d7da95b37247425
e3723f508e1b7b050ffbdd5a6fc7f74e9087d1df
F20110218_AAAQUG in_j_Page_030.tif
537c4b0d3efc41e2e404fa61bcf6e9e9
0434ab543c442479b153640d7c77d6aa3f6e7104
F20110218_AAAQTR in_j_Page_015.tif
8749b0235a79c4eedf496a0a54c1e131
6616f1482b9e916131cffb03da3590720ddfd8e4
7975 F20110218_AAARWT in_j_Page_042thm.jpg
37f631b543a3a307210fe2042437121e
3b43a4040f18fbc93f297afd7bac8b4009292631
6963 F20110218_AAASDD in_j_Page_044thm.jpg
7ef18f1debc90f681e84851c0653209a
aba2d46ed9c5f81d97887d0d8b2e5ef5e76ce7d7
7841 F20110218_AAASCP in_j_Page_023thm.jpg
43cbe52f5d3b2fdae19cb27cefb76306
c4cbc76531c4d3efb52df42841dde332d20dde03
7542 F20110218_AAARXJ in_j_Page_118thm.jpg
e1113a0bb7f671dc32e7ba3f0e77d015
3dc26b6ac2c25c59b5d504e030ea81588a0e2251
2361 F20110218_AAARAB in_j_Page_036.txt
dfb63e2a2c7a3587f84349d88712ab81
b1ff4dcc4c606d36b9143f3c2b25c23c75003dc4
F20110218_AAAQUH in_j_Page_031.tif
b07a9d091133fb591c7b7019bcbd54ec
bb3b21e75fc8ad0240fb740eb59b07b6bc4411a8
F20110218_AAAQTS in_j_Page_016.tif
1cc544abc95263fb098ac6297fd18dbd
f561e18ed4298348ccab686c4e81620e747ab40d
35430 F20110218_AAARWU in_j_Page_141.QC.jpg
edfb739dae0f90c5ee73748bdc948c2e
30189e0587378240466dfe09f4b1947c6319932e
6920 F20110218_AAASDE in_j_Page_045thm.jpg
94ab0c3728a50bf7fae64d97587f2d98
2bfb446adf21376a16f32bcfd96c561509d6ba24
7834 F20110218_AAASCQ in_j_Page_024thm.jpg
f010109e550135e7990b641b4cfb3850
973f0db72ce7527ac77100d4a5b16a1cde97f22a
22506 F20110218_AAARXK in_j_Page_137.QC.jpg
bc388ca8a2e8c173c8a7178784e1c3f6
e1992b1ef71bc3841327fbdcfe182abf78514b4e
2033 F20110218_AAARAC in_j_Page_037.txt
df1b92ea4cc749eab5a18536d34a6615
0c4586f375b58787da41cf8d59f78ea7a3f009a9
F20110218_AAAQUI in_j_Page_032.tif
74f4bc51212e41df1293698c2f384bd0
c70d9aa1ff7982111a5ac74bbfed9bc5bd1f733f
F20110218_AAAQTT in_j_Page_017.tif
061d74cd92a0fc0ba3043feebca49e50
23d7012d275055d2b6af7c67a8657888cd9f3e2b
32494 F20110218_AAARWV in_j_Page_021.QC.jpg
143428df973685c22d4c9f4360377342
f228c1a0a71eb9de0e1223afa42292af15ab2549
7208 F20110218_AAASDF in_j_Page_046thm.jpg
071b789fc7290e5e562f805c4cbe72cd
ceece3bfd2b11bc15a74c53d477c5750f10a50ae
7425 F20110218_AAASCR in_j_Page_026thm.jpg
5c056fd2b2c98235df092bf3e869b812
0d6591d5e2a4897cb09e6caec07767380455dfa4
7655 F20110218_AAARXL in_j_Page_027thm.jpg
347feade79745fe61631222a20706dee
95f2e8e4ddd714e95fed4517772b28e057df134b
1870 F20110218_AAARAD in_j_Page_038.txt
6208c2fa18d8880443c048ef7ab2e4c5
c3d210b38699ffce5476e949a2a1e7cf9e2b2fc3
F20110218_AAAQUJ in_j_Page_033.tif
d3eb56460c9fa5ac3c65a73a08953b77
f72c6284b40d8a2dd9f32588e135331630f218e1
F20110218_AAAQTU in_j_Page_018.tif
23486fa3a01d947c929d79d4236024ed
f7351d0e26d4cf2c730aae7e53059deaa5fe9666
28702 F20110218_AAARWW in_j_Page_048.QC.jpg
4e3fde2a476b24b4d7bc77b99e7f7b67
451763b0f50451e3873fc83fd3426070b353b410
7750 F20110218_AAASDG in_j_Page_047thm.jpg
594c1061c1a8780682c052114e0cbd90
485e2a9e384856aa9aed535e526ad158824f02f7
7882 F20110218_AAASCS in_j_Page_028thm.jpg
19b0b27c1ebecb4d2ba8fb85aeb057fc
5adbfa1aa18207d7fe0591031291ea107f802f86
29767 F20110218_AAARYA in_j_Page_020.QC.jpg
f3569fd72037847cc868e61b79b6249e
40b044834da49c6261a344e805be4a87f135ad7c
229865 F20110218_AAARXM UFE0013834_00001.xml
a4ce12f56636da1c21ef38843de10e56
c5f3cbb426ed691e5feb0550d9df451bec3762f9
F20110218_AAAQUK in_j_Page_034.tif
d0d3ac9e43596ac22b7e2bed274d5913
299e0dcdb3cb1c58be9edf2a26c40d0d9ae71f7c
F20110218_AAAQTV in_j_Page_019.tif
c53f8b228adcdf95ef8de8256aeeb200
d8799dbaf7fdbc7de4e50c06941538e130ff56ac
7368 F20110218_AAARWX in_j_Page_015thm.jpg
265695ad38248d35a6b7ffc7a1fc12ed
53e85d3d4324e2a13354791df69695e28fec24e7
1999 F20110218_AAARAE in_j_Page_039.txt
7740bd0df5c9db815d204b7c1e1b67b6
0a8fb2aacf4941d3bd3d42e3c0c41dde76cdcdff
F20110218_AAASDH in_j_Page_048thm.jpg
2d6e749651a4f65cb33522dd48151c97
2469df42afdc4f2a57b8b475e6f84ced741d406a
7364 F20110218_AAASCT in_j_Page_029thm.jpg
9e64a247148082450702c048cb638549
0144db98455c291cee0944c09070fb2034be8c6d
30362 F20110218_AAARYB in_j_Page_022.QC.jpg
16d88be50e8bd0cf1e14e022cc40bf7e
381e556598b729e5ee848f74f2478839a61785bc
7343 F20110218_AAARXN in_j_Page_001.QC.jpg
734ef255426bf34ff863375b445ac8af
3e7ee560dbbbb3137d544c9f4083cc72b09fca46
F20110218_AAAQUL in_j_Page_035.tif
bc17839cfd033da1dd3091e80c00c0ef
29a6b1a54d8e39cd3f936a730a077ae884559658
F20110218_AAAQTW in_j_Page_020.tif
f1232608e1948897e1bd2fbfb837ce0f
231bb74cb6ccc5d59bbf49193214932501fbc323
6674 F20110218_AAARWY in_j_Page_091thm.jpg
4593192b5d14fb3c5530af6022cf6ba9
a8a35c56ef205cbd32c4fe7b6bf3cf2ae35bcb54
1924 F20110218_AAARAF in_j_Page_040.txt
d2a17b5c442c03c51c6b18f9111801d9
bbff7c93b6ae5873cbaebc4bee1d3e3eb761bffc
8091 F20110218_AAASDI in_j_Page_049thm.jpg
97a4223aafa2be6ad3b4ef3f25cfe587
59b78ecb5d6f9a2526e3186649e562374cc69fdf
7475 F20110218_AAASCU in_j_Page_030thm.jpg
b6c62fbde84cc24e2ba8c93b4ce0fcd6
fe0726fdbdc389d39e2816f68eaff126f056db5b
30742 F20110218_AAARYC in_j_Page_024.QC.jpg
a80162470a6a88a0593b9ec9c47293db
d32cd0c82bb41f4e4b5dad692fdc2dc0426430fe
1506 F20110218_AAARXO in_j_Page_002.QC.jpg
e1d5951fc64e53e4d00dba012e06b561
bea307d23e57d52d0ab839a27e92e31c933f97f0
F20110218_AAAQVA in_j_Page_050.tif
8d65fdcef12a23ef1d5a3e9fab041091
a7f537681db008beac764acc76955ed28a088f14
F20110218_AAAQUM in_j_Page_036.tif
74744f373e7b8e0ebc16e5aaefc783e3
9e451623d8815257b454eab05bdf71100beecd0b
F20110218_AAAQTX in_j_Page_021.tif
91081d0f40061358606f937131ba6111
04795e0fa1d29704ed23ae4ac54abae47c21f046
6922 F20110218_AAARWZ in_j_Page_078thm.jpg
2f3ff485e3e5f355b451d19483612463
6fc008a81a40355079eb6761efe4597277e8588f
1594 F20110218_AAARAG in_j_Page_041.txt
3af0658a019125090559f0c34ee7e300
c1afd2f1bdd0093c654918e7a7cc8f14c5ab4cde
7477 F20110218_AAASDJ in_j_Page_050thm.jpg
ca7f2963f71cefb0f9e01ca0b3da6619
65f1bd283ba656eb426748066bb1d9383c2ea51c
7939 F20110218_AAASCV in_j_Page_031thm.jpg
da1db32c5b9ce3b4de87428694d2af3d
729b2fc97f058d0096e5b75c57e38a6682ba0a8a
30401 F20110218_AAARYD in_j_Page_025.QC.jpg
209db7142619f90142008e4ef70bbf76
5c1a9f1e31e856aa83d9a3a49308d42eeac618f8
1843 F20110218_AAARXP in_j_Page_003.QC.jpg
1b88ec0cfd5165df357c6a44431a0562
fff322e00313f0ec3a98d73c8c89098006aea2a5
F20110218_AAAQVB in_j_Page_051.tif
9c86a12d562ba197fe50a99eeedc69ae
98abdb35ab5c42516211db34618896336a3c62d8
F20110218_AAAQUN in_j_Page_037.tif
471422bc581b8799dfbabc58085edc2c
0e3aad97e6e8299f7df344fd48f3560f7efdcec9
F20110218_AAAQTY in_j_Page_022.tif
e95b3ba45750643dbb38c0c070c4a562
ef0f04b8ae75573a06a5fb7cec76685c189bb883
1876 F20110218_AAARAH in_j_Page_042.txt
286edda4d6d5fa27a769752f272c8d00
a4613a90ae76db6c6433b368362a1b127cef2684
8337 F20110218_AAASDK in_j_Page_052thm.jpg
3e233114e4802cf56ecd8783dacf59d7
e8c4c122a881b1435b8f8ca8cb5b8b071901fc69
7973 F20110218_AAASCW in_j_Page_032thm.jpg
bea5cc5b441040b580f2b922461044ab
16f1a8c1c61f5138423beef4b9ca43de5c1a6e90
29777 F20110218_AAARYE in_j_Page_026.QC.jpg
2bcff6fa03e4e4c6840b5f75503ba4d1
cf09f0b6e27d16eadfbf24501e2996e9da7e4684
4598 F20110218_AAARXQ in_j_Page_004.QC.jpg
3ee34323d728dcee521fd1a47ba9d67a
9033dac80cf06320d7e61e6295bc682810a84c73
F20110218_AAAQVC in_j_Page_052.tif
2a813a51fb52ccf57834c3161f242bd0
b1463097404a9396d8b046e31f54272ba0f4a9a4
F20110218_AAAQUO in_j_Page_038.tif
86b0ebe2f12e5a24963d5b50b778c658
6b43e8f849d9393c1e1541c197cf525fa01cc256
F20110218_AAAQTZ in_j_Page_023.tif
3d4b676b3322a538fbed8621b11f4f10
ac114b1f7ff3215a7b4e074e7c3032ad1077930e
1824 F20110218_AAARAI in_j_Page_043.txt
a30a244b7378a44e3f572e396f0fd4fc
9953c984cf5ce5e80a4b6dba5929d6015bf56186
8131 F20110218_AAASDL in_j_Page_053thm.jpg
1ec22d4d40d0dd4f35d4b330f687c9d7
2ad62dd00891dc05fa68aed18f4b15fc9fd9d822
6876 F20110218_AAASCX in_j_Page_034thm.jpg
b801d66d49eff1ef3c937b80ec6b610b
642db2556354903bb249e8ba4989dbd8c20bd9fe
32046 F20110218_AAARYF in_j_Page_027.QC.jpg
ff662e829283ba7e882390b1b5fd8798
4c00b1b22ac7b4866815915229981b23a309a200
25833 F20110218_AAARXR in_j_Page_006.QC.jpg
91ee63aa00bd3d7a57c503b43a4d8727
ccc3a7afd6034976ec821435faade3065e47b4e8
F20110218_AAAQVD in_j_Page_053.tif
caaec0ba730a7b8552070f2ad95358a1
accd588c43ae671148a42edb10f025d10e9804b8
F20110218_AAAQUP in_j_Page_039.tif
4d5a4c2b1516bbc8d59d2eb67ebd5c9b
95c70f6b0df69b22ab24c6f7b74a604f6bfbf8aa
1625 F20110218_AAARAJ in_j_Page_044.txt
f973a3d614f188a529014bcc1daaa3bc
f2e7b0a18691496049efef63a3672ffa27dd94a2
7639 F20110218_AAASEA in_j_Page_071thm.jpg
43bad36630133e424751e8b7546b97ae
f5ac2c6b3f045d91235490e568575254ca560b58
8434 F20110218_AAASDM in_j_Page_054thm.jpg
08e75ce85132e9c6e2adbb451abd76cb
256996df82df5872617680331537d89168110de7
7405 F20110218_AAASCY in_j_Page_035thm.jpg
c05f5242b44cb2cd9a6de213f0d26cf4
218f06f162a4e837e275183010516ed2dcafabf3
30726 F20110218_AAARYG in_j_Page_028.QC.jpg
68855110234a03ac6f1506d1ee7de4de
f3e17cc34c9ab11c3a62a0ab7ddff76ab38abe80
12992 F20110218_AAARXS in_j_Page_007.QC.jpg
ab3c8540004bfdf58b91b7d90c7eb0fe
281fd6572f427feb7c5c481cbb5494b5bc2a0e5a
F20110218_AAAQVE in_j_Page_054.tif
156ce70a680f99690f5de88e02ddd3a2
1fbc80167ded3e76474eb96771711f76cd7e6583
F20110218_AAAQUQ in_j_Page_040.tif
cd6868148a95c5c9a17c251029216b05
033825cd4d7ade7f24b76c8f7b13f57402309a13
1658 F20110218_AAARAK in_j_Page_045.txt
4ecb8427f9a449062cd98505b33d9277
1a5243f9c26f1ea892b1fd639768b98f2db0c4ec
8362 F20110218_AAASEB in_j_Page_072thm.jpg
4b653d3e78cb7dc230c323552b012bd8
f4db5ad7f0d3bf247646cf287d731db60c62eac5
2309 F20110218_AAASDN in_j_Page_055thm.jpg
4b3242a2750785eb98961c02c5554dba
170718eb3efabb65a015f37e77f04d8af6f0adb4
7641 F20110218_AAASCZ in_j_Page_036thm.jpg
e76af27ee6970bcebbcfe252f0946a58
7fadbc2a30388d33a488ad8970202198d8de1857
29156 F20110218_AAARYH in_j_Page_029.QC.jpg
719e8ebe894da58f64b0943844fe7421
e116b3319a7d4830c6f90677a2687de2cee31a3c
10427 F20110218_AAARXT in_j_Page_008.QC.jpg
4f80626ed017a52423e11a5cc6ce3888
9925e49994395f704c7aa7e29a4275a8508a7188
F20110218_AAAQVF in_j_Page_055.tif
74132ec5af07e82fbe9cc9405eab90f6
1786f084c8c22809ab1eefe3360771e8265f0a91
F20110218_AAAQUR in_j_Page_041.tif
26479b438905baf0d05ecfe12705d552
9ae8d87858576c326d73f389dc54fbff39817800
1698 F20110218_AAARAL in_j_Page_046.txt
e78ea028d6b08f4fe7d1be935a3c27ef
a471bc84e0f56511592bc960d50a010fe496a364
7840 F20110218_AAASEC in_j_Page_073thm.jpg
5b93d726c18a3763fbf7fa8e98b60c89
6067a2d44d003614ea8e194187271e74b7fb3df0
7059 F20110218_AAASDO in_j_Page_056thm.jpg
ffb87450ce065279451384a07e9e9984
f38a85699df2994e2ec2a7cc4f469597a58dd02d
29297 F20110218_AAARYI in_j_Page_030.QC.jpg
126523aeec1d4827dc402ab0f1ba4f7c
c933010d8858ae0d041b1997f82dc9085e594743
F20110218_AAAQVG in_j_Page_056.tif
1cbefcd4c06792f9a86137894fa46a8e
d08a01f24a3a58b0d35233ae8ef915b0ffde9b71
1869 F20110218_AAARBA in_j_Page_061.txt
19f2c04926eb6a7352ea913f34daba6a
795e258705c39481b6da1f47d65d5948646b5b02
1894 F20110218_AAARAM in_j_Page_047.txt
21d6057a38d83c1aa71f316a6ca05c77
88ccf81453d461eff75f4828c3ec15f3947e751b
5500 F20110218_AAASED in_j_Page_075thm.jpg
55e67c0a7670be450a3a8a2190091598
4ba75e038df1bff57dbb29fbea6bec7a807752e7
7902 F20110218_AAASDP in_j_Page_057thm.jpg
618176f232c7bf8ff445e5850624e2c9
3152e5dc973f9f4f14679a2d13b254a93c32774b
33947 F20110218_AAARYJ in_j_Page_032.QC.jpg
8d52331633ebeca61450860cece9d555
538a6bed6ed0082d7d53192a24d39454696e3e27
31815 F20110218_AAARXU in_j_Page_012.QC.jpg
e75c9acf92fb60acf750a510bc948aa1
4672770e3b8fa59daccb43f75d3426f45536897f
F20110218_AAAQVH in_j_Page_057.tif
8cadebeb943d0596ab10f8418b837e3c
3dd30be02fac46055af5eb9eed0633e9805f907c
F20110218_AAAQUS in_j_Page_042.tif
dbb7ed216a0854dee06b31b5f088b8ce
5a04c108b9fe55f4143b347acfaef6804190210e
2232 F20110218_AAARBB in_j_Page_062.txt
96267227d99449c585e3eb26cce87323
2e4abd57847811aa3f8a785675318fe3f1371469
1676 F20110218_AAARAN in_j_Page_048.txt
1d86b0bceae5374d902a22d54e712d4c
40fff8fd0fbbe04c676a08254ccc825900157035
7575 F20110218_AAASEE in_j_Page_076thm.jpg
0236cfe06577876712492ea2e20b80bd
713b7541ec6e19f7aa22a2ef2dd68d513bb5f741
7791 F20110218_AAASDQ in_j_Page_058thm.jpg
63fb3231e289d3eed5a6a00a259766da
85b43185c7cf7a50559f59e003b9e82334462fdb
41364 F20110218_AAARYK in_j_Page_033.QC.jpg
64246476d5373f32ccb60b036014ed51
736221f80a16561bfe462ec6cccea68d7c7df8ad
24634 F20110218_AAARXV in_j_Page_013.QC.jpg
54bada3e05c1aef76905f89f01b2f62d
a6cc6a9911ec1d48cd671a18287464f43286b27c
F20110218_AAAQVI in_j_Page_058.tif
97d514e125e68ea5c47e9a4efc791c31
637a99651e00a79990418736c21fd9aa8e4ac53f
F20110218_AAAQUT in_j_Page_043.tif
2194680e3d425a5c48d88110e9809ea8
b9ee76fb7b7a7d628121da41d0222ef57666ab6f
2777 F20110218_AAARBC in_j_Page_063.txt
b62ca69e7c7077982fc217c24fea8b4c
85d807ca05dadf072eac12bac7bcc61113d6c85a
1974 F20110218_AAARAO in_j_Page_049.txt
2934c214dcf8e26aab3711257e9f5092
0c2c882173cd64971ad24867febd6e3b4659b6cb
6871 F20110218_AAASEF in_j_Page_077thm.jpg
c232a621a88b36badfb2ac1a43177f11
e763e8146cbc883efdbf6ab691e80094364d3f1e
8297 F20110218_AAASDR in_j_Page_059thm.jpg
41889c0ababa3ff1f6e717540b3e9770
615100707998723f8c0b566e2c5e64ee5c23c0a6
28876 F20110218_AAARYL in_j_Page_034.QC.jpg
8835a45efbbed3d2e58f07f01149ab1a
0661a00a4ec81fe67c3b79e8860aa22a33ceb588
32474 F20110218_AAARXW in_j_Page_014.QC.jpg
4ed3b9880eb5090a624dee06798544a7
f307b7438d983a9b981ee78f86e87523dac831f7
F20110218_AAAQVJ in_j_Page_059.tif
8a66cf85612b28d3941dbe66c055cb58
8f47ed72d7d7925fe8083c2de87e396041f75ea0
F20110218_AAAQUU in_j_Page_044.tif
48c6a85a0728770ef0191cb91c630d3c
fab1916f0ad773d72ef75537d363e7606df80f61
2011 F20110218_AAARBD in_j_Page_064.txt
1c87cda3ac4b94742254f5fb7f2c578f
cb38770ae71e18a17a47f67c05b0bd57ae27e3fa
1557 F20110218_AAARAP in_j_Page_050.txt
734e6e101cf33d5d1deba26252abf167
a2eaa9c87c885766b785e27b3f0768b0aacf809c
5447 F20110218_AAASEG in_j_Page_080thm.jpg
9d3411305ff726bb67e0479ca2fb7b9c
5f07ff6f2d81d98b985884bd8defafe00d096347
7512 F20110218_AAASDS in_j_Page_061thm.jpg
2e07913f322dd0ced052c1216b4b6a90
acd39b0c66cf49577df9d90ab555443576be7273
9210 F20110218_AAARZA in_j_Page_055.QC.jpg
812b15969ea19a88a5f42b9049afa838
ea366f73bfad9605d956e140d202b6136eb40a55
28466 F20110218_AAARYM in_j_Page_035.QC.jpg
5e6053171f7bc3c3d11abc2c33f826cf
d90f789c16eb19049fba0b8ebc402f766c3c4ce7
28136 F20110218_AAARXX in_j_Page_015.QC.jpg
b5c7fd1a8ad08a7daf401fe8bb71d2c6
b724b9efbc9d49244e82b6ce6c81a49f9d98f868
F20110218_AAAQVK in_j_Page_060.tif
10f8693af4d1ba506cec8418ff4648a9
309e91e51959ddb81ab7f876c9a95c3742cabb50
F20110218_AAAQUV in_j_Page_045.tif
1bc4178edc69d9d08f0d1e1a753e9a04
55a2afd587045771602acba79ef50e5b0b49cbd5
2517 F20110218_AAARBE in_j_Page_065.txt
d3b764c72d3c71b4bc44d68d0bbbab33
b4dd7b26ef0eb216771245905661adeed10e2738
2020 F20110218_AAARAQ in_j_Page_051.txt
34cc171cf05a32806a869af2b75b31b2
03852dfb8a382d2efa6cd40ef5c9d8b84387e5b8
4516 F20110218_AAASEH in_j_Page_081thm.jpg
625073be401c511a2f250a894e39b5e4
d110bce669d1409fe26ac15ef6c6b2d92bacfb1e
8289 F20110218_AAASDT in_j_Page_062thm.jpg
affc5b7504d174352a9d9301ba44de0c
a75fab24ba5dff84d86b443250794858fd664abc
27331 F20110218_AAARZB in_j_Page_056.QC.jpg
fef654153b1c96799520299cb0e46ec7
271733454722225d00ac536f188b0375ee3416b3
32580 F20110218_AAARYN in_j_Page_036.QC.jpg
bed57a6f9e353175f48d3c1f553222f2
984e21cc7e21c0c640b39c6954c6bcf35ed6a99f
32285 F20110218_AAARXY in_j_Page_018.QC.jpg
de29361867a1605510fab4e345ed967d
5422df1642a86c483f3a75ab93114bfb9ba7c41a
F20110218_AAAQVL in_j_Page_061.tif
a324651acbf73f307a9070b25a335070
46f93da6f7f4b57165a1aadb37ec4062771fd1c9
F20110218_AAAQUW in_j_Page_046.tif
33f795e5f1803de11cde091f58b7a923
874a06ae1dcc493906cf750b1fe7569474049df2
1764 F20110218_AAARBF in_j_Page_066.txt
13dc754594cdde670614062ce090417a
4f8e0c3f6b80c392890019db318ee494721195dc
1771 F20110218_AAARAR in_j_Page_052.txt
406b85c3f6eac376680d72e8adb92690
64858837775a0e743275241827ea0c90a0497f54
5328 F20110218_AAASEI in_j_Page_082thm.jpg
e05560600baf776d4e53931039b4696c
59a18cf8ab556cce22745f0982f3216687319d6e
8520 F20110218_AAASDU in_j_Page_063thm.jpg
e5b990e700645c04072d1e769ceb60b0
4a17eba59334ac683621bb2d9915653dd4ad69ed
32709 F20110218_AAARZC in_j_Page_057.QC.jpg
3f72dac3547f3cd3c9871edc8998ad35
215f4be3f732448a2cb3eb51faa95c6f7c8aab81
33052 F20110218_AAARYO in_j_Page_037.QC.jpg
b3a1332df0c71ac33903d3ad95f0ea86
f7853e18fc93be9b26cf1e27b4ead554140eb7ed
32458 F20110218_AAARXZ in_j_Page_019.QC.jpg
d49f035c11745c69ac9e367a213faf84
9f9e2ac2a7b13ce184cf788455f7f259a97b6aa6
F20110218_AAAQVM in_j_Page_062.tif
cc9d116745aff7978e172701cc112ef2
c20d3640fe8d6e9ee7f2f2020504d6aab372785b
F20110218_AAAQUX in_j_Page_047.tif
25fc703e63cf362d9078b15d1cdd5bcb
bf85c99e0bec4a0e26ebe0627e8b6a27d21185cd
2157 F20110218_AAARBG in_j_Page_067.txt
e76823b973e058571e29d51b8cf2bfc8
c84d0ae7f2bbfca2b4b31bf47cf782fcc752b7dc
F20110218_AAAQWA in_j_Page_076.tif
244f9fa405174aa52692bc39cf6d47d6
ca83750187ceeb96da646c7dbf7ad5464a1c6771
1951 F20110218_AAARAS in_j_Page_053.txt
c7aacacb8e117e2839ec6d1cb985ad26
bf163b3633c9f9cc6e965b075e359bdc4e1940a5
5208 F20110218_AAASEJ in_j_Page_083thm.jpg
58c8566c3781aa4d9b3c9fc0e7b31c2f
e2b33b368b86e34b1d6c88080a8d40c4d989b9b1
7953 F20110218_AAASDV in_j_Page_064thm.jpg
0c195fa7210dfd5208eae39a40642bca
7c8147823d3d4f5d4c60e10026da7b5b95f56120
31248 F20110218_AAARZD in_j_Page_058.QC.jpg
616cfa62c27b40f1ce2e86da3f0e6f38
200d32c88c87b368d269c510100700dcf4553a27
31421 F20110218_AAARYP in_j_Page_038.QC.jpg
7550f547363d5307edcac40b009cbb96
078302c35a9e3a6fcba9f160605cf646c19d8796
F20110218_AAAQVN in_j_Page_063.tif
29526db11b0a294315a4f6d61cb1e4f1
4e8fe00a5300729d297295114292d058daa25d0d
F20110218_AAAQUY in_j_Page_048.tif
2fdf3a5b39ae73635cd2b51560c41260
0790a5ff81b8df6f8c8898a582c194b51da47e41
1954 F20110218_AAARBH in_j_Page_068.txt
b86b2cfbb6d0c6aaa50adb8f4f40033d
d4332c4a79888916c8c78372ab8db67a2f8340e9
F20110218_AAAQWB in_j_Page_077.tif
7f49b70a88cbbc129d8f18074354cd50
feae704c7d6f9cc6e9d55c52308df170d551fe8b
665 F20110218_AAARAT in_j_Page_054.txt
d2b9f77952699c3efee546e7d0e111ec
d2ecda83e6a17800a7b76fd1a15f1599e0ca973a
7023 F20110218_AAASEK in_j_Page_084thm.jpg
aec9e0af05e4ffaf70020dfe63134849
da957a8f914b422664f355fa597abdbedd2fe021
8488 F20110218_AAASDW in_j_Page_065thm.jpg
78369bac19b71f1eef6f16d5a181b7bd
5f7165848a522b0b8d37472e792c9a9faaa5e64f
33588 F20110218_AAARZE in_j_Page_059.QC.jpg
a972e2f156ceda463de1f2941c2dfb21
130ae890ca673fe740ce58b57df42402e091253e
32534 F20110218_AAARYQ in_j_Page_039.QC.jpg
7f37e91d841865e0f657001fc53f8ffe
3567eea091dcd1c5f3454ebd055985b3775d5d44
F20110218_AAAQVO in_j_Page_064.tif
95f40dc2b5a1375b412e568e7de7d7d3
f3b48912c2c6e5f5d4557def87e7ffdf4c2e3fbb
F20110218_AAAQUZ in_j_Page_049.tif
a87163a6856cc6a37a906eab6f9c82b8
21e3377ccc87da519cb59bb07c2f3955cc06054d
2234 F20110218_AAARBI in_j_Page_069.txt
4ff08ec4ef26ac861817d86e3a57d755
001d41aa13e2eec2e16e24fae9034c4d3f59a0de
F20110218_AAAQWC in_j_Page_078.tif
1c0e151e14553afc8036dbed81e73509
52e81380adcc7ce8a1b79fb89418f7e64acbdb34
482 F20110218_AAARAU in_j_Page_055.txt
a197eb37c8f22433d5b366fcf2e963d1
d7508f777ab2e98b4fe617943566b1a90bb9f8db
6499 F20110218_AAASFA in_j_Page_105thm.jpg
14ffa36846cb4dd90c3239b66cbf85e8
a7cec3a4a6a7e9951a79f05a8ea43a717d469fdb
4605 F20110218_AAASEL in_j_Page_085thm.jpg
929291f6f939f9f8b62a1d907f3506d2
d894f09599360df57eba74a714864ae05a38e23c
8223 F20110218_AAASDX in_j_Page_066thm.jpg
d9154f5457eab5d0089d300348a5507f
d70dc6348ac4acec89be5bdb60da9d94d5916b05
28408 F20110218_AAARZF in_j_Page_060.QC.jpg
a342cb6811f74d41d34cfca67f25ea00
c8b91df21b9a6469e447f22fba30347d7993dcfc
32367 F20110218_AAARYR in_j_Page_040.QC.jpg
b2a94665fd0540f5d70944d8756e0d2c
6a730ef913abe6d61d359a8feeb0dc9b8eb1488d
F20110218_AAAQVP in_j_Page_065.tif
7eddf6fa15acd03ab2a03331207ce8f5
59a2c135021379c3d91f699b734f6808e7981bd2
1897 F20110218_AAARBJ in_j_Page_070.txt
42707755be351f84ca78c4c6d330d6d3
ccfd626dcb5dc0dcdb11310154242b444bdc7071
F20110218_AAAQWD in_j_Page_079.tif
b1e6a10caaa6890240e83779ce82485c
2b61825140b86193ec31cf4b7eda2f4da72baea1
1704 F20110218_AAARAV in_j_Page_056.txt
8376e22e77bfd7f5a2ffec7ead6add18
ba5c3c3a2de15d51b105844784e7818b4aa1920a
7747 F20110218_AAASEM in_j_Page_087thm.jpg
e6e859a608dc094ccc6d77fdabef55c1
a546cdcbf03ba87c0d4525c2c59ca58210f1144b
8318 F20110218_AAASDY in_j_Page_067thm.jpg
d6afde8decac0b8f73d0b17411d3f9f1
4eaf02e2bb9e3bffa8c5b81d29260045d93479c2
31211 F20110218_AAARZG in_j_Page_061.QC.jpg
471596fff052ef1ba4b875f02fd8a7a8
21da13eaf2303b174d4c7993326af3b8e50e8207
28330 F20110218_AAARYS in_j_Page_041.QC.jpg
f9d4aa52d7e3d7d6491533acf2da6526
630442d842f1b4c928b81e7a278e03c38228a588
F20110218_AAAQVQ in_j_Page_066.tif
6330d79c2198e06d19e776f7230d975d
5b99f18ad2c20784fe85938f039b374f65bb98fa
F20110218_AAARBK in_j_Page_071.txt
57e4bc429c163ac4abdc1fe405866cc1
46d4459ad18a6e9514e40640adbfa2d40e458cf6
F20110218_AAAQWE in_j_Page_080.tif
09fa8c47c24ab4d21f0b5670f545d4cc
b8a7a4f8f0ccbd6ec2cc93b7170fcfabc6898840
1984 F20110218_AAARAW in_j_Page_057.txt
a82478551571d183c7483a8c54e0376f
6dcf35065c4eed2884c4ee32ba227e3033479db8
8658 F20110218_AAASFB in_j_Page_106thm.jpg
f074770860e064d643eae23624ed9837
5435ac9c6407174807a536671c5f5eb802616b69
8154 F20110218_AAASEN in_j_Page_088thm.jpg
1560b7c6808f7caa1834a548142a1918
e678baed2d6dc91b725902afa968773a5d577087
7300 F20110218_AAASDZ in_j_Page_068thm.jpg
3b9ff43c4c8da512cc16f813e6b5d108
c4cdb6f150d4fd466a05fb1e3f222c78baeb6ea9
34043 F20110218_AAARZH in_j_Page_062.QC.jpg
f2e77a0978cf22bc32ebf5ac094a80af
4156cc2f332188e9ae6876c2924756da10588225
30292 F20110218_AAARYT in_j_Page_043.QC.jpg
5bc06a885dfbd1c1991633612a30f9be
3e276220925763e9c5f024d33650c556c1d6cdec
F20110218_AAAQVR in_j_Page_067.tif
b5bdb982a82340e09872418e5eca0d54
e5cb48f739731b215e81e16ab110fea930e52156
1918 F20110218_AAARBL in_j_Page_072.txt
8d480b0401313f0c8a3623e8e984b3a6
5845c34a42f8a9707efaf4a2bc38679ed9763ee3
F20110218_AAAQWF in_j_Page_081.tif
986d381920396458de23b3c0fa2e452e
515bad77c4817fbca65fb6f6aa65ee48f8cc319e
1884 F20110218_AAARAX in_j_Page_058.txt
b6ffaa8d7e1e38da35ad2d530d5c9807
bebb8355482389fcacd0a7fa36ee3741bcddb3ff
8038 F20110218_AAASFC in_j_Page_107thm.jpg
78a49780d066b7507897aa1ea1348d53
2fddea6db3c3b15eceab4cb2cb3113a13c428bbb
3648 F20110218_AAASEO in_j_Page_089thm.jpg
af6513db378d530f1ab0158c92fb2df6
a3a05c230109434df8b580bd0b84ea17d2eba66a
32941 F20110218_AAARZI in_j_Page_064.QC.jpg
b242a25e4f77ad2665321621cf461643
804ac2b928ee83bd4f5efedf32c9c2bb131a5357
27225 F20110218_AAARYU in_j_Page_045.QC.jpg
db0b8ffe37e8ed70a310dc4009ceeebc
a5c35cebf55107e26ba45a56d14f029dc06f9ada
F20110218_AAAQVS in_j_Page_068.tif
775e49eeefdd06d3dc84672364d8a763
2d63ac7eccf0d239d29d1ac6dd5f20efad6782f2
2129 F20110218_AAARCA in_j_Page_087.txt
5872f292b121e5efb7a58ae12ac89429
10c9625dd90d7d80a2e82428425c7e4e84166c47
1913 F20110218_AAARBM in_j_Page_073.txt
7b76c775302012c67236e7e10fccf618
136b8bf696f5b8731468271b704d653bf71a8daf
F20110218_AAAQWG in_j_Page_082.tif
f8d14273b698be6960a63ca6dc0e5f3f
246d6a0697de0bba0d12468a20f05a9d45ace85c
1988 F20110218_AAARAY in_j_Page_059.txt
3963fc59e773b03d702887c81ed55e70
fbb2b6db0ee4fd47a68c1e937bd0d4d9849156c9
7410 F20110218_AAASFD in_j_Page_108thm.jpg
c16d8d6e17fca3bbd7023c371004760a
1e5e3b44d014d500197aa1a006fce716841fea5b
7594 F20110218_AAASEP in_j_Page_090thm.jpg
d6c42bdd2d7262f4f2ae87c383739854
5943aa84e58765d87bff6985a500ed7aba29544d
33993 F20110218_AAARZJ in_j_Page_065.QC.jpg
14933333c5e9f9b01684016b1538da83
2cf7bace46bcad2e282d628f67351f33a1fdbc05
2164 F20110218_AAARCB in_j_Page_088.txt
7ac4f42127f582c8ad0216d6582606c6
bbf03494f05bb9a39d12170cb416e45824d0f3c8
1354 F20110218_AAARBN in_j_Page_074.txt
317e587372da15e47b359b4da04c8e2f
60b877fea91d01a344cc126f168378e5d3549ae3
F20110218_AAAQWH in_j_Page_083.tif
873743725708e2f4d241f857bff48894
caa6faba0ad725d18266c16a8c23b23194db7d06
1038 F20110218_AAARAZ in_j_Page_060.txt
e889c367cd85b031016c32cbc99a55b2
a623cb09d241d2ff794d04f93fa9e69b292bec4c
7967 F20110218_AAASFE in_j_Page_109thm.jpg
f949ccece7487b2ac7b1d08d9da0d484
8dc5e3daf2831dae6dd9d90e61de5f70651560e5
6788 F20110218_AAASEQ in_j_Page_092thm.jpg
fab76d9aa64f47a070ff3cd0fd734101
8db01ff774d6bc3c4d79857ee9871374354e66d3
30795 F20110218_AAARZK in_j_Page_066.QC.jpg
cb350f429d6bbae0c995313197048b6f
59da3359a0f1ad662ae889a3309dbcd85f67acbe
F20110218_AAARYV in_j_Page_046.QC.jpg
ee87633fa4efb105bcdf6f4d2e97cfdf
0301fa855b272d080c9571cb58f333638e1f0c6e
F20110218_AAAQVT in_j_Page_069.tif
1bbfbfded09ab2958eb8f59c7b96e2be
be2514563285895094babe95d5bbfcc663c367f8
783 F20110218_AAARCC in_j_Page_089.txt
d65f426f427b06b9b537c5c5abf85bd0
3b2ae7464743e079dfadbe12b4108c993b171fa9
1544 F20110218_AAARBO in_j_Page_075.txt
892e19cf90db5a1da3fdfec700d2c8f5
7d53819b695492d322f17f56af9a26751cb286db
F20110218_AAAQWI in_j_Page_084.tif
1a2fa6996512b49122e3f8392980534f
fb5f4a5ef51fc46481a271c45079905ab82fe6a1
7255 F20110218_AAASFF in_j_Page_110thm.jpg
2d616e332bb2e0c1106471b8beed071f
60758da480a17b3126a3b1b9a69188aa4c746bf3
7117 F20110218_AAASER in_j_Page_095thm.jpg
a79f5838e34969e3b34a774f3b655d8f
801b59ab8c89fd843d56779ae06c5b517428c11b
33206 F20110218_AAARZL in_j_Page_067.QC.jpg
aaed4af73bb4eb981ade2cc61dcc5497
8605837bda770c9c010b14faf326db6835e3cb2b
30335 F20110218_AAARYW in_j_Page_047.QC.jpg
2815b66c7e2533137314e093a2998c9a
c9d132441999263c28d32e72a6e1d597545bf2e4
F20110218_AAAQVU in_j_Page_070.tif
90da3658121fa7bd28fcca62794f42e7
adb5c6328edb796a65cdbf0758bf69e586a5d80e
F20110218_AAARCD in_j_Page_090.txt
a11e2c51d7e1a7c2e378577af0e004a3
5aa4d51816004e5018dbe88076b28e42e8e30707
F20110218_AAARBP in_j_Page_076.txt
85fc75eaaeb449fca4622733c497ed2d
33a3197ae08f9e911cf4e15f5c8d76ed7ff0abcf
F20110218_AAAQWJ in_j_Page_085.tif
cea212d8176bd727d7e6ac41b7e9ea5b
98b12c61c56880ede17e4cf70b0a384b1f6c0cb0
6565 F20110218_AAASFG in_j_Page_111thm.jpg
443b8c1e042ef7178e4f717e32848092
db53108d038c2847e2b04972aaf08e4a052e7e80
7241 F20110218_AAASES in_j_Page_096thm.jpg
df276379d6070a05fc8729335d6e4913
8868c8371b7e99e2d90763fb11e9d5aa3816aaa2
29558 F20110218_AAARZM in_j_Page_068.QC.jpg
71de5ed81ada5ba977073e4b4836c54f
4e206fba0b78e1afa63c5f3dc6680c274c176f8a
32080 F20110218_AAARYX in_j_Page_049.QC.jpg
7a2fd253e5e1dbec75ff2fd72837ae27
997191a9a762aa40d6e0caeae4a23ab15c675376
F20110218_AAAQVV in_j_Page_071.tif
d7db69a48cbbb63807c1b8e4c527622a
d2a65135d48c2d238562cfde716d9d1b960ffbc2
1590 F20110218_AAARCE in_j_Page_091.txt
c08470450f21365cac10604a1573bdaf
91223a0886d01740c5c5b08223e6b245a5a9f6dc
1522 F20110218_AAARBQ in_j_Page_077.txt
11305de52d3258b5b6687e3e50836098
0bcdcd9a41496108c55d939af5a325a8b05e20ab
F20110218_AAAQWK in_j_Page_086.tif
f4f83bec76bb7c41c3f64d61140f182b
3e1cda39c0bc1a6b888d60863c0cca762eb1b3e7
7313 F20110218_AAASFH in_j_Page_114thm.jpg
0344fd8fa7c2591dfe3b6dbbd0af489c
334150e45e5157f40c8ece85fdc8ce58dee12f9e
7919 F20110218_AAASET in_j_Page_098thm.jpg
e03c6365156d540fbe176e872ef17958
4c008d6d01de6dfc91d530c2ff6545a9624e6772
33635 F20110218_AAARZN in_j_Page_069.QC.jpg
deb2f1954ea491f4d3f39103556c4e96
b33e2836489cc4a37d1cb1e9ca5545758bb3c6e7
30826 F20110218_AAARYY in_j_Page_052.QC.jpg
6c6a5b6f583cacf50066ed619328e553
e193b64433525648b0a479e6867109c725e4d13f
F20110218_AAAQVW in_j_Page_072.tif
10f5bd120cdd12bf01b07c955e7d2238
ace1b8520f50b5cf940068b04f973b06c0f3aeb0
1537 F20110218_AAARCF in_j_Page_092.txt
37220e4bdf6edd095253e7867c7cd1b2
c33237dc7fc1314b9948df39a4574274b71d5a6c
1627 F20110218_AAARBR in_j_Page_078.txt
a125c577cb8761778c1a543080cb5049
25b405dab593360573d7b26e143f76df21e5f760
F20110218_AAAQWL in_j_Page_087.tif
130faa3b8d598f446316ddde3f650154
00215f02d02f3b21816ff851323b898057c0cdb1
7963 F20110218_AAASFI in_j_Page_115thm.jpg
e5315453dd9be57c80ed34e9a251b19c
b00bf54771a65e6aba8dd5e7470e1fc01b6c587b
8295 F20110218_AAASEU in_j_Page_099thm.jpg
83854e3ae0547974efe0226a7c539b69
2abcb45c121de0002f9f1ae8d25206e5a2f71973
32315 F20110218_AAARZO in_j_Page_070.QC.jpg
e0a66cc8532f076548b2e0fa753aa7e2
5119c7662bca01b0cf6d472f52c88273f99ccccd
30544 F20110218_AAARYZ in_j_Page_054.QC.jpg
639dde9fab45c17c5ce305b900cdf05c
d7cd000d21c4750eac10a0a117be4669d5195599
F20110218_AAAQVX in_j_Page_073.tif
0778d27511f148f37b0ed25bf64b7bcb
a747036e2bb298b9bd2734dd996d85979485e4c0
2583 F20110218_AAARCG in_j_Page_093.txt
ab71f2de125a274e16e5b8715f14d5af
0a6468d57de919cf2f30759337a3df7be206ef0f
F20110218_AAAQXA in_j_Page_102.tif
3cb5a5541a9ab203fc4c9aff80624bdd
b7aa317c5cf6eb5c940f9d4681843d210d0efe52
1142 F20110218_AAARBS in_j_Page_079.txt
7651dc6167fd7cf19563a5f8bd2e7ed4
cbd0432151ae3df1f5a1ee748b0ba166d79e0890
F20110218_AAAQWM in_j_Page_088.tif
56f6ab6ae3e646359a73ce3d1a63f11d
c4614ce3f3198db835851a3a193daeee106a4e3f
6935 F20110218_AAASFJ in_j_Page_116thm.jpg
41a27cfc64208eee3c52d7729db91bb9
b6ae3fcdf1a295075e0f30d91daf1e0038f31116
6750 F20110218_AAASEV in_j_Page_100thm.jpg
84a35b7516cf3af79a15cd2851f7c0e0
630272e9dac9b870b07e7b6595626d7ebaf3a113
31576 F20110218_AAARZP in_j_Page_073.QC.jpg
912a1f86336780e95492fe7c1aa47cdc
f1ad37e2d3f16d1955cf049d601755fae15e65de
F20110218_AAAQVY in_j_Page_074.tif
f8a846390974b9196489f8fb5fa14673
9e249f4cb4c95e29b67e8bbedecf76625d3627d1
F20110218_AAARCH in_j_Page_094.txt
6e605b437c5808c88696b0a71601e91c
b55f25318038ac8143ff0b87068294706a4d7100
F20110218_AAAQXB in_j_Page_103.tif
94823e5b2e0b03dd90d53c7f0117f8b5
0b3da07d68c67fe5e0b014ee4fafbf0c642a24cf
1281 F20110218_AAARBT in_j_Page_080.txt
1b72c0659b20a3387d13a0023c611706
3c6b00290eb7ded272213d2c639251113d86c94f
F20110218_AAAQWN in_j_Page_089.tif
35067bbc758bb61eca6cd1c9e2a94307
ed82f4e26f527481e0fa4ae1d1cf14097afe8c3a
7013 F20110218_AAASFK in_j_Page_117thm.jpg
01b6474ddd0bb8ef8525ea167721efd7
d3c34bf4d5ec86fb6e6f7696d2313b718156a999
7236 F20110218_AAASEW in_j_Page_101thm.jpg
14b3afd91ad8cee9f5a5dc7289394c77
e58706b8cfae2105f5bc0ff248e4c86fe0311633
19492 F20110218_AAARZQ in_j_Page_074.QC.jpg
1b42e97080d254208267014325ae04ec
1c437f88591800589f154a9c5fde9f9df30ebc0b
F20110218_AAAQVZ in_j_Page_075.tif
d1ff46ae32dbb97794133aded58bc289
58e89f97f4c5e06e3e36ec1284a4c55abde54e38
1710 F20110218_AAARCI in_j_Page_095.txt
d01054ecb171ab6ae4136bc7e7f9d183
da00e93f3a2d7b8f8d715f5dd7de848d2bd69bff
F20110218_AAAQXC in_j_Page_104.tif
44db897f42729fd724ecaae249be314d
10e2963168f09a1798aa68a7eb4fb028f4b39c2a
738 F20110218_AAARBU in_j_Page_081.txt
d34f4fed1f73275924d44c7c0da9dd1b
fba0dec65a8f564f25c09d4c86f0f9b5f7afa82a
F20110218_AAAQWO in_j_Page_090.tif
1c94ea5655f8a948dda8061c89b95c54
73c3efc981bee0746ded0ecb96d3d62e2aa85d9a
8820 F20110218_AAASGA in_j_Page_142thm.jpg
2b90939ac76d15ae95f893801aae981a
8e55ee5dd53c1bf8c2cead62425e338705fba79e
7596 F20110218_AAASFL in_j_Page_119thm.jpg
9cae74b6f0c222af741507620bb350a2
d8c8c973f6df1021564a3bd2deaa661563acdafa
8280 F20110218_AAASEX in_j_Page_102thm.jpg
5146d2d0391f9e9616e1759ee0227aeb
f09e3f1b6c49b3c5c13cf486ede8108c01c73b79
29552 F20110218_AAARZR in_j_Page_076.QC.jpg
3da5e1de06c5773c8dd7d2f6faff22d4
7cc46489631521049f3d8427aa838df1473416db
1688 F20110218_AAARCJ in_j_Page_096.txt
e71d79e4dbbe4fd7bcdf78d24f29ca15
3b8cfa28728d51c5eff050ae934e70ff508b4ad7
F20110218_AAAQXD in_j_Page_105.tif
5e1f6115f805d25ff216f9c5b4b12d79
d9f332db69390a69dd09c5b757f43166215ab86a
1479 F20110218_AAARBV in_j_Page_082.txt
63dd908d56075b78766e3a9b1ae8fa89
47a9e73c47cc361511bc0531de708f7257799303
F20110218_AAAQWP in_j_Page_091.tif
5f0ab044bcd85bda59825f8e1bc17c64
21771ad42dcbfd7f660eb75ac32fb23ef025c73f
8371 F20110218_AAASGB in_j_Page_143thm.jpg
df94bc1165c33f4bfdfbf41d6bcadeba
e30ff5d989d50be56d4a3e906a0b18d96673e80c
7251 F20110218_AAASFM in_j_Page_120thm.jpg
0234ba8b146fd797f41be7a50b82567a
6f71d1d62ed97db41b027c1b4729695c2e13f6f0
6401 F20110218_AAASEY in_j_Page_103thm.jpg
550899c16e06cbfe5a2c34a55e07b8ac
ec75d3fb1b20d79f2d12f133d8c5f5c1d2ffa677
26552 F20110218_AAARZS in_j_Page_077.QC.jpg
f36cbea0b7e974102729d283c13d77df
31df2cfae2b1325f7321990a4989f62a915c326f
1783 F20110218_AAARCK in_j_Page_097.txt
8f8b25b7595b77ad414fe7f81b5ffb06
c555519973e2039a71bd2110539d62ccfb4ec6b8
F20110218_AAAQXE in_j_Page_106.tif
07d366e6c856e4340ede4f3ad89ea0b6
3a4ac73c6340283084b6dce24b4334fca91813d5
873 F20110218_AAARBW in_j_Page_083.txt
eab0264932b887f642d82108bbc2d290
331e887bb227181fc98d60d7cd129a36b08d3c29
F20110218_AAAQWQ in_j_Page_092.tif
83b18db733e228f90c8ae45283875d3e
166988517cace3ddc9b90db526dcbd38e0b62d60
6345 F20110218_AAASFN in_j_Page_121thm.jpg
9145fe59f05232500c78917280d7ce2d
fa177939fd6223d919af4af24e4f31fb76a6fb29
8382 F20110218_AAASEZ in_j_Page_104thm.jpg
555a7de36bc784a53336a91be6f48235
14e9b97e9eb1e7721160f7b857aa7133f9070f30
28418 F20110218_AAARZT in_j_Page_078.QC.jpg
b0a450a6319e2090e00601f0ec149175
d2a97dbca3d213fef6ce8a57da8cdf8b2ff51399
2006 F20110218_AAARCL in_j_Page_098.txt
129a1b574134ea39c3d30d49dbd967be
b913cede550d3b5f574d7e079150e06eacd40808
F20110218_AAAQXF in_j_Page_107.tif
439b9650b684ba8ee19dd7014bf95bf7
86f57df69b02ec45b4c1a854b0a8445d067793f2
1629 F20110218_AAARBX in_j_Page_084.txt
5a8fadf90171d2ebfbe8a02c76cd664b
c597566c0102adaf92c2340032fd9bcf988faf9b
F20110218_AAAQWR in_j_Page_093.tif
b373de95546c8c754c03d043d16144b2
48ec963d76b1082b5bc3b37cca485130621c5a34
3630 F20110218_AAASGC in_j_Page_144thm.jpg
078b83115227d902de4941e1225c9e5f
fda437ac6aaf8c0dc0f2094158a4622a145e8f16
6010 F20110218_AAASFO in_j_Page_122thm.jpg
6bfff85518ac830d631e0a5b4dc9ae98
1b28581f0c414e656ab703f90bab7359d67e0c52
18941 F20110218_AAARZU in_j_Page_079.QC.jpg
3ba635702b47f88af94adc407112c052
cf6668c4e01df4d8a95ae8474f888babe29364b4
1849 F20110218_AAARCM in_j_Page_099.txt
e3607b7be47c107f92b7462e03dfda6a
f7e1fb0e4739ac97383ff46dd9998af9da13888f
F20110218_AAAQXG in_j_Page_108.tif
5da80187d34c9bd6407c7adf9ab41157
e60be82399de2aa60659fc6de786257cedc044e8
900 F20110218_AAARBY in_j_Page_085.txt
1e1a10e78a3fcbcfd2d2f05a4e77d0f6
710e83070e2a32995dd15907ab64cc1075d5ea37
F20110218_AAAQWS in_j_Page_094.tif
835d6adaf823c6072f197afb62196528
fd8a08d83ffbd374aa3fcf86e6284e3cd87d82ba
1389 F20110218_AAARDA in_j_Page_113.txt
d5c27c96fcc0227ef0a65f87ad71992c
8ec0d7d54c39527deed209aff997f74102b56f21
F20110218_AAASGD in_j_Page_145thm.jpg
397954cb519ab5eb4352c94bed026a10
d54bad741172fbbb41c7c6367a63739b9fbcf08c
6441 F20110218_AAASFP in_j_Page_123thm.jpg
6c4ce2e900013b86af3237f41290ce93
1a37d38c4250c7d27f44c327542842e2d600df44
22175 F20110218_AAARZV in_j_Page_080.QC.jpg
60d2ff896eab6ca42c3517f5f7671343
5ca93f9f608e3c4cf0c4d0e1787a953cd7d2157f
1664 F20110218_AAARCN in_j_Page_100.txt
5bf2dd547da427f41dd5e77e5a2524cc
ccc4181b35c3b9e4d23fbba05f1a27c8ad7055dd
F20110218_AAAQXH in_j_Page_109.tif
f7cd0aaf55876cdece93ba133acb469e
f76a9185039b8aeaaca4476b35a7b5e83aa65009
1347 F20110218_AAARBZ in_j_Page_086.txt
39bd80343a9263c86659df11d304d3d1
5d37282b519b2ffa5d735edadd4fb87606917f61
F20110218_AAAQWT in_j_Page_095.tif
0af78ef1e98956084efcf15135367bc5
dc242d51d9f7b64435de57dfcfa84c59032f524a
2032 F20110218_AAARDB in_j_Page_114.txt
fa03ef19660d93633f85d15ea71fd9e1
e673d4646f13284657ee8abcad52406c4582e67f
1268337 F20110218_AAASGE in_j.pdf
6a436b24b3eb6a82cae04a9c13a6c6b1
d0ead510a97a1a0c3f6d8f9c01fefb4122f79ffb
7941 F20110218_AAASFQ in_j_Page_124thm.jpg
52482a2b6df77eea18625072ca7bb69d
3d128e0f6b67bd97c835a6566691dd768ad997fb
1827 F20110218_AAARCO in_j_Page_101.txt
4295b94a717040d986f324a3440c8689
d64474aa86181ab9b682fae4435467e850e86a3d
F20110218_AAAQXI in_j_Page_110.tif
9f1cdd48c3cedbbdf49007847d4b62db
000a1a264032a68a38e0a978f9aa825055d18c5b
2064 F20110218_AAARDC in_j_Page_115.txt
4ff63b5c817040abff20a3408527700f
6ab18db5ee0b0cc094b017dc6bf500f89cc5de65
6780 F20110218_AAASFR in_j_Page_126thm.jpg
be100187465ec2282388301d07ff998a
5b66c2f96e8ae948268663af2014c9eac7413bc7
14487 F20110218_AAARZW in_j_Page_081.QC.jpg
d42c86efba2fe0d1f877832b713dcd07
08f1f2a12434f62d62c07cadff38cb263e87a510
F20110218_AAARCP in_j_Page_102.txt
fe06e01744def12972bc0082ae42327f
854e767bfc3bd8bf7a8401c4c128e754c4be51b6
F20110218_AAAQXJ in_j_Page_111.tif
272c0787a8a968f5d22133834a4e11d6
b6b227d379b01b40f3803debb921b8fa994417e5
F20110218_AAAQWU in_j_Page_096.tif
6d37398043a3418a1fb42adae1cc27a9
f49223b55a32963830b52cd0342d308219cf69f0
1697 F20110218_AAARDD in_j_Page_116.txt
644924d86900378ffd547ba3b0383b81
d6795e0046b1f3ca473be80518f5908323a3dfbe
7917 F20110218_AAASFS in_j_Page_128thm.jpg
55c9bd9dc87ce0996a90ceea41fc24f9
d56c636696778887adce0057b0602668d08f10f3
18283 F20110218_AAARZX in_j_Page_082.QC.jpg
696a6fc97f7e977528911482fd274c89
f8d28e36180ad540129b149a1267db2a08426996
1446 F20110218_AAARCQ in_j_Page_103.txt
45faf3ef63ef99067c8185a8a60b8c0d
98923bd54a06a037449ffe6ac7792d6b09030904
F20110218_AAAQXK in_j_Page_112.tif
4fa89d512152bb5771091fb098a6be81
59384462fa4e6cd9e3f9382026a4ebc8f11de922
F20110218_AAAQWV in_j_Page_097.tif
5a8b59a6309d9815034617bf2c30de73
d4b3d5cd684e692ba91dea5811a857053549f883
1716 F20110218_AAARDE in_j_Page_117.txt
4f4fd645c22e3dab41b65c7aee5fb005
253c9eb1da3057471e0467426669fce42d4364c3
6372 F20110218_AAASFT in_j_Page_129thm.jpg
188a74d465f660156bb38aac176e2b20
d69c5a4759ec9790cc16cfe8eaa23e6531abe89a
18756 F20110218_AAARZY in_j_Page_083.QC.jpg
37d857974455fb0a92aed02e350ed3cd
ee24d884d7ef01f6272b774e379e18732876322c
2083 F20110218_AAARCR in_j_Page_104.txt
a080a5b97682a263c0562a905baf4caa
868e7933c3419b9c56e1010d7aaaf8e5ea841e4d
F20110218_AAAQXL in_j_Page_113.tif
58e366f0f18a7e7a5caf36b3b3571ae1
398f44b9430b704a1f02ca8aa194ee179efe6de7
F20110218_AAAQWW in_j_Page_098.tif
bb47fa9f0f6a5302a2c5d7896c13adcf
a16d1097d0eea3daae8606e174f43d7e5b3d6e7a
1864 F20110218_AAARDF in_j_Page_118.txt
38d4ced0f2527c0ea680369fc8212709
a040f06e6bc049e70c6b9a2c5a9a707784d627a0
8151 F20110218_AAASFU in_j_Page_131thm.jpg
bdbd052728481548dd6425fc659ce240
50816a267045349d83e5ad09c6ea69d3fd397ab0
27927 F20110218_AAARZZ in_j_Page_084.QC.jpg
3d048343f76344cc9b6bbdce5a8e8cdb
ad1bfe03b7e13f36d66fd4675a6d591a1ca0a00e
F20110218_AAAQYA in_j_Page_128.tif
4eea3f05fd19e18e529ea8a04b9b30e7
560163dd0c3d2c5c87903e03c853d8b3deb0b950
1734 F20110218_AAARCS in_j_Page_105.txt
1c9c309bf10fc2370fc5ac4ea9b7db3d
e7dc456677a7cb935d670fb6132ee802391caad8
F20110218_AAAQXM in_j_Page_114.tif
dafa57db1ce7b770f6a1cd0749e2d8c8
63744cd0b80ea06ec45a6d1f9559b583d3f85acc
F20110218_AAAQWX in_j_Page_099.tif
d81aed03c2a665c3e15f7864eb1fa432
cf3891594f901b6eab1b7059e8299993d22c17e7
1809 F20110218_AAARDG in_j_Page_119.txt
2daf1903f79acbd039d13616fbe8a8db
f1cf6cead2009baa59bec0a8493e83ec4adec9a1
7459 F20110218_AAASFV in_j_Page_136thm.jpg
ad69dbd9b8646a35a9574ee81e025ccb
40b4f8ba67284ba2554d8fa60e5d129cacd710db
F20110218_AAAQYB in_j_Page_129.tif
4a240c6638369df4a2b6f93d699f9041
8f223b62c6ba936d31153869140eb73e4cb3e9a2
2127 F20110218_AAARCT in_j_Page_106.txt
85a73d74f2a994f62ecef828095e16f4
f24976864c4befdd37fe031739cf452e8f67a6aa
F20110218_AAAQXN in_j_Page_115.tif
e7f55c4fd516d173f50d58699cc51ff7
9682107a9551042a765bf53df9783e51921aefdf
F20110218_AAAQWY in_j_Page_100.tif
c1695671129802640b52d50135e331bc
cf30389c1f9413a8e6061e427c7617480ed05bfe
2175 F20110218_AAARDH in_j_Page_120.txt
f06a2577bd19c729ea177bc409529f1b
f780e65ea45d64479d6a6946c0a275443e4913bc
7187 F20110218_AAASFW in_j_Page_138thm.jpg
ef44d7b18b24cafc4ec3d92f1b6b3009
d94356af06c22fa7be77c77275b8746ef1e88f48
F20110218_AAAQYC in_j_Page_130.tif
ba87431a982383be9e99c874ad801ce9
7cd9e8d3494456161588999cc55d7256884883ae
1854 F20110218_AAARCU in_j_Page_107.txt
7bc261751c5f60f4324f3c1f6c57d7dc
8ddb6a65772cec351d5dced6153f72c3de50b3d5
F20110218_AAAQXO in_j_Page_116.tif
67a84b83523bfb6e3134d1a7aba3df38
72a7e0335651e306e84e614f35f15e4684cc70bb
F20110218_AAAQWZ in_j_Page_101.tif
7782693dd744f0d3373798d215654f6d
2df78c06a134968eb9a9d1a54db3a522f936c3a6
1593 F20110218_AAARDI in_j_Page_121.txt
ea9bad03c7c477ea107c0ca1f45c327f
5f2b60d06bde7bc1005dc238d2dd7390911611fb
8426 F20110218_AAASFX in_j_Page_139thm.jpg
ec3b6b78f85d6d75897f75c070a542c6
c4c409ef4acf031fd0e9b718eca98cd77eda96f8
F20110218_AAAQYD in_j_Page_131.tif
828d675c096242b557f3648ae30c9acd
a10e2a09ed5d4203c94a816f2d74158297b922bb
1851 F20110218_AAARCV in_j_Page_108.txt
df3adfba385f7386295e51653b945d95
f1bb3b1b95d08b2f6dd493dddbec182cd65dc431
F20110218_AAAQXP in_j_Page_117.tif
da81a79c9d97a2723bb4359ad695eefc
9f3d4d63a80ac216f6e594465d5b8ce8eeea1f26
1785 F20110218_AAARDJ in_j_Page_122.txt
fb775568301940c7e83d6649a133094c
cab5a8045dd5fb54d3b69e217cc64c93677c4a61
7864 F20110218_AAASFY in_j_Page_140thm.jpg
8ba0c73a20a5dc67aa0b316a91815afc
c23fea26c64ede3fd728a4750104ca708f2a3cab
1960 F20110218_AAARCW in_j_Page_109.txt
7d8acb5b5ae4099fdd05b7585b0084e1
ae22a3f4d66fd01e6992e0883f1ea31d86edf406
F20110218_AAAQXQ in_j_Page_118.tif
9921cb0bad0de0bb05d6f18035d02e2b
93468d145f906e03e058fbd36722201c7f50dfb3
1722 F20110218_AAARDK in_j_Page_123.txt
1fe9dd2c3e6cb3df4cb538c61edd1789
42659e862cd40769c1cf39ff5fd16c7cafcd7791
F20110218_AAAQYE in_j_Page_132.tif
9bc7d2d30d1e7a7e7ae0bc24a01dfa9a
7e498282ff154cd1e71d8f49482034c2324aef99
8821 F20110218_AAASFZ in_j_Page_141thm.jpg
debd483152aa063b239c543b9b927480
64ba4e1298cded7c1b84c4289a950a2402247020
1717 F20110218_AAARCX in_j_Page_110.txt
f3d47d43c965c2794e050beb06d353af
4e697336d27fabdbd1c4a80cf02338c032f515cd
F20110218_AAAQXR in_j_Page_119.tif
c7eb158425bf7a721c95893620531ca7
344b7ed289af86182ec89a3c18c320a90e33c14f
2292 F20110218_AAAREA in_j_Page_139.txt
33ed0dce9743e85592f2fe29f00a42e1
cf9057c61e82f98c5ca8240c0851e3cfbe131f3c
F20110218_AAARDL in_j_Page_124.txt
0290848a29bc8ed40057ec09414aa12e
677ef041090a0f8aa05a532d146d3a1fdb29080f
F20110218_AAAQYF in_j_Page_133.tif
10c880e8341b89459e847148b807908d
efed5e7e765ae227ddc7687c5cd13da58bcb7c5a
1920 F20110218_AAARCY in_j_Page_111.txt
434112e70dea68171809e7a8afb7a956
282f4f1e1469a1e3110b2435cd0bc08598579e94
F20110218_AAAQXS in_j_Page_120.tif
4a189e095d5ccd46ab40058a855410fa
41f3956e63de3ecd0288ed64a76901df57c6234a
1955 F20110218_AAARDM in_j_Page_125.txt
820415fd882a983bad46a7911cf6c35f
cb44e2e54940927b90a8f875e38c90a28d1bfeba
F20110218_AAAQYG in_j_Page_134.tif
d2e837f75bdac6d4304cde2dd23b9b9c
f5e1169a9a9f287f52aae116a2a5c08a73489e73
2356 F20110218_AAARCZ in_j_Page_112.txt
79ccacf7f1b5702411d68caef6bcb595
6e6fc46686e388a2d254ff8815c4a02b5f0e8715
F20110218_AAAQXT in_j_Page_121.tif
82ca1d75762935d82a81df6f263e2310
72451e97b4a12562c8897befb74a62d5a61eba4c
2295 F20110218_AAAREB in_j_Page_140.txt
13898a12cfa67c554ade85b3b75c994b
1fec5f7beefb2981a788ae0b8aace6ef77cc8b08
1850 F20110218_AAARDN in_j_Page_126.txt
9aeab852552a2cdc9ed1ddc6414ecfc7
1f2e7887fc53b486adaafa5264a8f802fc1717f3
F20110218_AAAQYH in_j_Page_135.tif
1470725a82e33dc0edc73218e83c1877
fa859ae1c2ad6a3604b09e69e518be4e6f231900
F20110218_AAAQXU in_j_Page_122.tif
35bb3cb233c33e23476c7b5190a89dd9
20e8f646e8b2fb2e83d8072d2d65b6390d0ee2bb
2300 F20110218_AAAREC in_j_Page_141.txt
d898e5b06d1410f890e56b4144fdfa61
b615d4eeb5468888d73e4367982611b8dd86efea
1428 F20110218_AAARDO in_j_Page_127.txt
d70e43466bbd167e642d1eb1c2649a0b
09497cf5fe777b58578758b6f31219e55f8ceb68
F20110218_AAAQYI in_j_Page_136.tif
d2b0f882339b8f1d607895d2669c057e
15d3aa32e54fab6069e2685942ab6d3862bd3066
2696 F20110218_AAARED in_j_Page_142.txt
0574388f2029ce64d332b0a33996face
f48c5bbd364c8b4bb80af1c6945b8c166302bfa4
1860 F20110218_AAARDP in_j_Page_128.txt
b8c31a5cd1b989751383f20fdb60c0fd
348723d550993e212a6b7ffd60aa3457883cdd2d
F20110218_AAAQYJ in_j_Page_137.tif
0201ee77f3993fd422be48b594cb9666
0793160bf6fc85854e979f5584133f0fd000f9a5
F20110218_AAAQXV in_j_Page_123.tif
0fb1b608b2febe385ef2873b3c8c24e3
2a6b7d0503a24a1109528581073db2caa90960e4
2423 F20110218_AAAREE in_j_Page_143.txt
71f52440a9c8b46fa6deb0c44b8db4dd
0481fbb9beeb8e94081c6a2e46923eff8ed02919
1499 F20110218_AAARDQ in_j_Page_129.txt
ef043a4fd14845354a9a0275dc94b480
e25313ce8a8912b2addf4b191d7ba295c65828c6
F20110218_AAAQYK in_j_Page_138.tif
e1b3a8abe264bc9ba53a0f4352af9adc
a409399df9668e136976ac1a6e9ef7ee7a39db8f
F20110218_AAAQXW in_j_Page_124.tif
531d7523396bffe51549daba77654b1c
22e1b0d5465ed269ba4a65de34a6032ace9234b5
1049 F20110218_AAAREF in_j_Page_144.txt
8e94cfbd9f3c994771cb223cbc338eae
fa85ad8b95cbd2b1788651ab06617ab91b8c6a04
1427 F20110218_AAARDR in_j_Page_130.txt
ca0e4f73e099937636ddc8f26c54680f
b935b5e09048740cc8746e84794de858e2568303
F20110218_AAAQYL in_j_Page_139.tif
971173e03521c91067542e1e48992af9
4b4df954f84e2ab842d9474bb7e83538501b3021
F20110218_AAAQXX in_j_Page_125.tif
42cb3ec5d87acd411d7ee7ce9b181d79
a633c174f862e90d6302701c71bf9338391b3579
290 F20110218_AAAREG in_j_Page_145.txt
3cd35d34cc0bddedab16c407816535b0
a6e540eb9b42f152fbf7d3d687943be48edd7ddf
F20110218_AAAQZA in_j_Page_009.txt
b5c81edca3d8bdcc96f9bd5f763e3cee
506a969a570e388e28b0fb180e5ea07e14cd8c88
1926 F20110218_AAARDS in_j_Page_131.txt
2f606111252e3c4636e5176f0a0da3b8
1072f90abada8a0384417eb77c390a59076aead3
F20110218_AAAQYM in_j_Page_140.tif
162206c633e6e1e3ce1fe9cfa5ae3e97
c84e35f4fcd497448a102a0755e3b8c134c8b205
F20110218_AAAQXY in_j_Page_126.tif
c1bb7fff3be76d708082604ee475d646
7fd7b129164be09e7b967046a5f35c463c04e9b6
7513 F20110218_AAAREH in_j_Page_001.pro
772dd822e06452ed8afb092f59c34dfe
00abe2464e3e017c5971e387deb48929ffa76eef
1151 F20110218_AAAQZB in_j_Page_010.txt
ca5f3e488a3abaaa22bcd0015c84abfa
09d748ee637e1b88c91d36a758856de86e71aa9c
156 F20110218_AAARDT in_j_Page_132.txt
6e508654bebaa7da82a6895bccb7ca58
dd9e99b9e9705d60488a736e8b72b410b992158f
F20110218_AAAQYN in_j_Page_141.tif
f339d49e8ca491b9ef1652a22b716fbc
1941f4cdf7fd4a333153f216513248e226346ed6
F20110218_AAAQXZ in_j_Page_127.tif
1e142e34ad82025e7c95bbd523f97918
058f3ab6f6264fa135c3a30044af5dc91d3221f1
F20110218_AAAREI in_j_Page_002.pro
ddd9e963e7d5e797d822068dc42916f2
e3a5073e79e51e2bec4d148cc79b4b48d24c7e8d
1731 F20110218_AAAQZC in_j_Page_011.txt
5c6288e8e9afc5ae09802ad0333fd372
c2da6878e29bca8e96b6883cf2aee34d6d94bc49
1789 F20110218_AAARDU in_j_Page_133.txt
ea9c1d63c46f3722aac225e5768c831b
24a5bdefe77e2191fa55cff6c54ef6f3519542f3
F20110218_AAAQYO in_j_Page_142.tif
5e68f963ba53a44b0daba55affa5e883
8dfecfbbfd37f8285296f291b8a6a90ab3bba4ec
2386 F20110218_AAAREJ in_j_Page_003.pro
5e6c56cc3d81ad300cc8489cb1bd7c5b
c16938547c42cbb8ab8227b01f5db69dcb252970
1892 F20110218_AAAQZD in_j_Page_012.txt
acf8a7460c94682ee6f5239d9f62e9fd
d362af165fd8c2a71193fac60b09d3e208b13fd6
1986 F20110218_AAARDV in_j_Page_134.txt
ce789d0f882001ac90ae6cb394b92df6
bfde2da645d4738357218d247e41c014952353ce
F20110218_AAAQYP in_j_Page_143.tif
8f821e1c09747a5ceaec00175547329f
f3685ce8e60e974aecf7e8b106c83ade2efc03b2
4652 F20110218_AAAREK in_j_Page_004.pro
17c0b23950d9a560675a327287c12071
b939bc6be59c4d80843089dc31a50b71720d8f4e
1600 F20110218_AAAQZE in_j_Page_013.txt
25193e5561fd1f20984b39cae341bfd8
1e1edb8feff803a4fb7c7fb73930d19eed149550
1847 F20110218_AAARDW in_j_Page_135.txt
500579ae591571d8b4b2d443e505c654
a1ed00ebe35f642a3cfde8d22bc0a4b277608403
F20110218_AAAQYQ in_j_Page_144.tif
6fd23153e08efbec9345a1e20bf6dc69
26f8621cdd6ac6604b71f5961bdbaac3fbf148ff
45970 F20110218_AAARFA in_j_Page_020.pro
0aabd0182b50d4149ce24f43bc5bb631
fa9434c3950d72967578c58c7580788cfd5acc84
64560 F20110218_AAAREL in_j_Page_005.pro
029ff8f81304463608bdf97f18f4df18
2560946070a0e298d9dfceceb44316dcc44b0937
1946 F20110218_AAAQZF in_j_Page_014.txt
f3c8bb5e0da83e96c874be1e352de862
78973cb040c5932332fb41bfa7a2b6614e23ff12
1772 F20110218_AAARDX in_j_Page_136.txt
915ff20d4b89b639101c1256b6608186
06e7f6a7b609d303c16d7ec00c60429c2a12ad6c
F20110218_AAAQYR in_j_Page_145.tif
a93a7cf0673bf3b004851b1ac6d90a23
6a09410439a03eddfb4d2f91609da63f05cd679d
47366 F20110218_AAARFB in_j_Page_021.pro
e449e26639ffe2497b0e045cdcdbd762
8fcfd72b2a5e284498c737a64089e1872a001474
89909 F20110218_AAAREM in_j_Page_006.pro
0ed8717c45dc6bb226224d06269006d0
ad72a68dce7d5d9b4c6bac01d022bd1a89477d67
F20110218_AAAQZG in_j_Page_015.txt
a419377fc62e30e796b288150bf58dbb
53f477cb85bb06b4111b6e7bf960d6eb2ef17d02
F20110218_AAARDY in_j_Page_137.txt
62d0187326c4632174bf2c3d9f44a75e
c9f7bff0d9b868c3121d6217ad42786e1469f3c5
437 F20110218_AAAQYS in_j_Page_001.txt
f784a2898715e8c328d40072fbb4be38
06ad4f88387e401b7e22e5f1720f01ec606458ce
37428 F20110218_AAAREN in_j_Page_007.pro
f60f7f5737d967616d5dc0bfb9f5cc5b
9bec388595ab0a9c028f1ee7cd5fedd65b50d870
1873 F20110218_AAAQZH in_j_Page_016.txt
952f20d4796ce146989899bf031b22b5
f11688f765148a23c7a88bf423bab32e7c66c494
2035 F20110218_AAARDZ in_j_Page_138.txt
8d1134993cfbf9df6044f9846940ebc1
df2dcd138d4cd5fe1ff710ed52fb6159e492eab6
97 F20110218_AAAQYT in_j_Page_002.txt
550db4e6746f1ca3eefb22eb42e878a3
2e1bf1492382dbefc5bc6266c46b1ffdd8d0f9d4
21000 F20110218_AAAREO in_j_Page_008.pro
376db3fa6c73619f63c355b5a4665b05
b3d0526c70a9efb5bc179f14f776373c6c2c0e2c
1883 F20110218_AAAQZI in_j_Page_017.txt
af00675f22617ae985d9affabb3aef04
a8ece1a96e64119a5afc25be6b84f945276c85d0
146 F20110218_AAAQYU in_j_Page_003.txt
294794574ac58a912180cf687575608a
f2fcf854fdb86ed8ef8bbf44be237bef7427baf5
44907 F20110218_AAARFC in_j_Page_022.pro
fb99be17b95227647f6176f8db7a09c7
a2cb7b54e0ea8ce0e79e59bf055022a1f344c12f
51276 F20110218_AAAREP in_j_Page_009.pro
7b36758fe6e7aef68507612da811c273
78368b6344bb7710db643ce94d85ad78335bfdbe
1976 F20110218_AAAQZJ in_j_Page_018.txt
5616a7f011de91036fb15e646b00aa43
a58dcb9b4d71876cd214914973a1e6b56529edb4
235 F20110218_AAAQYV in_j_Page_004.txt
8957591c44d3ecb79b350171763353b0
2607ebd8559a9c6cb272d6bbeed9b294758e2b51
47847 F20110218_AAARFD in_j_Page_023.pro
f15118a536976054f5a50e8a3244173a
49a002cbb7b0b4a63be8ec5b7bad69b708bf3dad
28968 F20110218_AAAREQ in_j_Page_010.pro
8a4b1b76c5b72d3ff1e3b6cafb743ebc
ad75a5add8d88049a4bf530612a8042bd0632e43
1968 F20110218_AAAQZK in_j_Page_019.txt
ee159886010d15ff814c23b8c6f92ac0
b8e936fd95630801639cced882c1229d02f2127e
45979 F20110218_AAARFE in_j_Page_024.pro
f621db74c3206283bf813a6b9538242e
2bcbd678406ed905fa16475684c8a658174bf90c
38625 F20110218_AAARER in_j_Page_011.pro
0bbac82f5e4f24b06df0ad672ea2dc76
0c0ad3a0c1c106974218e7c777cd56901defe354
1848 F20110218_AAAQZL in_j_Page_020.txt
95185f46eb29fba74e87375318a7a9c8
b6c205c5a65b0e62206619d9d7e54e70c6053f28
2683 F20110218_AAAQYW in_j_Page_005.txt
9b8ab37d34f6a9fcb08dae5b1f489e91
7a6057791c61dbe0af89738acb36978176922541
46309 F20110218_AAARFF in_j_Page_025.pro
c11e5ef0ca631f104d51174c9aeeab4e
3a9b01b9216c0a2cf16bed296e5ed61262979504
48084 F20110218_AAARES in_j_Page_012.pro
cf0dd1a0ba63b5402fe67574b2fc9491
99f7c4182cfdc1d85a4853db354e1a8f9f4c96bc
1962 F20110218_AAAQZM in_j_Page_021.txt
d6f4b87c13f70ba20e060dda74aadbb2
6785149f3542d73d82e9db80daa43523197a83b7
3878 F20110218_AAAQYX in_j_Page_006.txt
3bb317c9fd20220f41e44edb21b7c0a6
02d7f2e3142fbb075d11679774f966cd94f8f5fa
43595 F20110218_AAARFG in_j_Page_026.pro
13575c3cd3d1d9643ebb74a6a8f1ae03
550f4f61b512671c4891afc3eb39f4aa7ac043ff
37241 F20110218_AAARET in_j_Page_013.pro
4ffe7632263ee633dd267da24989144e
7344127399d9db14b05601f85a4faeda61cdea3b
1779 F20110218_AAAQZN in_j_Page_022.txt
d8232336f042c767c79c56c7517267b2
0c19fa2542594a6a26971c153e7e6da21856d6f3
F20110218_AAAQYY in_j_Page_007.txt
5db1d7108101b67fe8242cf210f26613
be082f67b55348296ea04587af82754c9d1cfde0
46574 F20110218_AAARFH in_j_Page_027.pro
6d11da5b40c8f068ba163282c21277d2
ab7d6c3c6df3c19d2419ba2894d285c8cb958576
49176 F20110218_AAAREU in_j_Page_014.pro
8b7f148272986161b1179a6a2a51f1d0
a85c7e9f222b52e8e038bfc56ddda93a2ebe4d09
F20110218_AAAQZO in_j_Page_023.txt
2b04ee31e3ea2622ae510dac9cb82f5b
3bc4fce14b4468337f3af8dbcc6f9de0e5069ae9
903 F20110218_AAAQYZ in_j_Page_008.txt
f0ed1f600953b385c565dfaf87aa34a4
eb93d10349efc5bc617ff4b511ad1852331a2c0f
47014 F20110218_AAARFI in_j_Page_028.pro
df81a192f42ac37a6d1ec4ea4fb636a3
b50c8699ec70baefda2ac9b79edb24f5dd0ec5bf



PAGE 1

EFFICIENT SCHEDULING TECHNI QUES AND SYSTEMS FOR GRID COMPUTING By JANG-UK IN A DISSERTATION PRESENTED TO THE GRADUATE SCHOOL OF THE UNIVERSITY OF FLOR IDA IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF DOCTOR OF PHILOSOPHY UNIVERSITY OF FLORIDA 2006

PAGE 2

Copyright 2006 by Jang-uk In

PAGE 3

This document is dedicated to the graduate students of the University of Florida.

PAGE 4

ACKNOWLEDGMENTS I thank my parents, wife and all the co-workers. Especially heartfelt thanks go to Dr. Sanjay Ranka, chair of my committee, for his advice and support. iv

PAGE 5

TABLE OF CONTENTS page ACKNOWLEDGMENTS .................................................................................................iv LIST OF TABLES ...........................................................................................................viii LIST OF FIGURES ...........................................................................................................ix ABSTRACT .......................................................................................................................xi CHAPTER 1 INTRODUCTION........................................................................................................1 Grid Computing............................................................................................................1 Grid Resource Management Middleware: SPHINX....................................................6 Efficient Scheduling Techniques..................................................................................7 Scheduling Systems......................................................................................................9 The Portable Batch System (PBS).......................................................................10 Maui.....................................................................................................................11 LSF......................................................................................................................12 EZ-Grid................................................................................................................12 Resource Broker..................................................................................................13 Pegasus................................................................................................................13 Condor.................................................................................................................15 PROPHET...........................................................................................................15 Data Analysis Systems...............................................................................................16 Scheduling Algorithms...............................................................................................18 Iterative List Scheduling......................................................................................18 Dynamic Critical Path Scheduling......................................................................19 Reliability Cost Driven Scheduling.....................................................................19 Heterogeneous Earliest Finish Time Scheduling................................................19 Dynamic Level Scheduling.................................................................................20 Optimal Assignment with Sequential Search......................................................20 Contributions..............................................................................................................20 Outline........................................................................................................................22 2 GRID POLICY FRAMEWORK................................................................................23 Key Features...............................................................................................................25 v

PAGE 6

Policy Space................................................................................................................27 Resource Provider...............................................................................................28 Resource Property...............................................................................................28 Time.....................................................................................................................30 Three-dimensional Policy Space.........................................................................30 A Solution Strategy for Policy-based Scheduling......................................................32 Model Parameters................................................................................................32 Choice of an Objective Function and Optimization Metric................................34 Quality of Service Constraints.............................................................................36 Simulation Results......................................................................................................39 Future Works..............................................................................................................42 3 SPHINX: POLICY-BASED WORKFLOW SCHEDULING....................................44 Requirements of a Grid-scheduling Infrastructure.....................................................44 Information Requirements...................................................................................45 System Requirements..........................................................................................47 Highlights of SPHINX Architecture...........................................................................49 SPHINX Client...........................................................................................................51 SPHINX Server..........................................................................................................52 Data Replication Service............................................................................................56 Grid Monitoring Interface...........................................................................................57 Relationship with Other Grid Research......................................................................58 Grid Information Services...................................................................................59 Replica and Data Management Services.............................................................59 Job Submission Services.....................................................................................59 Virtual Data Services...........................................................................................60 Future Planners and Schedulers...........................................................................60 Experiments and Results.............................................................................................61 Scheduling Algorithms........................................................................................62 Test-bed and Test Procedure...............................................................................64 Performance Evaluation of Scheduling Algorithms............................................65 Effect of Feedback Information...........................................................................65 Comparison of Different Scheduling Algorithms with Feedback.......................66 Effects of Policy Constraints on the Scheduling Algorithms..............................72 Fault Tolerance and Scheduling Latency............................................................74 Conclusion and Future Research................................................................................75 4 POLICY-BASED SCHEDULING TECHNIQUES FOR WORKFLOWS...............78 Motivation...................................................................................................................78 Problem Definition and Related Works......................................................................79 Scheduling Algorithm Features..................................................................................84 Notation and Variable Definition...............................................................................85 Optimization Model....................................................................................................88 Profit Function for Single Workflow Scheduling...............................................89 Profit Function for Multiple Workflow Scheduling............................................89 vi

PAGE 7

Objective Function and Constraints....................................................................90 Policy-based Scheduling Algorithm and SPHINX.....................................................91 Iterative Policy-based Scheduling Algorithm.....................................................92 Scheduling Algorithm on SPHINX.....................................................................95 Experiment and Simulation Results............................................................................96 Network Configuration and Test Application.....................................................96 List Scheduling with the Mean Value Approach................................................97 The Simulated Performance Evaluation with Single DAG.................................98 The Simulated Performance Evaluation with Multiple DAGs..........................104 The Performance Evaluation with Single DAG on OSG..................................107 The Test Application.........................................................................................107 The Performance Evaluation with Multiple DAGs on OSG.............................111 The Algorithm Sensitivity to the Estimated Job Execution Time.....................116 Conclusion and Future Work....................................................................................119 5 CONCLUSIONS......................................................................................................121 LIST OF REFERENCES.................................................................................................126 BIOGRAPHICAL SKETCH...........................................................................................133 vii

PAGE 8

LIST OF TABLES Table page 1-1 The existing scheduling systems and the scheduling properties. .............................9 3-1 Finite automation of SPHINX scheduling status management................................50 3-2 SPHINX client functionalities..................................................................................51 3-3 SPHINX server functions for resource allocation....................................................53 3-4 SPHINX APIs for accessing data replicas through RLS service............................56 3-5 Database table schemas for accessing resource-monitoring information................57 3-6 Grid sites that are used in the experiment................................................................61 3-7 SPHINX server configurations.................................................................................64 viii

PAGE 9

LIST OF FIGURES Figure page 1-1 A grid with three subsystems.....................................................................................3 2-1 Examples of resource provider and request submitter hierarchies...........................29 2-2 Hierarchical policy definition example....................................................................32 2-3 Policy based scheduling simulation results..............................................................38 2-4 Policy based scheduling simulation results with highly biased resource usage.......40 2-5 Policy based scheduling simulation results with highly biased workload...............42 3-1 Sphinx scheduling system architecture....................................................................48 3-2 Overall structure of control process.........................................................................54 3-3 Effect of utilization of feedback information...........................................................65 3-4 Performance of scheduling algorithms with 300 jobs and without any policy........67 3-5 Performance of scheduling algorithms with 600 jobs and without any policy........69 3-6 Performance of scheduling algorithms with 1200 jobs and without any policy......70 3-7 Site-wise distribution of completed jobs vs. avg. job completion time...................71 3-8 Performance of policy based-scheduling algorithm.................................................73 3-9 Number of timeouts in the different algorithms.......................................................74 3-10 Sphinx scheduling latency: average scheduling latency..........................................75 4-1 An example workflow in Directed Acyclic Graph (DAG)......................................80 4-2 An example for job prioritization and processor assignment...................................87 4-3 The iteration policy-based scheduling algorithm on heterogeneous resources........93 4-4 The constraint and clustering effect on DAG completion........................................99 ix

PAGE 10

4-5 Average DAG completion time with 500 DAGs...................................................100 4-6 Average DAG completion time with the different scheduling algorithms (1).......102 4-7 Average DAG completion time with the different scheduling algorithms (2).......103 4-8 The scheduling performance of the different scheduling algorithms.....................104 4-9 The scheduling performance evaluation of the scheduling algorithms..................108 4-10 The scheduling performance comparison when the CCR is changed....................109 4-11 The scheduling performance comparison when the link density is changed.........110 4-12 The performance of the multiple DAG scheduling................................................111 4-13 The performance of the multiple DAG scheduling with the simple scheduling....113 4-14 The performance of the multiple DAG scheduling algorithms..............................114 4-15 The single dag scheduling sensitivity to the job execution time estimation..........115 4-16 The multiple dag scheduling sensitivity to the job execution time estimation......118 x

PAGE 11

Abstract of Dissertation Presented to the Graduate School of the University of Florida in Partial Fulfillment of the Requirements for the Degree of Doctor of Philosophy EFFICIENT SCHEDULING TECHNIQUES AND ALGORITHMS FOR GRID COMPUTING By Jang-uk In August 2006 Chair: Sanjay Ranka Major Department: Computer and Information Science and Engineering This dissertation discusses policy-based scheduling techniques on heterogeneous resource for grid computing. The proposed scheduling algorithm has the following features, which can be utilized on the grid computing environment. First, the algorithm supports the resource usage constrained scheduling. A grid consists of the resources that are owned by decentralized institutions. Second, the algorithm performs the optimization-based scheduling. It provides an optimal solution to the grid resource allocation problem. Third, the algorithm assumes that a set of resources is distributed geographically and is heterogeneous in nature. Fourth, the scheduling dynamically adjusts to the grid status. It tracks the current workload of the resources. The performance of the proposed algorithm is evaluated with a set of predefined metrics. In addition to showing the simulation results for the out-performance of the policy-based scheduling, a set of experiments is performed on open science grid (OSG). In this dissertation we discuss a novel framework for policy based scheduling in resource xi

PAGE 12

allocation of grid computing. The framework has several features. First, the scheduling strategy can control the request assignment to grid resources by adjusting resource usage accounts or request priorities. Second, efficient resource usage management is achieved by assigning usage quotas to intended users. Third, the scheduling method supports reservation based grid resource allocation. Fourth, the quality of service (QOS) feature allows special privileges to various classes of requests, users, groups, etc. This framework is incorporated as part of the SPHINX scheduling system that is currently under development at the University of Florida. Experimental results are provided to demonstrate the usefulness of the framework. A grid consists of high-end computational, storage, and network resources that, while known a priori, are dynamic with respect to activity and availability. Efficient scheduling of requests to use grid resources must adapt to this dynamic environment while meeting administrative policies. In this dissertation, we describe a framework called SPHINX that can administer grid policies and schedule complex and data intensive scientific applications. We present experimental results for several scheduling strategies that effectively utilize the monitoring and job-tracking information provided by SPHINX. These results demonstrate that SPHINX can effectively schedule work across a large number of distributed clusters that are owned by multiple units in a virtual organization in a fault-tolerant way in spite of the highly dynamic nature of the grid and complex policy issues. The novelty lies in the use of effective monitoring of resources and job execution tracking in making scheduling decisions and fault-tolerance something which is missing in todays grid environments. xii

PAGE 13

CHAPTER 1 INTRODUCTION Grid computing is increasingly becoming a popular way of achieving high performance computing for many scientific and commercial applications. The realm of grid computing is not limited to the one of parallel computing or distributed computing, as it requires management of disparate resources and different policies over multiple organizations. Our research studies grid computing and related technologies. We propose novel grid resource management middleware and efficient scheduling techniques. This chapter discusses grid computing issues and technologies. Specifically, we discuss the major difference between the new computing paradigm and existing parallel and distributed computing. We introduce new concepts and terminologies defined in grid computing. We then present the proposed scheduling system and technologies, and discuss how it affects and contributes to the computing community. Grid Computing Data generated by scientific applications are now routinely stored in large archives that are geographically distributed. Rather than observing the data directly, a scientist effectively peruses these data archives to find nuggets of information [1]. Typical searches require multiple weeks of computing time on a single workstation. The scientific applications that have these properties are discussed in detail in the upcoming sections of this chapter. 1

PAGE 14

2 Grid computing has become a popular way of providing high performance computing for many data intensive, scientific applications. Grid computing allows a number of competitive and/or collaborative organizations to share mutual resources, including documents, software, computers, data and sensors and computationally intensive applications to seamlessly process data [2, 3]. The realm of grid computing is beyond parallel or distributed computing in terms of requiring the management of a large number of heterogeneous resources with varying, distinct policies controlled by multiple organizations. Most scientific disciplines used to be either empirical or theoretical. In the past few decades, computational science has become a new branch in these disciplines. In the past computational science was limited to simulation of complex models. However in recent years it also encapsulates information management. This has happened because of the following trends: (1) Large amounts of data are available from scientific and medical equipment, (2) the cost of storage has decreased substantially, and (3) the development of Internet technologies allows the data to be accessible to any person at any location. The applications developed by scientists on this data tend to be both computationally and data intensive. An execution may require tens of days on a single workstation. In many cases it would not be feasible to complete this execution on a single workstation due to extensive memory and storage requirements. Computational grid addresses these and many other issues by allowing a number of competitive and/or collaborative organizations to share resources in order to perform one or more tasks. The resources that can be shared include documents, software, computers, data and sensors. The grid is defined by its pioneers [4] as follows: 2

PAGE 15

3 The real and specific problem that underlies the Grid concept is coordinated resource sharing and problem solving in dynamic, multi-institutional virtual organization. The sharing that we are concerned with is not primarily file exchange but rather direct access to computers, software, data and other resources, as is required by a range of collaborative problem solving resource brokering strategies emerging in industry, science and engineering. The owner of a resource can choose the amount, duration, and schedule of the resources available to different users (see Figure 1). These policies can vary over time, impacting the available resources for a given application. A core requirement for success of these environments will be a middleware that schedules different resources to maximize the overall efficiency of the system. Participants in Project 1can run program A Participants in Project 2 can run program B Participants in Project 2can read data D Participants in Project 2 can use these machines during the night. Participants in Project 1can run program A Participants in Project 2 can run program B Participants in Project 2can read data D Participants in Project 2 can use these machines during the night. Figure 1-1. A grid with three subsystems. Each providing restricted access to a subset of applications. Realizing the potential of grid computing requires the efficient utilization of resources. The execution of user applications must simultaneously satisfy both job execution constraints and system usage policies. Although many scheduling techniques for various computing systems exist [5-11], traditional scheduling systems are inappropriate for scheduling tasks onto grid resources for the following main reasons. 3

PAGE 16

4 Although parallel or distributed systems address one or more of these characteristics, they do not address all of them in a cohesive manner for grids. Virtual organization (VO) [12] is a group of consumers and producers united in their secure use of distributed high-end computational resources towards a common goal. Actual organizations, distributed nationwide or worldwide, participate in one or more VOs by sharing some or all of their resources. The grid resources in a VO are geographically distributed and heterogeneous in nature. These grid resources have decentralized ownership and different local scheduling policies dependent on their VO. The grid resources may participate in a VO in a non-dedicated way, which means the resources accept incoming requests from several different remote sources. The dynamic load and availability of the resources require mechanisms for discovering and characterizing their status continually. The second major challenge in the grid-computing environment relates to the planning and scheduling of data analyses. The factors that guide the development of a plan include user requirements, global and local policy, and overall state. User requirements may include not only the virtual data request but also optimization criteria such as completion in the shortest time or usage of the fewest computing resources. Any plan is necessarily constrained by resource availability, and consequently, we must obtain all available state information. This complicates planning, as the global system state can be large and determining future system states can be difficult. The complex interrelationships among different data representations (procedural vs. declarative), data locations (archived vs. cached), policies (local vs. global), and 4

PAGE 17

5 computations (different user queries, background tasks, etc.) make planning and scheduling a challenging and rewarding problem. New techniques are required for representing complex requests, for constructing request representations via the composition of representations for virtual data components, for representing and evaluating large numbers of alternative evaluation strategies, and for dealing with uncertainty in resource properties. A virtual data grid must be able to allocate storage, computer, and network resources to requests in a fashion that satisfies global and local policies. Global policy includes community-wide policies governing how resources dedicated to a particular collaboration should be prioritized and allocated. Local policies are site-specific constraints governing when and how external users can use local resources and the conditions under which local use has priority over remote use. The execution of a plan will fail if it violates either global or local policy. Hence we require mechanisms for representing policies and new resource discovery techniques that can take into account policy information. The purpose of planning and scheduling is to optimize the response to a query for virtual data given global and local policy constraints. Different optimization criteria may be applied to a PVDG request: minimize execution time, maximize reliability, minimize use of a particular resource, etc. For a given metric, optimization is driven by resource characteristics and availability. The dynamic nature of the grid coupled with complex policy issues poses interesting challenges for harnessing the resources in an efficient manner. In our research, we study the key features of grid resource management systems and their performance on 5

PAGE 18

6 Open Science Grid (OSG) [13], a worldwide consortium of university resources consisting of 2000+ CPUs. Grid Resource Management Middleware: SPHINX Efficient scheduling of requests to use grid resources must adapt to the dynamic grid computing environment while meeting administrative policies. Our research defines the necessary requirements of such a scheduler and proposes a framework called SPHINX. The scheduling middleware can administer grid policies, and schedule complex and data intensive scientific applications. The SPHINX design allows for a number of functional modules to flexibly plan and schedule workflows representing multiple applications on the grids. It also allows for performance evaluation of multiple algorithms for each functional module. We present early experimental results for SPHINX that effectively utilize other grid infrastructure such as workflow management systems and execution systems. These results demonstrate that SPHINX can effectively schedule work across a large number of distributed clusters that are owned by multiple units in a virtual organization. The results also show that SPHINX can overcome the highly dynamic nature of the grid and complex policy issues to utilize grid resources, which is an important requirement for executing large production jobs on the grid. These results show that SPHINX can effectively Reschedule jobs if one or more of the sites stops responding due to system downtime or slow response time. Improve total execution time of an application using information available from monitoring systems as well its own monitoring of job completion times. Manage policy constraints that limit the use of resources. Virtual Data Toolkit (VDT) [13] supports execution of workflow graphs. SPHINX working with VDT is in the primary stages of exhibiting interactive remote data access, 6

PAGE 19

7 demonstrating interactive workflow generation and collaborative data analysis using virtual data and data provenance. Also, any algorithms we develop will potentially be used by a wide user community of scientists and engineers. SPHINX is meant to be inherently customizable, serving as a modular "workbench" for CS researchers, a platform for easily exchanging planning modules and integrating diverse middleware technology. It will also deliver reliable and scalable software architecture for solving general-purpose distributed data intensive problems. Efficient Scheduling Techniques The dynamic and heterogeneous nature of the grid coupled with complex resource usage policy issues poses interesting challenges for harnessing the resources in an efficient manner. In our research, we present novel policy-based scheduling techniques and their performance on OSG. The execution and simulation results show that the proposed algorithm can effectively 1. Allocate grid resources to a set of applications under the constraints presented with resource usage policies. 2. Perform optimized scheduling on heterogeneous resources using an iterative approach and binary integer programming (BIP). 3. Improve the completion time of workflows in integration with job execution tracking modules of SPHINX scheduling middleware. The proposed policy-based scheduling algorithm is different from the existing works from the following perspectives. Policy constrained scheduling: The decentralized grid resource ownership restricts the resource usage of a workflow. The algorithm makes scheduling decisions based on resource usage constraints in a grid computing environment. Optimized resource assignment: The proposed algorithm makes an optimal scheduling decision utilizing the Binary Integer Programming (BIP) model. The BIP 7

PAGE 20

8 approach solves the scheduling problem to provide the best resource allocation to a set of workflows subject to constraints such as resource usage. The scheduling on heterogeneous resources: The algorithm uses a novel mechanism to handle different computation times of a job on various resources. The algorithm iteratively modifies resource allocation decisions for better scheduling based on different computation times instead of taking a mean value over the time. This approach has also been applied to the Iterative list scheduling [1]. Dynamic scheduling: In order to handle the dynamically changing grid environments, the algorithm uses a dynamic scheduling scheme rather than a static scheduling approach. A scheduling module makes the resource allocation decision to a set of schedulable jobs. The status of a job is defined as schedulable when it satisfies the following two conditions. Precedence constraint: all the preceding jobs are finished, and the input data of the job is available locally or remotely. Scheduling priority constraint: A job is considered to have higher priority than others when the job is critical to complete the whole workflow for a better completion time. Future scheduling: Resource allocation to a schedulable job impacts the workload on the selected resource. It also affects scheduling decisions of future schedulable jobs. The algorithm pre-schedules all the unready jobs to detect the impact of the current decision on the total workflow completion time. When the scheduling algorithm is integrated with the SPHINX scheduling middleware, it performs efficient scheduling in the policy-constrained grid environment. The performance is demonstrated in the experimental section. 8

PAGE 21

9 Table 1-1. The existing scheduling systems and the scheduling properties. This table shows the conventional scheduling systems and their scheduling property existence. The mark v means that a system has the property with the mark. Systems Adaptive scheduling Co-allocation Fault-tolerant Policy-based QoS support Flexible interface Nimrod-G v v Maui/Silver v v v v PBS v v EZ-grid v v v Prophet v LSF v v v v Scheduling Systems In this section we present the currently existing scheduling systems and their properties. Table 1-1 shows a set of scheduling systems and the given property existence. In the table, the mark v indicates that a system has the given property with the mark. In the table we specify a set of system properties. With adaptive scheduling we mean that the resource allocation decision is not finalized until the real job submission happens. The scheduling decision will change based on resource status and availability after the initial decision is made. Co-allocation means that a request may be allocated with several different resources. A real application requires different kind of resources such as CPU and storage. Co-allocation supporting scheduler allocates the required resources to the job. Fault-tolerant scheduling means that a job is rescheduled after its execution failure on a remote resource. In a dynamic grid environment execution failure is more likely to happen fairly often. The scheduling system is required to monitor job execution and reschedule it. Policy-based scheduling supports the heterogeneous resource ownership in grid computing. This topic is discussed in detail in the following section. Quality of service (QOS) is presented with deadline and other application requirements. A 9

PAGE 22

10 scheduling system should make resource allocation decisions concerning the requirement. An ideal system provides a flexible interface to other modules such as monitoring and scheduling to allow the replacement of existing modules in the system with other customized modules. The Portable Batch System (PBS) Portable Batch System is a batch job and computer system resource management package designed to support queuing and execution of batch jobs on heterogeneous clusters of resources. PBS offers several scheduling systems to support various resource allocation methods; such as Round Robin, First In First Out (FIFO), Load Balancing, Priority-based and Dedicated Times [15]. The PBS configuration consists of several modules, the PBS client, server, scheduler and job execution clusters which run the PBS MOM daemon. In the PBS system a job is submitted along with a resource specification on one of the front-ends, handed to the server, scheduled, run by the MOMs in the execution clusters, and has output placed back on the front end [16]. PBS works quite well for handling batch processing. However, as mentioned in the previous section, grid computing requires much more delicate resource management and refined request scheduling in a dynamically changing heterogeneous environment. The proposed resource allocation strategy achieves solutions to the issues by importing the concept of policyand reservation-based scheduling for Quality of Service (QOS). The proposed scheduler also supports fully interactive request submissions for negotiating the level of QOS requirements according to the current and estimated near future grid weather after the user makes a submission. 10

PAGE 23

11 Maui Maui is an advanced job scheduler for use on clusters and supercomputers. It is an optimized and configurable tool capable of supporting a large array of scheduling policies, dynamic priorities and extensive reservations. The Maui scheduler can act as a policy engine, which allows site administrators control over when and how resources are allocated to jobs [17]. The policies serve to control how and when jobs start. They include job prioritization, fairness policies and scheduling policies. The Quality of Service (QOS) feature allows a site to grant special privileges to particular users by providing additional resources, access to special capabilities and improved job prioritization. Maui also provides an advanced reservation infrastructure allowing sites to control exactly when, how and by whom resources are used. Every reservation consists of three major components, a set of resources, a timeframe and an access control list. The scheduler makes certain that the access control list is not violated during the reservations timeframe on the resources listed [18,19]. Even though Maui is a highly optimized and configurable scheduler capable of supporting scheduling policies, extensive reservations and dynamic priorities, it has limitations in scheduling distributed workloads to be executed across independent resources in a grid. A grid scheduling system must support global optimization in addition to a local best scheduling. The proposed scheduling framework supports the global optimization as well as local best fit by considering resource usage reservations and QOS requirements in the scheduling. The hierarchical architecture view of a grid in policy enforcement makes it possible to implement extensive and scalable resource allocation in the proposed scheduler. 11

PAGE 24

12 LSF LSF is a suite of application resource management products that schedule, monitor, and analyze the workload for a network of computers. LSF supports sequential and parallel applications running as interactive and batch jobs. The LSF package includes LSF Batch, LSF JobScheduler, LSF MultiCluster, LSF Make and LSF Analyzer all running on top of the LSF Base system. LSF is a loosely coupled cluster solution for heterogeneous systems. There are several scheduling strategies available in LSF. They include Job Priority Based Scheduling, Deadline Constraints Scheduling, Exclusive Scheduling, Preemptive Scheduling and Fairshare Scheduling. Multiple LSF scheduling policies can co-exist in the same system [20]. Even though LSF supports several different scheduling strategies, most of them do not provide enough ways for users to specify requirement and preference in resource allocation. The proposed scheduling strategy supports user interaction in resource allocation decisions by allowing QOS specification. EZ-Grid EZ-Grid is used to promote efficient job execution and controlled resource sharing across sites. It provides the policy framework to help resource providers and administrators enforce fine-grained usage policies based on authorization for the uses of their resources [21]. The framework automates policy-based authorization or access control and accounting for job execution in computing grids. A major difference between the policy engine and our proposed framework is that our framework utilizes a hierarchically defined policy along three dimensions consisting of resource providers, request submitters and time, and uses submitters Quality of Service requirements for resource allocation. 12

PAGE 25

13 Resource Broker The Resource Broker (RB) from the European Data Grid project provides a matchmaking service for individual jobs: given a job description file, it finds the resources that best match the users request [22, 23]. The RB makes the scheduling decision based only on the information of individual user authentication and individual job execution requirements Current plans suggest supporting different scheduling strategies. Our work goes beyond this by specifically accounting for VO policy constraints and VO-wide optimization of throughput via constraint matchmaking and intelligent scheduling algorithms. In addition, the proposed scheduler is designed to provide estimates of execution time so that the user may determine if a request fits within the user's deadlines. Finally, by considering the DAG as a whole, the middleware will be able to intelligently pre-determine any necessary data staging. Pegasus Pegasus [24] is a configurable system that can map and execute DAGs on a grid. Currently, Pegasus has two configurations. The first is integrated with the Chimera Virtual Data System. The Pegasus system receives an abstract DAG file from Chimera. Pegasus uses these dependencies to develop a concrete DAG by making use of two catalogs, the replica catalog that provides a list of existing data components, and a transformation catalog that stores a list of available executable components. With information from these catalogs, the Pegasus system maps the input abstract job descriptions onto grid resources. Then it adds additional jobs to provide the necessary data movement between dependent jobs. This final concrete plan is submitted to the grid execution system, DAGMan, which manages its execution. 13

PAGE 26

14 In its second configuration, the Pegasus system performs both the abstract and concrete planning simultaneously and independently of Chimera. This use of Pegasus takes a metadata description of the user's required output products. It then uses AI planning techniques to choose a series of data movement and job execution stages that aims to optimally produce the desired output. The result of the AI planning process is a concrete plan (similar to the concrete plan in the first configuration) that is submitted to DAGMan for execution. The framework presented in this document is distinct from the Pegasus work in many ways. For example, instead of optimizing plans benefiting individual users, the proposed framework, SPHINX allows for globally optimized plans benefiting the VO as a whole. In addition, Pegasus currently provides advanced forward planning of static workflows. The work presented in this document is designed to dynamically plan workflows by modifying groups of jobs within a DAG (sub-DAGs) and, depending on the nature of the grid, controlling the release of those sub-DAGs to execution systems such as Condor-G/DAGMan. SPHINX is meant to be inherently customizable, serving as a modular "workbench" for CS researchers, a platform for easily exchanging planning modules and integrating diverse middleware technology. As a result, by including Pegasus planning modules in the SPHINX server, the resulting scheduling system would be enhanced by taking full advantage of knowledge management and AI planning, provided by Pegasus, while providing the flexible dynamic workflow and just-in-time job planning provided by SPHINX. 14

PAGE 27

15 Condor The proposed scheduling system, SPHINX utilizes the stable execution control and maintenance provided by the Condor system [25, 26]. The Condor Team continues to develop Condor-G and DAGMan. Recently, to improve its just-in-time planning ability, DAGMan has been extended to provide a call-out to a customizable, external procedure just before job execution. This call-out functionality allows a remote procedure to modify the job description file and alter where and how the job will be executed. SPHINX envisages using the call-out feature in DAGMan for just-in-time error recovery and corrective just-in-time planning. However, as DAGMan and SPHINX increase in functionality, DAGMan itself could become a scheduling client and communicate through this and other callouts to the scheduling server directly. PROPHET Prophet is a system that automatically schedules data parallel Single Process Multiple Data (SPMD) programs in workstation networks [27]. In particular, Prophet uses application and resource information to select the appropriate type and number of workstations, divide the application into component tasks, distribute data across workstations, and assign tasks to workstations. To this end, Prophet automates the scheduling process for SPMD applications to obtain reduced completion time. In addition, Prophet uses network resource information and application information to guide the scheduling process. Finally, Prophet is unique in that it addresses the problems of workstation selection, partitioning and placement together. The SPHINX system provides functionality for scheduling jobs from multiple users concurrently based on the policy and priorities of these jobs in a dynamically changing resource environment. 15

PAGE 28

16 Data Analysis Systems There are many other international Grid projects underway in other scientific communities. These can be categorized as integrated Grid systems, core and user-level middleware, and application-driven efforts. Some of these are customized for the special requirements of the HEP community. Others do not accommodate the data intensive nature of the HEP Grids and focus upon the computational aspect of Grid computing. EGEE [28] middleware, called gLite [29], is a service-oriented architecture. The gLite Grid services aim to facilitate interoperability among Grid services and frameworks like JClarens and allow compliance with standards, such as OGSA [30], which are also based on the SOA principles. Globus [31] provides a software infrastructure that enables applications to handle distributed heterogeneous computing resources as a single virtual machine. Globus provides basic services and capabilities that are required to construct a computational Grid. Globus is constructed as a layered architecture upon which the higher-level JClarens Grid services can be built. Legion [32] is an object-based meta-system that provides a software infrastructure so that a system of heterogeneous, geographically distributed, high-performance machines can interact seamlessly. Several of the aims and goals of both projects are similar but compared to JClarens the set of methods of an object in Legion are described using Interface Definition Language. The Gridbus [32] toolkit project is engaged in the design and development of cluster and Grid middleware technologies for service-oriented computing. It uses Globus libraries and is aimed at data intensive sciences and these features make Gridbus conceptually equivalent to JClarens. 16

PAGE 29

17 UNICORE [33] provides a uniform interface for job preparation, and seamless and secure access to computing resources. Distributed applications within UNICORE are defined as multipart applications where the different parts may run on different computer systems asynchronously like the GAE services, or they can be sequentially synchronized. NASAs IPG [34] is a network of high performance computers, data storage devices, scientific instruments, and advanced user interfaces. Due to its Data centric nature and OGSA compliance, IPG services can potentially interoperate with GAE services. WebFlow [35], a framework for wide-area distributed computing, is based on a mesh of Java-enhanced Apache web servers, running servlets that manage and coordinate distributed computation and it is architecturally closer to JClarens The NetSolve [36] system is based around loosely coupled, distributed systems, connected via a LAN or WAN. Netsolve clients can be written in multiple languages as in JClarens and server can use any scientific package to provide its computational software. The Gateway system offers a programming paradigm implemented over a virtual web of accessible resources [37]. Although it provides portal behavior like JClarens and is based on SOA, its design is not intended to support data intensive applications. The GridLab [38] will produce a set of Grid services and toolkits providing capabilities such as dynamic resource brokering, monitoring, data management, security, information, adaptive services and more. GAE Services can access and interoperate with GridLab services due to its SOA based nature. 17

PAGE 30

18 The Internet computing projects, such as SETI@Home [39] and Distributed.Net [40], which build Grids by linking many end-user PCs across the internet, are primarily number crunching projects that lack the large data management features of HEP Grids. The Open Grid Services Architecture (OGSA) framework, the Globus-IBM vision for the convergence of web services and Grid computing, has been taken over by Web Services Resource Framework (WSRF) [41]. WSRF is inspired by the work of the Global Grid Forum's Open Grid Services Infrastructure (OGSI) [42]. The developers of the Clarens Web Services Framework are closely following these developments. Scheduling Algorithms This section discusses several existing scheduling algorithms. Although the referenced algorithms work well in the traditional high performance computing environment, they do not perform in a satisfactory manner with the characteristics of grids discussed in the previous section. Iterative List Scheduling This work [54] introduces an iterative list-scheduling algorithm to deal with scheduling on heterogeneous computing systems. The main idea in this iterative scheduling algorithm is to improve the quality of the schedule in an iterative manner using results from previous iterations. Although the algorithm can potentially produce shorter schedule length it does not support resource usage policies. It is a static scheduling algorithm, which assumes an unchanged or stable computing environment. In the dynamic and policy constrained grid computing environment the algorithm may not perform as the simulated results show in the dissertation. 18

PAGE 31

19 Dynamic Critical Path Scheduling Authors [55] propose a static scheduling algorithm for allocating task graphs to fully connected multiprocessors. It minimizes make-span subject to precedence constraint, which is determined by the critical path of the task graph. The homogeneous CPU-based scheduling algorithm assumes that the scheduler could manage the scheduling priority of jobs in a processor. This may not be true in a grid environment in which the resources have decentralized ownership and different local scheduling policies dependent on their VO. Reliability Cost Driven Scheduling The work [56] describes a two-phase scheme to determine a scheduling of tasks with precedence constraints that employs a reliability measure as one of the objectives in a real-time and heterogeneous distributed system. The static algorithm schedules real-time tasks for maximized reliability. The utility function of the algorithm finds a processor with the earliest start time for jobs in an application. In the presence of the policy constraint the algorithm may not be able to find the proper resource allocation to the application. Heterogeneous Earliest Finish Time Scheduling The algorithm [57] focuses on the appropriate selection of the weight for the nodes and edges of a directed acyclic graph, and experiments with a number of different schemes for computing these weights. The proposed technique uses the mean value approach to find the length of the produced schedule. Instead of the mean value approach our proposed algorithm uses the iterative approach to heterogeneous resources. The experimental results compare the two schemes. The off-line and priority-based scheduling may not be feasible in the grid-computing environment. 19

PAGE 32

20 Dynamic Level Scheduling This scheduling algorithm [58] match a job and a processor in an exhaustive way. The job is on the critical path of a directed acyclic graph (DAG), and the job starts on the processor at the earliest time. The algorithm uses the mean value approach on a heterogeneous CPU resource environment. The static and mean value-based scheduling may not produce a good scheduling result in policy-based grid computing. Optimal Assignment with Sequential Search The authors [59] describe two algorithms based on the A* technique. The first is a sequential algorithm that reduces the search space. The second proposes to lower time complexity, by running the assignment algorithm in parallel, and achieves significant speedup. The exhaustive and sequential search for the optimal assignment may not be feasible for a large tree search space even though their modified algorithm generates random solutions and prunes the tree. Our proposed algorithm performs optimal assignment in a different scheme. We utilize a sub-tree and iterative concepts instead of considering the whole tree and all heterogeneous resources. Contributions In this section we describe the main contribution of research to the scheduling system and techniques in Grid computing. Policy-driven request planning and scheduling of computational resources: We define and implement the mechanisms for representing and enforcing both local and global policy constraints. A grid scheduler properly needs to be able to allocate the resources to requests in a fashion that satisfies global and local constraints. Global constraints include community-wide policies governing how resources dedicated to a particular collaboration should be prioritized and allocated; local constraints include site-specific policies governing when external users can use local resources. We develop mechanisms for representing and enforcing constraints. We also develop policy-aware scheduling middleware and algorithms. We develop a novel framework for policy based scheduling in resource allocation of grid computing. The framework has several features. First, the scheduling 20

PAGE 33

21 strategy can control the request assignment to grid resources by adjusting resource usage accounts or request priorities. Second, efficient resource usage management is achieved by assigning usage quotas to intended users. Third, the scheduling method supports reservation based grid resource allocation. Fourth, Quality of Service (QOS) feature allows special privileges to various classes of requests, users, groups, etc. A fault-tolerant scheduling system in a dynamic Grid environment: Two major characteristics of Grid computing we can point out are de-centralized resource ownership and dynamic availability of resource. Shared by multiple Virtual Organizations, a Grid is meant to be an environment to mutually share resources among organizations. The composition of the grid is not homogeneous, to say the least. This makes it difficult to guarantee expected performance in any given execution environment. Another important factor is that due to the dynamic availability of resources, the presence of unplanned downtimes of certain resources in the Grid makes scheduling decision non-trivial as a job planned on a site may never be completes. These reasons make it very cumbersome for an application user to effectively use a grid. An application user usually throttles the jobs across the grid. The decision of how many jobs to send to a site is usually based on some static information like the number of CPUs available on the sites. However, the site with more CPUs might already be overloaded or this particular production manager (or his VO proxy, to be precise) might have a relegated priority at that remote site. As a result jobs might get delayed or even fail to execute. In such events, the application user has to re-submit the failed jobs. An experienced user may rely on his/her past experience and submit jobs to sites which have been more reliable in the past. However, the site which was working well before may have its own priority work to be done this time, thus temporarily relegating this users priority. The proposed scheduling middleware, named SPHINX, will effectively overcome the highly dynamic nature of the grid to harness grid resources. The system is equipped with advanced job execution tracking module and incorporated with other monitoring systems to maintain the information of data and resource availability. Scheduling workflow model: An application scientist typically solves his problem as a series of transformations. Each transformation may require one or more inputs and may generate one or more outputs. The inputs and outputs are predominantly files. The sequence of transformations required to solve a problem can be effectively modeled as a Directed Acyclic Graph (DAG) for many practical applications of interest that the proposal is targeting. Most existing scheduling systems assume that a user application consists of a single job. It means that there is no precedent relationship on a workflow. In the scheduling system and algorithm development we use different types of workflows and resources. The research focuses on single task workflows initially. They are the simplest workflows that are interactive or batch type. Then, we extend the research to simple DAG-based workflows, which will reflect a variety of real world scenarios. Finally, our scheduling algorithms deal with multiple DAGs in achieving objective functions simultaneously. The experiments of the algorithms are performed in 21

PAGE 34

22 heterogeneous resource environment. A resource type such as a CPU has a set of resources with different performance. In order to test and evaluate the proposed data analysis prototype we deploy the framework across a grid testbed named Grid2003/Open Science Grid which consists of more than 25 sites providing more than 2000 CPUs, and exhibit remote data access, workflow generation and collaborative data analysis using virtual data and data provenance, as well as showing non-trivial examples of policy based scheduling of requests in a resource constrained grid environment. Outline The dissertation is organized in the following manner. We discuss policy-based scheduling for single applications in Chapter 2. The chapter presents a policy-based novel-scheduling framework for obtaining a sub optimal scheduling solution on Grid resources. Specially, we also discuss resource allocation in the multi-dimensional policy space. We discuss the workflow centric scheduling middleware, SPHINX, with simulation and experiment results in Chapter 3. In that chapter we introduce the core features of SPHINX architecture, and discuss the distributed data analysis utilizing SPHINX and fault tolerant scheduling. It also discusses incorporating the scheduling system with other services such as JClarens and MonALISA. In Chapter 4 we present policy-based scheduling algorithms and supporting infrastructures for scheduling single and multiple workflows. We discuss scheduling techniques with awareness of given resource usage constraints and completion deadline. We present simulations and experiment results on Open Science Grid (OSG). We conclude the dissertation in Chapter 5 with future research plans. 22

PAGE 35

CHAPTER 2 GRID POLICY FRAMEWORK In this chapter we discuss a novel framework for policy based scheduling in resource allocation of grid computing. The framework has several features. First, the scheduling strategy can control the request assignment to grid resources by adjusting resource usage accounts or request priorities. Second, Efficient resource usage management is achieved by assigning usage quotas to intended users. Third, the scheduling method supports reservation based grid resource allocation. Fourth, Quality of Service (QOS) feature allows special privileges to various classes of requests, users, groups, etc. This framework is incorporated as part of the SPHINX scheduling system that is discussed in the next chapter. A set of experimental results is provided to demonstrate the usefulness of the framework. Petascale Virtual Data Grids (PVDGs) will need to be able to satisfy all of the job requests by allocating storage, computer and network resources in a fashion that satisfies both global and local constraints. Global constraints include community-wide policies governing how resources should be prioritized and allocated. Local constraints include site-specific control as to when external users are allowed use local resources. As such, PVDG computing requires mechanisms for representing and enforcing policy-based scheduling techniques. Policies, including authentication, authorization, and application constraints are important factors for maintaining resource ownership and security. The set of possible constraints on job execution can be various, and change significantly over time. These 23

PAGE 36

24 constraints may include different values for each job, for example RAM requirements or connectivity needed, or constraints that are static for a specific job type, such as the operating system or architecture. Policies may include any information that should be specified to ensure that a job is matched to appropriate resources [60, 61]. The proposed Policy based scheduling framework can achieve local and global optimisation in the resource allocation by providing the following several features: 1. Control the request assignment to grid resources by adjusting resource usage accounts or request priorities. This feature allows the scheduler balance workloads across resources in a grid resulting in better overall utilization and turn around time. 2. Support reservation based grid resource allocation. The scheduling concept makes it possible for the scheduler to assign multiple dependent requests in an optimally synchronized manner. To support the feature the resources that are participating in the scheduling should allow request to be executed with a reserved amount of resource usage in a specific time. In addition to the resource side reservation the request must be completed within the reservation in terms of usage duration and amount. 3. Allow variable Quality of Service (QOS) privileges to various classes of requests, users, groups, etc. In addition to a QOS in a basic level a request submitter can make a specific QOS request according to the privileges assigned to the class to which the submitter or the request belongs. The request submitter passes a request to the scheduler with the QOS specification. A resource allocation considering the QOS feature should be interactive between the submitter and the scheduler. They communicate each other to adjust the QOS requirements. Managing resource usage account prevents resource cycle waste and resource over usage. Resource providers assign usage quotas to intended resource users. The scheduler monitors resource usage by keeping track of the quota change. It saves resource cycles by assigning more requests to the idle resources, while reducing the number of requests assigned to the over used resources.

PAGE 37

25 Key Features The proposed policy based scheduling framework can achieve local and global optimization in the resource allocation by providing the following several features: Control the request assignment to grid resources by adjusting resource usage accounts or request priorities. This feature allows the scheduler balance workloads across resources in a grid resulting in better overall utilization and turn around time. Heterogeneous grid resources have different capabilities for executing submitted requests. Statically adjusting resource usage quotas of users can support workload balance across the resources. Also a scheduler considers dynamically changing workload on each machine when the scheduler makes resource allocation decision. The updated workload information is available from a resource-monitoring module in real time. The proposed framework incorporates the quota and monitoring information into resource allocation decision. The information is formulated in LP functions. We will discuss the feature in detail in the next section. Support reservation based grid resource allocation. The suggested framework makes it possible for a scheduler to assign multiple dependent requests to resources in an optimally synchronized manner. This feature is made possible due to the reservation driven scheduling. To support the feature the resources that are available in the scheduling should guarantee requests to be executed with a reserved amount of resource usage in a specific time period. The feature also supports global Quality of Service (QOS) satisfaction with a workload consisting of several dependent tasks. In order to make advanced reservation a scheduler needs to estimate request execution time and resource usage amount. A user can provide the information when she/he submits a request. In another way a scheduler can make the estimation based on functions of a

PAGE 38

26 prediction module such as benchmarking or historical execution records. In the proposed framework optimisation functions make resource allocation considering the advanced reservation with policy and anticipated execution information. Allow variable Quality of Service (QOS) privileges to various classes of requests, users, groups, etc. A user submits a request with its QOS requirement or preference. The QOS information may contain resource reservation specific requirement such as amount of resource (CPU, storage, etc) and a period of time. A user can also require a specific deadline of request execution or the best effort execution. The QOS privileges are different among the user classes. A class of users may have privileges for requiring more QOS then other groups do. A scheduler considering the QOS feature should interact with a request submitter for adjusting QOS achievement level. In the proposed system we assume that users specify the amount of required resource usages. The framework makes scheduling decision using optimisation functions for satisfying given QOSs and policy constraints. Support interactive scheduling with request submitters. A scheduler interacts with the submitters in order to negotiate QOS achievement level. In the cases where the scheduler cant achieve the service level that a submitter specifies, or the scheduler can suggest different QOS specification for better performance of request execution the scheduler sends the submitter the alternatives to the original QOS requirement. After receiving the alternatives the submitter considers them, and sends back her/his decision to the scheduler. The decision may include accepting or denying the alternatives, or requesting other scheduling effort with different QOS requirement.

PAGE 39

27 Managing resource usage account prevents resource cycle waste and resource over usage. In addition to the workload balance resource management system can control resource usage by adjusting usage quotas to intended resource users. The scheduler monitors resource usage by keeping track of the quota change. It saves resource cycles by assigning more requests to the idle resources, while reducing the number of requests assigned to the over used resources. Policy Space Grid computing requires collaborative resource sharing within a Virtual Organization (VO) and between different VOs. Resource providers and request submitters who participate within a VO share resources by defining how resource usage takes place in terms of where what, who, and when it is allowed. Accordingly, we assume that policies may be represented in a three (plus one) dimensional space consisting of resource provider (and property), request submitter, and time. We further assume that quota limits are descriptive enough to express how a resource may be used. By exploiting the relational character of policy attributes, a policy description space may be conveniently represented as a hierarchical tree. Indeed, the heterogeneity of the underlying systems, the difficulty of obtaining and maintaining global state information, and the complexity of the overall task all suggest a hierarchical approach to resource allocation. Further, such a hierarchical approach allows for a dynamic and flexible environment in which to administer policies. Three of the dimensions in the policy space, consisting of resource provider, request submitter and time, are modelled as hierarchical categorical policy attributes expressed in terms of quotas. An extra dimension, resource property, is modelled as a simple categorical attribute to the hierarchical resource provider dimension.

PAGE 40

28 Administrators, resource providers or requesters who participate in a VO then define resource usage policies (in terms of quotas) involving various levels of this hierarchical space. In the following sections we discuss the hierarchical representation of each dimension. Resource Provider A resource provider is defined as an entity which shares some particular physical resource within the context of a grid. Physical resources participating in a grid are often naturally organised into hierarchical groups consisting of both physical and logical views. In the example shown in Figure 2-1, a hypothetical grid consists of many domains each containing one or many clusters of machines. The representation in Sphinx is generalised such that arbitrary levels in the resource provider hierarchy may be added to maintain scalability. For example, the resource provider domain level might represent a set of gatekeepers at particular sites, or it might represent a sub-grid, itself containing local domains. It is common to assume that each compute-cluster has a local resource scheduling system (such as Condor, PBS, or LSF to name a few) to serve remote requests, but there may be compute-clusters which do expose individual machines to the grid for direct access. The hierarchical dimension (categorical tree) representing resource providers is given the symbol R H, provider Sphinx performs global scheduling to allocate remote resources for requests across an entire grid. As such, Sphinx specifies resource allocation to the resource provider level in which either a local resource scheduler exists or a sub-grid scheduler exists. Resource Property Each resource provider is augmented by a list of non-hierarchical properties, such as CPU, memory, storage space, bandwidth, etc. At any level in the resource provider

PAGE 41

29 hierarchy, quota limits for each resource property are appropriately aggregated. The categorical list representing resource properties is given the symbol R property VOGroupUser Job GridDomainCluster MachineLevel 1Level 2Level 3Level 4 VOGroupUser Job VOGroupUser Job GridDomainCluster MachineLevel 1Level 2Level 3Level 4 Figure 2-1. Examples of resource provider and request submitter hierarchies request submitter A request submitter is defined as an entity, which consumes resources within the context of a grid. As in the case of resource providers, request submitters are naturally structured in a hierarchical manner. In the example shown in Figure 2-1, a typical VO might be described by several groups of users, each with possible proxies (e.g. jobs) acting on their behalf. Again, the representation in Sphinx is generalised so that arbitrary levels in the request submitter hierarchy may be added to maintain scalability or convenient logical views of a VO. For instance, a group may consist of individual grid users (each with a particular certificate) or other sub-groups, each containing sub-sub-groups or users. In analogy with the resource provider hierarchy, the deepest levels in the request submitter hierarchy represent entities possessing a grid certificate or proxy. In general, a request for grid resources may be submitted (by a sanctioned user with a certificate) from any level. This enables the pooling of account quotas to service high priority requests emanating from any particular level in the request submitter hierarchy

PAGE 42

30 (such as a group or even the VO itself). The hierarchal dimension (categorical tree) representing request submitter is given the symbol S H Time Time is used to allow for possible dynamism in policies in which, for example, maximum quota limits may change in a welland pre-defined way (e.g. periodic) for request submitters. In addition, in order to provide a quality of service, the scheduler will need to plan and test, at various time frames in the future, possible resource allocation and de-allocation strategies. From a scheduling point of view, recent times are more important than times in the far future. Hence, Sphinx models the hierarchical nature of time by forming time frames using an adaptive logarithmic mesh, allowing for more finely or coarsely grained description of polices, depending on present or future scheduling requirements. The categorical list representing time is given the symbol T. Three-dimensional Policy Space As the policy space of quota limits defined above is comprised of categorical trees or lists, we construct a hierarchical policy tensor by taking the triple cross product of each hierarchical dimension: Q H = R H S H T where R H = R H, provider R property represents resource providers (with property attributes), S H represents request submitters, and T time. Each element of Q H represents an allotted quota limit. There are several ways to extract meaningful policy views of such a space. For example, Q H represents a volume of all possible quotas from all hierarchical levels of providers applied to all hierarchical levels of submitters and using all time scales. However, in the context of scheduling, a description of policies corresponding to particular aggregate-levels of providers, submitters, and times is often

PAGE 43

31 more germane than using all possible hierarchical views. One simple example is that of a grid site (resource-provider) which routes all incoming jobs to a compute-cluster via a local scheduler. In such a situation, enforcing fine-grained policies for each machine is not possible using a grid-wide scheduler (such as Sphinx), but enforcing site-aggregate polices is possible. Hence, one is more often interested in a particular view which collapses parts of the hierarchical policy space volume onto a sub-space, or aggregate surface, containing high-level or fine-grained policy descriptions, depending on the need. In this spirit, we first determine the tree level of interest for each branch (thereby defining leaf-nodes of differing granularity) of each dimension and then form the resulting triple cross product of leaf-nodes: Q L = R L S L T Such a construction of QL which is not unique, essentially allows for a flexible and efficient description of global and local polices. Various policy definitions are possible from the combination of leaf nodes in different levels of Resource-, Entityand Timepolicy hierarchy trees. Possible combinations from the trees make hierarchical policies in the three dimensions; Resource, Entity and Time. From the Figure 2-2 hierarchy trees, the combination (u2, r2, t3) means that there is a policy for User_1 on Cluster_1 in the time unit of Week. For example, User_1 can use one week of CPU time of machines on Cluster_1 during the month of July. (u5, r1, t2) means that Group_2 can use resources of Domain_1 for three months in a year. (u1, r5, t1) and (u1, r1, t2) defines policies for VO_1 on Domain_1 and 2 of Grid_1 in the time units of year and month respectively.

PAGE 44

32 Domain_1 Domain_2 Domain_3Group_1 Group_2Resource Policy Hierarchy Grid_1Cluster_1 Cluster_2Machine_1 Machine_2 r1r2r3r4r5User Policy Hierarchy VO_1VO_2User_1 User_2 User_3 u1u2u3u4u5Time Policy Hierarchy YearMonthWeekDay t1t2t3t4Domain_1 Domain_2 Domain_3Group_1 Group_2Resource Policy Hierarchy Grid_1Cluster_1 Cluster_2Machine_1 Machine_2 r1r2r3r4r5User Policy Hierarchy VO_1VO_2User_1 User_2 User_3 u1u2u3u4u5Time Policy Hierarchy YearMonthWeekDay t1t2t3t4 Figure 2-2. Hierarchical policy definition example A Solution Strategy for Policy-based Scheduling Given that a leaf-node tensor of policies has been defined as described in the previous section, we derive and apply an accounting tensor A which constrains a search for an optimal solution x for allocating resources in support of the requirements from a particular request b using the well known method of Linear Programming (LP). Specifically, we seek to minimize an objective function f (representing some heuristic knowledge of the grid): min[ f(x) ] subject to b where denotes the inner product of A(t) with x. Model Parameters Definition: Let Q be a tensor representing a leaf-node view of allotted quota limits (including pre-defined changes over time), U(t) be a tensor representing the amount of quota actually used at current time t. We then define A(t) to be a tensor representing the amount of quota still available at current time t by A(t) = Q U(t) and subject to the

PAGE 45

33 constraint that the remaining available quota always lie within the maximum allotted quota: ijkl ( 0 < A ijkl (t) < Q ijkl ) where i indexes all resource properties, j indexes all (leaf-node) resource providers, k indexes all (leaf-node) request submitters and, l (logarithmically) indexes time. Note that t and T refer to different uses of time. The functional dependence of U(t) and A(t) on current time t explicitly recognizes the fact that U(t) and A(t) are updated in real-time according to actual resource usage monitoring information. This is distinguished from the T dependence of Q, for example, which is not updated in real-time but which does define quota limits and their pre-defined (e.g. possibly periodic) future variation by indexing T relative to the present. In particular, Q ijkl represents present quota limits for l = 0 and future quota limits for l > 0. Definition: Let W(t) be a tensor representing the normalised (current and forecasted) grid weather such that W(t) ijk = (amount of resource in use)/(resource size) where i indexes all resource properties, j indexes all (leaf-node) resource providers, and k (logarithmically) indexes future steps in time from the present (k = 0). The estimated impact on grid weather from future resource allocation may be recursively obtained from current grid weather, W(t) ij0 by the relation W(t) ij(k+1) = W(t) ijk + W(t) ijk where W(t) ijk represents any future allocations (or de-allocations) of resources which are to be scheduled on property i of resource j during the k th interval of time away from the present.

PAGE 46

34 Definition: Let b be a tensor representing the required amount of resource requested where, b ij is indexed by i for all resource properties and j for all (leaf-node) request submitters. Definition: The Atomic Job Criterion (AJC), for some particular resource provider J, request submitter K, and time frame L, is defined to be (i A iJKL (t) b iK ) (i x iJKL 0) and states that if there exists a resource property i which can not satisfy the requested resource amount, then the corresponding resource provider J is removed from the space of feasible resource providers during the particular time frame L. This condition reduces the space of resource providers to include only those sites that are capable of accommodating the full request (i.e. the ability to satisfy all requirements declared in b). For jobs that are not atomic, for example split-able jobs suited to data parallelism, one would not impose such a stringent constraint as the AJC. In this chapter, we will only consider jobs that cannot be split into smaller sub-jobs. We wish to find a solution tensor x = [x ijkl ] which provides the optimal location to service the request within the reduced space of feasible providers where, i indexes resource properties; j indexes (leaf-node) resource providers; k indexes (leaf-node) request submitters; and l (logarithmically) indexes time. Choice of an Objective Function and Optimization Metric Several sophisticated objective functions and heuristic algorithms will be explored in the future. The simple choice of objective functions here is one in which the preferred

PAGE 47

35 location to service the request is that which least impacts both grid weather and the request submitters account usage 1 : min[ f KL (x) ] = min[ ijkl W jil x ijkl lL kK ] subject to: jl A iljk x ijlk b ik where mn is the Kronecker delta function (which is unity if m = n and zero otherwise), K corresponds to the particular request submitter, and L is some particular (logarithmic) time view corresponding to possible variation in quota limits. Such a simple strategy only provides greedy scheduling decisions for request submitter K within a certain time frame L, but does attempt to improve the flexibility of future scheduling choices by disfavouring resource providers in which the remaining quota for submitter K would be largely consumed. It may be shown that, for the case in which policies are constant for all times l, the above simple objective function f KL (x) is minimised when x ijKL = 0 (non-optimal provider of a resource property) = b iK /A ijKL (unique optimal provider of a resource property) = (b iK /A ijKL )/N (N-identical optimal providers of a resource property) Hence, non-zero entries of x ijkl are interpreted as representing the fraction of quota to be used from the remaining available quota for resource property i at provider j for request submitter k during time interval l ; x itself represents the solution which minimally impacts grid weather and which minimally uses the remaining available quota for request submitter K. While using the objective function f KL as defined above, one is 1 Please note that in the rest of this chapter, we have suppressed the functional time dependent notation of U(t), A(t), W(t) for the sake of clarity. It is understood, however, that U, A, and W are updated in real-time according to present conditions.

PAGE 48

36 nearly assured of finding a unique resource provider favouring any resource property individually, one is unfortunately not assured of finding a unique resource provider which is favoured for all resource properties together. One effective way to distinguish between resource providers j, which already optimally satisfy at least one of the requirements from request-submitter K, is to define a secondary optimisation metric based upon quota-usage for all resource properties at a particular resource provider. That is, for every resource provider j containing at least one non-zero resource property entry in the LP solution vector x jKL calculate the minimum length: x jKL 0 ( min j [ sqrt{ i (b iK /A ijKL ) 2 } ] ) This simply states that the favoured resource provider is that particular j which minimises the overall use of quotas, considering all required properties corresponding to request-submitter K (and at time window L). Used in connection with 0, this algorithm chooses a unique resource provider (up to multiple sites with identical lengths) from the set of resource providers with favoured properties. Finally, if there are multiple resource providers, each with identical and minimal overall quota-usage, a unique solution is made by randomly choosing one of the providers. Quality of Service Constraints Quality of service (QoS) constraints are supplied by a request submitter K in addition to, and in support of, a particular request requirement tensor b and may be expressed (for resource property i at resource provider j) in the form: z ijK < f ijK < Z ijK

PAGE 49

37 where z ijK (Z ijK ) is a lower (upper) limit and f ijK is some function of Q, U, W, x, and/or b. In this chapter, we consider only a "greedy" realisation of quality of service in the sense that a quality of service is guaranteed to a request submitter by only considering that requester's individual desires, but disregarding other unscheduled requests in the system. This implies a time ordering and prioritization of requests. A more "socialised" quality of service that is realised by simultaneously considering the requesters individual desires as well as the overall cost (i.e. the impact placed on all other unscheduled requests currently in the system) will appear in a forth-coming chapter. One simple example of a quality of service constraint that we investigate here is that of a submitter supplied deadline for request completion, or end date D E Let us assume that the element b 0K in the request requirement tensor b represents a usage requirement on CPU-time (e.g. in SI2000-hours) from request submitter K. Then let C j (b 0K ) represent the estimated wall clock completion time on resource j. In order to meet the desired quality of service at any particular resource j, the request must begin to be serviced on or before the start date D S = D E C j (b 0k ) Such a request for quality of service may be interpreted in the context of 0 as, z 0jK = date[ l ] < D E C j (b 0K ) = f 0jK where j represents a (leaf-node) resource, l represents (logarithmic) planning steps in time away from the present (l = 0), and date[ l ] converts discrete, relative time steps in l to an absolute date. By determining the latest possible start date over all resource providers j, date[ P ] = max j [ D E C j (b 0k ) ] one defines a feasible period of time (relative to now), l < P in which the request may be serviced and still meet the specified QoS. Such a QoS constraint may be simply

PAGE 50

38 imposed by restricting the sum over all future time intervals l to just that of the feasible time period l < P : min[ f KL (x) ] = min[ ijk(l
PAGE 51

39 Simulation Results The policy based scheduling that has been described in mathematical formats is simulated in MATLAB. The simulation assumes ten resources are available for the request assignment by the scheduling strategy. The first graph in Figure 2-3 shows the workload that each of the ten resources has in five time frames before any resource assignment for a request is made. The workload information on a resource in a time frame can be obtained from the estimated load status on the resource with information such as the current grid weather and request execution history in a Grid. Resources with ID 3 and 9 have higher initial workload than other resource, while resource with ID 1, 5 and 8 have fewer loads than the others in the beginning of the series of resource allocation. The second graph presents resource usage quota change. The dotted line in the graph shows the initial usage quotas that are available to a request submitter before the request submission, while the columns represent the quotas that are remained on the resources after the resource assignment. The third graph shows the request distribution among the resources after the policy-based scheduler completes the resource allocation for 70 requests. Each column in the graph represents the number of requests assigned to a resource in the time frames. The fourth graph shows the workload distribution on the resources after the scheduler completes the resource allocation. These results show that the resource allocation depends not only on the workload but also on the resource usage quota for a request submitter. The policy-based scheduler achieves the resource assignment optimisation by considering the evenly consumed usage quota and the workload balance among the resources. For example, we can see that resource with ID 1 has been assigned to the largest number of requests in the simulation

PAGE 52

40 because it has less overall workload for all the time frames, and a resource submitter is given high initial usage quota on the resources. In the case of request assignment to the resource with ID 8, even though the overall workload on the resource is less than on other resources, small number of requests is assigned to it because the quota for the submitter on the resource is less than on other resources. A B C D Figure 2-4. Policy based scheduling simulation results with highly biased resource usage quotas (In the legends of the graphs TFi means the i-th time frame, and InitQ represents the initially given resource usage quotas for a request submitter). A) Evenly distributed workload on the resources. B) Highly biased resource usage quotas. C) Resource allocation distribution. D) Changed resource workload. A resource provider who manages the resource usage policies for the request submitters may change the policies to control the resource usage. For instance, the provider can increase the usage quota on the less loaded resources to make it more utilized, while decreasing the quota on the over loaded resources to prevent the scheduler from assigning additional requests to the resources. From the results, the quota on the

PAGE 53

41 resource with ID 3 should be decreased to make the workload reduced in the future time frames. As an opposite case, the quota on the resource with ID 8 should be increased because the resource has been less loaded than the others. Increasing the quota causes the scheduler to assign larger number of requests to the resource. The results in the Figures 2-4 and 2-5 show how the resource usage quotas and the resource workload distribution affect the resource allocation. Figure 2-4a shows the case that the workloads are evenly distributed in the five time frames before a policy-based scheduler starts a series of resource allocation. Figure 2-4b shows that the resource usage quotas for a request submitter are highly biased, in the sense that Resource with ID 1 provides the highest quota, whereas Resource with ID 10 allows the submitter use the resource with the smallest amount quota. Given the initial conditions the scheduler allocates the resources to requests following the biased quota allowance as seen in Figure 2-4c. Because the given workloads are same on the resources in all the time frames the unevenly distributed quotas only affect the resource allocation distribution. The Figure 2-4d presents changed resource workload after the completed resource allocation. Figure 2-5 presents a case when the resource workloads are highly biased, while the resource usage quotas are same on resource in all the time frames at the time of resource allocation. With the initially given resource condition the request assignment is also highly biased following the resource workload distribution. The results presented above, shows that a policy based scheduler or resource providers can control the workload on Grid resources by properly assigning resource usage quotas for requests submitters with the consideration of the load statuses that the

PAGE 54

42 resources have been exposed. The scheduler can also achieve the load balanced request assignment on the resources utilizing the resource usage quota. A B C D Figure 2-5. Policy based scheduling simulation results with highly biased workload on resources (In the legends of the graphs TFi means the i-th time frame, and InitQ represents the initially given resource usage quotas for a request submitter). A) Highly biased workload on the resources. B) Resource usage quota change. C) Resource allocation distribution. D) Changed resource workload. Future Works We have developed a novel framework for policy based scheduling and quality of service in grid computing. This framework uses a unique hierarchical representation of the key variables required for scheduling for effective and flexible representation. It also allows for a linear programming based solution strategy that is extensible to the inclusion of additional constraints and variables. Linear programming methods are well studied in the literature and several packages are available to provide fast and effective solutions based on the type and number of variables involved. Initial experimental results

PAGE 55

43 demonstrate the usefulness of the framework and linear programming based solution methods for effective scheduling in multiple situations for policy and quality of service requirements. We are currently integrating a scheduling engine based on this framework into SPHINX and will present experimental results of the above policy framework on actual grids as well as the execution overhead of the solution strategy in the final version of the chapter.

PAGE 56

CHAPTER 3 SPHINX: POLICY-BASED WORKFLOW SCHEDULING A grid consists of high-end computational, storage, and network resources that, while known a priori, are dynamic with respect to activity and availability. Efficient scheduling of requests to use grid resources must adapt to this dynamic environment while meeting administrative policies. In this section, first I discuss the necessary requirements of a grid scheduler. Then I present a scheduling framework called SPHINX that incorporates these unique grid characteristics, and implements the policy-based scheduling technique. The section also presents methods for integrating this framework with related infrastructure for workflow management and execution. I present early experimental results for SPHINX that effectively utilizes other grid infrastructure such as workflow management systems and execution systems. These results demonstrate that SPHINX can effectively schedule work across a large number of distributed clusters that are owned by multiple units in a virtual organization. Requirements of a Grid-scheduling Infrastructure A grid is a unique computing environment. To efficiently schedule jobs, a grid scheduling system must have access to important information about the grid environment and its dynamic usage. Additionally, the scheduling system must meet certain fault tolerance and customizability requirements. This section outlines the different types of information the scheduling framework must utilize and the requirements a scheduler must satisfy. 44

PAGE 57

45 Information Requirements A core requirement for scheduling in the dynamic grid environment is to successfully map tasks onto dynamically changing resource environment while maximizing the overall efficiency of the system. The scheduling algorithm that performs this mapping must consider several factors when making its decision. Seven factors significantly affect this scheduling decision: Execution time estimation. Because of the heterogeneous nature of grid resources, their real execution performance differs from the optimal performance characterized by analytic benchmarking [6]. However, the real execution time can be effectively estimated, even on heterogeneous grid resources, by statistically analyzing the performance during the past executions [62]. Along with the past execution information, several methods such as statistical analysis, neural networks or data mining, can be used to estimate the execution time of a task [63]. Usage policies. Policies, including authentication, authorization, and application constraints are important factors for maintaining resource ownership and security. The set of possible constraints on job execution can be various and can change significantly over time. These constraints can include different values for each job, for example RAM requirements or connectivity needed, or constraints that are static for a specific job type, such as the operating system or architecture. Policies may include any information that should be specified to ensure that a job is matched to appropriate resources [60, 61]. Grid weather. The scheduling system must keep track of dynamically changed load and availability of grid resources. In addition, faults and failures of grid resources are certain to occur. The state of all critical grid components must be monitored, and the information should be available to the scheduling system.

PAGE 58

46 Resource descriptions. Due to the heterogeneous nature of the grid, descriptions of grid resource properties are vital. Such descriptions include configuration information such as pre-installed application software, execution environment information such as paths to local scratch spaces, as well as hardware information. In addition, such grid resources may often only be available to users at an aggregate or logical level, hiding the actual, physical resource used to execute a job. Hence, the ability to categorize as well as to both finely and coarsely describe grid resource properties is important. Replica management. The scheduling system must arrange for the necessary input data of any task to be present at its execution site. Individual data locations can have different performance characteristics and access control policies. A grid replica management service must discover these characteristics and provide a list of the available replicas for the scheduler to make a replica selection. Past and future dependencies of the application. Grid task submission is often expressed as a set of dependent subtasks and modeled as a Directed Acyclic Graph (DAG). In this case, the subtasks are represented by nodes in a graph, and the dependencies are by branches. When allocating resources to the subtasks, inter-task dependencies affect the required data movement among resources. Utilization of this dependency information can generate a provably optimal scheme for communicating shared data among subtasks [64]. Global job descriptions. A grid is primarily a collaborative system. It is not always ideal within the grid to schedule a particular task or group of tasks for maximum efficiency. A good balance needs to be struck between the requirements of each user and

PAGE 59

47 the overall efficiency. It is important for a grid scheduling system to have a global list of pending jobs so that it can optimize scheduling to minimize this global cost. System Requirements While the kinds of information above should be available to the system for efficient grid scheduling, the following requirements must be satisfied in order to provide efficient scheduling services to a grid Virtual Organization (VO) community. Distributed, fault-tolerant scheduling. Clearly, scheduling is a critical function of the grid middleware. Without a working scheduling system (human or otherwise), all processing on the grid would quickly cease. Thus, any scheduling infrastructure must be strongly fault-tolerant and recoverable in the inevitable case of failures. This need for fault tolerance consequently gives rise to a need for a distributed scheduling system. Centralized scheduling leaves the grid system prone to a single point of failure. Distributing the scheduling functionality between several agents is essential to providing the required fault tolerance. Customizability. Within the grid, many different VOs will interact within the grid environment and each of these VOs will have different application requirements. The scheduling system must be customizable enough to allow each organization with the flexibility to optimize the system for their particular needs. Extensibility. The architecture of the scheduling system should be extensible to allow for the inclusion of higher level modules into the framework. Higher level modules could help map domain specific queries onto more generic scheduling problems and map domain specific constraints onto generic scheduling constraints. Interoperability with other scheduling systems. Any single scheduling system is unlikely to provide a unique solution for all VOs. In order to allow cooperation at the

PAGE 60

48 level of VOs, for example in a hierarchy of VOs or among VO peers, the scheduling system within any single VO should be able to route jobs to or accept jobs from external VOs subject to policy and grid information constraints. Inter-VO cooperation is an architectural choice reflecting a tradeoff between synchronization and flexibility of low level middleware choices and configurations across large organizations. Quality of service. Multiple qualities of service may be desirable as there are potentially different types of users. There are users who are running small jobs that care about quick turnaround time or interactive behavior from the underlying system. On the other hand, large production runs may be acceptably executed as batch jobs. Users may put deadlines by which submitted jobs should be completed. The scheduling system should be able to provide these differential QoS features for the administrator/users. Figure 3-1. Sphinx scheduling system architecture

PAGE 61

49 Highlights of SPHINX Architecture The scheduling system, SPHINX is a novel scheduling middleware in dynamically changing and heterogeneous grid environment. In addition to the infrastructure requirements for grid scheduling described in the previous section, SPHINX focuses on several key functionalities in its architecture shown in Figure 3for efficient and fault tolerant scheduling service. Easily accessible system. SPHINX is modeled with an agent-based scheduling system consisting of two parties, the client and the server. The separation supports easy system accessibility and adaptability. The client is a lightweight portable scheduling agent that represents the server for processing scheduling requests. It provides an abstract layer to the service, and supports a customized interface to accommodate user specific functionalities. Automated procedure and modulated architecture. SPHINX consists of multiple modules that perform a series of refinement on a scheduling request. The procedure begins from a start, and the final state should be annotated with finished, which indicates the resource allocation has been made to a request. Each module takes a request, and changes its state according to the functionality of itself. The system easily modifies or extends the scheduling automation by making necessary changes to a module without affecting the logical structure of other modules. Table 3.1 shows all the states for jobs and dags defined in the current SPHINX prototype. Robust and recoverable system. The SPHINX server adopts database infrastructure to manage scheduling procedure. Database tables support inter-process communication among scheduling modules in the system. A module reads scheduling

PAGE 62

50 state of a request from the tables, edits the state, and writes the modification to the tables. It also supports fault tolerance by making the system easily recoverable from internal component failure. User interactive system. SPHINX supports user interaction to its resource allocation procedure. A user submits a request with quality of service (QOS) requirement. The requirement may specify resource usage amount and period. It is challenging to be able to satisfy the specification in a dynamically changing grid environment. SPHINX facilities the scheduling decision by negotiating QOS achievement with a user. Table 3-1. Finite automation of SPHINX scheduling status management. DAG States Description Unreduced The DAG has yet to be processed by the DAG reducer. Unpredicted The completion time of the DAG has not been estimated by the prediction engine. Unaccepted The completion time has been estimated for this DAG, and the server is waiting for the client to accept this estimation for final processing. Unfinished The DAG has begun processing but has not completed. Remove All jobs within this DAG have completed and can be removed from the queue during the next cleanup cycle. Job States Description Unpredicted The completion time of this job has not been estimated by the prediction engine. Unaccepted The DAG containing this job has not been accepted for final execution by the client. Unplanned The job has been accepted for scheduling by this server, but the planner has not created an execution plan for this job. Unsent The job is planned, but has not been sent to the client for execution. Unfinished The job is running on a remote machine and must be tracked by the tracking system. Remove The job has finished execution and no longer must be tracked by the tracking system. If the parent DAG for this job is also finished, the job may be removed from the tracking system during the next clean up phase.

PAGE 63

51 Platform independent interoperable system. A scheduling system should expect the interaction with systems on various kinds of platforms in a heterogeneous environment. SPHINX adapts communication protocols based on XML such as SOAP and XML-RPC to satisfy the requirement. Specially, it uses the communication protocol named Clarens [65] for incorporating the concept of grid security. Table 3-2. SPHINX client functionalities for interactive job scheduling and execution tracking Functions SPHINX APIs Parameters Execution request job_submit (String file_loc, String sender) //Client submits this request to SPHINX client file_loc: abstract dag file location sender: request sender information Scheduling request send_request (String dagXML, String msgType, String sender) //SPHINX client sends this request to server dagXML: dag in XML format msgType: request type sender: request sender information Admission control send_msg (String msgXML, String msgType) send_msg (String msgXML, String msgType, String sender) //SPHINX and user interact for resource allocation msgXML: message in XML format msgType: message type sender: message sender info. Submission request createSubmission (String jobInfo) //SPHINX client create a submission file. submit_job (String rescAlloc, String submitter) //SPHINX client send the file to DAGMan/Condor-G JobInfo: scheduling decision information rescAlloc: job submission file Submitter: job submitter information Execution tracking updateStatus (int jobId) String status = getJobStatus (int jobId) //Update the status of the job with the information //from a grid resource management service jobId: ID of a job that is currently running on a grid resource. The ID is assigned by the grid resource management system. status: the status of job, which the grid resource management system provides. SPHINX Client SPHINX client interacts with both the scheduling server that allocates resources for task execution and a grid resource management system such as DAGMan/Condor-G [4]. To begin the scheduling procedure, a user passes execution request to SPHINX client. The request is in the form of an abstract DAG that is produced by a workflow planner

PAGE 64

52 such as the Chimera Virtual Data System [66]. The abstract plan describes the logical I/O dependencies within a group of jobs. The client sends scheduling request to the server with a message containing the DAG and client information. After receiving resource allocation decision from the server, the client creates an appropriate request submission file according to the decision. The client submits the file to the grid resource management system. In order to achieve a users quality of service (QoS) requirement SPHINX implements interactive resource allocation. The client as a scheduling agent negotiates QoS satisfaction level with the user. SPHINX presents the user resource allocation decision such as estimated execution time, resource reservation period and amount according to the current grid resource status. Then the user should decide acceptance of the suggestion. A basic QoS negotiation feature is developed in the current system, while we develop detailed and sophisticated version. The tracking module in the client keeps track of execution status of submitted jobs. If the execution is held or killed on remote sites, then the client reports the status change to the server, and requests re-planning of the killed or held jobs. The client also sends the job cancellation message to the remote sites on which the held jobs are located. Table 3.2 shows the functionalities and SPHINX APIs. SPHINX Server SPHINX server as a major component of the system performs several functions. The current version of the server supports the following functions. First, it decides how best to allocate those resources to complete the requests. Second, it maintains catalogs of data, executables and their replicas. Third, it provides estimates for the completion time of the requests on these resources. Fourth, the server monitors the status of its resources.

PAGE 65

53 SPHINX server completes resource allocation procedure by changing the status of requests through the predefined states in Table 3.1. As mentioned in the previous section, each of the functions in SPHINX is developed in modulated fashion. Each module performs its corresponding function to a DAG or a job, and changes the state to the next according to the predefined order of states. We discuss each of the modules in detail in the following statements. Table 3.3 shows SPHINX server functions described in this section. Table 3-3. SPHINX server functions for resource allocation Modules SPHINX APIs Parameters Message Handling Module String incMsg = inc_msg_wrapper () out_msg_wrapper (String msg) msg_parsing (String msg) msg_send (String msg, String msgType String dest, String sender) //These functions are to send, receive and parse messages. //The message handling module is a gateway to SPHINX server incMsg: incoming message msg: message in XML format msgType: message type dest: receiver information sender: sender information DAG Reducer dag_reducing (int dagId) //For each of all the jobs in the dag, call replica management //service to check if all the outputs of the job exist. If exist, //then reduce the job and all the precedence of the job. dagId: ID of a dag in dag table. The state of the dag is unreduced Prediction Engine String est_info = exec_prediction(int jobId) //This function provides estimated information such as //execution time, resource usage (CPU, storage etc.) est_info: estimated data in XML string format jobId: ID of a job in job table Planner Planning (int jobId, String strategy) //It is to allocate resources to jobs according to scheduling //strategy. JobId: ID of job to be planned strategy: scheduling algorithm Control Process. The main function of the control process is to launch the necessary server-side service modules to process resource allocation to a request. Stateful entities such as DAGs and Jobs are operated and modified by scheduling modules. This architecture for the SPHINX Server, in which the control process awakens modules for processing stateful entities, provides an extensible and easily configurable

PAGE 66

54 system for future work. Figure 3shows overall structure of control process in SPHINX. The controller checks the state of jobs that are currently in the procedure of scheduling. If the controller finds a job in one of the states, then it invokes a corresponding service module to handle the job. Figure 3-2. Overall structure of control process. Message Handling Module. The message handling function is to provide a layer of abstraction between the internal representation of the scheduling procedure and the external processes. Additionally, the function is responsible for maintaining a list of all the currently connected clients, for ensuring that the list is kept accurate and for directing I/O from the various internal components to these various clients. The server maintains database tables for storing incoming and outgoing messages. Control process invokes incoming or outgoing message interfaces to the tables for retrieving, parsing and sending the messages. DAG Reducer. The DAG reducer reads an incoming DAG, and eliminates previously completed jobs in the DAG. Such jobs can be identified with the use of a replica catalog. The DAG reducer simply checks for the existence of the output files of each job, and if they all exist, the job and all precedence of the job can be deleted. The reducer consults replica location service for the existence and location of the data. Prediction Engine. The prediction engine will provide estimates of resource use. It estimates the resource requirements of the job based upon historical information, size of input if any, and/or user provided requirements. In the first implementation, this is

PAGE 67

55 constrained to overall execution time by application and site. A simple average and variance calculation is used to provide an initial estimation scheme; this method could be made more intelligent and robust in future implementations. When the prediction engine is called, it selects a DAG for prediction, estimates the completion time of each job, and finally from this data, estimates the total completion time of the DAG. Planner. The planner module creates an execution plan of the job whose input data are available. According to the data dependency a job should wait until all the precedent finish to generate output files. The execution plan includes several steps: 4. Choose a set of jobs that are ready for execution according to the input data availability. 5. Decide the optimal resources for the job execution. The planner makes resource allocation decision for each of the ready jobs. The scheduling is based on resource status and usage policy information, job execution prediction, and I/O dependency information described in Section 3. 6. Decide whether it is necessary to transfer input files to the execution site. If necessary, choose the optimal transfer source for the input files. 7. Decide whether the output files must be copied to persistent storage. If necessary, arrange for those transfers. After the execution plan for a job has been created, the planner creates an outgoing message with the planning information, and passes the message to a message-handling module. Tracking Module. The job-tracking module is responsible for keeping the tracking/prediction database on the SPHINX server current. In this first version, the primary functions of the tracking module is to check for job completion, update the job state in the tracking/prediction database, and when the job completes, add the execution time to the jobs historical data in the prediction data tables. In later versions, additional functions can be added. For example, the tracking module could also track job resource

PAGE 68

56 use in real time. It could then enforce resource usage policies by killing and re-queuing jobs that overstep estimated bounds. Of course, the tracking module can also monitor a host of additional prediction information and record it in the prediction tables. Table 3-4. SPHINX APIs for accessing data replicas through RLS service. In the table PFN or pfn represents physical file name, and lfn means logical file name. APIs Parameters Vector pfns = getPFN (String lfn) //It returns PFN mappings for the given lfn. pfns: a list of physical file names lfn: logical file name CreateMapping (String lfn, Sttring pfn) //It creates mapping for the given lfn and //pfn in the RSL service database. lfn: logical file name pfn: physical file name Data Replication Service The data replication service is designed to provide efficient replica management. SPHINX provides an interface to the service. The Globus Replica Location Service (RLS) [67] provides both replica information and index servers in a hierarchal fashion. In addition, GridFTP [68] is Grid Security Infrastructure (GSI) enabled FTP protocol that provides necessary security and file transfer functions. Initial implementations of SPHINX will both make use of RLS and GridFTP for replica and data management. In this initial implementation, the detection and replication algorithms are based on a memory paging technique. Once a hot spot has been identified, an appropriate replication site is chosen; initially, the site will be chosen at random. As future versions are developed, both the method of hot spot identification and replication location selection will be modified using improved replica management systems as well as better algorithms. Table 3.4 shows SPHINX APIs for accessing replica information through RLS.

PAGE 69

57 Grid Monitoring Interface The resource allocation decision made by the planner and the replication site selection by the data replication service depends on the information provided through the monitoring interface of SPHINX. As such, the interface provides a buffer between external monitoring services (such as MDS, GEMS, VO-Ganglia, MonALISA, and Hawkeye [69,70,71]) and the SPHINX scheduling system. In order to accommodate the wide-variety of grid monitoring services, the interface is developed as an SDK so that specific implementations are easily constructed. We use an interface to MonALISA for accessing resource monitoring information (Table 3.5). Table 3-5. Database table schemas for accessing resource-monitoring information through MonALISA Tables Fields Function SITE_FARM_MAP site_name varchar(255) not null farm_name varchar(100) not null This table stores mappings from grid resource IP to MonALISA farm name. ML_SNAPSHOT site_name varchar(255) not null mfunction varchar(100) not null value double This table stores the latest monitoring value for the given site and function. ML_SNAPSHOT_P2P site_source varchar(255) site_dest varchar(255) mfunctions varchar(100) value double This table stores the latest monitoring value for the given function between the given two sites. Grid monitoring service will be used to track resource-use parameters including CPU load, disk usage, and bandwidth; however, in addition, a grid monitoring service could possibly also collect policy information provided from each site, including resource cost functions and VO resource use limits. Additional interface modules will be developed to gather VO-centric policy information, which may be published and maintained by a VO in a centralized repository. External databases. Replica information is likely to be tied closely to a particular VO, and it is necessary to incorporate external replica services. Two external

PAGE 70

58 catalogs are considered in this chapter: the transformation catalog (TC) and the replica catalog (RC). Because the functionality is similar, both the RC and the TC may use similar or identical technology. The TC maintains information on transformations (these should be provided by workflow management systems such as Chimera). The TC maps a logical executable name to one or many physical executable names on particular grid resources. The RC maintains information on data produced within, or uploaded to a grid. The RC maps a logical file name to one or many physical file names on particular grid resources. Internal database. The prediction engine queries the Prediction Tables (PT) to get estimated information for job execution, including execution time, CPU and disk usage, and bandwidth. In the PT, a vector of input parameters and a site with a vector of benchmarks represents a job. Thus, for every execution of every set of input parameters (job) on every set of benchmarks (site), the resources consumed by the job are recorded. By querying this database of historical processing information, the prediction engine can estimate the time required to execute a given job on a given set of resources. The Job Tables (JT) maintains a persistent queue of submitted jobs. It monitors their progress, providing information on the status and location of job execution. In addition, all DAG (or job) state information in the scheduling system is incorporated into the job database. The tracking system within the scheduling server accesses the grid-monitoring interface and maintains the information in the JT. Relationship with Other Grid Research In this section, we discuss how SPHINX system research and development (R&D) interact with other ongoing R&D works.

PAGE 71

59 Grid Information Services Several systems are available for gathering information including resources access policies, resource characteristics and status, and required application environments. The current version of Globus Toolkit includes the Monitoring and Discovering Service (MDS) that can, in combination with other tools such as Ganglia, MonALISA and GEMS, gather this and other grid information. This system or others can be easily connected to the SPHINX system using the Grid Monitoring SDK. Replica and Data Management Services The Globus Replica Location Service (RLS) [67] provides both replica information and index servers in a hierarchal fashion. In addition, GridFTP is Grid Security Infrastructure (GSI) enabled FTP protocol that provides necessary security and file transfer functions. Initial implementations of SPHINX will both make use of RLS and GridFTP for replica and data management. The Network STorage (NeST) service provides a common abstracted layer for data access in grid environments [72]. In addition, the Stork Data Placement (DaP) scheduler manages and monitors data placement jobs in grid environments [73]. As the SPHINX work develops, both NeST and Stork can be incorporated into the framework to provide robust, transparent management of data movement. Job Submission Services The GriPhyN VDT includes Condor-G as a grid submission and execution service. Condor-G uses the Globus GRAM API to interact with Globus resources [74], and provides many execution management functions, including restarting failed jobs, gathering execution log information, and tracking job progress. In addition, DAGMan is a job execution engine that manages the execution of a group of dependent jobs

PAGE 72

60 expressed as a DAG. By leveraging the DAGMan job submission protocol, the scheduling system can dynamically alter the granularity of the job submission and planning to most efficiently process the jobs in its queue. For instance, if the prediction information indicates that a large group of fast jobs is waiting in the queue and the planner can assume the grid status will remain relatively stable during their execution, it can make a full-ahead plan for the whole group, construct a DAGMan submission DAG and pass the entire structure to DAGMan to manage. Conversely, if there are many long running jobs in the queue, the planner is able to release them one at a time to DAGMan to take full advantage of just-in-time scheduling. Virtual Data Services The SPHINX scheduler is currently designed to process abstract DAGs, as provided by the Chimera Virtual Data System, for grid execution. In particular, this work is fully integrated with the Chimera Virtual Data and Transformation Catalogs. These integration points provide SPHINX with the ability to flexibly and dynamically schedule large DAG workflows. Future Planners and Schedulers One of the main purposes of SPHINX research and development is to develop a robust and complete scheduling framework where further research and development can continue. Thus, it is essential that this initial development work should provide a feature for integrating new planning technology to the framework easily. Such integration is provided in two ways. First, the current SPHINX is developed in a modularized fashion. Modules in the system are developed, and tested independently. A single current SPHINX component can be exported into other systems, or new modules can be plugged into SPHINX system easily. Second, the planning module within the scheduling server

PAGE 73

61 interacts only with the internal database API. Through this interface, any planning or strategy module can be easily added to the scheduling server. A collection of such planning modules could be provided, and the scheduling server chooses the most optimal based on the composition of the input tasks. Table 3-6. Grid sites that are used in the experiment. CalTech represents California Institute of Technology, UFL does University of Florida, and UCSD does University of California at San Diego. Site Name Site Address # of Processors Processor Type CalTech citcms.cacr.caltech.edu Four Dual UFL ufloridadgt.phys.ufl.edu Three Dual UFL ufloridaigt.phys.ufl.edu Nine Dual UCSD uscmstb0.ucsd.edu Three Singular Experiments and Results The aim of the experiments we performed was to test the effectiveness of the scheduler as compared to the way things are done today on the Grid. We also compared and contrasted the different scheduling algorithms. There were two different features of grid systems that we evaluated: Importance of feedback information: The feedback provides execution status information of previously submitted jobs on grid sites. The scheduling algorithms can utilize this information to determine a set of reliable sites to schedule jobs. Sites having more number of cancelled jobs than completed jobs are marked unreliable. This strategy is not used in the algorithms marked as without-feedback. Importance of Monitoring Information: The monitoring systems provide performance of different components (especially compute resources) on the grid. This information is updated frequently and can be potentially used for effective scheduling. We wanted to determine the value of this information in effective scheduling.

PAGE 74

62 Scheduling Algorithms Round robin scheduling algorithm tries to submit jobs in the order of sites in a given list. All sites are scheduled to execute jobs without considering the status of the sites. If some sites are not available to execute the planned jobs, then the jobs are cancelled, and planned onto the next site in the list. Algorithm based on the number of CPUs with feedback utilizes resource-scheduling information of previously submitted jobs in a local SPHINX server. After determining load rate with the following formula for each site it plans jobs to the least loaded sites. i site on CPUs ofnumber the:CPUs i site on jobs running ofnumber the:_jobsunfinished i site tojobs planned ofnumber the:bsplanned_jo i site on rate load the:rate iiiiiiiiCPUs)_jobsunfinishedbsplanned_jo(rate where Queue length based scheduling algorithm makes the scheduling decision based on the lengths of the job queues at the remote sites provided by a monitoring module. This algorithm utilizes the feedback information to determine the job execution site. The scheduler selects a site with the smallest load rate according to the following formula.

PAGE 75

63 i site on CPUs ofnumber the:CPUs scheduler locala on i site tojobs planned ofnumber the:bsplanned_jo i site on CPUs toassignedcurrently are whichjobs ofnumber the:bsrunning_jo i site on jobs waitingofnumber the:squeued_job i site on rate load the:rate iiiiiiiiiiCPUs)bsplanned_jobsrunning_josqueued_job(rate where Job completion time-based scheduling algorithm utilizes the job completion rate information passed on by the job tracker module from the client to the server. In the absence of the job completion rate information, SPHINX schedules jobs based on round robin technique until it has that information for the remote sites. Thus, it uses a hybrid approach to compensate for unavailability of information. The site having the minimum job completion rate is chosen for the next schedulable job. ninformatiofeedback theusing computed is site on timecompletion job average the:_ sites grid ofnumber the: __ {available is site ,1 ,010 ,1iiiifotherwiseiinkkiAniAicompAvgnwhereAAcompAvgcompAvgMinimumi Policy-constrained scheduling puts resource usage constraints on each of the algorithms. For example, the next formula shows a revised round robin scheduling algorithm with resource usage constraints.

PAGE 76

64 user a by specified s site on iproperty ofamount required :required usera togiven s site on iproperty ofquota usage :quota si,si,si,si,propertyfor ,requiredquota s site whereithatsuch Test-bed and Test Procedure We use Grid3 as the test-bed for this experiment. Grid 3 currently has more than 25 sites across the US and Korea which collectively provide more than 2000 CPUs. The resources are used by 7 different scientific applications, including 3 high energy physics simulations and 4 data analyses in high energy physics, bio-chemistry, astrophysics and astronomy. It is critical to test the performance of scheduling algorithms in the same grid environment because the resource availability changes dynamically. Each of these scheduling algorithms was executed on multiple instances of SPHINX servers multiple number of times that were running concurrently to compare pair-wise or group-wise performance. These servers were started at the same time so that they can compete for the same set of grid resources. We felt this was the fairest way to compare the performance of different algorithms in a dynamically changing environment. Table 3.7 shows the configuration of the machines that we set up SPHINX clients and servers. Table 3-7. SPHINX server configurations Name CPU (Mhz) RAM(Mb) OS dimitri.dnsalias.net 2x800 512 RH 7.3 Julian.dnsalias.net 2x3000 2000 Fedora Core 2 However, the availability and performance of each site grid changes dynamically. This fact makes the number of sites available during different experiments to vary. Each scheduling algorithm also chooses different set of sites for submitting jobs according to

PAGE 77

65 its planning decision. The performance comparisons below represent our best case efforts to eliminate these effects by executing these in a pair-wise or group-wise approach described above. Average Dag Completion Time (30 dags x 10 jobs/dag)20002200240026002800300032003400# of CPUs basedRound-robin# of CPUs based-without feedbackRound-robin-without feedbackScheduling AlgorithmsTime (Seconds) Figure 3-3. Effect of utilization of feedback information. Performance Evaluation of Scheduling Algorithms In order to compare the performance of the algorithms, we submit 30, 60 and 120 DAGs, each of which has 10 jobs in random structure. The job simulates a simple execution that takes two or three input files, spends one minute before generating an output file. The size of output file is different for each job, and the file is located on the execution site by default. Including the time to transfer remotely located input files onto the site it is expected that each job will take about three or four minutes to be completed. The depth of the DAGs is up to five, while the maximum number of independent jobs in a level is four or five. Effect of Feedback Information We start by studying the impact of using feedback information. We compare the average DAG completion times for the round-robin and #-of-CPUs-based scheduling

PAGE 78

66 algorithms with and without feedback. The graph shown in Figure 3-3 plots the average DAG completion time for four algorithms Round robin with feedback, round-robin without feedback, Number-of-CPUs-based scheduling algorithm with feedback and Number-of-CPUs-based scheduling algorithm without feedback. Feedback is basically used for flagging an execution site as faulty. This is done considering the number of completed and cancelled jobs on that site as reported to SPHINX server by the job tracker(s). The feedback information includes the status of job execution such as hanged, killed, or completed on execution sites. A scheduler should be able to utilize this information to determine a set of available sites to schedule jobs. Without this feedback the scheduler keeps submitting jobs to unreliable sites resulting in many rescheduling jobs getting cancelled. As shown in the figure the average DAG completion time with the feedback information is less than the other case by about 20~29%. Round-robin scheduling without feedback is basically what a grid user would use for executing his jobs on the grid. This experiment basically proves how the feedback information in critical. It also brings to the fore the faultiness of the grid environment and how fault-tolerance can be achieved using SPHINX. Comparison of Different Scheduling Algorithms with Feedback The aim of the experiment is to demonstrate the dynamic nature of the grid and to evaluate which monitored parameter works best for making scheduling decisions.

PAGE 79

67 Average Dag Completion Time (30 dags x 10 jobs/dag)18001900200021002200230024002500Completion timebasedQueue lengthbased# of CPUs basedRound-robinScheduling AlgorithmsTime (Second) A Average Job Execution Time (30 dags x 10 jobs/dag)200250300350400450500550Completion timebasedQueue lengthbased# of CPUs basedRound-robinScheduling AlgorithmsTime (Second)0123456 Execution Idle time B Figure 3-4. Performance of scheduling algorithms with 300 jobs and without any policy constraints (A) & (B). The Round-robin algorithm distributes the jobs equally among all available sites; the number-of-CPUs-based algorithm considers the relative number of CPUs on that site (static); the Queue-length based approach considers the job queue information (dynamic) from the monitoring system; while the Completion time based strategy uses the average job completion rates to select the execution site.

PAGE 80

68 Figure 3-4(a) shows the performance of four different algorithms. Completion time-based scheduling algorithm (hybrid) performs better than other cases by about 17% in terms of average DAG completion time. This is because the scheduling algorithm schedules jobs to the sites that complete assigned jobs faster than other sites. Figure 3-4(b) presents job execution and idle time information. Jobs scheduled by completion time algorithms are executed faster than ones by other algorithms by about 50%, and execution waiting or idle time is less than by about 60 %. The same experiment was repeated with 600 and then 1200 jobs instead of just 300 jobs. Figure 3-5 gives the result for the 600 jobs experiment. Here, we actually observe that the Job completion rate based approach performs comparatively better than other algorithms as compared to the 300 jobs experiment. Here the performance of the Job completion time based approach is from ~33% to ~50% better than other scheduling strategies. This is because the algorithm gets smarter as the scheduling processes with more reliable job completion time information and makes the planning decision for jobs effectively using the wider knowledge base.

PAGE 81

69 Dag Completion Time (60 dags x 10 jobs)20003000400050006000Completion time basedQueue length based# of CPUs basedRound robinScheduling AlgorithmsTime (Seconds) A Job Execution Time (60 dags x 10 jobs)200300400500600Completion timebasedQueue length based# of CPUs basedRound robinScheduling AlgorithmsTime (Seconds)01234567 Execution Idle B Figure 3-5. Performance of scheduling algorithms with 600 jobs and without any policy constraints (A) The average DAG completion time for each of the algorithms, and (B) The average job execution time and the idle time for each of the algorithms.

PAGE 82

70 Average Dag Completion Time (120 dags x 10 jobs/dag)450050005500600065007000Completion timebasedQueue lengthbased# of CPUs basedRound robinScheduling AlgorithmsTime (Seconds) A Average Job Executioin Time (120 dags x 10 jobs/dag)340360380400420440460480Completion timebasedQueue length based# of CPUs basedRound robinScheduling AlgorithmsTime (Seconds)22.22.42.62.833.23.43.63.84 Execution Idle B Figure 3-6. Performance of scheduling algorithms with 1200 jobs and without any policy constraints (A) & (B). The results follow the trend same as the 300 and 600 jobs experiments, thus exhibiting scalability. Here it is worth noting that the absolute average DAG completion times for the 2 experiments should not and cannot be compared as the average load on the grid is different during these experiments.

PAGE 83

71 Job Completion Vs. Distribution (Completion Time Based)0100200300400500600700800acdcatlascitgrid3cluster28grid3ll03mcfarmnestspiderspiketier2-01tier2bufloridaPGuscmstb0SitesTime (Seconds)020406080100120140# of Jobs AVG Comp Time # of Jobs A Job Completion Vs. Distribution (# of CPUs Based)0100200300400500600700800acdcatlascluster28grid3mcfarmnestspiderspiketier2-01tier2bufgrid01ufloridaPGuscmstb0SitesTime (Seconds)050100150200250300350400# of Jobs AVG Comp Time # of Jobs B Figure 3-7. Site-wise distribution of completed jobs vs. avg. job completion time (A) & (B). In the Job completion time based approach (a), the number of jobs scheduled on a site is inversely proportional to its average job completion time. Figure 3-7(A) verifies that the job completion rate based approach indeed scheduled more jobs are to sites having least average-job-completion-time and vice

PAGE 84

72 versa. Other algorithms do not follow the trend e.g. number of CPU based algorithm listed here in Figure 3-7(B). The result shows that simple workload information such as the number of CPUs and running jobs (as used in Round-Robin and simple Load-Balancing techniques) is not good enough to estimate the performance of dynamic and customized grid resources. The job completion rate approach keeps track of the job completion time, and utilizes the information to estimate the near future execution environment on the grid sites. This seems to be a much better predictor of actual performance on the different sites. As monitoring systems mature and if the local sites make their performance measures more transparent and accurate the effectiveness of monitoring information in effective scheduling may improve. However, the data provided by extant monitoring systems and sites does not seem to be very useful. Effects of Policy Constraints on the Scheduling Algorithms Figure 3.8 (A) and (B) show the performance of scheduling algorithms constrained by resource usage quota policy. A users remaining usage quota defines the list of sites available to him for submitting jobs from which the scheduling algorithm recommends the execution site. The results obtained are similar to those without policy. The results underline the ability of SPHINX to do policy-based scheduling. Even in the presence of policy constraints, SPHINX is able to get a scheduling efficiency similar to the one in a constraint-free grid environment.

PAGE 85

73 Average Dag Completion Time (120 dags x 10 jobs/dag)450050005500600065007000Completion timebasedQueue lengthbased# of CPUs basedRound robinScheduling AlgorithmsTime (Seconds) A Average Job Executioin Time (120 dags x 10 jobs/dag)340360380400420440460480Completion timebasedQueue length based# of CPUs basedRound robinScheduling AlgorithmsTime (Seconds)22.22.42.62.833.23.43.63.84 Execution Idle B Figure 3-8. Performance of policy based-scheduling algorithm (A) Average DAG completion time, and (B) Average Job Execution and Idle time. In each of the scheduling algorithms, policy constraints are applied to get the pool of feasible sites before using the scheduling strategy.

PAGE 86

74 Timeout (120 dags x 10 jobs/dag)1253863271542258110100100010000Completiontime basedQueue lengthbased# of CPUsbasedRound robin# of CPUsbased-withoutfeedbackScheduling Algorithms# of jobs Figure 3-9. Number of timeouts in the different algorithms Fault Tolerance and Scheduling Latency Figure 3-9 gives the number of times jobs were rescheduled in each of the scheduling strategies. Note that without any feedback information, the number of resubmissions is very high (2258) as compared to 125 in the job completion rate based hybrid approach. The graph in Figure 3-10 (A) presents scheduling latency of SPHINX scheduling system with different workload of submitted workflows at the same time.Scheduling latency is defined as the taken time from the job submission until the scheduling decision is made and the job is submitted to the targeted resource. Each line presents the different number of workflows which are submitted concurrently. I differentiate the job arrival rate per minute. SPHINX shows pretty stable performance until the workload of 13 jobs / min with different number of DAGs which are submitted concurrently.

PAGE 87

75 The graph in Figure 3-10 (B) shows the scheduling latency of the cluster-based scheduling algorithm. The cluster size is set to three. It means that any three independently schedulable jobs in a workflow will be planned together in a single scheduling iteration. The scheduling algorithm performs multiple iterations of the job planning to determine the optimal resource allocation. The workflow arriving rate to a scheduling system is determined by the number of submitted workflows per minute. The workload is decided by the number of jobs of a workflow. The total number of submitted DAGs is 100. The algorithm shows good tolerance to the arriving rate up to nine workflows per minute with the different workloads such as 14, 12, 10 or 8 jobs. 0510152025303540450.5124111317# jobs / minuteSeconds 20 DAG's 40 DAG's 80 DAG's 100 DAG's The Scheduling Latency0102030405060567911The number of workflows per minuteThe scheduling time (sec ) 14Jobs/dag 12Jobs/dag 10Jobs/dag 8Jobs/da g Figure 3-10. Sphinx scheduling latency: average scheduling latency for various number of DAGs (20, 40, 80 and 100) with different arrival rate per minute Conclusion and Future Research This chapter introduces techniques and infrastructure for fault-tolerant scheduling of jobs across the grid in a dynamic environment. In addition to the SPHINX architecture which is robust, recoverable, modular, re-configurable and fault-tolerant, the novel contributions of this chapter to the state-of-the-art is the effective use of monitored information in efficient scheduling without requirement of human interference in a highly dynamic grid environment. These results show that SPHINX can effectively

PAGE 88

76 4. Reschedule jobs if one or more of the sites stops responding due to system downtime or slow response time. 5. Improve total execution time of an application using information available from monitoring systems as well its own monitoring of job completion times. 6. Manage policy constraints that limit the use of resources. 7. These results demonstrate the effectiveness of SPHINX in overcoming the highly dynamic nature of the grid and complex policy issues to harness grid resources, an important requirement for executing large production jobs on the grid. We are investigating novel scheduling methods to reduce the turn around time. We are also developing methods to schedule jobs with variable Quality of Service requirements. Latest updates on SPHINX are available at http://www.griphyn.org/sphinx A novel grid scheduling framework for computing has been proposed in this chapter and an initial implementation presented. Resource scheduling is a critical issue in executing large-scale data intensive applications in a grid. Due to the characteristics of grid resources, we believe that traditional scheduling algorithms are not suitable for grid computing. This document outlines several important characteristics of a grid scheduling framework including execution time estimation, dynamic workflow planning, enforcement of policy and QoS requirements, VO-wide optimization of throughput, and a fully distributed, fault tolerant system. Our proposed system, SPHINX, currently implements many of the characteristics outlined above and provides distinct functionalities, such as dynamic workflow planning and just-in-time scheduling in a grid environment. It can leverage existing monitoring and execution management systems. In addition, the highly customizable client-server framework can easily accommodate user specific functionality or integrate other scheduling algorithms, enhancing the resulting system. This is due to a flexible architecture that allows for the concurrent development of modules that can effectively

PAGE 89

77 manipulate a common representation for the application workflows. The workflows are stored persistently in database using this representation allowing for development of a variety of reporting abilities for effective grid administration. The development of SPHINX is still in progress, and we plan to include several additional core functionalities for grid scheduling. One such important functionality is that of estimating the resources required for task execution, enabling SPHINX to realistically allocate resources to a task. In addition, we plan to investigate scheduling and replication strategies that consider policy constraints and quality of service, and include them in SPHINX to improve scheduling accuracy and performance.

PAGE 90

CHAPTER 4 POLICY-BASED SCHEDULING TECHNIQUES FOR WORKFLOWS This chapter discusses policy-based scheduling techniques on heterogeneous resources for grid computing. The proposed scheduling algorithm has the following features, which can be utilized in grid computing environments. First, the algorithm supports the resource usage constrained scheduling. Second, the algorithm performs the optimization-based scheduling. It provides an optimal solution to the grid resource allocation problem. Third, the algorithm assumes that a set of resources is distributed geographically and is heterogeneous in nature. Fourth, the scheduling algorithm dynamically adjusts to the grid status by tracking the current workload of the resources. The performance of the proposed algorithm is evaluated with a set of predefined metrics. In addition to showing the simulation results for the out-performance of the policy-based scheduling, a set of experiment is performed on Open Science Grid (OSG). Motivation. Grid computing is recognized as one of the most powerful vehicles for high performance computing for data-intensive scientific applications. It has the following unique characteristics over the traditional parallel and distributed computing. First, grid resources are geographically distributed and heterogeneous in nature. Research and development organizations, distributed nationwide or worldwide, participate in one or more virtual organizations (VOs). A VO is a group of resource consumers and providers united in their secure use of distributed high-end computational resources towards a common goal. Second, these grid resources have decentralized ownership and different 78

PAGE 91

79 local scheduling policies dependent on their VO. Third, the dynamic load and availability of the resources require mechanisms for discovering and characterizing their status continually. The dynamic and heterogeneous nature of the grid coupled with complex resource usage policy issues poses interesting challenges for harnessing the resources in an efficient manner. In this paper, we present novel policy-based scheduling techniques and their performance on Open Science Grid (OSG), a worldwide consortium of university resources consisting of 2000+ CPUs. The execution and simulation results show that the proposed algorithm can effectively: 8. Allocate grid resources to a set of applications under the constraints presented with resource usage policies. 9. Perform optimized scheduling on heterogeneous resources using an iterative approach and binary integer programming (BIP). 10. Improve the completion time of workflows in integration with job execution tracking modules of SPHINX scheduling middleware. Problem Definition and Related Works An application scientist typically solves his problem as a series of transformations. Each transformation may require one or more inputs and may generate one or more outputs. The inputs and outputs are predominantly files. The sequence of transformations required to solve a problem can be effectively modeled as a Directed Acyclic Graph (DAG) for many practical applications of interest that the proposal is targeting.

PAGE 92

80 1234567885791006687558670465895 1234567885791006687558670465895 Figure 4-1. An example workflow in Directed Acyclic Graph (DAG). The figure shows a workflow consisting with eight jobs. The number of an edge represents a communication time. For simplicity we assume that the communication time is identical on different network. (Adapted from [54]) Figure 4-1 describes a DAG consisting of 8 tasks. It is useful to define an exit task the completion of this task implies that the workflow is executed. Task 8 represents the exit task. A scheduling algorithm aims to minimize a workflow completion time obtained by the assignment of the tasks of the DAG to processors. A scheduling algorithm that is efficient and suitable in the target environment should be able to exploit the inherent heterogeneity in the processor and network resources. Most of the existing scheduling algorithms perform the mapping of tasks to processors in two stages: Create a priority based ordering of the tasks. The priority of a task is based on its impact on total completion time. This requires determining the critical path, which in turn requires that the execution time of each task is available or can be estimated. The exact definition of critical path depends on the algorithm. Some algorithms use the longest path from the given task to calculate the critical path of a task. Others use the longest path from the start node to the given task to calculate its critical path.

PAGE 93

81 Use the priority based ordering created in the previous step to map the tasks so that the total completion time is minimized. The process is performed one task at a time in an order based on the priority. It is incremental in nature. However, once a task is assigned to a given processor, it is not generally remapped. The above approach has the following limitations for the target heterogeneous environment: 8. The amount of time required for a task is variable across different processors (due to heterogeneity). Thus estimating the priority based on cost of the critical path is difficult (this measure or related measures are used by most of the algorithms in determining a tasks priority). Adaptations of these algorithms for heterogeneous processors use the average or median processing time of subsequent tasks to estimate the critical path. However, this may not be an accurate reflection of the actual execution task. In fact one processor may execute task A faster than Task B, while the reverse may be true for another processor this may be due to differential amounts of memory, cache sizes, processor types, etc. 9. The tasks are assigned one at a time. Assuming that k processors are available at a given stage, this may not result in an optimal assignment. Clearly, one can call these algorithms k times sequentially to achieve the same goal. However, this may not result in the optimal allocation of the tasks on the k available processors. Mapping a large number of tasks simultaneously on available processors can allow for more efficient matching. 10. The tasks are assigned without any policy constraints. The policy constraints can restrict the subset of processors that can be assigned to a given task. This needs to be taken into account while making the scheduling decisions. Past research on task scheduling in DAGs has mainly focused on algorithms for a homogeneous environment. Scheduling algorithms such as Dynamic Critical Path (DCP) algorithm [55] that show good performance in a homogeneous environment may not be efficient for a heterogeneous environment because the computation time of a task may be dependent on the processor to which the task is mapped. Several scheduling algorithms for a heterogeneous environment have been recently proposed. Most of them are based on static list scheduling heuristics to minimize the execution time of DAGs. Examples of

PAGE 94

82 these algorithms include Dynamic Level Scheduling (DLS) algorithm [58] and Heterogeneous Earliest Finish Time (HEFT) algorithm [57]. The DLS algorithm selects a task to schedule and a processor where the task will be executed at each step. It has two features that can have an adverse impact on its performance. First, it uses the median of processing time across all the processors for a given task to determine a critical task. Secondly, it uses the earliest start time to select a processor for a task to be scheduled. These may not be effective for a heterogeneous environment. For instance, suppose that processor A and processor B are the only available processors for the assignment of task i. Assume that processor A becomes free slightly earlier than processor B (based on the mapping of tasks so far). Then task i is assigned to processor A, since processor A can execute it earlier than processor B. However, if processor B can finish task i earlier than processor A, the selection of processor B for the task should result in a better mapping. Also, the time required to execute the scheduling algorithm (cost of scheduling) is relatively high as the priorities of the remaining tasks need to be recalculated at each step of the iteration. The number of iterations is proportional to the total number of tasks The HEFT algorithm reduces the cost of scheduling by using pre-calculated priorities of tasks in scheduling. The priority of each task is computed using an upward rank, which is also used in our proposed algorithm. Also, it employs finding the earliest finish time for the selection of a processor, which is shown to be more suitable for a heterogeneous environment. Although it has been shown to have good performance in the experiments presented in [57], it can be improved by using better estimations of the critical path. The Iterative list scheduling [54] improves the quality of the schedule in an

PAGE 95

83 iterative manner using results from previous iterations. It only assigns one task at a time and does not support resource usage policies. It is a static scheduling algorithm, which assumes an unchanged or stable computing environment. In the dynamic and policy constrained grid computing environment the algorithm may not perform well this is supported by the simulation results presented in this paper. We address the above issues in the proposed algorithm and demonstrate that it is effective in satisfying the workflow completion deadline. The proposed algorithm consists of three main steps: Selection of tasks (or a task) Selection of processors for the selected tasks Assignment of selected tasks to selected processors based on policy constraints. A subset of independent tasks with similar priority is selected for simultaneous scheduling. The tasks in the selected subset are scheduled optimally (i.e., to minimize completion time) based on policy constraints. Further, the scheduling derived is iteratively refined using the mapping defined in the previous iteration to determine the cost of the critical path. This estimation, in general, should be better than using the average computation time on any processor. We utilize the resource usage reservation technique to schedule multiple DAG workflows. The technique facilitates the deadline satisfaction of workflow completion. When the scheduling algorithm is integrated with the SPHINX scheduling middleware it performs efficient scheduling on the policy-constrained grid environment. The performance is demonstrated in the experimental section.

PAGE 96

84 Scheduling Algorithm Features The proposed policy-based scheduling algorithm is different from the existing works in the following perspectives. Policy constrained scheduling: Decentralized grid resource ownership restricts the resource usage of a workflow. The algorithm makes scheduling decisions based on resource usage constraints in a grid-computing environment. Optimized resource assignment: The proposed algorithm makes an optimal scheduling decision utilizing the Binary Integer Programming (BIP) model. The BIP approach solves the scheduling problem to provide the best resource allocation to a set of workflows subject to constraints such as resource usage. The scheduling on heterogeneous resources: The algorithm uses a novel mechanism to handle different computation times of a job on various resources. The algorithm iteratively modifies resource allocation decisions for better scheduling based on the different computation times instead of taking a mean value of the time. This approach has also been applied to the Iterative list scheduling [54]. Dynamic scheduling: In order to encounter a dynamically changing grid environment the algorithm uses a dynamic scheduling scheme rather than a static scheduling approach. A scheduling module makes the resource allocation decision for a set of schedulable jobs. The status of a job is defined as schedulable when it satisfies the following two conditions. Precedence constraint: all the precedent jobs are finished, and the input data of the job is available locally or remotely.

PAGE 97

85 Scheduling priority constraint: A job is considered to have higher priority than others when the job is critical to complete the whole workflow for a better completion time. Future scheduling: The resource allocation to a schedulable job impacts the workload on the selected resource. It also affects the scheduling decision of future schedulable jobs. The algorithm pre-schedules all the unready jobs to detect the impact of the current decision on the total workflow completion time. When the scheduling algorithm is integrated with the SPHINX scheduling middleware it performs efficient scheduling on the policy-constrained grid environment. The performance is demonstrated in the experimental section. Notation and Variable Definition In this section we define the notations and variables that are used in the proposed scheduling algorithm. (2) --toprocessor from timeionCommunicat : (1) --processor on job of timenComputatio :jpcommjicomppjij The computation or execution time of a job on a processor (comp ij ) is not identical among a set of processors in a heterogeneous resource environment. The algorithm sets an initial execution time of a job with the mean value of the different times on a set of available processors. The time is updated with an execution time on a specific processor as the algorithm changes the scheduling decision based on the total workflow completion time. The data transfer or communication time between any two processors (comm pj ) is also different in the environment. (4) --job of jobs succeeding theofset A :(3) --job of jobs preceding theofset A :isucciprecii

PAGE 98

86 An application or workflow is in the format of directed acyclic graph (DAG). Each job i has a set of preceding (prec i ) and succeeding (comm i ) jobs in a DAG. The dependency is represented by the input/output file relationship. )( (8) --job from length completionWorkflow : (7) --processor on job of Time Finish Earliest : }}{,{ (6) --processor on job of TimeStart Earliest : (5) --jobfor processor of timeavailable The :kppsucckipiiijijijijpjkppreckijijijijcompLencommMaxcompcompLenicompLencompESTEFTjiEFTcommEFTMaxAvailMaxESTjiESTijAvailkiiii The algorithm keeps track of the availability of a processor (Avail ij ) to execute a job. We assume a processing model which all the jobs in a processor queue should be completed before a new job gets started. We assume a non-preemptive model. On the grid scheduling middleware, SPHINX obtains the information from a grid monitoring system such as MonALISA or GEMS. The algorithm computes the earliest start time of a job on each processor (EST ij ). A job can start its execution on a processor only after satisfying two conditions; first, a processor should be available (Avail ij ) to execute the job. Second, all the preceding jobs should be completed () on the same processor or other processors. The earliest finish time of a job on a processor (EFT }{pjkppreckcommEFTMaxi ij ) is defined by the earliest start time (EST ij ) and the job completion or execution time on the processor (comp ij ). The workflow completion time from a job to the end of a DAG (compLen i ) is defined recursively from the bottom to the job i. The value is used to decide the critical path in the workflow. The makespan of the critical path is used as a criterion to terminate the algorithm. The algorithm completes the scheduling when there is no improvement in the DAG completion time.

PAGE 99

87 belongs job which to dag of timecompletion workflow estimated The :belongs job which to dag of deadline completion workflow The :)()(ideestCompTimidDeadlineidid We also define the deadline and the estimated completion time of a workflow. They are used when we devise a profit function for the multiple workflow scheduling. The scheduling algorithm assumes that the workflow deadline (Deadline d(i) ) is submitted by a user, or is generated by a scheduling system. The scheduling algorithm computes the estimated workflow completion time (estCompTime d(i) ) with considering the current workload on Grid resources. The time is changed in the iterations of the scheduling due to the dynamic Grid resource status. ExecTime J1 J2 J3 J4 J5 J6 J7 J8 P1 70 68 78 89 30 66 25 94 P2 84 49 96 26 88 86 21 36 Policy l J1 J2 J3 J4 J5 J6 J7 J8 P1 v v v v v v P 2 v v v v v v Job J1 J2 J3 J4 J5 J6 J7 J8 AvgExecTime 77 68 87 89 88 76 21 65 compLen 517 354 354 340 199 199 181 65 Prioritization J 1 1 J 3 J 2 J 4J 6 6 J 5 J 7 J 8 8 Assignment P1 P2 P1 P1 P1 P2 P2 P1 Figure 4-2. An example for job prioritization and processor assignment for the workflow in Figure 1. The example presents a procedure to prioritize a set of jobs on the workflow from Figure 1 based on the scheduling function (P ij ). It also shows the processor assignment to the jobs based on the earliest finish time of a job on a processor (EFT ij ). Figure 4-2 shows an example of the procedure to assign jobs in a workflow to a set of processors (P1 and P2) based on the workflow prioritization techniques described in this section. It presents the heterogeneous execution time of the jobs on each of the

PAGE 100

88 processors (ExecTime). Each job such as J1, J2 or J8 has different execution time on the processors, P1 and P2. The figure also shows the resource usage restriction of the jobs. The mark (v) indicates that a processor allows a job to be executed on a corresponding processor. Otherwise, a job cant be run on a processor. In the first iteration of the resource allocation procedure the algorithm uses the average execution time (AvgExecTime) from the different time on the processors. In the next iterations the algorithm uses a specific execution time on a selected processor in the previous iteration. The algorithm computes the critical path length from each job to the bottom job of the workflow. The critical path length is calculated on the base of the predefined calculation for compLen i on (8). The values of the critical path length are used to prioritize the set of jobs in non-descending order (Prioritization). After sorting the jobs, the algorithm allocates a set of available processors to the job subject to the resource usage constraints. The assignment is determined with an optimization model that is discussed in the next sections. Optimization Model We devise a Binary Integer Programming (BIP) model to find an optimal solution to the scheduling problem on the proposed algorithm. In this section, we first define two scheduling profit functions for single workflow scheduling and multiple workflow scheduling respectively. Next, we discuss the optimization model utilizing the profit functions for the scheduling problem.

PAGE 101

89 Profit Function for Single Workflow Scheduling 0 (9) --processor toassigned is jobn profit whe the:)( ijijiijijEFTwhereEFTcompLenpjipprofitScheduling The scheduling profit function for the single workflow scheduling intends to generate higher profit value when a job on the critical path of a DAG is scheduled to a processor which is able to complete the job in an earliest possible time. CompLen i is calculated by computing the critical path value from the job to the bottom job of the DAG. We can assume that a job with higher critical path value is more urgent to complete the DAG as soon as possible. For a job i, the function tries to find a processor with the smallest completion time of the job. As described above, the earliest completion time of a job on a processor is determined by computing the processor available time and estimated execution time of the job. The objective function of the optimization model utilizes the profit function to select a proper processor for a job in the schedulable job list. Profit Function for Multiple Workflow Scheduling 0FT 1 0 1),()()()()()()(ijididididididijiEeestCompTimdeadlinetheneestCompTimdeadlineifwhereeestCompTimdeadlineEFTcompLenjip The profit function, p(i,j) is defined for the multiple workflow scheduling. Given a job i and a processor j, the function p computes a scheduling profit of the assignment of the job onto the processor. A scheduling system assumes that workflow d is submitted

PAGE 102

90 with a predefined deadline (deadline d(i) ) where job j belongs to workflow d. The deadlines of the given multiple workflows may be different. The profit function is used to prioritize a set of independent jobs in terms of the critical path value of the job i in a workflow d (compLen i /EFT ij ) and the remaining time to the deadline (deadline d(i) -estCompTime d(i) ). The scheduling algorithm gives higher priority to the workflow which is urgent to meet its deadline. The urgency is defined with the value of the remaining time. Due to the difference in the deadlines of the multiple workflows some jobs in a workflow may be more urgent to meet its deadline than other jobs in another workflow. The profit function maximizes the profit value by assigning higher profit to the job that is more critical to meet the workflow deadline. Objective Function and Constraints The scheduling algorithm determines the resource allocation to a set of independent jobs in the Binary Integer Programming (BIP) optimization model. It tries to assign the jobs to a set of processors, which will provide the jobs with the earliest finish time subject to the current resource load and the assignment constraints. The order of assignment to the jobs is decided by the urgency to meet the workflow deadline and the criticality to finish the workflow in short required time utilizing the profit functions and the job prioritization procedure of the algorithm. The assignment is limited by a set of constraints such as the resource usage constraint. The BIP optimization model is devised in the following format. The objective function ( ijijijxpMax ) is to maximize the profit function values for a set of jobs and processors. The profit functions for the single or multiple workflow scheduling are defined in the previous section. The objective function is constrained by the several constraints

PAGE 103

91 jtjiqjibwhere)orxjtxixjiqxbstxpMaxjijijijjiijjijijijijijijijprocessor on jobs assigned oflimit the: processor on job ofquota usage resource : processor on job oft requiremen usage resource : (Binary 1 0 (Load) processor eachfor t)(Assignmen job eachfor 1 (Policy) processor and job eachfor The resource usage constraint ( ijijijqxb ) is described with the two values, quota (q ij ) and requirement (b ij ) for a job i and a processor j. The quantitative model is flexible to express the policy of various resource types. The assignment constraint () makes sure that a job is not divided or assigned onto more than two processors. Each processor shouldnt be loaded with a set of assigned jobs over the predefined quota (t 1jijx j ) (). Currently the load is defined with the number of assigned jobs. The BIP model (x jiijtx ij = 0 or 1) is implemented with available BIP solvers. The optimization guarantees the optimal processor assignment to a set of jobs for the minimum resource requirement time. Policy-based Scheduling Algorithm and SPHINX In this section we discuss the policy-based scheduling algorithm in detail. This section also discusses the integration of the algorithm onto the grid scheduling middleware, SPHINX to run test applications on Open Science Grid.

PAGE 104

92 Iterative Policy-based Scheduling Algorithm The proposed scheduling algorithm uses the iterative approach to improve a resource allocation decision on heterogeneous Grid resource environment. It also makes an optimal scheduling decision by solving the scheduling problem modeled in Binary Integer Programming (BIP). Using the BIP model a set of independent jobs are scheduled at the same time, so that the scheduling is able to be optimal under the resource usage constraint environment. The algorithm uses the mean value approach to make an initial scheduling decision. In other words, the scheduling decision in the first scheduling iteration is made by the mean value of the job execution time on each processor. The scheduling algorithm then modifies the initial scheduling in an iterative way. In each iteration the execution time of a job is changed with a specific value on a selected processor in the previous iteration as the iterative scheduling proceeds. The iteration is terminated when there is no improvement in DAG completion time. The algorithm (Figure 4-3) is presented with detailed description below. A workflow formatted in a DAG consists of several jobs. The algorithm generates a list with unscheduled jobs in the DAG on time t (1). Upon the time when the workflow is submitted the list consists of all the jobs on the workflow. While a subset of the jobs is completed the size of the list is decreased with a smaller number of jobs than the number of jobs in the initial scheduling time. The algorithm initially sets the job execution time with the mean value on heterogeneous resources (2). As the execution time on the processors is different for each job the mean value of the execution time may be different for each job in the list. In the iterative scheduling approach the best DAG completion time (BestSL) in a series of iterations is selected, and the scheduling decision on the corresponding iteration is applied to the resource allocation (3, 4).

PAGE 105

93 Construct a job list, L unscheduled with unscheduled jobs in a workflow --(1) for each job i in L unscheduled Set exec i with a mean value of the execution time on processors --(2) end for Initialize BestSL and SL with a very large number --(3) While SL BestSL do --(4) BestSL = SL for each job i in L unscheduled Compute compLen i --(5) end for Sort L unscheduled by the non-increasing order of compLen i --(6) Dynamically cluster L unscheduled on the base of job dependency --(7) for each sub list, L schedulable for each job i in L schedulable for each processor j in the processor list, L processors Compute EFT ij --(8) end for end for Call optSolver(L schedulable L processors, compLen i EFT ij ) --(9) Assign each job i in L schedulable onto a selected processor --(10) Update each processor workload and available time --(11) end for Set SL with EFT ep where EFT ep is EFT of an exit job e on processor p e --(12) for each job i in L unscheduled Set exec i with the execution time on p i --(13) end for end while Figure 4-3. The iteration policy-based scheduling algorithm on heterogeneous resources. The algorithm utilizes an iterative scheduling scheme to deal with heterogeneous resources. It also performs an optimized scheduling by solving the policy-based scheduling problem that is modeled in Binary Integer Programming (BIP).

PAGE 106

94 The scheduling length (SL) in each iteration represents the DAG completion time with the resource allocation decision made in the scheduling iteration, and is compared with the best scheduling length (BestSL) computed so far. The critical path length of each job in the list (compLen i ) is computed in (5) as it is defined in the previous section. Based on the values of the critical path length the jobs in the list are sorted in the non-decreasing order. The order of the jobs in the sorted list represents the criticality of the jobs to complete the workflow (6). After sorting the jobs the list is clustered with a set of subsets (7). The size of each subset should be dynamic depending on the preceding relationship among the jobs in the list. Given the cluster size k (>= 1) the clustering is performs based on the given size k and the preceding relationship. In other words, the size of a cluster is less than or equal to k, and the jobs in a cluster should be independent. The algorithm formulates the optimization-scheduling problem with the subset of jobs. Before calling the optimization solver function the algorithm computes the earliest completion time (EFT ij ) of each job on each processor (8) as the time is defined in the previous section. The optimization function is called with the parameters, a scheduling cluster, a set of processors, compLen i and EFT ij (9). The optimization function is defined with the BIP model which is described in the previous section. The function returns a selected processor for each job in the cluster. The jobs are assigned to the processors (10). After the assignment the workload on each processor should be updated with the assigned job information such as the execution finish time of the assigned jobs (11). This information is used in the next scheduling iteration, and affects the assignment decision in the future scheduling. For the workload information the algorithm maintains the number of jobs and the next available time on each processor. The next available time of

PAGE 107

95 a processor defines the time when the processor finishes the currently running job, and is ready for executing the next job. In a scheduling iteration the algorithm computes workflow completion time resulting from the processor allocation decision to the schedulable job list (12). The completion time is defined with the longest job completion time in the workflow. In other words, it is defined with the earliest finish time of the exit job in the workflow. The algorithm updates the job execution time with the time on a selected processor in this iteration (13). In the next iteration the execution time of a job is deal with the time set in this step. Scheduling Algorithm on SPHINX We implement the proposed scheduling algorithm, and integrate it into the grid scheduling middleware, SPHINX. SPHINX performs dynamic scheduling with respect to the frequently changed load and availability of the grid resources. The scheduling system is integrated with a grid resource monitoring system such as MonALISA or GEMS to maintain grid resource status and workload information. The SPHINX job monitoring module keeps track of the job execution status on the sites. After making the scheduling decisions about a workflow based on the proposed scheduling algorithm, SPHINX makes the practical resource allocation only to a set of schedulable jobs in the workflow. A schedulable job has the entire parent jobs completed and the inputs available in a grid network. The job also has high priority to complete the workflow in the sorted job list in the algorithm. SPHINX also uses the processor status monitoring information to organize a set of available processors. The scheduling algorithm uses the set of processors to allocate resources to the jobs.

PAGE 108

96 Experiment and Simulation Results The performance of the proposed policy-based scheduling algorithm is evaluated with the simulation and the execution on the Open Science Grid (OSG). In this section we discuss the performance of a set of test applications to compare the algorithm with other scheduling algorithms that use the mean value approach to the heterogeneous resource environment. Network Configuration and Test Application OSG is a grid-computing infrastructure that supports scientific computing. It consists of more than 25 sites, and collectively provides more than 2000 CPUs. The resources are used by 7 different scientific applications, including 3 high energy physics simulations and 4 data analyses in high-energy physics, biochemistry, astrophysics and astronomy. In order to compare the performance of the algorithms, we generate a set of test workflows in directed acyclic graph (DAG) format. The workflow simulates a simple application that takes input files, and generates an output file. The size of the output file is different for each job, and the file is located on the execution site by default. The structure of a DAG, such as the depth and width, is set with different values. We set two kinds of parameters for the simulation: system parameters and workflow parameters. The system parameters consist of the number of processors and the resource usage constraints on the processors. For the simulation, we set the number of processors to 20, and the constraints are set with 10%, 50% and 90%. 10% of resource constraint means that the resource usage is limited to 90% of the total resources, while 90% represents that only 10% of the total resources are available to execute a users jobs.

PAGE 109

97 The workflow parameters of the simulation describe a workflow in DAG format. They consist of the number of jobs, the height of a DAG, the number of jobs in each level, the amount of input data for each job, the communication to computation ratio (CCR) and the request types. The request types are used to represent the workflow types. All the jobs in a DAG may request the same amount of CPU resources, while the jobs may require different job processing time. We generate a virtual workflow pool with the workflow parameters. Each DAG in the pool has different workflow related parameters representing different workflows. List Scheduling with the Mean Value Approach In the experiment the list-scheduling algorithm is compared with the policy-based scheduling algorithm in their performance of DAG completion time. The list scheduling uses the mean value approach to decide the execution time of a job on heterogeneous resources. The list-scheduling algorithm sorts a set of jobs in a workflow in the non-descending order of the workflow completion time. The workflow completion time is calculated from each of the jobs to the end of the workflow. In order words, the algorithm gives higher scheduling priority to a job, which affects the workflow completion time in a critical manner. The algorithm schedules the jobs in the sorted list one by one, and assigns a job onto the processor on which the job can finish as soon as possible. The resource usage policies constrain the assignments. Taking the mean value for the job execution time might not reflect the actual job execution on the heterogeneous resources. It results in a non-optimal scheduling for a workflow in the list-scheduling approach. Policy constraints might drive the resource assignment in the list scheduling into an unreasonable scheduling decision. It is mainly

PAGE 110

98 caused by the fact that scheduling the jobs in the sorted list one by one does not allow a chance to consider the policy constraints among multiple jobs. The scheme may result in a long DAG completion time by allocating a job to the wrong processor in terms of the earliest finish time of a job (EFT ij ). The workload-based algorithm considers the number of CPUs on a grid site and the current workload on the site. The workload is determined by maintaining the previous assignment information on a local scheduling system. The workload of a site consists of the number of jobs planned on the site and the number of jobs currently being executed on the site. iiiCPUnumsubmittednumplannednum_)__( where num_planned i : the number of planned jobs on site i num_submitted i : the number of submitted jobs on site i num_CPU i : the number of CPUs on site i The Simulated Performance Evaluation with Single DAG The first simulation evaluates the performance of the algorithms when the resource usage constraint is different. The constraint is defined with the ratios of the limited number of processors to the total number of processors. The policy constraint limits the job execution only on the available set of processors. The graphs in Figure 4-4 show the average DAG completion time with different constraints. Each graph shows the completion time when the number of tried DAGs is 20, 200, 500 or 700. As we have mentioned in the previous section two kinds of parameters are defined in the simulation, system-related and workflow-related parameter.

PAGE 111

99 Average Dag Completion Time (200 dags)05001000150020002500105090Resource usage constraints(%)Dag completion time (sec. ) ListSched Cluster=1 Cluster=3 Average Dag Completion Time (20 dags)020040060080010001200140016001030507090Resource usage constraints (%)Dag completion time (sec.) List sched Clust=1 Clust=3 Clust=5 Average Dag Completion Time (500 dags)020040060080010001200140016001090Resource usage constraints (%)Dag completion time (sec.) List Sched Cluster=1 Cluster=3 Average Dag Completion Time (700 dags)05001000150020002500105090Resource usage constraints (%)Dag completion time (sec.) ListSched Cluster=1 Cluster=3 Figure 4-4. The constraint and clustering effect on DAG completion. The graphs show the DAG completion time when the resource usage constraints are different. The constraints are defined by the ratio of the available processor to the total processors. The constraint specifies a set of available processors on which a job is allowed to run. The system-related parameter includes the number of processors and the resource usage constraint. In the simulation on Figure 4-4 we use 20 processors, and the constraint is changed from 10% to 90% increased by 20 %. The workflow-related parameters includes the number of jobs which is 10, 20, 40 or 80, the number of levels of a DAG which is 2, 4, 8 or 10, the link density which is 10% ~ 100%, the CCR which is 1, 4 or 8.

PAGE 112

100 Average Dag Completion Time (500 dags) (Link density=10%)02004006008001000120014001090Resource usage constraints (%)Dag completion time (sec. ) ListSched Clust=1 Clust=3 A verage Dag Completion Time (500 dags) (Link density=100%)0200400600800100012001400160018001090Resource usage constraints (%)Dag completion time (sec. ) ListSched Clust=1 Clust=3 Average Dag Completion Time (500 dags)(CCR=1)010020030040050060070080090010001090Resource usage constraints (%)Dag completion time (sec. ) ListSched Clust=1 Clust=3 Average Dag Completion Time (500 Dags)(CCR=4) 02004006008001000120014001600180020001090Resource usage constraints (%)Dag completion time (sec. ) ListSched Clust=1 Clust=3 Figure 4-5.Average DAG completion time with 500 DAGs when the two workflow parameters are different. The link density is 10% or 100%. The CCR is 1 or 4. The resource usage constraint is 10% or 90% From the DAG pool we select the different number of tried DAGs for each of the simulation results. We choose 20, 200, 500 or 700. We believe that the large number of tried DAGs gives more stable results in the average workflow completion time. In the next sets of simulations we focus the simulation with 500 and 700 DAGs because the results are more consistent with a large number of tried DAGs. For the results shown in Figure 4-5 we simulate the scheduling when two DAG properties, CCR and the link density are controlled with the following specific numbers. The CCR is 1 or 4, and the link density is 10% or 100%. The other workflow and system properties are randomly chosen with the following range. The number of jobs per DAG: 10, 20, 40 or 80 The number of levels of a DAG (height): 1, 2 or 10

PAGE 113

101 The number of processors: 20 The resource usage constraints: 10% or 90% In this simulation the types of workflows are defined with the two features. The first is the communication to computation rate (CCR) and the second is the link density. Based on the CCR the communication time is defined by Communication time = (Computation time CCR) / Communication rate We simulate two different kinds of DAGs in terms of CCR. With a large CCR a DAG is more communication oriented than with a small CCR. Another parameter to specify a workflow is link density. The link density is defined with the number of input data from the jobs in the parent level to a job in a child level. It determines the relationship between jobs within a DAG. With the link density equal to 20% a job takes input data from 20 % of the jobs in parent levels. The first two graphs in Figure 4-5 show the performance of the algorithms when the link density between jobs is different. The number of inputs of each job defines the link density. In the experiment the number of outputs is set to one. As the number of inputs increases the DAG completion time is increased. One of the main reasons is that the job takes more time to be ready to run on a processor with multiple inputs than with a small number of inputs.

PAGE 114

102 Single Dag Scheduling (700 dags) (Link Density=100%)0200040006000800010000105090Resource usage constraints (%)Average dag completio n time (sec.) ListSched Cluster=1 Cluster=3 Single Dag Scheduling (700 dags) (Link Density=10%)050100150200250300350105090Resource usage constraints (%)Average dag completio n time (sec.) ListSched Cluster=1 Cluster=3 Single Dag Scheduling (700 dags)(Link Density = 30%)050100150200250105090Resource usage constraints (%)Average dag completio n time (sec.) ListSched Cluster=1 Cluster=3 Single Dag Scheduling (700 dags)(Link Density=60%)02004006008001000120014001600105090Resource usage constraints (%)Average dag completio n time (sec.) ListSched Cluster=1 Cluster=3 Figure 4-6: Average DAG completion time with the different scheduling algorithms such as the list scheduling and the cluster-based scheduling. The number of tried DAGs is 700. The results show the performance of the algorithms when the list density is changed with 10%, 30%, 60% and 100%. The second two graphs in Figure 4-5 show the effect of the CCR on the DAG completion time. With communication-oriented workflow with CCR equal to 4 the policy-based scheduling shows good tolerance of the constraints compared to the list scheduling. The DAG completion time with the cluster-based scheduling is more stable than the list scheduling when resource usage constraints change from 10% to 90%. The policy-based scheduling shows consistent performance with various types of jobs. It means that the performance of the scheduling algorithm is stable with the different types of workflows such as the communication-oriented and the computation-oriented.

PAGE 115

103 Single Dag Scheduling (700 dags) (CCR=1)050100150200250300105090Resource usage constraints (%)Average dag completio n time (sec.) ListSched Cluster=1 Cluster=3 Single Dag Scheduling (700 dags) (CCR=4)02004006008001000120014001600105090Resource usage constraints (%)Average dag completio n time (sec.) ListSched Cluster=1 Cluster=3 Figure 4-7. Average DAG completion time with the different scheduling algorithms such as the list scheduling and the cluster-based scheduling. The number of tried DAGs is 700. The results show the performance of the algorithms when the communication to computation ratio (CCR) is changed from 1 to 4. The simulation results in the Figure 4-6 show the performance of the different scheduling algorithms when the link density is changed with 10%, 30%, 60% and 100%. The number of tried DAGs is 700 in the simulation. With the large number of DAGs the simulation gives more stable and reliable results. As we saw from the result with the 500 DAG simulation the cluster-based scheduling gives better scheduling results than the list scheduling in the 700 DAG simulation. The other system and workflow properties for the simulation are the same as the property values in the 500 DAG simulation. For the simulation results in Figure 4-7 we perform the simulation for the two different communication to computation rates (CCR); 1 and 4. We compute the average completion time of the 700 DAGs for the different resource usage policy constraints, 10%, 50% and 90% when the scheduling algorithms are different; the list scheduling, the clustering with the cluster size equal to 1 and the clustering with the cluster size equal to 3. We maintain the same value range for the other system and workflow related properties with the values for the 500 DAG simulation.

PAGE 116

104 Resource Constraint=10%00.20.40.60.811.211.11.21.31.41.51.61.71.8The deadline miss ratioThe fraction of dags with rati o <= x ListSched Clust=1 Clust=3 Resource Constraint=50%00.20.40.60.811.211.11.21.31.41.51.61.71.8The deadline miss ratioThe number of dags with ratio <= x ListSched Clust=1 Clust=3 Resource Constraint=90%00.20.40.60.811.211.11.21.31.4The deadline miss ratioThe number of dags with ratio <= x ListSched Clust=1 Clust=3 Figure 4-8. The scheduling performance of the different scheduling algorithms such as the list scheduling and the list-based scheduling with multiple workflows. The set of workflows for the simulation is 20, each of which consists of 50 DAGs. The deadline miss ratio is defined with the DAG completion time divided by the best DAG completion time. The best DAG completion time is given with the smallest DAG completion time under the given condition with the system and workflow related properties. The Simulated Performance Evaluation with Multiple DAGs In this section we simulate the performance of the scheduling algorithms with multiple workflows. The workflows are randomly selected from the DAG pool which contains DAGs with different values for the system and workflow-related parameters such as the number of jobs in a DAG, the number of input data of a job, the number of output data of a job, the height of a DAG. and The communication to computation ratio (CCR). The ranges of values for the parameters are defined in the following list. The number of jobs in a DAG: 10, 20, 40, 60 or 80. The link density: 10%, 20%, 100% The number of output: 1

PAGE 117

105 The height of a DAG: 1, 2, 10. The CCR: 1, 2, 4, 8 or 16. The number of processors: 20 The resource usage constraints: 10%, 50% or 90% We define a performance metric of the simulation. The deadline miss ratio is defined with the ratio of a DAG completion time to the deadline. We simulate the performance of a DAG with three different algorithms. We take the ratio of a DAG completion time with an algorithm to the deadline as the deadline miss ratio. The function below shows the metric to evaluate the performance of the scheduling algorithm. The deadline miss ratio is defined with two factors; the workflow completion time and the given deadline of workflow completion. The workflow completion time is determined with the largest job completion time in a DAG. ddeadlinecompTimewheredeadlinecompTimedddd dag of deadline completion The : start timeexecution thetimeendexecution The : The completion deadline of a workflow is set with an admission value and an estimated completion time of the workflow. The admission value is used to control the deadline miss rate. With a lower admission value the deadline becomes tighter, resulting in a higher miss rate. With a higher admission value the deadline miss rate becomes lower. The estimated workflow completion time is computed based on the estimated current load on grid resources, the load factor of the workflow and the constraints of the workflow execution. In our simulation we set a equal to 1, which means that the deadline is equal to the estimated DAG completion time on a given resource load.

PAGE 118

106 Deadline = (1 + a) EstComp Where a: the admission value EstComp: the estimated DAG completion time To implement the simulation setup we devised a table showing the dynamic resource load information. This table represents the best DAG completion time on the current resource load. In our simulation setup the completion time is to be the deadline as described above. The main purpose of the resource load table is to keep track of the current resource load and to decide estimated workflow completion time under the given load. With multiple workflows we also use the table to make future resource usage reservations. Given sequential workflows, a previous workflow has higher priority to use a given resource set than a later workflow. A given resource load from a workflow prevents a usage from the following workflows. The table is used to maintain the resource usage reservation information. The graphs in Figure 4-8 present the performance of the scheduling algorithms with multiple DAGs. The set of workflows for the simulation is 20, each of which consists of 50 DAGs. The graphs show the fraction of DAGs to the total 50 DAGs which has a deadline miss ratio less than the given value in X. Most of the 50 DAGs complete the jobs with less than a 1.2 deadline miss ratio when the resource usage constraints are 10% and 50%. Both scheduling results with a resource usage constraint equal to 90% are similar because the intensive constraint the cluster-based scheduling has not much chance to make the best scheduling decision for a job. In other words, the list scheduling result is not much different from the result of the cluster-based scheduling due to the large constraint with 90%.

PAGE 119

107 The Performance Evaluation with Single DAG on OSG In addition to the performance simulation we present the experiment to show the performance evaluation on OSG. The experiment presents the performance of the cluster-based scheduling algorithm, and compares it with the list-scheduling algorithm and other simple scheduling algorithms described in the previous section. In this experiment a resource represents a grid site on OSG. Specially, the experiment uses the CPU resource to run a set of workflow. Therefore, a resource indicates the CPU resource in a grid site in this section. For a experiment with single DAG scheduling we use the following system and workflow-related parameter setup. The number of DAGs: 150 The number of jobs per DAG: 4, 8, 16 or 32 The height of DAG: 1, 2, 4 or 8 The communication delay: 1 The link density: 10%, 50% or 100% The number of output: 1 The Test Application In the experiment we use Condor-G job specification language to format an application. We use the Condor and Globus infrastructure to submit jobs to a remote processor and to maintain the processor load information in OSG. The job represented in the Condor job specification language takes a set of predefined input data, which may be located in a local execution site or a remote site. After taking all the input data the job is executed for the predefined number of minute to generate a text file with the size of 1KB. The test file is an output of the job, and is located on the execution site. A set of input data is specified in the job description language, and transmitted from a location where the files are located to the job execution site. In this experiment we facilitate the Sphinx

PAGE 120

108 job execution monitor. It will keep track of the job execution statue information on the Grid sites. If a job fails on a remote site then the monitor will detect the failure, and reschedule the job keeping the failed information on its local database. A site with a large number of failure is eliminated from the list of available sites for job execution. Policy Constraint Effect020040060080010001200140016001800200010%50%90%Resource usage constraint (%)Dag completion time (Sec. ) Round-robin CPU-based Clusterin g Link Density Effect(Constraint = 50%)0500100015002000250030003500400010%50%100%The ratio of job linkage (%)Dag completion time (Sec. ) Round-robin CPU-based Clusterin g Figure 4-9. The scheduling performance evaluation of the scheduling algorithms such as the Round-robin, the CPU-based and the cluster-based with a single workflow scheduling. The resource usage constraints are changed with the different value of the constraints for 10%, 50% and 90%. The first experiment result in Figure 4-9 presents the average DAG completion time with the different scheduling algorithms such as the Round-robin, CPU-based and cluster-based algorithms for the different resource usage constraints with 10%, 50% and 90%. The second graph shows the results with a different link density setup when the resource constraint is set to 50%. In the experiment the link density is changed with the values of 10%, 50% and 100%. The average DAG completion time with the cluster-based scheduling with the cluster size equal to 3 is stable when the constraint is increased, while the completion time is dramatically increased with the Round-robin and the CPU-based when the constraint get intensive in the first graph.

PAGE 121

109 CCR=105010015020025030035040010%50%90%Constraints (%)Average dag completion time (sec.) ListSched Clust=3 CCR=2010020030040050060010%50%90%Constraints (%)Average dag completion time (sec.) ListSched Clust=3 CCR=4010020030040050060070080010%50%90%Constraints (%)Average dag completion time (sec.) ListSched Clust=3 Figure 4-10. The scheduling performance comparison between the list scheduling and the cluster-based scheduling when the CCR is changed with the values of 1, 2 and 4. The resource usage constraint is 10%, 50% or 90%. We also compare the scheduling performance of the cluster-based scheduling algorithm against the list-scheduling algorithm. The workflow types used in the experiment are categorized with the different CCRs and the link densities. The graphs in Figure 4-10 show the average DAG completion time when the CCR is different with the values of 1, 2 and 4. With the large CCRs the cluster-based scheduling outperforms the list-scheduling algorithm, while there is no much different in the performance with the CCR equal to 1. The results show that the cluster-based scheduling algorithm works better than the list-scheduling algorithm with the communication-oriented workflow, while the list-based scheduling outperforms the cluster-based scheduling with the computation-based workflows.

PAGE 122

110 Link Density=10%05010015020025030035040045010%50%90%Constraints (%)Average dag completion time (sec.) ListSched Clust= 3 Link Density=50%05010015020025030035040045010%50%90%Constraints(%)Average dag completion time (sec.) ListSched Clust=3 Link Density=100%05010015020025030035040045010%50%90%Constraints (%)Average dag completion time (sec.) ListSched Clust=3 Figure 4-11. The scheduling performance comparison between the list scheduling and the cluster-based scheduling when the link density is changed with the different values of 10%, 50% and 100%. The resource usage constraint is 10%, 50% or 90%. The graphs in Figure 4-11 show the average DAG completion time with the different link densities. The link density is defined with: Link density i = parentJobs i / upperjobs i 100 where parentJobs i : The number of parent jobs of job i upperjobs i : The number of jobs in the upper level of job i The graphs show the scheduling performance of the two algorithms when the resource usage constraints are changed with the values of 10%, 50% and 90%. There is no large performance difference between the list-scheduling algorithm and the cluster-based scheduling algorithm.

PAGE 123

111 Policy Constraint (10%)00.20.40.60.811.211.251.523The deadline ratioThe number of dags with ratio <= x Clustering CPU-based Roundrobin Policy Constraint (50%)00.20.40.60.811.2124816The deadline ratioThe number of dags with ratio <= x Clustering CPU-based Roundrobin Policy Constraint (90%)00.20.40.60.811.21481632The deadline miss ratio The number of dags with ratio <= x Clustering CPU-based Roundrobin Figure 4-12. The performance of the multiple DAG scheduling with the simple scheduling algorithms (Round-robin and CPU-based) and the cluster-based scheduling algorithm. The graphs show the fraction of DAGs which meet the deadline with the miss ration equal to 1 or the fraction of DAGs which miss the deadline with the ratio greater than 1. Each graph shows the result when the resource usage policy is 10%, 50% or 90%. The Performance Evaluation with Multiple DAGs on OSG With the next sets of experiments we perform the multiple DAG scheduling experiment on OSG. We compare and analysis the performance of the scheduling algorithms such as the cluster-based scheduling, the list-scheduling, the Round-robin and the CPU-based. In order to specify the workflow type we give control to the workflow-related parameters such as the link density and the CCR. In order to make the experiment results reliable and stable we repeat the experiment 10 times. In each try the set of multiple DAGs consists of 50 DAGs which are randomly

PAGE 124

112 selected from the DAG pool. The ranges of the values of the system-related and workflow-related parameters are defined in the following list. The number of data sets: 10 The number of DAGs per set: 50 The number of jobs per DAG: 4, 8, 16 or 32 The height of a DAG: 2, 4 or 8 The communication delay: 1 The link density: 10%, 50% or 100% The number of outputs: 1 The graphs in Figure 4-12 show the performance of the multiple DAG scheduling with the simple scheduling algorithms (Round-robin and CPU-based) and the cluster-based scheduling algorithm. The graphs show the fraction of DAGs which meet the deadline with the miss ration equal to 1 or the fraction of DAGs which miss the deadline with the ratio greater than 1. Each graph shows the result when the resource usage policy is 10%, 50% or 90%. The fraction of DAGs which misses the deadline is dramatically smaller with the cluster-based scheduling than with the simple scheduling such as the Round-robin and the CPU-based when the resource usage constraints are 50% and 90%. The performance on the constraint equal to 10% is similar between the two scheduling types. With the loosen resource usage constraint the simple algorithms are smart enough to choose right processors to finish jobs, and also with many available processors to execute jobs the performance is not much sensitive to the scheduling strategy. The experiment results on Figure 4-14 present the performance of the multiple DAG scheduling when the workflow types are different with the several link densities. As we have mentioned in the previous section the large value of the link density the jobs within a workflow are linked intensively and it make the DAG completion taking longer

PAGE 125

113 time than the DAG with the loosely connected jobs because the start time of a highly linked job may be delayed with the waiting time for the parent job completion. Link Density (10%)00.20.40.60.811.211.251.51.752The deadline miss ratioThe number of dags with the ratio <= x Clustering CPU-based Roundrobin Link Density (50%)00.20.40.60.811.2124816The deadline miss ratioThe number of dags with the ratio <= x Clustering CPU-based Roundrobin Link Density (100%)00.20.40.60.811.21481632The deadline miss ratioThe number of dags with the ratio <= x Clustering CPU-based Roundrobin Figure 4-13. The performance of the multiple DAG scheduling with the simple scheduling algorithms (Round-robin and CPU-based) and the cluster-based scheduling algorithm. The graphs show the fraction of DAGs which meet the deadline with the miss ration equal to 1 or the fraction of DAGs which miss the deadline with the ratio greater than 1. Each graph shows the result when the link density is 10%, 50% or 100%, while the resource usage constraint is same with 50%. The graphs in Figure 4-13 shows The performance of the multiple DAG scheduling with the simple scheduling algorithms (Round-robin and CPU-based) and the cluster-based scheduling algorithm. The graphs show the fraction of DAGs which meet the deadline with the miss ration equal to 1 or the fraction of DAGs which miss the deadline with the ratio greater than 1. Each graph shows the result when the link density is 10%, 50% or 100%, while the resource usage constraint is same with 50%.

PAGE 126

114 Multiple Dag Scheduling (10 data sets)(Constraints = 10%)00.10.20.30.40.50.60.70.80.9111.21.41.61.822.22.42.62.83The deadline miss ratioThe fraction of dags with ratio <= 0 ListSched Cluster=3 Multiple Dag Scheduling (10 data sets)(Constraints = 50%)00.10.20.30.40.50.60.70.80.9111.21.41.61.822.22.42.62.83The deadline miss ratioThe fraction of dags with ratio <= x ListSched Cluster=3 Multiple Dag Scheduling (10 data sets)(Constraints = 90%)00.20.40.60.811.211.21.41.61.822.22.42.62.83The deadline miss ratioThe fraction of dags with ratio <= x ListSched Cluster=3 Figure 4-14. The performance of the multiple DAG scheduling algorithms such as the list-scheduling and cluster-based scheduling. The graphs show the fraction of DAGs which meet the deadline (the deadline miss ratio = 1) or miss the deadline (the ratio > 1). Each of the graphs shows the performance when the resource usage constraint is 10%, 50% or 100%. With the DAGs on the high link density the cluster-based scheduling shows the better performance than the simple scheduling in terms of the deadline miss ratio. The performance difference between the scheduling algorithms with the loosely connected DAGs (link density equal to 10%) is not significant because the type of workflow can be scheduled with any scheduling strategy to complete in the reasonable completion time (1<= deadline miss ratio <=2). We also show the performance comparison between the cluster-based scheduling and the list-scheduling. The graphs in Figure 4-14 present the performance of the multiple DAG scheduling algorithms such as the list-scheduling and cluster-based

PAGE 127

115 scheduling. The graphs show the fraction of DAGs which meet the deadline (the deadline miss ratio = 1) or miss the deadline (the ratio > 1). Each of the graphs shows the performance when the resource usage constraint is 10%, 50% or 100%. The Estimation Inaccuracy Ratio = 0.005001000150020002500300035004000105090The resource usage constraint (%)Dag completion time (sec) ListSched Clust=1 Clust=3 The Estimation Inaccuracy Ratio = 0.50500100015002000250030003500400045005000105090The resource usage constraint (%)Dag completion time (sec) ListSched Clust=1 Clust=3 The Estimation Inaccuracy Ratio = 1.001000200030004000500060007000105090The resource usage constraint (%)Dag completion time (sec) ListSched Clust=1 Clust=3 Figure 4-15. The single dag scheduling sensitivity to the job execution time estimation. The graphs shows the average dag completion time for the three different scheduling algorithms such as list scheduling, clusterSize=1 and clusterSize=3. The graphs show the completion time for the different estimation inaccuracy ratios when the resource usage constraint is changed with 10, 50 and 90 %.

PAGE 128

116 The Algorithm Sensitivity to the Estimated Job Execution Time The execution time of a job on the heterogeneous resource environment is hard to estimate and is very critical to the performance of scheduling algorithms. The execution time of a job on the processors should be estimated properly on a scheduling decision procedure. The simulation results in the previous section show the performance of different algorithms when the estimates are precise. In this section, we compare the relative performance of the algorithms when the estimates may be inaccurate. We setup a function to simulate an estimated job execution time. Given an execution time of the job on a heterogeneous site and a ratio of the estimation inaccuracy the function generates a job execution time. The inaccuracy ratio determines the difference between the execution time used by the algorithm and the actual estimated time used during the simulation. For example, given that the execution time is 100 seconds and the inaccuracy ratio is 0.3 the estimated execution time is 70 or 130 seconds. The function definition is given such that: inaccuracy estimation theof ratio The :10 1and 1 between selection random A :)1,1( processor on job of timeexecution Given : processor on job of timeexecution d Estimate: )1,1( ratrandomjiexecTimejiimeestimatedTwhereratexecTimerandomexecTimeimeestimatedTijijijijij We simulated the above algorithms to investigate the performance of the scheduling algorithms for a range of accuracy ratios of 0.0, 0.5 and 1.0. We perform the simulation for single dag scheduling as well as multiple dag scheduling. The graphs in Figure 4-15 show the average dag completion time when we use single dag scheduling

PAGE 129

117 algorithms. We choose the estimation inaccuracy ratio for the values of 0.0, 0.5 and 1.0. In order to compare the robustness of the proposed scheduling algorithms we use three different scheduling algorithms: ClusterSize=3 or ClusterSize=1, and ListSched. The inconsistency ratio represents the ratio of the estimated time to the given execution time. For example, the value 0.5 means that the estimated time is different from the given time by 50%, while 0 means that the estimated time is same with the given time. We plot the dag completion times with different resource usage constraints. The usage constraints are changed with the values of 10, 50 and 90%. The graphs show the affect of the inaccuracy to the performance of a scheduling algorithm. The clustering based algorithm with the cluster size equal to 3 outperforms the other algorithms when the estimated time is equal to the given time, in other words, the inaccuracy ratio is 0.0. Although the clustering algorithm with cluster size 3 is better than the other two algorithms, the performance improvements decrease as the inaccuracy ratio increases to 1.0. Also, the absolute performance of all the algorithms decreases when the estimation times are not accurate. The results show the negative impact of the inaccuracy of execution time estimation. It also, shows that the clustering based approach is relatively better even when the estimates are not accurate.

PAGE 130

118 The Estimation Inaccuracy Ratio = 0.000.20.40.60.811.211.21.41.61.822.6The deadline miss ratioThe fraction of dags with the ratio <= x ListSched Clust=1 Clust=3 The Estimation Inaccuracy Ratio = 0.500.20.40.60.811.211.21.41.61.82The deadline miss ratioThe fraction of dags with the ratio <= x ListSched Clust=1 Clust=3 The Estimation Inaccuracy Ratio = 1.000.20.40.60.811.211.21.41.61.822.42.8The deadline miss ratioThe fraction of dags with the ratio <= x ListSched Clust=1 Clust=3 Figure 4-16. The multiple dag scheduling sensitivity to the job execution time estimation. The graphs shows the average ratio of the dags with the deadline miss ratio equal to or less than a given value for the three different scheduling algorithms such as list scheduling, clusterSize=1 and clusterSize=3. The graphs show the ratio for the different estimation inaccuracy ratios when the deadline miss ratio is different from 1 to 2.6. We also perform the performance sensitivity test with the multiple dag scheduling algorithms. The simulation results in Figure 4-16 show the ratios of dags to a given get of dags, of which the deadline miss ratio is equal to or less than a given value. The estimation inaccuracy ratio values are set to 0.0, 0.5 and 1.0. We use 20 sets of 50 dags

PAGE 131

119 that are randomly selected with different CCR, the number of jobs, and dag structure. The resource usage constraint is set to 50%. The graph with the inaccuracy ratio equal to 0 shows that the cluster-based scheduling algorithm with the cluster size equal to 3 outperforms the other algorithm in terms of the deadline miss ratios. More than 60% of the given sets of dags are completed within the deadline with the algorithm, while the other algorithms do not perform as good as the cluster-based scheduling. However, the performance difference is lower when the estimation inaccuracy ratio is 0.5 and 1.0. Also, less than 40% of the total dags meet their deadline with any algorithms when the estimation is not accurate. Thus, the absolute performance of all the algorithms decreases when the estimation times are not accurate. It is important to note the clustering based approach is still better than list scheduling even when the inaccuracy ratio is 1.0. This demonstrates the robustness of this approach. The above simulation results of single DAG and multiple DAGs also show the critical importance of the job execution estimation to the algorithm performance. Accurate estimation leads to better absolute and relative performance of intelligent scheduling algorithms. Conclusions In this paper we introduce a novel policy-based scheduling algorithm. It allocates grid resources to an application under the constraints presented with resource usage policies. It performs optimized scheduling on heterogeneous resources using an iterative approach and binary integer programming (BIP). The algorithm improves the completion time of an application in integration with job execution tracking and history modules of SPHINX scheduling middleware. The implementation of the algorithms in SPHINX

PAGE 132

120 makes it possible to schedule jobs onto OSG according to policy-based scheduling decisions.

PAGE 133

CHAPTER 5 CONCLUSIONS In this chapter we conclude the dissertation, and discuss future works. We discuss the current research issues on Grid computing, and out research hypothesis to the problems. We overview our solutions to the research issues. We also present future research works to improve the current techniques and software. Executing data intensive applications on large grids requires allocating dynamically changing storage, computer and network resources in a fashion that satisfies both global and local constraints. Global constraints include community-wide policies governing how resources should be prioritized and allocated. Local constraints include site-specific control as to when external users are allowed use local resources. We will develop algorithms and software that uses heterogeneous and dynamically changing resource status information to execute workflows in grid environment. One of the main focuses of this research is to develop algorithms that are able to gracefully handle the impact of latency of monitoring (time between collecting information from each sensor to time when this information is available in a collective form to the decision making process) on the quality of workflow scheduling in an adaptive resource environment. The algorithms are also able to resolve resource-scheduling issues utilizing efficient operational research methodologies for optimization subject to resource usage policy constraints. Achieving global optimization of workflow execution has been researched in high performance application development for many years, while it is still 121

PAGE 134

122 open issue. Especially in grid computing environment in which heterogeneous and autonomous resources and requests have various usage and execution requirement it is critical to satisfy overall quality of service (QOS) such as workflow completion deadline. The research also focuses on developing a novel-scheduling framework for achieving globally optimized resource allocation. The proposed strategy satisfies global QOS requirement as well as performs locally best-effort scheduling. The resource allocation is derived by resource usage policies for intended users and on-time scheduling decision to reflect dynamically changing grid environment. We tackle the issue of fault-tolerant scheduling by utilizing job execution tracking and rescheduling functionality in the system. Another issue that we tackle is distributed resource management. The research tries to develop a framework distributed based on resource content information that is characterized by system prosperities. Policies, including authentication, authorization, and application constraints are important factors for maintaining resource ownership and security. The set of possible constraints on job execution can be various, and change significantly over time. Policies may include any information that should be specified to ensure that a job is matched to appropriate resources. This research develops a novel framework for policy-based scheduling. The study investigates efficient resource usage management and prioritization for the scheduling strategies. In order to develop reliable and scalable software for solving general-purpose distributed data intensive problems the research investigates possible architectures that combine existing grid services. We will deploy the prototype across Grid testbeds to demonstrate the efficiency of the system. It will exhibits interactive remote data access,

PAGE 135

123 interactive workflow generation and collaborative data analysis using virtual data and data provenance, as well as showing non-trivial examples of policy-based scheduling of requests in a resource constrained grid environment. The development of scheduling algorithms and software that effectively use monitoring information along with information about the jobs already scheduled can lead to providing reasonable guarantees for completing the workflows within deadlines even though the available resources may change over time. A key portion of our research will be to understand and model the rate of change of the total resources availability and understand its impact on the completion times. We also argue that a scheduling system based on resource usage policies and request preference such as deadline is sufficient to satisfy specific scheduling requests for QOS and to achieve workload balance across a grid. In the research we investigate efficient architectures of distributed resource management framework to coordinate access to shared resources among autonomous instances. QOS is classified with two general dimensions; soft QOS and hard QOS. Each dimension has multiple properties such as resource usage amount or workflow execution deadline. We assume that resource usage accounts or request priorities adjustment can control the request assignment to grid resources. This feature allows the scheduler balance workloads across grid resources resulting in better overall utilization and turn around time. In addition to the workload balance a resource management system can manage resource usage by adjusting usage quotas to intended resource users. The scheduler monitors resource usage by keeping track of the quota change. It saves

PAGE 136

124 resource cycles by assigning more requests to the idle resources, while reducing the number of requests assigned to the overly used resources. We also argue that the implementation of a prototype of distributed high-level services supporting grid-enable data analysis within research communities can lead to satisfy the need to rapidly access and analyze massive data from globally dispersed scientists in hundreds of collaborative teams. The research investigates the associated complex behavior of such an end-to-end system. In particular, the prototype integrates several existing grid services for the distributed data analysis. We develop performance metrics to demonstrate the applications usefulness. Our scheduling algorithms are implemented onto the application, and show impact of the monitoring information onto the performance. We perform simulation to understand the impact of monitoring information and its latency on effective scheduling in adaptive resource environment. The simple schemes will include round robin, guided self-scheduling etc. Our goal would demonstrate that the newly developed algorithms have significantly superior performance in terms of meeting deadlines and other performance measures as compared to simple approaches. The research also presents simulation to demonstrate the efficiency of proposed distributed scheduling architecture. The simulation shows the performance of request scheduling on the three different network topologies; content-based, geography-based and centralized scheduling networks. The content-based network is constructed according to our research arguments, while the geography-based is a traditional

PAGE 137

125 distributed network based on resources geographical distance. In the centralized service a single server takes charge of scheduling requests to all the resources in a grid. In the simulation and experiments we use different types of workflows and resources. The research focuses on single task workflows initially. They are simplest workflows that are interactive or batch type. Then, we extend the research to simple DAG-based workflows, which will reflect a variety of real world scenarios. Finally, our scheduling algorithms deal with multiple DAGs in achieving an objective function simultaneously. The experiments of the algorithms are performed in heterogeneous resource environment. A resource type such as CPU, storage or network has a set of resources with different performance. In order to test and evaluate the proposed data analysis prototype we deploy the framework across a grid testbed named Grid2003 which consists of more than 25 sites providing more than 2000 CPUs, and exhibit remote data access, workflow generation and collaborative data analysis using virtual data and data provenance, as well as showing non-trivial examples of policy based scheduling of requests in a resource constrained grid environment.

PAGE 138

LIST OF REFERENCES 1. J. Gary, A. Szalay, The World Wide Telescope: An Archetype for Online Science. Microsoft Research Technical Report, 75(4), 6, August 2001. 2. P. Avery, I. Foster, The GriPhyN Project: Towards Petascale Virtual-Data Grids. The 2000 NSF Information and Technology Research Program, Arlington, VA, 2000. 3. I. Foster, C. Kesselman, S. Tuecke, The Anatomy of the Grid: Enabling Scalable Virtual Organizations. International J. Supercomputer Applications, 15(3), 2150, 2001. 4. J. Frey, T. Tannenbaum, M. Livny, I. Foster, S. Tuecke, Condor-G: A Computation Management Agent for Multi-Institutional Grids. Proceedings of Tenth International Symposium on High Performance Distributed Computing, 7-9, San Francisco, CA, 2001. 5. A. Gerasoulis, and T. Yang, On the Granularity and Clustering of Directed Acyclic Task Graphs, IEEE Trans. Parallel and Distributed Systems 5(9), 951-967, 1994. 6. A. Ghafoor, and J. Yang, A Distributed Heterogeneous Supercomputing Management System, Computer, 26(6), 78-86, June 1993. 7. G. Karypis, and V. Kumar, A Fast and High Quality Multilevel Scheme for Partitioning Irregular Graphs, Technical Report, Department of Computer Science, University of Minnesota, 1995. 8. M. Kaddoura, and S. Ranka. Runtime Support for Parallelization of Data-Parallel Applications on Adaptive and Nonuniform Environments, Journal of Parallel and Distributed Computing, Special Issue on Workstation Clusters and Network-based Computing, 163-168, June 1997. 9. Y. Kwok, and I. Ahmad, Static Scheduling Algorithms for Allocating Directed Task Graphs to Multiprocessors, ACM Computing Surveys, 31(4), 406-471 December 1999. 10. S. Ranka, M. Kaddoura, A. Wang, and G. C. Fox, Heterogeneous Computing on Scalable Heterogeneous Systems, in Proceedings of the SC93 Conference, 763-764, Phoenix, AZ, November 1993. 126

PAGE 139

127 11. T. Yang, and A. Gerasoulis, DSC: Scheduling Parallel Tasks on an Unbounded Number of Processors, IEEE Trans. Parallel and Distributed Systems, 5(9), 951-967, 1994. 12. I. Foster, The Grid2003 Production Grid: Principles and Practice, in Proceedings of 13th IEEE International Symposium on High-Performance Distributed Computing, 236-245, Honolulu, HI, 2004. 13. Open Science Grid, May 1 st 2006, http://www.opensciencegrid.org/index.php?option=com_frontpage&elMenu=Home Date Last Visited: March 6, 2006. 14. Virtual Data Toolkit, May, 2004, http://www.cs.wisc.edu/vdt/documentation.html Date Last Visited: March 6, 2006. 15. A. Bayucan, Globus-enabled PBS: the PBS-Globus Interface, Globus retreat 2000, Pittsburgh, PA, August 2000 16. S. Bittner, Selecting and Implementing the PBS Scheduler on an SGI Onyx2/Origin 2000, Cray Users Group, May 1999 17. Maui Scheduler Documentation, Maui High Performance Computing Center, 1999 http://www.hpc2n.umu.se/doc/maui/index.html Date Last Visited: March 6, 2006. 18. Q. Snell, M. Clement, D. Jackson, C. Gregory, The Performance Impact of Advance Reservation Meta-scheduling, IPDPS 2000 Scheduling Workshop, 83-85, Cancun, Mexico, May, 2000 19. Maui Scheduler Administrators Guide Version 3.2, Supercluster Research and Development Group, http://www.supercluster.org/mauidocs/mauiadmin.shtml Date Last Visited: March 6, 2006. 20. LSF Administrators Guide, Version 4.1, Platform Computing Corporation, February 2001 21. B. Sundaran, B. M. Chapman, Policy Engine: A Framework for Authorization, Accounting Policy Specification and Evaluation in Grids, 2nd International Conference on Grid Computing, Nov. 2001 22. B. Segal, Grid Computing: The European Data Project. In IEEE Nuclear Science Symposium, Lyon, France, October 2000. 23. M. Ruda, Integrating GRID Tools to Build a Computing Resource Broker: Activities of DataGrid WP1. CHEP 2001, Beijing, September 2001. 24. E. Deelman, J. Blythe, Y. Gil, C. Kesselman, Pegasus: Planning for Execution in Grids. Technical Report GriPhyN-2002-20, November 2002.

PAGE 140

128 25. D. Thain, T. Tannenbaum, M. Livny, Condor and the Grid, in Fran Berman, Anthony J.G. Hey, Geoffrey Fox, editors, Grid Computing: Making The Global Infrastructure a Reality, John Wiley, 2003. 26. J. Basney, M. Livny, Deploying a High Throughput Computing Cluster, High Performance Cluster Computing Rajkumar Buyya, Editor, Vol. 1, Chapter 5, Prentice Hall PTR, May 1999. 27. J. Weismann, Prophet: Automated Scheduling of SPMD Programs in Workstation Networks, Concurrency: Practice and Experience, 11(6), 301-321, May, 1999. 28. Enabling Grids for E-science, April 14 th 2006, http://egee-intranet.web.cern.ch/egee-intranet/index.htm Date Last Visited: March 6, 2006. 29. Lightweight Middleware for Grid Computing, April 14 th 2006, http://glite.web.cern.ch/glite/ Date Last Visited: March 6, 2006. 30. Toward Open Grid Services Architecture, http://www.globus.org/ogsa Date Last Visited: March 6, 2006. 31. I. Foster, C. Kesselman Globus: A Metacomputing Infrastructure Toolkit, International Journal of Supercomputer Applications 1997; 11(2):115. 32. R. Buyya The Gridbus Toolkit: Enabling Grid Computing and Business, http://www.gridbus.org Date Last Visited: March 6, 2006. 33. J. Almond D. Snelling, UNICORE: Uniform Access to Supercomputing as an Element of Electronic Commerce, Future Generation Computer Systems 1999; 15:539. 34. W. Johnston, D. Gannon, B. Nitzberg, Grids as Production Computing Environments: The Engineering Aspects of NASAs Information Power Grid, Eighth IEEE International Symposium on High Performance Distributed Computing, Redondo Beach, CA, August 1999. IEEE Computer Society Press: Los Alamitos, CA, 1999. 35. E. Akarsu, G. Fox, W. Furmanski, T. Haupt, WebFlowHigh-level Programming Environment and Visual Authoring Toolkit for High Performance Distributed Computing. SC98: High Performance Networking and Computing, Orlando, FL,1998. 36. H. Casanova, J. Dongarra NetSolve: A Network Server for Solving Computational Science Problems. International Journal of Supercomputing Applications and High Performance Computing, 11(3), 1997

PAGE 141

129 37. E. Akarsu, G. Fox, T. Haupt, A. Kalinichenko, K. Kim P. Sheethaalnath, C.Youn, Using Gateway System to Provide a Desktop Access to High Performance Computational Resources. The 8th IEEE International Symposium on High Performance Distributed Computing (HPDC-8), Redondo Beach, CA, August 1999. 38. Gridlab. March 6 th 2006, http://gridlab.org Date Last Visited: March 6, 2006. 39. SETI@Home. http://setiathome.ssl.berkeley.edu/ Date Last Visited: March 6, 2006. 40. Distributed.Net. January 24 th 2006, http://www.distributed.net/ Date Last Visited: March 6, 2006. 41. The WS-Resource Framework. http://www.globus.org/wsrf/ Date Last Visited: March 6, 2006. 42. Global Grid Forum. http://www.ggf.org/ogsi-wg Date Last Visited: March 6, 2006. 43. D. Doval, D. OMahony, Overlay Networks, A Scalable Alternative for P2P, IEEE Internet Computing, August 2003. 44. S. Ratnasamy, P. Francis, M. Handley, R. Karp, S. Shenker, A Scalable Content-Addressable Network, ACM SIGCOMM, 2001. 45. I. Stoica, R. Morris, D. Karger, M. F. Kaashoek, H. Balakrishman, Chord: A Scalable Peer-to-peer Loopup Service for Internet Applications, ACM SIGCOMM, 2001. 46. B. Zhao, J. Kubiatowicz, A. Joseph, Tapestry: An Infrastructure for Fault-tolerant Wide-area Location and Routing, Technical report, U. C. Berkeley, 2001 47. A. Crespo, H. Garcia-Molina, Semantic Overlay Networks for P2P Systems, Technical report, Stanford University, Jan. 2003. 48. W. Hoschek, A Unified Peer-to-Peer Database Framework for Scalable Service and Resource Discovery, Proc. of the International IEEE/ACM Workshop on Grid Computing, Baltimore, USA, Nov. 2002. Springer Verlag. 49. D. Bradley, Condor-G Matchmaking in USCMS, Condor technical report, University of Wisconsin, Nov. 2003 50. A. Carzaniga, A.L. Wolf, Content-based Networking: A New Communication Infrastructure, NSF Workshop on an infrastructure for Mobile and Wireless Systems, Scottsdale, AZ, October, 2001 51. A. Carzaniga, M.J. Rutherford, A.L. Wolf, A Routing Scheme for Content-based Networking, Proceedings of IEEE INFOCOMM 2004, Hong Kong China, March, 2004.

PAGE 142

130 52. R. Chand, P. Felber, A Scalable Protocol for Content-based Routing in Overlay Networks, Proceedings of the IEEE International Symposium on Network Computing and Applications, Cambridge, MA, April, 2003. 53. M. Aron, D. Sanders, P. Druschel, W. Zwaenepoel, Scalable Content-aware Request Distribution in Cluster-based Network Servers, Proceedings of the 2000 Annual Usenix Technical Conference, San Diego, CA, June, 2000. 54. G.Q. Liu, K.L. Poh, and M. Xie, Iterative List Scheduling for Heterogeneous Computing, Journal of Parallel and Distributed Computing, 65(5), 654-665, 2005. 55. Y. K. Kwok, I. Ahmad, Dynamic Critical-path Scheduling: An Effective Technique for Allocating Task Graphs to Multiprocessors, IEEE Transactions on Parallel and Distributed Systems, 7(5), 506-521, 1996. 56. X. Qin, H. Jiang, Reliability-driven Scheduling for Real-time Tasks with Precedence Constraints in Heterogeneous Distributed Systems, In Proceedings of the International Conference Parallel and Distributed Computing and Systems 2000 (PDCS 2000), Las Vegas, USA, November 6-9, 2000. 57. H. Zhao, and R. Sakellariou, An Experimental Investigation into the Rank Function of the Heterogeneous Earliest Finish Time Scheduling Algorithm, Euro-Par 2003, LNCS 2790, Springer 2003. 58. G. C. Sih, and E.A. Lee, Dynamic-level Scheduling for Heterogeneous Processor Networks. In Proceedings of the Second IEEE Symposium on Parallel and Distributed Systems, 1990. 59. M. Kafil, and I. Ahmad, Optimal Task Assignment in Heterogeneous Distributed Computing Systems, Concurrency, IEEE 6(3), 42-50, 1998. 60. R. Ramen, M. Livny, M. Solomon, Matchmaking: Distributed Resource Management for High Throughput Computing, Proceedings of the Seventh IEEE International Symposium on High Performance Distributed Computing July 28-31, 1998, Chicago, IL 61. J. M. Schopf, Ten Actions When Superscheduling. Scheduling Working Group, Informational Category, Global Grid Forum. 62. Y. A. Li, J.K. Antonio, H. J. Siegel, M. Tan, D.W. Watson, Determining the Execution Time Distribution for a Data Parallel Program in a Heterogeneous Computing Environment. Journal of Parallel and Distributed Computing 44 (1), 35-52, 1997. 63. M. Iverson, F. Ozguner, L. C. Potter, Statistical Prediction of Task Execution Times Through Analytic Benchmarking for Scheduling in a Heterogeneous Environment. Proceedings of 8th Heterogeneous Computing workshop (HCW), Apr. 1999, San Juan, Puerto Rico.

PAGE 143

131 64. M. Tan, H. J. Siegel, J. K. Antonio, Y. A. Li, Minimizing the Application Execution Time Through Scheduling of Subtasks and Communication Traffic in a Heterogeneous Computing System. IEEE Transactions on parallel and distributed system, 8(8), August 1997. 65. Conrad Steenberg, Clalens, http://clarens.sourceforge.net Date Last Visited: March 6, 2006. 66. I. Foster, J. Voeckler, M. Wilde, Y. Zhao, Chimera: A Virtual Data System for Representing, Querying, and Automating Data Derivation. The 14th International Conference on Scientific and Statistical Database Management (SSDBM 2002), 2002. 67. A. Chervenak, Giggle: A Framework for Constructing Scalable Replica Location Services. To appear in Proceedings of SC2002 Conference, November 2002. 68. T. Sandholm, J. Gawor, Globus Toolkit 3 Core Agrid Service Container Framework, http://www-unix.globus.org/toolkit/documentation.html Date Last Visited: March 6th, 2006. 69. P. Raman, A. George, M. Radlinski, R. Subramaniyan, GEMS: Gossip-Enabled Monitoring Service for Heterogeneous Distributed Systems. GriPhyN technical Report 2002-19, Dec 17, 2002 70. T. Tannenbaum, Hawkeye, Condor Week, Paradyn, March 2002. 71. X. Zhang, J. Freschl, and J. Schopf. A Performance Study of Monitoring and Information Services for Distributed Systems Proceedings of HPDC August 2003. 72. J. Bent, Flexibility, Manageability, and Performance in a Grid Storage Appliance, Proceedings of the Eleventh IEEE Symposium on High Performance Distributed Computing Edinburgh, Scotland, July 2002. 73. T. Kosar, M. Livny, Stork Data Placement (DaP) Scheduler, http://www.cs.wisc.edu/condor/stork/ Date Last Visited: March 6, 2006. 74. J. Frey, T. Tannenbaum, M. Livny, I. Foster, S. Tuecke, Condor-G: A Computation Management Agent for Multi-Institutional Grids. Proceedings of the Tenth IEEE Symposium on High Performance Distributed Computing (HPDC 10) IEEE Press, 5:237-246, 2002. 75. F. Rademakers, R. Brun, ROOT: An Object-Oriented Data Analysis Framework, Linux Journal, 51, July 1998 76. H.B. Newman, I.C. Legrand, P. Galvez, R. Voicu, C. Cirstoiu, MonALISA: A Distributed Monitoring Service Architecture, CHEP 2003, La Jola, California, March 2003

PAGE 144

132 77. J. Blythe, E. Deelman, Y. Gil, The Role of Planning in Grid Computing, To appear in Proceedings of the 13th International Conference on Automated Planning and Scheduling (ICAPS), Trent, Italy, June 9-13, 2003. 78. R. Ramen, M. Livny, M. Solomon, Matchmaking: Distributed Resource Management for High Throughput Computing, Proceedings of the Seventh IEEE International Symposium on High Performance Distributed Computing Chicago, IL, July 28-31, 1998 79. K. Ranganathan, I. Foster, Decoupling Computation and Data Scheduling in Distributed Data Intensive Applications. International Symposium for High Performance Distributed Computing (HPDC-11), Edinburgh, Scotland, July 2002. 80. M. Carman, F. Zini, L. Serafini, Towards an Economy-Based Optimization of File Access and Replication on a Data Grid, Proceedings of the 2 nd IEEE/ACM International Symposium on Cluster Computing and the Grid (CCGRID02), 340-345, Berlin, Germany, May 2002

PAGE 145

BIOGRAPHICAL SKETCH Jang-uk In obtained his Master of Science in computer science on December 1999. He obtained his Bachelor of Science in computer science from Hannam University, Taejon South Korea, on February 1997. 133


Permanent Link: http://ufdc.ufl.edu/UFE0013834/00001

Material Information

Title: Efficient Scheduling Techniques and Systems for Grid Computing
Physical Description: Mixed Material
Copyright Date: 2008

Record Information

Source Institution: University of Florida
Holding Location: University of Florida
Rights Management: All rights reserved by the source institution and holding location.
System ID: UFE0013834:00001

Permanent Link: http://ufdc.ufl.edu/UFE0013834/00001

Material Information

Title: Efficient Scheduling Techniques and Systems for Grid Computing
Physical Description: Mixed Material
Copyright Date: 2008

Record Information

Source Institution: University of Florida
Holding Location: University of Florida
Rights Management: All rights reserved by the source institution and holding location.
System ID: UFE0013834:00001


This item has the following downloads:


Full Text











EFFICIENT SCHEDULING TECHNIQUES AND SYSTEMS FOR GRID
COMPUTING














By

JANG-UK INT


A DISSERTATION PRESENTED TO THE GRADUATE SCHOOL
OF THE UNIVERSITY OF FLORIDA IN PARTIAL FULFILLMENT
OF THE REQUIREMENTS FOR THE DEGREE OF
DOCTOR OF PHILOSOPHY

UNIVERSITY OF FLORIDA


2006




























Copyright 2006

by

Jang-uk In

































This document is dedicated to the graduate students of the University of Florida.
















ACKNOWLEDGMENTS

I thank my parents, wife and all the co-workers. Especially heartfelt thanks go to

Dr. Sanjay Ranka, chair of my committee, for his advice and support.





















TABLE OF CONTENTS





ACKNOWLEDGMENT S ........._.._ ..... ._._ .............._ iv...


LI ST OF T ABLE S ........._.._ ..... ._._ .............._ viii..


LI ST OF FIGURE S .............. .................... ix


AB STRAC T ................ .............. xi


CHAPTER


1 INTRODUCTION ................. ...............1.......... ......


Grid Com putting ................... ............. ................ .............
Grid Resource Management Middleware: SPHINX .............. ........ .............
Efficient Scheduling Techniques ................. ...............7............ ....
Scheduling Systems ................. ...... ...... ...............9.......
The Portable Batch System (PB S) ................. ...............10...............
M aui ................. ...............11.................
LSF .............. ...............12....
EZ-Grid............... ...............12
Resource Broker ................. ...............13.......... ......

Pegasus ................. ...............13.......... .....
Condor ................. ...............15.......... .....
PROPHET .............. ...............15....

Data Analysis Systems .............. ...............16....
Scheduling Al gorithms ................. ...............18.......... .....
Iterative List Scheduling............... ...............1
Dynamic Critical Path Scheduling .............. ...............19....
Reliability Cost Driven Scheduling ................. ...............19................
Heterogeneous Earliest Finish Time Scheduling .............. .....................1

Dynamic Level Scheduling .............. .. .... ...............20
Optimal Assignment with Sequential Search ................. .......... ...............20
Contributions .............. ...............20....
O utline .............. ...............22....


2 GRID POLICY FRAMEWORK ................. ...............23................


Key Features ................. ...............25.......... .....












Policy Space............... ...............27.
Resource Provider .............. ...............28....
Resource Property .............. ...............28....
Tim e............... ... ...... .. .........3
Three-dimensional Policy Space .............. ...............30....
A Solution Strategy for Policy-based Scheduling .............. ..... ............... 3
M odel Parameters ................. .. ............. ... ....... ..... ..........3
Choice of an Obj ective Function and Optimization Metric .............. .... ............34
Quality of Service Constraints............... ..............3
Simulation Results ................ ...............39........._.....
Future W orks .............. ...............42....


3 SPHINX: POLICY-BASED WORKFLOW SCHEDULING ................. ................44


Requirements of a Grid-scheduling Infrastructure .............. ...............44....
Information Requirements ................. ...............45.................
Sy stem Requirements .............. ...............47....
Highlights of SPHINX Architecture ................. ...............49................
SPHINX Client ................. ...............51.......... ......
SPHINX Server .............. ...............52....
Data Replication Service .............. ...............56....
Grid Monitoring Interface............... .. ..............5
Relationship with Other Grid Research ................. ...............58........... ...
Grid Information Services .............. ...............59...

Replica and Data Management Services .............. ...............59....
Job Submission Services .............. ...............59....
Virtual Data Services............... ...............60
Future Planners and Schedulers ................. ...............60...............

Experiments and Results............... ...............61
Scheduling Al gorithms ................. ...............62.......... ......
Test-bed and Test Procedure .............. ... ...... ..............6
Performance Evaluation of Scheduling Algorithms............... ...............6
Effect of Feedback Information.................. .... ....................6

Comparison of Different Scheduling Algorithms with Feedback ................... ....66
Effects of Policy Constraints on the Scheduling Algorithms ................... ...........72
Fault Tolerance and Scheduling Latency .............. ...............74....
Conclusion and Future Research .............. ...............75....


4 POLICY-BASED SCHEDULING TECHNIQUES FOR WORKFLOWS ...............78


M otivation............... .. .......................7
Problem Definition and Related Works ..........._..__......___....._ ...........7
Scheduling Algorithm Features ............ .....___ ...............84..
Notation and Variable Definition .............. ...............85....

Optimization Model ............... ........... .. ............_ .............8
Profit Function for Single Workflow Scheduling .............. ....................8
Profit Function for Multiple Workflow Scheduling.............___ ........._ ......89











Obj ective Function and Constraints .........__. ........_. ....._.__..........9
Policy-based Scheduling Algorithm and SPHINTX .......................... ...............91
Iterative Policy-based Scheduling Algorithm .............. ..... ............... 9
Scheduling Algorithm on SPHINTX ................ ...............95................
Experiment and Simulation Results.................. ...............9
Network Configuration and Test Application ......... ................ ...............96
List Scheduling with the Mean Value Approach ............... ....................9
The Simulated Performance Evaluation with Single DAG .............. ...............98
The Simulated Performance Evaluation with Multiple DAGs ........................104
The Performance Evaluation with Single DAG on OSG .............. ...............107
The Test Application .............. ..... ............ ........ ._ ... ..........0
The Performance Evaluation with Multiple DAGs on OSG ...........................111
The Algorithm Sensitivity to the Estimated Job Execution Time ................... ..116
Conclusion and Future Work ........._._.. ....__. ...............119...

5 CONCLUSIONS .............. ...............121....

LIST OF REFERENCES ........._._.. ..... .___ ...............126....

BIOGRAPHICAL SKETCH ........._.._.._ ....__. ...............133...


















LIST OF TABLES


Table pg

1-1 The existing scheduling systems and the scheduling properties. ........... ................9

3-1 Finite automation of SPHINX scheduling status management. ............... ...............50

3-2 SPHINX client functionalities............... ............5

3-3 SPHINX server functions for resource allocation ................. .........................53

3-4 SPHINX API's for accessing data replicas through RLS service.. ..........................56

3-5 Database table schemas for accessing resource-monitoring information ...............57

3-6 Grid sites that are used in the experiment. ............. ...............61.....

3-7 SPHINX server configurations............... .............6

















LIST OF FIGURES


figure pg

1-1 A grid with three subsystems .............. ...............3.....

2-1 Examples of resource provider and request submitter hierarchies...........................29

2-2 Hierarchical policy definition example ................. ...._._ ....._._ ..........3

2-3 Policy based scheduling simulation results. .............. ...............38....

2-4 Policy based scheduling simulation results with highly biased resource usage.......40

2-5 Policy based scheduling simulation results with highly biased workload. .............42

3-1 Sphinx scheduling system architecture .............. ...............48....

3-2 Overall structure of control process. ............. ...............54.....

3-3 Effect of utilization of feedback information ................. ................ ......... .65

3-4 Performance of scheduling algorithms with 300 jobs and without any policy ........67

3-5 Performance of scheduling algorithms with 600 jobs and without any policy. .......69

3-6 Performance of scheduling algorithms with 1200 jobs and without any policy. .....70

3-7 Site-wise distribution of completed j obs vs. avg. j ob completion time. ..................7 1

3-8 Performance of policy based-scheduling algorithm ................. .......................73

3-9 Numb er of timeouts in the different algorithm s ................ ......... ................74

3-10 Sphinx scheduling latency: average scheduling latency .............. ....................7

4-1 An example workflow in Directed Acyclic Graph (DAG). ............. ...................80

4-2 An example for job prioritization and processor assignment............... ...............8

4-3 The iteration policy-based scheduling algorithm on heterogeneous resources........93

4-4 The constraint and clustering effect on DAG completion............... ...............9










4-5 Average DAG completion time with 500 DAGs .............. ....................10

4-6 Average DAG completion time with the different scheduling algorithms (1).......102

4-7 Average DAG completion time with the different scheduling algorithms (2).......103

4-8 The scheduling performance of the different scheduling algorithms. ................... .104

4-9 The scheduling performance evaluation of the scheduling algorithms .................. 108

4-10 The scheduling performance comparison when the CCR is changed ................... .109

4-11 The scheduling performance comparison when the link density is changed .........110

4-12 The performance of the multiple DAG scheduling ..........__....... ..__............11 1

4-13 The performance of the multiple DAG scheduling with the simple scheduling....1 13

4-14 The performance of the multiple DAG scheduling algorithms. ................... .......... 114

4-15 The single dag scheduling sensitivity to the job execution time estimation. .........1 15

4-16 The multiple dag scheduling sensitivity to the j ob execution time estimation. .....1 18
















Abstract of Dissertation Presented to the Graduate School
of the University of Florida in Partial Fulfillment of the
Requirements for the Degree of Doctor of Philosophy

EFFICIENT SCHEDULING TECHNIQUES AND ALGORITHMS FOR GRID
COMPUTING

By

Jang-uk In

August 2006

Chair: Sanjay Ranka
Major Department: Computer and Information Science and Engineering

This dissertation discusses policy-based scheduling techniques on heterogeneous

resource for grid computing. The proposed scheduling algorithm has the following

features, which can be utilized on the grid computing environment. First, the algorithm

supports the resource usage constrained scheduling. A grid consists of the resources that

are owned by decentralized institutions. Second, the algorithm performs the

optimization-based scheduling. It provides an optimal solution to the grid resource

allocation problem. Third, the algorithm assumes that a set of resources is distributed

geographically and is heterogeneous in nature. Fourth, the scheduling dynamically

adjusts to the grid status. It tracks the current workload of the resources. The

performance of the proposed algorithm is evaluated with a set of predefined metrics. In

addition to showing the simulation results for the out-performance of the policy-based

scheduling, a set of experiments is performed on open science grid (OSG). In this

dissertation we discuss a novel framework for policy based scheduling in resource









allocation of grid computing. The framework has several features. First, the scheduling

strategy can control the request assignment to grid resources by adjusting resource usage

accounts or request priorities. Second, efficient resource usage management is achieved

by assigning usage quotas to intended users. Third, the scheduling method supports

reservation based grid resource allocation. Fourth, the quality of service (QOS) feature

allows special privileges to various classes of requests, users, groups, etc. This

framework is incorporated as part of the SPHINX scheduling system that is currently

under development at the University of Florida. Experimental results are provided to

demonstrate the usefulness of the framework. A grid consists of high-end computational,

storage, and network resources that, while known a priori, are dynamic with respect to

activity and availability. Efficient scheduling of requests to use grid resources must adapt

to this dynamic environment while meeting administrative policies. In this dissertation,

we describe a framework called SPHINX that can administer grid policies and schedule

complex and data intensive scientific applications. We present experimental results for

several scheduling strategies that effectively utilize the monitoring and job-tracking

information provided by SPHINX. These results demonstrate that SPHINX can

effectively schedule work across a large number of distributed clusters that are owned by

multiple units in a virtual organization in a fault-tolerant way in spite of the highly

dynamic nature of the grid and complex policy issues. The novelty lies in the use of

effective monitoring of resources and j ob execution tracking in making scheduling

decisions and fault-tolerance something which is missing in today's grid environments.















CHAPTER 1
INTTRODUCTION

Grid computing is increasingly becoming a popular way of achieving high

performance computing for many scientific and commercial applications. The realm of

grid computing is not limited to the one of parallel computing or distributed computing,

as it requires management of disparate resources and different policies over multiple

organizations.

Our research studies grid computing and related technologies. We propose novel

grid resource management middleware and efficient scheduling techniques. This chapter

discusses grid computing issues and technologies. Specifically, we discuss the maj or

difference between the new computing paradigm and existing parallel and distributed

computing. We introduce new concepts and terminologies defined in grid computing.

We then present the proposed scheduling system and technologies, and discuss how it

affects and contributes to the computing community.

Grid Computing

Data generated by scientific applications are now routinely stored in large archives

that are geographically distributed. Rather than observing the data directly, a scientist

effectively peruses these data archives to find nuggets of information [1]. Typical

searches require multiple weeks of computing time on a single workstation. The

scientific applications that have these properties are discussed in detail in the upcoming

sections of this chapter.









Grid computing has become a popular way of providing high performance

computing for many data intensive, scientific applications. Grid computing allows a

number of competitive and/or collaborative organizations to share mutual resources,

including documents, software, computers, data and sensors and computationally

intensive applications to seamlessly process data [2, 3]. The realm of grid computing is

beyond parallel or distributed computing in terms of requiring the management of a large

number of heterogeneous resources with varying, distinct policies controlled by multiple

organizations.

Most scientific disciplines used to be either empirical or theoretical. In the past few

decades, computational science has become a new branch in these disciplines. In the past

computational science was limited to simulation of complex models. However in recent

years it also encapsulates information management. This has happened because of the

following trends: (1) Large amounts of data are available from scientific and medical

equipment, (2) the cost of storage has decreased substantially, and (3) the development of

Internet technologies allows the data to be accessible to any person at any location.

The applications developed by scientists on this data tend to be both

computationally and data intensive. An execution may require tens of days on a single

workstation. In many cases it would not be feasible to complete this execution on a single

workstation due to extensive memory and storage requirements. Computational grid

addresses these and many other issues by allowing a number of competitive and/or

collaborative organizations to share resources in order to perform one or more tasks. The

resources that can be shared include documents, software, computers, data and sensors.

The grid is defined by its pioneers [4] as follows:










The real and specific problem that underlies the Grid concept is coordinated
resource sharing and problem solving in dynamic, multi-institutional virtual
organization. The sharing that we are concerned with is not primarily file
exchange but rather direct access to computers, software, data and other
resources, as is required by a range of collaborative problem solving resource
brokering strategies emerging in industry, science and engineering.

The owner of a resource can choose the amount, duration, and schedule of the

resources available to different users (see Figure 1). These policies can vary over time,

impacting the available resources for a given application. A core requirement for success

of these environments will be a middleware that schedules different resources to

maximize the overall efficiency of the system.


Participants in project 1
can run pro gram A




BgB Participants in Project 2
can use these machines
during tbe night.
Participants ip Project2 can run
program B







Participants in Project2
can read data D

Figure 1-1. A grid with three subsystems. Each providing restricted access to a subset of
applications.

Realizing the potential of grid computing requires the efficient utilization of

resources. The execution of user applications must simultaneously satisfy both j ob

execution constraints and system usage policies. Although many scheduling techniques

for various computing systems exist [5-11], traditional scheduling systems are

inappropriate for scheduling tasks onto grid resources for the following main reasons.









Although parallel or distributed systems address one or more of these characteristics, they

do not address all of them in a cohesive manner for grids.

Virtual organization (VO) [12] is a group of consumers and producers united in

their secure use of distributed high-end computational resources towards a common goal.

Actual organizations, distributed nationwide or worldwide, participate in one or more

VO's by sharing some or all of their resources. The grid resources in a VO are

geographically distributed and heterogeneous in nature. These grid resources have

decentralized ownership and different local scheduling policies dependent on their VO.

The grid resources may participate in a VO in a non-dedicated way, which means the

resources accept incoming requests from several different remote sources. The dynamic

load and availability of the resources require mechanisms for discovering and

characterizing their status continually.

The second major challenge in the grid-computing environment relates to the

planning and scheduling of data analyses. The factors that guide the development of a

plan include user requirements, global and local policy, and overall state. User

requirements may include not only the virtual data request but also optimization criteria

such as completion in the shortest time or usage of the fewest computing resources. Any

plan is necessarily constrained by resource availability, and consequently, we must obtain

all available state information. This complicates planning, as the global system state can

be large and determining future system states can be difficult.

The complex interrelationships among different data representations (procedural vs.

declarative), data locations (archived vs. cached), policies (local vs. global), and










computations (different user queries, background tasks, etc.) make planning and

scheduling a challenging and rewarding problem.

New techniques are required for representing complex requests, for constructing

request representations via the composition of representations for virtual data

components, for representing and evaluating large numbers of alternative evaluation

strategies, and for dealing with uncertainty in resource properties.

A virtual data grid must be able to allocate storage, computer, and network

resources to requests in a fashion that satisfies global and local policies. Global policy

includes community-wide policies governing how resources dedicated to a particular

collaboration should be prioritized and allocated. Local policies are site-specific

constraints governing when and how external users can use local resources and the

conditions under which local use has priority over remote use. The execution of a plan

will fail if it violates either global or local policy. Hence we require mechanisms for

representing policies and new resource discovery techniques that can take into account

policy information.

The purpose of planning and scheduling is to optimize the response to a query for

virtual data given global and local policy constraints. Different optimization criteria may

be applied to a PVDG request: minimize execution time, maximize reliability, minimize

use of a particular resource, etc. For a given metric, optimization is driven by resource

characteristics and availability.

The dynamic nature of the grid coupled with complex policy issues poses

interesting challenges for harnessing the resources in an efficient manner. In our research,

we study the key features of grid resource management systems and their performance on









Open Science Grid (OSG) [13], a worldwide consortium of university resources

consisting of 2000+ CPUs.

Grid Resource Management Middleware: SPHINX

Efficient scheduling of requests to use grid resources must adapt to the dynamic

grid computing environment while meeting administrative policies. Our research defines

the necessary requirements of such a scheduler and proposes a framework called

SPHINX. The scheduling middleware can administer grid policies, and schedule

complex and data intensive scientific applications. The SPHINX design allows for a

number of functional modules to flexibly plan and schedule workflows representing

multiple applications on the grids. It also allows for performance evaluation of multiple

algorithms for each functional module. We present early experimental results for

SPHINX that effectively utilize other grid infrastructure such as workflow management

systems and execution systems. These results demonstrate that SPHINX can effectively

schedule work across a large number of distributed clusters that are owned by multiple

units in a virtual organization. The results also show that SPHINX can overcome the

highly dynamic nature of the grid and complex policy issues to utilize grid resources,

which is an important requirement for executing large production jobs on the grid. These

results show that SPHINX can effectively

* Reschedule j obs if one or more of the sites stops responding due to system
downtime or slow response time.
* Improve total execution time of an application using information available from
monitoring systems as well its own monitoring of job completion times.
* Manage policy constraints that limit the use of resources.

Virtual Data Toolkit (VDT) [13] supports execution of workflow graphs. SPHINX

working with VDT is in the primary stages of exhibiting interactive remote data access,









demonstrating interactive workflow generation and collaborative data analysis using

virtual data and data provenance. Also, any algorithms we develop will potentially be

used by a wide user community of scientists and engineers. SPHINX is meant to be

inherently customizable, serving as a modular "workbench" for CS researchers, a

platform for easily exchanging planning modules and integrating diverse middleware

technology. It will also deliver reliable and scalable software architecture for solving

general-purpose distributed data intensive problems.

Efficient Scheduling Techniques

The dynamic and heterogeneous nature of the grid coupled with complex resource

usage policy issues poses interesting challenges for harnessing the resources in an

efficient manner. In our research, we present novel policy-based scheduling techniques

and their performance on OSG. The execution and simulation results show that the

proposed algorithm can effectively

1. Allocate grid resources to a set of applications under the constraints presented with
resource usage policies.
2. Perform optimized scheduling on heterogeneous resources using an iterative
approach and binary integer programming (BIP).
3. Improve the completion time of workflows in integration with j ob execution
tracking modules of SPHINX scheduling middleware.

The proposed policy-based scheduling algorithm is different from the existing

works from the following perspectives.

Policy constrained scheduling: The decentralized grid resource ownership

restricts the resource usage of a workflow. The algorithm makes scheduling decisions

based on resource usage constraints in a grid computing environment.

Optimized resource assignment: The proposed algorithm makes an optimal

scheduling decision utilizing the Binary Integer Programming (BIP) model. The BIP










approach solves the scheduling problem to provide the best resource allocation to a set of

workflows subj ect to constraints such as resource usage.

The scheduling on heterogeneous resources: The algorithm uses a novel

mechanism to handle different computation times of a job on various resources. The

algorithm iteratively modifies resource allocation decisions for better scheduling based

on different computation times instead of taking a mean value over the time. This

approach has also been applied to the Iterative list scheduling [1].

Dynamic scheduling: In order to handle the dynamically changing grid

environments, the algorithm uses a dynamic scheduling scheme rather than a static

scheduling approach. A scheduling module makes the resource allocation decision to a

set of schedulable jobs. The status of a job is defined as schedulable when it satisfies the

following two conditions.

* Precedence constraint: all the preceding j obs are finished, and the input data of the
job is available locally or remotely.
* Scheduling priority constraint: A job is considered to have higher priority than
others when the job is critical to complete the whole workflow for a better
completion time.

Future scheduling: Resource allocation to a schedulable job impacts the workload

on the selected resource. It also affects scheduling decisions of future schedulable j obs.

The algorithm pre-schedules all the unready j obs to detect the impact of the current

decision on the total workflow completion time.

When the scheduling algorithm is integrated with the SPHINX scheduling

middleware, it performs efficient scheduling in the policy-constrained grid environment.

The performance is demonstrated in the experimental section.












Table 1-1. The existing scheduling systems and the scheduling properties. This table
shows the conventional scheduling systems and their scheduling property
existence. The mark "v" means that a system has the property with the mark.
Systems Adaptive Co- Fault- Policy- QoS Flexible
scheduling allocation tolerant based supprt interface
Nimrod-G v v
Maui/Silver v v v v
PBS v v
EZgid v v v
Prophet v
LSF v v v v

Scheduling Systems

In this section we present the currently existing scheduling systems and their

properties. Table 1-1 shows a set of scheduling systems and the given property existence.

In the table, the mark "v" indicates that a system has the given property with the mark. In

the table we specify a set of system properties. With adaptive scheduling we mean that

the resource allocation decision is not finalized until the real job submission happens.

The scheduling decision will change based on resource status and availability after the

initial decision is made. Co-allocation means that a request may be allocated with several

different resources. A real application requires different kind of resources such as CPU

and storage. Co-allocation supporting scheduler allocates the required resources to the

job. Fault-tolerant scheduling means that a job is rescheduled after its execution failure

on a remote resource. In a dynamic grid environment execution failure is more likely to

happen fairly often. The scheduling system is required to monitor job execution and

reschedule it. Policy-based scheduling supports the heterogeneous resource ownership in

grid computing. This topic is discussed in detail in the following section. Quality of

service (QOS) is presented with deadline and other application requirements. A









scheduling system should make resource allocation decisions concerning the requirement.

An ideal system provides a flexible interface to other modules such as monitoring and

scheduling to allow the replacement of existing modules in the system with other

customized modules.

The Portable Batch System (PBS)

Portable Batch System is a batch j ob and computer system resource management

package designed to support queuing and execution of batch j obs on heterogeneous

clusters of resources. PBS offers several scheduling systems to support various resource

allocation methods; such as Round Robin, First In First Out (FIFO), Load Balancing,

Priority-based and Dedicated Times [15]. The PBS configuration consists of several

modules, the PBS client, server, scheduler and job execution clusters which run the PBS

MOM daemon. In the PBS system a job is submitted along with a resource specification

on one of the front-ends, handed to the server, scheduled, run by the MOMs in the

execution clusters, and has output placed back on the front end [16]. PBS works quite

well for handling batch processing.

However, as mentioned in the previous section, grid computing requires much

more delicate resource management and refined request scheduling in a dynamically

changing heterogeneous environment. The proposed resource allocation strategy

achieves solutions to the issues by importing the concept of policy- and reservation-based

scheduling for Quality of Service (QOS). The proposed scheduler also supports fully

interactive request submissions for negotiating the level of QOS requirements according

to the current and estimated near future grid weather after the user makes a submission.









Maui

Maui is an advanced job scheduler for use on clusters and supercomputers. It is an

optimized and configurable tool capable of supporting a large array of scheduling

policies, dynamic priorities and extensive reservations. The Maui scheduler can act as a

"policy engine," which allows site administrators control over when and how resources

are allocated to jobs [17]. The policies serve to control how and when jobs start. They

include job prioritization, fairness policies and scheduling policies. The Quality of

Service (QOS) feature allows a site to grant special privileges to particular users by

providing additional resources, access to special capabilities and improved j ob

prioritization. Maui also provides an advanced reservation infrastructure allowing sites to

control exactly when, how and by whom resources are used. Every reservation consists

of three major components, a set of resources, a timeframe and an access control list.

The scheduler makes certain that the access control list is not violated during the

reservation's timeframe on the resources listed [18,19]. Even though Maui is a highly

optimized and configurable scheduler capable of supporting scheduling policies,

extensive reservations and dynamic priorities, it has limitations in scheduling distributed

workloads to be executed across independent resources in a grid.

A grid scheduling system must support global optimization in addition to a local

best scheduling. The proposed scheduling framework supports the global optimization as

well as local best fit by considering resource usage reservations and QOS requirements in

the scheduling. The hierarchical architecture view of a grid in policy enforcement makes

it possible to implement extensive and scalable resource allocation in the proposed

scheduler.









LSF

LSF is a suite of application resource management products that schedule, monitor,

and analyze the workload for a network of computers. LSF supports sequential and

parallel applications running as interactive and batch jobs. The LSF package includes

LSF Batch, LSF JobScheduler, LSF MultiCluster, LSF Make and LSF Analyzer all

running on top of the LSF Base system. LSF is a loosely coupled cluster solution for

heterogeneous systems. There are several scheduling strategies available in LSF. They

include Job Priority Based Scheduling, Deadline Constraints Scheduling, Exclusive

Scheduling, Preemptive Scheduling and Fairshare Scheduling. Multiple LSF scheduling

policies can co-exist in the same system [20].

Even though LSF supports several different scheduling strategies, most of them do

not provide enough ways for users to specify requirement and preference in resource

allocation. The proposed scheduling strategy supports user interaction in resource

allocation decisions by allowing QOS specification.

EZ-Grid

EZ-Grid is used to promote efficient j ob execution and controlled resource sharing

across sites. It provides the policy framework to help resource providers and

administrators enforce Eine-grained usage policies based on authorization for the uses of

their resources [21]. The framework automates policy-based authorization or access

control and accounting for job execution in computing grids.

A maj or difference between the policy engine and our proposed framework is that

our framework utilizes a hierarchically defined policy along three dimensions consisting

of resource providers, request submitters and time, and uses submitters' Quality of

Service requirements for resource allocation.









Resource Broker

The Resource Broker (RB) from the European Data Grid project provides a

matchmaking service for individual jobs: given a job description Eile, it Einds the

resources that best match the users' request [22, 23]. The RB makes the scheduling

decision based only on the information of individual user authentication and individual

job execution requirements

Current plans suggest supporting different scheduling strategies. Our work goes

beyond this by specifically accounting for VO policy constraints and VO-wide

optimization of throughput via constraint matchmaking and intelligent scheduling

algorithms. In addition, the proposed scheduler is designed to provide estimates of

execution time so that the user may determine if a request fits within the user's deadlines.

Finally, by considering the DAG as a whole, the middleware will be able to intelligently

pre-determine any necessary data staging.

Pegasus

Pegasus [24] is a configurable system that can map and execute DAGs on a grid.

Currently, Pegasus has two configurations. The first is integrated with the Chimera

Virtual Data System. The Pegasus system receives an abstract DAG file from Chimera.

Pegasus uses these dependencies to develop a concrete DAG by making use of two

catalogs, the replica catalog that provides a list of existing data components, and a

transformation catalog that stores a list of available executable components. With

information from these catalogs, the Pegasus system maps the input abstract j ob

descriptions onto grid resources. Then it adds additional jobs to provide the necessary

data movement between dependent jobs. This final concrete plan is submitted to the grid

execution system, DAGMan, which manages its execution.









In its second configuration, the Pegasus system performs both the abstract and

concrete planning simultaneously and independently of Chimera. This use of Pegasus

takes a metadata description of the user's required output products. It then uses AI

planning techniques to choose a series of data movement and job execution stages that

aims to optimally produce the desired output. The result of the AI planning process is a

concrete plan (similar to the concrete plan in the first configuration) that is submitted to

DAGMan for execution.

The framework presented in this document is distinct from the Pegasus work in

many ways. For example, instead of optimizing plans benefiting individual users, the

proposed framework, SPHINX allows for globally optimized plans benefiting the VO as

a whole. In addition, Pegasus currently provides advanced forward planning of static

workflows. The work presented in this document is designed to dynamically plan

workflows by modifying groups of jobs within a DAG (sub-DAGs) and, depending on

the nature of the grid, controlling the release of those sub-DAGs to execution systems

such as Condor-G/DAGMan.

SPHINX is meant to be inherently customizable, serving as a modular "workbench"

for CS researchers, a platform for easily exchanging planning modules and integrating

diverse middleware technology. As a result, by including Pegasus planning modules in

the SPHINX server, the resulting scheduling system would be enhanced by taking full

advantage of knowledge management and AI planning, provided by Pegasus, while

providing the flexible dynamic workflow and just-in-time job planning provided by

SPHINX.









Condor

The proposed scheduling system, SPHINX utilizes the stable execution control and

maintenance provided by the Condor system [25, 26]. The Condor Team continues to

develop Condor-G and DAGMan. Recently, to improve its just-in-time planning ability,

DAGMan has been extended to provide a call-out to a customizable, external procedure

just before job execution. This call-out functionality allows a remote procedure to

modify the job description file and alter where and how the job will be executed.

SPHINX envisages using the call-out feature in DAGMan for just-in-time error recovery

and corrective just-in-time planning. However, as DAGMan and SPHINX increase in

functionality, DAGMan itself could become a scheduling client and communicate

through this and other callouts to the scheduling server directly.

PROPHET

Prophet is a system that automatically schedules data parallel Single Process

Multiple Data (SPMD) programs in workstation networks [27]. In particular, Prophet

uses application and resource information to select the appropriate type and number of

workstations, divide the application into component tasks, distribute data across

workstations, and assign tasks to workstations. To this end, Prophet automates the

scheduling process for SPMD applications to obtain reduced completion time. In

addition, Prophet uses network resource information and application information to guide

the scheduling process. Finally, Prophet is unique in that it addresses the problems of

workstation selection, partitioning and placement together. The SPHINX system provides

functionality for scheduling j obs from multiple users concurrently based on the policy

and priorities of these j obs in a dynamically changing resource environment.









Data Analysis Systems

There are many other international Grid proj ects underway in other scientific

communities. These can be categorized as integrated Grid systems, core and user-level

middleware, and application-driven efforts. Some of these are customized for the special

requirements of the HEP community. Others do not accommodate the data intensive

nature of the HEP Grids and focus upon the computational aspect of Grid computing.

EGEE [28] middleware, called gLite [29], is a service-oriented architecture. The

gLite Grid services aim to facilitate interoperability among Grid services and frameworks

like JClarens and allow compliance with standards, such as OGSA [30], which are also

based on the SOA principles.

Globus [31] provides a software infrastructure that enables applications to handle

distributed heterogeneous computing resources as a single virtual machine. Globus

provides basic services and capabilities that are required to construct a computational

Grid. Globus is constructed as a layered architecture upon which the higher-level

JClarens Grid services can be built.

Legion [32] is an obj ect-based "meta-system" that provides a software

infrastructure so that a system of heterogeneous, geographically distributed, high-

performance machines can interact seamlessly. Several of the aims and goals of both

proj ects are similar but compared to JClarens the set of methods of an obj ect in Legion

are described using Interface Definition Language.

The Gridbus [32] toolkit proj ect is engaged in the design and development of

cluster and Grid middleware technologies for service-oriented computing. It uses Globus

libraries and is aimed at data intensive sciences and these features make Gridbus

conceptually equivalent to JClarens.










UNICORE [33] provides a uniform interface for job preparation, and seamless and

secure access to computing resources. Distributed applications within UNICORE are

defined as multipart applications where the different parts may run on different computer

systems asynchronously like the GAE services, or they can be sequentially synchronized.

NASA' s IPG [34] is a network of high performance computers, data storage

devices, scientific instruments, and advanced user interfaces. Due to its Data centric

nature and OGSA compliance, IPG services can potentially interoperate with GAE

services.

WebFlow [35], a framework for wide-area distributed computing, is based on a

mesh of Java-enhanced Apache web servers, running servlets that manage and coordinate

distributed computation and it is architecturally closer to JClarens .

The NetSolve [36] system is based around loosely coupled, distributed systems,

connected via a LAN or WAN. Netsolve clients can be written in multiple languages as

in JClarens and server can use any scientific package to provide its computational

software.

The Gateway system offers a programming paradigm implemented over a virtual

web of accessible resources [37]. Although it provides portal behavior like JClarens and

is based on SOA, its design is not intended to support data intensive applications.

The GridLab [38] will produce a set of Grid services and toolkits providing

capabilities such as dynamic resource brokering, monitoring, data management, security,

information, adaptive services and more. GAE Services can access and interoperate with

GridLab services due to its SOA based nature.









The Internet computing projects, such as SETI@Home [39] and Distributed.Net

[40], which build Grids by linking many end-user PCs across the internet, are primarily

number crunching proj ects that lack the large data management features of HEP Grids.

The Open Grid Services Architecture (OGSA) framework, the Globus-IBM vision

for the convergence of web services and Grid computing, has been taken over by Web

Services Resource Framework (WSRF) [41]. WSRF is inspired by the work of the Global

Grid Forum's Open Grid Services Infrastructure (OGSI) [42]. The developers of the

Clarens Web Services Framework are closely following these developments.

Scheduling Algorithms

This section discusses several existing scheduling algorithms. Although the

referenced algorithms work well in the traditional high performance computing

environment, they do not perform in a satisfactory manner with the characteristics of

grids discussed in the previous section.

Iterative List Scheduling

This work [54] introduces an iterative list-scheduling algorithm to deal with

scheduling on heterogeneous computing systems. The main idea in this iterative

scheduling algorithm is to improve the quality of the schedule in an iterative manner

using results from previous iterations. Although the algorithm can potentially produce

shorter schedule length it does not support resource usage policies. It is a static

scheduling algorithm, which assumes an unchanged or stable computing environment. In

the dynamic and policy constrained grid computing environment the algorithm may not

perform as the simulated results show in the dissertation.









Dynamic Critical Path Scheduling

Authors [55] propose a static scheduling algorithm for allocating task graphs to

fully connected multiprocessors. It minimizes make-span subject to precedence

constraint, which is determined by the critical path of the task graph. The homogeneous

CPU-based scheduling algorithm assumes that the scheduler could manage the

scheduling priority of jobs in a processor. This may not be true in a grid environment in

which the resources have decentralized ownership and different local scheduling policies

dependent on their VO.

Reliability Cost Driven Scheduling

The work [56] describes a two-phase scheme to determine a scheduling of tasks

with precedence constraints that employs a reliability measure as one of the obj ectives in

a real-time and heterogeneous distributed system. The static algorithm schedules real-

time tasks for maximized reliability. The utility function of the algorithm Einds a

processor with the earliest start time for j obs in an application. In the presence of the

policy constraint the algorithm may not be able to Eind the proper resource allocation to

the application.

Heterogeneous Earliest Finish Time Scheduling

The algorithm [57] focuses on the appropriate selection of the weight for the nodes

and edges of a directed acyclic graph, and experiments with a number of different

schemes for computing these weights. The proposed technique uses the mean value

approach to Eind the length of the produced schedule. Instead of the mean value

approach our proposed algorithm uses the iterative approach to heterogeneous resources.

The experimental results compare the two schemes. The off-line and priority-based

scheduling may not be feasible in the grid-computing environment.










Dynamic Level Scheduling

This scheduling algorithm [58] match a job and a processor in an exhaustive way.

The j ob is on the critical path of a directed acyclic graph (DAG), and the j ob starts on the

processor at the earliest time. The algorithm uses the mean value approach on a

heterogeneous CPU resource environment. The static and mean value-based scheduling

may not produce a good scheduling result in policy-based grid computing.

Optimal Assignment with Sequential Search

The authors [59] describe two algorithms based on the A* technique. The first is a

sequential algorithm that reduces the search space. The second proposes to lower time

complexity, by running the assignment algorithm in parallel, and achieves significant

speedup. The exhaustive and sequential search for the optimal assignment may not be

feasible for a large tree search space even though their modified algorithm generates

random solutions and prunes the tree. Our proposed algorithm performs optimal

assignment in a different scheme. We utilize a sub-tree and iterative concepts instead of

considering the whole tree and all heterogeneous resources.

Contributions

In this section we describe the main contribution of research to the scheduling

system and techniques in Grid computing.

*Policy-driven request planning and scheduling of computational resources: We
define and implement the mechanisms for representing and enforcing both local
and global policy constraints. A grid scheduler properly needs to be able to
allocate the resources to requests in a fashion that satisfies global and local
constraints. Global constraints include community-wide policies governing how
resources dedicated to a particular collaboration should be prioritized and allocated;
local constraints include site-specific policies governing when external users can
use local resources. We develop mechanisms for representing and enforcing
constraints. We also develop policy-aware scheduling middleware and algorithms.
We develop a novel framework for policy based scheduling in resource allocation
of grid computing. The framework has several features. First, the scheduling









strategy can control the request assignment to grid resources by adjusting resource
usage accounts or request priorities. Second, efficient resource usage management
is achieved by assigning usage quotas to intended users. Third, the scheduling
method supports reservation based grid resource allocation. Fourth, Quality of
Service (QOS) feature allows special privileges to various classes of requests,
users, groups, etc.

* A fault-tolerant scheduling system in a dynamic Grid environment: Two maj or
characteristics of Grid computing we can point out are de-centralized resource
ownership and dynamic availability of resource. Shared by multiple Virtual
Organizations, a Grid is meant to be an environment to mutually share resources
among organizations. The composition of the grid is not homogeneous, to say the
least. This makes it difficult to guarantee expected performance in any given
execution environment. Another important factor is that due to the dynamic
availability of resources, the presence of 'unplanned downtimes' of certain
resources in the Grid makes scheduling decision non-trivial as a job planned on a
site may never be completes. These reasons make it very cumbersome for an
application user to effectively use a grid. An application user usually throttles the
j obs across the grid. The decision of how many j obs to send to a site is usually
based on some static information like the number of CPUs available on the sites.
However, the site with more CPUs might already be overloaded or this particular
production manager (or his VO proxy, to be precise) might have a relegated
priority at that remote site. As a result j obs might get delayed or even fail to
execute. In such events, the application user has to re-submit the failed jobs. An
experienced user may rely on his/her past experience and submit j obs to sites which
have been more reliable in the past. However, the site which was working well
before may have its own priority work to be done this time, thus temporarily
relegating this user's priority. The proposed scheduling middleware, named
SPHINTX, will effectively overcome the highly dynamic nature of the grid to
harness grid resources. The system is equipped with advanced job execution
tracking module and incorporated with other monitoring systems to maintain the
information of data and resource availability.

* Scheduling workflow model: An application scientist typically solves his problem
as a series of transformations. Each transformation may require one or more inputs
and may generate one or more outputs. The inputs and outputs are predominantly
files. The sequence of transformations required to solve a problem can be
effectively modeled as a Directed Acyclic Graph (DAG) for many practical
applications of interest that the proposal is targeting. Most existing scheduling
systems assume that a user application consists of a single job. It means that there
is no precedent relationship on a workflow. In the scheduling system and
algorithm development we use different types of workflows and resources. The
research focuses on single task workflows initially. They are the simplest
workflows that are interactive or batch type. Then, we extend the research to
simple DAG-based workflows, which will reflect a variety of real world scenarios.
Finally, our scheduling algorithms deal with multiple DAGs in achieving obj ective
functions simultaneously. The experiments of the algorithms are performed in









heterogeneous resource environment. A resource type such as a CPU has a set of
resources with different performance.

*In order to test and evaluate the proposed data analysis prototype we deploy the
framework across a grid testbed named Grid2003/Open Science Grid which
consists of more than 25 sites providing more than 2000 CPUs, and exhibit remote
data access, workflow generation and collaborative data analysis using virtual data
and data provenance, as well as showing non-trivial examples of policy based
scheduling of requests in a resource constrained grid environment.

Outline

The dissertation is organized in the following manner. We discuss policy-based

scheduling for single applications in Chapter 2. The chapter presents a policy-based

novel-scheduling framework for obtaining a sub optimal scheduling solution on Grid

resources. Specially, we also discuss resource allocation in the multi-dimensional policy

space. We discuss the workflow centric scheduling middleware, SPHINX, with

simulation and experiment results in Chapter 3. In that chapter we introduce the core

features of SPHINX architecture, and discuss the distributed data analysis utilizing

SPHINX and fault tolerant scheduling. It also discusses incorporating the scheduling

system with other services such as JClarens and MonALISA. In Chapter 4 we present

policy-based scheduling algorithms and supporting infrastructures for scheduling single

and multiple workflows. We discuss scheduling techniques with awareness of given

resource usage constraints and completion deadline. We present simulations and

experiment results on Open Science Grid (OSG). We conclude the dissertation in

Chapter 5 with future research plans.















CHAPTER 2
GRID POLICY FRAMEWORK

In this chapter we discuss a novel framework for policy based scheduling in

resource allocation of grid computing. The framework has several features. First, the

scheduling strategy can control the request assignment to grid resources by adjusting

resource usage accounts or request priorities. Second, Efficient resource usage

management is achieved by assigning usage quotas to intended users. Third, the

scheduling method supports reservation based grid resource allocation. Fourth, Quality

of Service (QOS) feature allows special privileges to various classes of requests, users,

groups, etc. This framework is incorporated as part of the SPHINX scheduling system

that is discussed in the next chapter. A set of experimental results is provided to

demonstrate the usefulness of the framework.

Petascale Virtual Data Grids (PVDG' s) will need to be able to satisfy all of the j ob

requests by allocating storage, computer and network resources in a fashion that satisfies

both global and local constraints. Global constraints include community-wide policies

governing how resources should be prioritized and allocated. Local constraints include

site-specific control as to when external users are allowed use local resources. As such,

PVDG computing requires mechanisms for representing and enforcing policy-based

scheduling techniques.

Policies, including authentication, authorization, and application constraints are

important factors for maintaining resource ownership and security. The set of possible

constraints on job execution can be various, and change significantly over time. These










constraints may include different values for each j ob, for example RAM requirements or

connectivity needed, or constraints that are static for a specific job type, such as the

operating system or architecture. Policies may include any information that should be

specified to ensure that a job is matched to appropriate resources [60, 61].

The proposed Policy based scheduling framework can achieve local and global

optimisation in the resource allocation by providing the following several features:

1. Control the request assignment to grid resources by adjusting resource usage
accounts or request priorities. This feature allows the scheduler balance workloads
across resources in a grid resulting in better overall utilization and turn around
time.

2. Support reservation based grid resource allocation. The scheduling concept makes
it possible for the scheduler to assign multiple dependent requests in an optimally
synchronized manner. To support the feature the resources that are participating in
the scheduling should allow request to be executed with a reserved amount of
resource usage in a specific time. In addition to the resource side reservation the
request must be completed within the reservation in terms of usage duration and
amount.

3. Allow variable Quality of Service (QOS) privileges to various classes of requests,
users, groups, etc. In addition to a QOS in a basic level a request submitter can
make a specific QOS request according to the privileges assigned to the class to
which the submitter or the request belongs. The request submitter passes a request
to the scheduler with the QOS specification. A resource allocation considering the
QOS feature should be interactive between the submitter and the scheduler. They
communicate each other to adjust the QOS requirements.

Managing resource usage account prevents resource cycle waste and resource over

usage. Resource providers assign usage quotas to intended resource users. The scheduler

monitors resource usage by keeping track of the quota change. It saves resource cycles

by assigning more requests to the idle resources, while reducing the number of requests


assigned to the over used resources.










Key Features

The proposed policy based scheduling framework can achieve local and global

optimization in the resource allocation by providing the following several features:

Control the request assignment to grid resources by adjusting resource usage

accounts or request priorities. This feature allows the scheduler balance workloads

across resources in a grid resulting in better overall utilization and turn around time.

Heterogeneous grid resources have different capabilities for executing submitted

requests. Statically adjusting resource usage quotas of users can support workload

balance across the resources. Also a scheduler considers dynamically changing workload

on each machine when the scheduler makes resource allocation decision. The updated

workload information is available from a resource-monitoring module in real time. The

proposed framework incorporates the quota and monitoring information into resource

allocation decision. The information is formulated in LP functions. We will discuss the

feature in detail in the next section.

Support reservation based grid resource allocation. The suggested framework

makes it possible for a scheduler to assign multiple dependent requests to resources in an

optimally synchronized manner. This feature is made possible due to the reservation

driven scheduling. To support the feature the resources that are available in the

scheduling should guarantee requests to be executed with a reserved amount of resource

usage in a specific time period. The feature also supports global Quality of Service

(QOS) satisfaction with a workload consisting of several dependent tasks. In order to

make advanced reservation a scheduler needs to estimate request execution time and

resource usage amount. A user can provide the information when she/he submits a

request. In another way a scheduler can make the estimation based on functions of a










prediction module such as benchmarking or historical execution records. In the proposed

framework optimisation functions make resource allocation considering the advanced

reservation with policy and anticipated execution information.

Allow variable Quality of Service (QOS) privileges to various classes of requests,

users, groups, etc. A user submits a request with its QOS requirement or preference.

The QOS information may contain resource reservation specific requirement such as

amount of resource (CPU, storage, etc) and a period of time. A user can also require a

specific deadline of request execution or the best effort execution. The QOS privileges

are different among the user classes. A class of users may have privileges for requiring

more QOS then other groups do. A scheduler considering the QOS feature should

interact with a request submitter for adjusting QOS achievement level. In the proposed

system we assume that users specify the amount of required resource usages. The

framework makes scheduling decision using optimisation functions for satisfying given

QOS's and policy constraints.

Support interactive scheduling with request submitters. A scheduler interacts with

the submitters in order to negotiate QOS achievement level. In the cases where the

scheduler can't achieve the service level that a submitter specifies, or the scheduler can

suggest different QOS specification for better performance of request execution the

scheduler sends the submitter the alternatives to the original QOS requirement. After

receiving the alternatives the submitter considers them, and sends back her/his decision to

the scheduler. The decision may include accepting or denying the alternatives, or

requesting other scheduling effort with different QOS requirement.










Managing resource usage account prevents resource cycle waste and resource over

usage. In addition to the workload balance resource management system can control

resource usage by adjusting usage quotas to intended resource users. The scheduler

monitors resource usage by keeping track of the quota change. It saves resource cycles

by assigning more requests to the idle resources, while reducing the number of requests

assigned to the over used resources.

Policy Space

Grid computing requires collaborative resource sharing within a Virtual

Organization (VO) and between different VOs. Resource providers and request

submitters who participate within a VO share resources by defining how resource usage

takes place in terms of where what, who, and when it is allowed. Accordingly, we

assume that policies may be represented in a three (plus one) dimensional space

consisting of resource provider (and property), request submitter, and time. We further

assume that quota limits are descriptive enough to express how a resource may be used.

By exploiting the relational character of policy attributes, a policy description space

may be conveniently represented as a hierarchical tree. Indeed, the heterogeneity of the

underlying systems, the difficulty of obtaining and maintaining global state information,

and the complexity of the overall task all suggest a hierarchical approach to resource

allocation. Further, such a hierarchical approach allows for a dynamic and flexible

environment in which to administer policies.

Three of the dimensions in the policy space, consisting of resource provider,

request submitter and time, are modelled as hierarchical categorical policy attributes

expressed in terms of quotas. An extra dimension, resource property, is modelled as a

simple categorical attribute to the hierarchical resource provider dimension.










Administrators, resource providers or requesters who participate in a VO then define

resource usage policies (in terms of quotas) involving various levels of this hierarchical

space. In the following sections we discuss the hierarchical representation of each

dimension.

Resource Provider

A resource provider is defined as an entity which shares some particular physical

resource within the context of a grid. Physical resources participating in a grid are often

naturally organised into hierarchical groups consisting of both physical and logical views.

In the example shown in Figure 2-1, a hypothetical grid consists of many domains each

containing one or many clusters of machines. The representation in Sphinx is generalised

such that arbitrary levels in the resource provider hierarchy may be added to maintain

scalability. For example, the resource provider domain level might represent a set of

gatekeepers at particular sites, or it might represent a sub-grid, itself containing local

domains. It is common to assume that each compute-cluster has a local resource

scheduling system (such as Condor, PBS, or LSF to name a few) to serve remote

requests, but there may be compute-clusters which do expose individual machines to the

grid for direct access. The hierarchical dimension (categorical tree) representing

resource providers is given the symbol RH, provider -

Sphinx performs global scheduling to allocate remote resources for requests across

an entire grid. As such, Sphinx specifies resource allocation to the resource provider

level in which either a local resource scheduler exists or a sub-grid scheduler exists.

Resource Property

Each resource provider is augmented by a list of non-hierarchical properties, such

as CPU, memory, storage space, bandwidth, etc. At any level in the resource provider









hierarchy, quota limits for each resource property are appropriately aggregated. The

categorical list representing resource properties is given the symbol Rproperq.


Grid Level 1 VO


Domain Level 2 Group


Cluster Level 3 User



Machine Level 4 Job


Figure 2-1. Examples of resource provider and request submitter hierarchies request
submitter

A request submitter is defined as an entity, which consumes resources within the

context of a grid. As in the case of resource providers, request submitters are naturally

structured in a hierarchical manner. In the example shown in Figure 2-1, a typical VO

might be described by several groups of users, each with possible proxies (e.g. j obs)

acting on their behalf. Again, the representation in Sphinx is generalised so that arbitrary

levels in the request submitter hierarchy may be added to maintain scalability or

convenient logical views of a VO. For instance, a group may consist of individual grid

users (each with a particular certificate) or other sub-groups, each containing sub-sub-

groups or users. In analogy with the resource provider hierarchy, the deepest levels in the

request submitter hierarchy represent entities possessing a grid certificate or proxy. In

general, a request for grid resources may be submitted (by a sanctioned user with a

certificate) from any level. This enables the pooling of account quotas to service high

priority requests emanating from any particular level in the request submitter hierarchy










(such as a group or even the VO itself). The hierarchal dimension (categorical tree)

representing request submitter is given the symbol SH.

Time

Time is used to allow for possible dynamism in policies in which, for example,

maximum quota limits may change in a well- and pre-defined way (e.g. periodic) for

request submitters. In addition, in order to provide a quality of service, the scheduler will

need to plan and test, at various time frames in the future, possible resource allocation

and de-allocation strategies. From a scheduling point of view, recent times are more

important than times in the far future. Hence, Sphinx models the hierarchical nature of

time by forming time frames using an adaptive logarithmic mesh, allowing for more

Einely or coarsely grained description of polices, depending on present or future

scheduling requirements. The categorical list representing time is given the symbol T.

Three-dimensional Policy Space

As the policy space of quota limits defined above is comprised of categorical trees

or lists, we construct a hierarchical policy tensor by taking the triple cross product of each

hierarchical dimension:

QH = RH X SH T

where RH = H, provider Rpropert, TepreSents resource providers (with property

attributes), SH TepreSents request submitters, and Ttime. Each element of QH TepreSents

an allotted quota limit. There are several ways to extract meaningful policy views of

such a space. For example, QH TepreSents a "volume" of all possible quotas from all

hierarchical levels of providers applied to all hierarchical levels of submitters and using

all time scales. However, in the context of scheduling, a description of policies

corresponding to particular aggregate-levels of providers, submitters, and times is often









more germane than using all possible hierarchical views. One simple example is that of a

grid site (resource-provider) which routes all incoming j obs to a compute-cluster via a

local scheduler. In such a situation, enforcing Eine-grained policies for each machine is

not possible using a grid-wide scheduler (such as Sphinx), but enforcing site-aggregate

polices is possible.

Hence, one is more often interested in a particular view which collapses parts of

the hierarchical policy space "volume" onto a sub-space, or aggregate "surface,"

containing high-level or fine-grained policy descriptions, depending on the need. In this

spirit, we first determine the tree level of interest for each branch (thereby defining leaf-

nodes of differing granularity) of each dimension and then form the resulting triple cross

product of leaf-nodes:

QL = RL X SL

Such a construction of QL which is not unique, essentially allows for a flexible

and efficient description of global and local polices. Various policy definitions are

possible from the combination of leaf nodes in different levels of Resource-, Entity- and

Time- policy hierarchy trees. Possible combinations from the trees make hierarchical

policies in the three dimensions; Resource, Entity and Time. From the Figure 2-2

hierarchy trees, the combination (u2, r2, t3) means that there is a policy for User_1 on

Cluster_1 in the time unit of Week. For example, User_1 can use one week of CPU time

of machines on Cluster_1 during the month of July. (u5, rl, t2) means that Group_2 can

use resources ofDomain_1 for three months in a year. (ul, r5, tl) and (ul, rl, t2) defines

policies for VO_1 on Domain_1 and 2 of Grid_1 in the time units of year and month

respectively.










Resource Poliev Hierarchy Time Poliev Hierarchy User Poliev Hierarchy
Grid 1 Year t1 VO 1 VO 2
J11 1 ul
Domain 1 Domain 2 Domain 3 Month t2 Group 1 Group 2
r1 r5 / u5
Cluster 1 Cluster 2 Week t3 User 1 User 2 User 3
r2 ~/I u2 u3 u4
Machine 1 Machine_2 Day t4
r3 r4

Figure 2-2. Hierarchical policy definition example

A Solution Strategy for Policy-based Scheduling

Given that a leaf-node tensor of policies has been defined as described in the

previous section, we derive and apply an accounting tensor A which constrains a search

for an optimal solution x for allocating resources in support of the requirements from a

particular request b using the well known method of Linear Programming (LP).

Specifically, we seek to minimize an objective function f(representing some heuristic

knowledge of the grid):

min [f(x) ]

subj ect to

b

where denotes the inner product of A(t) with x.

Model Parameters

Definition: Let Q be a tensor representing a leaf-node view of allotted quota limits

(including pre-defined changes over time), U(t) be a tensor representing the amount of

quota actually used at current time t. We then define A(t) to be a tensor representing the

amount of quota still available at current time t by A(t) = Q U(t) and subj ect to the









constraint that the remaining available quota always lie within the maximum allotted

quota:

Vijkl(0

where i indexes all resource properties, j indexes all (leaf-node) resource providers,

k indexes all (leaf-node) request submitters and, 1 (logarithmically) indexes time. Note

that t and T refer to different uses of time. The functional dependence of U(t) and A(t) on

current time t explicitly recognizes the fact that U(t) and A(t) are updated in real-time

according to actual resource usage monitoring information. This is distinguished from

the T dependence of Q, for example, which is not updated in real-time but which does

define quota limits and their pre-defined (e.g. possibly periodic) future variation by

indexing T relative to the present. In particular, Qyki represents present quota limits for 1

= 0 and future quota limits for 1 > 0.

Definition: Let W(t) be a tensor representing the normalised (current and

forecasted) grid weather such that W(t)yk = (amount of resource in use)/(resource size)

where i indexes all resource properties, j indexes all (leaf-node) resource providers, and k

(logarithmically) indexes future steps in time from the present (k = 0). The estimated

impact on grid weather from future resource allocation may be recursively obtained from

current grid weather, W(t),o, by the relation W(t)lJ( a1 = W(t)y~k + RW(t)y~k, where 6W(t),k

represents any future allocations (or de-allocations) of resources which are to be

scheduled on property i of resource j during the kth interval of time away from the

present.









Definition: Let b be a tensor representing the required amount of resource

requested where, b,, is indexed by i for all resource properties and j for all (leaf-node)

request submitters.

Definition: The Atomic Job Criterion (AJC), for some particular resource provider

J, request submitter K, and time frame L, is defined to be

(3i ArJKL(t) < blK) ( XrJKL 0)

and states that if there exists a resource property i which can not satisfy the

requested resource amount, then the corresponding resource provider J is removed from

the space of feasible resource providers during the particular time frame L. This

condition reduces the space of resource providers to include only those sites that are

capable of accommodating the full request (i.e. the ability to satisfy all requirements

declared in b). For j obs that are not atomic, for example split-able j obs suited to data

parallelism, one would not impose such a stringent constraint as the AJC. In this chapter,

we will only consider j obs that cannot be split into smaller sub-j obs.

We wish to find a solution tensor x = [x,7kl] which provides the optimal location to

service the request within the reduced space of feasible providers where, i indexes

resource properties; j indexes (leaf-node) resource providers; k indexes (leaf-node)

request submitters; and I (logarithmically) indexes time.

Choice of an Objective Function and Optimization Metric

Several sophisticated obj ective functions and heuristic algorithms will be explored

in the future. The simple choice of obj ective functions here is one in which the preferred









location to service the request is that which least impacts both grid weather and the

request submitter's account usage :

min[ fK (x) ] = min[ Zlki 77I1 X7ki R1L RkK

subj ect to: 4, Allik x > bik

where J;;;, is the Kronecker delta function (which is unity if m = n and zero

otherwise), K corresponds to the particular request submitter, and L is some particular

(logarithmic) time view corresponding to possible variation in quota limits. Such a

simple strategy only provides "greedy" scheduling decisions for request submitter K

within a certain time frame L, but does attempt to improve the flexibility of future

scheduling choices by disfavouring resource providers in which the remaining quota for

submitter K would be largely consumed.

It may be shown that, for the case in which policies are constant for all times 1, the

above simple obj ective function fKL(x) is minimised when

x,,nL = 0 (non-optimal provider of a resource property)

= bK A IJn (unique optimal provider of a resource property)

= (bK/ IJKL) N (N-identical optimal providers of a resource property)

Hence, non-zero entries of xI~ki are interpreted as representing the fraction of quota

to be used from the remaining available quota for resource property i at provider j for

request submitter k during time interval 1 ; x itself represents the solution which

minimally impacts grid weather and which minimally uses the remaining available quota

for request submitter K. While using the objective function fK, as defined above, one is


SPlease note that in the rest of this chapter, we have suppressed the functional time
dependent notation of U(t), A(t), W(t) for the sake of clarity. It is understood, however,
that U, A, and Ware updated in real-time according to present conditions.









nearly assured of Einding a unique resource provider favouring any resource property

individually, one is unfortunately not assured of Einding a unique resource provider which

is favoured for all resource properties together.

One effective way to distinguish between resource providers j, which already

optimally satisfy at least one of the requirements from request-submitter K, is to define a

secondary optimisation metric based upon quota-usage for all resource properties at a

particular resource provider. That is, for every resource provider j containing at least one

non-zero resource property entry in the LP solution vector xJKL calculate the minimum

length:

VxJKL / 0 ( min, [ sqrtt ( L(blK UIKL2 2

This simply states that the favoured resource provider is that particular j which

minimises the overall use of quotas, considering all required properties corresponding to

request-submitter K (and at time window L). Used in connection with 0, this algorithm

chooses a unique resource provider (up to multiple sites with identical lengths) from the

set of resource providers with favoured properties. Finally, if there are multiple resource

providers, each with identical and minimal overall quota-usage, a unique solution is made

by randomly choosing one of the providers.

Quality of Service Constraints

Quality of service (QoS) constraints are supplied by a request submitter K in

addition to, and in support of, a particular request requirement tensor b and may be

expressed (for resource property i at resource provider j) in the form:

ZUJK <;K ZyJK









where zyJK (ZyJK) is a lower (upper) limit and fK is Some function of Q, U, W, x,

and/or b.

In this chapter, we consider only a "greedy" realisation of quality of service in the

sense that a quality of service is guaranteed to a request submitter by only considering

that requester's individual desires, but disregarding other unscheduled requests in the

system. This implies a time ordering and prioritization of requests. A more "socialised"

quality of service that is realized by simultaneously considering the requester' s individual

desires as well as the overall cost (i.e. the impact placed on all other unscheduled requests

currently in the system) will appear in a forth-coming chapter.

One simple example of a quality of service constraint that we investigate here is

that of a submitter supplied deadline for request completion, or end date DE Let us

assume that the element b0K in the request requirement tensor b represents a usage

requirement on CPU-time (e.g. in SI2000-hours) from request submitter K. Then let

C,(b0K) TepreSent the estimated wall clock completion time on resource j. In order to

meet the desired quality of service at any particular resource j, the request must begin to

be serviced on or before the start date Ds = DE CJ b~k) Such a request for quality

of service may be interpreted in the context of 0 as, z0JK = date[ l]i < DE CJ b0K) 0;qK,

where j represents a (leaf-node) resource, I represents (logarithmic) planning steps in time

away from the present (1 = 0), and date[1] converts discrete, relative time steps in I to an

absolute date. By determining the latest possible start date over all resource providers j,

date[ P]i=max,[ DE C(b~k)

one defines a feasible period of time (relative to now), < P in which the request

may be serviced and still meet the specified QoS. Such a QoS constraint may be simply











imposed by restricting the sum over all future time intervals 1to just that of the feasible

time period l



min[ fKL~x X = min[ 7kll

subject to: Z;(l brk

If a solution to this Linear Programming model is found, then it is the one which

chooses the best provider to service the request from among the feasible space of

resource providers and within the feasible time period, guaranteeing a simple QoS. It


may not be possible to guarantee such a QoS, in which case the request submitter must

either modify the requested requirements in b or the desired completion date DE.

Inbtlel Resource Workloald Resours Usage Quote









RI iC R1 R RS R RF FW lls R10 I 1 11l2 1 RI R6 MB R RS RB R10
"""A "" B
Resorwce location Disribuian Changed Resource Worldoad


0" ''
Rr R2 a RS F RS RT R at Rio


R1 Rt C3 ft< Wt ft Fl? ~st A Rio


I ~.. Irrc C Iw D

Figure 2-3. Policy based scheduling simulation results (In the legends of the graphs TFi
means the i-th time frame, and InitQ represents the initially given resource
usage quotas for a request submitter) A) Initial resource workload B)
Resource usage quota C) Resource allocation distribution D) Changed
resource workload.









Simulation Results

The policy based scheduling that has been described in mathematical formats is

simulated in MATLAB. The simulation assumes ten resources are available for the

request assignment by the scheduling strategy.

The first graph in Figure 2-3 shows the workload that each of the ten resources has

in five time frames before any resource assignment for a request is made. The workload

information on a resource in a time frame can be obtained from the estimated load status

on the resource with information such as the current grid weather and request execution

history in a Grid. Resources with ID 3 and 9 have higher initial workload than other

resource, while resource with ID 1, 5 and 8 have fewer loads than the others in the

beginning of the series of resource allocation. The second graph presents resource usage

quota change. The dotted line in the graph shows the initial usage quotas that are

available to a request submitter before the request submission, while the columns

represent the quotas that are remained on the resources after the resource assignment.

The third graph shows the request distribution among the resources after the policy-based

scheduler completes the resource allocation for 70 requests. Each column in the graph

represents the number of requests assigned to a resource in the time frames. The fourth

graph shows the workload distribution on the resources after the scheduler completes the

resource allocation.

These results show that the resource allocation depends not only on the workload

but also on the resource usage quota for a request submitter. The policy-based scheduler

achieves the resource assignment optimisation by considering the evenly consumed usage

quota and the workload balance among the resources. For example, we can see that

resource with ID 1 has been assigned to the largest number of requests in the simulation





40


because it has less overall workload for all the time frames, and a resource submitter is


given high initial usage quota on the resources. In the case of request assignment to the

resource with ID 8, even though the overall workload on the resource is less than on other

resources, small number of requests is assigned to it because the quota for the submitter

on the resource is less than on other resources.

Initil Reou~wo Workloml Reon r- IUsa e QuotaI


l~gg
600

600


R1 R 2 R) Rs R 4 s RF F R S Rio0 1 R1 !42 1 R I it R RRF R Rp 410







00 0



R1 RI R31 R4 M Ft R7 148 FtB F10 I 1 FI R3 FM FMI Fi R7 Ft H RIO
SC D
Figure 2-4. Policy based scheduling simulation results with highly biased resource usage
quotas (In the legends of the graphs TFi means the i-th time frame, and InitQ
represents the initially given resource usage quotas for a request submitter).
A) Evenly distributed workload on the resources. B) Highly biased resource
usage quotas. C) Resource allocation distribution. D) Changed resource
workload.

A resource provider who manages the resource usage policies for the request

submitters may change the policies to control the resource usage. For instance, the


provider can increase the usage quota on the less loaded resources to make it more

utilized, while decreasing the quota on the over loaded resources to prevent the scheduler

from assigning additional requests to the resources. From the results, the quota on the









resource with ID 3 should be decreased to make the workload reduced in the future time

frames. As an opposite case, the quota on the resource with ID 8 should be increased

because the resource has been less loaded than the others. Increasing the quota causes the

scheduler to assign larger number of requests to the resource.

The results in the Figures 2-4 and 2-5 show how the resource usage quotas and the

resource workload distribution affect the resource allocation. Figure 2-4a shows the case

that the workloads are evenly distributed in the five time frames before a policy-based

scheduler starts a series of resource allocation. Figure 2-4b shows that the resource

usage quotas for a request submitter are highly biased, in the sense that Resource with ID

1 provides the highest quota, whereas Resource with ID 10 allows the submitter use the

resource with the smallest amount quota. Given the initial conditions the scheduler

allocates the resources to requests following the biased quota allowance as seen in Figure

2-4c. Because the given workloads are same on the resources in all the time frames the

unevenly distributed quotas only affect the resource allocation distribution. The Figure

2-4d presents changed resource workload after the completed resource allocation.

Figure 2-5 presents a case when the resource workloads are highly biased, while

the resource usage quotas are same on resource in all the time frames at the time of

resource allocation. With the initially given resource condition the request assignment is

also highly biased following the resource workload distribution.

The results presented above, shows that a policy based scheduler or resource

providers can control the workload on Grid resources by properly assigning resource

usage quotas for requests submitters with the consideration of the load statuses that the







42


resources have been exposed. The scheduler cnas civ h odblne eus

assignment on the resources utilizing the resource usage quota.

Inhid Resource sWorkload Resourc Usage Qala









126



0 03



Int repesnt th ntalygvnrsouc uaghee qutas ford baln request


qlauot ochange.C Resourcen alloatio distibuton.D Changersorc

Futu IreWok











the lieatr and sevra pakae are avilbl to proid fast an efetv solutions1 R R r


basd o te types aIndh numberd of varabes involved Initial experimental fresults









demonstrate the usefulness of the framework and linear programming based solution

methods for effective scheduling in multiple situations for policy and quality of service

requirements. We are currently integrating a scheduling engine based on this framework

into SPHINX and will present experimental results of the above policy framework on

actual grids as well as the execution overhead of the solution strategy in the final version

of the chapter.















CHAPTER 3
SPHINX: POLICY-BASED WORKFLOW SCHEDULING

A grid consists of high-end computational, storage, and network resources that,

while known a priori, are dynamic with respect to activity and availability. Efficient

scheduling of requests to use grid resources must adapt to this dynamic environment

while meeting administrative policies.

In this section, first I discuss the necessary requirements of a grid scheduler. Then

I present a scheduling framework called SPHINX that incorporates these unique grid

characteristics, and implements the policy-based scheduling technique. The section also

presents methods for integrating this framework with related infrastructure for workflow

management and execution. I present early experimental results for SPHINX that

effectively utilizes other grid infrastructure such as workflow management systems and

execution systems. These results demonstrate that SPHINX can effectively schedule

work across a large number of distributed clusters that are owned by multiple units in a

virtual organization.

Requirements of a Grid-scheduling Infrastructure

A grid is a unique computing environment. To efficiently schedule jobs, a grid

scheduling system must have access to important information about the grid environment

and its dynamic usage. Additionally, the scheduling system must meet certain fault

tolerance and customizability requirements. This section outlines the different types of

information the scheduling framework must utilize and the requirements a scheduler must

sati sfy.









Information Requirements

A core requirement for scheduling in the dynamic grid environment is to

successfully map tasks onto dynamically changing resource environment while

maximizing the overall efficiency of the system. The scheduling algorithm that performs

this mapping must consider several factors when making its decision. Seven factors

significantly affect this scheduling decision:

Execution time estimation. Because of the heterogeneous nature of grid

resources, their real execution performance differs from the optimal performance

characterized by analytic benchmarking [6]. However, the real execution time can be

effectively estimated, even on heterogeneous grid resources, by statistically analyzing the

performance during the past executions [62]. Along with the past execution information,

several methods such as statistical analysis, neural networks or data mining, can be used

to estimate the execution time of a task [63].

Usage policies. Policies, including authentication, authorization, and application

constraints are important factors for maintaining resource ownership and security. The

set of possible constraints on job execution can be various and can change significantly

over time. These constraints can include different values for each job, for example RAM

requirements or connectivity needed, or constraints that are static for a specific job type,

such as the operating system or architecture. Policies may include any information that

should be specified to ensure that a job is matched to appropriate resources [60, 61].

Grid weather. The scheduling system must keep track of dynamically changed

load and availability of grid resources. In addition, faults and failures of grid resources

are certain to occur. The state of all critical grid components must be monitored, and the

information should be available to the scheduling system.










Resource descriptions. Due to the heterogeneous nature of the grid, descriptions

of grid resource properties are vital. Such descriptions include configuration information

such as pre-installed application software, execution environment information such as

paths to local scratch spaces, as well as hardware information. In addition, such grid

resources may often only be available to users at an aggregate or logical level, hiding the

actual, physical resource used to execute a job. Hence, the ability to categorize as well as

to both finely and coarsely describe grid resource properties is important.

Replica management. The scheduling system must arrange for the necessary

input data of any task to be present at its execution site. Individual data locations can

have different performance characteristics and access control policies. A grid replica

management service must discover these characteristics and provide a list of the available

replicas for the scheduler to make a replica selection.

Past and future dependencies of the application. Grid task submission is often

expressed as a set of dependent subtasks and modeled as a Directed Acyclic Graph

(DAG). In this case, the subtasks are represented by nodes in a graph, and the

dependencies are by branches. When allocating resources to the subtasks, inter-task

dependencies affect the required data movement among resources. Utilization of this

dependency information can generate a provably optimal scheme for communicating

shared data among subtasks [64].

Global job descriptions. A grid is primarily a collaborative system. It is not

always ideal within the grid to schedule a particular task or group of tasks for maximum

efficiency. A good balance needs to be struck between the requirements of each user and









the overall efficiency. It is important for a grid scheduling system to have a global list of

pending jobs so that it can optimize scheduling to minimize this global cost.

System Requirements

While the kinds of information above should be available to the system for efficient

grid scheduling, the following requirements must be satisfied in order to provide efficient

scheduling services to a grid Virtual Organization (VO) community.

Distributed, fault-tolerant scheduling. Clearly, scheduling is a critical function

of the grid middleware. Without a working scheduling system (human or otherwise), all

processing on the grid would quickly cease. Thus, any scheduling infrastructure must be

strongly fault-tolerant and recoverable in the inevitable case of failures. This need for

fault tolerance consequently gives rise to a need for a distributed scheduling system.

Centralized scheduling leaves the grid system prone to a single point of failure.

Distributing the scheduling functionality between several agents is essential to providing

the required fault tolerance.

Customizability. Within the grid, many different VOs will interact within the grid

environment and each of these VOs will have different application requirements. The

scheduling system must be customizable enough to allow each organization with the

flexibility to optimize the system for their particular needs.

Extensibility. The architecture of the scheduling system should be extensible to

allow for the inclusion of higher level modules into the framework. Higher level

modules could help map domain specific queries onto more generic scheduling problems

and map domain specific constraints onto generic scheduling constraints.

Interoperability with other scheduling systems. Any single scheduling system is

unlikely to provide a unique solution for all VOs. In order to allow cooperation at the









level of VOs, for example in a hierarchy of VOs or among VO peers, the scheduling

system within any single VO should be able to route j obs to or accept jobs from external

VOs subject to policy and grid information constraints. Inter-VO cooperation is an

architectural choice reflecting a tradeoff between synchronization and flexibility of low

level middleware choices and configurations across large organizations.

Quality of service. Multiple qualities of service may be desirable as there are

potentially different types of users. There are users who are running small j obs that care

about quick turnaround time or interactive behavior from the underlying system. On the

other hand, large production runs may be acceptably executed as batch j obs. Users may

put deadlines by which submitted jobs should be completed. The scheduling system

should be able to provide these differential QoS features for the administrator/users.


Figure 3-1. Sphinx scheduling system architecture










Highlights of SPHINX Architecture

The scheduling system, SPHINX is a novel scheduling middleware in dynamically

changing and heterogeneous grid environment. In addition to the infrastructure

requirements for grid scheduling described in the previous section, SPHINX focuses on

several key functionalities in its architecture shown in Figure 3- for efficient and fault

tolerant scheduling service.

Easily accessible system. SPHINX is modeled with an agent-based scheduling

system consisting of two parties, the client and the server. The separation supports easy

system accessibility and adaptability. The client is a lightweight portable scheduling

agent that represents the server for processing scheduling requests. It provides an

abstract layer to the service, and supports a customized interface to accommodate user

specific functionalities.

Automated procedure and modulated architecture. SPHINX consists of

multiple modules that perform a series of refinement on a scheduling request. The

procedure begins from a 'start', and the final state should be annotated with 'finished',

which indicates the resource allocation has been made to a request. Each module takes a

request, and changes its state according to the functionality of itself. The system easily

modifies or extends the scheduling automation by making necessary changes to a module

without affecting the logical structure of other modules.

Table 3.1 shows all the states for jobs and dags defined in the current SPHINX

prototype.

Robust and recoverable system. The SPHINX server adopts database

infrastructure to manage scheduling procedure. Database tables support inter-process

communication among scheduling modules in the system. A module reads scheduling










state of a request from the tables, edits the state, and writes the modification to the tables.

It also supports fault tolerance by making the system easily recoverable from internal

component failure.

User interactive system. SPHINX supports user interaction to its resource

allocation procedure. A user submits a request with quality of service (QOS)

requirement. The requirement may specify resource usage amount and period. It is

challenging to be able to satisfy the specification in a dynamically changing grid

environment. SPHINX facilities the scheduling decision by negotiating QOS

achievement with a user.

Table 3-1. Finite automation of SPHINX scheduling status management.
DAG States Description
Unreduced The DAG has yet to be processed by the DAG reducer.
Unpredicted The completion time of the DAG has not been estimated by the
prediction engine.
Unaccepted The completion time has been estimated for this DAG, and the
server is waiting for the client to accept this estimation for final
processin.
Unfinished The DAG has begun processing but has not completed.
Remove All jobs within this DAG have completed and can be removed
from the queue during, the next cleanup, cycle.
Job States Description
Unpredicted The completion time of this j ob has not been estimated by the
prediction engine.
Unaccepted The DAG containing this job has not been accepted for final
execution by the client.
Unplanned The j ob has been accepted for scheduling by this server, but the
planner has not created an execution plan for this job.
Unsent The job is planned, but has not been sent to the client for
execution.
Unfinished The job is running on a remote machine and must be tracked by
the tracking system.
Remove The j ob has finished execution and no longer must be tracked by
the tracking system. If the parent DAG for this j ob is also
finished, the job may be removed from the tracking system during
the next clean up, phase.










Platform independent interoperable system. A scheduling system should expect

the interaction with systems on various kinds of platforms in a heterogeneous

environment. SPHINX adapts communication protocols based on XML such as SOAP

and XML-RPC to satisfy the requirement. Specially, it uses the communication protocol

named Clarens [65] for incorporating the concept of grid security.

Table 3-2. SPHINX client functionalities for interactive j ob scheduling and execution
tracking
Functions SPHINX API's Parameters
Execution job submit (String file loc, String sender) file~loc: abstract dag file
request //Client submits this request to SPHINX client location
sender: request sender
information
Scheduling sendrequest (String dagXML, String dagXML: dag in XML format
request msgType, String sender) msgType: request type
//SPHINX client sends this request to server sender: request sender
information
Admission send msg (String msgXML, String msgType) msgXML: message in XML
control send msg (String msgXML, String msgType, format
String sender) msgType: message type
//SPHINX and user interact for resource sender: message sender info.
allocation
Submission createSubmission (String joblnfo) Joblnfo: scheduling decision
request //SPHINX client create a submission file. information
submit job (String rescAlloc, String submitter) rescAlloc: job submission file
//SPHINX client send the file to Submitter: job submitter
DAGMan/Condor-G information
Execution updateStatus (int jobld) jobld: ID of ajob that is
tracking String status = getJobStatus (int jobld) currently running on a grid
//Update the status of the job with the resource. The ID is assigned by
information //from a grid resource management the grid resource management
service system.
status: the status of job, which
the grid resource management
sstem provides.

SPHINX Client

SPHINX client interacts with both the scheduling server that allocates resources for

task execution and a grid resource management system such as DAGMan/Condor-G [4].

To begin the scheduling procedure, a user passes execution request to SPHINX client.

The request is in the form of an abstract DAG that is produced by a workflow planner










such as the Chimera Virtual Data System [66]. The abstract plan describes the logical

I/O dependencies within a group of jobs. The client sends scheduling request to the

server with a message containing the DAG and client information. After receiving

resource allocation decision from the server, the client creates an appropriate request

submission file according to the decision. The client submits the file to the grid resource

management system.

In order to achieve a user' s quality of service (QoS) requirement SPHINX

implements interactive resource allocation. The client as a scheduling agent negotiates

QoS satisfaction level with the user. SPHINX presents the user resource allocation

decision such as estimated execution time, resource reservation period and amount

according to the current grid resource status. Then the user should decide acceptance of

the suggestion. A basic QoS negotiation feature is developed in the current system, while

we develop detailed and sophisticated version.

The tracking module in the client keeps track of execution status of submitted j obs.

If the execution is held or killed on remote sites, then the client reports the status change

to the server, and requests re-planning of the killed or held j obs. The client also sends the

job cancellation message to the remote sites on which the held jobs are located. Table 3.2

shows the functionalities and SPHINX API's.

SPHINX Server

SPHINX server as a maj or component of the system performs several functions.

The current version of the server supports the following functions. First, it decides how

best to allocate those resources to complete the requests. Second, it maintains catalogs of

data, executables and their replicas. Third, it provides estimates for the completion time

of the requests on these resources. Fourth, the server monitors the status of its resources.










SPHINX server completes resource allocation procedure by changing the status of

requests through the predefined states in Table 3.1.

As mentioned in the previous section, each of the functions in SPHINX is

developed in modulated fashion. Each module performs its corresponding function to a

DAG or a job, and changes the state to the next according to the predefined order of

states. We discuss each of the modules in detail in the following statements. Table 3.3

shows SPHINX server functions described in this section.

Table 3-3. SPHINX server functions for resource allocation
Modules SPHINX API' s Parameters
Message String incrusg = inc msg wrapper () incrusg: incoming
Handling out msg wrapper (String msg) message
Module msggparsing (String msg) msg: message in XML
msg_send (String msg, String msgType String dest, format
String sender) msgType: message type
//These functions are to send, receive and parse dest: receiver information
messages. //The message handling module is a sender: sender information
gateway to SPHINX server
DAG dag~reducing (int dagId) dagId: ID of a dag in dag
Reducer //For each of all the jobs in the dag, call replica table. The state of the dag
management //service to check if all the outputs of the is unreduced
job exist. If exist, //then reduce the job and all the
precedence of the job.
Prediction String est~info = execgprediction(int jobld) est info: estimated data in
Engine //This function provides estimated information such as XML string format
//execution time, resource usage (CPU, storage etc.) jobld: ID of ajob in job
table
Planner Planning (int jobld, String strategy) Jobld: ID of job to be
//It is to allocate resources to jobs according to planned
scheduling //strategy. strategy: scheduling
algorithm

Control Process. The main function of the control process is to launch the

necessary server-side service modules to process resource allocation to a request.

Stateful entities such as DAG' s and Jobs are operated and modified by scheduling

modules. This architecture for the SPHINX Server, in which the control process awakens

modules for processing stateful entities, provides an extensible and easily configurable










system for future work. Figure 3- shows overall structure of control process in SPHINX.

The controller checks the state of jobs that are currently in the procedure of scheduling.

If the controller Einds a j ob in one of the states, then it invokes a corresponding service

module to handle the j ob.






Figure 3-2. Overall structure of control process.

Message Handling Module. The message handling function is to provide a layer

of abstraction between the internal representation of the scheduling procedure and the

external processes. Additionally, the function is responsible for maintaining a list of all

the currently connected clients, for ensuring that the list is kept accurate and for directing

I/O from the various internal components to these various clients. The server maintains

database tables for storing incoming and outgoing messages. Control process invokes

incoming or outgoing message interfaces to the tables for retrieving, parsing and sending

the messages.

DAG Reducer. The DAG reducer reads an incoming DAG, and eliminates

previously completed j obs in the DAG. Such j obs can be identified with the use of a

replica catalog. The DAG reducer simply checks for the existence of the output Hiles of

each j ob, and if they all exist, the j ob and all precedence of the j ob can be deleted. The

reducer consults replica location service for the existence and location of the data.

Prediction Engine. The prediction engine will provide estimates of resource use.

It estimates the resource requirements of the j ob based upon historical information, size

of input if any, and/or user provided requirements. In the first implementation, this is










constrained to overall execution time by application and site. A simple average and

variance calculation is used to provide an initial estimation scheme; this method could be

made more intelligent and robust in future implementations. When the prediction engine

is called, it selects a DAG for prediction, estimates the completion time of each j ob, and

finally from this data, estimates the total completion time of the DAG.

Planner. The planner module creates an execution plan of the j ob whose input

data are available. According to the data dependency a job should wait until all the

precedent finish to generate output files. The execution plan includes several steps:

4. Choose a set of jobs that are ready for execution according to the input data
availability.

5. Decide the optimal resources for the job execution. The planner makes resource
allocation decision for each of the ready j obs. The scheduling is based on resource
status and usage policy information, job execution prediction, and I/O dependency
information described in Section 3.

6. Decide whether it is necessary to transfer input files to the execution site. If
necessary, choose the optimal transfer source for the input files.

7. Decide whether the output files must be copied to persistent storage. If necessary,
arrange for those transfers.

After the execution plan for a job has been created, the planner creates an outgoing

message with the planning information, and passes the message to a message-handling

module.

Tracking Module. The job-tracking module is responsible for keeping the

tracking/prediction database on the SPHINX server current. In this first version, the

primary functions of the tracking module is to check for j ob completion, update the j ob

state in the tracking/prediction database, and when the job completes, add the execution

time to the job's historical data in the prediction data tables. In later versions, additional

functions can be added. For example, the tracking module could also track j ob resource










use in real time. It could then enforce resource usage policies by killing and re-queuing

jobs that overstep estimated bounds. Of course, the tracking module can also monitor a

host of additional prediction information and record it in the prediction tables.

Table 3-4. SPHINX API' s for accessing data replicas through RLS service. In the table
PFN or pfn represents physical fie name, and Ifn means logical fie name.
API' s Parameters
Vector pfns = getPFN (String Ifn) pfns: a list of physical file names
//It returns PFN mappings for the given lfn. Ifn: logical file name
CreateMapping (String Ifn, Sttring pfn) Ifn: logical file name
//It creates mapping for the given lfn and pfn: physical file name
/pnin the RSL service database.

Data Replication Service

The data replication service is designed to provide efficient replica management.

SPHINX provides an interface to the service. The Globus Replica Location Service

(RLS) [67] provides both replica information and index servers in a hierarchal fashion.

In addition, GridFTP [68] is Grid Security Infrastructure (GSI) enabled FTP protocol that

provides necessary security and Eile transfer functions. Initial implementations of

SPHINX will both make use of RLS and GridFTP for replica and data management.

In this initial implementation, the detection and replication algorithms are based on

a memory paging technique. Once a hot spot has been identified, an appropriate

replication site is chosen; initially, the site will be chosen at random. As future versions

are developed, both the method of hot spot identification and replication location

selection will be modified using improved replica management systems as well as better

algorithms. Table 3.4 shows SPHINX API's for accessing replica information through

RLS.










Grid Monitoring Interface

The resource allocation decision made by the planner and the replication site

selection by the data replication service depends on the information provided through the

monitoring interface of SPHINTX. As such, the interface provides a buffer between

external monitoring services (such as MDS, GEMS, VO-Ganglia, MonALISA, and

Hawkeye [69,70,71]) and the SPHINTX scheduling system. In order to accommodate the

wide-variety of grid monitoring services, the interface is developed as an SDK so that

specific implementations are easily constructed. We use an interface to MonALISA for

accessing resource monitoring information (Table 3.5).

Table 3-5. Database table schemas for accessing resource-monitoring information
through MonALISA
Tables Fields Function
SITEFARMMAP site~name varchar(255) not null This table stores mappings from
farm name varchar(100) not grid resource IP to MonALISA
null farm name.
ML SNAPSHOT site~name varchar(255) not null This table stores the latest
function varchar(100) not null monitoring value for the given
value double site and function.
ML SNAPSHOT P2P site~source varchar(255) This table stores the latest
site dest varchar(255) monitoring value for the given
functions varchar(100) function between the given two
value double sites.

Grid monitoring service will be used to track resource-use parameters including

CPU load, disk usage, and bandwidth; however, in addition, a grid monitoring service

could possibly also collect policy information provided from each site, including resource

cost functions and VO resource use limits. Additional interface modules will be

developed to gather VO-centric policy information, which may be published and

maintained by a VO in a centralized repository.

External databases. Replica information is likely to be tied closely to a

particular VO, and it is necessary to incorporate external replica services. Two external









catalogs are considered in this chapter: the transformation catalog (TC) and the replica

catalog (RC). Because the functionality is similar, both the RC and the TC may use

similar or identical technology. The TC maintains information on transformations (these

should be provided by workflow management systems such as Chimera). The TC maps a

logical executable name to one or many physical executable names on particular grid

resources. The RC maintains information on data produced within, or uploaded to a grid.

The RC maps a logical file name to one or many physical file names on particular grid

resources.

Internal database. The prediction engine queries the Prediction Tables (PT) to

get estimated information for job execution, including execution time, CPU and disk

usage, and bandwidth. In the PT, a vector of input parameters and a site with a vector of

benchmarks represents a job. Thus, for every execution of every set of input parameters

(j ob) on every set of benchmarks (site), the resources consumed by the j ob are recorded.

By querying this database of historical processing information, the prediction engine can

estimate the time required to execute a given j ob on a given set of resources.

The Job Tables (JT) maintains a persistent queue of submitted jobs. It monitors

their progress, providing information on the status and location of job execution. In

addition, all DAG (or j ob) state information in the scheduling system is incorporated into

the job database. The tracking system within the scheduling server accesses the grid-

monitoring interface and maintains the information in the JT.


Relationship with Other Grid Research

In this section, we discuss how SPHINX system research and development (R&D)

interact with other ongoing R&D works.









Grid Information Services

Several systems are available for gathering information including resources access

policies, resource characteristics and status, and required application environments. The

current version of Globus Toolkit includes the Monitoring and Discovering Service

(MDS) that can, in combination with other tools such as Ganglia, MonALISA and

GEMS, gather this and other grid information. This system or others can be easily

connected to the SPHINX system using the Grid Monitoring SDK.

Replica and Data Management Services

The Globus Replica Location Service (RLS) [67] provides both replica information

and index servers in a hierarchal fashion. In addition, GridFTP is Grid Security

Infrastructure (GSI) enabled FTP protocol that provides necessary security and file

transfer functions. Initial implementations of SPHINX will both make use of RLS and

GridFTP for replica and data management.

The Network STorage (NeST) service provides a common abstracted layer for data

access in grid environments [72]. In addition, the Stork Data Placement (DaP) scheduler

manages and monitors data placement jobs in grid environments [73]. As the SPHINX

work develops, both NeST and Stork can be incorporated into the framework to provide

robust, transparent management of data movement.

Job Submission Services

The GriPhyN VDT includes Condor-G as a grid submission and execution service.

Condor-G uses the Globus GRAM API to interact with Globus resources [74], and

provides many execution management functions, including restarting failed j obs,

gathering execution log information, and tracking job progress. In addition, DAGMan is

a j ob execution engine that manages the execution of a group of dependent j obs










expressed as a DAG. By leveraging the DAGMan j ob submission protocol, the

scheduling system can dynamically alter the granularity of the j ob submission and

planning to most efficiently process the j obs in its queue. For instance, if the prediction

information indicates that a large group of fast j obs is waiting in the queue and the

planner can assume the grid status will remain relatively stable during their execution, it

can make a full-ahead plan for the whole group, construct a DAGMan submission DAG

and pass the entire structure to DAGMan to manage. Conversely, if there are many long

running j obs in the queue, the planner is able to release them one at a time to DAGMan to

take full advantage of "just-in-time" scheduling.

Virtual Data Services

The SPHINX scheduler is currently designed to process abstract DAGs, as

provided by the Chimera Virtual Data System, for grid execution. In particular, this work

is fully integrated with the Chimera Virtual Data and Transformation Catalogs. These

integration points provide SPHINX with the ability to flexibly and dynamically schedule

large DAG workflows.

Future Planners and Schedulers

One of the main purposes of SPHINX research and development is to develop a

robust and complete scheduling framework where further research and development can

continue. Thus, it is essential that this initial development work should provide a feature

for integrating new planning technology to the framework easily. Such integration is

provided in two ways. First, the current SPHINX is developed in a modularized fashion.

Modules in the system are developed, and tested independently. A single current

SPHINX component can be exported into other systems, or new modules can be plugged

into SPHINX system easily. Second, the planning module within the scheduling server










interacts only with the internal database API. Through this interface, any planning or

strategy module can be easily added to the scheduling server. A collection of such

planning modules could be provided, and the scheduling server chooses the most optimal

based on the composition of the input tasks.

Table 3-6. Grid sites that are used in the experiment. CalTech represents California
Institute of Technology, UFL does University of Florida, and UCSD does
University of California at San Diego.
Site Name Site Address # of Processors Processor Typ
CalTech citems.cacr.caltech.edu Four Dual
UFL ufl ori dadgt.phy s.ufl .edu Three Dual
UFL ufloridaigt hys.ufl.edu Nine Dual
UC SD usemstb0.ucsd.edu Three Singular

Experiments and Results

The aim of the experiments we performed was to test the effectiveness of the

scheduler as compared to the way things are done today on the Grid. We also compared

and contrasted the different scheduling algorithms. There were two different features of

grid systems that we evaluated:

Importance of feedback information: The "feedback" provides execution status

information of previously submitted jobs on grid sites. The scheduling algorithms can

utilize this information to determine a set of reliable sites to schedule j obs. Sites having

more number of cancelled j obs than completed j obs are marked unreliable. This strategy

is not used in the algorithms marked as without-feedback.

Importance of2~onitoring Information: The monitoring systems provide

performance of different components (especially compute resources) on the grid. This

information is updated frequently and can be potentially used for effective scheduling.

We wanted to determine the value of this information in effective scheduling.










Scheduling Algorithms

Round robin scheduling algorithm tries to submit j obs in the order of sites in a

given list. All sites are scheduled to execute j obs without considering the status of the

sites. If some sites are not available to execute the planned j obs, then the j obs are

cancelled, and planned onto the next site in the list.

Algot 1 within based on the number of CPUs with feedback utilizes resource-

scheduling information of previously submitted j obs in a local SPHINX server. After

determining load rate with the following formula for each site it plans jobs to the least

loaded sites.

plannedj obs, + unfinished obs,)
rate,
CPUs,
where
rate, : the load rate on site i
planned jobs : the number of planned jobs to site i

unfinished obs : the number of running jobs on site i

CPUs, : the number of CPUs on site i

Queue length based scheduling algorithm makes the scheduling decision based on

the lengths of the j ob queues at the remote sites provided by a monitoring module. This

algorithm utilizes the feedback information to determine the job execution site. The

scheduler selects a site with the smallest load rate according to the following formula.










(queued jobs, + running jobs, + planned jobs,)
rate,
CPUs,
where
rate : the load rate on site i
queued jobs : the number of waiting jobs on site i
running obsl : the number of j obs which are currently assigned to CPUs on site i
planned jobs : the number of planned jobs to site i on a local scheduler
CPUs : the number of CPUs on site i


Job completion tinte-based scheduling algorithm utilizes the j ob completion rate

information passed on by the job tracker module from the client to the server. In the

absence of the job completion rate information, SPHINX schedules j obs based on round

robin technique until it has that information for the remote sites. Thus, it uses a hybrid

approach to compensate for unavailability of information. The site having the minimum

job completion rate is chosen for the next schedulable job.



Avg comp x If' si"te 's available
1_r~n,A,O i-0 cn p \0, othemilse


where

n : the number of grid sites

Avg comp, : the average job completion time on site i
Al is computed using the feedback information

Policy-constrained scheduling puts resource usage constraints on each of the

algorithms. For example, the next formula shows a revised round robin scheduling

algorithm with resource usage constraints.










site s
such that
quotax,s > requiredi,s, for property i
where
quota1,s : usage quota of property i on site s given to a user
requireds,s : required amount of property i on site s specified by a user

Test-bed and Test Procedure

We use Grid3 as the test-bed for this experiment. Grid 3 currently has more than

25 sites across the US and Korea which collectively provide more than 2000 CPUs. The

resources are used by 7 different scientific applications, including 3 high energy physics

simulations and 4 data analyses in high energy physics, bio-chemistry, astrophysics and

astronomy .

It is critical to test the performance of scheduling algorithms in the same grid

environment because the resource availability changes dynamically. Each of these

scheduling algorithms was executed on multiple instances of SPHINX servers multiple

number of times that were running concurrently to compare pair-wise or group-wise

performance. These servers were started at the same time so that they can compete for the

same set of grid resources. We felt this was the fairest way to compare the performance

of different algorithms in a dynamically changing environment. Table 3.7 shows the

configuration of the machines that we set up SPHINX clients and servers.

Table 3-7. SPHINX server configurations
Name CPU (h) RAM(b OS
dimitri .dnsalias.net 2x800 512 RH 7.3
Julian.dnsalias.net 2x3000 2000 Fedora Core 2

However, the availability and performance of each site grid changes dynamically.

This fact makes the number of sites available during different experiments to vary. Each

scheduling algorithm also chooses different set of sites for submitting j ob s according to











its planning decision. The performance comparisons below represent our best case efforts

to eliminate these effects by executing these in a pair-wise or group-wise approach

described above.


Average Dag Completion Time (30 dags x 10 jobs/dag)


3400

3200
93000

a 2800
S2600





#ofCPUsbased Round-robin #ofCPUsbased- Round-robin-
wthoutfeedback wthoutfeedback
Scheduling Algorithms


Figure 3 -3. Effect of utilization of feedback information.

Performance Evaluation of Scheduling Algorithms

In order to compare the performance of the algorithms, we submit 30, 60 and 120


DAGs, each of which has 10 jobs in random structure. The job simulates a simple

execution that takes two or three input files, spends one minute before generating an


output file. The size of output file is different for each j ob, and the file is located on the

execution site by default. Including the time to transfer remotely located input files onto

the site it is expected that each job will take about three or four minutes to be completed.

The depth of the DAGs is up to five, while the maximum number of independent j obs in

a level is four or five.

Effect of Feedback Information

We start by studying the impact of using feedback information. We compare the

average DAG completion times for the round-robin and #-of-CPUs-ba~sed scheduling









algorithms with and without feedback. The graph shown in Figure 3-3 plots the average

DAG completion time for four algorithms Round robin with feedback, round-robin

without feedback, Number-of-CPUs-based scheduling algorithm with feedback and

Number-of-CPUs-based scheduling algorithm without feedback. Feedback is basically

used for flagging an execution site as faulty. This is done considering the number of

completed and cancelled j obs on that site as reported to SPHINX server by the j ob

tracker(s).

The feedback information includes the status of job execution such as hanged,

killed, or completed on execution sites. A scheduler should be able to utilize this

information to determine a set of available sites to schedule jobs. Without this feedback

the scheduler keeps submitting j obs to unreliable sites resulting in many rescheduling

jobs getting cancelled. As shown in the figure the average DAG completion time with

the feedback information is less than the other case by about 20~29%.

Round-robin scheduling without feedback is basically what a grid user would use

for executing his jobs on the grid. This experiment basically proves how the feedback

information in critical. It also brings to the fore the faultiness of the grid environment and

how fault-tolerance can be achieved using SPHINX.

Comparison of Different Scheduling Algorithms with Feedback

The aim of the experiment is to demonstrate the dynamic nature of the grid and to

evaluate which monitored parameter works best for making scheduling decisions.








67




Average Dag Completion Time (30 dags x 10 jobs/dag)


2300

2200

2100


2000

1900


1800
Completion time Queue length # of CPUs based Round-robin
based based
Scheduling Algorithms


Average Job Execution Time (30 dags x 10 jobs/dag)


550

500

450

400

350

S300

250

200


Completion time Queue length # of CPUs based Round-robin
based based
Scheduling Algorithms


SExecution -4Idtlmetm


Figure 3-4. Performance of scheduling algorithms with 300 jobs and without any policy
constraints (A) & (B). The Round-robin algorithm distributes the jobs equally

among all available sites; the number-of-CPUs-based algorithm considers the
relative number of CPUs on that site (static); the Queue-length based

approach considers the job queue information (dynamic) from the monitoring
system; while the Completion time based strategy uses the average job
completion rates to select the execution site.









Figure 3-4(a) shows the performance of four different algorithms. Completion

time-based scheduling algorithm (hybrid) performs better than other cases by about 17%

in terms of average DAG completion time. This is because the scheduling algorithm

schedules j obs to the sites that complete assigned j obs faster than other sites. Figure 3-

4(b) presents job execution and idle time information. Jobs scheduled by completion

time algorithms are executed faster than ones by other algorithms by about 50%, and

execution waiting or idle time is less than by about 60 %.

The same experiment was repeated with 600 and then 1200 jobs instead of just 300

jobs. Figure 3-5 gives the result for the 600 jobs experiment. Here, we actually observe

that the Job completion rate based approach performs comparatively better than other

algorithms as compared to the 300 j obs experiment. Here the performance of the Job

completion time based approach is from ~33% to ~50% better than other scheduling

strategies. This is because the algorithm gets smarter as the scheduling processes with

more reliable j ob completion time information and makes the planning decision for j obs

effectively using the wider knowledge base.







69



Dag Completion Time (60 dags x 10 jobs)


6000


50001


3000~


2000
Completion timebased Queue length based # of CPUs based
Scheduling Algorithms


Job Execution Time (80 dags x 10 jobs)


500I


200


Completion time Queue length based # of CPUs based Round robin
based
Scheduling Algorithms


| agExecution -4 Idle


Figure 3-5. Performance of scheduling algorithms with 600 jobs and without any policy
constraints (A) The average DAG completion time for each of the algorithms,
and (B) The average j ob execution time and the idle time for each of the
algorithms.








70




Average Dag Completion Tme (120 dags x 10 jobs/dag)





6500


O 6000





5000


4500
Completion time Queue length # of CPUs based Round robin
based based
Scheduling Algorithms

A



Average Job Executioin Time (120 dags x 10 jobs/dag)

480 4
38
36
g 440 3 4

O 420 3 2

S40028

F 380 2 6
24
360
22

Completion time Queue length based # of CPUs based Round robin
based
Scheduling Algorithms

SExecution -4- dle
B


Figure 3-6. Performance of scheduling algorithms with 1200 j obs and without any policy
constraints (A) & (B). The results follow the trend same as the 300 and 600

jobs experiments, thus exhibiting scalability.


Here it is worth noting that the absolute average DAG completion times for the 2


experiments should not and cannot be compared as the average load on the grid is


different during these experiments.















Job Completion Vs Distribution (Completion Time Based)


800

700


140


S500

'400-

300-


Sites


---4--- AVGCompTime #ofJobs|


Job Completion Vs. Distribution (# of CPUs Based)


800

700

600




S400

.300

200

100


400


150


0




Sites


- -- AVG Comp Time #offJobs|


Figure 3-7. Site-wise distribution of completed j obs vs. avg. j ob completion time (A) &

(B). In the Job completion time based approach (a), the number of jobs

scheduled on a site is inversely proportional to its average j ob completion

time.





Figure 3-7(A) verifies that the job completion rate based approach indeed


scheduled more j obs are to sites having least average-j ob-completion-time and vice-









versa. Other algorithms do not follow the trend e.g. number of CPU based' algorithm

listed here in Figure 3-7(B).

The result shows that simple workload information such as the number of CPUs

and running jobs (as used in Round-Robin and simple Load-Balancing techniques) is not

good enough to estimate the performance of dynamic and customized grid resources.

The j ob completion rate approach keeps track of the j ob completion time, and utilizes the

information to estimate the near future execution environment on the grid sites. This

seems to be a much better predictor of actual performance on the different sites.

As monitoring systems mature and if the local sites make their performance

measures more transparent and accurate the effectiveness of monitoring information in

effective scheduling may improve. However, the data provided by extant monitoring

systems and sites does not seem to be very useful.

Effects of Policy Constraints on the Scheduling Algorithms

Figure 3.8 (A) and (B) show the performance of scheduling algorithms constrained

by resource usage quota policy. A user' s remaining usage quota defines the list of sites

available to him for submitting j obs from which the scheduling algorithm recommends

the execution site.

The results obtained are similar to those without policy. The results underline the

ability of SPHINX to do policy-based scheduling. Even in the presence of policy

constraints, SPHINX is able to get a scheduling efficiency similar to the one in a

constraint-free grid environment.





73



Average Dag Completion Time (120 dags x 10 jobs/dag)


7000


6500 |


-)
O6000
-
E 50


5000 -


Completion time Queue length # of CPUs based Round robin
based based
Scheduling Algorithms


Average Job Executioin Time (120 dags x 10 jobs/dag)


480
460
S440
O 420
400
E
F 380


I


360 -


340


Completion time Queue length based # of CPUs based Round robin
based
Scheduling Algorithms


III


| Execution -4- Ide


Figure 3-8. Performance of policy based-scheduling algorithm (A) Average DAG
completion time, and (B) Average Job Execution and Idle time. In each of the
scheduling algorithms, policy constraints are applied to get the pool of
feasible sites before using the scheduling strategy.













Timeout (120 dags x 10 jobs/dag)














Completion Queue length # of CPUs Round robin # of CPUs
time based based based based- without
feedback
Scheduling Algorithms



Figure 3-9. Number of timeouts in the different algorithms

Fault Tolerance and Scheduling Latency

Figure 3-9 gives the number of times j obs were rescheduled in each of the

scheduling strategies. Note that without any feedback information, the number of

resubmissions is very high (2258) as compared to 125 in the job completion rate based

hybrid approach.

The graph in Figure 3-10 (A) presents scheduling latency of SPHINX scheduling

system with different workload of submitted workflows at the same time. Scheduling

latency is defined as the taken time from the job submission until the scheduling decision

is made and the job is submitted to the targeted resource. Each line presents the different

number of workflows which are submitted concurrently. I differentiate the job arrival

rate per minute. SPHINX shows pretty stable performance until the workload of 13 jobs /

min with different number of DAGs which are submitted concurrently.











The graph in Figure 3-10 (B) shows the scheduling latency of the cluster-based

scheduling algorithm. The cluster size is set to three. It means that any three


independently schedulable jobs in a workflow will be planned together in a single

scheduling iteration. The scheduling algorithm performs multiple iterations of the job


planning to determine the optimal resource allocation. The workflow arriving rate to a

scheduling system is determined by the number of submitted workflows per minute. The


workload is decided by the number of jobs of a workflow. The total number of submitted

DAGs is 100. The algorithm shows good tolerance to the arriving rate up to nine


workflows per minute with the different workloads such as 14, 12, 10 or 8 jobs.

45 The Scheduling Latency


30 -o-- 20 DAG's 5
S25 40 DAG's 40--4oEa
~20 80 DAG's 30 -12Jobrldag
15 100 DAG's 10Jobrldag
20 8Jobrldag
10 -- '

5 s 10 1 3 1
0 05 2 4 1 1 1 0 5 6 7 9 11
# jobs / minute me number of workflows per minute


Figure 3-10. Sphinx scheduling latency: average scheduling latency for various number
of DAG' s (20, 40, 80 and 100) with different arrival rate per minute

Conclusion and Future Research

This chapter introduces techniques and infrastructure for fault-tolerant scheduling


of jobs across the grid in a dynamic environment. In addition to the SPHINX architecture

which is robust, recoverable, modular, re-configurable and fault-tolerant, the novel

contributions of this chapter to the state-of-the-art is the effective use of monitored


information in efficient scheduling without requirement of human interference in a highly


dynamic grid environment. These results show that SPHINX can effectively









4. Reschedule j obs if one or more of the sites stops responding due to system
downtime or slow response time.
5. Improve total execution time of an application using information available from
monitoring systems as well its own monitoring of job completion times.
6. Manage policy constraints that limit the use of resources.
7. These results demonstrate the effectiveness of SPHINX in overcoming the highly
dynamic nature of the grid and complex policy issues to harness grid resources, an
important requirement for executing large production jobs on the grid.

We are investigating novel scheduling methods to reduce the turn around time. We

are also developing methods to schedule jobs with variable Quality of Service

requirements.

Latest updates on SPHINX are available at http://www.griphyn. org/sphinx

A novel grid scheduling framework for computing has been proposed in this

chapter and an initial implementation presented. Resource scheduling is a critical issue in

executing large-scale data intensive applications in a grid. Due to the characteristics of

grid resources, we believe that traditional scheduling algorithms are not suitable for grid

computing. This document outlines several important characteristics of a grid scheduling

framework including execution time estimation, dynamic workflow planning,

enforcement of policy and QoS requirements, VO-wide optimization of throughput, and a

fully distributed, fault tolerant system.

Our proposed system, SPHINX, currently implements many of the characteristics

outlined above and provides distinct functionalities, such as dynamic workflow planning

and just-in-time scheduling in a grid environment. It can leverage existing monitoring

and execution management systems. In addition, the highly customizable client-server

framework can easily accommodate user specific functionality or integrate other

scheduling algorithms, enhancing the resulting system. This is due to a flexible

architecture that allows for the concurrent development of modules that can effectively









manipulate a common representation for the application workflows. The workflows are

stored persistently in database using this representation allowing for development of a

variety of reporting abilities for effective grid administration.

The development of SPHINX is still in progress, and we plan to include several

additional core functionalities for grid scheduling. One such important functionality is

that of estimating the resources required for task execution, enabling SPHINX to

realistically allocate resources to a task. In addition, we plan to investigate scheduling

and replication strategies that consider policy constraints and quality of service, and

include them in SPHINX to improve scheduling accuracy and performance.















CHAPTER 4
POLICY-BASED SCHEDULING TECHNIQUES FOR WORKFLOWS

This chapter discusses policy-based scheduling techniques on heterogeneous

resources for grid computing. The proposed scheduling algorithm has the following

features, which can be utilized in grid computing environments. First, the algorithm

supports the resource usage constrained scheduling. Second, the algorithm performs the

optimization-based scheduling. It provides an optimal solution to the grid resource

allocation problem. Third, the algorithm assumes that a set of resources is distributed

geographically and is heterogeneous in nature. Fourth, the scheduling algorithm

dynamically adjusts to the grid status by tracking the current workload of the resources.

The performance of the proposed algorithm is evaluated with a set of predefined metrics.

In addition to showing the simulation results for the out-performance of the policy-based

scheduling, a set of experiment is performed on Open Science Grid (OSG).

Motivation.

Grid computing is recognized as one of the most powerful vehicles for high

performance computing for data-intensive scientific applications. It has the following

unique characteristics over the traditional parallel and distributed computing. First, grid

resources are geographically distributed and heterogeneous in nature. Research and

development organizations, distributed nationwide or worldwide, participate in one or

more virtual organizations (VO's). A VO is a group of resource consumers and providers

united in their secure use of distributed high-end computational resources towards a

common goal. Second, these grid resources have decentralized ownership and different









local scheduling policies dependent on their VO. Third, the dynamic load and

availability of the resources require mechanisms for discovering and characterizing their

status continually.

The dynamic and heterogeneous nature of the grid coupled with complex resource

usage policy issues poses interesting challenges for harnessing the resources in an

efficient manner. In this paper, we present novel policy-based scheduling techniques and

their performance on Open Science Grid (OSG), a worldwide consortium of university

resources consisting of 2000+ CPUs. The execution and simulation results show that the

proposed algorithm can effectively:

8. Allocate grid resources to a set of applications under the constraints presented with
resource usage policies.
9. Perform optimized scheduling on heterogeneous resources using an iterative
approach and binary integer programming (BIP).
10. Improve the completion time of workflows in integration with j ob execution
tracking modules of SPHINX scheduling middleware.

Problem Definition and Related Works

An application scientist typically solves his problem as a series of transformations.

Each transformation may require one or more inputs and may generate one or more

outputs. The inputs and outputs are predominantly files. The sequence of transformations

required to solve a problem can be effectively modeled as a Directed Acyclic Graph

(DAG) for many practical applications of interest that the proposal is targeting.


























Figure 4-1. An example workflow in Directed Acyclic Graph (DAG). The figure shows a
workflow consisting with eight j ob s. The number of an edge represents a
communication time. For simplicity we assume that the communication time
is identical on different network. (Adapted from [54])

Figure 4-1 describes a DAG consisting of 8 tasks. It is useful to define an exit task

- the completion of this task implies that the workflow is executed. Task 8 represents the

exit task. A scheduling algorithm aims to minimize a workflow completion time

obtained by the assignment of the tasks of the DAG to processors. A scheduling

algorithm that is efficient and suitable in the target environment should be able to exploit

the inherent heterogeneity in the processor and network resources. Most of the existing

scheduling algorithms perform the mapping of tasks to processors in two stages:

Create a priority based ordering of the tasks. The priority of a task is based on its

impact on total completion time. This requires determining the critical path, which in turn

requires that the execution time of each task is available or can be estimated. The exact

definition of critical path depends on the algorithm. Some algorithms use the longest path

from the given task to calculate the critical path of a task. Others use the longest path

from the start node to the given task to calculate its critical path.










Use the priority based ordering created in the previous step to map the tasks so that

the total completion time is minimized. The process is performed one task at a time in an

order based on the priority. It is incremental in nature. However, once a task is assigned

to a given processor, it is not generally remapped.

The above approach has the following limitations for the target heterogeneous

environment:

8. The amount of time required for a task is variable across different processors (due
to heterogeneity). Thus estimating the priority based on cost of the critical path is
difficult (this measure or related measures are used by most of the algorithms in
determining a task' s priority). Adaptations of these algorithms for heterogeneous
processors use the average or median processing time of subsequent tasks to
estimate the critical path. However, this may not be an accurate reflection of the
actual execution task. In fact one processor may execute task A faster than Task B,
while the reverse may be true for another processor this may be due to differential
amounts of memory, cache sizes, processor types, etc.

9. The tasks are assigned one at a time. Assuming that k processors are available at a
given stage, this may not result in an optimal assignment. Clearly, one can call
these algorithms k times sequentially to achieve the same goal. However, this may
not result in the optimal allocation of the tasks on the k available processors.
Mapping a large number of tasks simultaneously on available processors can allow
for more efficient matching.

10. The tasks are assigned without any policy constraints. The policy constraints can
restrict the subset of processors that can be assigned to a given task. This needs to
be taken into account while making the scheduling decisions.

Past research on task scheduling in DAGs has mainly focused on algorithms for a

homogeneous environment. Scheduling algorithms such as Dynamic Critical Path (DCP)

algorithm [55] that show good performance in a homogeneous environment may not be

efficient for a heterogeneous environment because the computation time of a task may be

dependent on the processor to which the task is mapped. Several scheduling algorithms

for a heterogeneous environment have been recently proposed. Most of them are based on

static list scheduling heuristics to minimize the execution time of DAGs. Examples of









these algorithms include Dynamic Level Scheduling (DLS) algorithm [58] and

Heterogeneous Earliest Finish Time (HEFT) algorithm [57].

The DLS algorithm selects a task to schedule and a processor where the task will be

executed at each step. It has two features that can have an adverse impact on its

performance. First, it uses the median of processing time across all the processors for a

given task to determine a critical task. Secondly, it uses the earliest start time to select a

processor for a task to be scheduled. These may not be effective for a heterogeneous

environment. For instance, suppose that processor A and processor B are the only

available processors for the assignment of task i. Assume that processor A becomes free

slightly earlier than processor B (based on the mapping of tasks so far). Then task i is

assigned to processor A, since processor A can execute it earlier than processor B.

However, if processor B can Einish task i earlier than processor A, the selection of

processor B for the task should result in a better mapping. Also, the time required to

execute the scheduling algorithm (cost of scheduling) is relatively high as the priorities of

the remaining tasks need to be recalculated at each step of the iteration. The number of

iterations is proportional to the total number of tasks

The HEFT algorithm reduces the cost of scheduling by using pre-calculated

priorities of tasks in scheduling. The priority of each task is computed using an upward

rank, which is also used in our proposed algorithm. Also, it employs finding the earliest

Einish time for the selection of a processor, which is shown to be more suitable for a

heterogeneous environment. Although it has been shown to have good performance in the

experiments presented in [57], it can be improved by using better estimations of the

critical path. The Iterative list scheduling [54] improves the quality of the schedule in an









iterative manner using results from previous iterations. It only assigns one task at a time

and does not support resource usage policies. It is a static scheduling algorithm, which

assumes an unchanged or stable computing environment. In the dynamic and policy

constrained grid computing environment the algorithm may not perform well this is

supported by the simulation results presented in this paper.

We address the above issues in the proposed algorithm and demonstrate that it is

effective in satisfying the workflow completion deadline. The proposed algorithm

consists of three main steps:

* Selection of tasks (or a task)
* Selection of processors for the selected tasks
* Assignment of selected tasks to selected processors based on policy constraints.

A subset of independent tasks with similar priority is selected for simultaneous

scheduling. The tasks in the selected subset are scheduled optimally (i.e., to minimize

completion time) based on policy constraints. Further, the scheduling derived is

iteratively refined using the mapping defined in the previous iteration to determine the

cost of the critical path. This estimation, in general, should be better than using the

average computation time on any processor. We utilize the resource usage reservation

technique to schedule multiple DAG workflows. The technique facilitates the deadline

satisfaction of workflow completion.

When the scheduling algorithm is integrated with the SPHINX scheduling

middleware it performs efficient scheduling on the policy-constrained grid environment.

The performance is demonstrated in the experimental section.









Scheduling Algorithm Features

The proposed policy-based scheduling algorithm is different from the existing

works in the following perspectives.

Policy constrained scheduling: Decentralized grid resource ownership restricts the

resource usage of a workflow. The algorithm makes scheduling decisions based on

resource usage constraints in a grid-computing environment.

Optimized resource assignment: The proposed algorithm makes an optimal

scheduling decision utilizing the Binary Integer Programming (BIP) model. The BIP

approach solves the scheduling problem to provide the best resource allocation to a set of

workflows subj ect to constraints such as resource usage.

The scheduling on heterogeneous resources: The algorithm uses a novel

mechanism to handle different computation times of a job on various resources. The

algorithm iteratively modifies resource allocation decisions for better scheduling based

on the different computation times instead of taking a mean value of the time. This

approach has also been applied to the Iterative list scheduling [54].

Dynamic scheduling: In order to encounter a dynamically changing grid

environment the algorithm uses a dynamic scheduling scheme rather than a static

scheduling approach. A scheduling module makes the resource allocation decision for a

set of schedulable jobs. The status of a job is defined as schedulable when it satisfies the

following two conditions.

Precedence constraint: all the precedent j obs are finished, and the input data of the

job is available locally or remotely.









Scheduling priority constraint: A job is considered to have higher priority than

others when the j ob is critical to complete the whole workflow for a better completion

time.

Future scheduling: The resource allocation to a schedulable job impacts the

workload on the selected resource. It also affects the scheduling decision of future

schedulable jobs. The algorithm pre-schedules all the unready jobs to detect the impact

of the current decision on the total workflow completion time.

When the scheduling algorithm is integrated with the SPHINX scheduling

middleware it performs efficient scheduling on the policy-constrained grid environment.

The performance is demonstrated in the experimental section.

Notation and Variable Definition

In this section we define the notations and variables that are used in the proposed

scheduling algorithm.

comp,; : Computation time of job i on processor j---(1
conan; : Communication time from processor p to j ---(2

The computation or execution time of a job on a processor Compp,,) is not identical

among a set of processors in a heterogeneous resource environment. The algorithm sets

an initial execution time of a j ob with the mean value of the different times on a set of

available processors. The time is updated with an execution time on a specific processor

as the algorithm changes the scheduling decision based on the total workflow completion

time. The data transfer or communication time between any two processors (conansp) is

also different in the environment.

precl : A set of the preceding jobs of job i---(3
succl : A set of the succeeding jobs of job i---(4









An application or workflow is in the format of directed acyclic graph (DAG). Each

j ob i has a set of preceding (prec,) and succeeding (conan,) jobs in a DAG. The

dependency is represented by the input/output fie relationship.

Avail : The available time of processor j for job i ---(5

ES7(1 : Earliest Start Time of job i on processor j---(6
ES7(1 =Ma~x{Arail ,.A1 lir- ec, {EFTkp + COMM
EF7L1 : Earliest Finish Time of job i on processor j ---(7
EF7L1 = ES7, + consp,
conspLenl : Workflow completion length from j ob i ---(8
conspLenl = comlp, +3 A-v l= _succ, (conan gppk + COnapLelk)

The algorithm keeps track of the availability of a processor (Availty) to execute a

j ob. We assume a processing model which all the j obs in a processor queue should be

completed before a new job gets started. We assume a non-preemptive model. On the

grid scheduling middleware, SPHINTX obtains the information from a grid monitoring

system such as MonALISA or GEMS. The algorithm computes the earliest start time of

a job on each processor (EST,,). A job can start its execution on a processor only after

satisfying two conditions; first, a processor should be available (Avail,) to execute the

job. Second, all the preceding jobs should be completed (31 lr- =,,ee, (EFTkp + COMM, )

on the same processor or other processors. The earliest finish time of a job on a

processor (EFT,,) is defined by the earliest start time (EST,,) and the j ob completion or

execution time on the processor Compp,,). The workflow completion time from a job to

the end of a DAG (conspLen,) is defined recursively from the bottom to the job i. The

value is used to decide the critical path in the workflow. The makespan of the critical

path is used as a criterion to terminate the algorithm. The algorithm completes the

scheduling when there is no improvement in the DAG completion time.










Dea~dlined(,) : The workflowcompletiondeadline of dag d to which j ob i belongs
estComp Tined~(,) : The estimated workflowcompletion time of dag d to which j ob i belongs

We also define the deadline and the estimated completion time of a workflow. They are

used when we devise a profit function for the multiple workflow scheduling. The

scheduling algorithm assumes that the workflow deadline (Deadlinediz)) is Submitted by a

user, or is generated by a scheduling system. The scheduling algorithm computes the

estimated workflow completion time (estCompTimediz)) with considering the current

workload on Grid resources. The time is changed in the iterations of the scheduling due

to the dynamic Grid resource status.

Exe cTi me
J1 J2 J3 J4 J5 J6 J7 J8
P1 70 68 78 89 30 66 25 94
P2 84 49 96 26 88 86 21 36
Policy
I J1 J2 J3 J4 J5 J6 J7 J8
P1 v v v v v v
P2 v v v v v v

Job J1 J2 J3 J4 J5 J6 J7 J8
AvgExe cTi me 77 68 87 89 88 76 21 65
compLen 517 354 354 340 199 199 181 65

Prioritization J1 J3 J2 J4 J6 J5 J7 J8
Assignment P1 P2 P1 P1 P1 P2 P2 P1



Figure 4-2. An example for job prioritization and processor assignment for the workflow
in Figure 1. The example presents a procedure to prioritize a set of jobs on
the workflow from Figure 1 based on the scheduling function (P,,). It also
shows the processor assignment to the jobs based on the earliest finish time of
a job on a processor (EFT,,).

Figure 4-2 shows an example of the procedure to assign j obs in a workflow to a set

of processors (Pl and P2) based on the workflow prioritization techniques described in

this section. It presents the heterogeneous execution time of the j obs on each of the










processors (ExecTime). Each job such as J1, J2 or J8 has different execution time on the

processors, Pl and P2. The figure also shows the resource usage restriction of the jobs.

The mark (v) indicates that a processor allows a job to be executed on a corresponding

processor. Otherwise, a job can't be run on a processor. In the first iteration of the

resource allocation procedure the algorithm uses the average execution time

(AvgExecTime) from the different time on the processors. In the next iterations the

algorithm uses a specific execution time on a selected processor in the previous iteration.

The algorithm computes the critical path length from each j ob to the bottom j ob of the

workflow. The critical path length is calculated on the base of the predefined calculation

for compLen, on (8). The values of the critical path length are used to prioritize the set of

j obs in non-descending order (Prioritization). After sorting the j obs, the algorithm

allocates a set of available processors to the j ob subj ect to the resource usage constraints.

The assignment is determined with an optimization model that is discussed in the next

sections.

Optimization Model

We devise a Binary Integer Programming (BIP) model to find an optimal solution

to the scheduling problem on the proposed algorithm. In this section, we first define two

scheduling profit functions for single workflow scheduling and multiple workflow

scheduling respectively. Next, we discuss the optimization model utilizing the profit

functions for the scheduling problem.