<%BANNER%>

Distributed Configuration Management for Reconfigurable Cluster Computing

xml version 1.0 encoding UTF-8
REPORT xmlns http:www.fcla.edudlsmddaitss xmlns:xsi http:www.w3.org2001XMLSchema-instance xsi:schemaLocation http:www.fcla.edudlsmddaitssdaitssReport.xsd
INGEST IEID E20110113_AAAAEN INGEST_TIME 2011-01-13T23:12:43Z PACKAGE UFE0007181_00001
AGREEMENT_INFO ACCOUNT UF PROJECT UFDC
FILES
FILE SIZE 859898 DFID F20110113_AACXGZ ORIGIN DEPOSITOR PATH jacob_a_Page_21.jp2 GLOBAL false PRESERVATION BIT MESSAGE_DIGEST ALGORITHM MD5
e737abd89b156d2e5ad78b4874664886
SHA-1
af76a7a7b53627a6f87839b2a556879737e9aced
20664 F20110113_AACXQU jacob_a_Page_13.QC.jpg
cb4df4e955b5b7d4d939eda31ee377a6
beab705a8347b1bd784f9a6320e4dd3b007f6e44
32489 F20110113_AACXLX jacob_a_Page_30.pro
9fa4d7082e61aa1b5c33d3c1a475d540
5b497fa3bf85c084ef594b77c0b1dfd3f89975a2
62948 F20110113_AACXEB jacob_a_Page_05.jpg
8ada2ba045a398b94c86e3407f9ddc88
af3e0fc50776dcdac6366e8bc021ffa62abeeafd
5100 F20110113_AACXQV jacob_a_Page_67thm.jpg
9057f5facd98d55855e843f5cd27df7d
a0b7b48b277c08e24d29361590b0b760667fa82f
53620 F20110113_AACXLY jacob_a_Page_31.pro
3e47f37a7e86ba4d1abb9736c3d0e395
a659997e4cec9b8d9f0f8e3c68e31fc04c716a19
22435 F20110113_AACXEC jacob_a_Page_06.jpg
27027673269ed0c27430e5df9fd16291
4b0232e156d3244c031f08af6a8c58217e162bad
23031 F20110113_AACXQW jacob_a_Page_20.QC.jpg
7e751c8ec52a01ec3535cb18d1a97170
7d44bf44aeb48160b3e3c19c21b25a996ab31483
1053954 F20110113_AACXJA jacob_a_Page_14.tif
8507adfd89f71c342887b889b8fa4cc5
92c3b9ea670fc5f5dff49d95c2ed51f525cec3a1
45056 F20110113_AACXLZ jacob_a_Page_32.pro
d864aaf534779f0a5606e1c45d7af629
af6e46ddade0babdce261cceacab4a8b69aff499
64688 F20110113_AACXED jacob_a_Page_07.jpg
8eef21643e91bc6889903db6748c6a6e
d232af96edd81041377124b65971a39d3c00dfc9
23118 F20110113_AACXQX jacob_a_Page_56.QC.jpg
a6c024c94e85c46b1647d2459d9ee1a2
cecc8b05fa6a2688c98943e5dac0e26f22570739
F20110113_AACXJB jacob_a_Page_15.tif
2d9b6621dd0a13e7bfbd13578f1d9a4e
24dbd59a7024c1f67b5c55117dbde01c851de723
40394 F20110113_AACXEE jacob_a_Page_08.jpg
9e7bf56ca2473d9e0e4e5bcdfaa2c527
90c49ada6fe68d0c22d92bcc2c492fb7b4d3f323
2034 F20110113_AACXOA jacob_a_Page_26.txt
94158f601632722e2d6a1f56a81c6df6
c103b259f6a7c4f9d4762d019c27618199e3a253
21978 F20110113_AACXQY jacob_a_Page_51.QC.jpg
fd3bb8cfe2d9eaa5eed0dbda610da15b
cf82723cdb2b0abe62efd2aa8d627f7635d72684
F20110113_AACXJC jacob_a_Page_16.tif
2ab2f0283db3b8bcc12bf354017c7edc
88c762036b1341b0e99aab86d5397daa95dc18d7
59569 F20110113_AACXEF jacob_a_Page_09.jpg
ee66a1a811f8e6711a77624aa97ad14d
74f74108e1ccf7c6f4743355f67c7ccd8d4d6078
21914 F20110113_AACXQZ jacob_a_Page_12.QC.jpg
5b6bd270dcb246fde061023f3e689295
73c783fab4cc8396a7175062d11de06a2d787f30
F20110113_AACXJD jacob_a_Page_17.tif
cd9319363448ac99319a07e023b7696a
90edeee6b3f9c6c7c6f0ed83630ee2841c198cd2
26936 F20110113_AACXEG jacob_a_Page_10.jpg
665f772d607b629f34e1ecb105070ec0
2ceb4cb8bb485563a2e587cc747fd8fa0288ee9c
1330 F20110113_AACXOB jacob_a_Page_27.txt
dd7ce02492eb6fd9e1f5c08f7cd54e8b
6341d1193d5ec0b3ee627a84be133c3682fa3771
25271604 F20110113_AACXJE jacob_a_Page_18.tif
5a23825cd256c275131db835edbc9e49
eecfd3ee1f687f2ad32b7834cb0551d855cdbbaa
64517 F20110113_AACXEH jacob_a_Page_11.jpg
83ad91289979fddd15459601a879508e
a3554f734dc1e00ff7e1da3d7b918eb734e020b6
6868 F20110113_AACXTA jacob_a_Page_38thm.jpg
473ba31a3846f8f0136e1a8ba0baf505
94201c681a8f5a3c8bb6a07245b2c0e6e3013c5c
1849 F20110113_AACXOC jacob_a_Page_28.txt
284fc73f3458fbc9e40e194591846316
710d53a0198d004c99f9d896552f126fc64c8074
F20110113_AACXJF jacob_a_Page_19.tif
1edd2e5c678cc2a36e13a35c2e3a39ef
ab127abd61ae707a61893bc1636a7dc3623a747e
67606 F20110113_AACXEI jacob_a_Page_12.jpg
c19c7d6ebd2fa10370722d0fa4d681c5
c9102125ffb35c4d9973d86aebf9dda3a5723a3d
19560 F20110113_AACXTB jacob_a_Page_40.QC.jpg
89c508a249c1551522faadf3d827ab27
e3062ddb57df605ed64e83410398c9c2f980b143
1884 F20110113_AACXOD jacob_a_Page_29.txt
edc22f112431dd9810c50199c7c4aaac
2651dcb3811d7a848c8564acc5d9b5d5a3611d40
F20110113_AACXJG jacob_a_Page_20.tif
e15689cdc53f3876e589980ac1530b34
d1183d1324fe39c54856aa6430ae38a48e1b0b04
67468 F20110113_AACXEJ jacob_a_Page_13.jpg
32249d5b16fbb3b7c2524bf5611b9330
ca16498f9b6f2bc60a5a7feef938a95431d14c3e
5603 F20110113_AACXTC jacob_a_Page_40thm.jpg
e36b13c76e4b0608a1e9bd27d89a9b5d
491350dad2b3588321eb8b88631eacb72075c1e6
F20110113_AACXJH jacob_a_Page_21.tif
48945878a676fb5419721e5cd4d0fc50
baabd0012d1593d1dc94bee20dfcb3f33235a5e9
58471 F20110113_AACXEK jacob_a_Page_14.jpg
5ecbe1409d03977cd13e37d7510c03ed
ea4b06308ce7b03971c878b8ab8b90bf4943d8b4
1523 F20110113_AACXOE jacob_a_Page_30.txt
3d40b8dd34df2dd271b52e19eb93b7c6
757c14b99e72caba9db06acbb575060b711a84f8
19480 F20110113_AACXTD jacob_a_Page_41.QC.jpg
de35dc122c0bbf67fc47eb7da0e4c11d
cd1c8dfb51d0fb73212d7611f9b87516d56742c9
8423998 F20110113_AACXJI jacob_a_Page_22.tif
31d83456bfb54c1fb4e15025cd2afe59
f4ea1249c062bfda056f6ccde85f348cedce873d
67908 F20110113_AACXEL jacob_a_Page_15.jpg
0ceefa209bc19de91ad8dc5d0ae06a10
4ca64fe653bdaf4dd63869d53f64e88551f1576b
2849 F20110113_AACXOF jacob_a_Page_31.txt
5c9890a03e951f410eb30822897f8625
23a07bfb83bb51d60c817e66fec822cc14bbc7a2
21383 F20110113_AACXTE jacob_a_Page_43.QC.jpg
3daebc10ccb2ede94b9705baf6438530
a3db43d8ef7a534586aeb099c36ac760a60d38f6
F20110113_AACXJJ jacob_a_Page_23.tif
fa054678c43dd88ffe7a983fb112796f
9acaa0b34be685651a5c5fca4c1764dcccaff850
72416 F20110113_AACXEM jacob_a_Page_16.jpg
6816a26623879927876f74b28c8db463
35000640c77d87acd1f2c5ca1c3cdbd13e984284
1788 F20110113_AACXOG jacob_a_Page_32.txt
12b4e0a4f8ee5e55e7e964390d3f8542
5ccd2c771321d7115323f6f71b5c228d39f47832
6423 F20110113_AACXTF jacob_a_Page_43thm.jpg
3c55226152ee470182290841a59a37ac
e2420510264aaea79bffafb8d6cf3adbf70221bf
F20110113_AACXJK jacob_a_Page_24.tif
be004866e85e376c8b6b3bfa3427be8f
c3f40ab8a2e777f6c8edf293932dee8980037c4a
69242 F20110113_AACXEN jacob_a_Page_17.jpg
ec90a6fd51a839892dc7c4022f2eda3d
d9094ce53850651651788447143be059a8839bc2
1762 F20110113_AACXOH jacob_a_Page_33.txt
a004bfd74318cee1036db2f3876bc5b8
ad9eb4ae481e6ce60d058cabb64ceb8368193818
F20110113_AACXJL jacob_a_Page_25.tif
4ae521d0aa4848d921b624dd54fe94db
1f012976630f1ae523257344dbcdd85e417a5bd8
47682 F20110113_AACXEO jacob_a_Page_18.jpg
73ff3c792fb18910288f40a774e18abd
7872d703b413c85bf669b172188d1e874b76e44d
1752 F20110113_AACXOI jacob_a_Page_34.txt
c7519fd41be2b48a14c60e24f92cce85
98dc5a34170507ca4796a4dc678e8aac80398da5
4662 F20110113_AACXTG jacob_a_Page_45thm.jpg
e5d0cece2121dff18e499e2235cffa46
5f7f9ff2d0017cdb83a137e12572b113cc9104ed
F20110113_AACXJM jacob_a_Page_28.tif
258997a6a2b019c3cc6b9660f6d2aed0
942dab5cde47f0eea0d4104c87d3603cd3a6e8bc
65706 F20110113_AACXEP jacob_a_Page_21.jpg
d5465747a74bfe33b5c9a0609d32164d
f4c311d99f0da4bed2ed40b10b0681264dc40965
1592 F20110113_AACXOJ jacob_a_Page_35.txt
2c73d599a2ce24ed0f345f1536a6d29e
81646398000319ae93247885a35caa63b040e6c4
13721 F20110113_AACXTH jacob_a_Page_46.QC.jpg
6a35e91dc4835d14264fbdaa723ac034
e858a316a3d2aa74dd526019391dff3ddeb3f2c8
F20110113_AACXJN jacob_a_Page_29.tif
b6a82ac4416e5aef46b0f7e01053a483
06901e9cb086cd104155948d48eafb1a401f901b
63961 F20110113_AACXEQ jacob_a_Page_22.jpg
07742a50877cfd0c853ca10c924294a7
3710cc97986b9c774705bab124e22af1e913297a
1371 F20110113_AACXOK jacob_a_Page_36.txt
3478d5783c408ba11d4bc7ca62b3d914
27fc4f4237000700f912fcb6c9da5b1fb0822af7
4522 F20110113_AACXTI jacob_a_Page_47thm.jpg
30344efe529158ce8970b8f245cd76a0
8930435c41ff24c5dbfa7af70ee3db2021eacd70
F20110113_AACXJO jacob_a_Page_30.tif
ed4fdf265315286f6ceaf4a042091699
bbcf3b5fb96e33844332237f0c5a1ed6e3e2fdfe
52374 F20110113_AACXER jacob_a_Page_23.jpg
eb383fb9e0798fa89243bdc713ff33cb
e513b8674552baa68e4563ab339216aae4613e25
1859 F20110113_AACXOL jacob_a_Page_37.txt
4328c808c866618cf7d47b8f56b40d7b
ee917e8360bfdf58551e9a306e7ab19563ae2624
6042 F20110113_AACXTJ jacob_a_Page_48thm.jpg
e2d0c29530ca914718e76b2c9bfc8d02
498b7f6a320db4154cf54e2433d4be6d62f9c0a7
F20110113_AACXJP jacob_a_Page_31.tif
a65c21e289dc6e64d7aaa07d22249480
0f23954433193419397a3599dcd994668200724f
66613 F20110113_AACXES jacob_a_Page_24.jpg
5a9ce96a991f6ebb4dee787f2a62cd45
5ef56c1cb7a1acca50e0d7adeb7c77f77037f269
1873 F20110113_AACXOM jacob_a_Page_38.txt
d83de96d7395df5a3941d5f3b0f6d9c0
f95ebbe2937e7ca02462a9552c9aa97fadc166cb
22451 F20110113_AACXTK jacob_a_Page_50.QC.jpg
1f06589299aa92099264667cb98e7f29
af2a3dc32b03855a044a1bd7d95c1b43db82af5b
F20110113_AACXJQ jacob_a_Page_32.tif
92660006b6b48e67690ebd4f706f7edc
130d1a3919f8eb1e2a859134f109f6c9d0aecd5b
67312 F20110113_AACXET jacob_a_Page_25.jpg
3ac6352b217aec2f2fcb95a4f11bf987
06746eb1af171067360e0b9638a6b0feea1bc699
549 F20110113_AACXON jacob_a_Page_39.txt
c71c158a02c9a703f94788eb1e53d4da
bbd0701b0422198a9d353786dfe64d49ae31d842
6038 F20110113_AACXTL jacob_a_Page_50thm.jpg
b67c72df362c613e0e0a85bae5922a89
255322784e1b6816f3369dcf3ce17733b150ce6e
F20110113_AACXJR jacob_a_Page_34.tif
b4a1f39ae6885e909101bbebc975eb00
27c57f25399b8e350b5293a6565e287317faf063
73699 F20110113_AACXEU jacob_a_Page_26.jpg
b4dfdbb632391502c4c3f850c42777bf
6aea6b80bee79601873246bf2984ce1813c30f4d
1654 F20110113_AACXOO jacob_a_Page_40.txt
5b6939d133250b0a5d1de79288b7bd1a
7f8cada95a345d0e22618d84f49e648b4f318d67
6156 F20110113_AACXTM jacob_a_Page_51thm.jpg
07dceb94f4f62006bf93d9f6c0b03144
a543c79315cd419f688e5c09a87f36e153af4947
F20110113_AACXJS jacob_a_Page_35.tif
3a81adc0679787507c84aca6492527d6
3cc5bcbde75b1228d40ea633dba4d9cef631932e
69229 F20110113_AACXEV jacob_a_Page_28.jpg
03f170bc2b216e7edc1cb695d449b835
784daf5790b890cbfd3b5974eb89fe94faddabf6
1972 F20110113_AACXOP jacob_a_Page_42.txt
9b09d488c7a9737d50032d5774ac5caf
5ab30d84ec0cc7b8a6330437ce6eafff6950e5d0
23703 F20110113_AACXTN jacob_a_Page_52.QC.jpg
b65b2854c484b7b9160e11f0e7697744
59736882c7245dbbcdb84c92cc21703259e27087
F20110113_AACXJT jacob_a_Page_36.tif
1a47bb1b3b0c0f447727cd0a89134755
b4fe8300c742269502fc017e212ae83fb1ec2d5c
69574 F20110113_AACXEW jacob_a_Page_29.jpg
4ce6c516fb73b6c4c6c3e4341585fea9
f685faa8ba06d27e7ec5a972924146eea0fad20e
915 F20110113_AACXOQ jacob_a_Page_43.txt
1194cedfd85026b197776dc398073ce6
b6e4d140e3900e8fd4820149a2e93255128e15b9
21025 F20110113_AACXTO jacob_a_Page_53.QC.jpg
ce692ceec66bea2cc8dfd6f358d62533
21534410fd1242319eb088351503650b33ada13f
F20110113_AACXJU jacob_a_Page_37.tif
75eed62384c4c88f5e7466b573454b46
56e50c16c3305fe04a3687f2e2d8b118d770780f
1495 F20110113_AACXOR jacob_a_Page_44.txt
163ddcb3882574ca32d9d7a4dba2ee59
8047c4fed4e09aac7ee535428f04d593ba56ab97
5719 F20110113_AACXTP jacob_a_Page_55thm.jpg
164c745568c011b2964b37b9334c47d5
4936334c6e9dfa6e8f2837d9670250674b404bf2
F20110113_AACXJV jacob_a_Page_38.tif
b5e36636bfa69eb466ff5254933769c9
b601bbbe6e69ca58bb3e16cff79b9279d58a344f
61324 F20110113_AACXEX jacob_a_Page_30.jpg
e5ea1c805221afde740a85677f24db56
9949f11d7b5be669a722eb85bbbe3a08e0f16695
1416 F20110113_AACXOS jacob_a_Page_45.txt
1493f58063a731ba6691a381645efe32
accaf3a880fbc79ccdb31b448617ca0ffbfd4385
6197 F20110113_AACXTQ jacob_a_Page_56thm.jpg
7069071a1042d99b202f2e617a0e5ded
94240b25adcdb8f2f9d20f53c2f81c5fad335e02
F20110113_AACXJW jacob_a_Page_40.tif
883dca6fc0c9403a58b7a976b244e231
2102c6d4dd4433c55356322924ff9952c7ea5ad4
4131 F20110113_AACXCA jacob_a_Page_46thm.jpg
36547c0edb7091f7f94cd61c4ef918e9
a16d762d3ecc64c2e2fface5c31d8be117399c45
69849 F20110113_AACXEY jacob_a_Page_31.jpg
064abf0c9d593e1c5870ad2a2ef22511
3e201bdf1744c4a3019a083ae5ec14cf534d4220
635 F20110113_AACXOT jacob_a_Page_46.txt
1c98599cdee797647159680b5d3901d1
ac7b5bc215bd35cf43bc16b1b1aedbc4f1ed69d4
6547 F20110113_AACXTR jacob_a_Page_60thm.jpg
96709a4878f2695b2f20e1234db64823
6ca7014a156ddf09c9d0b44b77622842f6ad5b80
F20110113_AACXJX jacob_a_Page_41.tif
f3ee2bb36d844a157f6926a63747025d
df8a7ae30515f4abab839a76bd2830dcd1a1edaa
F20110113_AACXCB jacob_a_Page_66.tif
d5ab22cf7f21b8f90f5e249d4fe4c7a2
de39792b1069424a7db992e1c41bf4b365f4207a
67253 F20110113_AACXEZ jacob_a_Page_32.jpg
ee6c063d04e8e523c553a53069f2f0b5
9ba54bc049a67b3d6530b4c0bb1bead62cc5af31
1186 F20110113_AACXOU jacob_a_Page_47.txt
6552671d56db190adbf399d2e4b91c70
73452434fa9f65cf5e0d2b756224800c90f7e53d
6624 F20110113_AACXTS jacob_a_Page_61thm.jpg
356ed50bac34402f09d41cd1cc8c8857
85e0d03024b36c981d6001292b0e9d17af5cc700
F20110113_AACXJY jacob_a_Page_42.tif
3387dd5a46378374ef3536334ddd697c
7972b2a5a6b34fa9ef1f163ba8ffb5e600577817
773982 F20110113_AACXCC jacob_a.pdf
626972cbccbd2c5843b281f21215c565
66848ee4368d2eba410aba8ec1c36c8159d63bd6
1692 F20110113_AACXOV jacob_a_Page_49.txt
35e81f13a7316ff8749cb73b847b21db
65d0cbced074204292d40ab117baf54d71e780ec
14131 F20110113_AACXTT jacob_a_Page_62.QC.jpg
5e0b76f01fe17940522eb32f84ec331a
9428233d4fdf0b0f7604098bf7e2e7c6ab3b6fe6
F20110113_AACXJZ jacob_a_Page_43.tif
9c6d9c017d4bef07965bccb925e88669
153edbe355a73f499c770f54b35bb5eec892bad9
46057 F20110113_AACXCD jacob_a_Page_37.pro
2b3980e75e15c568ee938c381565129d
e4de0d5c714322a852f9a31ae3118d770feb40af
1158 F20110113_AACXOW jacob_a_Page_50.txt
97bfb888aa8482cf243d6659af62c72e
991489f49b9a848ae27dcb40e06b6a1a38584137
904576 F20110113_AACXHA jacob_a_Page_22.jp2
383b15016934385b622464882936a24e
f822259dd2edc0b0ec7af8ea26c4fbe9e15d3a9f
6731 F20110113_AACXTU jacob_a_Page_65thm.jpg
b7e17ebe798d9bf5a6685dc6cce5045f
046993cd3b960b25a775da4ab90d8e70870d7480
70671 F20110113_AACXCE jacob_a_Page_20.jpg
7a8bcf1682169cf176043027ca863439
fb7723cd7a0391627cdb1e8a323824f6389e1fe8
1814 F20110113_AACXOX jacob_a_Page_51.txt
1eb38e698cc883cdcfc7dc9e5cf1d436
adeed6068363e992b1f3943d6e8fe3f64caea6da
74895 F20110113_AACXHB jacob_a_Page_23.jp2
40ced58aa9c07d18f192c92ceef508cd
c4eae3aee9de844f6ff9f65325ed15d3353d2445
18232 F20110113_AACXTV jacob_a_Page_66.QC.jpg
2ec5394dccda8641fa8e272f720ead1c
6638620411a9e8609a81f2e8994e90d05a8bea6b
1815 F20110113_AACXCF jacob_a_Page_48.txt
bc5f516612486335a639cd86226d7809
47361a1e200dff17aed0c8a666b5bcd409dc0092
1956 F20110113_AACXOY jacob_a_Page_52.txt
90034cee9e595560680da8cdc8db8879
c85711dc6b09236753cf07010b47fa8e63c3d057
98761 F20110113_AACXHC jacob_a_Page_24.jp2
8446e2bff543906efb2ccb23c32e3443
b83eb4e869bfa452993325f37e437344267167ba
5056 F20110113_AACXTW jacob_a_Page_66thm.jpg
5d99c86007c234128ddaea9d6cebde24
d8e99f076d53839e2f7f55d0a5435f3758223a3a
F20110113_AACXCG jacob_a_Page_39.tif
fcb2a9a36e09e90352319332c71265a6
c5dd6353012bff3712b89ea380d80865cef613fe
42077 F20110113_AACXMA jacob_a_Page_33.pro
95c8f1cec14a9d384ab147c9fa44d50f
c2e9c0f579f378fa8a092619625e2b190932a1bb
1802 F20110113_AACXOZ jacob_a_Page_53.txt
59b01265830be23a508f7ac893ae7593
c1da19232336fc00e672b5964bad76ae990d79bb
90083 F20110113_AACXHD jacob_a_Page_25.jp2
7e6713eebd621ad5149c5b76780579c4
7eb9ac7bf5c9481b154522138c5bd614fb0a5863
17376 F20110113_AACXTX jacob_a_Page_67.QC.jpg
6f71b81244e99f271213de763293e6c3
2bea8a86f2c94f447c2f50ff9e8b961f52123fda
2512 F20110113_AACXCH jacob_a_Page_39thm.jpg
87e12a5e9028ff027c947419bd87fcf7
8a313990ee80648cb9042d6e421f4a8b7df1c9b0
36032 F20110113_AACXMB jacob_a_Page_34.pro
e87f1071c1125d92faeff996bca6dbdb
de0061d7ec4bbe4365e28c804169d6e9b96ae116
111643 F20110113_AACXHE jacob_a_Page_26.jp2
185b03e80223110375ca359776445309
354f2a1645b0b417fecc3ac1261d7908ab26f302
24113 F20110113_AACXRA jacob_a_Page_61.QC.jpg
b12d2a1cfc3c0e1a689fa91d29c363ba
e8c00361ffdd97623ea9e83323eef7f25e9dee6a
5979 F20110113_AACXCI jacob_a_Page_24thm.jpg
c73b434aa1430b6e2ab4cbbd9ab9cd3a
745a3d62fbb9bd849c7432c03e84017373883339
36045 F20110113_AACXMC jacob_a_Page_35.pro
8ee7f357922e06b6df90909b82ba8dbe
ace2fdb8a6e4f3eed556958ed0e5dee9cbc585f3
751305 F20110113_AACXHF jacob_a_Page_27.jp2
289039f2f0334438ef12e7308e3f2cfd
fa2e87ab228f990c4ae2bdde57f6b815b7c0e683
98990 F20110113_AACXCJ jacob_a_Page_56.jp2
acac4f11c7be729922df74dd08c5483f
e3b2040af02621bb6cbc43968104c8f045cf2515
34087 F20110113_AACXMD jacob_a_Page_36.pro
eb88e58651d298541c4cd4020d8b5443
09503fcc9809611d6557e09ef13ab3b331e9b55d
944349 F20110113_AACXHG jacob_a_Page_28.jp2
5820b60a6cf11d7125c9c7619c05d6ed
cb4ad7d7b1a39ce34d2b2dcdfe60575fa0be8011
5740 F20110113_AACXRB jacob_a_Page_44thm.jpg
141e6b740b4263ff105316e4a1178d6e
ce9e0bdbc8569bdcae2d7e38bfd87730beed06d5
5901 F20110113_AACXCK jacob_a_Page_59thm.jpg
740a2138c7b92ecfe74d3634e047ba3a
7c28dcd81786c8bb5248125930ba0c9e702653f5
40507 F20110113_AACXME jacob_a_Page_38.pro
f4f4e6a48514a1bae84b31e52af277f4
caf6ded7450acf4ac837fcb8dc7f88a2a7b7ea7d
103543 F20110113_AACXHH jacob_a_Page_29.jp2
2876f7a709e8ae319cccec15a411da02
f2adf442e08fe96ae736004e87746e6ad81365b4
24120 F20110113_AACXRC jacob_a_Page_60.QC.jpg
c9d236145a20732f50e64839042c5efd
795b7cb05d6208df9ea61ed0e8428c796e63d867
1461 F20110113_AACXCL jacob_a_Page_03thm.jpg
7708ac360e8992d0c61424222c57bca1
1cd96fffdc5f22a3533d9d0bd77d0bced7d36136
12769 F20110113_AACXMF jacob_a_Page_39.pro
afc0be8cfd5d4887cb0340e1de3e83f7
a1b2e341b5cbf4fd11503ecc3c31eb55de3bb412
822513 F20110113_AACXHI jacob_a_Page_30.jp2
c3904d3633fa0ec7ef705c008b3f9ce9
85daf04c40d3edb780e0d68b2d5ad12d42daa2fc
5544 F20110113_AACXRD jacob_a_Page_30thm.jpg
71ff1717689d47cd5c7f21b6d7d52a6b
6897546ad4467db9fdb3b70b6332318f4ae2eae4
18816 F20110113_AACXCM jacob_a_Page_09.QC.jpg
83284dd00a5d252ba237df13d1993319
73b22ba251957db26b8696b58ae25f340375a0e6
33538 F20110113_AACXMG jacob_a_Page_40.pro
a1223c2ec2b90c922d6b72c2b525378a
1778bb09e089719aca2e9c47e4aec5e00f686bb6
98121 F20110113_AACXHJ jacob_a_Page_31.jp2
7ec59d340aa5f1f94d880d3ad6708470
e626fd4d5294f9a9758d6689f64a0a458ea12b8f
21014 F20110113_AACXRE jacob_a_Page_49.QC.jpg
a4772381f31659d37a2120c0b859b367
cd39be3b6221aeb28364fb210fef92823dbe2f46
23992 F20110113_AACXCN jacob_a_Page_35.QC.jpg
e93a62a6e6bca343ebc87779c0b49d94
62ce876f8fc95de40688452bf41196f24e1838d3
28253 F20110113_AACXMH jacob_a_Page_41.pro
ac3d2294b977d3faf910ddf009fabf79
52786d1260795ef9b77c1664fd639ac61e28f474
99060 F20110113_AACXHK jacob_a_Page_32.jp2
a23fc0b879b4e30f632a06a4f156a8a4
33025924c31e48c80f5e31fe3ac0a43a5474e040
5040 F20110113_AACXRF jacob_a_Page_27thm.jpg
52fe9587764eccf39a9588df5908f9e8
da1a127cad772029e649cf88892cdbf73e018b06
19141 F20110113_AACXCO jacob_a_Page_54.QC.jpg
d18d2f7faab104fe7a57fab4bc05418c
09f76d25356c69874d91281f4d0952d5d19673df
50223 F20110113_AACXMI jacob_a_Page_42.pro
a5cae9d9ed8734deb3437ab81e022b5a
027c5179c4380d313d1a080ad5e8c43370222fb2
92069 F20110113_AACXHL jacob_a_Page_33.jp2
a372d5b03fa03e789a69be4e66d0f55f
61d517a20b0b297128b358ee8e8e839e305e6445
6491 F20110113_AACXRG jacob_a_Page_52thm.jpg
d83674617e275e041fd7082b14f99e78
116d927ed52c0105db6e8027d0c9fc614fde21fb
57036 F20110113_AACXCP jacob_a_Page_47.jp2
7ed53ddadb7e4363237f10774a22b7d6
1edfe6f2fbdaa3fce418b12ec16dd1149a7c4b3b
21817 F20110113_AACXMJ jacob_a_Page_43.pro
234bac14632cddb49184e0016a162aa7
374a08433a5050e120c18c3ab41b6e6b1e9ea923
990527 F20110113_AACXHM jacob_a_Page_34.jp2
53a691f71b313203295707e37a5ff827
731445a43d782c3d420945b60da54928a95a4947
19118 F20110113_AACXRH jacob_a_Page_14.QC.jpg
15d70b485b2776d4a2720f0d4b0ee4e0
41633c4c62e0b45bceec7ddd491fdf78f5831079
6146 F20110113_AACXCQ jacob_a_Page_15thm.jpg
28af525984b00c76aa5632987a3f0bc3
dd9fa3f752cc224d36522f0c5a4896298c925d5e
33816 F20110113_AACXMK jacob_a_Page_44.pro
64ac504edf1b9df0934545da84b32404
9f48290eee599ae3f1f42962b0441beef84855cc
994833 F20110113_AACXHN jacob_a_Page_35.jp2
6cc1d39bfe874b8f3e6c9c0b55b2b720
93926d33be4997001b59c4c1e9a4cfe321084efc
6520 F20110113_AACXRI jacob_a_Page_36thm.jpg
b820b4b28813503e9e787d5b0bf067b9
0a225037ac0294a8599b2c274a7fa441d3d4ce3e
11792 F20110113_AACXCR jacob_a_Page_03.jpg
d68f981cc0f5b56f57f4d5ab2707e6ad
7a6da1853fc932241abc5a713588ee10fd60c01e
31278 F20110113_AACXML jacob_a_Page_45.pro
93ccf079c057ca579193c77ae956ef11
f5fba1ffd4d47ea0dd231e81b60f96be7cb2b4c3
983231 F20110113_AACXHO jacob_a_Page_36.jp2
9e450bd486ba920e70b80d7882c87dfe
ea14a228d56a92a6169b5a4f19892aec171fe1a7
9133 F20110113_AACXRJ jacob_a_Page_10.QC.jpg
5233b09e0224e021bf9141fab63854fd
03ecc9dd407133024b5514675decef8c712a7e64
728088 F20110113_AACXCS jacob_a_Page_45.jp2
105b5d747a0ff26be6f0c1fe7f0ca7fa
4ee88b45ce25d2c7312c98cee750e352fdc3fd4b
13066 F20110113_AACXMM jacob_a_Page_46.pro
ae7e17bc117f36e52d43c8cf8c367cfb
c19e30b6bcca957048ab6c891e37c88b555ba35d
101377 F20110113_AACXHP jacob_a_Page_37.jp2
ee781f3d780271fb41c41b3c0e501e1c
42e057f192e8eb400d2a8684ddf3e56063b99601
5658 F20110113_AACXRK jacob_a_Page_41thm.jpg
41c4390b414bbd5b8f64e93556eae160
828cfc051d6df613e41b4d4680d3e38c4b7c92db
64235 F20110113_AACXCT jacob_a_Page_66.jpg
5afd8168f8c8099369d401aa8dd5cbb0
ca338fd0d9cd858650f5b8571be7b76e607a202b
24663 F20110113_AACXMN jacob_a_Page_47.pro
a90fc68d0e074c3cb46c78fa55e45a45
6f6ed27d2ed8282d8358ac544ca28a6cc6ffe372
1051978 F20110113_AACXHQ jacob_a_Page_38.jp2
b1474ea6def3bf6e44f899da285aa1d9
5985faa9909139a7e8677a675d07de7db8b87569
14167 F20110113_AACXRL jacob_a_Page_58.QC.jpg
168b303c045445b525e445b0f4cb7470
e65dfceff95f240a63fe30eb046b86b9969223e4
43537 F20110113_AACXMO jacob_a_Page_48.pro
0149f5f96bed6ae557478156e9a5ec75
5426dc71cb8bff7c186c5db8098a9d397f5580bc
30232 F20110113_AACXHR jacob_a_Page_39.jp2
1487fd403364db62747361ddfdb687cc
fd2e750d9d2f95eecb541961d94d62d3166d371d
1906 F20110113_AACXCU jacob_a_Page_16.txt
fe2ae2e70b4e289a1d90706885cc9687
1ac2daddf08daf79335652668d11e0a39ca8eddc
15760 F20110113_AACXRM jacob_a_Page_47.QC.jpg
94f544ce6d547f267d2aac823153b016
b28145ac19c7e5816678cf9a2118283a0a63425b
37351 F20110113_AACXMP jacob_a_Page_49.pro
baa80e00d308a1094cb3871a49309db1
162bf99211b9c3cd82de07c485893def0845d78b
788687 F20110113_AACXHS jacob_a_Page_40.jp2
cbe69bae58f85ba9b9dec08050bd864b
56b4d00db6c88780edbc84134232c6fd53baeda8
19107 F20110113_AACXRN jacob_a_Page_44.QC.jpg
fc5dca8567db48f51d38271938a462a4
64541455185192bc6f983f60687ed751fe67cfd2
45767 F20110113_AACXMQ jacob_a_Page_51.pro
7cd556cb3124ee688f9258d752e9bc49
4bfe9b0724955cba539882fd6f95ca7effce0cfb
720388 F20110113_AACXHT jacob_a_Page_41.jp2
624d4552d41abd7316dce746743fc5fe
e4032f518a67bf768d620473c186059b87f844a1
1392 F20110113_AACXCV jacob_a_Page_41.txt
468176b81a24fc72a6e704c16a096b28
9d51f1e639618fdf64eeab1656d611fd0b4e04cf
101780 F20110113_AACXRO UFE0007181_00001.xml FULL
eac0a3aeebab6b44a0db6e34b8296fc6
d44e19f8296fd78e49bf870810405dcb3d653f6a
48477 F20110113_AACXMR jacob_a_Page_52.pro
4843f9cb661525635cf881857b1af70d
d6c14be36e393df3424619b1d7e799fcfbb61010
111157 F20110113_AACXHU jacob_a_Page_42.jp2
1d7df951279606382a973a827ee75a4e
71c4ef65c39a59ab9e2eec7b5101129c5ca063ae
22373 F20110113_AACXCW jacob_a_Page_32.QC.jpg
7e47042335df34baf222015622459599
e517655537cfa12cf61b98dcae5395483c44811c
6878 F20110113_AACXRP jacob_a_Page_01.QC.jpg
5b2fba0767fb24b31d61b200198a07bb
a2fb223421264a82decc9da39e9d4ab186ceb51b
40215 F20110113_AACXMS jacob_a_Page_53.pro
7458f37c647883736c052986cec615d5
65f3b181ac734b2aabfe89cd0cae31337ff1c428
867870 F20110113_AACXHV jacob_a_Page_43.jp2
244f8918c0ca702bcc4d0568a111d6b4
6a7ccadd8e41a482414a9e031e461496e171053a
877106 F20110113_AACXCX jacob_a_Page_53.jp2
3bb2df42add98e5700d789cde1e080c5
79116ba356a25fb1d84c6b1f5d12f588c8348657
3199 F20110113_AACXRQ jacob_a_Page_02.QC.jpg
eb73c3523a7adc09ca5c72729a0abbd5
29565a09bd5258e4c05f5f08f423820d49d71a58
17238 F20110113_AACXMT jacob_a_Page_54.pro
ebf0bdc17ce550cadfacae95098dbf9d
234a0f255c1113a70b4b8ef9dc0fcbeb5099f1f7
776094 F20110113_AACXHW jacob_a_Page_44.jp2
b0b51bd650a5189e8e3841c6c1ea082f
2a121743d1ff7ab679eacc000b7a5190e9badcad
106 F20110113_AACXCY jacob_a_Page_02.txt
702deed0454c9344b3ec4bf8dbf38624
b2d2da63ac27aad965393839a62de61a538efcf9
1357 F20110113_AACXRR jacob_a_Page_02thm.jpg
49eff269bb3a4bbac180b20551f524c1
a5fbc819852b0f661ae6cae6883f63572b5779cb
41117 F20110113_AACXMU jacob_a_Page_55.pro
4875c0bb9a6924cfced73f8a002f53aa
0b688f273eec0d4d679bde9bf55e3c82296e6ac7
38003 F20110113_AACXHX jacob_a_Page_46.jp2
5f579feb19f7f577f3e9cfa618ab8245
2d61b3a1c314152b1ba3e6523219c7e0133d339b
6026 F20110113_AACXCZ jacob_a_Page_49thm.jpg
d34ef46da2fc31cf8e3c2e372820596c
43916560208a540e27a4be132b565da484d62c61
3492 F20110113_AACXRS jacob_a_Page_03.QC.jpg
8e4e5fdeed0a69838a67bb2b07008e16
e1ccccaeb55207d12460b08ebcc77e7bae5dca93
44794 F20110113_AACXMV jacob_a_Page_56.pro
39fd8c4c830a64e7b5a1f0e7dfc58f01
38b217c94ebb305deb25c19d3dc33f56b9b34b53
96776 F20110113_AACXHY jacob_a_Page_48.jp2
8d7554cdc448070a20def73328db1165
e36bc0b7d4eab9b51f2892ac401cd9ebe04ee191
17883 F20110113_AACXRT jacob_a_Page_05.QC.jpg
6fb944143ac948d474c368828efa5d0b
5f941b5cd9f1cd0494978f3622a025f58342b2b3
38706 F20110113_AACXMW jacob_a_Page_57.pro
956b46542c4eeae05b968b3acc5e7f5d
0b28707fe96e7f48fc590e4be277b16f32ff0be6
62016 F20110113_AACXFA jacob_a_Page_33.jpg
f84f66ae8de4a45971b50b52034c9bec
333827d1df4d879ee9d01f49ec7f6c370f74806e
86548 F20110113_AACXHZ jacob_a_Page_49.jp2
06d263cbf52d0b6bb3fb66367ee2bb0e
b746f5f941a6e0dc694d4a1d909aa4440b273383
4607 F20110113_AACXRU jacob_a_Page_05thm.jpg
65918a5c4c031c02d0517a6f50d05e67
0009137303c05e993795791559d0617f4f9dc325
28970 F20110113_AACXMX jacob_a_Page_58.pro
69edbef55dd41e87ccd457adba0b7eb1
01f41d698975eca724e9bb27d521b4940df5a5bf
76461 F20110113_AACXFB jacob_a_Page_34.jpg
54594610f2ad21b80b4336894515bee8
bc730681d14a316201d4d957a06c3e85448e9227
19616 F20110113_AACXRV jacob_a_Page_07.QC.jpg
e20fc583a49359bebe9747b337b0d287
cdd26d5e3a8b71c5abd6fa75d71572e07538ad57
42743 F20110113_AACXMY jacob_a_Page_59.pro
eaed349231d756d34b1d9c05cb5ab3f8
6ca1f355482f1289f9d78ad271cf27e6eb8a9164
76161 F20110113_AACXFC jacob_a_Page_35.jpg
622e2bb32112defa74e90e896736e98c
9b07ba12d2790bac439641d3fb7b25849f56ef43
5359 F20110113_AACXRW jacob_a_Page_07thm.jpg
be3fbbc229e708f2be99acc9ef7db277
08f020f0c2c5f9a7838703388592a6d19d87fc5c
49192 F20110113_AACXMZ jacob_a_Page_60.pro
b85cda8d6bc920d925b7e194e42557d6
a2cf7be3ae1b75b87702d52665f436a4fa2c0f60
75028 F20110113_AACXFD jacob_a_Page_36.jpg
7a827446abc6875d4fa4e766f0a5278d
1a639c9557579dc9fff94857a9a6cea448c86094
F20110113_AACXKA jacob_a_Page_44.tif
ca97f36d1e25214afa635eac4c3b9d86
d00316d02cf1fe157dd69f12aa7e4fe7d712d0bd
12255 F20110113_AACXRX jacob_a_Page_08.QC.jpg
f193eda442d2ae6015d9f78c8adf28fa
319b1fd2b6cdcdbe4a9922d1abee6f0fc08364ad
67060 F20110113_AACXFE jacob_a_Page_37.jpg
49395da4ee3efdc0eadeeb13fcbfa6f6
2f6ea2dcb3c58d40438722668c9595ad11dca566
F20110113_AACXKB jacob_a_Page_45.tif
5093dd6fd6f48492c647aa7cef5bd789
80d760f541098931ac12753222e5a3a8fe658fee
3482 F20110113_AACXRY jacob_a_Page_08thm.jpg
50de580f5e7d2fa99acbeba13eb0c656
488312ec05917c56e3c40656dc9568e720a6323d
80424 F20110113_AACXFF jacob_a_Page_38.jpg
1a9c0811484e4867142d8415b65f8d81
16f3c8d952ccf0a2e0eb9f63e7f7f50ad9c9b714
770 F20110113_AACXPA jacob_a_Page_54.txt
7fbf22823ba98b5abd348355593d9e09
682f9d30c8bc31cbd2f68e4c78797447ea1549e1
F20110113_AACXKC jacob_a_Page_46.tif
92f8295d33e0159c7f7a78a6ecd4de11
7f5bb9315641b870f6801955867a8836582c420d
20736 F20110113_AACXRZ jacob_a_Page_11.QC.jpg
93205035a1308aa39733a2b959ecabec
a058ed85b6b4f80f4f77e0a9f3fa819efbb18eef
24416 F20110113_AACXFG jacob_a_Page_39.jpg
0e3cf816f9dc83fe6a2b4f1c997644b7
3bef72e4e55e975ac425e4a16996b9ab68c13bbe
1730 F20110113_AACXPB jacob_a_Page_55.txt
77e5ec71c1bfe4989c5534df13b32539
0d8986374f41f0273310aa8c4929d6331516599a
F20110113_AACXKD jacob_a_Page_47.tif
93d9176ed7f152a1f10b56d0d20ac4c9
9765a3e48a784e7a0aeb7b78bebdf7a50fc727c7
60328 F20110113_AACXFH jacob_a_Page_40.jpg
fdcfed132f3ebf3e119ad1b245910cde
08ddaffb130a3b706c666e746fb62a28ad058160
F20110113_AACXKE jacob_a_Page_48.tif
123c55b79ee770587c94671b40c5cb2a
f30c37a795a37b521439dae6f356959045db4283
72365 F20110113_AACXFI jacob_a_Page_42.jpg
e50463bd4dabd8f26dbe484cdb8b7718
1b842420a53321fd6b321dbc6f4177a478ce292c
1811 F20110113_AACXPC jacob_a_Page_56.txt
778c9eca9c224e28bc8966e573cd5750
847ee6bc26d5b5fd34614efe13692d8dcb7e77d0
F20110113_AACXKF jacob_a_Page_49.tif
4f9d753e9a0cbb5a93f9dc26c4be5dd2
f5e433b6df01aea13462025bbd64750ad52e0d02
67878 F20110113_AACXFJ jacob_a_Page_43.jpg
dbfa9fb74a9af59b70c070f3aed998ac
f10d3053f234e96225c3df8537a7e6fda65e4580
1891 F20110113_AACXPD jacob_a_Page_57.txt
1302b3cd3fadda359ed105e417b1c523
eb7e6a24936d5059a4be102e80f87ede67d0165b
F20110113_AACXKG jacob_a_Page_50.tif
6e5159b77be1c605c9fc3e9b6e9de9d0
a4a65e5da446e08071c341b3bbb1449ae8632378
59753 F20110113_AACXFK jacob_a_Page_44.jpg
8f961011c72c1874096a0d7847ee7b9c
8060b7c253a155bd4dbddb5de2572f5f4d7f3392
1315 F20110113_AACXPE jacob_a_Page_58.txt
e96afbcb13135df94ab197bda10b47e1
5c98ac0b83e10aada25b8d32e16cde15a60af87c
F20110113_AACXKH jacob_a_Page_51.tif
cf69ac67b7167b77105e04a07ee092d6
511ec1956df97718712a60c96e6a8190700d0088
54852 F20110113_AACXFL jacob_a_Page_45.jpg
5187924b858d4d6ff18d6e5b55e27d88
ff6efabd572dda92aa61854f0dfcf448b1f8e65e
1758 F20110113_AACXPF jacob_a_Page_59.txt
2d7a65d33a5054cdaad27037b599f450
433d8fc88ca9e2d02027f03028041fc9fbda4bda
F20110113_AACXKI jacob_a_Page_52.tif
bb5a849b3381d8da2748970181ee64f3
a3b416066c001787516b8dfffe8497b1d4a06cf7
39324 F20110113_AACXFM jacob_a_Page_46.jpg
c130ceb9e35d6a29be6c64d4ba499c6f
8bd34456241215175ba66f7517c91fb685113e98
1938 F20110113_AACXPG jacob_a_Page_60.txt
b92d358bf821779eff03ff46979a89af
6050190361484c21b6ec18399876307a4d92c3ec
F20110113_AACXKJ jacob_a_Page_53.tif
6d454625fce1525865ccfa323f657085
321aa10d8355d659410df4d93ed0c10f20ff54c3
47562 F20110113_AACXFN jacob_a_Page_47.jpg
4ae5a57e6e32bc22b82cb128b9f8b7f2
7269836a3b140a750ba89e66727dd88a5ae281f2
2006 F20110113_AACXPH jacob_a_Page_61.txt
f2f09763a9292622e03b1deb192f6e2e
fd11369f1f9a932eed31e0e72b39d5679287d539
F20110113_AACXKK jacob_a_Page_54.tif
a36c5ea38cdc333a28b9c15168076c53
e1cf29562adaec7fc7c02389d5ca83f86b884318
65002 F20110113_AACXFO jacob_a_Page_48.jpg
f52de3321968b3f8ea8b78bbcf23cc38
be3e1c60138c0c9d9a31ed18edba04272d0cc9e1
2548 F20110113_AACXPI jacob_a_Page_64.txt
8371260c6de37110434db986f64b420f
128be7cb0906fa4e7504f872c4cf480c1c35a8e8
F20110113_AACXKL jacob_a_Page_55.tif
35ebe7eda4cf35b32c0a07a69b20bd68
59ca2f6cd048684bb9b300c3f4d56d5e1c1a9c72
62690 F20110113_AACXFP jacob_a_Page_49.jpg
d80d28d116776632e84840c78355d29d
4def22864e307bc58d877452696a471b20e9d930
2675 F20110113_AACXPJ jacob_a_Page_65.txt
7af6764f212cf63b010d0ce20bdff698
32b11ca223d2c7390753081c0938a613ecf12eb7
F20110113_AACXKM jacob_a_Page_57.tif
5743d78491bb2a16816fbc276363e0bb
d2501a6742151c81d6c86c5cc43adde8c22b13d7
1705 F20110113_AACXPK jacob_a_Page_66.txt
f6be4d9ac9cba17436c784ecb79a8e5d
e5c1fdbc19f9ce6cf4dfc4eccdab01d536fcadb2
F20110113_AACXKN jacob_a_Page_58.tif
87d53a0051f24bbcacc06e5e48f00be0
fd708f62754997c08b47c6283c9caf39f3940a6d
70017 F20110113_AACXFQ jacob_a_Page_50.jpg
4c65594454f5c792a9bf5f8d8524792e
642053263b59f7418e52ab5a1eb3289b6427119e
1424 F20110113_AACXPL jacob_a_Page_67.txt
fb4b587c24ffad1d19af10f0ca05dce4
652a719d51e858ff83ffa244e263b6dfe7558bd2
F20110113_AACXKO jacob_a_Page_59.tif
2622578d383376d69acd5ba616301760
434c4cc1f2c07bd58268da0e2978cc3dc240d7d2
66736 F20110113_AACXFR jacob_a_Page_51.jpg
868a8c0a9cfe4cef5ba2b5f50c66c88f
aa9702e24cea9584cab66d37e4eda0bc7e5e3b11
2272 F20110113_AACXPM jacob_a_Page_01thm.jpg
b81c4097320d0e8ed9c066c47311dfd2
8d7d71e0175177e2c86ab78497192bcd6b032107
F20110113_AACXKP jacob_a_Page_60.tif
b353ac88df466f15c22b5810e4d553d5
d8e45a94a6ce067361abc663c18fcea8a562b555
74126 F20110113_AACXFS jacob_a_Page_52.jpg
ac56fa6b85bba730d6ae71d7072a9c8c
f61ef172c39ee67786553cf20c1d0a3513903b63
6709 F20110113_AACXPN jacob_a_Page_26thm.jpg
639e73e3a9654ee79290f78b94a021fe
11354861f721078679de8cf2f2756ebdc87b826c
F20110113_AACXKQ jacob_a_Page_62.tif
a1029a9970bd425871775974ee0fbf88
a710d21a572fb609fd83e321d797f99fae30f7a9
64964 F20110113_AACXFT jacob_a_Page_53.jpg
df88e5fe64140cc6d9031375871737d8
eeebdae945835cf4145d2a7a1907ab1910dfd3ef
5216 F20110113_AACXPO jacob_a_Page_09thm.jpg
ed47bdc82226ac83cabfb6f07635bcab
e5423b5c5344619c31f98324c9c3f48e60045cfe
F20110113_AACXKR jacob_a_Page_63.tif
5f268c74f374cc484025fb8065488863
79c2e7fa7720dfb95ebe0227e3320feb89d20c5d
63241 F20110113_AACXFU jacob_a_Page_55.jpg
e577969049343ba6299e93fc83b45339
ba37932a63076b9b81676097e326327f6723d04e
6366 F20110113_AACXPP jacob_a_Page_28thm.jpg
830e1fab4296513f0e074584c6bd6e8f
b69e6c24a98e8a86a00302eafbea6c5b1632f09d
F20110113_AACXKS jacob_a_Page_64.tif
abac61ca80749479d7add09c4917244b
d962e773e794f46a802481ac2482b0d917983552
69995 F20110113_AACXFV jacob_a_Page_56.jpg
9d873ec15103438f53c300306c000827
3f3f4d952f67080d8534d26548ef69a14db5d870
18012 F20110113_AACXPQ jacob_a_Page_27.QC.jpg
c7750c1dedcf51bb577ab451ef647bb6
e689c01c2141f7685e5e34890bdd92c4a7370a7c
F20110113_AACXKT jacob_a_Page_65.tif
cdd2f899867610d4d308af43b0f644a6
e6feda70941955c814c7d930357ec24974e4f2cb
65384 F20110113_AACXFW jacob_a_Page_57.jpg
b45866453e1f0bfc6d000ebed5bc22a8
25a39e6cadb3490629540ff9be1b45fcca0ec782
6705 F20110113_AACXPR jacob_a_Page_35thm.jpg
e1b8decf305ff1ab6f7a2ab8266d86b0
bd9692450e55fd3da20a72a919f09d0cc9525593
F20110113_AACXKU jacob_a_Page_67.tif
0ec1b3619d23bedbb09b572f5af60aed
8415df7a7c18d1fdaf4a027732d2cc5bd43944db
46570 F20110113_AACXFX jacob_a_Page_58.jpg
93bbd17f97d6d86e683e45d10960bcfd
a2daafcb5b720fe71c3cc735bf5b5ae2587f4630
20369 F20110113_AACXPS jacob_a_Page_22.QC.jpg
8658e78c003e73bf7e1ddc08a314a124
c224ab790aa3ad10d74cac53a40debcb79a13cd9
7515 F20110113_AACXKV jacob_a_Page_01.pro
9b13db89631271cef882ce6b0fd80b4b
4a615715930618c6e7dfb3ac63c2aa0a7ce48289
27297 F20110113_AACXPT jacob_a_Page_65.QC.jpg
79a52bd5f58beedd8ebcb2ea214bc627
b324dcb42a12c47f9ba548bb7a4ebc333f004c27
1016 F20110113_AACXKW jacob_a_Page_02.pro
168f05e7adf0f51eedb801a1c0c8f657
b9aa9f9671c78dcc2bd335811c4df60a9da3998e
7056 F20110113_AACXDA jacob_a_Page_64thm.jpg
fddc4593803edb8f9659374d32c2e378
2966d8936d8b8af5c9d58b2774930bed32e2ce1e
63382 F20110113_AACXFY jacob_a_Page_59.jpg
7e3d333b5cc277cbac9759b1fb368be2
a33008a7a92c6e10feb869b53c0c08b861979503
26507 F20110113_AACXPU jacob_a_Page_64.QC.jpg
6d505128a6f3681d3d5f4088b72e3d0c
8902c887e390cf26445450a687076d77881df602
2312 F20110113_AACXKX jacob_a_Page_03.pro
61190f74976d55e8a1f98395760ea736
1b0be9ee0cb12c58271c1b2a1330ec11b1d6f796
6060 F20110113_AACXDB jacob_a_Page_57thm.jpg
96e272bb7a82e9675e26f719d5244e2b
d4d8824deb56d89ac88f90522b9424bc9a86c8b8
72980 F20110113_AACXFZ jacob_a_Page_60.jpg
7cfc5096dda1777c227b6ec923fb18c4
af2a89ca7f2bb4c14514453b7c4697e554ea6b8f
6862 F20110113_AACXPV jacob_a_Page_06.QC.jpg
7d52b9b183727a7a829ffd6fed40a7f4
5112a17f9655ddc856682b658b9eac7072f13960
15755 F20110113_AACXKY jacob_a_Page_04.pro
42149a29f7b96eb521ef3be30d337b21
5bc1f43b74976fe37849707ddc3dae39374673a4
980 F20110113_AACXDC jacob_a_Page_08.txt
bd002ea711ea3d76a1735b19c2331608
dc3a79083fad35c0b2f192e0f4848db8d065a841
20264 F20110113_AACXPW jacob_a_Page_59.QC.jpg
767b45456e69e2c7b682298b8f69a548
26214249c02d7be636a386a113da7b71e0888002
957947 F20110113_AACXIA jacob_a_Page_50.jp2
4d2febfcfe182f7036707c8a95887b48
d04a413f52d90a461aecf40a55d50e1a8bd3b384
64229 F20110113_AACXKZ jacob_a_Page_05.pro
560fc266eb9883deedd8f473e219e5c8
d8de3df8c58aa3f9324aa3bdd8e775e09ae58948
F20110113_AACXDD jacob_a_Page_26.tif
f4b3a1c6eba5d6f73050f23eb5055e9e
800023e12e57b4e04f1ee01456eb54312d05624a
6538 F20110113_AACXPX jacob_a_Page_19thm.jpg
9474b7d64e992b79c35116048f427de6
e1dfa7942b7d503ce409276f4a111d6346e71ede
99640 F20110113_AACXIB jacob_a_Page_51.jp2
425d5b1ea722e7567ddf15cfd5a8c00b
c0775ffbfaaa5e6f189aaac48a7a05014731986b
1804 F20110113_AACXDE jacob_a_Page_12.txt
ba66b0e14c5bca27eb36b0fecb849798
09a0309ca356daf32fa1a3d05eb7b5f65bf0838f
21911 F20110113_AACXPY jacob_a_Page_15.QC.jpg
6bec8ae6d37d88b7d1508ead430250ee
7ea5fa1a5862295d957884c75d5e08de6fe85e9d
106719 F20110113_AACXIC jacob_a_Page_52.jp2
428c871593ec57c8c057b5942a8801ea
713c1d87848951d8ff4988f1963652ac6483e14e
1771 F20110113_AACXDF jacob_a_Page_11.txt
c3509453e406f3ad360d969c0af04f43
ce4285cd494b9ba94c67b5430689b829155a458a
641910 F20110113_AACXID jacob_a_Page_54.jp2
ba74a72f7303cafdece7d9113c30ec9f
d94c657bc9285637c2cc799557beffd87e46123b
F20110113_AACXDG jacob_a_Page_33.tif
2a5cd4bba24dbee29c3d213d78aae68b
3125a85f85e12a3a9bb1b16ca0982de446a35c15
50964 F20110113_AACXNA jacob_a_Page_61.pro
76d237ad5ab0db9a0eabc150e8f12dd6
e4b0b4a93df53bf8b9f9ff86ed722edd7c022fef
6313 F20110113_AACXPZ jacob_a_Page_31thm.jpg
e50500f88813faeed0ed99632a116ebe
91aea0e454f993e1bde0ab7d44a63f8d6098edea
90369 F20110113_AACXIE jacob_a_Page_55.jp2
688b365a01132049a048c8b60da5b21d
0016fb7e440e5c29b5ac190eb99bc38fba2867b1
F20110113_AACXDH jacob_a_Page_13.tif
84db0d0ed3c44cd58b29eb7817759202
a30b375ff41367c9fde18fb7751e95831c27204b
26387 F20110113_AACXNB jacob_a_Page_62.pro
71c45ec2712ee57ff6536f2eb71ce4fe
d7a381ef36c3e97412eb1ec224056470bd837578
5916 F20110113_AACXSA jacob_a_Page_11thm.jpg
0650beae6575c629f32f0343f0537b06
ca7d38308243af4a41760ca828c28b6b9f00a8f2
86184 F20110113_AACXIF jacob_a_Page_57.jp2
0cae49badaf3b690dbc46e48db5dd885
e0624c9ddb9dd00fd424c20e5c41f55731c13197
F20110113_AACXDI jacob_a_Page_03.tif
bd014d73802a120a03d79cfc080df83f
beb9f6e7b0a1d17d48d58e224bf8b84425c9d88a
49557 F20110113_AACXNC jacob_a_Page_63.pro
f7359765351f264aafcceab0436be827
9d9d2dff3901c0d46ce2b3ab6a69d60b1f244afa
5869 F20110113_AACXSB jacob_a_Page_13thm.jpg
7933cb814df95c475d5afe81a85d0461
77ce03da88052eaa6d63193719aa459fa7c3f9eb
596761 F20110113_AACXIG jacob_a_Page_58.jp2
b57d55b3010276167e3890331c6a433c
3a86a8a7c36474b2020eabe05ba72d99c9f4ca36
2636 F20110113_AACXDJ jacob_a_Page_05.txt
a944413dd3377097a45885d20caa47de
71c75eaf750b1a6a57fe5b5ad9e8fd3bec0beebe
62673 F20110113_AACXND jacob_a_Page_64.pro
1d6aff9201860f68b52472840fe3717a
24dc4784f0151224f637d23f97dc45b2e5034bd3
23215 F20110113_AACXSC jacob_a_Page_16.QC.jpg
3a3d540ef1ea864c56444fa2e5c4be8a
2b618adaca8bd1185beb336d782f28a65a0478cd
111815 F20110113_AACXIH jacob_a_Page_60.jp2
814305af71178df83cd16ad9bcc9334d
1ebf55cea91b3e76dd416d9c5c0134be402ae8d8
70936 F20110113_AACXDK jacob_a_Page_19.jpg
d6e4c9dee35ccdf71e28dce32f708267
2d99a1bf4530b5400eeaa75424b65cba7aa12e87
65939 F20110113_AACXNE jacob_a_Page_65.pro
cfa611735e42159da9cb06cc82450856
6d36bb72cd17e3b5b47437ad2d9398ae6d1731b2
6495 F20110113_AACXSD jacob_a_Page_16thm.jpg
14fb983ed7f1b5a2f83d2dbcbe95cb08
49f5b7d31d17d8854d97943ca5bf72b62a0eff2d
110726 F20110113_AACXII jacob_a_Page_61.jp2
1b96a71e673024d5c2d948ac4dfdc452
db67e896c28817608559f95f953f205576121bc4
F20110113_AACXDL jacob_a_Page_56.tif
0783a33363d1fd13156bd424e779b37c
a1a71e933b47a6ee1dccdb88cf7e8198bb8bb93c
41695 F20110113_AACXNF jacob_a_Page_66.pro
906a6c082e2837a25b85bc8593810b07
f4365bd1dc605bbbeba3d471d5cbe934eae13b11
22396 F20110113_AACXSE jacob_a_Page_17.QC.jpg
4a581272515a9186b0d5a6035352e7d1
1f5578082eafdc1801cecde0e26f02e5869cf2ca
60615 F20110113_AACXIJ jacob_a_Page_62.jp2
ed3b57eafaf510fb80ec32dfba93384e
69c73cd31f41bbf82b4ae6e5950932e409d75e9b
2028 F20110113_AACXDM jacob_a_Page_63.txt
ea717c8accde579d2caf3c6462f555ad
c93679b0ec0ea01d7f07687d9662e0ad52059e1e
34435 F20110113_AACXNG jacob_a_Page_67.pro
0043fb0abc222eacd57623bad9c2287c
a81f3100042c47ed3238e61809a02bc630daf624
107992 F20110113_AACXIK jacob_a_Page_63.jp2
f9f0cd83f80dfbf1eef5faa5ea67333c
a22b05a1b69f5de4f59a628ef4916f4bc9a3209e
26767 F20110113_AACXDN jacob_a_Page_50.pro
d37fc784f794d510f94c174f9ff5e1c9
dd72901f97e4ba917189ece01b634de437aa5431
414 F20110113_AACXNH jacob_a_Page_01.txt
fdb1e9fc3b351db228f2ed2464fa8b83
32e4024733e38be2719b4bc5d5ec5939ebf84be7
6158 F20110113_AACXSF jacob_a_Page_17thm.jpg
0f0b5d8e1eccb5f0ddbbb63b72276ce7
af6e4d7e561418d7ab45deb337009cf3f50bd3c7
132854 F20110113_AACXIL jacob_a_Page_64.jp2
8f0acedbe99819a01f7810cae5941e97
0c015094b715d327ae43319f1d68b9f031e2817a
21271 F20110113_AACXDO jacob_a_Page_63.QC.jpg
b844109a1915d391ae76283a75d9809b
976d768c1a48868cadc7c241e8e614cb573340e4
143 F20110113_AACXNI jacob_a_Page_03.txt
44671be8103798badb0e50deeab363a7
b5553f32484ba0de9fe836d656aab1c8fee43cf1
5202 F20110113_AACXSG jacob_a_Page_18thm.jpg
bca0c9ccd6264ab65cd62e0ccc24d284
5047526239a50587fe0e757bc6670ddac5179295
26171 F20110113_AACXDP jacob_a_Page_38.QC.jpg
a1881871a11cb3fa79169c6a01d2f484
16dc75aadfb92023a586d17d0820cca98e3904bc
674 F20110113_AACXNJ jacob_a_Page_04.txt
9f1295349319dee4cff011bbc0d2e833
8d62486e71a5341d9d0561bc3ee50d78a651f656
139401 F20110113_AACXIM jacob_a_Page_65.jp2
77a207370742deb489d4c209e0892229
23662bffee24b170693c6284a84b416cc2084bbb
6311 F20110113_AACXSH jacob_a_Page_20thm.jpg
782c7567086019262e01f1428aade776
96182dc053bbb29f566007cc48891075ea9515c6
58827 F20110113_AACXDQ jacob_a_Page_27.jpg
65efbdb1721d68377335c15bb640ce8c
de629cdcaa7ee31f69d249ed69309854c527effd
547 F20110113_AACXNK jacob_a_Page_06.txt
92a1158c669235bf10ae97bfd5a357f5
0e8f9faff0a253a002a9ec23dfe2d31c9de2fb05
89656 F20110113_AACXIN jacob_a_Page_66.jp2
61629d34d72cdcb693232e0b3c0ae281
4be248858627780c6e3092649acd5a96223c3897
20878 F20110113_AACXSI jacob_a_Page_21.QC.jpg
3bfb6274c1491a7ef6d5622a23a95493
59e2f8a2471ecb5be199e2c274c5ea4870fb6c74
36854 F20110113_AACXDR jacob_a_Page_13.pro
62c990689c4ea364e9166b34fe1973d4
94ac5c0f340ccb44252f501753e90a0985d97d9e
2288 F20110113_AACXNL jacob_a_Page_07.txt
12b0ef7a77cf80ff88aa1047cbd0672c
2ec711eab7df865998a4cb0c24187c0610c31e6b
77768 F20110113_AACXIO jacob_a_Page_67.jp2
5054eeeda5379ffdb7031c26e81254d8
75013c4bce9229e4befe8059012214f2c0731891
6075 F20110113_AACXSJ jacob_a_Page_21thm.jpg
d6f4662c2c9c618ec3b37fe8bd6e9c3a
efa47cb691ac6f4708c79e55eebe09ea03c9cb25
20601 F20110113_AACXDS jacob_a_Page_55.QC.jpg
e4794a90577c09bf784f828119ebabc6
940030ecae80f1a8f4225c6842068e079ac7037d
1685 F20110113_AACXNM jacob_a_Page_09.txt
557d464175236ae39f652ed4af8ac52f
d61032968f3e065c154435b6a7a07b9a7a2555dd
F20110113_AACXIP jacob_a_Page_01.tif
bf108a37a7b80d2a30bb22f46000321c
0e6781de726924869f1332341bfd23992f054bd4
5635 F20110113_AACXSK jacob_a_Page_22thm.jpg
3e61efc2f5237645c4cfbaed5df66e75
235f92e38e75cf18b9afdbc09b31efc6fad371a1
6070 F20110113_AACXDT jacob_a_Page_12thm.jpg
91cb11b3362c069d9c54f676cc257867
c7ccf799f68cc7d552b19dd621920292f058a898
585 F20110113_AACXNN jacob_a_Page_10.txt
92c46b8ba0ff8d716a692a35b6d80782
491d9af9adf0938620f689da2bfc2a58daa700d2
F20110113_AACXIQ jacob_a_Page_02.tif
3239e4c6df0c8683c6e43b1561784c5e
5dd4082637b33e08a72ca4ab523dfae70301f9ea
4893 F20110113_AACXSL jacob_a_Page_23thm.jpg
f208ed641ab9b923c4c6c9d8709a97da
0caa68a53676d4c2120efbb05be4b887b0c1d499
F20110113_AACXDU jacob_a_Page_61.tif
a1c98bd5f4b536212101ca1b372a3a67
d9eaaa0541667da2d1d17de8ec6a2a3c92cef1b2
1927 F20110113_AACXNO jacob_a_Page_13.txt
77995a0c6febb3be57711d84e57f5d2e
e82fc9dd228496ada3018548c1b8e4cfac4c2eff
F20110113_AACXIR jacob_a_Page_04.tif
03f424d388be9ad793838aca1ea6c2e0
990c0e3de5e1e1e7de33b7f9be4dce85d7542119
20919 F20110113_AACXSM jacob_a_Page_24.QC.jpg
56718216a17012288bebc5cf80916bcf
b983f11b5b8e05c21cf905ad1b766b0ae0de6d29
78942 F20110113_AACXDV UFE0007181_00001.mets
bad89dd39b20aed4157bbf0762712086
522fd15f3204047bf943d952079c8632c40839a1
1545 F20110113_AACXNP jacob_a_Page_14.txt
b735a46a0cd4efebc270ffc760ddcfcf
47b96619ab00cc8c16cbaa2ed6270a6f3834b22f
F20110113_AACXIS jacob_a_Page_05.tif
ace465a2a702e583e174fb747ca051ed
6ddcdcb7f25dfdda2305704bdeedd7dd4a9ef8cf
21707 F20110113_AACXSN jacob_a_Page_25.QC.jpg
a266779298a28b2aa7a03c1cdea1d657
e64da633e7462049d413b56af28b0ae5d055bb4a
1909 F20110113_AACXNQ jacob_a_Page_15.txt
0efe8df9e7e48e8aeec2fe822c25b24f
c50c9ab6f47e59be5178b594781f26cbad7aee84
F20110113_AACXIT jacob_a_Page_06.tif
3facac0eedfa1614b1c8de1de16325b0
7f2c086cb7a75ec4170b9dc923ceecf5bb7e910b
22059 F20110113_AACXSO jacob_a_Page_28.QC.jpg
2e320e8137ece17c38b725f48de79ebb
bc5fa07a85986927bc85e5820098b863fada5060
1842 F20110113_AACXNR jacob_a_Page_17.txt
8f11fdaba051735f3c7c374bbbc66ea4
a27134f91118e80bdf1c537f7363856fa3c4a365
F20110113_AACXIU jacob_a_Page_07.tif
699779deb0cd3e28f250d147b5af4f82
7fb1483af08aee761445a379c5460f945efc6f8b
22885 F20110113_AACXSP jacob_a_Page_29.QC.jpg
736a2e6ad97f8905892622ace9da7ea4
dacdb22b534b7b79d77f5e72ac32b6f4c95fa52f
530 F20110113_AACXNS jacob_a_Page_18.txt
261ed04ffd1bb746d720d4491e90d93e
20c29e4bdfbd5f8624c847f27c15de8969180a57
F20110113_AACXIV jacob_a_Page_08.tif
ca2e5e8e7b7508f011fbca51d1a1df71
9da62b6729ab5176a5978e1cab1394d6e7f21de5
6270 F20110113_AACXSQ jacob_a_Page_29thm.jpg
c27f4bb9b09fe31922e249a7ff1a39ff
6fa37091fb59312aa511adc22328d4d6dbe6321b
22256 F20110113_AACXDY jacob_a_Page_01.jpg
77723f2ef42d3588efd90e6aa8b446c0
36939017751e8d67eace7a46ec7cb839ca93c71b
1914 F20110113_AACXNT jacob_a_Page_19.txt
43bb0b69a52dd38a402199af08411958
b3d875093d3eccae6c64bfeb8d1e8ed5ad183241
F20110113_AACXIW jacob_a_Page_09.tif
f1360f1da476bf97fca659dfd7ad8e70
a402a8fc52d8b79414ad08f56676ce64abcff930
19504 F20110113_AACXSR jacob_a_Page_30.QC.jpg
049e9137f73a717e1b12f0e183c25aca
ead6c41e1b102b389588fcec6ed7e967fc6b7048
10067 F20110113_AACXDZ jacob_a_Page_02.jpg
a7fedf832c22684d68196fabefd27251
629ed135fbc1779787022087f7981809296c7a01
F20110113_AACXNU jacob_a_Page_20.txt
a3e2d0d09ddada06257f6edaa33969e9
134db85a471f95a13e35a597781ae0db4b96f611
F20110113_AACXIX jacob_a_Page_10.tif
d80751f408f1a3b13275e4368bd611ad
1ad61c4cc0af5c1abf02822b153b710a9cba730b
6086 F20110113_AACXSS jacob_a_Page_32thm.jpg
b66e5a142f0a1f9807c836869a657207
7a7cb2acb64579a1c69b6bba80d8ef02d94d1057
2048 F20110113_AACXNV jacob_a_Page_21.txt
fb071dbc427ccac63312b7ceabc9d73b
6b8e7c4eba51103ec924faa8a5e34ebb07856dca
F20110113_AACXIY jacob_a_Page_11.tif
e0e5e1e8d9c310e72f8eee5d09103bd6
49451ee5f91d1c9a085631b13bb2daf217c32164
19894 F20110113_AACXST jacob_a_Page_33.QC.jpg
20aded7f1600facb6ca540701160cebb
109c73cc6e6456c9dbb6c60ffabbf7ef0c28162a
1635 F20110113_AACXNW jacob_a_Page_22.txt
08dd6805ce736040f7cd5c8afeacca5a
5e4978d6e8971c078d1ef5f561997ed477eb3e0d
73618 F20110113_AACXGA jacob_a_Page_61.jpg
0c96695602521d1b53f1682e4b33c637
d03e409500cfd1886401243a9bb0e4f1d5e8f141
F20110113_AACXIZ jacob_a_Page_12.tif
923f147791d380af487731df8092967f
68d24307b5a1ceba0765ae2c1b434e4ed3a6732a
5797 F20110113_AACXSU jacob_a_Page_33thm.jpg
d35145a5538101091a7941dba6efbb43
ce33c89b55b503c36643ab3097d7007d737dcc9a
1297 F20110113_AACXNX jacob_a_Page_23.txt
6253c8a77e726b7caeacd5bd6bc38d9f
1aac9bb342630b4f143388b872ffaae0d8ea76cc
42385 F20110113_AACXGB jacob_a_Page_62.jpg
9521fa27e543457fe335d7283410250d
6d0f5991614353df44d98136ca9aa5d0989b2d49
24090 F20110113_AACXSV jacob_a_Page_34.QC.jpg
aa7671c72c7e4abbf79d562f51cd93ad
d4fa4ee1eb8451fb0319dd12f9f49b4b6d55897a
1780 F20110113_AACXNY jacob_a_Page_24.txt
3edcb7d23b54406aa9c999cd0f294c19
b9477877fa97f1054268435426fee82a39e86312
76244 F20110113_AACXGC jacob_a_Page_63.jpg
7377f93a9fda174340aecc6d7aab1470
d40ce5067100d0b8e57b4af8adfeeba9339dc921
6620 F20110113_AACXSW jacob_a_Page_34thm.jpg
cb6d1931c8f52c44882c649b2efecab4
3ba960929c4e4c4b4b1c38f732ac337be2ad9809
11745 F20110113_AACXLA jacob_a_Page_06.pro
76bdb52e6e1afa879107d92c8b5a60b4
407d3ef7a3f8c34c0603bf7b870582eb88086e70
1844 F20110113_AACXNZ jacob_a_Page_25.txt
2301cb7a5c60b0180c8e5e09187714ad
4a1d6dd68eb08b436d28db83ed9b2a152eadb76c
93588 F20110113_AACXGD jacob_a_Page_64.jpg
47f4685148759b0b637391dc14eae2da
81632bff9ce669991be034dae65db00ef7a78b3c
23420 F20110113_AACXSX jacob_a_Page_36.QC.jpg
03b480223ba3d59989f8c172fd739cd6
d40f6bdd7297c430696ae6a7814d3e5da78c2c7b
57378 F20110113_AACXLB jacob_a_Page_07.pro
847cbc53fce3d4bbce3cab9ced11e3e3
421eb485e879773c6a9e1ee1ec718dc2093abecc
96711 F20110113_AACXGE jacob_a_Page_65.jpg
41e0b3c5148cef5217cc117631df23f4
6ff228c3f148c6762529befed02c0374c042b610
22042 F20110113_AACXSY jacob_a_Page_37.QC.jpg
3d332fd34e814a4914b52faa59e172af
9a2fc887481eae20066ef592ff8a4d53532b44b8
24506 F20110113_AACXLC jacob_a_Page_08.pro
85b52ad3e6197ba044bc5b458ac5be37
72640882104f6abd8cba4b05dc591fd4d197cc8f
54120 F20110113_AACXGF jacob_a_Page_67.jpg
33f3c94e97cf453b05f2d1123ab7265a
1ccf58f9358d4e21154a1a47a7a5c9a4814c2114
4101 F20110113_AACXQA jacob_a_Page_62thm.jpg
cef87e14cbc95f9d2184efe1bccc68d9
f6b61ee1a81ef06107fc1249f69f221469419e60
6148 F20110113_AACXSZ jacob_a_Page_37thm.jpg
de9cc923098b28b606da3182ee229ab4
06561d5396e5a45b0e358f7b1640fa6c7c12c6bb
37772 F20110113_AACXLD jacob_a_Page_09.pro
73932f75adf26e116733109261346f1b
2483e0f98526c290bcb26f083b4a88d940f6e4ba
22449 F20110113_AACXGG jacob_a_Page_01.jp2
fe9f788a3c8a52f801d40920273c86bc
b812981fcbb5cc9e0539f4636da3fa145e668bf0
5747 F20110113_AACXQB jacob_a_Page_63thm.jpg
c8eb407c5c02dbec7f9fb33e2fbe1fdf
19b79a3d78569b17f4116c5abcb5b8321534af35
14674 F20110113_AACXLE jacob_a_Page_10.pro
9a8f67b7ae3525a07cd849d412417b4a
7e4bdd966b86267f9b6c97c82d38d2dcf60f1a16
8030 F20110113_AACXGH jacob_a_Page_03.jp2
e7507ec7660e40acd586c2decc70d4a2
97e57533e800e2ad6400aa5f6691383506c75d28
7761 F20110113_AACXQC jacob_a_Page_39.QC.jpg
c3dad409250ac2660fcf4ee1e0d36484
e556d6b13b0b4b0230ff62528f0e199e1098edcc
42818 F20110113_AACXLF jacob_a_Page_11.pro
119d4a9af0530a20caf353d6309c7b6f
8b7e49d64749c0cc8074c646f8e19c1e642ba4e9
36375 F20110113_AACXGI jacob_a_Page_04.jp2
58ed93d8c223d481fcb0e5ab9c95c230
133069b4fdea2e9515da7543a600ecaea9b249bc
45752 F20110113_AACXLG jacob_a_Page_12.pro
da86fe929001e5c3ec4321652c3c0526
e72360301914274d5f52b2eacd478fbb96580bd2
1051984 F20110113_AACXGJ jacob_a_Page_05.jp2
9aec3cda61609f6defe9631d725e7d12
dc210a463c12ab8636364bd885e9af07ef230027
21421 F20110113_AACXQD jacob_a_Page_48.QC.jpg
f91182b0905965a0d309da0527d6d32d
462ad2d27797d99f75cc3e38c280ea72b4b7bd9b
38567 F20110113_AACXLH jacob_a_Page_14.pro
9937ba992d97b6454068f8c2af4a5b87
b6142cee572df80241c6eac62006925489138136
438923 F20110113_AACXGK jacob_a_Page_06.jp2
e31cfbc39967d1de5a61bddbcc8001a2
693884b6b79f7473e85fe52374800cf8dd9bdb6c
21988 F20110113_AACXQE jacob_a_Page_57.QC.jpg
ab65dada71c573931b1504d0eb827936
03c89d49f11b50a7168b6fcca9e467ecf4078466
1051974 F20110113_AACXGL jacob_a_Page_07.jp2
7fd17578f04ac800c19b09c5896dfbeb
e27b50aa5412f70408d48dfa2ccce8afebd2ef0a
6449 F20110113_AACXQF jacob_a_Page_42thm.jpg
742e05bb02514ad9d0318003f688f4b8
a4421032a5e409789a4f2852d81f3680c1d5b01b
46615 F20110113_AACXLI jacob_a_Page_15.pro
b2188520b514ebfca8687a8c14198133
e4c5afa448f28d0b932752254c7b77b222ff015c
1046192 F20110113_AACXGM jacob_a_Page_08.jp2
0bb8c36f5a11253d1e4c4864e09388a8
d4e1193c0ab647a572d41101e5b870c33fc6dee1
16095 F20110113_AACXQG jacob_a_Page_18.QC.jpg
a5e7fd073e7782a548eb917a0717d06b
ebd353876089483ec18ee7fe4b0f40fb4f04b241
48262 F20110113_AACXLJ jacob_a_Page_16.pro
144282d824fc2290a36559a3f70c4130
497707cbe47621b7cc7365e34060c2de8000b7a3
85996 F20110113_AACXGN jacob_a_Page_09.jp2
c344476eaf857ffe06a3dd7d0e504181
03c2f7d76b29bb08a5396311177b7e40f72bfae4
24192 F20110113_AACXQH jacob_a_Page_26.QC.jpg
ac6b26ac1baecceedebd433897a3523f
15f32363a274ce37ed540d51ec62db4aa7fc2819
46591 F20110113_AACXLK jacob_a_Page_17.pro
e4571c80a050062f5c39719e09958852
c1405eed2478c935bd46454f45856916eba35112
36396 F20110113_AACXGO jacob_a_Page_10.jp2
35b3910eed5d1cde653f30bae58c97e1
b9c0c43f3cdfb04047c35bb717fbc7f0e8d35c59
17378 F20110113_AACXQI jacob_a_Page_23.QC.jpg
b340c1f42f86b8fef3e4ad4f61c24948
a36dee888d7a25a3093eff6a4c4f6dc3a8392809
12882 F20110113_AACXLL jacob_a_Page_18.pro
687a37ce7a38c756ca464f7a88b874f9
e82c2a8e49e792e4f5f1c157a813f5cc4cad9313
94915 F20110113_AACXGP jacob_a_Page_11.jp2
db680ff7e0348c23244ead9ec5f3f339
ab3e2eda165ef307e51ab1067723b5a1248fb780
2302 F20110113_AACXQJ jacob_a_Page_06thm.jpg
6a668e91e39e13f86ed9d7d4ee1b6d2b
876ca631f6248ecf8b6fcff84c15d7ac2a048dbc
6107 F20110113_AACXBS jacob_a_Page_53thm.jpg
2e7eb938601106f73322fc7188a695fc
14c291c0e5b8fe604f56526dba4d930fbc36a792
48432 F20110113_AACXLM jacob_a_Page_19.pro
fe1d12352a51717a35d9839070167ff6
456a998b874700fc3abf7f4e2415355d690e913f
103656 F20110113_AACXGQ jacob_a_Page_12.jp2
a4abda5f364f5b4a8bb2dc1cc6db8b3b
617c867f94ecc5ea0061ebd510c1f03d4f511acb
2950 F20110113_AACXQK jacob_a_Page_04thm.jpg
1eaeeae714aca27b2df0595991e55ae6
de4fc279816bf449305b0ae2cd87a68d1ede95c4
96114 F20110113_AACXBT jacob_a_Page_59.jp2
7235a228235e7a621b64ea4a8f1628b9
a40e15ad57006d3d5f732b94b9c72abb4e98e68d
47599 F20110113_AACXLN jacob_a_Page_20.pro
607ba594e40bccf69de3779f314da08c
500855ce3625176b0885aa644c3e02e801917489
920140 F20110113_AACXGR jacob_a_Page_13.jp2
315ddca23f560e9514b3af5d712ee41b
75d5b52c108024e94652f39bcaa0159bfb63cb9f
21645 F20110113_AACXQL jacob_a_Page_31.QC.jpg
5a7e0d3a9265249c41800ebca2943fa9
92f664999ca97a7895e36f38180999022eca78a6
37187 F20110113_AACXLO jacob_a_Page_21.pro
e458f44ff350906298d3c9d32de57735
8003695aae73a7bfc9f46d1bd2393540ce3fcb4b
87133 F20110113_AACXGS jacob_a_Page_14.jp2
f41fde510fa9dd9d81386ad4c92a9f3a
6b25a8a5ee3eb28153093908f8d0d31ba18b4a6d
5343 F20110113_AACXQM jacob_a_Page_14thm.jpg
af816e3104604daa4b071c011c8e5078
4cd413254862a3402ef107c55bbc13e19aabdafb
5543 F20110113_AACXBU jacob_a_Page_54thm.jpg
82999b45ba02d3692784913c59a434cc
8f34808b644dbb2a189913b5d9a1cb683172687c
37500 F20110113_AACXLP jacob_a_Page_22.pro
92bd86bd13bd19a4bb1516e644936747
b5fd3eab149f43b8bae41f2b304d1db67760e7d0
101414 F20110113_AACXGT jacob_a_Page_15.jp2
e7a17360f7d377a6f4dd52c600da8ad8
ac7ca122affef756b12a511820ee96aafefc26b0
3754 F20110113_AACXQN jacob_a_Page_58thm.jpg
7f02be3ab683d31627a27bc99b7e8ea2
13a5f92070ccb2d8a7b86f54af60d5906d96a9d4
F20110113_AACXBV jacob_a_Page_27.tif
84e5234a0310956e4e0e7b3647b3204b
6a1a453c78bf2ee3b40c1dc6e367ea1a1d96c666
32683 F20110113_AACXLQ jacob_a_Page_23.pro
8603f07cbb5cbae79f5755f2903dceb7
934d56e59fc82c8ca85a2458888c95648f295530
106156 F20110113_AACXGU jacob_a_Page_16.jp2
426a2feb7f2ebbd9eb4d4507593f5f37
fa5e422938c7edc07508be9d07b267b474f430f4
17620 F20110113_AACXQO jacob_a_Page_45.QC.jpg
8f3527ec1e8b4e4c8cc1d2c1e03fdeac
d6047a69a7f67e6467618deb775e54a2da9a66f7
1056 F20110113_AACXBW jacob_a_Page_62.txt
472394111940fe675be90b8447e15bcf
45397e0f0a397b110dfb2971a46bc85171514f8a
43195 F20110113_AACXLR jacob_a_Page_24.pro
0695775b5d05e7b0cb18220457d667e8
9d71f05ca0959a3e13a281d534510ad87f68917f
103952 F20110113_AACXGV jacob_a_Page_17.jp2
2204f81af3167ae7fc27afae726047b8
988ad0779de46e6d9ac07c8454e2beb16e12b19c
22856 F20110113_AACXQP jacob_a_Page_19.QC.jpg
c991ec1ad4019d1814400a3362b6d167
fcf3a30b0fd1f318c8aceed77e2cb2715eb5cf36
57662 F20110113_AACXBX jacob_a_Page_54.jpg
d6184deaebca30d4367026a732c05dfc
5d8fc90d12032f35b8d8965720e6ed86aae8a025
40849 F20110113_AACXLS jacob_a_Page_25.pro
60aeb450c174a084608f257126629ba5
b125d43bf293b929ece47a84bfd9d98debd3fba8
496772 F20110113_AACXGW jacob_a_Page_18.jp2
7063f742f25e7cfdd680a8fc431dbd2e
aec2cf03801fd92b651a826e11772c59ae0313e8
6287 F20110113_AACXQQ jacob_a_Page_25thm.jpg
f40279fd151bc88197734d043e735272
6d103b3b09318e63e7766a914e0e6738e29fc78a
5267 F20110113_AACXBY jacob_a_Page_02.jp2
843259137186e6b36dc904ab8f9c9bfa
ed776f8695bd43a215d4cf563c69990f1be84e85
51252 F20110113_AACXLT jacob_a_Page_26.pro
99f9ad76483b30eb25c1697cdf1c39b2
2a1de389dbdc4bbe09e4dd6954f7efeb93b11e36
107739 F20110113_AACXGX jacob_a_Page_19.jp2
5271bd198e1ff7a58568f810fdbeb1ce
224dfa01b949d5b71de5dcfee49bf00e722b2356
23450 F20110113_AACXQR jacob_a_Page_42.QC.jpg
991d04bff08aff13bb08149417912f75
9d44b09cffdecaf652df39c979a2a9fbe1235209
61551 F20110113_AACXBZ jacob_a_Page_41.jpg
0ec8345b7ded61f459afcf0d367b303f
9b55cde1cff5bfb9fb34e775cf1c49bf84c07d92
28267 F20110113_AACXLU jacob_a_Page_27.pro
72be2079d5f434d6d28adc9660552aa1
fbfeead4e746c033eda7fdde6a78f75b383ad6a7
105654 F20110113_AACXGY jacob_a_Page_20.jp2
94d8de989fd05d35596dbc968a58e233
a8c840aee88c887122e956f13cd54662cff57226
9463 F20110113_AACXQS jacob_a_Page_04.QC.jpg
eb7970a3e100cd1b43822bb86e85738c
7c06b1509158cb8e5f00e5dcab48ffa74c9ec081
42564 F20110113_AACXLV jacob_a_Page_28.pro
5c529ba3f9303ffa0f19737ef895abda
4e21da1808c8fed6de2e0f9c1549ce9fb3fad84f
2789 F20110113_AACXQT jacob_a_Page_10thm.jpg
7f2c40d2d8a0d14e7031ea618b5bf387
ea7a480ea493a8724dcde13ac1aef233930d4d9e
46856 F20110113_AACXLW jacob_a_Page_29.pro
392dc7e6e4cf3dab1dd9565a0dd0bffd
3076013373ce9e88e39b2b20f43cf01ee7cd89e3
29331 F20110113_AACXEA jacob_a_Page_04.jpg
58b8db29c900104acce37005c86c7b8e
add7317d80154392b42106d2ee45ef04bad2f48f



PAGE 1

DISTRIBUTED CONFIGURATION MANAGEMENT FOR RECONFIGURABLE CLUSTER COMPUTING By AJU JACOB A THESIS PRESENTED TO THE GRADUATE SCHOOL OF THE UNIVERSITY OF FLOR IDA IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF MASTER OF SCIENCE UNIVERSITY OF FLORIDA 2004

PAGE 2

Copyright 2004 by Aju Jacob

PAGE 3

This document is dedicated to the God for giving me the strength to finish it.

PAGE 4

ACKNOWLEDGMENTS I thank Dr. Alan George, Ian Troxel, Raj Subramaniyan, Burt Gordon, Hung-Hsun Su, Dr. Sarp Oral, and all other members of the HCS Lab at the University of Florida for all their guidance and direction. I thank Dr. Herman Lam, Dr. Renato J. Figueiredo, and Dr. Jose A. B. Fortes for serving on my thesis committee. I thank my parents and sister for the support and encouragement. I would also like to thank Alpha-Data, Tarari, and Celoxica for their RC Boards. I thank Dolphin Inc. for its donation of SCI cards. I thank Intel and Cisco for their donation of Cluster resources. iv

PAGE 5

TABLE OF CONTENTS page ACKNOWLEDGMENTS.................................................................................................iv LIST OF TABLES.............................................................................................................vi LIST OF FIGURES..........................................................................................................vii ABSTRACT.......................................................................................................................ix 1 INTRODUCTION........................................................................................................1 2 BACKGROUND..........................................................................................................5 3 CONFIGURATION MANAGER DESIGN...............................................................14 3.1 Configuration Management Modules and Interfaces.......................................14 3.2 Configuration Manager Initialization and Configuration Determination........15 3.3 Configuration Managers Operating Layers....................................................16 3.4 Board Interface Module...................................................................................19 3.5 Distributed Configuration Manger Schemes...................................................20 4 EXPERIMENTAL SETUP........................................................................................23 4.1 Configuration Manager Scheme Protocol........................................................23 4.2 Experimental Setup..........................................................................................27 5 EXPERIMENTAL RESULTS...................................................................................30 6 PROJECTED SCALABILITY...................................................................................38 6.1 Completion Latency Projections......................................................................38 6.2 Hierarchy Configuration Managers.................................................................40 6.3 Consumed Bandwidth Projections...................................................................43 7 CONCLUSIONS AND FUTURE WORK.................................................................49 LIST OF REFERENCES...................................................................................................53 BIOGRAPHICAL SKETCH.............................................................................................57 v

PAGE 6

LIST OF TABLES Table page 1 Completion Latency Projection Equations for System Hierarchies.........................42 2 Control Consumption Equations for each Management Scheme.............................45 3 Bandwidth Equations for System Hierarchies.........................................................46 4 System Configurations for Given Constraints over System Sizes...........................48 vi

PAGE 7

LIST OF FIGURES Figure page 1 The CARMA Framework...........................................................................................3 2 Celoxicas RC1000 Architecture...............................................................................8 3 Tarari CPP Architecture.............................................................................................8 4 RAGE System Data Flow........................................................................................11 5 Imperial College Framework...................................................................................12 6 CARMAs Configuration Manager Overview. .......................................................15 7 Configuration Managers Layered Design. .............................................................17 8 Illustration of Relocation (Transformation) and Defragmentation..........................18 9 Board Interface Modules. ........................................................................................20 10 Distributed Configuration Management Schemes...................................................21 11 Master-Worker Configuration Manager Scheme.....................................................24 12 Client-Server Configuration Manager Scheme........................................................25 13 Pure Peer-to-Peer Configuration Manager Scheme..............................................26 14 Experimental Setup of SCI nodes............................................................................28 15 Completion Latency for Four Workers in MW........................................................30 16 Completion Latency for Four Clients in CS.............................................................31 17 Completion Latency for Four Peers in PPP.............................................................31 18 Completion Latency of Four Workers with Master in Adjacent SCI Ring..............33 19 Completion Latency of Four Clients with Server in Adjacent SCI Ring.................33 20 Completion Latency of Three Clients with Server in the Same SCI Ring...............34 vii

PAGE 8

21 Completion Latency of Eight Clients in CS.............................................................35 22 Completion Latency of MW Scheme for 2, 4, 8, and 16 nodes...............................36 23 Completion Latency of CS Scheme for 2, 4, 8, and 16 nodes.................................36 24 Completion Latency of Worst-Case PPP Scheme for 2, 4, 8, and 16 nodes............37 25 Completion Latency Projections for Worst-Case PPP, Typical-Case PPP, and CS 39 26 Four Layered Hierarchies Investigated....................................................................40 27 Optimal Group Sizes for each Hierarchy as System Size Increases........................43 28 Completion Latency Projections with Optimal Group Sizes up to 500 Nodes........44 29 Completion Latency Projections with Optimal Group Sizes...................................44 30 Network Bandwidth Consumed over Entire Network per Request.........................47 viii

PAGE 9

Abstract of Thesis Presented to the Graduate School of the University of Florida in Partial Fulfillment of the Requirements for the Degree of Master of Science DISTRIBUTED CONFIGURATION MANAGEMENT FOR RECONFIGURABLE CLUSTER COMPUTING By Aju Jacob December 2004 Chair: Alan D. George Major Department: Electrical and Computer Engineering Cluster computing offers many advantages as a highly cost-effective and often scalable approach for high-performance computing (HPC) in general, and most recently as a basis for hardware-reconfigurable systems, known as Reconfigurable Computing (RC) systems. To achieve the full performance of reconfigurable HPC systems, a run-time configuration manager is required. Centralized configuration services are a natural starting point but tend to limit performance and scalability. For large-scale RC systems, the configuration manager must be optimized for the system topology and management scheme. This thesis presents the design of a configuration manager within the Comprehensive Approach to Reconfigurable Management Architecture (CARMA) framework created at the University of Florida, in addition to several distributed configuration management schemes that leverage high-speed networking. The experimental results from this thesis highlight the effects of the design of this configuration manager and provide a comprehensive performance analysis of three of the ix

PAGE 10

proposed management schemes. In addition, the experiments explore and compare the scalability of these management schemes. These results show the configuration manager designs have little overhead on the system, when the system is unstressed. Finally, larger system-sizes are explored with an analytical model and scalability projections that investigate performance beyond available testbeds. The model shows that a hierarchical management scheme is needed for the configuration manager to provide lower bandwidth consumption and completion latency. x

PAGE 11

CHAPTER 1 INTRODUCTION Traditional computing spans a variety of applications due to its high flexibility, but is rather inefficient dealing with fine-grain data manipulation. Reconfigurable Computing (RC) attempts to perform computations in hardware structures to increase performance while still maintaining flexibility. Research has shown that certain applications, such as cryptanalysis, pattern matching, data mining, etc., reap performance gains when implemented in an RC system. However, this approach is valid only if there is an efficient method to reconfigure the RC hardware that does not overshadow the performance gains of RC. Loading a different configuration file alters the computation of an RC device, typically an Field-Programmable Gate Array (FPGA). To increase the efficiency of RC systems, the RC device is typically coupled with a General-Purpose Processor (GPP). GPPs are apt at control and I/O functions while the RC hardware handles fine-grain computations well. In a typical Commercial-off-the-Shelf (COTS) environment, the RC hardware takes the form of FPGA(s) on boards connected to the host processor through the PCI bus. Computation, control communication, and configurations are transferred over the PCI bus to the board. This coupling of RC board(s) to the GPP in a conventional computing node yields a single-node COTS RC system. A trend toward high-performance cluster computing based largely on COTS technologies has recently developed within the reconfigurable computing community. The creation of a 48-node COTS RC cluster at the Air Force Research Laboratory in 1

PAGE 12

2 Rome, NY [1] is evidence of this trend. Indeed, High-Performance Computing (HPC) has the potential to provide a cost-effective, scalable platform for high-performance parallel RC. However, while the addition of RC hardware has improved the performance of many stand-alone applications, providing a versatile multi-user and multitasking environment for clusters of RC and conventional resources imposes additional challenges. To address these challenges in HPC/RC designs, the HCS Research Lab at the University of Florida proposes the Comprehensive Approach to Reconfigurable Management Architecture (CARMA) framework [2, 3, 4]. CARMA provides a framework to develop and integrate key components as shown in Figure 1. With CARMA, the RC group at Florida seeks to specifically address key issues such as: dynamic RC-hardware discovery and management; coherent multitasking in a versatile multi-user environment; robust job scheduling and management; fault tolerance and scalability; performance monitoring down into the RC hardware; and automated application mapping into a unified management tool. This thesis focuses on the configuration management portion of CARMAs RC cluster management module in order to highlight the design and development of distributed configuration management schemes. Many of the new challenges CARMA addresses bear a striking resemblance to traditional HPC problems, and will likely have similar solutions, however others have very little correspondence whatsoever. The task of dynamically providing numerous configurations to distributed RC resources in an efficient manner is one such example. At first glance, changing computation hardware during execution in RC systems has no

PAGE 13

3 traditional HPC analog. However, future development of an RC programming model to allow the reuse of configuration files through run-time libraries resembles the concept of code reuse in tools such as the International Mathematical and Statistical Library (IMSL). This collection of mathematical functions abstracts the low-level coding details from developers by providing a means to pass inputs between predefined code blocks [5]. While tools like IMSL provide this abstraction statically, future RC programming models will likely include run-time library access. Core developers are providing the basis for such libraries [6]. Applications RC ClusterManagement DataNetwork Algorithm Mapping PerformanceMonitoring MiddlewareAPI UserInterface COTSProcessor RC FabricAPI RC Fabric RC Node To OtherNodes ControlNetwork Figure 1. The CARMA Framework. This Thesis focuses on Configuration Management located in the RC Cluster Management block. While the programming model aspect of configuration files could relate to traditional HPC, other aspects may not. For example, one might equate the transmission of configuration files to data staging by assuming they are simply another type of data necessary to be loaded onto a set of nodes at run-time. One reason this approach does not

PAGE 14

4 hold is the adaptable nature of configuration files. Recent work has suggested an adaptive-algorithms approach to RC in which tasks may require the run-time use of hundreds of configuration-file versions [7]. Traditional HPC systems today are not designed to recompile code at run-time much less do so hundreds of times per job. Another reason configuration files differ is that, while relatively small, the amount of high-speed cache dedicated to their storage is typically small, if existent. While some custom FPGAs allow up to four configurations to be stored on chip [8], systems that do not allocate on-chip or on-board memory cannot preemptively stage configurations at all. As demonstrated, configuration management and other issues need to be considered to achieve a versatile platform for RC-based HPC, especially if such systems may one day include grid-level computation where communication latencies can lead to significant performance penalties. The remaining chapters of this thesis are organized as follows. Chapter 2 provides a background of past work in configuration management services while Chapter 3 describes the design and development of CARMAs configuration manager. Chapter 4 provides a detailed discussion of the experimental setup while Chapter 5 presents and analyses the experimental results. Chapter 6 presents the projected scalability of CARMAs configuration manager. Finally, Chapter 7 describes conclusions and future work.

PAGE 15

CHAPTER 2 BACKGROUND Traditionally, computational algorithms are executed primarily in one of two ways: with an Application-Specific Integrated Circuit (ASIC) or with a software-programmed General-Purpose Processor (GPP) [9]. ASICs are fabricated to perform a particular computation and cannot be altered, however, ASIC designs provide fast and efficient execution. GPPs offer far greater flexibility because changing software instructions changes the functionality of the GPP. This flexibility comes at the overhead cost of fetching and decoding instructions. RC is intended to achieve the flexibility of GPPs and the performance of ASICs simultaneously. Loading a different configuration file alters the computation that an RC device, typically a FPGA, performs. Moreover, RC devices, such as FPGAs, are composed of logic-blocks that operate at hardware speeds. Configuration files dictate the algorithm or function that FPGA logic-blocks perform. The FPGA-configuration file is a binary representation of the logic design, used to configure the FPGA. Configuration file sizes have increased considerably over the last ten years. Xilinxs Virtex-series FPGAs have configuration-file sizes that range from 80 kB files to 5.3 MB for its high-end Virtex-II Pro [10]. The configuration-file size in general is a constant for any given device, regardless of the amount of logic in it, since every bit in the device gets programmed in full-chip configurations, whether it is used or not. However, the Virtex series supports partial reconfiguration, which is a mechanism where only logic on part of the FPGA is changed. Virtex-series chips can perform partial configuration without shutting down or disturbing processing on other regions of the 5

PAGE 16

6 FPGA. Albeit partial configuration is supported at the chip level, there exist no COTS-based boards that have software-API support for partial reconfiguration. Applications that exhibit parallelism and require fine-grain data manipulations are accelerated greatly by RC solutions. Data encryption is one such area that has benefited by RC. Serpent [11], DES [12], and Elliptic-Curve Cryptography [13] applications have all shown speedups as implemented in FPGAs when compared to conventional processors. Another application area that exhibits significant speedup when using RC is space-based processing. Hyperspectral imaging, a method of spectroscopy from satellites, can show frequency details that reveal images not visible to the human eye. Hyperspectral sensors can now acquire data in hundreds of frequency windows, each less than 10 nanometers in width, yielding relatively large data cubes for space-based systems (i.e. over 32 Mbytes) [14]. It is predicted that an RC implementation would have tremendous speedup over the todays conventional processor using Matlab. As mentioned in Chapter 1, the Air Force Research Laboratory in Rome, NY has created a 48-node RC cluster [1], thereby combining HPC and RC. Another example of the combination of HPC and RC exists at Virginia Tech with their 16-node Tower of Power [15]. More recently, the HCS Lab at the University of Florida has implemented an 9-node RC cluster [2]. A System-Area Network (SAN), more specifically Myrinet in Rome and Virginia Tech and Scalable Coherent Interconnect (SCI) in Floridas RC cluster, interconnects the nodes of these clusters. Myrinet and SCI as well as other High-Performance Networks (HPNs) including QsNet, and Infiniband, provide a high-throughput, low-latency network between HPC nodes. These networks are ideal to

PAGE 17

7 transfer latency-critical configuration files, which can be several MBs for high-end FPGAs. In addition to the cost advantages of COTS-based cluster computing, COTS-based RC boards facilitate the creation of RC systems. There are a variety of RC boards commercially available, varying in FPGA size, on-board memory, and software support. The most common interface for COTS-based RC boards is PCI. Two PCI-based RC boards presented in this chapter are Celoxicas RC1000 [16] and Tararis Content Processing Platform (CPP) [17]. The RC1000 board provides high-performance, real-time processing capabilities and provides dynamically reconfigurable solutions [16]. Figure 2 depicts the architecture of the RC1000. The RC1000 is a standard PCI-bus card equipped with a Xilinx Virtex device and SRAM memory directly connected to the FPGA. The board is equipped with two industry-standard PMC connectors for directly connecting other processors and I/O devices to the FPGA. Furthermore, a 50-pin unassigned header is provided for either inter-board communication or connecting custom interfaces. Configuration of the RC1000 is through the PCI bus directly from the host. Tarari has developed the dynamically reprogrammable content-processing technology to tackle the compute-intensive processing and flexibility requirements of the Internet-driven marketplace [17]. Figure 3 depicts the architecture of the Tarari CPP. The Tarari CPP has two Content-Processing Engines (CPE), each of which is a Xilinx Virtex-II FPGA. In addition, a third FPGA is a Content-Processing Controller (CPC), which handles PCI and inter-CPE communication, as well as configuration and on-board memory access. The DDR SDRAM is addressed by both CPEs, thus creating a shared

PAGE 18

8 memory scheme for inter-FPGA communication. The two CPEs enable parallelism and more complex processing, compared to the RC-1000. The Tarari boards have the ability to store configurations in on-board memory thereby decreasing the configuration latency by eliminating costly PCI-bus transfers. Figure 2. Celoxicas RC1000 Architecture. Figure 3. Tarari CPP Architecture.

PAGE 19

9 FPGA configuration is one of the most critical processing components that must be handled with care to ensure an RC speedup over GPP. As mentioned in Chapter 1, configuring RC resources is pure overhead in any RC system and thus has the potential to overshadow RC-performance gains. Reusing RC hardware during a processs execution, known as Run-Time Reconfiguration (RTR), has substantial performance gains over static configuration. One such example of this performance gain is demonstrated by RRANN at BYU. RRANN implements a backpropagation training algorithm using three time-exclusive FPGA configurations. RRANN demonstrated that RTR was able to increase the functional density by 500% compared to FPGA-based implementation not using RTR [18]. The dynamic allocation of RC resources results in multiple configurations per FPGA and consequently yields additional overhead compared to static systems. Dynamic allocation of configurations on a distributed system requires the RC system to maintain a dynamic list of where configuration files reside in the system. Furthermore, RTR systems must handle coordination between configurations, allowing the system to progress from one configuration to the next as quickly as possible. Moreover, methods such as configuration compression, transformation, defragmentation and caching can further reduce configuration overhead [9]. For example, using configuration compression technique, presented in [19], results in a savings of 11-41% in memory usage. The use of transformation and defragmentation has been shown to greatly reduce the configuration overhead encountered in RC, by a factor of 11 [20]. Configuration caching, in which configurations are retained on the chip or on the board until they are required again, also significantly reduces the reconfiguration

PAGE 20

10 overhead [21]. A well-designed RC system should be able to handle these overhead-reduction methods efficiently. Some RC systems implement a Configuration Manager (CM) to handle the issues that arise from RTR and configuration overhead reduction. There have been a few noteworthy designs upon which CARMAs configuration manager builds, two of which are discussed in detail: RAGE from the University of Glasgow [22] and a reconfiguration manager from Imperial College, UK [23]. The RAGE run-time reconfiguration system was developed in response to management methods that cater to one application and to one hardware setup [23]. The RAGE system provides a high-level interface for applications to perform complex reconfiguration and circuit-manipulation tasks. Figure 4 shows the dataflow of the RAGE system. A Virtual Hardware Manager (VHM) orchestrates the system by accepting application descriptions. The VHM requests circuit transforms from the Transform Manager if configurations do not currently fit in the FPGA. The VHM also manages the circuit store by converting hardware structures submitted by applications into circuits. The configuration manager loads circuits onto devices, in addition to passing state information and interrupts to the VHM. The device driver handles the board-specific functions and hides the programming interface of the FPGA from higher levels. The functionalities handled by the device driver include writing and reading to and from the FPGA, setting clock frequencies, and even monitoring the FPGA. Imperial College developed its configuration manager to exploit compile-time information, yet remain flexible enough to be deployed in hardware or software on both partial and non-partial reconfigurable FPGAs. The reconfiguration manager framework

PAGE 21

11 from Imperial College is shown in Figure 5. This framework is composed of three main components: the Monitor, Loader, and Configuration Store. When applications on the system advance to the next configuration, they notify the monitor. The monitor maintains the current state of the system including which FPGAs are in use and with what configurations. In some applications the state of the system can be determined at compile time, thereby reducing the complexity of the monitor. The loader, upon receiving a request from the monitor, loads the chosen configuration onto the FPGA using board-specific API functions. The loader retrieves the needed configuration file from the configuration store. The configuration store contains a directory of configuration files available to the system, in addition to the configuration data itself. A transform agent could be employed to compose configuration at run-time that fit appropriately into the FPGA. Figure 4. RAGE System Data Flow [22]. Imperial College developed its configuration manager to exploit compile-time information, yet remain flexible enough to be deployed in hardware or software on both

PAGE 22

12 partial and non-partial reconfigurable FPGAs. The reconfiguration manager framework from Imperial College is shown in Figure 5. This framework is composed of three main components: the Monitor, Loader, and Configuration Store. When applications on the system advance to the next configuration, they notify the monitor. The monitor maintains the current state of the system including which FPGAs are in use and with what configurations. In some applications the state of the system can be determined at compile time, thereby reducing the complexity of the monitor. The loader, upon receiving a request from the monitor, loads the chosen configuration onto the FPGA using board-specific API functions. The loader retrieves the needed configuration file from the configuration store. The configuration store contains a directory of configuration files available to the system, in addition to the configuration data itself. A transform agent could be employed to compose configuration at run-time that fit appropriately into the FPGA. FPGA Loader Application Config. State Monitor Config. Store Figure 5. Imperial College Framework [23]. The recent trend toward COTS-based distributed RC clusters requires the deployment of a distributed configuration manager such as the CARMA configuration manager located in the RC Cluster Management box in Figure 1. CARMAs execution

PAGE 23

13 manager is analogous to the VHM in RAGE, since both components coordinate the execution of configuration commands. CARMAs CM extends the configuration store of [23] by maintaining a distributed configuration store. The distributed store requires new methods for transporting and accounting configuration files. Furthermore, transformation of configuration files presented in both [22] and [23] can be implemented in the CARMA configuration manager. CARMAs configuration manager employs the high degree of device independence of the RAGE as well as the functional capability of [23]s Loader. Furthermore, CARMAs configuration manager supports multiple boards, some of which include those previously presented in this chapter, with the Board Interface Module (BIM). The BIM is functionally similar to the device driver of RAGE in that it handles low-level details of board configuration and communication. CARMAs configuration manager extends the Monitor in Shiraz et al. [23] to bring robust, scalable, and highly responsive monitoring down into the FPGAs resources by the use of Gossip-Enabled Monitoring Service (GEMS) [24] developed at Florida. A more detailed description of CARMAs configuration manager is given in Chapter 3.

PAGE 24

CHAPTER 3 CONFIGURATION MANAGER DESIGN CARMAs configuration manager incorporates a modular-design philosophy from both the RAGE [22] and the reconfiguration manager from Imperial College [23]. CARMAs configuration manager separates the typical CM operations of configuration determination, management, RC-hardware independence, and communication into separate modules for fault tolerance and pipelining. CARMA establishes a task-based flow of RC-job execution. Consequently, the CARMAs configuration manager encompasses different operating layers, which carry out sub-tasks to complete configuration of the RC Hardware. The CARMA configuration manager supports and improves on current levels of board independence and heterogeneity. In addition, CARMAs configuration manager institutes distributed configuration management to increase scalability, which results in the emergence of multiple management and communication schemes. A description of each of these features follows. Configuration Management Modules and Interfaces Figure 6 shows an overview of CARMAs configuration manager with its modules interconnected within a node and between nodes. All modules have been developed as separate processes, rather than inter-related threads, in order to increase the fault tolerance of the system. The execution manager handles the configuration determination while the configuration manager module handles the management of configuration files. The Board Interface Module (BIM) implements board independence to the application and to higher layers. A communication module handles all inter-node communication, 14

PAGE 25

15 including both the control network and the configuration-file transfer network. The communication module is interchangeable and can be tailored for specific System-Area Networks (SANs). ExecutionManager ConfigurationManager RC Hardware Comm. Communication Remote Node Control NetworkFile Transfers Local Node Inter-Process Comm. BIM Board API Figure 6. CARMAs Configuration Manager Overview. The figure shows Functional Modules and inter-node and intra-node Communication. Although the control and file transfer communication can reside on the same network, the current implementation leverages SAN interconnects for large file transfers. TCP sockets (e.g. over IP over Gigabit Ethernet) comprise the control network, while SCI currently serves as the data network for configuration file transfers. Modules within a node use a form of inter-process communication (i.e. message queues) to pass requests and status. Configuration Manager Initialization and Configuration Determination At initialization, the CM creates a directory of available RC boards and BIMs are forked off for each board to provide access. After the RC boards have been initialization, the configuration file-caching array is initialized. Next, the CM attempts to retrieve network information. Due to its distributed nature, the CM requires the network

PAGE 26

16 information of other CMs in order to communicate. The CM creates a network object from a file, which contains network information such as the IP address and SCI ID of nodes. Finally, the CM waits for transactions from the execution manager. Configuration determination is completed once the execution manager receives a task that requires RC hardware. A configuration transaction request is then sent to the CM. From the execution managers point-of-view, it must provide the CM with information regarding the configuration file associated with the task it is preparing to execute. The CM loads the configuration file on the target RC hardware in what is called a configuration transaction. Although the configuration transaction is the primary service of the CM, the CM also performs release transactions. The execution manager invokes release transactions when tasks have completed and the RC hardware can be released. Releasing the RC hardware allows it to be configured for another task, however the previous configuration is not erased in an attempt to take advantage of temporal locality of configuration use. Configuration Managers Operating Layers A functional description of how CARMA manages configuration files and executes configuration transactions is given in Figure 7. As described before, the CM receives configuration requests from the execution manager. Upon receiving a request, the File-Location layer attempts to locate the configuration file in a manner depending on the management scheme used. A more detailed description of CARMAs distributed management schemes is provided later in this chapter. The File-Transport layer packages the configuration file and transfers it over the data network. The File-Managing layer is responsible for managing FPGA resource access and defragmentation [9], as well as configuration caching [21], relocation and transformation [30]. Furthermore, the File

PAGE 27

17 Managing layer provides configuration information to the monitor (not shown) for scheduling, distributing and debugging purposes. The File-Loading layer uses a library of board-specific functions to configure and control the RC board(s) in the system and provide hardware independence to higher layers. Execution Manager File Location File Transport File Managing File Loading Board Interface Module Configuration Manager Figure 7. Configuration Managers Layered Design. This Figure shows the Layers inside the CM block, which implement the Location and Management of Configuration Files in CARMA. The transformation of configuration files and the CMs distributed nature requires a configuration store that is dynamic in both content and location. File location begins by searching the nodes local configuration-file cache. If there is a miss, a query is sent to a remote CM. Locating a configuration file varies in complexity depending on the management scheme, since in some schemes there is a global view of the RC system, while in others there is not.

PAGE 28

18 Due to CARMAs initial deployment on cluster-based systems, the CM typically has access to SANs which are widely deployed in clusters. SANs, such as SCI [31] and Myrinet [32], provide high-speed communication ideally suited for latency-dependent service such as configuration-file transport. To further diminish the transportation latency, the CM can exploit collective-communication mechanisms such as multicast supported by SANs [33, 34]. The CMs file-managing layer would deal with configuration-file caching and transformation, sometimes-called relocation, of configuration files in addition to defragmentation of the FPGA. Caching of configuration files is implemented by storing recently used configuration files in memory located on the RC board. For RC boards that do not support on-board configuration file storage, the files can be stored in RAM disk. CARMAs configuration manager currently does not support relocation as described in [9] and shown in Figure 8a, because current COTS-based RC-board software does not support partial reconfiguration. Defragmentation, shown in Figure 8b, is also not supported in the current CARMA version due to the inability to partially configure the RC boards. a) Relocation or Transformationb) Defragmentation Figure 8. Illustration of Relocation (Transformation) and Defragmentation [9].

PAGE 29

19 Board Interface Module A key feature of CARMAs configuration manager is that it provides board independence to higher layers. Board independence has not effectively been implemented in todays RC run-time management tools. CARMAs file-loading layer achieves this board independence with the creation of a BIM for each board. The BIM provides both the application and the CMs higher layers a module that translates generic commands into board-specific instructions. Each board type supported by CARMAs CM has a specific BIM tailored using that boards API. Figure 9 depicts the communication between the application and RC hardware through the BIM. At initialization, the CM spawns off a BIM for each of the boards within the node. The BIM remains dormant until the application requires use of the board, at which time the CM uses the BIM to configure the board. The application then sends data destined to the RC hardware to the BIM. The BIM then forwards the information in an appropriate format, and using the board-specific API, passes it to the board. After the application is finished accessing the board, the BIM goes back to its dormant state. Although the primary feature the BIM provides is board independence, the BIM also yields other advantageous features. As described the BIM provides access to the RC board for a local application, however the BIM also allows seamless and secure access to the RC board from remote nodes. Furthermore, the use of the BIM increases the reliability of the system, since applications do not access the boards directly. A security checkpoint could be established inside the BIM to screen data and configuration targeted to the RC board. However, these additional features do come at a slight overhead cost

PAGE 30

20 (roughly 10 s), decreased control of the board by the application, and furthermore, minor code additions to the application. RC Board Application BIM CM spawnsBIM for eachBoard RC Board BIM CM Board-SpecificCommunicationCARMA BoardInterface Language CM uses BIM toConfigure Board Figure 9. Board Interface Modules. Distributed Configuration Manager Schemes In order to provide a scalable, fault-tolerant configuration management service for thousands of nodes (one day) the CARMA configuration manager is fully distributed. The CARMA service is targeted for systems of 64 to 2,000 nodes and above, such as Sandias Cplant [35] and Japans Earth Simulator [36]. Such large-scale systems and grids will likely include RC hardware one day. In creating the distributed CM model, four distributed management schemes are proposed: Master-Worker (MW), Client-Server (CS), Pure Peer-to-Peer (PPP), and Hybrid Peer-to-Peer (HPP). Figure 10 illustrates these four schemes. While CARMA configuration management modules exist in different forms on various nodes in the four schemes, in all cases the CMs still use the communication module to communicate with one another.

PAGE 31

21 Global CM Local CM Local CM RC Hardware RC Hardware Exec. Manager Jobs SubmittedCentrallyConfigurationRequestConfiguration Files Master WorkerWorker Global CM Local CM Local CM RC Hardware RC Hardware ConfigurationRequestConfiguration Files Exec. Manager Jobs SubmittedLocallyClient Exec. ManagerClient Server Global CM Local CM Local CM RC Hardware RC Hardware Configuration Pointer Exec. ManagerPeer Exec. ManagerPeer Broker Configuration File ConfigurationRequest Local CM Local CM RC Hardware RC Hardware Exec. Manager Peer Exec. ManagerPeer Configuration File Configuration Request Jobs Submitted Locally c) Pure Peer-to-Peerd) Hybrid Peer-to-Peer b) Client-Server a) Master-Worker Configuration Request ConfigurationRequestConfigurationRequest Jobs SubmittedLocally Figure 10. Distributed Configuration Management Schemes. The MW scheme (Figure 10a) is a centralized scheme where the master maintains a global view of the system and has full control over job scheduling and configuration management. This scheme is representative of currently proposed CMs discussed in Chapter 2. While a centralized scheme is easy to implement, there will be performance limitations due to poor scalability for systems with a large number of nodes. The other three schemes in Figure 10 assume a distributed job-scheduling service. For the CS scheme, (Figure 10b) local CMs request and receive configurations from a server. Although this scheme is likely to exhibit better performance than MW for a given number of nodes, there will also be scalability limitations as the number of nodes is increased.

PAGE 32

22 The PPP scheme (Figure 10c) contains fully distributed CMs where there is no central view of the system. This scheme is similar to the Gnutella file-sharing network [37] and is described academically by Schollmeier at TUM in Munich, Germany [38]. This scheme will likely provide better latency performance since hot spots have been removed from the system. However, the bandwidth consumed by this scheme would likely be unwieldy when the number of nodes is rather large. The HPP scheme (Figure 10d) attempts to decrease the bandwidth consumed by PPP by consolidating configuration-file location information in a centralized source. The HPP scheme resembles the Napster file-sharing network [39]. The HPP scheme is also a low-overhead version of CS because local CMs receive a pointer from the broker to the client-node that possesses the requested configuration. This scheme will likely further reduce the server bottleneck by reducing service time. Having multiple servers/brokers may further ease the inherit limitations in these schemes. A hierarchical-layered combination of these four schemes will likely be needed to provide scalability up to thousands of nodes. For example, nodes might be first grouped using CS or HPP and then these groups might be grouped using PPP. The layered group concept has been found to dramatically improve scalability of HPC services for thousands of nodes [24]. Hierarchical layering and variation of these schemes are presented with analytical-scalability projections in Chapter 6. The following chapter details the experimental setup used to evaluate and experimentally compare the MW, CS and PPP schemes up to 16 nodes; HPP has been reserved for future study.

PAGE 33

CHAPTER 4 EXPERIMENTAL SETUP To investigate the performance of the CM in the MW, CS, and PPP schemes, a series of experiments are conducted. The objectives of these performance experiments are to determine the overhead imposed on the system by the CM, determine the components of latency in configuration transactions, and to provide a quantitative comparison between MW, CS, and PPP schemes. The performance metric used to compare these schemes is defined as the completion latency from the time a configuration request is received from the execution manager until the time that configuration is loaded onto the FPGA. Configuration Manager Protocols Figures 11, 12, and 13 illustrate the individual actions that compose a configuration transaction indicating the points at which latency has been measured for MW, CS, and PPP, respectively. The experimental setup includes system sizes of 2, 4, 8, and 16 nodes. In the MW and CS schemes, one node serves as the master or server while the remaining nodes are worker or client nodes. In the PPP scheme all the nodes are peer nodes. The Trigger block in Figures 11, 12, and 13 acts in place of the CARMAs execution manager and stimulates the system in a controlled periodic manner. A MW configuration transaction is composed of the following components as shown in Figure 11. The interval from 1 to 2 is defined as Creation Time and is the time it takes to create a configuration-request data structure. The interval from 2 to 3 is defined as Proxy Queue Time and is the time the request waits in the proxy queue until it 23

PAGE 34

24 can be sent over a TCP connection to the worker, while the Proxy Processing Time is the time it takes the Proxy to create a TCP connection to the worker and is the interval from 3 to 4. These connections are established and destroyed with each transaction because maintaining numerous connections is not scalable. The interval from 4 to 5 is defined as Request Transfer Time and is the time it takes to send a configuration request of 148 bytes over TCP. This delay was observed to average 420ms using TCP/IP over Gigabit Ethernet with a variance of less than 1%. ConfigurationStore Proxy Stub CM RC Hardware 1 2 3 4 9 6 7 5 8 10 11MasterWorker Stub CM RC Hardware Worker Stub CM RC Hardware Worker Stub CM RC Hardware Worker Trigger Figure 11. Master-Worker Configuration Manager Scheme. The interval from 5 to 6 is defined as Stub Processing Time and is the time it takes the Stub to read the TCP socket and place it in the CM queue, while the interval from 6 to 7 is defined as the CM Queue Time and is the time the request waits in the CM queue until the CM removes it. The CM Processing Time is the time required to accept the configuration request and determine how to obtain the needed configuration file and is the interval from 7 to 8. The interval from 8 to 9 is defined as File Retrieval Time, the

PAGE 35

25 time it takes to acquire the configuration file over SCI, including connection setup and tear down, whereas the interval from 9 to 10 is defined as CM-HW Processing Time and is the time to create a request with the configuration file and send the request to the BIM. Finally, the interval from 10 to 11 is defined as HW Configuration Time and is the time it takes to configure the FPGA. Trigger ConfigurationStore Stub CM RC Hardware 1 2 3 5 4 6 7ServerClient Client Client Trigger CM RC Hardware Client Trigger CM RC Hardware Trigger CM RC Hardware Figure 12. Client-Server Configuration Manager Scheme. CS configuration transactions are composed of the following components as shown in Figure 12. Note that jobs (via the Trigger block) are created on the client nodes in the CS scheme rather than on the master in the centralized MW scheme. The intervals from 1 to 2, 2 to 3 and 3 to 4 are the Creation Time, CM Queue Time and CM Processing Time, respectively and are defined as in MW. The interval from 4 to 5 is defined as File Retrieval Time, which is the time required for a client to send a request to the server and receive the file in response. The intervals from 5 to 6 and 6 to 7 are the CM-HW Processing Time and HW Configuration Time, respectively and are defined as in MW.

PAGE 36

26 Trigger ConfigurationStore Stub CM RC Hardware 1 2 3 5 4 6 7Peer ConfigurationStore Stub A B C D Trigger CM RC Hardware Peer Figure 13. Pure Peer-to-Peer Configuration Manager Scheme. PPP configuration transactions are composed of the following components as shown in Figure 13. Note that jobs (via the Trigger block) are created on the requesting peer nodes in the same way as the CS scheme. The intervals from 1 to 2, 2 to 3 and 3 to 4 are the Creation Time, CM Queue Time and CM Processing Time, respectively and are defined as they are in the two previous schemes. On the requesting peer, the interval from 4 to 5 is defined as File Retrieval Time, which is the time required for one peer to send a request to another peer and receive the file in response. The File Retrieval Time includes the time it takes for the requesting peer to contact each peer individually over TCP to request the needed configuration file. In order to explore the best-case and worst-case scenarios of configuration file distribution, the experiment was performed in two forms. The worst-case scenario was one in which the requesting node must contact all other peers for the needed configuration file and the last peer is the one that has the configuration file. In the best-case PPP, the first peer the requesting node contacts contains the configuration file. The intervals from 5 to 6 and 6 to 7 are the CM-HW

PAGE 37

27 Processing Time and HW Configuration Time, respectively and are defined as in the two previous schemes. On the responding side of the peer, the interval from A to B is defined as Query Response Time, which is the time required for a peer to determine if it has the requested configuration file. The stub sends a message to the configuration store to send the file to the requesting node, in addition to responding back to the requesting node that the configuration file was found. The interval from B to C is defined as the Configuration Store Queue Time, and is the time the request waits in the configuration store queue until the configuration store can service it. The configuration store establishes an SCI connection to the requesting peer and sends the configuration file, and is referred to as Configuration Store Processing Time, and is defined as the interval between C and D. Experimental Setup Experiments are performed with one instance of the CM running on a node, with the host system consisting of a Linux server with dual 2.4GHz Xeon processors and 1GB of DDR RAM. The control network between nodes is switched Gigabit Ethernet. The data network is 5.3 Gb/s SCI connected in a 2D torus with Dolphin D337 cards using the SISCI-software release 2.1. Each computing node contains a Tarari HPC board (i.e. CPX2100) [17] housed in a 66MHz, 64-bit PCI bus slot. For this study, the number of nodes is varied from 2, 4, 8, and 16 nodes. The 2D topology of the SCI network is shown in Figure 14. In experiments with the schemes of MW and CS, the CM is executed with one node serving as master/server and the others as computing nodes. In the 2, 4, 8, and 16-node cases the master/server always resides in the same node.

PAGE 38

28 M/S/P W/C/P W/C/P W/C/P W/C/P W/C/P W/C/P W/C/P W/C/P W/C/P W/C/P W/C/P W/C/P W/C/P W/C/P W/C/P 2-node 8-node16-node 4-node Figure 14. Experimental Setup of SCI nodes. Nodes are labeled Master (M), Servers (S), Workers (W), Clients (C), and Peers (P). The interval between requests, known as Configuration Request Interval, is chosen as the independent variable and is varied from 1s to 80ms. The value of 80ms is selected as the lower bound because this value is determined to be the minimum-delay time to retrieve a Virtex-II 1000 configuration file (~20ms) and configure a Tarari board (~60ms). The configuration of the board holds constant throughout the experiments, and is a function of the boards API, independent of CARMAs CM. In each trial, 20 requests were submitted with constant periodicity by the trigger, each requesting a different configuration file. Thus the trigger mimics the execution managers ability to handle configuration requests for multiple tasks that are running in the system. Since the execution manager is serialized, the trigger only sends one request at a time and is designed to do so in a periodic manner. This setup mimics the multitasking ability of the

PAGE 39

29 execution manager for 20 tasks running on the node each requiring a different configuration file. The timestamps for measuring all intervals is taken using the C function gettimeofday(), which provides a 1s resolution. Each experiment for each of the 3 schemes was preformed at least three times, and the values have been found to have a variance of at most 11%. The following chapter presents the final run of the experiment to be representative results of the experiment.

PAGE 40

CHAPTER 5 EXPERIMENTAL RESULTS Each of the completion-latency components are measured across all computing nodes in the MW, CS, and PPP schemes, and are summarized along with maximum, minimum, and average values in Figures 15, 16, and 17, respectively. All schemes have four computing nodes and the master/server are located in an adjacent SCI ring. PPP is performed in the worst-case scenario, as defined in Chapter 4. The results for request intervals larger than 250 ms are not shown because these values are found to be virtually identical to the 250 ms trial, for all schemes. Values in this region are found to be relatively flat because the master/server/peer is able to service requests without imposing additional delays. In this stable region, requests are only delayed by fixed values that are the sum of File Retrieval Time and HW Configuration Time, which is approximately equal to 130 ms, 140 ms, and 190 ms for MW, CS, and PPP, respectively. 050010001500200025003000Configuration Request Interval (ms)Completion Latency (ms) HW Configuration Time File Retrieval Time CM Queue Time 8090100110120130140150175200250Max.Min.Avg. Figure 15. Completion Latency for Four Workers in MW. 30

PAGE 41

31 020040060080010001200140016001800Configuration Request Interval (ms)Completion Latency (ms) HW Configuration Time File Retrieval Time CM Queue Time Max.Min.Avg.8090100110120130140150175200250 Figure 16. Completion Latency for Four Clients in CS. 0100200300400500600700800Configuratoin Request Interval (ms)Completion Latency (ms) HW Configuration Time File Retrieval Time CM Queue Time 8090100110120130140150175200250Max.Min.Avg. Figure 17. Completion Latency for Four Peers in PPP. Several observations can be made from the values in Figures 15, 16, and 17. In MW, CS, and PPP, the major components of completion latency are CM Queue Time, File Retrieval Time and HW Configuration Time, the largest of these naturally being CM Queue Time as resources become overloaded (i.e. decreasing configuration request interval). The remaining components combined are in the worst case less than 2% of the total completion latency, and thus are not shown in Figures 15, 16, and 17. The data

PAGE 42

32 demonstrates that the CM design in CARMA imposes virtually no overhead on the system in all cases. Furthermore, the data shows that when the request interval is large, File Retrieval Time and HW Configuration Time are the dominant factors. CM Queue Time and File Retrieval Time are the most dominant components of completion latency when the request interval is small. In Figure 15, CM Queue Time increases rapidly while File Retrieval Time grows steadily. In Figure 16, both CM Queue Time and File Retrieval Time grow steadily. However, in the PPP scheme, as seen in Figure 17, File Retrieval Time holds steady and CM Queue Time increases only after the configuration request interval is less than the File Retrieval Time. This phenomenon is a result of how the schemes are designed. Since PPP is designed with no single point of contention, the only factor in its queuing delay is CM-processing time dominated by File Retrieval Time. When the arrival rate is greater than the processing time, the CM Queue Time increases steadily. In the MW and CS schemes the File Retrieval Time is dependent on a centralized master or server, respectively. Another useful observation shown in Figures 15, 16 and 17 is the spread between the maximum and minimum values. In MW and PPP the spread is relatively small, at worst a 14% deviation from the average for MW and at worst 23% deviation for PPP. CS on the other hand experiences a worst-case 98% deviation from the average. The following figures investigate this phenomenon between the MW and CS schemes. Figure 18 and 19 present a per-node view of MW and CS respectively, which shows how the two schemes differ significantly. In CS, there is a greater variability in the completion latency each node experiences due to server access. Nodes closer to the server on the SCI-torus experience smaller file-transfer latencies, which allow them to

PAGE 43

33 receive and therefore request the next configuration file faster. Completion latencies in the MW scheme have less variability between nodes because all requests are serialized in the Master. Therefore, no one node receives an advantage based on proximity. 05001000150020002500300080280480680880Configuration Request Interval (ms)Completion Latency (ms) 05001000150020002500300080100120140160180200220240 Node 4Node 3Node 2Node 1Average Figure 18. Completion Latency of Four Workers with Master in Adjacent SCI Ring. 05001000150020002500300080280480680880Configuration Request Interval (ms)Completion Latency (ms) 040080012001600200080100120140160180200220240 Node 4Node 3Node 2Node 1Average Figure 19. Completion Latency of Four Clients with Server in Adjacent SCI Ring. Figure 20 illustrates the experiment where the server is moved into the same SCI ring as 3 clients. In the experimental results shown in Figure 19, the server resides on the

PAGE 44

34 adjacent SCI ring requiring a costly dimension switch to transfer files to clients. The data shows that the client closest to the server experienced a File Retrieval Time lower than that of other nodes at the stressed configuration-request interval of 100 ms. However, in the experiment shown in Figure 20 all the clients experienced similar File Retrieval Time at the stressed configuration-request interval of 100 ms. 5010015020025030035080280480680880Configuration Request Interval ( ms )Completion Latency (ms) Figure 20. Completion Latency of Three Clients with Server in the Same SCI Ring. This dependency on the client-to-server proximity is further illustrated in Figure 21, which shows the CS scheme for 8-nodes. Figure 21 shows results from the 8-node case where 3 clients reside on the same SCI ring as the server and 4 clients are on the adjacent ring. The spread between nodes is approximately 500 ms at a configuration-request interval of 80 ms. Figures 22, 23, and 24 present the scalability of MW, CS, and PPP (worst-case), respectively. Figure 22 shows that MW is not scalable since completion latency dramatically increases as the number of nodes increases, on average triples from 2 to 4 nodes and 4 to 8 nodes.

PAGE 45

35 0200400600800100012001400160080280480680880Configuration Request Interval ( ms )Completion Latency (ms) Figure 21. Completion Latency of Eight Clients in CS. Particularly, the 16-node MW trial incurs a queuing delay when the configuration-request interval is 1 s, which no other scheme incurs. These results clearly show that traditional centralized-managed schemes developed to date cannot be used for even relatively small distributed systems. Figure 23 shows that CS has a low completion latency for small numbers of nodes (i.e. 2 and 4 nodes), however rapidly increases for larger numbers of nodes (i.e. 8 and 16). For all system sizes, the completion latency of CS was roughly only 10% of MW. Figure 24 reveals that worst-case PPP has a constant increase in completion latency over 2, 4, 8, and 16 nodes. PPP has higher completion latency for small numbers of nodes (i.e. 2 and 4 nodes) being roughly 1.5 times that of CS. However, for larger numbers of nodes (i.e. 16) PPP has a lower average completion latency than CS, measuring close to 500 ms lower at configuration-request intervals less than 200 ms.

PAGE 46

36 020004000600080001000012000020040060080010001200Configuration Request Interval ( ms )Completion Latency (ms) 2-node8-node4-node16-node 020004000600080001000012000020040060080010001200Configuration Request Interval ( ms )Completion Latency (ms) 2-node8-node4-node16-node Figure 22. Completion Latency of MW Scheme for 2, 4, 8, and 16 nodes. 0500100015002000250030003500020040060080010001200Configuration Request Interval ( ms )Completion Latency (ms) 2-node8-node4-node16-node Figure 23. Completion Latency of CS Scheme for 2, 4, 8, and 16 nodes.

PAGE 47

37 0500100015002000250030003500020040060080010001200Configuration Request Interval ( ms )Completion Latency (ms) 2-node8-node4-node16-node Figure 24. Completion Latency of Worst-Case PPP Scheme for 2, 4, 8, and 16 nodes. Figures 22-24 show that Pure Peer-to-Peer and CS provide better scalability than MW. However, larger testbeds are needed to investigate scalability of these schemes at larger system sizes. As described earlier in this chapter, PPP has no single point of contention or hot spot, thus File Retrieval Time is relatively constant regardless of configuration-request interval. Furthermore having a fully distributed CM service also provides a great deal in terms of fault tolerance. The next chapter will provide an analytical scalability study of MW, CS, and PPP based on the experimental data shown above as well as possible extensions to these schemes.

PAGE 48

CHAPTER 6 PROJECTED SCALABILITY The current trend of cluster-based RC systems scaling to larger numbers of nodes will likely continue to progress, as RC technology advances. One day RC computing clusters might rival the size and computing power of Sandia National Laboratories Cplant [35], a large-scale parallel-computing cluster. Coupling parallel-computing power of this level with RC hardware yields a daunting task of system management. CARMAs configuration manager is one solution for allocating and staging configuration files in such large-scale distributed systems, however its performance on such system sizes should be explored. Based on the results presented in the previous chapter, CS and PPP schemes scale the best up to 16 nodes. Taking the experimental results for 2-, 4-, 8-, and 16-node systems, the completion latencies for larger systems can be projected. Completion Latency Projections Since searching for configuration files in large-scale RC system has no precedent, using current lookup algorithms available for PPP networks can yield an approximate number of nodes contacted in a query. The Gnutella lookup algorithm is a good starting point because of its simplicity and low overhead. Another lookup algorithm for PPP networks, Yappers, proposed by the CS department at Stanford University [40] uses hash tables, however this would yield weighty query processing. CARMAs PPP scheme would lie in the middle of these algorithms. A Gnutella-style algorithm would contact all nodes, and Yappers using the minimum number of hash buckets, resulting in minimum overhead, would contact roughly 25% of nodes. 38

PAGE 49

39 For the projections presented in this thesis, a typical-case PPP is assumed to be 25% of worst-case (i.e. all nodes contacted) PPP. A projection to larger node sizes is done with curves of the form bxmy for PPP and for CS. These curves are chosen for each since they are best-fit curves to the experimental data presented in Chapter 5. Both m and b are calculated mathematically and their values are found to be m=187.37 and b=138.6 for PPP, and m=10.85 and b=122.4 for CS. Figure 25 shows the results for CS, worst-case PPP (i.e. all nodes contacted), and typical-case PPP (i.e. 25% of nodes contacted) projected to 4096 nodes, since Dolphins SCI hardware scales to system sizes of 4096 [41]. The values used for the projection are at a configuration-request interval of 100 ms, since it is at this point (see Figures 16-18) that the CM is adequately stressed. bxmy2 1.00E+001.00E+011.00E+021.00E+031.00E+041.00E+051.00E+061.00E+071.00E+081.00E+09248163264128256512102420484096System Size (# of nodes)Completion Latency (ms) CS PPP worst-case PPP typical-case Figure 25. Completion Latency Projections for Worst-Case PPP, Typical-Case PPP, and CS. Note: Logarithmic Scales. Figure 25 shows that CS has greater completion latencies than worst-case PPP when group sizes are greater than 12 and greater than typical-case PPP when group sizes

PAGE 50

40 are greater than 6. Another observation from Figure 25 is that the schemes do not scale well for system sizes larger than 32. The completion latency of the CS scheme is of the order of 10 8 seconds for a system size of 4096, while a 32-node system yields completion latencies of a more reasonable 2 seconds. The data shows that in order to achieve reasonable completion latencies and scale to larger systems, CM nodes should be grouped into a layered hierarchy. Hierarchy Configuration Managers For this analytical study a two-layered hierarchy is chosen, where each layer could be implemented in either CS or typical-case PPP schemes. Typical-case PPP is used to more closely model a real system. Consequently the following four permutations are investigated, PPP-over-CS, CS-over-CS, CS-over-PPP, and PPP-over-PPP, shown in Figure 26. a) PPP over PPPc) CS over PPPb) PPP over CSd) CS over CS ServerServerHead PeerHead Peer Figure 26. Four Layered Hierarchies Investigated.

PAGE 51

41 The lower layer is divided into groups, where one node in each group is designated the head node. The server is the head node if the lower-layer scheme is CS and if the lower-layer scheme is PPP then one peer is designated as head node. The head node of each group then searches and transfers configuration files in the upper layer. In PPP-over-PPP shown in Figure 26a, a requesting node contacts the nodes in its group via the PPP scheme described. Normally a node can find the requested configuration file within its group, however when it cannot it contacts the head node of its group, which then retrieves the configuration file from the upper layer. This outcome is known as a configuration miss. On a configuration miss, the head node of the requesting group contacts the head nodes of other groups in search of the requested configuration file as dictated by the PPP scheme. When the requesting head node contacts other head nodes, each contacted head node searches for the configuration file within its group using the PPP scheme. In PPP-over-CS shown in Figure 26b, a requesting node contacts the server for its group. When the server does not have the requested configuration file, it contacts other servers using the PPP scheme. Figure 26c shows CS-over-PPP in which the nodes search for configuration files within the groups using PPP. When a configuration file is not found within the group the head node of the group contacts the server in the upper layer. Figure 26d shows CS-over-CS in which nodes contact the groups server, which then contacts the upper-layer server on a configuration miss. Using these hierarchical protocols, projection equations for completion latency can be derived and are shown in Table 1.

PAGE 52

42 Table 1. Completion Latency Projection Equations for System Hierarchies. Hierarchy Completion Latency Projection Equations PPP-over-PPP ))()(())((gPPPgnPPPPQgPPPP PPP-over-CS ))(()(gnPPPPQgCS CS-over-PPP ))(())((gnCSQgPPPP CS-over-CS ))(()(gnCSQgCS The functions PPP(x) and CS(x) represent the average completion latency experienced by one configuration request in a group of x nodes using the PPP and CS schemes, respectively. These functions are projected beyond 16 nodes and were presented earlier in this chapter. The variable n represents the number of nodes in the system, while g represents the group size. Groupings of 8, 16, 32, 64, and 128 are investigated, since these group sizes are reasonable in system sizes up to 4096 nodes. P represents the percentage of nodes contacted in the PPP scheme. Since we are using typical-case PPP, P has a value of 25%. Q is the configuration-miss rate and is assumed to occur 10% of the time based on measurement of distributed file systems in [42]. Although research into configuration caching is left for future work, the value of 10% is reasonable for this RC system. Using these equations and varying both group size and system size, a matrix of values is calculated. The optimal group size in each hierarchy as the system size increases is determined and shown in Figure 27. Figure 27 shows PPP-over-CS has a gradual change of optimal group sizes ending with optimal group size of 8 nodes for systems sizes > 512, while CS-over-PPP has a rapid change of optimal group sizes over system sizes ending with optimal group size of 128 nodes for systems sizes > 2048. Another observation is that the upper-layer scheme determines the optimal group size. For instance, a 128-node group size minimizes the

PAGE 53

43 high completion latency of CS in CS-over-PPP scheme, and an 8-node group size maximizes low completion latency of PPP in PPP-over-CS scheme. Figure 28 and 29 show the completion latency of the hierarchies using the optimal group size at the appropriate system sizes. 4128 1024 2048 4096System HierarchyPPP over PPPPPP over CSCS over PPPCS over CS 4 8 4 8System Size (# of nodes)Optimal Group Sizes (Latency Focused) 8 16 128 64 32 4 8 16 32 4 8 16 Figure 27. Optimal Group Sizes for each Hierarchy as System Size Increases. The data in Figure 28 shows that for system sizes of up to 512, PPP-over-CS with groups of 4 has the lowest completion time. Furthermore, for system sizes of 512 to1024 nodes, PPP-over-PPP with groups of 8 has the lowest completion time, and for system sizes of 1024 to 4096, PPP-over-PPP with groups of 16 has the lowest completion time, shown in Figure 29. Consumed Bandwidth Projections While the PPP-over-PPP hierarchy is projected to achieve the lowest completion latency at large scale, the control-communication bandwidth between nodes could be weighty. This section presents analytical control-bandwidth calculations for previously presented layered hierarchies. The data-network bandwidth utilization for all the schemes is no worse than 7.7%, and remains constant regardless of system size and therefore is not investigated.

PAGE 54

44 1001000100000100200300400500System Size (# of nodes)Completion Latency (ms) PPP over PPP PPP over CS CS over PPP CS over CS 4816164884 Figure 28. Completion Latency Projections with Optimal Group Sizes up to 500 Nodes. 10010001000010000005001000150020002500300035004000System Size (# of nodes)Completion Latency (ms) PPP over CS PPP over CS PPP over PPP PPP over PPP CS over PPP CS over CS 163264128881632168 Figure 29. Completion Latency Projections with Optimal Group Sizes. Using the bandwidth formula Equation (1) presented in [24] the control-bandwidth values can be determined. The parameters of Equation (1) are as follows: B is the bandwidth-per-node, is the number of layers in the system, L i is the total number of

PAGE 55

45 bytes transferred in the i th layer, while f i is the configuration-request frequency of the i th layer. Since is set to 2 and L i and f i are constant for both layers, the formula is simplified. Furthermore, when adapting Equation (1) to the CM protocols, Equation (2) is produced. The additional parameters are as follows: Q is the configuration-miss rate and is again assumed to be 10%, and the functions S 1 (x) and S 2 (x) represent functions describing the average amount of data used in the lower and upper layers, respectively. As before, the variable n represents the number of nodes in the system and g represents the group size while f i simplifies to the systems configuration-request frequency, f. As in the completion-latency projections, groupings of 8, 16, 32, 64, and 128 are investigated. iiifLB1 (1) fgnSQgSB ))(()(21 (2) Table 2 shows the data-consumption equations for each management scheme based on its protocol. The parameter P represents the average percentage of nodes contacted in finding the configuration file. The parameter L f is the total number of bytes transferred when the CM fails to find the configuration file, and L s is the total number of bytes when the CM succeeds in finding the configuration file. The parameter L r is the total number of bytes sent to the server to request a configuration file. Table 2. Control Consumption Equations for each Management Scheme. Management Data Consumption Equation PPP 2)2()( nLLnPnPPPsfbandwidth CS 2)( nLnCSrbandwidth

PAGE 56

46 Using the setup previously described in Chapter 6, Equation (2) is obtained. P is set to 25% of the nodes in the system. The L f and L s are equal in the PPP scheme since a node responds to a query by returning the query message to the requesting node. The message size is 74 bytes and breaks down as 14 bytes for Ethernet header, 20 bytes for TCP header, 20 bytes for IP, and 20 bytes of payload. The parameter L r totals 74 bytes as well, with the same breakdown as L f and L s Finally, f is set to correspond to the configuration request interval of 100 ms used in the completion latency calculations. Performing the appropriate substitutions of the data consumptions equations from Table 2 into the general bandwidth Equation (2) produces the bandwidth equations for each of the four hierarchies, shown in Table 3. Table 3. Bandwidth Equations for System Hierarchies. Hierarchy Consumed Bandwidth Equations PPP-over-PPP fLLgPggnPQLgPsff )))((( PPP-over-CS fLLgnPQLsfr ))(( CS-over-PPP fLQLLgPrsf CS-over-CS fLQLrr )( Using the equations in Table 3, the bandwidth consumed by each hierarchy over the entire network per request is calculated. Figure 30 shows the calculated results using the optimal group size found for completion latency at the appropriate system sizes. As seen in Figure 30, PPP-over-PPP incurs the most bandwidth and increases rapidly over larger system sizes. PPP-over-CS also increases with system sizes, however it consumes only 12% of the bandwidth of PPP-over-PPP. It can be observed that those hierarchies with CS in the upper layer have constant bandwidth, as can be seen by the bandwidth

PAGE 57

47 consumption of CS-over-CS and CS-over-PPP. However, CS-over-CS consumes only roughly 3% of bandwidth consumed by CS-over-PPP. 00.20.40.60.811.21.41.605001000150020002500300035004000System Size (# of nodes)Network Overhead (Mb/s) PPP over PPP PPP over CS CS over PPP CS over CS 3216812832 00.20.40.60.811.21.41.605001000150020002500300035004000System Size (# of nodes)Network Overhead (Mb/s) PPP over PPP PPP over CS CS over PPP CS over CS 00.20.40.60.811.21.41.605001000150020002500300035004000System Size (# of nodes)Network Overhead (Mb/s) PPP over PPP PPP over CS CS over PPP CS over CS 3216812832 Figure 30. Network Bandwidth Consumed over Entire Network per Request. From Figures 25, 28, 29 and 30 the optimal node configuration for an RC system is derived. Table 4 presents latency-bound, bandwidth-bound, and best overall system configurations. It should be noted that the bandwidth-bound constraint category excludes hierarchies with average completion latency values greater than 5 seconds. The low bandwidth benefits of these hierarchies cannot overcome their high completion latency. Table 4 provides a summary of the projection results and can be used to make CM setup decisions in future systems. For small system sizes (< 32) the latency-bound and bandwidth-bound categories have the same configurations. However, as the number of nodes increases the latency-bound category prefers hierarchies containing the PPP scheme, while the bandwidth-bound category prefers the CS scheme. Furthermore, as the system size increases so do the group sizes. The best-overall category attempts to reduce

PAGE 58

48 the bandwidth penalty of PPP schemes by choosing hierarchies that provide lower bandwidth requirements with minimal increase in latency. Thus, this category follows the latency-bound category until 512, since PPP-over-PPP consumes a significantly greater amount of bandwidth versus other hierarchies at system sizes greater than 512 (see Figure 30). Table 4. System Configurations for Given Constraints over System Sizes System Size (number of nodes) System Constraints < 8 8 to 32 32 to 512 512 to 1024 1024 to 4096 Latency bound Flat CS CS-over-CS group size 4 PPP-over-CS group size 4 PPP-over-PPP group size 8 PPP-over-PPP group size 16 Bandwidth bound* Flat CS CS-over-CS group size 4 CS-over-CS group size 8 PPP-over-CS group size 8 PPP-over-CS group size 8 Best Overall Flat CS CS-over-CS group size 4 PPP-over-CS group size 4 PPP-over-CS group size 8 PPP-over-CS group size 8 *Schemes with completion latency values greater than 5 seconds excluded.

PAGE 59

CHAPTER 7 CONCLUSIONS AND FUTURE WORK Traditional computing is inefficient at fine-grain data manipulation and highly parallel operations, thus RC arose to provide the flexibility and performance needed for todays HPC applications. First-generation RC systems typically entail FPGA boards coupled with general-purpose processors, thus merging the flexibility of general-purpose processors with the performance of an ASIC. RC achieves this flexibility with the use of configuration files and partial reconfiguration, which decreases configuration file sizes. RC has increased the performance of a variety of applications, such as cryptology, including encryption and decryption, and space-based processing, such as hyperspectral imaging. Recent trends extend RC systems to clusters of machines interconnected with SANs. The HCS Lab at Florida has developed a 9-node RC cluster to examine middleware and service issues. The Air Force has a 48-node RC cluster while an earlier cluster exists at Virginia Tech. These clusters use High-Performance Networks (HPNs), which provide them with a high-throughput, low-latency network. These networks are ideally suited to transfer latency-critical configuration files among nodes. Following the COTS mentality of cluster-computing networks, COTS-based RC boards are typically used to reduce cost. Two such RC boards are Celoxicas RC1000 and Tararis Content Processing Platform, among many others. Such heterogeneous and distributed systems warrant an efficient method of deploying and managing configuration files that does not overshadow the performance 49

PAGE 60

50 gains of RC computing. The Comprehensive Approach to Reconfigurable Management Architecture (CARMA) framework seeks to specifically address key issues in RC management. While some configuration-file management issues could relate to traditional HPC, other aspects may not. FPGA configuration is a critical processing component, which must be handled with care to ensure an RC speedup over a traditional system. Reusing the RC hardware during the processs execution, known as Run-Time Reconfiguration (RTR), increases system performance compared to static configuration. Furthermore, methods such as configuration compression, transformation, defragmentation, and caching can reduce configuration overhead further. RC systems implement a Configuration Manager (CM) to handle the issues that arise from RTR and configuration overhead reduction. CARMAs configuration manager builds upon a few noteworthy designs, two of which are the RAGE from the University of Glasgow, and the reconfiguration manager from Imperial College, UK. In response to the recent trend toward COTS-based distributed RC clusters, the HCS lab presents a modular and distributed configuration manager middleware. CARMAs execution manager, analogous to the VHM in RAGE, coordinates the execution of configuration commands. The configuration manager module manages the configuration files using a layered architecture. The BIM, like the device driver of RAGE, handles low-level details of board configuration and communication. CARMAs configuration manager extends the configuration store of Imperial Colleges reconfiguration manager by maintaining a distributed store, and thus requires a communication module. Since CARMA is fully distributed and targeted for thousands of nodes, various distributed files-management schemes are investigated. These include Master-Worker

PAGE 61

51 (MW), which mimics the fully centralized job schedulers of todays RC designs, which farm jobs and data to workers, and Client-Server (CS), which is similar to FTP servers. Also included are Pure Peer-to-Peer (PPP), like the Gnutella file-sharing network, and Hybrid Peer-to-Peer (HPP), like the Napster file-sharing network. In order to investigate completion latency and its components, experiments were performed in which the configuration request interval was varied. Additionally, the scalability to 16 nodes of the CM is investigated. The data gathered shows File Retrieval Time and HW Configuration Time are the dominant factors when request intervals are large. Furthermore, CM Queue Time is naturally the most dominant factor when request intervals are small. It is also observed that CARMAs configuration manager design imposes very little overhead on the system. In regard to scalability, the MW scheme is shown to not be scalable beyond 4 nodes. CS performs best for a small number of nodes (2-8), while PPP is found to be the most scalable to 16 nodes. Analytical models were used to extrapolate latency for larger systems and to predict scalability for future testbeds. These calculations showed that in order to scale systems to thousands of nodes, hierarchical schemes are required. Since experimental results showed that CS and PPP are the only scalable schemes beyond 8 nodes, they are permutated into two-layered hierarchical schemes. These hierarchies include PPP-over-PPP, PPP-over-CS, CS-over-PPP, CS-over-CS, in which the lower-layer nodes are grouped. The data showed that in a heterogeneous hierarchy, the upper-layer scheme determines optimal group size. Moreover, for system sizes ranging from 32 to 512, PPP-over-CS with group size of 4 yields the best completion-latency performance. For larger system sizes (512 to 4096), PPP-over-PPP with a group size of 16 yields the best

PAGE 62

52 completion-latency performance. In addition to completion latency, the bandwidth consumed by these hierarchies is investigated. The results showed that PPP-over-CS with a group size of 8 yields the best results, not only in terms of bandwidth but also in overall scalability, for large systems (> 512). Directions for future work include improving CM schemes, such as the existing PPP scheme, and implementing and evaluating the HPP scheme. In addition, caching configuration files should be investigated, in particular caching algorithms, miss rates, and configuration file locality. Moreover, the analytical projections made in this thesis should be validated in a future work. Bandwidth and utilization experiments should be conducted on larger systems as well. Another direction of future work would be to extend the features of CARMAs configuration manager by supporting new SANs and RC boards, and incorporating advanced configuration file-management techniques (e.g. defragmentation, etc.)

PAGE 63

LIST OF REFERENCES [1] V. Ross, Heterogeneous HPC Computing, presented at 35th Government Microcircuit Applications and Critical Technology Conference, Tampa, FL, April 2003. [2] A. Jacob, I. Troxel, and A. George, "Distributed Configuration Management for Reconfigurable Cluster Computing," presented at International Conference on Engineering of Reconfigurable Systems and Algorithms, Las Vegas, NV, June 2004. [3] I. Troxel, and A. George, UF-HCS RC Group Q3 Progress Report [online] 2004, http://www.hcs.ufl.edu/prj/rcgroup/teamHome.php (Accessed: May 25, 2004). [4] I. Troxel, A. Jacob, A. George, R. Subramaniyan, and M. Radlinski, "CARMA: A Comprehensive Management Framework for High-Performance Reconfigurable Computing," to appear in Proc. of 7th International Conference on Military and Aerospace Programmable Logic Devices, Washington, DC, September 2004. [5] Visual Numerics, Inc., IMSL Mathematical & Statistical Libraries [online] 2004, http://www.vni.com/products/imsl/ (Accessed: March 25, 2004). [6] Xilinx, Inc., Common License Consortium for Intellectual Property [online] 2004, http://www.xilinx.com/ipcenter/ (Accessed: February 13, 2004). [7] A. Derbyshire and W. Luk, Compiling Run-Time Parameterisable Designs, presented at 1st IEEE International Conference on Field-Programmable Technology, Hong Kong, China, December 2002. [8] Chameleon Systems, Inc., CS2000 Reconfigurable Communications Processor, Family Product Brief [online] 2000, http://www.chameleonsystems.com (Accessed: May 12, 2003). [9] K. Compton and S. Hauck, Reconfigurable Computing: A Survey of Systems and Software, ACM Computing Surveys, vol. 34, no. 2, June 2002, pp. 171-210. [10] Xilinx, Inc., Virtex Xilinx-II Series FPGAs [online] 2004, http://www.support.xilinx.com/publications/matrix/virtex_color.pdf (Accessed: February 13, 2004). 53

PAGE 64

54 [11] A. J. Elbrit and C. Paar, An FPGA Implementation and Performance Evaluation the Serpent Block Cipher, in Proc. of 8 th International Symposium on Field Programmable Gate Arrays, Monterey, CA, February 2000, pp. 33. [12] J. R. Hauser and J. Wawrzynek, Garp: A MIPS Processor with a Reconfigurable Coprocessor, in Proc. of 5 th IEEE Symposium on Field-Programmable Custom Computing Machines, Napa Valley, CA, April 1997, pp. 12. [13] K. H. Leung, K. W. Ma, W. K. Wong, and P. H. Leong, FPGA Implementation of a Microcoded Elliptic Curve Cryptographic Processor, in Proc. of 8 th IEEE Symposium on Field-Programmable Custom Computing Machines, Napa Valley, April 2000, pp. 68. [14] G. Peterson and S. Drager, Accelerating Defense Applications Using High Performance Reconfigurable Computing, presented at 35th Government Microcircuit Applications and Critical Technology Conference, Tampa, FL, April 2003. [15] M. Jones, L. Scharf, J. Scott, C. Twaddle, M. Yaconis, K. Yao, P. Athanas and B. Schott, "Implementing an API for Distributed Adaptive Computing Systems," presented at 34th IEEE International Conference on Communications, Vancouver, Canada, April 1999. [17] Tarari Inc., High-Performance Computing Processors Product Brief [online] 2004, http://www.tarari.com/PDF/HPC-BP.pdf (Accessed: February 12, 2004). [16] Celoxica Inc., RC1000 Development Platform Product Brief [online] 2004, http://www.celoxica.com/techlib/files/CEL-W0307171KKP-51.pdf (Accessed: February 12, 2004). [18] B. L. Hutchings, M.J. Wirthlin. "Implementation Approaches for Reconfigurable Logic Applications," in Proc. of 5th International Workshop on Field Programmable Logic and Applications, Oxford, England, August 1995, pp 419-428. [19] A. Dandalis and V. Prasanna. Configuration Compression for FPGA-based Embedded Systems, in Proc. of 9th International Symposium on Field Programmable Gate Arrays, Monterey, CA, February 2001, p.173-182. [20] K. Compton, J. Cooley, S. Knol, and S. Hauck, "Configuration Relocation and Defragmentation for FPGAs," presented at 8 th IEEE Symposium on Field-Programmable Custom Computing Machines, Napa Valley, CA, April 2000. [21] Z. Li, K. Compton, and S. Hauck, Configuration Caching for FPGAs, presented at 8th IEEE Symposium on Field-Programmable Custom Computing Machines, Napa Valley, CA, April 2000.

PAGE 65

55 [22] J. Burns, A. Donlin, J. Hogg, S. Singh, and M. de Witt, A Dynamic Reconfiguration Run-Time System, presented at 5th Annual IEEE Symposium on Custom Computing Machines, Los Alamitos, CA, April 1997. [23] N. Shiraz, W. Luk, and P. Cheung, Run-Time Management of Dynamically Reconfigurable Designs, in Proc. of 9 th International Workshop Field-Programmable Logic and Applications, Tallinn, Estonia, September 1998, pp. 59-68. [24] R. Subramaniyan, P. Raman, A. George, and M. Radlinski, GEMS: Gossip-Enabled Monitoring Service for Scalable Heterogeneous Distributed Systems White Paper, currently in journal review [online] 2003, http://www.hcs.ufl.edu/pubs/GEMS2003.pdf (Accessed: April 14, 2004). [25] K. Sistla, A. George, and R. Todd, "Experimental Analysis of a Gossip-based Service for Scalable, Distributed Failure Detection and Consensus," Cluster Computing, vol. 6, no. 3, July 2003, pp. 237-251 (in press). [26] D. Collins, A. George, and R. Quander, "Achieving Scalable Cluster System Analysis and Management with a Gossip-based Network Service," presented at IEEE Conference on Local Computer Networks, Tampa, FL, November 2001. [27] K. Sistla, A. George, R. Todd, and R. Tilak, "Performance Analysis of Flat and Layered Gossip Services for Failure Detection and Consensus in Scalable Heterogeneous Clusters," presented at IEEE Heterogeneous Computing Workshop at the Intnl. Parallel and Distributed Processing Symposium, San Francisco, CA, April, 2001. [28] S. Ranganathan, A. George, R. Todd, and M. Chidester, "Gossip-Style Failure Detection and Distributed Consensus for Scalable Heterogeneous Clusters," Cluster Computing, vol. 4, no. 3, July 2001, pp. 197-209. [29] M. Burns, A. George, and B. Wallace, "Simulative Performance Analysis of Gossip Failure Detection for Scalable Distributed Systems," Cluster Computing, vol. 2, no. 3, July 1999, pp. 207-217. [30] Z. Li, S. Hauck, Configuration Prefetching Techniques for Partial Reconfigurable Coprocessor with Relocation and Defragmentation, presented at 10th International Symposium on Field Programmable Gate Arrays, Monterey, CA, February 2002. [31] D. Gustavson, Q. Li, The Scalable Coherent Interface (SCI), IEEE Communications, vol. 34, no. 8, August 1996, pp. 52-63. [33] S. Oral and A. George, "Multicast Performance Analysis for High-Speed Torus Networks," presented at 27th IEEE Conference on Local Computer Networks via the High-Speed Local Networks Workshop, Tampa, FL, November 2002.

PAGE 66

56 [34] S. Oral and A. George, "A User-level Multicast Performance Comparison of Scalable Coherent Interface and Myrinet Interconnects," presented at 28th of IEEE Conference on Local Computer Networks via the High-Speed Local Networks Workshop, Bonn/Kswinter, Germany, October 2003. [35] R. Brightwell L. Fisk, Scalable Parallel Application Launch on Cplant, in Proc. of the ACM/IEEE conference on Supercomputing November 2001, Denver, Colorado, p. 40. [36] The Earth Simulator Center [online] 2004, http://www.es.jamstec.go.jp/esc/eng/ (Accessed: June 24, 2004). [37] P. Kirk Gnutella Stable 0.4 [online] 2003 http://rfc-gnutella.sourceforge.net/ developer/stable/index.html (Accessed: June 24, 2004). [38] R. Schollmeier A Definition of Peer-to-Peer Networking for the Classification of Peer-to-Peer Architectures and Applications, presented at the IEEE International Conference on Peer-to-Peer Computing, Linkping, Sweden, August 2001. [39] Napster, LLC, Napster.com, [online] 2004, http://www.napster.com (Accessed: April 14, 2004). [40] P. Ganesan, Q. Sun, and H. Garcia-Molina. YAPPERS: A Peer-to-Peer Lookup Service over Arbitrary Topology, presented at 22nd Annual Joint Conf. of the IEEE Computer and Communications Societies, San Francisco, April 2003. [41] Dolphin Interconnect Solutions Inc.Dolphin Interconnect Solutions [online] 2004, http://www.dolphinics.com/ (Accessed: April 15, 2004). [42] M. Baker and J. Ousterhout, "Availability in the Sprite Distributed File System," ACM Operating Systems Review, vol. 25, no. 2, April 1991, pp. 95-98.

PAGE 67

BIOGRAPHICAL SKETCH Aju Jacob received two bachelors degrees, one in computer engineering and the other in electrical engineering, from the University of Florida. He is presently a graduate student in the Department of Electrical and Computer Engineering at the University of Florida. He is working on his Master of Science in electrical and computer engineering with an emphasis in computer systems and networks. Currently, Aju is a research assistant in the High-performance Computing and Simulation Research Laboratory at the University of Florida. His research focuses on reconfigurable computing. Previously he spent a summer doing research in the Information Systems Laboratory at the University of South Florida. This research focused on design and implementation of an FPGA-based CICQ switch architecture. Aju has gained valuable industry work experiences from internships and projects. As part of the IPPD program, he worked in a team environment for Texas Instruments. The project involved designing and implementing a high-speed test interface. He also gained experience during his internship in AOL Time Warner's Web Fulfillment Division. Here he designed and implemented web pages and reporting system for e-business. He was also exposed to the latest web design tools and technologies. 57


Permanent Link: http://ufdc.ufl.edu/UFE0007181/00001

Material Information

Title: Distributed Configuration Management for Reconfigurable Cluster Computing
Physical Description: Mixed Material
Copyright Date: 2008

Record Information

Source Institution: University of Florida
Holding Location: University of Florida
Rights Management: All rights reserved by the source institution and holding location.
System ID: UFE0007181:00001

Permanent Link: http://ufdc.ufl.edu/UFE0007181/00001

Material Information

Title: Distributed Configuration Management for Reconfigurable Cluster Computing
Physical Description: Mixed Material
Copyright Date: 2008

Record Information

Source Institution: University of Florida
Holding Location: University of Florida
Rights Management: All rights reserved by the source institution and holding location.
System ID: UFE0007181:00001


This item has the following downloads:


Full Text












DISTRIBUTED CONFIGURATION MANAGEMENT
FOR RECONFIGURABLE CLUSTER COMPUTING















By

AJU JACOB


A THESIS PRESENTED TO THE GRADUATE SCHOOL
OF THE UNIVERSITY OF FLORIDA IN PARTIAL FULFILLMENT
OF THE REQUIREMENTS FOR THE DEGREE OF
MASTER OF SCIENCE

UNIVERSITY OF FLORIDA


2004

































Copyright 2004

by

Aju Jacob

































This document is dedicated to the God for giving me the strength to finish it.















ACKNOWLEDGMENTS

I thank Dr. Alan George, Ian Troxel, Raj Subramaniyan, Burt Gordon, Hung-Hsun

Su, Dr. Sarp Oral, and all other members of the HCS Lab at the University of Florida for

all their guidance and direction. I thank Dr. Herman Lam, Dr. Renato J. Figueiredo, and

Dr. Jose A. B. Fortes for serving on my thesis committee. I thank my parents and sister

for the support and encouragement. I would also like to thank Alpha-Data, Tarari, and

Celoxica for their RC Boards. I thank Dolphin Inc. for its donation of SCI cards. I thank

Intel and Cisco for their donation of Cluster resources.
















TABLE OF CONTENTS

page

A C K N O W L E D G M E N T S ................................................................................................. iv

LIST OF TABLES ............................... .............. ....... ....... vi

L IST O F F IG U R E S .... ...... ................................................ .. .. ..... .............. vii

ABSTRACT .............. ......................................... ix

1 IN TRODU CTION ................................................. ...... .................

2 B A C K G R O U N D ............................................................................ .. .............. .

3 CONFIGURATION MANAGER DESIGN............................................................14

3.1 Configuration Management Modules and Interfaces................... ........... 14
3.2 Configuration Manager Initialization and Configuration Determination ........15
3.3 Configuration Manager's Operating Layers.................................................16
3.4 B oard Interface M odule .......................................................... ............. 19
3.5 Distributed Configuration Manger Schemes .................................................20

4 E X PE R IM E N TA L SE TU P ............................................................. .....................23

4.1 Configuration Manager Scheme Protocol......................................................23
4.2 E xperim ental Setup ...........................................................................27

5 EXPERIMENTAL RESULTS ............................................................................30

6 PROJECTED SCALABILITY..... ..... ........................................ 38

6.1 Completion Latency Projections............... .......... ............. ............... 38
6.2 Hierarchy Configuration M managers ...................................... ............... 40
6.3 Consumed Bandwidth Projections............... ........... ......... ..... ............. 43

7 CONCLUSIONS AND FUTURE WORK.................... ......................49

L IST O F R E F E R E N C E S ......... ................. ...................................................................53

B IO G R A PH IC A L SK E TCH ..................................................................... ..................57



v
















LIST OF TABLES

Table page

1 Completion Latency Projection Equations for System Hierarchies......................42

2 Control Consumption Equations for each Management Scheme...........................45

3 Bandwidth Equations for System Hierarchies ............................... ............... .46

4 System Configurations for Given Constraints over System Sizes.........................48
















LIST OF FIGURES


Figure p

1 T he C A R M A Fram ew ork........................................................................... .............3

2 Celoxica's RC 1000 Architecture. ........................................................ ........... 8

3 Tarari CPP Architecture. ...... ........................... .........................................

4 RA G E System D ata Flow ..................................................................................... 11

5 Im perial College Fram ew ork. .............................................................................12

6 CARMA's Configuration Manager Overview. .....................................................15

7 Configuration Manager's Layered Design. .........................................................17

8 Illustration of Relocation (Transformation) and Defragmentation ......................... 18

9 B oard Interface M odules. .............................................. .............................. 20

10 Distributed Configuration Management Schemes. .............................................21

11 Master-Worker Configuration Manager Scheme ..................................................24

12 Client-Server Configuration Manager Scheme...................................................... 25

13 "Pure" Peer-to-Peer Configuration Manager Scheme ..........................................26

14 Experim ental Setup of SCI nodes. ........................................ ....................... 28

15 Completion Latency for Four Workers in MW......................................................30

16 Completion Latency for Four Clients in CS.................................. ............... 31

17 Completion Latency for Four Peers in PPP. .................................. ...............31

18 Completion Latency of Four Workers with Master in Adjacent SCI Ring..............33

19 Completion Latency of Four Clients with Server in Adjacent SCI Ring ...............33

20 Completion Latency of Three Clients with Server in the Same SCI Ring..............34









21 Completion Latency of Eight Clients in CS .......................................................35

22 Completion Latency of MW Scheme for 2, 4, 8, and 16 nodes.............................36

23 Completion Latency of CS Scheme for 2, 4, 8, and 16 nodes. .............................36

24 Completion Latency of Worst-Case PPP Scheme for 2, 4, 8, and 16 nodes.....3....37

25 Completion Latency Projections for Worst-Case PPP, Typical-Case PPP, and CS 39

26 Four Layered Hierarchies Investigated. ...................................... ............... 40

27 Optimal Group Sizes for each Hierarchy as System Size Increases ......................43

28 Completion Latency Projections with Optimal Group Sizes up to 500 Nodes........44

29 Completion Latency Projections with Optimal Group Sizes ................................44

30 Network Bandwidth Consumed over Entire Network per Request .......................47















Abstract of Thesis Presented to the Graduate School
of the University of Florida in Partial Fulfillment of the
Requirements for the Degree of Master of Science

DISTRIBUTED CONFIGURATION MANAGEMENT
FOR RECONFIGURABLE CLUSTER COMPUTING

By

Aju Jacob

December 2004

Chair: Alan D. George
Major Department: Electrical and Computer Engineering

Cluster computing offers many advantages as a highly cost-effective and often

scalable approach for high-performance computing (HPC) in general, and most recently

as a basis for hardware-reconfigurable systems, known as Reconfigurable Computing

(RC) systems. To achieve the full performance of reconfigurable HPC systems, a run-

time configuration manager is required. Centralized configuration services are a natural

starting point but tend to limit performance and scalability. For large-scale RC systems,

the configuration manager must be optimized for the system topology and management

scheme. This thesis presents the design of a configuration manager within the

Comprehensive Approach to Reconfigurable Management Architecture (CARMA)

framework created at the University of Florida, in addition to several distributed

configuration management schemes that leverage high-speed networking. The

experimental results from this thesis highlight the effects of the design of this

configuration manager and provide a comprehensive performance analysis of three of the









proposed management schemes. In addition, the experiments explore and compare the

scalability of these management schemes. These results show the configuration manager

designs have little overhead on the system, when the system is unstressed. Finally, larger

system-sizes are explored with an analytical model and scalability projections that

investigate performance beyond available testbeds. The model shows that a hierarchical

management scheme is needed for the configuration manager to provide lower bandwidth

consumption and completion latency.














CHAPTER 1
INTRODUCTION

Traditional computing spans a variety of applications due to its high flexibility, but

is rather inefficient dealing with fine-grain data manipulation. Reconfigurable

Computing (RC) attempts to perform computations in hardware structures to increase

performance while still maintaining flexibility. Research has shown that certain

applications, such as cryptanalysis, pattern matching, data mining, etc., reap performance

gains when implemented in an RC system. However, this approach is valid only if there

is an efficient method to reconfigure the RC hardware that does not overshadow the

performance gains of RC.

Loading a different configuration file alters the computation of an RC device,

typically an Field-Programmable Gate Array (FPGA). To increase the efficiency of RC

systems, the RC device is typically coupled with a General-Purpose Processor (GPP).

GPPs are apt at control and I/O functions while the RC hardware handles fine-grain

computations well. In a typical Commercial-off-the-Shelf (COTS) environment, the RC

hardware takes the form of FPGA(s) on boards connected to the host processor through

the PCI bus. Computation, control communication, and configurations are transferred

over the PCI bus to the board. This coupling of RC boards) to the GPP in a conventional

computing node yields a single-node COTS RC system.

A trend toward high-performance cluster computing based largely on COTS

technologies has recently developed within the reconfigurable computing community.

The creation of a 48-node COTS RC cluster at the Air Force Research Laboratory in









Rome, NY [1] is evidence of this trend. Indeed, High-Performance Computing (HPC)

has the potential to provide a cost-effective, scalable platform for high-performance

parallel RC. However, while the addition of RC hardware has improved the performance

of many stand-alone applications, providing a versatile multi-user and multitasking

environment for clusters of RC and conventional resources imposes additional

challenges.

To address these challenges in HPC/RC designs, the HCS Research Lab at the

University of Florida proposes the Comprehensive Approach to Reconfigurable

Management Architecture (CARMA) framework [2, 3, 4]. CARMA provides a

framework to develop and integrate key components as shown in Figure 1. With

CARMA, the RC group at Florida seeks to specifically address key issues such as:

dynamic RC-hardware discovery and management; coherent multitasking in a versatile

multi-user environment; robust job scheduling and management; fault tolerance and

scalability; performance monitoring down into the RC hardware; and automated

application mapping into a unified management tool. This thesis focuses on the

configuration management portion of CARMA's RC cluster management module in

order to highlight the design and development of distributed configuration management

schemes.

Many of the "new" challenges CARMA addresses bear a striking resemblance to

traditional HPC problems, and will likely have similar solutions, however others have

very little correspondence whatsoever. The task of dynamically providing numerous

configurations to distributed RC resources in an efficient manner is one such example.

At first glance, changing computation hardware during execution in RC systems has no









traditional HPC analog. However, future development of an RC programming model to

allow the reuse of configuration files through run-time libraries resembles the concept of

code reuse in tools such as the International Mathematical and Statistical Library (IMSL).

This collection of mathematical functions abstracts the low-level coding details from

developers by providing a means to pass inputs between predefined code blocks [5].

While tools like IMSL provide this abstraction statically, future RC programming models

will likely include run-time library access. Core developers are providing the basis for

such libraries [6].


F T Applications

User Algorithm Mapping
Interface

l-a i RC Cluster
S Management

Control Performance Middleware
Network Monitoring API


Ste oa nt ode s Data RC Fabric
:To Other m
S Network API
Nodes

RC Node
COTS
T RC Fabric
Processor 1__


Figure 1. The CARMA Framework. This Thesis focuses on Configuration Management
located in the RC Cluster Management block.

While the programming model aspect of configuration files could relate to

traditional HPC, other aspects may not. For example, one might equate the transmission

of configuration files to data staging by assuming they are simply another type of data

necessary to be loaded onto a set of nodes at run-time. One reason this approach does not









hold is the adaptable nature of configuration files. Recent work has suggested an

adaptive-algorithms approach to RC in which tasks may require the run-time use of

hundreds of configuration-file versions [7]. Traditional HPC systems today are not

designed to recompile code at run-time much less do so hundreds of times per job.

Another reason configuration files differ is that, while relatively small, the amount

of high-speed cache dedicated to their storage is typically small, if existent. While some

custom FPGAs allow up to four configurations to be stored on chip [8], systems that do

not allocate on-chip or on-board memory cannot preemptively "stage" configurations at

all. As demonstrated, configuration management and other issues need to be considered

to achieve a versatile platform for RC-based HPC, especially if such systems may one

day include grid-level computation where communication latencies can lead to significant

performance penalties.

The remaining chapters of this thesis are organized as follows. Chapter 2 provides

a background of past work in configuration management services while Chapter 3

describes the design and development of CARMA's configuration manager. Chapter 4

provides a detailed discussion of the experimental setup while Chapter 5 presents and

analyses the experimental results. Chapter 6 presents the projected scalability of

CARMA's configuration manager. Finally, Chapter 7 describes conclusions and future

work.














CHAPTER 2
BACKGROUND

Traditionally, computational algorithms are executed primarily in one of two ways:

with an Application-Specific Integrated Circuit (ASIC) or with a software-programmed

General-Purpose Processor (GPP) [9]. ASICs are fabricated to perform a particular

computation and cannot be altered, however, ASIC designs provide fast and efficient

execution. GPPs offer far greater flexibility because changing software instructions

changes the functionality of the GPP. This flexibility comes at the overhead cost of

fetching and decoding instructions. RC is intended to achieve the flexibility of GPPs and

the performance of ASICs simultaneously. Loading a different configuration file alters

the computation that an RC device, typically a FPGA, performs. Moreover, RC devices,

such as FPGAs, are composed of logic-blocks that operate at hardware speeds.

Configuration files dictate the algorithm or function that FPGA logic-blocks

perform. The FPGA-configuration file is a binary representation of the logic design, used

to configure the FPGA. Configuration file sizes have increased considerably over the last

ten years. Xilinx's Virtex-series FPGAs have configuration-file sizes that range from 80

kB files to 5.3 MB for its high-end Virtex-II Pro [10]. The configuration-file size in

general is a constant for any given device, regardless of the amount of logic in it, since

every bit in the device gets programmed in full-chip configurations, whether it is used or

not. However, the Virtex series supports partial reconfiguration, which is a mechanism

where only logic on part of the FPGA is changed. Virtex-series chips can perform partial

configuration without shutting down or disturbing processing on other regions of the









FPGA. Albeit partial configuration is supported at the chip level, there exist no COTS-

based boards that have software-API support for partial reconfiguration.

Applications that exhibit parallelism and require fine-grain data manipulations are

accelerated greatly by RC solutions. Data encryption is one such area that has benefited

by RC. Serpent [11], DES [12], and Elliptic-Curve Cryptography [13] applications have

all shown speedups as implemented in FPGAs when compared to conventional

processors. Another application area that exhibits significant speedup when using RC is

space-based processing. Hyperspectral imaging, a method of spectroscopy from

satellites, can show frequency details that reveal images not visible to the human eye.

Hyperspectral sensors can now acquire data in hundreds of frequency windows, each less

than 10 nanometers in width, yielding relatively large data cubes for space-based systems

(i.e. over 32 Mbytes) [14]. It is predicted that an RC implementation would have

tremendous speedup over the today's conventional processor using Matlab.

As mentioned in Chapter 1, the Air Force Research Laboratory in Rome, NY has

created a 48-node RC cluster [1], thereby combining HPC and RC. Another example of

the combination of HPC and RC exists at Virginia Tech with their 16-node "Tower of

Power" [15]. More recently, the HCS Lab at the University of Florida has implemented

an 9-node RC cluster [2]. A System-Area Network (SAN), more specifically Myrinet in

Rome and Virginia Tech and Scalable Coherent Interconnect (SCI) in Florida's RC

cluster, interconnects the nodes of these clusters. Myrinet and SCI as well as other High-

Performance Networks (HPNs) including QsNet, and Infiniband, provide a high-

throughput, low-latency network between HPC nodes. These networks are ideal to









transfer latency-critical configuration files, which can be several MBs for high-end

FPGAs.

In addition to the cost advantages of COTS-based cluster computing, COTS-based

RC boards facilitate the creation of RC systems. There are a variety of RC boards

commercially available, varying in FPGA size, on-board memory, and software support.

The most common interface for COTS-based RC boards is PCI. Two PCI-based RC

boards presented in this chapter are Celoxica's RC1000 [16] and Tarari's Content

Processing Platform (CPP) [17].

The RC1000 board provides high-performance, real-time processing capabilities

and provides dynamically reconfigurable solutions [16]. Figure 2 depicts the

architecture of the RC1000. The RC1000 is a standard PCI-bus card equipped with a

Xilinx Virtex device and SRAM memory directly connected to the FPGA. The board is

equipped with two industry-standard PMC connectors for directly connecting other

processors and I/O devices to the FPGA. Furthermore, a 50-pin unassigned header is

provided for either inter-board communication or connecting custom interfaces.

Configuration of the RC 1000 is through the PCI bus directly from the host.

Tarari has developed the dynamically reprogrammable content-processing

technology to tackle the compute-intensive processing and flexibility requirements of the

Internet-driven marketplace [17]. Figure 3 depicts the architecture of the Tarari CPP.

The Tarari CPP has two Content-Processing Engines (CPE), each of which is a Xilinx

Virtex-II FPGA. In addition, a third FPGA is a Content-Processing Controller (CPC),

which handles PCI and inter-CPE communication, as well as configuration and on-board

memory access. The DDR SDRAM is addressed by both CPEs, thus creating a shared-










memory scheme for inter-FPGA communication. The two CPEs enable parallelism and

more complex processing, compared to the RC-1000. The Tarari boards have the ability

to store configurations in on-board memory thereby decreasing the configuration latency

by eliminating costly PCI-bus transfers.


Figure 2. Celoxica's RC1000 Architecture.


1MB SRAM
I I I I


L-
E
D-


L
E
D


1MB SRAM


PCI bus


Figure 3. Tarari CPP Architecture.


1MB SRAM
I I II


1MB SRAM]









FPGA configuration is one of the most critical processing components that must be

handled with care to ensure an RC speedup over GPP. As mentioned in Chapter 1,

configuring RC resources is pure overhead in any RC system and thus has the potential to

overshadow RC-performance gains. Reusing RC hardware during a process's execution,

known as Run-Time Reconfiguration (RTR), has substantial performance gains over

static configuration. One such example of this performance gain is demonstrated by

RRANN at BYU. RRANN implements a backpropagation training algorithm using three

time-exclusive FPGA configurations. RRANN demonstrated that RTR was able to

increase the functional density by 500% compared to FPGA-based implementation not

using RTR [18].

The dynamic allocation of RC resources results in multiple configurations per

FPGA and consequently yields additional overhead compared to static systems. Dynamic

allocation of configurations on a distributed system requires the RC system to maintain a

dynamic list of where configuration files reside in the system. Furthermore, RTR

systems must handle coordination between configurations, allowing the system to

progress from one configuration to the next as quickly as possible.

Moreover, methods such as configuration compression, transformation,

defragmentation and caching can further reduce configuration overhead [9]. For

example, using configuration compression technique, presented in [19], results in a

savings of 11-41% in memory usage. The use of transformation and defragmentation has

been shown to greatly reduce the configuration overhead encountered in RC, by a factor

of 11 [20]. Configuration caching, in which configurations are retained on the chip or on

the board until they are required again, also significantly reduces the reconfiguration









overhead [21]. A well-designed RC system should be able to handle these overhead-

reduction methods efficiently.

Some RC systems implement a Configuration Manager (CM) to handle the issues

that arise from RTR and configuration overhead reduction. There have been a few

noteworthy designs upon which CARMA's configuration manager builds, two of which

are discussed in detail: RAGE from the University of Glasgow [22] and a

reconfiguration manager from Imperial College, UK [23].

The RAGE run-time reconfiguration system was developed in response to

management methods that cater to one application and to one hardware setup [23]. The

RAGE system provides a high-level interface for applications to perform complex

reconfiguration and circuit-manipulation tasks. Figure 4 shows the dataflow of the

RAGE system. A Virtual Hardware Manager (VHM) orchestrates the system by

accepting application descriptions. The VHM requests circuit transforms from the

Transform Manager if configurations do not currently fit in the FPGA. The VHM also

manages the circuit store by converting hardware structures submitted by applications

into circuits. The configuration manager loads circuits onto devices, in addition to

passing state information and interrupts to the VHM. The device driver handles the

board-specific functions and hides the programming interface of the FPGA from higher

levels. The functionalities handled by the device driver include writing and reading to

and from the FPGA, setting clock frequencies, and even monitoring the FPGA.

Imperial College developed its configuration manager to exploit compile-time

information, yet remain flexible enough to be deployed in hardware or software on both

partial and non-partial reconfigurable FPGAs. The reconfiguration manager framework









from Imperial College is shown in Figure 5. This framework is composed of three main

components: the Monitor, Loader, and Configuration Store. When applications on the

system advance to the next configuration, they notify the monitor. The monitor

maintains the current state of the system including which FPGAs are in use and with

what configurations. In some applications the state of the system can be determined at

compile time, thereby reducing the complexity of the monitor. The loader, upon

receiving a request from the monitor, loads the chosen configuration onto the FPGA

using board-specific API functions. The loader retrieves the needed configuration file

from the configuration store. The configuration store contains a directory of

configuration files available to the system, in addition to the configuration data itself. A

transform agent could be employed to compose configuration at run-time that fit

appropriately into the FPGA.


Applications

Operands' Circuit
Results Circuit
Rep.
Transform.
Virtual Circuit Manager

Manager
Circuit Results
Store CircuitV
State Device
Driver

Config. Prog.
Manager Data Address/
Data

SFPGA Hardware


Figure 4. RAGE System Data Flow [22].

Imperial College developed its configuration manager to exploit compile-time

information, yet remain flexible enough to be deployed in hardware or software on both









partial and non-partial reconfigurable FPGAs. The reconfiguration manager framework

from Imperial College is shown in Figure 5. This framework is composed of three main

components: the Monitor, Loader, and Configuration Store. When applications on the

system advance to the next configuration, they notify the monitor. The monitor

maintains the current state of the system including which FPGAs are in use and with

what configurations. In some applications the state of the system can be determined at

compile time, thereby reducing the complexity of the monitor. The loader, upon

receiving a request from the monitor, loads the chosen configuration onto the FPGA

using board-specific API functions. The loader retrieves the needed configuration file

from the configuration store. The configuration store contains a directory of

configuration files available to the system, in addition to the configuration data itself. A

transform agent could be employed to compose configuration at run-time that fit

appropriately into the FPGA.



Application


Config. Loader Monitor Config.
Store state


FPGA



Figure 5. Imperial College Framework [23].

The recent trend toward COTS-based distributed RC clusters requires the

deployment of a distributed configuration manager such as the CARMA configuration

manager located in the RC Cluster Management box in Figure 1. CARMA's execution









manager is analogous to the VHM in RAGE, since both components coordinate the

execution of configuration commands. CARMA's CM extends the configuration store of

[23] by maintaining a distributed configuration store. The distributed store requires new

methods for transporting and accounting configuration files. Furthermore, transformation

of configuration files presented in both [22] and [23] can be implemented in the CARMA

configuration manager. CARMA's configuration manager employs the high degree of

device independence of the RAGE as well as the functional capability of [23]'s Loader.

Furthermore, CARMA's configuration manager supports multiple boards, some of which

include those previously presented in this chapter, with the Board Interface Module

(BIM). The BIM is functionally similar to the device driver of RAGE in that it handles

low-level details of board configuration and communication. CARMA's configuration

manager extends the Monitor in Shiraz et al. [23] to bring robust, scalable, and highly

responsive monitoring down into the FPGAs resources by the use of Gossip-Enabled

Monitoring Service (GEMS) [24-29] developed at Florida. A more detailed description

of CARMA's configuration manager is given in Chapter 3.














CHAPTER 3
CONFIGURATION MANAGER DESIGN

CARMA's configuration manager incorporates a modular-design philosophy from

both the RAGE [22] and the reconfiguration manager from Imperial College [23].

CARMA's configuration manager separates the typical CM operations of configuration

determination, management, RC-hardware independence, and communication into

separate modules for fault tolerance and pipelining. CARMA establishes a task-based

flow ofRC-job execution. Consequently, the CARMA's configuration manager

encompasses different operating layers, which carry out sub-tasks to complete

configuration of the RC Hardware. The CARMA configuration manager supports and

improves on current levels of board independence and heterogeneity. In addition,

CARMA's configuration manager institutes distributed configuration management to

increase scalability, which results in the emergence of multiple management and

communication schemes. A description of each of these features follows.

Configuration Management Modules and Interfaces

Figure 6 shows an overview of CARMA's configuration manager with its modules

interconnected within a node and between nodes. All modules have been developed as

separate processes, rather than inter-related threads, in order to increase the fault

tolerance of the system. The execution manager handles the configuration determination

while the configuration manager module handles the management of configuration files.

The Board Interface Module (BIM) implements board independence to the application

and to higher layers. A communication module handles all inter-node communication,









including both the control network and the configuration-file transfer network. The

communication module is interchangeable and can be tailored for specific System-Area

Networks (SANs).
------- mono-- 0000. -----------
SExecution I I
Execution Comm. Communication

Manager

: Configuration a
'Manage IRemote Node

i BIM Inter-Process Comm. <-
SH-d Control Network
RC Hardware
I RC Hardwae File Transfers
S____ Local Node Board API

Figure 6. CARMA's Configuration Manager Overview. The figure shows Functional
Modules and inter-node and intra-node Communication.

Although the control and file transfer communication can reside on the same

network, the current implementation leverages SAN interconnects for large file transfers.

TCP sockets (e.g. over IP over Gigabit Ethernet) comprise the control network, while SCI

currently serves as the data network for configuration file transfers. Modules within a

node use a form of inter-process communication (i.e. message queues) to pass requests

and status.

Configuration Manager Initialization and Configuration Determination

At initialization, the CM creates a directory of available RC boards and BIMs are

forked off for each board to provide access. After the RC boards have been initialization,

the configuration file-caching array is initialized. Next, the CM attempts to retrieve

network information. Due to its distributed nature, the CM requires the network









information of other CMs in order to communicate. The CM creates a network object

from a file, which contains network information such as the IP address and SCI ID of

nodes. Finally, the CM waits for transactions from the execution manager.

Configuration determination is completed once the execution manager receives a

task that requires RC hardware. A configuration transaction request is then sent to the

CM. From the execution manager's point-of-view, it must provide the CM with

information regarding the configuration file associated with the task it is preparing to

execute. The CM loads the configuration file on the target RC hardware in what is called

a configuration transaction. Although the configuration transaction is the primary service

of the CM, the CM also performs release transactions. The execution manager invokes

release transactions when tasks have completed and the RC hardware can be released.

Releasing the RC hardware allows it to be configured for another task, however the

previous configuration is not erased in an attempt to take advantage of temporal locality

of configuration use.

Configuration Manager's Operating Layers

A functional description of how CARMA manages configuration files and executes

configuration transactions is given in Figure 7. As described before, the CM receives

configuration requests from the execution manager. Upon receiving a request, the File-

Location layer attempts to locate the configuration file in a manner depending on the

management scheme used. A more detailed description of CARMA's distributed

management schemes is provided later in this chapter. The File-Transport layer packages

the configuration file and transfers it over the data network. The File-Managing layer is

responsible for managing FPGA resource access and defragmentation [9], as well as

configuration caching [21], relocation and transformation [30]. Furthermore, the File-









Managing layer provides configuration information to the monitor (not shown) for

scheduling, distributing and debugging purposes. The File-Loading layer uses a library

of board-specific functions to configure and control the RC boards) in the system and

provide hardware independence to higher layers.


Execution Manager



SFile Location


2. File Transport


File Managing


S File Loading



Board Interface Module


Figure 7. Configuration Manager's Layered Design. This Figure shows the Layers inside
the CM block, which implement the Location and Management of
Configuration Files in CARMA.

The transformation of configuration files and the CM's distributed nature requires a

configuration store that is dynamic in both content and location. File location begins by

searching the node's local configuration-file cache. If there is a miss, a query is sent to a

remote CM. Locating a configuration file varies in complexity depending on the

management scheme, since in some schemes there is a global view of the RC system,

while in others there is not.









Due to CARMA's initial deployment on cluster-based systems, the CM typically

has access to SANs which are widely deployed in clusters. SANs, such as SCI [31] and

Myrinet [32], provide high-speed communication ideally suited for latency-dependent

service such as configuration-file transport. To further diminish the transportation

latency, the CM can exploit collective-communication mechanisms such as multicast

supported by SANs [33, 34].

The CM's file-managing layer would deal with configuration-file caching and

transformation, sometimes-called relocation, of configuration files in addition to

defragmentation of the FPGA. Caching of configuration files is implemented by storing

recently used configuration files in memory located on the RC board. For RC boards that

do not support on-board configuration file storage, the files can be stored in RAM disk.

CARMA's configuration manager currently does not support relocation as described in

[9] and shown in Figure 8a, because current COTS-based RC-board software does not

support partial reconfiguration. Defragmentation, shown in Figure 8b, is also not

supported in the current CARMA version due to the inability to partially configure the

RC boards.

Conligntration Incom ing
Pre ienl I FPGA C -h -- _,,-
Configumron I Configuraton


SConfigumtion 4 Configurton 2 Configuraion 2
CcMlllicts ^i ........r.'... Configuranon 3
XX XX1 ogmn Coniguraon Cnfigurion 4u



a) Relocation or Transformation b) Defragmentation

Figure 8. Illustration of Relocation (Transformation) and Defragmentation [9].









Board Interface Module

A key feature of CARMA's configuration manager is that it provides board

independence to higher layers. Board independence has not effectively been

implemented in today's RC run-time management tools. CARMA's file-loading layer

achieves this board independence with the creation of a BIM for each board. The BIM

provides both the application and the CM's higher layers a module that translates generic

commands into board-specific instructions. Each board type supported by CARMA's

CM has a specific BIM tailored using that board's API.

Figure 9 depicts the communication between the application and RC hardware

through the BIM. At initialization, the CM spawns off a BIM for each of the boards

within the node. The BIM remains dormant until the application requires use of the

board, at which time the CM uses the BIM to configure the board. The application then

sends data destined to the RC hardware to the BIM. The BIM then forwards the

information in an appropriate format, and using the board-specific API, passes it to the

board. After the application is finished accessing the board, the BIM goes back to its

dormant state.

Although the primary feature the BIM provides is board independence, the BIM

also yields other advantageous features. As described the BIM provides access to the RC

board for a local application, however the BIM also allows seamless and secure access to

the RC board from remote nodes. Furthermore, the use of the BIM increases the

reliability of the system, since applications do not access the boards directly. A security

checkpoint could be established inside the BIM to screen data and configuration targeted

to the RC board. However, these additional features do come at a slight overhead cost








(roughly 10 [ps), decreased control of the board by the application, and furthermore,

minor code additions to the application.

CM uses BIM to
Configure Board CM

Application
CARMA Board
Interface Language o CM spawns
BIM for each
BIM Board BIM

Board-Specific
Communication
/ RCBoard RC Board

Figure 9. Board Interface Modules.

Distributed Configuration Manager Schemes

In order to provide a scalable, fault-tolerant configuration management service for

thousands of nodes (one day) the CARMA configuration manager is fully distributed.

The CARMA service is targeted for systems of 64 to 2,000 nodes and above, such as

Sandia's Cplant [35] and Japan's Earth Simulator [36]. Such large-scale systems and

grids will likely include RC hardware one day. In creating the distributed CM model,

four distributed management schemes are proposed: Master-Worker (MW), Client-Server

(CS), "Pure" Peer-to-Peer (PPP), and "Hybrid" Peer-to-Peer (HPP). Figure 10 illustrates

these four schemes. While CARMA configuration management modules exist in

different forms on various nodes in the four schemes, in all cases the CMs still use the

communication module to communicate with one another.











S- Jobs Submitted Server
Master "Centrally" Glob
Exec Ma ag rJobs Submitted Global CM
Exec. Manager "Locally" Configuration

Global CM Configuration- Request
Configuration Configuration Request
Request \ RequestCli Client
Configuration
Worker Files Worker Exec. Manager Configuration Exec. Manager
Files
Local CM Local CM Local CM r Local CM

RC Hardware RC Hardware RC Hardware RC Hardware

a) Master-Worker b) Client-Server

Broker
Jobs Submitted Global CM
Jobs Submitted "Locally"onfiguration
S- "Locally"- Configu Request
Configuration l/
SRequest
Peer Peer Rqe
Peer Configuration Peer
Exec. Manager Configuration Exec. Manager Pointer
Request Exec. Manager Exec. Manager
Local CM Local CM
Configuration Configuration
RC Hardware File RC Hardware RC Hardware File RC Hardware

c) "Pure" Peer-to-Peer d) "Hybrid" Peer-to-Peer


Figure 10. Distributed Configuration Management Schemes.

The MW scheme (Figure 10a) is a centralized scheme where the master maintains a


global view of the system and has full control over job scheduling and configuration


management. This scheme is representative of currently proposed CMs discussed in


Chapter 2. While a centralized scheme is easy to implement, there will be performance


limitations due to poor scalability for systems with a large number of nodes. The other


three schemes in Figure 10 assume a distributed job-scheduling service. For the CS

scheme, (Figure 10b) local CMs request and receive configurations from a server.


Although this scheme is likely to exhibit better performance than MW for a given number


of nodes, there will also be scalability limitations as the number of nodes is increased.









The PPP scheme (Figure 10c) contains fully distributed CMs where there is no

central view of the system. This scheme is similar to the Gnutella file-sharing network

[37] and is described academically by Schollmeier at TUM in Munich, Germany [38].

This scheme will likely provide better latency performance since hot spots have been

removed from the system. However, the bandwidth consumed by this scheme would

likely be unwieldy when the number of nodes is rather large. The HPP scheme (Figure

10d) attempts to decrease the bandwidth consumed by PPP by consolidating

configuration-file location information in a centralized source. The HPP scheme

resembles the Napster file-sharing network [39]. The HPP scheme is also a low-overhead

version of CS because local CMs receive a pointer from the broker to the client-node that

possesses the requested configuration. This scheme will likely further reduce the server

bottleneck by reducing service time. Having multiple servers/brokers may further ease

the inherit limitations in these schemes.

A hierarchical-layered combination of these four schemes will likely be needed to

provide scalability up to thousands of nodes. For example, nodes might be first grouped

using CS or HPP and then these groups might be grouped using PPP. The layered group

concept has been found to dramatically improve scalability of HPC services for

thousands of nodes [24]. Hierarchical layering and variation of these schemes are

presented with analytical-scalability projections in Chapter 6. The following chapter

details the experimental setup used to evaluate and experimentally compare the MW, CS

and PPP schemes up to 16 nodes; HPP has been reserved for future study.














CHAPTER 4
EXPERIMENTAL SETUP

To investigate the performance of the CM in the MW, CS, and PPP schemes, a

series of experiments are conducted. The objectives of these performance experiments

are to determine the overhead imposed on the system by the CM, determine the

components of latency in configuration transactions, and to provide a quantitative

comparison between MW, CS, and PPP schemes. The performance metric used to

compare these schemes is defined as the completion latency from the time a

configuration request is received from the execution manager until the time that

configuration is loaded onto the FPGA.

Configuration Manager Protocols

Figures 11, 12, and 13 illustrate the individual actions that compose a configuration

transaction indicating the points at which latency has been measured for MW, CS, and

PPP, respectively. The experimental setup includes system sizes of 2, 4, 8, and 16 nodes.

In the MW and CS schemes, one node serves as the master or server while the remaining

nodes are worker or client nodes. In the PPP scheme all the nodes are peer nodes. The

Trigger block in Figures 11, 12, and 13 acts in place of the CARMA's execution manager

and stimulates the system in a controlled periodic manner.

A MW configuration transaction is composed of the following components as

shown in Figure 11. The interval from 1 to 2 is defined as Creation Time and is the time

it takes to create a configuration-request data structure. The interval from 2 to 3 is

defined as Proxy Queue Time and is the time the request waits in the proxy queue until it









can be sent over a TCP connection to the worker, while the Proxy Processing Time is the

time it takes the Proxy to create a TCP connection to the worker and is the interval from 3

to 4. These connections are established and destroyed with each transaction because

maintaining numerous connections is not scalable. The interval from 4 to 5 is defined as

Request Transfer Time and is the time it takes to send a configuration request of 148

bytes over TCP. This delay was observed to average 420ms using TCP/IP over Gigabit

Ethernet with a variance of less than 1%.


Master (
24
------------
Trigger

Worker

Configuration "4 6
Store Proxy Stub
\ =







RCHardware RCHardware RCHardware RC Hardware
LWorker AWorker ieWorker
-------- a --- ----i \- L -- -- --- -- -






The interval from 5 to 6 is defined as Stub Processing Time and is the time it takes

the Stub to read the TCP socket and place it in the CM queue, while the interval from 6 to

7 is defined as the CM Queue Time and is the time the request waits in the CM queue

until the CM removes it. The CMProcessing Time is the time required to accept the

configuration request and determine how to obtain the needed configuration file and is

the interval from 7 to 8. The interval from 8 to 9 is defined as File Retrieval Time, the









time it takes to acquire the configuration file over SCI, including connection setup and

tear down, whereas the interval from 9 to 10 is defined as CM-HWProcessing Time and

is the time to create a request with the configuration file and send the request to the BIM.

Finally, the interval from 10 to 11 is defined as HW Configuration Time and is the time it

takes to configure the FPGA.

Server Client

Configuration StubTrigger
Store tub
*I 4I










RC Hardware RC Hardware- RCHardware RC Hardware]
:-1 --------- isCM







Client @Client *Client


Figure 12. Client-Server Configuration Manager Scheme.

CS configuration transactions are composed of the following components as shown

in Figure 12. Note that jobs (via the Trigger block) are created on the client nodes in the

CS scheme rather than on the master in the centralized MW scheme. The intervals from

1 to 2, 2 to 3 and 3 to 4 are the Creation Time, CM Queue Time and CM Processing

Time, respectively and are defined as in MW. The interval from 4 to 5 is defined as File

Retrieval Time, which is the time required for a client to send a request to the server and

receive the file in response. The intervals from 5 to 6 and 6 to 7 are the CM-HW

Processing Time and HW Configuration Time, respectively and are defined as in MW.


























Figure 13. "Pure" Peer-to-Peer Configuration Manager Scheme.

PPP configuration transactions are composed of the following components as

shown in Figure 13. Note that jobs (via the Trigger block) are created on the requesting

peer nodes in the same way as the CS scheme. The intervals from 1 to 2, 2 to 3 and 3 to

4 are the Creation Time, CM Queue Time and CM Processing Time, respectively and are

defined as they are in the two previous schemes. On the requesting peer, the interval

from 4 to 5 is defined as File Retrieval Time, which is the time required for one peer to

send a request to another peer and receive the file in response. The File Retrieval Time

includes the time it takes for the requesting peer to contact each peer individually over

TCP to request the needed configuration file. In order to explore the best-case and worst-

case scenarios of configuration file distribution, the experiment was performed in two

forms. The worst-case scenario was one in which the requesting node must contact all

other peers for the needed configuration file and the last peer is the one that has the

configuration file. In the best-case PPP, the first peer the requesting node contacts

contains the configuration file. The intervals from 5 to 6 and 6 to 7 are the CM-HW









Processing Time and HW Configuration Time, respectively and are defined as in the two

previous schemes.

On the responding side of the peer, the interval from A to B is defined as Query

Response Time, which is the time required for a peer to determine if it has the requested

configuration file. The stub sends a message to the configuration store to send the file to

the requesting node, in addition to responding back to the requesting node that the

configuration file was found. The interval from B to C is defined as the Configuration

Store Queue Time, and is the time the request waits in the configuration store queue until

the configuration store can service it. The configuration store establishes an SCI

connection to the requesting peer and sends the configuration file, and is referred to as

Configuration Store Processing Time, and is defined as the interval between C and D.

Experimental Setup

Experiments are performed with one instance of the CM running on a node, with

the host system consisting of a Linux server with dual 2.4GHz Xeon processors and 1GB

of DDR RAM. The control network between nodes is switched Gigabit Ethernet. The

data network is 5.3 Gb/s SCI connected in a 2D torus with Dolphin D337 cards using the

SISCI-software release 2.1. Each computing node contains a Tarari HPC board (i.e.

CPX2100) [17] housed in a 66MHz, 64-bit PCI bus slot.

For this study, the number of nodes is varied from 2, 4, 8, and 16 nodes. The 2D

topology of the SCI network is shown in Figure 14. In experiments with the schemes of

MW and CS, the CM is executed with one node serving as master/server and the others

as computing nodes. In the 2, 4, 8, and 16-node cases the master/server always resides in

the same node.









IF 2-node
M/S/P W/C/P WIC/P WIC/P

III WM/SP W/C/P W/C/P W/C/P
III I I I

IIl I

II W/C/P W/C/P W/C/P W/C/P I

II I
S W/C/P W/C/P W/C/P W/C/P I

Ij W/C/P -^ W/C/P -t W/C/P t^ W/C/P I
III II
II|-node 8-node 16-node
^=---=----= --- J---------

Figure 14. Experimental Setup of SCI nodes. Nodes are labeled Master (M), Servers (S),
Workers (W), Clients (C), and Peers (P).
The interval between requests, known as Configuration Request Interval, is chosen

as the independent variable and is varied from 1s to 80ms. The value of 80ms is selected

as the lower bound because this value is determined to be the minimum-delay time to

retrieve a Virtex-II 1000 configuration file (-20ms) and configure a Tarari board

(-60ms). The configuration of the board holds constant throughout the experiments, and

is a function of the board's API, independent of CARMA's CM. In each trial, 20

requests were submitted with constant periodicity by the trigger, each requesting a

different configuration file. Thus the trigger mimics the execution manager's ability to

handle configuration requests for multiple tasks that are running in the system. Since the

execution manager is serialized, the trigger only sends one request at a time and is

designed to do so in a periodic manner. This setup mimics the multitasking ability of the






29


execution manager for 20 tasks running on the node each requiring a different

configuration file. The timestamps for measuring all intervals is taken using the C

function gettimeofday), which provides a 1 ts resolution. Each experiment for each of

the 3 schemes was preformed at least three times, and the values have been found to have

a variance of at most 11%. The following chapter presents the final run of the

experiment to be representative results of the experiment.
















CHAPTER 5
EXPERIMENTAL RESULTS

Each of the completion-latency components are measured across all computing

nodes in the MW, CS, and PPP schemes, and are summarized along with maximum,

minimum, and average values in Figures 15, 16, and 17, respectively. All schemes have

four computing nodes and the master/server are located in an adjacent SCI ring. PPP is

performed in the worst-case scenario, as defined in Chapter 4. The results for request

intervals larger than 250 ms are not shown because these values are found to be virtually

identical to the 250 ms trial, for all schemes. Values in this region are found to be

relatively flat because the master/server/peer is able to service requests without imposing

additional delays. In this stable region, requests are only delayed by fixed values that are

the sum of File Retrieval Time and HW Configuration Time, which is approximately

equal to 130 ms, 140 ms, and 190 ms for MW, CS, and PPP, respectively.


3000 9

2500 m HNConfiguration Time
El File Retrieval Time
E 2000 CM Qlueue Tim
U
1500
0
*1000

500

0
80 90 100 110 120 130 140 150 175 200 250
Configuration Request Interval (ms)


Figure 15. Completion Latency for Four Workers in MW.












1800
1600
1400 U HVWConfiguration Tirre
1 File Retrieval Tirme
E1200 CM Queue Tirme

1000
800
600
8 400
200
0
80 90 100 110 120 130 140 150 175 200 250
Configuration Request Interval (ms)


Figure 16. Completion Latency for Four Clients in CS.


800

700
600

S500

3 400
0
S300

o200
100

0


80 90 100 110 120 130 140 150 175 200 250
Configuratoin Request Interval (ms)


Figure 17. Completion Latency for Four Peers in PPP.

Several observations can be made from the values in Figures 15, 16, and 17. In

MW, CS, and PPP, the major components of completion latency are CM Queue Time,

File Retrieval Time and HW Configuration Time, the largest of these naturally being CM

Queue Time as resources become overloaded (i.e. decreasing configuration request

interval). The remaining components combined are in the worst case less than 2% of the

total completion latency, and thus are not shown in Figures 15, 16, and 17. The data









demonstrates that the CM design in CARMA imposes virtually no overhead on the

system in all cases. Furthermore, the data shows that when the request interval is large,

File Retrieval Time and HW Configuration Time are the dominant factors. CM Queue

Time and File Retrieval Time are the most dominant components of completion latency

when the request interval is small. In Figure 15, CM Queue Time increases rapidly while

File Retrieval Time grows steadily. In Figure 16, both CM Queue Time and File

Retrieval Time grow steadily. However, in the PPP scheme, as seen in Figure 17, File

Retrieval Time holds steady and CM Queue Time increases only after the configuration

request interval is less than the File Retrieval Time. This phenomenon is a result of how

the schemes are designed. Since PPP is designed with no single point of contention, the

only factor in its queuing delay is CM-processing time dominated by File Retrieval Time.

When the arrival rate is greater than the processing time, the CM Queue Time increases

steadily. In the MW and CS schemes the File Retrieval Time is dependent on a

centralized master or server, respectively. Another useful observation shown in Figures

15, 16 and 17 is the spread between the maximum and minimum values. In MW and PPP

the spread is relatively small, at worst a 14% deviation from the average for MW and at

worst 23% deviation for PPP. CS on the other hand experiences a worst-case 98%

deviation from the average. The following figures investigate this phenomenon between

the MW and CS schemes.

Figure 18 and 19 present a per-node view of MW and CS respectively, which

shows how the two schemes differ significantly. In CS, there is a greater variability in

the completion latency each node experiences due to server access. Nodes closer to the

server on the SCI-torus experience smaller file-transfer latencies, which allow them to










receive and therefore request the next configuration file faster. Completion latencies in

the MW scheme have less variability between nodes because all requests are serialized in

the Master. Therefore, no one node receives an advantage based on proximity.


3000

2500

2000

1500

1000

500

0


280 480 680 880
Configuration Request Interval (ms)


Figure 18. Completion Latency of Four Workers with Master in Adjacent SCI Ring.


3000

2500

2000

1500

1000

500

0


280 480 680 880
Configuration Request Interval (ms)


Figure 19. Completion Latency of Four Clients with Server in Adjacent SCI Ring.

Figure 20 illustrates the experiment where the server is moved into the same SCI

ring as 3 clients. In the experimental results shown in Figure 19, the server resides on the










adjacent SCI ring requiring a costly dimension switch to transfer files to clients. The data

shows that the client closest to the server experienced a File Retrieval Time lower than

that of other nodes at the stressed configuration-request interval of 100 ms. However, in

the experiment shown in Figure 20 all the clients experienced similar File Retrieval Time

at the stressed configuration-request interval of 100 ms.

350

300
E
S250

-J 200

a 150

E
0 100

50
80 280 480 680 880
Configuration Request Interval ( ms)


Figure 20. Completion Latency of Three Clients with Server in the Same SCI Ring.

This dependency on the client-to-server proximity is further illustrated in Figure

21, which shows the CS scheme for 8-nodes. Figure 21 shows results from the 8-node

case where 3 clients reside on the same SCI ring as the server and 4 clients are on the

adjacent ring. The spread between nodes is approximately 500 ms at a configuration-

request interval of 80 ms.

Figures 22, 23, and 24 present the scalability of MW, CS, and PPP (worst-case),

respectively. Figure 22 shows that MW is not scalable since completion latency

dramatically increases as the number of nodes increases, on average triples from 2 to 4

nodes and 4 to 8 nodes.










1600

1400

E 1200

1000

S800
0
600

E 400
o0
S200
0 i-----------------------
80 280 480 680 880
Configuration Request Interval( mns)


Figure 21. Completion Latency of Eight Clients in CS.

Particularly, the 16-node MW trial incurs a queuing delay when the configuration-

request interval is 1 s, which no other scheme incurs. These results clearly show that

traditional centralized-managed schemes developed to date cannot be used for even

relatively small distributed systems. Figure 23 shows that CS has a low completion

latency for small numbers of nodes (i.e. 2 and 4 nodes), however rapidly increases for

larger numbers of nodes (i.e. 8 and 16). For all system sizes, the completion latency of

CS was roughly only 10% of MW. Figure 24 reveals that worst-case PPP has a constant

increase in completion latency over 2, 4, 8, and 16 nodes. PPP has higher completion

latency for small numbers of nodes (i.e. 2 and 4 nodes) being roughly 1.5 times that of

CS. However, for larger numbers of nodes (i.e. 16) PPP has a lower average completion

latency than CS, measuring close to 500 ms lower at configuration-request intervals less

than 200 ms.












12000


O 10000
E

o 8000

- 6000
0
S4000
C.
E
o 2000

0


0 200


400 600 800

Configuration Request Interval ( ms )


1000 1200


Figure 22. Completion Latency of MW Scheme for 2, 4, 8, and 16 nodes.


3500

;~3000
E
S2500
o
S2000
-S
. 1500

. 1000
E
o
a 500

0


Figure 23. Completion Latency of CS Scheme for 2, 4, 8, and 16 nodes.


S^ 16-node

8-node


4-node

2-node


0 200 400 600 800 1000
Configuration Request Interval ( ms)


1200











3500

3000

2500
= 16-node
2000

T 1500
_-
S1500 8-node

a. 1000
E 4-node
0 500
0 2-node
0 200 400 600 800 1000 1200
Configuration Request Interval ( ms)



Figure 24. Completion Latency of Worst-Case PPP Scheme for 2, 4, 8, and 16 nodes.

Figures 22-24 show that "Pure" Peer-to-Peer and CS provide better scalability than

MW. However, larger testbeds are needed to investigate scalability of these schemes at

larger system sizes. As described earlier in this chapter, PPP has no single point of

contention or hot spot, thus File Retrieval Time is relatively constant regardless of

configuration-request interval. Furthermore having a fully distributed CM service also

provides a great deal in terms of fault tolerance. The next chapter will provide an

analytical scalability study ofMW, CS, and PPP based on the experimental data shown

above as well as possible extensions to these schemes.














CHAPTER 6
PROJECTED SCALABILITY

The current trend of cluster-based RC systems scaling to larger numbers of nodes

will likely continue to progress, as RC technology advances. One day RC computing

clusters might rival the size and computing power of Sandia National Laboratories Cplant

[35], a large-scale parallel-computing cluster. Coupling parallel-computing power of this

level with RC hardware yields a daunting task of system management. CARMA's

configuration manager is one solution for allocating and staging configuration files in

such large-scale distributed systems, however its performance on such system sizes

should be explored. Based on the results presented in the previous chapter, CS and PPP

schemes scale the best up to 16 nodes. Taking the experimental results for 2-, 4-, 8-, and

16-node systems, the completion latencies for larger systems can be projected.

Completion Latency Projections

Since searching for configuration files in large-scale RC system has no precedent,

using current lookup algorithms available for PPP networks can yield an approximate

number of nodes contacted in a query. The Gnutella lookup algorithm is a good starting

point because of its simplicity and low overhead. Another lookup algorithm for PPP

networks, Yappers, proposed by the CS department at Stanford University [40] uses hash

tables, however this would yield weighty query processing. CARMA's PPP scheme

would lie in the middle of these algorithms. A Gnutella-style algorithm would contact all

nodes, and Yappers using the minimum number of hash buckets, resulting in minimum

overhead, would contact roughly 25% of nodes.










For the projections presented in this thesis, a typical-case PPP is assumed to be

25% of worst-case (i.e. all nodes contacted) PPP. A projection to larger node sizes is

done with curves of the form y = m x + b for PPP and y = m x2 + b for CS. These


curves are chosen for each since they are best-fit curves to the experimental data

presented in Chapter 5. Both m and b are calculated mathematically and their values are

found to be m=187.37 and b=138.6 for PPP, and m=10.85 and b=122.4 for CS. Figure 25

shows the results for CS, worst-case PPP (i.e. all nodes contacted), and typical-case PPP

(i.e. 25% of nodes contacted) projected to 4096 nodes, since Dolphin's SCI hardware

scales to system sizes of 4096 [41]. The values used for the projection are at a

configuration-request interval of 100 ms, since it is at this point (see Figures 16-18) that

the CM is adequately stressed.


1.00E+09

1.00E+08 cs
1-A PPP worst-case
1.00E+07 ...... PPP typical-case

S1.00E+06

1.00E+05

o 1.00E+04

1.00E+03

1.00E+02

1.00E+01

1.00E+00
2 4 8 16 32 64 128 256 512 1024 2048 4096
System Size (# of nodes)


Figure 25. Completion Latency Projections for Worst-Case PPP, Typical-Case PPP, and
CS. Note: Logarithmic Scales.

Figure 25 shows that CS has greater completion latencies than worst-case PPP

when group sizes are greater than 12 and greater than typical-case PPP when group sizes









are greater than 6. Another observation from Figure 25 is that the schemes do not scale

well for system sizes larger than 32. The completion latency of the CS scheme is of the

order of 108 seconds for a system size of 4096, while a 32-node system yields completion

latencies of a more reasonable 2 seconds. The data shows that in order to achieve

reasonable completion latencies and scale to larger systems, CM nodes should be

grouped into a layered hierarchy.

Hierarchy Configuration Managers

For this analytical study a two-layered hierarchy is chosen, where each layer could

be implemented in either CS or typical-case PPP schemes. Typical-case PPP is used to

more closely model a real system. Consequently the following four permutations are

investigated, PPP-over-CS, CS-over-CS, CS-over-PPP, and PPP-over-PPP, shown in

Figure 26.


Head a) PPP over PPP
Peer







c) CS over PPP


Figure 26. Four Layered Hierarchies Investigated.


b) PPP over CS
Server .


I d \ CS overCS




d) CS over CS









The lower layer is divided into groups, where one node in each group is designated

the head node. The server is the head node if the lower-layer scheme is CS and if the

lower-layer scheme is PPP then one peer is designated as head node. The head node of

each group then searches and transfers configuration files in the upper layer. In PPP-

over-PPP shown in Figure 26a, a requesting node contacts the nodes in its group via the

PPP scheme described. Normally a node can find the requested configuration file within

its group, however when it cannot it contacts the head node of its group, which then

retrieves the configuration file from the upper layer. This outcome is known as a

configuration miss. On a configuration miss, the head node of the requesting group

contacts the head nodes of other groups in search of the requested configuration file as

dictated by the PPP scheme. When the requesting head node contacts other head nodes,

each contacted head node searches for the configuration file within its group using the

PPP scheme. In PPP-over-CS shown in Figure 26b, a requesting node contacts the server

for its group. When the server does not have the requested configuration file, it contacts

other servers using the PPP scheme. Figure 26c shows CS-over-PPP in which the nodes

search for configuration files within the groups using PPP. When a configuration file is

not found within the group the head node of the group contacts the server in the upper

layer. Figure 26d shows CS-over-CS in which nodes contact the group's server, which

then contacts the upper-layer server on a configuration miss. Using these hierarchical

protocols, projection equations for completion latency can be derived and are shown in

Table 1.









Table 1. Completion Latency Projection Equations for System Hierarchies.
Hierarchy Completion Latency Projection Equations
PPP-over-PPP P x (PPP(g)) + Q x P x (PPP(n + g) + PPP(g))
PPP-over-CS CS(g) + Q x P x (PPP(n g))
CS-over-PPP P x (PPP(g)) + Q x (CS(n + g))
CS-over-CS CS(g) + Q x (CS(n g))



The functions PPP(x) and CS(x) represent the average completion latency

experienced by one configuration request in a group ofx nodes using the PPP and CS

schemes, respectively. These functions are projected beyond 16 nodes and were

presented earlier in this chapter. The variable n represents the number of nodes in the

system, while g represents the group size. Groupings of 8, 16, 32, 64, and 128 are

investigated, since these group sizes are reasonable in system sizes up to 4096 nodes. P

represents the percentage of nodes contacted in the PPP scheme. Since we are using

typical-case PPP, P has a value of 25%. Q is the configuration-miss rate and is assumed

to occur 10% of the time based on measurement of distributed file systems in [42].

Although research into configuration caching is left for future work, the value of 10% is

reasonable for this RC system. Using these equations and varying both group size and

system size, a matrix of values is calculated. The optimal group size in each hierarchy as

the system size increases is determined and shown in Figure 27.

Figure 27 shows PPP-over-CS has a gradual change of optimal group sizes ending

with optimal group size of 8 nodes for systems sizes > 512, while CS-over-PPP has a

rapid change of optimal group sizes over system sizes ending with optimal group size of

128 nodes for systems sizes > 2048. Another observation is that the upper-layer scheme

determines the optimal group size. For instance, a 128-node group size minimizes the









high completion latency of CS in CS-over-PPP scheme, and an 8-node group size

maximizes low completion latency of PPP in PPP-over-CS scheme. Figure 28 and 29

show the completion latency of the hierarchies using the optimal group size at the

appropriate system sizes.

System Size (# of nodes)
4 128 1024 2048 4096


System Hierarchy


PPPoverPPP I 4 1 8 [ 16 |
PPP over CS | 4 8 8 1
CS over PPP 418 16 32 64 128
CS over CS 8 I 16 1 32
Optimal Group Sizes (Latency Focused)

Figure 27. Optimal Group Sizes for each Hierarchy as System Size Increases.

The data in Figure 28 shows that for system sizes of up to 512, PPP-over-CS with

groups of 4 has the lowest completion time. Furthermore, for system sizes of 512 to1024

nodes, PPP-over-PPP with groups of 8 has the lowest completion time, and for system

sizes of 1024 to 4096, PPP-over-PPP with groups of 16 has the lowest completion time,

shown in Figure 29.

Consumed Bandwidth Projections

While the PPP-over-PPP hierarchy is projected to achieve the lowest completion

latency at large scale, the control-communication bandwidth between nodes could be

weighty. This section presents analytical control-bandwidth calculations for previously

presented layered hierarchies. The data-network bandwidth utilization for all the schemes

is no worse than 7.7%, and remains constant regardless of system size and therefore is

not investigated.















10000








* 1000
-J

a.
0

E
o


100


200 300 400 500
System Size (# of nodes)


Figure 28. Completion Latency Projections with Optimal Group Sizes up to 500 Nodes.


100000




E
g 10000


-J

a,
I- 1000

o


0 500 1000 1500 2000 2500
System Size (# of nodes)


3000 3500 4000


Figure 29. Completion Latency Projections with Optimal Group Sizes.

Using the bandwidth formula Equation (1) presented in [24] the control-bandwidth


values can be determined. The parameters of Equation (1) are as follows: B is the


bandwidth-per-node, X is the number of layers in the system, L, is the total number of









bytes transferred in the ith layer, while is the configuration-request frequency of the ith

layer. Since X is set to 2 and L, andf are constant for both layers, the formula is

simplified. Furthermore, when adapting Equation (1) to the CM protocols, Equation (2)

is produced. The additional parameters are as follows: Q is the configuration-miss rate

and is again assumed to be 10%, and the functions Sl(x) and S2(x) represent functions

describing the average amount of data used in the lower and upper layers, respectively.

As before, the variable n represents the number of nodes in the system and g represents

the group size while simplifies to the system's configuration-request frequency,f As in

the completion-latency projections, groupings of 8, 16, 32, 64, and 128 are investigated.



i=1

B =(S,(g)+ Q x (S2(n + g)))x f (2)

Table 2 shows the data-consumption equations for each management scheme based

on its protocol. The parameter P represents the average percentage of nodes contacted in

finding the configuration file. The parameter Lfis the total number of bytes transferred

when the CM fails to find the configuration file, and L, is the total number of bytes when

the CM succeeds in finding the configuration file. The parameter Lr is the total number

of bytes sent to the server to request a configuration file.

Table 2. Control Consumption Equations for each Management Scheme.
Management Data Consumption Equation
PPP PPPbandwdth (n) = P x (n 2) Lf +Ls n > 2
CS CSbandw.dth (n) = L, n > 2









Using the setup previously described in Chapter 6, Equation (2) is obtained. P is

set to 25% of the nodes in the system. The Lf and L, are equal in the PPP scheme since a

node responds to a query by returning the query message to the requesting node. The

message size is 74 bytes and breaks down as 14 bytes for Ethernet header, 20 bytes for

TCP header, 20 bytes for IP, and 20 bytes of payload. The parameter Lr totals 74 bytes as

well, with the same breakdown as Lf and L,. Finally, fis set to correspond to the

configuration request interval of 100 ms used in the completion latency calculations.

Performing the appropriate substitutions of the data consumption equations from Table 2

into the general bandwidth Equation (2) produces the bandwidth equations for each of the

four hierarchies, shown in Table 3.

Table 3. Bandwidth Equations for System Hierarchies.
Hierarchy Consumed Bandwidth Equations
PPP-over-PPP (PxgxLf+Qx((Px(n g)xg+Pxg) xLf)+L)xf
PPP-over-CS (L + Q x (P x (n g) x L + L))x f
CS-over-PPP ((P x g xL + L)+Q x Lr)x f
CS-over-CS (L + (Q x L,)) x f


Using the equations in Table 3, the bandwidth consumed by each hierarchy over

the entire network per request is calculated. Figure 30 shows the calculated results using

the optimal group size found for completion latency at the appropriate system sizes. As

seen in Figure 30, PPP-over-PPP incurs the most bandwidth and increases rapidly over

larger system sizes. PPP-over-CS also increases with system sizes, however it consumes

only 12% of the bandwidth of PPP-over-PPP. It can be observed that those hierarchies

with CS in the upper layer have constant bandwidth, as can be seen by the bandwidth










consumption of CS-over-CS and CS-over-PPP. However, CS-over-CS consumes only

roughly 3% of bandwidth consumed by CS-over-PPP.


1.6

1.4 PPP over PPP
-- PPP over CS ..
1.2 -CS over PPP
..--' 32
-1 -CS over CS -o

o -
E 0.8
0 16
0.6

S0.4* 128

0.2
32

0 500 1000 1500 2000 2500 3000 3500 4000
System Size (# of nodes)


Figure 30. Network Bandwidth Consumed over Entire Network per Request.

From Figures 25, 28, 29 and 30 the optimal node configuration for an RC system is

derived. Table 4 presents latency-bound, bandwidth-bound, and best overall system

configurations. It should be noted that the bandwidth-bound constraint category excludes

hierarchies with average completion latency values greater than 5 seconds. The low

bandwidth benefits of these hierarchies cannot overcome their high completion latency.

Table 4 provides a summary of the projection results and can be used to make CM

setup decisions in future systems. For small system sizes (< 32) the latency-bound and

bandwidth-bound categories have the same configurations. However, as the number of

nodes increases the latency-bound category prefers hierarchies containing the PPP

scheme, while the bandwidth-bound category prefers the CS scheme. Furthermore, as the

system size increases so do the group sizes. The best-overall category attempts to reduce










the bandwidth penalty of PPP schemes by choosing hierarchies that provide lower

bandwidth requirements with minimal increase in latency. Thus, this category follows

the latency-bound category until 512, since PPP-over-PPP consumes a significantly

greater amount of bandwidth versus other hierarchies at system sizes greater than 512

(see Figure 30).

Table 4. System Configurations for Given Constraints over System Sizes
System System Size (number of nodes)
Constraints < 8 8 to 32 32 to 512 512 to 1024 1024 to 4096

Latency Flat CS CS-over-CS PPP-over-CS PPP-over-PPP PPP-over-PPP
Flat CS
bound group size 4 group size 4 group size 8 group size 16

Bandwidth CS-over-CS CS-over-CS PPP-over-CS PPP-over-CS
Flat CS
bound* group size 4 group size 8 group size 8 group size 8

Best t C CS-over-CS PPP-over-CS PPP-over-CS PPP-over-CS
Sg s Flat CS
Overall group size 4 group size 4 group size 8 group size 8
r 11 1 ,1, ,


Schemes with completion latency values greater than 5 seconds excluded.


1














CHAPTER 7
CONCLUSIONS AND FUTURE WORK

Traditional computing is inefficient at fine-grain data manipulation and highly

parallel operations, thus RC arose to provide the flexibility and performance needed for

today's HPC applications. First-generation RC systems typically entail FPGA boards

coupled with general-purpose processors, thus merging the flexibility of general-purpose

processors with the performance of an ASIC. RC achieves this flexibility with the use of

configuration files and partial reconfiguration, which decreases configuration file sizes.

RC has increased the performance of a variety of applications, such as cryptology,

including encryption and decryption, and space-based processing, such as hyperspectral

imaging.

Recent trends extend RC systems to clusters of machines interconnected with

SANs. The HCS Lab at Florida has developed a 9-node RC cluster to examine

middleware and service issues. The Air Force has a 48-node RC cluster while an earlier

cluster exists at Virginia Tech. These clusters use High-Performance Networks (HPNs),

which provide them with a high-throughput, low-latency network. These networks are

ideally suited to transfer latency-critical configuration files among nodes. Following the

COTS mentality of cluster-computing networks, COTS-based RC boards are typically

used to reduce cost. Two such RC boards are Celoxica's RC1000 and Tarari's Content

Processing Platform, among many others.

Such heterogeneous and distributed systems warrant an efficient method of

deploying and managing configuration files that does not overshadow the performance









gains ofRC computing. The Comprehensive Approach to Reconfigurable Management

Architecture (CARMA) framework seeks to specifically address key issues in RC

management. While some configuration-file management issues could relate to

traditional HPC, other aspects may not. FPGA configuration is a critical processing

component, which must be handled with care to ensure an RC speedup over a traditional

system. Reusing the RC hardware during the process's execution, known as Run-Time

Reconfiguration (RTR), increases system performance compared to static configuration.

Furthermore, methods such as configuration compression, transformation,

defragmentation, and caching can reduce configuration overhead further.

RC systems implement a Configuration Manager (CM) to handle the issues that

arise from RTR and configuration overhead reduction. CARMA's configuration

manager builds upon a few noteworthy designs, two of which are the RAGE from the

University of Glasgow, and the reconfiguration manager from Imperial College, UK. In

response to the recent trend toward COTS-based distributed RC clusters, the HCS lab

presents a modular and distributed configuration manager middleware. CARMA's

execution manager, analogous to the VHM in RAGE, coordinates the execution of

configuration commands. The configuration manager module manages the configuration

files using a layered architecture. The BIM, like the device driver of RAGE, handles

low-level details of board configuration and communication. CARMA's configuration

manager extends the configuration store of Imperial College's reconfiguration manager

by maintaining a distributed store, and thus requires a communication module.

Since CARMA is fully distributed and targeted for thousands of nodes, various

distributed files-management schemes are investigated. These include Master-Worker









(MW), which mimics the fully centralized job schedulers of today's RC designs, which

farm jobs and data to workers, and Client-Server (CS), which is similar to FTP servers.

Also included are "Pure" Peer-to-Peer (PPP), like the Gnutella file-sharing network, and

"Hybrid" Peer-to-Peer (HPP), like the Napster file-sharing network.

In order to investigate completion latency and its components, experiments were

performed in which the configuration request interval was varied. Additionally, the

scalability to 16 nodes of the CM is investigated. The data gathered shows File Retrieval

Time and HW Configuration Time are the dominant factors when request intervals are

large. Furthermore, CM Queue Time is naturally the most dominant factor when request

intervals are small. It is also observed that CARMA's configuration manager design

imposes very little overhead on the system. In regard to scalability, the MW scheme is

shown to not be scalable beyond 4 nodes. CS performs best for a small number of nodes

(2-8), while PPP is found to be the most scalable to 16 nodes.

Analytical models were used to extrapolate latency for larger systems and to predict

scalability for future testbeds. These calculations showed that in order to scale systems to

thousands of nodes, hierarchical schemes are required. Since experimental results

showed that CS and PPP are the only scalable schemes beyond 8 nodes, they are

permutated into two-layered hierarchical schemes. These hierarchies include PPP-over-

PPP, PPP-over-CS, CS-over-PPP, CS-over-CS, in which the lower-layer nodes are

grouped. The data showed that in a heterogeneous hierarchy, the upper-layer scheme

determines optimal group size. Moreover, for system sizes ranging from 32 to 512, PPP-

over-CS with group size of 4 yields the best completion-latency performance. For larger

system sizes (512 to 4096), PPP-over-PPP with a group size of 16 yields the best









completion-latency performance. In addition to completion latency, the bandwidth

consumed by these hierarchies is investigated. The results showed that PPP-over-CS

with a group size of 8 yields the best results, not only in terms of bandwidth but also in

overall scalability, for large systems (> 512).

Directions for future work include improving CM schemes, such as the existing

PPP scheme, and implementing and evaluating the HPP scheme. In addition, caching

configuration files should be investigated, in particular caching algorithms, miss rates,

and configuration file locality. Moreover, the analytical projections made in this thesis

should be validated in a future work. Bandwidth and utilization experiments should be

conducted on larger systems as well. Another direction of future work would be to

extend the features of CARMA's configuration manager by supporting new SANs and

RC boards, and incorporating advanced configuration file-management techniques (e.g.

defragmentation, etc.)















LIST OF REFERENCES


[1] V. Ross, "Heterogeneous HPC Computing," presented at 35th Government
Microcircuit Applications and Critical Technology Conference, Tampa, FL, April
2003.

[2] A. Jacob, I. Troxel, and A. George, "Distributed Configuration Management for
Reconfigurable Cluster Computing," presented at International Conference on
Engineering of Reconfigurable Systems and Algorithms, Las Vegas, NV, June
2004.

[3] I. Troxel, and A. George, "UF-HCS RC Group Q3 Progress Report" [online] 2004,
http://www.hcs.ufl.edu/prj/rcgroup/teamHome.php (Accessed: May 25, 2004).

[4] I. Troxel, A. Jacob, A. George, R. Subramaniyan, and M. Radlinski, "CARMA: A
Comprehensive Management Framework for High-Performance Reconfigurable
Computing," to appear in Proc. of 7th International Conference on Military and
Aerospace Programmable Logic Devices, Washington, DC, September 2004.

[5] Visual Numerics, Inc., "IMSL Mathematical & Statistical Libraries" [online] 2004,
http://www.vni.com/products/imsl/ (Accessed: March 25, 2004).

[6] Xilinx, Inc., "Common License Consortium for Intellectual Property" [online]
2004, http://www.xilinx.com/ipcenter/ (Accessed: February 13, 2004).

[7] A. Derbyshire and W. Luk, "Compiling Run-Time Parameterisable Designs,"
presented at 1st IEEE International Conference on Field-Programmable
Technology, Hong Kong, China, December 2002.

[8] Chameleon Systems, Inc., "CS2000 Reconfigurable Communications Processor,
Family Product Brief' [online] 2000, http://www.chameleonsystems.com
(Accessed: May 12, 2003).

[9] K. Compton and S. Hauck, "Reconfigurable Computing: A Survey of Systems and
Software," ACM Computing Surveys, vol. 34, no. 2, June 2002, pp. 171-210.

[10] Xilinx, Inc., "Virtex Xilinx-II Series FPGAs" [online] 2004,
http://www.support.xilinx.com/publications/matrix/virtex_color.pdf (Accessed:
February 13, 2004).









[11] A. J. Elbrit and C. Paar, "An FPGA Implementation and Performance Evaluation
the Serpent Block Cipher," in Proc. of 8th International Symposium on Field
Programmable Gate Arrays, Monterey, CA, February 2000, pp. 33-40.

[12] J. R. Hauser and J. Wawrzynek, "Garp: A MIPS Processor with a Reconfigurable
Coprocessor," in Proc. of 5th IEEE Symposium on Field-Programmable Custom
Computing Machines, Napa Valley, CA, April 1997, pp. 12-21.

[13] K. H. Leung, K. W. Ma, W. K. Wong, and P. H. Leong, "FPGA Implementation of
a Microcoded Elliptic Curve Cryptographic Processor," in Proc. of 8th IEEE
Symposium on Field-Programmable Custom Computing Machines, Napa Valley,
April 2000, pp. 68-76.

[14] G. Peterson and S. Drager, "Accelerating Defense Applications Using High
Performance Reconfigurable Computing," presented at 35th Government
Microcircuit Applications and Critical Technology Conference, Tampa, FL, April
2003.

[15] M. Jones, L. Scharf, J. Scott, C. Twaddle, M. Yaconis, K. Yao, P. Athanas and B.
Schott, "Implementing an API for Distributed Adaptive Computing Systems,"
presented at 34th IEEE International Conference on Communications, Vancouver,
Canada, April 1999.

[17] Tarari Inc., "High-Performance Computing Processors Product Brief" [online]
2004, http://www.tarari.com/PDF/HPC-BP.pdf (Accessed: February 12, 2004).

[16] Celoxica Inc., "RC1000 Development Platform Product Brief" [online] 2004,
http://www.celoxica.com/techlib/files/CEL-W0307171KKP-51 .pdf (Accessed:
February 12, 2004).

[18] B. L. Hutchings, M.J. Wirthlin. "Implementation Approaches for Reconfigurable
Logic Applications," in Proc. of 5th International Workshop on Field
Programmable Logic and Applications, Oxford, England, August 1995, pp 419-
428.

[19] A. Dandalis and V. Prasanna. "Configuration Compression for FPGA-based
Embedded Systems," in Proc. of 9th International Symposium on Field
Programmable Gate Arrays, Monterey, CA, February 2001, p.173-182.

[20] K. Compton, J. Cooley, S. Knol, and S. Hauck, "Configuration Relocation and
Defragmentation for FPGAs," presented at 8th IEEE Symposium on Field-
Programmable Custom Computing Machines, Napa Valley, CA, April 2000.

[21] Z. Li, K. Compton, and S. Hauck, "Configuration Caching for FPGAs," presented
at 8th IEEE Symposium on Field-Programmable Custom Computing Machines,
Napa Valley, CA, April 2000.









[22] J. Bums, A. Donlin, J. Hogg, S. Singh, and M. de Witt, "A Dynamic
Reconfiguration Run-Time System," presented at 5th Annual IEEE Symposium on
Custom Computing Machines, Los Alamitos, CA, April 1997.

[23] N. Shiraz, W. Luk, and P. Cheung, "Run-Time Management of Dynamically
Reconfigurable Designs," in Proc. of 9th International Workshop Field-
Programmable Logic and Applications, Tallinn, Estonia, September 1998, pp. 59-
68.

[24] R. Subramaniyan, P. Raman, A. George, and M. Radlinski, "GEMS: Gossip-
Enabled Monitoring Service for Scalable Heterogeneous Distributed Systems"
White Paper, currently in journal review [online] 2003,
http://www.hcs.ufl.edu/pubs/GEMS2003.pdf (Accessed: April 14, 2004).

[25] K. Sistla, A. George, and R. Todd, "Experimental Analysis of a Gossip-based
Service for Scalable, Distributed Failure Detection and Consensus," Cluster
Computing, vol. 6, no. 3, July 2003, pp. 237-251 (in press).

[26] D. Collins, A. George, and R. Quander, "Achieving Scalable Cluster System
Analysis and Management with a Gossip-based Network Service," presented at
IEEE Conference on Local Computer Networks, Tampa, FL, November 2001.

[27] K. Sistla, A. George, R. Todd, and R. Tilak, "Performance Analysis of Flat and
Layered Gossip Services for Failure Detection and Consensus in Scalable
Heterogeneous Clusters," presented at IEEE Heterogeneous Computing Workshop
at the Intnl. Parallel and Distributed Processing Symposium, San Francisco, CA,
April, 2001.

[28] S. Ranganathan, A. George, R. Todd, and M. Chidester, "Gossip-Style Failure
Detection and Distributed Consensus for Scalable Heterogeneous Clusters," Cluster
Computing, vol. 4, no. 3, July 2001, pp. 197-209.

[29] M. Burns, A. George, and B. Wallace, "Simulative Performance Analysis of Gossip
Failure Detection for Scalable Distributed Systems," Cluster Computing, vol. 2, no.
3, July 1999, pp. 207-217.

[30] Z. Li, S. Hauck, "Configuration Prefetching Techniques for Partial Reconfigurable
Coprocessor with Relocation and Defragmentation," presented at 10th International
Symposium on Field Programmable Gate Arrays, Monterey, CA, February 2002.

[31] D. Gustavson, Q. Li, "The Scalable Coherent Interface (SCI)," IEEE
Communications, vol. 34, no. 8, August 1996, pp. 52-63.

[33] S. Oral and A. George, "Multicast Performance Analysis for High-Speed Torus
Networks," presented at 27th IEEE Conference on Local Computer Networks via
the High-Speed Local Networks Workshop, Tampa, FL, November 2002.









[34] S. Oral and A. George, "A User-level Multicast Performance Comparison of
Scalable Coherent Interface and Myrinet Interconnects," presented at 28th of IEEE
Conference on Local Computer Networks via the High-Speed Local Networks
Workshop, Bonn/Koswinter, Germany, October 2003.

[35] R. Brightwell, L. Fisk, "Scalable Parallel Application Launch on Cplant," in Proc.
of the ACM/IEEE conference on Supercomputing November 2001, Denver,
Colorado, p. 40.

[36] "The Earth Simulator Center" [online] 2004, http://www.es.jamstec.go.jp/esc/eng/
(Accessed: June 24, 2004).

[37] P. Kirk, "Gnutella Stable 0.4" [online] 2003 http://rfc-gnutella.sourceforge.net/
developer/stable/index.html (Accessed: June 24, 2004).

[38] R. Schollmeier "A Definition of Peer-to-Peer Networking for the Classification of
Peer-to-Peer Architectures and Applications," presented at the IEEE International
Conference on Peer-to-Peer Computing, Linkoping, Sweden, August 2001.

[39] Napster, LLC, "Napster.com," [online] 2004, http://www.napster.com (Accessed:
April 14, 2004).

[40] P. Ganesan, Q. Sun, and H. Garcia-Molina. "YAPPERS: A Peer-to-Peer Lookup
Service over Arbitrary Topology," presented at 22nd Annual Joint Conf. of the
IEEE Computer and Communications Societies, San Francisco, April 2003.

[41] Dolphin Interconnect Solutions Inc."Dolphin Interconnect Solutions" [online]
2004, http://www.dolphinics.com/ (Accessed: April 15, 2004).

[42] M. Baker and J. Ousterhout, "Availability in the Sprite Distributed File System,"
ACM Operating Systems Review, vol. 25, no. 2, April 1991, pp. 95-98.















BIOGRAPHICAL SKETCH

Aju Jacob received two bachelor's degrees, one in computer engineering and the

other in electrical engineering, from the University of Florida. He is presently a graduate

student in the Department of Electrical and Computer Engineering at the University of

Florida. He is working on his Master of Science in electrical and computer engineering

with an emphasis in computer systems and networks.

Currently, Aju is a research assistant in the High-performance Computing and

Simulation Research Laboratory at the University of Florida. His research focuses on

reconfigurable computing. Previously he spent a summer doing research in the

Information Systems Laboratory at the University of South Florida. This research focused

on design and implementation of an FPGA-based CICQ switch architecture.

Aju has gained valuable industry work experiences from internships and projects.

As part of the IPPD program, he worked in a team environment for Texas Instruments.

The project involved designing and implementing a high-speed test interface. He also

gained experience during his internship in AOL Time Warner's Web Fulfillment

Division. Here he designed and implemented web pages and reporting system for e-

business. He was also exposed to the latest web design tools and technologies.