<%BANNER%>

Remote Sensing and Imaging in a Reconfigurable Computing Environment


PAGE 1

REMOTE SENSING AND IMAGING IN A RECONFIGURABLE COMPUTING ENVIRONMENT By VIKAS AGGARWAL A THESIS PRESENTED TO THE GRADUATE SCHOOL OF THE UNIVERSITY OF FLOR IDA IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF MASTER OF SCIENCE UNIVERSITY OF FLORIDA 2005

PAGE 2

This document is dedicated to my parents and my sister.

PAGE 3

ACKNOWLEDGMENTS I would first like to thank the all mighty for giving me an opportunity to come this far in life. I also wish to thank the department of ECE at UF, all the professors for their words of wisdom, Dr. Alan George and Dr. Kenneth Slatton for their infinite support, guidance and encouraging words, and all the members of the High-performance Computing and Simulation Lab and Adaptive Signal Processing Lab for their technical support and friendship. I also take this opportunity to thank my parents for their nurture and support, and my sister for always encouraging me whenever I was down. I hope I can fulfill all their expectations in life. iii

PAGE 4

TABLE OF CONTENTS page ACKNOWLEDGMENTS .................................................................................................iii LIST OF TABLES .............................................................................................................vi LIST OF FIGURES ..........................................................................................................vii ABSTRACT .......................................................................................................................ix CHAPTER 1 INTRODUCTION........................................................................................................1 2 BACKGROUND AND RELATED RESEARCH.......................................................6 2.1 Reconfigurable Computing.....................................................................................7 2.1.1 The Era of Programmable Hardware Devices..............................................7 2.1.2 The Enabling Technology for RC: FPGAs..................................................9 2.2 Remote-Sensing Test Application: Data Fusion...................................................13 2.2.1 Data Acquisition Process............................................................................14 2.2.2 Data Fusion: Multiscale Kalman Filter and Smoother...............................16 2.3 Related Research..................................................................................................20 3 FEASIBILITY ANALYSIS AND SYSTEM ARCHITECTURE.............................24 3.1 Issues and Trade-offs............................................................................................24 3.2 Fixed-point vs. Floating-point arithmetic.............................................................26 3.3 One-dimensional Time-tracking Kalman Filter Design.......................................29 3.4 Full System Architecture......................................................................................31 4 MULTISCALE FILTER DESIGN AND RESULTS.................................................34 4.1 Experimental Setup...............................................................................................34 4.2 Design Architecture #1.........................................................................................36 4.2 Design Architecture #2.........................................................................................41 4.3 Design Architecture #3.........................................................................................44 4.4 Performance Projection on Other Systems...........................................................48 4.4.1 Nallatechs BenNUEY Motherboard (with BenBLUE-II daughter card)..49 4.4.2 Honeywell Reconfigurable Space Computer (HRSC)...............................50 iv

PAGE 5

5 CONCLUSIONS AND FUTURE WORK.................................................................52 LIST OF REFERENCES...................................................................................................56 BIOGRAPHICAL SKETCH.............................................................................................60 v

PAGE 6

LIST OF TABLES Table page 3.1 Post place and route results showing the resource utilization for a 32x32 bit divider and the maximum frequency of operation...................................................27 3.2 Quantization error introduced by fixed-point arithmetic.........................................28 3.3 Performance results tabulated over 600 data points (slices occupied: 2%).............31 4.1 Performance comparison of a processor with an FPGA configured with design #1..............................................................................................................................37 4.2 Comparison of the execution times on the FPGA with Xeon processor and resource requirements of the hardware configuration for design #2........................42 4.3 Components of the execution time on an FPGA for processing eight rows of input image using the design #3...............................................................................46 4.4 Components of the total execution time on FPGA for processing a single scale of input data with different levels of concurrent processing....................................46 vi

PAGE 7

LIST OF FIGURES Figure page 2.1 Various computation platforms on the flexibility and performance spectrum...........8 2.2 A simplistic view of an FPGA and a logic cell inside that forms the basic building block ............................................................................................................9 2.3 Block diagram description of RC1000 board. Courtesy: RC1000 reference manual......................................................................................................................12 2.4 The steps involved in data acquisition and data processing.....................................14 2.5 Quad-tree data structure...........................................................................................16 3.1 Data generation and validation.................................................................................25 3.2 Set of sequential processing steps involved in 1-D Kalman filter...........................25 3.3 Differences between Matlab computed estimates and Handel-C fixed point estimates using eight bits of precision after the binary point for 100 data points....29 3.4 Processing path for (a) calculation of parameters and (b) calculation of the estimates...................................................................................................................30 3.5 High-level system block diagram.............................................................................32 4.1 Simulated observations corresponding to 256x256 resolution scale.......................34 4.2 Block diagram of the architecture of design #1......................................................36 4.3 Block diagram for the architecture of both the Kalman filter and smoother...........38 4.4 Error statistics of the output obtained from the fixed-point implementation...........39 4.5 Performance improvement and change in resource requirements with increase in concurrent computation............................................................................................41 4.6 Block diagram of the architecture of design #2.......................................................42 4.7 Performance improvement and change in resource requirements with increase in number of concurrently processed pixels.................................................................44 vii

PAGE 8

4.8 Block diagram of the architecture of the design #3................................................45 4.9 Improvement in the performance with increase in concurrent computations..........47 4.10 Error statistics for the outputs obtained after single scale of filtering......................48 4.11 Block diagram for the hardware architecture of the Nallatechs BenNUEY board with a BenBLUE-II extension card..........................................................................49 4.12 Block diagram of the hardware architecture of HRSC board..................................50 viii

PAGE 9

Abstract of Thesis Presented to the Graduate School of the University of Florida in Partial Fulfillment of the Requirements for the Degree of Master of Science REMOTE SENSING AND IMAGING IN A RECONFIGURABLE COMPUTING ENVIRONMENT By Vikas Aggarwal December 2005 Chair: Kenneth C. Slatton Cochair: Alan D. George Major Department: Electrical and Computer Engineering In recent years, there has been a significant improvement in the sensors employed for data collection. This has further pushed the envelope of the amount of data involved, rendering the conventional techniques of data collection, dissemination and ground-based processing impractical in several situations. The possibility of on-board processing has opened new doors for real-time applications and reduced the demands on the bandwidth of the downlink. Reconfigurable computing, a new star in the field of high-performance computing, could be employed as the enabling technology for such systems where conventional computing resources are constrained by many factors as described later. This work explores the possibility of deploying reconfigurable systems in remote sensing applications. As a case study, a data fusion application, which combines the information obtained from multiple sensors of different resolution, is used to perform feasibility analysis. The conclusions drawn from different design architectures for the test ix

PAGE 10

application are used to identify the limitations of current systems and propose future systems enabled with RC resources. x

PAGE 11

CHAPTER 1 INTRODUCTION Recent advances in sensor technology, such as increased resolution, frame rate, and number of channels, have resulted in a tremendous increase in the amount of data available for imaging applications, such as airborne and space-based remote sensing of the Earth, biomedical imaging, and computer vision. Raw data collected from the sensor must usually undergo significant processing before it can be properly interpreted. The need to maximize processing throughput, especially on embedded platforms, has in turn driven the need for new processing modalities. Todays data acquisition and dissemination systems need to perform more processing than the previous systems to support real-time applications and reduce the bandwidth demands on the downlink. Though cluster-based computing resources are the most widely used platform on ground stations, several factors, like space, cost and power make them impractical for on-board processing. FPGA-based reconfigurable systems are emerging as low-cost solutions which offer enormous computation potential in both the cluster-based systems and embedded systems arena. Remote sensing systems are employed in many different forms, covering the gamut from compact, powerand weight-limited satellite and airborne systems to much larger ground-based systems. The enormous number of sensors involved in the data collection process places heavy demands on the I/O capabilities of the system. The solution to the problem could be approached from two directions: first, performing data compression on-board before transmitting data; or second, performing some onboard computation and 1

PAGE 12

2 transmitting the processed data. The target computation system must be capable of processing multiple data streams in parallel at a high rate to support real-time applications which further increases the complexity of the problem. The nature of the problem demands that the processing system should not only be capable of high performance but also be able to deliver excellent performance per unit cost (where the cost includes several factors such as power, space and system price). There have been a plethora of publications which have demonstrated success in porting several remote sensing and image processing applications to FPGA-based platforms [1-5]. Some researchers have also created high-level algorithm development environments that expedite the porting and streamlining of such application code [6-8]. But, understanding the special needs of this class of applications and analysis of existing platforms to determine their viability as future computation engines for remote sensing systems warrants further research and examination. Identifying some missing components in current platforms that are essential for such systems is the focus of this work. A remote sensing application is used to illustrate the process. In this work, a data fusion application has been chosen as representative of the class of remote sensing applications, for the reason that it incorporates a wide variety of features that stress different aspects of the target computation system. The recent interest in sensor research has lead to a multitude of sensors in the market, which differ drastically in their phenomenology, accuracy, resolution and quantity of data. Traditionally, Interferometric Synthetic Aperture Radar (InSAR) has been employed for mapping extended areas of terrain with moderate resolution. Airborne Laser Swath Mapping (ALSM) has been increasingly employed to map local elevations

PAGE 13

3 at high resolution over smaller regions. A multiscale estimation framework can then be employed to fuse the data obtained from such different sensors having different resolutions to produce improved estimates over large coverage areas while maintaining high resolution locally. The nature of the processing involved imposes enormous computation burden on the system. Hence, the target system should be equipped with enormous computation potential. Since their inception, early processing systems have fallen into two separate camps. The first camp saw a need to accommodate wide varieties of applications with multiple processes running concurrently on the same system, and therefore chose General-Purpose Processors (GPP) to serve their needs. The other camp preferred to improve on the speed of the application and chose to leverage the performance advantages of the Application-Specific Integrated Circuits (ASICs). Over a period of time, these two camps strayed further apart in terms of processing abilities, flexibility and costs involved. Meanwhile, due to the technological advancements over the past decade, Reconfigurable Computing (RC) has garnered a great deal of attention from both the academic community and industry. Reconfigurable systems have tried to fuse the merits of both the camps and have proven to be a promising alternative. RC has demonstrated improved performance in speed on the order of 10 to 100 in comparison to GPPs for certain application domains such as image and signal processing [2, 4, 5, 9]. An even more remarkable aspect of this relatively new programming paradigm is that the performance improvements are obtained at less than two thirds of the cost of conventional processors. FPGA-enabled reconfigurable processing platforms have even outperformed the ASICs in market domains including signal processing and cryptography

PAGE 14

4 where ASICs and DSPs have been the dominant modalities for decades. The past decade has seen a mountainous growth in RC technology, but it is still in its infancy stage. The development tools, target system architectures and even processes for porting the applications need to mature before they can make meaningful accomplishments. However, RC-based designs have already shown performance speedups in application domains such as image processing which require similar processing patterns to many remote-sensing applications. Conventional processor-based resources cannot be employed in such applications because of their inherent limitations of size, power and weight, which RC-based systems can overcome. The structure of imaging algorithms lends itself to a high degree of parallelism that can be exploited in hardware by the FPGAs. The computations are often data-parallel, require little control, and contain large data sets (effectively infinite streams), and raw sensor data elements that do not have large bit widths, making them amenable to RC. However, this class of applications has three characteristics which make them challenging. First, they involve many arithmetic operations (e.g. multiply-accumulates and trigonometric functions) on real and/or complex data. Second, they require significant memory support, not just in the capacity of memory but also in the bandwidth that can be supported. Third, the scale of computation is large, requiring (possibly) hundreds of parallel operations per second and high-bandwidth interconnections to meet real-time constraints. These challenges must be addressed if RC systems are to significantly impact future remote-sensing systems. This work explores the possibility of deploying reconfigurable systems in remote-sensing applications using the chosen test case for feasibility analysis. The conclusions drawn from different design architectures for the test applications are used to identify the

PAGE 15

5 limitations of the current systems and propose solutions to enable future systems with RC resources. The structure of the remaining document is as follows. Chapter 2 presents a brief background on reconfigurable computing using FPGAs as the enabling technologies and data fusion using multiscale Kalman filters/smoothers. It also presents a discussion on the related research in the field of reconfigurable computing as applied to remote sensing and similar application domains. Chapter 3 presents some tests performed for initial study and feasibility analysis. The experiments are based on designs of the 1-D Kalman filter, which forms the heart of the calculations involved in the data fusion application. Chapter 4 presents a sequence of revisions to 2-D filter designs developed to solve the problem along with the associated methodologies. Each of these designs builds on the limitations identified in the previous design and proposes a better solution under the given system constraints. Their performance is compared with a baseline C code running on a Xeon processor. Several graphs and tables that are derived from the results are also presented. The final architecture in the chapter emulates the performance of an ideal system due to which it outperforms the processor-based solution by achieving over an order of magnitude speedup. Chapter 5 summarizes the research and the findings of this work. It draws conclusions based on the results and observations presented in the previous chapters. It also gives some future directions for work beyond this thesis.

PAGE 16

CHAPTER 2 BACKGROUND AND RELATED RESEARCH This work involves an interdisciplinary study of remote-sensing applications and a new paradigm in the field of high-performance computing: Reconfigurable Computing. The fast development times with reconfigurable devices, their density, advanced features such as optimized compact hardwire cores, programmable interconnections, memory arrays, and communication interfaces have made them a very attractive option for both terrestrial and space-/air-borne applications. There are multiple advantages of equipping future systems with reconfigurable computation engines. First, they help in overcoming the limited bandwidth problem on the downlink. Second, they create a possibility of providing several real-time applications on-board. Third, they can also be used for feedback mechanisms to change data collection strategy in response to the quality of the received data or for changing instrumentation planning policy. This thesis aims at designing different architectures for a test application to analyze various features of the existing platforms and suggest necessary improvements. The hardware designs for the FPGA are implemented using the Handel-C language (with DK-3 as its integrated development environment) to expedite the process as compared to the conventional HDL design flow. This chapter provides a comprehensive discussion on different aspects of reconfigurable computing with a brief description of the application. To summarize the existing research, a brief review of the relevant prior work in this field and other related fields is also presented in this chapter. 6

PAGE 17

7 2.1 Reconfigurable Computing This section presents some history on the germination, progress and recent explosion of this relatively new computing paradigm. This section also describes the rise in usage of programmable logic devices in general with time and concludes with a detailed discussion on the enabling technology for RC: FPGAs. 2.1.1 The Era of Programmable Hardware Devices From the infancy of Application-Specific Integrated Circuits (ASICs) the designers could foresee a need for chips with specialized hardware that would provide enormous computational potential. The state of the art of IC fabrication technology limited the amount of logic that could be packed into a single chip. During the 1980s and 90s the fabrication technology matured and saw drastic improvements in many fabrication processes, and the era of VLSI began. The high development and fabrication costs started dropping as the 90s saw an explosion in the demands of such products. The 1980s and 90s also saw birth of the killer microprocessors [10] which started capturing a big portion of the market. The faster time to market for the products motivated many to forgo ASIC and adopt general-purpose processors (GPP) or special-purpose processors such as digital signal processors (DSPs). While this approach provided a great deal of success in several application domains with relative simplicity, it was never able to match the performance of the specialized hardware in the high-performance computing (HPC) community. Real-time systems and other HPC systems with heavy computational demands still had to revert to ASIC implementations. To overcome the limitation of high non-recurring engineering (NRE) costs and long development times, an alternative methodology was developed: Programmable Logic Devices (PLDs). PLDs started playing a major role in the early 90s. Since they provided

PAGE 18

8 faster design cycles and mitigated the initial costs, they were soon adopted as inexpensive prototyping tools to perform design exploration for ASIC-based systems. As the technology matured, the application of PLDs expanded beyond their role as place holders into essential components of the final systems [11]. Due to their ability of being programmed in-field, developers could foresee the PLDs playing a major role in HPC where they could offer many advantages over conventional GPPs and ASICs. The GPP and the ASIC have existed at two extremes on the spectrum of computational resources. The key trade-off has been that of flexibility, where GPPs have held the lead in the market, and performance, where the ASICs have overshadowed the former. PLDs (also known as RC engines because of their ability to be reprogrammed) have made a strong impact in the market by providing the best of both the worlds. Figure 2.1. Various computation platforms on the flexibility and performance spectrum

PAGE 19

9 2.1.2 The Enabling Technology for RC: FPGAs As gate density further improved, a particular family of PLDs, namely FPGAs, became a particularly attractive option for researchers. An FPGA consists of an array of configurable logic elements along with a fully programmable interconnect fabric capable of providing complex routing between these logic elements [12]. Figure 2.2 presents an oversimplified structure of an FPGA. The routing resources represented in the diagram by horizontal and vertical wires that run between the Configurable Logic Blocks (CLBs) consume over 75% of the area on the chip. The flexible nature of an FPGAs architecture provides a platform upon which applications could be mapped efficiently. The ability to reconfigure the logic cells and the connections allows for modifying the behavior of the system even after deployment. This feature has important implications for supporting multiple applications, as an FPGA-based system could eliminate the need of creating a new system each time a new application is to be ported. I/O I/O I/O I/O I/O I/O CLB CLB CLB I/O I/O CLB CLB CLB CLB I/O I/O CLB I/O CLB CLB CLB CLB I/O CLB CLB CLB I/O I/O CLB I/O I/O I/O I/O I/O I/O Configurable Logic Blocks:LUT + FlipFlops Programmable Connections I/O Blocks: accessto externalpins Figure 2.2. A simplistic view of an FPGA and a logic cell inside that forms the basic building block [13].

PAGE 20

10 Modern FPGAs are embedded with a host of advanced processing blocks such as hardware multipliers and processor cores to make them more amenable for complex processing applications. One of the several advantages that FPGAs offer over conventional processors is that they are massive computing machines that lend themselves well to applications with inherent fine-grain parallelism. Because a farm of CLBs can operate completely independently, a large number of operations can take place on-chip simultaneously unlike in most other computing devices. The ability of concurrent computation and support for high memory bandwidth offered by internal RAMs in FPGAs offers them an edge over DSPs for several signal processing applications. Highly pipelined designs help further in overlapping and hiding latencies at various processing steps. The execution time for the control system software is difficult to predict on modern processors because of cache, virtual memory, pipelining and several other issues which make the worst-case performance significantly different from the average case. In contrast, the execution time on the FPGAs is deterministic in nature, which is an important factor for time-critical applications. Although FPGAs provide a powerful platform for efficiently creating highly optimized hardware configurations of applications, the process of configuration generation can be quite labor-intensive. Hence, researchers have been looking for alternative ways of porting applications with relative ease. Several graphical tools and higher-level programming tools are being developed by vendors that speed up the design cycle of porting an application on the FPGA. This thesis makes use of one such high-level programming tool called Handel-C (started as a project at Oxford University and

PAGE 21

11 later developed into a commercial tool by Celoxica Inc.) [8] to enable fast prototyping and architectural analysis. Handel-C provides an extension and somewhat of a superset of standard ANSI C including additional constructs for communication channels and parallelization pragmas while simultaneously removing support for many ANSI C standard libraries and other functionality such as recursion and compound assignments. DK the development environment that supports Handel-C, provides floating-point and fixed-point libraries. The compiler can produce synthesizable VHDL or an EDIF netlist and supports functional simulation. Handel-C and its corresponding development environment have been used previously in numerous other projects including image processing algorithms [14] and other HPC benchmarks [2]. The most common way in which RC-based systems exist today are as extensions to conventional processors. The FPGAs are integrated with memory and other essential components on a single board which then attaches to the host processor through some interconnect such as a Peripheral Component Interconnect (PCI) bus. This work makes use of an RC1000 board, a PCI-based board developed by Celoxica Inc. equipped with a VirtexE 2000 FPGA. Hardware configurations for the board can be generated using the DK development environment or following the more conventional design path of VHDL. Figure 2.3 shows a block diagram of the RC1000 board. The card consists of 8MB of memory that is organized into four independently accessible banks of 2MB each. The memory is accessible both to the FPGA and any other device on the PCI bus. The FPGA has two clock inputs on the global clock buffer. The first pin derives its clock from a programmable clock source or an external clock, whereas the second pin derives its clock from the second programmable clock source or

PAGE 22

12 the PCI bus clock. The board supports a variety of data transfers over the PCI bus, ranging from bitand byte-wide register transfers to DMA transfers into one of the memory banks. Recently, researchers have felt the need for developing stand-alone FPGA systems that could be accessed over traditional network interfaces. Such an autonomous system with an embedded processor and an FPGA [15] (a novel concept developed by researchers at the HCS Lab known as the Network-Attached Reconfigurable Computer or NARC) offers a very cost-effective solution for the embedded computing world, especially where power and space are at a premium. SRAM Bank512kx32 SRAM Bank512kx32 SRAM Bank512kx32 SRAM Bank512kx32 PCI-PCIBridge PLX PC190980 PMC #1 PMC #2 Clocks &Control Xilinx BG560V200E Linear Regulator Auxiliary I/OHostPrimary PCISecondary Isolation+3v3/+2v5 Isolation Figure 2.3. Block diagram description of RC1000 board. Courtesy: RC1000 reference manual [16].

PAGE 23

13 2.2 Remote-Sensing Test Application: Data Fusion In the past couple of decades, there has been a tremendous amount of research in sensor technology. This research has resulted in a rapid advancement of related technologies and a plethora of sensors in the market that differ significantly in the quality of data they collect. One of the most important applications that have attracted overwhelming attention in remote sensing is that of mapping topographies and building digital elevation maps of regions of earth using different kinds of sensors. These maps are then employed by researchers in different disciplines for various scientific applications (e.g. oceanography for estimation of ocean surface heights and behavior of currents, in geodesy for estimation of earths gravitational equipotential, etc.). Traditionally, satellite-based systems equipped with sensors like InSAR and Topographic Synthetic Aperture Radar (TOPSAR) had been employed to map extended areas of topography. But, these sensors lacked high accuracy and produced images of moderate resolution over the region of interest. Recently, ALSM has emerged as an important technology for remotely sensing topographies. ALSM sensor provides extremely accurate and high resolution maps of local elevations, but operates through a very exhaustive process which limits the coverage areas to smaller regions. Because of the varying nature of the data produced by these sensors, researchers have been developing algorithms that fuse information obtained at different resolutions into a single elevation map. A multiscale estimation framework developed by Feiguth [17] has been employed extensively over the past decade for performing efficient statistical analysis, interpolation, and smoothing. This framework has also been adopted to fuse ALSM and InSAR data having different resolutions to produce improved estimates over large coverage areas while maintaining high resolution locally.

PAGE 24

14 2.2.1 Data Acquisition Process Before delving into the mathematics of the estimation algorithm, a brief description of the current process of data acquisition is presented to aid in the understanding of this work and the motivation behind it. The process of collecting ALSM data involves flying the sensor in an aircraft over the region of interest. As the aircraft travels through each flight line, the sensor scans the regions of interest below in a raster scan fashion in a direction orthogonal to movement of the aircraft. 50 100 150 200 250 50 100 150 200 250 5 10 15 20 25 30 Figure 2.4. The steps involved in data acquisition and data processing With the help of several other position sensors and movement measurement instruments, the XYZ coordinates of mapped topography are generated and stored on disks in ASCII format. These XYZ coordinates are then used to generate a dense 3-D point cloud of irregularly spaced points. This 3D data set when gridded in X and Y directions at varying distance grids results in 2-D maps of corresponding resolutions. m= m= m= m= 2 m= m= m= m= 2 wei g ht function g N g 1 g 2 . x opt

PAGE 25

15 These images are then employed in the multiscale estimation framework with the SAR images to fuse the data sets and produce improved maps of topography. Because of the lack of processing capabilities on aircraft, these operations cannot be performed in real-time and hence data is stored on disk and processed offline in ground stations. Several applications could be made possible if on-board processing facilities were made available on the aircraft. Such a real-time system would offer several advantages over the conventional systems. For example, it could be used to change the data collection strategy or repeat the process over certain selected regions in response to the quality of data obtained. RC-based platforms as described in the previous section form a perfect fit for being deployed in such systems. Although this work closely deals with an aircraft-based target system, the issues involved are very generic and apply to most other remote sensing systems with some exceptions. As a result some issues might not be addressed in this work, for example the effect of radiations which have important implications on satellite-based systems, requiring some kind of redundancy is provided to overcome single event upsets (SEUs), do not affect an aircraft-based system. While it is desirable to design a complete system that could be deployed on-board an aircraft, doing so would entail a plethora of implementations issues that divert the focus from the more interesting aspects of this work explored through research. Hence, instead of building an end-to-end system, this work will focus on the data fusion application employed in the entire process (Figure 2.4) and will use it as a test case to analyze the feasibility of deploying RC-based systems in the remote sensing arena. The following sub-section provides a description of this data fusion algorithm.

PAGE 26

16 2.2.2 Data Fusion: Multiscale Kalman Filter and Smoother The multiscale models which are the focus of this thesis were proposed by Fieguth et al. [17] and provide a scale-recursive framework for estimating topographies at multiple resolutions. This multiresolution estimation framework offers the ability of highly efficient statistical analysis, interpolation, and smoothing of extremely large data sets. The framework also enjoys a number of other advantages not shared by other statistical methods. In particular, the algorithm has complexity that grows only linearly with number of leaf nodes. Additionally, the algorithm provides interpolated estimates at multiple resolutions along with the corresponding error variances that are useful in assessing the accuracy of the estimates. For these reasons, and many more, researchers have adopted this algorithm for various remote-sensing applications. Mutliscale Kalman smoothers modeled on fractional Brownian motion are defined on index sets, organized as multi-level quad-trees as shown in the Figure 2.5. The multiscale estimation is initiated with a fine-to-coarse sweep up the quad-tree that is analogous to Kalman filtering with an added merge step. This fine-to-coarse sweep up the quad-tree is followed by a coarse-to-fine sweep down the quad-tree that corresponds to Kalman smoothing. m = 0 m = 1 m = 2 Figure 2.5. Quad-tree data structure where m represents the scale.

PAGE 27

17 The statistical process defined on the tree is related to the observation process and has the coarse-to-fine scale mapping defined as follows )()()()()(swsBsxsAsx (1) )()()()(svsxsCsy (2) where s represents an abstract index for representing a node on the tree represents the lifting operator, s represents the parent of s ) represents the state variable (sx ) represents the observation (LIDAR or INSAR) (sy ) represents the state transition operator (sA ) represents the stochastic detail scaling function (sB ) represents the measurement state relation (sC ) represents the white process noise (sw ) represents the white measurement noise (sv represents the lowering operator, ns represents the n th child of s q represents the order of the tree, i.e. the number of descendant a parent has The process noise, is Gaussian with zero mean and the variance given by the following relation )(sw tsttwsw,])()([ (3) ),0(~)( Nsw (4) The prior covariance at the root node is given by ),0(~)0(00 Nxx (5)

PAGE 28

18 The parameters,,)C that define the model need to be chosen appropriately to match the process being modeled. The state transition operator was chosen to be 1 to create a model where the child nodes are the true value of the parent node offset a small value dependent on the process noise. The parameter is obtained using power spectral matching or fractal dimension classification methods. The measurement state relation matrix was assigned as for all pixels to represent the case where observations are present at all pixels without any data dropout. )(sA ) (sB (s )(sB )(sC Corresponding to any choice of the downward model, an upward model on the tree can be defined as in Fieguth et al. [17] )()()()(swsxsFsx (6) )()()()(svsxsCsy (7) 1)()(sTsPsAPsF (8) )())()(1()]()([1sQPsAPsAPswswssTsT (9) where, is the covariance of the state defined as sP tsxsx)()( Now, the algorithm can proceed with the two steps outlined above (upward and downward sweep) after initializing each leaf node with prior values, 0)|( ssx (10) sPssP )|( (11) A) Upward sweep The operations involved in the upward sweep are very similar to a 1-D Kalman filter which forms the heart of the computation and can be perceived as running along the scale

PAGE 29

19 on each pixel with an additional merge step after every iteration. The calculations performed at each node are as follows )()()|()()(sRsCssPsCsVT (12) )()()|()(1sVsCssPsKT (13) )|()]()([)|( ssPsCsKIssP (14) ))|()()()(()|()|( ssxsCsysKssxssx (15) The Kalman filter prediction step is then applied at all nodes except the leaf nodes which were initialized as mentioned above )|()()|(iiiissxsFssx (16) )()()|()()|(iiTiiiisQsFssPsFssP (17) This leads to a prediction estimate of the parent node from each descendant (1q), which are then merged into a single estimate value to be used in the measurement update step. qiiissxssPssPssx11)|()|()|()|( (18) qiisssPPqssP1111)]|()1[()|( (19) This process is iterated over all the scales (m) until the root node is reached. This process leads to the generation of estimates at multiple scales based on the statistical information acquired from the descendant layers. The completion of upward sweep leads to a smoothed estimate, at the root node. )0|0()0(xxs B) Downward Sweep

PAGE 30

20 The smoothed estimates for the remaining nodes are computed by propagating the information back down the tree in a smoothing step. )]|()()[()|()(ssxsxsJssxsxss (20) )()]|()()[()|()(sJssPsPsJssPsPTss (21) )|()()|()(1ssPsFssPsJT (22) where ) represents the smoothed estimates (sxs )(sPs represents the corresponding error variances It is worth mentioning here that the set of computations outlined above, despite being closely coupled, have two independent processing paths. This fact is further exploited in Chapters 3 and 4 where the designs for targeting the hardware are explored. 2.3 Related Research There has been an abundance of publications over the past decade by researchers who have tried accelerating various image processing algorithms on FPGAs. There has also been a good deal of academic and industry effort in deploying such dual-paradigm systems (which make use of both conventional processing resources and RC-based resources) in space for improving on-board performance. However, only a limited amount of work has been done in understanding the nature of processing involved in such applications to identify their specialized demands from the target systems. One of the earliest works on designing a Kalman filter for an FPGA was by Lee and Salcic in 1997 [18], where they attempted to accelerate a Kalman tracking filter for multi-target tracking radar systems. They achieved an order of magnitude improvement with their designs that spread over six 8000-series Altera chips in comparison to previous

PAGE 31

21 attempts in [19-22] that targeted transputers, digital signal processors, and linear arrays for obtaining improved performance over software implementations. In an application note from Celoxica Inc. [23], Chappel, Macarthur et al. present a system implementation for boresighting of sensor apertures using a Kalman filter for sensor fusion. The system utilizes a COTS-based FPGA which embeds a 32-bit softcore processor to perform the filtering operation. Their work serves as a classical example of developing a low-cost solution for embedded systems using FPGAs. There have also been other works [24-25] that perform Kalman filtering using an FPGA for solving similar problems such as implementation of a state space controller and real-time video filtering. In [25], Turney, Reza, and Delva have astutely performed pre-computation of certain parameters to reduce the resource requirements of the algorithm. The floating-point calculations involved in signal processing algorithms are not amenable to FPGA or hardware implementations in general, so researchers have resorted to fixed-point implementations and have been exploring ways to mitigate the errors hence induced. In [18], Lee and Salcic employ normalization of certain coefficients involved by the process variance to maximize data accuracy with a fixed number of bits or minimize the resource requirements for a certain level of accuracy. There has also been plenty of work done on the algorithmic side to overcome such effects. In [26], Scharf and Siggurdsson present a study of scaling rules and round off noise variances in a fixed point implementation of a Kalman filter. The 1-D Kalman filter involves heavy sequential processing steps and hence cannot fully exploit the fine-grain parallelism available in FPGAs. The multiscale Kalman filter by contrast involves independent operations on multiple pixels in an image and offers a

PAGE 32

22 high degree of parallelism (DoP) that is representative of the class of image processing algorithms. The possibility of operating on multiple pixels in parallel has motivated many researchers to target different imaging algorithms on FPGAs. Researchers [2, 4, 5, 9] have presented several examples illustrating the performance improvements obtained by porting imaging algorithms like 2-D Fast Fourier Transform, image classification, filtering, 2-D convolution and edge detection on the FPGA-based platforms. Dawood, Williams and Visser have developed a complete system [27] for performing image compression using FPGAs on-board a satellite to reduce bandwidth demands on the downlink. Several researchers have even developed a high-level environment [6-7] to provide the application programmers a much easier interface for targeting FPGAs for imaging applications. They achieve this goal by developing a parameterized library of commonly employed kernels in signal/image processing, and these cores can then be instantiated from a high-level environment as needed by the application. Employing RC technology in the remote sensing arena is not a new concept and several attempts have been made previously to take advantage of this technology. Buren, Murray and Langley in their work [28] have developed a reconfigurable computing board for high-performance computing in space using SRAM-based FPGAs. They have addressed the special needs for such systems and identified some key components that are essential to the success, such as, the use of high-speed dedicated memories for FPGAs and high I/O bandwidth and support for periodic reloading to mitigate radiation effects being some of them. In [3], Arribas and Macia have developed an FPGA board for a real-time vision development system that has been tailored for the embedded environment. Besides the academic research community, industries have also shown

PAGE 33

23 keen interest in the field. Honeywell has developed a Honeywell Reconfigurable Space Computer (HRSC) [29] board as a prototype of the RC adaptive processing cell concept for satellite-based processing. The HRSC incorporates system-level SEU mitigation techniques and facilitates the implementation of design-level techniques. In addition to hardware development research, abundance of work has been done on developing hardware configurations for various remote-sensing applications. In [1], a SAR/GMTI range compression algorithm has been developed for an FPGA-based system. Sivilotti, Cho et al., in their work in [30], have developed an automatic target detection application for SAR images to meet the high bandwidth and performance requirements of the application. Other works in [27, 31-33] discuss the issues involved in porting similar applications such as geometric global positioning, sonar processing, etc. on the FPGAs. This thesis aims to further the existing research in this field by developing a multiscale estimation application for an FPGA-enabled system and exploring different architectures to meet the system requirements.

PAGE 34

CHAPTER 3 FEASIBILITY ANALYSIS AND SYSTEM ARCHITECTURE This chapter presents a discussion on some of the issues involved and some initial experiments performed for feasibility analysis. The results of these tests influenced the choice of design parameters and architectural decisions made for the hardware designs of the algorithm presented in the next chapter. 3.1 Issues and Trade-offs Most signal processing algorithms executed on conventional platforms employ double-precision, floating-point arithmetic. As pointed out earlier, such floating-point arithmetic is not amenable to hardware implementations on FPGAs, as they have large logic area requirements (a more detailed comparison is presented in the next subsection). Carrying out these processing steps in fixed-point arithmetic is desirable but introduces quantization error which if not controlled can lead to errors large enough to defeat the purpose of hardware acceleration. Hence, there exists an important trade-off between the number of bits used and the amount of logic area required. There are techniques that mitigate these effects by modifying certain parts of the algorithm itself. Examples of such techniques include normalizing of different parameters to reduce the dynamic range of variables and using variable bit precisions for different parameters, using more bits for more significant variables. This work makes use of fixedand floating-point libraries available in Celoxicas DK package. To perform experimental analysis, the simulated data is generated using Matlab (the values for the simulated data were chosen to closely represent the true values of data acquired from the sensors). Hence, a procedure is 24

PAGE 35

25 required to verify the models in the FPGA with the data generated in Matlab. Text files are used in this work as a vehicle to carry out the task. Since Matlab produces double-precision, floating-point data while the FPGA requires fixed-point values, extra processing is needed to perform the conversion. Figure 3.1. Data generation and validation Another important issue that can become a significant hurdle arises from the nature of the processing involved in the Kalman filtering algorithm. The Kalman filter equations are recursive in nature and require the current iteration estimate value to begin the next state calculation. This behavior is clearly visible from Figure 3.2 which shows the processing steps in a time-tracking filter. Figure 3.2. Set of sequential processing steps involved in 1-D Kalman filter Initial prior values Measurement update Time update kTkkkTkkkRHPHHPK)(kkkkkkxHzKxxkTkkkkkkkQPPxx 11kkkkPHKP)( .txt file containing data Hardwar e HandelC/ VHDL Matlab simulation and kx represents the state variable represents associated error covariance kP kz represents the observation input represents process noise variance kQ represents the state transition operator represents measurement noise variance kR

PAGE 36

26 This problem cannot be mitigated by pipelining the different stages, because of the data dependency that exists from the last stage of the pipeline to the first stage. Although this can be a major roadblock for 1-D tracking algorithms, the situation can be made much better in the multiscale filtering algorithm because of the presence of an additional dimension where the parallelism can be exploited along the scale. Multiple pixels in the same scale can be processed in parallel as they are completely independent of each other. The number of such parallel paths would ultimately depend on the amount of resources required by the processing path of each pixel pipeline. Another interesting but subtle trade-off exists between the memory requirements and the logic area requirements. Hardware architecture of the algorithm could be made to reuse some resources on the chip (e.g. the arithmetic operations that are instantiated could be reused, especially modules such as multipliers and the dividers which consume excessive area) by saving the intermediate results in the on-board SRAM. This approach decreases the logic area demands of the algorithm, but at the cost of increased memory requirements and also the extra clock cycles required for computing each estimate. 3.2 Fixed-point vs. Floating-point arithmetic A study was performed to compare the resource demands posed by floating-point operations as opposed to fixed-point and integer operations of equal bit widths. Since a division operation is a very expensive operation in hardware, it yields more meaningful differences and was hence chosen as the test case. IEEE single-precision format was employed for floating-point division. A Xilinx VirtexII chip was chosen as the target platform for the experiment to exploit the multiplier components present in the device (and also partly due to the fact that the current version of the tool does not support

PAGE 37

27 floating-point division on any other platform). Table 3.1 compares the resource requirements for the different cases. Table 3.1. Post place and route results showing the resource utilization for a 3232 bit divider and the maximum frequency of operation. Target Chip: VirtexII 6000 Package: ff1152 Speed Grade: -4 Integers (32 bits) Fixed-point ( 16 bits before and after binary point) IEEE single-precision floating-point (32 bits) Slices (Total 33792) 19 (1%) 84 (1%) 487 (1%) 18-bit18-bit Multipliers (Total 144) 3 (2%) 6 (4%) 4 (2%) Max. frequency 63.2 MHz 50.5 MHz 97.5 MHz High costs involved in the floating-point operations are clearly visible from the table. The high frequency obtained for the floating-point unit, which appears as an anomaly, merely represents the efficiency of the cores used by the library in the tool. Once the cost savings obtained by resorting to the fixed-point operations have been identified, we need to understand the error introduced through this process. To analyze this error, multiple designs of the 1-D filter were developed with different bit widths of fixed-point implementations in Handel-C. Although simulation was adopted to generate the outputs, the designs were made to closely represent hardware models such that minimal changes could translate them into hardware configurations. The simulation outputs were compared with Matlabs double-precision floating results. Mean square error (MSE) between the filter estimates and the actual expected outputs were used as a metric for comparison as shown in Table 3.2.

PAGE 38

28 Table 3.2. Quantization error introduced by fixed-point arithmetic (averages are over 100 data points). First column defines the number of bits before and after the binary point. Fixed-point precision MSE in Matlab (double-precision) MSE in Handel-C and (% errorfrom Matlab) Mean square error from Matlab Max. abs. error from Matlab 8.8 0.6490 0.6294 (3%) 9.515 e-5 0.0631 8.5 0.6490 2.7179 (300%) 2.7995 3.4405 8.3 0.6490 3.6348 (460%) 3.1135 3.92 Although the maximum and average toleration level of error is largely dependent on the nature of the application, it is clearly evident from Table 3.2 that eight bits of precision after the binary point yields reasonable performance with less than 0.5% of maximum absolute error. As is also visible from the table, the accuracy decreases rapidly with reduction in bit width. The number of bits required before the binary point was kept constant as it is dictated by the dynamic range of the data set and not by the nature of processing. These observations influenced the selection of eight bits of precision after the binary point for the hardware architecture of the multiscale filter. Figure 3.3 depicts the same information by plotting the difference between the results obtained from Matlab floating-point and HandelC fixed-point implementations (with 8 bit precision) for each data point of the time series. It is worth mentioning here that the quantization error values presented in the graphs and tables are specific to the Kalman filter and depend on the type of processing involved (the multiplication and division having the worst effects on quantization errors). The recursive nature of processing involved over the scales in a multiscale filter may tend to accumulate the error at each scale and lead to larger error values than those presented in this section.

PAGE 39

29 0 10 20 30 40 50 60 70 80 90 100 -0.07 -0.06 -0.05 -0.04 -0.03 -0.02 -0.01 0 0.01 Absolute error Iteration Figure 3.3. Differences between Matlab computed estimates and Handel-C fixed point estimates using eight bits of precision after the binary point for 100 data points. 3.3 One-dimensional Time-tracking Kalman Filter Design A 1-D time tracking filter (represented by the Equations in Figure 3.2) was designed for the RC1000 FPGA board with a VirtexE 2000 chip (more details in Chapter 2). The design could not be pipelined because of the existence of worst-case data dependence from the output of the current iteration to the input of the next iteration. Performing all of the computation in hardware proved exorbitantly bulky and occupied about 12% of the slices for one instantiation of the filter. In addition, the long combinational delay introduced by the division operator lead to a considerably low design frequency of about 3.9 MHz. To make these designs of any practical importance, we need to overcome these problems or find a way to mitigate their effect. Revisiting the algorithm and taking a closer look at the equations involved yields some interesting information.

PAGE 40

30 X X + / X X X X + RHPpFQK1Ppnew X X + X XpnewHXpZKF Figure 3.4. Processing path for (a) calculation of parameters and (b) calculation of the estimates. The block diagram of computation steps reveals the existence of two independent processing chains. This fact has important implications for future hardware designs as it allows the reduction of the actual amount of computation that need to be performed in hardware. The calculation of the estimate error covariance (P pnew ) and the filter gain (K) is completely independent of observation input and generated estimates and hence can be done offline even prior to data collection. The pre-computation of these filter parameters has multiple advantages. It reduces the logic area requirements from about 12% to under 3% and eliminates the division operator from the critical path increasing the maximum design frequency to about 90 MHz. In addition, it also allows the possibility of changing the filter parameters by merely replacing the set of the pre-computed parameters, introducing a sort of virtual reconfiguration capability where the behavior of the filter changes without reconfiguring the FPGA. But these benefits come at the cost of extra

PAGE 41

31 memory requirements for storage of the filter parameters, a trade-off that was mentioned earlier. The reduction in area is an important consideration for the 2-D filter design as it allows a larger number of pixels to be processed in parallel. With these modifications a 1-D filter was developed for the RC1000 board and performance experiments were conducted with data sets containing 600 data points. The latency incurred for transferring all the values, one byte at a time, over the PCI bus hampers the performance. To overcome this limitation, DMA is used to transfer all the data values onto the on-board SRAM. The performance results for both the DMA and non-DMA case are shown in Table 3.3 below and compared against Matlab results. The timing results in Matlab yielded variations in multiple trials (which is attributed to the way Matlab works internally and handles memory). For this reason further experiments were performed using a C-based solution for obtaining software execution times. Table 3.3. Performance results tabulated over 600 data points (slices occupied: 2%) Code version Execution time MSE Matlab (running on P4 @ 3GHz) 0 -16 ms 0.6563 FPGA (non-DMA) 49.5 ms 0.6526 FPGA (DMA) 1 ms 0.6526 Components of FPGA execution time (DMA case) DMA write of 600 data values: 64 us DMA read of 600 values: 44 us Computation time for 600 values: 285 us 3.4 Full System Architecture Figure 3.5 depicts a high-level block diagram of the system architecture. To provide more functionality and enhance the usability, a display mechanism may be included in the final system depending on the needs of the application. Since this work focuses on just a part of the entire system (data fusion/estimation), the inputs are not

PAGE 42

32 directly obtained from the sensors. Instead, they go through several stages of processing before they are converted in a form compatible to be wired with the system shown in Figure 3.5. Most of these processing stages are also performed in ground-based stations currently, but some of the processing is also performed on-board the aircraft. Similarly, the output could be used directly for providing visual feedback or may need to pass through some further post processing stages before being in a directly usable format. Raw im age Figure 3.5. High-level system block diagram. As shown in the diagram, the system heavily depends on on-board memory which preferably should be organized in multiple independently addressable banks. The filter parameters corresponding to a chosen filter model will be stored in one/multiple memory bank(s). These filter parameters define the behavior of the algorithm and hence multiple sets of such parameters could be stored on off-board memory and transferred into onFilter parameters In p ut In p ut Out p Memor y Filter Logic Fixed-point to real number conver s ion Video controller Filtered image data to display unit FPGA Preprocessing Off b oard

PAGE 43

33 board memory as needed. The input image coming from one of the pre-processing blocks is distributed in one of the multiple memory blocks reserved for input by a memory controller. The provision of multiple input banks allows the overlapping input data transfer time with the computation time of the previous set of data. Spreading the filter parameters in multiple banks also aids in reducing the computation time by allowing multiple parameters to be read in parallel. The test system on which all the experiments are performed consists of a PCI-based card residing in a conventional Linux Server. Hence it may not exactly mirror the system just outlined, and could involve some additional issues such as limited memory, input/output transfer latencies over the PCI bus, etc. The goals of this work include identifying limitations in the current systems that hamper the performance, and speculating additional features that can enhance the efficiency of the system, on the basis of the results obtained from experiments.

PAGE 44

CHAPTER 4 MULTISCALE FILTER DESIGN AND RESULTS This chapter presents different hardware designs developed for the multiscale Kalman filter. Each subsequent design explores the opportunity to further improve performance and builds on the shortcomings discovered in the previous design. Results of the timing experiments are presented and analyzed to assess performance bottlenecks. 4.1 Experimental Setup Before presenting the hardware designs developed for the RC1000 target platform, a brief discussion about the input data set that was used is presented in order to set the experiment up for results and analysis. The test data was generated by simulation from Matlab to emulate the Digital Elevation Map (DEM) obtained over a topographical region. 50 100 150 200 250 50 100 150 200 250 -80 -60 -40 -20 0 20 40 60 80 Figure 4.1. Simulated observations corresponding to 256 256 resolution scale. 34

PAGE 45

35 The highest resolution observation was chosen to have a support of 256256 pixels and represents the data set corresponding to the one generated by an ALSM sensor. This image resolution hence gives rise to nine scales in the quad tree structure. Another set of observations was generated for a more coarse scale having a support of 128 128 representing the data generated from INSAR. Figure 4.1 depicts the (finer scale) data set, which can be seen to have four varying level of roughness for different regions of the image. Such a structure was chosen to incorporate the data corresponding to different kinds of terrains such as plain grasslands (smooth) and sparse forests (rough) in a single data set. This structure of the simulated observation was created by following the fractional Brownian model and populating the nodes in the quad tree structure starting from the root node using equations from Section 2.2.1. As with the case of 1-D filtering, pre-computation is employed in this case as well to reduce the resource requirements. Hence, the filter parameters needed for the online computation of estimates are also generated using equations in Section 2.2.1 (namely )/(), / (),(),(),( s sPssPsFsCsK ). The parameters along with the observation set required approximately 870 KB of memory when represented in the 8.8 fixed-point format. The small footprint of data, fed as input to the FPGA, provides several opportunities of exploiting the on-board memory in different ways as demonstrated further in the chapter. Some designs might require additional storage because of details of the architecture. It is also worth noting that the structure of computations in the 2-D filter though similar to the 1-D case has extra operations due to the merge step involved for combining the child nodes into a parent pixel.

PAGE 46

36 4.2 Design Architecture #1 The block diagram representing the hardware configuration (as for RC1000 card) of this design is shown in Figure 4.2. Pre-processed filter parameters are transferred via DMA into one of the SRAM banks. Although the transfer has to go over the PCI bus and incurs a substantial latency in the experimental system, this overhead has been separated in the results because the actual target system might have faster means of such transfer. Besides, the latency is just a single time effect and can be avoided by preloading the parameters of the desired filter model. Hence, the input latency will only be visible in the virtual reconfiguration cases where a new set of parameters need to be transferred into the on-board memory. Figure 4.2. Block diagram of the architecture of design #1. This design basically exploits the data parallelism available in the algorithm by pipelining the processing over multiple pixels. The pipeline performs the computation for four pixels simultaneously as this lends itself well to the quad tree structure of the application. The major performance hurdle is experienced at the memory port. Since each stage of the pipeline processing requires some parameters to be read from memory, a resource

PAGE 47

37 conflict exists. Hence, a stall is required after every stage of computation which effectively breaks down the pipeline and its associated advantages to a large extent. Table 4.1 compares the processing time for the algorithm on a 2.4 GHz Xeon processor with the time on the RC1000 board. The FPGA on the board is clocked at 30 MHz, and better performance can be obtained by increasing the design clock, which can be achieved by using faster, more advanced chips and by further optimizing the design. Table 4.1. Performance comparison of a processor with an FPGA configured with design #1. Execution time on Single scale (256 256) Multiple scales (till 4 4) RC1000 15.14ms 20.5ms 2.4 GHz Xeon Processor 9.85ms 13.49ms The times are compared for both single and multiple scales of processing. The computation is terminated when the image support reduces to just 44 because the overheads dominate actual computation time. The amount of resources occupied by this configuration is also listed beside the table. With just 17% of logic utilization for the processing of four pixels, enough area is left to provide the opportunity for increasing the amount of concurrent computation by adding more pixels in the processing chain. The values in the table show that the FPGA-based filter performs about 1.5 times faster than a conventional processor. In the embedded system arena the absolute performance of a system is less relevant than the performance per cost and is considered as a better metric for comparison. Similarly, raw performance improvements might not result in an order of magnitude speedup, but they do come at about one hundredth of the running cost of a Resource Utilization: Slices : 3286 out of 19200 (17%) Mem o ry : 850KB approx. (filte r param e ters) : 170KB approx. (outputs) DMA latency for sending the data over the PCI bus: 1 scale (650KB): 3.1m s All scales (870KB approx): 3.9m s

PAGE 48

38 competing system. The lessons learned from this design point to the fact that memory bandwidth is a crucial factor for obtaining better performance for the application. The resource hazards need to be eliminated to take complete advantage of the pipelined structure. The previous discussion related to the Kalman filtering involved in the application which populates the nodes on the tree going upwards. This application also involves a smoothing step to generate the estimates while traversing the tree from top to bottom. Recursive application of the filtering pipeline generates multiple sets of estimates at different scales, which are then used by the computations in the second step. The structure of calculations involved is similar to the filtering step, but requires some intermediate data values to be saved in addition to the outputs as in the previous case. This further increases the memory bandwidth demands of the system, but since these calculations only begin after the completion of the upwards step, they are not in conflict. Figure 4.3 shows the modifications required to incorporate these effects. The additional data has been stored in the empty memory banks, which allows them to be read in parallel for the smoothing pipeline without any stall cycles. Figure 4.3. Block diagram for the architecture of both the Kalman filter and smoother.

PAGE 49

39 Another set of parameters (also described in equations from Section 2.2.1) is required for the smoothing operations and is also stored in the same memory bank with other parameters.The design shown was spread across two chips by having both of the operational pipelines as independent designs. This result was achieved by creating two separate FPGA configuration files and reconfiguring the FPGA on an RC1000 board with the second file after the completion of the upward sweep step, in effect emulating a multi-FPGA system. This technique allows for higher computational potential and also provides the possibility of pipelining the upward and downward sweep operations for multiple data sets on a higher conceptual level. For this part of the experiment, the observations were just limited to the finest scale (representing the LIDAR data) which implies that no additional statistical information is incorporated in the filtering step except for the finest scale. Hence, the smoothed estimates could be obtained by performing the smoothing from one scale coarser to the observation scale. Existence of observations at multiple scales may have more than a single advantage in several cases; it not only increases the amount of available computation to be exploited but also helps in mitigating the precision effects that tend to accumulate over the scales in the application. Figure 4.4. Error statistics of the output obtained from the fixed-point implementation. Erro r Statistics: M a xi m u m absol u t e e r r o r f r o m M a t l a b: 0.4 2 49 MSE: 0 01 19 Maxim u m error perce n tage: < 2 % i n m o st cases

PAGE 50

40 The entire application, consisting of both the filtering and smoothing step, requires a total of about 23% (17% for filtering + 6% for smoothing) of the slices, processing four pixels simultaneously. Figure 4.4 compares the outputs obtained from the hardware version with the MATLAB double-precision, floating-point results. Because of the similarity in the structure of the two processing steps and the extra memory demands posed on the system by including both of them, the follow-on designs just focus on the filtering part of the application. A simplistic approach of improving the performance that follows from the previous filter design is created by extending the architecture to process more pixels in parallel, in effect filling up the unused area on the chip. The problem that hinders the performance gain is the set of input parameters that are required to process more pixels. Increasing the number of concurrent pixels further increases the memory I/O demands. As a result, extra stall cycles are needed to read the input values. These stall cycles, which are a major overhead, become a dominant part of the computation time and quickly saturate the performance of the entire system. This issue can be clearly understood by taking a closer look at the operational pipeline from the Handel-C code: Main loop of application takes 17 cycles of execution for 4 pixels of which just 7 cycles perform actual com putation. Therefore, CCs for execution of: 4 pixels pipeline: 17 128 = (7 + 10) (256/2) 8 pixels pipeline: 27 64 = (7 + 20) (256/4) 16 pixels pipeline: 47 32 = (7 + 40) (256/8) Hence for 4n replication of pipeline: n f n n 2 256 10 7 Slope of the curve = 2 4 14 n n f

PAGE 51

41 The same information is also conveyed in graphical form in Figure 4.5. As expected, the resource requirements increase linearly with the pixel count but the performance does not. These limitations need to be circumvented in the next design to further improve the performance by a better pipeline which has less stall cycles, in effect exposing more parallelism and hiding the input latency. There are two memory banks that were not utilized in the current design which could be employed to increase the memory bandwidth. Performance Analysis0246810124816# of pixels processed in parallelTime (ms)00.511.522.5Speedup Execution Time Speedup Vs Processor Slices occupied 4 pixels 17% 8 pixels 31% 16 pixels 64% FPGA Resource Utilization: Figure 4.5 Performance improvement and change in resource requirements with increase in concurrent computation. 4.2 Design Architecture #2 This design provides the FPGA-based engine with a higher memory bandwidth by using all four on-board memory banks (i.e. 32 4 = 128 bits per CC) in an attempt to eliminate the resource hazards present in the previous design. The filter parameters are now evenly spread across all the banks and hence a simultaneous read of all memory ports provides all the needed inputs for processing a single pixel. The available data

PAGE 52

42 parallelism is again exploited by pipelining the processing of independent pixels. Without the existence of stalls, the pipeline produces a single estimate every clock cycle. The constraint in the architecture comes from the fact that even all the memory banks together can only support the inputs for one pixel calculation. Simultaneous operation on multiple pixels requires some stall cycles to be introduced in the design again. Figure 4.6. Block diagram of the architecture of design #2. Timing experiments were performed with the same set of data and compared against the processing time for C code running on a Xeon processor. These results are presented in Table 4.2 along with resource consumption information. Table 4.2. Comparison of the execution times on the FPGA with Xeon processor and resource requirements of the hardware configuration for design #2. Execution time on Single scale (256 256) Multiple scales (till 4 4) RC1000 15.14ms 20.5ms 2.4 GHz Xeon Processor 7.39ms 9.86ms Speedup 2.04 2.07 DMA latency for sending the data over the PCI bus: 1 scale (650KB) : 10.2m s All scales (870KB approx) : 11.4m s the a m ount of data that needs to be tr ansferred is slightly la rger because of the architecture det a ils Resource Utilization: Slices : 963 out of 19200 (5%) Mem o ry : 1MB approx. (f ilte r p aram eters )

PAGE 53

43 The values show a minor improvement over the previous design with a speedup of two. This improvement is attained with just one pixel being processed by the pipeline, as a result of which the logic area required by the design comes down to just 5% of the chip. The amount of memory used by the design increased slightly because some zero padding is required in the higher order bits of the third and fourth bank that are left unused. An attempt is made to further improve the performance by introducing more pixels in the same pipeline. However, this requires introduction of one stall cycle for every extra pixel data read in the input stage of the pipeline. Again, a resource hazard exists due to the memory access, which eventually saturates the system performance with concurrent computation of a certain number of pixels. A mathematical analysis similar to the previous design could be employed to prove this fact as well. A closer look at the operational pipeline reveals that each estimate actually takes 3 cycles for computation (extra cycles are required because of a conflict at the fourth memory bank that serves as both input and output for the filtered estimates) and the analysis follows: Main loop of application takes 3 cycles for execution Therefore CCs for execution 1 pixel pipeline: 3 256 2 pixel pipeline: 4 128 = 4 (256/2) Hence, for n+1 pixel pipeline will require: nfnn12563 Slope of the graph can be found by the derivative, 211)(nnf Since the slope is not constant, the speedup is sub-linear and flattens off quadratically.

PAGE 54

44 Figure 4.7 depicts this same information by means of a graph drawn from the experimental data. These stalls impede a linear speedup with the number of pixels. These two designs also re-iterate the subtle trade-off between the memory requirements and the logic area requirements. All of these stall cycles can be avoided by reducing the memory demands and computing all of the parameters online, but will require more resources. This design also emphasized the criticality of the memory bandwidth. The next design will attempt to emulate a higher bandwidth system in order to overcome this limitation. Performance Analysis02468124Number of pipelinesTime (ms)012345Speedup Execution time on RC1000 Speedup over Xeon Pixels processed concurrently Slices occupied 1 5% 2 11% FPGA Resource Utilization: Figure 4.7. Performance improvement and change in resource requirements with increase in number of concurrently processed pixels. 4.3 Design Architecture #3 This design uses the on-chip block RAM (BRAM) resources to buffer the input and then work with these buffers for providing inputs to the pipeline. Treating BRAMs as buffers allows emulation of a more advanced RC system which is rich in memory bandwidth. A number of such buffers could be employed to have multiple non-stalling pipelines and hence increase the concurrent computations. The architecture for this design is depicted in Figure 4.8. The existence of input buffers means the design incurs

PAGE 55

45 some additional latency, because of the cycles required to fill the buffers. Since there is limited on-chip storage which is much less than the total amount of input data, multiple iterations are required to process the entire data set. The goal of this hardware design is to estimate the performance of a much more advanced RC system which is done by separating the buffering time from the actual computation time. These individual components of the total time have been tabulated in Tables 4.3 and 4.4. Figure 4.8. Block diagram of the architecture of the design #3. The designs were developed for a varying number of pipelines (one, two and four) and the timing experiments were performed for all these designs. The components of the total time were observed by maintaining a timer on the host processor and sending signals from the FPGA for reading the timer values, at the completion of different processing stages. Since each design operated on a different number of pixels, the times for all the designs were extrapolated (wherever needed) for processing of eight rows of input image to have a common comparison basis. Each input buffer has a capacity of 1KB and could therefore hold 512 values in the 8.8 fixed-point format.

PAGE 56

46 Table 4.3. Components of the execution time on an FPGA for processing eight rows of input image using the design #3. Time (s) : With 1 pipeline With 2 pipelines With 4 pipeline Input DMA (for entire image) 10254 10345 10752 Transfer into BRAM 80 78 78 Computation 132 66 34 Transfer output from BRAM 56 56 54 Output DMA (for entire output image 128x128) 614 599 745 This structure allowed each buffer to store data for processing of exactly two input rows of the high-resolution input image (256 256). The time listed in Table 4.3 as the transfer time in and out of the BRAM represents the overhead incurred because of the buffering and is almost the same for all the designs for a common basis of comparison (8 input image rows for our experiment). The designs differ in the amount of computation performed concurrently which is visible in the row marked as computation time. Table 4.4. Components of the total execution time on FPGA for processing a single scale of input data with different levels of concurrent processing, and the resource utilization. Time (s) : with 1 pipeline with 2 pipelines with 4 pipelines Input DMA (for entire image) 10254 10345 10752 Output DMA (for entire output image 128 128) 614 599 745 Computation 4224 2112 1088 Transfer time to and from BRAMs 4352 4288 4224 Total time (including transfer to/from BRAMs) 8576 6400 5312 Execution on 2.4 GHz Xeon processor 15140 FPGA Resource Slices occupied BRAM used 1 pipeline 5% 8% 2 pipeline 11% 17% 4 pipeline 22% 35%

PAGE 57

47 With a non-stalling pipeline a linear speedup is achieved by adding more pixels in the processing pipeline. Hence the performance of a system with sufficient memory bandwidth can be estimated by discarding the buffering overhead. These values are noted in Table 4.4 which compares the total execution time for processing of a single scale (256256) on the RC1000 versus the host processor. The values listed in Table 4.4 provide a good appraisal of the performance improvement attainable on a high memory bandwidth system. The resource consumption by the different designs is also listed. In addition to similar logic area requirements, this design also uses block RAMs heavily. It is evident that the design can be extended to satisfy the resource requirements for concurrent processing of about 10 pixels. The speedups hence obtained over the Xeon processor (neglecting the buffering time) is shown in the graph in Figure 4.9 where the values have been extrapolated for the eight-pixel design. The values are presented by both including and excluding the buffering time. Performance Comparison0510152025301248# of Processing PipelinesSpeedup With buffering time Without buffering time Figure 4.9. Improvement in the performance with increase in concurrent computations.

PAGE 58

48 The results including the buffering time degenerate to the case of design #2 and as a result the performance improvement obtained by increasing the amount of concurrent computation starts tapering off quickly. But, when the buffering times are discarded, a linear speedup is obtained by adding more computation. The last bar (with 8 pixels) in the graph shows that more than an order of magnitude improvement is attainable for good designs and systems with an appropriate amount of resources. Figure 4.10 presents the error statistics by comparing the outputs generated by the FPGA for this design with Matlab-computed estimates. The mean square error has a satisfactory low value of 0.01 with the percentage error being less than 2% in most cases. Error Sta t istics: Maxim u m absolute erro r f r om Matla b: 0.4249 MSE: 0.0119 Figure 4.10. Error s t atistics f o r the ou tputs obtain e d af ter s i ng le sca l e of f ilte ring. 4.4 Performance Projection on Oth er Systems This subsection gauges the perform a nce at tainable on som e of the other existing RC platfor m s. These projections are very si m p listic in nature and are derived based on the system dem a nds posed by the ap plication, as identified from the previous experiments and the reso urces available on these RC platfor m s. This projection is useful in appraising

PAGE 59

49 the true performance advantages that can be obtained from an advanced RC-enabled remote sensing system. 4.4.1 Nallatechs BenNUEY Motherboard (with BenBLUE-II daughter card) Figure 4.11 shows a high-level architecture diagram of the board. It consists of three Xilinx FPGAs, each one of them being a VirtexII 6000 (speed grade -4). These advanced chips have larger logic resources, more block RAMs and additional specialized hardware modules such as dedicated multipliers. The VirtexII series also provides much faster interconnects than the VirtexE series. All these factors account for about a 2x to 4x improvement in the design clock frequency in moving from VirtexE FPGA to a VirtexII for most designs. The system also supports a higher memory capacity providing 12MB of storage with a bandwidth of 192 bits every clock cycle. Almost a twofold increase in the memory bandwidth should yield a corresponding linear speedup of about 2x. PCI FPGA(XilinxSpartan 2) BenNUEY User FPGA (Xilinx Virtex2 6000, -4) ZBT SSRAM(2 MB) ZBT SSRAM(2 MB) BenBLUE-II Primary FPGA (Xilinx Virtex2 6000, -4) ZBT SSRAM(4 MB) ZBT SSRAM(4 MB) BenBLUE-II Secondary FPGA (Xilinx Virtex2 6000, -4) Local Bus 64 bit /66 MHz PCI ConnectorAddr & CtrlData32Addr & CtrlData32Addr & CtrlData64Addr & CtrlData64(32-bit data40 MHz)(64-bit data, 66 MHz)Inter-FPGAcommunications bus(159 I/O, user defined clk)PCICOMMSbus Figure 4.11. Block diagram for the hardware architecture of the Nallatechs BenNUEY board with a BenBLUE-II extension card. Courtesy: BenNUEY Reference Guide [34].

PAGE 60

50 Hence, this target system could provide a performance which is approximately 4x 8x better when compared to the RC1000 board. 4.4.2 Honeywell Reconfigurable Space Computer (HRSC) Figure 4.12 shows a high-level block diagram of the HRSC board. It is comprised of four Xilinx FPGAs, two of them being Virtex-1000 and the other two from the VirtexII series (VirtexII 2000). The presence of a larger amount of resources and faster chips should lead to higher frequency designs, leading to an improvement of approximately 2x over the RC1000 designs. The system is also very rich and flexible in the amount of memory resources and the way the memory is interconnected to all the FPGAs. All the memories are dual-ported and provide more flexibility in input buffering as well as providing higher bandwidth. PMCA_IFPLX 9054 DPMSB1256x32 PE1Virtex 1000 PMCSlot A PMCSlot B Front Panel VirtexII2000 VirtexII2000 DPMSB1256x32 DPMSB2256x32 DPM256x32 DPMSB2256x32 ConfigurationManagerVirtex DPMSB3256x32 DPMSB3256x32 Front Panel PMCB_IFPLX 9054 cPCI Local BusMemBusMemBusMemBusMemBusMemBusMemBusMemBusMemBusMemBusMemBusPMCLocal BusPMCLocal BusConfiguration, User I/O, andinterruptsFrontPanelUser I/OFrontPanelUser I/O Config.Cache cPCI_IFPLX 9054 cPCI J1,J2 PE2Virtex 1000 Figure 4.12. Block diagram of the hardware architecture of HRSC board [29].

PAGE 61

51 A single HRSC board supports approximately 1.75GB of storage capacity with a bandwidth of 448 bits per clock cycle. This high bandwidth should be more amenable to the design and should yield a performance enhancement of about 4x. The HRSC board, specifically tailored for space-based computing, was designed with similar considerations and would therefore lead to a system-level performance improvement on the order of 8x over the RC1000 card. This chapter presented a detailed discussion of the various experiments conducted and the results obtained. The next chapter summarizes the document by providing a set of conclusion and future directions in which the work could be pursued.

PAGE 62

CHAPTER 5 CONCLUSIONS AND FUTURE WORK This chapter summarizes the findings of this research and draws conclusions based on the results and observations presented in the previous chapters. It also discusses some unfinished issues and tasks that can be pursued further to build a complete system and do further analysis. Most of the current remote-sensing systems rely on conventional processors for performing computation offline after data acquisition and lack any real-time processing capabilities. Future systems will need on-board high-performance computing facilities in order to be able to perform increasing levels of computation. In such systems, performance per unit cost is a much more important metric than raw performance. This work has demonstrated a strong case depicting the relevance of reconfigurable computing for performing fast on-board computation. Several hardware configurations have been developed to extend the prior work on 1-D Kalman filtering to a 2-D case. The results have shown over 25 times speedup in computation time in ideal scenarios compared to the conventional GPP-based solutions which is bound to further increase with more advanced and faster FPGAs. More than two orders of magnitude improvement can be obtained on such advanced systems. Systems capable of yielding such high performance, coupled with other attributes such as low power, cost and space requirements make them a perfect match for remote sensing systems and can make airborne-based, real-time processing possible. There has been a great deal of prior work done in the field of mapping image processing algorithms on FPGA-based systems, but its application to the 52

PAGE 63

53 remote sensing world is still in its infancy. The results obtained in this work, show promise and demand further research and investigation. This work also highlighted some key issues involved in the design of remote-sensing applications for RC systems. Firstly, floating-point operations are wasteful of resources when designed for hardware, whereas fixed-point operations are more amenable in such cases. Therefore, algorithms need to be redesigned to mitigate the quantization effects and tested to ensure that they provide a desired level of accuracy. Secondly, remote-sensing applications pose high demands on the memory bandwidth of the system. Hence, the target systems should support a high memory bandwidth in addition to just providing large storage capacity for being successful. Use of multi-port memories provides added advantages, allowing both reads and writes to take place in parallel and providing an effective mechanism of hiding input latencies. Thirdly, the ability of processing data in parallel is a key attribute for meeting real-time requirements in a remote-sensing scenario, where multiple sensors produce large amounts of data at a high rate. Different designs have been developed and analyzed as a part of this work that exploit a high degree of parallelism and adapt to the available resources. The designs presented in this work for the application use pre-computed parameters which describe the behavior of the filter. Hence, they offer the ability of virtual reconfiguration, which is a novel concept developed and introduced through the course of this research, where the set of filter parameters in the memory could be changed to adapt the filter behavior. This functionality has important implications in both the remote sensing and RC arenas. Most remote-sensing systems need to adapt their processing methods in response to the change in data statistics. Although current RC systems can

PAGE 64

54 support dynamic reconfiguration of FPGAs, the configuration times are very slow, hence virtual reconfiguration provides a faster mechanism to achieve the same effect. To keep the work tractable the scope of the project was limited; only one part of the entire process was dealt with in detail, leaving several other modules. Most importantly, the raw sensor inputs have to go through several stages of processing before being converted into a form compatible with the designs developed in this work. The design and development of the pre-processing module needs to be addressed before an actual system can be developed. There were several lessons learned about the design of target hardware system and how different resources affect the performance. The design of such a hardware system based on the results and conclusions of this work is a challenging task that demands attention. There has been recent interest in developing stand-alone, FPGA-based systems which can exist without a host processor. Such a system offers a very cost-effective solution for embedded environments. Adapting such systems to the needs of remote-sensing application can have a monumental effect on the future research in this direction. This work also shows projection of the performance on some advanced systems, which need to be verified with experimental results. The focus of this work was to analyze the feasibility of deploying RC systems in the remote sensing arena and developing designs to support these claims, as a result the designs developed in this work are not optimal. With more time and optimizations the designs could yield better performance, for example by increasing the clock frequencies of the designs, or providing direct DMA into the block RAMs to hide the input and output latencies. As pointed out earlier, the use of fixed-point is very important for

PAGE 65

55 economical hardware configurations, but it leads to quantization errors that can be high enough to defeat the purpose of acceleration in some cases. There are several ways in which these quantization effects of can be mitigated by changes in the algorithm, such as normalization of all the parameters or providing observations at multiple scales to allow the filter to overcome the effect of accumulation of errors. Exploring all the ways to achieve this goal is essential for the success of the application.

PAGE 66

LIST OF REFERENCES 1. Integrated Sensors Inc., SAR/GMTI Range Compression Implementation in FPGAs, Application Note, Utica, NY, 2005. 2. V. R. Daggu and M. Venkatesan, Design and Implementation of an Efficient Reconfigurable Architecture for Image Processing Algorithms using Handel-C, M.S. Thesis, Dept. of Electrical and Computer Engineering, University of Nevada, Las Vegas. 3. P. C. Arribas, F. M. Macia, FPGA Board for Real Time Vision Development System, Devices, Circuits and Systems 2002. Proceedings of the Fourth IEEE International Caracas Conference on 17-19 Apr. 2002. Pages: T021-1 T021-6. 4. P. McCurry, F. Morgan and L. Kilmartin, Xilinx FPGA Implementation of an Image Classifier for Object Detection Applications, Image Processing 2001. Proceedings of 2001 International Conference on 7-10 Oct. 2001. Vol. 3, Pages: 346-349. 5. I. S. Uzun and A.A.A. Bouridane, FPGA Implementations of Fast Fourier Transforms for Real-time Signal and Image Processing, Field-Programmable Technology (FPT) 2003. Proceedings of 2003 IEEE International Conference on 15-17 Dec. 2003. Pages: 102-109. 6. B. A. Draper, J. R. Beveridge, A. P. W. Bhm, C. Ross and M. Chawathe, Accelerated Image Processing on FPGAs, Image Processing 2003. IEEE Transactions on Dec. 2003. Volume 12, Issue 12, Pages: 1543 1551. 7. K. Benkrid, D. Crookes, A. Bouridane, P. Con and K. Alotaibi, A High Level Software Environment for FPGA Based Image Processing, Image Processing And Its Applications 1999. Seventh International Conference on 13-15 Jul. 1999. Vol. 1, Pages: 112-116. 8. Celoxica Ltd., Handel-C Language Reference Manual, 2004. http://www. celoxica.com/techlib/files/CEL-W0410251JJ4-60.pdf. Last accessed: Aug. 2005. 9. N. Shirazi, P. M. Athanas and A. L. Abbott, Implementation of a 2-D Fast Fourier Transform on a FPGA-Based Custom Computing Machine, Field-Programmable Logic and Applications 1995. Proceedings of the fifth International Workshop on Sept. 1995. Vol. 975, Pages: 282-292. 56

PAGE 67

57 10. R. Hartenstein, A Decade of Reconfigurable Computing: A Visionary Retrospective, Design, Automation and Test in Europe, 2001. Proceedings of conference and exhibition on 13-16 Mar. 2001. Pages: 642-649. 11. K. Compton and S. Hauck, Reconfigurable Computing: A Survey of Systems and Software, ACM Computing Surveys 2002. Vol. 34(2), Pages: 171-210. 12. S. Gould, B. Worth, K. Clinton and E. Millham, An SRAM-Based FPGA Architecture, Custom Integrated Circuits Conference 1996. Proceedings of IEEE Conference on 5-8 May 1996. Pages: 243-246. 13. http://www.xilinx.com. 14. V. Aggarwal, I. Troxel and A. George, Design and Analysis of Parallel N-Queens on Reconfigurable Hardware with Handel-C and MPI, Military and Aerospace Programmable Logic Devices (MAPLD) 2004. International Conference on 8-10 Sept. 2004. 15. C. Conger, I. Troxel, D. Espinosa, V. Aggarwal and A. George, NARC: Network-Attached Reconfigurable Computing for High-performance, Network-based Applications, Military and Aerospace Programmable Logic Devices (MAPLD) 2005. International Conference on 8-10 Sept. 2005. (to appear). 16. Celoxica Ltd. RC1000 Reference Hardware Reference Manual, version 2.3, 2001. Document Number: RM-1120-0. 17. P. W. Fieguth, W. C. Carl, A. S. Willsky and C. Wunsch, Multiresolution Optimal Interpolation and Statistical Analysis of TOPEX/POSEIDON Satellite Altimetry, Geoscience and Remote Sensing 1995. IEEE Transactions on Mar. 1995. Vol. 33, Issue 2, Pages 280-292. 18. C. R. Lee and Z. Salcic, A Fully-hardware-type Maximum-parallel Architecture for Kalman Tracking Filter in FPGAs, Information, Communications and Signal Processing (ICICS) 1997. Proceedings of 1997 International Conference on 9-12 Sept. 1997. Vol. 2, Pages: 1243-1247. 19. L. P. Maguire and G. W. Irwin, Transputer Implementation of Kalman Filters, Control Theory and Applications 1991, IEE Proceedings D on Jul. 1991. Vol. 138, Issue 4, Pages 355-362. 20. J. M. Jover and T. Kailath, A Parallel Architecture for Kalman Filter Measurement Update and Parameter Estimation, Automatica (Journal of IFAC) 1986. Vol. 22, Issue 1, Pages: 43-58. Tarrytown, NY, USA. 21. R. S. Baheti, D. R. OHallaron and H. R. Itzkowitz, Mapping Extended Kalman Filters onto Linear Arrays, Automatic Control 1990. IEEE Transactions on Dec. 1990. Vol. 35, Issue 12, Pages: 1310-1319.

PAGE 68

58 22. I. D'Antone, L. Fortuna, G. Muscato and G. Nunnari, Arithmetic Constraints in Kalman Filter Implementation by using IMS A100 Devices, Implementation Problems in Digital Control 1989. IEE Colloquium on 9 May 1989. Pages: 8/1-8/7. 23. S. Chappell, A Macarthur, D. Preston, D. Olmstead and B. Flint, Exploiting FPGAs for Automotive Sensor Fusion, Application note May 2004. http://www. celoxica.com/techlib/files/CEL-W04061612DV-296.pdf. Last accessed: Aug 2005. 24. B. Garbergs and B. Sohlberg, Implementation of a State Space Controller in a Fpga, Electrotechnical Conference MELECON 1998. Ninth Mediterranean Conference on 18-20 May 1998. Vol. 1, Pages: 566-569. 25. R. D. Turney, A. M Reza, and J. G. R. Delva, Fpga Implementation of Adaptive Temporal Kalman Filter for Real Time Video Filtering, Acoustics, Speech, and Signal Processing, (ICASSP) 1999. IEEE International Conference on 15-19 Mar. 1999. Vol. 4, Pages: 2231-2234. 26. L. Scharf, and S. Sigurdsson, Fixed-Point Implementation of Fast Kalman Predictors, Automatic Control 1984. IEEE Transactions on Sept. 1984. Vol. 29, Issue 9, Pages: 850-852. 27. A. S. Dawood, J. A. Williams and S. J. Visser, On-board Satellite Image Compression Using Reconfigurable FPGAs, Field-Programmable Technology (FPT) 2002. Proceedings of IEEE International Conference on 16-18 Dec. 2002. Pages: 306-310. 28. D. V. Buren, P. Murray and T. Langley, A Reconfigurable Computing Board for High Performance Processing in Space, Aerospace Conference 2004. Proceedings of 2004 IEEE Conference on 6-13 Mar. 2004. Vol. 4, Pages: 2316-2326. 29. J. Ramos and I. A. Troxel, A Case Study in HW/SW Codesign and Project Risk Management: The Honeywell Reconfigurable Space Computer (HRSC), Military and Aerospace Programmable Logic Devices (MAPLD) 2004. International Conference on 8-10 Sept. 2004. 30. R. Sivilotti, Y. Cho, Wen-King Su, D. Cohen and B. Bray, Scalable Network Based FPGA Accelerators for an Automatic Target Recognition Application, FPGAs for Custom Computing Machines (FCCM) 1998. Proceedings of IEEE Symposium on 15-17 April 1998. Pages: 282-283. 31. P. Graham and B. Nelson, Frequency-Domain Sonar Processing in FPGAs and DSPs, FPGAs for Custom Computing Machines (FCCM) 1998. Proceedings of IEEE Symposium on 15-17 April 1998. Pages: 306-307. 32. T. Hamamoto, S. Nagao and K. Aizawa Real-Time Objects Tracking by Using Smart Image Sensors and FPGA, Image Processing 2002. Proceedings of International Conference on 24-28 June 2002. Vol. 3, Pages: III-441 III-444.

PAGE 69

59 33. A. Utgikar and G. Seetharaman, FPGA Implementable Architecture for Geometric Global Positioning, Field-Programmable Technology (FPT) 2003. Proceedings of IEEE International Conference on 15-17 Dec. 2003. Pages: 451-455 34. Nallatech Inc. BenNUEY Reference Guide, Issue 10, 2004. Document Number: NT107-0123.

PAGE 70

BIOGRAPHICAL SKETCH Vikas Aggarwal received a Bachelor of Science degree in electronics and communication engineering from the department of ECE at Guru Gobind Singh Indraprastha University, India, in August of 2003. He moved over to United States to pursue his graduate studies in the department of Electrical and Computer Engineering at the University of Florida. Vikas has been a paid graduate research assistant under the direction of Dr. Clint Slatton in Adaptive Signal Processing Lab and under Dr. Alan George in High-Performance Computing and Simulation Lab. Since becoming a paid graduate assistant he has worked on numerous projects in two relatively different fields of reconfigurable computing and adaptive signal processing techniques as applied to remote sensing applications. 60


xml version 1.0 encoding UTF-8
REPORT xmlns http:www.fcla.edudlsmddaitss xmlns:xsi http:www.w3.org2001XMLSchema-instance xsi:schemaLocation http:www.fcla.edudlsmddaitssdaitssReport.xsd
INGEST IEID E20101119_AAAAEE INGEST_TIME 2010-11-19T23:31:26Z PACKAGE UFE0012171_00001
AGREEMENT_INFO ACCOUNT UF PROJECT UFDC
FILES
FILE SIZE 52096 DFID F20101119_AACTVU ORIGIN DEPOSITOR PATH aggarwal_v_Page_14.pro GLOBAL false PRESERVATION BIT MESSAGE_DIGEST ALGORITHM MD5
1a0f742129aae512bbe5938fa0c26900
SHA-1
07490f249fa6b35d02325b890544d5c84781e0df
1053954 F20101119_AACTQX aggarwal_v_Page_42.tif
fb69d713fedb1e030df4c640fc2e2cf8
dde69be75c52bd6ac1b2c2f70be3081ca7410441
6693 F20101119_AACTJC aggarwal_v_Page_25thm.jpg
746226fc44bd6b48ae5c8a2e010e52ab
f47a98a10aa2c4db29be61a41b7a7b649191858b
26935 F20101119_AACTVV aggarwal_v_Page_18.pro
88f1290111b10a15e6fb10afb9dcb2bc
d615e4402d2ce9f982180e10ce5858a01051129b
F20101119_AACTQY aggarwal_v_Page_25.tif
941aae107525300a50eae9de3fec9178
b8ff003618eb63f81f7374dc6f5221efb5730f62
50506 F20101119_AACTJD aggarwal_v_Page_12.pro
1013b2f2b12351173b5cbd76e14021bd
23fd6b8f0c99cdfa0860f576cb015ee91e8c8dc2
84257 F20101119_AACTOA aggarwal_v_Page_09.jp2
7b070ad30513b5934f13ebb018664328
fb2ce1dc741123988adbbf334f16ad927d4dbbd9
31337 F20101119_AACTVW aggarwal_v_Page_19.pro
28daeaa5e967f1000f14b29f9f60fa56
b8c064b47d189b9dacbc9473eacf258d75cdfb55
25271604 F20101119_AACTQZ aggarwal_v_Page_18.tif
9afd05494a77ac7635306505b1d64693
3442e2b0eb2ee6d0e4280ddfc1ab4ab667d00c8e
F20101119_AACTOB aggarwal_v_Page_50.tif
faa7b9f60351f80b8655baa25b2f1e6b
bdcb8fec29c5a38b23d7382cfb1ac5708f3fb0f8
48195 F20101119_AACTVX aggarwal_v_Page_20.pro
b180af12899eb873cb87c196833ff339
f6c9e7acac3c2b195c01a4b6f29677285e41ae5f
1768 F20101119_AACTJE aggarwal_v_Page_05thm.jpg
b464984d60f85c37938343dc6091faff
c00794d1e99dd6fe7437c19d8689bb1ee1f65d41
27211 F20101119_AACTOC aggarwal_v_Page_08.jpg
b97978c10b4ac293b1d3e8682804ba72
5ee4c6689647f8f52a42b5e5c297a2d1435b03b4
52205 F20101119_AACTVY aggarwal_v_Page_23.pro
ff52ccce77c1d85399d718643c6ef55e
3d569a3d0e730990d0b185253e581d56aeac2ef2
19713 F20101119_AACTJF aggarwal_v_Page_39.QC.jpg
7266792c76a073c8712481beaa7d7a1d
ac28cce2f1c314bd6298636170c4df086bd6e698
49665 F20101119_AACTTA aggarwal_v_Page_25.pro
a14542a8202c7b539e4332a15b1b9193
31e9705b32d9d2d66274107dfff0bdcf27cba866
9417 F20101119_AACTOD aggarwal_v_Page_05.pro
f0b978973a5c12d121bcfcb01e4f57ee
c7f879472b28a8d6848239731604c700bb4f7df6
27571 F20101119_AACTVZ aggarwal_v_Page_24.pro
07caff51fd1c680e7f7ea6aa850bd81b
4f6ddff5fbeaf1890301d6f7dcbd094a0c352f34
802565 F20101119_AACTJG aggarwal_v_Page_18.jp2
55ea4fc389944a11468f2f088e642ee2
8d1778a602ce724c19d5a94cbc9d8603e26379f4
1960 F20101119_AACTTB aggarwal_v_Page_13.txt
ac9eab80ebe42fda11b52861de79abd1
e03a631c177f5e1e048086189e03725f5d18dceb
1051960 F20101119_AACTOE aggarwal_v_Page_24.jp2
fbe64d00635e15af5328f7fd1d729daf
9c4f796205597dba4529c0fec6be8f99c576994a
1903 F20101119_AACTJH aggarwal_v_Page_45.txt
d0816261610e89efd6764f5b44c80ea4
70e7cb272f146f869e7af083471c3c72e66267d0
44728 F20101119_AACTTC aggarwal_v_Page_62.pro
5902775e9346fd84204b5c4448fa50c1
f75439da477fbc7a600850b1d429545427710de8
106022 F20101119_AACTOF aggarwal_v_Page_38.jp2
ae9f7b4fa082ff779aecc2e76dafe56d
2964168f599a8a19819fb85b766d0e4918d78f90
6890 F20101119_AACTYA aggarwal_v_Page_14thm.jpg
667fee072d7bc5c43e74e58828a126e0
5b1fc60a71b0fab439af50c42557099fe3bf3fcb
60493 F20101119_AACTJI aggarwal_v_Page_07.pro
6d77053f5e45d9086f30390f52bbf11b
643e94f4615fa6917d7d6321cf0c353ce570814c
F20101119_AACTTD aggarwal_v_Page_06.tif
31d3ca6f5b57040e804e31e1a222cb69
b75c6754483ba4f4235574997c3f351f8f01ded2
21876 F20101119_AACTOG aggarwal_v_Page_62.QC.jpg
d91f6957bcdb77ce4373f195a2e8b035
6f88a6196f533193a0bacf250f338011e1624ef5
17443 F20101119_AACTYB aggarwal_v_Page_44.QC.jpg
de75ef7f69527aec8f93884d653653a5
107d5652854d023645c5f01df224ff699229b6aa
967731 F20101119_AACTJJ aggarwal_v_Page_52.jp2
ba25f1b74482ab4b5aedbda8c1e9388a
5e7b542733bd4d31f54e61564ac153bbeb988d27
20369 F20101119_AACTTE aggarwal_v_Page_53.QC.jpg
c26a15ec2aafbaead09f9ec3f9ccf13c
95536be761ebdcfa712b41662a70f16c3c1e05ee
18422 F20101119_AACTOH aggarwal_v_Page_18.QC.jpg
bf0d01dfba5f2257eec71908b62f8d48
6ff7de71bd95ec26b41b4a44f7f363dc36b0388c
5894 F20101119_AACTYC aggarwal_v_Page_24thm.jpg
c92f810ffb39ee4d59ee59e9ac958015
e0405a5b89a167844731f6dd692821d5f2d01570
F20101119_AACTJK aggarwal_v_Page_62.tif
cfa023982a68906c04be75c8b64aedcd
ab15a5071a433024ff49995689bc489982528f05
83323 F20101119_AACTTF UFE0012171_00001.mets FULL
6ee02642d5f7e3a01382f5bb71f0d4b7
9573bce30d1845420a91f8a8e7a33f02450bc54b
F20101119_AACTOI aggarwal_v_Page_68.tif
ae0a9d87f471a122227f1a7a95d17a68
3ca7e5e05e00e2a59858427a208c33e16025de3b
22088 F20101119_AACTYD aggarwal_v_Page_59.QC.jpg
052f506903798add4e731272a759d97c
3f1fa5905c5652ee55cd3666d14f7eb47753b350
F20101119_AACTJL aggarwal_v_Page_21.tif
5924e4432138571753a6f90c927be3d0
30cf5149994b08fafc2b79ce9ca2b19a6b0a4712
6688 F20101119_AACTYE aggarwal_v_Page_64thm.jpg
b63eab675a7f820773f3a4ee654ddc9b
0e84f0b1ca868feb383a845c9295d927d6a7b20a
69758 F20101119_AACTJM aggarwal_v_Page_54.jpg
40929fc453f5f8befe1f6e160da9e30f
a9400b3ec3ac5f4b0f915fa2f8bc358da92a8274
61471 F20101119_AACTOJ aggarwal_v_Page_15.jpg
ae981a44642e5b1f6ddaffaa4769ac4d
0946db7bb1d865b09a5136b5c0b889bbd753f984
6612 F20101119_AACTYF aggarwal_v_Page_49thm.jpg
d908fb2dbd9d790c25cf79a915ba7fa7
9926b69d7d4c0bbaee1c21b60aa34068a1857f92
1670 F20101119_AACTJN aggarwal_v_Page_10thm.jpg
bee109871e30b16fa8d92e1940f3a633
18693570b64293f1883e6bbb3f6a1f5aec425e1f
16097 F20101119_AACTTI aggarwal_v_Page_05.jpg
42b679885601edc354928cc626914dd8
3b67ca82e2da7450fcc852e0ff3daa14fa189856
1623 F20101119_AACTOK aggarwal_v_Page_26.txt
97c1833c7d3166aa77656f2e1eff601a
83a1f57c59cfe554ab6dc5d317b6846d2ac684db
24560 F20101119_AACTYG aggarwal_v_Page_63.QC.jpg
8957a040b2b8ae93d2198ec1245aa4e1
a80ba0620f7f64b408d8b528573b87bd016c04b4
6873 F20101119_AACTJO aggarwal_v_Page_32thm.jpg
8b2d3ce0a910709700f35c6dc1e457f0
48f755771234fe9d49b5ede3202dc9d7035165d8
12724 F20101119_AACTTJ aggarwal_v_Page_10.jpg
a5fff4889c4caa65c79fd52a5bea6e9e
fcff7a9d53c77cc10008a48baa1e8f09978b9457
23280 F20101119_AACTOL aggarwal_v_Page_01.jpg
6d72d96b81a4f72fb1d8b7cc34ed90e2
ea5328f343288a1b623f55b6b99df37765279546
10462 F20101119_AACTYH aggarwal_v_Page_03.QC.jpg
512850b69e23c49cee3fb18a6d5a604a
8497c6e4b5953b21bccd83dc79516d7f87665e93
72211 F20101119_AACTJP aggarwal_v_Page_20.jpg
d97df25023c83d5f2174a7148c41e030
9d588b2a3bb4051851dce1ee3052d297a191ae40
65429 F20101119_AACTTK aggarwal_v_Page_11.jpg
3af45f38d655af64cb83b4a0652129d2
2a542b5b0902e77699216765168da5955fd50679
1051940 F20101119_AACTOM aggarwal_v_Page_06.jp2
7eebaf6cb5aa930ee23b13386c874452
9655baa3ffb9bb16598969c4f9b21195137740da
6013 F20101119_AACTYI aggarwal_v_Page_16thm.jpg
0a89bc6c62e7042bec3732d92fcbe907
97498901d1b3d6086a72f42ba33c256a18662827
F20101119_AACTJQ aggarwal_v_Page_64.txt
65c8ad565c5e104b70030ce68354736b
ac8c66c0ff6b82a35b47108d054be55fad0be04b
61683 F20101119_AACTTL aggarwal_v_Page_30.jpg
a7d7f7b963b2adb57c1cc4f01144fac3
20f631177de2ff43da6cc231a319fb3e5cffbf1e
21532 F20101119_AACTON aggarwal_v_Page_26.QC.jpg
50dd8f077fd39337c1ab8187f223d25b
6aea46f6d2fc064a4f6901e70694b919757fdde2
4837 F20101119_AACTYJ aggarwal_v_Page_29thm.jpg
5b63681526df0533de758da1f88cd8cf
0f50e71eeb2d59abe47e75e343cbf6cb4dcaa936
5368 F20101119_AACTJR aggarwal_v_Page_44thm.jpg
d7e9dc72b0ded6b041ceeac8811f66b9
5de75947ceeda2872d25bdc737e6f750b99271b0
73401 F20101119_AACTTM aggarwal_v_Page_31.jpg
5f9e3a296284dec274155de2f892b54e
2d523e6994d7eed2c46128b55e307d3998f101a6
F20101119_AACTOO aggarwal_v_Page_32.tif
509efdb16eac57d8a0b020c946a54322
bc1164bb8714eb3dcc609d89a379494e8290d6f9
107577 F20101119_AACTYK UFE0012171_00001.xml
26ad0dfa8a248e29169b8f14813af4b9
1d655ffdd8ea54594e08ee52b713329897af7108
1051984 F20101119_AACTJS aggarwal_v_Page_49.jp2
a26d97b149532b832e6a1706501a3d08
ea01b92c24d58d11ade8e973c272574b0324c6c6
67634 F20101119_AACTTN aggarwal_v_Page_34.jpg
9a588d31622167d4fec3175639ebff22
2a26c3b30878ad6d74e87051b6555dd6dcb9a67f
72626 F20101119_AACTOP aggarwal_v_Page_48.jpg
b3c7fda3334a0e458c89864d9f1c1c88
2bae25ad9772977492ad2e4c84a9db6623139d05
7250 F20101119_AACTYL aggarwal_v_Page_01.QC.jpg
be01f054ea9b26200dbadbfed09a7046
50154d5443217a7db8558213c2434f6c762e1cf4
2005 F20101119_AACTJT aggarwal_v_Page_17.txt
eced44900a77f378e005203f5cd3e5f6
71a05c341312e516ec41bdde12bb09ff3626b9b8
7423 F20101119_AACTOQ aggarwal_v_Page_68thm.jpg
15af61140618cf306baaec8c6ec9aba2
d7e1d5f9c0e89ce00b48875eabc1f2ab9e4b37b5
21522 F20101119_AACTYM aggarwal_v_Page_07.QC.jpg
b546371e1a9eef105060b79eddb9c0f0
678346dd29eab2e598bcd68fd1ce995e126ff82e
6858 F20101119_AACTJU aggarwal_v_Page_36thm.jpg
fedf477e60d4c17299c33bd4d9436265
581a9d6475b5c3ea848e79ba38cd4026426e5029
52422 F20101119_AACTTO aggarwal_v_Page_35.jpg
90191f7edcd999e54582fe03a75300d3
ef43ca45740f28b63db4c8c783f976f1ee098f9d
1051977 F20101119_AACTOR aggarwal_v_Page_07.jp2
f85525bc31bc53aff1641fa0a8a4aaea
2fa6445b908a48f5ab6c5d88cb6b8eb21852f33c
2600 F20101119_AACTYN aggarwal_v_Page_08thm.jpg
ea48159d8f6c9971e077c520c474b004
0ed379b7480aee0b313a3e7a937b61abd45606f6
66277 F20101119_AACTJV aggarwal_v_Page_37.jpg
092e796f851d0b4a26c31d440332957f
518ea2c6b8e57123abcae49b574a799842a19789
62117 F20101119_AACTTP aggarwal_v_Page_40.jpg
26ae544b31c7653cdf63b9e5fba6f82b
235dc995ce2442d8ab7496c75b3f908cadbf8a3a
5174 F20101119_AACTOS aggarwal_v_Page_22thm.jpg
64fc3dc6e6b3494a17148b5d26822a72
397f94566efd2b10de02391ff48b50ef92521823
18911 F20101119_AACTYO aggarwal_v_Page_09.QC.jpg
61645d2bbbc957277725a8fc1f54001b
c217732cbd7038880b481eab731c106c1b8074e3
52110 F20101119_AACTJW aggarwal_v_Page_28.jpg
514cfe4ae2193f3e603fbaafa849b1ec
4d73ac91b9ef75649bf189e79c51954003a5a630
59265 F20101119_AACTTQ aggarwal_v_Page_42.jpg
b0576895b7db303ff426dd2c5f4379b2
101636c8650018e7216c77a8c41488f012d20842
1227 F20101119_AACTOT aggarwal_v_Page_44.txt
646b827472a93a7155c2562f6606c8c1
8acf6ad65e92593a349ce5b54927399b4058883e
5340 F20101119_AACTYP aggarwal_v_Page_09thm.jpg
a876f884b73e8e9b0bc78f7c1785c663
12fc93a50c2c4a25928dc810eb139d7a6eea729d
3288 F20101119_AACTJX aggarwal_v_Page_61thm.jpg
5329bce64305a0c3b6cd552431cd936b
f888a53585674e44cca2cc29ef95929f518dd075
43385 F20101119_AACTTR aggarwal_v_Page_43.jpg
3c06a722852dd04661cb874a571eaf27
e848dfa4692bf9c9052339a0f1627c6634cc3382
21465 F20101119_AACTOU aggarwal_v_Page_11.QC.jpg
0f3921cca38c902096a11fd60278b39c
31c8e72501927db23994f71019462d902ad5242b
24291 F20101119_AACTYQ aggarwal_v_Page_13.QC.jpg
ab05390e0e6889342278b01ff88a07b2
91469ee8b6290c65cb6607f07fc5c8818e508a22
26999 F20101119_AACTJY aggarwal_v_Page_43.pro
544d37bb2b6bb659a0fd92927dcc3ad9
c3666022e5460999b30729c5e3bf4bfdb0e30f34
70717 F20101119_AACTTS aggarwal_v_Page_47.jpg
dbeccf8ba83b2a48b7465c958b1a418b
db3aff22bbdd9a68cf69cb59c6612073e8f813aa
93089 F20101119_AACTOV aggarwal_v_Page_67.jpg
aca0b8b7c805b9b6741cb9768f2d71de
33ec17326eda901b36261e6b15a7c03ef1bfc82e
37278 F20101119_AACTHA aggarwal_v_Page_35.pro
9037e2701c5dbc571742c1e8c19ec381
78b955ac22bf1ec13a6b93bd926287136be69861
6683 F20101119_AACTYR aggarwal_v_Page_13thm.jpg
31f1d4f3975e8eb2fbdb3ea66c3ef7d6
4ba510bd2f403bd2b2b3cbedd194eedfe45f1879
67109 F20101119_AACTTT aggarwal_v_Page_58.jpg
ab69325cde4a55ccf20cb44ac0217efe
30343d18d8f902f8c7475fbea59e0abd0ac63059
742 F20101119_AACTOW aggarwal_v_Page_03.txt
447ddd1f534359165d1a9614ddd97eff
563f3d4f50f0d5d5569b5dd263d0f77abb336552
F20101119_AACTHB aggarwal_v_Page_07.tif
7073f7fef238c283419d7dfb36b59d89
3925673ff3b663f4acb04c109d8064166ee36997
73066 F20101119_AACTJZ aggarwal_v_Page_12.jpg
a16809b00c9464be196e1e04a70de91c
5bbfe2064c2e660db5654b829f51ad2431969037
24639 F20101119_AACTYS aggarwal_v_Page_17.QC.jpg
577869245551f01d9db884e9517a7926
d50e19cc622fed9476f751833986647580bf2c7d
32040 F20101119_AACTTU aggarwal_v_Page_61.jpg
e0371f10635c083a24658bf4581b206b
fd98774bc4afb8a7aee520a996ccb0e3a68beb14
1951 F20101119_AACTOX aggarwal_v_Page_21.txt
d58dbef0f01d2c0f337a93da84118231
4be83d667c81cf6fd0afca2de69b9f803bde7eee
67084 F20101119_AACTTV aggarwal_v_Page_62.jpg
a3fd084393770054c6a88f87a46e6212
bae1679cec22a841823d1ee5c123d718d6b24965
1051975 F20101119_AACTMA aggarwal_v_Page_60.jp2
db2b43da65aee124da31721ce671d47f
76fb999a10250541f9713a0b720e34a195f290a7
1997 F20101119_AACTOY aggarwal_v_Page_54.txt
8628c66a6be07dcedc8f573f848d5bf8
c124ff15ef07e88dde94858de26a46af29731fb4
75946 F20101119_AACTHC aggarwal_v_Page_14.jpg
f14d17c4bd260af38ea920ebb17c1ec3
f8ae0d555d2956362570a12d7d92b31e1a7f7661
6815 F20101119_AACTYT aggarwal_v_Page_17thm.jpg
93676d6039e5b602a33a534d2c07cf17
ba6ffca7a36e6f1f487d4ad76e75fdfd01640e09
71878 F20101119_AACTTW aggarwal_v_Page_64.jpg
95e93b4b99ae23496cf65c6d10c51b24
4756e65cef0f210e951cc4053225856e91d64467
F20101119_AACTMB aggarwal_v_Page_14.tif
245d02f7818ca9f43b5bbd31670e8c96
428dcf52bc17bc5ad4e325a26c7a71e9ed890b42
41861 F20101119_AACTOZ aggarwal_v_Page_53.pro
7c8b41d264a46bc6945275aeebebe1a8
7cd478d6e2141da1d83f126460805c82fbd14ee4
98613 F20101119_AACTHD aggarwal_v_Page_11.jp2
a1441231fbbbfeac4ebb7fcb755afabe
be14fb2457e0ae83086ba349c92254a717214852
6628 F20101119_AACTYU aggarwal_v_Page_19thm.jpg
c2dd4e3aad02621de1e21d5e7117bf92
52653d9d2c6c6cae5ea2ce3eb7934e03b3c51f25
25188 F20101119_AACTTX aggarwal_v_Page_65.jpg
9f6ddd82cbb598f8b2e5bcf8404761bc
654c8a9a012c7604bf8887f3c8351f9d6284edcc
1810 F20101119_AACTMC aggarwal_v_Page_30.txt
6df0d6eb2ff0868c92e2eb4fa6adf650
4e2907c453d2f8dd3c9d2e149555a22bf66b0185
141 F20101119_AACTHE aggarwal_v_Page_10.txt
100f575dad00e7de6e39cef2e97d3720
468ad3b6f79bab780dce099795a36f9e02301ee2
25259 F20101119_AACTYV aggarwal_v_Page_32.QC.jpg
3230915cdedec28fa8edb0ff94531abb
d91e9bfe85b3b5637de9e7a7acbf1018fb319855
6727 F20101119_AACTTY aggarwal_v_Page_02.jp2
068d60c8e4d90ee8709d7f0f57abd710
94fb3782ebff5ea1638a0033952616dfb55cd504
74622 F20101119_AACTMD aggarwal_v_Page_63.jpg
19e0f12a1c03049cc44f72d9e59554c6
369fbd8374d5cda7d9279de2651131e01f6546e2
F20101119_AACTHF aggarwal_v_Page_03.tif
dda10428ed0b451bc6e3ff41f48ee671
e93d5eef49ebba4923e7e9810013e56500714e0d
23441 F20101119_AACTRA aggarwal_v_Page_01.jp2
c912769cf436480dd7807a31af518ec7
fa223554646bf22f41c21140ff4cce2a56c36e69
4642 F20101119_AACTYW aggarwal_v_Page_33thm.jpg
4e4d8068360b76fe01c3356ef8903eb9
a3b3a7bf1a1dd0df2360d63296b0591171c42b13
111182 F20101119_AACTTZ aggarwal_v_Page_13.jp2
5ea0e11fb7bccd000460582dbdff3412
112988d7bcd7fbf381edc0a13456279e27caa4ef
93878 F20101119_AACTME aggarwal_v_Page_30.jp2
f194cc0596a3268e7fe7a4a1b8d2c1b8
47f812df3eb106b8e5e63ac95527bbc60fa99e08
1832 F20101119_AACTHG aggarwal_v_Page_35.txt
dec25c3536aac9a9b8f6dbdcda1c71ec
58f0b7a82168995b250e462e8e49a5dc01f54bb1
50123 F20101119_AACTRB aggarwal_v_Page_17.pro
6763a5b07ee315bc848c7d2c4cb22540
5029c517b994ecc840e8cf696346da31ec7ae40a
24855 F20101119_AACTYX aggarwal_v_Page_36.QC.jpg
23c6b404b5b50cc4b2098379a40c61cf
fb547e9fbf68c1ff2a16b1345bf1ca9edad87243
F20101119_AACTMF aggarwal_v_Page_19.tif
159ae439d251469f99641cae7447c8e7
633ba6ebe8e986e54b2c5eecf43098828aa581f9
99208 F20101119_AACTHH aggarwal_v_Page_62.jp2
f708d06a680ee81701603b1658345211
f494603051eeec13796a7c7cd75042c4c3424b30
2346 F20101119_AACTRC aggarwal_v_Page_01thm.jpg
126da20c2d6bc0549d99e0a16f14c276
809baf1ffc08b2d1a89bc80e5e73022aa7fc4cd8
22164 F20101119_AACTYY aggarwal_v_Page_46.QC.jpg
22868f44030f6e3a3747f66bdd746600
be2a70c614f4fcb83dccd93af4810e4e2b90883b
31146 F20101119_AACTWA aggarwal_v_Page_29.pro
177465a99da99e1daa6290bc578dd3c2
898e46be0ae7e025dfcb08ff803ba54da8ba4d2d
6567 F20101119_AACTMG aggarwal_v_Page_38thm.jpg
04a4497c541d47f4dedf175561ba7f2b
b6cfc8c4a8063ea0229833ac67406141c6c59cf2
F20101119_AACTHI aggarwal_v_Page_40.tif
222f373d02b9160cc69d42a215cfc558
82398fdce595a78f1dcd60e6a43a2cdfc32705e8
46284 F20101119_AACTRD aggarwal_v_Page_59.pro
e69f989806da19eb4f36d1a6393a74ec
f4dbf38b93a59303ca9b7a0fed678047f42303af
6744 F20101119_AACTYZ aggarwal_v_Page_48thm.jpg
09864b5d5cf2b7a2a973e6f821d0ed33
0464b7e1835fd742885e2ca8d988d2d6bdae7b8f
50569 F20101119_AACTWB aggarwal_v_Page_38.pro
25facc5f2307094b5bc746ab8c5c4285
50963ba3f8a256fadd045c76e5ee9951cb3c6729
20008 F20101119_AACTHJ aggarwal_v_Page_04.QC.jpg
6d8d71118194db5d0734df236d723ed6
6468272d47000a2dd3538c56a1fb55613d0e1d2d
6371 F20101119_AACTRE aggarwal_v_Page_66thm.jpg
972f29a811d19524ae1b68b37c3f0576
c182741dc30ca871b014b895a512ed7fa83bf6be
32883 F20101119_AACTWC aggarwal_v_Page_39.pro
354857e9a5a8bc3b4fd6b6e95df715a6
926070ec889b9b6f7f5c9c570a4c14613d19979b
F20101119_AACTMH aggarwal_v_Page_51.tif
40fd0e8f4f0dc36acd742038f2f22912
5b381a2ba88e90848f5f08176f2057e9d9fc99e6
19289 F20101119_AACTHK aggarwal_v_Page_50.QC.jpg
140fd32b4c36a03f78f47f1eac0c6536
4b5df5e086a3dd6f114aed221530962a799d32d2
26806 F20101119_AACTRF aggarwal_v_Page_67.QC.jpg
582627222aa69089cca002e896b2e00a
df286ce5a5ffb8de70e5bc86884d1842fafcd6fb
35756 F20101119_AACTWD aggarwal_v_Page_40.pro
7606e0322d864ae4775d0626f973f047
6621ceee7c7b43cefac8de4df3a328faa1b7e383
2946 F20101119_AACTMI aggarwal_v_Page_04.txt
c23ff72b3ba914926bd2c49948619b10
fa2d1730b17323a9869e25d169244e290482e103
1999 F20101119_AACTHL aggarwal_v_Page_32.txt
041e6f2e181ea514c67473e29948faf0
7428977a3cc9cb4a1a8dc35397a8e6623c99f662
1611 F20101119_AACTRG aggarwal_v_Page_57.txt
70b29fa2104a60b7b306221942bbe94c
a01cdc9d2657273fc60aaa2e247dbf954041f720
48055 F20101119_AACTWE aggarwal_v_Page_41.pro
d9afaf02916cb65435a45a9c38b98e1b
1521d2ddac3a122fb034367649cb6223944fce72
18191 F20101119_AACTMJ aggarwal_v_Page_61.pro
c63fca4ffaaff74e077df9ad31596427
32e07483b80e0122b08394c19e3274dbd8449299
10195 F20101119_AACTHM aggarwal_v_Page_61.QC.jpg
1ac9bb9d596daaf3fdb50297ecaa2dcb
ecc28f5551285579e9832fe5f147e91a348cfcd5
6298 F20101119_AACTRH aggarwal_v_Page_26thm.jpg
aaf4e1dacc1b93147dd16c7a577c4a68
204206d4785b8c80c88018f52e15b8ec500e4f87
48358 F20101119_AACTWF aggarwal_v_Page_45.pro
baba34b05d3e53f1ae67376c9dab3dc5
a4abdae28e59a9fbb396418a0b740b738ce3f091
6423 F20101119_AACTMK aggarwal_v_Page_46thm.jpg
2030a1fce521fcfb808c91cb03440433
9bd71df6c69fa02f9df5acd03024764933b923df
F20101119_AACTHN aggarwal_v_Page_05.tif
3c22d78433da293c9426797d8e2f2ae5
32bb7361479e77a2fb620dff899fbb81b9374b8a
6835 F20101119_AACTRI aggarwal_v_Page_63thm.jpg
ed45ac5ff2cd85f41f499b443f26637a
70fe9a8bf18514933e453ecc837461ca5fe73a3c
36231 F20101119_AACTWG aggarwal_v_Page_48.pro
43011c696314de337336de58bc18ac66
52b11843f5ba2c0822bce36f6d3fd745be783452
63368 F20101119_AACTML aggarwal_v_Page_27.jp2
c8d7f315b2dd0ad7c8345cbaa0f786b7
3be394b6ea19a07479cfbbf00359041134f042be
62676 F20101119_AACTHO aggarwal_v_Page_24.jpg
fbc10a88b3b99f704bb33a43daf69199
10af9828979d18254a9e7854833e85b3904d83fc
F20101119_AACTRJ aggarwal_v_Page_67.tif
9eb02213443aecef540cdb772d8e7cb8
5594d704ccd4988d4fa7c69f8d1485f0c9de34eb
42633 F20101119_AACTWH aggarwal_v_Page_54.pro
4ae7bec0b4b2c6deff3f4815693d862e
86d46c85eed2a1b4e2bbb66a80d8417c819d60bf
76975 F20101119_AACTMM aggarwal_v_Page_23.jpg
0c454bc0bb3e396ade3c887c28006e76
e90485410b504c6da9eb3f948cd9b2041d28c692
F20101119_AACTHP aggarwal_v_Page_30.tif
8c50db902032068fe50941c3af505b7b
722c79865bf566d3bfeab36dee9556a14f18e713
1778 F20101119_AACTRK aggarwal_v_Page_16.txt
e4e1cf9412e0b18c47477dea504d51b8
9b3b1ff279914c6757b3306c70287b47a0384676
36922 F20101119_AACTWI aggarwal_v_Page_55.pro
9e432e655512f379f9d191188e4927c6
7a4e9bd15300fa8bd6083abcbba3b3a75589bf9b
F20101119_AACTMN aggarwal_v_Page_02.tif
25f6281be2362eee9595977deb886cb6
e8c13c78992de16ccebd6883502e2d56a167e330
51394 F20101119_AACTHQ aggarwal_v_Page_63.pro
0528838a82a399c762b1b04a303fa5f9
c3663169a48eba04cbf157cbeca5f4c8b825684e
63380 F20101119_AACTRL aggarwal_v_Page_67.pro
34c6aea600a1dd2d3eaebb10afc4a27c
9ee431db0fb84122e33160d5f699dd8c3eb5128e
35178 F20101119_AACTWJ aggarwal_v_Page_57.pro
f64cf81c410e521afd930a6bac5a90e3
d08fc836d8dfe9555b1615df58d488bc55659b3a
F20101119_AACTMO aggarwal_v_Page_15.tif
1dc1e0d9188d145e19d4019c8d38da06
b42011aca2c33c2d4edcb7c598cde959d0450cb2
1857 F20101119_AACTHR aggarwal_v_Page_50.txt
b0f8c2ef9673e0d8db8ee7dd6a714588
ac4eff12df3cc8fc7ca1a75f8d2afc77b123d8dc
37846 F20101119_AACTWK aggarwal_v_Page_58.pro
b6e09d995506b8149732314f55d8526b
a83721852386baf4c7112fb5d10bf7ff9ae9eb1a
1437 F20101119_AACTMP aggarwal_v_Page_48.txt
2f04298001cfee5c3a071ea6c4279d0f
be884f711b1dd685b698202f270a5b20986b2ab7
949086 F20101119_AACTHS aggarwal_v_Page_46.jp2
45a345583f7e3e15eb170da92c2307a0
5992452370a8c776ba74c0341c616816ee0c110c
6202 F20101119_AACTRM aggarwal_v_Page_58thm.jpg
4ec67889090796f78b5ff344aeea4241
99e7ec64f26a03a9394db8bfce8831ebc1320ccd
49613 F20101119_AACTWL aggarwal_v_Page_64.pro
92b58efe96ca64a2a7ea23c603a78d92
0fa9f2316840d5e36f4fe61f6afa8df9a1bf642f
45275 F20101119_AACTMQ aggarwal_v_Page_34.pro
a2d567af27d20e8267275dd21a5b016a
9b319e480b057059090357a0c86ae062c19d71eb
3966 F20101119_AACTHT aggarwal_v_Page_70thm.jpg
7f27ce6972ffed9d7cfbfee85717ad95
c427f88c815a65a33caaac50c554de91654eda40
F20101119_AACTRN aggarwal_v_Page_39.tif
e901d1199930c3eb567ed961e0cd149a
ca9c6cfc5a98d2f86a0a4691aa212dcd0601e012
433 F20101119_AACTWM aggarwal_v_Page_01.txt
60f40cf1ab51a7d507b8ff67d094d84d
f0fa2a14a4a447309d195523476588e3e7c01c9c
56808 F20101119_AACTMR aggarwal_v_Page_18.jpg
7592bbdbbc794fe771ae494aab017b45
20f1df8f05283319b21fce079c010686296e1ae5
731 F20101119_AACTHU aggarwal_v_Page_61.txt
4b66d2975900a3b46b883599ec311363
8baf0253f45591ea75b7718c3967fc4ce8d9d2f8
30227 F20101119_AACTRO aggarwal_v_Page_33.pro
c27c0caf94d3a4e9e6a4130f4cf3f57f
cbc3f96280015a622988761825961c977dbe8c89
1250 F20101119_AACTWN aggarwal_v_Page_06.txt
ec05ee19772a83cb85d3de07647aa6a4
dcab2f2fb634b4ed59c51b7661f9290524fd34e3
2205 F20101119_AACTMS aggarwal_v_Page_38.txt
8a899ceb4249d8caa0e988e2692dd85e
c76704723ac4edc71eb312e687b4df76f6ab03d9
28858 F20101119_AACTHV aggarwal_v_Page_60.pro
72ee442eedb796e99d795d2493f27d35
dfe1d32c857731785c2ba21abb14b2be291ef589
1200 F20101119_AACTRP aggarwal_v_Page_33.txt
b46543c2a873016d6aa92d0c33a23e6c
f7e6d6dee8bf7028ecf462905053771d2fb49dca
599 F20101119_AACTWO aggarwal_v_Page_08.txt
ad333320b4498da77e7cc05e7cdb3f6c
30eb7171e0aaf42e5a4566886097fe71898f5e42
50899 F20101119_AACTMT aggarwal_v_Page_32.pro
a60628b17690ebe83e957c6b707ee400
6ca0f587c905f2615eb25e1bbf4bcf75f291588d
6604 F20101119_AACTHW aggarwal_v_Page_45thm.jpg
e066cb60a515df7fad81032cfa06e382
2bd5892650bba0bfa50ec2fcd4e8a227880a0697
10956 F20101119_AACTRQ aggarwal_v_Page_10.jp2
0a39d9f67418d30d9a247f26b666aa3d
5dfdb9abe56bf0156fa292f31a7c77f1ab49894e
1719 F20101119_AACTWP aggarwal_v_Page_09.txt
42b785b86db6af00c79ffc51ce9e43a2
11b60ca91de1da2a76d1d4215284a0bd4fba18fd
71876 F20101119_AACTMU aggarwal_v_Page_59.jpg
0151d21dc17dbf317533c3bd920f5402
96d8bdd92bcc6652ebea580455987c3235faddf0
F20101119_AACTHX aggarwal_v_Page_11.tif
a0f8544865ef68f5d7adecaedfee50b0
728ba50b6aa5bf6f1b0ef1bacf6b3992b7baab55
68114 F20101119_AACTRR aggarwal_v_Page_68.pro
b2fb92fab3f828a0f7337d815ed3ba1b
325fdab4a7703285f773b02d4b7c40a3d4f191f7
1814 F20101119_AACTWQ aggarwal_v_Page_11.txt
c3fa20175ecd1db4e7f8b633ff884c2d
fe23112d2cf31fa9fcce24da3e80a076d460982e
91326 F20101119_AACTHY aggarwal_v_Page_53.jp2
75e771f3dd814ccda2f4edc3102960d6
390ba6a734ace2b0441119c24b31814c0df4bef4
114655 F20101119_AACTRS aggarwal_v_Page_66.jp2
fe73f2514b5f4f0fefa7fb98ed37ef44
8e5e12639b53f32012f25f2618047532049b134f
20567 F20101119_AACTMV aggarwal_v_Page_58.QC.jpg
f3836dc716f39104ff67d7c9803cb912
b33362ff5808a916e5a9350d43df7ff23490d054
72143 F20101119_AACTHZ aggarwal_v_Page_25.jpg
bedb0d5558c6a39773f396677ae57721
a1b908ee8338d2b56e789c1cf627341058e4b925
2028 F20101119_AACTRT aggarwal_v_Page_47.txt
45558a6eccac78bd62aee132e5007462
414313901ec10c7c70d102c4c1a732e7d7e58c5d
107983 F20101119_AACTMW aggarwal_v_Page_20.jp2
3e896cecdd3551c8e83ccb2f1de29ad6
3de9fdd7bf3aca8273b4450421533d958a62bc47
2043 F20101119_AACTWR aggarwal_v_Page_14.txt
1bc54091e9e4a2df8dcdfe91c2cbe3e1
930bd5540e3d37a6e23a5bce0bf2ed16a7a1acc7
6787 F20101119_AACTRU aggarwal_v_Page_12thm.jpg
44ace846af3ffcc453c2fa8e7c8ca51a
2747a676f94add2ac2bcadb5cdfe4f1b17b79d01
6033 F20101119_AACTMX aggarwal_v_Page_37thm.jpg
566c88099cfbc277d84ebbbb3e84f1c3
020d585a6d0eb9a1bf69e384528fe786f9e303a7
1758 F20101119_AACTWS aggarwal_v_Page_22.txt
52d2b5d8f5652c79c384a3095d870ae0
76d347b5b953af980bce4a063fa7d9fdd0acda87
69702 F20101119_AACTRV aggarwal_v_Page_33.jp2
21c17642d7cafafbc33a5251177d10a7
1526fd4f0a975ec02ef581dee4c6259366589f68
103254 F20101119_AACTKA aggarwal_v_Page_41.jp2
be84aca46a9f2cdd72396cafdabc038d
a8e7e48bccdf097b69cad8adbc6ce2a7e85b1924
23791 F20101119_AACTMY aggarwal_v_Page_45.QC.jpg
1b28ad531525631de78903a5f4cdfe94
8102876efcc4846318ab86a7742b38c8c924af95
1952 F20101119_AACTWT aggarwal_v_Page_25.txt
ed0e5e899a1b0cb1c486ba977be2fbf4
dfb2dacc8f4d329014bbd7952cc19ff214046746
F20101119_AACTRW aggarwal_v_Page_24.tif
7ebbfddeb4e8d8ebb7929c12bf014a31
d7f3dcf01407b111a65625c9a3fe20ac10a9d0a5
23858 F20101119_AACTKB aggarwal_v_Page_60.QC.jpg
29f5d8abe36f07726d6add3d2fdd01c5
141ff0f4f2bde77076443d8df26ba8e47751d0cf
5539 F20101119_AACTMZ aggarwal_v_Page_50thm.jpg
3c53e32498fd3941c43f39b6364fe003
c502c3285c34167dbc98fb5f50c73324cc035245
1586 F20101119_AACTWU aggarwal_v_Page_29.txt
d29a7e19b3995978d6fbc486b8f85dfb
f663e006cb79489620c4e6db503909763d1be41c
109244 F20101119_AACTRX aggarwal_v_Page_64.jp2
8bec7f837ddd4627b9964ac34cde9d3e
edcd3c310d040c236854fb81fb7b0f91b63d4ecc
6760 F20101119_AACTKC aggarwal_v_Page_54thm.jpg
cdaf7496905db5184b17310578073a49
8201d6ff8e067eec8e90364a12ea35fb6608fa03
1867 F20101119_AACTWV aggarwal_v_Page_34.txt
aeb8a3e407db423db40256fa8087ad10
660b79b13d0bccdb66eee82e73386db97819a898
F20101119_AACTPA aggarwal_v_Page_16.tif
5bec2d5ffd296cf199133fb9d5070b98
d283df977907931a80ec3f4982d09717e62fb6c9
19205 F20101119_AACTRY aggarwal_v_Page_24.QC.jpg
ade5dd6a880c2e22134714d665d42b23
72abaf21fa3b60eb29746618546469725141eaeb
1700 F20101119_AACTKD aggarwal_v_Page_28.txt
56a78c7b9ba725cd8bcb1f83335bde2c
bdd34bcece21afc0ce2c7c35c47371192049cb0e
1583 F20101119_AACTWW aggarwal_v_Page_40.txt
335386fa24fd3866abc41bbe65552d68
01ba5eab72e5eddfbe69aef353eca80299d3b87d
6736 F20101119_AACTPB aggarwal_v_Page_55thm.jpg
b2e419e549c8474b51b369e52c66f61f
b9d854398fdf57104020788604eee395dfb68509
114602 F20101119_AACTRZ aggarwal_v_Page_32.jp2
d42fcf87741a1e648cf8c183ea26701c
a0179aea20033e26544ffffa87503788b04f680d
76300 F20101119_AACTKE aggarwal_v_Page_32.jpg
a9ff870293b0ea6ac53c5a8cd08b2527
16ac6bd3c785a290a6979432e93cc1963d2e1b92
1919 F20101119_AACTWX aggarwal_v_Page_42.txt
979b4efd27800569ed9e6c740d63cf57
1865c9bf94533fed0f2110a0db10389346e4b30c
6139 F20101119_AACTPC aggarwal_v_Page_11thm.jpg
ae6a1f76723bb184f6bea881ac9d32df
bf1e84200cf1290893762fdd9c8798928d7e1554
1071 F20101119_AACTWY aggarwal_v_Page_43.txt
e0321fcaeb1d521d176b62f8dca26102
758e88422d29b810781c2e4a1eccd06383b482a3
74294 F20101119_AACTPD aggarwal_v_Page_56.jpg
90cf534247ed7c57513fcc5938f63b3a
9c7e2be03862acd22d8eb3abba5c325e477050c5
5216 F20101119_AACTKF aggarwal_v_Page_28thm.jpg
ca9c251b98df89dcbfbf8aa919a3b11f
151fdcd97e058eb9519be885fdb4ec607a4e1ebd
91638 F20101119_AACTUA aggarwal_v_Page_15.jp2
37f6306d0876344396c8858a735441b5
91310fbcd0a5f3a11e8a5967e786dca175945d07
1846 F20101119_AACTWZ aggarwal_v_Page_46.txt
9c6383deca650331b8f65b9f308426aa
c7ead089efb4ae0cb22fd397b86f86e85505de82
4250 F20101119_AACTPE aggarwal_v_Page_43thm.jpg
333269f0cb28651917ed498556211de0
10f50c4c419520ec249e8bbb0b9fbf146b125f4f
25249 F20101119_AACTKG aggarwal_v_Page_23.QC.jpg
b7ac8e16e5600af13130a2ac1afcf25a
762093fb12a7fba90a05af5cc2eadef75005198d
96893 F20101119_AACTUB aggarwal_v_Page_16.jp2
735a93840ed46c4de0aaa7a495640786
a51e8749f5f84ab9b0dc673d762d60b21a3bbef9
F20101119_AACTPF aggarwal_v_Page_28.tif
ef8cd4ad11d3cf2f2f8467f2630688ab
0f8a3727a00614c6a5d8df8e15be9aafc02a112f
23439 F20101119_AACTKH aggarwal_v_Page_56.QC.jpg
71e637bfedbe9db8e5430cb0fdfb7eb2
1b8bc850431015a4596cceb362a8a64caea678fc
1051932 F20101119_AACTUC aggarwal_v_Page_21.jp2
0b4d91dfc95c56355860663f27d60925
ccc12e77db2f2281c4ba566758dd535e2d4f45c6
5237 F20101119_AACTPG aggarwal_v_Page_04thm.jpg
7714076ff635415dfd950e06352bb115
e4e8ffc72efac9f42ac1328ceb02d793cdfd06fd
75836 F20101119_AACTKI aggarwal_v_Page_55.jpg
56a2df39aac02a0720da99079156e345
95f50b9b9ba7045031756bd42fc1565e682789d1
115463 F20101119_AACTUD aggarwal_v_Page_23.jp2
7f7dbb65fc9fa2fcfbea11aff3221ce5
f2f6e8d93da4cba5cd61ecc020e1dcce43ecbc4e
1647 F20101119_AACTPH aggarwal_v_Page_15.txt
e1be28be7798ff87760ea160868e5651
71302e76d6d7a5d10d4d5bbef09b4b224abc5798
17585 F20101119_AACTKJ aggarwal_v_Page_28.QC.jpg
92f1454593544726f022bc98c7c03c6f
5f49bd04b3f7dbdfb4418ebbc58770028ab07c83
109072 F20101119_AACTUE aggarwal_v_Page_25.jp2
f884a494bbc101b1eef84f0168ca33a5
5a0182a2d3207a03ef1b74e0d7024dcee0fb408a
F20101119_AACTPI aggarwal_v_Page_44.tif
41c8d8419eaa2127172dad17c55ffcce
789c765a531f1d7d3b176015f360975d558075d1
80733 F20101119_AACTKK aggarwal_v_Page_66.jpg
ea63bf6e517c93043f8fb2a6ddc55045
833ef8481bdc3017a47ae876b7e49ff026a22fa6
922278 F20101119_AACTUF aggarwal_v_Page_26.jp2
7a1cf9ca594a9670896a63a62aff0e81
6eac604844c07faf20492939c03df46a7586f077
2075 F20101119_AACTPJ aggarwal_v_Page_23.txt
5623fe180ef6732a5be6323251b23c5b
aad24633dc93a8a3eb6531913aac84d7d73a703d
23786 F20101119_AACTKL aggarwal_v_Page_55.QC.jpg
a0e2cf611cf875e777e20d2f7e13404f
4b14138cc893d2ff436ea919689e6f8fc45d55b2
68020 F20101119_AACTUG aggarwal_v_Page_29.jp2
ceac9626c4eed11ad8037324fa4bf29b
88892f8095d2353f9672125e364423ae89db4644
77617 F20101119_AACTKM aggarwal_v_Page_19.jpg
77f60c1ca384e23adb1c41e772d757d5
df46758b397c5a9a9c2db18e9455bd4a2cb03eb9
99069 F20101119_AACTUH aggarwal_v_Page_34.jp2
a54cf11cf9f1c33ccc9c18caff948c9d
6855760804cdc64ac5fbf4fb796f8a1360ee0d69
37254 F20101119_AACTPK aggarwal_v_Page_70.jpg
6a7d9ea9aceb5cc928ac67a9ad4a68c7
cfcfb950513bdf8771800dd7da3e767ac2c46506
6772 F20101119_AACTKN aggarwal_v_Page_31thm.jpg
10d22d82860ff64a5a166df17c379c03
abbefae4610bcc01fb6fbad4bf85cce0e599e4d9
94619 F20101119_AACTUI aggarwal_v_Page_37.jp2
b04b4b775a04f2ef6bee6f80a288c944
bc20709d13ea92fbda926a56c4bb9dc8227fde57
110074 F20101119_AACTPL aggarwal_v_Page_31.jp2
5830a840c7f035a70fb70c3a0be72358
27c44b9cf08dfb71c09f2b99c038bfb520eed92e
43322 F20101119_AACTKO aggarwal_v_Page_37.pro
5df22b8dcb48fefef38c1957b2226dea
cc2b60f63402d59d860e5bf481543574ca197c30
826063 F20101119_AACTUJ aggarwal_v_Page_39.jp2
d6f13639e4cea01c2a11d0d644f788d8
2afc28a9fdb42ed66402ea477a0f93c0928756ce
1051907 F20101119_AACTPM aggarwal_v_Page_44.jp2
1e4c93e46d5ad2325dec1fdcec8d3d23
9c212aab6ae6b3daac44ab0dc7256ffb21a616ca
39346 F20101119_AACTKP aggarwal_v_Page_51.pro
36fa60ee3f4cf0d62fdd8a805b6db54a
b70f8b3ab3c431a80531d6a0a4b5a489a9f11ca7
79906 F20101119_AACTUK aggarwal_v_Page_42.jp2
0eb30684b29f56696cda7d8546e1d844
d848c670fc3e2695e81a92198de5c7514015823b
22212 F20101119_AACTPN aggarwal_v_Page_69.jp2
f2e2eb2badda18bd84a74b8b9192e539
cac73132ab9a46da70d7feac64fb1bcc50599c2a
25214 F20101119_AACTKQ aggarwal_v_Page_14.QC.jpg
dd6f464da86009c4b5cc8809b034ad24
adf886643972da67905aeb20d7e973397267af8a
60656 F20101119_AACTUL aggarwal_v_Page_43.jp2
dfa42004433a2b39fc595c4b8af8cb28
708a1257960e5670e7131f86b92fb3e0866b1d29
61856 F20101119_AACTPO aggarwal_v_Page_53.jpg
f051ab6f765179301e240763dfbea0fc
e8747f4cd444b874d5b19609c8f27df4565b1d75
F20101119_AACTKR aggarwal_v_Page_58.tif
b19868cd255788a0e45e14f6c1d0ca18
d3195e0028facee219576256b28c1e1af9ee084f
129947 F20101119_AACTUM aggarwal_v_Page_47.jp2
979215d7d00d6f2d030af51e02d855b0
6bde3fc8a89d13ec56045b89f7bfa8c4b9dc574a
1718 F20101119_AACTPP aggarwal_v_Page_52.txt
a41553905c72ad764fee97677fbe6d50
a4c469679b53ce0e1fc3ed0e35f2078d7b8a9626
6320 F20101119_AACTKS aggarwal_v_Page_57thm.jpg
e3cabb1e0b9db4553f5565a67b3dd9e5
f621b854d1a12a01b8dbac134789d84c4bf52bbb
112531 F20101119_AACTUN aggarwal_v_Page_50.jp2
4af850e854793b70464e21e94bc6099a
59af0810d2e7a285b393c5a4229000c381720de2
14480 F20101119_AACTPQ aggarwal_v_Page_27.QC.jpg
15e71af5c5245d5795136c06604b6132
b48033e6da6ae431c6c06c9ceddc67310e571d25
38101 F20101119_AACTKT aggarwal_v_Page_50.pro
144c2572b297e45e48f8b8e1237f3816
5752d0533ae92b7f78cd8930aeb235b452288932
844337 F20101119_AACTUO aggarwal_v_Page_51.jp2
bd64ddbec6ade41e0bd79071ebd899a7
6c3630bfe0ca06956f62f41c092fc214e66f9cb8
2387 F20101119_AACTKU aggarwal_v_Page_59.txt
31e2271b2dc376130a9f0906e1cc6513
0d064f6c4721ad61e5010fd887bdb6b456ac6bc2
F20101119_AACTPR aggarwal_v_Page_10.tif
782bd67ca310e0f7ea30d1fc39a47ff0
6b16e5971f511e21ea1668bd05989eb5f505cff6
1825 F20101119_AACTKV aggarwal_v_Page_62.txt
c60f9a6ebd36a89607f7c3bda8e6876c
b136738047cae6a98c170fc99ce5d320d5eaa8e1
43904 F20101119_AACTUP aggarwal_v_Page_61.jp2
2fce4fc82cb3691d62b7186ea27b22d0
3bd6ce39ec7f83847e4ed5f4afc3d81d284e661f
56989 F20101119_AACTPS aggarwal_v_Page_44.jpg
6eefb5b8afdb8ae23f2997f04486b438
81f18af5506f1bbe1ac8da1da569596ee285e975
53952 F20101119_AACTKW aggarwal_v_Page_66.pro
65aa817ee21bf7187b8b32660ec34b88
f0fb6b8875d21bd53c46ecb424b2734598d14118
133962 F20101119_AACTUQ aggarwal_v_Page_67.jp2
58f564bb6662cd6d6c907740a19cd328
422ee048f2643de4b9deba4e3612f2012f40d716
2146 F20101119_AACTFZ aggarwal_v_Page_69thm.jpg
eb298536fe2b575e745350ad5904fd82
d54169a6d5779ebb805cc357482f138881ff4b33
30074 F20101119_AACTPT aggarwal_v_Page_27.pro
c24b60a334dd99dad9d32d6f3e6dcdca
42eac4ee60e30225557956282049098f00047f86
59195 F20101119_AACTKX aggarwal_v_Page_09.jpg
5ba926a53ed446bccdc045b505252d60
f7ea2d8f3ba4ed3fe545d8abf5fbd0d7c0e49a76
F20101119_AACTUR aggarwal_v_Page_01.tif
34bed465caa8c3c578af57900dfe25ea
40ef1dbb03ccf14343c78c0f1d4823deda0284ac
F20101119_AACTPU aggarwal_v_Page_66.tif
0ae6fe32fdff939cd0ba5ae8eb8e14b2
de9446bdcab4c3679b1795cd6f1f6723eb938d10
5705 F20101119_AACTIA aggarwal_v_Page_15thm.jpg
b517c99f55f6c6ededad3e390464da1f
402d79b505209076726803c975c1dfb155957a96
5934 F20101119_AACTKY aggarwal_v_Page_51thm.jpg
7054ded0aed4802ffce1e815f61c60f7
a47dd45783ff55618c7191a3b6e15b422b1cfdf9
F20101119_AACTUS aggarwal_v_Page_04.tif
d939b396d33a8e67638e59808ed02995
1325d2e2d86c50429bcb7416bc55aca4419a65b7
F20101119_AACTPV aggarwal_v_Page_13.tif
452886627ad472f768ebd92052e371b4
de8a1230db7728d9d03f41cee19f6faf16370241
F20101119_AACTIB aggarwal_v_Page_45.tif
4dfae1d089ceb944b81e5e47b6421d66
4c94b9375c86c71b6da2f39147714935b4545301
2801 F20101119_AACTKZ aggarwal_v_Page_68.txt
92cc840951fd9950893324c23d9b3f8a
bed79d301dbc4fe20bd77464e6ed89db56f9a59e
F20101119_AACTUT aggarwal_v_Page_12.tif
92bea155cde5c4e9ff77fe0c4d55716c
77269bf2ddaefd94b2972cfa4e01526f07f47470
49578 F20101119_AACTPW aggarwal_v_Page_21.pro
8c643e74cf813343048aa894aabbf848
5751dca23178d82e35e27e86e9df9bd8142f4b43
1051981 F20101119_AACTIC aggarwal_v_Page_04.jp2
80302de3390d2bc746f05a87d55ff666
53a3b16dea38d56827ea7b3fca8c2692d300a383
F20101119_AACTUU aggarwal_v_Page_20.tif
90bfc6882913907b4bbb0b8761b1e27a
57a827dce7c8fa937fec6e9486ad0612bf167f07
95007 F20101119_AACTPX aggarwal_v_Page_56.jp2
48da58a97b9f87cbf2f156f1ce635756
0389f21c5f3aadf5c1a7e0abf099ff877ae9e51b
F20101119_AACTUV aggarwal_v_Page_26.tif
509d74c4e53043e6df150c9521f40035
d88859bb592fc1341719e00a4208051e1a55266f
791577 F20101119_AACTNA aggarwal_v_Page_40.jp2
775236bacf49bcc438d59485833f61ef
579533dc8298762755318d4c6c9caf4fb2533183
1817 F20101119_AACTPY aggarwal_v_Page_37.txt
ef695415658555dc788c16285469554f
724e4e16f5dabbe1787b31c9e84bb8fd8e0b3c7f
49228 F20101119_AACTID aggarwal_v_Page_22.jpg
41b07c5cf66d071f1423a4bb6af470a9
972e21e2520235618f3efed10722d10e6389bd03
F20101119_AACTUW aggarwal_v_Page_27.tif
4abd01b0201a2bd1a5785b091f6504a8
db82ea418b752083a83f486dc855b327038e8802
400 F20101119_AACTNB aggarwal_v_Page_69.txt
8325adcc5302f0f100c469cd24eeef35
459b413b36f23b0da0a6744bb70a943524a688dc
2026 F20101119_AACTPZ aggarwal_v_Page_36.txt
6d1a7303e0a052d4b776848d63494be9
53dfd9228e622878ab60856ea90266acbf085866
F20101119_AACTUX aggarwal_v_Page_31.tif
698b90ec139ffc201bb9063f89b899c1
12934ea8b7ebc271a56c4bb34cd29a06a56119d6
5989 F20101119_AACTNC aggarwal_v_Page_40thm.jpg
eb6dd805bd5b7bb27fb3b14af2e8deb6
8e27cffaa1d342d835911e41903308fb968b20d6
21019 F20101119_AACTIE aggarwal_v_Page_70.pro
f16aef3a34ecd9bdf2c5493e47143fa6
ca13bdbdf6e52e173919e2390e3d39dde33df964
23969 F20101119_AACTSA aggarwal_v_Page_38.QC.jpg
3d584a99d9309a3dc9cc801a4fb25ddc
7b686ec195fa63a5696ef0e550511082c42432b9
F20101119_AACTUY aggarwal_v_Page_35.tif
6030e78af48a6c1d0ee2419d5e7ec291
43ed91c95270e969243ec72870f7d681531544be
2434 F20101119_AACTND aggarwal_v_Page_07.txt
38b677e9b3639343b1cdbe4a846a83e1
a2d1962bbdd41a685d2ead061501ec7e116633f8
1899 F20101119_AACTIF aggarwal_v_Page_20.txt
749a7c8820c4a2b773195eb2de513d20
e61281e61f738fb8f58603952ec4458ef8540660
F20101119_AACTSB aggarwal_v_Page_56.tif
7b8205d569d1893b0c9c4c35d8613ab1
15aa9d09dbab76598ff8c67b7c70e67fd89adbfe
F20101119_AACTUZ aggarwal_v_Page_36.tif
4f153d5ecd234e5afc4871e455d329f7
9682c299a092ac81de9fd1b9fde855fcc3215076
2234 F20101119_AACTNE aggarwal_v_Page_66.txt
3b1f34982697b7d92b174f6bc3e5697a
eca29ef1b9b7bcfc15f774d2be4a5dd4dcc39830
69271 F20101119_AACTIG aggarwal_v_Page_52.jpg
116389b6208dd3ae4d90f1c5c96c31ed
50f181f93f696f723e296f587a398bcf15dbdcb7
47107 F20101119_AACTSC aggarwal_v_Page_56.pro
6d8d5ac8e47120f5803cdc0d6080f50c
6ea39a3e2baf2a6ae493088d3515764c884f0e8f
68889 F20101119_AACTNF aggarwal_v_Page_26.jpg
8c94b27561b24be2efb2ca5bbb68de82
5bca3bd8d179c7706b619eacb82566c32232d3be
20728 F20101119_AACTIH aggarwal_v_Page_52.QC.jpg
450aaa587aa2a2d871e3f87cc293ff82
6fa352412837c7c5247d233b345bb74115a0c7ad
2099 F20101119_AACTXA aggarwal_v_Page_51.txt
e275c23ba3d2a631a3ac57e6cc7cf137
2bb3635208368257ac7c51fb78bbcf0768913a9b
72327 F20101119_AACTSD aggarwal_v_Page_41.jpg
490129bdf68966877bf75f4350518111
6b4a599a0059cc90dfaa39e7168c0e74ae605dd5
145848 F20101119_AACTNG aggarwal_v_Page_68.jp2
4807375f72cc4daaad27b5d41857603f
912613104934a721e6adca4e32fbdafd78ad1e16
1261 F20101119_AACTII aggarwal_v_Page_19.txt
50dcbe15ac350df10f8db4be9c832d93
fcb22c9803f6b52e06e3b80e02fb176c614c14e6
1626 F20101119_AACTXB aggarwal_v_Page_58.txt
a0e09f95af12f2817c99eaceaa97976d
c97dee064050f0f3fbf707c0ec0caaa64db46a3a
111418 F20101119_AACTSE aggarwal_v_Page_17.jp2
df7e14619a40d9b7330bf0eb3225cbf1
fdc08b94790aee9a0f3ff8e25d8b7f06ce58ae60
75847 F20101119_AACTNH aggarwal_v_Page_35.jp2
c4dfcebd17b1f1dbdd5ebc36ccceddc8
6df72089373e102daf0064e10eb565db8f3e84a6
8480 F20101119_AACTIJ aggarwal_v_Page_65.QC.jpg
089f48654d0c2733e85a16111b9a4ede
96474fc5f8981501360dfc5ca89c91c4d50da5c1
1168 F20101119_AACTXC aggarwal_v_Page_60.txt
e0ca6f7b6a1ef24998e7eaac503f58be
a03e20e8a32dc1dd116fe48b3f59b250b18edabb
17437 F20101119_AACTSF aggarwal_v_Page_03.pro
5808f827157b125ee662910faa24daac
6e6d58da6cce0447cf552bfba98c43d18fc01f2c
82509 F20101119_AACTIK aggarwal_v_Page_21.jpg
9c3bba1d35771a66cf507be148faeadd
082f9b51038d58602126d6a23dee1c4f673eb95b
2022 F20101119_AACTXD aggarwal_v_Page_63.txt
4d3297008d02e37183d0c0cf0f854575
39e1bf0e3f8de1689c232e729b520f0da662bb3d
1019729 F20101119_AACTSG aggarwal_v_Page_59.jp2
51d26ac29c078e4ba42360e510360368
c25cd7177736de91e948eab5600eff7a2cedd2f7
25458 F20101119_AACTNI aggarwal_v_Page_44.pro
6c4dc504717cbdd1872403418e8c30d7
1b2fc0a4f199fa965407beb3c4fbeb675d276593
31600 F20101119_AACTIL aggarwal_v_Page_03.jpg
a9a2db18a7ea279e8cb6baac6d0aa878
9479f4e96fcedc6ea01f75961f90cf23b4af41f4
584 F20101119_AACTXE aggarwal_v_Page_65.txt
5dc052d4eb942fc1ce6b4f4e25f07bcc
0f4aa9cad0e67155ee2034aefded06601fb97f63
617589 F20101119_AACTSH aggarwal_v_Page_08.jp2
ad68b377789fbd421cf89cca6fe371d3
a587d98686216a65418e5bd1b065bd4742d25ba7
6285 F20101119_AACTNJ aggarwal_v_Page_56thm.jpg
e37659290c300f0b50a16aacb1cad727
54986ddd5e8c28e3140ed9317082060f2860b602
6306 F20101119_AACTIM aggarwal_v_Page_62thm.jpg
7061b927a7676807a9358f8a963132a1
2389f740440c9796269f9c268379faa51fc0124d
2620 F20101119_AACTXF aggarwal_v_Page_67.txt
5d63ea32e852aa979d83e09dbd72ac03
4f43cf965541b1d66a755eb3a36ca7ab3dd32e74
120 F20101119_AACTSI aggarwal_v_Page_02.txt
3ea0bb0a055d5841fc87bf132e4055bd
f66ad1006265264e9ad35765a4020780dd3a0450
953307 F20101119_AACTNK aggarwal_v_Page_54.jp2
ec4ca7521340bfc4cf02b4ed57f6f7e7
d6a4f96da0af1d8875cf8e045da7a6fbd2cf423f
12475 F20101119_AACTIN aggarwal_v_Page_70.QC.jpg
9cb4273afa24aa93e136c23b427c6fa5
7290b5be5a226d985f84b15a06bb02b48602aa71
891 F20101119_AACTXG aggarwal_v_Page_70.txt
c82f13b6b65409ebef61a81c3ea81249
30d4a43e4e687a7133f3a57f93328b0521ca47f6
28347 F20101119_AACTSJ aggarwal_v_Page_68.QC.jpg
3cc9f035d0be08a46d361afa4bff15de
f2bc95cc8c88a252413e5bd0453cbe20e7fc7da4
48604 F20101119_AACTNL aggarwal_v_Page_33.jpg
f69a51429941504ac7c91eb3dc4db711
16ab43baf9667f75e791bc0cc2f669cc6bf9c7e1
20668 F20101119_AACTIO aggarwal_v_Page_57.QC.jpg
c275845f6887b018ff1b1e72aec06a2b
016b71bbfc5c7c7a4248d992659ceeeda4dfc41f
1082361 F20101119_AACTXH aggarwal_v.pdf
84d2f30962a3d38789005d2ed77c9ed8
340318d9b7f8fe32d9cf9500b5cc2e4db7487ec5
43133 F20101119_AACTSK aggarwal_v_Page_27.jpg
6f4cfe0b6e5be0de9304364b964fcf4d
96ef7b3ec0be101599e8e159b33759dec4d17b17
50584 F20101119_AACTNM aggarwal_v_Page_31.pro
d21421d744a9d792abecaecbac1f37c5
52a3e71afc9e53a2341d4915848fd73a414ebfdb
1016995 F20101119_AACTIP aggarwal_v_Page_55.jp2
79600a36faad9eb0fb599073bf0f441d
88618a389a60b46d8b338e8c253b68cdb9f60bd3
21894 F20101119_AACTXI aggarwal_v_Page_34.QC.jpg
0b5e5a5d5e19505943434b9df86d4426
46946d41ac00e86a2de4561632b9c83563844ab6
70008 F20101119_AACTSL aggarwal_v_Page_45.jpg
b5508dc4646d0964718bc9abd7e77b38
9bd4a886b47813c6037e3ed7ad39adf326587147
1616 F20101119_AACTNN aggarwal_v_Page_55.txt
7850a0f710e26514291666873a5b6b60
9aa3c2a233524115299f027fbb87bbec6a1249fa
43462 F20101119_AACTIQ aggarwal_v_Page_16.pro
e53a78b77c917cfe52e20032a6a0b171
fc7d8dcce051bc7a50a76814fc7fb540258df8b1
14543 F20101119_AACTXJ aggarwal_v_Page_43.QC.jpg
3d02013d7903284c95863038547f140f
2a08fca7feb8d2dcf1701e536022455974e98613
11018 F20101119_AACTSM aggarwal_v_Page_02.jpg
7c15170422fc7ee261d36ff6c8f74ee8
00eff5b1be70de300cb63df7d46baf5faeb62135
F20101119_AACTNO aggarwal_v_Page_57.tif
525928c6a9d68866308aaff2dec63ea3
de4665c89d2ae29030f0aa578a73cd179f9c0744
65662 F20101119_AACTIR aggarwal_v_Page_57.jpg
d5bc677086b0327dc5eec2085a2a1f7a
a072ae6758909edbb9c2c47a9d1c49b661129f69
7328 F20101119_AACTXK aggarwal_v_Page_21thm.jpg
235ab6fa4fa3f178d7e9ea741f7eb49d
841b11b323b34f56e4ab3a77181a3e577046efa1
45315 F20101119_AACTNP aggarwal_v_Page_47.pro
b88d1b8cef76140aed15b1f3f9fbc3fa
dbf8988d798a2edee49dbecc05e2e36993a5d095
23211 F20101119_AACTIS aggarwal_v_Page_41.QC.jpg
0c290f7625a5c2b2086da7177f61c9cc
e184f2eb976b21d329b1165e0d14ae7a4c045817
22634 F20101119_AACTXL aggarwal_v_Page_54.QC.jpg
fad89676c06cbc31ae0120c1168e2133
f4a0a1c43e7c529eb6ccdee5f37613663c541714
3308 F20101119_AACTNQ aggarwal_v_Page_02.QC.jpg
e28e753f65e8bf7016e61184fedd5f8a
67db51931200b11b88a2608ef1927e31618d912a
14696 F20101119_AACTIT aggarwal_v_Page_29.QC.jpg
9710136f40578d7c0c1cbb542c928232
4caa8935c8d8b81470c695727f44639be7d1afa2
2649 F20101119_AACTSN aggarwal_v_Page_65thm.jpg
56b17a15387146c42819bd030a04b3dc
626044259559e121c51026f8377016d9fdc0f134
5837 F20101119_AACTXM aggarwal_v_Page_53thm.jpg
c287c7bfbfb366906c61be19d0cfad52
a9a89d0b3aa1b286642152f898cc683ee19b9c2b
41377 F20101119_AACTNR aggarwal_v_Page_15.pro
ccf79ca01dc8381baa90a3ebddeca1c9
f84246b6cb937c4634c909a81c29d59548264080
37602 F20101119_AACTIU aggarwal_v_Page_26.pro
dc370d9c79f1409be6c6858ea28d4ac3
12e0a2c0aec55ed1a0be1b23fc576ac09c4132ee
7188 F20101119_AACTSO aggarwal_v_Page_67thm.jpg
ddfee55bfca0ccee2ba68e240af2199c
ec912a36ca4cf62b4bc785aaea6b309df8a63e1b
23523 F20101119_AACTXN aggarwal_v_Page_49.QC.jpg
bace8791cb9fef93d8f99991513c6139
e2114dbbb74c51f3735e9bb63b0652e4d19c088b
2302 F20101119_AACTNS aggarwal_v_Page_56.txt
43d43b223a6d1ec4de4b22a4c89f9972
620038473082e31f3dba95a2bc1993dd167517cf
112268 F20101119_AACTIV aggarwal_v_Page_63.jp2
59acc58410e5e1205c82bd404f1c464e
423dc912778b971e3bf6366f79e0377aed2d3649
24013 F20101119_AACTSP aggarwal_v_Page_64.QC.jpg
52a379e253fddf32e9d15e5e5c5fd121
ffea1224f4de94b6ff649ffd8a9c62472a0ef8b3
23246 F20101119_AACTXO aggarwal_v_Page_66.QC.jpg
58cc58f430e46cdf253b01b79c5da375
d81cc20f9e8587cda5307157b92307dbfc64fdf9
F20101119_AACTNT aggarwal_v_Page_47.tif
e0355d0638f69ee3f026d8fdcafa03c6
5e90dd244ee8ce1fb239c13d0f84ae559b32d695
20703 F20101119_AACTIW aggarwal_v_Page_37.QC.jpg
1b4f43c4d3c238ab0a9ed698cdf69724
b123fc2b8493a2c6d39f5c10d2fd1de988aaf8db
64753 F20101119_AACTSQ aggarwal_v_Page_16.jpg
468d5991eb3291a0ee7aa628b791134e
a54608bc555204dfe889bafc23b2cecd45a0b379
4277 F20101119_AACTXP aggarwal_v_Page_10.QC.jpg
7deadebcfb2dd057544e21417eee75d8
9d10b162ab7a1f731072881d50c36362a6d25dcd
35461 F20101119_AACTNU aggarwal_v_Page_42.pro
46cc83f41d9514231282c2b3c1a94d1c
b5b944467dfbd68c49f67f52f2e03ccc27e444db
73260 F20101119_AACTIX aggarwal_v_Page_13.jpg
a259e8e1a10dc3a4b942e518df9c339c
3b326fdee4d7b9b667d7f99b58e8d2c57bee3b78
73611 F20101119_AACTSR aggarwal_v_Page_36.jpg
e97d2ee3f9ade5afbbf297519e6c2edf
7b651f0d41041d387de44ab7b7b4c21f53de2f88
3206 F20101119_AACTXQ aggarwal_v_Page_03thm.jpg
80e735fd30d03d8140b98860f09670ac
69ff635a24bbbbb0815b7e567ac1c13a51aeb97e
98488 F20101119_AACTNV aggarwal_v_Page_68.jpg
5a1d2f0620c943e8799f339bb5314d74
d6ae40ec295b7cdf6a5552a76c34e38632498653
5369 F20101119_AACTGA aggarwal_v_Page_35thm.jpg
2c083fa93247aab01e8b2b5a29499553
f71e9803916e5aeed00922dab2fb662b4a9dc303
F20101119_AACTIY aggarwal_v_Page_08.tif
404fdc31162ae91cb82c43c1f1157e35
b6760efde17084057cd7f76b1a5d680c765bbb31
F20101119_AACTSS aggarwal_v_Page_29.tif
8b87b3c4c4528d378605e522bdd5c6ef
8fd9f22144a547793dd427056a781e16d703dd59
24058 F20101119_AACTXR aggarwal_v_Page_20.QC.jpg
fa87c4467b26383782ddfe96aa710db7
28faebd26c2f34b2e78113b0658d78aba7863698
111290 F20101119_AACTNW aggarwal_v_Page_36.jp2
91bdac57196cab3b39f94a123d70e951
e9d36454e107fce696ef8324fa3a3dc1bbc385d1
75761 F20101119_AACTIZ aggarwal_v_Page_17.jpg
3997edbaf096892de66d10ef537f2352
fb2e89076db71a42ca827a7ef8fa81c83c794837
20276 F20101119_AACTST aggarwal_v_Page_30.QC.jpg
c2e96a18399cfec8054ed767720c66a0
5c214761cbde610aed609bab4df81510cf61d0c0
826427 F20101119_AACTNX aggarwal_v_Page_57.jp2
c27cacfb535c084011b787c81d2f7c53
1f60ebd4878c01bf3a3e31d5c1c4a8ea0351652e
63428 F20101119_AACTGB aggarwal_v_Page_22.jp2
fe4dc1c2fde54843683078bcde7936fa
82d023ede5e6952b25b2d49823e11ae09d83b806
51066 F20101119_AACTSU aggarwal_v_Page_70.jp2
89e8aaa56b9c186b8c5b907fa98f3f31
ed26652b17b5db57084004d52ac7b4a2189e04c3
16163 F20101119_AACTXS aggarwal_v_Page_33.QC.jpg
141a15d6ad1ef07b14bfc4c1513f39e9
93d5c7e7bd26a39ae69416df65af8e294d3793fc
F20101119_AACTLA aggarwal_v_Page_46.tif
ddb72e93aa0ac6b5c37aed428418dbbb
b00ad987fee2d07412400213a03da5b0d69e05f4
49745 F20101119_AACTNY aggarwal_v_Page_13.pro
ce909b6d39335231562ff51e585fbf2c
0c3453a8301a3cab3b554aaf5347b81762918a38
1498 F20101119_AACTGC aggarwal_v_Page_27.txt
b2a4a51840180e5e8e8825d161d052a7
65b1e260ef2a6f0cc065c5642b89b5711ef9f61b
F20101119_AACTSV aggarwal_v_Page_55.tif
b1732206b4cf148105351a5816a3b497
59c6b29fa7a4f3d28951fee6a537e3ea0a45ec80
7973 F20101119_AACTXT aggarwal_v_Page_08.QC.jpg
748d05c4d7adc97f2459cc5d85e01351
6097bb33892a9793ccf82d4d5c31afd35b0b6cbf
F20101119_AACTNZ aggarwal_v_Page_23.tif
5ff8602f39b31172aebb407c276e243b
f0a9803f3e0ffdbe62612b0d76f507f6835e9498
1154 F20101119_AACTGD aggarwal_v_Page_18.txt
f1b7b1b1d06c7d20af9a2a33bd4f95a9
8583334527e44b1fc8300109da25a159f592e362
1051973 F20101119_AACTSW aggarwal_v_Page_19.jp2
cba87ce50d40dc64add415b8c89a3055
6c7065f11bf11a866d5b3a2fc0b9644cb84769ba
6999 F20101119_AACTLB aggarwal_v_Page_23thm.jpg
c9228ab346bea0692e4178eb10e28cb3
6bb4c74387109d4c5520cd8001695af8fe3cc3b7
6810 F20101119_AACTXU aggarwal_v_Page_20thm.jpg
edd89d511b7f74d681e5e310586ade4b
2660c90dbc13cc373101b5fa24d31b5d1294def3
6916 F20101119_AACTGE aggarwal_v_Page_60thm.jpg
8a00fabc08dd5c8683bfdb5652e115bc
f9fa8aee42872544c307e15474d551d04553a2f2
60431 F20101119_AACTSX aggarwal_v_Page_39.jpg
c73a04d22f5e910a083c92ddcb633622
cd911125dc57cf2fc0d44f8aa877a73fa4d4d103
5765 F20101119_AACTLC aggarwal_v_Page_07thm.jpg
39eb8401e940f4de405ae91cbcd5c374
df3a900e7e8bc9556c68aec104d57e77b436efdf
20376 F20101119_AACTXV aggarwal_v_Page_15.QC.jpg
2ba930761234e8e23db86f1d88643f46
75c451c788e80dee217e79fb6241685d0dafd293
F20101119_AACTGF aggarwal_v_Page_34.tif
e79e737a5c099bdfe2b983e940f1d871
b54c53379e79310f950dbeb9f232287875681f97
6550 F20101119_AACTQA aggarwal_v_Page_41thm.jpg
dc584116d923105a73c4048f5522f385
cefb7a3212c260c07f0aaeedce4346f1c73c784c
77100 F20101119_AACTSY aggarwal_v_Page_28.jp2
52bc1a63f1af312c1e9ef0bc5aa7ff9c
dc45e57e358dfc3052a8d25c5c60da911a941a5a
20932 F20101119_AACTLD aggarwal_v_Page_69.jpg
bb728baef4d40123e4f1a52e8892a321
d118842fa82ccef0275a3ef40fb0db6ac6c50d67
20298 F20101119_AACTXW aggarwal_v_Page_40.QC.jpg
53152c8f6513268e8b30043e1c941d27
03533d183e61ba4b088e21870159508f9c519c4f
64321 F20101119_AACTGG aggarwal_v_Page_51.jpg
4561877a515fa5d82550a3eedb68ea6c
8f4ff2ac6870dbb483f73eab7348ef4d5709af14
1409 F20101119_AACTQB aggarwal_v_Page_02thm.jpg
bf0158764b21b51dee52e811098dabbe
545d659e3d147e462efd6f5a03240b1f6a3b9c41
28812 F20101119_AACTSZ aggarwal_v_Page_22.pro
4102497626767b34aeb81ebad09d8d20
5aaadf2ffc17aadcd6a7510feeda08623d0b6dba
5769 F20101119_AACTLE aggarwal_v_Page_42thm.jpg
dba8de3d167ed3e2a8ef7301b0de1a52
c31e8ea5f0dbb7d98e9a6dd3ee23d345fe3210b5
23916 F20101119_AACTXX aggarwal_v_Page_48.QC.jpg
e02852e03abf1e563539b17079ee9548
2b6bc2398ad9000eb9b46f1f7ccdd7039ce9f611
6106 F20101119_AACTGH aggarwal_v_Page_69.QC.jpg
b5da6f5188f949401681758ae0fcc389
d72996392998cba28abf6597e71f3cac78d004d7
1550 F20101119_AACTQC aggarwal_v_Page_39.txt
7740b5fb209bd753626b28c1fda0ac7d
fd7e5a290b3b2eb7eb013cc25583d07406866ee3
24075 F20101119_AACTLF aggarwal_v_Page_12.QC.jpg
4e6f950afade05e8b5c010c6b7e19b8a
62fdd3357aaf37adab035dd18951050cb18442e0
5614 F20101119_AACTXY aggarwal_v_Page_52thm.jpg
50b54a1abc80cc53a24420657f04a97c
2144395c4b84ff41c4c926b7242416e4eaf8ca27
F20101119_AACTVA aggarwal_v_Page_37.tif
e2261688fcda2faae0c8020e2c956f9b
f92d6dc6e6341552e50cf6d7b08171ea7f3bbe36
106031 F20101119_AACTGI aggarwal_v_Page_45.jp2
7deb0ef0f31349b0ee37a6adf59c0d14
c6bc5b3fd643d164eb95ce361f103d2d3f00e9c8
1864 F20101119_AACTQD aggarwal_v_Page_49.txt
34dc34b8f408c1267c8a64a6d5b44e60
858e20989130bb72fce86ad26678a74334ba084b
13393 F20101119_AACTXZ aggarwal_v_Page_06.QC.jpg
928184e3ac773a57465e75ae23f7697a
8d3e2130cf3879fea44b599b2cddf117a585d6c0
F20101119_AACTVB aggarwal_v_Page_38.tif
f7a1ccbf6a23fc0d7aa3ec3cf12f52fe
69d8d8cdb75937ce70a5d778ae95080a158930cd
32239 F20101119_AACTGJ aggarwal_v_Page_65.jp2
89fae6bc531daa565295e66da2e1162e
f73f7ac21bb809364a880565a6247dcd173821b3
6317 F20101119_AACTQE aggarwal_v_Page_34thm.jpg
a41482c41d8a97eaabc811160b1307d5
866e416d254ddebf9b5b8fdce8acbcf4a5bf8aa5
F20101119_AACTLG aggarwal_v_Page_22.tif
34d8363eae34509591b8f6f7a46e3f50
3232d29541ae37a2097d418622d5f5fb226ace13
F20101119_AACTVC aggarwal_v_Page_41.tif
70c28309fc7b43ff572ae1a98f265c0b
05822a60089b5702e5a95892d756cf063f1a5b36
111289 F20101119_AACTGK aggarwal_v_Page_12.jp2
2b5a1037d97887b9abdbcf668bf8a0d8
8a986f95610bd7528ad53b5da71f982c7aae745a
59664 F20101119_AACTQF aggarwal_v_Page_50.jpg
ee22366a6b21793ae4f2f89acb57883f
855edf61ae5e314d4ffefcd8812b2b84bf3793cc
72346 F20101119_AACTLH aggarwal_v_Page_46.jpg
8fcd968f273421efec13e5fcbd83a024
748985b951b7a4dbc5eb3ad39e92587ef1165c11
F20101119_AACTVD aggarwal_v_Page_43.tif
f2dcc2b9701a7c55b9592c135330141b
d370afb47f0b8905dc919388efc6c0d2b807c775
21621 F20101119_AACTGL aggarwal_v_Page_16.QC.jpg
2fad68c20f547b5924b2fe707df5baee
57f5c5b9d53719999d6e732958657cf9b690369b
51124 F20101119_AACTQG aggarwal_v_Page_36.pro
8836b9f0ac3f5f33e3497db3e24c8418
696e8a82d3455abfbd8aec2d323632f1dd565740
242161 F20101119_AACTLI aggarwal_v_Page_05.jp2
ee460a8762a9a18f05b400b129828678
c6cb8c7e69d6226e23a8402c7e62a04e16df1b8f
F20101119_AACTVE aggarwal_v_Page_48.tif
b3f54df16ed42c19cf297759ca6d79b0
d61969c7226df9148f5bd00fe418fe9d83ee0403
72298 F20101119_AACTGM aggarwal_v_Page_04.jpg
dcefcd39b1d7d13b5abcda9825400eac
ad685e6f5adbd32b92ee8de3454cba73671d5069
76934 F20101119_AACTQH aggarwal_v_Page_07.jpg
d0f1827a15261f8ae61f6448291f90c3
6815d54451012908fa98550d60505382a3e97648
7605 F20101119_AACTLJ aggarwal_v_Page_01.pro
9c25629713b7e617e37dbf39210889b7
c6fe80bca69a389453fb9d51ea44aedbb921c81c
F20101119_AACTVF aggarwal_v_Page_49.tif
be88a9091b5869340117a50e82f67081
d66ddd6bba39bd1114e42e568bb10cd72a0a87bd
39923 F20101119_AACTGN aggarwal_v_Page_46.pro
eb135fd6f5bf2ef27039f0561b7ac9bb
53febcc177990d0d0a6ce93e25a4a364c8b02988
6160 F20101119_AACTQI aggarwal_v_Page_30thm.jpg
a7fee807a75f43bcb737443928abb8df
c5f4f4af3797c46612e71aa91e27e04e6b3651b7
F20101119_AACTLK aggarwal_v_Page_09.tif
9a3d28b7e29056a4ae5008ae5499fcf8
9f0830efa9e4dc573dbf9e7bc3f60d8a6f10bd29
F20101119_AACTVG aggarwal_v_Page_52.tif
65dd929215b4af180db990ff75432c34
dc9ec7b4ee65a7e0ad9bc70fc44b4dae60ba83a6
1994 F20101119_AACTGO aggarwal_v_Page_31.txt
b2e2d50cdc1cb8ed4fe377a2006ec71b
de433a61a48c82bc5d837a2870f9213165ebc5d6
20050 F20101119_AACTQJ aggarwal_v_Page_42.QC.jpg
5d5fce79a64495fa725903fa82a442cf
697df684a38b40938fea4315692ff2ca4ed15260
13656 F20101119_AACTLL aggarwal_v_Page_65.pro
1867c9e1cce4c40ee05ae0d5ed37da59
eb1417d6a41c1578dbc46a7c09a03fd2a322b256
F20101119_AACTVH aggarwal_v_Page_53.tif
ca036e48d2eba38c1842db95a05cf9a7
d270dc39e48c170547468b48d915a1b900d14a2e
4539 F20101119_AACTGP aggarwal_v_Page_27thm.jpg
e64779588a923656c2067fcd9f0aa06f
066047025c7a15684d73d89ce026508dfdf83db2
36147 F20101119_AACTQK aggarwal_v_Page_28.pro
e74db7f0f7aa3d8bb92b8ccde8fc472f
00ad094fd255e74cc937e059795676d55626ef72
23794 F20101119_AACTLM aggarwal_v_Page_19.QC.jpg
ee441a30db1ac226f3771311086bb9d3
9cc088f07e2b14198ea71478eea6fb6489e966e6
F20101119_AACTVI aggarwal_v_Page_54.tif
23c396ec1674f4686deb19eec3596a02
8ba60a339f763de8474997913ff454091361c146
1016785 F20101119_AACTGQ aggarwal_v_Page_48.jp2
d50dc574fda8e69ad160336214df1337
d842a144363478b4fae32d1df8059c802d28919e
16633 F20101119_AACTLN aggarwal_v_Page_22.QC.jpg
bbe53980345e5553f5fc3af3e8a30fbc
ed3bababbf6477e77e34d1be7ce1b2c2e3991a26
F20101119_AACTVJ aggarwal_v_Page_60.tif
e4426aa332fa280e07d3c99f5d5c30f4
5680fcdbc3eb8a790193b2272a3ccbeeca9f982f
22759 F20101119_AACTGR aggarwal_v_Page_47.QC.jpg
890a4143bd67d7d1a8ec28104d49581a
89eef41a4a16cbfca7aaeae883aa45f7729016f7
14966 F20101119_AACTQL aggarwal_v_Page_08.pro
35944f7ca52b51800f8e13849c76f9cb
0d08d43ba15666c6f323ce4b17d4ea1092c53a94
23639 F20101119_AACTLO aggarwal_v_Page_25.QC.jpg
66b5fc5b34c62a683ca7c9062fdcbe88
12d3c2b6a60f4eb7ae5ae49449c3248dc8c43106
17523 F20101119_AACTGS aggarwal_v_Page_35.QC.jpg
ae7d909b4a464f5c6a32025e07711bb2
66cb7691c8f03486ee7233c3b8bb72240c606afd
4852 F20101119_AACTQM aggarwal_v_Page_05.QC.jpg
0b47b319db7431a5e2660993f2873c03
eaeff40a4d4ff62016222a3c8302cceb7bac118d
40758 F20101119_AACTLP aggarwal_v_Page_03.jp2
4c2672e9fa0efe778bd401695cb97178
3f8beb650d25aa915bb773302c91f9a69fea4580
F20101119_AACTVK aggarwal_v_Page_63.tif
e9382d286dc80f7d2bb0a5ba917499f3
b89166e41b254f732eff19e1d41e49a5b4793017
F20101119_AACTVL aggarwal_v_Page_64.tif
a2040f84e41b98c07956852975cb287f
900330f9c9701eb1db7c316d50f38cb1aa159aaf
F20101119_AACTGT aggarwal_v_Page_17.tif
e346576ae328611ca2c3b3e3613524a3
e7a6f16f019cd68f478744b4e9ffc3267114cce8
26159 F20101119_AACTQN aggarwal_v_Page_21.QC.jpg
8439557933df3fed1ecf2c70cea39274
1e50caad2b4422c9cb44db4b47d5324cd141fe37
F20101119_AACTLQ aggarwal_v_Page_61.tif
10480b147e3d31dd927e32e4c2db90c5
ad2394c484d0f12fdac4b309119cf9acae59f35a
F20101119_AACTVM aggarwal_v_Page_65.tif
1322e863ce527b52764b99f39bd616f8
d7d4bf67f9bfb7fdbff0012937e2969bd0f64087
75397 F20101119_AACTGU aggarwal_v_Page_38.jpg
7b5ca79663d1e5be842e9202a6270db3
44ad67a1f065b9ae8fc8c376119794759537ecd0
2020 F20101119_AACTQO aggarwal_v_Page_41.txt
ab025a26a25a63ea529a655e95fde989
a752e9567905cd7fc682f9d4365eafa3a719378f
24042 F20101119_AACTLR aggarwal_v_Page_31.QC.jpg
f09340d6a0520025da0bc8fb11d02f68
7314a75505539edd77830a800fd940fff18e3a5f
F20101119_AACTVN aggarwal_v_Page_69.tif
8e09341bdddf42112f7c7c3b29f17bf0
1a88cd6f0427ee1e152eb59f142d4cd08feae8ea
516 F20101119_AACTGV aggarwal_v_Page_05.txt
7524915b4d4334963f445bcabc0113ed
94f63453153097e4fc0a02ac209ce812f0ed04d4
6229 F20101119_AACTQP aggarwal_v_Page_47thm.jpg
2eaf5478526c970de5ba573907b1e875
595e1bc8d9dfb00950c2613bcf5f2ddf100393f9
46345 F20101119_AACTLS aggarwal_v_Page_06.jpg
8bd39a4097449b291d3f18303376e9bb
663ac1056d6ff0ee9f42ab4a9cbb59ee7e1783e1
1717 F20101119_AACTVO aggarwal_v_Page_02.pro
919d03a107c031e307b8188db51100aa
5e9a931580f6043b9c436b6e124c0853ae66f804
F20101119_AACTGW aggarwal_v_Page_33.tif
c299da2b0b856e5300893c23f4bf7281
e83c85db61f8843d1db5b35a734174244a00e946
6358 F20101119_AACTQQ aggarwal_v_Page_59thm.jpg
dc78ec411a901d0d36525ee802e32d7c
ac8fd7f623a530f07b372fb80caca9e7a4c722a2
40468 F20101119_AACTLT aggarwal_v_Page_52.pro
728fac72615f1ee4391d9b3f3894c4f3
e19655421ddcfa2e186d46db54c8aeb5480857d9
72589 F20101119_AACTVP aggarwal_v_Page_04.pro
4249aff4311b55717b957b5bacd74986
7dd0c9b828a07949ac76f415893c17f6dc5b0824
20565 F20101119_AACTGX aggarwal_v_Page_51.QC.jpg
353f0aafd440b78eca421f08b24f1b70
17eef3e7ff862c2eb2689321a810ff734a36a555
114557 F20101119_AACTQR aggarwal_v_Page_14.jp2
90c57078f630db6d4ce429dc4f70a5c7
fd0cbf98386f5562d6661309fffac06fa52f0641
F20101119_AACTLU aggarwal_v_Page_70.tif
5059b64fd554ffd115a2e7a6d1ad0b21
d9f65e1d53da5ad0b0543001d339bbf3bc10fce1
75327 F20101119_AACTGY aggarwal_v_Page_49.jpg
aa9e80e7f1eba2823bf2de1abcdbced3
ef66c5de08c5679cf47f967b9b4598d7b8b445c7
5549 F20101119_AACTQS aggarwal_v_Page_39thm.jpg
5b43d9b65d05625c0212a18f825a5e88
78df1cd3ecc15c2816b87d9b6ff6626bee13e062
8679 F20101119_AACTLV aggarwal_v_Page_69.pro
a69783685cf7d130d8d316ac4e45536c
8e96095f9a58e5e349af7ebea52ed46b8af4582a
29742 F20101119_AACTVQ aggarwal_v_Page_06.pro
df738e916a436c85922f26096581debf
12c839e2104ea21c00e37d58fbbe6416ebd982c7
3603 F20101119_AACTGZ aggarwal_v_Page_06thm.jpg
140ff5f5dbede688ce3232b5d3d22fd5
ba0aa90fcad50349f8f3714774ba9401f2615187
5514 F20101119_AACTQT aggarwal_v_Page_18thm.jpg
9bdfb76a9a0d85b1e08beacf4fa26126
0d2b53acd09c1cf5ac5e786d5e3548f02c38330f
1865 F20101119_AACTLW aggarwal_v_Page_53.txt
f1ca301e3c2cb756d206aa98babb1b02
cbef133c8e8d6cb97dcbb035c9fd0dd007aae9cd
37610 F20101119_AACTVR aggarwal_v_Page_09.pro
4a4bf1af01c3b17fa56920ea0f5a7f86
5152ffc20a953009575e07f0094d7831395618ad
F20101119_AACTQU aggarwal_v_Page_59.tif
6e57efa1abaee547f905392ff939c3c4
fdfe6045f80205957b9c95e00cf5d626fc195c87
42300 F20101119_AACTLX aggarwal_v_Page_30.pro
63aa113a0ba06e94b75169b5e1ae2abd
75b1a4d86c4d023a6550b8d8315d70200d97319d
3448 F20101119_AACTVS aggarwal_v_Page_10.pro
9b545481a3ddf7bf7727950ac523ad70
67257bf911a4c9900d33548c5c6876bb32520c0d
46073 F20101119_AACTQV aggarwal_v_Page_29.jpg
14f1962adf3f7ce6a2f6e3638e557f58
b9a64349498b25a30028fb5019f42860bdad36e3
1990 F20101119_AACTJA aggarwal_v_Page_12.txt
576bd2ecdefec9554cde4374f8b8b8a6
1f968cc7f5d3bfa648a5908f02c6b5fdd9d827ce
1051832 F20101119_AACTLY aggarwal_v_Page_58.jp2
cf6526c60d5333a3ec6a752dda56dc46
cc6accc46ee61caa3b9ec9e1f66a6d4fe08207b2
44165 F20101119_AACTVT aggarwal_v_Page_11.pro
0d81a8a8188c32b9a6ce55981494c919
cddbd2076dc16c49f3b1569e62ffb6dfdc21e8b0
71829 F20101119_AACTQW aggarwal_v_Page_60.jpg
5ff36c882fe1b43195e2a94c85ca0493
64fbd055f266a636eae39de4fd92cd54aacc0f0c
1148 F20101119_AACTJB aggarwal_v_Page_24.txt
a0e6d408174e82aee8609676d8836d7e
01a387c931ab6193d5ccc7f022952778594e95ce
44038 F20101119_AACTLZ aggarwal_v_Page_49.pro
683df560893caf31b81162b64bffa4c2
886b2c519fa75c98f0fac4560463cf8d34383d43


Permanent Link: http://ufdc.ufl.edu/UFE0012171/00001

Material Information

Title: Remote Sensing and Imaging in a Reconfigurable Computing Environment
Physical Description: Mixed Material
Copyright Date: 2008

Record Information

Source Institution: University of Florida
Holding Location: University of Florida
Rights Management: All rights reserved by the source institution and holding location.
System ID: UFE0012171:00001

Permanent Link: http://ufdc.ufl.edu/UFE0012171/00001

Material Information

Title: Remote Sensing and Imaging in a Reconfigurable Computing Environment
Physical Description: Mixed Material
Copyright Date: 2008

Record Information

Source Institution: University of Florida
Holding Location: University of Florida
Rights Management: All rights reserved by the source institution and holding location.
System ID: UFE0012171:00001


This item has the following downloads:


Full Text












REMOTE SENSING AND IMAGING
INA
RECONFIGURABLE COMPUTING ENVIRONMENT















By

VIKAS AGGARWAL


A THESIS PRESENTED TO THE GRADUATE SCHOOL
OF THE UNIVERSITY OF FLORIDA IN PARTIAL FULFILLMENT
OF THE REQUIREMENTS FOR THE DEGREE OF
MASTER OF SCIENCE

UNIVERSITY OF FLORIDA


2005

































This document is dedicated to my parents and my sister.















ACKNOWLEDGMENTS

I would first like to thank the all mighty for giving me an opportunity to come this

far in life. I also wish to thank the department of ECE at UF, all the professors for their

words of wisdom, Dr. Alan George and Dr. Kenneth Slatton for their infinite support,

guidance and encouraging words, and all the members of the High-performance

Computing and Simulation Lab and Adaptive Signal Processing Lab for their technical

support and friendship. I also take this opportunity to thank my parents for their nurture

and support, and my sister for always encouraging me whenever I was down. I hope I

can fulfill all their expectations in life.
















TABLE OF CONTENTS

page

A C K N O W L E D G M E N T S .................................................................... ......... .............. iii

LIST OF TABLES ......... .. ........... ....... ............... ..... vi

LIST OF FIGURE S ......... ..................................... ........... vii

ABSTRACT ........ .............. ............. ...... ...................... ix

CHAPTER

1 IN T R O D U C T IO N ............................................................................. .............. ...

2 BACKGROUND AND RELATED RESEARCH ................................................. 6

2.1 R econfigurable Com puting................................................. ........................... 7
2.1.1 The Era of Programmable Hardware Devices............................................7
2.1.2 The Enabling Technology for RC: FPGAs ............................................... 9
2.2 Remote-Sensing Test Application: Data Fusion................................................13
2.2.1 D ata A acquisition Process ............. ............................................. ........ 14
2.2.2 Data Fusion: Multiscale Kalman Filter and Smoother.............................16
2.3 R elated R research .......................... ........ ................ .. .. .... .. ....... .... 20

3 FEASIBILITY ANALYSIS AND SYSTEM ARCHITECTURE ............................. 24

3 .1 Issu es an d T rade-offs .................... .. .................................... .................... 24
3.2 Fixed-point vs. Floating-point arithmetic............... ...........................................26
3.3 One-dimensional Time-tracking Kalman Filter Design .......................................29
3.4 F ull Sy stem A architecture ......................................................................... ...... 3 1

4 MULTISCALE FILTER DESIGN AND RESULTS............... ....... ............. 34

4 .1 E x p erim mental S etu p .................................................................... .....................34
4 .2 D esign A architecture # 1 ........................................ ...................... .....................36
4.2 D design A architecture #2 ........... ........... .............. .................. ............... 41
4.3 D esign A architecture #3 ...................................... ........................................ 44
4.4 Performance Projection on Other Systems............................................... 48
4.4.1 Nallatech's BenNUEY Motherboard (with BenBLUE-II daughter card)..49
4.4.2 Honeywell Reconfigurable Space Computer (HRSC) ............................50










5 CONCLUSIONS AND FUTURE WORK .............................................................52

L IST O F R E F E R E N C E S .......................................................................... ....................56

B IO G R A PH IC A L SK E TCH ...................................................................... ..................60























































v
















LIST OF TABLES


Table p

3.1 Post place and route results showing the resource utilization for a 32x32 bit
divider and the maximum frequency of operation. ...............................................27

3.2 Quantization error introduced by fixed-point arithmetic ......................................28

3.3 Performance results tabulated over 600 data points (slices occupied: 2%) .............31

4.1 Performance comparison of a processor with an FPGA configured with design
#1 .............. .......................................... ......... 37

4.2 Comparison of the execution times on the FPGA with Xeon processor and
resource requirements of the hardware configuration for design #2.....................42

4.3 Components of the execution time on an FPGA for processing eight rows of
input im age using the design #3 .......... ....................................... ....................46

4.4 Components of the total execution time on FPGA for processing a single scale
of input data with different levels of concurrent processing...............................46
















LIST OF FIGURES


Figure p

2.1 Various computation platforms on the flexibility and performance spectrum...........8

2.2 A simplistic view of an FPGA and a logic cell inside that forms the basic
b u ild in g b lo ck ....................................................... ................ .. 9

2.3 Block diagram description of RC1000 board. Courtesy: RC1000 reference
m an u al .............................................................................. 12

2.4 The steps involved in data acquisition and data processing................................14

2.5 Q uad-tree data structure. ........................................ .......................................... 16

3.1 D ata generation and validation.......................................... ........................... 25

3.2 Set of sequential processing steps involved in 1-D Kalman filter .........................25

3.3 Differences between Matlab computed estimates and Handel-C fixed point
estimates using eight bits of precision after the binary point for 100 data points....29

3.4 Processing path for (a) calculation of parameters and (b) calculation of the
estim ates. ........................................................................... 30

3.5 High-level system block diagram ................................ ................... ...... ........ 32

4.1 Simulated observations corresponding to 256x256 resolution scale. ....................34

4.2 Block diagram of the architecture of design #1. ................... .................. .......... 36

4.3 Block diagram for the architecture of both the Kalman filter and smoother. ..........38

4.4 Error statistics of the output obtained from the fixed-point implementation .........39

4.5 Performance improvement and change in resource requirements with increase in
concurrent com putation ............................................................................ .... ..... 4 1

4.6 Block diagram of the architecture of design #2. ................................... ...............42

4.7 Performance improvement and change in resource requirements with increase in
number of concurrently processed pixels ...................................... .................44









4.8 Block diagram of the architecture of the design #3 ......................... ..... ..........45

4.9 Improvement in the performance with increase in concurrent computations. .........47

4.10 Error statistics for the outputs obtained after single scale of filtering....................48

4.11 Block diagram for the hardware architecture of the Nallatech's BenNUEY board
with a BenBLUE-II extension card...........................................................49

4.12 Block diagram of the hardware architecture of HRSC board .............................50















Abstract of Thesis Presented to the Graduate School
of the University of Florida in Partial Fulfillment of the
Requirements for the Degree of Master of Science

REMOTE SENSING AND IMAGING
INA
RECONFIGURABLE COMPUTING ENVIRONMENT

By

Vikas Aggarwal

December 2005

Chair: Kenneth C. Slatton
Cochair: Alan D. George
Major Department: Electrical and Computer Engineering

In recent years, there has been a significant improvement in the sensors employed

for data collection. This has further pushed the envelope of the amount of data involved,

rendering the conventional techniques of data collection, dissemination and ground-based

processing impractical in several situations. The possibility of on-board processing has

opened new doors for real-time applications and reduced the demands on the bandwidth

of the downlink. Reconfigurable computing, a new star in the field of high-performance

computing, could be employed as the enabling technology for such systems where

conventional computing resources are constrained by many factors as described later.

This work explores the possibility of deploying reconfigurable systems in remote sensing

applications. As a case study, a data fusion application, which combines the information

obtained from multiple sensors of different resolution, is used to perform feasibility

analysis. The conclusions drawn from different design architectures for the test









application are used to identify the limitations of current systems and propose future

systems enabled with RC resources.














CHAPTER 1
INTRODUCTION

Recent advances in sensor technology, such as increased resolution, frame rate, and

number of channels, have resulted in a tremendous increase in the amount of data

available for imaging applications, such as airborne and space-based remote sensing of

the Earth, biomedical imaging, and computer vision. Raw data collected from the sensor

must usually undergo significant processing before it can be properly interpreted. The

need to maximize processing throughput, especially on embedded platforms, has in turn

driven the need for new processing modalities. Today's data acquisition and

dissemination systems need to perform more processing than the previous systems to

support real-time applications and reduce the bandwidth demands on the downlink.

Though cluster-based computing resources are the most widely used platform on ground

stations, several factors, like space, cost and power make them impractical for on-board

processing. FPGA-based reconfigurable systems are emerging as low-cost solutions

which offer enormous computation potential in both the cluster-based systems and

embedded systems arena.

Remote sensing systems are employed in many different forms, covering the gamut

from compact, power- and weight-limited satellite and airborne systems to much larger

ground-based systems. The enormous number of sensors involved in the data collection

process places heavy demands on the I/O capabilities of the system. The solution to the

problem could be approached from two directions: first, performing data compression on-

board before transmitting data; or second, performing some onboard computation and









transmitting the processed data. The target computation system must be capable of

processing multiple data streams in parallel at a high rate to support real-time

applications which further increases the complexity of the problem. The nature of the

problem demands that the processing system should not only be capable of high

performance but also be able to deliver excellent performance per unit cost (where the

cost includes several factors such as power, space and system price).

There have been a plethora of publications which have demonstrated success in

porting several remote sensing and image processing applications to FPGA-based

platforms [1-5]. Some researchers have also created high-level algorithm development

environments that expedite the porting and streamlining of such application code [6-8].

But, understanding the special needs of this class of applications and analysis of existing

platforms to determine their viability as future computation engines for remote sensing

systems warrants further research and examination. Identifying some missing

components in current platforms that are essential for such systems is the focus of this

work. A remote sensing application is used to illustrate the process. In this work, a data

fusion application has been chosen as representative of the class of remote sensing

applications, for the reason that it incorporates a wide variety of features that stress

different aspects of the target computation system.

The recent interest in sensor research has lead to a multitude of sensors in the

market, which differ drastically in their phenomenology, accuracy, resolution and

quantity of data. Traditionally, Interferometric Synthetic Aperture Radar (InSAR) has

been employed for mapping extended areas of terrain with moderate resolution. Airborne

Laser Swath Mapping (ALSM) has been increasingly employed to map local elevations









at high resolution over smaller regions. A multiscale estimation framework can then be

employed to fuse the data obtained from such different sensors having different

resolutions to produce improved estimates over large coverage areas while maintaining

high resolution locally. The nature of the processing involved imposes enormous

computation burden on the system. Hence, the target system should be equipped with

enormous computation potential.

Since their inception, early processing systems have fallen into two separate camps.

The first camp saw a need to accommodate wide varieties of applications with multiple

processes running concurrently on the same system, and therefore chose General-Purpose

Processors (GPP) to serve their needs. The other camp preferred to improve on the speed

of the application and chose to leverage the performance advantages of the Application-

Specific Integrated Circuits (ASICs). Over a period of time, these two camps strayed

further apart in terms of processing abilities, flexibility and costs involved.

Meanwhile, due to the technological advancements over the past decade,

Reconfigurable Computing (RC) has garnered a great deal of attention from both the

academic community and industry. Reconfigurable systems have tried to fuse the merits

of both the camps and have proven to be a promising alternative. RC has demonstrated

improved performance in speed on the order of 10 to 100 in comparison to GPPs for

certain application domains such as image and signal processing [2, 4, 5, 9]. An even

more remarkable aspect of this relatively new programming paradigm is that the

performance improvements are obtained at less than two thirds of the cost of

conventional processors. FPGA-enabled reconfigurable processing platforms have even

outperformed the ASICs in market domains including signal processing and cryptography









where ASICs and DSPs have been the dominant modalities for decades. The past decade

has seen a mountainous growth in RC technology, but it is still in its infancy stage. The

development tools, target system architectures and even processes for porting the

applications need to mature before they can make meaningful accomplishments.

However, RC-based designs have already shown performance speedups in

application domains such as image processing which require similar processing patterns

to many remote-sensing applications. Conventional processor-based resources cannot be

employed in such applications because of their inherent limitations of size, power and

weight, which RC-based systems can overcome. The structure of imaging algorithms

lends itself to a high degree of parallelism that can be exploited in hardware by the

FPGAs. The computations are often data-parallel, require little control, and contain large

data sets (effectively infinite streams), and raw sensor data elements that do not have

large bit widths, making them amenable to RC. However, this class of applications has

three characteristics which make them challenging. First, they involve many arithmetic

operations (e.g. multiply-accumulates and trigonometric functions) on real and/or

complex data. Second, they require significant memory support, not just in the capacity

of memory but also in the bandwidth that can be supported. Third, the scale of

computation is large, requiring (possibly) hundreds of parallel operations per second and

high-bandwidth interconnections to meet real-time constraints. These challenges must be

addressed if RC systems are to significantly impact future remote-sensing systems. This

work explores the possibility of deploying reconfigurable systems in remote-sensing

applications using the chosen test case for feasibility analysis. The conclusions drawn

from different design architectures for the test applications are used to identify the









limitations of the current systems and propose solutions to enable future systems with RC

resources.

The structure of the remaining document is as follows. Chapter 2 presents a brief

background on reconfigurable computing using FPGAs as the enabling technologies and

data fusion using multiscale Kalman filters/smoothers. It also presents a discussion on

the related research in the field of reconfigurable computing as applied to remote sensing

and similar application domains. Chapter 3 presents some tests performed for initial

study and feasibility analysis. The experiments are based on designs of the 1-D Kalman

filter, which forms the heart of the calculations involved in the data fusion application.

Chapter 4 presents a sequence of revisions to 2-D filter designs developed to solve the

problem along with the associated methodologies. Each of these designs builds on the

limitations identified in the previous design and proposes a better solution under the

given system constraints. Their performance is compared with a baseline C code running

on a Xeon processor. Several graphs and tables that are derived from the results are also

presented. The final architecture in the chapter emulates the performance of an ideal

system due to which it outperforms the processor-based solution by achieving over an

order of magnitude speedup. Chapter 5 summarizes the research and the findings of this

work. It draws conclusions based on the results and observations presented in the

previous chapters. It also gives some future directions for work beyond this thesis.














CHAPTER 2
BACKGROUND AND RELATED RESEARCH

This work involves an interdisciplinary study of remote-sensing applications and a

new paradigm in the field of high-performance computing: Reconfigurable Computing.

The fast development times with reconfigurable devices, their density, advanced features

such as optimized compact hardwire cores, programmable interconnections, memory

arrays, and communication interfaces have made them a very attractive option for both

terrestrial and space-/air-borne applications. There are multiple advantages of equipping

future systems with reconfigurable computation engines. First, they help in overcoming

the limited bandwidth problem on the downlink. Second, they create a possibility of

providing several real-time applications on-board. Third, they can also be used for

feedback mechanisms to change data collection strategy in response to the quality of the

received data or for changing instrumentation planning policy.

This thesis aims at designing different architectures for a test application to analyze

various features of the existing platforms and suggest necessary improvements. The

hardware designs for the FPGA are implemented using the Handel-C language (with DK-

3 as its integrated development environment) to expedite the process as compared to the

conventional HDL design flow. This chapter provides a comprehensive discussion on

different aspects of reconfigurable computing with a brief description of the application.

To summarize the existing research, a brief review of the relevant prior work in this field

and other related fields is also presented in this chapter.









2.1 Reconfigurable Computing

This section presents some history on the germination, progress and recent

explosion of this relatively new computing paradigm. This section also describes the rise

in usage of programmable logic devices in general with time and concludes with a

detailed discussion on the enabling technology for RC: FPGAs.

2.1.1 The Era of Programmable Hardware Devices

From the infancy of Application-Specific Integrated Circuits (ASICs) the designers

could foresee a need for chips with specialized hardware that would provide enormous

computational potential. The state of the art of IC fabrication technology limited the

amount of logic that could be packed into a single chip. During the 1980s and 90s the

fabrication technology matured and saw drastic improvements in many fabrication

processes, and the era of VLSI began. The high development and fabrication costs

started dropping as the 90s saw an explosion in the demands of such products.

The 1980s and 90s also saw birth of the killer "microprocessors" [10] which started

capturing a big portion of the market. The faster time to market for the products

motivated many to forgo ASIC and adopt general-purpose processors (GPP) or special-

purpose processors such as digital signal processors (DSPs). While this approach

provided a great deal of success in several application domains with relative simplicity, it

was never able to match the performance of the specialized hardware in the high-

performance computing (HPC) community. Real-time systems and other HPC systems

with heavy computational demands still had to revert to ASIC implementations. To

overcome the limitation of high non-recurring engineering (NRE) costs and long

development times, an alternative methodology was developed: Programmable Logic

Devices (PLDs). PLDs started playing a major role in the early 90s. Since they provided









faster design cycles and mitigated the initial costs, they were soon adopted as inexpensive

prototyping tools to perform design exploration for ASIC-based systems. As the

technology matured, the application of PLDs expanded beyond their role as "place

holders" into essential components of the final systems [11].

Due to their ability of being programmed in-field, developers could foresee the

PLDs playing a major role in HPC where they could offer many advantages over

conventional GPPs and ASICs. The GPP and the ASIC have existed at two extremes on

the spectrum of computational resources. The key trade-off has been that of flexibility,

where GPPs have held the lead in the market, and performance, where the ASICs have

overshadowed the former. PLDs (also known as RC engines because of their ability to be

reprogrammed) have made a strong impact in the market by providing the best of both

the worlds.








LL.










Performance

Figure 2.1. Various computation platforms on the flexibility and performance spectrum









2.1.2 The Enabling Technology for RC: FPGAs

As gate density further improved, a particular family of PLDs, namely FPGAs,

became a particularly attractive option for researchers. An FPGA consists of an array of

configurable logic elements along with a fully programmable interconnect fabric capable

of providing complex routing between these logic elements [12]. Figure 2.2 presents an

oversimplified structure of an FPGA. The routing resources represented in the diagram

by horizontal and vertical wires that run between the Configurable Logic Blocks (CLBs)

consume over 75% of the area on the chip. The flexible nature of an FPGA's architecture

provides a platform upon which applications could be mapped efficiently. The ability to

reconfigure the logic cells and the connections allows for modifying the behavior of the

system even after deployment. This feature has important implications for supporting

multiple applications, as an FPGA-based system could eliminate the need of creating a

new system each time a new application is to be ported.
SHIFTiN COLT


Figure 2.2. A simplistic view of an FPGA and a logic cell inside that forms the basic
building block [13].









Modem FPGAs are embedded with a host of advanced processing blocks such as

hardware multipliers and processor cores to make them more amenable for complex

processing applications.

One of the several advantages that FPGAs offer over conventional processors is

that they are massive computing machines that lend themselves well to applications with

inherent fine-grain parallelism. Because a "farm" of CLBs can operate completely

independently, a large number of operations can take place on-chip simultaneously unlike

in most other computing devices. The ability of concurrent computation and support for

high memory bandwidth offered by internal RAMs in FPGAs offers them an edge over

DSPs for several signal processing applications. Highly pipelined designs help further in

overlapping and hiding latencies at various processing steps. The execution time for the

control system software is difficult to predict on modem processors because of cache,

virtual memory, pipelining and several other issues which make the worst-case

performance significantly different from the average case. In contrast, the execution time

on the FPGAs is deterministic in nature, which is an important factor for time-critical

applications.

Although FPGAs provide a powerful platform for efficiently creating highly

optimized hardware configurations of applications, the process of configuration

generation can be quite labor-intensive. Hence, researchers have been looking for

alternative ways of porting applications with relative ease. Several graphical tools and

higher-level programming tools are being developed by vendors that speed up the design

cycle of porting an application on the FPGA. This thesis makes use of one such high-

level programming tool called Handel-C (started as a project at Oxford University and









later developed into a commercial tool by Celoxica Inc.) [8] to enable fast prototyping

and architectural analysis. Handel-C provides an extension and somewhat of a superset

of standard ANSI C including additional constructs for communication channels and

parallelization pragmas while simultaneously removing support for many ANSI C

standard libraries and other functionality such as recursion and compound assignments.

DK the development environment that supports Handel-C, provides floating-point and

fixed-point libraries. The compiler can produce synthesizable VHDL or an EDIF netlist

and supports functional simulation. Handel-C and its corresponding development

environment have been used previously in numerous other projects including image

processing algorithms [14] and other HPC benchmarks [2].

The most common way in which RC-based systems exist today are as extensions to

conventional processors. The FPGAs are integrated with memory and other essential

components on a single board which then attaches to the host processor through some

interconnect such as a Peripheral Component Interconnect (PCI) bus.

This work makes use of an RC1000 board, a PCI-based board developed by

Celoxica Inc. equipped with a VirtexE 2000 FPGA. Hardware configurations for the

board can be generated using the DK development environment or following the more

conventional design path of VHDL. Figure 2.3 shows a block diagram of the RC1000

board. The card consists of 8MB of memory that is organized into four independently

accessible banks of 2MB each. The memory is accessible both to the FPGA and any

other device on the PCI bus. The FPGA has two clock inputs on the global clock buffer.

The first pin derives its clock from a programmable clock source or an external clock,

whereas the second pin derives its clock from the second programmable clock source or











the PCI bus clock. The board supports a variety of data transfers over the PCI bus,


ranging from bit- and byte-wide register transfers to DMA transfers into one of the


memory banks.


Recently, researchers have felt the need for developing stand-alone FPGA systems


that could be accessed over traditional network interfaces. Such an autonomous system


with an embedded processor and an FPGA [15] (a novel concept developed by


researchers at the HCS Lab known as the Network-Attached Reconfigurable Computer or


NARC) offers a very cost-effective solution for the embedded computing world,


especially where power and space are at a premium.


Secondary
Host
Primary PCI [ PMC #1
PCI-PCI
Bridge PMC#2



Clocks &
S PLX PC190980 Control



SRAM Bank
512kx32

SRAM Bank Xilinx
512kx32 BG560
SV200E
SRAM Bank
512kx32

Isolation SRAM Bank 0
512kx32

Linear +3v3/+2v5 Isolation
Regulator
Auxiliary I/O



Figure 2.3. Block diagram description of RC1000 board. Courtesy: RC1000 reference
manual [16].









2.2 Remote-Sensing Test Application: Data Fusion

In the past couple of decades, there has been a tremendous amount of research in

sensor technology. This research has resulted in a rapid advancement of related

technologies and a plethora of sensors in the market that differ significantly in the quality

of data they collect. One of the most important applications that have attracted

overwhelming attention in remote sensing is that of mapping topographies and building

digital elevation maps of regions of earth using different kinds of sensors. These maps

are then employed by researchers in different disciplines for various scientific

applications (e.g. oceanography for estimation of ocean surface heights and behavior of

currents, in geodesy for estimation of earth's gravitational equipotential, etc.).

Traditionally, satellite-based systems equipped with sensors like InSAR and

Topographic Synthetic Aperture Radar (TOPSAR) had been employed to map extended

areas of topography. But, these sensors lacked high accuracy and produced images of

moderate resolution over the region of interest. Recently, ALSM has emerged as an

important technology for remotely sensing topographies. ALSM sensor provides

extremely accurate and high resolution maps of local elevations, but operates through a

very exhaustive process which limits the coverage areas to smaller regions.

Because of the varying nature of the data produced by these sensors, researchers

have been developing algorithms that fuse information obtained at different resolutions

into a single elevation map. A multiscale estimation framework developed by Feiguth

[17] has been employed extensively over the past decade for performing efficient

statistical analysis, interpolation, and smoothing. This framework has also been adopted

to fuse ALSM and InSAR data having different resolutions to produce improved

estimates over large coverage areas while maintaining high resolution locally.









2.2.1 Data Acquisition Process

Before delving into the mathematics of the estimation algorithm, a brief description

of the current process of data acquisition is presented to aid in the understanding of this

work and the motivation behind it. The process of collecting ALSM data involves flying

the sensor in an aircraft over the region of interest. As the aircraft travels through each

flight line, the sensor scans the regions of interest below in a raster scan fashion in a

direction orthogonal to movement of the aircraft.


















\ 21






Figure 2.4. The steps involved in data acquisition and data processing

With the help of several other position sensors and movement measurement

instruments, the XYZ coordinates of mapped topography are generated and stored on

disks in ASCII format. These XYZ coordinates are then used to generate a dense 3-D

point cloud of irregularly spaced points. This 3D data set when gridded in X and Y

directions at varying distance grids results in 2-D maps of corresponding resolutions.









These images are then employed in the multiscale estimation framework with the SAR

images to fuse the data sets and produce improved maps of topography. Because of the

lack of processing capabilities on aircraft, these operations cannot be performed in real-

time and hence data is stored on disk and processed offline in ground stations. Several

applications could be made possible if on-board processing facilities were made available

on the aircraft. Such a real-time system would offer several advantages over the

conventional systems. For example, it could be used to change the data collection

strategy or repeat the process over certain selected regions in response to the quality of

data obtained. RC-based platforms as described in the previous section form a perfect fit

for being deployed in such systems. Although this work closely deals with an aircraft-

based target system, the issues involved are very generic and apply to most other remote

sensing systems with some exceptions. As a result some issues might not be addressed in

this work, for example the effect of radiations which have important implications on

satellite-based systems, requiring some kind of redundancy is provided to overcome

single event upsets (SEUs), do not affect an aircraft-based system. While it is desirable

to design a complete system that could be deployed on-board an aircraft, doing so would

entail a plethora of implementations issues that divert the focus from the more interesting

aspects of this work explored through research. Hence, instead of building an end-to-end

system, this work will focus on the data fusion application employed in the entire process

(Figure 2.4) and will use it as a test case to analyze the feasibility of deploying RC-based

systems in the remote sensing arena. The following sub-section provides a description of

this data fusion algorithm.









2.2.2 Data Fusion: Multiscale Kalman Filter and Smoother

The multiscale models which are the focus of this thesis were proposed by Fieguth

et al. [17] and provide a scale-recursive framework for estimating topographies at

multiple resolutions. This multiresolution estimation framework offers the ability of

highly efficient statistical analysis, interpolation, and smoothing of extremely large data

sets. The framework also enjoys a number of other advantages not shared by other

statistical methods. In particular, the algorithm has complexity that grows only linearly

with number of leaf nodes. Additionally, the algorithm provides interpolated estimates at

multiple resolutions along with the corresponding error variances that are useful in

assessing the accuracy of the estimates. For these reasons, and many more, researchers

have adopted this algorithm for various remote-sensing applications.

Mutliscale Kalman smoothers modeled on fractional Brownian motion are defined

on index sets, organized as multi-level quad-trees as shown in the Figure 2.5. The

multiscale estimation is initiated with a fine-to-coarse sweep up the quad-tree that is

analogous to Kalman filtering with an added merge step. This fine-to-coarse sweep up

the quad-tree is followed by a coarse-to-fine sweep down the quad-tree that corresponds

to Kalman smoothing.


m=0





m=2



Figure 2.5. Quad-tree data structure where m represents the scale.









The statistical process defined on the tree is related to the observation process and has the

coarse-to-fine scale mapping defined as follows

x(s) = A(s)x(s) + B(s)w(s) (1)

y(s) =C(s)x(s)+ v(s) (2)

where

s represents an abstract index for representing a node on the tree

y represents the lifting operator, sy represents the parent of s

x(s) represents the state variable

y(s) represents the observation (LIDAR or INSAR)

A(s) represents the state transition operator

B(s) represents the stochastic detail scaling function

C(s) represents the measurement state relation

w(s) represents the white process noise

v(s) represents the white measurement noise

a represents the lowering operator, sa, represents the Wth child ofs

q represents the order of the tree, i.e. the number of descendant a parent has

The process noisew(s), is Gaussian with zero mean and the variance given by the

following relation

E[w(s)w(t)t] = IS,, (3)

w(s)- N(O,I) (4)

The prior covariance at the root node is given by

x, = x(0)- N(0, P) (5)









The parameters A(s), B(s), C(s) that define the model need to be chosen appropriately to

match the process being modeled. The state transition operator was chosen to be '1' to

create a model where the child nodes are the true value of the parent node offset a small

value dependent on the process noise. The parameter B(s) is obtained using power

spectral matching or fractal dimension classification methods. The measurement state

relation matrixC(s) was assigned as '1' for all pixels to represent the case where

observations are present at all pixels without any data dropout.

Corresponding to any choice of the downward model, an upward model on the tree

can be defined as in Fieguth et al. [17]

x(sy) = F(s)x(s)+ w(s) (6)

y(s) =C(s)x(s)+ v(s) (7)

F(s)= P,A' (s)P, (8)

E[ -(s)w (s)] = P, (1- A (s)PA(s)Ps ) (9)
=Q(s)

where, P, is the covariance of the state defined as Ex(s)x(s) ].

Now, the algorithm can proceed with the two steps outlined above (upward and

downward sweep) after initializing each leaf node with prior values,

(s s+)= 0 (10)

P(s s+) = P (11)

A) Upward sweep

The operations involved in the upward sweep are very similar to a 1-D Kalman filter

which forms the heart of the computation and can be perceived as running along the scale









on each pixel with an additional merge step after every iteration. The calculations

performed at each node are as follows

V(s) = C(s)P(s | s+)C' (s) + R(s) (12)

K(s) =P(s s+)CT (s)V' (s) (13)

P(s s)= [I-K(s)C(s)]P(s I s+) (14)

i(s s)= (s s s+)+ K(s)(y(s)-C(s)i(s I s+)) (15)

The Kalman filter prediction step is then applied at all nodes except the leaf nodes which

were initialized as mentioned above

x(s I sa) = F(sa, )(sa, I sa,) (16)

P(s sa,) = F(sa,)P(sa, I sa,)F (sa,)+Q(sa,) (17)

This leads to a prediction estimate of the parent node from each descendant (1...q),

which are then merged into a single estimate value to be used in the measurement update

step.


(s s+) =P(s Is+)J P(s sa,)i(s sa,) (18)
i=1


P(s s+)=[(- q)P5l+ P -(sl sa,)]1 (19)
1=1

This process is iterated over all the scales (m) until the root node is reached. This process

leads to the generation of estimates at multiple scales based on the statistical information

acquired from the descendant layers. The completion of upward sweep leads to a

smoothed estimate, s (0) = (0 | 0) at the root node.

B) Downward Sweep









The smoothed estimates for the remaining nodes are computed by propagating the

information back down the tree in a smoothing step.

s= (s) = s) + J(s)[I' (sy)- i(sy I s)] (20)

P (s) = P(s s)+ J(s)[P" (sy)- P(sy I s)]J' (s) (21)

J(s)= P(s I s)F (s)P (sy | s) (22)

where

is (s) represents the smoothed estimates

P"(s) represents the corresponding error variances

It is worth mentioning here that the set of computations outlined above, despite being

closely coupled, have two independent processing paths. This fact is further exploited in

Chapters 3 and 4 where the designs for targeting the hardware are explored.

2.3 Related Research

There has been an abundance of publications over the past decade by researchers

who have tried accelerating various image processing algorithms on FPGAs. There has

also been a good deal of academic and industry effort in deploying such dual-paradigm

systems (which make use of both conventional processing resources and RC-based

resources) in space for improving on-board performance. However, only a limited

amount of work has been done in understanding the nature of processing involved in such

applications to identify their specialized demands from the target systems.

One of the earliest works on designing a Kalman filter for an FPGA was by Lee

and Salcic in 1997 [18], where they attempted to accelerate a Kalman tracking filter for

multi-target tracking radar systems. They achieved an order of magnitude improvement

with their designs that spread over six 8000-series Altera chips in comparison to previous









attempts in [19-22] that targeted transputers, digital signal processors, and linear arrays

for obtaining improved performance over software implementations. In an application

note from Celoxica Inc. [23], Chappel, Macarthur et al. present a system implementation

for boresighting of sensor apertures using a Kalman filter for sensor fusion. The system

utilizes a COTS-based FPGA which embeds a 32-bit softcore processor to perform the

filtering operation. Their work serves as a classical example of developing a low-cost

solution for embedded systems using FPGAs. There have also been other works [24-25]

that perform Kalman filtering using an FPGA for solving similar problems such as

implementation of a state space controller and real-time video filtering. In [25], Turney,

Reza, and Delva have astutely performed pre-computation of certain parameters to reduce

the resource requirements of the algorithm.

The floating-point calculations involved in signal processing algorithms are not

amenable to FPGA or hardware implementations in general, so researchers have resorted

to fixed-point implementations and have been exploring ways to mitigate the errors hence

induced. In [18], Lee and Salcic employ normalization of certain coefficients involved

by the process variance to maximize data accuracy with a fixed number of bits or

minimize the resource requirements for a certain level of accuracy. There has also been

plenty of work done on the algorithmic side to overcome such effects. In [26], Scharf

and Siggurdsson present a study of scaling rules and round off noise variances in a fixed

point implementation of a Kalman filter.

The 1-D Kalman filter involves heavy sequential processing steps and hence cannot

fully exploit the fine-grain parallelism available in FPGAs. The multiscale Kalman filter

by contrast involves independent operations on multiple pixels in an image and offers a









high degree of parallelism (DoP) that is representative of the class of image processing

algorithms. The possibility of operating on multiple pixels in parallel has motivated

many researchers to target different imaging algorithms on FPGAs. Researchers [2, 4, 5,

9] have presented several examples illustrating the performance improvements obtained

by porting imaging algorithms like 2-D Fast Fourier Transform, image classification,

filtering, 2-D convolution and edge detection on the FPGA-based platforms. Dawood,

Williams and Visser have developed a complete system [27] for performing image

compression using FPGAs on-board a satellite to reduce bandwidth demands on the

downlink. Several researchers have even developed a high-level environment [6-7] to

provide the application programmers a much easier interface for targeting FPGAs for

imaging applications. They achieve this goal by developing a parameterized library of

commonly employed kernels in signal/image processing, and these cores can then be

instantiated from a high-level environment as needed by the application.

Employing RC technology in the remote sensing arena is not a new concept and

several attempts have been made previously to take advantage of this technology. Buren,

Murray and Langley in their work [28] have developed a reconfigurable computing board

for high-performance computing in space using SRAM-based FPGAs. They have

addressed the special needs for such systems and identified some key components that

are essential to the success, such as, the use of high-speed dedicated memories for FPGAs

and high I/O bandwidth and support for periodic reloading to mitigate radiation effects

being some of them. In [3], Arribas and Macia have developed an FPGA board for a

real-time vision development system that has been tailored for the embedded

environment. Besides the academic research community, industries have also shown









keen interest in the field. Honeywell has developed a "Honeywell Reconfigurable Space

Computer" (HRSC) [29] board as a prototype of the RC adaptive processing cell concept

for satellite-based processing. The HRSC incorporates system-level SEU mitigation

techniques and facilitates the implementation of design-level techniques. In addition to

hardware development research, abundance of work has been done on developing

hardware configurations for various remote-sensing applications. In [1], a SAR/GMTI

range compression algorithm has been developed for an FPGA-based system. Sivilotti,

Cho et al., in their work in [30], have developed an automatic target detection application

for SAR images to meet the high bandwidth and performance requirements of the

application. Other works in [27, 31-33] discuss the issues involved in porting similar

applications such as geometric global positioning, sonar processing, etc. on the FPGAs.

This thesis aims to further the existing research in this field by developing a multiscale

estimation application for an FPGA-enabled system and exploring different architectures

to meet the system requirements.














CHAPTER 3
FEASIBILITY ANALYSIS AND SYSTEM ARCHITECTURE

This chapter presents a discussion on some of the issues involved and some initial

experiments performed for feasibility analysis. The results of these tests influenced the

choice of design parameters and architectural decisions made for the hardware designs of

the algorithm presented in the next chapter.

3.1 Issues and Trade-offs

Most signal processing algorithms executed on conventional platforms employ

double-precision, floating-point arithmetic. As pointed out earlier, such floating-point

arithmetic is not amenable to hardware implementations on FPGAs, as they have large

logic area requirements (a more detailed comparison is presented in the next subsection).

Carrying out these processing steps in fixed-point arithmetic is desirable but introduces

quantization error which if not controlled can lead to errors large enough to defeat the

purpose of hardware acceleration. Hence, there exists an important trade-off between the

number of bits used and the amount of logic area required. There are techniques that

mitigate these effects by modifying certain parts of the algorithm itself. Examples of

such techniques include normalizing of different parameters to reduce the dynamic range

of variables and using variable bit precisions for different parameters, using more bits for

more significant variables. This work makes use of fixed- and floating-point libraries

available in Celoxica's DK package. To perform experimental analysis, the simulated

data is generated using Matlab (the values for the simulated data were chosen to closely

represent the true values of data acquired from the sensors). Hence, a procedure is









required to verify the models in the FPGA with the data generated in Matlab. Text files

are used in this work as a vehicle to carry out the task. Since Matlab produces double-

precision, floating-point data while the FPGA requires fixed-point values, extra

processing is needed to perform the conversion.



Hardwar

Matlab f Handel-
simulation C/VHDT
and
.txt file
containing
data

Figure 3.1. Data generation and validation

Another important issue that can become a significant hurdle arises from the nature

of the processing involved in the Kalman filtering algorithm. The Kalman filter

equations are recursive in nature and require the current iteration estimate value to begin

the next state calculation. This behavior is clearly visible from Figure 3.2 which shows

the processing steps in a time-tracking filter.

pk H' Xk+1 = kxk Initial prior
Kk HkPk Hk +Rk Pk+ kP k T + Qk values


SMeasurement update Time update

k =k +Kk(Zk-Hkk ) Pk = (I -KkHk)Pk
xk represents the state variable Pk represents associated error covariance
zk represents the observation input Qk represents process noise variance
( represents the state transition operator Rk represents measurement noise variance

Figure 3.2. Set of sequential processing steps involved in 1-D Kalman filter









This problem cannot be mitigated by pipelining the different stages, because of the data

dependency that exists from the last stage of the pipeline to the first stage. Although this

can be a major roadblock for 1-D tracking algorithms, the situation can be made much

better in the multiscale filtering algorithm because of the presence of an additional

dimension where the parallelism can be exploited along the scale. Multiple pixels in the

same scale can be processed in parallel as they are completely independent of each other.

The number of such parallel paths would ultimately depend on the amount of resources

required by the processing path of each pixel pipeline. Another interesting but subtle

trade-off exists between the memory requirements and the logic area requirements.

Hardware architecture of the algorithm could be made to reuse some resources on the

chip (e.g. the arithmetic operations that are instantiated could be reused, especially

modules such as multipliers and the dividers which consume excessive area) by saving

the intermediate results in the on-board SRAM. This approach decreases the logic area

demands of the algorithm, but at the cost of increased memory requirements and also the

extra clock cycles required for computing each estimate.

3.2 Fixed-point vs. Floating-point arithmetic

A study was performed to compare the resource demands posed by floating-point

operations as opposed to fixed-point and integer operations of equal bit widths. Since a

division operation is a very expensive operation in hardware, it yields more meaningful

differences and was hence chosen as the test case. IEEE single-precision format was

employed for floating-point division. A Xilinx VirtexII chip was chosen as the target

platform for the experiment to exploit the multiplier components present in the device

(and also partly due to the fact that the current version of the tool does not support









floating-point division on any other platform). Table 3.1 compares the resource

requirements for the different cases.

Table 3.1. Post place and route results showing the resource utilization for a 32 x 32 bit
divider and the maximum frequency of operation.

Target Chip: VirtexII 6000
Package: ffl 152
Speed Grade: -4
Integers Fixed-point ( 16 bits before IEEE single-precision
(32 bits) and after binary point) floating-point (32 bits)
Slices
Total33792 19(1%) 84 (1%) 487 (1%)
(Total 33792)
18-bit x 18-bit
Multipliers 3 (2%) 6 (4%) 4 (2%)
(Total 144)
Max. frequency 63.2 MHz 50.5 MHz 97.5 MHz

High costs involved in the floating-point operations are clearly visible from the table.

The high frequency obtained for the floating-point unit, which appears as an anomaly,

merely represents the efficiency of the cores used by the library in the tool. Once the cost

savings obtained by resorting to the fixed-point operations have been identified, we need

to understand the error introduced through this process. To analyze this error, multiple

designs of the 1-D filter were developed with different bit widths of fixed-point

implementations in Handel-C. Although simulation was adopted to generate the outputs,

the designs were made to closely represent hardware models such that minimal changes

could translate them into hardware configurations. The simulation outputs were

compared with Matlab's double-precision floating results. Mean square error (MSE)

between the filter estimates and the actual expected outputs were used as a metric for

comparison as shown in Table 3.2.









Table 3.2. Quantization error introduced by fixed-point arithmetic (averages are over 100
data points). First column defines the number of bits before and after the
binary point.
MSE in Handel-C
Fixed-point MSE in Matlab ( Mean square error Max. abs. error
and (% error from
precision (double-precision) ( e f from Matlab from Matlab
Matlab)

8.8 0.6490 0.6294(3%) 9.515 e-5 0.0631

8.5 0.6490 2.7179 (300%) 2.7995 3.4405

8.3 0.6490 3.6348 (460%) 3.1135 3.92

Although the maximum and average toleration level of error is largely dependent on the

nature of the application, it is clearly evident from Table 3.2 that eight bits of precision

after the binary point yields reasonable performance with less than 0.5% of maximum

absolute error. As is also visible from the table, the accuracy decreases rapidly with

reduction in bit width. The number of bits required before the binary point was kept

constant as it is dictated by the dynamic range of the data set and not by the nature of

processing. These observations influenced the selection of eight bits of precision after the

binary point for the hardware architecture of the multiscale filter. Figure 3.3 depicts the

same information by plotting the difference between the results obtained from Matlab

floating-point and Handel-C fixed-point implementations (with 8 bit precision) for each

data point of the time series. It is worth mentioning here that the quantization error

values presented in the graphs and tables are specific to the Kalman filter and depend on

the type of processing involved (the multiplication and division having the worst effects

on quantization errors). The recursive nature of processing involved over the scales in a

multiscale filter may tend to accumulate the error at each scale and lead to larger error

values than those presented in this section.










0 01






-0 02

-0 03
Absolute
error -004

-0 05

-0 06

-007 ----
10 20 30 40 50 60 70 80 90 100

Iteration >


Figure 3.3. Differences between Matlab computed estimates and Handel-C fixed point
estimates using eight bits of precision after the binary point for 100 data
points.

3.3 One-dimensional Time-tracking Kalman Filter Design

A 1-D time tracking filter (represented by the Equations in Figure 3.2) was

designed for the RC1000 FPGA board with a VirtexE 2000 chip (more details in Chapter

2). The design could not be pipelined because of the existence of worst-case data

dependence from the output of the current iteration to the input of the next iteration.

Performing all of the computation in hardware proved exorbitantly bulky and occupied

about 12% of the slices for one instantiation of the filter. In addition, the long

combinational delay introduced by the division operator lead to a considerably low

design frequency of about 3.9 MHz. To make these designs of any practical importance,

we need to overcome these problems or find a way to mitigate their effect. Revisiting the

algorithm and taking a closer look at the equations involved yields some interesting

information.










R +
H X

K 1
Pp X
F X








Xp

Z X Xpnew

K X

F

Figure 3.4. Processing path for (a) calculation of parameters and (b) calculation of the
estimates.

The block diagram of computation steps reveals the existence of two independent

processing chains. This fact has important implications for future hardware designs as it

allows the reduction of the actual amount of computation that need to be performed in

hardware. The calculation of the estimate error covariance (Ppnew) and the filter gain (K)

is completely independent of observation input and generated estimates and hence can be

done offline even prior to data collection. The pre-computation of these filter parameters

has multiple advantages. It reduces the logic area requirements from about 12% to under

3% and eliminates the division operator from the critical path increasing the maximum

design frequency to about 90 MHz. In addition, it also allows the possibility of changing

the filter parameters by merely replacing the set of the pre-computed parameters,

introducing a sort of virtual reconfiguration capability where the behavior of the filter

changes without reconfiguring the FPGA. But these benefits come at the cost of extra









memory requirements for storage of the filter parameters, a trade-off that was mentioned

earlier. The reduction in area is an important consideration for the 2-D filter design as it

allows a larger number of pixels to be processed in parallel. With these modifications a

1-D filter was developed for the RC1000 board and performance experiments were

conducted with data sets containing 600 data points. The latency incurred for transferring

all the values, one byte at a time, over the PCI bus hampers the performance. To

overcome this limitation, DMA is used to transfer all the data values onto the on-board

SRAM. The performance results for both the DMA and non-DMA case are shown in

Table 3.3 below and compared against Matlab results. The timing results in Matlab

yielded variations in multiple trials (which is attributed to the way Matlab works

internally and handles memory). For this reason further experiments were performed

using a C-based solution for obtaining software execution times.

Table 3.3. Performance results tabulated over 600 data points (slices occupied: 2%)
Code version Execution time MSE
Matlab (running on P4 @ 3GHz) 0 -16 ms 0.6563

FPGA (non-DMA) 49.5 ms 0.6526

FPGA (DMA) 1 ms 0.6526

Components of FPGA execution time (DMA case)
DMA write of 600 data values: 64 us
DMA read of 600 values: 44 us
Computation time for 600 values: 285 us

3.4 Full System Architecture

Figure 3.5 depicts a high-level block diagram of the system architecture. To

provide more functionality and enhance the usability, a display mechanism may be

included in the final system depending on the needs of the application. Since this work

focuses on just a part of the entire system (data fusion/estimation), the inputs are not










directly obtained from the sensors. Instead, they go through several stages of processing

before they are converted in a form compatible to be wired with the system shown in

Figure 3.5. Most of these processing stages are also performed in ground-based stations

currently, but some of the processing is also performed on-board the aircraft. Similarly,

the output could be used directly for providing visual feedback or may need to pass

through some further post processing stages before being in a directly usable format.


---- Raw
Pre- processing Raw
image
-------------------

Memor Filter
v parameters
Filter
SInput Logic



Off- Input FPGA
board


+ -__ Fixed-point to
S Out l real number
S conversion



Filtered image data Video
to display unit controller




Figure 3.5. High-level system block diagram.

As shown in the diagram, the system heavily depends on on-board memory which

preferably should be organized in multiple independently addressable banks. The filter

parameters corresponding to a chosen filter model will be stored in one/multiple memory

bank(s). These filter parameters define the behavior of the algorithm and hence multiple

sets of such parameters could be stored on off-board memory and transferred into on-









board memory as needed. The input image coming from one of the pre-processing

blocks is distributed in one of the multiple memory blocks reserved for input by a

memory controller. The provision of multiple input banks allows the overlapping input

data transfer time with the computation time of the previous set of data. Spreading the

filter parameters in multiple banks also aids in reducing the computation time by

allowing multiple parameters to be read in parallel. The test system on which all the

experiments are performed consists of a PCI-based card residing in a conventional Linux

Server. Hence it may not exactly mirror the system just outlined, and could involve some

additional issues such as limited memory, input/output transfer latencies over the PCI

bus, etc. The goals of this work include identifying limitations in the current systems that

hamper the performance, and speculating additional features that can enhance the

efficiency of the system, on the basis of the results obtained from experiments.














CHAPTER 4
MULTISCALE FILTER DESIGN AND RESULTS

This chapter presents different hardware designs developed for the multiscale

Kalman filter. Each subsequent design explores the opportunity to further improve

performance and builds on the shortcomings discovered in the previous design. Results

of the timing experiments are presented and analyzed to assess performance bottlenecks.

4.1 Experimental Setup

Before presenting the hardware designs developed for the RC1000 target platform,

a brief discussion about the input data set that was used is presented in order to set the

experiment up for results and analysis. The test data was generated by simulation from

Matlab to emulate the Digital Elevation Map (DEM) obtained over a topographical

region.












200--


0 100 150 200 250

Figure 4.1. Simulated observations corresponding to 256 x 256 resolution scale.
s?-^^,^--^ --
^^ ^ ''- ..;~a *i~ Pc









The highest resolution observation was chosen to have a support of 256 x 256 pixels and

represents the data set corresponding to the one generated by an ALSM sensor. This

image resolution hence gives rise to nine scales in the quad tree structure. Another set of

observations was generated for a more coarse scale having a support of 128 x 128

representing the data generated from INSAR. Figure 4.1 depicts the (finer scale) data set,

which can be seen to have four varying level of roughness for different regions of the

image. Such a structure was chosen to incorporate the data corresponding to different

kinds of terrains such as plain grasslands (smooth) and sparse forests (rough) in a single

data set. This structure of the simulated observation was created by following the

fractional Brownian model and populating the nodes in the quad tree structure starting

from the root node using equations from Section 2.2.1. As with the case of 1-D filtering,

pre-computation is employed in this case as well to reduce the resource requirements.

Hence, the filter parameters needed for the online computation of estimates are also

generated using equations in Section 2.2.1 (namely K(s),C(s),F(s),P(s/ s+),P(s/sa)).

The parameters along with the observation set required approximately 870 KB of

memory when represented in the 8.8 fixed-point format. The small footprint of data, fed

as input to the FPGA, provides several opportunities of exploiting the on-board memory

in different ways as demonstrated further in the chapter. Some designs might require

additional storage because of details of the architecture. It is also worth noting that the

structure of computations in the 2-D filter though similar to the 1-D case has extra

operations due to the merge step involved for combining the child nodes into a parent

pixel.









4.2 Design Architecture #1

The block diagram representing the hardware configuration (as for RC1000 card)

of this design is shown in Figure 4.2. Pre-processed filter parameters are transferred via

DMA into one of the SRAM banks. Although the transfer has to go over the PCI bus and

incurs a substantial latency in the experimental system, this overhead has been separated

in the results because the actual target system might have faster means of such transfer.

Besides, the latency is just a single time effect and can be avoided by preloading the

parameters of the desired filter model. Hence, the input latency will only be visible in the

virtual reconfiguration cases where a new set of parameters need to be transferred into the

on-board memory.


To host or DMA SRAM
source of data Engine k banks





16 bits bts 16 bits 16 bits 166 its

Stage age age Stage Stage
6 16 bits
16 bits
Pixel processing pipeline




Figure 4.2. Block diagram of the architecture of design #1.

This design basically exploits the data parallelism available in the algorithm by pipelining

the processing over multiple pixels. The pipeline performs the computation for four

pixels simultaneously as this lends itself well to the quad tree structure of the application.

The major performance hurdle is experienced at the memory port. Since each stage of

the pipeline processing requires some parameters to be read from memory, a resource









conflict exists. Hence, a stall is required every stage of which

breaks down the and its associated advantages to a large extent.

Table 4.1 the time for the algorithm on a 2.4

with the time on the RC board. The FPGA on the board is clocked at 30 MHz, and

better can be obtained by increasing the clock, which can be

achieved by using more advanced chips and by the design.

Table 4.1. Performance .. of a with an N 4 configured with design

Execution time on Single scale ( ix ) .. scales (till 4 x 4)

RC1I 15.14ms .5ms

2.4 GHz 7 9. 13.

Resource 4 latency for the data
:3 out of I (1 )over the PCI bus:
Memory : 850KB 1 scale (6 1): 3.1ms
parameters) All scales( ): 3.
:1 ; (
The times are for both and "' scales of The

computation is terminated when the image reduces to just 4x4 because the

overheads dominate actual time. The amount of resources occupied by this

configuration is also beside the table. just 1 of logic utilization for the

processing of four enough area is left to provide the for* the

amount of concurrent computation by .. more in the chain. The

values in the table show that the 4-based filter- about 1.5 times faster than a

conventional In the embedded system arena the absolute performance of a

system is less relevant than the per cost and is considered as a better metric

for comparison. Similarly, raw performance might not result in an order

of magnitude but they do come at about one hundredth of the running cost of a









competing system. The lessons learned from this design point to the fact that memory

bandwidth is a crucial factor for obtaining better performance for the application. The

resource hazards need to be eliminated to take complete advantage of the pipelined

structure.

The previous discussion related to the Kalman filtering involved in the application

which populates the nodes on the tree going upwards. This application also involves a

smoothing step to generate the estimates while traversing the tree from top to bottom.

Recursive application of the filtering pipeline generates multiple sets of estimates at

different scales, which are then used by the computations in the second step. The

structure of calculations involved is similar to the filtering step, but requires some

intermediate data values to be saved in addition to the outputs as in the previous case.

This further increases the memory bandwidth demands of the system, but since these

calculations only begin after the completion of the upwards step, they are not in conflict.

Figure 4.3 shows the modifications required to incorporate these effects. The additional

data has been stored in the empty memory banks, which allows them to be read in

parallel for the "smoothing" pipeline without any stall cycles.


Figure 4.3. Block diagram for the architecture of both the Kalman filter and smoother.









Another set of parameters (also described in equations from Section 2.2.1) is required for

the smoothing operations and is also stored in the same memory bank with other

parameters.The design shown was spread across two chips by having both of the

operational pipelines as independent designs. This result was achieved by creating two

separate FPGA configuration files and reconfiguring the FPGA on an RC 1000 board with

the second file after the completion of the upward sweep step, in effect emulating a multi-

FPGA system. This technique allows for higher computational potential and also

provides the possibility of pipelining the upward and downward sweep operations for

multiple data sets on a higher conceptual level. For this part of the experiment, the

observations were just limited to the finest scale (representing the LIDAR data) which

implies that no additional statistical information is incorporated in the filtering step

except for the finest scale. Hence, the smoothed estimates could be obtained by

performing the smoothing from one scale coarser to the observation scale. Existence of

observations at multiple scales may have more than a single advantage in several cases; it

not only increases the amount of available computation to be exploited but also helps in

mitigating the precision effects that tend to accumulate over the scales in the application.










Matlab-generated estimates FPGA-generated estimates Absolute error
Error Statistics:
Maximum absolute error from Matlab: 0.4249
MSE: 0.0119
Maximum error percentage: <2% in most cases
Figure 4.4. Error statistics of the output obtained from the fixed-point implementation.









The entire consisting of both the filtering and a total

of about (1 for- + for of the slices, four

simultaneously. -e 4.4 the obtained from the hardware version

with the MATLAB double fl results.

:of the in the structure of the two .and the extra

memory demands on the system by including both of them, the f designs

on the filtering part of the application. A of: the

performance that f from the filter design is created by : the

architecture to more L in in -- filling up the unused area on the

chip. The problem that hinders the performance gain is the set of I that

are to more I the number of concurrent 1

increases the memory I/O demands. As a result, extra stall cycles are needed to read the

values. stall cycles, which are a overhead, become a dominant part of

the computation time and 'y saturate the performance of the entire system. This

issue can be clearly understood by taking a closer look at the from

the Handel-C code:

Main loop of application takes 17 cycles of execution for 4. of which
just 7 cycles perform actual computation.

Therefore, CCs for execution of:
4 17x (7+ 10) x(2i )
8 pipeline: 27 x64 = + -)x (25 )
16 pipeline: 47x32 =(7 + 40) x(_ '8)


for4n- .1' of (7+1 )x- = )

) -14
Slope of the curve = ) -
4112







41


The same information is also conveyed in graphical form in Figure 4.5. As expected, the

resource requirements increase linearly with the pixel count but the performance does

not. These limitations need to be circumvented in the next design to further improve the

performance by a better pipeline which has less stall cycles, in effect exposing more

parallelism and hiding the input latency. There are two memory banks that were not

utilized in the current design which could be employed to increase the memory

bandwidth.


Performance Analysis
12 25



6 6
2 05
0 --- 0
4 8 16
#of pixels processed in parallel
-- Execution Time -*-Speedup Vs Processor


FPGA Resource Slices
Utilization: occupied
4 pixels 17%
8 pixels 31%
16 pixels 64%


Figure 4.5 Performance improvement and change in resource requirements with increase
in concurrent computation.

4.2 Design Architecture #2

This design provides the FPGA-based engine with a higher memory bandwidth by

using all four on-board memory banks (i.e. 32x4 = 128 bits per CC) in an attempt to

eliminate the resource hazards present in the previous design. The filter parameters are

now evenly spread across all the banks and hence a simultaneous read of all memory

ports provides all the needed inputs for processing a single pixel. The available data









parallelism is again exploited by pipelining the processing of independent pixels.

Without the existence of stalls, the pipeline produces a single estimate every clock cycle.

The constraint in the architecture comes from the fact that even all the memory banks

together can only support the inputs for one pixel calculation. Simultaneous operation on

multiple pixels requires some stall cycles to be introduced in the design again.


To host or DMA SRAM
source o data F noinp I v I f hnLA


Figure 4.6. Block diagram of the architecture of design #2.

Timing experiments were performed with the same set of data and compared

against the processing time for C code running on a Xeon processor. These results are

presented in Table 4.2 along with resource consumption information.

Table 4.2. Comparison of the execution times on the FPGA with Xeon processor and
resource requirements of the hardware configuration for design #2.
Execution time on Single scale (256 x 256) Multiple scales (till 4 x 4)

RC 1000 15.14ms 20.5ms

2.4 GHz Xeon Processor 7.39ms 9.86ms

Speedup 2.04 2.07


Resource Utilization:
Slices : 963 out of 19200 (5%)
Memory : 1MB approx. (filter
parameters)


DMA latency for sending the data over the PCI bus:
1 scale (650KB) : 10.2ms
All scales (870KB approx) : 11.4ms
* the amount of data that needs to be transferred is slightly larger because of
the architecture details









The values show a minor improvement over the previous design with a speedup of two.

This improvement is attained with just one pixel being processed by the pipeline, as a

result of which the logic area required by the design comes down to just 5% of the chip.

The amount of memory used by the design increased slightly because some zero padding

is required in the higher order bits of the third and fourth bank that are left unused. An

attempt is made to further improve the performance by introducing more pixels in the

same pipeline. However, this requires introduction of one stall cycle for every extra pixel

data read in the input stage of the pipeline. Again, a resource hazard exists due to the

memory access, which eventually saturates the system performance with concurrent

computation of a certain number of pixels. A mathematical analysis similar to the

previous design could be employed to prove this fact as well. A closer look at the

operational pipeline reveals that each estimate actually takes 3 cycles for computation

(extra cycles are required because of a conflict at the fourth memory bank that serves as

both input and output for the filtered estimates) and the analysis follows:

Main loop of application takes 3 cycles for execution

Therefore CCs for execution
1 pixel pipeline: 3x256
2 pixel pipeline: 4x 128= 4x (256/2)

Hence, for n+1 pixel pipeline will require:

(3 n) x 256 = f (n)
n+1
-1
Slope of the graph can be found by the derivative, f'(n) 1
1+n

Since the slope is not constant, the speedup is sub-linear and flattens off
quadratically.







44


Figure 4.7 depicts this same information by means of a graph drawn from the

experimental data. These stalls impede a linear speedup with the number of pixels.

These two designs also re-iterate the subtle trade-off between the memory requirements

and the logic area requirements. All of these stall cycles can be avoided by reducing the

memory demands and computing all of the parameters online, but will require more

resources. This design also emphasized the criticality of the memory bandwidth. The

next design will attempt to emulate a higher bandwidth system in order to overcome this

limitation.


Performance Analysis FPGA Resource
Utilization:


8 5

4 3
4-
22

0 0
1 2 4
Number of pipelines

-- Execution time on RC1000 -- Speedup over Xeon


Pixels
SSlices
processed occie
concurrently occ
1 5%
2 11%


Figure 4.7. Performance improvement and change in resource requirements with increase
in number of concurrently processed pixels.

4.3 Design Architecture #3

This design uses the on-chip block RAM (BRAM) resources to buffer the input and

then work with these buffers for providing inputs to the pipeline. Treating BRAMs as

buffers allows emulation of a more advanced RC system which is rich in memory

bandwidth. A number of such buffers could be employed to have multiple non-stalling

pipelines and hence increase the concurrent computations. The architecture for this

design is depicted in Figure 4.8. The existence of input buffers means the design incurs









some additional latency, because of the cycles required to fill the buffers. Since there is

limited on-chip storage which is much less than the total amount of input data, multiple

iterations are required to process the entire data set. The goal of this hardware design is

to estimate the performance of a much more advanced RC system which is done by

separating the buffering time from the actual computation time. These individual

components of the total time have been tabulated in Tables 4.3 and 4.4.


To host or DMA SRAM
SPAM
source of data Engine c y k f P X banks







hI 1Ip r hi i1| rI





Figure 4.8. Block diagram of the architecture of the design #3.

The designs were developed for a varying number of pipelines (one, two and four) and

the timing experiments were performed for all these designs. The components of the

total time were observed by maintaining a timer on the host processor and sending signals

from the FPGA for reading the timer values, at the completion of different processing

stages. Since each design operated on a different number of pixels, the times for all the

designs were extrapolated (wherever needed) for processing of eight rows of input image

to have a common comparison basis. Each input buffer has a capacity of 1KB and could

therefore hold 512 values in the 8.8 fixed-point format.









Table 4.3. Components of the execution time on an FPGA for processing eight rows of
input image using the design #3.
Time (ps) : With 1 pipeline With 2 pipelines With 4 pipeline

Input DMA (for entire image) 10254 10345 10752

Transfer into BRAM 80 78 78

Computation 132 66 34

Transfer output from BRAM 56 56 54
Output DMA
614 599 745
(for entire output image 128x128)


This structure allowed each buffer to store data for processing of exactly two input rows

of the high-resolution input image (256 x 256). The time listed in Table 4.3 as the transfer

time in and out of the BRAM represents the overhead incurred because of the buffering

and is almost the same for all the designs for a common basis of comparison (8 input

image rows for our experiment). The designs differ in the amount of computation

performed concurrently which is visible in the row marked as computation time.

Table 4.4. Components of the total execution time on FPGA for processing a single scale
of input data with different levels of concurrent processing, and the resource
utilization.
Te () with 1 with 2 with 4
Time (ps) :,. .
pipeline pipelines pipelines
Input DMA (for entire image) 10254 10345 10752
Output DMA
614 599 745
(for entire output image 128 x 128)
Computation 4224 2112 1088

Transfer time to and from BRAMs 4352 4288 4224
Total time (including transfer to/from
8576 6400 5312
BRAMs)_____
Execution on 2.4 GHz Xeon processor 15140


FPGA
Resource


Slices occupied BRAM used
1 pipeline 5% 8%
2 pipeline 11% 17%
4 pipeline 22% 35%











With a non-stalling pipeline a linear speedup is achieved by adding more pixels in the

processing pipeline. Hence the performance of a system with sufficient memory

bandwidth can be estimated by discarding the buffering overhead. These values are

noted in Table 4.4 which compares the total execution time for processing of a single

scale (256 x256) on the RC1000 versus the host processor.

The values listed in Table 4.4 provide a good appraisal of the performance

improvement attainable on a high memory bandwidth system. The resource consumption

by the different designs is also listed. In addition to similar logic area requirements, this

design also uses block RAMs heavily. It is evident that the design can be extended to

satisfy the resource requirements for concurrent processing of about 10 pixels. The

speedups hence obtained over the Xeon processor (neglecting the buffering time) is

shown in the graph in Figure 4.9 where the values have been extrapolated for the "eight-

pixel" design. The values are presented by both including and excluding the buffering

time.


Performance Comparison

30
25
20-
w 15
10-
5
0
1 2 4 8
# of Processing Pipelines

With buffering time Without buffering time


Figure 4.9. Improvement in the performance with increase in concurrent computations.








The results including the buffering time degenerate to the case of design #2 and as a

result the performance improvement obtained by increasing the amount of concurrent

computation starts tapering off quickly. But, when the buffering times are discarded, a

linear speedup is obtained by adding more computation.

The last bar (with 8 pixels) in the graph shows that more than an order of

magnitude improvement is attainable for good designs and systems with an appropriate

amount of resources. Figure 4.10 presents the error statistics by comparing the outputs

generated by the FPGA for this design with Matlab-computed estimates. The mean

square error has a satisfactory low value of 0.01 with the percentage error being less than

2% in most cases.


.. ,'
t-"

*^ *s


Matlab-generated estimates FPGA-generated estimates Absolute error
Error Statistics:
Maximum absolute error from Matlab:
0.4249
MSE: 0.0119
Figure 4.10. Error statistics for the outputs obtained after single scale of filtering.

4.4 Performance Projection on Other Systems

This subsection gauges the performance attainable on some of the other existing

RC platforms. These projections are very simplistic in nature and are derived based on

the system demands posed by the application, as identified from the previous experiments

and the resources available on these RC platforms. This projection is useful in appraising


j.










the true performance advantages that can be obtained from an advanced RC-enabled

remote sensing system.

4.4.1 Nallatech's BenNUEY Motherboard (with BenBLUE-II daughter card)

Figure 4.11 shows a high-level architecture diagram of the board. It consists of

three Xilinx FPGAs, each one of them being a VirtexII 6000 (speed grade -4). These

advanced chips have larger logic resources, more block RAMs and additional specialized

hardware modules such as dedicated multipliers. The VirtexII series also provides much

faster interconnects than the VirtexE series. All these factors account for about a 2x to 4x

improvement in the design clock frequency in moving from VirtexE FPGA to a VirtexII

for most designs. The system also supports a higher memory capacity providing 12MB

of storage with a bandwidth of 192 bits every clock cycle. Almost a twofold increase in

the memory bandwidth should yield a corresponding linear speedup of about 2x.


ZBTSSRAM ZBTSSRAM ZBT SSRAM ZBT SSRAM
(2 MB) (2 MB) (4 MB) (4 MB)

32= o 3 0 64,0 64,0 ^


PCI
o FPGA PCenBLUE-II
) v ili COMMS BenNUEY User BenBLUE-
\- (xilinx bus / Primary FPGA
S artan2) FPGA (Xilinx Local Bus Primay GA
SSpartan 2) (32-bit data (Xilinx Virtex2
40 MHz) Virtex2 6000, -4) \ 6000, -4)
4o M)(64-bit data, 66 MHz) 6 4
a-

Inter-FPGA (159 I/O, user
communications bus defined clk)


BenBLUE-II
Secondary
FPGA (Xilinx
Virtex2 6000, -4)



Figure 4.11. Block diagram for the hardware architecture of the Nallatech's BenNUEY
board with a BenBLUE-II extension card. Courtesy: BenNUEY Reference
Guide [34].









Hence, this target system could provide a performance which is approximately 4x 8x

better when compared to the RC1000 board.

4.4.2 Honeywell Reconfigurable Space Computer (HRSC)

Figure 4.12 shows a high-level block diagram of the HRSC board. It is comprised

of four Xilinx FPGAs, two of them being Virtex-1000 and the other two from the

VirtexII series (VirtexII 2000). The presence of a larger amount of resources and faster

chips should lead to higher frequency designs, leading to an improvement of

approximately 2x over the RC1000 designs. The system is also very rich and flexible in

the amount of memory resources and the way the memory is interconnected to all the

FPGAs. All the memories are dual-ported and provide more flexibility in input buffering

as well as providing higher bandwidth.

IIIIIII Frl:ln n ,: I:i,, JE J
11111111111 ~ 11 a [1111111111111 11111111111 111111~-~1'I 1111111111111 Ikl 111111111


Figure 4.12. Block diagram of the hardware architecture of HRSC board [29].

Figure 4. 12. Block diagram of the hardware architecture of HRSC board [29].









A single HRSC board supports approximately 1.75GB of storage capacity with a

bandwidth of 448 bits per clock cycle. This high bandwidth should be more amenable to

the design and should yield a performance enhancement of about 4x. The HRSC board,

specifically tailored for space-based computing, was designed with similar considerations

and would therefore lead to a system-level performance improvement on the order of 8x

over the RC 1000 card.

This chapter presented a detailed discussion of the various experiments conducted

and the results obtained. The next chapter summarizes the document by providing a set

of conclusion and future directions in which the work could be pursued.














CHAPTER 5
CONCLUSIONS AND FUTURE WORK

This chapter summarizes the findings of this research and draws conclusions based

on the results and observations presented in the previous chapters. It also discusses some

unfinished issues and tasks that can be pursued further to build a complete system and do

further analysis.

Most of the current remote-sensing systems rely on conventional processors for

performing computation offline after data acquisition and lack any real-time processing

capabilities. Future systems will need on-board high-performance computing facilities in

order to be able to perform increasing levels of computation. In such systems,

performance per unit cost is a much more important metric than raw performance. This

work has demonstrated a strong case depicting the relevance of reconfigurable computing

for performing fast on-board computation. Several hardware configurations have been

developed to extend the prior work on 1-D Kalman filtering to a 2-D case. The results

have shown over 25 times speedup in computation time in ideal scenarios compared to

the conventional GPP-based solutions which is bound to further increase with more

advanced and faster FPGAs. More than two orders of magnitude improvement can be

obtained on such advanced systems. Systems capable of yielding such high performance,

coupled with other attributes such as low power, cost and space requirements make them

a perfect match for remote sensing systems and can make airborne-based, real-time

processing possible. There has been a great deal of prior work done in the field of

mapping image processing algorithms on FPGA-based systems, but its application to the









remote sensing world is still in its infancy. The results obtained in this work, show

promise and demand further research and investigation.

This work also highlighted some key issues involved in the design of remote-

sensing applications for RC systems. Firstly, floating-point operations are wasteful of

resources when designed for hardware, whereas fixed-point operations are more

amenable in such cases. Therefore, algorithms need to be redesigned to mitigate the

quantization effects and tested to ensure that they provide a desired level of accuracy.

Secondly, remote-sensing applications pose high demands on the memory bandwidth of

the system. Hence, the target systems should support a high memory bandwidth in

addition to just providing large storage capacity for being successful. Use of multi-port

memories provides added advantages, allowing both reads and writes to take place in

parallel and providing an effective mechanism of hiding input latencies. Thirdly, the

ability of processing data in parallel is a key attribute for meeting real-time requirements

in a remote-sensing scenario, where multiple sensors produce large amounts of data at a

high rate. Different designs have been developed and analyzed as a part of this work that

exploit a high degree of parallelism and adapt to the available resources.

The designs presented in this work for the application use pre-computed parameters

which describe the behavior of the filter. Hence, they offer the ability of "virtual

reconfiguration," which is a novel concept developed and introduced through the course

of this research, where the set of filter parameters in the memory could be changed to

adapt the filter behavior. This functionality has important implications in both the remote

sensing and RC arenas. Most remote-sensing systems need to adapt their processing

methods in response to the change in data statistics. Although current RC systems can









support dynamic reconfiguration of FPGAs, the configuration times are very slow, hence

virtual reconfiguration provides a faster mechanism to achieve the same effect.

To keep the work tractable the scope of the project was limited; only one part of the

entire process was dealt with in detail, leaving several other modules. Most importantly,

the raw sensor inputs have to go through several stages of processing before being

converted into a form compatible with the designs developed in this work. The design

and development of the pre-processing module needs to be addressed before an actual

system can be developed.

There were several lessons learned about the design of target hardware system and

how different resources affect the performance. The design of such a hardware system

based on the results and conclusions of this work is a challenging task that demands

attention. There has been recent interest in developing stand-alone, FPGA-based systems

which can exist without a host processor. Such a system offers a very cost-effective

solution for embedded environments. Adapting such systems to the needs of remote-

sensing application can have a monumental effect on the future research in this direction.

This work also shows projection of the performance on some advanced systems, which

need to be verified with experimental results.

The focus of this work was to analyze the feasibility of deploying RC systems in

the remote sensing arena and developing designs to support these claims, as a result the

designs developed in this work are not optimal. With more time and optimizations the

designs could yield better performance, for example by increasing the clock frequencies

of the designs, or providing direct DMA into the block RAMs to hide the input and

output latencies. As pointed out earlier, the use of fixed-point is very important for






55


economical hardware configurations, but it leads to quantization errors that can be high

enough to defeat the purpose of acceleration in some cases. There are several ways in

which these quantization effects of can be mitigated by changes in the algorithm, such as

normalization of all the parameters or providing observations at multiple scales to allow

the filter to overcome the effect of accumulation of errors. Exploring all the ways to

achieve this goal is essential for the success of the application.
















LIST OF REFERENCES


1. Integrated Sensors Inc., "SAR/GMTI Range Compression Implementation in
FPGAs," Application Note, Utica, NY, 2005.

2. V. R. Daggu and M. Venkatesan, "Design and Implementation of an Efficient
Reconfigurable Architecture for Image Processing Algorithms using Handel-C,"
M.S. Thesis, Dept. of Electrical and Computer Engineering, University of Nevada,
Las Vegas.

3. P. C. Arribas, F. M. Macia, "FPGA Board for Real Time Vision Development
System," Devices, Circuits and Systems 2002. Proceedings of the Fourth IEEE
International Caracas Conference on 17-19 Apr. 2002. Pages: T021-1 T021-6.

4. P. McCurry, F. Morgan and L. Kilmartin, "Xilinx FPGA Implementation of an
Image Classifier for Object Detection Applications," Image Processing 2001.
Proceedings of 2001 International Conference on 7-10 Oct. 2001. Vol. 3, Pages:
346-349.

5. I. S. Uzun and A.A.A. Bouridane, "FPGA Implementations of Fast Fourier
Transforms for Real-time Signal and Image Processing," Field-Programmable
Technology (FPT) 2003. Proceedings of 2003 IEEE International Conference on
15-17 Dec. 2003. Pages: 102-109.

6. B. A. Draper, J. R. Beveridge, A. P. W. Bohm, C. Ross and M. Chawathe,
"Accelerated Image Processing on FPGAs," Image Processing 2003. IEEE
Transactions on Dec. 2003. Volume 12, Issue 12, Pages: 1543 1551.

7. K. Benkrid, D. Crookes, A. Bouridane, P. Con and K. Alotaibi, "A High Level
Software Environment for FPGA Based Image Processing," Image Processing And
Its Applications 1999. Seventh International Conference on 13-15 Jul. 1999. Vol.
1, Pages: 112-116.

8. Celoxica Ltd., "Handel-C Language Reference Manual," 2004. http://www.
celoxica.com/techlib/files/CEL-W0410251JJ4-60.pdf. Last accessed: Aug. 2005.

9. N. Shirazi, P. M. Athanas and A. L. Abbott, "Implementation of a 2-D Fast Fourier
Transform on a FPGA-Based Custom Computing Machine," Field-Programmable
Logic and Applications 1995. Proceedings of the fifth International Workshop on
Sept. 1995. Vol. 975, Pages: 282-292.









10. R. Hartenstein, "A Decade of Reconfigurable Computing: A Visionary
Retrospective," Design, Automation and Test in Europe, 2001. Proceedings of
conference and exhibition on 13-16 Mar. 2001. Pages: 642-649.

11. K. Compton and S. Hauck, "Reconfigurable Computing: A Survey of Systems and
Software", ACM Computing Surveys 2002. Vol. 34(2), Pages: 171-210.

12. S. Gould, B. Worth, K. Clinton and E. Millham, "An SRAM-Based FPGA
Architecture," Custom Integrated Circuits Conference 1996. Proceedings of IEEE
Conference on 5-8 May 1996. Pages: 243-246.

13. http://www.xilinx.com.

14. V. Aggarwal, I. Troxel and A. George, "Design and Analysis of Parallel N-Queens
on Reconfigurable Hardware with Handel-C and MPI," Military and Aerospace
Programmable Logic Devices (MAPLD) 2004. International Conference on 8-10
Sept. 2004.

15. C. Conger, I. Troxel, D. Espinosa, V. Aggarwal and A. George, "NARC: Network-
Attached Reconfigurable Computing for High-performance, Network-based
Applications," Military and Aerospace Programmable Logic Devices (MAPLD)
2005. International Conference on 8-10 Sept. 2005. (to appear).

16. Celoxica Ltd. "RC1000 Reference Hardware Reference Manual," version 2.3,
2001. Document Number: RM-1120-0.

17. P. W. Fieguth, W. C. Carl, A. S. Willsky and C. Wunsch, "Multiresolution Optimal
Interpolation and Statistical Analysis of TOPEX/POSEIDON Satellite Altimetry,"
Geoscience and Remote Sensing 1995. IEEE Transactions on Mar. 1995. Vol.
33, Issue 2, Pages 280-292.

18. C. R. Lee and Z. Salcic, "A Fully-hardware-type Maximum-parallel Architecture
for Kalman Tracking Filter in FPGAs," Information, Communications and Signal
Processing (ICICS) 1997. Proceedings of 1997 International Conference on 9-12
Sept. 1997. Vol. 2, Pages: 1243-1247.

19. L. P. Maguire and G. W. Irwin, "Transputer Implementation of Kalman Filters,"
Control Theory and Applications 1991, IEE Proceedings D on Jul. 1991. Vol. 138,
Issue 4, Pages 355-362.

20. J. M. Jover and T. Kailath, "A Parallel Architecture for Kalman Filter
Measurement Update and Parameter Estimation," Automatica (Journal of IFAC)
1986. Vol. 22, Issue 1, Pages: 43-58. Tarrytown, NY, USA.

21. R. S. Baheti, D. R. O'Hallaron and H. R. Itzkowitz, "Mapping Extended Kalman
Filters onto Linear Arrays," Automatic Control 1990. IEEE Transactions on Dec.
1990. Vol. 35, Issue 12, Pages: 1310-1319.









22. I. D'Antone, L. Fortuna, G. Muscato and G. Nunnari, "Arithmetic Constraints in
Kalman Filter Implementation by using IMS A100 Devices," Implementation
Problems in Digital Control 1989. IEE Colloquium on 9 May 1989. Pages: 8/1-
8/7.

23. S. Chappell, A Macarthur, D. Preston, D. Olmstead and B. Flint, "Exploiting
FPGAs for Automotive Sensor Fusion," Application note May 2004. http://www.
celoxica.com/techlib/files/CEL-W04061612DV-296.pdf. Last accessed: Aug 2005.

24. B. Garbergs and B. Sohlberg, "Implementation of a State Space Controller in a
Fpga," Electrotechnical Conference MELECON 1998. Ninth Mediterranean
Conference on 18-20 May 1998. Vol. 1, Pages: 566-569.

25. R. D. Tumey, A. M Reza, and J. G. R. Delva, "Fpga Implementation of Adaptive
Temporal Kalman Filter for Real Time Video Filtering," Acoustics, Speech, and
Signal Processing, (ICASSP) 1999. IEEE International Conference on 15-19 Mar.
1999. Vol. 4, Pages: 2231-2234.

26. L. Scharf, and S. Sigurdsson, "Fixed-Point Implementation of Fast Kalman
Predictors," Automatic Control 1984. IEEE Transactions on Sept. 1984. Vol. 29,
Issue 9, Pages: 850-852.

27. A. S. Dawood, J. A. Williams and S. J. Visser, "On-board Satellite Image
Compression Using Reconfigurable FPGAs," Field-Programmable Technology
(FPT) 2002. Proceedings of IEEE International Conference on 16-18 Dec. 2002.
Pages: 306-310.

28. D. V. Buren, P. Murray and T. Langley, "A Reconfigurable Computing Board for
High Performance Processing in Space," Aerospace Conference 2004. Proceedings
of 2004 IEEE Conference on 6-13 Mar. 2004. Vol. 4, Pages: 2316-2326.

29. J. Ramos and I. A. Troxel, "A Case Study in HW/SW Codesign and Project Risk
Management: The Honeywell Reconfigurable Space Computer (HRSC)," Military
and Aerospace Programmable Logic Devices (MAPLD) 2004. International
Conference on 8-10 Sept. 2004.

30. R. Sivilotti, Y. Cho, Wen-King Su, D. Cohen and B. Bray, "Scalable Network
Based FPGA Accelerators for an Automatic Target Recognition Application,"
FPGAs for Custom Computing Machines (FCCM) 1998. Proceedings of IEEE
Symposium on 15-17 April 1998. Pages: 282-283.

31. P. Graham and B. Nelson, "Frequency-Domain Sonar Processing in FPGAs and
DSPs," FPGAs for Custom Computing Machines (FCCM) 1998. Proceedings of
IEEE Symposium on 15-17 April 1998. Pages: 306-307.

32. T. Hamamoto, S. Nagao and K. Aizawa "Real-Time Objects Tracking by Using
Smart Image Sensors and FPGA," Image Processing 2002. Proceedings of
International Conference on 24-28 June 2002. Vol. 3, Pages: III-441 III-444.






59


33. A. Utgikar and G. Seetharaman, "FPGA Implementable Architecture for Geometric
Global Positioning," Field-Programmable Technology (FPT) 2003. Proceedings of
IEEE International Conference on 15-17 Dec. 2003. Pages: 451-455

34. Nallatech Inc. "BenNUEY Reference Guide," Issue 10, 2004. Document Number:
NT107-0123.















BIOGRAPHICAL SKETCH

Vikas Aggarwal received a Bachelor of Science degree in electronics and

communication engineering from the department of ECE at Guru Gobind Singh

Indraprastha University, India, in August of 2003. He moved over to United States to

pursue his graduate studies in the department of Electrical and Computer Engineering at

the University of Florida.

Vikas has been a paid graduate research assistant under the direction of Dr. Clint

Slatton in Adaptive Signal Processing Lab and under Dr. Alan George in High-

Performance Computing and Simulation Lab. Since becoming a paid graduate assistant

he has worked on numerous projects in two relatively different fields of reconfigurable

computing and adaptive signal processing techniques as applied to remote sensing

applications.