Group Title: CARMA : a comprehensive management framework for high-performance reconfigurable computing
Title: Presentation
ALL VOLUMES CITATION PDF VIEWER THUMBNAILS PAGE IMAGE ZOOMABLE
Full Citation
STANDARD VIEW MARC VIEW
Permanent Link: http://ufdc.ufl.edu/UF00094757/00002
 Material Information
Title: Presentation
Physical Description: Book
Language: English
Creator: Troxel, Ian A.
Jacob, Aju M.
George, Alan D.
Subramaniyan, Raj
Radlinkski, Matthew A.
Publisher: Troxel et al.
Place of Publication: Gainesville, Fla.
Publication Date: 2004
 Record Information
Bibliographic ID: UF00094757
Volume ID: VID00002
Source Institution: University of Florida
Holding Location: University of Florida
Rights Management: All rights reserved by the source institution and holding location.

Downloads

This item has the following downloads:

MAPLD2004_Paper197 ( PDF )


Full Text
/ UNIVERSITY OF
.FLORIDA


CARMA: A Comprehensive Management

Framework for High-Performance

Reconfigurable Computing



lan A. Troxel, Aju M. Jacob, Alan D. George,
Raj Subramaniyan, and Matthew A. Radlinski

High-performance Computing and Simulation (HCS) Research Laboratory
Department of Electrical and Computer Engineering
University of Florida
Gainesville, FL


#197 MAPLD 2004


Troxel





CARMA Motivation

* Key missing pieces in RC for HPC
L Dynamic RC fabric discovery and management
a Coherent multitasking, multi-user environment
L Robust job scheduling and management
L Design for fault tolerance and scalability
a Heterogeneous system support
L Device independent programming model
a Debug and system health monitoring (Holy Fire by Alex Gr(
a System performance monitoring into the RC fabric
a Increased RC device and system usability

* Our proposed Comprehensive Approach to Reconfigurable
Management Architecture (CARMA) attempts to unify existing
technologies as well as fill in missing pieces

UNIVERSITY OF
Troxel FLORIDA 2 #197 MAPLD 20(


ey)


)4






CARMA Framework Overview

CARMA seeks to integrate:
Applications L Graphical user interface
SL Flexible programming model
User Algorithm Mapping L COTS application mapper(s)
Interface Handel-C, Impulse-C, Viva, System Generator, etc.
t Graph-based job description
S RC Cluster DAGMan, Condensed Graphs, etc.
Management Li Robust management tool
Distributed, scalable job scheduling
Control Performance Middleware Checkpointing, rollback and recovery
network Monitoring API Distributed configuration management
Sa Multilevel monitoring service (GEMS)
SData RC Fabric Networks, hosts, and boards
To Other Monitoring down into RC Fabric
Network API
Nodes Device independent middleware API
RC Node I a Multiple types of RC boards
SOTS PCI (many), network-attached, Pilchard
Processor RC Fabric a Multiple high-speed networks
_______ SCI, Myrinet, GigE, InfiniBand, etc.

4 UNIVERSITY OF
Troxel FLORIDA 3 #197 MAPLD 2004







Application Mapper Evaluation


Evaluating on basis of ease of use, performance, hardware device independence, programming
model, parallelization support, resource targeting, network support, stand-alone mapping, etc.

C-Based tools
a Celoxica SDK (Handel-C)
Provides access to in-house boards: ADM-XRC (xl), Tarari (x4), RC1000 (x4)
Good deal of success after lessons learned
0 Hardware design focused
a Impulse Accelerated Technologies Impulse-C Simulator -, Hardware Synthesis
S Provides an option for hardware independence Analysis/ Scheduling (Compiler)
S Built upon open source Streams-C from LANL Host Process
S Supports ANSI standard C (Software) Sequence info
Datapath Ops Hardware Library
Graphical tools
a StarBridge Systems Viva oftwre Process HLerator Synthesis
Nallatech Fuse / DIMEtalk bray Functons
a Annapolis Micro Systems CoreFire
RTL Processing- Processingelement
(Hardware)
Xilinx ISE compulsory Streams-C c/o LANL
i Evaluating the role of Jbits, System Generator, and XHWIF

Evaluations still ongoing
Programming model a fundamental issue to be addressed




UNIVERSITY OF
roxel FLORIDA 4 #197 MAPLD 20(


T


)4






CARMA Interface

* Simple graphical user interface
L Preliminary basis for graphical user interface via the Simple Web
Interface Link Library (SWILL) from the University of Chicago*
L User view for authentication and job submission/status
a Administration view for system status and maintenance
Applications supported
a Single or multiple tasks per job (via CARMA DAGs**)
a CARMA registered (via CARMA API and DAGs) or not
Provides security, fault tolerance
L Sequential and parallel (hand-coded or via MPI)
C-based application mappers supported
L CARMA middleware API provides architecture independence
Any code that can link to the CARMA API library can be executed (Handel-C
and ADM-XRC API tested to date)
Bit files must be registered with the CARMA Configuration Manager (CM)
L All other mappers can use "not CARMA registered" mode
a Plans for linking Streams/Impulse-C, System Generator, et al.

U NIV E RS ITY OF http://systems.cs.uchicago.edu/swill/
roxel FLORIDA 5 ** Similar to CondorDAGs #197 MAPLD 20(


T


)4







CARMA User Interface


I N CA~.E~:~rm -H er a t Netscape


Smile Edit View Go Bookmarks Tools Window Help

. 4 H.: e 'f =-T.:1-, -3 ,,-.-, .,l : I. r ,-,,' ,r .


ZS -0i


Welcome to the CARIMA User Interface


Make a selection from the list below:


Users Click Here
Admins Click Here


www.hcs.ufl.edu


UNIVERSITY OF
FLORIDA


ScaRMA HCS Research Laboratory


1 ( 5aA W El I D.- I I~i~


Troxel


#197 MAPLD 2004





CARMA DAG Example
CARMA Job Manager (JM) xam
1 1 2
* Prototyping effort (CARMA interoperability) Hper ype
a Completed first version of CARMA JM 13 1
Task-based execution via Condor-like DAGs 1 2 1 2
Separate processes and message queues for fault-tolerance Hyper3 Hyper.4
Checkpointing enabled with rollback in progress 12 12
Links to all other CARMA components Fel 1 2
Fully distributed multi-node operation with job/task migration Hyper.5
Links to CARMA monitor and GEMS to make scheduling decisions
Tradeoff studies and analyses underway


* External extensions to COTS tools (COTS plug and play)
a Expand upon preliminary work @ GWU/GMU* W"u"b S-1"
Striving for "plug and play" approach to JM .-.
CARMA Monitor provides board info. (via ELIM) ib.
Working to link to CARMA CMapp queu-
ELIM -Externa l e Wl
Tradeoff studies and analysis underwayPGA ,,_',"jo
Prog uambini i bard
Integration of other CARMA components in progress I"" WUn G
c/o GWU/GMU
U NIV E RS ITY OF Kris Gaj, Tarek EI-Ghazawi, et al., "Effective Utilization and
Reconfiguration of Distributed Hardware Resources Using
Troxel RA 7 Job Management Systems," Reconfigurable Architecture #197 MAPLD 2004
S JAL Workshop 2003, Nice, France, April 2003.







CARMA CM Design


* Builds upon previous design concepts*
* Execution Manager (EM)
a Forks tasks from JM and returns results to JM
a Requests and releases configurations
Configuration Manager (CM)
a Manages configuration transport and caching
a Loads, unloads configurations via BIM
Board Interface Module (BIM)
) Provides board independence
a Allows for configuration temporal locality benefits
Communication Module
) Handles all inter-node communication
CM uses BIM to
Configure Board CM Board I
locationn Cor
-ARMA Board
erface Language CM spawns
BIM for each Inst
BIM Board BIM Pro,
Sep
Board Specific Sim
Communication
7 fRC Board7 7Enh
RC Board RC Board


r-- ----------------------------------- -- --- ----- ----- -
........................................... .................................
Execution
M a Comm. Communication
Manager

Configuration
Manager "
SRemote Node
*--------------------------------
BIM Inter-Process Comm. -
Control Network
RC Hardware
File Transfers
Local Node Board API
L .......................................... I


nterface Module (BIM)
figures and interfaces with diverse set of RC boards
Numerous PCI-based boards
Various interfaces for network attached RC
antiated at startup
vides hardware independence to higher layers
arate BIM for each supported board
pie standard interface to boards for remote nodes
lances security by authenticating data and configurations


iU NIV E RS ITY OF
F L 0 R I D


U. of Glasgow (Rage), Imperial College
8 in UK, U. Washington, among others


Troxel


#197 MAPLD 2004







Distributed CM Management Schemes
Jobs submitted Global view of subm
S"centrally" te ytem at Jobs submitted
-- centrally the system at all times


ljioDal view oI
the system at all times


Results,
Statistics
/ ^


l


L 33





* 33 *3
LAPP MA
LRM-
LRMON
Locl~y
L\


Master-Worker (MW)


Reiests/ R
Stan s S
Tasks,
Configurations


- ULUoca y


requests,
stics
\\


ie,'\i e ouses configurations

Client-Server (CS)


Global view of
the system at all time


Jobs submitted
---locally


LRMN

LRON'
*33 \
*33 *


LAP
LA33


U4
LR-


Client-Broker (CB)


Jobs submitted
locally



Requests requests

Configurations
\


i


Simple Peer-to-Peer (SPP)


. UNIVERSITY OF
F FLORIDA


Note: More in-depth results for
9 distributed CM appeared at
ERSA'04


3 \
- *

Tsksa


States
0 0 0


773
P MAP
UM*
RM\
MO


#197 MAPLD 2004


I


i 14


/


e


s RMAN
^""'^Network^


Troxel






CM System Recommendations
Scalability projected up to 4096 nodes
* Performed analytic scalability analysis based on 16-node experimental results
Dual 2.4GHz Xeons and a Tarari CPX2100 HPC board in a 64/66 PCI slot
Gigabit Ethernet and 5.3 Gbps Scalable Coherent Interface (SCI) control and data networks respectively
* Flat system of 4096 has very high completion times (-5 minutes for SPP and -83 hrs for CS)
* Layered hierarchy needed for reasonable completion times (-2.5 sec for SPP over SPP at 4096 nodes)
* CS reduces network traffic by sacrificing response time and SPP improves response time by increasing
network utilization



CS over CS with SPP over CS SPP over SPP SPP over SPP
Flat CS
group size 4 with group size 4 with group size 8 with group size 16
W M P Flat s CS over CS with CS over CS with SPP over CS with SPP over CS with
Flat CS
group size 4 group size 8 group size 8 group size 8
IFlS CS over CS with SPP over CS SPP over CS with SPP over CS with
Flatgroup size 4 with group size 4 group size 8 group size 8CS
group size 4 with group size 4 group size 8 group size 8


Conclusions
* CARMA CM design imposes very little overhead on the system
* Hierarchical scheme needed to scale to systems of thousands of nodes (traditional MW will not work)
* Multiple servers for CS scheme don't reduce the server bottleneck for system sizes greater than 32
* SPP over CS (group size 8) best overall performance for systems larger than 512 nodes
NU NIV E RS ITY OF Schemes with completion latency values
rroxel FLORIDA 10 greater than 5 seconds excluded #197 MAPLD 20(


)4






CARMA Monitoring Services

Monitoring service Diagnosti
Li Statistics Collector JM ---D--.....Query* '
Gathers local and remote information Proc.
T C
Updates GEMS* and local values ExMan Stat.
ExMan A **> Stat.
L Query Processor ..." Collector 4 GEMS
Processes task scheduling requests from JM
Maintains local information ConMan -----
a Round-Robin Database B M BIM BIM.. To Othe
BIM BIM BIM **** ***********A-: Nodes
Compact way to store performance logs
Supports simple query interface FPGA FPGA FPGA ...
L CARMA Diagnostic Board Board Board
System watchdog alerts based on defined
heuristics of failure conditions Initial CARMA Monitor Parameters
A) Stats from JM, ExMan, ConMan, BIM, Board
Provides system monitoring and debug -Dynamic statistics (push or pull)
. Initial monitor version is complete -Static statistics (pull)
B) Stats from remote nodes via GEMS
* Studying FPGA monitoring options C) StatCollector passes info to the RRD from local
and remote modules via the Query Processor
* Increasing the scheduling options D) JM queries RRD for resource information to
make scheduling decisions
* Tradeoff studies and analyses underway E) The CARMA diagnostic tool performs system
administration, debug and optimization


iU NIV E RS ITY OF
F L 0 R I D


Troxel


Gossip-Enabled Monitoring Service (GEMS); developed
by HCS Lab for robust, scalable, multilevel monitoring
11 of resource health and performance. #197 MAPLD 2004
For more info. see http://www.hcs.ufl.edu/gems






CARMA End-to-End Service Description


* Functionality demonstrated to date
L Graphical user interface
L Job/task scheduling based on board requirements and
configuration temporal locality
L Parallel and serial jobs
L CARMA registered and non-registered tasks
L Remote execution and result retrieval
L Configuration caching and management
L Mixed RC and "CPU-only" tasks
L Heterogeneous board execution (3 types thus far)
a System and RC device monitoring
a Inter-node communication via SCI or TCP/IP/GigE
a Fault-tolerant design
Processes can be restarted while running

a Virtually no system impact from CARMA overhead
despite use of unoptimized code
Less than 5MB RAM per node
Less than 0.1% processor utilization on a 2.4 GHz Xeon
server
Less than 200 Kbps network utilization


CARMA Execution Stages


1) User submits job
2) JM performs a task schedule request and
monitor replies with execution location
3) JM forwards tasks to local or remote ExMan
4) If task requires an RC board, ExMan sends a
configuration request to the local CM
5) The CM finds the file and configures the board
6) The user's task is forked (runs on processor)
7) Users access RC boards via the BIM
8) Task results are forwarded to the originating JM
9) Job results are forwarded to the originating user
Note: All modules update the monitor


UNIVERSITY OF
FLORIDA


Troxel


#197 MAPLD 2004






CARMA Framework Verification


* Several test jobs executed concurrently
a Parallel Add Test composed of
ADD.exe, a "CPU-only" task to add two numbers
AddOne.bit, an RC task to increment input value
a Parallel N-Queens Test composed of
ADD.exe, a "CPU-only" task to add two numbers
NQueens.bit, an RC1000 task to calculate a subset of
the total number of solutions for an NxN board
4 RC1000s and 4 Tararis communicating via MPI
a Parallel Sieve of Erasthones (on Tarari)
a Parallel Monte Carlo Pi Generator (on Tarari)
a Blowfish encrypt/decrypt (on ADM-XRC)
Example System Setup interfae


N-Queens Test Par. Add Test


NQueens.bit

92


iU NIV E RS ITY OF
F L 0 R I D


Troxel


These simple applications used to test CARMA's functionality, while
CARMA's services have wider applicability to
13 problems of greater size and complexity. #197 MAPLD 2004





Conclusions

First working version of CARMA complete & tested
a Numerous features supported
Simple GUI front-end interface
Coherent multitasking, multi-user environment
Dynamic RC fabric discovery and management
Robust job scheduling and management
Fault-tolerant and scalable services by design
Performance monitoring down into the RC fabric
Heterogeneous board support with hardware independence
Linking to COTS job management service
a Initial testing shows the framework to be sound with very
little overhead imposed upon the system


UNIVERSITY OF
Troxel FLORIDA 14 #197 MAPLD 20(


)4






Future Work and Acknowledgements


* Continue to fill in additional CARMA features
a Include support for other boards, application mappers, and languages
a Complete JM rollback feature and finish linkage to LSF
o Include broker and caching mechanisms for the peer-to-peer distributed CM scheme
a Include more intelligent scheduling algorithms (e.g. Last Release Time)
a Expand RC device monitoring and include debug and opt. mechanisms
a Enhance security including secure data transfer and authentication
a Deploy on a large-scale test facility

* Develop CARMA instantiations for other RC domains
a Distributed shared-memory machines with RC (e.g. SGI Altix)
a Embedded RC systems (e.g. satellite/aircraft systems, munitions)

* We wish to thank the following for supporting this research:
a Department of Defense
a Xilinx
a Celoxica
a Alpha Data
a Tarari
a Key vendors of our HPC cluster resources (Intel, AMD, Cisco, Nortel)


UNIVERSITY OF
Troxel FLORIDA 15 #197 MAPLD 20(


r


)4




University of Florida Home Page
© 2004 - 2010 University of Florida George A. Smathers Libraries.
All rights reserved.

Acceptable Use, Copyright, and Disclaimer Statement
Last updated October 10, 2010 - - mvs