Title: Fixed and reconfigurable multi-core device characterization for HPEC
CITATION PDF VIEWER THUMBNAILS PAGE IMAGE ZOOMABLE
Full Citation
STANDARD VIEW MARC VIEW
Permanent Link: http://ufdc.ufl.edu/UF00094676/00001
 Material Information
Title: Fixed and reconfigurable multi-core device characterization for HPEC
Physical Description: Book
Language: English
Creator: Williams, Jason
George, Alan D.
Richardson, Justin
Gosrani, Kunal
Suresh, Siddarth
Publisher: University of Florida
Place of Publication: Gainesville, Fla.
Publication Date: September, 2008
Copyright Date: 2008
 Notes
General Note: HPEC 2008 (High Performance Embedded Computing)
 Record Information
Bibliographic ID: UF00094676
Volume ID: VID00001
Source Institution: University of Florida
Holding Location: University of Florida
Rights Management: All rights reserved by the source institution and holding location.

Downloads

This item has the following downloads:

HPEC08_F5 ( PDF )


Full Text






raxe K<-e -i m 'AI



DvCHREC
NSF Center for High-Performance
Reconfigurable Computing
Al
Buti
aKu
3D. .a











HI -

Reag Anly


F Wo rk a
ao a *io
3.6 6






II:(~s IIe)


- 7








-WKI I) 1g rU U"1 U

















r.I ilt i-r, ore
Plan -ore Ip.ri:


I h


cn UiSE enlher in
j.Ijnd~j~: 31:ne n~d


Homogeneous


Heterogeneous


Heterogeneolus


Homogeneolus


Heterogeneous


RReCCnlig Lirabi l y


I3 a D, hm NI P -ie # F E e.Ioc:n[


FC-Illplh


Fr~~~lsn:~n


Inl~rljr~


Inl~r~~~jnn~~~l








MKG]


PE1
Prg-A


PE3
Prg-A


PE2
Prg-A


PE4
Prg-A












130 nm FMC


90 nm RMC


AmbricAm20451
ClearSpeed CSX600
Freescale MPC7447


Altera Stratix-ll EP2S180
ElernentCXI ECA-64
MathstarArrix FPOA


Raytheon MONARCH
Tilera TILE64
Xilinx Virtex-4 LX200
Xilinx Virtex-4 SX55


Freescale MPC8640D
90 nm FMC
IBM Cell BE
Altera Stratix-lll EP3SL340
Altera Stratix-lll EP3SE260
65 nm RMC
Xilinx Virtex-5 LX330T
Xilinx Virtex-5 SX95T
45 nm FMC Intel Atom N270


Altera Stratix-IV EP4SE530


UI


40 nm RMC







91 kolA


bDt, =fax NLu + xN,


SII


'~):,r ii=(C~)S' i i + i S i~~h ) xll:~jl








!ISA!i~


7Dblt fx ZJWTxN,


UIl


'D i'lP f X Y IV,
Itl~ ICPI,







O r- ao perf






B cache \ = %hitratex Nx x
mi 8 x CPA,

a m t rc l c a N, xxvWx f
I/\Bblock
S e * *- *-* 8xCP.1

%hitrate Hit-rate scale factor

Sc ^lt b d id a r g^eH o Ni # of blocks of element i

Pi # of ports or simultaneous
accesses supported by
element i

4 Wi width of datapath

fi memory operating
- Ue d a c f u t frequency, variable for FPGAs
CPA, # of clock cycles per
memory access










Illil a I


Device


Bit-level


Raw


16-bit Int.


32-bit Int.


SPFP


F I 4 +


Sustain.


Raw


Sustain.


Raw


Sustain.


Raw


Sustain.


DPFP


Raw


ArrixFPOA 6144 6144 384 384



i 1 .1 -i i 2!- ... 1 1 -, I "l I"I I I
l !' i _ii", i i '.4'4 4, ,'- , ,




'lili I.-4 I 5 J" "-' :l I 2









CSX600 1536 1536 24 24 24 24 24 24 24 24
MPC7447 288 288 17 17 9 92 61 6 3
MPC8447D 2881 28817 317 91 9x 62 61


Sustain.


MPC8640D


I 576


34 34l 19 18 1?1 17?









9I


-t-V4 LX200
- EP2S18O
--- MONARCH
-I-ECA-64
---V5 LX330T
-- EP3SL340
- EP4SE530


-
-

V- V


--V4 SX5
-:- FPOA 90nm
---TILEG4 -

---- V5 SX95T
- EP3SE260 65 nm
40 nm


40 nm


65 nm
FPGA

65 nm
FPGAs



I 90 nm
FPGAs


I Non-
SFPGAs
100000 200000 300000 400000 500000

Parallel Operations (RMC)


FMJF
V Iy


-Am2045
- MPC7447
-+-Cell
-- Atom N270


CSX600 1nm
1 130 nm


---MPC8640 D


90 nm
45 nm


8000
7000

6000
5000
4000

3000
2000

1000


0 100000 200000 300000 400000 500000

Parallel Operations (FMC


8000

7000
6000
5000

4000
3000
2000

1000
0


H


K K 0j









Ill


--V4 LX200
--EP2S180
+ MONARCH
-- ECA-64
--V5LX330T
- EP3SL340
-- EP4SE530


- V4SX5S
--- FPOA 90 nm
- TILE-64
....V5SX95T 65 nm
- EP3SE260 J
40 nm


- Am2045
- MPC7447
----Cell
--Atom N270


0 500 1000 1500 2000 2500 3000 3500


Parallel Operations (RMC)


KiKl0j


CSXGfOO
I 130 nm


- M- PC640D


90 nm
45 nm


0 500 1000 1500 2000 2500 3000 3500


Parallel Operations (FMC)


MC;vi










Ill


S ---V4 LX200
EP2S180
i --- MONARCH
-- ECA-64
V5 LX330T
-+- EP3SL340
-A- EP4SE530


15


0 10
W


--V4 SX55
- FPOA 90 nm
----TILEG4

VS SX95T
- EP3SE260 -65 nm
40 nm


MrYfJ I-- Am2045
U---- MPC7447
Cell
--Atom N270


15


0 10
0


CSX600


-- MPC8640 D


0 200 400 600 800 1000 1200 1400

Parallel Operations IFMC)


130 nm

90 nm
45 nm


0 200 400 600 800 1000 1200 1400

Parallel Operations (RMC)













--V4 LX200
-A-EP2S180
--V LX330T
-- EP3SL340
--EP4SE530


-W--V4 SX5S
-- MONARCH
V5 SX95T
SEP3SE260


FMJ
mIy


" 90 nm
65 nm

40 nm


---CSX600
-4-Cell
- Atom N270


--- MPC7447
-- MPC8640D


- - - - - -


0 200 400 600 8C

Parallel Operations (FMC)


KRKlj


130 nm
90 nm

45 nm


-I


0 6

4

2



0


200 400 600

Parallel Operations (RMC)


'w-& -













-- V4 LX200
-- EP2S180
-V5 LX330T
-- EP3SL340
- EP4SE530


SV4 SX55
V-.... 5 SX95T
-- EP3SE260


90 nm

65 nm
40 nm


iir!j&


SCSX00
-*-Cell
6 Atom N270


-- MPC7447
- MPC8640 D


- --i ., -


100 200 300

Parallel Operations (FMC)


400 500


*l


0J4133


130 nm
90 nm
45 nm


5
5 --

4
4 --

3-

2

1

0
0


0 100 200 300 400 50

Parallel Operations (RMC)


~ __I









1Ill


-- CSX600
-- Cell LS
- -V4 SXSS
- FPOA
ECA-64
-- EP3SL340
VS LX.330T
--- EP4SE530


-Am2045
- V4 LX200
- EP2S18O
MONARCH
- EP3SE260
-. VS SX95T


S90 nm


65 nm


500 600


-U- MPC7447 LI D+I --- MPC7447 L2
- MPC8460D LI D+I MPC8460D L2

- TILE64L1D+l ---TILE64 L2


130 nm


S90 nm


8000
7000

6000

5000

4000

3000

2000

1000


0% 20% 40% 60% 80% 100%


Hit Rate


Ii:


Bj~


8000

7000

6000

5000

4000

3000

2000

1000

0


0 100 200 300 400

Achievable Frequency (MHz)


-B1MW















Arimmeric t


Memory Overations


Long-
Term
Goals


Degree of
Parallelism,


Density or


Device


Intensity


SBandwidth


2D-C'om oliition (I = Im.ige size iind s = Ciller size)
For I = 512; s =3 ; Computational Intensity = 9.9
For I = 512; s = 7; Computational Intensity = 8.9
For I = 512; s = 15; Computational Intensity = 8.5
CFAR Computational Intensity = 2.1
Radix-4 FFT Computational Intensity = 4.7
Direct Form FIR Com)iLputaional Intenslt\ =4.1
nl arix nMulliplyl Cominlutartonal Intensit\ = 2.0


7 =


rat on:







Il1ill


Bit-level CDW


EP4SE53O


EP4SE530


Am2045


V4 LX200


16-bit Integer CDW V5 SX95T V5 SX95T Am2045 V4 SX55

32-bit Integer CDW V5 SX95T V5 SX95T Am2045 V4 SX55

SPFP CDW V5 SX95T V5 SX95T Cell V4 SX55

DPFP CDW EP4SE530 EP4SE530 CSX600 CSX600


IMB


EP4SE530


EN4SE530


Am2045


EP2SI80







NIILll[
















It









A 3 UIU 93 I



.A a C p S tr U Device Hk
A 33 C -r U IVDeic H o :

mo 3 -. 3 3 m 3 3 3 33 3 II *
< l ^* 33* 3 0 A - 3 3 3 3 3
mm T. Chn eat, "Cl Broadban Enin Arhtetr an it Fis Im n atnA Promne Viw, IB Joural of
sar .Et D es 5 1 -e. 20 pp

Amm 3 3 0 *A 33f# 0 A 3 -
m a T o P C A h c e WhA 0 a *33 2007*



--mmm 3 A a 3 3*3 3 a



mmmm 0 e 3 3 0 9 g*~ 3 3 3---. 3 3-
--~i 0 3 *I I e -e e *- .I s


-mm 33 *. A3 3 *e 9* e- ee 5
m m * A. 3 0A A s 3 55 0 0
El A mn * Inc., .Dv- c e A h e u Overview,52007


p 0 A .0 -

*l .~ *e so usS:
SEl emes - **I*n- E **4- P d Bs 2 00 .s
3 -*e l Seiodutr In. MPC740 AIS Mirpocso Fail Reernc Maua Re.,205









3 9 e.* 5!5t 2
S Isr

















p











Datapath Width
(bits)


Frequency
(MHz)


Power
(W)


On-chip Memory


Am2045 360 3+1 32 350 15 45 brics ea. w/ 8 SRAM banks

130 CSX600 1+96 1 64 250 10 I, D caches, 96 32-bit banks SRAM
130 nm
MPC7447 1+1 1+2 Int, 2+1 SPFP, 3 32/128 1000 10 L1-I, L1-D: 4 words/access @ 2 cycles/access,
DPFP L2: 8 words/access @ 9 cycles/access

Cell BE 1+8 2+1 64/128 3200 70 L1-I, L1-D, L2 (PPE), 8 128-bit LS banks (SPEs)
90 nm IMPC864D 2+2 1+2 Int, 2+1 SPFP, 3 32/128 1000 14 Ea. core: L1-I, L1-D: 4 words/access @ 2 cycles/access,
DPFP L2: 8 words/access (@ 11.5 cycles/access


64/128


1600


Unknown


FPGA Device Features

Max. Frequency Min. Power
Device LUTs DSPs (MHz) (W) Max. Power (W) On-chip Memory

9 128-bit dual port blocks @ 420 MHz, 768
Stratix-II EP2S180 143,520 768 500 3.26 30 32-bit dual port blocks @0 550 MHz, 930
16-bit dual port blocks @0 500 MHz

90 nm Virtex-4SX55 49152 512 500 110 48 72-bit dual port blocks @ 600 MHz,
864 32-bit dual port blocks @ 580 MHz,

Virtex-4LX200 178,176 96 500 1.27 23 48 72-bit dual port blocks @ 600 MHz,
1040 32-bit dual port blocks @ 580 MHz,
Stratix-III EP3SE260 203,520 768 550 2.11 25 320 32-bit dual port blocks @ 500 MHz

Stratix-III EP3SL340 270,400 576 550 2.83 32 336 32-bit dual port blocks @ 500 MHz
65 nm
Virtex-5 SX95T 58,800 640 550 1.89 10 488 72-bit dual port blocks @ 550 MHz

Virtex-5 LX330T 207,360 192 550 3.43 27 648 72-bit dual port blocks @ 550 MHz


Stratix-IV EP4SE530


64 72-bit dual port blocks @ 600 MHz,
1280 32-bit dual port blocks @ 600 MHz,


Device


Cores


Instructions
Issued/Core


45 nm


Atom N270


40 nm


424,960


1,024








III;


64 16-bit hetero. elements


Frequency
(MHz)


Min. Power (W) IMax. Power (W)


On-chip Memory

4 16-bit memory units,
5 simultaneous operations


Mathstar Ax 80 32-bit dual port banks @ 1 GHz,
Mathstar Arrix
FPOA 256 16-bit ALUs, 64 16x16 MACs 1000 18.82 @0 25% 46.25 @0 100% 12 72-bit single port banks @ 500
MHz

31 memory clusters, 4
Raytheon 6 32-bit RISC processor cores, 12 33 67 memoriecluster, d
MONARCH 256-bit333 6.7 33 memories/cluster, dual ported, 32 bits
MONARCH 256-bit Arithmetic Clusters
wide


Tilera TILE64


64 32-bit 3 issue VLIW processor
cores


64 32-bit L1 I, D caches, Unified L2
cache @ 7 cycle access


Device


Bit-Op


16-bit Int.


32-bit Int.


SPFP


DPFP


Stratix-II EP2S180 500 420 410 286 148

Stratix-III EP3SE260 550 273 400 329 195

Stratix-III EP3SL340 550 273 400 329 195

Stratix-IV EP4SE530 550 243 291 241 184

Virtex-4 SX55 500 249 344 274 185

Virtex-4 LX200 500 249 344 274 185

Virtex-5 SX95T 550 378 463 357 237

Virtex-5 LX330T 550 378 463 357 237


90 nm
RMC


Device

ElementCXI
ECA-64




University of Florida Home Page
© 2004 - 2010 University of Florida George A. Smathers Libraries.
All rights reserved.

Acceptable Use, Copyright, and Disclaimer Statement
Last updated October 10, 2010 - - mvs