

## Requirements for Scalable Application Specific Processing in Commercial HPEC

Steve Miller Chief Engineer gniting Innovation and Leadershir



scm 9/28/2004

## The 3 Single-Paradigm Architectures



| <u>Scalar</u> | <u>Vector</u> | App-Specific    |   |
|---------------|---------------|-----------------|---|
| Intel Itanium | Cray X1       | Graphics - GPU  | 2 |
| SGI MIPS      | NEC SX        | Signals - DSP   |   |
| IBM Power     |               | Prog'ble - FPGA |   |
| Sun SPARC     |               | Other ASICs     |   |
| HP PA         |               |                 |   |
|               |               |                 |   |
|               |               |                 |   |

gniting Innovation and Leadership



### Paradigms to Applications





## **Architectural Challenges**



- Hardware
  - Bandwidth to/from System
  - Scalability
- Software
  - Compliers/Languages
  - Debuggers
  - APIs

gniting Innovation and Leadershi



## Multi-Paradigm Computing UltraViolet

Scalar

Vector

Vector

Suppué



### Terascale to Petascale Data Set : Bring Function to Data **Scalar**

#### **Scalable Shared Memory**

- . Globally addressable
- . Thousands of ports

**Reconfigurable** 

- . Flat & high bandwidth
- . Flexible & configurable



Graphics

### Software



### Provide for HDL modules

Integrated environment with debugger Highest performance

- •Leverage 3<sup>rd</sup> Party Std Language Tools Celoxia, Impulse Acceleration, Mitrion, Mentor Graphics
- •Developed an FPGA aware version of GDB Capable of debugging the FPGA and System Software Capable of multiple CPUs and multiple FPGAs
- Developed RASC Abstraction Layer (RASCAL)



### Software Overview



| Debugger (GDB) |                              | Download<br>Utilities | User Space |  |
|----------------|------------------------------|-----------------------|------------|--|
| Application    |                              |                       |            |  |
|                | Abstraction Layer<br>Library | Device<br>Manager     | USEI Space |  |
| Alg            | Linux Kernel                 |                       |            |  |
| СОР            | Hardware                     |                       |            |  |

Igniting Innovation and Leadership



## **Abstraction Layer: Algorithm API**

sgi

eadershi

gniting Innovation and Les

The Abstraction Layer's algorithm API mirrors the COP API with a few additions that enable wide scaling,







Direct Connection to NUMAlink4 6.4GB/s/connection
Fast System Level Reprogramming of FPGA
Atomic Memory Operations Same set as System CPUs
Hardware Barriers

Configurations to 8191 NUMA/FPGA connections



# **MOATB Block Diagram**

sgi

