

# Adaptive beamforming using QR in FPGA

Richard Walke, Real-time System Lab Advanced Processing Centre S&E Division



### Contents

- 1 Architecture of adaptive beamformer
- 2 FPGA components
  - Digital receiver
  - QR processor for adaptive weight calculation
- 3 Design methodology
- 4 Demonstration overview
- 5 Conclusions



### Section 1 Architecture of adaptive beamformer



#### Architecture of adaptive beamformer Adaptive beamformer



### Section 2 FPGA components



# FPGA Components Software configurable FIR



### FPGA Components Software configurable FIR

- Software programmable parameters include:
  - filter length
  - decimation ratio
  - complex/real arithmetic
  - number of channels
  - time varying filtering (inter & intra-pulse)
- Performance 20-30 GOPS on XC2V6000-5
   <u>100 GOPS on Virtex2 Pro (2003)</u>



### FPGA Components Weight calculation using QR

- Building block for a range of adaptive algorithms
  - Sample matrix inversion (SMI)
  - Soft constraints



### FPGA Components QR decomposition



#### FPGA Components Features of QR

- Good numerical properties. Arithmetic choices:
  - CORDIC: shift-add
  - Fixed-point: multiply-add
  - Floating-point: Higher dynamic range, allows algorithms with fewer operations & lower wordlength. Smallest!
- Highly parallel (Givens rotations)
  - Suits FPGA
  - Need to reduce parallelism for many applications!



### FPGA Components Obtaining lower-levels of parallelism



# FPGA Components Novel mapping of QR





Linear systolic array



#### FPGA Components FPGA implementation

Number of Ops



#### XCV3200E-8

139 14-bit FP operators @ 160MHz = 22 GFLOPS

### FPGA Components QR processor - main features

| Size                                     | Mantissa<br>wordlength | Clock               | Utilisation (XC2V6000) |     |     |     | Operations  | Power              |     |
|------------------------------------------|------------------------|---------------------|------------------------|-----|-----|-----|-------------|--------------------|-----|
| 1 Boundary<br>3 Internal                 | 14-bit<br>mantissa     | 101MHz⁵             | Mults                  | 32  | 22% | 23% | 6 GFLOPS    | 2.24W <sup>4</sup> |     |
|                                          |                        |                     | Rams <sup>3</sup>      | 34  | 23% |     |             |                    |     |
|                                          |                        |                     | LUTS                   | 15K | 22% |     |             |                    |     |
|                                          |                        |                     | FFs                    | 16K | 23% |     |             |                    |     |
| 1 Boundary<br>12 Internal                | 14-bit<br>mantissa     | 100MHz <sup>2</sup> | <mark>82%</mark> 2     |     |     |     | 20.3 GFLOPS | 8W <sup>6</sup>    |     |
| 1 Boundary<br>9 Internal                 | 17-bit<br>mantissa     | 97MHz               | 74%                    |     |     |     | 15 GFLOPS   | 7W <sup>6</sup>    |     |
| Pentium <sup>™</sup> 4 2GHz <sup>1</sup> |                        |                     |                        |     |     |     | 4 GFLOPS    | 70W                | ×50 |

- 1 Estimated (based on data from Richard Linderman)
- 2 Estimated (design too large for PC)
- 3 Also depends upon number of inputs
- 4 Obtained via Xpower
- 5 For XC2V6000-5
- 6 Extrapolated



### Section 3 Heterogeneous design methodology



## Heterogeneous design methodology $GEDAE^{TM}$

Graphically specify system

![](_page_16_Figure_2.jpeg)

# Heterogeneous design methodology $GEDAE^{TM}$

Graphically specify system
 primitive functions in 'c'

![](_page_17_Figure_2.jpeg)

# Heterogeneous design methodology $GEDAE^{TM}$

- Graphically specify system
  - primitive functions in 'c'
  - executable specification

![](_page_18_Figure_4.jpeg)

## Heterogeneous design methodology

- Graphically specify system
  - primitive functions in 'c'
  - executable specification
- Auto-code generation
  - parallel programme
     constructed by GEDAE

![](_page_19_Figure_6.jpeg)

## Heterogeneous design methodology

- Graphically specify system
  - primitive functions in 'c'
  - executable specification
- Auto-code generation
  - parallel programme
     constructed by GEDAE
- Currently no support for FPGA
  - highly compatible model

![](_page_20_Figure_8.jpeg)

#### Heterogeneous design methodology Core based methodology

- Cores used for key functions
  - FFT, QR, FIR filter ...
  - Build in parallelism (manually)
  - Parameterised

![](_page_21_Figure_5.jpeg)

![](_page_21_Picture_6.jpeg)

### Heterogeneous design methodology Core based methodology

- Cores used for key functions
  - FFT, QR, FIR filter ...
  - Build in parallelism (manually)
  - Parameterised
- Automatically generated system
  - communications inserted

![](_page_22_Figure_7.jpeg)

![](_page_22_Picture_8.jpeg)

### Heterogeneous design methodology Core based methodology

- Cores used for key functions
  - FFT, QR, FIR filter ...
  - Build in parallelism (manually)
  - Parameterised
- Automatically generated system
   communications inserted
- Architectural exploration
  - Compaan gives Matlab NLP to VHDL
  - RTL output in future version

![](_page_23_Figure_9.jpeg)

![](_page_23_Picture_10.jpeg)

### Section 4 Adaptive beamformer demonstration overview

![](_page_24_Picture_1.jpeg)

# Demonstration overview System mapping

![](_page_25_Figure_1.jpeg)

### Conclusion

#### • FPGAs

- Performance dependent upon level of optimisation
- Floating-point is realistic
- 10x compute improvement
- 5 20x power improvement
- Design is main issue
  - Hardware design: High levels of parallelism required
  - Core-based design approaches offer interim solution
  - Architectural synthesis tools are emerging

![](_page_26_Picture_10.jpeg)

### Acknowledgements

- The project to develop a core-based design methodology is a collaboration between:
  - QinetiQ Ltd
    - poc: rlwalke@qinetiq.com
  - BAE SYSTEMS ATC, Gt Baddow
    - poc: ian.alston@baesystems.com
- Contributions have been made by John McAllister under contract with the Queen's University of Belfast.
- This work was sponsored by the United Kingdom Ministry of Defence Corporate Research Programme.

![](_page_27_Picture_8.jpeg)