



### **CEARCH**

## Cognition Enabled ARCHitecture

Stephen Crago and Janice McMahon, USC/ISI

Chris Archer<sup>1</sup>, Krste Asanovic<sup>2</sup>, Richard Chaung<sup>3</sup>, Keith Goolsbey<sup>4</sup>, Mary Hall<sup>5</sup>, Christos Kozyrakis<sup>6</sup>, Kunle Olukotun<sup>6</sup>, Una-May O'Reilly<sup>2</sup>, Rick Pancoast<sup>7</sup>, Viktor Prasanna<sup>8</sup>, Rodric Rabbah<sup>2</sup>, Steve Ward<sup>2</sup>, Donald Yeung<sup>9</sup>

### September 20, 2006

<sup>1</sup>Northrop Grumman, <sup>2</sup>MIT, <sup>3</sup>Army I2WD, <sup>4</sup>Cycorp, <sup>5</sup>USC/ISI, <sup>6</sup>Stanford University, <sup>7</sup>Lockheed Martin, <sup>8</sup>USC, <sup>9</sup>University of Maryland

The views and conclusions contained in this document are those of the authors and should not be interpreted as representing the official policies, either expressed or implied, of the Defense Advanced Research Projects Agency (DARPA) or the U.S. Government. Effort sponsored by the Defense Advanced Research Projects Agency (DARPA) through the Department of the Interior National Business Center under grant number NBCH104009.



## **Outline**



- Project Goals
- Architecture Characteristics
- Application Examples
- Summary



## **CEARCH Goals**



- Develop a computer architecture that supports cognitive information processing
  - □ Computer architecture: a set of hardware and system software interfaces and implementations
- Support real-time, embedded cognitive processing requirements through an efficient, high-performance computer architecture
- Identify algorithms and improved algorithm implementations that can leverage the CEARCH computer architecture
- CEARCH is not a cognitive architecture project
  - Cognitive architecture: a computational model (usually expressed in software) for a complete cognitive system that may or may not be based on human psychology



# CEARCH and Cognitive Architectures



- The CEARCH computer architecture will run a *variety* of cognitive architectures efficiently
  - □ Multiple cognitive architectures important
    - No single consensus on cognitive architectures
    - Important to support emerging cognitive architecture research: each IPTO program in this domain has its own cognitive architecture
    - Different domains may require different cognitive architectures
  - □ Support for variety of cognitive architectures
    - Wide range of cognitive algorithms drive CEARCH architecture to ensure coverage
    - Adaptivity and scalability emphasized to support dynamic processing requirements critical to all cognitive architectures
- CEARCH computer architecture has some characteristics of a cognitive system
  - □ Introspection and self-management: knows what it is doing and how to process efficiently
  - ☐ Learns how to process more efficiently over time
  - Supports inexact computations when optimality is not feasible or possible
  - □ Robust processing in the context of faults



### **CEARCH Team**



Program Lead Steve Crago (Co-PI, ISI) Janice McMahon (Co-PI, ISI) Bob Parker (ISI)



## Military Requirements & Applications

- Janice McMahon (ISI)
- Steve Crago (ISI)
- UAV Sensor Fusion
  - Chris Archer (NG)
  - Mark Akey (NG)
  - Kirk Dunkelberger (NG)
- Threat Analysis and Planning
  - Rick Pancoast (LM)
  - •Jim Kilian (LM)
- UGS Sensor Fusion

## Cognitive Algorithms **Definition**

- Janice McMahon (ISI)
- Probabilistic Reasoning and Learning
  - Sebastian Thrun (Stanford)
  - Daphne Koller (Stanford)
  - •Gary Bradski (Intel)
- Evolutionary/Machine Learning
  - Una-May O'Reilly (MIT)
  - Leslie Kaelbling (MIT)
- Knowledge Base Reasoning and Learning
  - Keith Goolsbey (Cycorp)
  - •Michael Witbrock

### (Cycorp)

## Computing Architectures Integration & Mapping

- Steve Crago (ISI)
- Janice McMahon (ISI)
- InfiniT Processor and Run-Time System
  - Krste Asanovic (MIT), Rodric Rabbah (MIT), Steve Ward (MIT)
- Transactional Memory
  - Kunle Olukotun (Stanford)
  - Christos Kozyrakis (Stanford)
- Soft Computing Architectures
  - Don Yeung (ISI, UMd)
- Compiler with Learning
  - •Mary Hall (ISI)
- Parallelization: Viktor Prasanna (USC), Cauligi Raghavendra (USC)













## **CEARCH Project Overview**







## **Scenario Summary**

**UGS Urban Situational** 



## Shipboard Threat Analysis and Planning

# Awareness Multi-UAV Sen



### UAV-based Behavior Spotting



| Kernel                                           | Example Scenario<br>Requirement                      | Example architectural drivers                          |
|--------------------------------------------------|------------------------------------------------------|--------------------------------------------------------|
| Probabilistic Relational<br>Model (Learn, Infer) | 1-2 Tera-updates / sec on large graphs               | Probabilistic computation                              |
| SATisfiability-based<br>Planner                  | 1 Giga-Boolean-inferences / sec                      | Parallel tree traversal                                |
| Support Vector<br>Machine Classification         | 2 Tera-ops (variable-precision floating point) / sec | Flexible caching for sparse vectors                    |
| Information-form Data<br>Association Tracking    | 2 Tera-ops (probability calculations) / sec          | Parallel sparse matrix calculations                    |
| Symbolic Reasoning and Learning                  | 313K problem trees per second                        | Symbolic matching, irregular memory accesses           |
| System                                           |                                                      | Rapid High-Level<br>Reorganization and<br>Responsivity |

Cognitive reasoning
and learning
techniques require new
computing platforms to
enable new real-time,
embedded capabilities
and missions
Must combine orders of
magnitude
performance/efficiency
improvement with
ability to respond
rapidly to the needs of
dynamic environments



## **Outline**



- Project Goals
- Architecture Characteristics
- Application Examples
- Summary



# Why Do We Need Hardware for Cognitive Systems?



- Introspective and Self-Managing Computing
  - Must support introspective information flow from applications to hardware (and back) to support cognitive resource management and introspective applications
  - □ Scalable Web of Cognitive Virtual Processing Elements
    - Efficient, high-performance computation required to support realtime reasoning and learning requirements
    - Must be adaptable and able to support variety of cognitive processing paradigms (graphs, symbolic reasoning, etc.) and dynamic requirements
  - □ Multi-level Soft Computing
    - Support for probabilistic and inexact data types and computation pervasive in system (processing, memory, communication, programming model, run-time system)
  - □ Adaptive memory system
    - Unpredictable, irregular memory accesses and large working sets
    - Driven by parallel computation, dynamic resource allocation, and fundamental characteristics of algorithms and data



## Introspection and Self-Management



- System must adapt to unpredictability in cognitive systems
  - □ Dynamic scenarios lead to dynamic and unpredictable changes in processing requirements
  - □ Cognitive processing too complex to be managed by programmer
    - Cognitive algorithms provide means for system to manage itself
  - □ Faults are unavoidable at this scale
- Introspection required to support autonomous adaptability
  - □ Processing: precision, performance required, operation mixes, efficiency of functional units
  - □ Memory and Communication: access/communication patterns, cache hit rates, working set sizes, precision required, bandwidth/latency trade-offs, protection





# Scalable Web of Cognitive Virtual Processing Elements



- Cognitive processing requires massive fine-grained parallelism with highly efficient processing elements
- Cognitive processing elements different from generalpurpose computing, scientific computing, and signal processing elements
  - □ Processing granularity highly variable and dynamic
  - □ Cognitive systems and scenarios lead to dynamic code and data movement and load balancing
  - □ Density of parallelism must be much higher to do real-time reasoning and learning in complex scenarios



**Parallelism With Varying Granularity and Computation Types** 



## **Multi-Level Soft Computing**



- Exploit the tolerance for imprecision, uncertainty, partial truth, and approximation to achieve tractability, robustness and low solution cost\*
  - □ Optimality or exactness infeasible in cognitive application domains
  - □ Input data has imprecision and inaccuracy
  - □ Robustness needed to handle transient and persistent faults
- Exploitation of soft computing for performance gains changes architecture at all levels
  - □ Processor: data types, functional units, circuit design
  - □ Memory: local and shared lossy memory protocols, latency reduction
  - □ Communication: lossy protocols, QoS tuning
  - ☐ System software: data types, communication of precision trade-offs No Dropping

to programmer, resource management

**Performance Improvements From Message Dropping** 

\*http://www.soft-computing.de/def.html

Policy #1 Policy #2



## **Adaptive Memory System**



- Cognitive processing leads to poor memory system behavior in traditional memory systems
  - □ Some algorithms have irregular and hard-to-predict access patterns
  - □ Working sets can be very large because of complexity of scenarios
  - □ Dynamic resource allocation and fine-grained parallelism leads to more global memory accesses and locality challenges

## Memory system requirements

- □ Flexible allocation among cognitive processing elements
- ☐ Fine-grained protection
- □ Flexible commit policies
- □ Inexpensive roll-back for fault tolerance and race conditions between parallel compute elements



L1 Cache



## **CEARCH Architecture Layers**



### **Programming Model**

- Abstraction barriers provide scalable low-level performance with highlevel specifications
- Goal-based performance and resource allocation allows computation to be in part selected by system
- Soft computing semantics

#### Runtime System

- Learning and reasoningbased goal-oriented instrumentation and compilation
- Adaptive and introspective hierarchical resource allocation for processing, memory, and communication

#### Hardware Architecture

- Millions of introspective virtual processing elements running on thousands of hardware engines
- Adaptive memory for efficient data access and sharing
- Soft computing support



Programming Model for the Algorithm

### ■"The Bridge"



- □ Language expresses the algorithm and algorithm goals
- □ Architecture independent and malleable code

Programming Model for Introspection

### "The Engine Room"



- Can analyze the program ("reflection" interface)
- □ Can find information about the resources/architecture
- □ Provide rules for
  - Scheduling and Resource allocation
  - Learning and Adaptation
  - Soft computing and fault tolerance

### □ By

- Default policies
- Overwritten by creating generic rules
- Or custom rules for an application



### **CEARCH Hardware Architecture**





Stored
processor
Millions of
scalable
cognitive virtual
processing
elements (stored
threads) for
dynamic parallel
reasoning and
learning, Soft
computing

Multi-level cognitive memory, stored processor working sets



Processor and memory allocation and precision, Reasoning and Learning requirements, Fault tolerance

> **Adaptive** transactional **Mondriaan** memory, Parallel reasoning and learning data accesses. Soft coherence. Speculation, Locality management, Cell Sharing, Isolation and Protection



## **Outline**



- **Project Goals**
- Architecture Characteristics
- **Application Examples**
- Summary



## **Spotting Behaviors OODA Loop**









**HPEC 2006** 

## **Introspection and Self-Management**







**HPEC 2006** 

## Scalable Web of Virtual Processing Elements







## **Multi-Level Soft Computing**







## **Adaptive Memory**







## **CEARCH Application Speedups**







## LBP Performance Improvement







## **Outline**



- Project Goals
- Architecture Characteristics
- Application Examples
- Summary



## **CEARCH Summary**



- CEARCH is a dynamic self-managing architecture for cognitive processing uniquely suited to complex environments
  - □ Driven by cognitive system and algorithm characteristics
  - Dynamically organize resources to optimize performance, power and reliability
  - □ Adaptation and introspection in both hardware and software
- CEARCH has unique features to efficiently support cognitive applications and that provide capability not possible with today's COTS architectures
  - □ Stored processor
  - □ Adaptive, transactional memory
  - □ Soft computation
  - □ Introspection and run-time policy control support
- Preliminary architecture evaluation indicates
  - ☐ High performance potential
  - Well suited to cognitive applications and soft computing