

# SiFive Performance P270

The **SiFive® Performance™** P270 processor is a 64-bit RISC-V Linux capable, RISC-V Vector enabled, application processor, offering best-inclass performance and efficiency for the broadest range of control and vector combined computation workloads. The P270 features the powerful combination of a RISC-V Vector 256-bit vector length integrated with an 8-stage, dual-issue, inorder scalar pipeline and decoupled vector ALU pipeline.

Building on the robust foundations of the comprehensive **SiFive® Essential™** U7-Series portfolio, the multi-core, multi-cluster capable design of the P270 enhances the high-end capability with additional features specifically targeting combined control and vector computation applications. The finely tuned combination of out-ofthe-box features and design scalability ensures designers can achieve the optimal balance of power, performance, and area while achieving the fastest time to market.

## SiFive Performance P270 Key Features

- 64-bit RISC-V ISA, including the RISC-V Vector extension version 1.0
- 8-stage dual-issue superscalar in-order pipeline for scalar computation
- Coherent multi-core, multi-cluster processor configuration options, with up to 8 cores
- Linux-capable applications processor
- Memory parallelism provides cache miss tolerance
- Multi-layer caching support for optimum data movement
- Decoupled scalar and vector pipelines for optimum parallel execution of scalar and vector computation
- Virtual memory support, with up to 48-bit addressing, with precise exceptions
- High performance, flexible interfaces to external system memories and peripherals for easier system integration
- INT8, INT16 & INT32, INT64, FP16, FP32 & FP64, and Q8.8 to Q15 fixed point datatypes
- 256-bit vector register length (VLEN), with 128-bit data ALU width (DLEN)
- 128-bit Vector ALU and Load/Store architecture
- High performance vector memory subsystem
- Vector data stride L2 prefetcher unit



### **Comprehensive Linux Support**

SiFive includes and supports Linux FPGA bitstreams for all the SiFive® Intelligence<sup>™</sup>, SiFive Performance, and SiFive Essential Application processors. The Linux FPGA platform, based on the Xilinx VCU118, provides *a fast, hassle-free way* to experience a complete Linux environment.

It takes just three simple steps to get started:

- 1. Download and flash the Xilinx VCU118 FPGA with the supplied bitstream for your chosen SiFive processor
- 2. Download and flash the SiFive Linux BSP onto an SD card
- 3. Boot Linux

#### Multi-Core, Multi-Cluster

The P270 processor can be configured up to 4 cores in a coherent multi-core cluster, and up to 2 coherent clusters, giving a maximum of 8 cores in a multi-cluster coherent Core Complex. In the multi-core, multi-cluster configuration, cache coherency is managed in the Shared System Cache (SSC) with a dedicated 512KB or 1MB memory per bank, up to 8MB in total. The Coherent System Fabric manages both coherency and a crossbar network to allow all cores full connectivity to the shared ports. A multi-cluster design may be configured to add a second port to support the bandwidth requirements of multiple clusters.



#### **Memory System and Caches**

The P270 memory system is designed for scalability and flexibility, allowing the most suitable level of tuning for application workloads. A 32KB Level 1 Instruction Cache, a similarly sized Data Cache, both with 4-way cache associativity, alongside a private 256KB Level 2 cache offers the best selection of high performance while minimizing power and area. In multi-core systems, a Level 3 cache can be configured to be either 1MB, 2MB or 4MB, with a multi-cluster option of 8MB.



## P270 PORTS

These ports provide interfaces to external system memories and peripherals; they are shared across all processors in a multi-core, multi-cluster configuration.

| Memory Port<br>1 or 2 ports | Arm <sup>®</sup><br>AMBA <sup>®</sup><br>AXI4™<br>128-bit | <ul> <li>Interface with memory that offers the highest performance</li> <li>The only cacheable region of memory support accesses for data and instructions</li> <li>Supports up to 128 outstanding transfers per memory port</li> </ul> |
|-----------------------------|-----------------------------------------------------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| System Port<br>1 or 2 ports | AXI4<br>64-bit                                            | <ul> <li>Used typically for high-bandwidth, uncached memory or devices</li> </ul>                                                                                                                                                       |
| Peripheral Port<br>1 port   | AXI4<br>64-bit                                            | <ul> <li>Interface with lower speed peripherals</li> <li>Supports code execution</li> <li>supports the RISC-V standard Atomic (A) extension</li> </ul>                                                                                  |
| Front Port<br>1 or 2 ports  | AXI4<br>128-bit                                           | <ul> <li>External Initiators for accessing on Core Complex devices and ports</li> <li>Transactions through the Front Port are coherent with core caches</li> </ul>                                                                      |

## **RISC-V Vectors (RVV)**

The RISC-V Vector (V) ISA standard extension enables processor cores based on the RISC-V instruction set architecture to process data arrays alongside traditional scalar operations, unifying vector and scalar capabilities into a single application processor. The P270 processor implements a 256-bit vector length architecture (VLEN), fully supporting the vector extension standard, with dynamic variable vector length operations. The vector ALU and load/store architecture data width (DLEN) is 128-bits.

With RISC-V vectors, P270 offers:

- A single Vector ISA (ratified at version 1.0 by RISC-V International in 2021), which greatly simplifies software development across the full range of vector processors
- A vector-length-agnostic architecture, which allows library and application code investment to be reused across multiple generations and broad ranges of processor implementations
- Dynamic (runtime) modification of vector parameters for the most efficient computation on a processor, enabling better targeting for market application computation needs
- Support of LMUL (vector Length **MUL**tiplier), the ability to concatenate multiple vector hardware computations for a single instruction, giving more efficient vector throughput with a smaller number of software instructions. P270 supports LMUL up to 8, thus a 2048-bit software vector length.
- Extensive range of vector data types and sizes, including FP16 / FP32 / FP64, integer INT8 up to INT64 data types, and Q8.8 to Q15 fixed point, supporting a wide range of application requirements



## **Advanced Software Development Capabilities**

SiFive Freedom Studio, built on top of the popular Eclipse IDE, is the fastest way to get started programming with SiFive hardware. Freedom Studio is packaged with a pre-built tool suite, example projects, and includes comprehensive support for SiFive Insight Advanced Trace and Debug capabilities.

For SiFive Vector accelerated processors, there is an Advanced LLVM C-Compiler that performs autovectorization of C-code. This enables software developers to map their C-algorithms onto the SiFive Vector processors quickly and efficiently. Additionally, for developers with legacy SIMD code, the SiFive Recode technology can be used to quickly migrate software targeted at SIMD instruction sets to RISC-V Vectors (RVV).



#### **Broad Application Coverage**

The P270 is ideal for the broadest range of applications that require high-throughput, single-thread performance, while performing within specific power and area constraints:

- Enterprise Switching/Routing/Storage, Smart NICs
- Edge Analytics, Big-Data Analytics
- Imagine processing, Object Detection, Recognition
- Autonomous Machines
- Edge Compute