Datasheet
Tensilica FloatingPoint DSP Family
Specially designed for floating-point processing with exceptional PPA
The Cadence Tensilica FloatingPoint family of high-performing digital signal processors (DSPs) is specially designed for floating-point-centric processing while providing exceptional power, performance, and area (PPA). The Tensilica FloatingPoint DSPs offer a wide range of softwarecompatible scalability from 128-bit vector width to 1024-bit vector width. The scalability combined with configurability of the FloatingPoint DSPs provides SoC designers with the flexibility to design for a broad spectrum of applications, ranging from energy-efficient solutions for battery-operated devices to high-performance computing (HPC).
Overview
Overview
Optimized with various performance-enhancing features, Tensilica FloatingPoint DSPs offer outstanding performance per unit area and performance per unit power in floating-point computation for a wide range of applications. Based on the Tensilica Xtensa 32-bit RISC micro-architecture, the family (Figure 1) comprises the Tensilica FloatingPoint KP1 DSP, the Tensilica FloatingPoint KP6 DSP, the Tensilica FloatingPoint KQ7 DSP, and the Tensilica FloatingPoint KQ8 DSP.
Features
Tensilica FloatingPoint DSPs | ||||
---|---|---|---|---|
KP1 | KP6 | KQ7 | KQ8 | |
Xtensa Platfor | LX | LX | NX | NX |
Vector Width (b) | 128 | 512 | 512 | 1024 |
Xtensa LX Secure Mode | ✓ | ✓ | ||
8b/16b/32b/64b ALU Ops | ✓ | ✓ | ✓ | ✓ |
Scalable, Configurable, and Extensible
The highly scalable Tensilica FloatingPoint DSP family offers the SoC designer peace of mind when designing a solution that meets their PPA budget envelope. For energy-sensitive applications, the FloatingPoint KP1 DSP offers an ultra-low energy consumption solution. The FloatingPoint KP6 DSP provides balanced high performance in a small area, yielding excellent performance-per-unit area design. If much higher performance and clock speed are required, the FloatingPoint KQ7 and KQ8 DSPs (Figure 2) present superior vector floating-point operational throughput. All of the Tensilica FloatingPoint DSPs have a common ISA, making software portability and migration easy.
The Tensilica FloatingPoint DSPs offer easy, checkbox-style configurability for pre-verified instruction options. The simple approach in defining a DSP core results in seamless integrations of the feature into the hardware, the compiler, the modeling tools, and the verification scripts. These capabilities provide the solution designer with the ability to build an optimized and custom DSP with minimal to no development schedule impact, compared to what a change in hardware design would typically incur.
The performance of Tensilica FloatingPoint DSPs can be further enhanced and differentiated using the TIE language. Custom operations defined through the Verilog-like TIE language are automatically integrated and recognized by the Xtensa tool chain. The FloatingPoint DSPs also can be extended to support custom interfaces, such as queues and ports for efficient connection to external hardware blocks. These custom interfaces can be defined to match the interfaces of existing third-party IP. Hence, the FloatingPoint DSPs can access hardware offload accelerators in a deterministic single- or multi-cycle operation, greatly reducing power consumption and without impacting the shared system bus.
High-Performance Floating-Point Processing, Energy Efficiency, and Small Area Footprint
Floating-point numbers are commonly used in most technical and engineering computations. Some designers select floating-point format because of the easy handling of dynamic range of the data values, and some choose to simply just run the floating-point code generated by signal processing modeling tools. Running the floating-point code produced by modeling tools helps speed time to market and reduce the scope of the project by not converting the floating-point code to fixed-point version.
In applications that process large or unpredictable data sets, using floating-point numbers in the computation is no longer a convenience, but a requirement. In other applications, the floating-point format simply performs a better job compared to computation in fixed-point numbers. In a motor control application, for example, systems using the floating-point numbers can control the speed and torque more accurately and efficiently, resulting in better performance and greater energy efficiency compared to a system using fixed-point numbers.
The high-performance software tools accompanying Tensilica FloatingPoint DSPs provide superior auto-vectorization capability in vectorizing the scalar code to effectively utilize the vector floating-point units. The FloatingPoint DSPs also offer a vector data type and N-way programming model to make scaling between different SIMD widths easy. With the support of the optimized Eigen library, NatureDSP library, SLAM (Simultaneous Localization and Mapping) library, and math library, the FloatingPoint DSPs provide an easy programming environment, making porting and migrating floating-point software much easier.
The Tensilica FloatingPoint DSP family was specially designed to provide cost-effective and energy-efficient DSP solutions in high-performance floating-point-centric computation. Whether you are looking for an ultra-low energy and small area-cost floating-point DSP solution, or you need a super-high-performance floating-point compute engine for your complex mathematical models, you have the flexibility to find a suitable DSP solution from the Tensilica FloatingPoint DSP family.
Toolchain
Tensilica FloatingPoint DSPs are delivered with a complete set of software tools. The toolset includes a high-performance C/C++ compiler with automatic vectorization and instruction bundling to support the VLIW pipeline in the DSP. This comprehensive toolset also includes the linker, assembler, debugger, profiler, and graphical visualization.
A comprehensive instruction set simulator (ISS) allows you to quickly simulate and evaluate performance. When working with large systems or lengthy test vectors, the fast, functional Tensilica TurboXim simulator option achieves speeds that are 40X to 80X faster than the ISS for efficient software development and functional verification.
Tensilica Xtensa SystemC (XTSC) and C-based Xtensa Modeling Protocol (XTMP) system modeling are available for full-chip simulations. Pin-level XTSC offers co-simulation or SystemC and RTL-level offload accelerator blocks for fast, cycle-accurate simulations.
The Tensilica FloatingPoint DSPs support all major back-end EDA flows, and represent the ultimate in customizable DSPs from Cadence, the leader in scalable, configurable, and extensible solutions for advanced floating-point signal processing solutions. This proven development environment for both hardware and software reduces time to market and risk, as well as providing maximum flexibility in designing a broad range of applications using floating-point formats.
For more information, visit IP.