Adapteva, a privately-held semiconductor technology start up, recently announced that it is sampling the fourth generation of its Epiphany multicore architecture, making the ground-breaking performance and energy efficiency of the Epiphany platform widely available for co-processing solutions to be implemented by industries requiring the next leap forward in parallel computing. Adapteva’s new OpenCL SDK, also announced today, provides a portable API for accessing the compute capabilities of a platform, accelerating performance in a wide spectrum of applications in market categories from gaming and entertainment to scientific and medical software.
The first video above is a demonstration of the Epiphany-IV 28nm multicore silicon performance. The video shows how the Epiphany processor clearly beats an x86 processor while executing a parallel matrix multiplication demo written entirely in ANSI-C.
The Epiphany microprocessor architecture is a supremely scalable shared memory architecture, featuring up to 4,096 processors on a single chip, connected through a high-bandwidth on-chip network. Each processor node represents a fully-featured floating point RISC processor built from scratch for multicore processing, a high bandwidth local memory system, and an extensive set of built in hardware features for multicore communication. The resulting performance boost is coupled with Adapteva’s low power design and standard C programming model, bringing an unprecedented level of real-time processing to performance and power constrained mobile devices like smartphones and tablet computers, as well as improving performance levels for an array of other parallel computing platforms.
- Complete multicore solution featuring a high performance microprocessor ISA, Network-On-Chip, and distributed memory system
- Fully-featured ANSI-C programmable GNU/Eclipse based tool chain
- Scalable to 1000’s of cores and TFLOPS of performance on a single chip
- 1GHz superscalar RISC processor cores
- IEEE Floating Point Instruction Set
- Shared memory architecture with up to 128KB memory at each processor node
- Zero startup-cost messaging passing
- Vector Interrupt Controller
- Distributed Multicore Multidimensional DMAs
- 32 GB/sec local memory bandwidth per core
- 8GB/sec per processor network bandwidth
- 72 GFLOPS/Watt energy efficiency
- Processor tile size of 0.5mm^2 at 65nm, 0.128mm^2 at 28nm
- Out-of-the box floating point C programs enables significantly faster time to market and lower development costs compared to ASIC or FPGA based solutions.
- Up to 100X advantage in energy efficiency compared to traditional multicore floating point processors offers breakthrough improvements in battery life, cost of ownership, and reliability.
- Unparalleled performance, as much as 5 TFLOPs on a single chip, enables a new set of high performance applications.
- Low latency zero-overhead inter-core communication simplifies parallel programming.
- Scalable architecture allows code reuse across a wide range of markets and applications from smart-phones all the way to leading edge supercomputers.
- Are your customers complaining that their mobile device runs out of battery too fast?
- Do you lack the money, team, or time needed to convert your floating point C-based reference application to a fixed point FPGA/ASIC hardware implementation?
- Do you have a killer app in mind that won’t become practical until 2016 based on existing mobile processor roadmaps?
A demonstration of the Epiphany-IV 28nm 64-core silicon energy efficiency. The video demonstrates that the Epiphany-IV achieves a world best energy efficiency of 50 GFLOPS/Watt.
High Performance Applications:
- Would you benefit from reducing your processing latencies to microseconds and still being able to program in ANSI-C?
- Do you lack the electrical and cooling infrastructure needed to operate a state of the art high performance system?
- Are you only seeing 10-15% of the advertised maximum performance of your current vendor’s manycore solution?
- Are you frustrated with the steep learning curve and proprietary development environments of existing floating pointaccelerator technologies?