IBM Č VUT Stud ent Resea rch Project 2 006 DSP Algorithms for CBE Architecture Jan Kryl (krylj1@fel.cvut.cz) Petr Kurtin (kurtip1@fel.cvut.cz)
In this talk… • Introduction to DSP Algorithms • Overview of CBE Architecture • Implementation – DSP Library – JPEG Library • Design issues • Concluding notes IBM Č VUT Student Research Project 2006 (2 z 14)
DSP Algorithms • Application fields – Audio signal processing (mp3, vorbis compression) – Digital image (jpeg compression) – Digital video (closely related to digital image processing) – Communications, navigation, radar, GPS • Signal representation – Time, Spatial, Frequency, Autocorrelation or Wavelet domain IBM Č VUT Student Research Project 2006 (3 z 14)
DSP Algorithms (cont.) • Domain transformation algorithms – Discrete Cosine Transform (image and audio compression) • Filter is essential unit in DSP – FIR filters (no feedback) – IIR filters (feedback -> can be unstable) • Frequency analysis – Fourier transforms, resolution, spectral leakage and windowing IBM Č VUT Student Research Project 2006 (4 z 14)
Cell Processor • In fact, it's not a single processor but one PPE and eight SPEs • PPE (Power Processor Element) – 64-bit, general purpose, PowerPC compliant processor – Runs OS, manages SPEs • SPE (Synergistic Processor Elements) – SIMD coprocessor specialized for computation intensive tasks IBM Č VUT Student Research Project 2006 (5 z 14)
Cell Processor (cont.) • PPE and SPEs are connected through high speed EIB (Element Interconnect Bus) • Developement – Various kinds of libraries (math, game, audio, ...) – Runs Linux – Complete Cell hw can be emulated by Full System Simulator – Development and porting of applications is easy IBM Č VUT Student Research Project 2006 (6 z 14)
DSP Library • SPUC (Signal Processing using C++) library has been ported to use Cell interface – Basic building blocks (FIR, IIR, Allpass, Lagrange interpolation filters, NCO, Cordic rotator, …) – Communication functions (timing, phase and frequency discriminators for BPSK/QPSK signals) – Various adaptive equalizer classes • Our example – Shows the operations above matrixes IBM Č VUT Student Research Project 2006 (7 z 14)
JPEG Library from IJG IBM Č VUT Student Research Project 2006 (8 z 14)
Design Issues • How to maximize utility of SPUs? – Select the best model (RPC, threads) – Transfer bigger clusters of data and less often – Take advantage of already running threads (don't needlessly spawn new threads) • Minimize necessary changes to library code IBM Č VUT Student Research Project 2006 (9 z 14)
Design Issues (cont.) IBM Č VUT Student Research Project 2006 (10 z 14)
Design Issues (cont.) IBM Č VUT Student Research Project 2006 (11 z 14)
Design Issues (cont.) • Plan – Library cross-compilation for PPC architecture – Rewrite generic C code to Altivec C intrinsics – Rewrite Altivec intrinsics to SPU intrinsics, add SPU thread activation code to library – Test the library • The plan was successfully realized IBM Č VUT Student Research Project 2006 (12 z 14)
And the results? • Original bmp image before compression • size 7678 bytes • JPEG (75% quality) • size 2362 bytes • JPEG (15% quality) • size 1272 bytes IBM Č VUT Student Research Project 2006 (13 z 14)
Summary • Subjects of possible future extensions – Make compilation & instalation process more friendly – For sure there are other ways how to do the same – try them and test performance – Port the whole jpeg library and not just compression part – Change library in backward compatible manner • Project pages (password required) – http://service.felk.cvut.cz/courses/36SPA/prj/36SPA23 • Documentation, Sources, Binaries IBM Č VUT Student Research Project 2006 (14 z 14)
Recommend
More recommend