IBM - CVUT Student Research Projects Viterbi decoder on STI CELL processor Michal Blažek (blazem2@fel.cvut.cz)
Viterbi algorithm introduction • Finds the most likely state trajectory given the • Works in several steps HMM model and observation’s sequence. 1. initializa tion : ≤ ≤ for 1 i N ( ) δ = π ⋅ ( i ) b O 1 i i 1 2. recursion : ≤ ≤ ≤ ≤ for 2 t T, 1 j N [ ] ( ) ( ) ( ) δ = δ ⋅ ⋅ j max i a b O − t t 1 ij j t ≤ ≤ 1 i N [ ] ( ) ( ) ψ = δ ⋅ j arg max i a − t t 1 ij ≤ ≤ 1 i N 3 . terminati on : [ ( ) ] = δ * P max i T ≤ ≤ 1 i N [ ] ( ) = δ * q arg max i T T ≤ ≤ 1 i N 4 . backtracki ng = for t T - 1, T - 2,...,1 ( ) = ψ ⋅ * * q q + + t t 1 t 1 IBM - CVUT Student Research Projects 2
ALF introduction (Accelerated Library Framework) • Programming environment for data and task parallel applications. • Supports the MPMD (multiple-program-multiple- data) programming model. • Optimal tasks scheduling scheme based on tasks dependencies. • Two components: host runtime X accelerator runtime • Input + output + parameters = work block IBM - CVUT Student Research Projects 3
Program realization • Algorithm is divide in four separated TASKs creating a simple pipeline. • Between particular neighbouring TASKs is defined a time dependency for running in right order. • TASKs which using ALF are computed in parallel on 1-6 SPEs. • Program uses “Host data partitioning” method for partition data across the accelerators (SPEs). IBM - CVUT Student Research Projects 4
Conclusion and results • Working correctly. • The most difficult part (recursion) is not completed yet with ALF, for checking results is for now working on PPU(using SIMD) only. • No optimalizations and tune. • Orientation computation time comparison on the same data set (HMM-500x500x500, seq-500): – This: 10,5 sec – Old (1SPE, libspe2, SIMD): 18,3 sec IBM - CVUT Student Research Projects 5
Recommend
More recommend