Session Border Control in the Cloud Accelerating Virtual Network Functions with GPUs Kevin Riley –CTO & EVP of Advanced R&D
Our Application and the GPU Opportunity SBCs…What are They? SBC Application Secure and Interwork Unified Communications Components Deployed in Service Provider Core, Edge and Customer Premise Control Plane Application Decomposes into Control and Media Plane Media Evolution Transcoding Historically Implemented on Purpose Built HW Migration to CPU and Cloud Infrastructure is current State of the Art Media Security & Forwarding Challenges Transcoding Inefficiencies Inhibiting Cloud Migration at Scale Enhanced Security Capabilities Ill-suited to CPU 2 Ribbon Communications Confidential and Proprietary
A Break-Out Strategy is Needed for Cloud SBC How to Unlock Cloud How to Unlock Cloud Performance On-Par with Performance On-Par with Purpose Built Hardware? Purpose Built Hardware?
Observation 1: GPUs are Pervasive in the Data Center 4 Ribbon Communications Confidential and Proprietary
Observation 2: GPU Performance is Gapping CPU Peak Double Precision FLOPS Peak Memory Bandwidth GFLOPS GB/s 8000 1400 7000 1200 6000 1000 5000 800 4000 600 3000 400 2000 200 1000 0 0 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 NVIDIA GPU X86 CPU NVIDIA GPU X86 CPU Market Realist Source: Nvidia’s Presentation 5 Ribbon Communications Confidential and Proprietary
Mapping GPUs to the SBC Application For a stable transcoding system, it’s imperative that processing of all channels is completed within the codec frame time. CPUs and DSPs process channels sequentially and hence need to ensure per channel processing time is low. Challenges • Speech codecs generally employ various types of recursive filters Channel 0 which are ill-suited for parallelization. Codec Frame Time • Even if we parallelize parts of the TIME operation, over all speed-up will be limited by Amdahl’s law. Channel N • A GPU core is relatively less powerful than a CPU core. 6 Ribbon Communications Confidential and Proprietary
Mapping GPUs to the SBC Application (cont.) New Approach TRADITIONAL APPROACH NEW APPROACH • GPU cores are less powerful, but they are plentiful. • We get better performance when adjacent GPU threads perform similar jobs. Channel 0 • Offload entire encode/decode operation for a single channel on to a single GPU thread and ensure processing time is less than frame time. Codec Frame Time Channel N Channel 0 Channel 1 G729A transcoding (encode + decode), with a 10ms frame-time, TIME takes approximately 35us for one channel on an E5-2690v2 processor. On a CPU we can achieve approximately 285 transcodes. Channel N When we offload per channel processing to a single GTX970 thread, it takes approximately 6ms (initial prototype). However in this 6ms we can process 1664 channels (GTX970 has 1664 cores). 7 Ribbon Communications Confidential and Proprietary
Observation 3: GPUs are the Ideally Suited for SBC Media SBC Stall Point • Virtualization on COTS • High scale on COTS • High scale on customized hardware hardware hardware. • Codecs sourced from Intel • Codecs ported and • Codecs sourced from IPP and third-party optimized in-house using third-party vendors. vendors. reference source code. 8 Ribbon Communications Confidential and Proprietary
GPU Technology Delivers Disruptive Performance Gains 3.5x 9x <2x Transcoding Transcoding Power Compared Compared Consumption to DSP to CPU 9 Ribbon Communications Confidential and Proprietary
GPU vs CPU GPU VS CPU - SESSIONS SESSIONS MULTIPLIER M60 V-100 1458% 1136% 1066% 732% 605% 518% 534% 519% 407% 320% G729A EVRC-9.3 EVRCB-9.3 AMR-12.2 AMRW B-6.6 GPU VS CPU -SESSIONS/WATT 543% M-60 V-100 SESSIONS /WATT 356% 333% 314% 209% 193% 172% 133% 111% 81% G729A EVRC-9.3 EVRCB-9.3 AMR-12.2 AMRW B-6.6 10 Ribbon Communications Confidential and Proprietary
Ribbon & Nvidia Thought Leadership with Transcoding on GPUs • Nvidia GPUs provide massive parallel processing for calculation-intensive tasks, perfectly suited for simultaneous media transcoding for multiple codec types • Nvidia is delivering improved performance with reduced power requirements at unmatched velocity • Ribbon software framework leverages CPU for media handling and overhead, with transcoding moved to GPU 11 Ribbon Communications Confidential and Proprietary
Continuing to Disrupt with GPU Control Plane Co-Processor Model Media • Media Transcoding Ideally Maps Transcoding • Most Compute Intensive Component Media Security & Forwarding Front-End Processor Model • Direct NIC/GPU Memory Copy Unlocks Further Disruption • Enables Further SBC Media Plane Acceleration • Unlocks Enhanced DPI/Pattern Matching Security Capability
Recommend
More recommend