Visual Language Perception from Videos
MOHIT GUPTA ADVISOR: AMITABHA MUKERJEE
Visual Language Perception from Videos MOHIT GUPTA ADVISOR: - - PowerPoint PPT Presentation
Visual Language Perception from Videos MOHIT GUPTA ADVISOR: AMITABHA MUKERJEE Introduction and Motivation Humans process and store what they perceive in a highly abstracted, condensed format For e.g. Computers on the other
MOHIT GUPTA ADVISOR: AMITABHA MUKERJEE
condensed format
become a valid question for a computer
maximizing the common pitch-subtitle boundaries
face
vocals
[1] Tran, Luan, et al. "Pitch reduced patterns relative to photolithography features." U.S. Patent No. 7,253,118. 7 Aug. 2007. [2] Swe, Ei Mon Mon, and Moe Pwint. "An Efficient Approach for Classification of Speech and Music." Advances in Multimedia Information Processing-PCM 2008. Springer Berlin Heidelberg, 2008. 50-60. [3] Cotton, Courtenay. "A Three-Feature Speech/Music Classification System." (2006). [4] Shah, Sejal, and Archana Bhise. "Fast Speaker Recognition using Efficient Feature Extraction Technique." International Journal of Computer Science 2. [5] Hossen, Abdulnasir, and Said Al-Rawahi. "A Text–Independent Speaker Identification System Based on the Zak Transform." Signal Processing an International Journal (SPIJ) 4.2: 68. [6] Zhao, Xianyu, et al. "SVM-based speaker verification by location in the space of reference speakers." Acoustics, Speech and Signal Processing,