9/23/2009 Team Members • Ali Khodaei • Kaveh Shahabi Search Engine • Sangeetha U Santharam for for Shoah Foundation Presented by Ali Khodaei (khodaei@usc.edu) Project Motivation Project Definition • Existence of huge set of useful data • Robust, efficient and interactive search engine ranking testimonies based on combination of – Over 50,000 video testimonies – Textual (regular) keywords – Each divided to one-minute segments – Each segment tagged with set of keywords – Spatial keywords • Good amount of spatial and textual data • This search engine finds and ranks the most • Lack of location-based search engine textually and spatially relevant testimonies – Lack of an interface to ask for spatial data (segments) according to – Lack of ranking/scoring function to rank/score – query keywords document based on space and text simultaneously – query location Output Input • Query Keywords – Set of keywords inputted as text • Query Location – A region drawn on the map OR A i d th OR – A spatial keyword inputted as text 1
9/23/2009 System Components Tasks 1- Data tier SHOA – Data Cleansing Together DB handles sessions, user • Understand / format / standardize the data interactions, and events – Geocoding / GeoTagging GUI Data (Client Side) Extraction • Find missing lat/long information for some of • Find missing lat/long information for some of And And cleansing spatial keywords Web Video • Assign appropriate geographical information to Application Load Video (Server Side) DB RAW each testimony/segment Formatted – Index Construction DB • Create inverted files for regular keywords Readonly Web Service Index access • Create inverted files for spatial keywords Mid tier consist of all Creating index Structure structure (one time) the core functionalities Tasks Tasks 2- Middle tier 3- Interface (GUI) – Intelligent web-services – User friendly interface to receive input from the user • Talk to interface • Textbox for textual keywords – Receive input (query parameters) • Map interface to draw/show query location – Send output (query result) – A textbox can be used to input a location s name A textbox can be used to input a location’s name • Talk to data tier – Displays the result dynamically and interactively – Get data – Access index • Results should be changed on-the-fly based on map location – Access video database – Provides mechanism to show the testimonies from • Perform necessary operations the interface – Process data • Show testimonies on the same page – Calculates scores – Format the results • Link to a new page for showing the testimonies Tasks Break-down + Schedule 4- Research/Algorithm • Data tier – Hybrid index structure – Understand / format / cleanse (/geocode) / transfer the data • captures spatial and textual keywords (probably using inverted files) simultaneously and efficiently • 4 weeks sangy + Ali – Come up with index structure schema for the middle layer Come up with index structure schema for the middle layer – Relevance ranking function R l ki f i • 2 weeks Ali • Formulas for spatial and temporal scores – Create/implement the actual index structure • A combined scoring function with different weights • 4weeeks Ali + sangy for different features – Integration/extra,.. – Spatial representation of each segment • 1 week Ali and/or testimony’s spatial data 2
9/23/2009 Break-down + Schedule Break-down + Schedule • Research / Algorithm • Middle layer development – Creating prototypes /connectivity to the interface – Spatial representation of each segment • 3 weeks Kaveh and/or testimony’s spatial data – [1.5 weeks wait for data tier] • 1.5 weeks Ali + Sangy 1.5 weeks Ali Sangy – Create code for ranking function – Relevance ranking function, Formulas for • 2.5 weeks Kaveh spatial and textual scores – Create code for video • 2.5 weeks Ali • 2 weeks Kaveh – Integration/testing • 1 week Kaveh Tasks for Sangy Break-down + Schedule Integration / Testing • Web-development Implement Spatial Index – Static/complete GUI (no functionality) Sangy Middle layer • 3 weeks functionality - Adding functionality Sangy + Kaveh Tasks Static/complete p GUI - 2 weeks - Adding Ajax and dynamic features Kaveh + Data format / Geocode Ali - 4 weeks Data understanding /cleansing - Integration/test Kaveh + Sangy + Ali - 1 week 4 6 8 10 12 2 Time Tasks for Kaveh Tasks for Ali Integration / Testing Ajax/dynamic Integration / features Testing Coding: Video Implement Functionality Spatial Index Ajax/dynamic features Tasks Tasks Relevance ranking g Coding : function searching & ranking Adding index structure schema functionality to for the middle layer middle layer/interface Data understanding /cleansing/geo-tagging Prototyping mid tier 2 4 6 8 10 12 2 4 6 8 10 12 Time Time 3
9/23/2009 Milestones and Deliverables Deliverables 10/06/09 Prototype 10/30/09 Working Model 11/18/09 Complete GUI 1) Prototype of system having a static (non functional) interface Complete GUI with AJAX and Video embedding – 4rd week Mile Working Model with full functionality 2) System with actual ranking/index 2) System with actual ranking/index stone stone including Indexing/Ranking / / structure and end-to-end functionality – 9th week Prototype 3) (2) + Ajax + video embedding – 11th week 2 4 6 8 10 12 Time Resources • Data – Provided by Shoah Foundation • data stored in sysbase tables • Needs to be cleansed, formatted and transferred • Software – MS Visual Studio .Net – Oracle 10g + • Hardware – Windows Server (+IIS) 4
Recommend
More recommend