Proximal Graphical Event Model IBM Research Debarun Bhattacharjya, Dharmashankar Subramanian, Tian Gao Objective: To learn statistical and causal relationships between event types in the form of graphical models using event datasets Home health visit Hospital admission Prescription refill Event datasets: Occurrences of various event types over time • Examples: web logs; customer transactions; network notifications; political events; financial events; insurance claims; health episodes; other medical events • Notation: 𝑬 = 𝑚 𝑗 , 𝑢 𝑗 , 𝑗 = 1, … , 𝑂; 𝑚 𝑗 ∈ 𝑀, 𝑀 = 𝑁 – Assume it is temporally ordered b/w time 𝑢 0 = 0 ≤ 𝑢 1 and 𝑢 𝑂+1 = 𝑈 ≥ 𝑢 𝑂 – Note that there are 𝑁 types of event types/labels and 𝑂 events in the dataset 1
IBM Research Proximal Graphical Event Model (PGEM) w aa w ab A • PGEM = 𝐻, 𝑋, Λ ; graph + set of (time) windows on each edge and conditional intensity parameters w ac B • Assumption: The intensity of an event label (node) depends on whether or not its parents have happened at least once in their C respective recent histories w bc Formally, denoting a node 𝑌 ’s parents as 𝑽 : • 𝐻 = 𝑀, 𝐹 where 𝑀 is the event label set • There is a window for every edge, 𝑋 = 𝑥 𝑦 : ∀𝑌 ∈ 𝑀 , where 𝑥 𝑦 = 𝑥 𝑨𝑦 : ∀𝑎 ∈ 𝑽 • There is an intensity parameter for every node 𝑌 and for every instantiation 𝒗 of 𝑥 𝑦 : ∀𝑌 ∈ 𝑀 its parent occurrences, Λ = 𝜇 𝑦|𝒗 2
IBM Research Parameter and Structure Learning Learning problem: Given an event dataset 𝑬 , learn PGEM = 𝐻, 𝑋, Λ • Log-likelihood: – 𝑂 𝑦; 𝒗 : # of times 𝑌 is observed and the condition 𝒗 is true in the relevant windows – 𝐸 𝒗 : duration over the entire time period where the condition 𝒗 is true • For a given graph, finding the optimal (MLE) conditional intensities when given the windows is easy, but finding the optimal windows is hard! • Contribution 1 : Analysis and proof that reduces the window search to a finite set that is algorithmically constructed . • Contribution 2 : A method to search over graph structures, with some theoretical results on efficient search and consistency justification 3
IBM Research Results: Synthetic Datasets Wed Dec 5, 5:00 – 7:00 pm, Room 210 & 230 AB #6 4
Recommend
More recommend