Modeling Intention in Email:
Speech Acts, Information Leaks and User Ranking Methods p g
Vitor R. Carvalho
Carnegie Mellon University
William Cohen
Tom Jon Ramnath Tom Mitchell Jon Elsas a at Balasubramanyan
Outline Outline Motivation Motivation 1 1. Email Speech Acts - - PowerPoint PPT Presentation
Modeling Intention in Email : Speech Acts, Information Leaks and User Ranking Methods p g Vitor R. Carvalho Carnegie Mellon University William Cohen Ramnath a at Tom Tom Jon Jon Balasubramanyan Mitchell Elsas Outline Outline
Speech Acts, Information Leaks and User Ranking Methods p g
Carnegie Mellon University
William Cohen
Tom Jon Ramnath Tom Mitchell Jon Elsas a at Balasubramanyan
1
1.
2.
3.
4.
2
The most successful e-communication application.
Great tool to collaborate, especially in different time zones. Very cheap, fast, convenient and robust. It just works.
Increasingly popular
Clinton adm. left 32 million emails to the National Archives
[ Shipley & Schwalbe, 2007]
Bush adm….more than 100 million in 2009 (expected)
Visible impact
Office workers in the U.S. spend at least 25% of the day on email –
not counting handheld use
3
[ Dabbish & Kraut CSCW-2006]
[ Dabbish & Kraut, CSCW-2006] . [ Belloti et al. HCI-2005]
4
1
1.
2.
3.
4.
5
From: Benjamin Han
I f i
To: Vitor Carvalho Subject: LTI Student Research Symposium
Hey Vitor
Hey Vitor When exactly is the LTI SRS submission deadline?
Prioritize email by “intention” Help keep track of your tasks:
Also, don’t forget to ask Eric about the SRS webpage.
commitments, reminders, answers, etc. Thanks. Ben ,
Better integration with to-do
6
Add Task: follow up on: “request for screen shots” by ___ days before -?
2 “next Wed” (12/5/07) “end of the week” (11/30/07) “Sunday” (12/2/07)
Request Request Time/date 7
[Cohen, Carvalho & Mitchell, EMNLP [Cohen, Carvalho & Mitchell, EMNLP-
04]
Verb Verb
noun pair (e.g., propose
Commisive Directive D li Commit Request Propose Commisive Directive D li Commit Request Propose
( g meeting, request information) - Not all pairs make sense
Deliver Commit Request Propose Amend Noun Deliver Commit Request Propose Amend Noun
may contain multiple acts T t d ib l
Activity Delivery Activity Delivery
than all possible speech acts in English
Ongoing Event Meeting Opinion Data Ongoing Event Meeting Opinion Data
g s
usage of email (e.g. delivery of
8
Meeting Other Meeting Other
g ( g y files)
Data: Carnegie Mellon MBA students competition
Semester-long project for CMU MBA students. Total of 277
Se este
U stude ts
students, divided in 50 teams (4 to 6 students/team). Rich in task negotiation.
1700+ messages (from 5 teams) were manually labeled. One of
the teams was double labeled and the inter-annotator agreement the teams was double labeled, and the inter-annotator agreement ranges from 0.72 to 0.83 (Kappa) for the most frequent acts.
Features:
– N-grams: 1-gram, 2-gram, 3-gram,4-gram and 5-gram – Pre-Processing
Remove Signature files, quoted lines (in-reply-to) [Jangada package] Entity normalization and substitution patterns:
“Sunday”…”Monday” →[day], [number]:[number] → [hour], “me, her, him ,us or them” → [me], “after, before, or during” → [time], etc
9
[ Carvalho & Cohen, HLT-ACTS-06] [ Cohen, Carvalho & Mitchell, EMNLP-04]
0.9 1
1g (1716 msgs) 1g+2g+3g+PreProcess
0.7 0.8 cision 0 4 0.5 0.6 Prec 0.3 0.4 0.2 0.4 0.6 0.8 1 Recall
10
5-fold cross-validation over 1716 emails, SVM with linear kernel
Ciranda:
Java package for Email Speech Act Classification
11
Strong correlation between Example of Email Thread Sequence
[ Carvalho & Cohen, SIGIR-05]
Request Propose Request Deliver
Strong correlation between previous and next message’s acts
Deliver Request Commit Propose Commit
Act has little or no correlation with other acts
C Deliver
Both Context and Content have
Commit
predictive value for email act classification
12
Context: Collective classification problem
Commit [ Carvalho & Cohen, SIGIR-05]
j probability distribution is approximated with a set of conditional distributions that can be learned
Request
conditional probabilities are calculated for each node given its Markov blanket.
Request Deliver
) ) ( | Pr( ) Pr(
=
i i
X Blanket X X r
Parent Message Child Message Current Message
i
[ Heckerman et al., JMLR-00] [ Neville & Jensen, JMLR-07]
Inference: Temperature-driven Gibbs li
13
sampling
43 44
dD t Baseline Collective
Modest im provem ents
36.84 42.01 44.98 40.72 38.69 43.44
Propose Deliver dData Only on acts related to negotiation: Request, Com m it, Propose, Meet, Com m issive, etc.
47 81 58.27 47.25 52.42 58.37 49.55
Meeting Directive Request Meet, Com m issive, etc.
37.66 30.74 47.81 42.55 32.77
Commissive Commit Meeting 10 20 30 40 50 60 70 Kappa Values (%)
“Sparse” links
14
Kappa values with and without collective classification, averaged over four team test sets in the leave-one-team out experiment.
[Kushmerick & Khousainov, IJCAI-05]
[Leusky,SIGIR-04][Carvalho,Wu & Cohen, CEAS-07]
[Feng et al., HLT/NAACL-06]
[Dredze, Lau & Kushmerick, IUI-06]
15
1
1.
2.
3.
4.
16
17
18
19
http://www.sophos.com/ 20
Email Leak: email accidentally sent to wrong person
[ Carvalho & Cohen, SDM-07]
No labeled data sent to wrong person No labeled data
this kind of data?
names, aliases, etc
completion of email addresses
4 Keyboard settings Disastrous consequences: expensive law suits, brand reputation damage negotiation
21
reputation damage, negotiation setbacks, etc.
[ Carvalho & Cohen, SDM-07]
1 Create simulated/artificial email 1. Create simulated/artificial email recipients
2. Build model for (msg.recipients):
2. Build model for (msg.recipients): train classifier on real data to detect synthetically created outliers (added to the true recipient list).
names, aliases, etc
completion of email
Features: textual(subject, body), network features (frequencies, co-
3 Detect outlier and warn user based addresses
4 Keyboard settings 3. Detect outlier and warn user based
22
– Frequent typos, same/similar last names, identical/similar first names, aggressive auto-completion of addresses, etc.
– On each trial, one of the msg recipients is randomly chosen d tli i t d di t
and an outlier is generated according to:
3 2 Marina.wang @enron.com Generate a random email Else: Randomly select an address 2 1 address NOT in Address Book
23
Else: Randomly select an address book entry
– For each user, ~10% most t t recent sent messages were used as test – Some basic preprocessing
Create a “TfIdf centroid” for each user in Address Book. For testing, rank according to cosine similarity between test msg and each centroid
each centroid.
Given a test msg, get 30 most similar msgs in training set. Rank according to “sum of similarities” of a given user on the 30-msg
24
similarities of a given user on the 30-msg set.
– Remove repeated messages and inconsistencies
– List provided by Corrada-Emmanuel from UMass
– Messages were represented as the union of BOW of Messages were represented as the union of BOW of body and BOW of subject
25
0.6
Rocc
0.45 0.5 0.55
Accuracy
0.35 0.4 Random Rocchio KNN-30
A
Average Accuracy in 10 trials: On each trial, a different set
26
1. Frequency features
– Number of received, sent and sent+received messages (from this user)
2. Co-Occurrence Features
– Number of times a user co-
3. Auto features
– For each recipient R find Rm For each recipient R, find Rm (=address with max score from 3g-address list of R), then use score(R)-score(Rm) as feature.
27
1. Frequency features
– Number of received, sent and sent+received messages (from this user)
2. Co-Occurrence Features
Combine with text-only scores using perceptron-based reranking,
– Number of times a user co-
3. Auto features
– For each recipient R find Rm
g p p g, trained on simulated leaks
For each recipient R, find Rm (=address with max score from 3g-address list of R), then use score(R)-score(Rm) as feature.
b d Network Features Text-based Feature
28
[ Carvalho & Cohen, SDM-07]
0.9
0.804 0.748 0.718 0.814
0.8
Random TfIdf Knn30 Knn30+Frequency
0.558 0.56
0.6 0.7
curacy .
Knn30+Frequency Knn30+Cooccur1 Knn30+Cooccur_to Knn30+All Above
0.406
0 4 0.5
Acc
_
0.3 0.4 L k P di ti
29
Leak Prediction
– Grep for “mistake”, “sorry” or “accident”
“Sorry. Sent this to you by m istake.”, “I accidentally sent you this rem inder”
– Note: must be from one of the Enron users
you this rem inder
alex.perkins@ 2 kitchen l/sent items/497 it has 44 recipients leak is rita wynne@
– The proposed algorithm was able to find these two leaks
30
31
CC an important collaborator: a manager a colleague a contractor CC an important collaborator: a manager, a colleague, a contractor, an intern, etc.
[ Carvalho & Cohen, ECIR-2008] More frequent than expected (from Enron Collection)
Communication delays, Deadlines can be missed
32
Opportunities wasted, Costly misunderstandings, Task delays
P di ti CC BCC
means “two recipients were addressed in the same message in the training ”
33
set”
0.5
36 Enron users
0.4 0.45
Frequency
users
0.3 0.35
MAP
Recency TFIDF KNN
0.2 0.25
Perceptron
0.15
TOCCBCC CCBCC
34 [Carvalho & Cohen, ECIR-08] 44000+ queries Avg: ~ 1267 q/ user
Ranking combined by
Reciprocal Rank:
∈
Rankings q i q i
35
0.5 0.55
Freq Rec
0 4 0.45
M1-uc M2-uc TFIDF
0.35 0.4
MAP
KNN Fusion
0.25 0.3 0.15 0.2 36
TOCCBCC (thread) CCBCC (thread)
TOCCBCC
[Carvalho & Cohen, ECIR-08]
CCBCC
37
38
Mozilla Thunderbird plug-in (Cut Once)
Leak warnings: hit x to remove recipient
Suggestions: hit + to add
Timer: msg is sent after 10sec by default 39
write)
40
1
1.
2.
3.
4.
41
0.5
36 Enron users
0.4 0.45
Frequency
users
0.3 0.35
MAP
Recency TFIDF KNN
0.2 0.25
Perceptron
0.15
TOCCBCC CCBCC
42
[Joachims, KDD-02] [Cao et al. , ICML-07]
– Online, scalable.
[Elsas, Carvalho & Carbonell, WSDM-08] [Freund et al, 2003]
,
43
R k
j i j i
Rank q d1
d2 d3 d4
mi m i i i i
x w x w x w d d f + + + = = ... , w ) (
2 2 1 1
d4 d5 d6
6 26 16 m
... dT
j i j i
44
45
46
– bound on the number of misranks
[El C lh & C b ll 2008] [Collins, 2002; Gao et al, 2005]
– Voting, averaging, committee, pocket, etc. – General update rule:
[Elsas, Carvalho & Carbonell, 2008]
– Here: Averaged version of perceptron
1 NR R t t
+
47
Here: Averaged version of perceptron
2
i ranksvm
[Joachims, KDD-02], [Herbrich et al, 2000]
∈RP i i ranksvm w
)} , {( , 1 , , subject to
i NR R i NR R
d d RP d d w = − ≥ − ≥ ε ε
2
1 h λ Equivalent to:
+
RP NR R ranksvm w
. 2C where , = λ
48
[Elsas, Carvalho & Carbonell, WSDM-08]
1.5 2 1
0.5 Lo
1 2 3
49
NR R
1.5 2 1
0.5 L
1 2 3
50
NR R
) ( 1 1 1 1 1 σ
σ σ σ
x sigmoid e e e
x x x
− = + − = +
− − −
1.5 2
1 1 e e + +
Robust to
1
0.5 Lo
1 2 3
51
NR R
Base ranking Base Sigmoid Final model Base ranking model ase Ranker S g
Rank Non-convex: e.g., Non convex: e.g., RankSVM, Perceptron, ListNet, Minimize (a very close approximation for) the empirical error: number of misranks etc. error: number of misranks Robust to outliers (label noise)
52
x
e x sigmoid
σ −
+ = 1 1 ) (
RP NR R k SigmoidRan w
2
) ( ) ( ) ( ) ( ) 1 ( k d kSi i k k k k k
+
d rankSigmoi
)] , ( 1 )[ , ( 2 ) (
) ( NR R NR R RP k d rankSigmoi
d d w sigmoid d d w sigmoid w w L − − − − = ∇
λ
53
RP
0.55
36 Enron users
p< 0.01 p= 0.06 p< 0.01
0.5
Frequency Recency
12.7% 1.69%
13.2% 0.96% 2.07% p= 0.74 p< 0.01 p< 0.01
0.45
MAP
TFIDF KNN Percep Percep+Sigmoid
0 35 0.4
M
Percep+Sigmoid RankSVM RankSVM+Sigmoid Listnet
0.3 0.35
ListNet+sigmoid
54
TOCCBCC CCBCC
44000+ queries Avg: ~ 1267 q/ user
36 Enron users
1 0.9 0.95
26.7% 1.22% 0.67% 15.9% 0.77% 4.51% p< 0.01 p< 0.01 p= 0.55 p< 0.01 p< 0.01 p< 0.01
0.8 0.85
AUC
Percep
0.7 0.75
A
Percep Percep+Sigmoid RankSVM RankSVM+Sigmoid
0.6 0.65
TOCCBCC CCBCC
Listnet ListNet+sigmoid
55
TOCCBCC CCBCC
36 Enron users
0 475 0.45 0.475
Percep Percep+Sigmoid RankSVM R kSVM+Si id
12.9% 2.68% 1.53%
0.425
ecision
RankSVM+Sigmoid Listnet ListNet+sigmoid
8 71% 1 78% 0 2%
0 375 0.4
R-Pre
8.71% 1.78%
0.35 0.375
TOCCBCC CCBCC
56
TOCCBCC CCBCC
0.9 0 9
36 Enron users
0 9 0.7 0.8 0.9
moid
0.7 0.8 0.9
MAP values CCBCC task
0 7 0.8 0.9
d
0.5 0.6
ptron+Sigm
0.5 0.6 0.7
VM+Sigmo
0.5 0.6 0.7
et+Sigmoid
0.3 0.4
Percep
0.3 0.4 0.5
RankS
0.3 0.4 0.5
ListNe
0.2 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9
Perceptron
0.2 0.3 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9
RankSVM
0.2 0.3 0.2 0.4 0.6 0.8
Li tN t
57
p
RankSVM ListNet
0.94
Percep
[Wang & Cohen, ICDM-2007]
0 9 0.92
Percep Percep+Sigmoid RankSVM RankSVM+Sigmoid
2.68% 0.11% 3.20% 1.94% 0.44% 1.81%
0.88 0.9
MAP
ListNet ListNet+Sigmoid
1.76% 0.46% 3.22%
0.84 0.86
M
0.8 0.82
SEAL 1 SEAL 2 SEAL 3
58
SEAL-1 SEAL-2 SEAL-3
[ 18 features, ~ 120/ 60 train/ test splits, ~ half relevant]
[ Liu et al, SIGIR-LR4IR 2007]
0 45 0.5
Percep P Si id
41.8% 0.38%
0.35 0.4 0.45
Percep+Sigmoid RankSVM RankSVM+Sigmoid ListNet
21.5% 2.51% 2.02%
0 2 0.25 0.3
MAP
ListNet ListNet+Sigmoid
261% 7.86% 19.5%
0.1 0.15 0.2 0.05
Ohsumed Trec3 Trec4
59
[ # queries/ # features: (106/ 25) (50/ 44) (75/ 44)] Ohsumed Trec3 Trec4
0.92 0.90 n)
A few steps to convergence
good starting point
0.86 0.88 AUC (train perceptron+sigmoid
0.84 perceptron+sigmoid rankSVM+sigmoid ListNet+sigmoid Random+sigmoid 0.82 5 10 15 20 25 30 35 Epoch (gradient descent iteration)
60
TOCCBCC Enron: user lokay-m
(empirical risk minimization framework) Fi t b l i f t d t ti i t
61
[ al., 2000] [Levin et al., 03]
with larger number of acts.
Email is new domain
discovery [Bennett & Carbonell, 2005], Activity classification [ Dredze et al., 2006], Task-focused email summary [Corsten-Oliver et al, 2004], Predicting Social Roles [Leusky 2004] etc
62
Predicting Social Roles [Leusky, 2004], etc.
– [Boufaden et al., 2005]
privacy breaches (student names, student grades, IDs).
– [Pal & McCallum, 2006], [Dredze et al., 2008] CC P di ti bl R i i t di ti b d
summary keywords E t S h i E il – Expert Search in Email
2006], [Balog et al, 2006],[Soboroff, Craswell, de Vries (TREC- E t i 2005 06 07 )]
63
Enterprise 2005-06-07…)]
Minimization)
64
65