Do Developers Feel Emotion? An Exploratory Analysis of Emotions. - - PowerPoint PPT Presentation

do developers feel emotion an exploratory analysis of
SMART_READER_LITE
LIVE PREVIEW

Do Developers Feel Emotion? An Exploratory Analysis of Emotions. - - PowerPoint PPT Presentation

Do Developers Feel Emotion? An Exploratory Analysis of Emotions. Motivation Feelings and emotions dictate to a large extent our actions and decisions. Developers potential and productivity is fully unlockable if people feel safe


slide-1
SLIDE 1

Do Developers Feel Emotion? An Exploratory Analysis of Emotions.

slide-2
SLIDE 2

Motivation

  • Feelings and emotions dictate to a large

extent our actions and decisions.

  • Developersʼ potential and productivity is fully

unlockable if people feel safe and happy.

  • It is important to support managers and project

leaders in detecting emotions

slide-3
SLIDE 3

Final Goal

  • Building a tool for automatic emotion
  • detection. A first step:
  • Can emotions actually be detected from

issue reports?

  • If so, can human actually agree on the

identified emotions?

slide-4
SLIDE 4

Our approach

  • A significant sample of developers’

comments of the Apache were analyzed based on Parrott’s emotional framework.

  • Can human raters, without any training, agree on

the presence of emotions in issue reports?

  • Dose training improve the agreement of human

raters?

  • Dose context improve the agreement of human

raters?

slide-5
SLIDE 5

Related Work

Ahmed Hassan et. al tried to answering these questions:

  • What is the personality type of OSS

developers?

  • Dose the language and attitude of a

developer change as moves from being a current, to a departing developer?

slide-6
SLIDE 6

Related Work

  • Guzman et. al proposed an approach to improve

emotional awareness in software development teams by means of quantitative emotion summaries.

  • Their approach automatically extracts and summarizes emotions

expressed in collaboration artifacts by combining probabilistic topic modelling with lexical sentiment analysis techniques.

slide-7
SLIDE 7

Emotion Mining

  • Emotion mining tries to identify the

presence of emotions like joy or fear

  • Sentiment analysis evaluates a given

emotion as being positive or negative

slide-8
SLIDE 8

Emotion Mining in Software Engineering

  • Applied to text artifacts can be used to

provide hints on factors responsible for joy and satisfaction, or fear and anger among developers.

  • It provides a different perspective to

interpret productivity and job satisfaction.

slide-9
SLIDE 9

Parrott’s Framework

slide-10
SLIDE 10

Issue Tracking System

  • A repository used by software companies to
  • rganize software maintenance and evolution.
  • Team members submit and discuss issues including

bugs and feature requests, ask for advice or share

  • pinions
  • It might reveal how committers feel towards a

bug, feature, project or even their colleagues.

  • Each issue is characterized by several attributes

like: priority, status, type(improvement, perfective maintenance, new feature, corrective maintenance, adaptive maintenance)

slide-11
SLIDE 11

Experimental Setup

  • Goal: Understand the kinds of emotions

found in issue reports

  • Four authors rated issue reports from
  • pen source systems
  • Analyzing the identified emotions and

rater’s agreement

slide-12
SLIDE 12

Dataset

  • Issue repository of the Apache software

foundation

  • host of 117 open source projects rating

large long-lived to small representative data

slide-13
SLIDE 13

Dataset

  • Issue reports since 19th of October 2000

till July 2013

  • Developers’ comments + issue report

attributes

  • No distinction between bugs, new features,

and enhancements

  • Granularity: issue comment level
  • Enough number of issue commits to obtain

95% confidence level.

slide-14
SLIDE 14

Emotion Mining

  • Each rater identified emotions associated

to each comment according to Parrott’s six emotions: love, joy, surprise, anger, sadness, fear

  • Personal rate
  • Based on common understanding of

Parrott’s framework

  • No ground true: agreement is considered

as correct , agreement: majority vote

slide-15
SLIDE 15

Examples

  • I'm not so convinced that moving all the static

methods out is useful (Fear).

  • How is a bunch of static methods on a utility class

easier than a bunch of static methods within the HtmlCalendarRenderer better? (Anger)

  • The risk of introducing new bugs for no great

benefit (Fear).

  • Previously almost all these helper methods were

private; this \textbf{patch} makes them all public [...]} (Neutral)

slide-16
SLIDE 16

Measuring Agreement

  • Degree of inter-rare agreement
  • Cohen’s for two raters
  • Fleiss’s k value for more than two raters
slide-17
SLIDE 17

Question 1

  • Can human raters, without any training,

agree on the presence of emotions in issue reports?

  • Motivation: Emotion mining from software

development artifacts is not trivial, since they consist of unstructured data, they are relatively short, written in informal way.

slide-18
SLIDE 18

Question 1: Approach

  • 400 issue report comments were arbitrary

assigned to two of the raters.

  • Each author selected the emotions that

were present in the comment

  • Once all comments had been annotated,

the four files were collected and analyzed using Cohen’s K.

slide-19
SLIDE 19

Question1: Result

  • In 41% of the comments, the raters agreed
  • n all 6 emotions whereas 85% of

comments do not contain any emotion

  • Only for Love, the raters achieved more

than slight agreement, moderate value.

  • 6.5% agreed on the presence of a particular

emotion, Love, 96.75+5 on the absence, Surprise.

slide-20
SLIDE 20

Result

  • While some emotions obtain higher

agreement than others, only one emotion

  • btained moderate agreement, and raters

agree the most on the absence of an emotion.

slide-21
SLIDE 21

Question 2

  • Dose training improve the agreement of

human raters on the presence of emotions in issue reports?

  • Motivation: Without thorough training,

raters achieve only a slight agreement. This leads to the current question.

slide-22
SLIDE 22

Question 2: Approach

  • Each rater compiled a list of generic

expressions he or she felt insecure

  • A general example and emotion added
  • 144 expressions were obtained
  • Meeting for discussion
  • Replication and refinement study

performed

slide-23
SLIDE 23

Question 2: Replication and Refinement Study

  • Replicated our study of RQ1 on a second

sample.

  • Refinement study revisited 235 comments
  • f RQ1 with at least one emotion

disagreement, all four authors decide about

  • ccurrence of emotion.
  • Why refinement was done?
slide-24
SLIDE 24

Question 2: Results

  • 65% of comments, the raters agreed on all

6 emotions

  • Four out of six emotions improve from

slight to fair agreement. Joy, Anger, Sadness and Fear

  • 4.17% agreed on the presence of an

emotion, Love

  • 72.76 obtained agreement by at least 3

raters.

slide-25
SLIDE 25

Result

  • Training improves the overall agreement on

emotions, as well as for most of the individual emotions. Love, joy and sadness are the most common emotions.

slide-26
SLIDE 26

Question 3

  • Dose context improve the agreement of

human raters on the presence of emotions in issue reports?

  • Motivation: previous experiments can be

compared to eavesdropping on a group, and catching just one phrase.

  • Due to technical and unstructured nature
  • f software development artifacts, the

impact of context might be different than in literary English.

slide-27
SLIDE 27

Question3: example

  • Sentence: “yeah right”
  • “moving to java 8 we solve all problems”
  • “breaking backward compatibility is

risky”

slide-28
SLIDE 28

Question3: Approach

  • Experiment with two steps:
  • Replication of study RQ2: 384

comments, two raters

  • Same analyze with the context of those

comments

slide-29
SLIDE 29

Question3:Results

  • Adding context reduces rater agreement

for love

  • More raters change their mind for

comments with context

  • Context seems to make raters doubt about

t h e i r r a t i n g , i n t r o d u c i n g m o r e disagreement.

slide-30
SLIDE 30

Discussion

  • A. Impact of Context:
  • at first, our findings seem counter-intuitive.
  • Using a simple yes/no decision as rating is too

large as simplification. Instead, multiple rating.

  • B. Do Emotions Really Matter for Issue

Reports:

  • Our finding suggests there is link between

emotions and software development. Reports with “love” emotion tend to have a lower number of comments and fixing time.

slide-31
SLIDE 31

Threats to Validity

  • Internal validity: We rely on the presence
  • f a casual relationship between a

developer’s emotions and what he or she writes in issue report comments.

  • Construct validity: Ambiguity of messages

and subjectivity of emotions. To reduce:

  • Parrott’s framework is adopted
  • explanation and clarifying of framework
  • each commit was analyzed by at least two

authors

slide-32
SLIDE 32

Threats to Validity

  • External validity: Replication of this work
  • n other open source systems and on

commercial projects are needed to confirm

  • ur findings.
  • Reliability Validity: No ground truth exist to

compare our findings. Different groups of raters overall will obtain the same results as well.

slide-33
SLIDE 33

Conclusion

  • Software development, as collaborative activity of

developers, is influenced by human emotions.

  • Issue reports do express emotions towards design

choices, maintenance activity or colleagues.

  • Love, joy and sadness are easier to agree on.
  • Emotion mining can improve through training
  • Some challenges like the impact of context need

to be studied more, on more data sources and systems.

slide-34
SLIDE 34

34