Towards Automa-cally Se3ng Language Bias in Rela-onal Learning Jose - PowerPoint PPT Presentation

Towards Automa-cally Se3ng Language Bias in Rela-onal Learning Jose Picado, Arash Termehchy, Alan Fern, Sudhanshu Pathak Informa-on and Data Management and Analy-cs (IDEA) Lab

Design a drug to treat HIV What is the structure of compounds that have an#-HIV ac-vity? A compound has an#-HIV ac-vity if it has the following substructure: Oracle N O N 2

Rela-onal learning can learn defini-on for an--HIV compound atom Training data: compId atomId atomId element an#-HIV no-an#-HIV c1 a1 a1 N compId compId c2 a10 a2 O c1 c2 bond c3 c4 atomId1 atomId2 type a1 a2 single a2 a3 single an--HIV(x) :- compound(x,u), atom(u,N), compound(x,v), atom(v,O), Rela-onal learning compound(x,w), atom(w,N), algorithm bond(u,v,single), bond(v,w,single). 3

Benefits of rela-onal learning ü Leverage the structure of compound atom data and learn over complex compId atomId atomId element schemas with mul-ple tables c1 a1 a1 N c2 a10 a2 O ü Automa-c feature extrac-on and selec-on bond atomId1 atomId2 type ü Results are interpretable a1 a2 single (Datalog) a2 a3 single an--HIV(x) :- compound(x,u), atom(u,N), compound(x,v), atom(v,O), Rela-onal learning compound(x,w), atom(w,N), algorithm bond(u,v,single), bond(v,w,single). 4

How rela-onal learning works What is the defini-on of the advisedBy rela-on? paperAuthor professor student advisedBy paperId authorId id posi-on id phase year studId profId p1 f1 f1 faculty s1 post_quals 3 s1 f1 p1 s1 f2 faculty s2 pre_quals 2 s3 f3 p2 s3 f3 adjunct s3 post_prelims 5 not-advisedBy p2 f3 studId profId … s2 f3 s1 f3 Rela-onal learning ? algorithm 5

Generic rela-onal learning algorithm advisedBy(x,y) :- professor id posi-on paperAuthor paperId authorId student id phase year Scoring func-on f : P - N P: posi-ve examples covered N: nega-ve examples covered advisedBy(x,y) :- true. 6

Generic rela-onal learning algorithm advisedBy(x,y) :- professor id posi-on f=1 f=0 f=-1 paperAuthor(z,x) professor(y,z) paperAuthor paperId authorId student id phase year Scoring func-on f : P - N P: posi-ve examples covered N: nega-ve examples covered advisedBy(x,y) :- true. 7

Generic rela-onal learning algorithm advisedBy(x,y) :- professor id posi-on f=1 f=0 f=-1 paperAuthor(z,x) professor(y,z) paperAuthor paperId authorId student id phase year Scoring func-on f : P - N P: posi-ve examples covered N: nega-ve examples covered advisedBy(x,y) :- paperAuthor(z,x). 8

Generic rela-onal learning algorithm advisedBy(x,y) :- professor id posi-on f=1 f=0 f=-1 paperAuthor(z,x) professor(y,z) paperAuthor f=1 f=2 f=0 paperId authorId student(x,v,w) paperAuthor(z,y) student id phase year Scoring func-on f : P - N P: posi-ve examples covered N: nega-ve examples covered advisedBy(x,y) :- paperAuthor(z,x). 9

Generic rela-onal learning algorithm advisedBy(x,y) :- professor id posi-on f=1 f=0 f=-1 paperAuthor(z,x) professor(y,z) paperAuthor f=1 f=2 f=0 paperId authorId student(x,v,w) paperAuthor(z,y) student id phase year Scoring func-on f : P - N P: posi-ve examples covered N: nega-ve examples covered advisedBy(x,y) :- paperAuthor(z,x), paperAuthor(z,y). 10

Generic rela-onal learning algorithm advisedBy(x,y) :- professor id posi-on f=1 f=0 f=-1 paperAuthor(z,x) professor(y,z) paperAuthor f=1 f=2 f=0 paperId authorId student(x,v,w) paperAuthor(z,y) student f=2 f=1 f=1 id phase year No improvement Scoring func-on f : P - N P: posi-ve examples covered N: nega-ve examples covered advisedBy(x,y) :- paperAuthor(z,x), paperAuthor(z,y). 11

Learned defini-on What is the defini-on of the advisedBy rela-on? paperAuthor professor student advisedBy paperId authorId id posi-on id phase year studId profId p1 f1 f1 faculty s1 post_quals 3 s1 f1 p1 s1 f2 faculty s2 pre_quals 2 s3 f3 p2 s3 f3 adjunct s3 post_prelims 5 not-advisedBy p2 f3 studId profId … s2 f3 s1 f3 Rela-onal learning advisedBy(x,y) :- algorithm paperAuthor(z,x), paperAuthor(z,y). 12

Hypothesis space in rela-onal learning algorithms is huge • Hypothesis space: all Datalog defini-ons containing rela-ons in the schema • Current solu-on: users must set language bias to restrict the hypothesis space professor advisedBy(x,y) :- id posi-on … paperAuthor paperAuthor(x,x) professor(x,z) paperId authorId paperAuthor(z,x) professor(x,y) paperAuthor(z,y) student(x,v,w) paperAuthor(x,y) student(x,y,z) student paperAuthor(z,v) … id phase year 13

Syntac-c bias restricts the structure of learned Datalog defini-ons • Which rela-ons to query? • Which rela-ons to join and over which agributes? • Should an agribute be a constant or a variable? join paperId with professor id? professor id posi-on advisedBy(x,y) :- paperAuthor(z,x), professor(z,v). paperAuthor advisedBy(x,y) :- paperId authorId professor(y,z), professor(y,faculty). student constant variable id phase year 14

Predicate defini-ons • Assign types to each agribute in every rela-on • Only agributes with same type can join professor a;ribute type id posi-on professor[id] professor professor[posi-on] posi-on paperAuthor paperAuthor[paperId] paper paperId authorId paperAuthor[authorId] student paperAuthor[authorId] professor student student[id] student id phase year … 15

Predicate defini-ons • Assign types to each agribute in every rela-on • Only agributes with same type can join input to the algorithm a;ribute type professor(professor,posi-on) professor[id] professor paperAuthor(paper,student) professor[posi-on] posi-on paperAuthor(paper,professor) paperAuthor[paperId] paper student(student,phase,year) … paperAuthor[authorId] student paperAuthor[authorId] professor student[id] student advisedBy(x,y) :- … paperAuthor(z,x), professor(z,v). 16

Mode defini-ons • Define the mode to call rela-ons and create literals • Each agribute can be: – an exis-ng variable (+) – an exis-ng or new variable (-) – a constant (#) input to the algorithm professor id posi-on professor(+,-) paperAuthor professor(-,+) professor(+,#) paperId authorId … student id phase year 17

Predicate and mode defini-ons are the “black magic” of rela-onal learning • All rela-onal learning algorithms require syntac-c bias • Manually wrigen by the user Rewrite Learn Evaluate Difficult and Requires exper-se Trial-and-error -me-consuming 18

Many lines of code to specify defini-ons movies(+movieid,--tle,-year) movies2composers(+movieid,-composer) cer-ficates(+movieid,#country,#cer-ficate) movies2genres(+movieid,-genreid) movies2composers(-movieid,+composer) countries(+countryid,-country) movies2prodcompanies(+movieid,- composers(+composer,-name) countries(+countryid,#country) prodcompanyid) movies2costdes(+movieid,-costdes) running-mes(+movieid,--me) movies2colors(+,movieid,-colorid) movies2costdes(-movieid,+costdes) running-mes(+movieid,#-me) movies2directors(+movieid,-director) costdesigners(+costdes,-name) aka-tles(+movieid,-languageid,--tle) movies2directors(-movieid,+director) movies2editors(+movieid,-editor) akanames(+name,-name) movies2producers(+movieid,-producer) movies2editors(-movieid,+editor) altversions(+movieid,-text) movies2producers(-movieid,+producer) editors(+editor,-name) business(+movieid,-text) producers(+producer,-name) movies2misc(+movieid,-misc) plots(+movieid,-text) directors(+director,-name) misc(+misc,-name) biographies(+bio,-name,-text) colorinfo(+colorid,-color) movies2proddes(+movieid,-proddes) distributors(+movieid,-name) colorinfo(+colorid,#color) movies2proddes(-movieid,+proddes) mpaara-ngs(+movieid,-text) movies2writers(+movieid,-writer) proddesigners(+proddes,-name) mpaara-ngs(+movieid,#text) movies2writers(-movieid,+writer) genres(+genreid,-genre) releasedates(+movieid,-countryid,-date) writers(+writer,-name) genres(+genreid,#genre) releasedates(+movieid,-countryid,#date) movies2actors(+movieid,-actor,-character) prodcompanies(+prodcompanyid,- technical(+movieid,-text) actors(+actor,-name,-sex) prodcompany) technical(+movieid,#text) actors(+actor,-name,#sex) ra-ngs(+movieid,-rank,-votes) language(+languageid,-language) movies2cinematgrs(+movieid,-cinemat) cer-ficates(+movieid,-country,-cer-ficate) language(+languageid,#language) movies2cinematgrs(-movieid,+cinemat) cer-ficates(+movieid,#country,-cer-ficate) movies2languages(+movieid,-languageid) cinematgrs(+cinemat,-name) cer-ficates(+movieid,-country,#cer-ficate) movies2countries(+movieid,-countryid) 19

AutoMode: automa-cally induce syntac-c bias • Leverage informa-on in the schema and content of the database AutoMode Exact IND Discovery Predicate and mode Approximate defini-ons IND Discovery Rela-onal learning algorithm 20

AutoMode: generate predicate defini-ons • Use inclusion dependencies (referen-al integrity constraints) to find types of agributes • Key idea: the most frequently used joins are the ones over the agributes that par-cipate in an IND – E.g., primary-key to foreign-key rela-onship professor taughtBy id posi-on courseId profId term f1 faculty c1 f1 Fall16 f2 faculty c2 f2 Fall16 f3 adjunct taughtBy[profId] professor[id] ⊆ 21

Towards Automa-cally Se3ng Language Bias in Rela-onal Learning Jose - PowerPoint PPT Presentation

Towards Automa-cally Se3ng Language Bias in Rela-onal Learning Jose Picado, Arash Termehchy, Alan Fern, Sudhanshu Pathak Informa-on and Data Management and Analy-cs (IDEA) Lab Design a drug to treat HIV What is the structure of compounds that

rela%onal algebra & calculus Relational DB: The Origins Frege:

CSE 344 Introduc/on to Data Management Sec%on 4: Rela%onal Algebra Outline HW3 Check-in

BIAS What Is Bias? Bias can be defined as favoring one side, position, or belief being

Variable selection bias Bias in Ensemble Bias in Ensemble Methods Methods Variable selection

Ocelot Rela%onal Logic in a Solver-Aided Language James Bornholt http://ocelot.tools Emina

BIAS BIAS LIGHT LIGHT & & MEDIUM MEDIUM TR TRUCK UCK TIRES TIRES Bias Bias Ligh

Image Segmentaon Using Min-Cut Problem: automacally classify

Semi-Automa+cally Modeling Web APIs to Create Linked APIs

ZigZag: Automa,cally Hardening Web Applica,ons Against Client- side Valida,on Vulnerabili,es

Expectancy bias and Bias and forensic evidence Bias and speech research forensic speech

Publication bias in QCA Publication bias in QCA Publication bias in QCA Meaning, diagnosis and

The Rela/vis/c Quantum World A lecture series on Rela/vity

Rela%vis%c Red Black Trees Rela%vis%c Programming Concurrent

The Rela/vis/c Quantum World A lecture series on Rela/vity

The Impact of Maternal Death on Childrens Health and Education Outcomes Cally Ardington*

DHA/GSA Industry Day 1.0 1 Medi cally Ready Forc e Ready Medical Forc e Mary Davie

A Use Case Model for RAS (Reliability, Availability, and Serviceability) in an MPP (Massively

r tst rt

Context and motivations Generalisation of heterogenous embedded systems hw +sw (typically :

Experience Discovery: Hybrid Recommendation of Student Activities using Social Network Data Robin

Schema and Constraints Mats Rydberg mats@neotechnology.com opencypher.org opencypher.org |

Contract-based Specification and Verification of Dataflow Programs Jonatan Wiik Pontus Bostrm

i* in Practice: Identifying Frequent Problems in its Application The Authors Karina Abad

An Introduction to FUNCTI NAL OBJECT - ORIENTED Apostolos N. Papadopoulos

Towards Automa-cally Se3ng Language Bias in Rela-onal Learning Jose - PowerPoint PPT Presentation

Towards Automa-cally Se3ng Language Bias in Rela-onal Learning Jose Picado, Arash Termehchy, Alan Fern, Sudhanshu Pathak Informa-on and Data Management and Analy-cs (IDEA) Lab Design a drug to treat HIV What is the structure of compounds that

rela%onal algebra &amp; calculus Relational DB: The Origins Frege:

CSE 344 Introduc/on to Data Management Sec%on 4: Rela%onal Algebra Outline HW3 Check-in

BIAS What Is Bias? Bias can be defined as favoring one side, position, or belief being

Variable selection bias Bias in Ensemble Bias in Ensemble Methods Methods Variable selection

Ocelot Rela%onal Logic in a Solver-Aided Language James Bornholt http://ocelot.tools Emina

BIAS BIAS LIGHT LIGHT &amp; &amp; MEDIUM MEDIUM TR TRUCK UCK TIRES TIRES Bias Bias Ligh

Image Segmenta*on Using Min-Cut Problem: automa*cally classify

Semi-Automa+cally Modeling Web APIs to Create Linked APIs

ZigZag: Automa,cally Hardening Web Applica,ons Against Client- side Valida,on Vulnerabili,es

Expectancy bias and Bias and forensic evidence Bias and speech research forensic speech

Publication bias in QCA Publication bias in QCA Publication bias in QCA Meaning, diagnosis and

The Rela/vis/c Quantum World A lecture series on Rela/vity

Rela%vis%c Red Black Trees Rela%vis%c Programming Concurrent

The Rela/vis/c Quantum World A lecture series on Rela/vity

The Impact of Maternal Death on Childrens Health and Education Outcomes Cally Ardington*

DHA/GSA Industry Day 1.0 1 Medi cally Ready Forc e Ready Medical Forc e Mary Davie

A Use Case Model for RAS (Reliability, Availability, and Serviceability) in an MPP (Massively

r tst rt

Context and motivations Generalisation of heterogenous embedded systems hw +sw (typically :

Experience Discovery: Hybrid Recommendation of Student Activities using Social Network Data Robin

Schema and Constraints Mats Rydberg mats@neotechnology.com opencypher.org opencypher.org |

Contract-based Specification and Verification of Dataflow Programs Jonatan Wiik Pontus Bostrm

i* in Practice: Identifying Frequent Problems in its Application The Authors Karina Abad

An Introduction to FUNCTI NAL OBJECT - ORIENTED Apostolos N. Papadopoulos

rela%onal algebra & calculus Relational DB: The Origins Frege:

BIAS BIAS LIGHT LIGHT & & MEDIUM MEDIUM TR TRUCK UCK TIRES TIRES Bias Bias Ligh

Image Segmentaon Using Min-Cut Problem: automacally classify