UC Berkeley CrowdDB: Answering queries with crowdsourcing ! Michael Franklin + , Donald Kossmann*, Tim Kraska + Sukriti Ramesh*, and Reynold Xin + + UC Berkeley * ETH Zurich 1
! !"#$%&'()*+ " ,-'%./)*+ # /#0!/+ 9#2,5:)+,00);+<=/2+0#%.&+%.+)#!48#.+)0,-/+ Algorithms Watson/IBM search Machines People 1//2+&/-'.%34/)+&#+25.,(%-,!!5+0%-6+7/)&+ #0/$,8."+0#%.&+ amplab 2
9'/+>?@+A,7+ Algorithms Watson/IBM search Machines People ?,6/+)/.)/+#B+2,&,+,&+)-,!/+75+8"'&!5+ %.&/"$,8."+,!"#$%&'()*+(,-'%./)*+,.2+0/#0!/+ 3 amplab
C$#D2EF+ Algorithms Watson/IBM search Crowd DB Machines People ?,6/+)/.)/+#B+2,&,+,&+)-,!/+75+8"'&!5+ %.&/"$,8."+,!"#$%&'()*+(,-'%./)*+,.2+0/#0!/+ 4 amplab
EFG',$2+H4/$%/)+ $%&'()*+,(&-. !//0-11. "(02-3.$('. I##"!/+ I##"!/0!/=*+?&.J+K%/D+C>+ LMNOF.+ P.&!J+F4)%./))+?,-'%./)+ >$(#.6*+1Q+ LMOOF.+ ?%-$#)#R+ S/2(#.2*+T>+ LMUOF.+ SELECT Market_Cap From Companies Where Company_Name = “IBM” Number of Rows: 0 Problem: Entity Resolution E amplab 5
EFG',$2+H4/$%/)+ $%&'()*+,(&-. !//0-11. "(02-3.$('. I##"!/+ I##"!/0!/=*+?&.J+K%/D+C>+ LMNOF.+ P.&!J+F4)%./))+?,-'%./)+ >$(#.6*+1Q+ LMOOF.+ ?%-$#)#R+ S/2(#.2*+T>+ LMUOF.+ SELECT Market_Cap From Companies Where Company_Name = “Apple” Number of Rows: 0 Problem: Closed World Assumption C amplab 6
EFG',$2+H4/$%/)+ SELECT Top_1(Image) From Pictures Where Theme = “Business Success” Number of Rows: 0 Problem: M Missing Intelligence amplab 7
V,)5+H4/$%/)+ $%&'()*+,(&-. !//0-11. "(02-3.$('. I##"!/+ I##"!/0!/=*+?&.J+K%/D+C>+ LMNOF.+ P.&!J+F4)%./))+?,-'%./)+ >$(#.6*+1Q+ LMOOF.+ ?%-$#)#R+ S/2(#.2*+T>+ LMUOF.+ SELECT Market_Cap From Companies Where Company_Name = “IBM” $200Bn Number of Rows: 1 amplab 8
@$/W5+V,)5+H4/$%/)+ $%&'()*+,(&-. !//0-11. "(02-3.$('. I##"!/+ I##"!/0!/=*+?&.J+K%/D+C>+ LMNOF.+ P.&!J+F4)%./))+?,-'%./)+ >$(#.6*+1Q+ LMOOF.+ ?%-$#)#R+ S/2(#.2*+T>+ LMUOF.+ SELECT Market_Cap From Companies Where Company_Name = “The Cool Software Company” $2xxBn Number of Rows: 1 amplab 9
C$#D2)#4$-%."+ amplab 10
?%-$#&,)6%."+X+K%$&4,!%Y/2+Z4(,.)+ • " C4$$/.&+!/,2/$;+>(,Y#.+?/-',.%-,!+94$6+ • " S/34/)&#$)+0!,-/+Z4(,.+P.&/!!%"/.-/+9,)6)+[ZP9)\+ – " ?%.%(4(+0$%-/;+LOJON+ – " ]&'/$+0,$,(/&/$);+^#B+$/0!%-,)+[,))%".(/.&)\*+/=0%$,8#.*++ 41-0.5)3-06(7- *_ . – " >@PG7,)/2;+`-$/,&/Z%&[\a*+`"/&>))%".(/.&)[\a*+ `,00$#b/>))%".(/.&)[\a*+`B#$-/V=0%$/[\a+ – " S/34/)&#$)+,00$#b/+c#7)+,.2+0,5(/.&+ • " T#$6/$)+[,J6J,J+`&4$6/$)a\+-'##)/+c#7)*+2#+&'/(*+"/&+ 0,%2+ amplab 11
C$#D2EF++ CrowdSQL Results 41-.38-.70%9/.3%.()19-0. :;<8(0/.=>-0?-1 ! CrowdDB Parser + MetaData T'/$/+&#+4)/+&'/+-$#D2;+ • " UI Optimizer UI Editor – " Z,b/+0/#0!/+<.2+2,&,+ Creation – " Z,b/+0/#0!/+2#+`B4YY5a+ -#(0,$%)#.)+ Executor UI Template Manager Statistics • " V34,!%&5+[',)'\+,.2+]$2/$%."+[)#$&\+ Files Access Methods HIT Management T'/$/+%&+2#/).:&+(,6/+)/.)/;+ • " – " Z,b/+0/#0!/+2#+34%-6+)#$&+#B+)/&)+ [%J/J*+94$6P&\+ – " >.5&'%."+&'/+-#(04&/$+,!$/,25+ Disk 1 2#/)+D/!!++ Disk 2 amplab 12
C$#D2dHA+ee+dHAfff+ • " $@%1-/<A%0@/. ! ! .B'-)<A%0@/. – " dHA)+-!#)/2GD#$!2+,))4(08#.+%)+,+!%/+ – " P.g4/.-/)+34/$5+/=/-48#.+)&$,&/"%/)h#08#.)+ – " >&+&'/+(#(/.&*+D/+#.!5+,!!#D+,+$/)&$%-&/2+)/&+#B+34/$%/)+,",%.)&+ -$#D2G)#4$-/2+&,7!/)+ – " I#,!;+V=0!#$/+&'/+#0/.GD#$!2+,)+(4-'+,)+0#))%7!/+[/J"J*+,.)D/$+,+ 34/$5+,)+7/)&+,)+0#))%7!/+"%b/.+,+-/$&,%.+742"/&\++ • " $(78?)C. – " Vb/$5+$/)4!&+B$#(+&'/+-$#D2+%)+)&#$/2+ ! +9##+/=0/.)%b/+.#&+&#+2#+)#+ – " H4/$%/)+4)/+)&#$/2+$/)4!&)+D'/./b/$+0#))%7!/+ – " >.)D/$)+-',."/+7,)/2+#.+&'/+34/$5+'%)&#$5+,.2+-,-'/+7/',b%#$+ – " I#,!;+]i/$+(#$/+-#.&$#!+#b/$+-,-'%."*+99A*+_+ • " !)19-0.D>(@?3*. – " Z4(,.G%.04&G&#!/$,.&+34/$5+0$#-/))%."+ – " >&+&'/+(#(/.&+,+)%(0!/+)/&+#B+'/4$%)8-)++ – " I#,!;+-#.<2/.-/+%.&/$b,!)*+%&/$,8b/+%(0$#b/(/.&*_+ amplab 13
C$#D2dHA+ ::E.FG3-)1?%)1 ; !+(9:.(1+0*:'0(31<2.''' !+(9:.(1+0*:'-,=3*.' CREATE C CROWD TABLE department ( CREATE TABLE company ( university STRING, name STRING PRIMARY KEY, department STRING, hq_address C CROWD STRING); phone_no STRING) PRIMARY KEY (university, department ) ; :"E.FG3-)1?%)1H.. !+(9:&;1,38''' !"#$%#"%&"'()*+,-(+.'/01++*2-34'5%6.78' SELECT * SELECT p FROM picture FROM companies WHERE subject = "Golden Gate Bridge" ~ “Big Blue” + WHERE Name ~ ORDER BY C CROWDORDER(p, "Which pic shows better %subject"); amplab 14
j)/$+P.&/$B,-/+I/./$,8#.+ • " >+-!/,$+jP+%)+6/5+&#+$/)0#.)/+8(/+,.2+,.)D/$+ 34,!%&5J+ • " T/+-,.+!/b/$,"/+&'/+dHA+d-'/(,+&#+,4&#G "/./$,&/+jP++[/J"J*+]$,-!/+k#$()*+/&-J\+ amplab 15
jP+D%&'+C#.&/=&+ amplab 16
H4/$5+]08(%Y,8#.+,.2+V=/-48#.+ CREATE C CROWD TABLE department ( dVAVC9+l++ name STRING PRIMARY KEY kS]?+@S]kVdd]S+0*+EV@>S9?V19+2+ phone_no STRING); TZVSV+2J.,(/+e+0J2/0+ CREATE C CROWD TABLE professor ( >1E+0J.,(/+e`?%-',/!+mJ+C,$/5a+ name STRING PRIMARY KEY e-mail STRING dep STRING REF department(name) Please fi ll out the missing ); department data Department CS ! ! name="Carey" Phone p.dep=d.name MTJoin Submit (Dep) p.dep = d.name Department ! p.dep=d.name Please fi ll out the missing ! name= professor data MTProbe Name Carey Professor Department "Carey" (Professor) E-Mail name=Carey Department Professor Submit (b) Logical plan (c) Logical plan (d) Physical plan before optimization after optimization amplab 17
H4/$5+]08(%Y,8#.+ Please fi ll out the missing department data CS Department ! name="Carey" ! Phone p.dep=d.name MTJoin Submit (Dep) p.dep = d.name Department ! p.dep=d.name Please fi ll out the missing ! name= professor data MTProbe Carey Name Professor Department "Carey" (Professor) E-Mail name=Carey Department Professor Submit (b) Logical plan (c) Logical plan (d) Physical plan before optimization after optimization I>@-.J(1-/.%'K&?L-0. @/$B#$(,.-/+P.)%"'n4!+H4/$5+A,."4,"/+[@PHA\+&/-'.%34/)++ • " &#+2/,!+D%&'+#0/.GD#$!2+,))4(08#.+ d%(0!/+)/&+#B+$4!/)+&#+0%-6+&'/+7/)&+0!,.+ • " d%(0!/+'/4$%)8-)+&#+)/&+&'/+-$#D2+0,$,(/&/$)+++ • " [/J"J*+$/0!%-,8#.+B,-&#$*+0$%-/+0/$+ZP9*+/&-J\+ amplab 18 +
E/,!%."+D%&'+&'/+]0/.GT#$!2+ Focus Right Now: PK Queries 1 1 ! ! name="Carey" dVAVC9+l++ p.dep=d.name 1*1 kS]?+@S]kVdd]S+0*++ ! ! name= +++EV@>S9?V19+2+ Department ! TZVSV+0J2/0+e+2J.,(/+ p.dep=d.name ! "Carey" ! >1E+0J.,(/+e`C,$/5a+ 1 Professor Department + Professor 10 Soon dVAVC9+l++ 10 Stop After (10) kS]?+@S]kVdd]S+0*++ ! ! p.dep=d.name +++EV@>S9?V19+2+ ! p.dep=d.name 10 10*1 TZVSV+0J2/0+e+2J.,(/+ ! ! AP?P9+O*+NO++ Professor Department Professor Department ! Never dVAVC9+l++ ! p.dep=d.name kS]?+@S]kVdd]S+0*++ ! ! +++EV@>S9?V19+2+ Professor Department TZVSV+0J2/0+e+2J.,(/+ amplab 19
H4/$5+]08(%Y,8#.+ Please fi ll out the missing department data CS Department ! name="Carey" ! Phone p.dep=d.name MTJoin Submit (Dep) p.dep = d.name Department ! p.dep=d.name Please fi ll out the missing ! name= professor data MTProbe Carey Name Professor Department "Carey" (Professor) E-Mail name=Carey Department Professor Submit (b) Logical plan (c) Logical plan (d) Physical plan before optimization after optimization • " C$/,&/)+4)/$+%.&/$B,-/+&/(0!,&/)+ • " d/!/-&+0'5)%-,!+#0/$,&#$)+ • " 1/D+H4/$5+]0/$,&#$);+ – " C$#D2+]0/$,&#$);+?9@$#7/*+?9m#%.*+?9k4.-8#.+ – " ]&'/$;+d9]@+>k9VS+[%J/J*+!%(%&\+ amplab 20 +
?9@$#7/+ • " d%(%!,$+&#+,+&,7!/G)-,.+D%&'+0$/2%-,&/+04)'G 2#D.)+ • " F,&-'/)+)/b/$,!+c#7)+%.&#+#./+ZP9+ • " P))4/)+,)+(,.5+$/34/)&)+%.+0,$,!!/!+,)+0#))%7!/++ "M#0%J- [@$#B/))#$\+ [7,)/2+#.+&'/+-,$2%.,!%&5+0$/2%-8#.\+ 1,(/eC,$/5+ • " E#/)+)%(0!/+34,!%&5+-#.&$#!+[34#$4(+b#&/)\+ • " `C,-'/)a+&'/+$/)4!&+%.)%2/+&'/+-#$$/)0#.2%."+ &,7!/+ ! +34/$%/)+',b/+)%2/G/i/-&)++ Please fi ll out the missing Please fi ll out the missing professor data Please fi ll out the missing professor data company data! Name Carey N ame Carey IBM Name E-Mail Department Headquarter CS Department name address Department E-Mail Phone Submit Submit Submit Crowd Column & Crowd Column & Denormalization Crowd Colums Crowd Colums 21 w/o foreign keys with foreign keys amplab
Recommend
More recommend