YOW! Data Respecting Privacy with ‘Look alike’ data sets Tim Garnsey - Director - Verge Labs
“ “Data is the new oil” Clive Humby
What makes refining hard? 1. Corporate competitive advantage 2. People’s privacy
Competitive advantage Generating content in the process User looks for content Making content attractive
People’s privacy
Analysing data 1. Care about relationships within subjects (in aggregate) 2. Care about relationships across subjects (in aggregate) 3. Don’t care about subjects
Analysing data Date Name Business * Lat Long City 2017-06-15 Nicole Thelma's Filipino Restaurant 5 36.0631 -115.038 Henderson NV 2015-09-23 Matt Cabo's Mexican Cuisine and Cantina 1 35.0854 -80.8476 Charlotte NC 2016-05-03 Maddie The Parlor 3 33.5094 -112.04 Phoenix AZ 2017-08-08 Nick DC Steak House 1 33.3024 -111.842 Chandler AZ 2017-07-16 Keith Kneaders Bakery & Cafe 1 33.6266 -111.896 Scottsdale AZ 2013-02-10 Mike Hickory Tavern 4 35.1019 -80.9912 Charlotte NC 2016-08-23 Michael Wendy's Noodle Cafe 3 36.1275 -115.225 Las Vegas NV 2017-02-10 Petrina Clones 4 Patients ONLY 1 36.2858 -115.285 Las Vegas NV 2016-01-03 Maddie Chicks With Spiritual Gifts 5 33.4689 -112.07 Phoenix AZ 2016-10-19 Sunggin Cholla Prime Steakhouse & Lounge 4 33.454 -111.886 Scottsdale AZ 2012-01-30 Maribeth Urban Cookies Bakeshop 5 33.4742 -112.065 Glendale AZ 2016-11-04 Jaime In-N-Out Burger 5 33.508 -112.266 Phoenix AZ
Analysing data Date Name Business * Lat Long City 2017-11-22 Angela Fat Tuesday 5 36.1095 -115.171 Las Vegas NV 2013-09-18 Kenny Yanni's Gyros 5 36.0119 -115.136 Las Vegas NV 2012-12-07 Joyce Panera Bread 4 33.5805 -112.122 Glendale AZ 2017-12-04 Kenny Scallywags 5 43.6878 -79.3945 Toronto ON 2012-11-15 Abby Tip of the Tail Grooming 1 41.4317 -81.8867 North Olmsted OH 2011-02-26 Laura Thai Express 1 43.6132 -79.5558 Etobicoke ON 2017-10-11 Liane Armando's Mexican Food 4 33.6842 -112.107 Phoenix AZ 2017-05-17 Laura Momo Hair Salon 5 43.7039 -79.3979 Toronto ON 2014-01-02 Caitlin Chopstix Express 3 36.1425 -115.209 Las Vegas NV 2017-03-04 Han Take Over Lease 5 40.4403 -79.9863 Pittsburgh PA 2016-10-21 Will Betty's Flower Shop 1 36.2384 -115.155 North Las Vegas NV 2013-08-19 Steve Rise Biscuits & Donuts 3 33.4923 -111.924 Scottsdale AZ
Analysing data 1. Care about relationships within subjects (in aggregate) 2. Care about relationships across subjects (in aggregate) 3. Don’t care about subjects … really need a “very statistically similar” data set
Copying machines
Use GANs … 0.3 0.6 0.1
Use differential privacy …
… and join them together 0.3 0.6 0.1
Check-in data
Check-in data
Zoom
Tips and tricks There is a universe here, but getting started is easy: 1. Blogs, SO and wikipedia 2. Convert to floating point (cities, text) 3. Start small (there is always something that is valuable enough)
Questions? tim@vergelabs.ai @TimGarnsey
Recommend
More recommend