Natural Analysts in Adaptive Data Analysis Tijana Zrnic joint with Moritz Hardt
Adaptivity in Machine Learning
Adaptivity in Machine Learning data analyst with training data
Adaptivity in Machine Learning model data analyst with training data
Adaptivity in Machine Learning model test data set data analyst with training data
Adaptivity in Machine Learning model test data set data analyst with training data accuracy
Adaptivity in Machine Learning model tune hyperparams... test data set data analyst with training data accuracy
Adaptivity in Machine Learning model test data set data analyst with training data accuracy
Adaptivity in Machine Learning model test data set data analyst with training data accuracy
Adaptivity in Machine Learning model tune hyperparams... test data set data analyst with training data accuracy
Adaptivity in Machine Learning model test data set data analyst with training data accuracy
Adaptivity in Machine Learning model test data set data analyst with training data accuracy
Adaptivity in Machine Learning model tune hyperparams... test data set data analyst with training data accuracy
<latexit sha1_base64="btWuKJH9/rCxCKL5tGKBdwWU5A=">AB6HicbVBNS8NAEJ3Ur1q/qh69LBbBU0mqoMeiF48t2FpoQ9lsN+3azSbsToQS+gu8eFDEqz/Jm/GbZuDtj4YeLw3w8y8IJHCoOt+O4W19Y3NreJ2aWd3b/+gfHjUNnGqGW+xWMa6E1DpVC8hQIl7ySa0yiQ/CEY3878hyeujYjVPU4S7kd0qEQoGEUrNbFfrhVdw6ySrycVCBHo1/+6g1ilkZcIZPUmK7nJuhnVKNgk9LvdTwhLIxHfKupYpG3PjZ/NApObPKgISxtqWQzNXfExmNjJlEge2MKI7MsjcT/O6KYbXfiZUkiJXbLEoTCXBmMy+JgOhOUM5sYQyLeythI2opgxtNiUbgrf8ip16reRbXWvKzUb/I4inACp3AOHlxBHe6gAS1gwOEZXuHNeXRenHfnY9FacPKZY/gD5/MH4XeM/A=</latexit> Adaptivity in Machine Learning • After tested models, how well does the final model t generalize? ‣ Depends on how the accuracies are computed
Classical Holdout vs Response Mechanism model data analyst with training data accuracy test data set
Classical Holdout vs Response Mechanism model data analyst with training data accuracy test data set • Reporting exact sample accuracy allows for great overfitting
Classical Holdout vs Response Mechanism model response mechanism for reporting accuracy data analyst with training data accuracy test data set • Reporting exact sample accuracy allows for great overfitting • Better bounds can be obtained by having a non-trivial response mechanism in charge of reporting accuracy on the test data
Main Questions model data analyst with training data accuracy test data set
Main Questions model response mechanism for reporting accuracy data analyst with training data accuracy test data set
Main Questions model response mechanism for reporting accuracy data analyst with training data accuracy test data set How do we construct a mechanism such that its responses generalize to the • population?
Main Questions model response mechanism for reporting accuracy data analyst with training data accuracy test data set How do we construct a mechanism such that its responses generalize to the • population? want 95% reported accuracy on test data ≈ 95% accuracy on fresh data from same population ‣
Main Questions model response mechanism for reporting accuracy data analyst with training data accuracy test data set How do we construct a mechanism such that its responses generalize to the • population? want 95% reported accuracy on test data ≈ 95% accuracy on fresh data from same population ‣ For such a good mechanism, how much does a possibly adversarial analyst overfit? •
Abstraction via Adaptive Data Analysis
Abstraction via Adaptive Data Analysis Framework of Dwork et al. (2015)
Abstraction via Adaptive Data Analysis Framework of Dwork et al. (2015) analyst
Abstraction via Adaptive Data Analysis Framework of Dwork et al. (2015) analyst mechanism
<latexit sha1_base64="uAzqYQ8pfQnZjL0XJEs1v308KOM=">AB83icbVDLSsNAFL2pr1pfVZduBovgqiRV0GXRjcuK9gFNKJPpB06mYR5CX0N9y4UMStP+POv3HSZqGtBwYO59zLPXPClDOlXfbKa2tb2xulbcrO7t7+wfVw6OSowktE0SnsheiBXlTNC2ZprTXiopjkNOu+HkNve7T1QqlohHPU1pEORYBEjWFvJ92OsxwTz7GFWGVRrbt2dA60SryA1KNAaVL/8YUJMTIUmHCvV9xUBxmWmhFOZxXfKJpiMsEj2rdU4JiqIJtnqEzqwxRlEj7hEZz9fdGhmOlpnFoJ/OMatnLxf+8vtHRdZAxkRpNBVkcigxHOkF5AWjIJCWaTy3BRDKbFZExlphoW1Negrf85VXSadS9i3rj/rLWvCnqKMJnMI5eHAFTbiDFrSBQArP8ApvjnFenHfnYzFacoqdY/gD5/MHxZeRgQ=</latexit> <latexit sha1_base64="iVYgqWYpJ7GNWJI9xnYwSPU6ZHk=">ACB3icbVDLSgMxFM34rPU16lKQYBFclZkq6LoxmVF+4DOWDJpg1NMkOSEcowOzf+ihsXirj1F9z5N2baQbT1QODknHu5954gZlRpx/myFhaXldWS2vl9Y3NrW17Z7elokRi0sQRi2QnQIowKkhTU81IJ5YE8YCRdjC6zP32PZGKRuJWj2PiczQNKQYaSP17AOPIz3EiKU3macohz/RnYnyj274lSdCeA8cQtSAQUaPfvT60c4URozJBSXdeJtZ8iqSlmJCt7iSIxwiM0IF1DBeJE+enkjgweGaUPw0iaJzScqL87UsSVGvPAVOZbqlkvF/zuokOz/2UijRODpoDBhUEcwDwX2qSRYs7EhCEtqdoV4iCTC2kSXh+DOnjxPWrWqe1KtXZ9W6hdFHCWwDw7BMXDBGaiDK9ATYDBA3gCL+DVerSerTfrfVq6YBU9e+APrI9vIBKZdQ=</latexit> <latexit sha1_base64="2QmInd+nHO60uhS7AYCvgpiU4a8=">AB6HicbVBNS8NAEJ3Ur1q/qh69LBbBU0mqoMeiF48t2FpoQ9lsJ+3azSbsboQS+gu8eFDEqz/Jm/GbZuDtj4YeLw3w8y8IBFcG9f9dgpr6xubW8Xt0s7u3v5B+fCoreNUMWyxWMSqE1CNgktsGW4EdhKFNAoEPgTj25n/8IRK81jem0mCfkSHkoecUWOlpuyXK27VnYOsEi8nFcjR6Je/eoOYpRFKwTVu5ifEzqgxnAqelXqoxoWxMh9i1VNItZ/ND52SM6sMSBgrW9KQufp7IqOR1pMosJ0RNSO97M3E/7xuasJrP+MySQ1KtlgUpoKYmMy+JgOukBkxsYQyxe2thI2oszYbEo2BG/5VXSrlW9i2qteVmp3+RxFOETuEcPLiCOtxBA1rAOEZXuHNeXRenHfnY9FacPKZY/gD5/MH2F+M9g=</latexit> <latexit sha1_base64="GH4lnl7xiOBJEv5hdGjYXKo68Lw=">AB8nicbVDLSsNAFL3xWeur6tLNYBFclaQKuiy6cVnBPqANZTKdtEMnkzBzI5TQz3DjQhG3fo07/8ZJm4W2Hhg4nHMvc+4JEikMu63s7a+sbm1Xdop7+7tHxWjo7bJk414y0Wy1h3A2q4FIq3UKDk3URzGgWSd4LJXe53nrg2IlaPOE24H9GREqFgFK3U60cUx4zKrDkbVKpuzZ2DrBKvIFUo0BxUvrDmKURV8gkNabnuQn6GdUomOSzcj81PKFsQke8Z6miETd+No8I+dWGZIw1vYpJHP190ZGI2OmUWAn84hm2cvF/7xeiuGNnwmVpMgVW3wUpJgTPL7yVBozlBOLaFMC5uVsDHVlKFtqWxL8JZPXiXtes27rNUfrqN26KOEpzCGVyAB9fQgHtoQgsYxPAMr/DmoPivDsfi9E1p9g5gT9wPn8AiGeRag=</latexit> Abstraction via Adaptive Data Analysis Framework of Dwork et al. (2015) data set S analyst mechanism S ∼ P n P - population distribution - sample size n
<latexit sha1_base64="rslrDZ2B9h+NmotiRJ6kHYRkDtA=">AB6nicbVBNS8NAEJ34WetX1aOXxSJ4KkV9Fj04rGi/YA2lM120y7dbOLuRCihP8GLB0W8+ou8+W/ctjlo64OBx3szMwLEikMu63s7K6tr6xWdgqbu/s7u2XDg6bJk414w0Wy1i3A2q4FIo3UKDk7URzGgWSt4LRzdRvPXFtRKwecJxwP6IDJULBKFrp/rHn9Uplt+LOQJaJl5My5Kj3Sl/dfszSiCtkhrT8dwE/YxqFEzySbGbGp5QNqID3rFU0YgbP5udOiGnVumTMNa2FJKZ+nsio5Ex4yiwnRHFoVn0puJ/XifF8MrPhEpS5IrNF4WpJBiT6d+kLzRnKMeWUKaFvZWwIdWUoU2naEPwFl9eJs1qxTuvVO8uyrXrPI4CHMJnIEHl1CDW6hDAxgM4Ble4c2Rzovz7nzMW1ecfOYI/sD5/AECVI2d</latexit> <latexit sha1_base64="2QmInd+nHO60uhS7AYCvgpiU4a8=">AB6HicbVBNS8NAEJ3Ur1q/qh69LBbBU0mqoMeiF48t2FpoQ9lsJ+3azSbsboQS+gu8eFDEqz/Jm/GbZuDtj4YeLw3w8y8IBFcG9f9dgpr6xubW8Xt0s7u3v5B+fCoreNUMWyxWMSqE1CNgktsGW4EdhKFNAoEPgTj25n/8IRK81jem0mCfkSHkoecUWOlpuyXK27VnYOsEi8nFcjR6Je/eoOYpRFKwTVu5ifEzqgxnAqelXqoxoWxMh9i1VNItZ/ND52SM6sMSBgrW9KQufp7IqOR1pMosJ0RNSO97M3E/7xuasJrP+MySQ1KtlgUpoKYmMy+JgOukBkxsYQyxe2thI2oszYbEo2BG/5VXSrlW9i2qteVmp3+RxFOETuEcPLiCOtxBA1rAOEZXuHNeXRenHfnY9FacPKZY/gD5/MH2F+M9g=</latexit> <latexit sha1_base64="uAzqYQ8pfQnZjL0XJEs1v308KOM=">AB83icbVDLSsNAFL2pr1pfVZduBovgqiRV0GXRjcuK9gFNKJPpB06mYR5CX0N9y4UMStP+POv3HSZqGtBwYO59zLPXPClDOlXfbKa2tb2xulbcrO7t7+wfVw6OSowktE0SnsheiBXlTNC2ZprTXiopjkNOu+HkNve7T1QqlohHPU1pEORYBEjWFvJ92OsxwTz7GFWGVRrbt2dA60SryA1KNAaVL/8YUJMTIUmHCvV9xUBxmWmhFOZxXfKJpiMsEj2rdU4JiqIJtnqEzqwxRlEj7hEZz9fdGhmOlpnFoJ/OMatnLxf+8vtHRdZAxkRpNBVkcigxHOkF5AWjIJCWaTy3BRDKbFZExlphoW1Negrf85VXSadS9i3rj/rLWvCnqKMJnMI5eHAFTbiDFrSBQArP8ApvjnFenHfnYzFacoqdY/gD5/MHxZeRgQ=</latexit> <latexit sha1_base64="iVYgqWYpJ7GNWJI9xnYwSPU6ZHk=">ACB3icbVDLSgMxFM34rPU16lKQYBFclZkq6LoxmVF+4DOWDJpg1NMkOSEcowOzf+ihsXirj1F9z5N2baQbT1QODknHu5954gZlRpx/myFhaXldWS2vl9Y3NrW17Z7elokRi0sQRi2QnQIowKkhTU81IJ5YE8YCRdjC6zP32PZGKRuJWj2PiczQNKQYaSP17AOPIz3EiKU3macohz/RnYnyj274lSdCeA8cQtSAQUaPfvT60c4URozJBSXdeJtZ8iqSlmJCt7iSIxwiM0IF1DBeJE+enkjgweGaUPw0iaJzScqL87UsSVGvPAVOZbqlkvF/zuokOz/2UijRODpoDBhUEcwDwX2qSRYs7EhCEtqdoV4iCTC2kSXh+DOnjxPWrWqe1KtXZ9W6hdFHCWwDw7BMXDBGaiDK9ATYDBA3gCL+DVerSerTfrfVq6YBU9e+APrI9vIBKZdQ=</latexit> <latexit sha1_base64="GH4lnl7xiOBJEv5hdGjYXKo68Lw=">AB8nicbVDLSsNAFL3xWeur6tLNYBFclaQKuiy6cVnBPqANZTKdtEMnkzBzI5TQz3DjQhG3fo07/8ZJm4W2Hhg4nHMvc+4JEikMu63s7a+sbm1Xdop7+7tHxWjo7bJk414y0Wy1h3A2q4FIq3UKDk3URzGgWSd4LJXe53nrg2IlaPOE24H9GREqFgFK3U60cUx4zKrDkbVKpuzZ2DrBKvIFUo0BxUvrDmKURV8gkNabnuQn6GdUomOSzcj81PKFsQke8Z6miETd+No8I+dWGZIw1vYpJHP190ZGI2OmUWAn84hm2cvF/7xeiuGNnwmVpMgVW3wUpJgTPL7yVBozlBOLaFMC5uVsDHVlKFtqWxL8JZPXiXtes27rNUfrqN26KOEpzCGVyAB9fQgHtoQgsYxPAMr/DmoPivDsfi9E1p9g5gT9wPn8AiGeRag=</latexit> Abstraction via Adaptive Data Analysis Framework of Dwork et al. (2015) query q 1 S analyst mechanism data set S ∼ P n q i : supp( P ) → [0 , 1] d - queries posed by analyst P - population distribution - sample size n
Recommend
More recommend