AutoTVM & Device Fleet `
Learning to Optimize Tensor Programs Frameworks High-level data flow graph and optimizations Hardware
Learning to Optimize Tensor Programs Frameworks High-level data flow graph and optimizations Hardware
Learning to Optimize Tensor Programs Frameworks High-level data flow graph and optimizations Machine Learning based Program Optimizer Hardware
Learning to Optimize Tensor Programs Frameworks High-level data flow graph and optimizations Machine Learning based Program Optimizer Learning to generate optimized program for new operator workloads and hardware Hardware
Search over Possible Program Transformations Compute Description C = tvm.compute((m, n), lambda y, x: tvm.sum(A[k, y] * B[k, x], axis=k)) Loop Thread Bindings Cache Locality Transformations Thread Latency Hiding Tensorization Cooperation Hardware
Search over Possible Program Transformations Compute Description C = tvm.compute((m, n), lambda y, x: tvm.sum(A[k, y] * B[k, x], axis=k)) Loop Thread Bindings Cache Locality Transformations Thread Latency Hiding Tensorization Cooperation Hardware
Search over Possible Program Transformations Compute Description C = tvm.compute((m, n), lambda y, x: tvm.sum(A[k, y] * B[k, x], axis=k)) Billions Loop Thread Bindings Cache Locality of possible Transformations optimization Thread choices Latency Hiding Tensorization Cooperation Hardware
Learning-based Program Optimizer Program Program Optimizer Code Generator � 4
Learning-based Program Optimizer Program Program Optimizer Code Generator Runtime Measurements � 4
Learning-based Program Optimizer Program Program Optimizer Code Generator Runtime Measurements High experiment cost, each trial costs ~1second � 4
Learning-based Program Optimizer Program Program Optimizer Code Generator � 5
Learning-based Program Optimizer Program Program Optimizer Code Generator Cost Model � 5
Learning-based Program Optimizer Program Program Optimizer Code Generator Cost Model Need reliable cost model per hardware � 5
Learning-based Program Optimizer Program Code Generator Program Optimizer
<latexit sha1_base64="1Z6CzjBl0OMVztfQ+m452YDkcY0=">AB8nicbVDLSsNAFL2pr1pfVZdugkVwVRIRdFnUhcsK9gFtKJPpB06mQkzN0IJ/Qw3LhRx69e482+ctFlo64GBwzn3MueMBHcoOd9O6W19Y3NrfJ2ZWd3b/+genjUNirVlLWoEkp3Q2KY4JK1kKNg3UQzEoeCdcLJbe53npg2XMlHnCYsiMlI8ohTglbq9WOCY0pEdjcbVGte3ZvDXSV+QWpQoDmofvWHiqYxk0gFMabnewkGdHIqWCzSj81LCF0QkasZ6kMTNBNo8c8+sMnQjpe2T6M7V3xsZiY2ZxqGdzCOaZS8X/N6KUbXQcZlkiKTdPFRlAoXlZvf7w65ZhTF1BJCNbdZXTomlC0LVsCf7yaukfVH3vbr/cFlr3BR1lOETuEcfLiCBtxDE1pAQcEzvMKbg86L8+58LEZLTrFzDH/gfP4AdN2RWg=</latexit> <latexit sha1_base64="1Z6CzjBl0OMVztfQ+m452YDkcY0=">AB8nicbVDLSsNAFL2pr1pfVZdugkVwVRIRdFnUhcsK9gFtKJPpB06mQkzN0IJ/Qw3LhRx69e482+ctFlo64GBwzn3MueMBHcoOd9O6W19Y3NrfJ2ZWd3b/+genjUNirVlLWoEkp3Q2KY4JK1kKNg3UQzEoeCdcLJbe53npg2XMlHnCYsiMlI8ohTglbq9WOCY0pEdjcbVGte3ZvDXSV+QWpQoDmofvWHiqYxk0gFMabnewkGdHIqWCzSj81LCF0QkasZ6kMTNBNo8c8+sMnQjpe2T6M7V3xsZiY2ZxqGdzCOaZS8X/N6KUbXQcZlkiKTdPFRlAoXlZvf7w65ZhTF1BJCNbdZXTomlC0LVsCf7yaukfVH3vbr/cFlr3BR1lOETuEcfLiCBtxDE1pAQcEzvMKbg86L8+58LEZLTrFzDH/gfP4AdN2RWg=</latexit> <latexit sha1_base64="1Z6CzjBl0OMVztfQ+m452YDkcY0=">AB8nicbVDLSsNAFL2pr1pfVZdugkVwVRIRdFnUhcsK9gFtKJPpB06mQkzN0IJ/Qw3LhRx69e482+ctFlo64GBwzn3MueMBHcoOd9O6W19Y3NrfJ2ZWd3b/+genjUNirVlLWoEkp3Q2KY4JK1kKNg3UQzEoeCdcLJbe53npg2XMlHnCYsiMlI8ohTglbq9WOCY0pEdjcbVGte3ZvDXSV+QWpQoDmofvWHiqYxk0gFMabnewkGdHIqWCzSj81LCF0QkasZ6kMTNBNo8c8+sMnQjpe2T6M7V3xsZiY2ZxqGdzCOaZS8X/N6KUbXQcZlkiKTdPFRlAoXlZvf7w65ZhTF1BJCNbdZXTomlC0LVsCf7yaukfVH3vbr/cFlr3BR1lOETuEcfLiCBtxDE1pAQcEzvMKbg86L8+58LEZLTrFzDH/gfP4AdN2RWg=</latexit> <latexit sha1_base64="1Z6CzjBl0OMVztfQ+m452YDkcY0=">AB8nicbVDLSsNAFL2pr1pfVZdugkVwVRIRdFnUhcsK9gFtKJPpB06mQkzN0IJ/Qw3LhRx69e482+ctFlo64GBwzn3MueMBHcoOd9O6W19Y3NrfJ2ZWd3b/+genjUNirVlLWoEkp3Q2KY4JK1kKNg3UQzEoeCdcLJbe53npg2XMlHnCYsiMlI8ohTglbq9WOCY0pEdjcbVGte3ZvDXSV+QWpQoDmofvWHiqYxk0gFMabnewkGdHIqWCzSj81LCF0QkasZ6kMTNBNo8c8+sMnQjpe2T6M7V3xsZiY2ZxqGdzCOaZS8X/N6KUbXQcZlkiKTdPFRlAoXlZvf7w65ZhTF1BJCNbdZXTomlC0LVsCf7yaukfVH3vbr/cFlr3BR1lOETuEcfLiCBtxDE1pAQcEzvMKbg86L8+58LEZLTrFzDH/gfP4AdN2RWg=</latexit> Learning-based Program Optimizer Program Code Generator Program Optimizer D Training data
<latexit sha1_base64="1Z6CzjBl0OMVztfQ+m452YDkcY0=">AB8nicbVDLSsNAFL2pr1pfVZdugkVwVRIRdFnUhcsK9gFtKJPpB06mQkzN0IJ/Qw3LhRx69e482+ctFlo64GBwzn3MueMBHcoOd9O6W19Y3NrfJ2ZWd3b/+genjUNirVlLWoEkp3Q2KY4JK1kKNg3UQzEoeCdcLJbe53npg2XMlHnCYsiMlI8ohTglbq9WOCY0pEdjcbVGte3ZvDXSV+QWpQoDmofvWHiqYxk0gFMabnewkGdHIqWCzSj81LCF0QkasZ6kMTNBNo8c8+sMnQjpe2T6M7V3xsZiY2ZxqGdzCOaZS8X/N6KUbXQcZlkiKTdPFRlAoXlZvf7w65ZhTF1BJCNbdZXTomlC0LVsCf7yaukfVH3vbr/cFlr3BR1lOETuEcfLiCBtxDE1pAQcEzvMKbg86L8+58LEZLTrFzDH/gfP4AdN2RWg=</latexit> <latexit sha1_base64="1Z6CzjBl0OMVztfQ+m452YDkcY0=">AB8nicbVDLSsNAFL2pr1pfVZdugkVwVRIRdFnUhcsK9gFtKJPpB06mQkzN0IJ/Qw3LhRx69e482+ctFlo64GBwzn3MueMBHcoOd9O6W19Y3NrfJ2ZWd3b/+genjUNirVlLWoEkp3Q2KY4JK1kKNg3UQzEoeCdcLJbe53npg2XMlHnCYsiMlI8ohTglbq9WOCY0pEdjcbVGte3ZvDXSV+QWpQoDmofvWHiqYxk0gFMabnewkGdHIqWCzSj81LCF0QkasZ6kMTNBNo8c8+sMnQjpe2T6M7V3xsZiY2ZxqGdzCOaZS8X/N6KUbXQcZlkiKTdPFRlAoXlZvf7w65ZhTF1BJCNbdZXTomlC0LVsCf7yaukfVH3vbr/cFlr3BR1lOETuEcfLiCBtxDE1pAQcEzvMKbg86L8+58LEZLTrFzDH/gfP4AdN2RWg=</latexit> <latexit sha1_base64="1Z6CzjBl0OMVztfQ+m452YDkcY0=">AB8nicbVDLSsNAFL2pr1pfVZdugkVwVRIRdFnUhcsK9gFtKJPpB06mQkzN0IJ/Qw3LhRx69e482+ctFlo64GBwzn3MueMBHcoOd9O6W19Y3NrfJ2ZWd3b/+genjUNirVlLWoEkp3Q2KY4JK1kKNg3UQzEoeCdcLJbe53npg2XMlHnCYsiMlI8ohTglbq9WOCY0pEdjcbVGte3ZvDXSV+QWpQoDmofvWHiqYxk0gFMabnewkGdHIqWCzSj81LCF0QkasZ6kMTNBNo8c8+sMnQjpe2T6M7V3xsZiY2ZxqGdzCOaZS8X/N6KUbXQcZlkiKTdPFRlAoXlZvf7w65ZhTF1BJCNbdZXTomlC0LVsCf7yaukfVH3vbr/cFlr3BR1lOETuEcfLiCBtxDE1pAQcEzvMKbg86L8+58LEZLTrFzDH/gfP4AdN2RWg=</latexit> <latexit sha1_base64="1Z6CzjBl0OMVztfQ+m452YDkcY0=">AB8nicbVDLSsNAFL2pr1pfVZdugkVwVRIRdFnUhcsK9gFtKJPpB06mQkzN0IJ/Qw3LhRx69e482+ctFlo64GBwzn3MueMBHcoOd9O6W19Y3NrfJ2ZWd3b/+genjUNirVlLWoEkp3Q2KY4JK1kKNg3UQzEoeCdcLJbe53npg2XMlHnCYsiMlI8ohTglbq9WOCY0pEdjcbVGte3ZvDXSV+QWpQoDmofvWHiqYxk0gFMabnewkGdHIqWCzSj81LCF0QkasZ6kMTNBNo8c8+sMnQjpe2T6M7V3xsZiY2ZxqGdzCOaZS8X/N6KUbXQcZlkiKTdPFRlAoXlZvf7w65ZhTF1BJCNbdZXTomlC0LVsCf7yaukfVH3vbr/cFlr3BR1lOETuEcfLiCBtxDE1pAQcEzvMKbg86L8+58LEZLTrFzDH/gfP4AdN2RWg=</latexit> Learning-based Program Optimizer Program Code Generator Program Optimizer Learning Statistical Cost Model D Training data
<latexit sha1_base64="1Z6CzjBl0OMVztfQ+m452YDkcY0=">AB8nicbVDLSsNAFL2pr1pfVZdugkVwVRIRdFnUhcsK9gFtKJPpB06mQkzN0IJ/Qw3LhRx69e482+ctFlo64GBwzn3MueMBHcoOd9O6W19Y3NrfJ2ZWd3b/+genjUNirVlLWoEkp3Q2KY4JK1kKNg3UQzEoeCdcLJbe53npg2XMlHnCYsiMlI8ohTglbq9WOCY0pEdjcbVGte3ZvDXSV+QWpQoDmofvWHiqYxk0gFMabnewkGdHIqWCzSj81LCF0QkasZ6kMTNBNo8c8+sMnQjpe2T6M7V3xsZiY2ZxqGdzCOaZS8X/N6KUbXQcZlkiKTdPFRlAoXlZvf7w65ZhTF1BJCNbdZXTomlC0LVsCf7yaukfVH3vbr/cFlr3BR1lOETuEcfLiCBtxDE1pAQcEzvMKbg86L8+58LEZLTrFzDH/gfP4AdN2RWg=</latexit> <latexit sha1_base64="1Z6CzjBl0OMVztfQ+m452YDkcY0=">AB8nicbVDLSsNAFL2pr1pfVZdugkVwVRIRdFnUhcsK9gFtKJPpB06mQkzN0IJ/Qw3LhRx69e482+ctFlo64GBwzn3MueMBHcoOd9O6W19Y3NrfJ2ZWd3b/+genjUNirVlLWoEkp3Q2KY4JK1kKNg3UQzEoeCdcLJbe53npg2XMlHnCYsiMlI8ohTglbq9WOCY0pEdjcbVGte3ZvDXSV+QWpQoDmofvWHiqYxk0gFMabnewkGdHIqWCzSj81LCF0QkasZ6kMTNBNo8c8+sMnQjpe2T6M7V3xsZiY2ZxqGdzCOaZS8X/N6KUbXQcZlkiKTdPFRlAoXlZvf7w65ZhTF1BJCNbdZXTomlC0LVsCf7yaukfVH3vbr/cFlr3BR1lOETuEcfLiCBtxDE1pAQcEzvMKbg86L8+58LEZLTrFzDH/gfP4AdN2RWg=</latexit> <latexit sha1_base64="1Z6CzjBl0OMVztfQ+m452YDkcY0=">AB8nicbVDLSsNAFL2pr1pfVZdugkVwVRIRdFnUhcsK9gFtKJPpB06mQkzN0IJ/Qw3LhRx69e482+ctFlo64GBwzn3MueMBHcoOd9O6W19Y3NrfJ2ZWd3b/+genjUNirVlLWoEkp3Q2KY4JK1kKNg3UQzEoeCdcLJbe53npg2XMlHnCYsiMlI8ohTglbq9WOCY0pEdjcbVGte3ZvDXSV+QWpQoDmofvWHiqYxk0gFMabnewkGdHIqWCzSj81LCF0QkasZ6kMTNBNo8c8+sMnQjpe2T6M7V3xsZiY2ZxqGdzCOaZS8X/N6KUbXQcZlkiKTdPFRlAoXlZvf7w65ZhTF1BJCNbdZXTomlC0LVsCf7yaukfVH3vbr/cFlr3BR1lOETuEcfLiCBtxDE1pAQcEzvMKbg86L8+58LEZLTrFzDH/gfP4AdN2RWg=</latexit> <latexit sha1_base64="1Z6CzjBl0OMVztfQ+m452YDkcY0=">AB8nicbVDLSsNAFL2pr1pfVZdugkVwVRIRdFnUhcsK9gFtKJPpB06mQkzN0IJ/Qw3LhRx69e482+ctFlo64GBwzn3MueMBHcoOd9O6W19Y3NrfJ2ZWd3b/+genjUNirVlLWoEkp3Q2KY4JK1kKNg3UQzEoeCdcLJbe53npg2XMlHnCYsiMlI8ohTglbq9WOCY0pEdjcbVGte3ZvDXSV+QWpQoDmofvWHiqYxk0gFMabnewkGdHIqWCzSj81LCF0QkasZ6kMTNBNo8c8+sMnQjpe2T6M7V3xsZiY2ZxqGdzCOaZS8X/N6KUbXQcZlkiKTdPFRlAoXlZvf7w65ZhTF1BJCNbdZXTomlC0LVsCf7yaukfVH3vbr/cFlr3BR1lOETuEcfLiCBtxDE1pAQcEzvMKbg86L8+58LEZLTrFzDH/gfP4AdN2RWg=</latexit> Learning-based Program Optimizer Program Code Generator Program Optimizer Learning Statistical Cost Model D Training data Unique Problem • Relatively low experiment cost Characteristics • Domain-specific problem structure • Large quantity of similar tasks
Program-aware Cost Modeling High-Level Configuration
Program-aware Cost Modeling High-Level Configuration for y in range(8): for x in range(8): C[y][x]=0 for k in range(8): C[y][x]+=A[k][y]*B[k][x] Low-level Abstract Syntax Tree (shared between tasks)
Program-aware Cost Modeling outer touched loop memory Boosted length High-Level Configuration C A B y 1 y 64 64 64 Tree Ensembles x 8 x 8 8 64 k 64 k 1 8 8 statistical features for y in range(8): for x in range(8): C[y][x]=0 for k in range(8): C[y][x]+=A[k][y]*B[k][x] Low-level Abstract Syntax Tree (shared between tasks)
Recommend
More recommend