Data Analytics as a Service for Data Scientists Prof. Chen Li 1
A real story Sue: Public Health Adam: Data researcher Scientist 2
Challenges ... ● Infrastructure ● Data collection ● Large scale ● Machine learning Not enough IT background! 3
Software solutions Cloudberry : Big data visualization Texera: AsterixDB: Analytics using workflows parallel database Users: researchers from UCI, UCLA 4
Cloudberry: Big Data Visualization 5
TwitterMap system 6
Takes too much time? 7
Fixed-length slicing? 8
Query slicing with a rhythm 9
Open challenges ● Modeling DB for approximation viz ● Visualizing large number of records ● Integrating computing between middleware and frontend 10
Texera: big data analytics using interactive workflows 11
Actor Model 12
Integrate ML Models Included as Data Training operators preparation Instances for training UDF (online) UDF (feed) 13
Conclusion: data analytics as a service Labeled Instances Cloudberry : Big data visualization Classifier Trainer Texera: AsterixDB: Analytics using workflows parallel database 14
Recommend
More recommend