saiku saiku taking taking olap olap databases databases
play

Saiku Saiku taking taking OLAP OLAP databases databases into - PowerPoint PPT Presentation

Saiku Saiku taking taking OLAP OLAP databases databases into into 21 21st st century century Tomasz Tomasz Nurkiewicz Nurkiewicz nurkiewicz nurkiewicz.com com @tnurkiewicz | | tnurkiewicz Slides: bit.ly/33degree What


  1. Saiku Saiku – – taking taking OLAP OLAP databases databases into into 21 21st st century century Tomasz Tomasz Nurkiewicz Nurkiewicz nurkiewicz nurkiewicz.com com @tnurkiewicz | | tnurkiewicz Slides: bit.ly/33degree

  2. What What is is Saiku Saiku? ? DEMO DEMO

  3. Core Core concepts concepts OLAP Fact Dimension Hierarchy

  4. Example Example facts facts Sold product Tweet/forum post/shared photo Website hit Incoming text message ...you name it

  5. Dimension Dimension "Properties of facts" When? What? Where? Who? How?

  6. Example Example dimensions dimensions Access Access log log Timestamp IP URL resource HTTP response code

  7. Hierarchy Hierarchy Multi Multi-level level aggregation aggregation Example Example: : location location hierarchy hierarchy (All) Continent Country State City

  8. Measures Measures Quantitative properties Aggregate matching facts over them Count/Sum/Average/Min/Max

  9. Example Example measures measures Load time ( page hit fact ) Total price ( sale fact ) Age of customer

  10. Charting Charting - DEMO DEMO

  11. Exporting Exporting - DEMO DEMO

  12. Drill Drill down down - DEMO DEMO

  13. Ignored Ignored concepts concepts Hypercube Mondrian MDX

  14. Your Your own own cube cube

  15. Star Star schema schema

  16. ETL ETL

  17. ETL ETL - challenges challenges Missing or incomplete data Heuristics Incremental, periodic updates Various data sources

  18. Schema Schema file file <Schema name="Twitter"> <Cube name="Tweets" defaultMeasure="Count"> <Table name="tweet"> <DimensionUsage name="Time" source="Time" foreignKey="time_id"/> <Dimension name="Location" foreignKey="location_id"> <Hierarchy hasAll="true" allMemberName="All locations"> <Table name="location"/> <Level name="Continent" column="continent"/> <Level name="Country" column="country"/> <Level name="City" column="city"/> </Hierarchy> </Dimension> <!-- ... --> </Schema>

  19. Schema Schema Workbench Workbench Source: www.stratebi.com/cursos/olap-mdx

  20. Security Security - users users Standard user/password Roles Spring Security - customizable

  21. Security Security - data data By role Restrict what can be seen Top/bottom limit

  22. Performance Performance Big data, before it was cool Indexes on foreign keys Aggregate tables

  23. Without Without Aggregate Aggregate table table SELECT COUNT(id) FROM tweet NATURAL JOIN locations GROUP BY locations.continent

  24. With With aggregate aggregate table table INSERT INTO agg (cnt, l.city, l.country, l.continent) SELECT COUNT(t.id) AS cnt, city, country, continent FROM tweet t NATURAL JOIN locations l GROUP BY l.city Usages: SELECT SUM(agg.count) FROM agg GROUP BY locations.continent

  25. Pentaho Pentaho Aggregation Aggregation Designer Designer Source: infocenter.pentaho.com/help/index.jsp

  26. Deployment Deployment mondrian.jar - engine saiku.war - RESTful web services ui.war - JS front-end

  27. Disadvantages Disadvantages Horizontal scalability? Stuck with SQL databases Complex schema definition (XML) Aggregate tables are hard

  28. Thank Thank you you! ! Slides: nurkiewicz.github.io/talks/2014/33degree

Recommend


More recommend