OLAP & Data Mining OLAP & Data Mining Agenda Agenda SQL Server Features (in short) SQL Server Features (in short) OLAP OLAP Data Mining Data Mining Demos Demos
Agenda Agenda SQL Server Features (in short) SQL Server Features (in short) OLAP OLAP Data Mining Data Mining Demos Demos
SQL Server Features SQL Server Features .NET Inside the database .NET Inside the database Native support of XML & Web Services Native support of XML & Web Services Visual Studio Integration Visual Studio Integration SQL Server Mobile Enhancements SQL Server Mobile Enhancements 64- -bit support bit support 64 Notification Services Notification Services Integration Services Integration Services Scalability & Availability Improvements Scalability & Availability Improvements What .NET Inside the Database means? What .NET Inside the Database means? Same experience as developing normal Same experience as developing normal applications applications Place part of the business logic inside the database Place part of the business logic inside the database Extend data types with user defined types Extend data types with user defined types Better Performance Better Performance No connections required No connections required Managed Code is compiled – Managed Code is compiled – TSQL TSQL interpeted interpeted Write object oriented applications and avoid T- -SQL SQL Write object oriented applications and avoid T Use the rich API of .NET Framework Use the rich API of .NET Framework Expose DB functionality via web services Expose DB functionality via web services
SQL Server 2005 SQL Server 2005 A Complete Enterprise Data Management and BI Solution Reporting Services Reporting Services Development Tools Development Tools Management Tools Management Tools Analysis Services Analysis Services OLAP & Data Mining OLAP & Data Mining Data Transformation Data Transformation Services Services ETL ETL SQL Server SQL Server Relational Engine Relational Engine Agenda Agenda SQL Server Features (in short) SQL Server Features (in short) OLAP OLAP Data Mining Data Mining Demos Demos
Relational Versus OLAP Reports Relational Versus OLAP Reports Feature Relational OLAP Feature Relational OLAP � � � � Flexible schema Flexible schema � � � � Real time data access Real time data access � � � � Single data store Single data store � � � � Simple management Simple management � � � � Detail reporting Detail reporting � � � � High performance High performance � � � � End- -user oriented user oriented End � � � � Ease of navigation and exploration Ease of navigation and exploration � � � � Rich analytics Rich analytics Basic Principles Basic Principles OLAP databases are denormalized OLAP databases are denormalized databases databases Normalization is good for updates – – inserts but bad for inserts but bad for Normalization is good for updates performance performance Process data in advance to provide better response Process data in advance to provide better response times times data � � need space � � takes Process more data Process more need for extra disk for extra disk space takes time � � faster more time more faster response: We should find the balance response: We should find the balance Precalculate values that users may ask for Precalculate values that users may ask for No real- -time data time data No real Use Integration Services (DTS) to build and update the Use Integration Services (DTS) to build and update the datawarehouse datawarehouse
Concepts Concepts Fact tables Fact tables Dimension tables Dimension tables Measures Measures Star and Snowflake schemas Star and Snowflake schemas Cube Cube Storage options: ROLAP- -HOLAP HOLAP- -MOLAP MOLAP Storage options: ROLAP Multidimensional Data Sales volume as a function of product, Sales volume as a function of product, month, and region month, and region Dimensions: Product, Location, Time Hierarchical summarization paths n o i g e R Industry Region Year Category Country Quarter Product Product City Month Week Office Day Month
A Concept Hierarchy: A Concept Hierarchy: Dimension (location) Dimension (location) all all Europe ... North_America region Germany ... Spain Canada ... Mexico country Vancouver ... city Frankfurt ... Toronto L. Chan ... M. Wind office A Sample Data Cube Total annual sales Date Product of TV in U.S.A. 1Qtr 2Qtr sum 3Qtr 4Qtr TV U.S.A PC VCR Country sum Canada Mexico sum
Agenda Agenda SQL Server Features (in short) SQL Server Features (in short) OLAP OLAP Data Mining Data Mining Demos Demos Value of Data Mining Value of Data Mining Business Knowledge SQL Server 2005 SQL Server 2005 Relative Business Value Data Mining OLAP Reports (Adhoc) Reports (Static) Easy Difficult Usability 5 new algorithms 5 new algorithms Tightly integrated with AS, DTS, Reporting Tightly integrated with AS, DTS, Reporting Graphical tools/wizards Graphical tools/wizards Integration with Web/Office apps Integration with Web/Office apps 12 embeddable viewers 12 embeddable viewers
Algorithms Algorithms Data Mining algorithms developed in conjunction Data Mining algorithms developed in conjunction with Microsoft Research address the full range of with Microsoft Research address the full range of data mining applications data mining applications Classification, Regression, Segmentation, Classification, Regression, Segmentation, Association, Forecasting, Text Analysis, and Association, Forecasting, Text Analysis, and Advanced Data Exploration Advanced Data Exploration Algorithm complexity is hidden behind a Algorithm complexity is hidden behind a consistent, familiar API consistent, familiar API Data Mining logic accessed through SQL queries, Data Mining logic accessed through SQL queries, results provided through datasets, rowsets rowsets results provided through datasets, Applications can use SQL Server Data Mining to apply Applications can use SQL Server Data Mining to apply learned rules, or show patterns to the user learned rules, or show patterns to the user Data Mining Tasks Data Mining Tasks Classification Classification Regression Regression Segmentation Segmentation Association Association Forecasting Forecasting Text Analysis Text Analysis Advanced Data Advanced Data Exploration Exploration
Data Mining Tasks Data Mining Tasks Classification Classification Typical Business Questions Typical Business Questions Regression What type of membership card What type of membership card Regression should I offer? should I offer? Which customers will respond to Which customers will respond to Segmentation my mailing? my mailing? Segmentation Is this transaction fraudulent? Is this transaction fraudulent? Will I lose this customer? Will I lose this customer? Association Association Will this product be defective? Will this product be defective? Algorithms Algorithms Forecasting Forecasting Preferred Preferred Decision Trees Decision Trees Naï Na ïve Bayes ve Bayes Text Analysis Text Analysis Neural Nets Neural Nets Also Can Also Can Clustering Clustering Advanced Data Advanced Data Sequence Clustering Sequence Clustering Association Rules Association Rules Exploration Exploration Data Mining Tasks Data Mining Tasks Classification Classification Typical Business Questions Typical Business Questions Regression Regression How much revenue will I get from How much revenue will I get from this customer? this customer? How long will this asset be in How long will this asset be in service? Segmentation Segmentation service? Algorithms Algorithms Preferred Preferred Association Association Decision Trees Decision Trees Neural Nets Neural Nets Also Can Also Can Forecasting Forecasting Clustering Clustering Sequence Clustering Sequence Clustering Text Analysis Text Analysis Advanced Data Advanced Data Exploration Exploration
Recommend
More recommend