3/17/2009 OUTLINE OUTLINE Business Intelligence Business Intelligence Knowledge Management Paper by W. F. Cody BIKM J. T. Kreulen V. Krishna eClassifier W. S. Spangler Integrated BIKM Tools Presentation by Dylan Chi Discussion by Debojit Dhar THE INTEGRATION OF BUSINESS INTELLIGENCE THE INTEGRATION OF BUSINESS INTELLIGENCE AND KNOWLEDGE MANAGEMENT AND KNOWLEDGE MANAGEMENT BUSINESS INTELLIGENCE BUSINESS INTELLIGENCE BUSINESS INTELLIGENCE BUSINESS INTELLIGENCE “Business intelligence (BI) refers to skills, Business intelligence technology has coalesced technologies, applications and practices used to around the use of two technologies help a business acquire a better understanding data warehousing of its commercial context.” on-line analytical processing (OLAP). “Business intelligence may also refer to the collected information itself.” --Wikipedia DATA WAREHOUSING DATA WAREHOUSING DATA WAREHOUSING DATA WAREHOUSING Data warehousing is a systematic approach to The various sources for the relevant business collecting relevant business data into a single data are referred to as the operatio tional l data repository, where it is organized and validated so sto tores s (ODS). that it can be analyzed and presented in a form The data are extracted, transformed, and that is useful for business decision-making. loade ded (ETL) from the ODS systems into a data mart. 1
3/17/2009 OLAP OLAP OLAP OLAP “ Online analy lytica tical l processin ssing, or OLAP, is an In the data mart, the data are modeled as an approach to quickly answer multi-dimensional OLAP cube (multidimensional model) analytical queries .” Multidimensional model supports flexible drill- --Wikipedia down and roll-up analyses OLAP CUBE OLAP CUBE OUTLINE OUTLINE Business Intelligence Knowledge Management Knowledge Management BIKM eClassifier Integrated BIKM Tools KNOWLEDGE MANAGEMENT KNOWLEDGE MANAGEMENT KNOWLEDGE MANAGEMENT KNOWLEDGE MANAGEMENT In this context, it is used for the management “ Knowle ledge Manag agement nt (KM KM) comprises a and analysis of unstructured information, range of practices used in an organization to particularly text documents. identify, create, represent, distribute and enable Textual information sources adoption of insights and experiences. Such Business documents, e-mail, news and press insights and experiences comprise knowledge, articles, technical journals, patents, conference either embodied in individuals or embedded in proceedings, business contracts, government organizational processes or practice .” reports, regulatory filings, discussion groups, problem report databases, sales and support --Wikipedia notes, web. 2
3/17/2009 DISCUSSION DISCUSSION OUTLINE OUTLINE Does the authors‟ definition of business Business Intelligence intelligence agree with yours? Why or why not? Knowledge Management What business intelligence applications can BIKM BIKM you think of that aren„t mentioned in the eClassifier paper? Integrated BIKM Tools BIKM BIKM BIKM PROBLEMS BIKM PROBLEMS The authors believe that over time techniques Understanding sales effectiveness from both BI and KM will blend Products, sales representatives, customers New techniques will seamlessly span the Sales techniques analysis of both data and text Improving support and warranty analysis Customer complaints Relating CRM to profitability „hidden‟ cost Complete picture ENVIRONMENTAL ISSUES ENVIRONMENTAL ISSUES EXAMPLE EXAMPLE Text information sits inside the same database A business analyst explore a revenue cube and detect a downward movement in revenues for a Textual information is in systems distinct from software product in some part of the United the ODS systems States. The sources of text to relate to a business data The data cube shows the phenomenon but analysis are not known does not provide any explanation for it 3
3/17/2009 EXAMPLE EXAMPLE DISCUSSION DISCUSSION To understand the phenomenon, some text Do you think integrating BI and KM to be a sources could be used to extract valuable good idea? information Do you think the ideas in the paper made/did Enterprise-specific information not make it to the mainstream BI tools? Have Service call logs about the product you come across tools that use the BIKM Competitive intelligence reports concept? Purchased text information Public documents in Web forms Discussions about products OUTLINE OUTLINE ECLASSIFIER ECLASSIFIER Business Intelligence eClassif sifie ier is an application that can quickly analyze a large collection of documents and Knowledge Management utilize multiple algorithms, visualizations, and BIKM metrics to create and to maintain a taxonomy. eClassifier eClassifier It is very difficult to automatically produce a Integrated BIKM Tools satisfactory taxonomy for a diverse set of users without allowing human intervention. DOCUMENT REPRESENTATION DOCUMENT REPRESENTATION TAXONOMY GENERATION TAXONOMY GENERATION Feature space of terms and phrases Automatically create an initial categorization or taxonomy The feature space is obtained by counting the occurrence of terms and phrases in each document k -means algorithm Stop-word lists Interactive, query-based clustering Synonym list, stock phrase list, „include word‟ list Seeds categories based on a set of keywords Vector of weighted frequencies Tests out the queries Dictionary tool Refines the clusters based on the observed results 4
3/17/2009 TAXONOMY EVALUATION TAXONOMY EVALUATION TAXONOMY VISUALIZATION TAXONOMY VISUALIZATION Once we have an initial taxonomy of the documents, eClassifier provides the means to understand and to evaluate it. Category label is generated using a term-cover- age algorithm that identifies dominant terms in the feature space. Metrics Size, cohesion, distinctness CLASSIFICATION CLASSIFICATION ANALYSIS AND REPORTING ANALYSIS AND REPORTING Assign additional documents to the taxonomy FAQ analysis as they become available Discovery of correlations eClassifier creates a batch classifier to process Chi-squared test the additional documents Continuous variables Nearest centroid Using a generated taxonomy to compare Native Bayes multivariate document collections Native Bayes multinomial … Decision tree DISCUSSION DISCUSSION OUTLINE OUTLINE The main tasks of eClassifier can be Business Intelligence represented as: Knowledge Management Taxonomy generation BIKM Taxonomy and category evaluation eClassifier Taxonomy visualization Classification Integrated BIKM Tools Integrated BIKM Tools Analysis and reporting Which of these do you think is most important and why? 5
3/17/2009 INTEGRATION PARADIGM INTEGRATION PARADIGM INTEGRATING TEXT INFORMATION INTEGRATING TEXT INFORMATION Text is ultimately associated with business data Find attributes in the documents that can be records to enhance the understanding of the used to link them to the data, or find attributes data in the documents that can be used as additional dimensions to deepen the We might strive to achieve a tighter integration understanding of the data of the text information with the associated data Compute quantitative values from the Using an OLAP multidimensional data model as the integrating mechanism documents INTEGRATED BIKM TOOLS INTEGRATED BIKM TOOLS DOCUMENT WAREHOUSING DOCUMENT WAREHOUSING Apply the OLAP data model to text documents, The fact table granularity is a document creating a document warehouse The dimension tables hold the attributes of the Allow users to explore data cubes with a star document schema and consists of a report view and navigational controls SHARED DIMENSION DATA MODEL SHARED DIMENSION DATA MODEL SHARED DIMENSIONS SHARED DIMENSIONS We use star schemas to organize and analyze both data and document cubes Providing a mechanism to link them will allow deeper analysis and thereby provide greater value The key to achieving it is to directly link the data to the documents through shared dimensions 6
3/17/2009 DYNAMIC DIMENSIONS DYNAMIC DIMENSIONS DISCUSSION DISCUSSION The new taxonomy can be made available to Do you think it is possible to create a the document warehouse by creating a consistent taxonomy for documents using the corresponding dimension table to represent the concepts detailed in the paper? What changes taxonomy and then populating an added would you suggest to come up with a more column in the fact table, associating all known useful classification? document with the newly published dimension Is it a good idea to group documents under a single hierarchy or class? Thank you 7
Recommend
More recommend