failure prediction for decision makers in data centers
play

Failure Prediction for decision makers in Data Centers using Data - PowerPoint PPT Presentation

Failure Prediction for decision makers in Data Centers using Data Mining. Group ID- 39WDIT Team Members IT 11 6002 44 D.G.S.M. Wijayarathne IT 11 6049 90 W.K.S.D Fernando IT 11 6005 58 A.S.M.S Sharfaan IT 11 6073 42 J.S.D


  1. Failure Prediction for decision makers in Data Centers using Data Mining. Group ID- 39WDIT

  2. Team Members  IT 11 6002 44 D.G.S.M. Wijayarathne  IT 11 6049 90 W.K.S.D Fernando  IT 11 6005 58 A.S.M.S Sharfaan  IT 11 6073 42 J.S.D Fernando  IT 11 6104 58 M.P.L Mendis Internal Supervisor: Mr. Dilhan Manawadu IT 11 6049 90

  3. Topics to be covered…. 1. Introduction 2. Overall Descriptions 3. Specific Requirements 4. References 5. Appendices IT 11 6049 90

  4. Introduction

  5. Overview  Develop team will implement a system called “WinSeer” to predict Data Center failures.  What is the need of predicting Data Center failures?  System targets for decision makers. IT 11 6049 90

  6. Objectives  Select best data mining algorithm.  Develop a data mining model.  Predict data center failures.  Acknowledge decision makers about failures. IT 11 6049 90

  7. Software Architecture Diagram IT 11 6049 90

  8. Product perspective.  Existing Researches.  Research 1: Prediction of Hard Drive Failures via Rule Discovery from Auto Support Data by V. Agrawal, C. Bhattacharyya, T. Niranjan, S. Susarla in Sep 2009 .  Research 2: Effective Failure Prediction in Hadoop Clusters by R. Dudko, A. Sharma, and J. Tedesco .  Research 3: A Failure Detection and Prediction Mechanism for Enhancing Dependability of Data Centers by Q. Guan, Z. Zhang, and S. Fu in October 2012 .  Research 4: Host Load Prediction in a Google Compute Cloud with a Bayesian Model by S. Di1, D. Kondo, W. Cirne in 2012 .

  9. Product perspective “ WinSeer ” Project Research 1 Research 2 Research 3 Research 4 Prediction Data center server Hard Drive Failure prediction Enhancing Host Load factor failures. Failures. in Hadoop Dependability of Prediction. Clusters. Data Centers. Target Decision makers in Data center Operators and Data center Google users. Audients organizations. administrators. managers of the administrators. cluster. Business To increase the To avoid loss of Data management Provide high To increase the data centers’ Goal data and and monitoring accuracy to the availability of availability. performance large clusters. Data Centers. the search degradation. engine. Model Type Open source data Rule learning Novel approach. Bayesian and Based on Bayes mining models. algorithms. decision trees model. models. User Web interface. Net Application. Monitoring Monitoring Web interface. Interface systems. systems. IT 11 6002 44 Features.

  10. Product functions IT 11 6002 44

  11. Product functions  Login  View Predictions IT 11 6002 44

  12. Product functions  Login  View Predictions  Update Profile  Add a new user IT 11 6002 44

  13. User characteristics  Two classes of users.  Organizational decision maker  System administrator.  Ability to read and understand English.  Familiarity with the operation of the basic Graphical User Interface (GUI) of a web browser.  Should have an e-mail account to get email alerts. IT 11 6002 44

  14. Assumptions and dependencies  The server monitoring function will be done by the currently available system and the pattern recognition and the failure predictions generating only will be done by the proposed system. IT 11 6002 44

  15. Distribution of requirements  Prototype – Processing data set and Integrating tools with ASP.net.  Mid Review - WinSeer data mining model.  Finals - Complete project, focus on predicting the failures at least before 2 weeks’ time.  Processing XML data set – Shamini, Premeshini  Integrate the open source data mining tools with ASP.net. – Samith, Sameera  Research the mining model algorithms – Saumy, Sameera, Samith  Generate reports - Saumy IT 11 6002 44

  16. Specific Requirements

  17. External interface requirements  Detailed user interfaces  Login Interface IT 11 6005 58

  18. External interface requirements  Home Page IT 11 6005 58

  19. Detailed user interfaces  Register New User. IT 11 6005 58

  20. Detailed user interfaces  Edit Profile. IT 11 6005 58

  21. Detailed user interfaces  Update and view user details. IT 11 6005 58

  22. Detailed user interfaces  Software interface integrations  Weka 3.6  Knime 2.6  RapidMiner 5.3  Communication interface integrations  Internet connection is required to feed the web pages and to access the web interface by the user. IT 11 6005 58

  23. Classes/Objects IT 11 6005 58

  24. Performance Requirements  Response Time  How fast the system handle individual requests.  Should not render resident computer useless for other purposes. IT 11 6073 42

  25. Performance Requirement  Throughput  How many requests the system can handle.  “Winseer” prediction handles datasets of up to 20 GB in size IT 11 6073 42

  26. Design Constraints  Easy to access the system.  Develop the mining model by using open source tools.  Software Interfaces used in “WinSeer”. IT 11 6073 42

  27. Software System Attributes.  Reliability  Availability. IT 11 6073 42

  28. Software System Attributes.  Security.  Maintainability. IT 11 6073 42

  29. Supporting Information

  30. References  [1] HowStuffWorks.com Contributors, "Are data mining and data warehousing related?", 20 April 2011. HowStuffWorks.com , [Online]. Available: http://www.howstuffworks.com/are-data-mining-and-data- warehousing-related.htm. [Accessed: March. 23, 2013].  [2] “Database Fundamentals,” 2008. [Online]. Available: http://www.personal.psu.edu/glh10/ist110/topic/topic07/topic07_09.htm l. [Accessed: Mar. 23, 2013].  [3] B. Sudeshna, Georgia, "DATA MINING," 1997. [Online]. Available: http://www.siggraph.org/education/materials/HyperVis/applicat/data_mi ning/data_mining.html [Accessed: Mar.23, 2013].  [4] M. Bruno, "4 open source data mining tools (with GUI)," April 21 2009. [Online]. Available: http://www.analyticbridge.com/profiles/blogs/4-open-source-data- mining.  [5] “Data Mining: What is Data Mining?,” [Online]. Available: http://www.anderson.ucla.edu/faculty/jason.frand/teacher/technologies/ palace/datamining.htm. [Accessed: Mar.24, 2013]. IT 11 6104 58

  31. References  [6] C.G Carrier, and O Povel, "Characterizing Data Mining software," Intelligent Data Analysis 7, pp. 181-185, August 2003.  [7] V. Agrawal, C. Bhattacharyya, T. Niranjan and S. Susarla, “Prediction of Hard Drive Failures via Rule Discovery from Auto Support Data” pp.782 -786, 2009 International Conference on Machine Learning and Applications, Dec. 2009.  [8] R. Dudko, A. Sharma and J. Tedesco, “Effective Failure Prediction in Hadoop Clusters,” Available:https://wiki.engr.illinois.edu/download/attachments/19576688 7/JAR-2nd.pdf? version=3&modificationDate=1333424381000 [Accessed Mar 28, 2013].  [9] Q. Guan, Z. Zhang, and S. Fu, A Failure Detection and Prediction Mechanism for Enhancing Dependability of Data Centers, Vol. 4, No. 5, International Journal of Computer Theory and Engineering, 2012.  [10] S. Di, D. Kondo and W. Cirne, “Host Load Prediction in a Google Compute Cloud with a Bayesian Model,” In Proceedings of IEEE International Conference for High Performance Computing, Networking, Storage and Analysis, Nov. 2012. IT 11 6104 58

  32. References  [11] “Performance requirements documentation,” [Online]. Available: http://pic.dhe.ibm.com/infocenter/aix/v7r1/index.jsp?topic=%2Fcom.ib m.aix.prftungd%2Fdoc%2Fprftungd%2Fdoc_perf_reqs.htm. [Accessed: March. 23, 2013].  [12] “How to write Performance Requirements with Example,” [Online]. Available: http://www.1202performance.com/atricles/how-to-write- performance-requirements-with-example/. [Accessed: March. 23, 2013].  [13]M. Bruno, "4 open source data mining tools (with GUI)," April 21 2009. [Online]. Available: http://www.analyticbridge.com/profiles/blogs/4-open-source-data- mining. [Accessed: Mar.23, 2013].  [14]Z. Li, "using data mining techniques to improve software reliable," 2006.  [15]A. Alzghoul, M. Löfstrand,” Increasing availability of industrial systems through data stream mining”, Computers &Industrial Engineering, 2010. [Accessed: Mar.23, 2013]. IT 11 6104 58

  33. References  [16]"Non Functional Requirements," 2, Aug 26 2010. [Online]. Available: http://c2.com/cgi/wiki?NonFunctionalRequirements.[Accessed: March. 22, 2013]. IT 11 6104 58

  34. Interview Questions  How you define the scale of your company? (Scale of the servers.)  What kind of data do your servers handle? ( How Critical )  Have you faced any server failures in your company?  How often failures are happening?  How do you get to know when a failure occurred in your company?  How failures affect to your company?  After a failure happens, what are your next action steps? IT 11 6104 58

  35. Interview Questions Cont..  How long does it take to recover from a failure?  Do you have any server failure prediction mechanism?  If yes; • What kind of mechanism do you have to predict failures in your data centers? • Is it cost effective? • How early can you get about the failure? • Are you satisfied with your system?  If no; • If you have a failure prediction mechanism, will it be helpful to your decisions and your company? • What is your idea about a failure prediction mechanism? IT 11 6104 58

Recommend


More recommend