Artificial Immune Systems Artificial Immune Systems and Data Mining: Bridging the and Data Mining: Bridging the Gap with Scalability and Gap with Scalability and Improved Learning Improved Learning Olfa Nasraoui, Fabio González Cesar Cardona, Dipankar Dasgupta The University of Memphis A Demo/Poster at the National Science Foundation Workshop on Next Generation Data Mining, Nov. 2002 Nasraoui, Gonzalez, Cardona, Dasgupta Nasraoui, Gonzalez, Cardona, Dasgupta: Scalable Artificial Immune System Based Data Mining : Scalable Artificial Immune System Based Data Mining NSF-NGDM, Nov. 1-3, 2002, Baltimore, MD
Inspired by Nature… Inspired by Nature… � living organisms exhibit living organisms exhibit extremely sophisticated extremely sophisticated learning learning � and processing processing abilities that allow them to survive and abilities that allow them to survive and and proliferate proliferate � nature nature has always served as has always served as inspiration inspiration for several for several � scientific and technological developments, exp: Neural scientific and technological developments, exp: Neural Networks, Evolutionary Computation Networks, Evolutionary Computation � immune system: immune system: parallel and distributed adaptive parallel and distributed adaptive � system w/ tremendous potential in many intelligent system w/ tremendous potential in many intelligent computing applications. computing applications. Nasraoui, Gonzalez, Cardona, Dasgupta Nasraoui, Gonzalez, Cardona, Dasgupta: Scalable Artificial Immune System Based Data Mining : Scalable Artificial Immune System Based Data Mining NSF-NGDM, Nov. 1-3, 2002, Baltimore, MD
What is the Immune What is the Immune System? System? � Protects Protects our bodies from foreign pathogens our bodies from foreign pathogens � (viruses/bacteria) (viruses/bacteria) � Innate Innate Immune System (initial, limited, ex: skin, tears, Immune System (initial, limited, ex: skin, tears, � …etc) …etc) � Acquired Acquired Immune System ( Immune System ( Learns Learns how to respond to how to respond to � NEW threats adaptively) NEW threats adaptively) � Primary Primary immune response immune response � � First response to invading pathogens First response to invading pathogens � � Secondary Secondary immune response immune response � � Encountering similar pathogen a second time Encountering similar pathogen a second time � � Remember Remember past encounters past encounters � � Faster and stronger response than primary response Faster and stronger response than primary response � Nasraoui, Gonzalez, Cardona, Dasgupta Nasraoui, Gonzalez, Cardona, Dasgupta: Scalable Artificial Immune System Based Data Mining : Scalable Artificial Immune System Based Data Mining NSF-NGDM, Nov. 1-3, 2002, Baltimore, MD
Points of Strength of The Points of Strength of The Immune System Immune System � Recognition ( Recognition ( Anomaly detection, Noise tolerance) Anomaly detection, Noise tolerance) � � Robustness Robustness ( (Noise tolerance) Noise tolerance) � � Feature extraction Feature extraction � � Diversity Diversity (can face an entire repertoire of foreign (can face an entire repertoire of foreign � invaders) invaders) � Reinforcement learning Reinforcement learning � � Memory Memory (remembers past encounters: basis for vaccine) (remembers past encounters: basis for vaccine) � � Distributed Distributed Detection (no single central system) Detection (no single central system) � � Multi Multi- -layered layered (defense mechanisms at multiple levels) (defense mechanisms at multiple levels) � � Adaptive Adaptive (Self (Self- -regulated) regulated) � Nasraoui, Gonzalez, Cardona, Dasgupta Nasraoui, Gonzalez, Cardona, Dasgupta: Scalable Artificial Immune System Based Data Mining : Scalable Artificial Immune System Based Data Mining NSF-NGDM, Nov. 1-3, 2002, Baltimore, MD
Major Players: Major Players: B- -Cells Cells B � Through a process of Through a process of recognition recognition and and stimulation stimulation , B , B- -Cells will Cells will � clone and mutate to produce a to produce a diverse diverse set of antibodies set of antibodies clone and mutate adapted to different antigens adapted to different antigens � B B- -Cells Cells secrete secrete antibodies w/ antibodies w/ paratopes paratopes that can that can bind to bind to � specific antigens ( (epitopes epitopes) ) and destroy their host invading agent and destroy their host invading agent specific antigens through a KILL, SUICIDE, or INGEST KILL, SUICIDE, or INGEST signal signal. . through a B- -Cells Cells antibody antibody paratopes paratopes also can also can bind to antibody bind to antibody � B � idiotopes on on other other B B- -Cells, hence sending a STIMULATE or Cells, hence sending a STIMULATE or idiotopes SUPPRESS signal � � hence the hence the Network Network � � Memory Memory SUPPRESS signal Nasraoui, Gonzalez, Cardona, Dasgupta Nasraoui, Gonzalez, Cardona, Dasgupta: Scalable Artificial Immune System Based Data Mining : Scalable Artificial Immune System Based Data Mining NSF-NGDM, Nov. 1-3, 2002, Baltimore, MD
Requirements for Clustering Requirements for Clustering Data Streams (Barbara, 02) Data Streams (Barbara, 02) � Compactness of representation Compactness of representation � � Network of B Network of B- -cells: each cell can recognize several antigens cells: each cell can recognize several antigens � � B B- -cells compressed into clusters/sub cells compressed into clusters/sub- -networks networks � � Fast incremental processing of new data points Fast incremental processing of new data points � � New antigen influences only activated sub New antigen influences only activated sub- -network network � � Activated cells updated incrementally Activated cells updated incrementally � � Proposed approach learns in Proposed approach learns in 1 pass 1 pass . . � � Clear and fast identification of “outliers” Clear and fast identification of “outliers” � � New antigen that does not activate any New antigen that does not activate any subnetwork subnetwork is a is a � potential outlier � potential outlier � create new B create new B- -cell to recognize it cell to recognize it � This new B This new B- -cell could grow into a cell could grow into a subnetwork subnetwork (if it is stimulated (if it is stimulated � by a new trend) or die/move to disk (if outlier) by a new trend) or die/move to disk (if outlier) Nasraoui, Gonzalez, Cardona, Dasgupta Nasraoui, Gonzalez, Cardona, Dasgupta: Scalable Artificial Immune System Based Data Mining : Scalable Artificial Immune System Based Data Mining NSF-NGDM, Nov. 1-3, 2002, Baltimore, MD
General Architecture General Architecture 1- -Pass Adaptive Pass Adaptive 1 Evolving data � Immune Immune Learning Learning Immune network information system ? Stimulation (competition Evolving Immune Evolving Immune & memory) Age (old vs. new) Network Network Outliers (based on (compressed into (compressed into activation) subnetworks) ) subnetworks Nasraoui, Gonzalez, Cardona, Dasgupta Nasraoui, Gonzalez, Cardona, Dasgupta: Scalable Artificial Immune System Based Data Mining : Scalable Artificial Immune System Based Data Mining NSF-NGDM, Nov. 1-3, 2002, Baltimore, MD
Internal and External Immune Internal Immune Interactions: Before & After Interactions Internal Stimulation External Lifeline of B-cell Stimulation Nasraoui, Gonzalez, Cardona, Dasgupta Nasraoui, Gonzalez, Cardona, Dasgupta: Scalable Artificial Immune System Based Data Mining : Scalable Artificial Immune System Based Data Mining NSF-NGDM, Nov. 1-3, 2002, Baltimore, MD
Continuous Trap Initial Data Continuous Initialize ImmuNet and MaxLimit Immune Immune Memory Compress ImmuNet into K subNet’s Learning Learning Constraints Present NEW antigen data Identify nearest subNet * Compute soft activations in subNet* Start/Reset Yes Activates Update subNet* ‘s ARB Influence range /scale ImmuNet? No Update subNet* ‘s ARBs’ stimulations Clone antigen Clone and Mutate ARBs Outlier? Domain Kill lethal ARBs Knowledge Constraints Kill extra ARBs (based on Yes age / stimulation strategy) OR #ARBs > Secondary MaxLimit? increase acuteness of competition OR storage Move oldest patterns to aux. storage No ImmuNet Compress ImmuNet Stat’s & Visualization Nasraoui, Gonzalez, Cardona, Dasgupta Nasraoui, Gonzalez, Cardona, Dasgupta: Scalable Artificial Immune System Based Data Mining : Scalable Artificial Immune System Based Data Mining NSF-NGDM, Nov. 1-3, 2002, Baltimore, MD
Recommend
More recommend