Maximal Component detection in graphs using swarm-based and genetic algorithm Antonio Gonzalez-Pardo, David Camacho David Camacho david.camacho@uam.es A pplied I ntelligence & D ata A nalysis http://aida.ii.uam.es Universidad Autónoma de Madrid
Outline Introduction Goals Bio-inspired approaches applied to subgraphs The ACO approach The GA approach Experiments Problems observed Why we have done this work?
Introduction
Introduction Past decades have shown a growing interest on Collective Intelligence and Evolutionary Algorithm. Collective Intelligence (CI) uses concepts extracted from the social behavior observed in sets of self-organizing communities like ants, bees or bacterias amongst others. Evolutionary Algorithms (EA) are based on the evolution that allows the generation of new individuals based on the characteristics of their parents.
Introduction Both algorithms have some similarities: Are based on heuristic search. The population explores the solution space of the problem modelled. Are usually applied to NP-complete or NP-hard problems. Some problems where EA and CI are applied: Routing problems Constraint Satisfaction Problems Scheduling problems.
Goals
Goals 1. In this work we have made an experimental comparison between a classical Genetic-based approach and a Swarm-based strategy applied to the detection of the maximal connected component of a graph . 2. The maximal connected component is composed by the maximum number of nodes of a graph in such a way that from any node there exist a path to any other node from the same set.
Bio-inspired approaches applied to subgraphs detection The ACO approach
The ACO approach In this approach, ants (agents) travel through the network trying to visit all the nodes . Each ant starts in a random node. Initially, neighbours of the current node have the same probability to be selected but this probability change during the execution of the system due to the pheromone values deposited by the ants.
The ACO approach Once any ant decides what will be the next node to visit, the ant put a pheromone in the graph. Pheromones are sensed by ants while travelling through the network and attract ants to follow trails with high pheromone concentrations (values). Also, there is an evaporation process that reduces the different pheromone concentrations in the graph. This process is very useful to forget bad decisions taken previously.
The ACO approach Given an ant located in node i , the probability of travelling from node i to node j is: Where: is the pheromone concentration between nodes i and j is 0 if node j has just been visited by the ant. is the neighborhood of node i
The ACO approach
Bio-inspired approaches applied to subgraphs detection The GA approach
The GA approach In this approach, individuals (agents) represents possible solution to the problem. Individuals contain a phenotype that contains a set of genes with different nodes names. Both, the length of the phenotype and the gene values are randomly selected. The phenotype represents a possible path that will be evaluated against the graph. This evaluation provides a fitness value and it will be used in the generation of the next population.
The GA approach The fitness function is the same as the one used in ACO to determine the goodness of a path: The validation process is perform with a greedy algorithm that start visiting the first node in the phenotype. From its neighbours, the algorithm discards those nodes that have been visited and select one, randomly, from the remaining nodes taking into account that nodes belonging to the phenotype have more probability to be selected.
The GA approach The generation process is composed by two operators: Crossover. This operator is used to generate new individuals based on the parents phenotype. For each parent a random crossover-point is selected and their corresponding parts are interchanged. In the case that a new individual has to inherit the same gene from both parents, only one gene is inherited by the children. Mutation. This second operator is used to scape from local optimum. In this case the value of a gene is change randomly.
Experiments
Network used ACO and GA have been executed in Random graph and Small World Graph. Random Graph . In this type of graph the creation of edges depends on a probability ( p ). In Small World Graph there is a connectivity degree ( k ) and a redirection probability ( p ). Random graph is selected because it is easy to implement and to test our software. Small World Graph is studied because it is the most commonly used in communication network due to its characteristics.
Discovering the Connected Components Goal: to compare which algorithm discovers a greater number of different connected components. Each experiment is executed in graphs composed by 10 nodes during 10 generations, and it has been repeated 10 times. Small World networks have a connectivity degree of 1 (i.e. each node have 2 output connections)
Discovering the Connected Components # # Pop. Conclusions Graph p Comp Comp Size (ACO) (GA) GA finds more different 0.1 21 87 connected components but 5 this is because GA does not 0.9 25 124 Small take into account the order World 0.1 14 114 of appearance. 10 0.9 18 163 This is (N a , N b , N c ) is different 0.1 10 25 to (N b , N a , N c ) 5 0.9 39 549 Rando In the following exp. only m 18 0.1 10 ACO is used, because this 10 algorithm does not show this 0.9 74 1093 behaviour
Graphical result Connected Initial Network Components
Influence of the number of edges Results Experimental Set-up Graph Small World # Nodes 100 Redirec. Prob. 0.15 # Ants 50 # Iterations 100 Conclusion The higher number of edges, more steps are needed to find the connected components.
Influence of the number of steps Experimental Set-up Results Graph Small World # Nodes 1000 Connect. Degree 50 Redirec. Prob. 0.15 # Ants 100 Conclusion As ants do not transmit information about their path, the #steps must be equal or greater than the #nodes.
Problems observed
Problems observed Genetic Algorithm needs some grouping algorithms to discover partial solutions included into bigger ones. Genetic Algorithm does not have any mechanism to ensure that good phenotype blocks will be transmitted. Ant Colony Optimization discover connected components without branches that is because ants do not have direct communication between each other. P1 P1 P1 P1 P2 P2 P2 P2
Thanks for your attention… But…. that is all? Why we have done this work?
Why we have done this work? We have applied a classic Genetic Algorithm and a classic Ant Colony Optimization. But very few new has been contributed! The application domain is sub-graph detection. Yes really, but there is (maybe) millions of algorithms, models, techniques and tools to study this problem!
Why we have done this work? This was the initial step in a larger research whose main goal is to apply Collective Intelligence algorithms to Constraint Satisfaction Problems
CSP and ACO (Khan et al 2009) Khan et al. 2009 solves n-queens problem with ACO. For n-queens problem, they modelled a graph composed by n layers and each layer has n 2 nodes. This means that the layers represents the queens, and each layer represent the whole board
CSP and ACO (Khan et al 2009) Ants travel from a node in layer X to a node in layer (X+1) indicating in which square each queen is located.
CSP and ACO (Solnon 2002) In this case, the graph is full connected and each node represent the tuple <variable, value> With this approach, we have n queens with 2 variables that defines the queen coordinates, and each variable can have n different values
CSP and ACO (Solnon 2002) 3-Queens 4-Queens 5-Queens 8-Queens
Our approach Our approach is based on a graph where each node represent each element in the problem. In the case of the N-queens problem , the graph will have only N nodes . Each node will have all the variables that involves the element (i.e. in the case of N-queens problem, each node will have only 2 variables that represent the position of the queen) The number of edges depends on the problem because two nodes will be connected if there is at least one restriction that involves any variable of the two elements.
Our approach With our approach ants not only navigates trough the network indicating that the ant is located in this square as Khan and Solnon does, but our ants assign different values to variables contained in the nodes taking into account the restriction stored in the edges. With this approach the network is drastically reduced and the system is scalable.
Our approach (8-queens) Khan et. al. Solnon 2002 Our approach 2009
Recommend
More recommend