Upload
others
View
7
Download
0
Embed Size (px)
Citation preview
LearningCombinatorialOptimizationAlgorithmsoverGraphsHanjun Dai*,EliasB.Khalil*, Yuyu Zhang, Bistra Dilkina,LeSong(*equalcontribution)
GeorgiaInstituteofTechnology Mailto:{hanjun.dai,elias.khalil,yuyu.zhang,bdilkina,lsong}@cc.gatech.edu
ProblemStatement
ImplementQ-functionwithstructure2vec
ReinforcementLearningFormulation Experiments
Given a graph optimization problem 𝑮 and a distribution 𝓓 of problem instances, can we learn better greedy
heuristics that generalize to unseen instances from 𝒟?
Minimum Vertex Cover
Tackling NPC problems Design rationale ExampleExact algorithms Tight formulations, IP solvers CPLEX
Approximation algorithms Worst-case guarantees Edge-pickingHeuristics Empirical performance Degree-greedy
Can classical algorithms exploit the common distribution of instances?
Minimum VertexCover (MVC)
Maximum Cut (MAXCUT)
Traveling Salesman Problem (TSP)
Synthetic data
Random gaphs Erdos-Renyi (ER) or Barabasi-Albert (BA) ER or BA
DIMACS generator; uniform grid or clustered
Statistics(# nodes)
training: 15 ~ 500test: 15 ~ 1200
training: 15 ~ 300test: 15 ~ 1200
training: 15 ~ 300test: 15 ~ 1200
Real-world dataDatasets MemeTracker Physics TSPLIB
Visualization
Statistics 1 graph, 960 nodes, 5000 edges
10 graphs, 125 nodes, 375 edges
38 graphs, 51 to 318 nodes
Opt obtained ILP with CPLEX IQP with CPLEX Concorde
MVC- BA MAXCUT- BA
TSP- ClusteredWe can see that S2V-DQN achieves a very low approximation ratio. Note that the “optimal" valueused in the computation of approximation ratios may not be truly optimal (due to the solver time cutoffat 1 hour), and so CPLEX’s and Concorde’s solution do typically get worse as problem size grows.This is why sometimes we can even get better approximation ratio on larger graphs.5.3 Scalability & Trade-off between running time and approximation ratioTo construct a solution on a test graph, our algorithm has polynomial complexity of O(k|E|) where kis number of greedy steps (at most the number of nodes |V |) and |E| is number of edges. For instance,on graphs with 1200 nodes, we can find the solution of MVC within 11 seconds using a single GPU,while getting an approximation ratio of 1.0062. For dense graphs, we can also sample the edges forthe graph embedding computation to save time, a measure we will investigate in the future.Figure 3 illustrates the approximation ratios of various approaches as a function of running time.All algorithms report a single solution at termination, whereas CPLEX reports multiple improvingsolutions, for which we recorded the corresponding running time and approximation ratio. Figure D.3(Appendix D.7) includes other graph sizes and types, where the results are consistent with Figure 3.
(a) MVC BA 200-300 (b) MAXCUT BA 200-300
Figure 3: Time-approximationtrade-off for MVC and MAX-CUT. In this figure, each dotrepresents a solution found fora single problem instance. ForCPLEX, we also record the timeand quality of each solution itfinds, e.g. CPLEX-1st means thefirst feasible solution found byCPLEX.
Figure 3 shows that, for MVC, we are slightly slower than the approximation algorithms but enjoya much better approximation ratio. Also note that although CPLEX found the first feasible solutionquickly, it also has much worse ratio; the second improved solution found by CPLEX takes similaror longer time than our S2V-DQN, but is still of worse quality. For MAXCUT, the observations arestill consistent. One should be aware that sometimes our algorithm can obtain better results than1-hour CPLEX,which gives ratios below 1.0. Furthermore, sometimes S2V-DQN is even faster thanthe MaxcutApprox, although this comparison is not exactly fair, since we use GPUs; however, wecan still see that our algorithm is efficient.5.4 Experiments on real-world datasetsIn addition to the experiments for synthetic data, we identified sets of publicly available benchmarkor real-world instances for each problem, and performed experiments on them. A summary of resultsis in Table 3, and details are given in Appendix C. S2V-DQN significantly outperforms all competingmethods for MVC, MAXCUT and TSP.
Table 3: Realistic data experiments, results summary. Values are average approximation ratios.
Problem Dataset S2V-DQN Best Competitor 2nd Best CompetitorMVC MemeTracker 1.0021 1.2220 (MVCApprox-Greedy) 1.4080 (MVCApprox)MAXCUT Physics 1.0223 1.2825 (MaxcutApprox) 1.8996 (SDP)TSP TSPLIB 1.0475 1.0947 (2-opt) 1.1771 (Cheapest)
5.5 Discovery of interesting new algorithmsWe further examined the algorithms learned by S2V-DQN, and tried to interpret what greedy heuristicshave been learned. We found that S2V-DQN is able to discover new and interesting algorithms whichintuitively make sense but have not been analyzed before. For instance, S2V-DQN discovers analgorithm for MVC where nodes are selected to balance between their degrees and the connectivity ofthe remaining graph (Appendix Figure D.4 and D.7). For MAXCUT, S2V-DQN discovers an algorithmwhere nodes are picked to avoid cancelling out existing edges in the cut set (Appendix Figure D.5).These results suggest that S2V-DQN may also be a good assistive tool for discovering new algorithms,especially in cases when the graph optimization problems are new and less well-studied.
6 ConclusionsWe presented an end-to-end machine learning framework for automatically designing greedy heuristicsfor hard combinatorial optimization problems on graphs. Central to our approach is the combinationof a deep graph embedding with reinforcement learning. Through extensive experimental evaluation,we demonstrate the effectiveness of the proposed framework in learning greedy heuristics as comparedto manually-designed greedy algorithms. The excellent performance of the learned heuristics isconsistent across multiple different problems, graph types, and graph sizes, suggesting that theframework is a promising new tool for designing algorithms for graph problems.
8
11.0011.0021.0031.0041.0051.0061.007
MVC- BA• Trainonsmallgraphswith50-100nodes,generalizetolargergraphs
S2V-DQN
CPLEX1st
CPLEX2nd
CPLEX3rd CPLEX
4th
2-approx
2-approx+
PNAC
• Generate 200 Barabasi-Albert networks with 300 nodes
• Let CPLEX produces 1st, 2nd, 3rd, 4th
feasible solutions
Minimum Vertex Cover: NP-Complete Problems on Graphs
Background
Definition:Find smallest vertex subset Ss.t., S covers all the edges
Application:Advertising optimizationin social networks
Real-world: Same problem is solved repeatedly with slightly different data
Apr.14 Apr.17 …… timeline
Scale Free Network
~
Not automatically! Typical approach: customize b.n.b./approx./heuristic
Graph Opt. Prob. Greedy Procedures Illustration
Minimum Vertex Cover Insert nodes into cover
Maximum Cut Insert nodes into subset
Traveling Salesman Prob. Insert nodes into sub-tour
ProposedFramework
min()∈ +,-
/𝑥1
�
1∈𝓥𝑠. 𝑡. 𝑥1 + 𝑥8 ≥ 1, ∀ 𝑖, 𝑗 ∈ 𝓔
Repeat until all edges are covered:
1. Compute score for each vertex
2. Select vertex with largest score
3. Add best vertex to cover
Reward: 𝑟@ = −1
State 𝑆: current selected nodes
Action value function: 𝑸E(𝑺, 𝒗)
Greedy policy: 𝑣∗ = 𝑎𝑟𝑔𝑚𝑎𝑥O𝑄Q(𝑆, 𝑣)
Update state 𝑆
degree distribution, triangle counts, distance to tagged nodes, etc. In order to represent such complexphenomena over combinatorial structures, we will leverage a deep learning architecture over graphs,in particular the structure2vec of [5], to parameterize bQ(h(S), v;⇥).
3.1 Structure2Vec
We first provide an introduction to structure2vec. This graph embedding network will computea p-dimensional feature embedding µv for each node v 2 V , given the current partial solution S. Morespecifically, structure2vec defines the network architecture recursively according to an inputgraph structure G, and the computation graph of structure2vec is inspired by graphical modelinference algorithms, where node-specific tags or features xv are aggregated recursively accordingto G’s graph topology. After a few step of recursion, the network will produce a new embedding foreach node, taking into account both graph characteristics and long-range interactions between thesenode features. One variant of the structure2vec architecture will initialize the embedding µ
(0)v
at each node as 0, and for all v 2 V update the embeddings synchronously at each iteration as
µ(t+1)v F
⇣xv, {µ(t)
u }u2N (v), {w(v, u)}u2N (v) ;⇥
⌘, (2)
where N (v) is the set of neighbors of node v in graph G, and F is a generic nonlinear mapping suchas a neural network or kernel function.Based on the update formula, one can see that the embedding update process is carried out based onthe graph topology. A new round of embedding sweeping across the nodes will start only after theembedding update for all nodes from the previous round has finished. It is easy to see that the updatealso defines a process where the node features xv are propagated to other nodes via the nonlinearpropagation function F . Furthermore, the more update iterations one carries out, the farther awaythe node features will propagate and get aggregated nonlinearly at distant nodes. In the end, if oneterminates after T iterations, each node embedding µ
(T )v will contain information about its T -hop
neighborhood as determined by graph topology, the involved node features and the propagationfunctino F . An illustration of 2 iterations of graph embedding can be found in Figure 1.
3.2 Parameterizing bQ(h(S), v;⇥)
We now discuss the parameterization of bQ(h(S), v;⇥) using the embeddings fromstructure2vec.In particular, we design F to update a p-dimensional embedding µv as
µ(t+1)v relu
�✓1xv + ✓2
Xu2N (v)
µ(t)u + ✓3
Xu2N (v)
relu(✓4 w(v, u))�, (3)
where ✓1 2 Rp, ✓2, ✓3 2 Rp⇥p and ✓4 2 Rp are the model parameters, and relu is the rectifiedlinear unit(relu(z) = z if z > 0 and 0 otherwise) applied elementwise to its input. The summationover neighbors is one way of aggregating neighborhood information invariant to the permutation ofneighbor ordering. For simplicity of exposition, xv here is a binary scalar as described earlier; it isstraightforward to extend xv to a vector representation by incorporating useful node information. Tomake the nonlinear transformations more powerful, we can add some more layers of relu before wepool over the neighboring embeddings µu.Once the embedding for each node is computed after T iterations, we will use these embeddings todefine the bQ(h(S), v;⇥) function. More specifically, we will use the embeddingµ(T )
v for node v and thepooled embedding over the entire graph,
Pu2V µ
(T )u , as the surrogates for v and h(S), respectively, i.e.
bQ(h(S), v;⇥) = ✓>5 relu([✓6X
u2Vµ(T )u , ✓7 µ
(T )v ]) (4)
where ✓5 2 R2p, ✓6, ✓7 2 Rp⇥p and [·, ·] is the concatenation operator. Since the embedding µ(T )u
is computed based on the parameters from the graph embedding network, bQ(h(S), v) will dependon a collection of 7 parameters ⇥ = {✓i}7i=1. The number of iterations T for the graph embeddingcomputation is usually small, such as T = 4.The parameters ⇥ will be learned. Previously, [5] required a ground truth label for every input graphG in order to train the structure2vec architecture. There, the output of the embedding is linkedwith a softmax-layer, so that the parameters can by trained end-to-end by minimizing the cross-entropyloss. This approach is not applicable to our case due to the lack of training labels. Instead, we trainthese parameters together end-to-end using reinforcement learning.
4
degree distribution, triangle counts, distance to tagged nodes, etc. In order to represent such complexphenomena over combinatorial structures, we will leverage a deep learning architecture over graphs,in particular the structure2vec of [5], to parameterize bQ(h(S), v;⇥).
3.1 Structure2Vec
We first provide an introduction to structure2vec. This graph embedding network will computea p-dimensional feature embedding µv for each node v 2 V , given the current partial solution S. Morespecifically, structure2vec defines the network architecture recursively according to an inputgraph structure G, and the computation graph of structure2vec is inspired by graphical modelinference algorithms, where node-specific tags or features xv are aggregated recursively accordingto G’s graph topology. After a few step of recursion, the network will produce a new embedding foreach node, taking into account both graph characteristics and long-range interactions between thesenode features. One variant of the structure2vec architecture will initialize the embedding µ
(0)v
at each node as 0, and for all v 2 V update the embeddings synchronously at each iteration as
µ(t+1)v F
⇣xv, {µ(t)
u }u2N (v), {w(v, u)}u2N (v) ;⇥
⌘, (2)
where N (v) is the set of neighbors of node v in graph G, and F is a generic nonlinear mapping suchas a neural network or kernel function.Based on the update formula, one can see that the embedding update process is carried out based onthe graph topology. A new round of embedding sweeping across the nodes will start only after theembedding update for all nodes from the previous round has finished. It is easy to see that the updatealso defines a process where the node features xv are propagated to other nodes via the nonlinearpropagation function F . Furthermore, the more update iterations one carries out, the farther awaythe node features will propagate and get aggregated nonlinearly at distant nodes. In the end, if oneterminates after T iterations, each node embedding µ
(T )v will contain information about its T -hop
neighborhood as determined by graph topology, the involved node features and the propagationfunctino F . An illustration of 2 iterations of graph embedding can be found in Figure 1.
3.2 Parameterizing bQ(h(S), v;⇥)
We now discuss the parameterization of bQ(h(S), v;⇥) using the embeddings fromstructure2vec.In particular, we design F to update a p-dimensional embedding µv as
µ(t+1)v relu
�✓1xv + ✓2
Xu2N (v)
µ(t)u + ✓3
Xu2N (v)
relu(✓4 w(v, u))�, (3)
where ✓1 2 Rp, ✓2, ✓3 2 Rp⇥p and ✓4 2 Rp are the model parameters, and relu is the rectifiedlinear unit(relu(z) = z if z > 0 and 0 otherwise) applied elementwise to its input. The summationover neighbors is one way of aggregating neighborhood information invariant to the permutation ofneighbor ordering. For simplicity of exposition, xv here is a binary scalar as described earlier; it isstraightforward to extend xv to a vector representation by incorporating useful node information. Tomake the nonlinear transformations more powerful, we can add some more layers of relu before wepool over the neighboring embeddings µu.Once the embedding for each node is computed after T iterations, we will use these embeddings todefine the bQ(h(S), v;⇥) function. More specifically, we will use the embeddingµ(T )
v for node v and thepooled embedding over the entire graph,
Pu2V µ
(T )u , as the surrogates for v and h(S), respectively, i.e.
bQ(h(S), v;⇥) = ✓>5 relu([✓6X
u2Vµ(T )u , ✓7 µ
(T )v ]) (4)
where ✓5 2 R2p, ✓6, ✓7 2 Rp⇥p and [·, ·] is the concatenation operator. Since the embedding µ(T )u
is computed based on the parameters from the graph embedding network, bQ(h(S), v) will dependon a collection of 7 parameters ⇥ = {✓i}7i=1. The number of iterations T for the graph embeddingcomputation is usually small, such as T = 4.The parameters ⇥ will be learned. Previously, [5] required a ground truth label for every input graphG in order to train the structure2vec architecture. There, the output of the embedding is linkedwith a softmax-layer, so that the parameters can by trained end-to-end by minimizing the cross-entropyloss. This approach is not applicable to our case due to the lack of training labels. Instead, we trainthese parameters together end-to-end using reinforcement learning.
4
degree distribution, triangle counts, distance to tagged nodes, etc. In order to represent such complexphenomena over combinatorial structures, we will leverage a deep learning architecture over graphs,in particular the structure2vec of [5], to parameterize bQ(h(S), v;⇥).
3.1 Structure2Vec
We first provide an introduction to structure2vec. This graph embedding network will computea p-dimensional feature embedding µv for each node v 2 V , given the current partial solution S. Morespecifically, structure2vec defines the network architecture recursively according to an inputgraph structure G, and the computation graph of structure2vec is inspired by graphical modelinference algorithms, where node-specific tags or features xv are aggregated recursively accordingto G’s graph topology. After a few step of recursion, the network will produce a new embedding foreach node, taking into account both graph characteristics and long-range interactions between thesenode features. One variant of the structure2vec architecture will initialize the embedding µ
(0)v
at each node as 0, and for all v 2 V update the embeddings synchronously at each iteration as
µ(t+1)v F
⇣xv, {µ(t)
u }u2N (v), {w(v, u)}u2N (v) ;⇥
⌘, (2)
where N (v) is the set of neighbors of node v in graph G, and F is a generic nonlinear mapping suchas a neural network or kernel function.Based on the update formula, one can see that the embedding update process is carried out based onthe graph topology. A new round of embedding sweeping across the nodes will start only after theembedding update for all nodes from the previous round has finished. It is easy to see that the updatealso defines a process where the node features xv are propagated to other nodes via the nonlinearpropagation function F . Furthermore, the more update iterations one carries out, the farther awaythe node features will propagate and get aggregated nonlinearly at distant nodes. In the end, if oneterminates after T iterations, each node embedding µ
(T )v will contain information about its T -hop
neighborhood as determined by graph topology, the involved node features and the propagationfunctino F . An illustration of 2 iterations of graph embedding can be found in Figure 1.
3.2 Parameterizing bQ(h(S), v;⇥)
We now discuss the parameterization of bQ(h(S), v;⇥) using the embeddings fromstructure2vec.In particular, we design F to update a p-dimensional embedding µv as
µ(t+1)v relu
�✓1xv + ✓2
Xu2N (v)
µ(t)u + ✓3
Xu2N (v)
relu(✓4 w(v, u))�, (3)
where ✓1 2 Rp, ✓2, ✓3 2 Rp⇥p and ✓4 2 Rp are the model parameters, and relu is the rectifiedlinear unit(relu(z) = z if z > 0 and 0 otherwise) applied elementwise to its input. The summationover neighbors is one way of aggregating neighborhood information invariant to the permutation ofneighbor ordering. For simplicity of exposition, xv here is a binary scalar as described earlier; it isstraightforward to extend xv to a vector representation by incorporating useful node information. Tomake the nonlinear transformations more powerful, we can add some more layers of relu before wepool over the neighboring embeddings µu.Once the embedding for each node is computed after T iterations, we will use these embeddings todefine the bQ(h(S), v;⇥) function. More specifically, we will use the embeddingµ(T )
v for node v and thepooled embedding over the entire graph,
Pu2V µ
(T )u , as the surrogates for v and h(S), respectively, i.e.
bQ(h(S), v;⇥) = ✓>5 relu([✓6X
u2Vµ(T )u , ✓7 µ
(T )v ]) (4)
where ✓5 2 R2p, ✓6, ✓7 2 Rp⇥p and [·, ·] is the concatenation operator. Since the embedding µ(T )u
is computed based on the parameters from the graph embedding network, bQ(h(S), v) will dependon a collection of 7 parameters ⇥ = {✓i}7i=1. The number of iterations T for the graph embeddingcomputation is usually small, such as T = 4.The parameters ⇥ will be learned. Previously, [5] required a ground truth label for every input graphG in order to train the structure2vec architecture. There, the output of the embedding is linkedwith a softmax-layer, so that the parameters can by trained end-to-end by minimizing the cross-entropyloss. This approach is not applicable to our case due to the lack of training labels. Instead, we trainthese parameters together end-to-end using reinforcement learning.
4
Neighbors’ edge weights
Updating feature vector
Repeat embedding 𝑇 times
𝚯 = 𝜽 𝐢V𝟏𝟕 :modelparameters
Node’s own tag 𝑥O
Neighbors’ features
𝑣 3
0
01
1
Input
NonlinearMappings
Prediction:
degree distribution, triangle counts, distance to tagged nodes, etc. In order to represent such complexphenomena over combinatorial structures, we will leverage a deep learning architecture over graphs,in particular the structure2vec of [5], to parameterize bQ(h(S), v;⇥).
3.1 Structure2Vec
We first provide an introduction to structure2vec. This graph embedding network will computea p-dimensional feature embedding µv for each node v 2 V , given the current partial solution S. Morespecifically, structure2vec defines the network architecture recursively according to an inputgraph structure G, and the computation graph of structure2vec is inspired by graphical modelinference algorithms, where node-specific tags or features xv are aggregated recursively accordingto G’s graph topology. After a few step of recursion, the network will produce a new embedding foreach node, taking into account both graph characteristics and long-range interactions between thesenode features. One variant of the structure2vec architecture will initialize the embedding µ
(0)v
at each node as 0, and for all v 2 V update the embeddings synchronously at each iteration as
µ(t+1)v F
⇣xv, {µ(t)
u }u2N (v), {w(v, u)}u2N (v) ;⇥
⌘, (2)
where N (v) is the set of neighbors of node v in graph G, and F is a generic nonlinear mapping suchas a neural network or kernel function.Based on the update formula, one can see that the embedding update process is carried out based onthe graph topology. A new round of embedding sweeping across the nodes will start only after theembedding update for all nodes from the previous round has finished. It is easy to see that the updatealso defines a process where the node features xv are propagated to other nodes via the nonlinearpropagation function F . Furthermore, the more update iterations one carries out, the farther awaythe node features will propagate and get aggregated nonlinearly at distant nodes. In the end, if oneterminates after T iterations, each node embedding µ
(T )v will contain information about its T -hop
neighborhood as determined by graph topology, the involved node features and the propagationfunctino F . An illustration of 2 iterations of graph embedding can be found in Figure 1.
3.2 Parameterizing bQ(h(S), v;⇥)
We now discuss the parameterization of bQ(h(S), v;⇥) using the embeddings fromstructure2vec.In particular, we design F to update a p-dimensional embedding µv as
µ(t+1)v relu
�✓1xv + ✓2
Xu2N (v)
µ(t)u + ✓3
Xu2N (v)
relu(✓4 w(v, u))�, (3)
where ✓1 2 Rp, ✓2, ✓3 2 Rp⇥p and ✓4 2 Rp are the model parameters, and relu is the rectifiedlinear unit(relu(z) = z if z > 0 and 0 otherwise) applied elementwise to its input. The summationover neighbors is one way of aggregating neighborhood information invariant to the permutation ofneighbor ordering. For simplicity of exposition, xv here is a binary scalar as described earlier; it isstraightforward to extend xv to a vector representation by incorporating useful node information. Tomake the nonlinear transformations more powerful, we can add some more layers of relu before wepool over the neighboring embeddings µu.Once the embedding for each node is computed after T iterations, we will use these embeddings todefine the bQ(h(S), v;⇥) function. More specifically, we will use the embeddingµ(T )
v for node v and thepooled embedding over the entire graph,
Pu2V µ
(T )u , as the surrogates for v and h(S), respectively, i.e.
bQ(h(S), v;⇥) = ✓>5 relu([✓6X
u2Vµ(T )u , ✓7 µ
(T )v ]) (4)
where ✓5 2 R2p, ✓6, ✓7 2 Rp⇥p and [·, ·] is the concatenation operator. Since the embedding µ(T )u
is computed based on the parameters from the graph embedding network, bQ(h(S), v) will dependon a collection of 7 parameters ⇥ = {✓i}7i=1. The number of iterations T for the graph embeddingcomputation is usually small, such as T = 4.The parameters ⇥ will be learned. Previously, [5] required a ground truth label for every input graphG in order to train the structure2vec architecture. There, the output of the embedding is linkedwith a softmax-layer, so that the parameters can by trained end-to-end by minimizing the cross-entropyloss. This approach is not applicable to our case due to the lack of training labels. Instead, we trainthese parameters together end-to-end using reinforcement learning.
4
Compute Q-value
State representation: sum-pooling over nodes Action representation
ExperimentSettings
Generalization on new random graphs with same distribution
approximationratio≈ 1
Methods:• S2V-DQN: our proposed algorithm• PN-AC: Bello et.al. 2016• Greedy heuristics; • Approximation algorithms;
Approximation ratio:
Lower the better
Generalization with different distribution
1
1.005
1.01
1.015
1.02
1.025
MAXCUT- BA
11.021.041.061.081.1
1.12
TSP- Clustered
Convergence
103 104 105
# minibatch training
1
1.05
1.1
1.15
1.2
1.25
1.3
1.35
appr
ox ra
tio
pre-trained
#node-15-20#node-40-50#node-50-100#node-100-200#node-200-300
103 104 105
# minibatch training
1
1.1
1.2
1.3
1.4
1.5
1.6
1.7
1.8
appr
ox ra
tio
pre-trained
#node-15-20#node-40-50#node-50-100#node-100-200#node-400-500
102 103 104
# minibatch training
1
1.02
1.04
1.06
1.08
1.1
1.12
1.14
appr
ox ra
tio
#node-15-20#node-40-50#node-50-100#node-100-200#node-200-300
MVC- BA
MAXCUT- BA
TSP- Clustered
Experiments on real-world graphs
Time-Solution Tradeoff Unique Heuristic Discovered(1) (2)
(3) (4)
(5)
(10)
(6)
(7) (8)
(9)
Our method keeps the connectivity!