\section{Methods} \raggedbottom We represent a leaf as a undirected Graph $G = (V,E)$. Each vertex $v$ represents a leaf cell whereas a root $v_{root}$ is predefined. Leaf cells are connected to its neighbors cells via plasmodesmata. Those connections are represented by the edges $E$. We then look for a minimum set of nodes such that still the whole leaf can be supplied with water and the nutrients can be evacuated. For this purpose these vein cells need to be connected and the root cell needs to be part of the solution. (Vielleicht an dieser Stelle schon auf DS referenzieren)\\ We represent a leaf as a undirected graph $G = (V,E)$. Each vertex $v$ represents a leaf cell whereas a root $v_{root}$ is predefined. Leaf cells are connected to its neighbors cells via plasmodesmata. Those connections are represented by the edges $E$. We then look for a minimum set of nodes such that still the whole leaf can be supplied with water and the nutrients can be evacuated. For this purpose these vein cells need to be connected and the root cell needs to be part of the solution. (Vielleicht an dieser Stelle schon auf DS referenzieren)\\ (Irgendwie unterbringen, dass die non-vein-Zellen nicht direkt mit den vein-Zellen benachbart sein müssen.) \\ We use an node based ILP-Formulation to solve this special variant of the dominating set. We start by introducing a formulation for the general $k$-hop dominating set. As the objective function for our special variant remains the same, we then stepwise add constraints until we can present an ILP-formulation for the rooted connected $k$-hop dominating set.\\ As our implementation is node based we ommit decision variables for edges, and instead only assign a variable $x_v \in \{0,1\}$ for every $v \in V$, whereas $x_v = 1 \Leftrightarrow v \in DS$. \subsection{Minimum Dominating Set} As we try to minimize the number of vertices in the dominating set our ILP is given as:(Obvious/ useless phrase?) \\ As we try to minimize the number of vertices in the dominating set our ILP is given as:\\ \textit{objective target}: \begin{equation} \label{obj} min\{\sum_{v \in V}{x_v}\} ... ... @@ -14,7 +14,7 @@ min\{\sum_{v \in V}{x_v}\} \begin{equation} \label{base} \sum_{w \in N(v)}{x_w} + x_v \geq 1, \forall v \in V \end{equation} The family of inequalities \eqref{base} says that each vertex or at least one of its neighbors has to be included in the dominating set. The family of inequalities \eqref{base} says that each vertex itself or at least one of its neighbors has to be included in the dominating set. \subsection{Minimum $k$-hop Dominating Set} The objective target for this problem is the same as \eqref{obj}. But the family of inequalities \eqref{base} is not valid for this case. Instead another famility of inequalities is valid: \\ ... ... @@ -24,7 +24,7 @@ The objective target for this problem is the same as \eqref{obj}. But the family This family of inequalities serves to model the requirement that each vertex or at least one member of the k-neighborhood has to be included in the dominating set. For the case $k = 1$ this family is the same as \eqref{base}. \subsection{Connectivity} To enforce connectivity(using ILP)there are different approaches. To enforce connectivity(using ILP)there are different approaches. As this is not trivial there have been many publications \citep{bomersbach}, \citep{fischetti_steiner_t}, \citep{fault_tolerant}, \citep{forrest}, \citep{on_imposing_con}, \citep{mtz} concerning this issue in the past years. \subsubsection{Vertex separators} On approach is to use so called vertex separators. In \citep{bomersbach} and \citep{fischetti_steiner_t} the authors used this approach to create ILP based algorithms to solve other graph theoretical optimization problems which require the solution to be connected. \citep{bomersbach} presented an ILP-formulation to solve the connected maximum coverage problem and \citep{fischetti_steiner_t} proposed ILP-formulations for different variants of the steiner tree problem. (that was solved in a branch and cut framework?). ... ... @@ -41,54 +41,115 @@ x_v + x_w \leq \sum_{u \in S_{v,w}}{x_u} + 1, \forall v, w \in V, v \neq w, \for \end{equation} This inequalities require that for each combination of two vertices $v$ and $w$, if both vertices included in the dominating set, at least one vertex from a minimum $v$-$w$-separator, has also to be included. \\ In contrast to the problem from \citep{bomersbach} we have a predefined root node which must be part of the solution. In \citep{forest} the authors introduced ILP-formulations for different problems motivated by forest planning. The objective of this problems is to find a profit maximizing harvest schedule, while old-growth-forest patches have to be conserved. Input instances are given as undirected graphs, with areas of the forest as nodes and edges between adjacent areas. There is one particular case where a predefined area is to be preserved plus all preserved areas need to be connected. This is very similar to our problem i.e. that there is a predefined root vertex and the requirement that all those vertices, which are included in the solution, need to be connected. Also minimum vertex separator constraints were used to enforce connectivity, but if a root node was present only those constraints which separate the connected component, that includes the root node, and all the other components were taken into account. The authors state that rooted inequalities are stronger as this was commonly noted in the literature on steiner tree problems. For our case we also only use constraints In contrast to the problem from \citep{bomersbach} we have a predefined root node which must be part of the solution. In \citep{forrest} the authors introduced ILP-formulations for different problems motivated by forest planning. The objective of this problems is to find a profit maximizing harvest schedule, while old-growth-forest patches have to be conserved. Input instances are given as undirected graphs, with areas of the forest as nodes and edges between adjacent areas. There is one particular case where a predefined area is to be preserved plus all preserved areas need to be connected. This is very similar to our problem i.e. that there is a predefined root vertex and the requirement that all those vertices, which are included in the solution, need to be connected. Also minimum vertex separator constraints were used to enforce connectivity, but if a root node was present only those constraints which separate the connected component, that includes the root node, and all the other components were taken into account. The authors state that rooted inequalities are stronger as this was commonly noted in the literature on steiner tree problems. For our case we also only use constraints \begin{equation} \label{rootSep} x_v + x_w \leq \sum_{u \in S_{v,w}}{x_u} + 1, \forall v, w \in V, v \neq w, \forall S_{v,w} \in S(v,r) \end{equation} for minimum vertex separators that include the root node. Including all minimum vertex seperator constraints creates an exponential number of constraints \citep{bomersbach}. Therefore in \citep{bomersbach}, \citep{fischetti_steiner_t} and \citep{forest} they treated this constraints as lazy constraints, which means in particular that none of those constraints are included in the initial model. Instead iteratively integer solutions are resolved \citep{bomersbach}, \citep{fischetti_steiner_t}. If such a solution is not connected, in \citep{bomersbach} and \citep{fischetti_steiner_t} minimal vertex separators that separate single components are identified via a linear time algorithm, while in \citep{forrest} a classical max-flow min-cut theorem is used to identify violated constraints.\\ Our algorithm to identify and add violated constraints is analogous the one from \citep{bomersbach} with the exception that we only search for constraints including the root node. The number of all minimum vertex seperator constraints is potentially exponential \citep{bomersbach}. Therefore in \citep{bomersbach}, \citep{fischetti_steiner_t} and \citep{forest} they treated these constraints as lazy constraints, which means in particular that none of those constraints are included in the initial model. Instead iteratively integer solutions are resolved \citep{bomersbach}, \citep{fischetti_steiner_t}. If such a solution is not connected, in \citep{bomersbach} and \citep{fischetti_steiner_t} minimal vertex separators that separate single components are identified via a linear time algorithm, while in \citep{forrest} a classical max-flow min-cut theorem is used to identify violated constraints.\\ Our algorithm to identify and add violated constraints is analogous the one from \citep{bomersbach} with the exception that we only search for violated constraints that include the root node. ---------------------------------------------------------------------------------------------\\ Everything below this line hasn't been changed yet! \begin{algorithm}[H] \SetAlgoLined $DS^* := \{v | x_v = 1\}$ \\ $G' := G[DS]$\\ $C :=$ set of all disjunct connected components\\ $c_{root} :=$ connected component that contains $v_{root}$\\ \For{all components $c$ in $C \setminus \{c_{root}\}$} { $v :=$ any node from $c$\\ $s_1 :=$ findMinVertexSeparator($G$, $DS^*$, $v \in c$, $v_{root}$, $c_{root}$)\\ $s_2 :=$ findMinVertexSeparator($G$, $DS^*$, $v_{root}$, $v \in c$))\\ \For{all $w_1 \in c$} { add the following constraint to the model: $\sum_{s \in s_1}{x_s} \geq x_{w_1} + x_{v_{root}} - 1$\\ } \For{all $w_2 \in c_{root}$} { add the following constraint to the model: $\sum_{s \in s_2}{x_s} \geq x_{w_2} + x_{v} -1$ \If{$G'$ is not connected} { $C :=$ set of all disjunct connected components\\ $c_{root} :=$ connected component that contains $v_{root}$\\ \For{all components $c$ in $C \setminus \{c_{root}\}$} { $v :=$ any node from $c$\\ $s_1 :=$ findMinVertexSeparator($G$, $DS^*$, $v \in c$, $v_{root}$, $c_{root}$)\\ $s_2 :=$ findMinVertexSeparator($G$, $DS^*$, $v_{root}$, $v \in c$))\\ \For{all $w_1 \in c$} { add the following constraint to the model: $\sum_{s \in s_1}{x_s} \geq x_{w_1} + x_{v_{root}} - 1$\\ } \For{all $w_2 \in c_{root}$} { add the following constraint to the model: $\sum_{s \in s_2}{x_s} \geq x_{w_2} + x_{v} -1$ } } } \caption{Add violated constraints} \end{algorithm} This algorithm is executed each time an integer solution is resolved (using a branch and cut framework). Let $D^*$ be an integer, not necessarily connected, solution. Let $C$ be the set of all connected components from the graph $G' = G[D^*]$ and let $c_r$ be the component that contains the root node $v_r$. Then the algorithm detects for all single components $c \in C \setminus \{c_r\}$ one minimal vertex separator that separates $c$ and the component $c_r$. The constraints concerning these separators are then added to the model and the branch and cut procedure continues. It is important to mention that there is in general more than one minimal vertex separator which seperates two arbitrary components. The algorithm \ref{alg:minSep} detects exactly one, i.e., the separator, that is closest to the first component. By executing the algorithm \ref{alg:minSep} with every component $c \in C \setminus \{c_r\}$ as first component and $c_r$ as second component and vice versa, we ensure that a minimal vertex separator that is closest to each of the components is added. \begin{algorithm}[H] \label{minSep} \begin{algorithm}[H] \label{alg:minSep} \SetAlgoLined $N(c_v) :=$ neighbors of nodes of $c_w$ in $G$ (Maybe use the formal definition from methods?)\\ $G' := G$ with all edges between vertices in $c_v \cup N(c_v)$ removed\\ $G' := G$ with all edges between vertices in $c_v \cup N(c_v)$ removed\\ \label{alg:remEdges} $R_w :=$ vertices that can be reached from $w$ in $G'$\\ \Return $N(c_v) \cap R_w$ \caption{findMinVertexSeparator($G$, $DS^*$, $v \in c_v$, $w$, $c_v$)} \end{algorithm} The constraints \eqref{sep} containing this separators are then added to the model and the iteration process continues until a connected integer solution is found. Algorithm 2 is the same linear time algorithm as used in \citep{bomersbach} for to identify minimal vertex separators.\\ For the case that there is no optimal solution of size $1$ an additional constraint is added to tighten up the feasible region and to prevent unnecessary iterations. The algorithm above detects a minimal vertex separator that seperates the node $w$ and the connected component $c_v$. It is taken from \citep{bomersbach} although Bomersbach et al. took it initially from \citep{fischetti_steiner_t}. With this method the minimal vertex separator is found that is closest to the component $c_v$. In picture \ref{pic:min_sep} one can see an illustration of the process. Suppose the red marked nodes are an unconnected solution $D^*$. The set of blue marked nodes is the minimal separator that is closest to the connected component on the upper graph while the set of green marked nodes is the minimal separator that is closest to the component containing the root. On the picture in the middle and the right you can see the step \ref{remEdges} of the algorithm \ref{alg:minSep}. As one can see, after removing all edges between the components and its neighborhood the blue marked nodes on the middle picture and the green marked nodes on the right picture are still reachable from the other component. Therefore the algorithm returns this selection of nodes as minimal vertex separator. \begin{figure} \centering \includegraphics[width=10cm]{bilder/find_minimal_separator_illustration.eps} \caption{} \label{pic:min_sep} \end{figure} We add an additonal constraint to the model to tighten up the feasible region and to prevent unnecessary iterations. \begin{equation} \label{neigh} x_v \leq \sum_{w \in N(v)} x_w, \forall v \in V x_v \leq \sum_{w \in N(v)} x_w, \forall v \in V \setminus \{v_{root}\} \end{equation} This constraint demands that for each vertex which is part of the dominating set at least one of its neighbors is also included. In \citep{bomersbach} and \citep{fischetti_steiner_t} this constraint is also part of the model. (Maybe mention that the "neighborhood" is always a minimum separator so this type of inequalities are valid) This constraint demands that for each vertex which is part of the dominating set at least one of its neighbors is also included. In \citep{bomersbach} and \citep{fischetti_steiner_t} this constraint is also part of the model. As the neighborhood of a single vertex is always a minimal vertex separator that separates this node from any other vertex outside the neighborhood, this constraint is valid. We exclude the root node $v_{root}$ to prevent that for the case of a valid solution that only contains $1$ single vertex another one is added unnecessarily. \subsubsection{Miller-Tucker-Zemlin Constraints} There are also formulations to enforce connectivity that only need a polynomial number of constraints. These constraints are not added lazily but instead all added initially. There exist some approaches that base on the construction of a spanning tree. We have implemented one of these formulations in the scope of this thesis. This approach was used in \citep{mtz} to generate an ILP-formulation for the Minimum Connected Dominating Set problem. In the scope of the publication $4$ different formulations, all based on creating a spanning tree, were compared (experimentally). This particular formulation outperformed all $3$ others on all $6$ inputgraphs. With increasing size the difference in the runtime became larger. In the scope of this thesis we therefore only compared this one with the vertex separator version. The Miller Tucker Zemlin constraints were initially introduced to present an ILP-formulation for the Traveling Salesman Problem with only polynomial many constraints. Let $G =(V,E)$ be our undirected inputgraph. We follow the description from \citep{mtz} by defining $G_d = (V \cup \{n+1, n+2\}, A)$ as directed graph, whereas $A = \{(n+1, n+2)\} \cup \{\bigcup_{i=1}^n{(n+1,i), (n+2,i)} \} \cup E'$ and $E' = \{(j,i), (i,j): {i,j} \in E \}$. Note that $E'$ is the bidirected version of $E$, that means, we add an arc in both directions for every edge in $E$. Let $n = |V|$. We create two additional nodes $n+1$ and $n+2$. Additionally we add an arc from $n+1$ and from $n+2$ to every vertex $v \in V$, and we add an arc from $n+1$ to $n+2$. The idea behind the constraints is to create a directed spanning tree $T_d = (V \cup {n+1,n+2}, E_d)$ on $G_d$, such that vertex $n+1$ is a root and holds an arc (on $T_d$) to every vertex, which is not part of $D$ and to $n+2$. While $n+2$ holds an arc to a node $v_r$ within $D$. All the other nodes form a tree with root $v_r$. Let $y_{ij} \forall (i,j) \in A$ be decision variables, that specify whether the arc $(i,j)$ is part of the spanning tree $T_d$. Let $u_i \in \mathbb{Z}_+, \forall i \in V \cup \{n+1, n+2\}$ be auxilliary variables, that specify in which step the arc is passed starting from $n+1$. Those auxilliary variables eliminate subtours as they also do in the Traveling Salesman Problem. In the following we give a full ILP-formulation for to enforce connectivity via MTZ-constraints. \begin{equation} \label{mtz_eq_1} \sum_{i \in V}{y_{n+2,i}} = 1 \end{equation} \begin{equation} \label{mtz_eq_2} \sum_{(i,j) \in A}{y_{ij}} = 1, \forall j \in V \end{equation} \begin{equation} \label{mtz_eq_3} y_{n+1,i} + y_{ij} \leq 1, \forall (i,j) \in E' \end{equation} \begin{equation} \label{mtz_eq_4} (n+1)*y_{i,j} + u_i - u_j + (n-1)*y_{ji} \leq n, \forall (i,j) \in E' \end{equation} \begin{equation} \label{mtz_eq_5} (n+1)*y_{i,j} + u_i - u_j + (n-1)*y_{ji} \leq n, \forall (i,j) \in A \setminus E' \end{equation} \begin{equation} \label{mtz_eq_6} y_{n+1,n+2} = 1 \end{equation} \begin{equation} \label{mtz_eq_7} u_{n+1} = 0 \end{equation} \begin{equation} \label{mtz_eq_8} 1 \leq u_{i} \leq n+1, i \in V \cup\{n+2\} \end{equation} \begin{equation} \label{mtz_eq_9} x_i = 1-y_{n+1,i}, \forall i \in V \end{equation} \begin{figure} \centering \includegraphics[width=10cm]{bilder/mtz_illustration.eps} \caption{Illustration of the principle. The dashed circle outlines the dominating set. All vertices, that are connected to $n+1$ are not part of the dominating set.} \label{mtz} \end{figure} Constraints \eqref{mtz_eq_1} ensure that there is exactly one root for the dominating set. In our case we replace this inequality by the following: $y:{n+2,v_{root}} = 1$ and $y_{n+2, i} = 0, \forall i \in V \setminus \{v_{root}\}$. Constraints \eqref{mtz_eq_2} enforce that each node on the spanning tree $T_d$ has exactly one incoming arc. While constraints \eqref{mtz_eq_3} require that all the nodes from $T_d$ are either connected to each other or have an incoming arc from node $n+1$, the node which marks nodes that are not part of $D$. With the exception of the term $(n-1)y_{ji}$ the constraints \eqref{mtz_eq_4} and \eqref{mtz_eq_5} are the original MTZ constraints to eliminate subtours from \citep{mtz_orig}. The mentioned term is an improvement from \citep{mtz_improv}. Constraint \eqref{mtz_eq_6} demands the arc $(n+1,n+2)$ to be included in $T_d$. Constraints \eqref{mtz_eq_8} define the value of ranges for the auxilliary variables $u_i$. As these variables specify in which step the arc to node $i$ is passed, only values from $1$ - $n+1$ (the number of incoming arcs) can be assigned to it. Finally the last constraints \eqref{mtz_eq_9} ensure that if there is no incoming arc from node $n+1$ to a node $i$, then $i$ must be included in $D$ and vice versa (I think it is important to mention the backward direction as otherwise the impression could arise that only the MTZ constraints decide which vertices are included). We combine the above mentioned ILP-formulation for MkCDS with this formulation to enforce connectivity. The solution of this formulation then is a optimal connected solution with $v \in D \Leftrightarrow x_v = 1$. As previously mentioned this formulation only needs polynomial many constraints. More precisely there are $(|V|+2) + (2|E|+2|V|+1) = O(|E|+|V|)$ decision variables and $1 + |V| + 2|E| + 2|E| + (2|V|+1) + 1 + 1 + |V| = O(|E|+|V|)$ constraints. \subsection{Minimum connected $k$-hop Dominating Set} \label{khopmodel} A connected $k$-hop dominating set is a $k$-hop dominating set DS such that $G[DS]$ is connected.(Maybe refer to methods as this is redundant?). Its ILP-Formulation consists of the objective target \eqref{obj} and constraints \eqref{khop} and a collection of constraints to induce connectivity(In the future different types of potential constraints should be added). ... ... @@ -97,4 +158,43 @@ A connected $k$-hop dominating set is a $k$-hop dominating set DS such that $G[D Let$v_{root} \in V$be the predefined root.The ILP-Model of this problem is the ILP-Model of \ref{khopmodel} enriched with following constraint. \begin{equation} \label{root} x_{v_{root}} \geq 1 \end{equation} \ No newline at end of file \end{equation} \subsection{Additional methods to tighten up the space of feasible solutions} In the scope of this thesis additional contraints were tested, that should tighten up the space of feasable solutions further. As it can potentially cost much time to create unconnected solutions, we want to prevent unnecessary iterations. \subsubsection{Intermediate node constraint} In the paper about the Steiner Tree Problem \citep{fischetti_steiner_t} one inequality to reduce the number of unconnected feasible solutions is proposed. It demands that for each node in the solution, which is not a predefined terminal, to have two neighhbors in the solution. A node that has two neighbors in the solution can be seen as an intermediate node. Let$T$be the set of all terminals. The inequality can formaly be described as $2 * x_v \leq \sum_{w \in N(v)}{x_v}, \forall v \in T$. Unfortunately this inequality can not be applied to our problem without potentially excluding optimal solutions. By this inequality solutions can be generated, which have additional nodes at the end of branches, that are not necessary for the MkCDS but that are necessary to fulfill this inequality. In our case we would need to require that for each vertex, which is not at the end of a branch, this inequality needs to be satisfied. But we can not decide which node will be at the end of a branch in advance. \begin{figure} \centering \includegraphics[width=10cm]{bilder/intermediate_node_constraint_illustration.eps} \caption{The dashed circle outlines the necessity to have connected triplets at the end of a branch} \label{pic:inc} \end{figure} In \eqref{pic:inc} there is an illustration that compares one optimal solution without this constraint on the left and one with this constraint on the right. On the right hand side the end of a branch is circled to outline the additional node generated by this constraint. Even if the generated solutions are not inevitably optimal, the generated solutions are close to an optimal solution (in terms of the number of nodfes). At the same time this constraint reduces the runtime in many instances drastically. That is why it can be considered to generate approximative solutions using this constraint. This constraint can also be used to generate a sufficient upper bound in the branch and cut process. But for the most instances this is not necessary as a sufficient upper bound is found quickly. It needs much more time to find a sufficient lower bound and to close the gap. \subsubsection{Reduce path length} To exclude such solutions which contain single (unconnected) nodes, that are close to the rim we invented constraints to reduce the length of each path between the nodes of a solution and the root node. The length of each path to an arbitrary node is naturally limited by the number of members of the dominating set. In the extreme there is one single branch, which has exactly the length of the number of all members of the dominating set. In the case of more than one branch the upper bound is still valid. On that account we started by following the naive approach to limit the path from the root node to each member of$D$by the size of$D$. The formal description is \begin{equation} \label{gaussian} \sum_{v \in V}{x_v} \geq shortestpath\{v_{root}, v\}, \forall v \in V \setminus \{v_{root}\} \end{equation}. As this constraint did not reduced the runtime wie tried to refine it. There are too many possible (unconnected) solutions where the constraint is satisfied. Picture \ref{pic:rpl} shows one of it. \begin{figure} \centering \includegraphics[width=3cm]{bilder/reduce_path_length_illustration.eps} \caption{An unconnected solution where the path length constraint is satisfied.} \label{pic:rpl} \end{figure} This circumstane lead to the following constraint, that makes use of the gausian summ formula. The idea is still to limit the distance between the root node$v_{root}$and all the members of$D$. In this advanced formulation we limit the sum of the distances to$\sum_{i_1}^|D*|{i}\$. This constraint cuts of unconnected solutions that are valid using only the previous constraint \eqref{rpl}. But as our tests revealed this constraint did not generate a performance boost but even epanded the runtime(As it probably adds too much complexity to the model). (Maybe also mention that this constraint in isolation allows solutions which are forbidden using the previous one) \subsubsection{preventively adding separators} We use the lazy approach to prevent that too many constraints are added that are not mandatory to generate suffiecient solutions. In despite of this we evaluated if adding a certain amount/ some particular separator constraints could reduce the runtime. It could have been that are more appropriate LP bound is generated using this approach and unnecessary iterations could have been prevented.