todo.md 15 KB
 msurl committed Jun 03, 2020 1 2 3 4 # Week 4 ## Python ### implementation * Refactor code.  Mario Surlemont committed Jun 07, 2020 5  - Fix bug with separators. ✔  Mario Surlemont committed Jun 07, 2020 6  - Add option to cli to select which type of inequalities to be included. ✔  msurl committed Jun 03, 2020 7 * Implement different type of constraints.  Mario Surlemont committed Jun 11, 2020 8  - _Solving the Maximum-Weight Connected Subgraph Problem to Optimality_ ❌ (moved to backlog)  Mario Surlemont committed Jun 07, 2020 9 10  - _Solving the Connected Dominating Set Problem and Power Dominating Set Problem by Integer Programming_ ✔ - Only one type of constraints which performed best in this paper. Performs bad on our graphs.  msurl committed Jun 03, 2020 11 12  ### runtime  Mario Surlemont committed Jun 10, 2020 13 * Do tests according to test concept and store data in some sort of strucured format(CSV/ XML/ JSON) which could be importet in python or R. ✔  msurl committed Jun 03, 2020 14 15  ## thesis  Mario Surlemont committed Jun 11, 2020 16 * Begin paragraph about results and add preliminary results and observations. ❌ (moved to week 5)  Mario Surlemont committed Jun 11, 2020 17  - Create a concept how to test and which test results you want to include. (✔) (a preliminary concept)  msurl committed Jun 03, 2020 18 19 20  ## literature * Read _Imposing Connectivity Constraints in Forest Planning Models_.  Mario Surlemont committed Jun 11, 2020 21  - Check for different constraints which could strengthen the formulation. ❌ (moved to week 5. But with focus on symmetry breaking)  Mario Surlemont committed Jun 11, 2020 22 * Read _Solving the Connected Dominating Set Problem and Power Dominating Set Problem by Integer Programming_ ✔  msurl committed Jun 03, 2020 23  - Different constraints to induce connectivity are compared. Try to implement some of them to compare them and see which one is best. You may see if this is related to a charateristic of your graphs. As the literature may states that for some type of graphs some approaches are better.  Mario Surlemont committed Jun 11, 2020 24 * Check _On imposing connectivity constraints in integer programs_ again to see if there is some other literature which you are missing. ❌ (moved to week 5)  msurl committed Jun 03, 2020 25 26   Mario Surlemont committed Jun 11, 2020 27 28 29 30 # Week 5 ## Python ### implementation * Try to implement a symmetry breaker to check if this affects especially the runtime which is needed to close the gap and exclude all unconnected solutions with at most as many nodes as an optimal connected.  Mario Surlemont committed Jun 16, 2020 31 32  - May implement My Ky's, even if it is not optimal/ 100% correct. Just to observe if there is a connection between symmetrical unconnected solutions and the runtime and how strong this connection is. (Possible indicators: runtime, number of lazily added constraints) ❌ - Unfortunately her symmetry breaker made use of edges and our model consists only of node variables.  Mario Surlemont committed Jun 16, 2020 33  - Until now there has no symmetry breaker come to my mind and I haven't recognised one in a paper yet.  Mario Surlemont committed Jun 16, 2020 34   Mario Surlemont committed Jun 11, 2020 35 36 ### runtime * Create random graphs and try to compare runtime between ILP and ASP.  Mario Surlemont committed Jun 16, 2020 37 38 39 40 41 42 43 44 45  - Measure some characteristics of the graphs such as density, |V|, |E|, maximum degree, minimum degree, average degree, (maybe standard derivation of degree or median degree?) ✔ - The graphs have been tested on random graphs with different size (|V| = 10, 20, 50, 100, 250, 500) with each size having 10 density(0.1, 0.2, ..., 0.9) levels. - The results of this implementation were comparable to the one from _An Integer Programming Approach for Fault-Tolerant Connected Dominating Sets*_. - On random graphs the implementation can handle much more vertices than on our graphs. - A fact that is *very* interesting is that on gridlike graphs the implementation performs way worse. Even when those graphs have similar density and much less vertices. - A major problem seems to be a low average degree and a low maximum degree of the nodes! - It seems to be as I assumed that when "high value" vertices are missing there are to many alternatives which have to be excluded in the iteration process. - I should create tables with the differences in characteristics of those gridlike graphs and random graphs. - On "thin" random graphs (width << length) the algorithm performs much better as there are not so many alterantive I assume.  Mario Surlemont committed Jun 16, 2020 46  - A comparison between ILP and ASP hasn't been done yet but is important to see, if this implemantation is at least competetive on random graphs! ❌  Mario Surlemont committed Jun 11, 2020 47  - Measure and note gap between |D| of a minimal connected and a minimal unconnected solution. Measure for each |D| number of unconnected solutions which were found(ILP-only) and how many constraints where added in sum.  Mario Surlemont committed Jun 16, 2020 48 49 50 51  - - Constraints ✔ - - Number of unconnected solutions is equal to lazily added constraints ✔ - - It would still be interesting to measure the number of solutions which exist for each size. Completely ommiting any connectivity. ❌ - Measure the time which was needed to find the first (nearly) optimal solution (strong upper bound) and the time needed to close the gap. ✔  Mario Surlemont committed Jun 11, 2020 52 53 54 55 * Make some more detailed tests and check for the following connections: - To what degree does the number of unconnected k-hop solutions which have at most as many nodes as an optimal connected solution affect the runtime? (Calculate how many of them exist and add this data as a column to a table of test results) - Is there a clear corellation between the number of constraints which were added and the runtime for (possible unconnected) solutions? - Check for graphs where the gap between an optimal connected solution and the smallest unconnected solution is rather big if this approach(our implementation) might be unapplyable. Such that there is too much time wasted to exclude all possible unconnected solutions and increase the lower bound.  Mario Surlemont committed Jun 16, 2020 56 * Implement a heuristic to start with a stronger lower bound. ✔  Mario Surlemont committed Jun 11, 2020 57 58  - Given a symmetrical leaf where the root is centered (at least on the x-axis). There are at least ![width-2*k+height-k-\frac{width}{2}-k](https://render.githubusercontent.com/render/math?math=width-2*k%2Bheight-k-%5Cfrac%7Bwidth%7D%7B2%7D-k) nodes needed to provide a connected dominating set. - To reach both rims in the width and the rim on the top.  Mario Surlemont committed Jun 16, 2020 59 60  - Unfortunately this did not improve the runtime at least for the bigger-leaf example with k=2. It drastically increased the runtime. - So at first I would not follow this idea and try to find another technique to generate a better lower bound.  Mario Surlemont committed Jun 11, 2020 61 62  ## thesis  Mario Surlemont committed Jun 16, 2020 63 64 65 66 67 * Begin paragraph about results and add preliminary results and observations. ❌ - Refine the concept how to test and which test results you want to include. (✔) - The concept was refined as additonal parameters were added which could underpin (or disprove) the assumption that our graphs and gridlike graphs in general are hard too solve with this approach. - The paragraph as a latex file wasn't started yet. ❌ - This has the HIGHEST priority for week 6 as writing is (for me) the least pleasurefull task.  Mario Surlemont committed Jun 11, 2020 68 69  ## literature  Mario Surlemont committed Jun 16, 2020 70 * Read _Imposing Connectivity Constraints in Forest Planning Models_. ✔  Mario Surlemont committed Jun 11, 2020 71  - Check for different constraints which could strengthen the formulation. (with focus on symmetry breaking)  Mario Surlemont committed Jun 16, 2020 72  - They did not mention a symmetry breaker *but* they mentioned some other types of inequalities which could strengthen the formulation according to the connectivity specification.  Mario Surlemont committed Jun 16, 2020 73 74 75  - The two most promising seem to be - Only adding separators involving the root node but not between connected components - The rooted ring inequalities  Mario Surlemont committed Jun 16, 2020 76 77  - A thing which was interesting that for their specific problems (which were not really close to MCDS and had only connectivity in common) they achived a much stronger LP bound. - They added cuts before an ILP solution was found and added cuts for LP solutions also.  Mario Surlemont committed Jun 16, 2020 78  - They used different subproblems which they solved iteratively and used the previous results as a heuristic for the next iteration step.  Mario Surlemont committed Jun 16, 2020 79 80 * Read _An Efficient Branch and Cut Algorithm to Find Frequently Mutated Subnetworks in Cancer_ again with focus on symmetry breaking. ✔ - No symmetry breakers were mentioned.  Mario Surlemont committed Jun 16, 2020 81 82 * Read through _An Integer Programming Approach for Fault-Tolerant Connected Dominating Sets*_ again and check for symmetry breaking or other constraints to tighten up the space of solutions. ✔ - I could not find anything about symmetry breaking or additional inequalities for the case k=d=1 (Which is standard MCDS). But the table of results was interesting because they also tested their implemenation for the case k=d=1 which then is equal to our ILP-formulation. Their results were not bad but unfortunately I could not find anything more detailed description of their test graphs. Only number of nodes and density is shown. But those two properties are not sufficient as my personal test on random graphs revealed.  Mario Surlemont committed Jun 16, 2020 83  - An interesting observation, according to their result tables is that their largest test graph(with lowest densitiy) which has 200 nodes has an optimal solution that consists of "only" 26 nodes whereas an optimal solution for our bigger-leaf test graph in the case k=1 consists of 24 nodes while the graph only has 70 nodes. Both graphs have approximately the same density. So their graph must have some nodes which have a higher value/degree such that adding those nodes generates a higher profit than adding others. So there is no equal alternative to adding this nodes. This reduces the number of iterations and (as one can see in the table of results) the number of lazily added constraints. As a consequence I assume that is in an important factor which reduces the runtime.  msurl committed Jun 03, 2020 84   Mario Surlemont committed Jun 17, 2020 85 86 87 # Week 6 ## Python ### Implementation  Mario Surlemont committed Jun 23, 2020 88 89 * Implement the mentioned constraints from _Imposing Connectivity Constraints in Forest Planning Models_ (Thursday) (✔) - From now on only separators between the root node and other components are added to the model. This improves the runtime. ✔  Mario Surlemont committed Jun 23, 2020 90  - The ring inequalites haven't been implemented yet. ❌  Mario Surlemont committed Jun 23, 2020 91  - I would prefer to focus on the writing and further testing prior to improve the formulation. If there is enough time left I will try to further improve the runtime.  Mario Surlemont committed Jun 23, 2020 92 * May implement that cuts are added even for fractional solutions to strengthen the LP bound. (thursday) ❌ (moved to bbacklog)  Mario Surlemont committed Jun 17, 2020 93   Mario Surlemont committed Jun 23, 2020 94 95 96 97 ### runtime * Test the ASP version on the graphs from week 5 to compare them. (saturday) ✔ - The ASP version performs quite bad on many random graphs where the ILP version performs quite good. One key factor that I could determine is that on those random graphs were the ILP version was good the difference in the size of an unconnected solution and a connected is rather low. So the ILP version seems to handle those instances were there is no big difference good. - The ASP version performed in general better on grid graphs. This is what I expected as the ASP version was better on the leaf graphs as well. My assumption still is that there are too many "alternative" unconnected solutions where it is much cheaper to replace nodes from an previous unconnected solution and to generate a new unconnected solution than to add those vertices which separate them. As for those instances where the ILP needs a lot of time to solve the number of lazily added constraints is very high and the gap decreases(my test results) very slow both of these facts seem to underline my assumption.  msurl committed Jun 30, 2020 98 * Test the (new) implementation on the usual graphs and them from week 5. (saturday) (❌)  Mario Surlemont committed Jun 23, 2020 99  - Only tested on a few graphs as the testing costs much time. But the decrease in runtime was significant.  msurl committed Jun 30, 2020 100 * Create clean tables and CSV files. (sunday) (❌)  Mario Surlemont committed Jun 23, 2020 101  - As I tested many graphs I have to further select which graphs I really want to include and I might have to do new tests if I change the implementation.  Mario Surlemont committed Jun 17, 2020 102 ## thesis  Mario Surlemont committed Jun 23, 2020 103 104 105 * Begin the paragraph of results. (wednesday) (✔) - I wrote those bullet points that I want to include in a text file. But I haven't created an tex file yet. * Refactor the paragraph for Implementation and methods. (wednesday) ✔  Mario Surlemont committed Jun 17, 2020 106   msurl committed Jun 30, 2020 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 # Week 7 ## thesis * According to Elines feedback. - Move definitions from methods to a preliminary section and move the intro into ILP's to this section. ✔ - Move the stuff from the implementation section to the methods section and mix it up a bit. (✔) - (The implementation section was removed for now and its content was moved to methods. Most of the definitions from the preliminaries section should be moved to method section and mixed up with the content) - Rework the part with the vertex separators and be more precise and specific. ✔ - Add missing stuff. (❌) - Different type of constraints to enforce connectivity. (✔) - I wrote a german draft to get an idea how to do it. - Different constraints that I tried to strengthen the ILP formulation and preventively adding separator constraints. ❌ ## python ### runtime * Check how large is the difference between the size of a minimum dominanting set and a rooted connected minimum dominating set for all test instances. - Done for leafs. ✔ # Week 8 ## thesis * Mix the stuff from the methods section and the definitions up a bit. * Add missing stuff. (Highest priority! Rather have awfull stuff that can be refinded than having the feeling of getting stucked!) - Different type of constraints to enforce connectivity. - Different constraints that I tried to strengthen the ILP formulation and preventively adding separator constraints. - Introduction - Table of results - Discussion - Implementation ## python ### runtime * Check how large is the difference between the size of a minimum dominanting set and a rooted connected minimum dominating set for all test instances. - GNM 250 graphs. - GNM 100 graphs. ## python ### runtime * For the leaf instances, the grid graphs and some of the random graphs: - measure the size of a minimal unconnected solution and how fast the added separators close the gap.  Mario Surlemont committed Jun 17, 2020 146   msurl committed Jun 03, 2020 147 148 149 # Backlog ## Python ### Implementation  Mario Surlemont committed Jun 16, 2020 150 * Remove lp_to_graph from package and instead use some other format to store graphs and use them as input. ✔  Mario Surlemont committed Jun 11, 2020 151 152 153 154 155 156 157 158 * Implement different type of constraints. - _Solving the Maximum-Weight Connected Subgraph Problem to Optimality_ * Carefully check the literature i.e. _An Integer Programming Approach for Fault-Tolerant Connected Dominating Sets*_, _An Efficient Branch and Cut Algorithm to Find Frequently Mutated Subnetworks in Cancer_, _Thinning out Steiner trees: a node-based model for uniform edge costs_ to check if they did anything differently according to the sepearating constraints. * Check if something from the algorithmic framework of _Thinning out Steiner trees: a node-based model for uniform edge costs_ can be applied to our case. (Local Branching, Heuristic, etc.) * Implement a heuristic to provide a better upper bound. - The intermediate root constraint could be useful to do so.  Mario Surlemont committed Jun 16, 2020 159 160 161 162 * Maybe *change the cost function for the nodes*. - At the moment every node has a cost of 1. - It could work to give each node the cost of shortest path between a node and the root. - With this technique we would encourage the solver to prefer nodes that are close to the root. With this we would avoid that far distant nodes would be added and connected subcomponents which are far away from the root are generated.  Mario Surlemont committed Jun 16, 2020 163 164  - At least this could lead to a heuristic.  Mario Surlemont committed Jun 23, 2020 165 ## thesis  msurl committed Jun 30, 2020 166 * Unify style of the entries of the bibliography.  Mario Surlemont committed Jun 11, 2020 167 168 169 ### packaging * Add unit tests to the package * Add dependencies to the recipe  msurl committed Jun 30, 2020 170 * Find a place to store the code if the package will be published(Maybe a standard github repo)