todo.md 15 KB
Newer Older
msurl's avatar
msurl committed
1
2
3
4
# Week 4
## Python
### implementation
* Refactor code. 
Mario Surlemont's avatar
Mario Surlemont committed
5
	- Fix bug with separators. ✔
Mario Surlemont's avatar
Mario Surlemont committed
6
	- Add option to cli to select which type of inequalities to be included. ✔
msurl's avatar
msurl committed
7
* Implement different type of constraints. 
Mario Surlemont's avatar
Mario Surlemont committed
8
	- _Solving the Maximum-Weight Connected Subgraph Problem to Optimality_ ❌ (moved to backlog)
Mario Surlemont's avatar
Mario Surlemont committed
9
10
	- _Solving the Connected Dominating Set Problem and Power Dominating Set Problem by Integer Programming_ ✔
	    - Only one type of constraints  which performed best in this paper. Performs bad on our graphs. 
msurl's avatar
msurl committed
11
12

### runtime
Mario Surlemont's avatar
Mario Surlemont committed
13
* Do tests according to test concept and store data in some sort of strucured format(CSV/ XML/ JSON) which could be importet in python or R.  ✔
msurl's avatar
msurl committed
14
15

## thesis
Mario Surlemont's avatar
Mario Surlemont committed
16
* Begin paragraph about results and add preliminary results and observations. ❌ (moved to week 5)
Mario Surlemont's avatar
Mario Surlemont committed
17
	- Create a concept how to test and which test results you want to include. (✔) (a preliminary concept) 
msurl's avatar
msurl committed
18
19
20

## literature
* Read _Imposing Connectivity Constraints in Forest Planning Models_.
Mario Surlemont's avatar
Mario Surlemont committed
21
	- Check for different constraints which could strengthen the formulation. ❌ (moved to week 5. But with focus on symmetry breaking)
Mario Surlemont's avatar
Mario Surlemont committed
22
* Read _Solving the Connected Dominating Set Problem and Power Dominating Set Problem by Integer Programming_ ✔
msurl's avatar
msurl committed
23
	- Different constraints to induce connectivity are compared. Try to implement some of them to compare them and see which one is best. You may see if this is related to a charateristic of your graphs. As the literature may states that for some type of graphs some approaches are better. 
Mario Surlemont's avatar
Mario Surlemont committed
24
* Check _On imposing connectivity constraints in integer programs_ again to see if there is some other literature which you are missing. ❌ (moved to week 5)
msurl's avatar
msurl committed
25
26


Mario Surlemont's avatar
Mario Surlemont committed
27
28
29
30
# Week 5
## Python
### implementation
* Try to implement a symmetry breaker to check if this affects especially the runtime which is needed to close the gap and exclude all unconnected solutions with at most as many nodes as an optimal connected. 
Mario Surlemont's avatar
Mario Surlemont committed
31
32
    - May implement My Ky's, even if it is not optimal/ 100% correct. Just to observe if there is a connection between symmetrical unconnected solutions and the runtime and how strong this connection is. (Possible indicators: runtime, number of lazily added constraints) ❌ 
        - Unfortunately her symmetry breaker made use of edges and our model consists only of node variables.
Mario Surlemont's avatar
Mario Surlemont committed
33
        - Until now there has no symmetry breaker come to my mind and I haven't recognised one in a paper yet. 
Mario Surlemont's avatar
Mario Surlemont committed
34

Mario Surlemont's avatar
Mario Surlemont committed
35
36
### runtime 
* Create random graphs and try to compare runtime between ILP and ASP. 
Mario Surlemont's avatar
Mario Surlemont committed
37
38
39
40
41
42
43
44
45
    - Measure some characteristics of the graphs such as density, |V|, |E|, maximum degree, minimum degree, average degree, (maybe standard derivation of degree or median degree?) ✔
        - The graphs have been tested on random graphs with different size (|V| = 10, 20, 50, 100, 250, 500) with each size having 10 density(0.1, 0.2, ..., 0.9) levels. 
        - The results of this implementation were comparable to the one from _An Integer Programming Approach for Fault-Tolerant Connected Dominating Sets*_. 
        - On random graphs the implementation can handle much more vertices than on our graphs. 
        - A fact that is *very* interesting is that on gridlike graphs the implementation performs way worse. Even when those graphs have similar density and much less vertices.
            - A major problem seems to be a low average degree and a low maximum degree of the nodes! 
            - It seems to be as I assumed that when "high value" vertices are missing there are to many alternatives which have to be excluded in the iteration process. 
                - I should create tables with the differences in characteristics of those gridlike graphs and random graphs. 
                - On "thin" random graphs (width << length) the algorithm performs much better as there are not so many alterantive I assume. 
Mario Surlemont's avatar
Mario Surlemont committed
46
        - A comparison between ILP and ASP hasn't been done yet but is important to see, if this implemantation is at least competetive on random graphs! ❌ 
Mario Surlemont's avatar
Mario Surlemont committed
47
    - Measure and note gap between |D| of a minimal connected and a minimal unconnected solution. Measure for each |D| number of unconnected solutions which were found(ILP-only) and how many constraints where added in sum. 
Mario Surlemont's avatar
Mario Surlemont committed
48
49
50
51
    -   - Constraints ✔
    -   - Number of unconnected solutions is equal to lazily added constraints ✔
    -   - It would still be interesting to measure the number of solutions which exist for each size. Completely ommiting any connectivity. ❌ 
    - Measure the time which was needed to find the first (nearly) optimal solution (strong upper bound) and the time needed to close the gap. ✔
Mario Surlemont's avatar
Mario Surlemont committed
52
53
54
55
* Make some more detailed tests and check for the following connections: 
    - To what degree does the number of unconnected k-hop solutions which have at most as many nodes as an optimal connected solution affect the runtime? (Calculate how many of them exist and add this data as a column to a table of test results)
    - Is there a clear corellation between the number of constraints which were added and the runtime for (possible unconnected) solutions? 
        - Check for graphs where the gap between an optimal connected solution and the smallest unconnected solution is rather big if this approach(our implementation) might be unapplyable. Such that there is too much time wasted to exclude all possible unconnected solutions and increase the lower bound. 
Mario Surlemont's avatar
Mario Surlemont committed
56
* Implement a heuristic to start with a stronger lower bound. ✔
Mario Surlemont's avatar
Mario Surlemont committed
57
58
    - Given a symmetrical leaf where the root is centered (at least on the x-axis). There are at least ![width-2*k+height-k-\frac{width}{2}-k](https://render.githubusercontent.com/render/math?math=width-2*k%2Bheight-k-%5Cfrac%7Bwidth%7D%7B2%7D-k) nodes needed to provide a connected dominating set.
        - To reach both rims in the width and the rim on the top. 
Mario Surlemont's avatar
Mario Surlemont committed
59
60
    - Unfortunately this did not improve the runtime at least for the bigger-leaf example with k=2. It drastically increased the runtime.
        - So at first I would not follow this idea and try to find another technique to generate a better lower bound. 
Mario Surlemont's avatar
Mario Surlemont committed
61
62

## thesis
Mario Surlemont's avatar
Mario Surlemont committed
63
64
65
66
67
* Begin paragraph about results and add preliminary results and observations. ❌ 
    - Refine the concept how to test and which test results you want to include. (✔)
        - The concept was refined as additonal parameters were added which could underpin (or disprove) the assumption that our graphs and gridlike graphs in general are hard too solve with this approach. 
    - The paragraph as a latex file wasn't started yet. ❌ 
        - This has the HIGHEST priority for week 6 as writing is (for me) the least pleasurefull task. 
Mario Surlemont's avatar
Mario Surlemont committed
68
69
    
## literature 
Mario Surlemont's avatar
Mario Surlemont committed
70
* Read _Imposing Connectivity Constraints in Forest Planning Models_. ✔
Mario Surlemont's avatar
Mario Surlemont committed
71
    - Check for different constraints which could strengthen the formulation. (with focus on symmetry breaking)
Mario Surlemont's avatar
Mario Surlemont committed
72
        - They did not mention a symmetry breaker *but* they mentioned some other types of inequalities which could strengthen the formulation according to the connectivity specification. 
Mario Surlemont's avatar
Mario Surlemont committed
73
74
75
            - The two most promising seem to be
                - Only adding separators involving the root node but not between connected components
                - The rooted ring inequalities
Mario Surlemont's avatar
Mario Surlemont committed
76
77
        - A thing which was interesting that for their specific problems (which were not really close to MCDS and had only connectivity in common) they achived a much stronger LP bound. 
        - They added cuts before an ILP solution was found and added cuts for LP solutions also. 
Mario Surlemont's avatar
Mario Surlemont committed
78
        - They used different subproblems which they solved iteratively and used the previous results as a heuristic for the next iteration step. 
Mario Surlemont's avatar
Mario Surlemont committed
79
80
* Read _An Efficient Branch and Cut Algorithm to Find Frequently Mutated Subnetworks in Cancer_ again with focus on symmetry breaking. ✔
    - No symmetry breakers were mentioned.
Mario Surlemont's avatar
Mario Surlemont committed
81
82
* Read through _An Integer Programming Approach for Fault-Tolerant Connected Dominating Sets*_ again and check for symmetry breaking or other constraints to tighten up the space of solutions. ✔
    - I could not find anything about symmetry breaking or additional inequalities for the case k=d=1 (Which is standard MCDS). But the table of results was interesting because they also tested their implemenation for the case k=d=1 which then is equal to our ILP-formulation. Their results were not bad but unfortunately I could not find anything more detailed description of their test graphs. Only number of nodes and density is shown. But those two properties are not sufficient as my personal test on random graphs revealed. 
Mario Surlemont's avatar
Mario Surlemont committed
83
    - An interesting observation, according to their result tables is that their largest test graph(with lowest densitiy) which has 200 nodes has an optimal solution that consists of "only" 26 nodes whereas an optimal solution for our bigger-leaf test graph in the case k=1 consists of 24 nodes while the graph only has 70 nodes. Both graphs have approximately the same density. So their graph must have some nodes which have a higher value/degree such that adding those nodes generates a higher profit than adding others. So there is no equal alternative to adding this nodes. This reduces the number of iterations and (as one can see in the table of results) the number of lazily added constraints. As a consequence I assume that is in an important factor which reduces the runtime. 
msurl's avatar
msurl committed
84

Mario Surlemont's avatar
Mario Surlemont committed
85
86
87
# Week 6
## Python
### Implementation
Mario Surlemont's avatar
Mario Surlemont committed
88
89
* Implement the mentioned constraints from _Imposing Connectivity Constraints in Forest Planning Models_ (Thursday) (✔)
    - From now on only separators between the root node and other components are added to the model. This improves the runtime. ✔ 
Mario Surlemont's avatar
Mario Surlemont committed
90
    - The ring inequalites haven't been implemented yet. ❌ 
Mario Surlemont's avatar
Mario Surlemont committed
91
        - I would prefer to focus on the writing and further testing prior to improve the formulation. If there is enough time left I will try to further improve the runtime. 
Mario Surlemont's avatar
Mario Surlemont committed
92
* May implement that cuts are added even for fractional solutions to strengthen the LP bound. (thursday) ❌  (moved  to bbacklog) 
Mario Surlemont's avatar
Mario Surlemont committed
93

Mario Surlemont's avatar
Mario Surlemont committed
94
95
96
97
### runtime
* Test the ASP version on the graphs from week 5 to compare them. (saturday) ✔ 
    - The ASP version performs quite bad on many random graphs where the ILP version performs quite good. One key factor that I could determine is that on those random graphs were the ILP version was good the difference in the size of an unconnected solution and a connected is rather low. So the ILP version seems to handle those instances were there is no big difference good.
    - The ASP version performed in general better on grid graphs. This is what I expected as the ASP version was better on the leaf graphs as well. My assumption still is that there are too many "alternative" unconnected solutions where it is much cheaper to replace nodes from an previous unconnected solution and to generate a new unconnected solution than to add those vertices which separate them. As for those instances where the ILP needs a lot of time to solve the number of lazily added constraints is very high and the gap decreases(my test results) very slow both of these facts seem to underline my assumption.
msurl's avatar
msurl committed
98
* Test the (new) implementation on the usual graphs and them from week 5. (saturday) (❌)
Mario Surlemont's avatar
Mario Surlemont committed
99
    - Only tested on a few graphs as the testing costs much time. But the decrease in runtime was significant.
msurl's avatar
msurl committed
100
* Create clean tables and CSV files. (sunday) (❌)
Mario Surlemont's avatar
Mario Surlemont committed
101
    - As I tested many graphs I have to further select which graphs I really want to include and I might have to do new tests if I change the implementation.
Mario Surlemont's avatar
Mario Surlemont committed
102
## thesis
Mario Surlemont's avatar
Mario Surlemont committed
103
104
105
* Begin the paragraph of results. (wednesday) (✔)
    - I wrote those bullet points that I want to include in a text file. But I haven't created an tex file yet. 
* Refactor the paragraph for Implementation and methods. (wednesday) ✔ 
Mario Surlemont's avatar
Mario Surlemont committed
106

msurl's avatar
msurl committed
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
# Week 7
## thesis
* According to Elines feedback.
	- Move definitions from methods to a preliminary section and move the intro into ILP's to this section. ✔ 
	- Move the stuff from the implementation section to the methods section and mix it up a bit. (✔)
		- (The implementation section was removed for now and its content was moved to methods. Most of the definitions from the preliminaries section should be moved to method section and mixed up with the content)
	- Rework the part with the vertex separators and be more precise and specific. ✔
	- Add missing stuff. (❌)
		- Different type of constraints to enforce connectivity. (✔)
			- I wrote a german draft to get an idea how to do it. 
		- Different constraints that I tried to strengthen the ILP formulation and preventively adding separator constraints. ❌

## python 
### runtime 
* Check how large is the difference between the size of a minimum dominanting set and a rooted connected minimum dominating set for all test instances.
	- Done for leafs. ✔

# Week 8
## thesis
* Mix the stuff from the methods section and the definitions up a bit. 
* Add missing stuff. (Highest priority! Rather have awfull stuff that can be refinded than having the feeling of getting stucked!)
	- Different type of constraints to enforce connectivity. 
	- Different constraints that I tried to strengthen the ILP formulation and preventively adding separator constraints. 
	- Introduction
	- Table of results
	- Discussion
	- Implementation

## python 
### runtime 
* Check how large is the difference between the size of a minimum dominanting set and a rooted connected minimum dominating set for all test instances.
	- GNM 250 graphs.
	- GNM 100 graphs. 


## python
### runtime 
* For the leaf instances, the grid graphs and some of the random graphs: 
	- measure the size of a minimal unconnected solution and how fast the added separators close the gap.
Mario Surlemont's avatar
Mario Surlemont committed
146

msurl's avatar
msurl committed
147
148
149
# Backlog
## Python 
### Implementation 
Mario Surlemont's avatar
Mario Surlemont committed
150
* Remove lp_to_graph from package and instead use some other format to store graphs and use them as input. ✔
Mario Surlemont's avatar
Mario Surlemont committed
151
152
153
154
155
156
157
158
* Implement different type of constraints. 
    - _Solving the Maximum-Weight Connected Subgraph Problem to Optimality_

* Carefully check the literature i.e. _An Integer Programming Approach for Fault-Tolerant Connected Dominating Sets*_, _An Efficient Branch and Cut Algorithm to Find Frequently Mutated Subnetworks in Cancer_, _Thinning out Steiner trees: a node-based model for uniform edge costs_ to check if they did anything differently according to the sepearating constraints. 
* Check if something from the algorithmic framework of _Thinning out Steiner trees: a node-based model for uniform edge costs_ can be applied to our case. (Local Branching, Heuristic, etc.)
* Implement a heuristic to provide a better upper bound.
    - The intermediate root constraint could be useful to do so. 

Mario Surlemont's avatar
Mario Surlemont committed
159
160
161
162
* Maybe *change the cost function for the nodes*. 
    - At the moment every node has a cost of 1. 
    - It could work to give each node the cost of shortest path between a node and the root.
        - With this technique we would encourage the solver to prefer nodes that are close to the root. With this we would avoid that far distant nodes would be added and connected subcomponents which are far away from the root are generated. 
Mario Surlemont's avatar
Mario Surlemont committed
163
164
        - At least this could lead to a heuristic. 

Mario Surlemont's avatar
Mario Surlemont committed
165
## thesis
msurl's avatar
msurl committed
166
* Unify style of the entries of the bibliography. 
Mario Surlemont's avatar
Mario Surlemont committed
167
168
169
### packaging
* Add unit tests to the package
* Add dependencies to the recipe
msurl's avatar
msurl committed
170
* Find a place to store the code if the package will be published(Maybe a standard github repo)