diff --git a/conclusion.tex b/conclusion.tex
index fc9da33b4cda80de8f9d1f61ad320843ef59a80e..48905c1e0be12df0c7bfe70cd420fe9e0dce6e37 100644
--- a/conclusion.tex
+++ b/conclusion.tex
@@ -1,9 +1,5 @@
-\newpage
 \section{Conclusion}
-Overall, \citet{Rendle19} concludes that the last \textit{five years} of \textit{research} for the \textit{MovieLens10M-dataset} have not really produced any new findings. Although in the presented experiment the \textit{best practice} of the \textit{community} was applied, the \textit{simplest matrix-factorization} methods could clearly beat the reported results. Thus, the authors support the thesis that \textit{finding} and \textit{evaluating valid} and \textit{sharp baselines} is \textit{not trivial}. \textit{Empirical data} are collected, since there is \textit{no formal evidence} in the field of \textit{recommender systems} to make the methods comparable. From the \textit{numerical evaluation} the authors identify the \textit{rating of a work} in a \textit{scientific context} as a \textit{major problem}. Here, a \textit{publication} is classified as \textit{not worth publishing} if it achieves \textit{better results with old methods}. Rather, most papers aim to \textit{distinguish themselves} from the others by using new methods that beat the old ones. In this way, \textit{baselines} are \textit{not questioned} and the \textit{community} is steered in the wrong direction, as their work competes against \textit{insufficient} \textit{baselines}. This problem was not only solved during the \textit{Netflix-Prize} by the \textit{horrendous prize money}. However, it turns out that the \textit{insights} gained there were more \textit{profound} and can be transferred to the \textit{MovieLens10M-dataset}. Thus \textit{new techniques} but \textit{no new elementary knowledge} could be achieved on the \textit{MovieLens10M-dataset}.
-With this paper \citet{Rendle19} addresses the highly experienced reader. The simple structure of the paper convinces by the clear and direct way in which the problem is identified. Additionally, the paper can be seen as an \textit{addendum} to the \textit{Netflix-Prize}. As the authors \citet{Rendle} and \citet{Koren} were significantly \textit{involved} in this competition, the points mentioned above are convincing by the experience they have gained. With their results they support the very simple but not trivial statement that finding good \textit{baselines} requires an \textit{immense effort} and this has to be \textit{promoted} much more in a \textit{scientific context}. This implies a change in the \textit{long-established thinking} about the evaluation of scientific work. At this point it is questionable whether it is possible to change existing thinking. This should be considered especially because the scientific sector, unlike the industrial sector, cannot provide financial motivation due to limited resources. On the other hand, it must be considered that the individual focus of a work must also be taken into account. Thus, it is \textit{questionable} whether the \textit{scientific sector} is able to create such a large unit with regard to a \textit{common goal} as \textit{Netflix} did during the competition.
-It should be clearly emphasized that it is immensely important to use sharp \textit{baselines} as guidelines. However, in a \textit{scientific context} the \textit{goal} is not as \textit{precisely defined} as it was in the \textit{Netflix-Prize}. Rather, a large part of the work is aimed at investigating whether new methods such as \textit{neural networks} etc. are applicable to the \textit{recommender problem}.
-Regarding the results, however, it has to be said that they clearly support a \textit{rethinking} even if this should only concern a \textit{small part} of the work. On the website \textit{Papers with Code}\footnote{\url{https://paperswithcode.com/sota/collaborative-filtering-on-movielens-10m}} the \textit{public leaderboard} regarding the results obtained on the \textit{MovieLens10M-dataset} can be viewed. The source analysis of \textit{Papers with Code} also identifies the results given by \citet{Rendle19} as leading.
-In addition, \textit{future work} should focus on a more \textit{in-depth source analysis} which, besides the importance of the \textit{MovieLens10M-dataset} for the \textit{scientific community}, also examines whether and to what extent \textit{other datasets} are affected by this phenomenon.
-Due to the recent publication in spring \textit{2019}, this paper has not yet been cited frequently. So time will tell what impact it will have on the \textit{community}. Nevertheless, \citet{Dacrema2019} has already observed similar problems for \textit{top-n-recommender} based on this paper. According to this, \citet{Rendle} seems to have recognized an elementary and unseen problem and made it public. This is strongly reminiscent of the so-called \textit{Artificial-Intelligence-Winter (AI-Winter)} in which \textit{stagnation} in the \textit{development} of \textit{artificial intelligence} occurred due to too high expectations and other favourable factors. Overall the paper has the potential to \textit{counteract} the \textit{general hype} whose only purpose is to develop the best and only true model and thus \textit{prevent} a \textit{winter for recommender systems}.
+Overall, \citet{Rendle19} concludes that the last \textit{five years} of \textit{research} for the \textit{MovieLens10M-dataset} have not really produced any new findings. Although in the presented experiment the \textit{best practice} of the \textit{community} was applied, the \textit{simplest matrix-factorization} methods could clearly beat the reported results. Thus, the authors support the thesis that \textit{finding} and \textit{evaluating valid} and \textit{sharp baselines} is \textit{not trivial}. \textit{Empirical data} are collected, since there is \textit{no formal evidence} in the field of \textit{recommender systems} to make the methods comparable. From the \textit{numerical evaluation} the authors identify the \textit{rating of a work} in a \textit{scientific context} as a \textit{major problem}. Here, a \textit{publication} is classified as \textit{not worth publishing} if it achieves \textit{better results with old methods}. Rather, most papers aim to \textit{distinguish themselves} from the others by using new methods that beat the old ones. In this way, \textit{baselines} are \textit{not questioned} and the \textit{community} is steered in the wrong direction, as their work competes against \textit{insufficient} \textit{baselines}. 
 
+
+This problem was not only solved during the \textit{Netflix-Prize} by the \textit{horrendous prize money}. However, it turns out that the \textit{insights} gained there were more \textit{profound} and can be transferred to the \textit{MovieLens10M-dataset}. Thus \textit{new techniques} but \textit{no new elementary knowledge} could be achieved on the \textit{MovieLens10M-dataset}.
\ No newline at end of file
diff --git a/critical_assessment.tex b/critical_assessment.tex
new file mode 100644
index 0000000000000000000000000000000000000000..9a204c5a0b6e967f8d23a08833b8b55f4c5f1da0
--- /dev/null
+++ b/critical_assessment.tex
@@ -0,0 +1,21 @@
+\newpage
+\section{Critical Assessment}
+With this paper \citet{Rendle19} addresses the highly experienced reader. The simple structure of the paper convinces by the clear and direct way in which the problem is identified. Additionally, the paper can be seen as an \textit{addendum} to the \textit{Netflix-Prize}. 
+
+The problem addressed by \citet{Rendle19} is already known from other topics like \textit{information-retrieval} and \textit{machine learning}. For example, \citet{Armstrong09} described the phenomenon observed by \citet{Rendle19} in the context of \textit{information-retrieval systems} that too \textit{weak baselines} are used. He also sees that \textit{experiments} are \textit{misinterpreted} by giving \textit{misunderstood indicators} such as \textit{statistical significance}. In addition, \citet{Armstrong09} also sees that the \textit{information-retrieval community} lacks an adequate overview of results. In this context, he proposes a collection of works that start reminiscent of the \textit{Netflix-Leaderboard}. \citet{Lin19} also observed the problem of \textit{baselines} for \textit{neural-networks} that are \textit{too weak}. Likewise, the actual observation that \textit{too weak baselines} exist due to empirical evaluation is not unknown in the field of \textit{recommender systems}. \citet{Ludewig18} already observed the same problem for \textit{session-based recommender systems}. Such systems only work with data generated during a \textit{session} and try to predict the next \textit{user} selection. They also managed to achieve better results using \textit{session-based matrix-factorization}, which was inspired by the work of \citet{Rendle09} and \citet{Rendle10}. The authors see the problem in the fact that there are \textit{too many datasets} and \textit{different measures} of evaluation for \textit{scientific work}. In addition, \citet{Dacrema19} take up the problem addressed by \citet{Lin19} and show that \textit{neural approaches} to solving the \textit{recommender-problem} can also be beaten by simplest methods. They see the main problem in the \textit{reproducibility} of publications and suggest a \textit{rethinking} in the \textit{verification} of results in this field of work. Furthermore, they do not refrain from taking a closer look at \textit{matrix-factorization} in this context.
+Compared to the listed work, it is not unknown that in some subject areas \textit{baselines} are \textit{too weak} and lead to \textit{stagnant development}. Especially when considering that \textit{information-retrieval} and \textit{machine learning} are the \textit{cornerstones} of \textit{recommender systems} it is not surprising to observe similar phenomena. Nevertheless, the work published by \citet{Rendle19} stands out from the others. Using the insights gained during the \textit{Netflix-Prize}, he underlines the problem of the \textit{lack of standards} and \textit{unity} for \textit{scientific experiments} in the work mentioned above.
+
+However, the work published by \citet{Rendle19} also clearly stands out from the above-mentioned work. In contrast to them, not only the problem for the \textit{MovieLens10M-dataset} in combination with \textit{matrix-factorization} is recognized. Rather, the problem is brought one level higher. Thus, it succeeds in gaining a global and reflected but still distanced view of the \textit{best practice} in the field of \textit{recommender systems}.
+Besides calling for \textit{uniform standards}, \citet{Rendle19} criticizes the way the \textit{scientific community} thinks. \citet{Rendle19} recognizes the \textit{publication-bias} addressed by \citet{Sterling59}. The so-called \textit{publication-bias} describes the problem that there is a \textit{statistical distortion} of the data situation within a \textit{scientific topic area}, since only successful or modern papers are published. \citet{Rendle19} clearly abstracts this problem from the presented experiment. The authors see the problem in the fact that a scientific paper is subject to a \textit{pressure to perform} which is based on the \textit{novelty} of such a paper. This thought can be transferred to the \textit{file-drawer-problem} described by \citet{Rosenthal79}. This describes the problem that many \textit{scientists} do not publish their work and, out of concern about not meeting the \textit{publication standards} such as \textit{novelty} or the question of the \textit{impact on the community}, do not submit their results at all and prefer to \textit{keep them in a drawer}. Although the problems mentioned above are not directly addressed, they can be abstracted due to the detailed presentation. In contrast to the other works, this way a wanted or unwanted abstraction and naming of concrete and comprehensible problems is achieved.
+Nevertheless, criticism must also be made of the work published by \citet{Rendle19}. Despite the high standard of the work, it must be said that the problems mentioned above can be identified but are not directly addressed by the authors. The work of \citet{Rendle19} even lacks an embedding in the context above. Thus, the experienced reader who is familiar with the problems addressed by \citet{Armstrong09}, \citet{Sterling59} and \citet{Rosenthal79} becomes aware of the contextual and historical embedding and value of the work. In contrast, \citet{Lin19}, published in the same period, succeeds in this embedding in the contextual problem and in the previous work. Moreover, it is questionable whether the problem addressed can actually lead to a change in \textit{long-established thinking}. Especially if one takes into account that many scientists are also investigating the \textit{transferability} of new methods to the \textit{recommender problem}. Thus, the call for research into \textit{better baselines} must be viewed from two perspectives. On the one hand, it must be noted that \textit{too weak baselines} can lead to a false understanding of new methods. On the other hand, however, it must also be noted that this could merely trigger the numerical evaluation in a competitive process to find the best method, as was the case with the \textit{Netflix-Prize}. However, in the spirit of \citet{Sculley18}, it should always be remembered that: \textit{"the goal of science is not wins, but knowledge"}.
+
+As the authors \citet{Rendle} and \citet{Koren} were significantly \textit{involved} in this competition, the points mentioned above are convincing by the experience they have gained. With their results they support the very simple but not trivial statement that finding good \textit{baselines} requires an \textit{immense effort} and this has to be \textit{promoted} much more in a \textit{scientific context}. This implies a change in the \textit{long-established thinking} about the evaluation of scientific work. At this point it is questionable whether it is possible to change existing thinking. This should be considered especially because the scientific sector, unlike the industrial sector, cannot provide financial motivation due to limited resources. On the other hand, it must be considered that the individual focus of a work must also be taken into account. Thus, it is \textit{questionable} whether the \textit{scientific sector} is able to create such a large unit with regard to a \textit{common goal} as \textit{Netflix} did during the competition.
+It should be clearly emphasized that it is immensely important to use sharp \textit{baselines} as guidelines. However, in a \textit{scientific context} the \textit{goal} is not as \textit{precisely defined} as it was in the \textit{Netflix-Prize}. Rather, a large part of the work is aimed at investigating whether new methods such as \textit{neural networks} etc. are applicable to the \textit{recommender problem}.
+Regarding the results, however, it has to be said that they clearly support a \textit{rethinking} even if this should only concern a \textit{small part} of the work. 
+
+On the website \textit{Papers with Code}\footnote{\url{https://paperswithcode.com/sota/collaborative-filtering-on-movielens-10m}} the \textit{public leaderboard} regarding the results obtained on the \textit{MovieLens10M-dataset} can be viewed. The source analysis of \textit{Papers with Code} also identifies the results given by \citet{Rendle19} as leading.
+In addition, \textit{future work} should focus on a more \textit{in-depth source analysis} which, besides the importance of the \textit{MovieLens10M-dataset} for the \textit{scientific community}, also examines whether and to what extent \textit{other datasets} are affected by this phenomenon.
+Due to the recent publication in spring \textit{2019}, this paper has not yet been cited frequently. So time will tell what impact it will have on the \textit{community}. Nevertheless, \citet{Dacrema2019} has already observed similar problems for \textit{top-n-recommender} based on this paper. According to this, \citet{Rendle} seems to have recognized an elementary and unseen problem and made it public. 
+
+This is strongly reminiscent of the so-called \textit{Artificial-Intelligence-Winter (AI-Winter)} in which \textit{stagnation} in the \textit{development} of \textit{artificial intelligence} occurred due to too high expectations and other favourable factors. Overall the paper has the potential to \textit{counteract} the \textit{stagnation} in development and  thus \textit{prevent} a \textit{winter for recommender systems}.
+
diff --git a/recommender.tex b/recommender.tex
index 3d92371d36b987e2775e76a603f0dddb61b886ae..13028aa8a316887d9f6abf65013b00a9779efcd4 100644
--- a/recommender.tex
+++ b/recommender.tex
@@ -8,7 +8,7 @@ Each of the \textit{users} in $\mathcal{U}$ gives \textit{ratings} from a set $\
 In the following, the two main approaches of \textit{collaborative-filtering} and \textit{content-based} \textit{recommender systems} will be discussed. In addition, it is explained how \textit{matrix-factorization} can be integrated into the two ways of thinking.
 
 \subsection{Content-Based}
-\textit{Content-based} \textit{recommender systems (CB)} work directly with \textit{feature vectors}. Such a \textit{feature vector} can, for example, represent a \textit{user profile}. In this case, this \textit{profile} contains informations about the \textit{user's preferences}, such as \textit{genres}, \textit{authors}, \textit{etc}.  This is done by trying to create a \textit{model} of the \textit{user}, which best represents his preferences. The different \textit{learning algorithms} from the field of \textit{machine learning} are used to learn or create the \textit{models}. The most prominent \textit{algorithms} are: \textit{tf-idf}, \textit{bayesian learning}, \textit{Rocchio's algorithm} and \textit{neuronal networks} \citep{Lops11, Ferrari19, DeKa11}. Altogether the built and learned \textit{feature vectors} are compared with each other. Based on their closeness, similar \textit{features} can be used to generate \textit{missing ratings}. Figure \ref{fig:cb} shows a sketch of the general operation of \textit{content-based recommenders}.
+\textit{Content-based} \textit{recommender systems (CB)} work directly with \textit{feature vectors}. Such a \textit{feature vector} can, for example, represent a \textit{user profile}. In this case, this \textit{profile} contains informations about the \textit{user's preferences}, such as \textit{genres}, \textit{authors}, \textit{etc}.  This is done by trying to create a \textit{model} of the \textit{user}, which best represents his preferences. The different \textit{learning algorithms} from the field of \textit{machine learning} are used to learn or create the \textit{models}. The most prominent \textit{algorithms} are: \textit{tf-idf}, \textit{bayesian learning}, \textit{Rocchio's algorithm} and \textit{neuronal networks} \citep{Lops11, Dacrema19, DeKa11}. Altogether the built and learned \textit{feature vectors} are compared with each other. Based on their closeness, similar \textit{features} can be used to generate \textit{missing ratings}. Figure \ref{fig:cb} shows a sketch of the general operation of \textit{content-based recommenders}.
 
 \subsection{Collaborative-Filtering}
 Unlike the \textit{content-based recommender (CF)}, the \textit{collaborative-filtering recommender} not only considers individual \textit{users} and \textit{feature vectors}, but rather a \textit{like-minded neighborhood} of each \textit{user}.
diff --git a/references.bib b/references.bib
index fd227a7f3791cc73ec87a11433133de9a561543b..5d03be60ba282ee8674adde31ee730145b78fdec 100644
--- a/references.bib
+++ b/references.bib
@@ -51,15 +51,6 @@ editor = {P.B. Kantor and F. Ricci and L. Rokach and B. Shapira},
 publisher={Springer},
 doi = {10.1007/978-0-387-85820-3_4}
 }
-@inproceedings{Ferrari19,
-author = {Maurizio Ferrari Dacrema and Paolo Cremonesi and Dietmar Jannach},
-year = {2019},
-month = {07},
-pages = {},
-title = {Are We Really Making Much Progress? A Worrying Analysis of Recent Neural Recommendation Approaches},
-isbn = {978-1-4503-6243-6},
-doi = {10.1145/3298689.3347058}
-}
 @article{Kor09,
 author = {Yehuda Koren and 
 			  Robert Bell and
@@ -188,4 +179,107 @@ journal = {Proceedings of KDD Cup and Workshop}
   journal={ArXiv},
   year={2019},
   volume={abs/1911.07698}
-}
\ No newline at end of file
+}
+@inproceedings{Armstrong09,
+author = {Armstrong, Timothy and Moffat, Alistair and Webber, William and Zobel, Justin},
+year = {2009},
+month = {11},
+pages = {601-610},
+title = {Improvements that don’t add up: Ad-hoc retrieval results since},
+doi = {10.1145/1645953.1646031}
+}
+@article{Lin19,
+author = {Lin, Jimmy},
+title = {The Neural Hype and Comparisons Against Weak Baselines},
+year = {2019},
+issue_date = {January 2019},
+publisher = {Association for Computing Machinery},
+address = {New York, NY, USA},
+volume = {52},
+number = {2},
+issn = {0163-5840},
+url = {https://doi.org/10.1145/3308774.3308781},
+doi = {10.1145/3308774.3308781},
+journal = {SIGIR Forum},
+month = jan,
+pages = {40–51},
+numpages = {12}
+}
+@article{Ludewig18,
+  author    = {Ludewig, Jannach},
+  title     = {Evaluation of Session-based Recommendation Algorithms},
+  journal   = {CoRR},
+  volume    = {abs/1803.09587},
+  year      = {2018},
+  url       = {http://arxiv.org/abs/1803.09587},
+  archivePrefix = {arXiv},
+  eprint    = {1803.09587},
+  timestamp = {Mon, 13 Aug 2018 16:46:25 +0200},
+  biburl    = {https://dblp.org/rec/bib/journals/corr/abs-1803-09587},
+  bibsource = {dblp computer science bibliography, https://dblp.org}
+}
+@article{Rendle09,
+author = {Rendle, Steffen and Freudenthaler, Christoph and Gantner, Zeno and Schmidt-Thieme, Lars},
+year = {2012},
+month = {05},
+pages = {},
+title = {BPR: Bayesian Personalized Ranking from Implicit Feedback},
+journal = {Proceedings of the 25th Conference on Uncertainty in Artificial Intelligence, UAI 2009}
+}
+@inproceedings{Rendle10,
+author = {Rendle, Steffen and Freudenthaler, Christoph and Schmidt-Thieme, Lars},
+title = {Factorizing Personalized Markov Chains for Next-Basket Recommendation},
+year = {2010},
+isbn = {9781605587998},
+publisher = {Association for Computing Machinery},
+address = {New York, NY, USA},
+url = {https://doi.org/10.1145/1772690.1772773},
+doi = {10.1145/1772690.1772773},
+booktitle = {Proceedings of the 19th International Conference on World Wide Web},
+pages = {811–820},
+numpages = {10},
+keywords = {basket recommendation, markov chain, matrix factorization},
+location = {Raleigh, North Carolina, USA},
+series = {WWW ’10}
+}
+@article{Dacrema19,
+  author    = {Dacrema, Maurizio Ferrari  and 
+                Cremonesi Paolo and Jannach Dietmar},
+  title     = {Are We Really Making Much Progress? {A} Worrying Analysis of Recent
+               Neural Recommendation Approaches},
+  journal   = {CoRR},
+  volume    = {abs/1907.06902},
+  year      = {2019},
+  url       = {http://arxiv.org/abs/1907.06902},
+  archivePrefix = {arXiv},
+  eprint    = {1907.06902},
+  timestamp = {Tue, 23 Jul 2019 10:54:22 +0200},
+  biburl    = {https://dblp.org/rec/bib/journals/corr/abs-1907-06902},
+  bibsource = {dblp computer science bibliography, https://dblp.org}
+}
+@article{Sterling59,
+ ISSN = {01621459},
+ URL = {http://www.jstor.org/stable/2282137},
+ abstract = {There is some evidence that in fields where statistical tests of significance are commonly used, research which yields nonsignificant results is not published. Such research being unknown to other investigators may be repeated independently until eventually by chance a significant result occurs-an "error of the first kind"-and is published. Significant results published in these fields are seldom verified by independent replication. The possibility thus arises that the literature of such a field consists in substantial part of false conclusions resulting from errors of the first kind in statistical tests of significance.},
+ author = {Theodore D. Sterling},
+ journal = {Journal of the American Statistical Association},
+ number = {285},
+ pages = {30--34},
+ publisher = {[American Statistical Association, Taylor & Francis, Ltd.]},
+ title = {Publication Decisions and Their Possible Effects on Inferences Drawn from Tests of Significance--Or Vice Versa},
+ volume = {54},
+ year = {1959}
+}
+@inproceedings{Rosenthal79,
+  title={The file drawer problem and tolerance for null results.},
+  author={Robert S. Rosenthal},
+  year={1979}
+}
+@inproceedings{Sculley18,
+  title={Winner's Curse? On Pace, Progress, and Empirical Rigor},
+  author={D. Sculley and Jasper Snoek and Alexander B. Wiltschko and Ali Rahimi},
+  booktitle={ICLR},
+  year={2018}
+}
+
+
diff --git a/submission.pdf b/submission.pdf
index 7b34ced0d695ccfc057dcc0fa90df76d4dcea3ef..10f411085079a1565e2fb50d410271315f6de346 100644
Binary files a/submission.pdf and b/submission.pdf differ
diff --git a/submission.tex b/submission.tex
index f5036da2a51513e489f16d45263d9cc16bbfa299..ecfc290f63891c65d45219b1f6345eb1280a46ac 100644
--- a/submission.tex
+++ b/submission.tex
@@ -76,6 +76,7 @@ A Study on Recommender Systems}
 \input{recommender}
 \input{baselines}
 \input{conclusion}
+\input{critical_assessment}
 \newpage
 \bibliography{references}
 \bibliographystyle{plainnat}