Add more text for experiments

67429e29 · Marc Feger · 2c183bd0 · 67429e29 · 67429e29 · 67429e29
Commit 67429e29 authored 5 years ago by Marc Feger
--- a/baselines.tex
+++ b/baselines.tex
@@ -3,6 +3,16 @@ This section reviews the \textit{main part} of the work represented by \citet{Re
 \subsection{Motivation and Background}
 As in many other fields of \textit{data-science}, a valid \textit{benchmark-dataset} is required for a proper execution of experiments. In the field of \textit{recommender systems}, the best known \textit{datasets} are the \textit{Netflix-} and \textit{MovieLens-dataset}. This section introduces both \textit{datasets} and shows the relationship of \citet{Koren}, one of the authors of this paper, to the \textit{Netflix-Prize}, in addition to the existing \textit{baselines}.
 \subsubsection{Netflix-Prize}
+The topic of \textit{recommender systems} was first properly promoted and made known by the \textit{Netflix-Prize}. On \textit{October 2nd 2006}, the competition announced by \textit{Netflix} began with the \textit{goal} of beating the self-developed \textit{recommender system Cinematch} with an \textit{RMSE} of \textit{0.9514} by at least \textit{10\%}.
+In total, the \textit{Netflix-dataset} was divided into three parts that can be grouped into two categories: \textit{training} and \textit{qualification}. In addition to a \textit{probe-dataset} for \textit{training} the algorithms, two further datasets were retained to qualify the winners. The \textit{quiz-dataset} was then used to calculate the \textit{score} of the \textit{submitted solutions} on the \textit{public leaderboard}. In contrast, the \textit{test-dataset} was used to determine the \textit{actual winners}. Each of the pieces had around \textit{1.408.000 data} and \textit{similar statistical values}. By splitting the data in this way, it was possible to ensure that an improvement could not be achieved by \textit{simple hill-climbing-algorithms}.
+It took a total of \textit{three years} and \textit{several hundred models} until the team \textit{"BellKor`s Pragmatic Chaos"} was chosen as the \textit{winner} on \textit{21st September 2009}. They had managed to achieve an \textit{RMSE} of \textit{0.8554} and thus an \textit{improvement} of \textit{0.096}. Such a result is extraordinary excellent, because it took \textit{one year} of work and intensive research to reduce the \textit{RMSE} from \textit{0.8712 (progress award 2007)} to \textit{0.8616 (progress award 2008)}.
+The \textit{co-author} of the present paper, \citet{Koren}, was significantly involved in the work of this team. Since the beginning of the event, \textit{matrix-factorization methods} have been regarded as promising approaches. Even with the simplest \textit{SVD} methods, \textit{RMSE values} of \textit{0.94} could be achieved by \citet{Kurucz07}.
+The \textit{breakthrough} came through \citet{Funk06} who achieved an \textit{RMSE} of \textit{0.93} with his \textit{FunkSVD}.
+Based on this, more and more work has been invested in the research of simple \textit{matrix-factorization methods}.
+Thus, \citet{Zh08} presented an \textit{ALS variant} with an \textit{RMSE} of \textit{0.8985} and \citet{Koren09} presented an \textit{SGD variant} with \textit{RMSE 0.8995}.
+\textit{Implicit data} were also used. For example, \citet{Koren09} could also achieve an \textit{RMSE} of \textit{0.8762} by extending \textit{SVD++} with a \textit{time variable}. This was then called \textit{timeSVD++}.
+
+The \textit{Netflix-Prize} made it clear that even the \textit{simplest methods} are \textit{not trivial} and that a \textit{reasonable investigation} and \textit{evaluation requires} an \textit{immense effort} from within the \textit{community}.
 \subsubsection{MovieLens}
 \subsection{Experiment Realization}
 \subsubsection{Experiment Preparation}

--- a/references.bib
+++ b/references.bib
@@ -123,3 +123,18 @@ doi = {10.1145/1390156.1390267}
  howpublished = {\url{https://ieeexplore.ieee.org/author/37414256700}},
  note = {Accessed: 2019-12-21},
 }
+@article{Kurucz07,
+author = {Miklós Kurucz and András Benczúr and Károly Csalogány},
+year = {2007},
+month = {01},
+pages = {},
+title = {Methods for large scale SVD with missing values},
+journal = {ACM KDDCup 2007}
+}
+@article{Koren09,
+author = {Yehuda Koren},
+year = {2009},
+month = {09},
+pages = {},
+title = {The BellKor solution to the Netflix Grand Prize}
+}
\ No newline at end of file
--- a/submission.pdf
+++ b/submission.pdf