diff --git a/baselines.tex b/baselines.tex index f28562d38f9583338b04cae3e2c8c99e579c90bd..ad47585594e216cbd458d64b9b2edd49c15e7968 100644 --- a/baselines.tex +++ b/baselines.tex @@ -21,7 +21,7 @@ It can be clearly stated that the \textit{texisting baselines} have been \textit \subsection{Experiment Realization} As the \textit{Netflix-Prize} has shown, \textit{research} and \textit{validation} is \textit{complex} even for very \textit{simple methods}. Not only during the \textit{Netflix-Prize} was intensive work done on researching \textit{existing} and \textit{new reliable methods}. The \textit{MovieLens10M-dataset} was used just as often. With their \textit{experiment} the authors \textit{doubt} that the \textit{baselines} of \textit{MovieLens10M} are \textit{inadequate} for the evaluation of new methods. To test their hypothesis, the authors transferred all the findings from the \textit{Netflix-Prize} to the existing baselines of \textit{MovieLens10M}. -\subsubsection{Experiment Preparation} +\subsubsection{Experiment Preparation}\label{sec:experiment_preparation} Before actually conducting the experiment, the authors took a closer look at the given baselines. In the process, they noticed some \textit{systematic overlaps}. These can be taken from \textit{table} below. \input{overlaps} @@ -36,6 +36,12 @@ As a \textit{first intermediate result} of the preparation it can be stated that In addition, it can be stated that learning using the \textit{bayesian approach} is better than learning using \textit{SGD}. Even if the results could be different due to more efficient setups, it is still surprising that \textit{SGD} is worse than the \textit{bayesian approach}, although the \textit{exact opposite} was reported for \textit{MovieLens10M}. For example, \textit{figure} \ref{fig:reported_results} shows that the \textit{bayesian approach BPMF} achieved an \textit{RMSE} of \textit{0.8187} while the \textit{SGD approach Biased MF} performed better with \textit{0.803}. The fact that the \textit{bayesian approach} outperforms \textit{SGD} has already been reported and validated by \citet{Rendle13}, \citet{Rus08} for the \textit{Netflix-Prize-dataset}. Looking more closely at \textit{figures} \ref{fig:reported_results} and \ref{fig:battle}, the \textit{bayesian approach} scores better than the reported \textit{BPMF} and \textit{Biased MF} for each \textit{dimensional embedding}. Moreover, it even beats all reported baselines and new methods. Building on this, the authors have gone into the detailed examination of the methods and baselines. \subsubsection{Experiment Implementation} +For the actual execution of the experiment, the \textit{authors} used the knowledge they had gained from the \textit{preparations}. They noticed already for the two \textit{simple matrix-factorization models SGD-MF} and \textit{Bayesian MF}, which were trained with an \textit{embedding} of \textit{512 dimensions} and over \textit{128 epochs}, that they performed extremely well. Thus \textit{SGD-MF} achieved an \textit{RMSE} of \textit{0.7720}. This result alone was better than: \textit{RSVD (0.8256)}, \textit{Biased MF (0.803)}, \textit{LLORMA (0.7815)}, \textit{Autorec (0.782)}, \textit{WEMAREC (0.7769)} and \textit{I-CFN++ (0.7754)}. In addition, \textit{Bayesian MF} with an \textit{RMSE} of \textit{0.7653} not only beat the \textit{reported baseline BPMF (0.8197)}. It also beat the \textit{best algorithm MRMA (0.7634)}. +As the \textit{Netflix-Prize} showed, the use of \textit{implicit data} such as \textit{time} or \textit{dependencies} between \textit{users} or \textit{items} could \textit{immensely improve existing models}. In addition to the two \textit{simple matrix factorizations}, \textit{table} \ref{table:models} shows the \textit{extensions} of the \textit{authors} regarding the \textit{bayesian approach}. + +\input{model_table} +As it turned out that the \textit{bayesian approach} gave more promising results, the given models were trained with it. For this purpose, the \textit{dimensional embedding} as well as the \textit{number of sampling steps} for the models were examined again. Again the \textit{gaussian normal distribution} was used for \textit{initialization} as indicated in \textit{section} \ref{sec:experiment_preparation} . \textit{Figure} XY shows the corresponding results. + \subsection{Obeservations} \subsubsection{Stronger Baselines} \subsubsection{Reproducability} diff --git a/model_table.tex b/model_table.tex new file mode 100644 index 0000000000000000000000000000000000000000..fcd49928ed580ceae4db612eb39e3714573ee9cc --- /dev/null +++ b/model_table.tex @@ -0,0 +1,18 @@ +% Please add the following required packages to your document preamble: +% \usepackage{graphicx} +\begin{table}[!ht] +\centering +\resizebox{\textwidth}{!}{% +\begin{tabular}{|l|l|l|} +\hline +\textbf{Name} & \textbf{Feature} & \textbf{Comment} \\ \hline +\textit{Matrix-Factorization} & \textit{u}, \textit{i} & Simple \textit{matrix-factorization} similar to \textit{biased matrix-factorization} and \textit{RSVD}. \\ \hline +\textit{timeSVD} & \textit{u}, \textit{i}, \textit{t} & Based on the \textit{matrix- factorization}, \textit{time dependencies} are taken into account. \\ \hline +\textit{SVD++} & \textit{u}, \textit{i}, $\mathcal{I}_u$ & Based on the \textit{matrix-factorization}, the \textit{items} $\mathcal{I}_u$ that a \textit{user} has \textit{viewed} are included. \\ \hline +\textit{timeSVD++} & \textit{u}, \textit{i}, \textit{t}, $\mathcal{I}_u$ & Combination of \textit{SVD++} and \textit{timeSVD}. \\ \hline +\textit{timeSVD++ flipped} & \textit{u}, \textit{i}, \textit{t}, $\mathcal{I}_u$, $\mathcal{U}_i$ & Extension of \textit{timeSVD++} whereby all other \textit{users} $\mathcal{U}_i$ who have seen a certain \textit{item} are also taken into account. \\ \hline +\end{tabular}% +} +\caption{\textit{Models} and their \textit{features} created and used by the \textit{authors}.} +\label{table:models} +\end{table} \ No newline at end of file diff --git a/submission.pdf b/submission.pdf index 2315d4853adef794b13235d3219823368f9e7a1a..f072dc6532dcb43d4107451fee2f931ef750412c 100644 Binary files a/submission.pdf and b/submission.pdf differ