Skip to content
Snippets Groups Projects
Commit 381a6a4e authored by Marc Feger's avatar Marc Feger
Browse files

Add more text for experiments

parent 98a54703
Branches
No related tags found
No related merge requests found
Bilder/battle.png

60.6 KiB | W: | H:

Bilder/battle.png

31.8 KiB | W: | H:

Bilder/battle.png
Bilder/battle.png
Bilder/battle.png
Bilder/battle.png
  • 2-up
  • Swipe
  • Onion skin
Bilder/bayes_dimensional_embedding.png

49.4 KiB

Bilder/bayes_sampling_steps.png

47.8 KiB

Bilder/corrected_results.png

43.3 KiB

Bilder/reported_results.png

122 KiB | W: | H:

Bilder/reported_results.png

39.1 KiB | W: | H:

Bilder/reported_results.png
Bilder/reported_results.png
Bilder/reported_results.png
Bilder/reported_results.png
  • 2-up
  • Swipe
  • Onion skin
...@@ -31,7 +31,7 @@ To prepare the two learning procedures they were initialized with a \textit{gaus ...@@ -31,7 +31,7 @@ To prepare the two learning procedures they were initialized with a \textit{gaus
For both approaches the number of \textit{sampling steps} was then set to \textit{128}. Since \textit{SGD} has two additional \textit{hyperparameters} $\lambda, \gamma$ these were also determined. Overall, the \textit{MovieLens10M-dataset} was evaluated by a \textit{10-fold cross-validation} over a \textit{random global} and \textit{non-overlapping 90:10 split}. In each split, \textit{90\%} of the data was used for \textit{training} and \textit{10\%} of the data was used for \textit{evaluation} without overlapping. In each split, \textit{95\%} of the \textit{training data} was used for \textit{training} and the remaining \textit{5\%} for \textit{evaluation} to determine the \textit{hyperparameters}. The \textit{hyperparameter search} was performed as mentioned in \textit{section} \ref{sec:sgd} using the \textit{grid} $(\lambda \in \{0.02, 0.03, 0.04, 0.05\}, \gamma \in \{0.001, 0.003\})$. This grid was inspired by findings during the \textit{Netflix-Prize} \citep{Kor08, Paterek07}. In total the parameters $\lambda=0.04$ and $\gamma=0.003$ could be determined. Afterwards both \textit{learning methods} and their settings were compared. The \textit{RMSE} was plotted against the used \textit{dimension} $f$ of $p_u, q_i \in \mathbb{R}^f$. \textit{Figure} \ref{fig:battle} shows the corresponding results. For both approaches the number of \textit{sampling steps} was then set to \textit{128}. Since \textit{SGD} has two additional \textit{hyperparameters} $\lambda, \gamma$ these were also determined. Overall, the \textit{MovieLens10M-dataset} was evaluated by a \textit{10-fold cross-validation} over a \textit{random global} and \textit{non-overlapping 90:10 split}. In each split, \textit{90\%} of the data was used for \textit{training} and \textit{10\%} of the data was used for \textit{evaluation} without overlapping. In each split, \textit{95\%} of the \textit{training data} was used for \textit{training} and the remaining \textit{5\%} for \textit{evaluation} to determine the \textit{hyperparameters}. The \textit{hyperparameter search} was performed as mentioned in \textit{section} \ref{sec:sgd} using the \textit{grid} $(\lambda \in \{0.02, 0.03, 0.04, 0.05\}, \gamma \in \{0.001, 0.003\})$. This grid was inspired by findings during the \textit{Netflix-Prize} \citep{Kor08, Paterek07}. In total the parameters $\lambda=0.04$ and $\gamma=0.003$ could be determined. Afterwards both \textit{learning methods} and their settings were compared. The \textit{RMSE} was plotted against the used \textit{dimension} $f$ of $p_u, q_i \in \mathbb{R}^f$. \textit{Figure} \ref{fig:battle} shows the corresponding results.
\input{battle} \input{battle}
\newpage
As a \textit{first intermediate result} of the preparation it can be stated that both \textit{SGD} and \textit{gibbs-samper} achieve better \textit{RMSE values} for increasing \textit{dimensional embedding}. As a \textit{first intermediate result} of the preparation it can be stated that both \textit{SGD} and \textit{gibbs-samper} achieve better \textit{RMSE values} for increasing \textit{dimensional embedding}.
In addition, it can be stated that learning using the \textit{bayesian approach} is better than learning using \textit{SGD}. Even if the results could be different due to more efficient setups, it is still surprising that \textit{SGD} is worse than the \textit{bayesian approach}, although the \textit{exact opposite} was reported for \textit{MovieLens10M}. For example, \textit{figure} \ref{fig:reported_results} shows that the \textit{bayesian approach BPMF} achieved an \textit{RMSE} of \textit{0.8187} while the \textit{SGD approach Biased MF} performed better with \textit{0.803}. The fact that the \textit{bayesian approach} outperforms \textit{SGD} has already been reported and validated by \citet{Rendle13}, \citet{Rus08} for the \textit{Netflix-Prize-dataset}. Looking more closely at \textit{figures} \ref{fig:reported_results} and \ref{fig:battle}, the \textit{bayesian approach} scores better than the reported \textit{BPMF} and \textit{Biased MF} for each \textit{dimensional embedding}. Moreover, it even beats all reported baselines and new methods. Building on this, the authors have gone into the detailed examination of the methods and baselines. In addition, it can be stated that learning using the \textit{bayesian approach} is better than learning using \textit{SGD}. Even if the results could be different due to more efficient setups, it is still surprising that \textit{SGD} is worse than the \textit{bayesian approach}, although the \textit{exact opposite} was reported for \textit{MovieLens10M}. For example, \textit{figure} \ref{fig:reported_results} shows that the \textit{bayesian approach BPMF} achieved an \textit{RMSE} of \textit{0.8187} while the \textit{SGD approach Biased MF} performed better with \textit{0.803}. The fact that the \textit{bayesian approach} outperforms \textit{SGD} has already been reported and validated by \citet{Rendle13}, \citet{Rus08} for the \textit{Netflix-Prize-dataset}. Looking more closely at \textit{figures} \ref{fig:reported_results} and \ref{fig:battle}, the \textit{bayesian approach} scores better than the reported \textit{BPMF} and \textit{Biased MF} for each \textit{dimensional embedding}. Moreover, it even beats all reported baselines and new methods. Building on this, the authors have gone into the detailed examination of the methods and baselines.
...@@ -40,9 +40,15 @@ For the actual execution of the experiment, the \textit{authors} used the knowle ...@@ -40,9 +40,15 @@ For the actual execution of the experiment, the \textit{authors} used the knowle
As the \textit{Netflix-Prize} showed, the use of \textit{implicit data} such as \textit{time} or \textit{dependencies} between \textit{users} or \textit{items} could \textit{immensely improve existing models}. In addition to the two \textit{simple matrix factorizations}, \textit{table} \ref{table:models} shows the \textit{extensions} of the \textit{authors} regarding the \textit{bayesian approach}. As the \textit{Netflix-Prize} showed, the use of \textit{implicit data} such as \textit{time} or \textit{dependencies} between \textit{users} or \textit{items} could \textit{immensely improve existing models}. In addition to the two \textit{simple matrix factorizations}, \textit{table} \ref{table:models} shows the \textit{extensions} of the \textit{authors} regarding the \textit{bayesian approach}.
\input{model_table} \input{model_table}
As it turned out that the \textit{bayesian approach} gave more promising results, the given models were trained with it. For this purpose, the \textit{dimensional embedding} as well as the \textit{number of sampling steps} for the models were examined again. Again the \textit{gaussian normal distribution} was used for \textit{initialization} as indicated in \textit{section} \ref{sec:experiment_preparation} . \textit{Figure} XY shows the corresponding results. As it turned out that the \textit{bayesian approach} gave more promising results, the given models were trained with it. For this purpose, the \textit{dimensional embedding} as well as the \textit{number of sampling steps} for the models were examined again. Again the \textit{gaussian normal distribution} was used for \textit{initialization} as indicated in \textit{section} \ref{sec:experiment_preparation}. \textit{Figure} \ref{fig:bayes_evaluation} shows the corresponding results.
\input{bayes_evaluation}
\subsection{Obeservations} \subsection{Obeservations}
The first observation that emerges from \textit{figure} \ref{fig:bayes_sampling_steps} is that the \textit{increase} in \textit{sampling steps} with a \textit{fixed dimensional embedding} also results in an \textit{improvement} in \textit{RMSE} for all models. Based on this, \textit{figure} \ref{fig:bayes_dimensional_embeddings} also shows that an \textit{increase} in the \textit{dimensional embedding} for \textit{512 sampling steps} also leads to an \textit{improvement} in the \textit{RMSE} for all models. Thus, both the \textit{number of sampling steps} and the size of the \textit{dimensional embedding} are involved in the \textit{RMSE} of \textit{matrix-factorization models} when they are trained using the \textit{bayesian approach}.
As a second finding, the \textit{RMSE values} of the created models can be taken from \textit{figure} \ref{fig:bayes_dimensional_embeddings}. Several points can be addressed. Firstly, it can be seen that the \textit{individual inclusion} of \textit{implicit knowledge} such as \textit{time} or \textit{user behaviour} leads to a significant \textit{improvement} in the \textit{RMSE}. For example, models like \textit{bayesian timeSVD (0.7587)} and \textit{bayesian SVD++ (0.7563)}, which already use single implicit knowledge, beat the \textit{simple bayesian MF} with an \textit{RMSE} of \textit{0.7633}. In addition, it also shows that the \textit{combination} of \textit{implicit data} further improves the \textit{RMSE}. \textit{Bayesian timeSVD++} achieves an \textit{RMSE} of \textit{0.7523}. Finally, \textit{bayesian timeSVD++ flipped} can achieve an \textit{RMSE} of \textit{0.7485} by adding \textit{more implicit data}.
This results in the third and most significant observation of the experiment. Firstly, the \textit{simple bayesian MF} with an \textit{RMSE} of \textit{0.7633} already beat the best method \textit{MRMA} with an \textit{RMSE} of \textit{0.7634}. Furthermore, the best method \textit{MRMA} could be surpassed with \textit{bayesian timeSVD++} by 0.0149 with respect to the \textit{RMSE}. Such a result is astonishing, as it took \textit{one year} during the \textit{Netflix-Prize} to reduce the leading \textit{RMSE} from \textit{0.8712 (progress award 2007)} to \textit{0.8616 (progress award 2008)}. Additionally, this result is remarkable as it \textit{challenges} the \textit{last 5 years} of research on the \textit{MovieLens10M-dataset}. Based on the results obtained, the \textit{authors} see the first problem with the \textit{results} achieved on the \textit{MovieLens10M-dataset} as being that they were \textit{compared against} too \textit{weak baselines}.
\subsubsection{Stronger Baselines} \subsubsection{Stronger Baselines}
\subsubsection{Reproducability} \subsubsection{Reproducability}
\subsubsection{Inadequate validations} \subsubsection{Inadequate validations}
\ No newline at end of file
\begin{figure}[!ht] \begin{figure}[!ht]
\centering \centering
\includegraphics[scale=0.37]{Bilder/battle.png} \includegraphics[scale=0.60]{Bilder/battle.png}
\caption{Comparison of \textit{matrix-factorization} learned by \textit{gibbs-sampling (bayesian learning)} and \textit{stochastic gradient descent (SGD)} for an \textit{embedding dimension} from \textit{16} to \textit{512}. \caption{Comparison of \textit{matrix-factorization} learned by \textit{gibbs-sampling (bayesian learning)} and \textit{stochastic gradient descent (SGD)} for an \textit{embedding dimension} from \textit{16} to \textit{512}.
} }
\label{fig:battle} \label{fig:battle}
......
\begin{figure}[!ht]
\centering
\begin{subfigure}[b]{0.45\linewidth}
\includegraphics[width=\linewidth]{Bilder/bayes_sampling_steps.png}
\caption{\textit{RMSE} vs. \textit{sampling steps}}
\label{fig:bayes_sampling_steps}
\end{subfigure}
\begin{subfigure}[b]{0.45\linewidth}
\includegraphics[width=\linewidth]{Bilder/bayes_dimensional_embedding.png}
\caption{\textit{RMSE} vs. \textit{dimensional embedding}}
\label{fig:bayes_dimensional_embeddings}
\end{subfigure}
\caption{Final evaluation of the \textit{number of sampling steps} and \textit{dimensional embedding} for the designed models. \textit{Figure} \ref{fig:bayes_sampling_steps} shows the \textit{number of sampling steps} with a \textit{dimensional embedding} of \textit{128} against the corresponding \textit{RMSE}. \textit{Figure} \ref{fig:bayes_dimensional_embeddings} shows the \textit{RMSE} generated by \textit{512 sampling steps} with \textit{variable dimensional embedding}.}
\label{fig:bayes_evaluation}
\end{figure}
\ No newline at end of file
\begin{figure}[!ht] \begin{figure}[!ht]
\centering \centering
\includegraphics[scale=0.25]{Bilder/reported_results.png} \includegraphics[scale=0.60]{Bilder/reported_results.png}
\caption{\textit{Results obtained} on the \textit{MovieLens10M-dataset} over the last \textit{5 years}. The \textit{y-axis} shows the corresponding \textit{RMSE} values and the \textit{x-axis} shows the \textit{year} in which the corresponding method was developed. \textit{Blue} marked points show \textit{newer methods} that have \textit{competed} against the points shown in \textit{black}. \citep{Rendle19}} \caption{\textit{Results obtained} on the \textit{MovieLens10M-dataset} over the last \textit{5 years}. The \textit{y-axis} shows the corresponding \textit{RMSE} values and the \textit{x-axis} shows the \textit{year} in which the corresponding method was developed. \textit{Blue} marked points show \textit{newer methods} that have \textit{competed} against the points shown in \textit{black}. \citep{Rendle19}}
\label{fig:reported_results} \label{fig:reported_results}
\end{figure} \end{figure}
\ No newline at end of file
No preview for this file type
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Please register or to comment