diff --git a/recommender.tex b/recommender.tex index d07e556a8f717da7cb9ca8c9422e51f7e08ae93b..f70837f5e86e0bfa42e3d4108f3aea936b6242d8 100644 --- a/recommender.tex +++ b/recommender.tex @@ -28,3 +28,15 @@ The first and easiest way to solve \textit{matrix-factorization} is to connect t \subsubsection{Regulated Matrix-Factorization} This problem extends the \textit{basic matrix factorization} by a \textit{regulation factor} $\lambda$ in the corresponding \textit{minimization problem}. Since $\mathcal{R}$ is thinly occupied, the effect of \textit{overfitting} may occur due to learning from the few known values. The problem with \textit{overfitting} is that the generated \textit{ratings} are too tight. To counteract this, the magnitudes of the previous vectors is taken into account. High magnitudes are punished by a factor $\lambda(\lVert q_i \rVert^2 + \lVert p_u \lVert^2)$ in the \textit{minimization problem}. Overall, the \textit{minimization problem} $\min_{p_u, q_i}{\sum_{(u,i) \in \mathcal{B}} (r_{ui} - \hat{r}_{ui})^{2}} + \lambda(\lVert q_i \rVert^2 + \lVert p_u \lVert^2)$ is to be solved. + +\subsubsection{Weighted Regulated Matrix-Factorization} +A \textit{regulation factor} $\lambda$ is introduced in analogy to \textit{regulated matrix-factorization}. Additional \textit{weights} $\alpha$ and $\beta$ are introduced to take into account the individual magnitude of a vector. The \textit{minimization problem} then corresponds to $\min_{p_u, q_i}{\sum_{(u,i) \in \mathcal{B}} (r_{ui} - \hat{r}_{ui})^{2}} + \lambda(\alpha\lVert q_i \rVert^2 + \beta\lVert p_u \lVert^2)$. + +\subsubsection{Biased Matrix-Factorization} +A major advantage of \textit{matrix-factorization} is the ability to model simple relationships according to the application. Thus, an excellent data source cannot always be assumed. Due to the \textit{natural interaction} of the \textit{users} with the \textit{items}, \textit{preferences} arise. Such \textit{preferences} lead to \textit{behaviour patterns} which manifest themselves in the form of a \textit{bias} in the data. In principle, a \textit{bias} is not bad, but it must be taken into account when modeling the \textit{recommender system}. +The most popular model that takes \textit{bias} into account is called \textit{biased matrix-factorization}. +In addition, the \textit{missing rating} is no longer determined only by the \textit{inner product} of the two vectors $q_i$ and $p_u$. Rather, the \textit{bias} is also considered. Accordingly, a \textit{missing rating} is calculated by $\hat{r}_{ui} = b_{ui} + q_i^Tp_u$, where $b_{ui}$ is the \textit{bias} of a \textit{user} $u$ and an \textit{item} $i$. The \textit{bias} is determined by $b_{ui}=\mu + b_u + b_i$. The parameter $\mu$ is the \textit{global average} of all \textit{ratings} $r_{ui} \in \mathcal{R}$. +Furthermore, $b_u = \mu_u - \mu$ and $b_i = \mu_i - \mu$. +Here $\mu_u$ denotes the \textit{average} of all \textit{assigned ratings} of the \textit{user} $u$. Similarly, $\mu_i$ denotes the \textit{average} of all \textit{received ratings} of an \textit{item} $i$. +Thus $b_u$ indicates the \textit{deviation} of the \textit{average assigned rating} of a \textit{user} from the \textit{global average}. Similarly, $b_i$ indicates the \textit{deviation} of the \textit{average rating} of an item from the \textit{global average}. +In addition, the \textit{minimization problem} can be extended by the \textit{bias}. Accordingly, the \textit{minimization problem} is then $\min_{p_u, q_i}{\sum_{(u,i) \in \mathcal{B}} (r_{ui} - \hat{r}_{ui})^{2}} + \lambda(\lVert q_i \rVert^2 + \lVert p_u \lVert^2 + b_u^2 + b_i^2)$. diff --git a/submission.pdf b/submission.pdf index ac8586d4e91fc3f5ba09587a889413239cff6601..40918daa327a14f5ad122b9dacdcf980edfa02d6 100644 Binary files a/submission.pdf and b/submission.pdf differ