Introduction
I have to confess that I spend a lot of time on Youtube. Browsing through Youtube homepage, we can often observe that many of the Youtube videos match our interests/needs. If I click on a video and watch for a longtime, Youtube will think that I enjoy it and keep recommend similar videos. If I searched something on the Internet (not just on Youtube), I would see the relevent videos pop up shortly. Some of the videos may not even match to our current interests, but they are the good ones that Youtube thinks what we like. For instance, the video of “Plastic Love (プラスティック・ラブ)” by Mariya Takeuchi was taking over everyone’s Youtube page in 2018. Almost immediately, the song brought a popular hit to the City Pop music category. My curiosity of the mysterious algorithm behind Youtube was stemed since then. There is a formal name of the “algroithm”, which is called Recommender Systems.
How does Youtube Recommender Systems work?
The architecture of Youtube recommendtaions was presented in the work by Covington et al [1]. According to Convington, there are three major challenges of Youtube recommendtiaons:
- scale: Youtube has massive user base and contents, but the older recommendations fail to work on such scale
- freshness: Many videos are uploaded per second on Youtube. To keep users updated by the latest videos, they want to balance the new content with “well-established videos”.
- noise: Past user behaviors are hard to predict because of sparsity. Thus, Youtube modeled “noisy implicit feedback signals”(*) to deal with the noise issue.
Instead of using the traditional matrix factorization, Youtube recommendations used deep neural networks.
*Covington did not explain the model in the paper, but I’ll make a note to look it up later.
System Overview
The overall idea of Youtube recommnedation works like this: Youtube takes millions of video courpus as an input, then candidate generation retreives the videos into hundreds that will feed to ranking. Eventually, only dozens of videos are recommended to users’ phones.
I made an illustration below to help me visualize the overall structure.
Precision and Recall
Precion (or positive predictive value) is the fraction of relevant instances among the retrieved instances.
\[\textrm{Precion} = \frac{\textrm{true positive}}{\textrm{true positive} + \textrm{false positive} }\]Recall (or sensitivity) is the fraction of the total amount of relevence that were retrieved
\[\textrm{Recall} = \frac{\textrm{true positive}}{\textrm{true positive} + \textrm{false negative} }\]Visualizations of Precision and Recall [4] </div>
Deep Candidate Generation Model Architecture
Deep candidate generation model Architecture (*Covington et al* [1]) </div> ## Deep Ranking Network Architecture # Typical Problems and Solutions - watch time prediction: weightet logistic regression - cold start: they take the users information from other websites too --- # Reference [1] Covington, P., Adams, J., and Sargin, E. "Deep Neural Networks for YouTube Recommendations". In *Proceedings of the 10th ACM Conference on Recommender Systems* (*RecSys '16*). Association for Computing Machinery, New York, NY, USA, 191–198. DOI:https://doi.org/10.1145/2959100.2959190 [2] Marchal, C. "[How Youtube’s Algorithm Turned an Obscure 1980s Japanese Song Into an Enormously Popular Hit: Discover Mariya Takeuchi’s “Plastic Love”](https://www.openculture.com/2018/10/youtubes-algorithm-turned-obscure-1980s-japanese-song-enormously-popular-hit-discover-mariya-takeuchis-plastic-love.html). *Music*. 2018. [3] ST. Michel, P. ["Mariya Takeuchi: The pop genius behind 2018's surprise online smash hit from Japan"](https://www.japantimes.co.jp/culture/2018/11/17/music/mariya-takeuchi-pop-genius-behind-2018s-surprise-online-smash-hit-japan/). *The Japan Times*. 2018. [4] [Wiki: Precision and Recall](https://en.wikipedia.org/wiki/Precision_and_recall)