Mind-reading Algorithms: An Introduction to Recommender Systems
Tonight’s game plan: A hot meal, a warm bed, and a couple of hours of Netflix. Best of all, you are about to make it happen.
You put the key in the front door, turn the lock, and push the door open…
…That last episode from Black Mirror was somewhat disturbing. Maybe I should give a chance to Better Call Saul tonight…
You are halfway through the front door…
…Probably won’t be as good as Breaking Bad but… Wait, what’s that?
There is music playing nearby. You step back outside and listen. It’s coming from that new store next to your building. It is a song you’ve had in your mind for the last couple of days. There is a red carpet laid down in front of the store.¹
“E-Store,” you whisper to yourself, reading what’s on the glowing red neon sign on top of the store. You hadn’t paid much attention to this place the few times you passed by.
You will do a quick inspection of what is on display. After all, a few minutes of window shopping hasn’t harmed anyone.
You enter the store and notice that it is warmly lit. There are various shelves with items and some clerks moving and sorting stuff out. There is a small stage at the center of the store with a couch and a tiny shelf.
Upon closer examination, you noticed that there are only running shoes of your size and your favorite sports brand on the tiny shelf on the stage. You had been thinking about starting running again.
Is it time I do something about it?
You select a pair of shoes and head to the cashier. On your way there, you spot a shelf with running shirts and pants². A glance at them reveals that some would make for a great outfit when combined with your new shoes. You’ll take a quick look.
I cannot go out there looking like an amateur!
With your now complete running outfit, you resume your way to the cashier. While handing in the items to the clerk, you notice a poster of your favorite actors running a marathon³. They are wearing almost the same outfit you are about to buy. The only difference is that you are missing the smartwatch they are wearing.
What are the odds?
The clerk smiles and points toward a smartwatch on display next to the cashier. You happily abide.
After leaving the store, a few hours later, you start to reflect on your spending spree. Besides the outfit and smartwatch, you bought a pair of sunglasses, 1 kg of protein powder, a gym subscription, and a health plan.
Maybe that bit of window shopping was not that harmless after all…
What seems like a far-fetched story for brick-and-mortar stores is the bread-and-butter of many internet services. The strategies used by the story’s E(vil)-store to capture your attention, have an existing digital analogous you can check in the references of this article. Most of these techniques are part of a field referred to as Recommender Systems.
Nowadays, Recommenders are ubiquitous. Chances are that you are reading this article because of a suggestion generated by one. They are responsible for 35% of what users buy on Amazon⁴, 75% of what people watch on Netflix, and 70% of watch time on YouTube⁵.
These algorithms are so ingrained in our society that people have even started to get wary of their risks⁶. Anti-vaxxers, flat-earth proponents, and other conspiracy theorists have learned to manipulate these systems to recommended their content at disproportionately high rates. Those who argued for a New Enlightenment Era driven by the internet most likely did not have present-day YouTube in mind.
So far, this article is not helping in rehabilitating the Recommendation Systems’ image. But my goal is to focus on their positive side. Recommenders provide us with a valuable service: enable decision-making by decreasing uncertainty over choices. Furthermore, as we will see in the next sections, in a digital world of endless options, this is not an easy task.
Through this article I try to share an intuitive idea on the What, Why, and How of Recommenders. I am not aiming to cover implementation nor technical details. So beware if that is what you are looking for.
In a nutshell, if you need to explain what is a recommender to your boss, this article might help. Conversely, if you need to build a recommender for your boss, this article might help to distract him while you search for other articles!
Recommendation Systems 101
A Recommendation System or a Recommender is a set of techniques used for suggesting users the most suitable items based on their needs. This definition sounds simple, yet it conceals many details.
In the context of recommenders, an item is a very malleable idea. It could go from movies or songs in entertainment applications to possible love or mating partners in a dating app. Based on the items’ qualities, the recommender tries to guess which items are the most suitable to suggest to a given user. Thus, if you have a history of watching action movies, it’s fair to assume that you’ll prefer movies like Fast & Furious than the latest romantic drama on Netflix.
Suitability is also a subjective matter. From a user perspective, you expect that a recommender provides you with the best possible option for your needs, as fast as possible, and paying the least. On the other hand, a business is trying to make a living, thus the way in which the recommender will provide suggestions will need to reflect that end. One can expect then, that the needs of both the users and businesses will sometimes clash.
Charles Duhigg popularized a telling example of recommendations going too far. In the book, The Power of Habit, he points out the case of an angry father who found out his teenage daughter was pregnant through a targeted ad. The advertising company, using his daughter’s purchasing history, thought she could soon need baby clothes and sent coupons for it. The unsuspecting father received the mail and found the coupons. Shortly after complaining to a representative of the company, learn from his daughter that she was indeed pregnant.
For this article, I deliberately chose a definition of Recommender Systems that was not limited to software or computer systems. That is because these systems are not technical matters that only big technology companies can build. Moreover, they are not limited to the digital world or even to human affairs.
Hunter-gatherers civilizations needed to recommend to others the best foraging places for their survival. Kings had panels of ministers for suggesting courses of action in essential areas of government. Even in the animal kingdom, ants leave traces behind to suggest to others in the colony the best routes towards food⁷.
It’s in recent times that the use of Recommender Systems has extended to a wide range of digital services. In applications where the number of choices was excessive, it became necessary. The research into such systems started at Duke University in the seventies. Yet, it took two decades for the arrival of the first known software-based Recommender System, Tapestry. It was developed at Xerox Palo Alto Research Center (PARC) and published in the journal Communications of the ACM in 1992⁸.
Xerox PARC researchers during an informal meeting⁹. Probably complaining about all the cat images filling their inboxes.
Using Tapestry, the Xerox PARC’s researchers, tried to handle all the unnecessary documents they were receiving due to the increasing use of electronic mail. This system employed people’s collaboration to tag documents based on their reactions. Then, those tags were used to create personal filters that reduced the amount of incoming documents per user. For instance, Alice could create filters to only receive documents tagged as Funny by Bob and Joe, and to receive documents tagged as Important from Mary¹⁰.
But how does an algorithm which started as a filter for documents became so rooted in our present-day digital services? That is what we will go through in the next section.
For the sake of simplicity, from now on we will focus solely on software-based Recommenders and will refer to them using the broad terms Recommendation Systems, Recommender Systems, and Recommenders.
The Problems with Small Bookshelves and Infinite Bookshelves
Imagine you are about to open a bookstore in your town. It feels like a terrible idea now that Amazon dominates the market. Even so, nobody will stop your entrepreneurial drive. You’ve already signed a lease for a small but well-located place and are also planning on offering your signature espresso to customers.
A while ago, you received catalogs of books from a few publishing houses, and today you need to decide which books will fill the shelves. But, as you read through the first catalog, making a decision feels more and more daunting.
Should I order the latest book by Paulo Coelho?…
What about The Hunger Games series?…
And the recently-published Memories of a Ranch Dresser Expert from my friend Derek¹¹?
Shelf space limits how many books you can have at a time. As you probably want to survive over the long-term, it is sensible to focus on the most popular books. Sorry, Derek…
In this case, caring for the individual desires of customers is impossible. You don’t have enough space for so many books. If you want to make money, you need to put on display what you know is on demand. Some clients will not find what they are looking for, but the majority will be happy just by buying the most popular offerings.
Now, imagine that a few years have gone by. Your strategy is working like a charm. Customers are very happy with your selection of books and signature espresso. So much, that a major bookstore chain recently made a huge offer for your store. They want to name you as CEO to drive their newly established digital strategy.
Finally, you don’t have to worry about limited shelf space anymore. The company’s homepage is an infinite and fully-customizable bookshelf. Also, you have access to an inventory as big as Amazon’s. You just need to figure out which books to show, out of the fifty million available, to each of the ten million customers you are expecting next month… Hmm…
You could stick to your previous strategy of showing each customer the most popular books on the homepage. Yet, millions of customers will have little interest in what you are showing to them. Besides, you will not exploit your vast inventory’s potential. The end result could be millions of angry customers and under-performing sales.
Another option could be to show all the available books on the homepage. Nonetheless, you are at risk of the Paradox of Choice. Humans, when faced with abundant choices, instead of feeling happier, get irritated, and anxious¹². Thus, you might end up with angrier customers and fewer sales.
It has been couple of weeks and the Board of Directors are already having second thoughts about your designation.
Desperate, you step out of the office and into the rainy night. Looking at the skies, you scream, asking for a way out…
Suddenly, your mobile phone vibrates. You spend a few seconds struggling to unlock your phone under the rain until you can read the notification.
“Wondering what to watch next? We suggest Black Mirror: Bandersnatch” reads on a small banner from Netflix on your dripping-wet screen.
Ugh… Thanks, but right now is not the best of times…
Or… Is it?
That is when recommendation systems step in. Usually, in a less dramatic manner.
There is a middle road between providing all possible choices to a user or generic choices to all users. It is possible to provide a few but well-thought suggestions to each user by using a recommender.
For that, you do not need to care about the popularity of books. You could match each customers’ interests with books’ attributes as genre, length, and author. For instance, you might find that a few customers would react better to The Lightbringer series instead of the Game of Thrones (GoT). In an online bookstore, that is something you can and need to care for. In a physical store, those same customers will most likely need to settle with the GoT series.
The difference in which customers demands are met in the online and physical world is referred to as The Long Tail. The figure below is a visual aid to understanding this phenomenon. Each bar on the horizontal axis represents an item. These bars are ordered in a decreasing manner by popularity (represented in the vertical axis).
The Long Tail. Physical stores define what they show to users by their shelf space limitations. Online stores use Recommenders to define what to show.
The bars to the left of the dotted vertical line are the items that a physical store may display given its space limitations. In contrast, an online store could display the entire range of items: the tail as well as the popular ones¹³. Recommenders are meant to solve the issue of displaying an excessive number of options in an online context.
So far we have seen what recommenders are and the problems they solve. Now we will review what are the different ways Recommenders generate suggestions.
Fantastic Recommenders and Where To Find Them
Besides the What and Why of recommenders, it also makes sense to get an idea of how these systems are usually build. For that, we will review the standard six categories of recommenders¹⁴ and which technological companies have made use of them¹⁵:
- Content-based (CB): recommends items similar to the ones that the user has liked. For identifying the similarity, the recommender uses characteristics or features from the items. For a books’ recommender the algorithm could use genre, author, or book-length as features to recommend similar books. Used by: Facebook, and Amazon
- Collaborative filtering (CF): recommends the user items that other users with similar tastes liked in the past. The reasoning behind CF is that “two or more individuals sharing some similar interests in one area tend to get inclined towards similar items or products from some other areas too.” Used by: Amazon, Facebook, LinkedIn, and Twitter
- Demographic: recommends items based on the demographic profile of the user. These systems usually segment users following business-specific rules and generate recommendations based on those segments. Used by: eBay
- Knowledge-based: recommends items by matching explicit user’s needs to items’ features. For instance, you specify the number of bedrooms, floor space, and the website returns a list of the best matches of houses.
- Community-based: recommends items using the user’s friends’ preferences: Tell me who your friends are, and I’ll tell you what you like.
- Hybrid: this type of recommender suggests items combining two or more of the previous techniques. A typical case is to combine a collaborative filtering approach with a content-based system. Used by: Amazon, and Netflix
Out of these six types of recommenders, the first two, Content-based and Collaborative Filtering, are the most popular. There is ample material on both available online. Start there, if you would like to dig deeper into recommenders or build one yourself.
Closing Words
This article started as a technical introduction to Recommender Systems. Yet, after a bit of research, I noticed there were already hundreds of articles with a similar goal.
As I was not very motivated to do the same, I made this Frankenstein article by mixing a bit of narrative and theory. I hope it was useful for understanding Recommenders and maybe gave you a pity laugh.
We are now in an era where these algorithms are shaping a significant part of our daily lives. We should care to understand what is behind what we see in our social media feeds, online shopping suggestions, and other digital services. This article tried to fill that gap in an accessible manner.
I hope you enjoyed the article. Feel free to drop me a note if you have questions or comments.
Next Steps
Finally, if you want to learn more about Recommenders, I have a couple of suggestions for starting points:
Theory
Applications
- Beginner’s Recommendation Systems with Python
- Recommendation System Based on PySpark
- Building A Collaborative Filtering Recommender System with TensorFlow
Datasets
References
[1] Mailchimp, What is Retargeting? (2019, date of access)
[2] R. Reshef, Understanding Collaborative Filtering Approach (2015)
[3] A. Chandrashekar, F. Amat, J. Basilico and T. Jebara, Netflix’s Artwork Personalization (2017)
[4] I. MacKenzie, C. Meyer, and S. Noble, How Retailers Can Keep Up With Consumers (2013)
[5] A. Rodriguez, YouTube’s recommendations drive 70% of what we watch (2018)
[6] G. Chalot, Twitter Thread on YouTube’s Recommendations (2019)
[7] R. Sharma, R. Singh, Evolution of Recommender Systems from Ancient Times to Modern Era: A Survey (2016)
[8] R. Sharma, R. Singh, Evolution of Recommender Systems from Ancient Times to Modern Era: A Survey (2016)
[9] Computer History, Xerox PARC (2019, date of access)
[10] Huttner, Joseph, From Tapestry to SVD: A Survey of the Algorithms That Power Recommender Systems (2009)
[11] The_Curly_Council, This is a profession I can see myself getting into (2013)
[12] P. Hiebert, The Paradox Of Choice, 10 Years Later (2017)
[13] J. Leskovec, A. Rajaraman, J. Ullman, Mining Massive Datasets, Chapter 9 (2014)
[14] F. Ricci, L. Rokach, B. Shapira, Introduction to Recommender Systems Handbook, Chapter 1 (2011)
[15] R. Sharma, R. Singh, Evolution of Recommender Systems from Ancient Times to Modern Era: A Survey (2016)
Citation
@online{castillo2019,
author = {Castillo, Dylan},
title = {Mind-Reading {Algorithms:} {An} {Introduction} to
{Recommender} {Systems}},
date = {2019-12-20},
url = {https://dylancastillo.co/posts/mind-reading-algorithms.html},
langid = {en}
}