Recommender systems in Retail 2019: Overview and Use case
A Retail Case study about implementing recommender systems developed during the Summer School of Research Methods
In summer 2019, some of the ShopUp team took part in the Summer School of Research Methods. The 7-day event was organized by several members of Data Science Society and academia representative from Sofia University, UNWE and Technical University Sofia. There were more than 10 lecturers and about 30 participants from different companies, organizations and universities. Among them were experts, researchers, PhD candidates and masters students. Each day the program started with presentations or workshops followed by allocated time for a Capstone project (a project which aims to capture what we’ve learnt). The workshops combined practice and theory in the area of maths, statistics, neural network, reinforcement learning and etc. The Capstone project was mandatory for each participant and there were three different cases to select from.
We at ShopUp decided to open-source this project work and share it with the community. The code we created is implementing different recommenders’ techniques for building a retail recommender engine using data, provided from a Kaggle contest.
What Is a Recommender System?
A recommender system (or a recommendation system) can be perceived as a black box, offering different items to end users, depending on their past interest and behaviour, no matter if the user is a retailer, a store, a shopping or entertainment center. The more relevant items are offered, the higher interest and revenue is generated. Therefore for marketing and sales purposes the higher prediction rate means higher ROI in promoting different products. A relevant example is: if you buy horror books, the engine would offer you a book which is the most relevant to your interest, using the patterns from various consumers. Similar techniques are used by Amazon, Netflix or in the music industry by Spotify.
Types of Recommendation Systems
There are several types of recommendation system and the most popular among them are collaborative filtering and content based.
- Collaborative filtering systems offer items which are relevant to your soul mates in terms of interest and bought items. An example is to offer you a book which is most liked by your friends.
- Content-based filtering systems are based on items metadata like genre, actor, color, etc. So if a bookstore user has been reading Sci-Fi books, the bookstore would be quicker in performing a recommendation of another Sci-Fi book over a romantic comedy.
There are two types of models in Collaborative filtering – Memory-based methods and Model-based methods – for which you can read more in this great article.
There are several bibles if you want to dig into that area – Recommender Systems Handbook and Recommender Systems: The Textbook.
The Case
The case is from a Kaggle contest and it’s aim is to motivate research in the field of recommender systems. The online retailers’ data consists of three files: a file with behaviour data (events.csv), a file with item properties (item_properties.сsv) and a file which describes the category tree (category_tree.сsv). It is raw data, i.e. without any content transformations, however, all values are hashed due to confidential issues. The data set is with more than 2 millions records and over 70 K customers, which was sufficient for further experiments.
More details can be found here:
What ShopUp Team Chose
The main concept of this summer school was to challenge ourselves with a new type of projects and to produce valuable outcome and insights. That’s why we didn’t pay attention to traditional methods such as collaborative and content based filtering and after a brief research we decided to try several other approaches.
We liked the most the approach suggested by Alexandros Karatzoglou,Telefonica Scientific director. Here you could find his presentation at the AI conference of WeAreDevelopers in 2018 and the video. Finally we selected 3 out of the 4 suggested methods on the slide No 20 (Learning item embeddings, Deep collaborative filtering, Session-based recommendations with RNN). The fourth one (Feature extraction directly from content like text or images) was ignored based on missing and insufficient data .
Source: Alexandros Karatzoglou
For Deep collaborative filtering and Session-based recommendation with RNN we decided to use fast.ai trainings and materials for a fast development with the idea of receiving cutting edge results.
Fast.AI believes that deep learning is transforming the world. They are making the learning process easier to use and are also getting more people from all backgrounds involved through their free courses for coders, software library, cutting-edge research and community.
We like their vision and cause that AI needs to be accessible and easy to implement. The solutions are really fast and probably a lot of other optimizations can be done later on. Our main concept is to share what we did and to show how easy it is nowadays to create cutting edge recommenders.
1. How to apply Learning Item Embedding?
We decided to use the Word2Vec word embedding approach to develop an item-to-item recommender (prod2vec model). This technique is one of several used by Yahoo for product recommendations. In their paper they describe a system that leverages user purchase history determined from email receipts to deliver highly personalized product ads to Yahoo Mail users.
For more details about the approach you can refer to Alex’s presentation from slide No 25 onwards.
The implementation of the technique was done by using the gensim library in python. For a concise description of the implementation of Word2Vec with Gensim library you can check also this article.
First, we grouped the visitors by their “visitorid” and for each we chose only the ‘add-to-cart’ and ‘transaction’ actions. In this way we want to filter only those cases where the customer will most probably engage in a transaction.
Next, we put the items of each visitor in a list and after that put all those lists in another one. The latter we will use as input for the model:
model = gensim.models.Word2Vec(df_f)
word2vec = Word2Vec(df_f, min_count=1)
Min_count of 1 will include those items that appear at least once in the corpus of words (in our case these are items). This value can be changed in order to select items that appear at least 2 or more times in the corpus.
We had a look at the dictionary of all unique items that exist at least once in the corpus (as we defined it in the previous line):
v1 = word2vec.wv['123441']
In order to see the most similar items (and their similarity index) to a given item we used the following:
sim_words = word2vec.wv.most_similar('123441')
Since this is not a very convenient representation of the model results we tried to represent it graphically by using the dimensionality reduction technique TSNE.
First, we take the unique items and their vector representations. We do this in the following way:
# Put all unique items from the vocabulary in a list:
vocab = list(model.wv.vocab)
X = model[vocab]
# Then, apply the dimensionality reduction technique:
tsne = TSNE(n_components=2)
X_tsne = tsne.fit_transform(X)
# Put the results in a dataframe:
ff = pd.DataFrame(X_tsne, index=vocab, columns=['x', 'y'])
# Show the result on a 2D scatter plot:
plt.figure(figsize=(15,10))
plt.plot(ff['x'], ff['y'], 'ro', alpha = 0.5)
for i in range(len(ff)):
plt.text(ff.x.iloc[i], ff.y.iloc[i], str(ff.index[i]))
plt.show()
At the end, we presented our results in a straightforward way by using the following:
# A function for input request:
def get_input():
answer1= input("Customer ID? ")
if answer1 in test1.index:
print('Old customer')
else: print("New customer")
return answer1
# A function for recommendation:
def recommend(customerid):
c = test1[test1.index.isin([customerid])]
prod = c[0] for i, p in enumerate(prod):
sim_words = word2vec.wv.most_similar(p)
print("\n")
print("Bought product:", prod[i], "\n")
for j in range(len(sim_words)):
print("Recommended Product ID:", sim_words[j][0])
print("Similarity index:", sim_words[j][1])
# And finally, check the recommended products for a given product by using:
answer=get_input()
recommend(customerid=answer)
You can find the full code here: github repo of Data Science Society (the file is recommender.ipynb)
2. How to Apply Deep Collaborative Filtering?
This method is based on matrix factorization and the advantage is that it deals with sparsity. In two words there are models that predict users’ rating of unrated items. In this approach techniques such as dimensionality reduction are used to improve the accuracy. Examples of such models are used in IMDb movie ratings, which predicts the most liked movies and suggests them to the user.
For more details about the approach you can refer to Alex’s presentation after slide No 37 where you can see other approaches too with Restricted Boltzmann Machines (RBM), Autoencoders, Temporal Deep Semantic Structured Model (TDSSM) and etc. We decided to use the model from fast.ai training on Lesson 4: Deep Learning 2019 – NLP; Tabular data; Collaborative filtering; Embeddings after 1:07:27 and the notebook. You can as well download the code for the course from github repo.
For our environment we used Google colab and the solution is uploaded into the project github repo of Data Science Society file ShopUp recommender system.ipynb.
We started with data load and data preparation and computed some rankings:
sorted_df = pd.read_csv(base_dir +'/events.csv')
sorted_df["date"] = pd.DatetimeIndex(sorted_df["timestamp"]).date
We computed the ranking to be identical to IMDb case with values from 1 to 5 and the logic is: 1 is when someone browses for a product; 3 is if he adds an item to the cart;5 is for transaction (payment).
sorted_df["rating1"] = np.where(sorted_df.event == "view",1,3)
sorted_df["rating2"] = np.where((sorted_df.event == "transaction") & (sorted_df["rating1"] == 3),2,0)
sorted_df['rating'] = sorted_df["rating1"] + sorted_df["rating2"]
# Then we run the fast.ai library and started with databunch, learner and learning rate:
data =y_range = [0,5.5]
learn = collab_learner(data, n_factors=50, y_range=y_range)
learn.lr_find()
learn.recorder.plot(skip_end=15)
# train the model
learn.fit_one_cycle(3, 1e-2)
# save the model
learn.save(base_dir +’/dotcat’)
And that’s it, you have the model! All you need to do afterword is to follow the deployment steps which will not be covered here but you can refer to fast.ai documentation and watch Lesson 2.
3. How to Apply Session Based Recommendations with RNN?
Session based recommendation with RNN is very strong method which captures the sequence of products, recommending the right product at the right time (after slide No 65). For example if there is a parent with a kid, 12 months old, it will make sense to recommend products for a little bit older kid, but not too old – let’s say for 14 months; and everything bigger or smaller that this period is irrelevant.
We used technique called transfer learning where we combined existing models with our new recommender one.
It is explained very well by Jeremy on the same lesson 4 (2:11). We took a model Wikitext 103 which is used to predict the next word in a sentence like the example here where after hot in the first case it will be dog and in the second it could be day. The next step is to add your specific dataset to build your language model which is going to be sequence of products.
Source: Jeremy Howard
In our case each sentences is the sequence of products which were viewed, added to the card or purchased and provided in numbers token like these ones for two customers:
- “65273, 253615, 344723, 344723, 344723, 344723, 65273, 253615, 344723, 344723, 344723, 344723, 344723, 344723”
- “139394, 164941, 226353, 139394, 164941, 226353, 139394, 164941, 226353”
Then we run the model with our database and build our new language model with accuracy near by 70% which was unbelievable for us. The code is below.
# Load data set already prepared in the format mentioned above:
final_text= pd.read_csv(base_dir +'/final_text_no_bracket.csv')
data_lm = (TextList.from_df(final_text, cols=['visitorid', 'new'])
.split_by_rand_pct()
.label_for_lm()
.databunch())
Data_lm.vocab.itos[:11] # to see the tokens are they working as expected and each product to keep its numbers
data_lm.show_batch() # see how the data looks like
learn = language_model_learner(data_lm, AWD_LSTM, drop_mult=0.3)
learn.fit_one_cycle(2, 5e-2, moms=(0.8,0.7))
# Test the model
TEXT = "440866"
N_WORDS = 4
N_SENTENCES = 1
print("\n".join(learn.predict(TEXT, N_WORDS +1 , temperature=0.75) for _ in range(N_SENTENCES)))
And that’s it! It works and recommends products based on the customer history.
Summary:
To sum up, we covered the following:
- What recommender systems are, how they work, and some of their different types
- How to implement Learning item embeddings
- Deep collaborative filtering
- Session-based recommendations with RNN
- The Repo links
Our idea is to show you how each one of you can easily create such a model and experiment with it.
We at ShopUp believe that solving hard problems with latest technologies is something worth spending time on. We hope this helps you and motivates you to play with data.
Good luck!
Mix-Movie.com
October 16, 2019 @ 8:53 pm
As you can see above, the weight matrix has a shape ofx 100. The first row of this matrix corresponds to the first word in the vocabulary, the second to the second, and so on.
kahdim hussain
November 5, 2019 @ 1:15 pm
worldwide.Technology, especially artificial intelligence
has made our lives really easy. From the general apps to
Alexa in our houses technology has seeped in even without
us realizing when this happened. Here is a quick view of
future where AI will have significant role in each phase
of our life.
Android is a mobile operating system based on a modified
version of the Linux kernel and other open source software,
designed primarily for touchscreen mobile devices such as
smartphones and tablets. … These apps are licensed by
manufacturers of Android devices certified under
imposed by Google
ShopUp Datathon2020 – Article recommender case – Data Science Society
May 17, 2020 @ 2:00 pm
[…] A recurrent neural network (RNN) is a type of artificial neural network commonly used in natural language processing (NLP). RNNs are designed to recognize a data’s sequential characteristics and use patterns to predict the next likely scenario. We can use this method to predict the next cluster given the previous history we have for each user, in that case we can predict what is the future cluster of interest for that person. We used old code from article which we create in 2019 on summer school https://shopup.me/blog/recommenders_systems/ […]