Yes, dear readers, today we’re not just pairing hearts, we’re pairing souls with movies using none other than the beauty of neural networks. So, grab your popcorn, because we’re about to dive into the deep end of the recommendation pool, where every user and movie gets their own digital dance in the multidimensional space of taste and preference.
In this article, we will be using the famous movie lens dataset with 20 million rows of data. It's going to be interesting. I'm so excited to teach you this because I love movies. Trust me, I don't particularly appreciate scrolling searching for a better movie in a long list of movies to choose from so recommendation systems are here to help.
Link to data https://grouplens.org/datasets/movielens/20m/
In my code block, you can just copy and run the code with no need to download the dataset. so let's get started with imports
let's download our dataset
Our zip file is now downloaded lets unzip and read the data
As you can see our dataset has many CSV files but for this article we will be using two of them. yes, you had that right.
Before we proceed, you are wondering which technique will we use to recommend. The recommendation technique I'm using is Collaborative Filtering with Embeddings in a Neural Network. Here we are going to focus on which movies a user has watched, and the ratings he/she gave the movies then classify movies that are almost similar to the ones the user has watched then predict ratings that the user is likely to give this new movies that are similar to the ones he has watched. After that, we will recommend movies that our model suggests if the user was to watch would have rated the highest.
In other words, we are going to create a model that will predict first the ratings of movies that the user has rated already and then if our model works well, we will now predict ratings for movies our user has not watched. After that, recommend the best 10 new movies with the highest predicted rating, remember we can only recommend new movies, not the ones our user has watched.
let us view our datasets
The only columns useful here or at least the ones I'll be using are userId, movieId, title, and rating. As you have guessed we will have to merge our 2 datasets
Our merged dataset is very useful now but one more thing, we want to ensure that our user IDs are arranged sequentially for ease of embedding but our dataset is too large to check so we will create new columns that are well organized.
let us now print out the head of our new data frame
Next, we will define the number of users and movies that will come in handy during embending.
In the following code, I'll be splitting our data into training and testing, and then I'll normalize our rating data, the code is quite long but easy to understand remember you can leave a question in the comment section in case you need an explanation.
the above is just splitting the data into train and test maybe to just explain the rating code
avg_rating=rating_train.mean() Computes the average rating for the training data.
rating_train=rating_train-avg_rating Centers the training ratings by subtracting the average rating.
rating_test=rating_test-avg_rating Centers the test ratings similarly to maintain consistency.
Next, we will set the foundation for a neural network-based recommender system by defining the input layers, embedding users and movies into a meaningful representation, and combining these representations to feed into the rest of the network.
To this point, our data is ready to be fed into an ANN neural network. so let us create our neural network layers
You can add as many layers as you want but one thing to note is the last dense layer has no activation since this is a linear classification model.
Next, we will compile our model, but for this model remember we have two inputs.
Our next is now fitting the model and training it
For a better view of our model loss and metrics summary, we can plot a graph, it's a good exercise so you can check if your loss is reducing effectively
let's plot our metrics in case your boss wants to view your model performance if he is a non-tech you know
Now let's do the most important work that the whole of this model is about. we will now make predictions for the already-rated movies and see how our model works
The code above makes predictions on already-rated movies and then returns average ratings. we will now compare the predicted ratings with the original ones
As you can see our first 5 predictions at least 3 of them are almost accurate so I'm confident in my model, you can fine-tune your code to increase the accuracy further but I'll go with this for now, when you get a good model please share.
Next, I'm going to create a function to recommend 10 movies to a user with a user ID of zero. since it is a function I’m not going to split the code but I have commented in each code line so it is easier to digest each step
Now let's print out our 10 recommended movies
And there you have it, folks! With a sprinkle of code, a dash of math, and a whole lot of computational power, we’ve built an AI matchmaker that could potentially turn movie night into an epic saga of perfect picks. Remember, while this digital Cupid might not understand the joy of a plot twist or the comfort of a well-worn rom-com, it’s tirelessly working to ensure that when you hit play, the stars (or at least the reviews) align. So next time you’re lost in the endless sea of streaming options, just think, there’s a neural network out there, trying to make sure your next movie is nothing short of a blockbuster hit in the story of your life.
If that was exciting leave a follow and numerous claps, yes that's how you can pay me imagine :) quite cheap right?
Thank you for reading.
About Writer
FAQs
What is a movie recommendation system?A movie recommendation system is a software tool that predicts user preferences for films based on various data inputs, helping users discover movies they might enjoy.
How do recommendation systems work?These systems use algorithms to analyze user behavior, preferences, and historical data to generate personalized movie suggestions.
What are the main types of recommendation systems?The main types are collaborative filtering, content-based filtering, and hybrid approaches that combine both methods.
What is collaborative filtering?Collaborative filtering recommends movies based on user interactions and preferences, suggesting films liked by similar users.
What is content-based filtering?Content-based filtering recommends movies based on the attributes of the films themselves, such as genre, director, and cast.
What are hybrid recommendation systems?Hybrid systems combine collaborative and content-based filtering to improve accuracy and address the limitations of each method.
What data is needed for a recommendation system?Typical data includes user ratings, movie genres, user demographics, and viewing history.
How can I evaluate the performance of a recommendation system?You can evaluate performance using metrics like Mean Absolute Error (MAE), Root Mean Squared Error (RMSE), and precision/recall.
What is user-based collaborative filtering?User-based collaborative filtering recommends movies by finding users with similar tastes and suggesting films they liked.
What is item-based collaborative filtering?Item-based collaborative filtering recommends movies by analyzing the similarities between movies based on user ratings.
How does matrix factorization work?Matrix factorization techniques decompose the user-item rating matrix into lower-dimensional matrices, capturing latent features to make predictions.
What role does machine learning play in recommendation systems?Machine learning algorithms help refine predictions by learning from user interactions and continuously adapting to new data.
Can recommendation systems work for new users?New users pose a challenge (the cold start problem), but techniques like demographic-based recommendations can help.
What are some common challenges in building recommendation systems?Challenges include data sparsity, cold start problems, and maintaining diversity in recommendations.
How can user feedback improve recommendations?User feedback helps refine algorithms by providing additional data points that enhance accuracy and relevance.
What is the importance of diversity in recommendations?Diverse recommendations prevent echo chambers, exposing users to a wider range of films and enhancing user satisfaction.
Are there any ethical concerns with recommendation systems?Yes, ethical concerns include data privacy, algorithmic bias, and the potential for reinforcing negative stereotypes.
How do streaming platforms use recommendation systems?Streaming platforms utilize recommendation systems to personalize user experiences and improve content discovery.
What is the impact of recommendation systems on user engagement?Effective recommendation systems increase user engagement, retention, and overall satisfaction by providing relevant content.
Can I create my own movie recommendation system?Yes, you can create your own recommendation system using open-source libraries and datasets, learning from various algorithms and techniques.