Spotify Recommendation System using N8N
Uncover the magic behind Spotify's music recommendations! Learn how to build your own system using N8N, vector databases, and audio analysis. Explore track features like danceability & tempo to create personalized playlists. A fun, insightful learning exercise!

Preface
Welcome to this exploration of building a Spotify recommendation system using N8N! Before we dive in, I'd like to set some important context for this project.
This blog post represents a learning exercise—a journey into understanding how music recommendation systems function at a fundamental level. By working with vector databases, audio feature analysis, and similarity algorithms, we can gain valuable insights into the mechanics behind the personalized music suggestions we encounter daily.
It's important to note that this project faces a significant limitation: Spotify has recently deprecated portions of their API that previously provided access to track audio features. This change means that while the system described here works with our existing dataset of 32,833 tracks, expanding it with new music or implementing it as a production service would require alternative approaches.
Despite this limitation, the concepts and techniques explored here remain valuable. The principles of vector similarity, feature normalization, and multidimensional representation apply across recommendation domains. Whether you're interested in music technology, data science, or just curious about how your favorite streaming service seems to know your taste so well, I hope this walkthrough provides useful insights.
Consider this project a springboard for your own experimentation—perhaps using alternative music APIs, applying these techniques to different domains entirely, or finding creative workarounds for the API limitations. The learning journey is what matters most, and I invite you to approach this with the spirit of exploration rather than as a production-ready blueprint.
Introduction
In the vast ocean of music streaming, personalized recommendations have become the compass guiding listeners to new discoveries. Behind these seemingly magical suggestions lie sophisticated systems processing numerous data points to understand musical preferences and patterns.
This article explores how to build your own Spotify recommendation system using N8N - a flexible workflow automation tool that connects various services through a visual interface without complex coding. While Spotify's recent API changes limit access to track audio features, this learning exercise still demonstrates valuable principles of recommendation systems.
Our implementation uses a QDrant vector database containing 32,833 pre-extracted tracks, each represented by numerical characteristics like danceability, energy, and tempo. By transforming these features into multidimensional vectors, we can discover connections between songs based on deeper musical characteristics rather than just genre or artist.
In the following sections, we'll break down each component of this system, from data loading to recommendation generation, showing you how to harness vector similarity for music discovery. Though API limitations affect production viability, these core concepts provide insight into how modern recommendation systems function.
Architecture
At the heart of our Spotify recommendation system lies a thoughtfully designed architecture that bridges the gap between Spotify's rich music data and personalized recommendations. The system consists of five core components working in concert to deliver music suggestions that feel both familiar and fresh.
System Overview
Our N8N configuration is built around two main pillars: the Spotify Loader and the Spotify Recommendation Agent. The loader handles the crucial task of data ingestion and preparation, while the agent serves as the intelligence layer that processes user requests and delivers recommendations.
The Recommendation Agent is equipped with three specialized tools (implemented as N8N workflows):
- Get Tracks by Artist - searches the database for tracks from specific artists
- Get Tracks by Name - retrieves tracks matching particular titles
- Get Recommended Tracks - discovers similar music based on a seed track's characteristics
Vector Database as the Knowledge Foundation
The QDrant vector database serves as our system's knowledge repository, storing 32,833 tracks as multidimensional vectors. Unlike traditional databases that excel at exact matches, vector databases are designed for similarity searches - perfect for finding music with comparable characteristics.
Each track in our database is represented by a vector - essentially a list of numbers that captures the track's musical DNA. When searching for recommendations, the system identifies tracks whose vectors are closest to our reference point in this multidimensional space.
Data Normalization and Vectorization
The power of our recommendation system stems from how we process raw track features. The Spotify Loader normalizes each track's attributes to ensure they contribute proportionally to recommendations.
- track_popularity
- danceability
- energy
- key
- loudness
- mode
- speechiness
- acousticness
- instrumentalness
- liveness
- valence
- tempo
This normalization involves:
- Scaling values to comparable ranges
- Calculating standard deviations to understand the distribution of each feature
- Converting processed features into vectors that represent each track's unique musical signature
The resulting vectors encode rich information about each track's musical characteristics, creating a mathematical representation that allows the system to understand music in terms of acoustic properties rather than just metadata.
The resulting vectors encode rich information about each track's musical characteristics, creating a mathematical representation that allows the system to understand music in terms of acoustic properties rather than just metadata.
This architecture provides a flexible foundation for music discovery that can be easily extended with additional features or customized to emphasize particular musical aspects based on personal preferences.
The Spotify Loader Component
The Spotify Loader serves as the foundation of our recommendation system, handling the critical process of extracting, transforming, and loading music data into our vector database. This component bridges Spotify's vast music catalog and our recommendation engine, ensuring we have high-quality, structured data to power accurate suggestions. Due to the Spotify API changes this data is no longer available so a extract of 32,833 has been used.
Data Extraction Process
Our system utilizes a publicly available Spotify dataset from the TidyTuesday project, accessible at:
https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2020/2020-01-21/spotify_songs.csv
This dataset already contains a wealth of track information including both metadata and acoustic features, eliminating the need for direct API calls to Spotify. The N8N loader workflow begins by fetching this CSV file and parsing its contents for processing.
Statistical Normalization
The normalization process in our N8N workflow is handled by two critical function nodes that prepare the data for vectorization:
Mean and Standard Deviation Calculation
The first function node analyzes the entire dataset to calculate statistical properties for each of the twelve acoustic features:
const featuresToNormalize = [
"track_popularity", "danceability", "energy", "key", "loudness", "mode",
"speechiness", "acousticness", "instrumentalness", "liveness", "valence", "tempo"
];
For each feature, the node:
- Collects all values across the tracks
- Calculates the sum and count of valid numerical values
- Computes the mean (average) of each feature
- Determines the standard deviation to understand the feature's distribution
These statistical measures provide a baseline understanding of how each feature is distributed across our music library, which is essential for proper normalization.
Vector Creation through Z-Score Normalization
The second function node transforms each track into a normalized vector representation using the statistical measures from the previous step:
// Standardize the feature value
let normalizedValue = (track[feature] - featureMean) / featureStdDev;
This calculation applies z-score normalization (also known as standardization), which:
- Centers each feature around zero (by subtracting the mean)
- Scales the values according to the feature's standard deviation
The result is a 12-dimensional vector for each track where:
- Values near zero represent features close to the average across all tracks
- Positive values indicate features higher than average
- Negative values indicate features lower than average
The code also handles edge cases where standard deviations might be zero or normalization might produce invalid results, ensuring robust vector creation.
This normalization ensures that features with larger numerical ranges (like tempo) don't overshadow features with smaller ranges (like danceability). The result is a balanced 12-dimensional vector where each feature contributes proportionally to similarity calculations.
By combining these diverse audio characteristics into a single vector representation, our system can identify tracks that sound similar across multiple dimensions, creating recommendations that capture the essence of your musical preferences rather than just matching metadata like artist or genre.
Data Enrichment
The loader preserves all original track metadata while adding two new properties to each track:
- normalized_features: An object containing the normalized value of each acoustic feature
- vector_embedding: A 12-dimensional array representing the track's position in our feature space
These enriched track objects are then ready for insertion into the QDrant vector database, where they can be efficiently queried for similarity-based recommendations.
By transforming raw track features into standardized vectors, the Spotify Loader component creates a mathematical representation of music that allows our system to understand sonic similarity beyond simple genre classifications or artist relationships.
Features Used for Recommendations
The heart of our Spotify recommendation system lies in the audio features we use to represent each track. These features, provided by Spotify's Audio Features API, capture both the technical aspects of music production and the perceptual qualities listeners experience. Let's explore the twelve features that form our track vectors and understand why they create such effective recommendations.
Audio CharacteristicsDanceability
Danceability measures how suitable a track is for dancing based on elements like tempo, rhythm stability, beat strength, and regularity. A value of 0.0 indicates least danceable while 1.0 is most danceable. This feature helps our system identify tracks with similar rhythmic qualities regardless of genre.
Energy
Energy represents the perceived intensity and activity throughout a track. High-energy tracks (closer to 1.0) feel fast, loud, and noisy, while low-energy tracks (closer to 0.0) sound more mellow and relaxed. This feature helps match the intensity level of recommendations to seed tracks.
Loudness
Measured in decibels (dB), loudness reflects the overall amplitude of a track. Values typically range from -60 to 0 dB, with higher values indicating louder tracks. This production characteristic helps match tracks with similar mixing styles and perceived volume.
Tempo
Tempo estimates the track's speed in beats per minute (BPM). This fundamental musical attribute ranges from slow ballads around 60 BPM to fast-paced electronic music exceeding 200 BPM. Our system can recognize tempo similarities even across different genres.
Musical Attributes
Key
Key identifies the overall harmonic content of the track using standard Pitch Class notation (0-11), where 0 represents C, 1 represents C♯/D♭, and so on. This feature helps match tracks with compatible harmonic structures.
Mode
Mode indicates whether a track is in a major (1) or minor (0) scale. This binary feature captures an important aspect of a song's emotional quality, as major keys often sound brighter while minor keys tend to sound more somber.
Voice and Sound QualitiesSpeechiness
Speechiness detects the presence of spoken words in a track. Values closer to 1.0 represent talk shows or podcasts, while values below 0.33 typically represent music with no vocals. This helps distinguish between instrumental tracks and vocal-heavy songs.
Acousticness
Acousticness measures the confidence that a track is acoustic rather than electronic. Values closer to 1.0 represent acoustic instruments, while values near 0.0 indicate electronic production. This feature helps match tracks with similar production styles.
Instrumentalness
Instrumentalness predicts whether a track contains no vocals. Values closer to 1.0 indicate instrumental tracks, while vocal content pushes this value toward 0.0. This helps our system distinguish between vocal and instrumental music when generating recommendations.
Liveness
Liveness detects the presence of an audience in the recording, with values above 0.8 strongly indicating a live performance. This feature helps identify tracks with similar recording environments.
Emotional QualitiesValence
Valence measures the musical positiveness conveyed by a track. High valence (closer to 1.0) suggests more positive emotions (happy, cheerful), while low valence suggests negative emotions (sad, angry). This feature is particularly powerful for matching the emotional tone of recommendations.
Track Popularity
While not an audio feature, track popularity (0-100) indicates how frequently a track has been played relative to others. Including this in our vector helps balance between recommending unknown tracks with similar audio features and tracks that have broader appeal.
The Recommendation Agent
At the center of our Spotify recommendation system stands the Recommendation Agent - an intelligent component within N8N that processes user requests and leverages the vector database to deliver personalized music suggestions. This agent acts as the interface between users and the underlying recommendation engine, interpreting natural language queries and translating them into precise database operations.
Agent Architecture
The Recommendation Agent is implemented as an N8N Tools Agent with a chat interface, allowing users to interact with the system using natural language. When a user submits a query, the agent analyzes the request, determines which tool to use, and then formats the results into a coherent response.
The agent is configured to understand various types of music discovery requests and select the appropriate workflow to handle each query type. For example, if a user asks "Show me songs like 'Galway Girl'," the agent recognizes this as a request for similar tracks and invokes the appropriate tool.
Available Tools
The agent has access to three specialized N8N workflows that serve as its tools:
Get_Tracks_By_Name
- This tool queries the QDrant vector database to retrieve tracks from a specific artist
- Example query: "Show me tracks by Ed Sheeran"
- The tool returns matching tracks
Get_Track_Details
- This tool searches for specific tracks by their title
- Example query: "Find the song 'Billie Jean'"
Get_Track_Recommendations
- This tool performs vector similarity searches to find tracks with similar audio characteristics
- Example query: "What songs are similar to 'Clocks' by Coldplay?"
- The tool leverages the 12-dimensional vectors to find truly similar-sounding music
Query Processing Flow
When a user sends a message to the agent, the following process occurs:
- The agent analyzes the query to determine the user's intent
- It selects the appropriate tool based on whether the user is looking for:
- Tracks from a specific artist
- A specific track by name
- Recommendations based on a reference track
- The agent invokes the selected workflow, passing relevant parameters
- The workflow queries the vector database and returns results
- The agent formats these results into a user-friendly response
Technical Implementation
The agent is implemented using N8N's HTTP Request nodes to interact with the QDrant vector database. When the agent needs to execute a tool, it:
- Prepares the appropriate parameters (artist name, track name, or reference track)
- Invokes the corresponding N8N workflow
- Waits for the workflow to complete
- Processes the returned data
The strength of this approach is its flexibility - as natural language processing models improve, the agent can be updated to handle more complex queries without changing the underlying recommendation tools.
By combining the power of vector similarity search with an intuitive chat interface, the Recommendation Agent makes complex music discovery accessible to anyone, regardless of their technical knowledge about audio features or recommendation algorithms.
Future Customization Options
Adjusting Feature Weights for Personalized Recommendations
The default recommendation system treats all audio features equally, but personalization comes from emphasizing features that matter most to individual listeners. You can modify the vector similarity calculations by implementing feature weighting in your N8N workflow:
- Create a Custom Weighting Profile: Add a function node that applies multipliers to specific features before similarity calculations. For example:
const weights = {
"valence": 1.5, // Emphasize emotional tone
"energy": 1.3, // Prioritize energy level
"danceability": 1.2, // Slightly increase importance of danceability
"acousticness": 0.7 // Reduce impact of acousticness
};
// Apply weights to normalized features
item.weighted_vector = item.vector_embedding.map((value, index) => {
const feature = featuresToNormalize[index];
return value * (weights[feature] || 1.0);
});
- User Preference Interface: Create a simple form where users can adjust sliders for each feature, generating a personalized weight profile.
- Genre-Specific Weighting: Develop preset weights optimized for different genres - for example, emphasizing rhythm features for dance music or acousticness for folk recommendations.
Adding Additional Data Sources
While audio features provide a solid foundation, incorporating additional data can significantly enhance recommendation quality:
- Lyrical Analysis: Integrate a service like Genius API to fetch and analyze song lyrics, adding semantic dimensions to your recommendations.
- Genre Tags: Supplement your database with genre classifications from sources like MusicBrainz or Last.fm, allowing users to filter recommendations by genre affinity.
- Playlist Co-occurrence: Track which songs frequently appear together in user-created playlists, adding a social dimension to recommendations that pure audio analysis might miss.
- Artist Relationships: Build a graph of artists who collaborate, tour together, or influence each other, enabling "six degrees of separation" music discovery.
Creating Custom Recommendation Algorithms
Beyond vector similarity, consider implementing these algorithmic approaches in N8N:
- Hybrid Filtering: Combine content-based (audio features) with collaborative filtering (user behavior patterns) for more robust recommendations.
- Sequential Recommendations: Build a Markov chain model that learns typical song transitions in playlists, generating recommendations that flow naturally from the currently playing track.
- Mood Progression Algorithms: Create specialized recommendations that gradually shift the mood - for example, building a workout playlist that starts slow, intensifies, then cools down.
- Diversity Injection: Modify the standard nearest-neighbor approach to deliberately include some tracks that are moderately different, preventing the "echo chamber" effect of too-similar recommendations.
Performance and Limitations
Limitations of the Current Implementation
The most significant limitation stems from Spotify's API changes that restrict access to audio feature data. This creates several challenges:
- Limited Dataset Currency: Our system relies on a static dataset of 32,833 tracks, meaning newer releases won't appear in recommendations until the dataset is manually updated.
- Feature Consistency: Without direct API access to current feature data, there's no guarantee that Spotify's internal calculations of features like "danceability" remain consistent with our historical data.
- Cold Start Problem: New users with no listening history receive generic recommendations until sufficient preference data accumulates.
Ideal Methodology
The optimal approach would follow this workflow:
- Use the Spotify API to query for a song the user likes
- Extract the song's audio features directly from the Spotify API
- Normalize these features using the same statistical parameters as our database
- Use the normalized vector to search for similar matches in QDrant
- Return the top matches as recommendations
Conclusion
Our N8N-powered Spotify recommendation system demonstrates the power of combining vector similarity search with audio feature analysis. By representing tracks as multidimensional vectors and leveraging QDrant's efficient similarity calculations, we've created a system that:
- Discovers music connections based on acoustic properties rather than just metadata
- Provides recommendations based on artists, specific tracks
- Delivers suggestions that capture the essential musical qualities that make listeners enjoy particular tracks
- Operates efficiently without requiring deep coding knowledge, thanks to N8N's visual workflow design
Workflow
Below is the workflow implementation for our Spotify recommendation system. Note that the individual tools—Get_Tracks_By_Name, Get_Track_Details, and Get_Track_Recommendations—need to be configured as their own separate N8N workflows. The main Agent workflow should then be updated to reference these tools by their respective workflow IDs. This modular design allows for easier maintenance and the ability to upgrade individual components without disrupting the entire system. Each workflow has been carefully structured to handle its specific task while maintaining compatibility with the overall recommendation architecture.