Spotify Recommendation System using N8N

Uncover the magic behind Spotify's music recommendations! Learn how to build your own system using N8N, vector databases, and audio analysis. Explore track features like danceability & tempo to create personalized playlists. A fun, insightful learning exercise!

Spotify Recommendation System using N8N
Spotify Recommendation System in N8N

Preface

Welcome to this exploration of building a Spotify recommendation system using N8N! Before we dive in, I'd like to set some important context for this project.
This blog post represents a learning exercise—a journey into understanding how music recommendation systems function at a fundamental level. By working with vector databases, audio feature analysis, and similarity algorithms, we can gain valuable insights into the mechanics behind the personalized music suggestions we encounter daily.

It's important to note that this project faces a significant limitation: Spotify has recently deprecated portions of their API that previously provided access to track audio features. This change means that while the system described here works with our existing dataset of 32,833 tracks, expanding it with new music or implementing it as a production service would require alternative approaches.
Despite this limitation, the concepts and techniques explored here remain valuable. The principles of vector similarity, feature normalization, and multidimensional representation apply across recommendation domains. Whether you're interested in music technology, data science, or just curious about how your favorite streaming service seems to know your taste so well, I hope this walkthrough provides useful insights.

Consider this project a springboard for your own experimentation—perhaps using alternative music APIs, applying these techniques to different domains entirely, or finding creative workarounds for the API limitations. The learning journey is what matters most, and I invite you to approach this with the spirit of exploration rather than as a production-ready blueprint.

Introduction

In the vast ocean of music streaming, personalized recommendations have become the compass guiding listeners to new discoveries. Behind these seemingly magical suggestions lie sophisticated systems processing numerous data points to understand musical preferences and patterns.

This article explores how to build your own Spotify recommendation system using N8N - a flexible workflow automation tool that connects various services through a visual interface without complex coding. While Spotify's recent API changes limit access to track audio features, this learning exercise still demonstrates valuable principles of recommendation systems.

Our implementation uses a QDrant vector database containing 32,833 pre-extracted tracks, each represented by numerical characteristics like danceability, energy, and tempo. By transforming these features into multidimensional vectors, we can discover connections between songs based on deeper musical characteristics rather than just genre or artist.

In the following sections, we'll break down each component of this system, from data loading to recommendation generation, showing you how to harness vector similarity for music discovery. Though API limitations affect production viability, these core concepts provide insight into how modern recommendation systems function.

Architecture

At the heart of our Spotify recommendation system lies a thoughtfully designed architecture that bridges the gap between Spotify's rich music data and personalized recommendations. The system consists of five core components working in concert to deliver music suggestions that feel both familiar and fresh.

System Overview

Our N8N configuration is built around two main pillars: the Spotify Loader and the Spotify Recommendation Agent. The loader handles the crucial task of data ingestion and preparation, while the agent serves as the intelligence layer that processes user requests and delivers recommendations.

The Recommendation Agent is equipped with three specialized tools (implemented as N8N workflows):

  • Get Tracks by Artist - searches the database for tracks from specific artists
  • Get Tracks by Name - retrieves tracks matching particular titles
  • Get Recommended Tracks - discovers similar music based on a seed track's characteristics

Vector Database as the Knowledge Foundation

The QDrant vector database serves as our system's knowledge repository, storing 32,833 tracks as multidimensional vectors. Unlike traditional databases that excel at exact matches, vector databases are designed for similarity searches - perfect for finding music with comparable characteristics.

Each track in our database is represented by a vector - essentially a list of numbers that captures the track's musical DNA. When searching for recommendations, the system identifies tracks whose vectors are closest to our reference point in this multidimensional space.

Data Normalization and Vectorization

The power of our recommendation system stems from how we process raw track features. The Spotify Loader normalizes each track's attributes to ensure they contribute proportionally to recommendations.

  • track_popularity
  • danceability
  • energy
  • key
  • loudness
  • mode
  • speechiness
  • acousticness
  • instrumentalness
  • liveness
  • valence
  • tempo

This normalization involves:

  • Scaling values to comparable ranges
  • Calculating standard deviations to understand the distribution of each feature
  • Converting processed features into vectors that represent each track's unique musical signature

The resulting vectors encode rich information about each track's musical characteristics, creating a mathematical representation that allows the system to understand music in terms of acoustic properties rather than just metadata.
The resulting vectors encode rich information about each track's musical characteristics, creating a mathematical representation that allows the system to understand music in terms of acoustic properties rather than just metadata.

This architecture provides a flexible foundation for music discovery that can be easily extended with additional features or customized to emphasize particular musical aspects based on personal preferences.

The Spotify Loader Component

The Spotify Loader serves as the foundation of our recommendation system, handling the critical process of extracting, transforming, and loading music data into our vector database. This component bridges Spotify's vast music catalog and our recommendation engine, ensuring we have high-quality, structured data to power accurate suggestions. Due to the Spotify API changes this data is no longer available so a extract of 32,833 has been used.

Data Extraction Process

Our system utilizes a publicly available Spotify dataset from the TidyTuesday project, accessible at:

https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2020/2020-01-21/spotify_songs.csv

This dataset already contains a wealth of track information including both metadata and acoustic features, eliminating the need for direct API calls to Spotify. The N8N loader workflow begins by fetching this CSV file and parsing its contents for processing.

Statistical Normalization

The normalization process in our N8N workflow is handled by two critical function nodes that prepare the data for vectorization:

Mean and Standard Deviation Calculation

The first function node analyzes the entire dataset to calculate statistical properties for each of the twelve acoustic features:

const featuresToNormalize = [
    "track_popularity", "danceability", "energy", "key", "loudness", "mode",
    "speechiness", "acousticness", "instrumentalness", "liveness", "valence", "tempo"
];

For each feature, the node:

  • Collects all values across the tracks
  • Calculates the sum and count of valid numerical values
  • Computes the mean (average) of each feature
  • Determines the standard deviation to understand the feature's distribution

These statistical measures provide a baseline understanding of how each feature is distributed across our music library, which is essential for proper normalization.

Vector Creation through Z-Score Normalization

The second function node transforms each track into a normalized vector representation using the statistical measures from the previous step:

// Standardize the feature value
let normalizedValue = (track[feature] - featureMean) / featureStdDev;

This calculation applies z-score normalization (also known as standardization), which:

  • Centers each feature around zero (by subtracting the mean)
  • Scales the values according to the feature's standard deviation

The result is a 12-dimensional vector for each track where:

  • Values near zero represent features close to the average across all tracks
  • Positive values indicate features higher than average
  • Negative values indicate features lower than average

The code also handles edge cases where standard deviations might be zero or normalization might produce invalid results, ensuring robust vector creation.

This normalization ensures that features with larger numerical ranges (like tempo) don't overshadow features with smaller ranges (like danceability). The result is a balanced 12-dimensional vector where each feature contributes proportionally to similarity calculations.

By combining these diverse audio characteristics into a single vector representation, our system can identify tracks that sound similar across multiple dimensions, creating recommendations that capture the essence of your musical preferences rather than just matching metadata like artist or genre.

Data Enrichment

The loader preserves all original track metadata while adding two new properties to each track:

  • normalized_features: An object containing the normalized value of each acoustic feature
  • vector_embedding: A 12-dimensional array representing the track's position in our feature space

These enriched track objects are then ready for insertion into the QDrant vector database, where they can be efficiently queried for similarity-based recommendations.

By transforming raw track features into standardized vectors, the Spotify Loader component creates a mathematical representation of music that allows our system to understand sonic similarity beyond simple genre classifications or artist relationships.

Features Used for Recommendations

The heart of our Spotify recommendation system lies in the audio features we use to represent each track. These features, provided by Spotify's Audio Features API, capture both the technical aspects of music production and the perceptual qualities listeners experience. Let's explore the twelve features that form our track vectors and understand why they create such effective recommendations.

Audio CharacteristicsDanceability

Danceability measures how suitable a track is for dancing based on elements like tempo, rhythm stability, beat strength, and regularity. A value of 0.0 indicates least danceable while 1.0 is most danceable. This feature helps our system identify tracks with similar rhythmic qualities regardless of genre.

Energy

Energy represents the perceived intensity and activity throughout a track. High-energy tracks (closer to 1.0) feel fast, loud, and noisy, while low-energy tracks (closer to 0.0) sound more mellow and relaxed. This feature helps match the intensity level of recommendations to seed tracks.

Loudness

Measured in decibels (dB), loudness reflects the overall amplitude of a track. Values typically range from -60 to 0 dB, with higher values indicating louder tracks. This production characteristic helps match tracks with similar mixing styles and perceived volume.

Tempo

Tempo estimates the track's speed in beats per minute (BPM). This fundamental musical attribute ranges from slow ballads around 60 BPM to fast-paced electronic music exceeding 200 BPM. Our system can recognize tempo similarities even across different genres.
Musical Attributes

Key

Key identifies the overall harmonic content of the track using standard Pitch Class notation (0-11), where 0 represents C, 1 represents C♯/D♭, and so on. This feature helps match tracks with compatible harmonic structures.

Mode

Mode indicates whether a track is in a major (1) or minor (0) scale. This binary feature captures an important aspect of a song's emotional quality, as major keys often sound brighter while minor keys tend to sound more somber.

Voice and Sound QualitiesSpeechiness

Speechiness detects the presence of spoken words in a track. Values closer to 1.0 represent talk shows or podcasts, while values below 0.33 typically represent music with no vocals. This helps distinguish between instrumental tracks and vocal-heavy songs.

Acousticness

Acousticness measures the confidence that a track is acoustic rather than electronic. Values closer to 1.0 represent acoustic instruments, while values near 0.0 indicate electronic production. This feature helps match tracks with similar production styles.

Instrumentalness

Instrumentalness predicts whether a track contains no vocals. Values closer to 1.0 indicate instrumental tracks, while vocal content pushes this value toward 0.0. This helps our system distinguish between vocal and instrumental music when generating recommendations.

Liveness

Liveness detects the presence of an audience in the recording, with values above 0.8 strongly indicating a live performance. This feature helps identify tracks with similar recording environments.

Emotional QualitiesValence

Valence measures the musical positiveness conveyed by a track. High valence (closer to 1.0) suggests more positive emotions (happy, cheerful), while low valence suggests negative emotions (sad, angry). This feature is particularly powerful for matching the emotional tone of recommendations.

Track Popularity

While not an audio feature, track popularity (0-100) indicates how frequently a track has been played relative to others. Including this in our vector helps balance between recommending unknown tracks with similar audio features and tracks that have broader appeal.

The Recommendation Agent

At the center of our Spotify recommendation system stands the Recommendation Agent - an intelligent component within N8N that processes user requests and leverages the vector database to deliver personalized music suggestions. This agent acts as the interface between users and the underlying recommendation engine, interpreting natural language queries and translating them into precise database operations.

Agent Architecture

The Recommendation Agent is implemented as an N8N Tools Agent with a chat interface, allowing users to interact with the system using natural language. When a user submits a query, the agent analyzes the request, determines which tool to use, and then formats the results into a coherent response.

The agent is configured to understand various types of music discovery requests and select the appropriate workflow to handle each query type. For example, if a user asks "Show me songs like 'Galway Girl'," the agent recognizes this as a request for similar tracks and invokes the appropriate tool.

Available Tools

The agent has access to three specialized N8N workflows that serve as its tools:

Get_Tracks_By_Name

  • This tool queries the QDrant vector database to retrieve tracks from a specific artist
  • Example query: "Show me tracks by Ed Sheeran"
  • The tool returns matching tracks

Get_Track_Details

  • This tool searches for specific tracks by their title
  • Example query: "Find the song 'Billie Jean'"

Get_Track_Recommendations

  • This tool performs vector similarity searches to find tracks with similar audio characteristics
  • Example query: "What songs are similar to 'Clocks' by Coldplay?"
  • The tool leverages the 12-dimensional vectors to find truly similar-sounding music

Query Processing Flow

When a user sends a message to the agent, the following process occurs:

  1. The agent analyzes the query to determine the user's intent
  2. It selects the appropriate tool based on whether the user is looking for:
    • Tracks from a specific artist
    • A specific track by name
    • Recommendations based on a reference track
  3. The agent invokes the selected workflow, passing relevant parameters
  4. The workflow queries the vector database and returns results
  5. The agent formats these results into a user-friendly response

Technical Implementation

The agent is implemented using N8N's HTTP Request nodes to interact with the QDrant vector database. When the agent needs to execute a tool, it:

  • Prepares the appropriate parameters (artist name, track name, or reference track)
  • Invokes the corresponding N8N workflow
  • Waits for the workflow to complete
  • Processes the returned data

The strength of this approach is its flexibility - as natural language processing models improve, the agent can be updated to handle more complex queries without changing the underlying recommendation tools.

By combining the power of vector similarity search with an intuitive chat interface, the Recommendation Agent makes complex music discovery accessible to anyone, regardless of their technical knowledge about audio features or recommendation algorithms.

Future Customization Options

Adjusting Feature Weights for Personalized Recommendations

The default recommendation system treats all audio features equally, but personalization comes from emphasizing features that matter most to individual listeners. You can modify the vector similarity calculations by implementing feature weighting in your N8N workflow:

  1. Create a Custom Weighting Profile: Add a function node that applies multipliers to specific features before similarity calculations. For example:
const weights = {
  "valence": 1.5,       // Emphasize emotional tone
  "energy": 1.3,        // Prioritize energy level
  "danceability": 1.2,  // Slightly increase importance of danceability
  "acousticness": 0.7   // Reduce impact of acousticness
};

// Apply weights to normalized features
item.weighted_vector = item.vector_embedding.map((value, index) => {
  const feature = featuresToNormalize[index];
  return value * (weights[feature] || 1.0);
});
  1. User Preference Interface: Create a simple form where users can adjust sliders for each feature, generating a personalized weight profile.
  2. Genre-Specific Weighting: Develop preset weights optimized for different genres - for example, emphasizing rhythm features for dance music or acousticness for folk recommendations.

Adding Additional Data Sources

While audio features provide a solid foundation, incorporating additional data can significantly enhance recommendation quality:

  1. Lyrical Analysis: Integrate a service like Genius API to fetch and analyze song lyrics, adding semantic dimensions to your recommendations.
  2. Genre Tags: Supplement your database with genre classifications from sources like MusicBrainz or Last.fm, allowing users to filter recommendations by genre affinity.
  3. Playlist Co-occurrence: Track which songs frequently appear together in user-created playlists, adding a social dimension to recommendations that pure audio analysis might miss.
  4. Artist Relationships: Build a graph of artists who collaborate, tour together, or influence each other, enabling "six degrees of separation" music discovery.

Creating Custom Recommendation Algorithms

Beyond vector similarity, consider implementing these algorithmic approaches in N8N:

  1. Hybrid Filtering: Combine content-based (audio features) with collaborative filtering (user behavior patterns) for more robust recommendations.
  2. Sequential Recommendations: Build a Markov chain model that learns typical song transitions in playlists, generating recommendations that flow naturally from the currently playing track.
  3. Mood Progression Algorithms: Create specialized recommendations that gradually shift the mood - for example, building a workout playlist that starts slow, intensifies, then cools down.
  4. Diversity Injection: Modify the standard nearest-neighbor approach to deliberately include some tracks that are moderately different, preventing the "echo chamber" effect of too-similar recommendations.

Performance and Limitations

Limitations of the Current Implementation

The most significant limitation stems from Spotify's API changes that restrict access to audio feature data. This creates several challenges:

  1. Limited Dataset Currency: Our system relies on a static dataset of 32,833 tracks, meaning newer releases won't appear in recommendations until the dataset is manually updated.
  2. Feature Consistency: Without direct API access to current feature data, there's no guarantee that Spotify's internal calculations of features like "danceability" remain consistent with our historical data.
  3. Cold Start Problem: New users with no listening history receive generic recommendations until sufficient preference data accumulates.

Ideal Methodology

The optimal approach would follow this workflow:

  1. Use the Spotify API to query for a song the user likes
  2. Extract the song's audio features directly from the Spotify API
  3. Normalize these features using the same statistical parameters as our database
  4. Use the normalized vector to search for similar matches in QDrant
  5. Return the top matches as recommendations

Conclusion

Our N8N-powered Spotify recommendation system demonstrates the power of combining vector similarity search with audio feature analysis. By representing tracks as multidimensional vectors and leveraging QDrant's efficient similarity calculations, we've created a system that:

  • Discovers music connections based on acoustic properties rather than just metadata
  • Provides recommendations based on artists, specific tracks
  • Delivers suggestions that capture the essential musical qualities that make listeners enjoy particular tracks
  • Operates efficiently without requiring deep coding knowledge, thanks to N8N's visual workflow design

Workflow

Below is the workflow implementation for our Spotify recommendation system. Note that the individual tools—Get_Tracks_By_Name, Get_Track_Details, and Get_Track_Recommendations—need to be configured as their own separate N8N workflows. The main Agent workflow should then be updated to reference these tools by their respective workflow IDs. This modular design allows for easier maintenance and the ability to upgrade individual components without disrupting the entire system. Each workflow has been carefully structured to handle its specific task while maintaining compatibility with the overall recommendation architecture.

Resources