Configuration Parameters

These are the parameters accepted for this script. From v1.0.0, only PostgreSQL, Redis, and TZ configuration must still be configured via environment variables. All other configuration values are managed through the browser setup wizard and persisted in the database. For compatibility with legacy installations, environment variables are imported into the database automatically on first startup. The Setup Wizard is shown on clean installation as lending page and is also available later from the menu under Administration > Setup Wizard.

How to find jellyfin userid: * Log into Jellyfin from your browser as an admin * Go to Dashboard > “admin panel” > Users. * Click on the user’s name that you are interested * The User ID is visible in the URL (is the part just after = ): * http://your-jellyfin-server/web/index.html#!/useredit.html?userId=xxxxx

How to create an the jellyfin's API token: * The API Token, still as admin you can go to Dashboard > “Admin panel” > API Key and create a new one.

The mandatory parameter that you need to change from the example are this:

Parameter Description Default Value
Mediaserver General
JELLYFIN_URL (Required) Your Jellyfin server's full URL http://YOUR_JELLYFIN_IP:8096
JELLYFIN_USER_ID (Required) Jellyfin User ID. (N/A - from Secret)
JELLYFIN_TOKEN (Required) Jellyfin API Token. (N/A - from Secret)
EMBY_URL (Required) Your Emby server's full URL http://YOUR_EMBY_IP:8096
EMBY_USER_ID (Required) Emby User ID. (N/A - from Secret)
EMBY_TOKEN (Required) Emby API Token. (N/A - from Secret)
NAVIDROME_URL (Required) Your Navidrome server's full URL http://YOUR_JELLYFIN_IP:4553
NAVIDROME_USER (Required) Navidrome User ID. (N/A - from Secret)
NAVIDROME_PASSWORD (Required) Navidrome user Password. (N/A - from Secret)
LYRION_URL (Required) Your Lyrion server's full URL http://YOUR_LYRION_IP:9000
POSTGRES_USER (Required) PostgreSQL username. (N/A - from Secret)
POSTGRES_PASSWORD (Required) PostgreSQL password. (N/A - from Secret)
POSTGRES_DB (Required) PostgreSQL database name. (N/A - from Secret)
POSTGRES_HOST (Required) PostgreSQL host. postgres-service.playlist
POSTGRES_PORT (Required) PostgreSQL port. 5432
REDIS_URL (Required) URL for Redis. redis://localhost:6379/0
GEMINI_API_KEY (Required if AI_MODEL_PROVIDER is GEMINI) Your Google Gemini API Key. (N/A - from Secret)
MISTRAL_API_KEY (Required if AI_MODEL_PROVIDER is MISTRAL) Your Mistral API Key. (N/A - from Secret)
OPENAI_API_KEY (Required if AI_MODEL_PROVIDER is OPENAI) Your OpenAI / OpenRouter API Key. (N/A - from Secret)
AudioMuse-AI Authentication
AUTH_ENABLED Enable the AudioMuse-AI authentication layer true
AUDIOMUSE_USER Username for web UI login (N/A - from Secret)
AUDIOMUSE_PASSWORD Password for web UI login (N/A - from Secret)
API_TOKEN Bearer token for API/worker requests (N/A - from Secret)
JWT_SECRET HMAC key used to sign session JWTs from Secret OR automatically created if blank

These parameters can be left as-is:

Parameter Description Default Value
CLEANING_SAFETY_LIMIT Max number of albums deleted during cleaning 100
MUSIC_LIBRARIES Comma-separated list of music libraries/folders for analysis. If empty, all libraries/folders are scanned. For Lyrion: Use folder paths like "/music/myfolder". For Jellyfin/Navidrome: Use library/folder names. "" (empty - scan all)
ENABLE_PROXY_FIX Enable Proxy Fix for Flask when behind a reverse proxy. Example Nginx configuration: config.py false
WORKER_URL This is the Url your worker instance runs on. The server instance uses this parameter to call the worker. Make sure to include /worker at the end of the url (e.g. http://worker.example.com:8029/worker) false
WORKER_POSTGRES_HOST This is the Url of your the postgres service on your server. The worker uses this to connect the postgres service the flask app uses too. Make sure to not include a protocol (like "http") (e.g. 100.000.00.00) false
WORKER_REDIS_URL This is the Url of your the redis service on your server. The worker uses this to connect to the redis service the flask app uses too. Make sure to include the protocol "redis://" and the dbindex "/0" (e.g. redis://100.000.00.00:6379/0) false
TZ Set the time zone of Flask and worker container UTC

These are the default parameters used when launching analysis or clustering tasks. You can change them directly in the front-end.

Parameter Description Default Value
CLAP - TEXT SEARCH AND MUSICNN MODEL
CLAP_ENABLED If false disable CLAP model during the analysis and the use of Text Search functionality. true
CLAP_PYTHON_MULTITHREADS CPU threading for CLAP analysis. False (default) = Use ONNX internal threading (recommended). True = Use Python ThreadPoolExecutor false
PER_SONG_MODEL_RELOAD Model reloading strategy. true (default) = Unload MusiCNN and CLAP after each song (stable VRAM, slower). false = MusiCNN reloads every 20 songs, CLAP at album end (faster but may accumulate VRAM) true
Analysis General
NUM_RECENT_ALBUMS Number of recent albums to scan (0 for all). 0
TOP_N_MOODS Number of top moods per track for feature vector. 5
CLAP_ENABLED Enable or disable CLAP model for text-to-audio search capabilities. true
CLAP_PYTHON_MULTITHREADS CPU threading for CLAP analysis. False (default) = Use ONNX internal threading (recommended). True = Use Python ThreadPoolExecutor false
Clustering General
ENABLE_CLUSTERING_EMBEDDINGS Whether to use audio embeddings (True) or score-based features (False) for clustering. true
CLUSTER_ALGORITHM Default clustering: kmeans, dbscan, gmm, spectral. kmeans
MAX_SONGS_PER_CLUSTER Max songs per generated playlist segment. 0
MAX_SONGS_PER_ARTIST Max songs from one artist per cluster. 3
MAX_DISTANCE Normalized distance threshold for tracks in a cluster. 0.5
CLUSTERING_RUNS Iterations for Monte Carlo evolutionary search. 1000
TOP_N_PLAYLISTS POST Clustering it keep only the top N diverse playlist. 8
USE_GPU_CLUSTERING When true enalbe the use of GPU on K-Means, DBSCAN and PCA false
Similarity General
VOYAGER_EF_CONSTRUCTION Number of elements analyzed when building the neighbor list. Higher = better graph quality + slower rebuilds. 200 is a tuned default; 1024 was the previous default and gives marginally better recall at ~3-4× the rebuild time. 200
VOYAGER_M Number of neighbor links per node in the HNSW graph. Higher = better recall + larger on-disk index + slower rebuilds. 32 is a tuned default; 64 was the previous default and roughly doubles the link payload. 32
VOYAGER_QUERY_EF Number neighbor analyzed during the query. 1024
VOYAGER_METRIC Different tipe of distance metrics: angular, euclidean,dot angular
SIMILARITY_ELIMINATE_DUPLICATES_DEFAULT It enable the possibility of use the MAX_SONGS_PER_ARTIST also in similar song true
SIMILARITY_RADIUS_DEFAULT Default behavior for radius similarity mode. When true, similarity results may be re-ordered using the radius (bucketed) algorithm for better listening paths. true
Sonic Fingerprint General
SONIC_FINGERPRINT_NEIGHBORS Default number of track for the sonic fingerprint 100
SONIC_FINGERPRINT_MAX_SONGS_PER_ALBUM Navidrome only. Max tracks a single album may contribute to the fingerprint seed pool, so one large album (e.g. a 100+ track DJ mix) cannot dominate. Other media servers fetch top songs directly and ignore this. 3
Song Alchemy General
ALCHEMY_DEFAULT_N_RESULTS Number of similar songs to return when creating the Alchemy result (default). 100
ALCHEMY_MAX_N_RESULTS Maximum number of similar songs to return for Alchemy results. 200
ALCHEMY_TEMPERATURE Temperature for probabilistic sampling in Song Alchemy (softmax temperature). Use 0.0 for deterministic selection. 1.0
Similar Song and Song Path Duplicate filtering General
DUPLICATE_DISTANCE_THRESHOLD_COSINE Less than this cosine distance the track is a duplicate. 0.01
DUPLICATE_DISTANCE_THRESHOLD_EUCLIDEAN Less than this euclidean distance the track is a duplicate. 0.15
DUPLICATE_DISTANCE_CHECK_LOOKBACK How many previous song need to be checked for duplicate. 1
MOOD_SIMILARITY_THRESHOLD Maximum normalized distance for mood similarity filtering. Lower value will give more importance to mood 0.15
Song Path General
PATH_DISTANCE_METRIC The distance metric to use for pathfinding. Options: 'angular', 'euclidean' angular
PATH_DEFAULT_LENGTH Default number of songs in the path if not specified in the API request 25
PATH_AVG_JUMP_SAMPLE_SIZE Number of random songs to sample for calculating the average jump distance 200
PATH_CANDIDATES_PER_STEP Number of candidate songs to retrieve from Voyager for each step in the path 25
PATH_LCORE_MULTIPLIER It multiply the number of centroid created based on the distance. Higher is better for distant song and worst for nearest. 3
PATH_FIX_SIZE When true, path generation will attempt to produce exactly the requested path length using centroid merging and backfilling. When false, the algorithm will perform a single best pick per centroid and may return a shorter path. Can be overridden per-request via the path_fix_size query parameter. false
Evolutionary Clustering & Scoring
ITERATIONS_PER_BATCH_JOB Number of clustering iterations processed per RQ batch job. 20
MAX_CONCURRENT_BATCH_JOBS Maximum number of clustering batch jobs to run simultaneously. 10
CLUSTERING_BATCH_TIMEOUT_MINUTES Max time a batch can run before being considered failed (prevents infinite hangs). 60
CLUSTERING_MAX_FAILED_BATCHES Max number of failed batches before stopping new launches and forcing completion. 10
CLUSTERING_BATCH_CHECK_INTERVAL_SECONDS How often to check batch status for timeout detection. 30
TOP_K_MOODS_FOR_PURITY_CALCULATION Number of centroid's top moods to consider when calculating playlist purity. 3
EXPLOITATION_START_FRACTION Fraction of runs before starting to use elites. 0.2
EXPLOITATION_PROBABILITY_CONFIG Probability of mutating an elite vs. random generation. 0.7
MUTATION_INT_ABS_DELTA Max absolute change for integer parameter mutation. 3
MUTATION_FLOAT_ABS_DELTA Max absolute change for float parameter mutation. 0.05
MUTATION_KMEANS_COORD_FRACTION Fractional change for KMeans centroid coordinates. 0.05
K-Means Ranges
NUM_CLUSTERS_MIN Min $K$ for K-Means. 40
NUM_CLUSTERS_MAX Max $K$ for K-Means. 100
DBSCAN Ranges
DBSCAN_EPS_MIN Min epsilon for DBSCAN. 0.1
DBSCAN_EPS_MAX Max epsilon for DBSCAN. 0.5
DBSCAN_MIN_SAMPLES_MIN Min min_samples for DBSCAN. 5
DBSCAN_MIN_SAMPLES_MAX Max min_samples for DBSCAN. 20
GMM Ranges
GMM_N_COMPONENTS_MIN Min components for GMM. 40
GMM_N_COMPONENTS_MAX Max components for GMM. 100
GMM_COVARIANCE_TYPE Covariance type for GMM (task uses full). full
Spectral Ranges
SPECTRAL_N_CLUSTERS_MIN Min components for Spectral clustering. 40
SPECTRAL_N_CLUSTERS_MAX Max components for Spectral clustering. 100
SPECTRAL_N_NEIGHBORS Number of Neighbors on which do clustering. Higher is better but slower 20
PCA Ranges
PCA_COMPONENTS_MIN Min PCA components (0 to disable). 0
PCA_COMPONENTS_MAX Max PCA components (e.g., 8 for feature vectors, 199 for embeddings). 199
AI Naming (*)
AI_MODEL_PROVIDER AI provider: OLLAMA, GEMINI, MISTRAL, OpenAI or NONE. NONE
AI_REQUEST_TIMEOUT_SECONDS Timeout (in seconds) for AI API requests. Increase for slower hardware or larger models. 300
TOP_N_ELITES Number of best solutions kept as elites. 10
SAMPLING_PERCENTAGE_CHANGE_PER_RUN Percentage of songs to swap out in the stratified sample between runs (0.0 to 1.0). 0.2
MIN_SONGS_PER_GENRE_FOR_STRATIFICATION Minimum number of songs to target per stratified genre during sampling. 100
STRATIFIED_SAMPLING_TARGET_PERCENTILE Percentile of genre song counts to use for target songs per stratified genre. 50
OLLAMA_SERVER_URL URL for your Ollama instance (if AI_MODEL_PROVIDER is OLLAMA). http://<your-ip>:11434/api/generate
OLLAMA_MODEL_NAME Ollama model to use (if AI_MODEL_PROVIDER is OLLAMA). mistral:7b
GEMINI_MODEL_NAME Gemini model to use (if AI_MODEL_PROVIDER is GEMINI). gemini-2.5-pro
MISTRAL_MODEL_NAME Mistral model to use (if AI_MODEL_PROVIDER is MISTRAL). ministral-3b-latest
OPENAI_MODEL_NAME OpenAI or OpenRouter model to use (if AI_MODEL_PROVIDER is OPENAI). openai/gpt-4
OPENAI_SERVER_URL URL for OpenAI / OpenRouter (if AI_MODEL_PROVIDER is OPENAI). https://openrouter.ai/api/v1/chat/completions
Scoring Weights
SCORE_WEIGHT_DIVERSITY Weight for inter-playlist mood diversity. 2.0
SCORE_WEIGHT_PURITY Weight for playlist purity (intra-playlist mood consistency). 1.0
SCORE_WEIGHT_OTHER_FEATURE_DIVERSITY Weight for inter-playlist 'other feature' diversity. 0.0
SCORE_WEIGHT_OTHER_FEATURE_PURITY Weight for intra-playlist 'other feature' consistency. 0.0
SCORE_WEIGHT_SILHOUETTE Weight for Silhouette Score (cluster separation). 0.0
SCORE_WEIGHT_DAVIES_BOULDIN Weight for Davies-Bouldin Index (cluster separation). 0.0
SCORE_WEIGHT_CALINSKI_HARABASZ Weight for Calinski-Harabasz Index (cluster separation). 0.0
Lyrics & SemGrove (Semantic + Groove) Search
MUSICSERVER_LYRICS_TIMEOUT Timeout (seconds) for fetching embedded lyrics from the media server (Jellyfin / Emby / Navidrome / Lyrion). Increase if your server fetches lyrics on-the-fly via plugins that may take several seconds to respond. 2.5
LYRICS_ENABLED When false, the lyrics transcription/embedding step is skipped entirely during analysis. true
LYRICS_API_ENABLE When true, fetches lyrics from external APIs (slots 1 & 2) before falling back to Whisper-small ASR transcription. true
LYRICS_ASR_ENABLE When false, skips the Whisper-small ASR transcription stage entirely. Tracks with no media-server lyrics and no external-API lyrics are marked as instrumental (sentinel embedding) instead of being transcribed. true
LYRICS_API_1_URL_TEMPLATE URL template for lyrics API slot 1. Use {artist_param}, {title_param} placeholders. e.g. https://example.com/api/get?{artist_param}={artist}&{title_param}={title} ""
LYRICS_API_1_ARTIST_PARAM Query parameter name for the artist in API slot 1. artist_name
LYRICS_API_1_TITLE_PARAM Query parameter name for the track title in API slot 1. track_name
LYRICS_API_1_LYRICS_FIELD JSON field name containing the lyrics text in the API slot 1 response. plainLyrics
LYRICS_API_1_APIKEY_PARAM Query parameter name for the API key in slot 1 (leave empty if no key needed). ""
LYRICS_API_1_APIKEY_VALUE API key value for slot 1. ""
LYRICS_API_1_TIMEOUT HTTP timeout in seconds for API slot 1. 5.0
LYRICS_API_2_URL_TEMPLATE URL template for lyrics API slot 2 (fallback after slot 1). ""
LYRICS_API_2_ARTIST_PARAM Query parameter name for the artist in API slot 2. artist
LYRICS_API_2_TITLE_PARAM Query parameter name for the track title in API slot 2. title
LYRICS_API_2_LYRICS_FIELD JSON field name containing the lyrics text in the API slot 2 response. lyrics
LYRICS_API_2_APIKEY_PARAM Query parameter name for the API key in slot 2. ""
LYRICS_API_2_APIKEY_VALUE API key value for slot 2. ""
LYRICS_API_2_TIMEOUT HTTP timeout in seconds for API slot 2. 5.0
VAD_VOICE_RECOGNITION Minimum seconds of voiced audio Silero VAD must detect before a track is sent to the Whisper-small ASR engine for lyric transcription. Tracks below this threshold are treated as instrumental/ambient and skip ASR entirely (the instrumental embedding sentinel is used instead). Use this knob to fine-tune instrumental/ambient song recognition in the lyrics analysis pipeline. Setting it very high (e.g. 1000) effectively disables ASR transcription for every track, since no song can reach that much voiced audio within the 4-minute analysis clip. 25
LYRICS_ASR_BEAM_SIZE Beam search width for the Whisper-small ASR decoder. 1 = pure greedy (fastest, most error-prone), 2 = sweet spot (catches stuck-loop attractors at ~2× greedy cost), 5 = Whisper-upstream default (max quality, ~5× cost). Each extra beam adds one extra decoder.run per generated token plus its own KV cache (~30-80 MB at a full 30 s chunk). 5
LYRICS_ASR_MIN_AVG_LOGPROB General avg_logprob floor for ASR output. Whisper-small's per-chunk avg_logprob is averaged over the track; if the result is below this threshold the transcript is dropped as likely hallucination and the track is treated as instrumental. Values are negative — closer to 0 is stricter (rejects more), more negative is looser (accepts more). -1.0 is a permissive global floor that catches only truly degenerate transcriptions. -1.0
LYRICS_ASR_NON_ENGLISH_MIN_LOGPROB Additional avg_logprob floor applied only when Whisper reports a non-English language. Whisper-small is English-biased, so legitimate non-English transcriptions (CJK, Cyrillic, Arabic, etc.) naturally score lower in the -0.5 to -0.8 range; set this looser than the English floor (more negative) to avoid dropping valid foreign-language lyrics. Raise toward -0.5 if you see garbage non-English transcriptions slipping through. -0.85
LYRICS_WHISPER_MODEL_DIR Path to the Whisper-small ONNX bundle directory. Must contain encoder_model.onnx, decoder_model_merged.onnx, tokenizer.json and the rest of the HuggingFace optimum export. Pre-bundled in the official Docker image from lyrics_model_whisper.tar.gz. /app/model/whisper-small-onnx
LYRICS_WHISPER_LANG_CONFIDENCE Confidence floor for Whisper's built-in language detection (softmax over the 99 language tokens at the first decoder step). Chunks whose top-language probability falls below this are dropped and the track is treated as instrumental — no external langdetect involved. Lower to 0.5 if you find legit songs being dropped. 0.7
LYRICS_WHISPER_MIN_FREE_RAM_GB Minimum free RAM (GB) before Whisper loads. Whisper-small peaks ~1.5 GB, so 2.5 GB leaves headroom. 2.5
LYRICS_TEXT_MAX_COMPRESSION_RATIO Compression ratio (zlib) used to filter out text that is not real lyrics. Highly repetitive content compresses far more than real lyrics, so text above this ratio is dropped before embedding. Set to 0 to disable the gate. 15.0
SEM_GROVE_WEIGHT_LYRICS Contribution of the lyrics embedding to the merged SemGrove cosine similarity (squared scale factor, [0.0–1.0]). Requires index rebuild after change. 0.75
SEM_GROVE_WEIGHT_AUDIO Contribution of the MusicNN audio embedding to the merged SemGrove cosine similarity (squared scale factor, [0.0–1.0]). Requires index rebuild after change. 0.25

⚠️ The only officially supported model is qwen3.5:9b or qwen3.5:4b for faster one. Compatibility testing is done exclusively against it. Other models below were tested and may work, but use them at your own risk — issues opened for untested or arbitrary models could be closed. Different models behave differently and outputs vary between runs.

ℹ️ The models listed below were tested in the past and will not be retested going forward. They are documented for reference only.

Self-hosted (Ollama): gemma3:4b, ministral-3:3b (fastest), plus: llama3.1:8b, llama3.2:1b/3b, gemma3:1b, qwen3:0.6b/1.7b, qwen2.5:1.5b, qwen3.5:0.8b/2b, deepseek-r1:1.5b, phi4-mini:3.8b, lfm2.5-thinking:1.2b.

Cloud, tested March 2026: claude-sonnet-4.6 (best), claude-haiku-4.5, gemini-3-flash-preview. Earlier: mistral:7b, llama3.1:8b, gemini-2.5-pro, gemini-1.5-flash-latest.

You can use either an external AI API or self-host with Ollama — deployment example here:

  • https://github.com/NeptuneHub/k3s-supreme-waffle/tree/main/ollama

OpenAI-compatible hosted providers

AudioMuse-AI can use hosted services that expose an OpenAI-compatible chat completions API through the existing OPENAI provider. Atlas Cloud is one example: point OPENAI_SERVER_URL at its OpenAI-compatible endpoint and keep using a model that has been validated for AudioMuse-AI unless you have tested another model with your library.

Example Atlas Cloud configuration:

AI_MODEL_PROVIDER=OPENAI
OPENAI_SERVER_URL=https://api.atlascloud.ai/v1/chat/completions
OPENAI_MODEL_NAME=qwen3.5:9b
OPENAI_API_KEY=<atlas-key>