Skip to content

Commit f9db8d1

Browse files
feat(discogs): Discogs plugin + Bulk Enrichment + v0.5.5 hardening (#100)
Release v0.5.5 — code-only upgrade from v0.5.4 (no new migrations). ## New features - **Discogs** music scraper plugin (CD/vinyl/cassette via UPC/EAN) + MusicBrainz + Deezer + GoodLib fallback chain - **Bulk ISBN Enrichment** (/admin/libri/bulk-enrich) — manual batch (20 books/click, rate-limited 1 req / 2 min) + cron-driven background enrichment with atomic flock() + non-zero exit on failure ## Release robustness - public/installer/assets symlink → real directory (fixes copy(file, dir) crash in manual upgrade on materialized installs) - create-release(-local).sh ZIP verification now detects symlink entries via zipinfo metadata (prevents recurrence) - Full end-to-end reinstall regression test (scripts/reinstall-test.sh + tests/manual-upgrade-real.spec.js) exercises the real admin UI upgrade flow without rsync shortcuts ## Code quality (CodeRabbit rounds) - 16 Major fixes: BulkEnrichController raw-exception leak, FILTER_VALIDATE_BOOL parsing, UPDATE result check, NULLIF(TRIM(...)) on isbn identifiers, validated-ISBN vs barcode distinction in ScrapeController, rate-limit back-pressure, accessible switch (aria-label + aria-labelledby), flock atomic cron locking - 11 additional Major fixes: PluginManager cleanup on prepare fail, tipo_media filter includes NULL legacy rows, installer.js null-safe getElementById + icon-preserving DOM rebuild + dead code removal, style.css header contrast + alert-warning palette, cron finally race fix, zipinfo awk extracts symlink path not target, test queries mirror getStats(), test 9 asserts targetId specifically, discogs activation explicit, music markers no longer key on generic 'Barcode' - 168 new EN + DE translations ## i18n parity - en_US.json: 4197 → 4365 entries - de_DE.json: 4196 → 4365 entries See updater.md Version History for the full v0.5.5 entry and README.md "What's New in v0.5.5" for the user-facing changelog.
1 parent d0cf881 commit f9db8d1

File tree

71 files changed

+11189
-3878
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

71 files changed

+11189
-3878
lines changed

.coderabbit.yaml

Lines changed: 154 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,2 +1,155 @@
1+
# yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json
2+
# Pinakes — CodeRabbit Configuration
3+
# PHP/Slim 4 library management system with MySQL
4+
5+
language: "it-IT"
6+
7+
tone_instructions: |
8+
Sii conciso e diretto. Concentrati su bug reali, vulnerabilità di sicurezza
9+
e violazioni delle regole del progetto. Evita suggerimenti stilistici minori.
10+
11+
early_access: true
12+
113
reviews:
2-
max_files: 200
14+
profile: "assertive"
15+
request_changes_workflow: false
16+
high_level_summary: true
17+
poem: false
18+
review_status: true
19+
20+
# ── File Filters ──────────────────────────────────────────────────
21+
path_filters:
22+
- "!vendor/**"
23+
- "!node_modules/**"
24+
- "!public/assets/tinymce/**"
25+
- "!public/assets/fontawesome/**"
26+
- "!public/assets/choices/**"
27+
- "!public/assets/flatpickr/**"
28+
- "!public/assets/sweetalert2/**"
29+
- "!*.min.js"
30+
- "!*.min.css"
31+
- "!*.map"
32+
- "!pinakes-*.zip"
33+
- "!pinakes-*.sha256"
34+
- "!test-results/**"
35+
36+
# ── Path-Specific Review Instructions ──────────────────────────────
37+
path_instructions:
38+
# Controllers — input validation, auth, soft-delete
39+
- path: "app/Controllers/**"
40+
instructions: |
41+
- CRITICO: ogni query sulla tabella `libri` DEVE avere `AND deleted_at IS NULL`
42+
- Verifica che `getParsedBody()` non sia usato per JSON — serve `json_decode((string)$request->getBody())`
43+
- Input utente: validare e sanitizzare PRIMA dell'uso
44+
- Sessione: `$_SESSION['user']['id']` (NON `$_SESSION['user_id']`)
45+
- Eccezioni: catturare `\Throwable` non `\Exception` (strict_types TypeError extends \Error)
46+
- Logging: `SecureLogger::error()` non `error_log()` per contesti sensibili
47+
- Route: mai hardcodare percorsi URL, usare `route_path('key')` o `RouteTranslator::route('key')`
48+
- Export CSV: tipo_media deve essere incluso, usare stringa vuota come fallback (non 'libro')
49+
50+
# Models / Repository — query safety
51+
- path: "app/Models/**"
52+
instructions: |
53+
- CRITICO: ogni SELECT/UPDATE/DELETE sulla tabella `libri` DEVE avere `AND deleted_at IS NULL`
54+
- Soft-delete: nullificare isbn10, isbn13, ean quando si fa soft-delete (prevent unique constraint violations)
55+
- Transaction safety: mai annidare `begin_transaction()` in mysqli (causa commit implicito)
56+
- Pattern: verificare `@@autocommit` per rilevare transazioni in corso
57+
- hasColumn() guard per colonne aggiunte in migrazioni recenti (backward compat)
58+
- tipo_media: usare `array_key_exists` guard, non sovrascrivere il valore se non esplicitamente fornito
59+
60+
# Views — escaping, XSS prevention
61+
- path: "app/Views/**"
62+
instructions: |
63+
- CRITICO: `htmlspecialchars(url(...), ENT_QUOTES, 'UTF-8')` in TUTTI gli attributi HTML (href, action, src)
64+
- `route_path()` richiede lo stesso escaping negli attributi HTML
65+
- PHP->JS: usare `json_encode(..., JSON_HEX_TAG)` per qualsiasi dato PHP inserito in JavaScript
66+
- TinyMCE: SEMPRE includere `model: 'dom'` e `license_key: 'gpl'` in ogni `tinymce.init({})`
67+
- Mai usare `HtmlHelper::e()` nelle view — usare `htmlspecialchars(..., ENT_QUOTES, 'UTF-8')`
68+
- Schema.org: ogni tipo_media deve avere il proprio branch con proprietà specifiche (non mescolare Book con CreativeWork)
69+
- DataTable: ogni valore da API deve passare per `escapeHtml()` prima del rendering
70+
71+
# Support classes — helpers, utilities
72+
- path: "app/Support/**"
73+
instructions: |
74+
- MediaLabels: `isMusic()` deve essere autoritativo su tipo_media quando impostato
75+
- `inferTipoMedia()`: attenzione ai false positive su token corti ('cd' matcha 'CD-ROM', 'lp' matcha parole con 'lp')
76+
- `formatTracklist()`: deve rilevare HTML pre-formattato (`<ol>`) e restituirlo as-is
77+
- PluginManager: usare `\Throwable` non `\Exception`, `BundledPlugins::LIST` centralizzato
78+
- Route translation: mai hardcodare percorsi, usare `RouteTranslator::route('key')`
79+
80+
# Plugins — API safety, rate limiting
81+
- path: "storage/plugins/**"
82+
instructions: |
83+
- SICUREZZA: ogni chiamata curl DEVE avere CURLOPT_PROTOCOLS (HTTP/HTTPS only), CURLOPT_MAXREDIRS, CURLOPT_CONNECTTIMEOUT, CURLOPT_SSL_VERIFYPEER
84+
- SSRF: validare/castare ID esterni (es. releaseId a int) prima di usarli in URL
85+
- Rate limiting: deve essere elapsed-based (microtime) e static (persistere tra istanze)
86+
- Ogni `curl_exec()` deve avere `curl_error()` check con logging
87+
- Hook registration: transazione + rethrow on failure
88+
- Non enrichire dati di libri con cover musicali (gate su resolveTipoMedia)
89+
90+
# Migrations — versioning, idempotency
91+
- path: "installer/database/migrations/**"
92+
instructions: |
93+
- CRITICO: il nome del file di migrazione DEVE avere versione <= version.json (altrimenti viene silenziosamente saltata)
94+
- L'updater usa `version_compare($migrationVersion, $toVersion, '<=')` — versioni superiori sono IGNORATE
95+
- Ogni migrazione DEVE essere completamente idempotente (IF NOT EXISTS, IF @col_exists = 0, etc.)
96+
- LIKE patterns: evitare `%cd%` e `%lp%` che matchano false positive ('CD-ROM', parole con 'lp') — usare REGEXP word boundaries
97+
- Se servono più migrazioni per una release: unirle in UN file con la versione della release
98+
99+
# Translations — completeness
100+
- path: "locale/**"
101+
instructions: |
102+
- Ogni chiave presente in it_IT.json DEVE essere presente anche in en_US.json e de_DE.json
103+
- Le chiavi di traduzione devono corrispondere esattamente (case-sensitive)
104+
- I placeholder (%s, %d) devono essere preservati in tutte le lingue
105+
- Nuove chiavi aggiunte nel codice PHP/JS devono essere aggiunte in TUTTE le lingue
106+
107+
# Tests — E2E patterns
108+
- path: "tests/**"
109+
instructions: |
110+
- I test E2E richiedono `/tmp/run-e2e.sh` per credenziali DB/admin
111+
- `--workers=1` obbligatorio per esecuzione seriale
112+
- SweetAlert: dopo form submit, verificare e cliccare `.swal2-confirm`
113+
- Choices.js: usare `fill` + `waitForTimeout` + click suggestion
114+
- Flatpickr: interagire via JS evaluate, non click diretto
115+
- Pulizia dati test: FK-safe order (prima tabelle figlie, poi padri)
116+
117+
# Release scripts
118+
- path: "scripts/**"
119+
instructions: |
120+
- MAI creare ZIP manualmente — SEMPRE usare `create-release.sh`
121+
- Lo script verifica 9 file critici nel ZIP prima del rilascio
122+
- `git archive` usa file COMMITTATI, non la working directory
123+
- Verificare che `public/assets/tinymce/models/dom/model.min.js` sia nel ZIP
124+
125+
# ── Auto Review Settings ───────────────────────────────────────────
126+
auto_review:
127+
enabled: true
128+
drafts: false
129+
130+
# ── Tools ──────────────────────────────────────────────────────────
131+
tools:
132+
phpstan:
133+
enabled: true
134+
shellcheck:
135+
enabled: true
136+
semgrep:
137+
enabled: true
138+
gitleaks:
139+
enabled: true
140+
yamllint:
141+
enabled: true
142+
143+
# ── Chat ──────────────────────────────────────────────────────────────
144+
chat:
145+
auto_reply: true
146+
147+
# ── Knowledge Base ────────────────────────────────────────────────────
148+
knowledge_base:
149+
opt_out: false
150+
learnings:
151+
scope: "local"
152+
issues:
153+
scope: "auto"
154+
pull_requests:
155+
scope: "auto"

.gitattributes

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,8 @@
22
* text=auto
33

44
# Exclude from release archives (git archive)
5-
public/installer/assets export-ignore
5+
# NOTE: public/installer/assets/ MUST be in the ZIP — contains installer.js
6+
# and style.css. Excluding it causes step 2 (test connection) to silently fail.
67
tests/ export-ignore
78
test/ export-ignore
89
.github/ export-ignore

.gitignore

Lines changed: 31 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -138,6 +138,27 @@ storage/plugins/goodlib/*
138138
!storage/plugins/goodlib/*.md
139139
!storage/plugins/goodlib/views/
140140
!storage/plugins/goodlib/views/*.php
141+
!storage/plugins/discogs/
142+
storage/plugins/discogs/*
143+
!storage/plugins/discogs/*.php
144+
!storage/plugins/discogs/*.json
145+
!storage/plugins/discogs/*.md
146+
!storage/plugins/discogs/views/
147+
!storage/plugins/discogs/views/*.php
148+
!storage/plugins/deezer/
149+
storage/plugins/deezer/*
150+
!storage/plugins/deezer/*.php
151+
!storage/plugins/deezer/*.json
152+
!storage/plugins/deezer/*.md
153+
!storage/plugins/deezer/views/
154+
!storage/plugins/deezer/views/*.php
155+
!storage/plugins/musicbrainz/
156+
storage/plugins/musicbrainz/*
157+
!storage/plugins/musicbrainz/*.php
158+
!storage/plugins/musicbrainz/*.json
159+
!storage/plugins/musicbrainz/*.md
160+
!storage/plugins/musicbrainz/views/
161+
!storage/plugins/musicbrainz/views/*.php
141162

142163
# Premium plugin - never track (private/commercial)
143164
storage/plugins/scraping-pro/
@@ -204,6 +225,7 @@ desktop.ini
204225
# Test Artifacts
205226
# ========================================
206227
.playwright-mcp/
228+
test-results/
207229

208230
# ========================================
209231
# Development Documentation (not for distribution)
@@ -402,6 +424,14 @@ hackernews.md
402424
docs/reference/bug-gemini.md
403425
docs/reference/analisi-sicurezza.md
404426
scripts/generate_dewey_json.py
427+
scripts/create-release-local.sh
428+
scripts/reinstall-test.sh
429+
pinakes-v*-local.zip
430+
pinakes-v*-local.zip.sha256
431+
432+
# Reinstall / upgrade regression runbook (local-only, not committed)
433+
reinstall_test.md
434+
tests/manual-upgrade-real.spec.js
405435
docs/reference/start-server.md
406436
docs/reference/security-audit-report.md
407437
docs/reference/routes-to-add.md
@@ -423,6 +453,7 @@ internal/
423453
updater.md
424454
updater_new.md
425455
scraping-pro-*.zip
456+
scraping-pro-*.zip.sha256
426457
fix-autoloader.php
427458
test-upgrade/
428459
.playwright-cli/

README.md

Lines changed: 66 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -9,7 +9,7 @@
99
1010
Pinakes is a self-hosted, full-featured ILS for schools, municipalities, and private collections. It focuses on automation, extensibility, and a usable public catalog without requiring a web team.
1111

12-
[![Version](https://img.shields.io/badge/version-0.5.3-0ea5e9?style=for-the-badge)](version.json)
12+
[![Version](https://img.shields.io/badge/version-0.5.5-0ea5e9?style=for-the-badge)](version.json)
1313
[![Installer Ready](https://img.shields.io/badge/one--click_install-ready-22c55e?style=for-the-badge&logo=azurepipelines&logoColor=white)](installer)
1414
[![License](https://img.shields.io/badge/License-GPL--3.0-orange?style=for-the-badge)](LICENSE)
1515

@@ -24,19 +24,79 @@ Pinakes is a self-hosted, full-featured ILS for schools, municipalities, and pri
2424

2525
---
2626

27-
## What's New in v0.5.3
27+
## What's New in v0.5.5
2828

29-
### 🔍 Cross-Version Consistency Fixes (v0.4.9.9–v0.5.2)
29+
### 📥 Bulk ISBN Enrichment (#87 follow-up)
30+
31+
- **New Admin page `/admin/libri/bulk-enrich`** — automatic enrichment of books with missing covers/descriptions using their ISBN/EAN
32+
- **Manual batch** — process 20 books per click through all active scraping plugins (Open Library, Google Books, Discogs, MusicBrainz, Deezer, scraping-pro if installed). Rate-limited to 1 request per 2 minutes to protect upstream APIs
33+
- **Cron-driven** — configurable background enrichment via `scripts/bulk-enrich-cron.php` with atomic `flock(LOCK_EX|LOCK_NB)` locking
34+
- **No-overwrite guarantee** — only fills NULL or empty fields, never touches populated data
35+
- **Empty-string safe**`NULLIF(TRIM(col), '')` on `isbn13/isbn10/ean` so legacy rows with blank identifiers don't shadow populated ones
36+
37+
### 🔌 New bundled scraping plugins
38+
39+
- **Discogs** — music metadata (CD, vinyl, cassette) via UPC/EAN barcode or text search. Registers 4 hooks (`scrape.isbn.validate`, `scrape.sources`, `scrape.fetch.custom`, `scrape.data.modify`)
40+
- **MusicBrainz** — fallback music metadata source
41+
- **Deezer** — cover art + track listings for audio media
42+
- **GoodLib** — custom-domain book metadata scraper
43+
44+
### 🎯 Upgrade/Install robustness
45+
46+
- **Fixed** `public/installer/assets` symlink → real directory (manual upgrade used to crash with `copy(): The second argument cannot be a directory` on installs where the dir had been materialized)
47+
- **Release ZIP guard**`create-release.sh` now scans ZIP metadata via `zipinfo` and aborts if any symlink entry would ship (prevents regressions like the one above)
48+
- **Reinstall regression test** — full end-to-end suite (`scripts/reinstall-test.sh` + `tests/manual-upgrade-real.spec.js`) that exercises the real admin UI upgrade flow (upload ZIP → click "Avvia" → `Updater::performUpdateFromFile`) instead of bypassing via rsync. Runs the full Playwright suite on both a fresh install and an upgraded install
49+
50+
### 🧹 CodeRabbit Major fixes (16 items)
51+
52+
- **`BulkEnrichController::start`** — no longer leaks raw exception messages to clients; logs via `SecureLogger` and returns a generic 500
53+
- **`BulkEnrichController::toggle`**`filter_var(FILTER_VALIDATE_BOOL)` so `"false"/"0"/"off"` correctly disable the feature
54+
- **`BulkEnrichmentService::setEnabled`** — returns bool; controller propagates DB failures instead of swallowing them
55+
- **`BulkEnrichmentService::enrichBook`** — checks the `UPDATE` execute() result before marking the book as enriched (prevents false-positive success logs on DB failure)
56+
- **`ScrapeController::normalizeIsbnFields`** — distinguishes validated ISBN requests (via `IsbnFormatter::isValid`) from plugin-accepted barcode requests, so legitimate book lookups no longer skip ISBN backfill when the scraper omits `format`/`tipo_media`
57+
- **Accessible switch**`aria-label` + `aria-labelledby` on `#toggle-enrichment`
58+
- Full list in `updater.md` Version History.
59+
60+
### 🌐 i18n
61+
62+
- **168 new translations** added to `en_US.json` + `de_DE.json` — all strings introduced in this branch are now fully localised. `it_IT.json` stays minimal (fallback-to-key)
63+
64+
### Migrations
65+
66+
No new migrations. All DB changes ship in existing `migrate_0.5.4.sql`. Running v0.5.5 on a v0.5.4 install is a code-only upgrade.
67+
68+
---
69+
70+
## Previous Releases
71+
72+
<details>
73+
<summary><strong>v0.5.4</strong> - Discogs Plugin + Media Type + Plugin Manager Hardening</summary>
74+
75+
### 🎵 Discogs music scraper plugin (#87)
76+
77+
- **New `tipo_media` ENUM** (`libro/disco/audiolibro/dvd/altro`) on `libri` with composite index `(deleted_at, tipo_media)`
78+
- **Heuristic backfill** from `formato` using anchored LIKE patterns (avoids `%cd%` matching CD-ROM, `%lp%` matching "help")
79+
- **Discogs + MusicBrainz + CoverArtArchive + Deezer** chain with 4 hooks (incl. `scrape.isbn.validate` for UPC-12/13)
80+
- **Barcode → ISBN guard** in `ScrapeController::normalizeIsbnFields` — skips normalization when no format/tipo_media signal to avoid the EAN-in-`isbn13` regression
81+
- **PluginManager** migrated from `error_log``SecureLogger` (31 call sites)
82+
83+
### Post-release hotfixes (rolled into v0.5.4)
84+
85+
- `autoRegisterBundledPlugins` INSERT had 14 columns / 13 values after CodeRabbit round 11 — fresh installs crashed with "Column count doesn't match value count" (fixed in `c9bd82c`)
86+
- Same method's `bind_param('ssssssssissss')` had positions 8+9 swapped — `path='discogs'` was cast to int `0`, orphan-detection then deleted the rows (fixed in `fb1e881`)
87+
88+
</details>
89+
90+
<details>
91+
<summary><strong>v0.5.3</strong> - Cross-Version Consistency Fixes (v0.4.9.9–v0.5.2)</summary>
3092

3193
- **`descrizione_plain` propagated** — Catalog FULLTEXT search and admin grid now use `COALESCE(NULLIF(descrizione_plain, ''), descrizione)` for LIKE conditions, completing the HTML-free search feature from v0.4.9.9
3294
- **ISSN in Schema.org & API**`issn` property now emitted in Book JSON-LD and returned by the public API (`/api/books`)
3395
- **CollaneController atomicity**`rename()` aborts on `prepare()` failure instead of committing partial state
3496
- **LibraryThing import aligned**`descrizione_plain` (with `html_entity_decode` + spacing), ISSN normalization, `AuthorNormalizer` on traduttore, soft-delete guards on all UPDATE queries, and `descrizione_plain` column conditional (safe on pre-0.4.9.9 databases)
3597
- **Secondary Author Roles** — LT import now routes translators to `traduttore` field based on `Secondary Author Roles`
3698

37-
---
38-
39-
## Previous Releases
99+
</details>
40100

41101
<details>
42102
<summary><strong>v0.5.2</strong> - Name Normalization (#93)</summary>

0 commit comments

Comments
 (0)