Commit Graph

1311 Commits

Author SHA1 Message Date
Maurice 407bacf66e test(llm-parse): cover the extraction router, client factory and import jobs
The new LLM extraction router shipped with little branch coverage, dropping src/nest below the 80% gate. Add unit tests for routeExtraction (flights/single/union/error paths, deterministic booking-wide fill), the native Ollama format client, the provider factory, the local-router service path with its type-aware text cap, the flat->schema.org mapper's remaining reservation types, and the background import-jobs runner. Also remove the now-unused validate.ts (only its FlatLike type was still referenced; moved to flat-schemas).
2026-06-26 22:12:10 +02:00
Maurice 3372ed4ce8 refactor(planner): move the import-review bridge effect into the page hook
TripPlannerPage held a useEffect (the background-import → review bridge), which trips the page-pattern check (pages must stay wiring containers). Move the effect and its store/IndexedDB wiring into useTripPlanner where the rest of the import-review state already lives.
2026-06-26 22:11:56 +02:00
Maurice f2f598ada8 feat(settings): use the shared custom dropdown for the AI parsing provider
Swap the native select for CustomSelect so the provider picker matches the rest of the app's styling (dark mode, portal dropdown).
2026-06-26 19:34:50 +02:00
Maurice b76a69dfbd fix(settings): show the Integrations tab when only AI parsing is enabled
hasIntegrations gated the tab on memories/mcp/airtrail only, so a user with just the llm_parsing addon enabled saw no Integrations tab and could not reach the AI parsing config. Include llmEnabled in the gate.
2026-06-26 19:22:05 +02:00
Maurice eec6e0b53e feat(settings): let users set their own AI parsing model
Adds an "AI parsing" section under Settings -> Integrations where a user can choose the LLM provider, model, base URL, API key and multimodal option used for booking extraction. This per-user config applies when an admin has not configured an instance-wide model. Reuses the existing encrypted user settings: the API key is stored encrypted, never prefilled, and a blank field keeps the stored one. Adds settings.aiParsing.* across all 20 locales.
2026-06-26 19:12:54 +02:00
Maurice 9bb16ad307 chore(extract): recommend only Qwen3-8B (drop Qwen2.5 from the curated list)
Qwen3-8B is the identified default; the prior Qwen2.5 entries are no longer needed in the pull list.
2026-06-26 16:39:23 +02:00
Maurice 915bb0d0ca fix(import): persist source files in IndexedDB so attach survives a reload
The source document was only kept in memory on the background task, so a page reload during the (now always-LLM ~25s) parse lost it and the booking saved without its file. Store the uploaded files in IndexedDB keyed by job id; the review loads them from there when the in-memory copy is gone, and a 1h TTL prunes abandoned imports.
2026-06-26 16:27:06 +02:00
Maurice f3077ce4f0 fix(import): preview the parsed cost as linked in the review modal
During the per-item import review the booking isn't saved yet, so the Costs section showed an empty 'Create expense' even though a linked cost will be created on save. Show the parsed price (amount + category) as the pending linked expense so the user can verify it up front. Reuses existing i18n keys.
2026-06-26 16:27:06 +02:00
Maurice 3149c2960e perf(extract): cap single-booking text tighter; require rental company
A long single-booking PDF (e.g. an 11-page rental voucher) spent ~200s on CPU prompt-eval at the 16k cap, though its data sits in the first ~2k. Cap non-flight docs at 6k (flights keep 16k for all legs). Also make the rental operator a required field so the car gets a real title.
2026-06-26 16:08:32 +02:00
Maurice 1ab427000a fix(import): refresh costs immediately after an imported booking is saved
The saving client gets no budget:created echo (X-Socket-Id) and the create response omits the linked budget item, so the booking's Costs section and the Costs tab stayed stale until a manual reload. Reload the budget items right after a create that carried a budget entry.
2026-06-26 15:57:18 +02:00
Maurice 7ece89ac5c fix(extract): require the hotel address and ask for the rental company
After dropping the vendor templates, the model skipped the (often unlabeled) Expedia-style hotel address — making address a required schema field forces it to emit the street-address line, restoring the booking's location/place. Also hint the rental company so a car booking gets a real title instead of the generic fallback.
2026-06-26 15:57:17 +02:00
Maurice 13f342e446 refactor(extract): drop vendor templates, let the model drive with deterministic backfill
Now that a capable instruct model (Qwen3-8B, thinking off) reads name/address/dates/legs reliably across formats, the per-vendor template short-circuit distorted more than it fixed: brittle on layout variations and overriding the better model output. Remove the template layer; the model extracts the structure and Schicht 2 backfills the confirmation/total and takes the currency from the document's own symbol (correcting model misreads like ¥→$). Per-type prompts now also ask for address and price/currency.
2026-06-26 15:42:21 +02:00
Maurice 51e8524d5c feat(extract): recommend Qwen3-8B as the local extraction model
A/B against the prior default (qwen2.5:7b) on CPU showed Qwen3-8B is both faster and more accurate on tricky/multilingual booking docs (correct Airbnb year+price, correct DisneySea admission date), once thinking is disabled — which the router now does. Feature it as the recommended pull, keep qwen2.5:7b as the fallback.
2026-06-26 14:59:38 +02:00
Maurice b86bdce490 fix(extract): disable model thinking for grammar-constrained extraction
Hybrid/reasoning models (Qwen3 and similar) default to emitting reasoning tokens, which collide with Ollama's format-grammar constraint — on CPU this produced null/unparseable output and blew the latency budget (qwen3:8b: null or 300s timeouts vs ~20s with thinking off). Send think:false on the /api/chat call; Ollama ignores it for non-thinking models (verified on qwen2.5:7b), so it's safe and unlocks the stronger Qwen3 family.
2026-06-26 14:50:50 +02:00
Maurice 7f6920241c feat(import): attach the parsed source document to each booking
Keep the uploaded files on the background task and hand them to the review flow, so each reviewed booking pre-fills its Files with the document it was parsed from (uploaded with the booking on save). The two modals also adopt the shared resolveDayId helper.
2026-06-26 10:41:41 +02:00
Maurice 801bf0539f refactor(extract): dedupe currency/day helpers, drop redundant casts, support JPY vouchers
Code-audit clean-ups: share one normCurrency between the router and the templates, lift the duplicated nearest-day resolver into formatters.resolveDayId, drop two needless as-unknown-as casts at the fillBookingWideFields call sites, restore routeExtraction's doc comment, and give the broker template readable names. Plus recognise ¥/JPY and fall back to a standalone symbol amount, so a Klook-style voucher whose price sits far from any label still yields a cost.
2026-06-26 10:41:29 +02:00
Maurice 6f21eba216 fix(import): refresh costs after a booking review so imported expenses appear without a reload
Imported bookings auto-create their linked budget items server-side, but the saving client suppresses its own budget:created echo, so the Costs list stayed stale until a manual reload. Reload the budget items when the review session ends.
2026-06-26 09:56:22 +02:00
Maurice 50eb88511c fix(import): resolve an imported transport's day from its parsed dates
A reviewed transport (e.g. a rental car) arrived with only its parsed pick-up/return dates and no day_id, so the modal kept just the time and saved a bare "HH:MM" with no date. Resolve start/end day from the parsed dates (exact match, else nearest trip day) so the booking lands on the right days.
2026-06-26 09:46:36 +02:00
Maurice ca3ffea3ea fix(reservations): skip un-geocoded endpoints instead of failing the save
reservation_endpoints.lat/lng are NOT NULL, so saving a reviewed transport whose pick-up/return couldn't be geocoded threw a 500 and lost the whole booking (dates, linked cost). Skip those rows; the dates still persist on reservation_time/reservation_end_time.
2026-06-26 09:31:49 +02:00
Maurice e934fe43f1 fix(import): keep the parse-progress widget across a reload
Persist the background-import tasks (id/trip/status only) and re-fetch each job's status on mount, so a parse still running when the page reloads keeps its widget instead of vanishing; expired jobs (404) are dropped and a restored 'done' task re-fetches its items.
2026-06-26 09:08:44 +02:00
Maurice b175ef4626 fix(extract): backfill booking code/total and harden the reference match
Apply the deterministic confirmation-code and total fill to vendor-template results too (not just model output), and require the captured reference to contain a digit so a bare 'Confirmation'/'Reference' label no longer grabs the next prose word.
2026-06-26 09:08:36 +02:00
Maurice 9aaf313d59 feat(extract): add Expedia and rental-broker booking templates
Pull the hotel/rental fields these vendors print in a stable text layout (name, address, stay/pickup dates, price, reference) deterministically, so the import stops depending on the local model for them. Handles German long/abbreviated months and English dates incl. 12-hour and comma forms.
2026-06-26 09:08:25 +02:00
Maurice c5fb76da7b fix(import): create linked costs and accommodations from reviewed bookings
Reviewing an imported booking saves it through the normal reservation
form, which dropped the parsed price (so no linked cost was created) and
only created the accommodation when both nights matched a trip day.
Carry the parsed price into a linked cost on save, and create the
accommodation from whichever day the check-in/out dates resolve to.
2026-06-25 23:56:21 +02:00
Maurice 628830011d feat(import): parse bookings in the background with a progress widget
Parsing a booking can take a while on a CPU host, so don't hold the
upload modal open for it. The async import endpoint returns a job id
right away; the parse runs server-side (one at a time per user) and
pushes progress over the user's WebSocket, and a small widget in the
bottom corner tracks it while the user keeps navigating and editing.
A finished job opens the per-item review from the widget.
2026-06-25 23:56:21 +02:00
Maurice c92c6bc07c feat(extract): drive local parsing through a layered extraction router
The single-shot prompt was unreliable on multi-leg flights and longer
documents, and slow on a CPU host. For the local provider, run a small
router instead:

- deterministic vendor templates first, with no model call at all
- exactly one grammar-enforced call per document via Ollama's native
  `format` (flights as a flat array of legs, everything else as one flat
  reservation, the type picked from keywords or a union schema)
- booking-wide fields (booking reference, total price, the overnight
  arrival day) filled deterministically from the text afterwards, and
  dates coerced to ISO so a natural-language date can't slip through

Recommend qwen2.5 in the AI-parsing settings instead of NuExtract.
2026-06-25 23:56:20 +02:00
Maurice ccf0703f23 feat(import): review each parsed booking before it's saved
Instead of writing parsed items straight to the trip, the import opens the
normal edit modal pre-filled for each one, so you can check and fix it before
saving — useful when a model guesses a wrong date or address. Hotels gained an
editable address field; on save an existing place is matched by name, otherwise
the reviewed address is geocoded and a new place is created.
2026-06-25 10:27:19 +02:00
Maurice 7291d9c52f fix(admin): tidy the AI parsing settings and recommend the 2B model
The provider picker is the shared CustomSelect now and the form is split into
clear sections rather than a flat stack of inputs. NuExtract 2.0 2B is the
recommended default — fastest on a CPU-only host and MIT licensed; the 4B
carries a non-commercial licence, so it's no longer flagged as recommended.
2026-06-25 10:27:19 +02:00
Maurice 156b8da37e feat(extract): drive NuExtract with its native template
NuExtract isn't an instruct model — fed a plain chat prompt it just echoes the
schema back. Detect a NuExtract model by id and talk to it the way the model
cards document: the JSON template inlined in a single user message, no system
prompt, no json_schema, temperature 0. Its flat result is mapped back to the
same KiReservation shape the rest of the pipeline already uses, so nothing
downstream changes; every other model keeps the generic prompt.

Money is taken as a verbatim string and parsed locally (German "1.580,22 €"
otherwise comes back as 1.49772), a rental car's pickup/return ride the from/to
fields so a stray form label doesn't become the location, and a lodging with no
name falls back to its address instead of being dropped.
2026-06-25 10:27:01 +02:00
Maurice cee4b87cc9 fix(extract): refresh accommodations after a booking import
A freshly imported hotel links to an accommodation that lives outside the
trip store, so loadTrip alone left the reservation edit modal with blank
place/date fields. Reload the accommodations list once the import finishes.
2026-06-24 23:29:59 +02:00
Maurice 223f5ce9bc feat(extract): create a linked cost from the booking price on import
When a confirmation carries a total price, record it as a real expense
linked to the reservation (in the matching Costs category) instead of
leaving the amount in metadata only. Gated on the Costs addon.
2026-06-24 23:29:59 +02:00
Maurice 5fa79bba52 feat(extract): capture seat, class, platform, price + event venue contact
Request and map root-level seat/class/platform and a total price/currency into reservation metadata (shown on the card; price reuses the existing label). Read both the root and reservationFor and tolerate common field-name aliases (priceAmount, priceCurrencyISO4217Code, fareClass, ...) since models name these inconsistently. Also capture event/attraction venue telephone + url onto the auto-created place, matching lodging/restaurant.
2026-06-24 23:04:24 +02:00
Maurice 23d5a5bd9c perf(extract): cap LLM input at 4000 chars for CPU-only speed
On a GPU-less host the model's prompt-eval time scales with input length and dominates total latency. Booking details sit at the top of a confirmation, so capping the extracted text at 4000 chars (was 8000) roughly halves extraction time (~50s warm for a capable local 7B model) with no loss of fields on real hotel/rental confirmations. Tunable if a long multi-segment itinerary needs more.
2026-06-24 22:44:55 +02:00
Maurice a5d05cb92e feat(extract): fill transport/booking fields, geocode endpoints, assign days
- rental car: request+map dropoffLocation, emit pickup->return from/to endpoints, set a location string (G1/G2/G3). - geocode endpoints (stations/stops/terminals/rental desks) on confirm via Nominatim; mapper now emits coordless named endpoints and confirm persists only the geocoded ones (G6). - assign every dated booking to the nearest trip day so it still shows when slightly out of range, and keep hotel accommodation from vanishing when a check date misses (G5/G10). - fix bus mislabelled as train + add bus_number metadata (G7/G8), flag malformed boats (G9), accept root start/end time for events (G11). - raise the local-LLM timeout to 300s for CPU-only Ollama.
2026-06-24 22:23:13 +02:00
Maurice ac03b7ca13 fix(extract): make AI imports reliable and fast on local models
client: the import call inherited the global 8s axios timeout and aborted long LLM extractions even though the server finished it; remove the timeout. server: raise the OpenAI-compatible LLM timeout 60s->180s (a cold Ollama model can take ~45s to first token). server: cap extracted text to 8000 chars before the LLM - multi-page T&C tails (30k+ chars) overflowed the context window, truncating the relevant head and making CPU inference crawl; booking details sit at the top.
2026-06-24 21:20:20 +02:00
Maurice 22813f8d81 fix(extract): auto-run the AI fallback when the addon is enabled
Booking import only fell back to the LLM when each user flipped an 'always retry with AI' toggle, so by default files kitinerary returned nothing for just failed. Run the fallback automatically whenever the AI Parsing addon is on (fallback-on-empty); drop the now-redundant per-user toggle and its setting.
2026-06-24 21:20:19 +02:00
jubnl 186625591a feat(extract): extract data using LLM 2026-06-24 18:45:52 +02:00
jubnl 49fb2fded2 chore(wiki): make sure that all environement variables are properly documented 2026-06-24 14:03:39 +02:00
github-actions[bot] 4cd4c9c8d8 chore: bump version to 3.1.2 [skip ci] v3.1.2 2026-06-23 19:24:13 +00:00
jubnl 6cc8908f87 fix(tests): memory leak 2026-06-23 21:23:39 +02:00
Maurice 68f48bc070 ci: give client test workers 8 GB heap (no coverage) to fix worker OOM (#1258) 2026-06-23 21:23:39 +02:00
Maurice 76d8abb44d ci: run client tests without coverage to avoid the v8 report OOM (#1258) 2026-06-23 21:23:39 +02:00
Maurice 91c350c946 ci: raise client coverage heap to 12 GB for the v8 report phase (#1258) 2026-06-23 21:23:39 +02:00
Maurice 1e4a9a95c2 ci: raise Node heap for the client coverage run to fix OOM (#1258) 2026-06-23 21:23:39 +02:00
Maurice fe54f45d62 fix(map): draw the route line to and from the day's accommodation (#1275)
The map route ran first-activity to last-activity only, while the sidebar
already showed the hotel-to-first-stop and last-stop-to-hotel legs with
their drive times. Feed the day's accommodation bookends into the map
route too, reusing the same getDayBookendHotels lookup and the
"optimize from accommodation" gate, so the drawn line starts and ends at
the hotel, including single-activity and transfer days.
2026-06-23 21:23:39 +02:00
Maurice b36c9931b3 fix(costs): allow recording an expense with no split or payer (#1286)
Adding an expense required at least one participant, so a cost you only
want to record — e.g. a booking paid on-site later — could not be saved
without splitting it. Drop the participant requirement: with nobody
selected the expense saves as a recorded total, counted in the trip
total and shown as Unfinished, and kept out of settlements until
who-paid is filled in. The shared schema and server already supported
this case.
2026-06-23 21:23:39 +02:00
Maurice c1fe1d2d6a fix(packing): keep a custom category when its last item is removed (#1289)
Removing the only item of a user-created category deleted the whole
category. Turn that row back into the existing ... placeholder in
place instead, so the category keeps its position and colour; adding an
item reuses the placeholder slot. Deleting the placeholder (or the
category menu) still removes an empty category.
2026-06-23 21:23:39 +02:00
Maurice ebbbf91d60 fix(dashboard): show an error instead of a blank trip list when the server is unreachable (#1283)
When the backend or identity provider was unreachable, a returning user with a
persisted session landed on the dashboard with an empty trip grid and no error.
That looks identical to a logged-in user who simply has no trips, so people
assumed their data had been lost.

Three client-side layers were quietly swallowing the failure: the auth check
only cleared state on a 401, so a 5xx or a network error left the stale session
in place and kept rendering the protected route; the offline-first trip repo
turned a failed fetch into the empty cache without throwing; and the dashboard
had neither an error nor an empty state, so a blank grid meant both "outage" and
"no trips".

The auth check now tells genuine offline (keep serving the cache silently, the
PWA happy path) apart from a server outage while online (keep the session but
flag it). The dashboard shows a reassuring "couldn't reach the server, your
trips are safe" banner with a retry, and a real zero-trip account finally gets a
proper empty state so the two cases never look alike. New strings added across
all locales.
2026-06-23 21:23:39 +02:00
Maurice 328d1c9468 fix(auth): keep the last admin when OIDC claims would demote it (#1274)
On OIDC-only instances the bootstrap admin (first SSO user) rarely carries the configured admin claim, so a forced re-login — e.g. after a JWT-secret rotation — re-derived its role purely from claims and demoted it to user, locking the instance out with no recovery. The OIDC login role sync now skips a downgrade that would strip the last remaining admin, and the admin user-update endpoint guards the same case.
2026-06-23 21:23:39 +02:00
Maurice 48ebdff2d5 feat(planner): bring back the Google Maps route export button (#1255)
The day-plan route bar lost its Open in Google Maps action in the 3.1.0 redesign. A small button with the Google logo (monochrome, theme-aware) now sits next to the Route toggle and opens the day stops, in planned order, as a Google Maps directions link in a new tab.
2026-06-23 21:23:39 +02:00
Maurice 457a42b229 fix(admin): show non-Docker update steps when not running in Docker (#1269)
The "How to Update" modal always rendered Docker commands and claimed the instance runs in Docker, even on bare-metal / LXC installs like Proxmox Community Scripts. It now branches on the is_docker flag the backend already returns: non-Docker installs get a generic "re-run your install method" note plus a link to the update guide. Docker stays the default when the flag is absent, so existing installs are unaffected.
2026-06-23 21:23:39 +02:00