Files
TREK/server
Maurice 23d5a5bd9c perf(extract): cap LLM input at 4000 chars for CPU-only speed
On a GPU-less host the model's prompt-eval time scales with input length and dominates total latency. Booking details sit at the top of a confirmation, so capping the extracted text at 4000 chars (was 8000) roughly halves extraction time (~50s warm for a capable local 7B model) with no loss of fields on real hotel/rental confirmations. Tunable if a long multi-segment itinerary needs more.
2026-06-24 22:44:55 +02:00
..
2026-06-18 20:13:30 +02:00
2026-05-06 21:38:40 +02:00
2026-06-18 20:13:30 +02:00
2026-06-18 20:13:30 +02:00
2026-06-16 22:22:45 +02:00
2026-06-16 22:22:45 +02:00
2026-06-16 22:22:45 +02:00
2026-06-16 22:22:45 +02:00
2026-06-16 22:22:45 +02:00
2026-06-16 22:22:45 +02:00