I am trying to use llama.cpp, prompt processing takes a lot of time and at the end I get a timeout
1437.52.393.991 I slot print_timing: id 0 | task 79 | prompt processing, n_tokens = 2048, progress = 0.03, t = 4.73 s / 432.86 tokens per second
1437.57.687.158 I slot print_timing: id 0 | task 79 | prompt processing, n_tokens = 4096, progress = 0.06, t = 10.02 s / 408.60 tokens per second
1438.03.456.006 I slot print_timing: id 0 | task 79 | prompt processing, n_tokens = 6144, progress = 0.09, t = 15.79 s / 389.02 tokens per second
1438.09.683.416 I slot print_timing: id 0 | task 79 | prompt processing, n_tokens = 8192, progress = 0.13, t = 22.02 s / 372.01 tokens per second
1438.16.536.080 I slot print_timing: id 0 | task 79 | prompt processing, n_tokens = 10240, progress = 0.16, t = 28.87 s / 354.65 tokens per second
1438.23.899.520 I slot print_timing: id 0 | task 79 | prompt processing, n_tokens = 12288, progress = 0.19, t = 36.24 s / 339.10 tokens per second
1438.31.885.378 I slot print_timing: id 0 | task 79 | prompt processing, n_tokens = 14336, progress = 0.22, t = 44.22 s / 324.18 tokens per second
1438.40.567.796 I slot print_timing: id 0 | task 79 | prompt processing, n_tokens = 16384, progress = 0.25, t = 52.91 s / 309.69 tokens per second
1438.49.938.815 I slot print_timing: id 0 | task 79 | prompt processing, n_tokens = 18432, progress = 0.28, t = 62.28 s / 295.97 tokens per second
1438.59.931.296 I slot print_timing: id 0 | task 79 | prompt processing, n_tokens = 20480, progress = 0.31, t = 72.27 s / 283.39 tokens per second
1439.10.479.040 I slot print_timing: id 0 | task 79 | prompt processing, n_tokens = 22528, progress = 0.35, t = 82.82 s / 272.02 tokens per second
1439.21.866.563 I slot print_timing: id 0 | task 79 | prompt processing, n_tokens = 24576, progress = 0.38, t = 94.20 s / 260.88 tokens per second
1439.33.389.829 I slot print_timing: id 0 | task 79 | prompt processing, n_tokens = 26624, progress = 0.41, t = 105.73 s / 251.82 tokens per second
1439.44.817.553 I slot print_timing: id 0 | task 79 | prompt processing, n_tokens = 28672, progress = 0.44, t = 117.15 s / 244.74 tokens per second
1439.57.807.858 I slot print_timing: id 0 | task 79 | prompt processing, n_tokens = 30720, progress = 0.47, t = 130.15 s / 236.04 tokens per second
1440.10.719.133 I slot print_timing: id 0 | task 79 | prompt processing, n_tokens = 32768, progress = 0.50, t = 143.06 s / 229.06 tokens per second
1440.24.145.546 I slot print_timing: id 0 | task 79 | prompt processing, n_tokens = 34816, progress = 0.53, t = 156.48 s / 222.49 tokens per second
1440.39.555.780 I slot print_timing: id 0 | task 79 | prompt processing, n_tokens = 36864, progress = 0.57, t = 171.89 s / 214.46 tokens per second
1440.55.872.954 I slot print_timing: id 0 | task 79 | prompt processing, n_tokens = 38912, progress = 0.60, t = 188.21 s / 206.75 tokens per second
1441.12.928.627 I slot print_timing: id 0 | task 79 | prompt processing, n_tokens = 40960, progress = 0.63, t = 205.27 s / 199.55 tokens per second
1441.30.558.198 I slot print_timing: id 0 | task 79 | prompt processing, n_tokens = 43008, progress = 0.66, t = 222.90 s / 192.95 tokens per second
1441.48.856.858 I slot print_timing: id 0 | task 79 | prompt processing, n_tokens = 45056, progress = 0.69, t = 241.19 s / 186.80 tokens per second
1442.07.963.365 I slot print_timing: id 0 | task 79 | prompt processing, n_tokens = 47104, progress = 0.72, t = 260.30 s / 180.96 tokens per second
1442.27.662.433 I slot print_timing: id 0 | task 79 | prompt processing, n_tokens = 49152, progress = 0.75, t = 280.00 s / 175.54 tokens per second
1442.39.336.872 W srv next: request cancelled after 30s, potentially a client-side timeout; please check your client's code
Prompt processing consistently ends < 300 seconds. The log says "request cancelled after 30s" which seems strange.
All the watchdog timeouts settings are set to very high values (900 seconds)
I am trying to use llama.cpp, prompt processing takes a lot of time and at the end I get a timeout
Prompt processing consistently ends < 300 seconds. The log says "request cancelled after 30s" which seems strange.
All the watchdog timeouts settings are set to very high values (900 seconds)