Hi,
When I use pymesos to run 10, 100, 1000 tasks at same time, it runs perfectly.
However, for 10000 tasks at same time, some status of tasks are TASK_LOST.
I'm not sure the problem is pymesos or the setting I set.
Mesos Version: 1.9.0
Pymesos: git clone the latest (2020/6/9)
Total CPU 412, MEM 5.2TB, Disk 983.9
For one task, it needs 0.01 cpu, 1M mem
For the task starts is TASK_LOST, The mesos master shows:
Sending status update TASK_LOST for task task-xx of framework xxx 'Task launched with inva
lid offers: Offer xxx is no longer valid'
I guess the cause is that two or above tasks use the same offer id. When one of these tasks finished, the offer will release, and the other task using same offer id cannot use this offer anymore.
Hi,
When I use pymesos to run 10, 100, 1000 tasks at same time, it runs perfectly.
However, for 10000 tasks at same time, some status of tasks are TASK_LOST.
I'm not sure the problem is pymesos or the setting I set.
Mesos Version: 1.9.0
Pymesos: git clone the latest (2020/6/9)
Total CPU 412, MEM 5.2TB, Disk 983.9
For one task, it needs 0.01 cpu, 1M mem
For the task starts is TASK_LOST, The mesos master shows:
Sending status update TASK_LOST for task task-xx of framework xxx 'Task launched with inva
lid offers: Offer xxx is no longer valid'
I guess the cause is that two or above tasks use the same offer id. When one of these tasks finished, the offer will release, and the other task using same offer id cannot use this offer anymore.