The problem
We recently migrated to this library (thanks for it!) from a different Mongo transport. After a few days in production the queue processing got seriously broken. I scaled up the number of workers/consumers to 4x the number we usually need but even then the messages weren't really being processed quickly enough and the queue size was growing endlessly.
At this point there were 100k messages in the queue. I then noticed that Mongo was struggling and reporting about Slow query a lot on the messages collection. Turns out that as the number of messages in the queue grew, the rate of picking up the messages was slowing down rapidly. This seems to have been caused by the sort({available_at: 1}) part of the query. This killed the database when there were many messages to sort through.
The solution
tl;dr after adding an {available_at: 1} index to the collection, the messages processing sped up enormously and the whole queue of 150k+ items got processed in just a few minutes.
The index got used a lot right after its creation:
The proposal
For a quick fix, in the documentation, at least suggest to the users to ensure this index exists.
For a proper fix, the bundle should ensure this index exists. But there isn't currently a "collection setup" phase, so I don't know where to best place that logic.
The problem
We recently migrated to this library (thanks for it!) from a different Mongo transport. After a few days in production the queue processing got seriously broken. I scaled up the number of workers/consumers to 4x the number we usually need but even then the messages weren't really being processed quickly enough and the queue size was growing endlessly.
At this point there were 100k messages in the queue. I then noticed that Mongo was struggling and reporting about
Slow querya lot on the messages collection. Turns out that as the number of messages in the queue grew, the rate of picking up the messages was slowing down rapidly. This seems to have been caused by thesort({available_at: 1})part of the query. This killed the database when there were many messages to sort through.The solution
tl;dr after adding an
{available_at: 1}index to the collection, the messages processing sped up enormously and the whole queue of 150k+ items got processed in just a few minutes.The index got used a lot right after its creation:
The proposal
For a quick fix, in the documentation, at least suggest to the users to ensure this index exists.
For a proper fix, the bundle should ensure this index exists. But there isn't currently a "collection setup" phase, so I don't know where to best place that logic.