Sidekiq batches are great for managing groups of related jobs. But there's one booby trap to watch out for.
Say you need to process a file. So you set up a Sidekiq job that opens a batch, reads the file, and enqueues each row for processing:
class CreateBatch
include Sidekiq::Worker
def perform
batch = Sidekiq::Batch.new
batch.on(:success, SuccessCallback)
batch.jobs do
CSV.foreach(my_file) do |row|
args = my_row_args(row)
RowWorker.perform_async(args)
end
end
end
end
The batch completes successfully.
But later you notice that some unrelated jobs have a problem. The sidekiq.log
shows that these other jobs started but never finished. They aren't running and they never went to the dead queue. Uh oh.
After some more digging, you realize that the box ran critically low on memory and the Linux OOM Killer terminated one of your Sidekiq processes by reviewing the output here:
sudo dmesg -T | egrep -i 'killed process'
What happened?
The issue: Enqueuing a large number of jobs within the batch.jobs
block can cause your Ruby process to bloat. This can cause the process to be killed at some later point when the Linux OOM Killer decides it needs to intervene.
batch.jobs
holds onto all of the args
in a Ruby array and waits until the end of the block to submit that info to Redis. The purpose of this is to make pushing jobs onto the batch an atomic action. If a network failure occurs, you don't end up with only some jobs in the batch. This also prevents a race condition where jobs process so quickly that the success callback is fired before everything has been enqueued.
That array of args
can get really big.
One way to avoid eating up too much memory is to push a job onto the batch and then let that job finish enqueuing all the other jobs. This is possible because:
batch
can be called from within a Sidekiq worker to access its batchThe new code to create your batch might look something like:
class CreateBatch
include Sidekiq::Worker
def perform
batch = Sidekiq::Batch.new
batch.on(:success, SuccessCallback)
batch.jobs { AddJobsToBatch.perform_async }
end
end
class AddJobsToBatch
include Sidekiq::Worker
def perform
raise 'Must be run as part of a Sidekiq batch' if batch.nil?
CSV.foreach(my_file) do |row|
# This example enqueues one row at a time but
# rows could be enqueued in small groups
batch.jobs do
args = my_row_args(row)
RowWorker.perform_async(args)
end
end
end
end
This will prevent your Ruby process from running out of memory on very large datasets. (Note that you'll still need enough space in Redis to hold all the jobs that are enqueued.)
This strategy is similar to the recommendation for huge batches. But it's good practice to do this for all your batches to be ready to handle increased data.