A database is traditionally used to store data, but I have seen developers use database tables as a job queue. The purpose of the job queue is to track which data needs to be processed after it has been inserted into that table. A dedicated process then needs to periodically poll/query the database table for unprocessed data. Depending on the frequency of the polling and the availability of the unprocessed data, this technique can become very inefficient. Database polling is easy to implement and quick to prototype. Unfortunately as your systems grows, it will soon show its limitations in scalability and efficiency. RabbitMQ is an ideal solution to replace this type of database polling. In this video, I go over the high level design of how to replace database polling with RabbitMQ.
The main reason why RabbitMQ is a more efficient is because you can use RabbitMQ to implement an event-driven architecture. In this way, your data processing program will be notified when the data become available. In this architecture, you can store the data in the database and have the event notification invoke the database query. Furthermore, you can even remove the need of the database table if you publish the unprocessed data into the message queue.
Database polling might not become an issue if the polling is infrequent, such as every few hours or once per day. This is typical for batch jobs which download data and upload that data into another system for further processing. Be aware that as the volume of data increases, the single batch jobs processing time could be extended beyond the expect time window. In this case, you need to increase the polling frequency or improve the efficiency of the batch job to decrease the total run time.
Don’t forget to check out my Reliable Messaging with RabbitMQ Video Course.