Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Recurring schedule after an hour did not work anymore #445

Open
lrjbrual opened this issue Dec 6, 2024 · 5 comments
Open

Recurring schedule after an hour did not work anymore #445

lrjbrual opened this issue Dec 6, 2024 · 5 comments

Comments

@lrjbrual
Copy link

lrjbrual commented Dec 6, 2024

Hi,

We have current setup in DigitalOcean to run an API to send data to another third party app. and it success processing the jobs during 1 to 2 hours, We are using DigitalOcean App platform container.

Here is our setup of recurring.yml

production:
  salesforce_invoice_sync:
    class: SalesforceInvoiceSync::InvoiceSyncJob
    schedule: every 10 minutes

development:
  salesforce_invoice_sync:
    class: SalesforceInvoiceSync::InvoiceSyncJob
    schedule: every hour

However after one hour to two hours the jobs does not work anymore:
here is the screenshot:
image
the finished job, after there is no:
image

Here is pick high memory use:
image

Our logs.

Dec 06 13:41:48 [innovation-funding-7fd67954d9-qln2z](https://my.papertrailapp.com/systems/innovation-funding-7fd67954d9-qln2z/events?focus=1801806248485232640&selected=1801806248485232640) [InnovationFunding](https://my.papertrailapp.com/groups/39357550/events?q=program%3AInnovationFunding&focus=1801806248485232640&selected=1801806248485232640) I, [2024-12-06T21:41:48.038204 #43]  INFO -- :   Parameters: {"server_id"=>"solid_queue", "application_id"=>"innovationfunding", "status"=>"finished"}
Dec 06 13:41:48 [innovation-funding-7fd67954d9-qln2z](https://my.papertrailapp.com/systems/innovation-funding-7fd67954d9-qln2z/events?focus=1801806248694931456&selected=1801806248694931456) [InnovationFunding](https://my.papertrailapp.com/groups/39357550/events?q=program%3AInnovationFunding&focus=1801806248694931456&selected=1801806248694931456) I, [2024-12-06T21:41:48.093696 #43]  INFO -- : Completed 200 OK in 53ms (Views: 23.0ms | ActiveRecord: 19.1ms (67 queries, 52 cached) | GC: 10.0ms)
Dec 06 13:41:48 [innovation-funding-7fd67954d9-qln2z](https://my.papertrailapp.com/systems/innovation-funding-7fd67954d9-qln2z/events?focus=1801806248715911175&selected=1801806248715911175) [InnovationFunding](https://my.papertrailapp.com/groups/39357550/events?q=program%3AInnovationFunding&focus=1801806248715911175&selected=1801806248715911175) I, [2024-12-06T21:41:48.090483 #43]  INFO -- :   Rendered layout layouts/mission_control/jobs/application.html.erb (Duration: 28.0ms | GC: 10.0ms)
Dec 06 13:44:42 [innovation-funding-7fd67954d9-qln2z](https://my.papertrailapp.com/systems/innovation-funding-7fd67954d9-qln2z/events?focus=1801806978533203969&selected=1801806978533203969) [SyncHubWorker](https://my.papertrailapp.com/groups/39357550/events?q=program%3ASyncHubWorker&focus=1801806978533203969&selected=1801806978533203969) D, [2024-12-06T21:44:42.095384 #598] DEBUG -- : SolidQueue-1.0.2 Prune dead processes (5.6ms)  size: 0
Dec 06 13:49:42 [innovation-funding-7fd67954d9-qln2z](https://my.papertrailapp.com/systems/innovation-funding-7fd67954d9-qln2z/events?focus=1801808237189308477&selected=1801808237189308477) [SyncHubWorker](https://my.papertrailapp.com/groups/39357550/events?q=program%3ASyncHubWorker&focus=1801808237189308477&selected=1801808237189308477) D, [2024-12-06T21:49:42.182049 #598] DEBUG -- : SolidQueue-1.0.2 Prune dead processes (76.9ms)  size: 0
Dec 06 13:49:42 [innovation-funding-7fd67954d9-qln2z](https://my.papertrailapp.com/systems/innovation-funding-7fd67954d9-qln2z/events?focus=1801808239529730051&selected=1801808239529730051) [SyncHubWorker](https://my.papertrailapp.com/groups/39357550/events?q=program%3ASyncHubWorker&focus=1801808239529730051&selected=1801808239529730051) D, [2024-12-06T21:49:42.741373 #608] DEBUG -- : SolidQueue-1.0.2 Unblock jobs (1.3ms)  limit: 500, size: 0
Dec 06 13:54:42 [innovation-funding-7fd67954d9-qln2z](https://my.papertrailapp.com/systems/innovation-funding-7fd67954d9-qln2z/events?focus=1801809495778295809&selected=1801809495778295809) [SyncHubWorker](https://my.papertrailapp.com/groups/39357550/events?q=program%3ASyncHubWorker&focus=1801809495778295809&selected=1801809495778295809) D, [2024-12-06T21:54:42.253818 #598] DEBUG -- : SolidQueue-1.0.2 Prune dead processes (65.0ms)  size: 0
Dec 06 13:59:42 [innovation-funding-7fd67954d9-qln2z](https://my.papertrailapp.com/systems/innovation-funding-7fd67954d9-qln2z/events?focus=1801810754165972992&selected=1801810754165972992) [SyncHubWorker](https://my.papertrailapp.com/groups/39357550/events?q=program%3ASyncHubWorker&focus=1801810754165972992&selected=1801810754165972992) D, [2024-12-06T21:59:42.274625 #598] DEBUG -- : SolidQueue-1.0.2 Prune dead processes (8.9ms)  size: 0
Dec 06 13:59:42 [innovation-funding-7fd67954d9-qln2z](https://my.papertrailapp.com/systems/innovation-funding-7fd67954d9-qln2z/events?focus=1801810756204388357&selected=1801810756204388357) [SyncHubWorker](https://my.papertrailapp.com/groups/39357550/events?q=program%3ASyncHubWorker&focus=1801810756204388357&selected=1801810756204388357) D, [2024-12-06T21:59:42.767003 #608] DEBUG -- : SolidQueue-1.0.2 Unblock jobs (4.2ms)  limit: 500, size: 0
Dec 06 13:59:47 [innovation-funding-7fd67954d9-qln2z](https://my.papertrailapp.com/systems/innovation-funding-7fd67954d9-qln2z/events?focus=1801810776764866566&selected=1801810776764866566) [InnovationFunding](https://my.papertrailapp.com/groups/39357550/events?q=program%3AInnovationFunding&focus=1801810776764866566&selected=1801810776764866566) I, [2024-12-06T21:59:47.668392 #43]  INFO -- : Started GET "/jobs/applications/innovationfunding/finished/jobs?server_id=solid_queue" for [172.71.98.110](https://my.papertrailapp.com/groups/39357550/events?q=%22172.71.98.110%22&focus=1801810776764866566&selected=1801810776764866566) at 2024-12-06 21:59:47 +0000

in Addition, when using also puma, to add: bin/jobs start
image

I am not sure what is going on, why the jobs stop. or do we need to add a worker? But on the other app, we are using a good job,. We are using web only in DigitalOcean, not with worker, but it seems to be working, and it is a heavy load of APIs.

Thanks you in advance for assistance and help.

@rosa
Copy link
Member

rosa commented Dec 9, 2024

Hey @lrjbrual, sorry for the delay! It's odd that the job stops being scheduled. When this happens, can you check which processes are running in the server where you're running solid_queue? Just running something like

ps axl | grep solid

@lrjbrual
Copy link
Author

lrjbrual commented Dec 9, 2024

ps axl | grep solid

Hi @rosa, No, problem. Appreciate your reply. I am using DigitalOcean App platform, using web resource without worker.

here are the log information:

0  1000    59    42  20   0 836564 228588 ?     Sl   ?          0:05 solid-queue-supervisor(1.1.0): supervising 71, 74, 78
0  1000    71    59  20   0 837140 228868 ?     Sl   ?          0:09 solid-queue-dispatcher(1.1.0): dispatching every 1 seconds
0  1000    74    59  20   0 839444 232840 ?     Sl   ?          0:11 solid-queue-worker(1.1.0): waiting for jobs in invoice_jobs
0  1000    78    59  20   0 839124 230888 ?     Sl   ?          0:04 solid-queue-scheduler(1.1.0): scheduling salesforce_invoice_sync
0  1000   155   146  20   0  11812  5080 ?      S    ?          0:00 grep solid

After I revalidate again:

rails@innovation-funding-5cff94d599-qsw5r:/rails$ ps axl | grep solid
0  1000   221   157  20   0  11812  4628 ?      S    ?          0:00 grep solid

my jobs stop again:
image

It skipping the schedule of 19:50 and for sure the same again:
image

@rosa
Copy link
Member

rosa commented Dec 9, 2024

Huh, what did it happen between this

0  1000    59    42  20   0 836564 228588 ?     Sl   ?          0:05 solid-queue-supervisor(1.1.0): supervising 71, 74, 78
0  1000    71    59  20   0 837140 228868 ?     Sl   ?          0:09 solid-queue-dispatcher(1.1.0): dispatching every 1 seconds
0  1000    74    59  20   0 839444 232840 ?     Sl   ?          0:11 solid-queue-worker(1.1.0): waiting for jobs in invoice_jobs
0  1000    78    59  20   0 839124 230888 ?     Sl   ?          0:04 solid-queue-scheduler(1.1.0): scheduling salesforce_invoice_sync
0  1000   155   146  20   0  11812  5080 ?      S    ?          0:00 grep solid

and this?

After I revalidate again:

rails@innovation-funding-5cff94d599-qsw5r:/rails$ ps axl | grep solid
0  1000   221   157  20   0  11812  4628 ?      S    ?          0:00 grep solid

As there, solid queue is no longer running so that's the reason the job is not being enqueued, but any ideas of what happened?

@lrjbrual
Copy link
Author

lrjbrual commented Dec 9, 2024

Huh, what did it happen between this

0  1000    59    42  20   0 836564 228588 ?     Sl   ?          0:05 solid-queue-supervisor(1.1.0): supervising 71, 74, 78
0  1000    71    59  20   0 837140 228868 ?     Sl   ?          0:09 solid-queue-dispatcher(1.1.0): dispatching every 1 seconds
0  1000    74    59  20   0 839444 232840 ?     Sl   ?          0:11 solid-queue-worker(1.1.0): waiting for jobs in invoice_jobs
0  1000    78    59  20   0 839124 230888 ?     Sl   ?          0:04 solid-queue-scheduler(1.1.0): scheduling salesforce_invoice_sync
0  1000   155   146  20   0  11812  5080 ?      S    ?          0:00 grep solid

and this?

After I revalidate again:

rails@innovation-funding-5cff94d599-qsw5r:/rails$ ps axl | grep solid
0  1000   221   157  20   0  11812  4628 ?      S    ?          0:00 grep solid

As there, solid queue is no longer running so that's the reason the job is not being enqueued, but any ideas of what happened?

That is what I am looking for, why it is happening. I added honeybadger to revalidate. it seems after 3 times attempt it clear up the solid queue, cannot find what is going on, yet.

@rosa, do you also know how to clean up some continuous connections to the database? for example instead of creating a new connection it will re-use the old connection. and I found it connects multiple times, and I have limited the database connection to 22. I'm not sure how to deal with clean up the connection or re-use it; I'm still exploring the solid queue documentation.

I will continue to to monitor until tomorrow if I have still an issue with running a solid queue. I rerun again the bin/jobs start and it is almost 2 and hop hours and still running; I Hope it will continue without issues.

@rosa
Copy link
Member

rosa commented Jan 3, 2025

Hey @lrjbrual, so sorry for the delay replying! Somehow I missed the notification for your last comment 🤦‍♀️

You don't need to manually clean up any DB connection, Solid Queue relies on Active Record for that.

I have limited the database connection to 22

Where do you have limited this? In Rails's database.yml configuration or in the database itself?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants