Are there checkpoints?

Questions and Answers : Bugs : Are there checkpoints?
Message board moderation

To post messages, you must log in.

Previous · 1 · 2

AuthorMessage
Christian Krause
Project administrator

Send message
Joined: 9 May 22
Posts: 250
Credit: 449,267
RAC: 198
Message 319 - Posted: 28 May 2022, 14:30:41 UTC - in response to Message 315.  

If you think there is a problem or a missing feature in the wrapper app, please report the issue to the BOINC dev team.
ID: 319 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
frankhagen

Send message
Joined: 14 May 22
Posts: 6
Credit: 220
RAC: 0
Message 323 - Posted: 29 May 2022, 4:55:00 UTC
Last modified: 29 May 2022, 4:59:02 UTC

the Feature is there.

so you are using

checkpoint_filename

inside your wrapper - script and use it?
ID: 323 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Christian Krause
Project administrator

Send message
Joined: 9 May 22
Posts: 250
Credit: 449,267
RAC: 198
Message 332 - Posted: 31 May 2022, 12:25:09 UTC - in response to Message 323.  

Thanks for pointing this out. We will add it to the config.
ID: 332 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Christian Krause
Project administrator

Send message
Joined: 9 May 22
Posts: 250
Credit: 449,267
RAC: 198
Message 355 - Posted: 6 Jun 2022, 20:47:56 UTC - in response to Message 332.  

It is fixed in the latest app version: 220606.
ID: 355 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
DaveW

Send message
Joined: 3 Jun 22
Posts: 21
Credit: 100,051
RAC: 0
Message 357 - Posted: 7 Jun 2022, 7:08:30 UTC

Christian you need checkpoints. Otherwise people are wasting their time & money.
ID: 357 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
boboviz

Send message
Joined: 13 May 22
Posts: 24
Credit: 202,288
RAC: 3
Message 362 - Posted: 9 Jun 2022, 6:59:20 UTC - in response to Message 357.  

Christian you need checkpoints. Otherwise people are wasting their time & money.


??
My wus are restarting correctly after pc reboot...
ID: 362 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Christian Krause
Project administrator

Send message
Joined: 9 May 22
Posts: 250
Credit: 449,267
RAC: 198
Message 370 - Posted: 10 Jun 2022, 21:21:05 UTC - in response to Message 362.  

DaveW: checkpoints are working. Maybe you still had an old app version?
ID: 370 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Werinbert

Send message
Joined: 14 May 22
Posts: 7
Credit: 100,055
RAC: 0
Message 386 - Posted: 12 Jun 2022, 1:32:33 UTC

I still question the usefulness of the checkpointing....the check points seem to happen near the start of the task and then stop for hours. Thus when used many hours of crunching time is still lost. (Tasks from 2206.11)
ID: 386 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Christian Krause
Project administrator

Send message
Joined: 9 May 22
Posts: 250
Credit: 449,267
RAC: 198
Message 391 - Posted: 12 Jun 2022, 17:07:04 UTC - in response to Message 386.  

Hi Werinbert, checkpoints are written everytime the progress is updated. Similarly to your observations, we noticed issues with long-running tasks. Therefore we have reduced the taks size to 50% in 220612. We hope that this will reduce such issues. Thanks for reporting it.
ID: 391 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
mikey
Avatar

Send message
Joined: 13 May 22
Posts: 3
Credit: 436,766
RAC: 204
Message 395 - Posted: 13 Jun 2022, 2:29:45 UTC - in response to Message 391.  

Hi Werinbert, checkpoints are written everytime the progress is updated. Similarly to your observations, we noticed issues with long-running tasks. Therefore we have reduced the taks size to 50% in 220612. We hope that this will reduce such issues. Thanks for reporting it.


I just aborted a 2206.10 task that had run for almost 19 hours and was only about 50% complete, hopefully others do much better. I am still running a 2206.10 task and then all the rest are 2206.11 tasks, 39 of them on this machine alone.
ID: 395 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Conan

Send message
Joined: 13 May 22
Posts: 37
Credit: 3,056,211
RAC: 2,938
Message 397 - Posted: 13 Jun 2022, 8:46:47 UTC - in response to Message 395.  

Hi Werinbert, checkpoints are written everytime the progress is updated. Similarly to your observations, we noticed issues with long-running tasks. Therefore we have reduced the taks size to 50% in 220612. We hope that this will reduce such issues. Thanks for reporting it.


I just aborted a 2206.10 task that had run for almost 19 hours and was only about 50% complete, hopefully others do much better. I am still running a 2206.10 task and then all the rest are 2206.11 tasks, 39 of them on this machine alone.


Dump them mikey,
I dumped all my 2206.10 due to excess run time limits (I think I only managed 4 successful results), skipped version 2206.11 and now have 2206.12 which have a reduced run time so don't hit the exceed limit and I have not had any failures so far with 20 odd already run.

Conan
ID: 397 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
PDW

Send message
Joined: 13 May 22
Posts: 24
Credit: 10,000,312
RAC: 0
Message 404 - Posted: 18 Jun 2022, 12:41:01 UTC - in response to Message 391.  

Hi Werinbert, checkpoints are written everytime the progress is updated. Similarly to your observations, we noticed issues with long-running tasks. Therefore we have reduced the taks size to 50% in 220612. We hope that this will reduce such issues. Thanks for reporting it.

I have recently seen a task that has been running for more than 5 hours without a checkpoint being written.

The idea of checkpoints is that they are written at regular periods, the default BOINC option is set to 60 seconds which always seemed to have been set quite low but can be configured by the user.
Could you write an entry in the logs to show when it is done ?
Assuming you can also fix that the entirety of each log is included as per my other post on that matter.
ID: 404 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Christian Krause
Project administrator

Send message
Joined: 9 May 22
Posts: 250
Credit: 449,267
RAC: 198
Message 415 - Posted: 19 Jun 2022, 19:43:42 UTC - in response to Message 404.  

Our stats confirm that the average compute time is around 2 CPU hours. There can be very rare / exceptional cases where it takes much longer. If you see this happening, you can send us the WU to investigate it further.
ID: 415 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Previous · 1 · 2

Questions and Answers : Bugs : Are there checkpoints?

©2024 LODA Language