Message boards : General Discussion : Xtrack beam simulation v0.04
Message board moderation
Previous · 1 · 2 · 3 · 4 · 5 · 6 · Next
Author | Message |
---|---|
Send message Joined: 24 Oct 19 Posts: 276 Credit: 800,403 RAC: 3,728 ![]() ![]() ![]() |
In reply to Garrulus glandarius's message of 21 Sep 2025: Consider yourself lucky with your wingmen. Mine still hasn't finished a single task out of almost 2000 and over 1000 have timed out. The only way to validate our work is that admins stop the creation of wus and waiting the end of the queue... |
Send message Joined: 24 Oct 19 Posts: 276 Credit: 800,403 RAC: 3,728 ![]() ![]() ![]() |
In reply to Garrulus glandarius's message of 21 Sep 2025: Consider yourself lucky with your wingmen. Mine still hasn't finished a single task out of almost 2000 and over 1000 have timed out. The only way to validate our work is that admins stop the creation of wus and waiting the end of the queue... |
Send message Joined: 22 Aug 22 Posts: 44 Credit: 82,456 RAC: 167 ![]() ![]() |
What was that? |
Send message Joined: 17 Mar 15 Posts: 106 Credit: 1,038,379 RAC: 432 ![]() ![]() |
It was a double post to answer a double post :D |
Send message Joined: 19 Sep 25 Posts: 24 Credit: 19,188 RAC: 506 |
In reply to boboviz's message of 21 Sep 2025: The only way to validate our work is that admins stop the creation of wus and waiting the end of the queue... Yeah, probably. Now all my new tasks are 3rd and 4th replicates of various "B2_scan" WUs that were originally sent a week ago. It still takes quite a while to finish any of the running tasks (mostly "collapse" and "levelling" series) and runtimes are very random on my end as well. Lots of patience is needed to finally figure out what's going on and how things work in the background. |
Send message Joined: 25 Apr 25 Posts: 5 Credit: 421,998 RAC: 2,617 ![]() |
I have set my resource share to 0% to avoid having a queue as the tasks still have a 1m estimated run time. Is there a way for the devs to have the error limit increased? I fear the current limit (5) will be breached soon with the number of tasks being timed out and/ or aborted, and legit workunits will be marked as faulty. |
Send message Joined: 19 Sep 25 Posts: 24 Credit: 19,188 RAC: 506 |
In reply to AnandBhat's message of 22 Sep 2025: I fear the current limit (5) will be breached soon with the number of tasks being timed out and/ or aborted, and legit workunits will be marked as faulty. Seconded. I've been tinkering around with settings on very different computers so I also ended up aborting a bunch of tasks sadly. Also, could the max # of tasks in preferences be increased from 8? It would help avoid setting LHC to 0 priority and interfere with any backup projects. |
Send message Joined: 22 Apr 16 Posts: 782 Credit: 4,057,880 RAC: 4,191 ![]() ![]() ![]() |
When you set unlimited, you get 50 Task in Win11pro with Boinc 8.2.4 |
Send message Joined: 19 Sep 25 Posts: 24 Credit: 19,188 RAC: 506 |
In reply to maeax's message of 22 Sep 2025: When you set unlimited, That's still way too much for a computer with 16 threads in the present situation when some tasks run for days. Some steps between 8 and 50 (16, 24, 32 maybe) would allow better fine-tuning. |
Send message Joined: 22 Apr 16 Posts: 782 Credit: 4,057,880 RAC: 4,191 ![]() ![]() ![]() |
You can set 8 in prefs. More task is only with unlimited. |
Send message Joined: 18 Sep 16 Posts: 19 Credit: 1,034,732 RAC: 544 ![]() ![]() |
In reply to maeax's message of 17 Sep 2025: Have many Xtrack Tasks with deadline 25/09/19. That could be the new Boinc 8.2.4 too as it does have some issues |
Send message Joined: 19 Sep 25 Posts: 24 Credit: 19,188 RAC: 506 |
In reply to mikey's message of 22 Sep 2025: In reply to maeax's message of 17 Sep 2025: I'm running 7.24.1 on my dedicated computer (with Mint 22.1) and I have the same observations: apparent progress is fast at first, then slows down after 90% and once it gets close to 100% BOINC estimates 0 time left and tasks keep running in this state for another 1-2-3 days. It's similar on Win11 running BOINC 8.2.4. |
![]() Send message Joined: 28 Jul 16 Posts: 527 Credit: 400,710 RAC: 0 ![]() ![]() |
This is a very old BOINC behaviour that can always be observed in connection with new apps/app versions. At least work fetch, credit calculation and estimated time left are based on a couple of input values which are not even constants. Overcommitted computers or very low (sometimes very high!) credits per task are typical results of this. It always takes a while until this stabilises. The more runtimes vary the longer it takes (or becomes even worse). There is not really anything the project can do as the root cause must be solved in the BOINC code (server and client). |
Send message Joined: 17 Mar 15 Posts: 106 Credit: 1,038,379 RAC: 432 ![]() ![]() |
It doesn't take dozens of tasks to get to a regular / correct estimate per machine, here it seems to be unable to adjust anything and we keep having tasks that behave the same way : quickly gets to 100% and then it can run days like that, with the impacts discussed above. |
Send message Joined: 17 Mar 15 Posts: 106 Credit: 1,038,379 RAC: 432 ![]() ![]() |
However, now I see the behaviour has now changed : I don't have anymore pending tasks because they were all cancelled by the server, so I have only 4 ongoing tasks and they all have an estimated % (and no more 100%) and an estimated remaining time, and that remaining + elapsed is more or less 35 hours. So I'll see if it is realistic, but now it behaves more like a regular boinc app indeed. |
Send message Joined: 20 Jun 17 Posts: 38 Credit: 6,492,588 RAC: 3,385 ![]() ![]() ![]() |
Stupidly low flops caused the initial download count to be too high as the tasks had ETA of less than a minute. Some projects get this right, others do not. A 2nd issue is the tasks getting to 100% so the client thinks tasks are about done but can really be days more of run time. So even with 0.01 days queue the client thinks you need a new task ready for every thread. I had several clients download work from my backup/0% project as LHC-dev was NNW even though I had 20 some LHC-dev tasks to run for 32 threads and all my running tasks were in High-Priority mode/near deadline. The high variability of the work doesn't help and the client is very slow to adapt (even with consistent run time) but the issues mentioned in this thread are project related with the setup of the tasks. I'm not sure the error count can be corrected w/o re-issue so we could be stuck with no credit for perfectly returned work but too many timeouts due to the task/app setup. I would much rather have a 100x longer initial ETA than 1/100th. GPUGrid is like this even though they limit task per processor. Even with over 300 successful tasks the ETA on some of these tasks are still 2-4min |
Send message Joined: 22 Apr 16 Posts: 782 Credit: 4,057,880 RAC: 4,191 ![]() ![]() ![]() |
Podman (Docker) is running inside. This make Squid not possible to work. Boinc seem to have not 100% control of the working Task. atm Xtrack get message, no tasks avalaible. Wingman work need more power from Cern-IT Cluster to do also wingman work like Atlas. |
![]() ![]() Send message Joined: 8 Apr 15 Posts: 834 Credit: 15,366,731 RAC: 10,011 ![]() ![]() ![]() |
It might help if we didn't have so many of these as the wingman https://lhcathomedev.cern.ch/lhcathome-dev/results.php?hostid=4195 |
Send message Joined: 19 Sep 25 Posts: 24 Credit: 19,188 RAC: 506 |
Yeah, I have a bunch of tasks with that guy as wingman. Snatched 1k tasks then disappeared (last connection was on the 20th). |
Send message Joined: 24 Oct 19 Posts: 276 Credit: 800,403 RAC: 3,728 ![]() ![]() ![]() |
In reply to maeax's message of 23 Sep 2025: Podman (Docker) is running inside. Are you sure? From what i know, XBoinc is native and do not use virtualization... |
©2025 CERN