Message boards : General Discussion : Xtrack beam simulation v0.04
Message board moderation

To post messages, you must log in.

Previous · 1 · 2 · 3 · 4 · 5 · 6 · Next

AuthorMessage
boboviz

Send message
Joined: 24 Oct 19
Posts: 276
Credit: 800,403
RAC: 3,728
Message 9097 - Posted: 21 Sep 2025, 14:13:00 UTC - in response to Message 9093.  

In reply to Garrulus glandarius's message of 21 Sep 2025:
Consider yourself lucky with your wingmen. Mine still hasn't finished a single task out of almost 2000 and over 1000 have timed out.

The newer ones I'm getting were aborted by both initial users, but task #3 is unsent so there's no chance for my results to validate in the near future.


The only way to validate our work is that admins stop the creation of wus and waiting the end of the queue...
ID: 9097 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
boboviz

Send message
Joined: 24 Oct 19
Posts: 276
Credit: 800,403
RAC: 3,728
Message 9098 - Posted: 21 Sep 2025, 14:14:47 UTC - in response to Message 9093.  

In reply to Garrulus glandarius's message of 21 Sep 2025:
Consider yourself lucky with your wingmen. Mine still hasn't finished a single task out of almost 2000 and over 1000 have timed out.

The newer ones I'm getting were aborted by both initial users, but task #3 is unsent so there's no chance for my results to validate in the near future.


The only way to validate our work is that admins stop the creation of wus and waiting the end of the queue...
ID: 9098 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
kotenok2000
Avatar

Send message
Joined: 22 Aug 22
Posts: 44
Credit: 82,456
RAC: 137
Message 9099 - Posted: 21 Sep 2025, 15:42:32 UTC

What was that?
ID: 9099 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
[AF>Le_Pommier] Jerome_C2005

Send message
Joined: 17 Mar 15
Posts: 106
Credit: 1,038,379
RAC: 355
Message 9101 - Posted: 21 Sep 2025, 15:45:43 UTC

It was a double post to answer a double post :D
ID: 9101 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Garrulus glandarius

Send message
Joined: 19 Sep 25
Posts: 24
Credit: 19,188
RAC: 415
Message 9102 - Posted: 21 Sep 2025, 15:48:00 UTC - in response to Message 9098.  

In reply to boboviz's message of 21 Sep 2025:
The only way to validate our work is that admins stop the creation of wus and waiting the end of the queue...


Yeah, probably. Now all my new tasks are 3rd and 4th replicates of various "B2_scan" WUs that were originally sent a week ago. It still takes quite a while to finish any of the running tasks (mostly "collapse" and "levelling" series) and runtimes are very random on my end as well. Lots of patience is needed to finally figure out what's going on and how things work in the background.
ID: 9102 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
AnandBhat

Send message
Joined: 25 Apr 25
Posts: 5
Credit: 421,998
RAC: 2,617
Message 9103 - Posted: 22 Sep 2025, 0:13:16 UTC

I have set my resource share to 0% to avoid having a queue as the tasks still have a 1m estimated run time. Is there a way for the devs to have the error limit increased? I fear the current limit (5) will be breached soon with the number of tasks being timed out and/ or aborted, and legit workunits will be marked as faulty.
ID: 9103 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Garrulus glandarius

Send message
Joined: 19 Sep 25
Posts: 24
Credit: 19,188
RAC: 415
Message 9107 - Posted: 22 Sep 2025, 7:23:14 UTC - in response to Message 9103.  

In reply to AnandBhat's message of 22 Sep 2025:
I fear the current limit (5) will be breached soon with the number of tasks being timed out and/ or aborted, and legit workunits will be marked as faulty.


Seconded. I've been tinkering around with settings on very different computers so I also ended up aborting a bunch of tasks sadly. Also, could the max # of tasks in preferences be increased from 8? It would help avoid setting LHC to 0 priority and interfere with any backup projects.
ID: 9107 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
maeax

Send message
Joined: 22 Apr 16
Posts: 782
Credit: 4,057,880
RAC: 3,438
Message 9108 - Posted: 22 Sep 2025, 7:45:42 UTC - in response to Message 9107.  

When you set unlimited,
you get 50 Task in Win11pro with Boinc 8.2.4
ID: 9108 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Garrulus glandarius

Send message
Joined: 19 Sep 25
Posts: 24
Credit: 19,188
RAC: 415
Message 9109 - Posted: 22 Sep 2025, 7:56:33 UTC - in response to Message 9108.  

In reply to maeax's message of 22 Sep 2025:
When you set unlimited,
you get 50 Task in Win11pro with Boinc 8.2.4


That's still way too much for a computer with 16 threads in the present situation when some tasks run for days. Some steps between 8 and 50 (16, 24, 32 maybe) would allow better fine-tuning.
ID: 9109 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
maeax

Send message
Joined: 22 Apr 16
Posts: 782
Credit: 4,057,880
RAC: 3,438
Message 9110 - Posted: 22 Sep 2025, 9:18:53 UTC - in response to Message 9109.  

You can set 8 in prefs.
More task is only with unlimited.
ID: 9110 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
mikey

Send message
Joined: 18 Sep 16
Posts: 19
Credit: 1,034,732
RAC: 446
Message 9111 - Posted: 22 Sep 2025, 13:29:31 UTC - in response to Message 9066.  

In reply to maeax's message of 17 Sep 2025:
Have many Xtrack Tasks with deadline 25/09/19.
Xtrack varying from a few minutes to many days in runtime.

So, no calculation in Boinc, how to find a good runtime.

After three hours of runtime, Boinc show no endtime in Boincmanager anymore.
Help in Slotfolder with stderr.txt is also not possible.


That could be the new Boinc 8.2.4 too as it does have some issues
ID: 9111 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Garrulus glandarius

Send message
Joined: 19 Sep 25
Posts: 24
Credit: 19,188
RAC: 415
Message 9113 - Posted: 22 Sep 2025, 15:49:39 UTC - in response to Message 9111.  

In reply to mikey's message of 22 Sep 2025:
In reply to maeax's message of 17 Sep 2025:
Have many Xtrack Tasks with deadline 25/09/19.
Xtrack varying from a few minutes to many days in runtime.

So, no calculation in Boinc, how to find a good runtime.

After three hours of runtime, Boinc show no endtime in Boincmanager anymore.
Help in Slotfolder with stderr.txt is also not possible.


That could be the new Boinc 8.2.4 too as it does have some issues


I'm running 7.24.1 on my dedicated computer (with Mint 22.1) and I have the same observations: apparent progress is fast at first, then slows down after 90% and once it gets close to 100% BOINC estimates 0 time left and tasks keep running in this state for another 1-2-3 days. It's similar on Win11 running BOINC 8.2.4.
ID: 9113 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
computezrmle
Volunteer moderator
Project tester
Volunteer developer
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 28 Jul 16
Posts: 527
Credit: 400,710
RAC: 0
Message 9114 - Posted: 22 Sep 2025, 16:23:47 UTC - in response to Message 9113.  

This is a very old BOINC behaviour that can always be observed in connection with new apps/app versions.
At least work fetch, credit calculation and estimated time left are based on a couple of input values which are not even constants.

Overcommitted computers or very low (sometimes very high!) credits per task are typical results of this.

It always takes a while until this stabilises.
The more runtimes vary the longer it takes (or becomes even worse).


There is not really anything the project can do as the root cause must be solved in the BOINC code (server and client).
ID: 9114 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
[AF>Le_Pommier] Jerome_C2005

Send message
Joined: 17 Mar 15
Posts: 106
Credit: 1,038,379
RAC: 355
Message 9115 - Posted: 22 Sep 2025, 16:32:25 UTC
Last modified: 22 Sep 2025, 16:32:49 UTC

It doesn't take dozens of tasks to get to a regular / correct estimate per machine, here it seems to be unable to adjust anything and we keep having tasks that behave the same way : quickly gets to 100% and then it can run days like that, with the impacts discussed above.
ID: 9115 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
[AF>Le_Pommier] Jerome_C2005

Send message
Joined: 17 Mar 15
Posts: 106
Credit: 1,038,379
RAC: 355
Message 9116 - Posted: 22 Sep 2025, 20:20:41 UTC

However, now I see the behaviour has now changed : I don't have anymore pending tasks because they were all cancelled by the server, so I have only 4 ongoing tasks and they all have an estimated % (and no more 100%) and an estimated remaining time, and that remaining + elapsed is more or less 35 hours.

So I'll see if it is realistic, but now it behaves more like a regular boinc app indeed.
ID: 9116 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
mmonnin

Send message
Joined: 20 Jun 17
Posts: 38
Credit: 6,492,588
RAC: 2,777
Message 9117 - Posted: 22 Sep 2025, 20:38:52 UTC

Stupidly low flops caused the initial download count to be too high as the tasks had ETA of less than a minute. Some projects get this right, others do not.
A 2nd issue is the tasks getting to 100% so the client thinks tasks are about done but can really be days more of run time. So even with 0.01 days queue the client thinks you need a new task ready for every thread. I had several clients download work from my backup/0% project as LHC-dev was NNW even though I had 20 some LHC-dev tasks to run for 32 threads and all my running tasks were in High-Priority mode/near deadline.

The high variability of the work doesn't help and the client is very slow to adapt (even with consistent run time) but the issues mentioned in this thread are project related with the setup of the tasks.
I'm not sure the error count can be corrected w/o re-issue so we could be stuck with no credit for perfectly returned work but too many timeouts due to the task/app setup.

I would much rather have a 100x longer initial ETA than 1/100th. GPUGrid is like this even though they limit task per processor. Even with over 300 successful tasks the ETA on some of these tasks are still 2-4min
ID: 9117 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
maeax

Send message
Joined: 22 Apr 16
Posts: 782
Credit: 4,057,880
RAC: 3,438
Message 9118 - Posted: 23 Sep 2025, 3:32:05 UTC - in response to Message 9117.  

Podman (Docker) is running inside.
This make Squid not possible to work.
Boinc seem to have not 100% control of the working Task.
atm Xtrack get message, no tasks avalaible.
Wingman work need more power from Cern-IT Cluster to do also wingman work like Atlas.
ID: 9118 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
ProfileMagic Quantum Mechanic
Avatar

Send message
Joined: 8 Apr 15
Posts: 834
Credit: 15,366,858
RAC: 9,935
Message 9119 - Posted: 23 Sep 2025, 5:22:17 UTC

It might help if we didn't have so many of these as the wingman
https://lhcathomedev.cern.ch/lhcathome-dev/results.php?hostid=4195
ID: 9119 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Garrulus glandarius

Send message
Joined: 19 Sep 25
Posts: 24
Credit: 19,188
RAC: 415
Message 9120 - Posted: 23 Sep 2025, 6:20:36 UTC - in response to Message 9119.  

Yeah, I have a bunch of tasks with that guy as wingman. Snatched 1k tasks then disappeared (last connection was on the 20th).
ID: 9120 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
boboviz

Send message
Joined: 24 Oct 19
Posts: 276
Credit: 800,403
RAC: 3,728
Message 9121 - Posted: 23 Sep 2025, 6:25:21 UTC - in response to Message 9118.  

In reply to maeax's message of 23 Sep 2025:
Podman (Docker) is running inside.


Are you sure? From what i know, XBoinc is native and do not use virtualization...
ID: 9121 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Previous · 1 · 2 · 3 · 4 · 5 · 6 · Next

Message boards : General Discussion : Xtrack beam simulation v0.04


©2025 CERN