Message boards : ATLAS Application : ATLAS long simulation 1.01
Message board moderation

To post messages, you must log in.

Previous · 1 · 2 · 3 · 4 · Next

AuthorMessage
zombie67 [MM]
Avatar

Send message
Joined: 26 Feb 15
Posts: 26
Credit: 4,101,356
RAC: 0
Message 7146 - Posted: 22 Mar 2021, 21:54:58 UTC

Thanks for the info. Even running 8x 4-core tasks, they are finishing in about 7 hours. So no issue WRT time limit. I guess I will just pick the middle and run 4x 8-core tasks.
Reno, NV
Team: SETI.USA
ID: 7146 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
zombie67 [MM]
Avatar

Send message
Joined: 26 Feb 15
Posts: 26
Credit: 4,101,356
RAC: 0
Message 7147 - Posted: 23 Mar 2021, 4:36:10 UTC

FWIW, all my machines have dried up. Server status page says 300+ tasks available, but nothing will download.
ID: 7147 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
maeax

Send message
Joined: 22 Apr 16
Posts: 659
Credit: 1,719,912
RAC: 3,195
Message 7148 - Posted: 23 Mar 2021, 9:00:37 UTC

https://lhcathomedev.cern.ch/lhcathome-dev/result.php?resultid=2958471
This task is in BigPanda, but the control-messages have no HITFile message shown:
[2021-03-23 00:31:09] -rw-------. 1 boinc boinc 698704227 23. Mär 00:30 HITS.24586168._000132.pool.root.1
[2021-03-23 00:31:09] -rw-------. 1 boinc boinc 1034 23. Mär 00:30 memory_monitor_summary.json
[2021-03-23 00:31:09] -rw-------. 1 boinc boinc 1526165 23. Mär 00:30 log.24586168._000132.job.log.tgz.1
[2021-03-23 00:31:09] -rw-------. 1 boinc boinc 404198 23. Mär 00:30 heartbeat.json
[2021-03-23 00:31:09] -rw-r--r--. 1 boinc boinc 29 23. Mär 00:30 wrapper_checkpoint.txt
[2021-03-23 00:31:09] -rw-r--r--. 1 boinc boinc 8192 23. Mär 00:30 boinc_mmap_file
[2021-03-23 00:31:09] -rw-------. 1 boinc boinc 8823 23. Mär 00:31 pilotlog.txt
[2021-03-23 00:31:09] -rw-------. 1 boinc boinc 3763711 23. Mär 00:31 log.24586168._000132.job.log.1
[2021-03-23 00:31:09] -rw-------. 1 boinc boinc 417 23. Mär 00:31 output.list
[2021-03-23 00:31:09] -rw-r--r--. 1 boinc boinc 620 23. Mär 00:31 runtime_log
[2021-03-23 00:31:09] -rw-------. 1 boinc boinc 5703680 23. Mär 00:31 result.tar.gz
[2021-03-23 00:31:09] -rw-r--r--. 1 boinc boinc 9729 23. Mär 00:31 runtime_log.err
[2021-03-23 00:31:09] -rw-------. 1 boinc boinc 587 23. Mär 00:31 lllLDmVQUhynfZGDcpSWOuwoABFKDmABFKDm2IFNDm5DFKDmittgxm.diag
[2021-03-23 00:31:09] -rw-r--r--. 1 boinc boinc 433571 23.

CentOS8-VM with squid, a lot of Mirror messages from CentOS8 in the squid-log!
ID: 7148 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
David Cameron
Project administrator
Project developer
Project tester
Project scientist

Send message
Joined: 20 Apr 16
Posts: 180
Credit: 1,355,327
RAC: 0
Message 7149 - Posted: 23 Mar 2021, 9:51:49 UTC - in response to Message 7145.  


David Cameron wrote:
ATLAS systems cancel any tasks which have been queued for more than two days

I would suggest to take this as a hint and use a setup that allows a task to finish within 1-1.5 days.
Beside that it's a matter of personal preference.


Sorry I wasn't clear here before. I meant that unsent tasks are cancelled after 2 days. Once a task is taken by a volunteer the usual 1 week deadline applies, and then if the deadline expires the task is available for another host.
ID: 7149 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
David Cameron
Project administrator
Project developer
Project tester
Project scientist

Send message
Joined: 20 Apr 16
Posts: 180
Credit: 1,355,327
RAC: 0
Message 7150 - Posted: 23 Mar 2021, 10:05:43 UTC - in response to Message 7144.  
Last modified: 23 Mar 2021, 10:09:12 UTC

Let's say you have a 32 core machine. Is it better to run a single task using 32 cores? Or 4 tasks using 8 cores each? Or 8 tasks using 4 cores each? I have tried all three configurations and the credits per hour work out to be roughly the same. As far as I can tell, there is no advantage to me regardless of configuration.

So my question is, does the project have a preference?


Thanks for trying this out! It seems your 32-core tasks worked fine. As computezrmle said, the choice of cores should be based on your individual disk and memory set up. A single-core task uses ~3GB of memory and your 32-core tasks used 7GB because a lot of memory is shared between the different processes. In addition each task requires 8-10GB of disk space. There is a slight efficiency gain in running fewer cores but it is probably only a few percent.
ID: 7150 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
zombie67 [MM]
Avatar

Send message
Joined: 26 Feb 15
Posts: 26
Credit: 4,101,356
RAC: 0
Message 7151 - Posted: 23 Mar 2021, 15:24:30 UTC
Last modified: 23 Mar 2021, 15:37:50 UTC

Am I the only one not getting work since about 10 hours ago?

Edit: Of course, as soon as I posted that, I started getting work again.
ID: 7151 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Michael Goetz
Avatar

Send message
Joined: 20 Mar 21
Posts: 3
Credit: 22,351
RAC: 0
Message 7152 - Posted: 23 Mar 2021, 16:17:59 UTC - in response to Message 7151.  

I pinged you on Discord. The Linux machine I was setting up with cvmfs and singularity yesterday finished its last SoB, so I started it on the very long ATLAS tasks. It had no trouble getting a task.
Want to find one of the largest known primes? Try PrimeGrid. Or help cure disease at WCG.

ID: 7152 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
zombie67 [MM]
Avatar

Send message
Joined: 26 Feb 15
Posts: 26
Credit: 4,101,356
RAC: 0
Message 7153 - Posted: 23 Mar 2021, 19:03:48 UTC

Wow. Lots of download errors again. So many, that I am now getting:

lhcathome-dev 3/23/2021 12:02:00 PM This computer has finished a daily quota of 1 tasks
ID: 7153 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Michael Goetz
Avatar

Send message
Joined: 20 Mar 21
Posts: 3
Credit: 22,351
RAC: 0
Message 7154 - Posted: 24 Mar 2021, 1:27:44 UTC - in response to Message 7153.  

Wow. Lots of download errors again. So many, that I am now getting:

lhcathome-dev 3/23/2021 12:02:00 PM This computer has finished a daily quota of 1 tasks


The one task I got earlier completed in abouit 8-9 hours. Since then, it's been all download errors.
Want to find one of the largest known primes? Try PrimeGrid. Or help cure disease at WCG.

ID: 7154 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
David Cameron
Project administrator
Project developer
Project tester
Project scientist

Send message
Joined: 20 Apr 16
Posts: 180
Credit: 1,355,327
RAC: 0
Message 7155 - Posted: 24 Mar 2021, 11:11:50 UTC

This should be better now. I have removed all the tasks with missing input files.
ID: 7155 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Michael Goetz
Avatar

Send message
Joined: 20 Mar 21
Posts: 3
Credit: 22,351
RAC: 0
Message 7156 - Posted: 24 Mar 2021, 14:01:32 UTC

I've successfully run the very long Atlas tasks on a Debian 10 (Buster) Intel Haswell i5-4670K system, on 4 cores, and am currently also running it on a Debian 9 (Stretch) AMD Zen2 Ryzen 7 3700X syetm on 8 cores.

What exactly are you looking for us to test? Is there anything specific I should be testing?

In about 10 days, I'll also have an old i3-330M laptop available, but I'm not sure if it can finish one of these tasks in 7 days.
Want to find one of the largest known primes? Try PrimeGrid. Or help cure disease at WCG.

ID: 7156 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
David Cameron
Project administrator
Project developer
Project tester
Project scientist

Send message
Joined: 20 Apr 16
Posts: 180
Credit: 1,355,327
RAC: 0
Message 7157 - Posted: 24 Mar 2021, 14:36:58 UTC - in response to Message 7156.  

Thanks for testing this out. The computations in each task are the same as in the regular tasks on LHC@Home but process 5 times as much data. The idea of testing here was to see if anything strange happens and to see if anything needs tweaked in the BOINC server configuration to handle these tasks. It seems like it mostly runs smoothly so we can go to the next step of releasing the app officially on LHC@Home.

The long tasks are mainly for powerful dedicated machines so old laptops would be better put to work on the regular short tasks. People should only opt-in to the long task app if their machines can handle it.
ID: 7157 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
maeax

Send message
Joined: 22 Apr 16
Posts: 659
Credit: 1,719,912
RAC: 3,195
Message 7158 - Posted: 24 Mar 2021, 15:34:12 UTC

12 CPU's in CentOS8-VM:
cpuconsumptiontime: 206826 s B U T only 10% of the Credit as before :-(
[2021-03-24 16:14:00] 2021-03-24 15:13:51,518 | INFO | retrieve | pilot.control.job | make_job_report | corecount: 12
https://lhcathomedev.cern.ch/lhcathome-dev/workunit.php?wuid=2064054
ID: 7158 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Sergey Kovalchuk

Send message
Joined: 11 Mar 16
Posts: 23
Credit: 68,680
RAC: 0
Message 7159 - Posted: 24 Mar 2021, 15:34:45 UTC

finished the tasks of the first wave of March 18 and can no longer get new ones
are there any restrictions on host parameters?

since we are already talking about the official release - note new tasks do not count as ATLAS for badges
ID: 7159 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
zombie67 [MM]
Avatar

Send message
Joined: 26 Feb 15
Posts: 26
Credit: 4,101,356
RAC: 0
Message 7160 - Posted: 24 Mar 2021, 16:10:47 UTC

Uploads are taking 2-3 hours each.
ID: 7160 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
maeax

Send message
Joined: 22 Apr 16
Posts: 659
Credit: 1,719,912
RAC: 3,195
Message 7161 - Posted: 24 Mar 2021, 16:40:49 UTC

AMD Ryzen 3950x-watercooling, d/u 70/30MBit.
This Tasks are very special and need a lot of Power.
Have only this PC to let it running in a quater daytime.
ID: 7161 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
David Cameron
Project administrator
Project developer
Project tester
Project scientist

Send message
Joined: 20 Apr 16
Posts: 180
Credit: 1,355,327
RAC: 0
Message 7162 - Posted: 24 Mar 2021, 17:27:48 UTC - in response to Message 7159.  

finished the tasks of the first wave of March 18 and can no longer get new ones
are there any restrictions on host parameters?

since we are already talking about the official release - note new tasks do not count as ATLAS for badges


Good point, I'll think about how to include the credits for ATLAS badges.

I'm going to stop submitting new tasks here now and set up the app on the production LHC@Home server. I'll announce on the message boards there when it's ready.

Thanks everyone for all your help here!
ID: 7162 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
David Cameron
Project administrator
Project developer
Project tester
Project scientist

Send message
Joined: 20 Apr 16
Posts: 180
Credit: 1,355,327
RAC: 0
Message 7163 - Posted: 25 Mar 2021, 13:07:39 UTC
Last modified: 25 Mar 2021, 13:07:55 UTC

Long tasks are now available on LHC@Home (in a beta application): https://lhcathome.cern.ch/lhcathome/forum_thread.php?id=5625
ID: 7163 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Yeti
Avatar

Send message
Joined: 29 May 15
Posts: 147
Credit: 2,842,484
RAC: 0
Message 7164 - Posted: 25 Mar 2021, 13:56:59 UTC - in response to Message 7163.  

Long tasks are now available on LHC@Home (in a beta application): https://lhcathome.cern.ch/lhcathome/forum_thread.php?id=5625

When can Windows-Only-Users expect to run these ?

Maybe we can test it here ...
ID: 7164 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
rbpeake

Send message
Joined: 15 Apr 15
Posts: 38
Credit: 227,251
RAC: 0
Message 7166 - Posted: 26 Mar 2021, 18:00:40 UTC - in response to Message 7164.  

I will second the question on when will this be available for Windows users?

Thanks!
ID: 7166 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Previous · 1 · 2 · 3 · 4 · Next

Message boards : ATLAS Application : ATLAS long simulation 1.01


©2024 CERN