Message boards : ATLAS Application : Tasks testing new pilot version
Message board moderation

To post messages, you must log in.

AuthorMessage
David Cameron
Project administrator
Project developer
Project tester
Project scientist

Send message
Joined: 20 Apr 16
Posts: 180
Credit: 1,355,327
RAC: 0
Message 6369 - Posted: 16 May 2019, 12:27:50 UTC

Hi all,

We are putting some tasks here which use a brand new version of the ATLAS "pilot": this is the tool which controls the execution of the task from start to finish. Please let us know if you see any strange behaviour, in particular if there are connections to external services or ports which were not there before.
ID: 6369 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
maeax

Send message
Joined: 22 Apr 16
Posts: 659
Credit: 1,719,716
RAC: 3,335
Message 6370 - Posted: 16 May 2019, 13:55:58 UTC
Last modified: 16 May 2019, 13:56:49 UTC

The message is
no new Tasks avalaible for the moment.
https://lhcathomedev.cern.ch/lhcathome-dev/results.php?hostid=2244
ID: 6370 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
computezrmle
Volunteer moderator
Project tester
Volunteer developer
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 28 Jul 16
Posts: 467
Credit: 389,411
RAC: 503
Message 6371 - Posted: 17 May 2019, 12:24:16 UTC

I already mentioned a firewall issue a while ago at LHC-prod regarding pandaserver.cern.ch, port 25085:
https://lhcathome.cern.ch/lhcathome/forum_thread.php?id=5008&postid=38684

The same happens here and should be solved.
So far no other firewall issues can be seen with v0.62.


The currently running 1-core task will need a while to complete.
The logfile shows this:
14:20:00 ISFG4SimSvc          INFO       Event nr. 9 took 84.16 s. New average 125.3 +- 25.92
ID: 6371 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
computezrmle
Volunteer moderator
Project tester
Volunteer developer
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 28 Jul 16
Posts: 467
Credit: 389,411
RAC: 503
Message 6372 - Posted: 17 May 2019, 19:38:23 UTC

My proxy log shows that a HITS file has been produced and successfully uploaded
[17/May/2019:21:08:35 +0200] "POST http://lhcathome-test.cern.ch/lhcathome-dev_cgi/file_upload_handler HTTP/1.1" 200 110797778 "-" "BOINC client (x86_64-suse-linux-gnu 7.14.2)" TCP_MISS:HIER_DIRECT


Nonetheless the task is marked as invalid.
https://lhcathomedev.cern.ch/lhcathome-dev/result.php?resultid=2777605
ID: 6372 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
maeax

Send message
Joined: 22 Apr 16
Posts: 659
Credit: 1,719,716
RAC: 3,335
Message 6373 - Posted: 18 May 2019, 15:28:33 UTC - in response to Message 6372.  

ID: 6373 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
gyllic

Send message
Joined: 10 Mar 17
Posts: 40
Credit: 108,345
RAC: 0
Message 6374 - Posted: 19 May 2019, 18:14:11 UTC - in response to Message 6373.  

ID: 6374 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
David Cameron
Project administrator
Project developer
Project tester
Project scientist

Send message
Joined: 20 Apr 16
Posts: 180
Credit: 1,355,327
RAC: 0
Message 6375 - Posted: 20 May 2019, 8:50:12 UTC

Oops, there was a bug in the new validator used for these tasks, it should be fixed now. The tasks indeed completed correctly so it seems like things work ok.
ID: 6375 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
maeax

Send message
Joined: 22 Apr 16
Posts: 659
Credit: 1,719,716
RAC: 3,335
Message 6378 - Posted: 20 May 2019, 14:32:12 UTC

The new one from today is also not confirmed .
https://lhcathomedev.cern.ch/lhcathome-dev/workunit.php?wuid=1898186
ID: 6378 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
gyllic

Send message
Joined: 10 Mar 17
Posts: 40
Credit: 108,345
RAC: 0
Message 6380 - Posted: 20 May 2019, 17:23:34 UTC

this tasks ran und validated successfully:
https://lhcathomedev.cern.ch/lhcathome-dev/result.php?resultid=2778502

looking good
ID: 6380 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
gyllic

Send message
Joined: 10 Mar 17
Posts: 40
Credit: 108,345
RAC: 0
Message 6381 - Posted: 20 May 2019, 18:06:23 UTC - in response to Message 6371.  
Last modified: 20 May 2019, 18:07:28 UTC

I already mentioned a firewall issue a while ago at LHC-prod regarding pandaserver.cern.ch, port 25085:
https://lhcathome.cern.ch/lhcathome/forum_thread.php?id=5008&postid=38684

The same happens here and should be solved.
I also see connections to this port, especially at task start up. The logs (only logged for one task) show connections to aipanda034.cern.ch on port 25085
ID: 6381 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
maeax

Send message
Joined: 22 Apr 16
Posts: 659
Credit: 1,719,716
RAC: 3,335
Message 6382 - Posted: 21 May 2019, 5:51:00 UTC - in response to Message 6380.  
Last modified: 21 May 2019, 5:51:21 UTC

ID: 6382 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
David Cameron
Project administrator
Project developer
Project tester
Project scientist

Send message
Joined: 20 Apr 16
Posts: 180
Credit: 1,355,327
RAC: 0
Message 6383 - Posted: 21 May 2019, 13:15:42 UTC - in response to Message 6381.  

I already mentioned a firewall issue a while ago at LHC-prod regarding pandaserver.cern.ch, port 25085:
https://lhcathome.cern.ch/lhcathome/forum_thread.php?id=5008&postid=38684

The same happens here and should be solved.
I also see connections to this port, especially at task start up. The logs (only logged for one task) show connections to aipanda034.cern.ch on port 25085


I've made a change to avoid these connections which will apply to new WU submitted from now. If the problem is confirmed to be fixed I'll apply the changes on the production WU too.
ID: 6383 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
gyllic

Send message
Joined: 10 Mar 17
Posts: 40
Credit: 108,345
RAC: 0
Message 6384 - Posted: 21 May 2019, 19:52:49 UTC - in response to Message 6383.  

I've made a change to avoid these connections which will apply to new WU submitted from now. If the problem is confirmed to be fixed I'll apply the changes on the production WU too.
Tested one task (https://lhcathomedev.cern.ch/lhcathome-dev/result.php?resultid=2778990) and it is looking good. The logs show no connections to ports that are not mentioned in http://lhcathome.web.cern.ch/test4theory/my-firewall-complaining-which-ports-does-project-use
ID: 6384 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
rbpeake

Send message
Joined: 15 Apr 15
Posts: 38
Credit: 227,251
RAC: 0
Message 6392 - Posted: 29 May 2019, 18:13:01 UTC - in response to Message 6369.  

...We are putting some tasks here which use a brand new version of the ATLAS "pilot": this is the tool which controls the execution of the task from start to finish....

What practical effect will this have? Will it increase processing efficiency for the user?
Thanks!
ID: 6392 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
David Cameron
Project administrator
Project developer
Project tester
Project scientist

Send message
Joined: 20 Apr 16
Posts: 180
Credit: 1,355,327
RAC: 0
Message 6393 - Posted: 31 May 2019, 9:57:07 UTC - in response to Message 6392.  

...We are putting some tasks here which use a brand new version of the ATLAS "pilot": this is the tool which controls the execution of the task from start to finish....

What practical effect will this have? Will it increase processing efficiency for the user?
Thanks!


It won't have any effect on the efficiency because the code doing the simulation itself is not changing. The change is to the tool which launches and monitors the simulation code, not only for ATLAS@Home but on the whole ATLAS computing grid. This tool is being refactored and rewritten after ten years of accumulating legacy features and is being rolled out on all the ATLAS sites over the next months. So the practical effects should be zero for the volunteers except maybe some minor changes in the logs you see on the stderr of your tasks.
ID: 6393 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
maeax

Send message
Joined: 22 Apr 16
Posts: 659
Credit: 1,719,716
RAC: 3,335
Message 6401 - Posted: 25 Jun 2019, 11:49:07 UTC

Have a question about SL69:
After some days of searching: SL69 is running now with CVMFS2.
Before, it was only running with CVMFS.

Have before running with a Script from atlas.cern.ch.

Now it can be started after the Console-Command:
chcon -R -t cvmfs_cache_t /scratch/cvmfs

https://lhcathomedev.cern.ch/lhcathome-dev/workunit.php?wuid=1906567
ID: 6401 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
maeax

Send message
Joined: 22 Apr 16
Posts: 659
Credit: 1,719,716
RAC: 3,335
Message 6402 - Posted: 28 Jun 2019, 7:54:03 UTC

Had a energy stop at home.
After restart this Atlas it finished successful, but the runtime is not so as before for other tasks.
https://lhcathomedev.cern.ch/lhcathome-dev/result.php?resultid=2787228
Is this task ok?
ID: 6402 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote

Message boards : ATLAS Application : Tasks testing new pilot version


©2024 CERN