Message boards : ATLAS Application : ATLAS vbox v.1.17
Message board moderation

To post messages, you must log in.

AuthorMessage
David Cameron
Project administrator
Project developer
Project tester
Project scientist

Send message
Joined: 20 Apr 16
Posts: 180
Credit: 1,355,327
RAC: 0
Message 7826 - Posted: 19 Oct 2022, 12:43:37 UTC

Hi all,

v1.17 of vbox contains some fixes for CVMFS configuration provided by computezrmle which should address some of the problems people see with stuck or not working CVMFS at the start of tasks. We are testing it here on dev just to make sure it works ok before releasing on the prod server.
ID: 7826 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Crystal Pellet
Volunteer tester

Send message
Joined: 13 Feb 15
Posts: 1178
Credit: 810,985
RAC: 2,009
Message 7827 - Posted: 19 Oct 2022, 16:38:47 UTC - in response to Message 7826.  

I tested several tasks. Not sure what problem should be solved.
https://lhcathomedev.cern.ch/lhcathome-dev/results.php?hostid=4547&offset=0&show_names=0&state=0&appid=5
https://lhcathomedev.cern.ch/lhcathome-dev/results.php?hostid=4600&offset=0&show_names=0&state=0&appid=5
On host id 4547 I had three errors. 1 was part of the first 5 tasks starting at once. On that task I got a popup of a vboxheadless.exe application error.

With 2 other tasks I tested the netwoork connection problem by interrupting my internet when starting an ATLAS task.
I got Testing CVMFS .... of course without response. The tasks keep on running endless.
After connecting to the internet again also nothing positive happened / tasks keep on running doing nothing.
So I suspended the tasks, removed the saved states, rebooted the VMs and saved the state after over 5 minutes of runtime.
Then I resumed the tasks in BOINC again.
1 task got the same VBoxHeadless application error as above and the other was canceled by the server although running well.
ID: 7827 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
computezrmle
Volunteer moderator
Project tester
Volunteer developer
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 28 Jul 16
Posts: 467
Credit: 389,411
RAC: 503
Message 7828 - Posted: 19 Oct 2022, 18:48:24 UTC

Looks fine.
All changes seem to work as expected.
ID: 7828 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
maeax

Send message
Joined: 22 Apr 16
Posts: 659
Credit: 1,719,912
RAC: 3,195
Message 7829 - Posted: 20 Oct 2022, 6:58:03 UTC - in response to Message 7826.  
Last modified: 20 Oct 2022, 7:34:13 UTC

For Win11pro, no tasks avalaible.
Ok, native parameter was active.
Running with 6 (10 Cpu's) tasks from production concurrently.
https://lhcathomedev.cern.ch/lhcathome-dev/show_host_detail.php?hostid=4639
2022-10-20 09:07:41 (27964): Guest Log: Checking CVMFS...
2022-10-20 09:07:44 (27964): Guest Log: CVMFS is ok
Using on the Threadripper no Squid.
If this solution is in production and running well,
will testing Squid again .
ID: 7829 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
computezrmle
Volunteer moderator
Project tester
Volunteer developer
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 28 Jul 16
Posts: 467
Credit: 389,411
RAC: 503
Message 7830 - Posted: 20 Oct 2022, 7:43:46 UTC - in response to Message 7827.  

Not sure what problem should be solved.

1.
The starting order (systemd services) was undefined, hence random.
Now it's explicitly defined that an ATLAS job doesn't start prior to the availability of autofs (=CVMFS mounts) and the shared folder mount.

2.
Some quoting issues were fixed that affected the CVMFS fail-over.
old: fail-over didn't work when the backend behind s1cern-cvmfs.openhtc.io was down
new: CVMFS now correctly switches to a fail-over server and automatically back when the closest server is back
ID: 7830 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Crystal Pellet
Volunteer tester

Send message
Joined: 13 Feb 15
Posts: 1178
Credit: 810,985
RAC: 2,009
Message 7832 - Posted: 21 Oct 2022, 8:54:37 UTC

Tested 1 ATLAS-task with the newest VirtualBox version 7.0.2:
https://lhcathomedev.cern.ch/lhcathome-dev/result.php?resultid=3126629
ID: 7832 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
maeax

Send message
Joined: 22 Apr 16
Posts: 659
Credit: 1,719,912
RAC: 3,195
Message 7833 - Posted: 21 Oct 2022, 9:07:11 UTC - in response to Message 7832.  

2022-10-20 21:00:03 (18608): Detected: VirtualBox VboxManage Interface (Version: 6.1.40)
Tasks from Yesterday, also no problems.
ID: 7833 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Crystal Pellet
Volunteer tester

Send message
Joined: 13 Feb 15
Posts: 1178
Credit: 810,985
RAC: 2,009
Message 7867 - Posted: 8 Nov 2022, 12:28:42 UTC

@Laurence and/or David,

Any plans to update vboxwrapper to the official released version 26206 here at dev and production?
ID: 7867 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
computezrmle
Volunteer moderator
Project tester
Volunteer developer
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 28 Jul 16
Posts: 467
Credit: 389,411
RAC: 503
Message 7868 - Posted: 8 Nov 2022, 12:59:23 UTC - in response to Message 7867.  

Theory and CMS are already updated.
ATLAS may follow next week I guess.
ID: 7868 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
maeax

Send message
Joined: 22 Apr 16
Posts: 659
Credit: 1,719,912
RAC: 3,195
Message 7877 - Posted: 10 Nov 2022, 19:55:18 UTC - in response to Message 7867.  

2022-11-10 16:51:55 (27948): Guest Log: Checking CVMFS...
2022-11-10 16:52:05 (27948): Guest Log: 00:00:10.025474 timesync vgsvcTimeSyncWorker: Radical guest time change: -3 588 532 149 000ns (GuestNow=1 668 095 524 624 523 000 ns GuestLast=1 668 099 113 156 672 000 ns fSetTimeLastLoop=true )
2022-11-10 18:31:47 (27948): Status Report: Elapsed Time: '6000.000000'
2022-11-10 18:31:47 (27948): Status Report: CPU Time: '5983.421875'
2022-11-10 20:11:54 (27948): Status Report: Elapsed Time: '12000.000000'
2022-11-10 20:11:54 (27948): Status Report: CPU Time: '11972.406250'
2022-11-10 20:44:36 (27948): Powering off VM.
2022-11-10 20:44:37 (27948): Successfully stopped VM.
2022-11-10 20:44:37 (27948): Deregistering VM. (boinc_690a80b3b9f01bd0, slot#8)
Computer ID 4639
Laufzeit 3 Stunden 53 min. 7 sek.
CPU Zeit 3 Stunden 52 min. 8 sek.
Prüfungsstatus Ungültig
https://lhcathomedev.cern.ch/lhcathome-dev/result.php?resultid=3144569
ID: 7877 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote

Message boards : ATLAS Application : ATLAS vbox v.1.17


©2024 CERN