Message boards : ATLAS Application : Testing CentOS 7 vbox image
Message board moderation
Previous · 1 · 2 · 3 · 4 · 5 · 6 · Next
Author | Message |
---|---|
Send message Joined: 20 Apr 16 Posts: 180 Credit: 1,355,327 RAC: 0 ![]() ![]() |
- Rarely not uploading a result, although HITS-file is produced (with 200 events tasks very annoying). This is the most serious problem in my opinion and a good reason to use the LHC wrapper. I just released 0.85 for windows which uses v26198ab7 so let's see if it helps with this problem. That's up to you. I'm testing with Windows 7 and VBox 6.0.12. No idea how Linux and Windows 10 will do. I'm running Linux with VBox 6.0.12 and everything works ok so I kept the old wrapper for now. Most Linux users would run the native version anyway so this is not as important as Windows. |
Send message Joined: 13 Feb 15 Posts: 1188 Credit: 878,593 RAC: 20 ![]() ![]() |
I just released 0.85 for windows which uses v26198ab7 so let's see if it helps with this problem.I got some tasks for the LHC wrapper. First one: https://lhcathomedev.cern.ch/lhcathome-dev/result.php?resultid=2822886 I noticed you distribute vboxwrapper_26198ab1_windows_x86_64.pdb and so not created from the source of the ab7 version. Therefore probably useless and if the correct version, pdb's are normally only used when developing a new wrapper. By the way: several BOINC projects using VBOX do not distribute the pdb's at all like Cosmology@Home. |
Send message Joined: 13 Feb 15 Posts: 1188 Credit: 878,593 RAC: 20 ![]() ![]() |
With task https://lhcathomedev.cern.ch/lhcathome-dev/result.php?resultid=2822826 there was an issue. After 30 minutes up time there were no athena's running. I suppose there was a temporary network problem to a server. |
Send message Joined: 22 Apr 16 Posts: 677 Credit: 2,002,766 RAC: 0 ![]() ![]() |
This task is now finished under Win10pro. Graphic and RDP are now active. (10 Collisions instead of 200). Had 3 Cores running with app_config. The wrapper is so as Crystal wrote. Virtualbox 5.2.32. TOP under F3 scrolls every 5 sec. https://lhcathomedev.cern.ch/lhcathome-dev/result.php?resultid=2822874 2019-09-18 19:25:46 (5692): Guest Log: HITS file was successfully produced 2019-09-18 19:25:46 (5692): Guest Log: -rw-------. 1 atlas atlas 9186657 Sep 18 17:24 /home/atlas/RunAtlas/HITS.000649-403691-15393._078090.pool.root.1 |
Send message Joined: 22 Apr 16 Posts: 677 Credit: 2,002,766 RAC: 0 ![]() ![]() |
On all three Win10pro Atlas have produced a hits-file: 2019-09-18 20:36:22 (13484): Guest Log: HITS file was successfully produced 2019-09-18 20:36:22 (13484): Guest Log: -rw-------. 1 atlas atlas 9192759 Sep 18 18:34 /home/atlas/RunAtlas/HITS.000649-404167-26542._078090.pool.root.1 2019-09-18 21:58:19 (7476): Guest Log: HITS file was successfully produced 2019-09-18 21:58:19 (7476): Guest Log: -rw-------. 1 atlas atlas 9044926 Sep 18 19:55 /home/atlas/RunAtlas/HITS.000649-403632-27311._078090.pool.root.1 The third is in the message before. Thank you David, to find the wrapper for working with Win10pro. If there is a newer one, we can test it again. The .vdi was always 084 as before and not downloaded again. |
Send message Joined: 13 Feb 15 Posts: 1188 Credit: 878,593 RAC: 20 ![]() ![]() |
On all three Win10pro Atlas have produced a hits-file:Good to see that switching to v0.85 with VBox Manage interface solved your problem on Win10. The cause could also be related to VBox Version: 5.2.32, you are running on all three Win10 machines with vboxwrapper_26202. Maybe even together with being AMD-processors. Is there a reason not to update to VBox 6.0.12? v5.2.32 is now only recommended for 32bit systems/OS's and Oracle support for it will stop in July 2020. |
Send message Joined: 22 Apr 16 Posts: 677 Credit: 2,002,766 RAC: 0 ![]() ![]() |
Hi Crystal, no there is no reason, only it is work for five machines and the Linux VM's ;-) This year it would be. Have one machine with a problem for upgrading to 10.00.18362.00. (Nine times with no success, every time 3 GByte OS-Download!) When this is successful in October, than will upgrade Virtualbox to 6.0.x. Being happy now to see that it work. If there is a better vboxwrapper-version, we can test it. |
Send message Joined: 20 Apr 16 Posts: 180 Credit: 1,355,327 RAC: 0 ![]() ![]() |
I just released 0.85 for windows which uses v26198ab7 so let's see if it helps with this problem.I got some tasks for the LHC wrapper. First one: https://lhcathomedev.cern.ch/lhcathome-dev/result.php?resultid=2822886 The ATLAS WU have used the pdb from the beginning, I always copied the setup from the previous app version without questioning if it was useful or not :) Good to see that other problems were fixed with this wrapper, I will try to pass this information upstream to the developers. I notice this version of the wrapper adds an extra new line to each guest log messate which is a bit annoying so I will try to remove it. |
Send message Joined: 20 Apr 16 Posts: 180 Credit: 1,355,327 RAC: 0 ![]() ![]() |
I just made v0.86 which doesn't use the pdb, let's see if it still works. |
Send message Joined: 22 Apr 16 Posts: 677 Credit: 2,002,766 RAC: 0 ![]() ![]() |
No problems: https://lhcathomedev.cern.ch/lhcathome-dev/result.php?resultid=2823029 2019-09-19 14:17:15 (3260): Guest Log: HITS file was successfully produced 2019-09-19 14:17:15 (3260): Guest Log: -rw-------. 1 atlas atlas 9055739 Sep 19 12:14 /home/atlas/RunAtlas/HITS.000649-2620510-10830._078090.pool.root.1 Edit: VM need to be deleted manually in Virtualbox: 2019-09-19 14:17:15 (3260): VM Completion File Detected. 2019-09-19 14:17:15 (3260): Powering off VM. 2019-09-19 14:22:18 (3260): VM did not power off when requested. 2019-09-19 14:22:18 (3260): VM was successfully terminated. 2019-09-19 14:22:18 (3260): Deregistering VM. (boinc_619308a4cd5573bb, slot#2 |
Send message Joined: 13 Feb 15 Posts: 1188 Credit: 878,593 RAC: 20 ![]() ![]() |
Edit: VM need to be deleted manually in Virtualbox:This is meaningless. It's always been that way in the LHC-wrapper. It's only cosmetic. If Laurence has boredom he could have a look at it. Although the VM is powered off directly (you can see it in VBox Manager), the VM is cleaned 5 minutes thereafter. The wrapper thinking that VM did not power off when requested. must be a failure in the script. It's already for 5 minutes. See my Theory results: https://lhcathome.cern.ch/lhcathome/results.php?hostid=10360630&offset=0&show_names=0&state=4&appid=13 showing the same message towards the end. The remnant in VirtualBox Manager you found was maybe from this older task of yours: https://lhcathomedev.cern.ch/lhcathome-dev/result.php?resultid=2822072 |
Send message Joined: 22 Apr 16 Posts: 677 Credit: 2,002,766 RAC: 0 ![]() ![]() |
Yes Crystal, the old VM was from a other task. Have started a new task and this is deleted correct. 2019-09-19 18:04:03 (12396): Guest Log: HITS file was successfully produced 2019-09-19 18:04:03 (12396): Guest Log: -rw-------. 1 atlas atlas 9192759 Sep 19 16:02 /home/atlas/RunAtlas/HITS.000649-2895230-8215._078090.pool.root.1 |
Send message Joined: 22 Apr 16 Posts: 677 Credit: 2,002,766 RAC: 0 ![]() ![]() |
Now we have no new tasks. |
Send message Joined: 20 Apr 16 Posts: 180 Credit: 1,355,327 RAC: 0 ![]() ![]() |
I have started sending some real tasks here (the same tasks which are being sent to the production server). |
Send message Joined: 13 Feb 15 Posts: 1188 Credit: 878,593 RAC: 20 ![]() ![]() |
I have started sending some real tasks here (the same tasks which are being sent to the production server).Surprise, surprise. I didn't expect new tasks this evening. Suddenly I got a few and 1 started immediately with my setting of 2 cores. That will take long, because 200 events and more complicated particle interactions. I suspended the task, discarded the saved state with VirtualBox Manager, changed the settings from 2 to 4 cores and from 4800 to 6600MB RAM. I started the VM myself without using BOINC and let it run for a few minutes. Saved the VM and resumed the task in BOINC. In Remote Display 4 athena's are running. |
Send message Joined: 13 Feb 15 Posts: 1188 Credit: 878,593 RAC: 20 ![]() ![]() |
That will take long, because 200 events and more complicated particle interactions.Event processing is lasting between 1599 seconds/event and 2947 seconds/event (58 done so far). That makes 31 hours and 40 minutes for the whole task plus some time for suspends to test. First suspend with "Leave applications in memory (LAIM) off". To give the test a chance to succeed, I first suspended the 4 Theory's running alongside the ATLAS-task, so no other threads busy. OK, first suspend test went well. Saving time 29.3 seconds, that's fast, but the rest of the system almost idle. Save-set size on disk 3,415,400,448 bytes and vm_image.vdi in slot almost 4GB. Resuming the task went well. To speed up the task a bit, I will let run lesser Theory's (or none) alongside the ATLAS. |
Send message Joined: 22 Apr 16 Posts: 677 Credit: 2,002,766 RAC: 0 ![]() ![]() |
Crystal, have the same trouble. ;-) One Computer got a unexpected task. But for the test it is ok. Is it possible to run it in native. -dev have no test-parameter in preferences. Edit: Seeing atm a lot of lines in RDP with only the sign 222222 than a new line with the event nr. |
Send message Joined: 20 Apr 16 Posts: 180 Credit: 1,355,327 RAC: 0 ![]() ![]() |
I have disabled the native app here for now because the purpose is to test the new VBox image. I have a couple of successful tasks so far, but maybe I got lucky (or you were unlucky) since the time per event is around 400s. The RDP console looks ok for me. Edit: I suspended one task after 3 hours of running, the save took 35s for 3.5GB vdi. On resume the task continued from where it was suspended. |
Send message Joined: 13 Feb 15 Posts: 1188 Credit: 878,593 RAC: 20 ![]() ![]() |
I have a couple of successful tasks so far, but maybe I got lucky (or you were unlucky) since the time per event is around 400s. The RDP console looks ok for me.You must be lucky. Still busy with my first task and 82 events to go. Although your machine is a 4790 and mine a 2600 yours will not be twice as fast. Since I'm running only this ATLAS-task the event times decrease to between 607 and 1555 seconds. You are probably smashing different hadrons. |
Send message Joined: 13 Feb 15 Posts: 1188 Credit: 878,593 RAC: 20 ![]() ![]() |
I suspended my first long runner a second time after 141 events processed. Saving time 28.3 seconds on an else idle system. Save-set on disk: 3,313,393,664 bytes VM image in slot: 4211 MB |
©2025 CERN