Message boards :
CMS Application :
New Version 60.63
Message board moderation
Previous · 1 · 2 · 3 · Next
Author | Message |
---|---|
Send message Joined: 22 Apr 16 Posts: 677 Credit: 2,002,766 RAC: 819 |
Shutdown after 12 hours is now ok. Graphic on the left side of the Boincmanager show Apache Testpage instead of CMS-Data. |
Send message Joined: 28 Jul 16 Posts: 482 Credit: 394,720 RAC: 0 |
... It seems vboxwrapper_26205_x86_64-pc-linux-gnu was built on a system with glibc 2.29 but I'm using Rocky Linux 8.6 (derived from RHEL 8) which still uses 2.28. As an option you may want to upgrade to Rocky Linux 9. It installs glibc-2.34-28.el9_0.x86_64.rpm For details see: https://download.rockylinux.org/pub/rocky/ https://download.rockylinux.org/pub/rocky/9/BaseOS/x86_64/os/Packages/g/ |
Send message Joined: 20 Jan 15 Posts: 1139 Credit: 8,168,972 RAC: 1,763 |
... It seems vboxwrapper_26205_x86_64-pc-linux-gnu was built on a system with glibc 2.29 but I'm using Rocky Linux 8.6 (derived from RHEL 8) which still uses 2.28. Hmm, that's only just been released... I managed to get vboxwrapper to compile by using devtools-11; gcc 8.0 didn't have a 32-bit libstdc++.a. Now waiting for my task backoff to time-out so I can try with the "new" vboxwrapper -- at least it passes an ldd test without problem. |
Send message Joined: 28 Jul 16 Posts: 482 Credit: 394,720 RAC: 0 |
... I managed to get vboxwrapper to compile ... The most recent changes are not yet merged to BOINC master. You would have to download them from the links below, then recompile vboxwrapper. Otherwise you would get a version that may run into the "VirtualBox 4.0" error. https://github.com/BOINC/boinc/blob/5347c8068c5594cc008dacb80e97c4c85601a08c/samples/vboxwrapper/vbox_common.cpp https://github.com/BOINC/boinc/blob/5347c8068c5594cc008dacb80e97c4c85601a08c/samples/vboxwrapper/vbox_vboxmanage.cpp |
Send message Joined: 20 Jan 15 Posts: 1139 Credit: 8,168,972 RAC: 1,763 |
... I managed to get vboxwrapper to compile ... Hmm, OK, done that. BOINC did download the "official" wrapper when it first asked for work because the local copy was the wrong size. I copied the new version across after that; hopefully it won't overwrite again when it does eventually get a new task after my quota is increased. |
Send message Joined: 28 Jul 16 Posts: 482 Credit: 394,720 RAC: 0 |
In cc_config.xml add/set: <dont_check_file_sizes>1</dont_check_file_sizes> Then reload config files via boincmanager. Even then the client will occasionally download a fresh copy from the server, e.g. after a crash. |
Send message Joined: 13 Feb 15 Posts: 1188 Credit: 859,751 RAC: 36 |
This one https://lhcathomedev.cern.ch/lhcathome-dev/result.php?resultid=3102526 did not run the full 12 hours wallclock, but was shutdown a bit earlier (I suppose not by the wrapper). It had done three cmrRun's and after the third it did not get a new CMS-job. Maybe therefore it was closed earlier? |
Send message Joined: 22 Apr 16 Posts: 677 Credit: 2,002,766 RAC: 819 |
1.5 hour is the difference in your shown task between the next running job inside of the CMS task. Why is this so long? Because of missing CMS-Data? Have checked two in production from my side. Seeing the same. It seem only a protocol notice from the elapsed time. |
Send message Joined: 13 Feb 15 Posts: 1188 Credit: 859,751 RAC: 36 |
1.5 hour is the difference in your shown task between the next running job inside of the CMS task.I really don't know what you mean. |
Send message Joined: 28 Jul 16 Posts: 482 Credit: 394,720 RAC: 0 |
It is not caused by vboxwrapper. The CMS processes inside the VM became much smarter a long while ago. Based on the benchmarks running when a VM starts they try to estimate whether a new subtask can be finished within the given remaining time. It's a rather complex algorithm and nothing is reported back to the logs visible here. The runtime and CPU time from your log appears to be inside the usual limits, hence there's nothing to worry about. |
Send message Joined: 28 Jul 16 Posts: 482 Credit: 394,720 RAC: 0 |
@ivan If just a few libs are missing you may try to extract them from Rocky Linux 9 and make them available with the method described here: https://lhcathome.cern.ch/lhcathome/forum_thread.php?id=5015&postid=39487 |
Send message Joined: 13 Feb 15 Posts: 1188 Credit: 859,751 RAC: 36 |
The CMS processes inside the VM became much smarter a long while ago.I already thought that there must have been built in some more intelligence, but it was never communicated. In the past I had written, that a lot of CPU-time could be spilled on slow machines, that need more than 6 hours for one CMS-job. Example: A machine needs 11 hours for 1 single CMS-job. In the past a second job started and would have been killed after 18 hours VM-lifetime. 7 hours of cpu time were spilled. I'm glad that is avoided now. |
Send message Joined: 20 Jan 15 Posts: 1139 Credit: 8,168,972 RAC: 1,763 |
In cc_config.xml add/set: OK, thanks for pointing that out. |
Send message Joined: 20 Jan 15 Posts: 1139 Credit: 8,168,972 RAC: 1,763 |
[ I managed to get vboxwrapper to compile by using devtools-11; gcc 8.0 didn't have a 32-bit libstdc++.a. OK, it seems to have worked. First task ran to completion and a new task has started up. I see both my computers (Win10 and Rocky Linux) running in the HTcondor pool: [lxplus789:~] > condor_status -pool vocms0840.cern.ch|grep '@9-' glidein_4804_350226202@9-4416-18346 LINUX X86_64 Claimed Busy 3.360 2500 0+00:23:36 glidein_4821_834398040@9-4599-1833 LINUX X86_64 Claimed Busy 1.350 2500 0+00:20:05 The only current glitch is that the "Show VM Console" button doesn't appear (nor does the "Show graphics" option show the current logs) so I cannot check that it is actually running cmsRun jobs. |
Send message Joined: 20 Jan 15 Posts: 1139 Credit: 8,168,972 RAC: 1,763 |
[The only current glitch is that the "Show VM Console" button doesn't appear (nor does the "Show graphics" option show the current logs) so I cannot check that it is actually running cmsRun jobs. OK, got around that by running VirtualBox and logging in to the VM. Output files are appearing and look OK. |
Send message Joined: 28 Jul 16 Posts: 482 Credit: 394,720 RAC: 0 |
The only current glitch is that the "Show VM Console" button doesn't appear (nor does the "Show graphics" option show the current logs) so I cannot check that it is actually running cmsRun jobs. The reason for this is reported here, but you already found the suggested workaround. 2022-07-28 23:52:52 (212243): Required extension pack not installed, remote desktop not enabled. |
Send message Joined: 20 Jan 15 Posts: 1139 Credit: 8,168,972 RAC: 1,763 |
The only current glitch is that the "Show VM Console" button doesn't appear (nor does the "Show graphics" option show the current logs) so I cannot check that it is actually running cmsRun jobs. Thanks, hadn't noticed that. I saw that VirtualBox was updated when I did a "yum update" yesterday, but didn't realise the extpak wasn't installed at the same time. That's bitten me once before on my "managed" Win10 box when central IT updated Vbox but not the extpak. Loaded pack, aborted task, new task shows VM console. I still have a problem with the OS intercepting Alt-Fn key sequences, tho'. The only reliable one is Alt-F3. |
Send message Joined: 13 Feb 15 Posts: 1188 Credit: 859,751 RAC: 36 |
I'm running a new CMS-task doing its first CMS-job inside the VM. After 25% into this first job, I noticed a differencing image in the snapshot folder with the size of 4444913664 bytes, 4.13 GB. Is this to be expected? |
Send message Joined: 13 Feb 15 Posts: 1188 Credit: 859,751 RAC: 36 |
Finally the peak disk usage of that task was 4.36 GB Normal seems to be between about 650 to 800 MB https://lhcathomedev.cern.ch/lhcathome-dev/result.php?resultid=3102793 |
Send message Joined: 20 Jan 15 Posts: 1139 Credit: 8,168,972 RAC: 1,763 |
I'm running a new CMS-task doing its first CMS-job inside the VM. I can't comment, myself. Laurence is on holiday next week so he may not be able to reply. |
©2024 CERN