Message boards : CMS Application : New Version 60.63
Message board moderation

To post messages, you must log in.

Previous · 1 · 2 · 3 · Next

AuthorMessage
maeax

Send message
Joined: 22 Apr 16
Posts: 601
Credit: 1,368,444
RAC: 2,194
Message 7652 - Posted: 28 Jul 2022, 5:20:37 UTC

Shutdown after 12 hours is now ok.
Graphic on the left side of the Boincmanager show Apache Testpage instead of CMS-Data.
ID: 7652 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
computezrmle
Avatar

Send message
Joined: 28 Jul 16
Posts: 400
Credit: 374,764
RAC: 62
Message 7656 - Posted: 28 Jul 2022, 12:42:17 UTC - in response to Message 7650.  

... It seems vboxwrapper_26205_x86_64-pc-linux-gnu was built on a system with glibc 2.29 but I'm using Rocky Linux 8.6 (derived from RHEL 8) which still uses 2.28.

As an option you may want to upgrade to Rocky Linux 9.
It installs glibc-2.34-28.el9_0.x86_64.rpm

For details see:
https://download.rockylinux.org/pub/rocky/
https://download.rockylinux.org/pub/rocky/9/BaseOS/x86_64/os/Packages/g/
ID: 7656 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile ivan
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist
Avatar

Send message
Joined: 20 Jan 15
Posts: 1126
Credit: 7,409,829
RAC: 2,269
Message 7657 - Posted: 28 Jul 2022, 13:14:31 UTC - in response to Message 7656.  

... It seems vboxwrapper_26205_x86_64-pc-linux-gnu was built on a system with glibc 2.29 but I'm using Rocky Linux 8.6 (derived from RHEL 8) which still uses 2.28.

As an option you may want to upgrade to Rocky Linux 9.
It installs glibc-2.34-28.el9_0.x86_64.rpm

For details see:
https://download.rockylinux.org/pub/rocky/
https://download.rockylinux.org/pub/rocky/9/BaseOS/x86_64/os/Packages/g/

Hmm, that's only just been released... I managed to get vboxwrapper to compile by using devtools-11; gcc 8.0 didn't have a 32-bit libstdc++.a.
Now waiting for my task backoff to time-out so I can try with the "new" vboxwrapper -- at least it passes an ldd test without problem.
ID: 7657 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
computezrmle
Avatar

Send message
Joined: 28 Jul 16
Posts: 400
Credit: 374,764
RAC: 62
Message 7658 - Posted: 28 Jul 2022, 13:45:14 UTC - in response to Message 7657.  

... I managed to get vboxwrapper to compile ...

The most recent changes are not yet merged to BOINC master.
You would have to download them from the links below, then recompile vboxwrapper.
Otherwise you would get a version that may run into the "VirtualBox 4.0" error.

https://github.com/BOINC/boinc/blob/5347c8068c5594cc008dacb80e97c4c85601a08c/samples/vboxwrapper/vbox_common.cpp

https://github.com/BOINC/boinc/blob/5347c8068c5594cc008dacb80e97c4c85601a08c/samples/vboxwrapper/vbox_vboxmanage.cpp
ID: 7658 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile ivan
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist
Avatar

Send message
Joined: 20 Jan 15
Posts: 1126
Credit: 7,409,829
RAC: 2,269
Message 7660 - Posted: 28 Jul 2022, 14:53:33 UTC - in response to Message 7658.  

... I managed to get vboxwrapper to compile ...

The most recent changes are not yet merged to BOINC master.
You would have to download them from the links below, then recompile vboxwrapper.
Otherwise you would get a version that may run into the "VirtualBox 4.0" error.

https://github.com/BOINC/boinc/blob/5347c8068c5594cc008dacb80e97c4c85601a08c/samples/vboxwrapper/vbox_common.cpp

https://github.com/BOINC/boinc/blob/5347c8068c5594cc008dacb80e97c4c85601a08c/samples/vboxwrapper/vbox_vboxmanage.cpp

Hmm, OK, done that. BOINC did download the "official" wrapper when it first asked for work because the local copy was the wrong size. I copied the new version across after that; hopefully it won't overwrite again when it does eventually get a new task after my quota is increased.
ID: 7660 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
computezrmle
Avatar

Send message
Joined: 28 Jul 16
Posts: 400
Credit: 374,764
RAC: 62
Message 7661 - Posted: 28 Jul 2022, 15:01:48 UTC - in response to Message 7660.  

In cc_config.xml add/set:
<dont_check_file_sizes>1</dont_check_file_sizes>
Then reload config files via boincmanager.

Even then the client will occasionally download a fresh copy from the server, e.g. after a crash.
ID: 7661 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Crystal Pellet
Volunteer tester

Send message
Joined: 13 Feb 15
Posts: 1146
Credit: 750,252
RAC: 1,445
Message 7662 - Posted: 28 Jul 2022, 16:55:35 UTC

This one https://lhcathomedev.cern.ch/lhcathome-dev/result.php?resultid=3102526 did not run the full 12 hours wallclock, but was shutdown a bit earlier (I suppose not by the wrapper).
It had done three cmrRun's and after the third it did not get a new CMS-job. Maybe therefore it was closed earlier?
ID: 7662 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
maeax

Send message
Joined: 22 Apr 16
Posts: 601
Credit: 1,368,444
RAC: 2,194
Message 7663 - Posted: 28 Jul 2022, 17:56:29 UTC - in response to Message 7662.  
Last modified: 28 Jul 2022, 18:09:33 UTC

1.5 hour is the difference in your shown task between the next running job inside of the CMS task.
Why is this so long? Because of missing CMS-Data?
Have checked two in production from my side.
Seeing the same.
It seem only a protocol notice from the elapsed time.
ID: 7663 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Crystal Pellet
Volunteer tester

Send message
Joined: 13 Feb 15
Posts: 1146
Credit: 750,252
RAC: 1,445
Message 7664 - Posted: 28 Jul 2022, 19:39:54 UTC - in response to Message 7663.  

1.5 hour is the difference in your shown task between the next running job inside of the CMS task.
I really don't know what you mean.
ID: 7664 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
computezrmle
Avatar

Send message
Joined: 28 Jul 16
Posts: 400
Credit: 374,764
RAC: 62
Message 7665 - Posted: 28 Jul 2022, 19:56:36 UTC - in response to Message 7662.  

It is not caused by vboxwrapper.

The CMS processes inside the VM became much smarter a long while ago.
Based on the benchmarks running when a VM starts they try to estimate whether a new subtask can be finished within the given remaining time.
It's a rather complex algorithm and nothing is reported back to the logs visible here.

The runtime and CPU time from your log appears to be inside the usual limits, hence there's nothing to worry about.
ID: 7665 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
computezrmle
Avatar

Send message
Joined: 28 Jul 16
Posts: 400
Credit: 374,764
RAC: 62
Message 7666 - Posted: 29 Jul 2022, 7:34:09 UTC - in response to Message 7657.  

@ivan
If just a few libs are missing you may try to extract them from Rocky Linux 9 and make them available with the method described here:
https://lhcathome.cern.ch/lhcathome/forum_thread.php?id=5015&postid=39487
ID: 7666 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Crystal Pellet
Volunteer tester

Send message
Joined: 13 Feb 15
Posts: 1146
Credit: 750,252
RAC: 1,445
Message 7667 - Posted: 29 Jul 2022, 7:34:18 UTC - in response to Message 7665.  

The CMS processes inside the VM became much smarter a long while ago.
I already thought that there must have been built in some more intelligence, but it was never communicated.
In the past I had written, that a lot of CPU-time could be spilled on slow machines, that need more than 6 hours for one CMS-job.
Example:
A machine needs 11 hours for 1 single CMS-job. In the past a second job started and would have been killed after 18 hours VM-lifetime.
7 hours of cpu time were spilled. I'm glad that is avoided now.
ID: 7667 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile ivan
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist
Avatar

Send message
Joined: 20 Jan 15
Posts: 1126
Credit: 7,409,829
RAC: 2,269
Message 7672 - Posted: 29 Jul 2022, 10:04:23 UTC - in response to Message 7661.  
Last modified: 29 Jul 2022, 10:10:50 UTC

In cc_config.xml add/set:
|dont_check_file_sizes|1|/dont_check_file_sizes|
Then reload config files via boincmanager.

Even then the client will occasionally download a fresh copy from the server, e.g. after a crash.

OK, thanks for pointing that out.
ID: 7672 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile ivan
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist
Avatar

Send message
Joined: 20 Jan 15
Posts: 1126
Credit: 7,409,829
RAC: 2,269
Message 7673 - Posted: 29 Jul 2022, 10:18:51 UTC - in response to Message 7657.  

[ I managed to get vboxwrapper to compile by using devtools-11; gcc 8.0 didn't have a 32-bit libstdc++.a.
Now waiting for my task backoff to time-out so I can try with the "new" vboxwrapper -- at least it passes an ldd test without problem.

OK, it seems to have worked. First task ran to completion and a new task has started up. I see both my computers (Win10 and Rocky Linux) running in the HTcondor pool:
[lxplus789:~] > condor_status -pool vocms0840.cern.ch|grep '@9-'
glidein_4804_350226202@9-4416-18346           LINUX      X86_64 Claimed   Busy          3.360 2500  0+00:23:36
glidein_4821_834398040@9-4599-1833            LINUX      X86_64 Claimed   Busy          1.350 2500  0+00:20:05

The only current glitch is that the "Show VM Console" button doesn't appear (nor does the "Show graphics" option show the current logs) so I cannot check that it is actually running cmsRun jobs.
ID: 7673 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile ivan
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist
Avatar

Send message
Joined: 20 Jan 15
Posts: 1126
Credit: 7,409,829
RAC: 2,269
Message 7674 - Posted: 29 Jul 2022, 10:42:55 UTC - in response to Message 7673.  

[The only current glitch is that the "Show VM Console" button doesn't appear (nor does the "Show graphics" option show the current logs) so I cannot check that it is actually running cmsRun jobs.

OK, got around that by running VirtualBox and logging in to the VM. Output files are appearing and look OK.
ID: 7674 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
computezrmle
Avatar

Send message
Joined: 28 Jul 16
Posts: 400
Credit: 374,764
RAC: 62
Message 7678 - Posted: 29 Jul 2022, 11:24:13 UTC - in response to Message 7673.  

The only current glitch is that the "Show VM Console" button doesn't appear (nor does the "Show graphics" option show the current logs) so I cannot check that it is actually running cmsRun jobs.

The reason for this is reported here, but you already found the suggested workaround.
2022-07-28 23:52:52 (212243): Required extension pack not installed, remote desktop not enabled.
ID: 7678 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile ivan
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist
Avatar

Send message
Joined: 20 Jan 15
Posts: 1126
Credit: 7,409,829
RAC: 2,269
Message 7681 - Posted: 29 Jul 2022, 12:59:46 UTC - in response to Message 7678.  

The only current glitch is that the "Show VM Console" button doesn't appear (nor does the "Show graphics" option show the current logs) so I cannot check that it is actually running cmsRun jobs.

The reason for this is reported here, but you already found the suggested workaround.
2022-07-28 23:52:52 (212243): Required extension pack not installed, remote desktop not enabled.

Thanks, hadn't noticed that. I saw that VirtualBox was updated when I did a "yum update" yesterday, but didn't realise the extpak wasn't installed at the same time. That's bitten me once before on my "managed" Win10 box when central IT updated Vbox but not the extpak. Loaded pack, aborted task, new task shows VM console. I still have a problem with the OS intercepting Alt-Fn key sequences, tho'. The only reliable one is Alt-F3.
ID: 7681 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Crystal Pellet
Volunteer tester

Send message
Joined: 13 Feb 15
Posts: 1146
Credit: 750,252
RAC: 1,445
Message 7682 - Posted: 29 Jul 2022, 19:05:00 UTC

I'm running a new CMS-task doing its first CMS-job inside the VM.
After 25% into this first job, I noticed a differencing image in the snapshot folder with the size of 4444913664 bytes, 4.13 GB.

Is this to be expected?
ID: 7682 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Crystal Pellet
Volunteer tester

Send message
Joined: 13 Feb 15
Posts: 1146
Credit: 750,252
RAC: 1,445
Message 7685 - Posted: 30 Jul 2022, 7:09:59 UTC - in response to Message 7682.  

Finally the peak disk usage of that task was 4.36 GB

Normal seems to be between about 650 to 800 MB

https://lhcathomedev.cern.ch/lhcathome-dev/result.php?resultid=3102793
ID: 7685 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile ivan
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist
Avatar

Send message
Joined: 20 Jan 15
Posts: 1126
Credit: 7,409,829
RAC: 2,269
Message 7688 - Posted: 30 Jul 2022, 10:14:14 UTC - in response to Message 7682.  

I'm running a new CMS-task doing its first CMS-job inside the VM.
After 25% into this first job, I noticed a differencing image in the snapshot folder with the size of 4444913664 bytes, 4.13 GB.

Is this to be expected?

I can't comment, myself. Laurence is on holiday next week so he may not be able to reply.
ID: 7688 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Previous · 1 · 2 · 3 · Next

Message boards : CMS Application : New Version 60.63


©2022 CERN