Message boards :
CMS Application :
New Version 60.70
Message board moderation
Author | Message |
---|---|
![]() ![]() Send message Joined: 12 Sep 14 Posts: 1052 Credit: 294,071 RAC: 0 ![]() |
Synchronizing the vboxwrapper with the latest official version. |
![]() Send message Joined: 28 Jul 16 Posts: 434 Credit: 376,220 RAC: 0 ![]() ![]() |
As for vboxwrapper 26206 this task is running fine: https://lhcathomedev.cern.ch/lhcathome-dev/result.php?resultid=3141537 I'm still missing the modifications of the vdi's boot partition that should make CVMFS fail-over and load balancing more robust. Are there plans to implement them before this vdi is used on prod? |
![]() ![]() Send message Joined: 12 Sep 14 Posts: 1052 Credit: 294,071 RAC: 0 ![]() |
No, I spoke to Jakob about this. He said that the values in the contextualization should overwrite the kernel values soon after the VM boots. They should get fixed in new CernVM release. |
Send message Joined: 22 Aug 22 Posts: 13 Credit: 6,118 RAC: 41 ![]() |
I get this when i try to start cms_mt ![]() My ISP doesn't support ipv6 |
![]() Send message Joined: 28 Jul 16 Posts: 434 Credit: 376,220 RAC: 0 ![]() ![]() |
@Laurence The CMS vdi currently in use sets this link in /usr/sbin/bootstrap: /cvmfs/grid.cern.ch/vc/vm-qa/sbin/bootstrap-idtoken bootstrap-idtoken sets this variable: branch=qa If it was the intention to implement a switch dev/prod it will not work. I recently implemented a switch for ATLAS that solves the same objective: https://github.com/davidgcameron/boinc-scripts/blob/master/vbox/ATLASbootstrap.sh#L41-L45 It could easily be rewritten and tested for CMS. Just give me a "go". |
![]() Send message Joined: 28 Jul 16 Posts: 434 Credit: 376,220 RAC: 0 ![]() ![]() |
*.openhtc.io responds to both, ipv4 and ipv6. from your screenshot -> 188.114.96.1 At least this one should have worked. It is clearly reported by the frontier client. Was it a transient error (1 task only) or do all tasks report it? Could you check whether Cloudflare is blocked in your firewall (maybe only for this box)? |
Send message Joined: 22 Aug 22 Posts: 13 Credit: 6,118 RAC: 41 ![]() |
When i try to open cms4-frontier.openhtc.io in chrome i get this. ![]() Probably problems on their side. |
![]() ![]() Send message Joined: 12 Sep 14 Posts: 1052 Credit: 294,071 RAC: 0 ![]() |
This was the plan. Ideally the switch would be done before the bootstrap script. Something like updating the /sbin to /cmvfs link. Was going to think about later so any ideas would be welcome. @Laurence |
![]() Send message Joined: 28 Jul 16 Posts: 434 Credit: 376,220 RAC: 0 ![]() ![]() |
Your CVMFS requests from Theory tasks are also sent to *.openhtc.io. Surely to the same Cloudflare datacenter. Could even be that CVMFS and frontier requests are processed by the very same Cloudflare Squid instance there. Those Squids get their data from backend systems at CERN, RAL, Fermilab ... and do an automatic fail-over. Very unlikely that all backend systems are down at the same moment. Especially since this would crash nearly all CMS tasks worldwide. Please upgrade VirtualBox to the recent v6.1. The new vboxwrapper 26206 does not have the .com interface any more which was responsible for problems in the past. |
Send message Joined: 22 Aug 22 Posts: 13 Credit: 6,118 RAC: 41 ![]() |
What if i install virtualbox 7.0.2? |
![]() Send message Joined: 28 Jul 16 Posts: 434 Credit: 376,220 RAC: 0 ![]() ![]() |
Some kind of a "bootstrap preloader" that does this: 1. mount the shared folder 2. parse init_data.xml from there (it tells you whether you are in dev or prod) 3. modify the link to the main bootstrap script in /sbin according to (2.) 4. mount grid.cern.ch (the link points to this repo; unlike ATLAS which gets it's boot script from atlas.cern.ch) 5. execute the main bootstrap script on CVMFS This is very close to the ATLAS script. I'll prepare a suggestion. |
![]() Send message Joined: 28 Jul 16 Posts: 434 Credit: 376,220 RAC: 0 ![]() ![]() |
7.x might work, but since it is rather new you may stumble over unexpected issues. I'm already aware of a modification that affects the media manager. So far it's not a show stopper here but it needs a closer look. |
![]() ![]() Send message Joined: 8 Apr 15 Posts: 674 Credit: 11,131,196 RAC: 1,903 ![]() ![]() ![]() |
Well it looks like suspend time again and it is friday so who know when......good thing I was up at 4am watching this happen. I must have forgot to set the one laptop to "no more work" so I got 9 of those on that one but the main 3 hosts only got one. I checked other members running CMS and they have the same thing. It happens fast too (4 min 46 sec) I see hundreds of these around here but I think we have them all suspended now. https://lhcathomedev.cern.ch/lhcathome-dev/results.php?hostid=4252 goodnight |
![]() ![]() Send message Joined: 20 Jan 15 Posts: 1126 Credit: 7,849,048 RAC: 9 ![]() |
We've reverted the change that garbled our glidein script -- I'm running main and -dev jobs successfully now. ![]() |
![]() ![]() Send message Joined: 8 Apr 15 Posts: 674 Credit: 11,131,196 RAC: 1,903 ![]() ![]() ![]() |
Thanks Ivan |
©2023 CERN