21) Message boards : CMS Application : Upcoming WMAgent/CouchDB update - jobs will drain Sunday night (Message 7381)
Posted 17 Jun 2022 by Profile ivan
Post:
CERN IT also wants to recreate the VM that we run the agent on, to move it to a new Hypervisor. So the downtime may be a bit longer than I anticipated, but hopefully just a few hours.
22) Message boards : CMS Application : Upcoming WMAgent/CouchDB update - jobs will drain Sunday night (Message 7376)
Posted 17 Jun 2022 by Profile ivan
Post:
CMS wants to upgrade WMAgent to bring in a new version of CouchDB. I've just submitted a new workflow which should start draining around midnight Sunday (European). Please be ready to set NoNewTasks late Sunday to avoid any problems. Hopefully we'll be up again by Monday night.
23) Message boards : CMS Application : New Version 60.60 (Message 7346)
Posted 15 Jun 2022 by Profile ivan
Post:
Hmm, yes, I got the new wrapper running on both my machines by making sure CMS@Home wasn't running (I had to manually remove the BOINC VMs on Windows as they hung around in VirtualBox after I did a pause and abort on the tasks).
24) Message boards : CMS Application : New Version 60.60 (Message 7338)
Posted 15 Jun 2022 by Profile ivan
Post:
I suspect it is caused by a vdi registration error (on Windows a well as on Linux):
VBoxManage.exe: error: Cannot register the hard disk 'C:\ProgramData\BOINC\projects\lhcathomedev.cern.ch_lhcathome-dev\CMS_2021_07_07.vdi' {f888c51e-0503-4495-8794-fd67809dc4e8} because a hard disk 'C:\ProgramData\BOINC\projects\lhcathome.cern.ch_lhcathome\CMS_2021_07_07.vdi' with UUID {f888c51e-0503-4495-8794-fd67809dc4e8} already exists

The old app version as well as the new app version both refer to the same vdi file "CMS_2021_07_07.vdi".
To allow both app versions to coexist the new vdi file must have a different name and a different UUID.

Ah, thanks, I'd missed that difference because of the long line and my eyesight problems -- I thought it was the same file.
25) Message boards : CMS Application : New Version 60.60 (Message 7333)
Posted 14 Jun 2022 by Profile ivan
Post:
This update provides a new version of the VboxWrapper which supports the muliattachmode. Please let me know if there are any issues.

For what it's worth, this version dies immediately on both my Windows 10 machine and a Rocky Linux 8.6 box[1]. I didn't have much time to investigate this afternoon (too many meetings...). Please let us know whether or not your new tasks are running normally.
[1] Tasks link if anyone has the time and inclination to take a look overnight UK time.
26) Message boards : CMS Application : CMS task Credential problem (Message 7309)
Posted 26 Apr 2022 by Profile ivan
Post:
Right, I'm getting the same message on a Linux box and a Win10 machine that both run "production" jobs OK. I'm in a dialogue with Laurence about this. Something may have got corrupted in upgrades to various virtual machines lately -- we haven't worked out what yet.
It turns out that we will need to test new things on -dev soon, as HTCondor is moving away from using x509 credentials to authenticate users. At the moment we fudge this by generating our own certificates -- which is at the heart of these errors. So, it's a priority for us (just at a time when I have other pressing matters om my interrupt-stack...).
I'll try to keep you informed.
27) Message boards : CMS Application : Lack of CMS tasks due to a problem in WMAgent development (Message 7270)
Posted 9 Dec 2021 by Profile ivan
Post:
Unfortunately, I have been unable to submit new workflows to the CMS project since yesterday, and the job queues have now drained.
The cause is a change introduced in the development of the CMS work-flow management system. These changes are tested first on a development system before being moved to the production system. We currently use the development system to run CMS@Home, so the change is impacting us.
I'm trying to find out when a fix will be forthcoming, but until then set No New Tasks for CMS or switch to another project.
I'm sorry about this. I will let you know when I am able to submit jobs again.
28) Message boards : CMS Application : CMS 60.55 (vbox64_mt_mcore_cms) (Message 7269)
Posted 9 Dec 2021 by Profile ivan
Post:
LOL make up your mind server


Heh!
29) Message boards : News : CMS job queue to drain this weekend (21/08/2021) (Message 7250)
Posted 23 Aug 2021 by Profile ivan
Post:
The update is done and jobs are available again. Feel free to continue.
30) Message boards : News : CMS job queue to drain this weekend (21/08/2021) (Message 7244)
Posted 21 Aug 2021 by Profile ivan
Post:
Oh, b****r, I posted that twice here and not to the main app as intended. Sorry...
31) Message boards : News : CMS job queue to drain this weekend (21/08/2021) (Message 7243)
Posted 21 Aug 2021 by Profile ivan
Post:
When will they move multi-core CMS over to the public?
The current version worked fine with 1,2,3,4,and 8 cores when I ran hundreds of them (running 8's and 4'rs right now)

Are you sure it works now? We have a new VM on -dev that uses a different glide-in mechanism, to prepare for when we move to CMS production (like fusion energy, that's always just "this" far away...). Last I checked, you could specify multi-core still, but the task only ran one job. Unless Laurence tweaked it without telling me.
We hope to move the new VM to the public app when both Laurence and Federica are back from summer holidays -- I think that's next week.
32) Message boards : News : CMS job queue to drain this weekend (21/08/2021) (Message 7241)
Posted 20 Aug 2021 by Profile ivan
Post:
CMS is about to release a new version of WMAgent based entirely on python 3. They have asked that they be able to update our agent by Monday evening (23/08), so I will not inject any new workflows before the upgrade. I expect the job queue to drain by late on Sunday.
Please set your CMS application to no new tasks by then.
33) Message boards : News : CMS job queue to drain this weekend (21/08/2021) (Message 7240)
Posted 20 Aug 2021 by Profile ivan
Post:
CMS is about to release a new version of WMAgent based entirely on python 3. They have asked that they be able to update our agent by Monday evening (23/08), so I will not inject any new workflows before the upgrade. I expect the job queue to drain by late on Sunday.
Please set your CMS application to no new tasks by then.
34) Message boards : CMS Application : New Version 50.00 (Message 7184)
Posted 14 May 2021 by Profile ivan
Post:
First investigations suggest that this is due to the condor jobs not matching VMs with more than one CPU, hence the familiar no jobs message. I'll need to get more specialists involved to find out when and why this changed -- it looks like it comes from the WMAgent side rather than CMS@home-dev itself

Ah, it's the infamous Ascension Day long weekend at CERN (it always catches me out) so I don't expect much response from anyone until Monday at the earliest.
35) Message boards : CMS Application : New Version 50.00 (Message 7183)
Posted 12 May 2021 by Profile ivan
Post:
First investigations suggest that this is due to the condor jobs not matching VMs with more than one CPU, hence the familiar no jobs message. I'll need to get more specialists involved to find out when and why this changed -- it looks like it comes from the WMAgent side rather than CMS@home-dev itself
36) Message boards : CMS Application : New Version 50.00 (Message 7181)
Posted 11 May 2021 by Profile ivan
Post:
The docker message is harmless -- we don't use docker.
Sorry I haven't been following this, I've had a rough year and when I can get motivated I have to concentrate on getting the "mainstream" version into Production. ...an endless round of meetings, reports, meetings...
I'll fire up an instance and see what I see.
37) Message boards : News : CMS@Home disruption this week (Message 6865)
Posted 29 Nov 2019 by Profile ivan
Post:
OK, thanks to great efforts by the CMS & CERN IT teams, a workaround is in place and we are able to run jobs again! I've submitted a small batch and have jobs running on my boxen. I'll submit a larger batch later, and take the opportunity to increase the job size as the average run-time is less than I would prefer. This should increase our efficiency.
38) Message boards : News : CMS@Home disruption this week (Message 6853)
Posted 27 Nov 2019 by Profile ivan
Post:
It appears that a database intervention at CERN went badly, leaving our data tables empty and us not being able to submit new CMS@Home jobs. Advice is that it will take several days to recover -- and as well as that some of the major players are in the USA, which has holidays for the rest of this week. I'll keep an eye on it, but I'm doubtful we'll be running again this week. Sorry 'bout that!
Happy Thanksgiving...
39) Message boards : News : CMS job shortage Wednesday 13th November (Message 6821)
Posted 14 Nov 2019 by Profile ivan
Post:
OK, jobs are available now, so you can start running CMS tasks again.
40) Message boards : News : CMS job shortage Wednesday 13th November (Message 6812)
Posted 11 Nov 2019 by Profile ivan
Post:
CMS IT will be installing a new version of WMAgent on Wednesday. This will impact job availability for the duration of the intervention. We might be able to eliminate the little gremlin that's been plaguing us for the last few weeks, too.
So, please set your CMS processors to No New Tasks sometime tomorrow, Tuesday 12th, so that current tasks will stop requesting new jobs before the queues get cut. I'll let you know when jobs are available again.
Thanks.


Previous 20 · Next 20


©2024 CERN