Message boards :
CMS Application :
Possible disruption in the next several hours
Message board moderation
Author | Message |
---|---|
Send message Joined: 20 Jan 15 Posts: 1139 Credit: 8,310,612 RAC: 46 |
mea culpa! I realised today that I'd accidentally typed one zero too many in the WMAgent request for the current batch, and launched ten times too many jobs! Alan tells me this could overload the agent, so I've submitted a "normal" batch and have set this one to "force-complete". This will clear out its queue, but I don't know exactly what effect it will have on currently-running jobs. So, there may be some jobs report as failed, or otherwise faulty, but once the tasks start picking up jobs from the new batch it should all clear up. My apologies, I hope it's not too traumatic. |
Send message Joined: 20 Jan 15 Posts: 1139 Credit: 8,310,612 RAC: 46 |
OK, the new batch has started queueing. There was a hiatus of about 35 minutes with no jobs in the queue. |
Send message Joined: 20 Jan 15 Posts: 1139 Credit: 8,310,612 RAC: 46 |
There will be an upgrade to the HTCondor schedd this afternoon. I'm told it should make no significant disturbance, but be warned... [Added] Done, with no problems seen. [/Added] |
Send message Joined: 20 Jan 15 Posts: 1139 Credit: 8,310,612 RAC: 46 |
Oops, something has gone wrong now. There is a failure in the WMAgent -- it still says there are jobs available, but other monitors and Dashboard say that they have run out. Suggest you set No New Tasks until I round up the CERN posse. |
Send message Joined: 20 Jan 15 Posts: 1139 Credit: 8,310,612 RAC: 46 |
Problem fixed & jobs in the queue. Time to restart. |
©2024 CERN