Message boards : News : New jobs available
Message board moderation
Previous · 1 · 2
Author | Message |
---|---|
![]() ![]() Send message Joined: 20 Jan 15 Posts: 1139 Credit: 8,310,612 RAC: 0 ![]() |
OK, the database fix appears to have been applied, I have been able to submit a new batch of jobs. ![]() |
Send message Joined: 9 Apr 15 Posts: 57 Credit: 230,221 RAC: 0 ![]() |
Yep, done 2 already. |
![]() ![]() Send message Joined: 20 Jan 15 Posts: 1139 Credit: 8,310,612 RAC: 0 ![]() |
Yep, done 2 already. I make it 3 now. :-) ![]() |
Send message Joined: 20 Mar 15 Posts: 243 Credit: 886,442 RAC: 0 ![]() ![]() |
I've followed up two jobs from the 50 event task, 281 and 305. Both were started but abandoned when the hosts were shut down. Dashboard shows my IP but the (presumably) final hosts's start and stop times. I can't see any indication on Dashboard that the jobs weren't run to completion on the hosts which originally picked them up. No retries are shown. So, at least for successful jobs, the high retry rate isn't due to these "abandoned" jobs - they show up as normally successful jobs. |
Send message Joined: 13 Feb 15 Posts: 1206 Credit: 886,953 RAC: 534 ![]() ![]() |
From your today submitted batch 151102_084842:ireid_crab_CMS_at_Home_TTbar_50ev_3 the first 3 returned jobs failed with 8001 / CMS exception (CMSSW) All 3 from the same IP and running (too) short to be true. Edit: the 4th job finished with success error code. |
![]() ![]() Send message Joined: 20 Jan 15 Posts: 1139 Credit: 8,310,612 RAC: 0 ![]() |
From your today submitted batch 151102_084842:ireid_crab_CMS_at_Home_TTbar_50ev_3 the first 3 returned jobs failed with 8001 / CMS exception (CMSSW) I see three job logs with 8001 errors, 7, 17, and 23 and they do all come from the same machine. It has 13 errors in all today. I'll see what I can find. ![]() |
Send message Joined: 16 Aug 15 Posts: 966 Credit: 1,211,816 RAC: 0 ![]() |
Some of our computers are re-assigned to another backfill batch of 200 jobs. Looks like they are failing, again. |
©2025 CERN