Message boards : CMS Application : Batch Progress
Message board moderation
| Author | Message |
|---|---|
ivanSend message Joined: 20 Jan 15 Posts: 1152 Credit: 8,310,612 RAC: 0 |
[cms005@lcggwms02:~] > cat stats.sh #!/bin/bash grep 'NodeStatus ' $1/node_state.txt|sort|uniq -c Mon May 23 12:30:46 [cms005@lcggwms02:~] > ./stats.sh 160518_203523:ireid_crab_CMS_at_Home_TTbar_50ev_prodB 4173 NodeStatus = 1; /* "STATUS_READY" */ 1075 NodeStatus = 3; /* "STATUS_SUBMITTED" */ 4745 NodeStatus = 5; /* "STATUS_DONE" */ 7 NodeStatus = 6; /* "STATUS_ERROR" */ |
|
Send message Joined: 16 Aug 15 Posts: 967 Credit: 1,216,795 RAC: 51 |
Thanks for the info. How can there be fewer "submitted" than "Ready" or "Done"? |
ivanSend message Joined: 20 Jan 15 Posts: 1152 Credit: 8,310,612 RAC: 0 |
Thanks for the info. When I send a batch - 10,000 in this case - all the jobs are "ready". Then, up to ~1,000 are moved into the queue and become "submitted". As jobs are taken up by processes, jobs move into the queue to replace them and thus go out of "ready" into "submitted"; I think that jobs which are re-queued for retry also are "submitted". I'm not sure what state running jobs are in, probably also submitted since the sum of "idle" and "running" is about the number submitted. And of course jobs which are successful get to "done", those which fail three tries (or other errors) go into "error". Note that all four categories add up to the total number in the batch. |
|
Send message Joined: 16 Aug 15 Posts: 967 Credit: 1,216,795 RAC: 51 |
Thanks for the explanation Ivan. I thought, submitted means the same as in dashboard. |
ivanSend message Joined: 20 Jan 15 Posts: 1152 Credit: 8,310,612 RAC: 0 |
|
|
Send message Joined: 16 Aug 15 Posts: 967 Credit: 1,216,795 RAC: 51 |
How is the proxy lease? Does it not need a refresh. soon? |
ivanSend message Joined: 20 Jan 15 Posts: 1152 Credit: 8,310,612 RAC: 0 |
How is the proxy lease? 160518_203523, so due on the 25th. Current stats: 3776 NodeStatus = 1; /* "STATUS_READY" */ 1084 NodeStatus = 3; /* "STATUS_SUBMITTED" */ 1 NodeStatus = 4; /* "STATUS_POSTRUN" */ 5132 NodeStatus = 5; /* "STATUS_DONE" */ 7 NodeStatus = 6; /* "STATUS_ERROR" */ |
ivanSend message Joined: 20 Jan 15 Posts: 1152 Credit: 8,310,612 RAC: 0 |
|
ivanSend message Joined: 20 Jan 15 Posts: 1152 Credit: 8,310,612 RAC: 0 |
|
|
Send message Joined: 16 Aug 15 Posts: 967 Credit: 1,216,795 RAC: 51 |
Thanks, Ivan! |
ivanSend message Joined: 20 Jan 15 Posts: 1152 Credit: 8,310,612 RAC: 0 |
|
ivanSend message Joined: 20 Jan 15 Posts: 1152 Credit: 8,310,612 RAC: 0 |
|
ivanSend message Joined: 20 Jan 15 Posts: 1152 Credit: 8,310,612 RAC: 0 |
|
|
Send message Joined: 16 Aug 15 Posts: 967 Credit: 1,216,795 RAC: 51 |
Congrats! Best one, yet. Only 0.32% ERROR. |
ivanSend message Joined: 20 Jan 15 Posts: 1152 Credit: 8,310,612 RAC: 0 |
|
|
Send message Joined: 16 Aug 15 Posts: 967 Credit: 1,216,795 RAC: 51 |
Looks like we lost a few in the system, though -- I only count 9,816 result files on the data-bridge. Some may yet turn up, but it's doubtful. Does that include the error jobs? |
Laurence CERN![]() Send message Joined: 12 Sep 14 Posts: 1150 Credit: 342,328 RAC: 0 |
I prefer 99.68% SUCCESS. We should try to understand what happened to the 32 jobs that failed. |
Laurence CERN![]() Send message Joined: 12 Sep 14 Posts: 1150 Credit: 342,328 RAC: 0 |
If you can give me some examples of missing files I can investigate. |
ivanSend message Joined: 20 Jan 15 Posts: 1152 Credit: 8,310,612 RAC: 0 |
|
ivanSend message Joined: 20 Jan 15 Posts: 1152 Credit: 8,310,612 RAC: 0 |
|
©2025 CERN