Message boards : Number crunching : Job information in Task report
Message board moderation
Author | Message |
---|---|
![]() ![]() Send message Joined: 20 Jan 15 Posts: 1139 Credit: 8,310,612 RAC: 0 ![]() |
From now on you should see some job information, including exit status, in the task reports on your account pages. E.g in http://lhcathome2.cern.ch/vLHCathome/result.php?resultid=5428296 there is: 2016-02-12 14:07:07 (15624): Guest Log: ======== gWMS-CMSRunAnalysis.sh STARTING at Fri Feb 12 07:12:20 GMT 2016 on 32157-79553-26164 ======== 2016-02-12 14:07:07 (15624): Guest Log: Local time : Fri Feb 12 07:12:20 GMT 2016 2016-02-12 14:07:07 (15624): Guest Log: Current system : Linux 32157-79553-26164 3.10.64-85.cernvm.x86_64 #1 SMP Fri Jan 9 09:53:29 CET 2015 x86_64 x86_64 x86_64 GNU/Linux 2016-02-12 14:07:07 (15624): Guest Log: ..... 2016-02-12 14:07:07 (15624): Guest Log: ====== Fri Feb 12 08:29:34 2016: Finished remote stageout of user output files (status 0). 2016-02-12 14:07:07 (15624): Guest Log: Will not inject transfer requests to ASO for the user output files, because they were staged out directly to the permanent storage. 2016-02-12 14:07:07 (15624): Guest Log: ====== Fri Feb 12 08:29:34 2016: cmscp.py FINISHING (status 0). 2016-02-12 14:07:07 (15624): Guest Log: ======== Stageout at Fri Feb 12 08:29:38 GMT 2016 FINISHING (short status 0) ======== 2016-02-12 14:07:07 (15624): Guest Log: ======== gWMS-CMSRunAnalysis.sh FINISHING at Fri Feb 12 08:29:38 GMT 2016 on 32157-79553-26164 with (short) status 0 ======== 2016-02-12 14:07:07 (15624): Guest Log: Local time: Fri Feb 12 08:29:38 GMT 2016 2016-02-12 14:07:07 (15624): Guest Log: Short exit status: 0 2016-02-12 14:07:07 (15624): Guest Log: Job Running time in seconds: 4638 i.e. the first three and last eight lines from the _condor_stdout file. We hope this will go some way towards providing information you have been asking for. ![]() |
![]() ![]() Send message Joined: 20 Jan 15 Posts: 1139 Credit: 8,310,612 RAC: 0 ![]() |
Feedback? ![]() |
Send message Joined: 16 Aug 15 Posts: 966 Credit: 1,211,816 RAC: 0 ![]() |
Hi Ivan, Generally it is very good. All information a volunteer might want. However, i would remove the following lines per job: ======== gWMS-CMSRunAnalysis.sh STARTING at Sat Feb 13 01:04:13 GMT 2016 on 277-617-13516 ======== Current system : Linux 277-617-13516 3.10.64-85.cernvm.x86_64 #1 SMP Fri Jan 9 09:53:29 CET 2015 x86_64 x86_64 x86_64 GNU/Linux
|
![]() ![]() Send message Joined: 20 Jan 15 Posts: 1139 Credit: 8,310,612 RAC: 0 ![]() |
Thanks for the comment. We're trying to keep it simple, hence just using "head -3" and "tail -8". Now we have it working refinements are, as Laurence said to me, just a SMOP. :-) ![]() |
Send message Joined: 13 Feb 15 Posts: 1217 Credit: 908,429 RAC: 1,389 ![]() ![]() ![]() |
Hello Ivan, Towards the end of a task (longer running than 24 hours) the info is not always consistent. I saw on the Console that the run was ended and also the INFO "Time exceeded. Shutting down!", but that info and the extracted lines from the jobs of the last run was not in the stderr. My last result: 2016-02-14 20:32:37 (5716): Status Report: Elapsed Time: '96666.741228' 2016-02-14 20:32:37 (5716): Status Report: CPU Time: '87505.734531' 2016-02-14 21:55:38 (5716): VM Completion File Detected. 2016-02-14 21:55:38 (5716): Powering off VM. and a correct result from another cruncher: 2016-02-14 09:08:01 (70714): Guest Log: [INFO] CMS glidein Run 8 ended 2016-02-14 09:08:01 (70714): Guest Log: Log extracts for Run 8 jobs . . Job extracts . 2016-02-14 09:09:02 (70714): Guest Log: [INFO] Time exceeded. Shutting down! 2016-02-14 09:09:02 (70714): VM Completion File Detected. 2016-02-14 09:09:02 (70714): Powering off VM. In my result also the line "Guest Log: [INFO] CMS glidein Run XX ended" was missing, probably therefore also missing the other info. XX should be 17. |
Send message Joined: 16 Aug 15 Posts: 966 Credit: 1,211,816 RAC: 0 ![]() |
It would also be good to include the job number, as one could look it up in dashboard. |
Send message Joined: 16 Aug 15 Posts: 966 Credit: 1,211,816 RAC: 0 ![]() |
I have noticed, that the info for last run is not listed . EDIT |
![]() ![]() Send message Joined: 12 Sep 14 Posts: 1128 Credit: 339,230 RAC: 19 ![]() |
Something doesn't make sense. In your task I see: 2016-02-15 10:44:09 (4340): Guest Log: [INFO] Starting CMS Application - Run 4 But then I see. 2016-02-15 13:41:04 (4340): Guest Log: [INFO] CMS glidein Run 1 ended After that message it should immediately print. Log extracts for Run 1 jobs but it doesn't. Why is your version Anonymous platform (CPU)? http://boincai05.cern.ch/CMS-dev/result.php?resultid=114497 |
Send message Joined: 16 Aug 15 Posts: 966 Credit: 1,211,816 RAC: 0 ![]() |
Because i have an app_info.xml to test 2 core operation. This only works a little, as it only reduces the linux overhead a bit. In this case it is set to single core. |
![]() ![]() Send message Joined: 20 Jan 15 Posts: 1139 Credit: 8,310,612 RAC: 0 ![]() |
It would also be good to include the job number, as one could look it up in dashboard. We could do that if we included the next line from the HEAD of _condor_stdout -- it's quite a long line though. ![]() |
![]() ![]() Send message Joined: 12 Sep 14 Posts: 1128 Credit: 339,230 RAC: 19 ![]() |
or this? grep ^jobNumber ./run-4/glide_iCLO8R/dir_11492/_condor_stdout jobNumber: 5519 |
Send message Joined: 16 Aug 15 Posts: 966 Credit: 1,211,816 RAC: 0 ![]() |
or: Output files: step1.root=step1_9573.root |
![]() ![]() Send message Joined: 12 Sep 14 Posts: 1128 Credit: 339,230 RAC: 19 ![]() |
Update should be in CVMFS in a few hours :) |
©2025 CERN