Message boards : Number crunching : Job queue empty!!
Message board moderation

To post messages, you must log in.

1 · 2 · Next

AuthorMessage
Crystal Pellet
Volunteer tester

Send message
Joined: 13 Feb 15
Posts: 1188
Credit: 861,609
RAC: 15
Message 273 - Posted: 22 Apr 2015, 13:01:19 UTC

ERROR:root:No message received! Nothing to do!

For over at least 1 hour . . .
ID: 273 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Phil

Send message
Joined: 9 Apr 15
Posts: 57
Credit: 230,221
RAC: 0
Message 277 - Posted: 22 Apr 2015, 19:06:58 UTC - in response to Message 273.  
Last modified: 22 Apr 2015, 19:08:02 UTC

ERROR:root:No message received! Nothing to do!

For over at least 1 hour . . .


Over 8 hours, now. Shall we leave em running, pause them a while or what!
ID: 277 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Ben Segal
Volunteer moderator
Volunteer developer
Volunteer tester

Send message
Joined: 12 Sep 14
Posts: 65
Credit: 544
RAC: 0
Message 284 - Posted: 23 Apr 2015, 7:54:05 UTC - in response to Message 280.  

ERROR:root:No message received! Nothing to do! . . .

I'm not sure I understand... Where is this error message coming from and what job queue are you saying is empty? I see plenty of tasks available on the server status page and my hosts haven't had any problem getting work, but maybe that's not what you are referring to.

CP is referring to the CMS job queue which feeds jobs into the VM from CERN. The error message he sees is on the 5th VM console output (shown by the "Show VM Console" button, followed by CTRL-ALT-F5 to select the 5th screen). Right now there is a problem at the CMS end which the admins know about and will fix asap.

The BOINC tasks are what you see on the server status page and there's no shortage of these - each BOINC task runs 24 hours and just starts a new VM each time.

Patience...
ID: 284 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Laurence
Project administrator
Project developer
Project tester
Avatar

Send message
Joined: 12 Sep 14
Posts: 1069
Credit: 334,882
RAC: 0
Message 285 - Posted: 23 Apr 2015, 10:32:28 UTC - in response to Message 284.  

More jobs have been submitted.
ID: 285 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Phil

Send message
Joined: 9 Apr 15
Posts: 57
Credit: 230,221
RAC: 0
Message 286 - Posted: 23 Apr 2015, 15:42:33 UTC - in response to Message 285.  

Cheers!
Lets un-suspend the tasks and see what happens.
ID: 286 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Magic Quantum Mechanic
Avatar

Send message
Joined: 8 Apr 15
Posts: 781
Credit: 12,363,955
RAC: 4,138
Message 287 - Posted: 23 Apr 2015, 20:07:05 UTC

All is well here.

I might try this on another OS version next to see how that goes.
Mad Scientist For Life
ID: 287 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Phil

Send message
Joined: 9 Apr 15
Posts: 57
Credit: 230,221
RAC: 0
Message 288 - Posted: 23 Apr 2015, 22:12:10 UTC - in response to Message 286.  

Cheers!
Lets un-suspend the tasks and see what happens.


Ahh well, my VM sulked instead of restarting.
Aborted and running a new one.
ID: 288 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Magic Quantum Mechanic
Avatar

Send message
Joined: 8 Apr 15
Posts: 781
Credit: 12,363,955
RAC: 4,138
Message 289 - Posted: 23 Apr 2015, 23:50:44 UTC - in response to Message 288.  

Cheers!
Lets un-suspend the tasks and see what happens.


Ahh well, my VM sulked instead of restarting.
Aborted and running a new one.



I see you are running with a Win8.1 OS and that happened to be the one I was thinking about testing next (Win7 works on a desktop and my laptop along with vLHC and Atlas at the same time)

You seem to get about every other task to be completes.

So 8.1 must work if the other things are running properly.

Were you changing the *preferences* while it was running?

http://boincai05.cern.ch/CMS-dev/result.php?resultid=55937

And like we do with Atlas and on rare occasion vLHC check for any *image.vdi that needs to be *removed* since at times they just sit there instead of doing it automatically.

ID: 289 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Keiken

Send message
Joined: 2 Jul 15
Posts: 15
Credit: 140,962
RAC: 0
Message 450 - Posted: 2 Jul 2015, 20:39:03 UTC

Hi, just started my first task. It's been running for 4,5 hours now and the whole CMSJobAgent-stderr.log is full of "ERROR:root:No message received! Nothing to do!" with not a single line that is different than that. Is everything normal?
ID: 450 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile PDW

Send message
Joined: 20 May 15
Posts: 217
Credit: 6,170,322
RAC: 2,599
Message 451 - Posted: 2 Jul 2015, 21:09:15 UTC - in response to Message 450.  
Last modified: 2 Jul 2015, 21:16:29 UTC

Hi,

The jobs run for 24 hours and having checked my CMSJobAgent-stderr.log file I do have a lot of those error messages but do have some others indicating work is happening.

What does your local stderr.txt file (in the Slots folder) show ?

Mine has lines at the end like...


2015-07-02 18:27:00 (6984): Status Report: Job Duration: '86400.000000'
2015-07-02 18:27:00 (6984): Status Report: Elapsed Time: '24000.000000'
2015-07-02 18:27:00 (6984): Status Report: CPU Time: '3446.873295'
2015-07-02 20:06:57 (6984): Status Report: Job Duration: '86400.000000'
2015-07-02 20:06:57 (6984): Status Report: Elapsed Time: '30000.495445'
2015-07-02 20:06:57 (6984): Status Report: CPU Time: '4248.531234'
2015-07-02 21:46:55 (6984): Status Report: Job Duration: '86400.000000'
2015-07-02 21:46:55 (6984): Status Report: Elapsed Time: '36000.495445'
2015-07-02 21:46:55 (6984): Status Report: CPU Time: '5017.319762'

indicating something is being done !
Also, CPU time used (in BOINC manager) for the task increases
ID: 451 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Keiken

Send message
Joined: 2 Jul 15
Posts: 15
Credit: 140,962
RAC: 0
Message 452 - Posted: 2 Jul 2015, 23:05:33 UTC - in response to Message 451.  

Ah okay thanks PDW! My 'CMSJobAgent-stderr.log' still only consists of the error messages (one every minute!), but in the stderr.txt file I have three completed jobs showing at the end (and of course the CPU time increases in the BOINC manager).

I would say that it is a little bit confusing that CMSJobAgent-stderr.log only shows errors when it is the file that is found through the BOINC manager and not like the stderr file for which you have to navigate through the folders.

Maybe write some lines for completed jobs in it too?


One other thing regarding the CMSJobAgent-stderr.log file:

There seem to be some characters in it that can't be shown correctly:

As you can see the first character and the one after the '!' are the ones I'm talking about.
Overall the beginning and the end of the line seem a little bit odd. Is that intended?
ID: 452 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile PDW

Send message
Joined: 20 May 15
Posts: 217
Credit: 6,170,322
RAC: 2,599
Message 453 - Posted: 3 Jul 2015, 8:32:26 UTC - in response to Message 452.  

I just looked again at the CMSJobAgent-stderr.log file and mine was a screen full of error messages but I couldn't find how to scroll back to see earlier ones. To be honest I looked once when I first started running the tasks and just found it easier to look at my local log files.

The weird character looks like the start of a formatting command that couldn't be handled (interpreted) by the output device so was output as best it could. You need to get an answer from a developer if you want specifics but I don't believe it is anything to worry about.
ID: 453 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Keiken

Send message
Joined: 2 Jul 15
Posts: 15
Credit: 140,962
RAC: 0
Message 454 - Posted: 3 Jul 2015, 10:59:04 UTC - in response to Message 453.  

That's what I thought too. I just wanted to point it out to the developers.
ID: 454 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Richard Haselgrove

Send message
Joined: 4 May 15
Posts: 64
Credit: 55,584
RAC: 0
Message 455 - Posted: 3 Jul 2015, 13:14:42 UTC
Last modified: 3 Jul 2015, 13:14:50 UTC

We should get some interesting data from the latest LHC design features



(scanned from tomorrow's edition of the UK magazine "New Scientist". Work that one out if you can!)
ID: 455 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile PDW

Send message
Joined: 20 May 15
Posts: 217
Credit: 6,170,322
RAC: 2,599
Message 456 - Posted: 3 Jul 2015, 13:29:02 UTC - in response to Message 455.  

Serious design flaw right there.

The picnic area should be much, much closer to the pub !
ID: 456 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Richard Haselgrove

Send message
Joined: 4 May 15
Posts: 64
Credit: 55,584
RAC: 0
Message 457 - Posted: 3 Jul 2015, 13:49:40 UTC - in response to Message 456.  

That's why you should always consult with your users before finalising the design!
ID: 457 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Richard Haselgrove

Send message
Joined: 4 May 15
Posts: 64
Credit: 55,584
RAC: 0
Message 458 - Posted: 3 Jul 2015, 14:58:55 UTC - in response to Message 452.  

There seem to be some characters in it that can't be shown correctly:

As you can see the first character and the one after the '!' are the ones I'm talking about.
Overall the beginning and the end of the line seem a little bit odd. Is that intended?

Those look like some old VT-series terminal escape characters, perhaps designed to set some screen attribute and turn it off again.
ID: 458 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Magic Quantum Mechanic
Avatar

Send message
Joined: 8 Apr 15
Posts: 781
Credit: 12,363,955
RAC: 4,138
Message 459 - Posted: 3 Jul 2015, 20:21:11 UTC
Last modified: 3 Jul 2015, 20:31:00 UTC

I think it would run better if the Pub and the Picnic Area used the Wormhole so you are ready for a voyage to another Galaxy and a planet that has another Pub and Picnic Area



ID: 459 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile ivan
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist
Avatar

Send message
Joined: 20 Jan 15
Posts: 1139
Credit: 8,310,612
RAC: 444
Message 460 - Posted: 6 Jul 2015, 13:54:42 UTC - in response to Message 458.  

There seem to be some characters in it that can't be shown correctly:

As you can see the first character and the one after the '!' are the ones I'm talking about.
Overall the beginning and the end of the line seem a little bit odd. Is that intended?

Those look like some old VT-series terminal escape characters, perhaps designed to set some screen attribute and turn it off again.

Yes, (escape)[(number)m -- turning on and off colour text attributes. On the right terminal they show up as red or some such.
ID: 460 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Phil

Send message
Joined: 9 Apr 15
Posts: 57
Credit: 230,221
RAC: 0
Message 461 - Posted: 6 Jul 2015, 14:12:57 UTC

Okay, so back to the point...
It seems that the CERN job queue (databridge, copilot, whatever) has been unable to supply jobs for some weeks.
Do you guys want us to continue running BOINC jobs, or leave off until an announcement?
ID: 461 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
1 · 2 · Next

Message boards : Number crunching : Job queue empty!!


©2024 CERN