Message boards : Number crunching : 24 hours just isn't what it used to be
Message board moderation
Author | Message |
---|---|
![]() Send message Joined: 20 May 15 Posts: 217 Credit: 6,193,119 RAC: 0 ![]() ![]() |
Have I missed something ? Why are the 24 hour jobs no longer stopping after circa 24 hours ? Three Boinc tasks that started about lunchtime today have all completed and validated in only 6 to 7 hours. Was this to be expected ? |
Send message Joined: 16 Aug 15 Posts: 966 Credit: 1,211,816 RAC: 0 ![]() |
It will finish the "run" that it is on, when going past the 24H limit. A "run" takes about 6h, so it is possible, worst case, to run for 30h. This was done to not loose they job it was working on, when hitting 24h. |
![]() ![]() Send message Joined: 20 Jan 15 Posts: 1139 Credit: 8,310,612 RAC: 0 ![]() |
Have I missed something ? You may have run foul of the new rule stopping tasks if all the jobs in one "run" terminate with non-zero exit codes. I'll try to investigate later (just got in from work, haven't watched the news yet...). ![]() |
![]() Send message Joined: 20 May 15 Posts: 217 Credit: 6,193,119 RAC: 0 ![]() ![]() |
The stderr output for the tasks don't have the Guest Log details in them that the previous version did. It does have the line... Detected: Heatbeat check (file: '$s' every 0.000000 seconds) Is it really checking every 0.000000 seconds ? PS. Wouldn't bother with the news, if you saw yesterday's and you'll see tomorrow's you'll be fine :-) |
![]() ![]() Send message Joined: 12 Sep 14 Posts: 1128 Credit: 339,230 RAC: 11 ![]() |
I just see the VM Completion File Detected message but no reason. Looks like the logging has broken. Will investigate. |
![]() ![]() Send message Joined: 20 Jan 15 Posts: 1139 Credit: 8,310,612 RAC: 0 ![]() |
My quick check only gives status 65 non-zero exit codes for you, from Wed & Thurs; those would be the site-local-config file errors we accidentally introduced. Nothing for today. Which host(s)? You don't have much RAC on any of them so I'm not sure which one(s) you are using -- peering behind the curtains with admin rights doesn't give the convenient "last time host contacted server" column. ![]() |
![]() Send message Joined: 20 May 15 Posts: 217 Credit: 6,193,119 RAC: 0 ![]() ![]() |
Another one just did the same, hosts 471, 472, 485 & 761 |
Send message Joined: 16 Aug 15 Posts: 966 Credit: 1,211,816 RAC: 0 ![]() |
touch: cannot touch `/home/boinc/shared/heartbeat': No such file or directory sudo: sorry, you must have a tty to run sudo sudo: sorry, you must have a tty to run sudo sudo: sorry, you must have a tty to run sudo sudo: sorry, you must have a tty to run sudo sudo: sorry, you must have a tty to run sudo sudo: sorry, you must have a tty to run sudo sudo: sorry, you must have a tty to run sudo sudo: sorry, you must have a tty to run sudo sudo: sorry, you must have a tty to run sudo sudo: sorry, you must have a tty to run sudo sudo: sorry, you must have a tty to run sudo sudo: sorry, you must have a tty to run sudo sudo: sorry, you must have a tty to run sudo sudo: sorry, you must have a tty to run sudo sudo: sorry, you must have a tty to run sudo sudo: sorry, you must have a tty to run sudo Just found this in cron-stderr. Maybe related. |
![]() ![]() Send message Joined: 12 Sep 14 Posts: 1128 Credit: 339,230 RAC: 11 ![]() |
I think I know why this might be failing. Are the consoles working for you? Will try to look at it today if I can. The tasks and jobs seem to be running fine though. |
![]() Send message Joined: 20 May 15 Posts: 217 Credit: 6,193,119 RAC: 0 ![]() ![]() |
They were yesterday, haven't checked today (will be another 30 minutes before I can). |
![]() Send message Joined: 20 May 15 Posts: 217 Credit: 6,193,119 RAC: 0 ![]() ![]() |
Consoles are fine, shows the events being processed etc. |
![]() ![]() Send message Joined: 12 Sep 14 Posts: 1128 Credit: 339,230 RAC: 11 ![]() |
I think that with the new image the /etc/sudoers file had changed and contains requiretty. Have just pushed an update to the bootstrap script that removes this and it should be there once CVMFS is updated and a new task is started. Will check the output of some tasks this evening to see if it worked. |
©2025 CERN