Message boards :
Theory Application :
New version 5.00
Message board moderation
Previous · 1 · 2 · 3 · 4 · 5
Author | Message |
---|---|
Send message Joined: 13 Feb 15 Posts: 1188 Credit: 862,257 RAC: 25 |
That means the VirtualBox COM (VBoxSVC.exe) can't communicate fast enough with the wrapper, mostly caused by a too busy system. The VM periodically 'touches' a heartbeat-file to let vboxwrapper know that the VM is still alive. If the wrapper does not detect a change to the modified time of the file at the specified interval it assumes the worst and aborts the job :-( |
Send message Joined: 29 Sep 15 Posts: 5 Credit: 454,762 RAC: 0 |
2 tasks are using 0.632 CPUs. What is that? I have 5 running tasks for 4 threads now. Edit: I aborted-reported tasks and reset project. It looks fine. |
Send message Joined: 29 Sep 15 Posts: 5 Credit: 454,762 RAC: 0 |
|
Send message Joined: 8 Apr 15 Posts: 781 Credit: 12,422,653 RAC: 3,337 |
I got a couple short Valids but now on the 2nd batch of this. I aborted the 1st batch after 12+ hours and just noticed the next batch after 8+ hours (just got out of https://lhcathomedev.cern.ch/lhcathome-dev/show_host_detail.php?hostid=1816 |
Send message Joined: 8 Apr 15 Posts: 781 Credit: 12,422,653 RAC: 3,337 |
We need to do something to have the server stop these Cranky-failed tasks in the first 10 minutes instead of just letting them run until we check to find out that they are running for no reason. This That happens in the first 3 minutes of booting up and there is no reason to keep them running. First thing this morning I had to go and check all of these tasks I have running to see if they are actually running or like this morning where I found half of them had been running 8 hours and are Cranky-failed and have to Abort them and try to get new ones to start running as I am watching so I know for sure and can either let them run or Abort them. That is a waste of time and I know having this over at LHC will not work since nobody over there likes wasting computer time. I got most of mine running again except 3 cores that refused to get Cranky to run .......and well it is time to watch some sunday football .....but since 2 of those cores are here watching the game with me I will see if I can talk them into running before it gets Cranky and suspends for a few hours. THIS is what we want to see in the first 5 minutes (actually less) |
Send message Joined: 8 Apr 15 Posts: 781 Credit: 12,422,653 RAC: 3,337 |
These sure can be a problem if you don't have an internet speed of at least 1mbps because they have to start running in less than 3 minutes or this happens and it even gets worse since they will just keep on running this Invalid tasks for many,many hours if you don't happen to check any particular task and see this in the log or in the VM Console where you can see if it is running a Valid task or not (as I have shown snap shots of already) I don't know what the reason is for these having to start running in less than 3 minutes to actually work and it seems you must have a way to get around this. I saw it mentioned over at the LHC board and I know some of them live a few miles from Cern with faster than lightning ISP's so they only see it once in a while and wonder why they get these Invalids...........well I watch this happen every day and just so I can make sure they will all start I have to do it after 2am when my ISP speed is back to the fastest I get and I can start up 24 of these. BUT if I try this any other time (other than 2am-8am) it is just pure luck to get any of them to start.......like just now I do a speed test first and it is right on the line around 1mbps so I get one to start up and try a second task and it seems to be still running fast enough BUT as I watch the VM Console I see it took longer than the 1min 45 seconds so they jump to the next pages and then tell me it Failed So since it is 1am now I will just wait until 2am and get them all running again. This type of thing has always been the main problem with VB tasks since day one 9 years ago. |
Send message Joined: 8 Apr 15 Posts: 781 Credit: 12,422,653 RAC: 3,337 |
Well here I go again after 2 weeks and 1200 Valid tasks here my satellite isp throttles me down to dialup speed for the next 2 weeks so I can't run any of these Theory VB tasks If they don't start *running* in less than 3 minutes then they will just run for hours and give you a Failed task which you can see in the VM Console in the first 3 minutes so no reason to just let it run an Invalid for hours and hours. Sure would be nice if it didn't demand high-speed internet to just start the VB tasks (since 2011) and after they do start the slow speed does not matter at all. So as UOTD this is pretty much all I will be doing. (and it would be nice if the CMS were fixed so they run on a Windows OS even though since it is VB they do the same thing as the rest) HAPPY NEW YEAR |
Send message Joined: 8 Apr 15 Posts: 781 Credit: 12,422,653 RAC: 3,337 |
Something is wrong with these tasks today. https://lhcathomedev.cern.ch/lhcathome-dev/result.php?resultid=2857294 I have about 15 in a row and the strange part is they start as if they are going to run Valids but end up failing. They start that way and end up Guest Log: 00:08:32 CET +01:00 2020-01-09: cranky: [ERROR] Container 'runc' terminated with status code 1 .[ERROR] Job Failed |
Send message Joined: 13 Feb 15 Posts: 1188 Credit: 862,257 RAC: 25 |
Something is wrong with these tasks today. https://lhcathome.cern.ch/lhcathome/forum_thread.php?id=5265 |
©2024 CERN