Message boards : General Discussion : The BOINC VM application
Message board moderation

To post messages, you must log in.

Previous · 1 · 2 · 3

AuthorMessage
maeax

Send message
Joined: 22 Apr 16
Posts: 672
Credit: 1,901,223
RAC: 5,079
Message 6499 - Posted: 23 Jul 2019, 13:53:06 UTC

3D 23 hours so long. This is changed at the moment. From the System?
2019-07-23 13:52:08 (1356): Status Report: Elapsed Time: '324006.135441'
2019-07-23 13:52:08 (1356): Status Report: CPU Time: '5766.375000'
2019-07-23 15:26:17 (1356): Preference change detected
2019-07-23 15:26:17 (1356): Setting CPU throttle for VM. (95%)
2019-07-23 15:26:17 (1356): Setting checkpoint interval to 1200 seconds. (Higher value of (Preference: 1200 seconds) or (Vbox_job.xml: 600 seconds))
2019-07-23 15:34:06 (1356): Status Report: Elapsed Time: '330006.135441'
2019-07-23 15:34:06 (1356): Status Report: CPU Time: '5819.234375'
ID: 6499 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Laurence
Project administrator
Project developer
Project tester
Avatar

Send message
Joined: 12 Sep 14
Posts: 1067
Credit: 329,589
RAC: 129
Message 6500 - Posted: 23 Jul 2019, 14:29:00 UTC - in response to Message 6499.  
Last modified: 23 Jul 2019, 14:29:37 UTC

3D 23 hours so long. This is changed at the moment. From the System?
2019-07-23 13:52:08 (1356): Status Report: Elapsed Time: '324006.135441'
2019-07-23 13:52:08 (1356): Status Report: CPU Time: '5766.375000'
2019-07-23 15:26:17 (1356): Preference change detected
2019-07-23 15:26:17 (1356): Setting CPU throttle for VM. (95%)
2019-07-23 15:26:17 (1356): Setting checkpoint interval to 1200 seconds. (Higher value of (Preference: 1200 seconds) or (Vbox_job.xml: 600 seconds))
2019-07-23 15:34:06 (1356): Status Report: Elapsed Time: '330006.135441'
2019-07-23 15:34:06 (1356): Status Report: CPU Time: '5819.234375'


Your VM is not getting any tasks.

https://lhcathomedev.cern.ch/lhcathome-dev/results.php?hostid=3858

Check that this host is able to get Theory tasks in your preferences. If that doesn't work I would suggest stopping until we produce a new version with more detailed debugging.
ID: 6500 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
maeax

Send message
Joined: 22 Apr 16
Posts: 672
Credit: 1,901,223
RAC: 5,079
Message 6501 - Posted: 23 Jul 2019, 14:35:12 UTC - in response to Message 6500.  
Last modified: 23 Jul 2019, 14:41:21 UTC

Ok, have stopped the task and deleted the Localhost host and wait for a new Program-Version.
Will take a look tomorrow morning.

Edit: The Application-Status say for Vers.0.03 always 0 GFlops (Win,Linux and Intel-Mac)
ID: 6501 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Ray Murray
Avatar

Send message
Joined: 13 Apr 15
Posts: 138
Credit: 2,945,852
RAC: 0
Message 6504 - Posted: 25 Jul 2019, 21:18:07 UTC

Both Boinc tasks, 2791895 and 2791928 which created localhosts 3861 and 3869 respectively, are still happily returning tasks and McPlots after 6 and 5+1/2 days.
Both look likely to overrun their deadlines (Friday evening and Saturday morning) but I won't be able to see what happens then as I'm away to Dunaverty Golf Club tomorrow to play in their Holiday Open on Saturday. (Short but testing links course with small but perfectly formed greens that are well worth the 3hr drive.) Also on Friday sometime I'm going to attempt a Summits-On-The-Air activation on 2m FM from Beinn na Lice (Mull of Kintyre) I wonder what the overlap is of Boincers and Radio Amateurs? Might do a Café enquiry post.
If that deadline triggers the ending of the Boinc tasks, I'll miss it but if they run to Remaining Time Estimate, which is still ticking at half regular time, then they might not finish until Sunday, by which time I'll be back
ID: 6504 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Magic Quantum Mechanic
Avatar

Send message
Joined: 8 Apr 15
Posts: 754
Credit: 11,755,780
RAC: 8,768
Message 6505 - Posted: 25 Jul 2019, 22:41:25 UTC - in response to Message 6504.  

Catch some birdies Ray and swing that driver like the LHC
ID: 6505 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Ray Murray
Avatar

Send message
Joined: 13 Apr 15
Posts: 138
Credit: 2,945,852
RAC: 0
Message 6508 - Posted: 1 Aug 2019, 8:06:29 UTC

13 and 12 days and local hosts both still running fine, returning jobs. BoincVM tasks timed-out on the website long ago with no effect. Both 90+% but remaining time has slowed to 1 second per 7 or 8 real-time seconds. From original 4 days estimate, it's still claiming 18hrs left so might run forever without manual intervention. I haven't interfered with them at all but my Antivirus has been requesting a restart since the weekend. I'll let the younger one do that this evening and we'll see if it checkpoints and recovers.
ID: 6508 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Ray Murray
Avatar

Send message
Joined: 13 Apr 15
Posts: 138
Credit: 2,945,852
RAC: 0
Message 6514 - Posted: 3 Aug 2019, 16:19:07 UTC - in response to Message 6508.  

Very impressed by the robustness of these. Both came through the Antivirus reboots unphased so I did the Windows 1903 Features update as well and both survived. I had difficulty with one host that had a series of partial installs, crashes, roll-backs until I eventually tracked down and updated a driver for a card reader that I have never used, with no help from Windows troubleshooter. On that one the VM did Power Off, which would normally signal loss of the running "job" but even that recovered without losing the 2 waiting jobs. The EXECUTING job may have started over, but impressive survival nonetheless.
14 and 13 days in. Still going strong.
ID: 6514 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Ray Murray
Avatar

Send message
Joined: 13 Apr 15
Posts: 138
Credit: 2,945,852
RAC: 0
Message 6526 - Posted: 9 Aug 2019, 21:14:39 UTC

Weekly update.
99.2% 20days 18hrs, 4hrs remaining and
98.4% 19days 6hrs, 7hrs remaining … although for both, the remaining time has slowed to around 30 real seconds per task second.
Both still healthily returning McPlots

As the rate of remaining time slows exponentially, there seems little likelihood of a natural finish.
The first thing needed is for the VM to shutdown if it does not receive any tasks,
With 2 jobs always cached, ready to run, there would need to be a fairly long job-supply hiatus for this to occur. Would robbing it of network access achieve this, at the cost of not returning the last 3 or 4 jobs? Will there need to be a manual kill? Aborting the already overdue Boinc Task? … Not that I WANT to kill them. Just wondering what to do when v.0.04 comes along.

If anyone wants to try, you can log into the VM as root using your authenticator as the password. You can then attach to Cosmology@home and try it out.
What would the "Authenticator" be? I've tried my account ID number and even copied the whole "weak account key" but neither lets me in to test this. (I'm a GUI person rather than command-line so maybe better that I can't get in ;¬)
ID: 6526 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Ray Murray
Avatar

Send message
Joined: 13 Apr 15
Posts: 138
Credit: 2,945,852
RAC: 0
Message 6536 - Posted: 11 Aug 2019, 22:20:29 UTC

Tonight, I have changed the "location" of one of the localhosts and allowed it to only request 2 Theory tasks, from the previous default of 4, which it accepted so I now have 1 Executing and 1 Uninitialized (waiting to run). Overnight, I have reduced it further to a single Task. Tomorrow evening I'll knock that down to zero and see whether the VM ends on completion of the job/task or reverts to the ==== Tasks === lines as at start-up.
ID: 6536 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Ray Murray
Avatar

Send message
Joined: 13 Apr 15
Posts: 138
Credit: 2,945,852
RAC: 0
Message 6537 - Posted: 12 Aug 2019, 7:16:44 UTC - in response to Message 6536.  
Last modified: 12 Aug 2019, 7:55:03 UTC

Woke up early. Tried to set zero request; not available. Tried to deselect all apps which defaults to ACCEPT all apps so I have unchecked all except BoincVM, of which there are none available so should mean it doesn't get anything, unless Laurence releases a new version today which might result in interesting nested VMs.
Should find out what happens before 9:00UTC.
Won't be there to observe or intervene as I'm at work, now.
ID: 6537 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Ray Murray
Avatar

Send message
Joined: 13 Apr 15
Posts: 138
Credit: 2,945,852
RAC: 0
Message 6544 - Posted: 12 Aug 2019, 20:41:50 UTC - in response to Message 6537.  
Last modified: 12 Aug 2019, 21:05:12 UTC

Task didn't finish after running out of work; it has just been doing the screenful of
==== Tasks ==== lines since reporting its last job this morning. I have changed it back to accepting 2 jobs but it hasn't done that request yet. I don't know what the back-off would be or how to force a manual update so it'll be idle until it makes that request.

Didn't take long, <1/2hr, before I even had a chance to post this update. Back to 1 x Executing and 1 x waiting. Think I'll limit both of them to 2 jobs so as not to waste any when they do eventually expire or are superseded.

Take-aways so far (for whenever the next version or more of these become available):

I set my hosts to School where they would only accept BoincVM to get these.
I didn't have to do ANY fiddly setting up that would normally be required to get Native to work. (Theses are running on Windows hosts)

The localhost VM will start with the preferences of your Default location (regardless of the Host location) so Theory MUST be ticked there or you only get ==== Tasks === on the console. I initially had 4 jobs, 1 CPU selected so the VM started with 1 CPU and got 4 jobs (1 running, 3 waiting). I don't know if 2 or more CPUs would work (yet) and jobs can be reduced to 1 or 2.
Q. Would they (could they?) run CMS or ATLAS if they are ticked or are they limited to Theory_Native for now?

Time Remaining estimate is currently irrelevant. These started at 4 days but are still going strong after 23 days. Remaining Time within the VM and therefore Fraction Done aren't accurate either. Elapsed time looks fine.

If suspended and allowed to save before exiting Boinc, they will recover from any number of Host reboots. I try to avoid harsh, instant shutdowns but suspect they would even survive such and at worst, might restart the Executing job.

The localhosts collect their own credits for each job returned so even if the Boinc task has an error (Timed-out, Aborted etc), therefore no credit, the localhost credit is unaffected.
Q. Will subsequent VMs use the same localhost ID or will a brand new one be created each time?

Hope these observations are useful.
it should be possible to manage the BOINC client in the Guest via a Web browser.
Might you glean some useful pointers from the Christmas Challenge stuff from a few years ago, if any of that is stored somewhere? I THINK it only worked as Firefox/Chrome plugins. Just found cernvm-webapi-1.2.8 and cvmwebapi-2.0.13, that I obviously never got round to deleting.
ID: 6544 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Ray Murray
Avatar

Send message
Joined: 13 Apr 15
Posts: 138
Credit: 2,945,852
RAC: 0
Message 6585 - Posted: 31 Aug 2019, 9:28:41 UTC

No tasks for either VM overnight so seemed as good a time as any to end them both after 42 and 40 days. Both had got to 99.9xx% but estimated time remaining had slowed to about a minute reduction per day.
They have both left a powered-off VM in VBox and a yellow triangle image in Media Manager. Might see later if one of them will start without any Boinc involvement.
ID: 6585 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Previous · 1 · 2 · 3

Message boards : General Discussion : The BOINC VM application


©2024 CERN