Message boards :
Theory Application :
New version 5.00
Message board moderation
Author | Message |
---|---|
Send message Joined: 12 Sep 14 Posts: 1067 Credit: 334,882 RAC: 0 |
I am working on a new VM version that works similarly to the native version. It may take a few iterations until it work. |
Send message Joined: 20 Jun 17 Posts: 25 Credit: 5,114,325 RAC: 14,408 |
Wow 24.75GB of memory usage on 32t system. Necessary? I have some v5.00 and v5.01 tasks. Are the 5.00 ones still needed? |
Send message Joined: 12 Sep 14 Posts: 1067 Credit: 334,882 RAC: 0 |
With v5.02 the jobs starts to run. |
Send message Joined: 12 Sep 14 Posts: 1067 Credit: 334,882 RAC: 0 |
With v5.02 the jobs starts to run. But it dies due to missing heartbeat. New version on it's way. |
Send message Joined: 12 Sep 14 Posts: 1067 Credit: 334,882 RAC: 0 |
With v5.02 the jobs starts to run. v5.03 goes further. |
Send message Joined: 13 Feb 15 Posts: 1188 Credit: 859,751 RAC: 12 |
I don't understand. So far you was testing here Linux Native and Windows BOINC VM app. Suddenly you are testing here Linux VBox, is that right? |
Send message Joined: 20 Jun 17 Posts: 25 Credit: 5,114,325 RAC: 14,408 |
With v5.02 the jobs starts to run. Wait, so v5.00 and v5.01 the tasks do nothing? It looks like none of mine have returned yet. |
Send message Joined: 12 Sep 14 Posts: 1067 Credit: 334,882 RAC: 0 |
I don't understand. So far you was testing here Linux Native and Windows BOINC VM app. That is correct. My first task returned ok. I will push out the Windows and Mac versions now. |
Send message Joined: 13 Feb 15 Posts: 1188 Credit: 859,751 RAC: 12 |
My first Windows task is running for 12 minutes without doing a job. During boot I saw this: On my host the shared folder is there with init_data.xml in it. Now my VM is this showing without using CPU: ALT-F2 is showing the 'top' output. In all other vbox application F3 is used and F2 is used for Job output. You switched both. Stopped the task gracefully with shutdown in shared folder. Result: https://lhcathomedev.cern.ch/lhcathome-dev/result.php?resultid=2825186 |
Send message Joined: 13 Apr 15 Posts: 138 Credit: 2,969,210 RAC: 6 |
Single-core Base Memory has gone up from 730MB to 1500MB which means that where my 2-core Linux host with 4GB of RAM used to be able to run 2 x Theory (VBox or Native), it can now only run 1, with another "waiting for memory". 1 just started on a Windows host: 2-core (accidentally, I usually prefer singles) memory 2250MB From the BoincVM thread "it should be possible to manage the BOINC client in the Guest via a Web browser." "Show Graphics" button gets to an Apache landing page. Is this where that control will be? Same output as CP. Comparing the stderr of Laurence's successful task: mine gets to the line corresponding to 2019-09-23 16:00:54 (11016): Guest Log: 00:00:00.008616 main 5.2.6 r120293 started. Verbose level = 0 but no further. The next line is the shutdown so I'll leave it for now and see what happens in an hour or so. These might just be "Blanks", not intended to do any actual work as presumably there would be details of a Job between those 2 lines. |
Send message Joined: 13 Feb 15 Posts: 1188 Credit: 859,751 RAC: 12 |
I somehow managed on a not reproducible manner to run a valid task on my Windows host. https://lhcathomedev.cern.ch/lhcathome-dev/result.php?resultid=2825190 I'm missing job info in your and my result. |
Send message Joined: 13 Feb 15 Posts: 1188 Credit: 859,751 RAC: 12 |
This time I didn't see ERROR init_data.xml is mising, but WARNING and ERROR further in the boot process: Edit: Next task is running a Herwig++, without any intervention of mine. That's a resend from a user who aborted most of the tasks, returned 6 valids and 2 in progress on a MAC-machine |
Send message Joined: 13 Feb 15 Posts: 1188 Credit: 859,751 RAC: 12 |
Not all Windows tasks are ending into an error. After my Herwig++ finished successful, a new job (pythia8) started well out of the box: |
Send message Joined: 12 Sep 14 Posts: 1067 Credit: 334,882 RAC: 0 |
I think there is a race condition where the job starts before the shared directory in mounted. It will need a new vm but I probabably can't do that today. |
Send message Joined: 13 Feb 15 Posts: 1188 Credit: 859,751 RAC: 12 |
I think there is a race condition where the job starts before the shared directory in mounted. It will need a new vm but I probabably can't do that today.OK, I'll stop testing then after the tasks return running now. I pushed the brake and started 4 tasks one by by one on an else idle system. All 4 are running a job + the one I already had running. |
Send message Joined: 13 Feb 15 Posts: 1188 Credit: 859,751 RAC: 12 |
It will need a new vm but I probabably can't do that today. When you create a new image, could you also have a look to the memory requirements for Theory vbox-tasks. We always (at least for a longer period) had the requirement 630MB + (100MB * ncores). Now it is suddenly 750MB + (750MB * ncores), so 1500MB for a single core Theory-VM. This was (unneeded IMO) changed 1 or 2 months ago at the production too, causing within BOINC too much memory reservation and tasks waiting for memory that in principle is available. |
Send message Joined: 13 Feb 15 Posts: 1188 Credit: 859,751 RAC: 12 |
I have in one of the VM's a job running with this description: ee zhad 91.2 - - sherpa 2.2.5 default 2000This is a known long runner or even never ending job. I see that the job will be killed after 18 hours elapsed time and will not get credits in contrast to what's happening with the current Theory VBox production tasks when gracefully stopped. |
Send message Joined: 8 Apr 15 Posts: 780 Credit: 12,163,009 RAC: 2,189 |
Not all Windows tasks are ending into an error. After my Herwig++ finished successful, a new job (pythia8) started well out of the box: I just ran 10 or those and were Valid with that same |
Send message Joined: 13 Feb 15 Posts: 1188 Credit: 859,751 RAC: 12 |
I see that the job will be killed after 18 hours elapsed time and will not get credits in contrast to what's happening with the current Theory VBox production tasks when gracefully stopped. Not the before mentioned sherpa, but a pythia8 needing more than 18 hours elapsed time. boinc pp jets 7000 100 - pythia8 8.235 cr1 100000 123The machine did his job properly, task was killed gracefully, but no granted credits cause result file to upload was missing. https://lhcathomedev.cern.ch/lhcathome-dev/result.php?resultid=2825825 |
Send message Joined: 13 Feb 15 Posts: 1188 Credit: 859,751 RAC: 12 |
The sherpa ended into the error condition running too long: https://lhcathomedev.cern.ch/lhcathome-dev/result.php?resultid=2825826ee zhad 91.2 - - sherpa 2.2.5 default 2000I see that the job will be killed after 18 hours elapsed time and will not get credits in contrast to what's happening with the current Theory VBox production tasks when gracefully stopped. |
©2024 CERN