Message boards :
Number crunching :
Respect My Limits!
Message board moderation
Previous · 1 · 2
Author | Message |
---|---|
Send message Joined: 28 Jul 16 Posts: 7 Credit: 1,349 RAC: 0 |
OK, now I got work. I set CPUs to unlimited as in the (dangerous) default setting, allowed 1 job and checked all APPS except for the classical LHC Sixtrack. On the i5 (Quadcore, no hyperthreading) where I reserved one CPU core for firing the Hawaii GPU, I got three tasks: 2x ALICE, 1x LHCb. So, I guess my asumptions above were correct. If I leave these preferences as they are, the Linux Quadcore with hyperthreading (and hence 8 possible tasks) would certainly crash as it would retrieve 7 tasks to be run simultaneously (again, one CPU core is reserved for the GTX770 GPU) and would run out of memory. Of course, maybe the BOINC manager interferes with its memory managment - but I wouldn't trust it. So, two options for me: (1) Setup an app_config.xml to restrict the number of CPU-cores (not CPUs!) to, say, FOUR. (2) Design another (additional) preference setup and assign that to this Linux box. Problem: I do not know how much of RAM each of the apps maximally requires... Michael. P.S.: Suggestion: You may rename the preference sheets from "home, school, work, etc." to "8 GB RAM / 4 CPU-Cores, 8 GB RAM / 8 CPU-Cores, 16 GB RAM / 4 CPU-Cores, 16 GB RAM / 8 CPU-Cores, etc." and preset these preferences sheets appropriately to never exceed the physical RAM by your apps. This would spare me an app_config.xml and allow even the less experienced to easily setup things in an optimized way. President of Rechenkraft.net |
Send message Joined: 28 Jul 16 Posts: 7 Credit: 1,349 RAC: 0 |
Yes. SHOULD have. But hasn't. The default is "unlimited". (3) What does # of CPUS mean? CPUs or CORES? Please specify precisely in the settings page. What is this setting good for? Mwahahaha , such as "results" for the tasks sent out by the server? Just teasing... (4) Behind the name of each app, please indicate in red writing the amount of RAM in GB which that particular app will require/reserve at maximum FOR A SINGLE task. I find it important to make this crystal clear to whomever thinks he/she can modify these settings. Ah, Laurence, please put it in. One more thing done... ;-) Michael. President of Rechenkraft.net |
Send message Joined: 16 Aug 15 Posts: 966 Credit: 1,211,816 RAC: 0 |
The default settings need to be fail save. One task, 1 core and a min of memory. |
Send message Joined: 28 Jul 16 Posts: 7 Credit: 1,349 RAC: 0 |
The default settings need to be fail save. Yes. Michael. President of Rechenkraft.net |
Send message Joined: 28 Jul 16 Posts: 7 Credit: 1,349 RAC: 0 |
Are ALICE tasks differing in length? My i5 machine estimates a 6 hrs runtime but then this happens (and credits are granted, so task OK): 02.08.2016 13:39:09 | vLHCathome-dev | Starting task ALICE_16562_1470076953.252092_0 One other ALICE task is already at 50 min. but barely CPU activity. ALT+F4 Console shows "job finished with unknown exit code". ALT+F3 indicates 0.3% CPU load... Michael. President of Rechenkraft.net |
Send message Joined: 28 Jul 16 Posts: 7 Credit: 1,349 RAC: 0 |
Two more ALICE tasks uploaded. One around 50 min. the other around 15 min runtime. Both showed numbers in the error console window. At least they terminate properly. Won't report more about this, because probably off -topic in this discussion thread here. If you need screenshots in the future please let me know... Michael. President of Rechenkraft.net |
Send message Joined: 16 Aug 15 Posts: 966 Credit: 1,211,816 RAC: 0 |
A lot of these tasks are idle. They do not do any real work. As this is a testing project, there is a number of things, that do not work. I tried a few times to address this, with little success. You just have to get used to it, or not. |
Send message Joined: 28 Jul 16 Posts: 7 Credit: 1,349 RAC: 0 |
OK, incoming multicore tasks (Theory, 3 CPUs) cause other vLHC-dev tasks in progress to be suspended. Not good. Priority settings checked? Michael. President of Rechenkraft.net |
Send message Joined: 28 Jul 16 Posts: 482 Credit: 394,720 RAC: 0 |
This is my experience attaching a new host: 1. I set max_jobs to 3, max_cpus to 2 and checked only the benchmark application 2. I attached the host and while it downloaded the .vdi it asked for 3 WUs 3. After the .vdi download had finished my host started all 3 WUs 4. During this startup phase my host used several GB of swap space and became unresponsive due to very high IO load 5. It took more than 20 minutes to recover and one of the WUs lost the connection to the BOINC client 6. After the first WUs were finished (resp. aborted) I changed max_jobs to 1 and checked the CMS application 7. Whit this setting (2 cpus) the CMS WUs failed after a few minutes with EXIT_NO_SUB_TASKS 8. I changed max_cpus to 1 and the next CMS WU finished successfully |
Send message Joined: 11 Jun 16 Posts: 1 Credit: 250,843 RAC: 0 |
For one week My host can't use more than one core. Before all was in default, just limited in BM to 75% of core use and all MT applications were going normal. But suddenly only one core is in use despite any settings I tried on my account page. Any hints? |
Send message Joined: 21 Sep 15 Posts: 89 Credit: 383,017 RAC: 0 |
My cycle brought me back to this project and I turned on the new applications. No problems (other than those I've expressed before, design or VBOX related) with CMS, THEORY, or BENCHMARK. Have not yet got an ALICE task to complete successfully, will do that next, have concentrated on LHCb. Problem: Estimated time on download shows approx 50 minutes. When it starts running, this quickly climbs to 1:10:30:00 or more (over one day, almost one and a half) which causes BOINC scheduler problems - I don't get work I need for other projects until the last minute. Then to aggravate the situation, the job completes successfully anywhere from 3.4 to 6.8 hours later (although estimate never drops accordingly). Problem: Disk space required is absurd. Very quickly climbs to at least 2GB, eventually will pass 4.5GB! I had to raise the BOINC allocation to even get these to run (found out quickly not to run more than one dev app at a time) and think some crashes may be due to running out of available disk space. MAJOR problem: Does not abide by BOINC memory utilization settings. Requires 2.07GB RAM. With 4GB present, set at "50% when in use", not only will LHCb not suspend "Waiting on Memory", but other projects will also not suspend. (Appears they don't know how much LHCb is taking - they do suspend if LHCb not present.) Trying to run two LHCb tasks (which should have one running and one waiting) instead runs both and crashes. |
Send message Joined: 12 Sep 14 Posts: 1067 Credit: 334,882 RAC: 0 |
My cycle brought me back to this project and I turned on the new applications. No problems (other than those I've expressed before, design or VBOX related) with CMS, THEORY, or BENCHMARK. Have not yet got an ALICE task to complete successfully, will do that next, have concentrated on LHCb. Welcome back! I hope you have seen that we are moving towards one production project LHC@home. Your last cycle made a contribution towards moving in this direction. :)
We can look into this. Will add it to the task tracker (see top left).
This is not unusual for VM based applications. The disk space is the disk size for the VM including OS, application code and data. 2-4GB is pretty compact for what it is.
This is a known issue for VM applications. Virtual Box does not accurately report it's memory usage. This is causing many problems and needs to resolved but it is non-trivial. |
Send message Joined: 21 Sep 15 Posts: 89 Credit: 383,017 RAC: 0 |
Simple fix for memory and disk space issues. PUT IT ON THE PREFERENCES PAGE! If I'd KNOWN that each task required 2GB RAM, I would have known to only allow one at a time. If I'd KNOWN that each application I ran would require 5GB disk, I would have selected only one application at a time to run. COMMUNICATION!!!! :-) (Would be nice to put up there that 'does not abide by BOINC memory limitations', too. But then the home page for the project still doesn't even mention that VirtualBox is required - volunteers don't find that out until after joining...) |
©2024 CERN