Thread 'exceeded disk limit'

Author	Message
Crystal Pellet Volunteer tester Send message Joined: 13 Feb 15 Posts: 1239 Credit: 960,834 RAC: 598	Message 362 - Posted: 10 May 2015, 21:30:31 UTC - in response to Message 361. Well, it seems to be running as well as the others at the moment: image - we won't know for certain until the back end starts supplying jobs again, of course. CMS only supplies 64-bit apps, sure - but the app it supplies is BOINC's 64-bit VBox wrapper. I simply set up an app_info file and substituted the 32-bit wrapper files, and off it went. The whole point of a VM is that the guest OS doesn't have to match the host: my hardware on this host is fully 64-bit capable, and includes the virtualization hooks to enable VBox to run. I was aware you used the app_info.xml, vbox32 en BOINC32, but could not see if your machine was 64bit capable. I suppose you also placed your 4.79GB vdi-file in the project folder and linked to it in your app_info, so you don't have to wait whether the size of the vdi is increased >4GB. ID: 362 · Rating: 0 · rate: / Reply Quote

Richard Haselgrove Send message Joined: 4 May 15 Posts: 64 Credit: 55,584 RAC: 0	Message 363 - Posted: 10 May 2015, 21:37:05 UTC - in response to Message 362. Well, it seems to be running as well as the others at the moment: image - we won't know for certain until the back end starts supplying jobs again, of course. CMS only supplies 64-bit apps, sure - but the app it supplies is BOINC's 64-bit VBox wrapper. I simply set up an app_info file and substituted the 32-bit wrapper files, and off it went. The whole point of a VM is that the guest OS doesn't have to match the host: my hardware on this host is fully 64-bit capable, and includes the virtualization hooks to enable VBox to run. I was aware you used the app_info.xml, vbox32 en BOINC32, but could not see if your machine was 64bit capable. I suppose you also placed your 4.79GB vdi-file in the project folder and linked to it in your app_info, so you don't have to wait whether the size of the vdi is increased >4GB. Well, I did a quick cheat-test by copying the 4.79 GB file I saved a couple of days ago to the 32-bit machine, and putting in in another project's slot directory. 'Exceeded disk limit' came up with the right numbers, the task was aborted and the file deleted, and everything carried on working properly. ID: 363 · Rating: 0 · rate: / Reply Quote

Ben Segal Volunteer moderator Volunteer developer Volunteer tester Send message Joined: 12 Sep 14 Posts: 65 Credit: 544 RAC: 0	Message 364 - Posted: 11 May 2015, 4:45:11 UTC - in response to Message 363. Any ideas why this bug affected CMS and not ATLAS? Or did it? Ben ID: 364 · Rating: 0 · rate: / Reply Quote

Crystal Pellet Volunteer tester Send message Joined: 13 Feb 15 Posts: 1239 Credit: 960,834 RAC: 598	Message 365 - Posted: 11 May 2015, 7:23:25 UTC - in response to Message 364. Any ideas why this bug affected CMS and not ATLAS? Or did it? Ben Hi Ben, Probably because ATLAS vdi-files in the slot never exceeds 4GB. Initial the vdi is 1.57 and the job within the VM is doing 1 single job lasting about 2-3 hours. Maybe you can give some attention to the fact that the CMS-vdi is growing and growing when there are no jobs available in the queue. I've noted here before that 19% cpu is used even when there is nothing to do and several python processes are running in the VM. Maybe they are creating big loggings. ID: 365 · Rating: 0 · rate: / Reply Quote

Phil Send message Joined: 9 Apr 15 Posts: 57 Credit: 230,221 RAC: 0	Message 367 - Posted: 11 May 2015, 16:40:19 UTC - in response to Message 365. the CMS-vdi is growing and growing when there are no jobs available in the queue. I've noted here before that 19% cpu is used even when there is nothing to do and several python processes are running in the VM. Maybe they are creating big loggings. That is one hell of a lot of writing! ID: 367 · Rating: 0 · rate: / Reply Quote

ivan Volunteer moderator Project administrator Project developer Project tester Project scientist Send message Joined: 20 Jan 15 Posts: 1145 Credit: 8,310,612 RAC: 0	Message 368 - Posted: 11 May 2015, 17:09:24 UTC - in response to Message 365. Maybe you can give some attention to the fact that the CMS-vdi is growing and growing when there are no jobs available in the queue. I've noted here before that 19% cpu is used even when there is nothing to do and several python processes are running in the VM. Maybe they are creating big loggings. I've been digging around on this. It doesn't appear to be logs, rather something to do with cvmfs. Increase in "disk" usage in /var/lib/cvmfs/shared matches the usage increase in /, and almost matches the size increase of the .vdi image; e.g. in the last hour these increased by 165,700K, 165,780K and 170,918K respectively. (This on my SLC6 machine.) I'll check overnight growth in the morning. ID: 368 · Rating: 0 · rate: / Reply Quote

Crystal Pellet Volunteer tester Send message Joined: 13 Feb 15 Posts: 1239 Credit: 960,834 RAC: 598	Message 369 - Posted: 12 May 2015, 8:59:01 UTC - in response to Message 364. Any ideas why this bug affected CMS and not ATLAS? Or did it? Ben I have searched for tasks on ATLAS and the highest slot use I could find, was 3.5 GB. That are all files in the slot directory and subdirectories together including the snapshot files, which are still made by the wrapper ATLAS has in use. ID: 369 · Rating: 0 · rate: / Reply Quote

ivan Volunteer moderator Project administrator Project developer Project tester Project scientist Send message Joined: 20 Jan 15 Posts: 1145 Credit: 8,310,612 RAC: 0	Message 370 - Posted: 12 May 2015, 11:07:22 UTC - in response to Message 368. Last modified: 12 May 2015, 11:16:48 UTC Maybe you can give some attention to the fact that the CMS-vdi is growing and growing when there are no jobs available in the queue. I've noted here before that 19% cpu is used even when there is nothing to do and several python processes are running in the VM. Maybe they are creating big loggings. I've been digging around on this. It doesn't appear to be logs, rather something to do with cvmfs. Increase in "disk" usage in /var/lib/cvmfs/shared matches the usage increase in /, and almost matches the size increase of the .vdi image; e.g. in the last hour these increased by 165,700K, 165,780K and 170,918K respectively. (This on my SLC6 machine.) I'll check overnight growth in the morning. Managed to get the overnight usage figures, moments before a campus-wide power failure took out all my computers (it may have been even wider, one of my PCs at home stopped reporting at the same time...). Of course, the breaker to my office tripped so it was without power for two hours. So, the usage in /var/lib/cvmfs/shared increased by 1,217,932K; amount used in / increased by 1,220,940K, and the .vdi file increased by 2,213,544K. Probably need a cvmfs expert to tell us what that implies. [Edit] Looked at the figures after I restarted. Image "disk" usage seems to have gone back to normal startup values: /var/lib/cvmfs dropped 1,920,440K from the pre-cut figure and / is using 1,932,470K less. The .vdi file, however, is 174,064K larger. [/Edit] ID: 370 · Rating: 0 · rate: / Reply Quote

Phil Send message Joined: 9 Apr 15 Posts: 57 Credit: 230,221 RAC: 0	Message 371 - Posted: 14 May 2015, 18:01:09 UTC I hit BOINC's "reset" to fetch wrapper and disk image again. Current task although doing no work, has not grown abouve 1.6GB vdi. ID: 371 · Rating: 0 · rate: / Reply Quote

Magic Quantum Mechanic Send message Joined: 8 Apr 15 Posts: 800 Credit: 14,270,303 RAC: 7,424	Message 372 - Posted: 14 May 2015, 20:59:51 UTC CMS-dev has been running with no problems for me this month. http://boincai05.cern.ch/CMS-dev/results.php?userid=192 Only time I had any error was when I lost power for a few minutes or did a reboot for Windows update but got around that too. Atlas is another story. Mad Scientist For Life ID: 372 · Rating: 0 · rate: / Reply Quote

Richard Haselgrove Send message Joined: 4 May 15 Posts: 64 Credit: 55,584 RAC: 0	Message 377 - Posted: 19 May 2015, 9:47:27 UTC I really think we need to get a grip on this problem. To recap: there's a bug in the BOINC client which means it fails to delete files larger than 4 GB when it should. This project is (still, today) producing files larger than 4 GB. When those files are left lying around, we cause errors for every other BOINC project that our computers may be attached to. That's not nice. Eric M has had to put up a front-page warning at LHC classic, so he can get on with his work. The cure is simple and permanent: apply the 080515 hotfix BOINC client. But I had a look through the top 200 hosts yesterday (pretty much the active user base here), and only 9 of the 154 windows machines had the hotfix applied - take a bow, rbpeake, Crystal Pellet, Ray Murray, and m. (the other three were mine) Since the message clearly isn't getting through, even to people who have posted in this thread, I'm going to send a PM to the admins asking them to reinforce the message via a front-page news item and BOINC 'Notice': and if that doesn't work, to ask them to enforce a minimum BOINC version of 7.5.1 for Windows computers attached to this project. ID: 377 · Rating: 0 · rate: / Reply Quote

Richard Haselgrove Send message Joined: 4 May 15 Posts: 64 Credit: 55,584 RAC: 0	Message 378 - Posted: 19 May 2015, 10:18:02 UTC OK, messages sent to Ivan, Laurence and Hendrik. The files that are needed to apply the hotfix are For 64-bit BOINC boinc.080515.x64.zip For 32-bit BOINC boinc.080515.x86.zip Simply extract the two files for your version from the .zip archive, and copy them to your BOINC program folder - you'll need to stop the BOINC client while you do this, and restart it again afterwards. ID: 378 · Rating: 0 · rate: / Reply Quote

m Volunteer tester Send message Joined: 20 Mar 15 Posts: 243 Credit: 886,442 RAC: 0	Message 382 - Posted: 19 May 2015, 20:45:07 UTC A point to note. If you, like me, were running an older version of BOINC (7.2.42, it works well so why change?) and install the fix, you may have problems due to not having the required version of the Visual C runtimes. You need to change to BOINC 7.4.42 first, then add/replace the installed files with those from the 080515 archive. John. ID: 382 · Rating: 0 · rate: / Reply Quote

Magic Quantum Mechanic Send message Joined: 8 Apr 15 Posts: 800 Credit: 14,270,303 RAC: 7,424	Message 385 - Posted: 21 May 2015, 7:11:28 UTC Mad Scientist For Life ID: 385 · Rating: 0 · rate: / Reply Quote

Development for LHC@home