Message boards :
Number crunching :
issue of the day
Message board moderation
Previous · 1 . . . 8 · 9 · 10 · 11
Author | Message |
---|---|
Send message Joined: 8 Apr 15 Posts: 781 Credit: 12,422,653 RAC: 6,510 |
For some reason today one of my 8-core pc's decided to keep running tasks that claim to be 8-core multi-tasks (I ran some of those when multi-core tasks first started) so it will only run one task at a time but when it is finished the stderr says they are just 2-core tasks. I tried to reboot and get it to start another task but it is still saying it is 8-core The other two matching 8-core computers next to it is running 2-core tasks so they are able to run several tasks at the same time as they are supposed to be doing (and the server says there are no more Theory tasks available) So I was hoping to use all the extra cores to run Theory tasks over at LHC too. Mad Scientist For Life |
Send message Joined: 22 Apr 16 Posts: 677 Credit: 2,002,766 RAC: 1 |
Hi Magic, LHC-dev Preferences max# Jobs and max# CPUs. What are the parameters of them? For me max# Jobs = 2 and max# CPUs # = 1. |
Send message Joined: 8 Apr 15 Posts: 781 Credit: 12,422,653 RAC: 6,510 |
Hi Axel.....I'll send you a pm later.....busy day watching NFL games |
Send message Joined: 8 Apr 15 Posts: 781 Credit: 12,422,653 RAC: 6,510 |
New Problems today https://lhcathomedev.cern.ch/lhcathome-dev/result.php?resultid=364887 Guest Log: [DEBUG] DC_NOP failed! And https://lhcathomedev.cern.ch/lhcathome-dev/result.php?resultid=364887 Long list of Credential problems Mad Scientist For Life |
Send message Joined: 8 Apr 15 Posts: 781 Credit: 12,422,653 RAC: 6,510 |
DAMN IT No matter how many times I check my 9 computers and try to keep them all running new tasks and THEN set them to NOT allow new tasks..........and I have a nice long list of Valids.......than I go to check and THREE of mine that I run here for some reason are NOT set to stop them from getting new tasks until I actually do it myself when I know the internet speed is fast enough to start these VB tasks. As usual I see 12 Error while computing because they finished todays and then over and over running each of them for 25 minutes until they give up and do it again. All of them did eventually get 6 more to start running but at 2am I will have to just suspend them until 7am and hope they will start back up (and I have had plenty of bad luck trying that) I am not a morning person so I will start them all up again and all the other ones I run here and LHC and back to bed for a few hours but if these don't start back up that will just mean 6 more Errors to add to that 12 and then have to wait until the next morning at 7am to actually start them up the best way possible. (I'm not asking for any suggestions but just figured it would be better to type than to toss a couple computers out the window) BUT they are all set to not get any new tasks for sure right now and they better start back up in the morning since I have to suspend them in about 2 hours 45 minutes. https://youtu.be/8CtjhWhw2I8 Mad Scientist For Life |
Send message Joined: 8 Apr 15 Posts: 781 Credit: 12,422,653 RAC: 6,510 |
https://lhcathomedev.cern.ch/lhcathome-dev/results.php?userid=192 Well it looks like those 6 restarted are going to be finished Valid....... a bit shorter runs than I usually get. And for the first time since I started doing this the way I do do get the fastest speed and get up at 7am.......well I didn't wake up like I usually do so all I could do is restart those 6 tasks I already started but not start the other ones or the 30 tasks for LHC. So the computers get to sit and wait until 7am tomorrow.....hope the wife will wake me up if I don't wake up myself. I tend to stay awake until 3 or 4am so that isn't easy but I am used to doing that. BUT I guarantee EVERY core will be running tasks by 7:30am (as always my Einstein GPU's run non-stop and only uses the internet to return the finished tasks) Mad Scientist For Life |
Send message Joined: 8 Apr 15 Posts: 781 Credit: 12,422,653 RAC: 6,510 |
https://lhcathomedev.cern.ch/lhcathome-dev/result.php?resultid=367529 annoying tasks of the day Mad Scientist For Life |
Send message Joined: 8 Apr 15 Posts: 781 Credit: 12,422,653 RAC: 6,510 |
This is always the problem with VB tasks. Getting the Cern server to allow me to just start running the tasks. Nothing to do with d/ling data blocks (I always watch it happen) You would think Cern would realize by now it is me every time and just hand over the credentials. (security) (yeah that will never happen) Mad Scientist For Life |
Send message Joined: 8 Apr 15 Posts: 781 Credit: 12,422,653 RAC: 6,510 |
Welcome back to life LHC-dev Just in time for me to see how many of my computers can't run VB tasks after doing the Windows 10 Update KB4041994 and KB4051963 after hours of trying to get this Fall version 1709 to work. Three of my 8-cores worked the first time and one refuses to so far (the one with the most memory) and a couple others I use here (quad and 3-core) Only problem is when it doesn't work it gives me a 5 second computer error and so I have to abort them. Here is an example on the one I am on right now https://lhcathomedev.cern.ch/lhcathome-dev/result.php?resultid=372085 And it is taking its time Initializing... Guess I'll go check that 8-core and hope for the best and then the old 3-core..........funny how VB is always the program that refuses to work but no problem if I fired up the GPU cards for Einstein. Mad Scientist For Life |
Send message Joined: 8 Apr 15 Posts: 781 Credit: 12,422,653 RAC: 6,510 |
Well the server is back to running my d/l's like a snail on a cold day climbing up a flag pole. I do speed tests on my end and can up and download at full speed unless it is to get tasks here. 50Kbps right now just for a single task taking an hour so far to get to 75% Then the tasks (Atlas and Theory) run to Valid in less time than it takes to d/l and sent back in a couple seconds. I decided to give up on the Atlas since I have run plenty of them here and over there -----------> Right now I wanted to just switch back to Theory on this 8-core but I also aded CMS and as usual I have to d/l the newest .vdi for that and the d/l speed is about 45Kbps........THAT will take hours. I could d/l 200 Einstein GPU's or 200 Sixtracks in about 3 minutes but not these here (just like the alpha's) This is still the first 12 hours of my new monthly isp contract which means full speed until I use it all up and only VB tasks and vdi's use that all up after a week if I run them on all 9 computers. Of course Sixtrack and GPU tasks don't even need to be connected to the internet once I have the tasks........I just let them run and turn them in when I have lots of completes. For my last Atlas task it is almost finished d/ling ONE task and it will be about 1 hour 15 minutes just to do that and then will run to a Valid task faster than that total time.. So I will finally get to start an Atlas task and then wait 5 hours to get this CMS vdi and then the tasks. I use a satellite dish for everything here but I have 4 of my computers off just so I can get more done with the 32 cores I have running Sixtracks and GPU's at Einstein without using the internet to run those. Can't get much done here lately and in the past I did quite a lot every day. Mad Scientist For Life |
Send message Joined: 8 Apr 15 Posts: 781 Credit: 12,422,653 RAC: 6,510 |
https://lhcathomedev.cern.ch/lhcathome-dev/result.php?resultid=380436 Well I just got 2 of these and the good thing is they just ran 16 minutes before the server fell asleep again DC_NOP failed! 2018-01-17 17:24:45 (7632): Guest Log: AUTHENTICATE:1006:exceeded 1516238684 deadline during authentication 2018-01-17 17:24:45 (7632): Guest Log: AUTHENTICATE:1004:Failed to authenticate using GSI 2018-01-17 17:24:45 (7632): Guest Log: GSI:5002:Failed to authenticate because the remote (server) side was not able to acquire its credentials. Mad Scientist For Life |
Send message Joined: 8 Apr 15 Posts: 781 Credit: 12,422,653 RAC: 6,510 |
Mad Scientist For Life |
Send message Joined: 29 Apr 19 Posts: 13 Credit: 109,352 RAC: 0 |
Did the 2nd one return after the due date but within a grace period? Also, are my fine because they are using older version of VBox? This result is fine when returning a completed on the 3rd attempt on being sent out: -------------------------------------------------------------------------------------------------------------------- While this one is failed with too many errors upon receiving a completed, good result: ------------------------------------------------------------------------------------------------------------------- |
©2024 CERN