Message boards : News : Server upgrade
Message board moderation

To post messages, you must log in.

AuthorMessage
Nils Høimyr
Project administrator
Project developer
Project tester
Project scientist
Help desk expert
Avatar

Send message
Joined: 12 Sep 14
Posts: 26
Credit: 98,143
RAC: 1
Message 5068 - Posted: 11 Aug 2017, 8:22:39 UTC

We will migrate the lhcathome-dev project to a Centos7 server and upgrade the BOINC server components.

The lhcathome-dev project server will be unreachable for a while later today during the upgrade.
ID: 5068 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
maeax

Send message
Joined: 22 Apr 16
Posts: 137
Credit: 452,726
RAC: 1,339
Message 5069 - Posted: 11 Aug 2017, 8:44:23 UTC

Thank you for the info,
so we can stop work now.
ID: 5069 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Nils Høimyr
Project administrator
Project developer
Project tester
Project scientist
Help desk expert
Avatar

Send message
Joined: 12 Sep 14
Posts: 26
Credit: 98,143
RAC: 1
Message 5070 - Posted: 11 Aug 2017, 9:16:44 UTC - in response to Message 5069.  

Actually the upgraded server code gives an error with the scheduler: No start tag in scheduler reply .

Will try to fix it, otherwise we'll revert back to the previous setup.
ID: 5070 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile MAGIC Quantum Mechanic
Avatar

Send message
Joined: 8 Apr 15
Posts: 193
Credit: 3,695,763
RAC: 4,932
Message 5071 - Posted: 13 Aug 2017, 5:42:55 UTC
Last modified: 13 Aug 2017, 5:44:22 UTC

Well I sure hope this brings vLHC-dev back to normal again since most of the members have left us here.

I even switched several of mine over to LHC but still have 2 here running the CMS and Theory tasks I have loaded.

Finished the previous version of LHCb tasks but I rather not d/l the new vdi for now just to save this months data transfer so I can go longer than just one week before I use all 80GB total for the month from my satellite connection (45-50MBps)

I have a feeling those vdi d/l/s along with VB tasks really eat that data transfer up so I am testing my other 32 cores running SixTracks without the ethernet cables plugged in.......I know it works great with the Einstein GPU tasks.

But the main thing is getting this site back to normal since it is getting close to 10 weeks without keeping it all up to date.
ID: 5071 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Nils Høimyr
Project administrator
Project developer
Project tester
Project scientist
Help desk expert
Avatar

Send message
Joined: 12 Sep 14
Posts: 26
Credit: 98,143
RAC: 1
Message 5072 - Posted: 15 Aug 2017, 7:48:42 UTC

Not sure if I follow you MAGIC, you are using the URL:

https://lhcathomedev.cern.ch/lhcathome-dev/ for this dev project?


Anyway, we are trying again the server update today, so for about 30 minutes, the server will be unavailable.

There might be a couple such interventions on the dev project today, Tuesday 15th of August.
ID: 5072 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile MAGIC Quantum Mechanic
Avatar

Send message
Joined: 8 Apr 15
Posts: 193
Credit: 3,695,763
RAC: 4,932
Message 5073 - Posted: 15 Aug 2017, 9:56:00 UTC - in response to Message 5072.  

Not sure if I follow you MAGIC, you are using the URL:

https://lhcathomedev.cern.ch/lhcathome-dev/ for this dev project?


Anyway, we are trying again the server update today, so for about 30 minutes, the server will be unavailable.

There might be a couple such interventions on the dev project today, Tuesday 15th of August.


Well Nils we have been talking about this problem here since June 7th and on several threads yet we never get anything done about it and all I get is one of you saying you don't understand and then I type out a long step by step explanation and still nothing here.

Once again.......look at the stats pages......they are all wrong and look at my account where it shows My Computers and there you will see it says I have not used any computer here for over 30 days.

And e even when you check my Computers on the *Show: All computers* tab it will say the last contact here was in JUNE.

Now how do you think that is possible when I am the one who has the highest average (RAC) and Total here?

Once again I have to post here that I have been running 6 or 7 computers here 24/7 since the beginning on April 8th 2015 and the Total credit 3,068,798 did not happen by not running any computers here for months.

BUT if you also check the RAC stats page it has *Paul* still at the top of the page since the stats pages have not changed for 10 WEEKS and that former member has not done ONE tasks here since June 7th.

And once again let me remind you that this site has not even changed that user of the day for 10 WEEKS

Most of the regulars here left because nothing ever gets done even when we write long posts here describing all the problems.

There are only about 5 members still here and Ivan is one of them and I have always been the one doing most of the work here yet......as I have said about 10 times now.....this website has not been doing the simple job of keeping the stats up to date for 10 weeks.

We can't even check how our computers are doing unless we go and check each one instead of just getting on ONE and going to our accounts to see what our computers are doing so we know if they have tasks running or if there is any problems.

Now sure if I was just running one computer for Cern it wouldn't be all that bad but I am running NINE and I have been doing this 24/7 since 2004 and am one of the very few that also does all the Alpha-beta tests.

And of course I use the right URL for vLHC-dev as does the other members here and the ones that left.

One thing for sure is that I won't type this out again and next time I will just give the link to this post and maybe all the other ones about this.

The only thing we get updated here is the stats below our avatar and that does not do us any good when it comes to checking all of our computers work from the accounts.

Here you can find the many times I wasted here explaining this and was even asked to by other members since it is like this for all that are left here.

https://lhcathomedev.cern.ch/lhcathome-dev/forum_user_posts.php?userid=192
ID: 5073 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
maeax

Send message
Joined: 22 Apr 16
Posts: 137
Credit: 452,726
RAC: 1,339
Message 5077 - Posted: 15 Aug 2017, 16:26:51 UTC

Nils,

thank you for the Server-upgrade.
The statistic of the -dev is now in boincstats.com again.
ID: 5077 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Ray Murray
Avatar

Send message
Joined: 13 Apr 15
Posts: 72
Credit: 1,483,993
RAC: 1,314
Message 5078 - Posted: 15 Aug 2017, 19:23:20 UTC - in response to Message 5077.  
Last modified: 15 Aug 2017, 19:24:45 UTC

...... although ... even though I let it run and return a Benchmark task just now, (credited), this host is still below the 30 day cut-off and shows last contact as 15 June 8¬( , while 2 others above the cut-off successfully updated straight away.
ID: 5078 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
maeax

Send message
Joined: 22 Apr 16
Posts: 137
Credit: 452,726
RAC: 1,339
Message 5079 - Posted: 16 Aug 2017, 5:03:51 UTC

Hi Ray,

have for the -dev project after the last task was finished made a reset, removed it and made a new connect of the -dev project.

Maybe this will help for a new connect-day.
ID: 5079 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Ray Murray
Avatar

Send message
Joined: 13 Apr 15
Posts: 72
Credit: 1,483,993
RAC: 1,314
Message 5085 - Posted: 16 Aug 2017, 16:38:43 UTC - in response to Message 5079.  

Yes, Maeax,
That would work but would require the download of all the .vdi's again so Magic would take about a week to do that on his 10 machines and clockwork network. And if a host is doing Sixtrack on the Production site for more than 30 days, it would again be seen as "inactive" here and the same "last contact" lock would occur. I don't knows where Nils et al might look but the fix must be server-side rather than host-side.
Now that the stats are being exported again, this might now be a purely cosmetic issue but I don't know if it might have other implications down the line.
ID: 5085 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile MAGIC Quantum Mechanic
Avatar

Send message
Joined: 8 Apr 15
Posts: 193
Credit: 3,695,763
RAC: 4,932
Message 5086 - Posted: 16 Aug 2017, 17:14:52 UTC
Last modified: 16 Aug 2017, 17:20:05 UTC

Ray is correct.

https://lhcathomedev.cern.ch/lhcathome-dev/hosts_user.php?userid=192

The most important part is still the same.

And last night and today I can't connect with the server

8/16/2017 10:18:11 AM | lhcathome-dev | Server error: feeder not running

ID: 5086 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Nils Høimyr
Project administrator
Project developer
Project tester
Project scientist
Help desk expert
Avatar

Send message
Joined: 12 Sep 14
Posts: 26
Credit: 98,143
RAC: 1
Message 5088 - Posted: 17 Aug 2017, 6:27:25 UTC - in response to Message 5086.  

Sorry about the feeder issue. We applied another update to the BOINC server code to test new job scheduling features, and missed a DB update in the process. This should now be ok.
ID: 5088 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile ivan
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist
Avatar

Send message
Joined: 20 Jan 15
Posts: 1026
Credit: 2,217,824
RAC: 3,819
Message 5090 - Posted: 17 Aug 2017, 6:56:12 UTC - in response to Message 5088.  

Sorry about the feeder issue. We applied another update to the BOINC server code to test new job scheduling features, and missed a DB update in the process. This should now be ok.

I just got two new jobs, and server status is (mostly) all green, so it looks OK again.
ID: 5090 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Ray Murray
Avatar

Send message
Joined: 13 Apr 15
Posts: 72
Credit: 1,483,993
RAC: 1,314
Message 5092 - Posted: 17 Aug 2017, 21:01:23 UTC

... and the host that had been being ignored has updated it's "last contact" to the manual "phone home" I did just now 8¬). 5 of Magic's also have recent contacts again so the tap with a hammer seems to have done the trick.
ID: 5092 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
maeax

Send message
Joined: 22 Apr 16
Posts: 137
Credit: 452,726
RAC: 1,339
Message 5093 - Posted: 18 Aug 2017, 0:01:22 UTC

Yes Ray,

that's the real life with this hammer.

Is it possible for the Admins to activate the processor-page under statistics and sort them inside those headers?

This new -dev Server is so fast at the moment, whow.

Thank you.
ID: 5093 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
maeax

Send message
Joined: 22 Apr 16
Posts: 137
Credit: 452,726
RAC: 1,339
Message 5101 - Posted: 22 Aug 2017, 17:14:39 UTC

The -dev-Server is back.

Thank you Cern-IT.
ID: 5101 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile ivan
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist
Avatar

Send message
Joined: 20 Jan 15
Posts: 1026
Credit: 2,217,824
RAC: 3,819
Message 5102 - Posted: 22 Aug 2017, 18:42:27 UTC - in response to Message 5101.  

The -dev-Server is back.

Thank you Cern-IT.

Oh, goody! I'd better tickle my -dev machine to ask for tasks.
ID: 5102 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile MAGIC Quantum Mechanic
Avatar

Send message
Joined: 8 Apr 15
Posts: 193
Credit: 3,695,763
RAC: 4,932
Message 5103 - Posted: 23 Aug 2017, 9:21:57 UTC

I just tried to get a couple on one of my 8-core pc's and this is what I got

Stderr output
<core_client_version>7.6.33</core_client_version>
<![CDATA[
<message>
app_version download error: couldn't get input files:
<file_xfer_error>
<file_name>vboxwrapper_26198ab7_windows_x86_64.exe</file_name>
<error_code>-224 (permanent HTTP error)</error_code>
<error_message>permanent HTTP error</error_message>
</file_xfer_error>
<file_xfer_error>
<file_name>Theory_2017_05_29.xml</file_name>
<error_code>-224 (permanent HTTP error)</error_code>
<error_message>permanent HTTP error</error_message>
</file_xfer_error>

</message>
]]>


I am about to see if it does the same thing on the other two 8-cores sitting next to that one.
ID: 5103 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Nils Høimyr
Project administrator
Project developer
Project tester
Project scientist
Help desk expert
Avatar

Send message
Joined: 12 Sep 14
Posts: 26
Credit: 98,143
RAC: 1
Message 5104 - Posted: 23 Aug 2017, 14:44:31 UTC - in response to Message 5103.  

Thanks for pointing this out. We still have work units in the DB that points to the former vLHCathome-dev.

Download should be working again now.
ID: 5104 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote

Message boards : News : Server upgrade


©2017 CERN