Message boards :
CMS Application :
CMS network test are getting more strict
Message board moderation
Author | Message |
---|---|
Send message Joined: 28 Jul 16 Posts: 481 Credit: 394,720 RAC: 0 |
The new CMS bootstrap does a couple of basic network connection tests to the following target systems: cern.ch port 80 vccs.cern.ch port 443 vocms0840.cern.ch port 9618 (HTCondor) vocms0267.cern.ch port 4080 (WMAgent) I recently noticed a couple of volunteer computers that don't pass this tests which causes an EXIT_INIT_FAILURE and shuts down the VM. The tests are done using ncat and each test is repeated up to 3 times (=runs). The timeout is currently set to 15 s for each run which should be enough to send packets around the world a couple of times. Setting a higher timeout would not make much sense since especially the HTCondor server will be contacted by CMS every minute. This example shows a computer that successfully passes run 1 to cern.ch and VCCS and run 3 to HTCondor but fails all 3 runs to WMAgent: 2021-07-06 08:20:53 (7517): Guest Log: [INFO] Testing connection to cern.ch 2021-07-06 08:20:53 (7517): Guest Log: [INFO] Testing connection to VCCS 2021-07-06 08:20:53 (7517): Guest Log: [INFO] Testing connection to HTCondor 2021-07-06 08:21:05 (7517): Guest Log: [DEBUG] Status run 1 of up to 3: 1 2021-07-06 08:21:26 (7517): Guest Log: [DEBUG] Status run 2 of up to 3: 1 2021-07-06 08:21:31 (7517): Guest Log: [INFO] Testing connection to WMAgent 2021-07-06 08:21:44 (7517): Guest Log: [DEBUG] Status run 1 of up to 3: 1 2021-07-06 08:22:05 (7517): Guest Log: [DEBUG] Status run 2 of up to 3: 1 2021-07-06 08:22:26 (7517): Guest Log: [DEBUG] Status run 3 of up to 3: 1 2021-07-06 08:22:26 (7517): Guest Log: [DEBUG] Ncat: Version 7.50 ( https://nmap.org/ncat ) 2021-07-06 08:22:26 (7517): Guest Log: Ncat: Connection timed out. 2021-07-06 08:22:26 (7517): Guest Log: [ERROR] Could not connect to vocms0267.cern.ch on port 4080 2021-07-06 08:22:26 (7517): Guest Log: [INFO] Shutting Down. I suspect the affected computers may be located behind a heavily loaded router. I'd like to ask affected testers to describe under which local conditions the ncat tests fail. |
Send message Joined: 8 Apr 15 Posts: 777 Credit: 12,074,862 RAC: 5,243 |
Yes you know how that goes here for some reason so I always have to check by looking here https://lhcathomedev.cern.ch/lhcathome-dev/top_hosts.php And that of course only tells what computer and not who it is and several never seem to check it if they get the credit for those failed tasks and never seem to look here either. We shouldn't have to do that here since this is for testing and hiding a computer is .......... |
Send message Joined: 22 Apr 16 Posts: 675 Credit: 1,989,507 RAC: 423 |
Yes you know how that goes here for some reason so I always have to check by looking here for example: Volunteer: mmonnin (451) |
Send message Joined: 20 Jun 17 Posts: 25 Credit: 4,777,813 RAC: 5,496 |
Yes you know how that goes here for some reason so I always have to check by looking here Most of those are not mine so go bark at someone else. Why doesn't LHC just remove the stats export option required by GDPR as well then? |
©2024 CERN