1) Message boards : General Discussion : Peer certificate cannot be authenticated with given CA certificates (Message 7088)
Posted 14 Jan 2021 by Profile PDW
Post:
Work uploaded about 20 minutes ago.
2) Message boards : General Discussion : Peer certificate cannot be authenticated with given CA certificates (Message 7087)
Posted 14 Jan 2021 by Profile PDW
Post:
Ditto...

14/01/2021 09:24:35 | lhcathome-dev | [http] [ID#1] Info:  Connected to lhcathomedev.cern.ch (137.138.44.42) port 443 (#834)
14/01/2021 09:24:35 | lhcathome-dev | [http] [ID#1] Info:  ALPN, offering http/1.1
14/01/2021 09:24:35 | lhcathome-dev | [http] [ID#1] Info:  Cipher selection: ALL:!EXPORT:!EXPORT40:!EXPORT56:!aNULL:!LOW:!RC4:@STRENGTH
14/01/2021 09:24:35 | lhcathome-dev | [http] [ID#1] Info:  successfully set certificate verify locations:
14/01/2021 09:24:35 | lhcathome-dev | [http] [ID#1] Info:    CAfile: C:\Program Files\BOINC\ca-bundle.crt
14/01/2021 09:24:35 | lhcathome-dev | [http] [ID#1] Info:    CApath: none
14/01/2021 09:24:35 | lhcathome-dev | [http] [ID#1] Info:  TLSv1.2 (OUT), TLS header, Certificate Status (22):
14/01/2021 09:24:35 | lhcathome-dev | [http] [ID#1] Info:  TLSv1.2 (OUT), TLS handshake, Client hello (1):
14/01/2021 09:24:35 | lhcathome-dev | [http] [ID#1] Info:  TLSv1.2 (IN), TLS handshake, Server hello (2):
14/01/2021 09:24:35 | lhcathome-dev | [http] [ID#1] Info:  TLSv1.2 (IN), TLS handshake, Certificate (11):
14/01/2021 09:24:35 | lhcathome-dev | [http] [ID#1] Info:  TLSv1.2 (OUT), TLS alert, Server hello (2):
14/01/2021 09:24:35 | lhcathome-dev | [http] [ID#1] Info:  SSL certificate problem: unable to get local issuer certificate
14/01/2021 09:24:35 | lhcathome-dev | [http] [ID#1] Info:  Closing connection 834
14/01/2021 09:24:35 | lhcathome-dev | [http] HTTP error: Peer certificate cannot be authenticated with given CA certificates
14/01/2021 09:24:36 | lhcathome-dev | Scheduler request failed: Peer certificate cannot be authenticated with given CA certificates
3) Message boards : Number crunching : Vritual Box E_ACCESSDENIED Error (Message 3565)
Posted 12 Jun 2016 by Profile PDW
Post:
The hosts involved in the tasks posted have either 4 or 6GB. Should be enough for one task.

But how many are they running either from here or other projects (that can demand a lot of memory) ?
4) Message boards : Number crunching : Vritual Box E_ACCESSDENIED Error (Message 3559)
Posted 12 Jun 2016 by Profile PDW
Post:
Do you have any of those errors appearing on machines with lots of memory ?

All your examples could easily be close to the edge and drop off when they try to start !
5) Message boards : Theory Application : x509 proxy error (Message 3306)
Posted 10 May 2016 by Profile PDW
Post:
LHCb task running okay now, will be tomorrow before I can run a Theory task !
6) Message boards : Theory Application : x509 proxy error (Message 3301)
Posted 10 May 2016 by Profile PDW
Post:
Gave LHCb a go and that fails also (as I expected) but did see this in the output after the 'Could not get an X509 credential' message...

/usr/sbin/boinc-shutdown: line 31: [: too many arguments

Perhaps some sort of anger management is required ?
7) Message boards : Theory Application : x509 proxy error (Message 3300)
Posted 10 May 2016 by Profile PDW
Post:
The host is now told that it has completed its quota for the day...

10/05/2016 09:36:49 | vLHCathome-dev | Sending scheduler request: Requested by user.
10/05/2016 09:36:49 | vLHCathome-dev | Requesting new tasks for CPU
10/05/2016 09:36:50 | vLHCathome-dev | Scheduler request completed: got 0 new tasks
10/05/2016 09:36:50 | vLHCathome-dev | No tasks sent
10/05/2016 09:36:50 | vLHCathome-dev | No tasks are available for Theory Simulation
10/05/2016 09:36:50 | vLHCathome-dev | This computer has finished a daily quota of 1 tasks

I assume this is a result of the invalid tasks being reported due to the system failure to get x509 credential.
Is this what you expect to happen regarding host backoff ?

Does this not mean if every user gets this system error all machines will be backed off for a day ?
8) Message boards : Theory Application : x509 proxy error (Message 3297)
Posted 10 May 2016 by Profile PDW
Post:
After trying 6 times (per project site) to request a credential tasks are failing...

2016-05-10 07:18:48 (12888): Guest Log: [ERROR] Could not get an x509 credential
2016-05-10 07:18:48 (12888): Guest Log: [ERROR] The x509 proxy creation failed.
2016-05-10 07:18:48 (12888): Guest Log: [INFO] Shutting Down.
2016-05-10 07:18:48 (12888): VM Completion File Detected.
2016-05-10 07:18:48 (12888): VM Completion Message: The x509 proxy creation failed.
9) Message boards : Number crunching : Credit Per App statistics (Message 3196)
Posted 3 May 2016 by Profile PDW
Post:
Would this list help?

The user per-app stats are working correctly but the team per-app stats have been incorrect since they first appeared two months ago :-(
10) Message boards : ATLAS Application : New Experimental ATLAS Application (Message 3010)
Posted 25 Apr 2016 by Profile PDW
Post:
I was looking at F5 with everything scrolling up fast !
F4 is still showing that the bandwidth was 510497 (23,866,170 bytes)

Not sure how many jobs it has done but has been running for over 5 hours now.
11) Message boards : ATLAS Application : New Experimental ATLAS Application (Message 3009)
Posted 25 Apr 2016 by Profile PDW
Post:
At the end of the job, you should see in Console 5 (stderr.log) a gfal-copy command. After than has run it should show the bandwidth experienced.

Bandwidth: xxx

I just happened to look whilst it was doing a gfal-copy, it did put up the headers for various columns to describe the copying but it then scrolled up so fast as it moved on to the next job/process I didn't get to read it. Can't find it in any of the logs yet.

How many jobs are being run at a time ?
12) Message boards : Theory Application : Task not starting and not shutting down ! (Message 2903)
Posted 21 Apr 2016 by Profile PDW
Post:
This is StartLog from a task that hasn't done any real work and won't exit...

04/11/16 16:18:12 ******************************************************
04/11/16 16:18:12 ** condor_startd (CONDOR_STARTD) STARTING UP
04/11/16 16:18:12 ** /usr/sbin/condor_startd
04/11/16 16:18:12 ** SubsystemInfo: name=STARTD type=STARTD(7) class=DAEMON(1)
04/11/16 16:18:12 ** Configuration: subsystem:STARTD local:<NONE> class:DAEMON
04/11/16 16:18:12 ** $CondorVersion: 8.0.6 Feb 01 2014 BuildID: 225363 $
04/11/16 16:18:12 ** $CondorPlatform: x86_64_RedHat6 $
04/11/16 16:18:12 ** PID = 4464
04/11/16 16:18:12 ** Log last touched time unavailable (No such file or directory)
04/11/16 16:18:12 ******************************************************
04/11/16 16:18:12 Using config source: /etc/condor/condor_config
04/11/16 16:18:12 Using local config sources:
04/11/16 16:18:12 /etc/condor/config.d/10_security.config
04/11/16 16:18:12 /etc/condor/config.d/14_network.config
04/11/16 16:18:12 /etc/condor/config.d/20_workernode.config
04/11/16 16:18:12 /etc/condor/config.d/30_lease.config
04/11/16 16:18:12 /etc/condor/config.d/35_theory.config
04/11/16 16:18:12 /etc/condor/config.d/40_ccb.config
04/11/16 16:18:12 /etc/condor/condor_config.local
04/11/16 16:18:12 Daemon Log is logging: D_ALWAYS D_ERROR
04/11/16 16:18:12 DaemonCore: command socket at <10.0.2.15:33147?noUDP>
04/11/16 16:18:12 DaemonCore: private command socket at <10.0.2.15:33147>
04/11/16 16:18:12 ERROR: Could not open canonicalization file '/etc/condor/certificate_mapfile' (No such file or directory)
04/11/16 16:18:13 CCBListener: heartbeat disabled because interval is configured to be 0
04/11/16 16:18:13 CCBListener: registered with CCB server alicondor01.cern.ch as ccbid 188.184.129.127:9618?addrs=188.184.129.127-9618&noUDP&sock=collector#497
04/11/16 16:18:13 HibernationSupportedStates invalid '' in ad from hibernation plugin /usr/libexec/condor/condor_power_state
04/11/16 16:18:26 VM-gahp server reported an internal error
04/11/16 16:18:26 VM universe will be tested to check if it is available
04/11/16 16:18:26 History file rotation is enabled.
04/11/16 16:18:26 Maximum history file size is: 20971520 bytes
04/11/16 16:18:26 Number of rotated history files is: 2
slot type 0: Cpus: 1, Memory: auto, Swap: auto, Disk: auto
slot type 0: Cpus: 1, Memory: 4500, Swap: 100.00%, Disk: 100.00%
04/11/16 16:18:26 New machine resource allocated
04/11/16 16:18:26 CronJobList: Adding job 'mips'
04/11/16 16:18:26 CronJobList: Adding job 'kflops'
04/11/16 16:18:26 CronJob: Initializing job 'mips' (/usr/libexec/condor/condor_mips)
04/11/16 16:18:26 CronJob: Initializing job 'kflops' (/usr/libexec/condor/condor_kflops)
04/11/16 16:18:26 State change: IS_OWNER is false
04/11/16 16:18:26 Changing state: Owner -> Unclaimed
04/11/16 16:18:26 State change: RunBenchmarks is TRUE
04/11/16 16:18:26 Changing activity: Idle -> Benchmarking
04/11/16 16:18:26 BenchMgr:StartBenchmarks()
04/11/16 16:18:39 Request accepted.
04/11/16 16:18:39 Remote owner is test4theory@cern.ch
04/11/16 16:18:39 State change: claiming protocol successful
04/11/16 16:18:39 Changing state and activity: Unclaimed/Benchmarking -> Claimed/Idle
04/11/16 16:18:40 Got activate_claim request from shadow (188.184.187.167)
04/11/16 16:18:40 Remote job ID is 260339.0
04/11/16 16:18:40 Got universe "VANILLA" (5) from request classad
04/11/16 16:18:40 State change: claim-activation protocol successful
04/11/16 16:18:40 Changing activity: Idle -> Busy
04/11/16 16:18:41 PERMISSION DENIED to condor@localhost from host 10.0.2.15 for command 60008 (DC_CHILDALIVE), access level DAEMON: reason: DAEMON authorization policy contains no matching ALLOW entry for this request; identifiers used for this host: 10.0.2.15,10.0.2.15, hostname size = 1, original ip address = 10.0.2.15
04/11/16 16:18:53 State change: benchmarks completed
04/11/16 16:21:46 Called deactivate_claim_forcibly()
04/11/16 16:21:46 Starter pid 4557 exited with status 0
04/11/16 16:21:46 State change: starter exited
04/11/16 16:21:46 Changing activity: Busy -> Idle
04/11/16 16:21:47 Got activate_claim request from shadow (188.184.187.167)
04/11/16 16:21:47 Remote job ID is 260340.0
04/11/16 16:21:47 Got universe "VANILLA" (5) from request classad
04/11/16 16:21:47 State change: claim-activation protocol successful
04/11/16 16:21:47 Changing activity: Idle -> Busy
04/11/16 16:21:48 PERMISSION DENIED to condor@localhost from host 10.0.2.15 for command 60008 (DC_CHILDALIVE), access level DAEMON: reason: cached result for DAEMON; see first case for the full reason
04/11/16 16:24:15 Called deactivate_claim_forcibly()
04/11/16 16:24:15 Starter pid 5159 exited with status 0
04/11/16 16:24:15 State change: starter exited
04/11/16 16:24:15 Changing activity: Busy -> Idle
04/11/16 16:24:16 Got activate_claim request from shadow (188.184.187.167)
04/11/16 16:24:16 Remote job ID is 260341.0
04/11/16 16:24:16 Got universe "VANILLA" (5) from request classad
04/11/16 16:24:16 State change: claim-activation protocol successful
04/11/16 16:24:16 Changing activity: Idle -> Busy
04/11/16 16:24:17 PERMISSION DENIED to condor@localhost from host 10.0.2.15 for command 60008 (DC_CHILDALIVE), access level DAEMON: reason: cached result for DAEMON; see first case for the full reason
04/11/16 16:24:17 PERMISSION DENIED to condor@localhost from host 10.0.2.15 for command 60008 (DC_CHILDALIVE), access level DAEMON: reason: cached result for DAEMON; see first case for the full reason
04/11/16 16:24:17 PERMISSION DENIED to condor@localhost from host 10.0.2.15 for command 60008 (DC_CHILDALIVE), access level DAEMON: reason: cached result for DAEMON; see first case for the full reason
04/11/16 16:24:17 Starter pid 5731 exited with status 4
04/11/16 16:24:17 State change: starter exited
04/11/16 16:24:17 Changing activity: Busy -> Idle
04/11/16 16:24:18 Got activate_claim request from shadow (188.184.187.167)
04/11/16 16:24:18 Remote job ID is 260341.0
04/11/16 16:24:18 Got universe "VANILLA" (5) from request classad
04/11/16 16:24:18 State change: claim-activation protocol successful
04/11/16 16:24:18 Changing activity: Idle -> Busy
04/11/16 16:24:19 PERMISSION DENIED to condor@localhost from host 10.0.2.15 for command 60008 (DC_CHILDALIVE), access level DAEMON: reason: cached result for DAEMON; see first case for the full reason
04/11/16 16:26:44 Called deactivate_claim_forcibly()
04/11/16 16:26:44 Starter pid 5739 exited with status 0
04/11/16 16:26:44 State change: starter exited
04/11/16 16:26:44 Changing activity: Busy -> Idle
04/11/16 16:26:44 State change: received RELEASE_CLAIM command
04/11/16 16:26:44 Changing state and activity: Claimed/Idle -> Preempting/Vacating
04/11/16 16:26:44 State change: No preempting claim, returning to owner
04/11/16 16:26:44 Changing state and activity: Preempting/Vacating -> Owner/Idle
04/11/16 16:26:44 State change: IS_OWNER is false
04/11/16 16:26:44 Changing state: Owner -> Unclaimed
04/21/16 16:24:03 ******************************************************
04/21/16 16:24:03 ** condor_startd (CONDOR_STARTD) STARTING UP
04/21/16 16:24:03 ** /usr/sbin/condor_startd
04/21/16 16:24:03 ** SubsystemInfo: name=STARTD type=STARTD(7) class=DAEMON(1)
04/21/16 16:24:03 ** Configuration: subsystem:STARTD local:<NONE> class:DAEMON
04/21/16 16:24:03 ** $CondorVersion: 8.0.6 Feb 01 2014 BuildID: 225363 $
04/21/16 16:24:03 ** $CondorPlatform: x86_64_RedHat6 $
04/21/16 16:24:03 ** PID = 3550
04/21/16 16:24:03 ** Log last touched 4/21 16:23:51
04/21/16 16:24:03 ******************************************************
04/21/16 16:24:03 Using config source: /etc/condor/condor_config
04/21/16 16:24:03 Using local config sources:
04/21/16 16:24:03 /etc/condor/config.d/10_security.config
04/21/16 16:24:03 /etc/condor/config.d/14_network.config
04/21/16 16:24:03 /etc/condor/config.d/20_workernode.config
04/21/16 16:24:03 /etc/condor/config.d/30_lease.config
04/21/16 16:24:03 /etc/condor/config.d/35_theory.config
04/21/16 16:24:03 /etc/condor/config.d/40_ccb.config
04/21/16 16:24:03 /etc/condor/condor_config.local
04/21/16 16:24:03 Daemon Log is logging: D_ALWAYS D_ERROR
04/21/16 16:24:03 DaemonCore: command socket at <10.0.2.15:42749?noUDP>
04/21/16 16:24:03 DaemonCore: private command socket at <10.0.2.15:42749>
04/21/16 16:24:03 ERROR: Could not open canonicalization file '/etc/condor/certificate_mapfile' (No such file or directory)
04/21/16 16:24:16 CCBListener: heartbeat disabled because interval is configured to be 0
04/21/16 16:24:16 CCBListener: registered with CCB server alicondor01.cern.ch as ccbid 188.184.129.127:9618?addrs=188.184.129.127-9618&noUDP&sock=collector#13882
04/21/16 16:24:16 HibernationSupportedStates invalid '' in ad from hibernation plugin /usr/libexec/condor/condor_power_state
04/21/16 16:24:23 VM-gahp server reported an internal error
04/21/16 16:24:23 VM universe will be tested to check if it is available
04/21/16 16:24:23 History file rotation is enabled.
04/21/16 16:24:23 Maximum history file size is: 20971520 bytes
04/21/16 16:24:23 Number of rotated history files is: 2
slot type 0: Cpus: 1, Memory: auto, Swap: auto, Disk: auto
slot type 0: Cpus: 1, Memory: 4500, Swap: 100.00%, Disk: 100.00%
04/21/16 16:24:23 New machine resource allocated
04/21/16 16:24:23 CronJobList: Adding job 'mips'
04/21/16 16:24:23 CronJobList: Adding job 'kflops'
04/21/16 16:24:23 CronJob: Initializing job 'mips' (/usr/libexec/condor/condor_mips)
04/21/16 16:24:23 CronJob: Initializing job 'kflops' (/usr/libexec/condor/condor_kflops)
04/21/16 16:24:23 State change: IS_OWNER is false
04/21/16 16:24:23 Changing state: Owner -> Unclaimed
04/21/16 16:24:23 State change: RunBenchmarks is TRUE
04/21/16 16:24:23 Changing activity: Idle -> Benchmarking
04/21/16 16:24:23 BenchMgr:StartBenchmarks()
04/21/16 16:24:46 State change: benchmarks completed
04/21/16 16:24:46 Changing activity: Benchmarking -> Idle
04/21/16 16:25:18 Request accepted.
04/21/16 16:25:18 Remote owner is test4theory@cern.ch
04/21/16 16:25:18 State change: claiming protocol successful
04/21/16 16:25:18 Changing state: Unclaimed -> Claimed
04/21/16 16:25:20 Got activate_claim request from shadow (188.184.187.167)
04/21/16 16:25:20 Remote job ID is 271551.0
04/21/16 16:25:20 Got universe "VANILLA" (5) from request classad
04/21/16 16:25:20 State change: claim-activation protocol successful
04/21/16 16:25:20 Changing activity: Idle -> Busy
04/21/16 16:25:30 PERMISSION DENIED to condor@246-776-24187 from host 10.0.2.15 for command 60008 (DC_CHILDALIVE), access level DAEMON: reason: DAEMON authorization policy contains no matching ALLOW entry for this request; identifiers used for this host: 10.0.2.15,10.0.2.15, hostname size = 1, original ip address = 10.0.2.15
04/21/16 16:39:11 Called deactivate_claim_forcibly()
04/21/16 16:39:13 Got activate claim while starter is still alive.
04/21/16 16:39:13 Telling shadow to try again later.
04/21/16 16:39:14 Got activate claim while starter is still alive.
04/21/16 16:39:14 Telling shadow to try again later.
04/21/16 16:39:15 Got activate claim while starter is still alive.
04/21/16 16:39:15 Telling shadow to try again later.
04/21/16 16:39:16 Got activate claim while starter is still alive.
04/21/16 16:39:16 Telling shadow to try again later.
04/21/16 16:39:17 Got activate claim while starter is still alive.
04/21/16 16:39:17 Telling shadow to try again later.
04/21/16 16:39:19 Got activate claim while starter is still alive.
04/21/16 16:39:19 Telling shadow to try again later.
04/21/16 16:39:20 Got activate claim while starter is still alive.
04/21/16 16:39:20 Telling shadow to try again later.
04/21/16 16:39:21 Got activate claim while starter is still alive.
04/21/16 16:39:21 Telling shadow to try again later.
04/21/16 16:39:22 Got activate claim while starter is still alive.
04/21/16 16:39:22 Telling shadow to try again later.
04/21/16 16:39:23 Got activate claim while starter is still alive.
04/21/16 16:39:23 Telling shadow to try again later.
04/21/16 16:39:24 Got activate claim while starter is still alive.
04/21/16 16:39:24 Telling shadow to try again later.
04/21/16 16:39:25 Got activate claim while starter is still alive.
04/21/16 16:39:25 Telling shadow to try again later.
04/21/16 16:39:26 Got activate claim while starter is still alive.
04/21/16 16:39:26 Telling shadow to try again later.
04/21/16 16:39:29 Got activate claim while starter is still alive.
04/21/16 16:39:29 Telling shadow to try again later.
04/21/16 16:39:30 Got activate claim while starter is still alive.
04/21/16 16:39:30 Telling shadow to try again later.
04/21/16 16:39:31 Got activate claim while starter is still alive.
04/21/16 16:39:31 Telling shadow to try again later.
04/21/16 16:39:32 Got activate claim while starter is still alive.
04/21/16 16:39:32 Telling shadow to try again later.
04/21/16 16:39:33 Got activate claim while starter is still alive.
04/21/16 16:39:33 Telling shadow to try again later.
04/21/16 16:39:34 Got activate claim while starter is still alive.
04/21/16 16:39:34 Telling shadow to try again later.
04/21/16 16:39:36 Got activate claim while starter is still alive.
04/21/16 16:39:36 Telling shadow to try again later.
04/21/16 16:39:37 Got activate claim while starter is still alive.
04/21/16 16:39:37 Telling shadow to try again later.
04/21/16 16:39:37 Called deactivate_claim()
04/21/16 16:39:39 State change: received RELEASE_CLAIM command
04/21/16 16:39:39 Changing state and activity: Claimed/Busy -> Preempting/Vacating
04/21/16 16:39:41 starter (pid 3590) is not responding to the request to hardkill its job. The startd will now directly hard kill the starter and all its decendents.
04/21/16 16:39:41 Starter pid 3590 died on signal 9 (signal 9 (Killed))
04/21/16 16:39:41 State change: starter exited
04/21/16 16:39:41 State change: No preempting claim, returning to owner
04/21/16 16:39:41 Changing state and activity: Preempting/Vacating -> Owner/Idle
04/21/16 16:39:41 State change: IS_OWNER is false
04/21/16 16:39:41 Changing state: Owner -> Unclaimed
04/21/16 16:40:19 Request accepted.
04/21/16 16:40:19 Remote owner is test4theory@cern.ch
04/21/16 16:40:19 State change: claiming protocol successful
04/21/16 16:40:19 Changing state: Unclaimed -> Claimed
04/21/16 16:40:21 Got activate_claim request from shadow (188.184.187.167)
04/21/16 16:40:21 Remote job ID is 271557.0
04/21/16 16:40:21 Got universe "VANILLA" (5) from request classad
04/21/16 16:40:21 State change: claim-activation protocol successful
04/21/16 16:40:21 Changing activity: Idle -> Busy
04/21/16 16:40:28 PERMISSION DENIED to condor@246-776-24187 from host 10.0.2.15 for command 60008 (DC_CHILDALIVE), access level DAEMON: reason: cached result for DAEMON; see first case for the full reason
04/21/16 16:40:28 PERMISSION DENIED to condor@246-776-24187 from host 10.0.2.15 for command 60008 (DC_CHILDALIVE), access level DAEMON: reason: cached result for DAEMON; see first case for the full reason
04/21/16 16:40:47 Called deactivate_claim_forcibly()
04/21/16 16:40:48 Got activate claim while starter is still alive.
04/21/16 16:40:48 Telling shadow to try again later.
04/21/16 16:40:49 Got activate claim while starter is still alive.
04/21/16 16:40:49 Telling shadow to try again later.
04/21/16 16:40:50 Got activate claim while starter is still alive.
04/21/16 16:40:50 Telling shadow to try again later.
04/21/16 16:40:51 Got activate claim while starter is still alive.
04/21/16 16:40:51 Telling shadow to try again later.
04/21/16 16:40:52 Got activate claim while starter is still alive.
04/21/16 16:40:52 Telling shadow to try again later.
04/21/16 16:40:54 Got activate claim while starter is still alive.
04/21/16 16:40:54 Telling shadow to try again later.
04/21/16 16:40:55 Got activate claim while starter is still alive.
04/21/16 16:40:55 Telling shadow to try again later.
04/21/16 16:40:56 Got activate claim while starter is still alive.
04/21/16 16:40:56 Telling shadow to try again later.
04/21/16 16:40:57 Got activate claim while starter is still alive.
04/21/16 16:40:57 Telling shadow to try again later.
04/21/16 16:40:58 Got activate claim while starter is still alive.
04/21/16 16:40:58 Telling shadow to try again later.
04/21/16 16:40:59 Got activate claim while starter is still alive.
04/21/16 16:40:59 Telling shadow to try again later.
04/21/16 16:41:00 Got activate claim while starter is still alive.
04/21/16 16:41:00 Telling shadow to try again later.
04/21/16 16:41:01 Got activate claim while starter is still alive.
04/21/16 16:41:01 Telling shadow to try again later.
04/21/16 16:41:02 Got activate claim while starter is still alive.
04/21/16 16:41:02 Telling shadow to try again later.
04/21/16 16:41:04 Got activate claim while starter is still alive.
04/21/16 16:41:04 Telling shadow to try again later.
04/21/16 16:41:05 Got activate claim while starter is still alive.
04/21/16 16:41:05 Telling shadow to try again later.
04/21/16 16:41:06 Got activate claim while starter is still alive.
04/21/16 16:41:06 Telling shadow to try again later.
04/21/16 16:41:07 Got activate claim while starter is still alive.
04/21/16 16:41:07 Telling shadow to try again later.
04/21/16 16:41:08 Got activate claim while starter is still alive.
04/21/16 16:41:08 Telling shadow to try again later.
04/21/16 16:41:09 Got activate claim while starter is still alive.
04/21/16 16:41:09 Telling shadow to try again later.
04/21/16 16:41:10 Got activate claim while starter is still alive.
04/21/16 16:41:10 Telling shadow to try again later.
04/21/16 16:41:10 Called deactivate_claim()
04/21/16 16:41:11 State change: received RELEASE_CLAIM command
04/21/16 16:41:11 Changing state and activity: Claimed/Busy -> Preempting/Vacating
04/21/16 16:41:17 starter (pid 6986) is not responding to the request to hardkill its job. The startd will now directly hard kill the starter and all its decendents.
04/21/16 16:41:17 Starter pid 6986 died on signal 9 (signal 9 (Killed))
04/21/16 16:41:17 State change: starter exited
04/21/16 16:41:17 State change: No preempting claim, returning to owner
04/21/16 16:41:17 Changing state and activity: Preempting/Vacating -> Owner/Idle
04/21/16 16:41:17 State change: IS_OWNER is false
04/21/16 16:41:17 Changing state: Owner -> Unclaimed


I killed an earlier one with a shutdown file and in the completed log it shows...

2016-04-21 15:46:06 (16476): Guest Log: [INFO] VMID: db552956-770a-4b03-9ed9-316e25ec1573
2016-04-21 15:46:06 (16476): Guest Log: [INFO] Requesting an X509 credential from vLHC@home
2016-04-21 15:46:06 (16476): Guest Log: [INFO] Requesting an X509 credential from vLHC@home-dev
2016-04-21 15:46:06 (16476): Guest Log: [INFO] Theory application starting. Check log files.
2016-04-21 16:22:33 (16476): VM Completion File Detected.
2016-04-21 16:22:33 (16476): Powering off VM.
2016-04-21 16:22:35 (16476): Successfully stopped VM.
2016-04-21 16:22:40 (16476): Deregistering VM. (boinc_c915505983d72e43, slot#8)
13) Message boards : News : Project Configuration Update (Message 2850)
Posted 19 Apr 2016 by Profile PDW
Post:
Have put the limit to 5 tasks in progress.

Yep, thats about the limit for a 16GB host.

Aww, but what about my 128 GB host (with 20 cores...)?

It seems to have a terrible performance:
State: All (66) · In progress (0) · Validation pending (0) · Validation inconclusive (0) · Valid (16) · Invalid (0) · Error (50)

That's probably due to the Fibre internet connection he has.
Shame the Fibre in question is a piece of damp string ;-)
14) Message boards : News : Project Configuration Update (Message 2843)
Posted 19 Apr 2016 by Profile PDW
Post:
The flood gates have been opened.

No problems with any extra tasks running but 1 machine had been running 2 at a time and getting through them in about 7 mins due to...

2016-04-19 09:00:39 (4576): Guest Log: [INFO] Theory application starting. Check log files.
2016-04-19 09:06:29 (4576): Guest Log: [ERROR] App is not supported. Shutting down!

Had been doing that for a few hours before the gates opened and so with estimated run times that low it downloaded over 800 !
15) Message boards : News : Project Restructuring (Message 2813)
Posted 18 Apr 2016 by Profile PDW
Post:
I was hoping during the server update yesterday that it might result in per-project breakdowns on the user account page like other projects do. Is this in the pipeline of things to do ?

Also, the user stats have been corrected, thanks, but the team stats (Edit: Per Application ones) are still wrong. Can you have a fiddle with those when you have time please ?


Bump.

Bumpety-bump (as Atlas has appeared in the stats) !
16) Message boards : LHCb Application : Not validating (Message 2774)
Posted 15 Apr 2016 by Profile PDW
Post:
What about now?

What ever you did worked for mine, thanks :)
17) Message boards : LHCb Application : Not validating (Message 2771)
Posted 15 Apr 2016 by Profile PDW
Post:
Is it working for you now?


Now I have 14 waiting for Validation

http://lhcathomedev.cern.ch/vLHCathome-dev/results.php?userid=192

Though 8 of those on your host ID 345 aborted quickly with the error like this...

2016-04-13 08:10:06 (7660): Guest Log: [ERROR] Cloud not get an x509 credential

The same machine had been aborting tasks quickly as it can't find the heartbeat file, the current one looks like it might be working though.
18) Message boards : Theory Application : The Theory Application (Message 2752)
Posted 14 Apr 2016 by Profile PDW
Post:
Back to getting failure to start work due to...

2016-04-14 12:17:09 (16392): Guest Log: [INFO] Theory application starting. Check log files.
2016-04-14 12:23:10 (16392): Guest Log: [ERROR] App is not supported. Shutting down!
19) Message boards : LHCb Application : Not validating (Message 2738)
Posted 13 Apr 2016 by Profile PDW
Post:
Not for me.
20) Message boards : Theory Application : Credentials (Message 2737)
Posted 13 Apr 2016 by Profile PDW
Post:
Whilst I've been out my credentials have gone again :(
Ignore that, it's my brain that has gone again !


Next 20


©2024 CERN