Message boards :
Theory Application :
Task not starting and not shutting down !
Message board moderation
Author | Message |
---|---|
Send message Joined: 20 May 15 Posts: 217 Credit: 6,185,015 RAC: 3,010 |
This is StartLog from a task that hasn't done any real work and won't exit... 04/11/16 16:18:12 ****************************************************** 04/11/16 16:18:12 ** condor_startd (CONDOR_STARTD) STARTING UP 04/11/16 16:18:12 ** /usr/sbin/condor_startd 04/11/16 16:18:12 ** SubsystemInfo: name=STARTD type=STARTD(7) class=DAEMON(1) 04/11/16 16:18:12 ** Configuration: subsystem:STARTD local:<NONE> class:DAEMON 04/11/16 16:18:12 ** $CondorVersion: 8.0.6 Feb 01 2014 BuildID: 225363 $ 04/11/16 16:18:12 ** $CondorPlatform: x86_64_RedHat6 $ 04/11/16 16:18:12 ** PID = 4464 04/11/16 16:18:12 ** Log last touched time unavailable (No such file or directory) 04/11/16 16:18:12 ****************************************************** 04/11/16 16:18:12 Using config source: /etc/condor/condor_config 04/11/16 16:18:12 Using local config sources: 04/11/16 16:18:12 /etc/condor/config.d/10_security.config 04/11/16 16:18:12 /etc/condor/config.d/14_network.config 04/11/16 16:18:12 /etc/condor/config.d/20_workernode.config 04/11/16 16:18:12 /etc/condor/config.d/30_lease.config 04/11/16 16:18:12 /etc/condor/config.d/35_theory.config 04/11/16 16:18:12 /etc/condor/config.d/40_ccb.config 04/11/16 16:18:12 /etc/condor/condor_config.local 04/11/16 16:18:12 Daemon Log is logging: D_ALWAYS D_ERROR 04/11/16 16:18:12 DaemonCore: command socket at <10.0.2.15:33147?noUDP> 04/11/16 16:18:12 DaemonCore: private command socket at <10.0.2.15:33147> 04/11/16 16:18:12 ERROR: Could not open canonicalization file '/etc/condor/certificate_mapfile' (No such file or directory) 04/11/16 16:18:13 CCBListener: heartbeat disabled because interval is configured to be 0 04/11/16 16:18:13 CCBListener: registered with CCB server alicondor01.cern.ch as ccbid 188.184.129.127:9618?addrs=188.184.129.127-9618&noUDP&sock=collector#497 04/11/16 16:18:13 HibernationSupportedStates invalid '' in ad from hibernation plugin /usr/libexec/condor/condor_power_state 04/11/16 16:18:26 VM-gahp server reported an internal error 04/11/16 16:18:26 VM universe will be tested to check if it is available 04/11/16 16:18:26 History file rotation is enabled. 04/11/16 16:18:26 Maximum history file size is: 20971520 bytes 04/11/16 16:18:26 Number of rotated history files is: 2 slot type 0: Cpus: 1, Memory: auto, Swap: auto, Disk: auto slot type 0: Cpus: 1, Memory: 4500, Swap: 100.00%, Disk: 100.00% 04/11/16 16:18:26 New machine resource allocated 04/11/16 16:18:26 CronJobList: Adding job 'mips' 04/11/16 16:18:26 CronJobList: Adding job 'kflops' 04/11/16 16:18:26 CronJob: Initializing job 'mips' (/usr/libexec/condor/condor_mips) 04/11/16 16:18:26 CronJob: Initializing job 'kflops' (/usr/libexec/condor/condor_kflops) 04/11/16 16:18:26 State change: IS_OWNER is false 04/11/16 16:18:26 Changing state: Owner -> Unclaimed 04/11/16 16:18:26 State change: RunBenchmarks is TRUE 04/11/16 16:18:26 Changing activity: Idle -> Benchmarking 04/11/16 16:18:26 BenchMgr:StartBenchmarks() 04/11/16 16:18:39 Request accepted. 04/11/16 16:18:39 Remote owner is test4theory@cern.ch 04/11/16 16:18:39 State change: claiming protocol successful 04/11/16 16:18:39 Changing state and activity: Unclaimed/Benchmarking -> Claimed/Idle 04/11/16 16:18:40 Got activate_claim request from shadow (188.184.187.167) 04/11/16 16:18:40 Remote job ID is 260339.0 04/11/16 16:18:40 Got universe "VANILLA" (5) from request classad 04/11/16 16:18:40 State change: claim-activation protocol successful 04/11/16 16:18:40 Changing activity: Idle -> Busy 04/11/16 16:18:41 PERMISSION DENIED to condor@localhost from host 10.0.2.15 for command 60008 (DC_CHILDALIVE), access level DAEMON: reason: DAEMON authorization policy contains no matching ALLOW entry for this request; identifiers used for this host: 10.0.2.15,10.0.2.15, hostname size = 1, original ip address = 10.0.2.15 04/11/16 16:18:53 State change: benchmarks completed 04/11/16 16:21:46 Called deactivate_claim_forcibly() 04/11/16 16:21:46 Starter pid 4557 exited with status 0 04/11/16 16:21:46 State change: starter exited 04/11/16 16:21:46 Changing activity: Busy -> Idle 04/11/16 16:21:47 Got activate_claim request from shadow (188.184.187.167) 04/11/16 16:21:47 Remote job ID is 260340.0 04/11/16 16:21:47 Got universe "VANILLA" (5) from request classad 04/11/16 16:21:47 State change: claim-activation protocol successful 04/11/16 16:21:47 Changing activity: Idle -> Busy 04/11/16 16:21:48 PERMISSION DENIED to condor@localhost from host 10.0.2.15 for command 60008 (DC_CHILDALIVE), access level DAEMON: reason: cached result for DAEMON; see first case for the full reason 04/11/16 16:24:15 Called deactivate_claim_forcibly() 04/11/16 16:24:15 Starter pid 5159 exited with status 0 04/11/16 16:24:15 State change: starter exited 04/11/16 16:24:15 Changing activity: Busy -> Idle 04/11/16 16:24:16 Got activate_claim request from shadow (188.184.187.167) 04/11/16 16:24:16 Remote job ID is 260341.0 04/11/16 16:24:16 Got universe "VANILLA" (5) from request classad 04/11/16 16:24:16 State change: claim-activation protocol successful 04/11/16 16:24:16 Changing activity: Idle -> Busy 04/11/16 16:24:17 PERMISSION DENIED to condor@localhost from host 10.0.2.15 for command 60008 (DC_CHILDALIVE), access level DAEMON: reason: cached result for DAEMON; see first case for the full reason 04/11/16 16:24:17 PERMISSION DENIED to condor@localhost from host 10.0.2.15 for command 60008 (DC_CHILDALIVE), access level DAEMON: reason: cached result for DAEMON; see first case for the full reason 04/11/16 16:24:17 PERMISSION DENIED to condor@localhost from host 10.0.2.15 for command 60008 (DC_CHILDALIVE), access level DAEMON: reason: cached result for DAEMON; see first case for the full reason 04/11/16 16:24:17 Starter pid 5731 exited with status 4 04/11/16 16:24:17 State change: starter exited 04/11/16 16:24:17 Changing activity: Busy -> Idle 04/11/16 16:24:18 Got activate_claim request from shadow (188.184.187.167) 04/11/16 16:24:18 Remote job ID is 260341.0 04/11/16 16:24:18 Got universe "VANILLA" (5) from request classad 04/11/16 16:24:18 State change: claim-activation protocol successful 04/11/16 16:24:18 Changing activity: Idle -> Busy 04/11/16 16:24:19 PERMISSION DENIED to condor@localhost from host 10.0.2.15 for command 60008 (DC_CHILDALIVE), access level DAEMON: reason: cached result for DAEMON; see first case for the full reason 04/11/16 16:26:44 Called deactivate_claim_forcibly() 04/11/16 16:26:44 Starter pid 5739 exited with status 0 04/11/16 16:26:44 State change: starter exited 04/11/16 16:26:44 Changing activity: Busy -> Idle 04/11/16 16:26:44 State change: received RELEASE_CLAIM command 04/11/16 16:26:44 Changing state and activity: Claimed/Idle -> Preempting/Vacating 04/11/16 16:26:44 State change: No preempting claim, returning to owner 04/11/16 16:26:44 Changing state and activity: Preempting/Vacating -> Owner/Idle 04/11/16 16:26:44 State change: IS_OWNER is false 04/11/16 16:26:44 Changing state: Owner -> Unclaimed 04/21/16 16:24:03 ****************************************************** 04/21/16 16:24:03 ** condor_startd (CONDOR_STARTD) STARTING UP 04/21/16 16:24:03 ** /usr/sbin/condor_startd 04/21/16 16:24:03 ** SubsystemInfo: name=STARTD type=STARTD(7) class=DAEMON(1) 04/21/16 16:24:03 ** Configuration: subsystem:STARTD local:<NONE> class:DAEMON 04/21/16 16:24:03 ** $CondorVersion: 8.0.6 Feb 01 2014 BuildID: 225363 $ 04/21/16 16:24:03 ** $CondorPlatform: x86_64_RedHat6 $ 04/21/16 16:24:03 ** PID = 3550 04/21/16 16:24:03 ** Log last touched 4/21 16:23:51 04/21/16 16:24:03 ****************************************************** 04/21/16 16:24:03 Using config source: /etc/condor/condor_config 04/21/16 16:24:03 Using local config sources: 04/21/16 16:24:03 /etc/condor/config.d/10_security.config 04/21/16 16:24:03 /etc/condor/config.d/14_network.config 04/21/16 16:24:03 /etc/condor/config.d/20_workernode.config 04/21/16 16:24:03 /etc/condor/config.d/30_lease.config 04/21/16 16:24:03 /etc/condor/config.d/35_theory.config 04/21/16 16:24:03 /etc/condor/config.d/40_ccb.config 04/21/16 16:24:03 /etc/condor/condor_config.local 04/21/16 16:24:03 Daemon Log is logging: D_ALWAYS D_ERROR 04/21/16 16:24:03 DaemonCore: command socket at <10.0.2.15:42749?noUDP> 04/21/16 16:24:03 DaemonCore: private command socket at <10.0.2.15:42749> 04/21/16 16:24:03 ERROR: Could not open canonicalization file '/etc/condor/certificate_mapfile' (No such file or directory) 04/21/16 16:24:16 CCBListener: heartbeat disabled because interval is configured to be 0 04/21/16 16:24:16 CCBListener: registered with CCB server alicondor01.cern.ch as ccbid 188.184.129.127:9618?addrs=188.184.129.127-9618&noUDP&sock=collector#13882 04/21/16 16:24:16 HibernationSupportedStates invalid '' in ad from hibernation plugin /usr/libexec/condor/condor_power_state 04/21/16 16:24:23 VM-gahp server reported an internal error 04/21/16 16:24:23 VM universe will be tested to check if it is available 04/21/16 16:24:23 History file rotation is enabled. 04/21/16 16:24:23 Maximum history file size is: 20971520 bytes 04/21/16 16:24:23 Number of rotated history files is: 2 slot type 0: Cpus: 1, Memory: auto, Swap: auto, Disk: auto slot type 0: Cpus: 1, Memory: 4500, Swap: 100.00%, Disk: 100.00% 04/21/16 16:24:23 New machine resource allocated 04/21/16 16:24:23 CronJobList: Adding job 'mips' 04/21/16 16:24:23 CronJobList: Adding job 'kflops' 04/21/16 16:24:23 CronJob: Initializing job 'mips' (/usr/libexec/condor/condor_mips) 04/21/16 16:24:23 CronJob: Initializing job 'kflops' (/usr/libexec/condor/condor_kflops) 04/21/16 16:24:23 State change: IS_OWNER is false 04/21/16 16:24:23 Changing state: Owner -> Unclaimed 04/21/16 16:24:23 State change: RunBenchmarks is TRUE 04/21/16 16:24:23 Changing activity: Idle -> Benchmarking 04/21/16 16:24:23 BenchMgr:StartBenchmarks() 04/21/16 16:24:46 State change: benchmarks completed 04/21/16 16:24:46 Changing activity: Benchmarking -> Idle 04/21/16 16:25:18 Request accepted. 04/21/16 16:25:18 Remote owner is test4theory@cern.ch 04/21/16 16:25:18 State change: claiming protocol successful 04/21/16 16:25:18 Changing state: Unclaimed -> Claimed 04/21/16 16:25:20 Got activate_claim request from shadow (188.184.187.167) 04/21/16 16:25:20 Remote job ID is 271551.0 04/21/16 16:25:20 Got universe "VANILLA" (5) from request classad 04/21/16 16:25:20 State change: claim-activation protocol successful 04/21/16 16:25:20 Changing activity: Idle -> Busy 04/21/16 16:25:30 PERMISSION DENIED to condor@246-776-24187 from host 10.0.2.15 for command 60008 (DC_CHILDALIVE), access level DAEMON: reason: DAEMON authorization policy contains no matching ALLOW entry for this request; identifiers used for this host: 10.0.2.15,10.0.2.15, hostname size = 1, original ip address = 10.0.2.15 04/21/16 16:39:11 Called deactivate_claim_forcibly() 04/21/16 16:39:13 Got activate claim while starter is still alive. 04/21/16 16:39:13 Telling shadow to try again later. 04/21/16 16:39:14 Got activate claim while starter is still alive. 04/21/16 16:39:14 Telling shadow to try again later. 04/21/16 16:39:15 Got activate claim while starter is still alive. 04/21/16 16:39:15 Telling shadow to try again later. 04/21/16 16:39:16 Got activate claim while starter is still alive. 04/21/16 16:39:16 Telling shadow to try again later. 04/21/16 16:39:17 Got activate claim while starter is still alive. 04/21/16 16:39:17 Telling shadow to try again later. 04/21/16 16:39:19 Got activate claim while starter is still alive. 04/21/16 16:39:19 Telling shadow to try again later. 04/21/16 16:39:20 Got activate claim while starter is still alive. 04/21/16 16:39:20 Telling shadow to try again later. 04/21/16 16:39:21 Got activate claim while starter is still alive. 04/21/16 16:39:21 Telling shadow to try again later. 04/21/16 16:39:22 Got activate claim while starter is still alive. 04/21/16 16:39:22 Telling shadow to try again later. 04/21/16 16:39:23 Got activate claim while starter is still alive. 04/21/16 16:39:23 Telling shadow to try again later. 04/21/16 16:39:24 Got activate claim while starter is still alive. 04/21/16 16:39:24 Telling shadow to try again later. 04/21/16 16:39:25 Got activate claim while starter is still alive. 04/21/16 16:39:25 Telling shadow to try again later. 04/21/16 16:39:26 Got activate claim while starter is still alive. 04/21/16 16:39:26 Telling shadow to try again later. 04/21/16 16:39:29 Got activate claim while starter is still alive. 04/21/16 16:39:29 Telling shadow to try again later. 04/21/16 16:39:30 Got activate claim while starter is still alive. 04/21/16 16:39:30 Telling shadow to try again later. 04/21/16 16:39:31 Got activate claim while starter is still alive. 04/21/16 16:39:31 Telling shadow to try again later. 04/21/16 16:39:32 Got activate claim while starter is still alive. 04/21/16 16:39:32 Telling shadow to try again later. 04/21/16 16:39:33 Got activate claim while starter is still alive. 04/21/16 16:39:33 Telling shadow to try again later. 04/21/16 16:39:34 Got activate claim while starter is still alive. 04/21/16 16:39:34 Telling shadow to try again later. 04/21/16 16:39:36 Got activate claim while starter is still alive. 04/21/16 16:39:36 Telling shadow to try again later. 04/21/16 16:39:37 Got activate claim while starter is still alive. 04/21/16 16:39:37 Telling shadow to try again later. 04/21/16 16:39:37 Called deactivate_claim() 04/21/16 16:39:39 State change: received RELEASE_CLAIM command 04/21/16 16:39:39 Changing state and activity: Claimed/Busy -> Preempting/Vacating 04/21/16 16:39:41 starter (pid 3590) is not responding to the request to hardkill its job. The startd will now directly hard kill the starter and all its decendents. 04/21/16 16:39:41 Starter pid 3590 died on signal 9 (signal 9 (Killed)) 04/21/16 16:39:41 State change: starter exited 04/21/16 16:39:41 State change: No preempting claim, returning to owner 04/21/16 16:39:41 Changing state and activity: Preempting/Vacating -> Owner/Idle 04/21/16 16:39:41 State change: IS_OWNER is false 04/21/16 16:39:41 Changing state: Owner -> Unclaimed 04/21/16 16:40:19 Request accepted. 04/21/16 16:40:19 Remote owner is test4theory@cern.ch 04/21/16 16:40:19 State change: claiming protocol successful 04/21/16 16:40:19 Changing state: Unclaimed -> Claimed 04/21/16 16:40:21 Got activate_claim request from shadow (188.184.187.167) 04/21/16 16:40:21 Remote job ID is 271557.0 04/21/16 16:40:21 Got universe "VANILLA" (5) from request classad 04/21/16 16:40:21 State change: claim-activation protocol successful 04/21/16 16:40:21 Changing activity: Idle -> Busy 04/21/16 16:40:28 PERMISSION DENIED to condor@246-776-24187 from host 10.0.2.15 for command 60008 (DC_CHILDALIVE), access level DAEMON: reason: cached result for DAEMON; see first case for the full reason 04/21/16 16:40:28 PERMISSION DENIED to condor@246-776-24187 from host 10.0.2.15 for command 60008 (DC_CHILDALIVE), access level DAEMON: reason: cached result for DAEMON; see first case for the full reason 04/21/16 16:40:47 Called deactivate_claim_forcibly() 04/21/16 16:40:48 Got activate claim while starter is still alive. 04/21/16 16:40:48 Telling shadow to try again later. 04/21/16 16:40:49 Got activate claim while starter is still alive. 04/21/16 16:40:49 Telling shadow to try again later. 04/21/16 16:40:50 Got activate claim while starter is still alive. 04/21/16 16:40:50 Telling shadow to try again later. 04/21/16 16:40:51 Got activate claim while starter is still alive. 04/21/16 16:40:51 Telling shadow to try again later. 04/21/16 16:40:52 Got activate claim while starter is still alive. 04/21/16 16:40:52 Telling shadow to try again later. 04/21/16 16:40:54 Got activate claim while starter is still alive. 04/21/16 16:40:54 Telling shadow to try again later. 04/21/16 16:40:55 Got activate claim while starter is still alive. 04/21/16 16:40:55 Telling shadow to try again later. 04/21/16 16:40:56 Got activate claim while starter is still alive. 04/21/16 16:40:56 Telling shadow to try again later. 04/21/16 16:40:57 Got activate claim while starter is still alive. 04/21/16 16:40:57 Telling shadow to try again later. 04/21/16 16:40:58 Got activate claim while starter is still alive. 04/21/16 16:40:58 Telling shadow to try again later. 04/21/16 16:40:59 Got activate claim while starter is still alive. 04/21/16 16:40:59 Telling shadow to try again later. 04/21/16 16:41:00 Got activate claim while starter is still alive. 04/21/16 16:41:00 Telling shadow to try again later. 04/21/16 16:41:01 Got activate claim while starter is still alive. 04/21/16 16:41:01 Telling shadow to try again later. 04/21/16 16:41:02 Got activate claim while starter is still alive. 04/21/16 16:41:02 Telling shadow to try again later. 04/21/16 16:41:04 Got activate claim while starter is still alive. 04/21/16 16:41:04 Telling shadow to try again later. 04/21/16 16:41:05 Got activate claim while starter is still alive. 04/21/16 16:41:05 Telling shadow to try again later. 04/21/16 16:41:06 Got activate claim while starter is still alive. 04/21/16 16:41:06 Telling shadow to try again later. 04/21/16 16:41:07 Got activate claim while starter is still alive. 04/21/16 16:41:07 Telling shadow to try again later. 04/21/16 16:41:08 Got activate claim while starter is still alive. 04/21/16 16:41:08 Telling shadow to try again later. 04/21/16 16:41:09 Got activate claim while starter is still alive. 04/21/16 16:41:09 Telling shadow to try again later. 04/21/16 16:41:10 Got activate claim while starter is still alive. 04/21/16 16:41:10 Telling shadow to try again later. 04/21/16 16:41:10 Called deactivate_claim() 04/21/16 16:41:11 State change: received RELEASE_CLAIM command 04/21/16 16:41:11 Changing state and activity: Claimed/Busy -> Preempting/Vacating 04/21/16 16:41:17 starter (pid 6986) is not responding to the request to hardkill its job. The startd will now directly hard kill the starter and all its decendents. 04/21/16 16:41:17 Starter pid 6986 died on signal 9 (signal 9 (Killed)) 04/21/16 16:41:17 State change: starter exited 04/21/16 16:41:17 State change: No preempting claim, returning to owner 04/21/16 16:41:17 Changing state and activity: Preempting/Vacating -> Owner/Idle 04/21/16 16:41:17 State change: IS_OWNER is false 04/21/16 16:41:17 Changing state: Owner -> Unclaimed I killed an earlier one with a shutdown file and in the completed log it shows... 2016-04-21 15:46:06 (16476): Guest Log: [INFO] VMID: db552956-770a-4b03-9ed9-316e25ec1573 2016-04-21 15:46:06 (16476): Guest Log: [INFO] Requesting an X509 credential from vLHC@home 2016-04-21 15:46:06 (16476): Guest Log: [INFO] Requesting an X509 credential from vLHC@home-dev 2016-04-21 15:46:06 (16476): Guest Log: [INFO] Theory application starting. Check log files. 2016-04-21 16:22:33 (16476): VM Completion File Detected. 2016-04-21 16:22:33 (16476): Powering off VM. 2016-04-21 16:22:35 (16476): Successfully stopped VM. 2016-04-21 16:22:40 (16476): Deregistering VM. (boinc_c915505983d72e43, slot#8) |
©2024 CERN