Message boards : Theory Application : Tasks are finishing prematurely
Message board moderation

To post messages, you must log in.

AuthorMessage
Crystal Pellet
Volunteer tester

Send message
Joined: 13 Feb 15
Posts: 1188
Credit: 859,751
RAC: 36
Message 3113 - Posted: 30 Apr 2016, 6:34:29 UTC

After the second job finished of a new task the task was killed prematurely.
http://lhcathomedev.cern.ch/vLHCathome-dev/result.php?resultid=167051

Last part of StarterLog:

04/30/16 07:24:28 Create_Process succeeded, pid=4196
04/30/16 07:39:38 condor_read() failed: recv(fd=8) returned -1, errno = 104 Connection reset by peer, reading 21 bytes from <188.184.187.167:9618>.
04/30/16 07:39:38 IO: Failed to read packet header
04/30/16 07:44:39 condor_write(): Socket closed when trying to write 410 bytes to <188.184.187.167:9618>, fd is 8
04/30/16 07:44:39 Buf::write(): condor_write() failed
04/30/16 07:49:39 condor_write(): Socket closed when trying to write 410 bytes to <188.184.187.167:9618>, fd is 8
04/30/16 07:49:39 Buf::write(): condor_write() failed
04/30/16 07:54:40 condor_write(): Socket closed when trying to write 410 bytes to <188.184.187.167:9618>, fd is 8
04/30/16 07:54:40 Buf::write(): condor_write() failed
04/30/16 07:59:41 condor_write(): Socket closed when trying to write 410 bytes to <188.184.187.167:9618>, fd is 8
04/30/16 07:59:41 Buf::write(): condor_write() failed
04/30/16 08:04:41 condor_write(): Socket closed when trying to write 410 bytes to <188.184.187.167:9618>, fd is 8
04/30/16 08:04:41 Buf::write(): condor_write() failed
04/30/16 08:09:06 Process exited, pid=4196, status=0
04/30/16 08:09:06 About to exec Post script: /var/lib/condor/execute/dir_4192/tarOutput.sh 2016-562440-218
04/30/16 08:09:06 Create_Process succeeded, pid=9698
04/30/16 08:09:06 Process exited, pid=9698, status=0
04/30/16 08:09:06 condor_write(): Socket closed when trying to write 581 bytes to <188.184.187.167:9618>, fd is 8
04/30/16 08:09:06 Buf::write(): condor_write() failed
04/30/16 08:09:06 condor_write(): Socket closed when trying to write 363 bytes to <188.184.187.167:9618>, fd is 8
04/30/16 08:09:06 Buf::write(): condor_write() failed
04/30/16 08:09:06 Failed to send job exit status to shadow
04/30/16 08:09:06 JobExit() failed, waiting for job lease to expire or for a reconnect attempt
04/30/16 08:17:47 Got SIGQUIT. Performing fast shutdown.
04/30/16 08:17:47 ShutdownFast all jobs.
04/30/16 08:17:47 condor_write(): Socket closed when trying to write 363 bytes to <188.184.187.167:9618>, fd is 8
04/30/16 08:17:47 Buf::write(): condor_write() failed
04/30/16 08:17:47 Failed to send job exit status to shadow
04/30/16 08:17:47 JobExit() failed, waiting for job lease to expire or for a reconnect attempt
ID: 3113 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Crystal Pellet
Volunteer tester

Send message
Joined: 13 Feb 15
Posts: 1188
Credit: 859,751
RAC: 36
Message 3116 - Posted: 30 Apr 2016, 8:36:55 UTC

New task only ran 7 minutes -> http://lhcathomedev.cern.ch/vLHCathome-dev/result.php?resultid=167108

2016-04-30 09:45:08 (5544): Guest Log: [INFO] Mounting the shared directory
2016-04-30 09:45:08 (5544): Guest Log: [INFO] Shared directory mounted, enabling vboxmonitor
2016-04-30 09:45:08 (5544): Guest Log: [DEBUG] Probing CVMFS ...
2016-04-30 09:45:08 (5544): Guest Log: Probing /cvmfs/grid.cern.ch... OK
2016-04-30 09:45:08 (5544): Guest Log: Probing /cvmfs/sft.cern.ch... OK
2016-04-30 09:45:18 (5544): Guest Log: 0
2016-04-30 09:45:18 (5544): Guest Log: cms.cern.ch not mounted
2016-04-30 09:45:18 (5544): Guest Log: 1
2016-04-30 09:45:18 (5544): Guest Log: [DEBUG] Finished probing.
2016-04-30 09:45:18 (5544): Guest Log: [INFO] Reading volunteer information
2016-04-30 09:45:18 (5544): Guest Log: [INFO] Volunteer: Crystal Pellet (38) Host: 37
2016-04-30 09:45:18 (5544): Guest Log: [INFO] VMID: 5a65677f-3929-47b8-97cd-9212275cc67f
2016-04-30 09:45:18 (5544): Guest Log: [INFO] Requesting an X509 credential from vLHC@home
2016-04-30 09:45:18 (5544): Guest Log: [INFO] Requesting an X509 credential from vLHC@home-dev
2016-04-30 09:45:18 (5544): Guest Log: [INFO] Theory application starting. Check log files.
2016-04-30 09:51:00 (5544): Guest Log: [INFO] Condor exited with 0
2016-04-30 09:51:00 (5544): Guest Log: [INFO] Shutting Down.
ID: 3116 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Rasputin42
Volunteer tester

Send message
Joined: 16 Aug 15
Posts: 966
Credit: 1,211,816
RAC: 0
Message 3117 - Posted: 30 Apr 2016, 8:39:48 UTC - in response to Message 3116.  
Last modified: 30 Apr 2016, 8:40:24 UTC

This is typical of no jobs available.
ID: 3117 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Leonardo Cristella

Send message
Joined: 4 Mar 16
Posts: 31
Credit: 44,320
RAC: 0
Message 3119 - Posted: 30 Apr 2016, 8:58:14 UTC - in response to Message 3117.  

Just submitted a new bunch. Sorry for the delay.
ID: 3119 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Rasputin42
Volunteer tester

Send message
Joined: 16 Aug 15
Posts: 966
Credit: 1,211,816
RAC: 0
Message 3136 - Posted: 1 May 2016, 7:30:46 UTC
Last modified: 1 May 2016, 7:31:15 UTC

I had tasks finishing after about 7 min in the period from 01.46-3.11UTC this morning.(no jobs?)
Then, the tasks worked again.(kept running)
ID: 3136 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Crystal Pellet
Volunteer tester

Send message
Joined: 13 Feb 15
Posts: 1188
Credit: 859,751
RAC: 36
Message 3148 - Posted: 1 May 2016, 11:04:38 UTC

This task was running normally and killed, where it was busy with a Sherpa job 3 minutes elapsed events processing and
estimated to do 2-3 hours and suddenly VM was idling without 'nobody' processes.
ID: 3148 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote

Message boards : Theory Application : Tasks are finishing prematurely


©2024 CERN