Name RZOKDm2u8g3n7Olcko1bjSoqABFKDmABFKDm7AsVDmDkNKDmYNdCsm_0
Workunit 2319801
Created 21 Jul 2023, 17:16:26 UTC
Sent 21 Jul 2023, 17:17:43 UTC
Report deadline 28 Jul 2023, 17:17:43 UTC
Received 21 Jul 2023, 18:03:29 UTC
Server state Over
Outcome Computation error
Client state Compute error
Exit status 0 (0x00000000)
Computer ID 4721
Run time 9 min 45 sec
CPU time 1 min 2 sec
Validate state Invalid
Credit 0.00
Device peak FLOPS 5.90 GFLOPS
Application version ATLAS Simulation v3.01 (native_mt)
x86_64-pc-linux-gnu
Peak working set size 1.19 GB
Peak swap size 1.71 GB
Peak disk usage 36.33 MB

Stderr output

<core_client_version>7.20.2</core_client_version>
<![CDATA[
<stderr_txt>
19:36:36 (2430897): wrapper (7.7.26015): starting
19:36:36 (2430897): wrapper: running run_atlas (--nthreads 1)
[2023-07-21 19:36:36] Arguments: --nthreads 1
[2023-07-21 19:36:36] Threads: 1
[2023-07-21 19:36:36] Checking for CVMFS
[2023-07-21 19:36:38] Probing /cvmfs/atlas.cern.ch... OK
[2023-07-21 19:36:38] Probing /cvmfs/atlas-condb.cern.ch... OK
[2023-07-21 19:36:38] Running cvmfs_config stat atlas.cern.ch
[2023-07-21 19:36:38] VERSION PID UPTIME(M) MEM(K) REVISION EXPIRES(M) NOCATALOGS CACHEUSE(K) CACHEMAX(K) NOFDUSE NOFDMAX NOIOERR NOOPEN HITRATE(%) RX(K) SPEED(K/S) HOST PROXY ONLINE
[2023-07-21 19:36:38] 2.10.1.0 2217423 180 50540 121547 0 73 2852265 4096001 0 130560 0 839612 99.996 53079 3118 http://s1cern-cvmfs.openhtc.io/cvmfs/atlas.cern.ch http://10.116.178.201:3128 1
[2023-07-21 19:36:38] CVMFS is ok
[2023-07-21 19:36:38] Using apptainer image /cvmfs/atlas.cern.ch/repo/containers/fs/singularity/x86_64-centos7
[2023-07-21 19:36:38] Checking for apptainer binary...
[2023-07-21 19:36:38] Using apptainer found in PATH at /usr/bin/apptainer
[2023-07-21 19:36:38] Running /usr/bin/apptainer --version
[2023-07-21 19:36:38] apptainer version 1.1.9-1.el9
[2023-07-21 19:36:38] Checking apptainer works with /usr/bin/apptainer exec -B /cvmfs /cvmfs/atlas.cern.ch/repo/containers/fs/singularity/x86_64-centos7 hostname
[2023-07-21 19:36:38] P620-CentOS9
[2023-07-21 19:36:38] apptainer works
[2023-07-21 19:36:38] Starting ATLAS job with PandaID=5910110243
[2023-07-21 19:36:38] Running command: /usr/bin/apptainer exec -B /cvmfs,/var/lib/boinc/slots/2 /cvmfs/atlas.cern.ch/repo/containers/fs/singularity/x86_64-centos7 sh start_atlas.sh
[2023-07-21 19:46:33]  *** The last 200 lines of the pilot log: ***
[2023-07-21 19:46:33] 2023-07-21 17:43:56,073 | INFO     | will abort job monitoring soon since job state=failed (job is still in queue)
[2023-07-21 19:46:33] 2023-07-21 17:43:58,577 | INFO     | monitor loop #20: job 0:5910110243 is in state 'failed'
[2023-07-21 19:46:33] 2023-07-21 17:43:58,577 | INFO     | will abort job monitoring soon since job state=failed (job is still in queue)
[2023-07-21 19:46:33] 2023-07-21 17:43:59,039 | INFO     | copied /var/lib/boinc/slots/2/PanDA_Pilot-5910110243/memory_monitor_summary.json to /var/lib/boinc/slots/2
[2023-07-21 19:46:33] 2023-07-21 17:43:59,039 | INFO     | CPU consumption time: 63.53 s (rounded to 64 s)
[2023-07-21 19:46:33] 2023-07-21 17:43:59,040 | WARNING  | main payload execution returned non-zero exit code: 1
[2023-07-21 19:46:33] 2023-07-21 17:43:59,040 | INFO     | scanning dmesg message for subprocess=2448290 for memory errors
[2023-07-21 19:46:33] 2023-07-21 17:43:59,040 | INFO     | executing command: dmesg|grep 2448290
[2023-07-21 19:46:33] 2023-07-21 17:43:59,158 | CRITICAL | execute payloads caught an exception (cannot recover): [Errno 107] Transport endpoint is not connected: '/bin/bash', Traceback (most recent call last):
[2023-07-21 19:46:33]   File "/var/lib/boinc/slots/2/pilot3/pilot/control/payload.py", line 257, in execute_payloads
[2023-07-21 19:46:33]     perform_initial_payload_error_analysis(job, exit_code)
[2023-07-21 19:46:33]   File "/var/lib/boinc/slots/2/pilot3/pilot/control/payload.py", line 577, in perform_initial_payload_error_analysis
[2023-07-21 19:46:33]     msg = scan_for_memory_errors(job.subprocesses)
[2023-07-21 19:46:33]   File "/var/lib/boinc/slots/2/pilot3/pilot/control/payload.py", line 651, in scan_for_memory_errors
[2023-07-21 19:46:33]     _, out, _ = execute(cmd)
[2023-07-21 19:46:33]   File "/var/lib/boinc/slots/2/pilot3/pilot/util/container.py", line 64, in execute
[2023-07-21 19:46:33]     process = subprocess.Popen(exe,
[2023-07-21 19:46:33]   File "/cvmfs/atlas.cern.ch/repo/ATLASLocalRootBase/x86_64/python/3.9.14-x86_64-centos7/lib/python3.9/subprocess.py", line 951, in __init__
[2023-07-21 19:46:33]     self._execute_child(args, executable, preexec_fn, close_fds,
[2023-07-21 19:46:33]   File "/cvmfs/atlas.cern.ch/repo/ATLASLocalRootBase/x86_64/python/3.9.14-x86_64-centos7/lib/python3.9/subprocess.py", line 1821, in _execute_child
[2023-07-21 19:46:33]     raise child_exception_type(errno_num, err_msg, err_filename)
[2023-07-21 19:46:33] OSError: [Errno 107] Transport endpoint is not connected: '/bin/bash'
[2023-07-21 19:46:33] 
[2023-07-21 19:46:33] 2023-07-21 17:44:01,081 | INFO     | monitor loop #21: job 0:5910110243 is in state 'failed'
[2023-07-21 19:46:33] 2023-07-21 17:44:01,082 | INFO     | will abort job monitoring soon since job state=failed (job is still in queue)
[2023-07-21 19:46:33] 2023-07-21 17:44:01,719 | WARNING  | job:queue_monitor:received graceful stop - abort after this iteration
[2023-07-21 19:46:33] 2023-07-21 17:44:01,720 | WARNING  | since job:queue_monitor is responsible for sending job updates, we sleep for 20 s
[2023-07-21 19:46:33] 2023-07-21 17:44:01,720 | WARNING  | job:job_monitor:received graceful stop - abort after this iteration
[2023-07-21 19:46:33] 2023-07-21 17:44:01,720 | INFO     | will abort loop
[2023-07-21 19:46:33] 2023-07-21 17:44:02,082 | INFO     | found 1 job(s) in 20 queues
[2023-07-21 19:46:33] 2023-07-21 17:44:02,082 | INFO     | aborting job 5910110243
[2023-07-21 19:46:33] 2023-07-21 17:44:02,122 | WARNING  | pilot monitor received instruction that args.graceful_stop has been set
[2023-07-21 19:46:33] 2023-07-21 17:44:02,123 | WARNING  | will wait for a maximum of 300 s for threads to finish
[2023-07-21 19:46:33] 2023-07-21 17:44:02,724 | INFO     | [job] job monitor thread has finished
[2023-07-21 19:46:33] 2023-07-21 17:44:02,745 | INFO     | [job] create_data_payload thread has finished
[2023-07-21 19:46:33] 2023-07-21 17:44:02,861 | INFO     | [payload] validate_pre thread has finished
[2023-07-21 19:46:33] 2023-07-21 17:44:03,046 | INFO     | [job] retrieve thread has finished
[2023-07-21 19:46:33] 2023-07-21 17:44:03,257 | INFO     | [data] control thread has finished
[2023-07-21 19:46:33] 2023-07-21 17:44:03,349 | INFO     | [payload] validate_post thread has finished
[2023-07-21 19:46:33] 2023-07-21 17:44:03,459 | INFO     | [payload] failed_post thread has finished
[2023-07-21 19:46:33] 2023-07-21 17:44:03,862 | INFO     | [payload] control thread has finished
[2023-07-21 19:46:33] 2023-07-21 17:44:04,034 | INFO     | [job] control thread has finished
[2023-07-21 19:46:33] 2023-07-21 17:44:04,052 | INFO     | [data] copytool_in thread has finished
[2023-07-21 19:46:33] 2023-07-21 17:44:04,191 | INFO     | [job] validate thread has finished
[2023-07-21 19:46:33] 2023-07-21 17:44:05,027 | WARNING  | data:queue_monitoring:received graceful stop - abort after this iteration
[2023-07-21 19:46:33] 2023-07-21 17:44:05,226 | INFO     | [payload] execute_payloads thread has finished
[2023-07-21 19:46:33] 2023-07-21 17:44:09,034 | INFO     | [data] queue_monitor thread has finished
[2023-07-21 19:46:33] 2023-07-21 17:44:12,738 | INFO     | job.realtimelogging is not enabled
[2023-07-21 19:46:33] 2023-07-21 17:44:13,741 | INFO     | [payload] run_realtimelog thread has finished
[2023-07-21 19:46:33] 2023-07-21 17:44:22,816 | INFO     | job 5910110243 has state=failed
[2023-07-21 19:46:33] 2023-07-21 17:44:22,816 | INFO     | preparing for final server update for job 5910110243 in state='failed'
[2023-07-21 19:46:33] 2023-07-21 17:44:23,215 | WARNING  | job_aborted has been set - aborting pilot monitoring
[2023-07-21 19:46:33] 2023-07-21 17:44:23,215 | INFO     | [monitor] control thread has ended
[2023-07-21 19:46:33] 2023-07-21 17:44:23,591 | INFO     | waiting for thread to finish: ['<_MainThread(MainThread, started 140694095030080)>', '<ExcThread(queue_monitor, started 140693548799744)>']
[2023-07-21 19:46:33] 2023-07-21 17:44:25,603 | INFO     | waiting for thread to finish: ['<_MainThread(MainThread, started 140694095030080)>', '<ExcThread(queue_monitor, started 140693548799744)>']
[2023-07-21 19:46:33] 2023-07-21 17:44:27,614 | INFO     | waiting for thread to finish: ['<_MainThread(MainThread, started 140694095030080)>', '<ExcThread(queue_monitor, started 140693548799744)>']
[2023-07-21 19:46:33] 2023-07-21 17:44:29,621 | INFO     | waiting for thread to finish: ['<_MainThread(MainThread, started 140694095030080)>', '<ExcThread(queue_monitor, started 140693548799744)>']
[2023-07-21 19:46:33] 2023-07-21 17:44:31,632 | INFO     | waiting for thread to finish: ['<_MainThread(MainThread, started 140694095030080)>', '<ExcThread(queue_monitor, started 140693548799744)>']
[2023-07-21 19:46:33] 2023-07-21 17:44:33,644 | INFO     | waiting for thread to finish: ['<_MainThread(MainThread, started 140694095030080)>', '<ExcThread(queue_monitor, started 140693548799744)>']
[2023-07-21 19:46:33] 2023-07-21 17:44:35,654 | INFO     | waiting for thread to finish: ['<_MainThread(MainThread, started 140694095030080)>', '<ExcThread(queue_monitor, started 140693548799744)>']
[2023-07-21 19:46:33] 2023-07-21 17:44:37,661 | INFO     | waiting for thread to finish: ['<_MainThread(MainThread, started 140694095030080)>', '<ExcThread(queue_monitor, started 140693548799744)>']
[2023-07-21 19:46:33] 2023-07-21 17:44:39,666 | INFO     | waiting for thread to finish: ['<_MainThread(MainThread, started 140694095030080)>', '<ExcThread(queue_monitor, started 140693548799744)>']
[2023-07-21 19:46:33] 2023-07-21 17:44:41,676 | INFO     | waiting for thread to finish: ['<_MainThread(MainThread, started 140694095030080)>', '<ExcThread(queue_monitor, started 140693548799744)>']
[2023-07-21 19:46:33] 2023-07-21 17:44:43,682 | INFO     | waiting for thread to finish: ['<_MainThread(MainThread, started 140694095030080)>', '<ExcThread(queue_monitor, started 140693548799744)>']
[2023-07-21 19:46:33] 2023-07-21 17:44:45,686 | INFO     | waiting for thread to finish: ['<_MainThread(MainThread, started 140694095030080)>', '<ExcThread(queue_monitor, started 140693548799744)>']
[2023-07-21 19:46:33] 2023-07-21 17:44:47,696 | INFO     | waiting for thread to finish: ['<_MainThread(MainThread, started 140694095030080)>', '<ExcThread(queue_monitor, started 140693548799744)>']
[2023-07-21 19:46:33] 2023-07-21 17:44:49,707 | INFO     | waiting for thread to finish: ['<_MainThread(MainThread, started 140694095030080)>', '<ExcThread(queue_monitor, started 140693548799744)>']
[2023-07-21 19:46:33] 2023-07-21 17:44:51,716 | INFO     | waiting for thread to finish: ['<_MainThread(MainThread, started 140694095030080)>', '<ExcThread(queue_monitor, started 140693548799744)>']
[2023-07-21 19:46:33] 2023-07-21 17:44:53,725 | INFO     | waiting for thread to finish: ['<_MainThread(MainThread, started 140694095030080)>', '<ExcThread(queue_monitor, started 140693548799744)>']
[2023-07-21 19:46:33] 2023-07-21 17:44:55,736 | INFO     | waiting for thread to finish: ['<_MainThread(MainThread, started 140694095030080)>', '<ExcThread(queue_monitor, started 140693548799744)>']
[2023-07-21 19:46:33] 2023-07-21 17:44:57,744 | INFO     | waiting for thread to finish: ['<_MainThread(MainThread, started 140694095030080)>', '<ExcThread(queue_monitor, started 140693548799744)>']
[2023-07-21 19:46:33] 2023-07-21 17:44:59,750 | INFO     | waiting for thread to finish: ['<_MainThread(MainThread, started 140694095030080)>', '<ExcThread(queue_monitor, started 140693548799744)>']
[2023-07-21 19:46:33] 2023-07-21 17:45:01,755 | INFO     | waiting for thread to finish: ['<_MainThread(MainThread, started 140694095030080)>', '<ExcThread(queue_monitor, started 140693548799744)>']
[2023-07-21 19:46:33] 2023-07-21 17:45:03,766 | INFO     | waiting for thread to finish: ['<_MainThread(MainThread, started 140694095030080)>', '<ExcThread(queue_monitor, started 140693548799744)>']
[2023-07-21 19:46:33] 2023-07-21 17:45:05,777 | INFO     | waiting for thread to finish: ['<_MainThread(MainThread, started 140694095030080)>', '<ExcThread(queue_monitor, started 140693548799744)>']
[2023-07-21 19:46:33] 2023-07-21 17:45:07,787 | INFO     | waiting for thread to finish: ['<_MainThread(MainThread, started 140694095030080)>', '<ExcThread(queue_monitor, started 140693548799744)>']
[2023-07-21 19:46:33] 2023-07-21 17:45:09,799 | INFO     | waiting for thread to finish: ['<_MainThread(MainThread, started 140694095030080)>', '<ExcThread(queue_monitor, started 140693548799744)>']
[2023-07-21 19:46:33] 2023-07-21 17:45:11,806 | INFO     | waiting for thread to finish: ['<_MainThread(MainThread, started 140694095030080)>', '<ExcThread(queue_monitor, started 140693548799744)>']
[2023-07-21 19:46:33] 2023-07-21 17:45:13,815 | INFO     | waiting for thread to finish: ['<_MainThread(MainThread, started 140694095030080)>', '<ExcThread(queue_monitor, started 140693548799744)>']
[2023-07-21 19:46:33] 2023-07-21 17:45:15,823 | INFO     | waiting for thread to finish: ['<_MainThread(MainThread, started 140694095030080)>', '<ExcThread(queue_monitor, started 140693548799744)>']
[2023-07-21 19:46:33] 2023-07-21 17:45:17,830 | INFO     | waiting for thread to finish: ['<_MainThread(MainThread, started 140694095030080)>', '<ExcThread(queue_monitor, started 140693548799744)>']
[2023-07-21 19:46:33] 2023-07-21 17:45:19,839 | INFO     | waiting for thread to finish: ['<_MainThread(MainThread, started 140694095030080)>', '<ExcThread(queue_monitor, started 140693548799744)>']
[2023-07-21 19:46:33] 2023-07-21 17:45:21,851 | INFO     | waiting for thread to finish: ['<_MainThread(MainThread, started 140694095030080)>', '<ExcThread(queue_monitor, started 140693548799744)>']
[2023-07-21 19:46:33] 2023-07-21 17:45:23,862 | INFO     | waiting for thread to finish: ['<_MainThread(MainThread, started 140694095030080)>', '<ExcThread(queue_monitor, started 140693548799744)>']
[2023-07-21 19:46:33] 2023-07-21 17:45:25,869 | INFO     | waiting for thread to finish: ['<_MainThread(MainThread, started 140694095030080)>', '<ExcThread(queue_monitor, started 140693548799744)>']
[2023-07-21 19:46:33] 2023-07-21 17:45:27,880 | INFO     | waiting for thread to finish: ['<_MainThread(MainThread, started 140694095030080)>', '<ExcThread(queue_monitor, started 140693548799744)>']
[2023-07-21 19:46:33] 2023-07-21 17:45:29,891 | INFO     | waiting for thread to finish: ['<_MainThread(MainThread, started 140694095030080)>', '<ExcThread(queue_monitor, started 140693548799744)>']
[2023-07-21 19:46:33] 2023-07-21 17:45:31,903 | INFO     | waiting for thread to finish: ['<_MainThread(MainThread, started 140694095030080)>', '<ExcThread(queue_monitor, started 140693548799744)>']
[2023-07-21 19:46:33] 2023-07-21 17:45:33,913 | INFO     | waiting for thread to finish: ['<_MainThread(MainThread, started 140694095030080)>', '<ExcThread(queue_monitor, started 140693548799744)>']
[2023-07-21 19:46:33] 2023-07-21 17:45:35,920 | INFO     | waiting for thread to finish: ['<_MainThread(MainThread, started 140694095030080)>', '<ExcThread(queue_monitor, started 140693548799744)>']
[2023-07-21 19:46:33] 2023-07-21 17:45:37,931 | INFO     | waiting for thread to finish: ['<_MainThread(MainThread, started 140694095030080)>', '<ExcThread(queue_monitor, started 140693548799744)>']
[2023-07-21 19:46:33] 2023-07-21 17:45:39,938 | INFO     | waiting for thread to finish: ['<_MainThread(MainThread, started 140694095030080)>', '<ExcThread(queue_monitor, started 140693548799744)>']
[2023-07-21 19:46:33] 2023-07-21 17:45:41,941 | INFO     | waiting for thread to finish: ['<_MainThread(MainThread, started 140694095030080)>', '<ExcThread(queue_monitor, started 140693548799744)>']
[2023-07-21 19:46:33] 2023-07-21 17:45:43,950 | INFO     | waiting for thread to finish: ['<_MainThread(MainThread, started 140694095030080)>', '<ExcThread(queue_monitor, started 140693548799744)>']
[2023-07-21 19:46:33] 2023-07-21 17:45:45,958 | INFO     | waiting for thread to finish: ['<_MainThread(MainThread, started 140694095030080)>', '<ExcThread(queue_monitor, started 140693548799744)>']
[2023-07-21 19:46:33] 2023-07-21 17:45:47,964 | INFO     | waiting for thread to finish: ['<_MainThread(MainThread, started 140694095030080)>', '<ExcThread(queue_monitor, started 140693548799744)>']
[2023-07-21 19:46:33] 2023-07-21 17:45:49,973 | INFO     | waiting for thread to finish: ['<_MainThread(MainThread, started 140694095030080)>', '<ExcThread(queue_monitor, started 140693548799744)>']
[2023-07-21 19:46:33] 2023-07-21 17:45:51,982 | INFO     | waiting for thread to finish: ['<_MainThread(MainThread, started 140694095030080)>', '<ExcThread(queue_monitor, started 140693548799744)>']
[2023-07-21 19:46:33] 2023-07-21 17:45:53,993 | INFO     | waiting for thread to finish: ['<_MainThread(MainThread, started 140694095030080)>', '<ExcThread(queue_monitor, started 140693548799744)>']
[2023-07-21 19:46:33] 2023-07-21 17:45:56,002 | INFO     | waiting for thread to finish: ['<_MainThread(MainThread, started 140694095030080)>', '<ExcThread(queue_monitor, started 140693548799744)>']
[2023-07-21 19:46:33] 2023-07-21 17:45:58,013 | INFO     | waiting for thread to finish: ['<_MainThread(MainThread, started 140694095030080)>', '<ExcThread(queue_monitor, started 140693548799744)>']
[2023-07-21 19:46:33] 2023-07-21 17:46:00,021 | INFO     | waiting for thread to finish: ['<_MainThread(MainThread, started 140694095030080)>', '<ExcThread(queue_monitor, started 140693548799744)>']
[2023-07-21 19:46:33] 2023-07-21 17:46:02,025 | INFO     | waiting for thread to finish: ['<_MainThread(MainThread, started 140694095030080)>', '<ExcThread(queue_monitor, started 140693548799744)>']
[2023-07-21 19:46:33] 2023-07-21 17:46:04,032 | INFO     | waiting for thread to finish: ['<_MainThread(MainThread, started 140694095030080)>', '<ExcThread(queue_monitor, started 140693548799744)>']
[2023-07-21 19:46:33] 2023-07-21 17:46:06,044 | INFO     | waiting for thread to finish: ['<_MainThread(MainThread, started 140694095030080)>', '<ExcThread(queue_monitor, started 140693548799744)>']
[2023-07-21 19:46:33] 2023-07-21 17:46:08,050 | INFO     | waiting for thread to finish: ['<_MainThread(MainThread, started 140694095030080)>', '<ExcThread(queue_monitor, started 140693548799744)>']
[2023-07-21 19:46:33] 2023-07-21 17:46:10,061 | INFO     | waiting for thread to finish: ['<_MainThread(MainThread, started 140694095030080)>', '<ExcThread(queue_monitor, started 140693548799744)>']
[2023-07-21 19:46:33] 2023-07-21 17:46:12,064 | INFO     | waiting for thread to finish: ['<_MainThread(MainThread, started 140694095030080)>', '<ExcThread(queue_monitor, started 140693548799744)>']
[2023-07-21 19:46:33] 2023-07-21 17:46:14,074 | INFO     | waiting for thread to finish: ['<_MainThread(MainThread, started 140694095030080)>', '<ExcThread(queue_monitor, started 140693548799744)>']
[2023-07-21 19:46:33] 2023-07-21 17:46:16,082 | INFO     | waiting for thread to finish: ['<_MainThread(MainThread, started 140694095030080)>', '<ExcThread(queue_monitor, started 140693548799744)>']
[2023-07-21 19:46:33] 2023-07-21 17:46:17,912 | INFO     | proceeding with final server update
[2023-07-21 19:46:33] 2023-07-21 17:46:17,913 | INFO     | this job has now completed (state=failed)
[2023-07-21 19:46:33] 2023-07-21 17:46:17,913 | INFO     | pilot will not update the server (heartbeat message will be written to file)
[2023-07-21 19:46:33] 2023-07-21 17:46:17,913 | INFO     | job 5910110243 has failed - writing final server update
[2023-07-21 19:46:33] 2023-07-21 17:46:17,913 | WARNING  | making sure that job.state is set to failed since a pilot error code is set
[2023-07-21 19:46:33] 2023-07-21 17:46:17,913 | WARNING  | wrong length of table data, x=[1689961185.0, 1689961246.0, 1689961307.0, 1689961368.0, 1689961431.0], y=[2169.0, 863456.0, 999160.0, 1321736.0, 1342433.0] (must be
[2023-07-21 19:46:33] 2023-07-21 17:46:17,914 | INFO     | payload/TRF did not report the number of read events
[2023-07-21 19:46:33] 2023-07-21 17:46:17,914 | WARNING  | command={cmd} does not exist - cannot check number of available cores
[2023-07-21 19:46:33] 2023-07-21 17:46:17,914 | INFO     | executing command: grep -o 'avx2[^ ]*\|AVX2[^ ]*' /proc/cpuinfo
[2023-07-21 19:46:33]   File "/var/lib/boinc/slots/2/pilot3/pilot/common/exception.py", line 424, in run
[2023-07-21 19:46:33]     self._target(**self._kwargs)
[2023-07-21 19:46:33]   File "/var/lib/boinc/slots/2/pilot3/pilot/control/job.py", line 2432, in queue_monitor
[2023-07-21 19:46:33]     update_server(job, args)
[2023-07-21 19:46:33]   File "/var/lib/boinc/slots/2/pilot3/pilot/control/job.py", line 2483, in update_server
[2023-07-21 19:46:33]     send_state(job, args, job.state, metadata=metadata)
[2023-07-21 19:46:33]   File "/var/lib/boinc/slots/2/pilot3/pilot/control/job.py", line 329, in send_state
[2023-07-21 19:46:33]     data = get_data_structure(job, state, args, xml=xml, metadata=metadata, final=final)
[2023-07-21 19:46:33]   File "/var/lib/boinc/slots/2/pilot3/pilot/control/job.py", line 653, in get_data_structure
[2023-07-21 19:46:33]     instruction_sets = has_instruction_sets(['AVX2'])
[2023-07-21 19:46:33]   File "/var/lib/boinc/slots/2/pilot3/pilot/util/auxiliary.py", line 492, in has_instruction_sets
[2023-07-21 19:46:33]     exit_code, stdout, stderr = execute(cmd)
[2023-07-21 19:46:33]   File "/var/lib/boinc/slots/2/pilot3/pilot/util/container.py", line 64, in execute
[2023-07-21 19:46:33]     process = subprocess.Popen(exe,
[2023-07-21 19:46:33]   File "/cvmfs/atlas.cern.ch/repo/ATLASLocalRootBase/x86_64/python/3.9.14-x86_64-centos7/lib/python3.9/subprocess.py", line 951, in __init__
[2023-07-21 19:46:33]     self._execute_child(args, executable, preexec_fn, close_fds,
[2023-07-21 19:46:33]   File "/cvmfs/atlas.cern.ch/repo/ATLASLocalRootBase/x86_64/python/3.9.14-x86_64-centos7/lib/python3.9/subprocess.py", line 1821, in _execute_child
[2023-07-21 19:46:33]     raise child_exception_type(errno_num, err_msg, err_filename)
[2023-07-21 19:46:33] exception caught by thread run() function: (<class 'OSError'>, OSError(107, 'Transport endpoint is not connected'), <traceback object at 0x7ff5e294ea40>)
[2023-07-21 19:46:33] Traceback (most recent call last):
[2023-07-21 19:46:33]   File "/var/lib/boinc/slots/2/pilot3/pilot/common/exception.py", line 424, in run
[2023-07-21 19:46:33]     self._target(**self._kwargs)
[2023-07-21 19:46:33]   File "/var/lib/boinc/slots/2/pilot3/pilot/control/job.py", line 2432, in queue_monitor
[2023-07-21 19:46:33]     update_server(job, args)
[2023-07-21 19:46:33]   File "/var/lib/boinc/slots/2/pilot3/pilot/control/job.py", line 2483, in update_server
[2023-07-21 19:46:33]     send_state(job, args, job.state, metadata=metadata)
[2023-07-21 19:46:33]   File "/var/lib/boinc/slots/2/pilot3/pilot/control/job.py", line 329, in send_state
[2023-07-21 19:46:33]     data = get_data_structure(job, state, args, xml=xml, metadata=metadata, final=final)
[2023-07-21 19:46:33]   File "/var/lib/boinc/slots/2/pilot3/pilot/control/job.py", line 653, in get_data_structure
[2023-07-21 19:46:33]     instruction_sets = has_instruction_sets(['AVX2'])
[2023-07-21 19:46:33]   File "/var/lib/boinc/slots/2/pilot3/pilot/util/auxiliary.py", line 492, in has_instruction_sets
[2023-07-21 19:46:33]     exit_code, stdout, stderr = execute(cmd)
[2023-07-21 19:46:33]   File "/var/lib/boinc/slots/2/pilot3/pilot/util/container.py", line 64, in execute
[2023-07-21 19:46:33]     process = subprocess.Popen(exe,
[2023-07-21 19:46:33]   File "/cvmfs/atlas.cern.ch/repo/ATLASLocalRootBase/x86_64/python/3.9.14-x86_64-centos7/lib/python3.9/subprocess.py", line 951, in __init__
[2023-07-21 19:46:33]     self._execute_child(args, executable, preexec_fn, close_fds,
[2023-07-21 19:46:33]   File "/cvmfs/atlas.cern.ch/repo/ATLASLocalRootBase/x86_64/python/3.9.14-x86_64-centos7/lib/python3.9/subprocess.py", line 1821, in _execute_child
[2023-07-21 19:46:33]     raise child_exception_type(errno_num, err_msg, err_filename)
[2023-07-21 19:46:33] OSError: [Errno 107] Transport endpoint is not connected: '/bin/bash'
[2023-07-21 19:46:33] 
[2023-07-21 19:46:33] None
[2023-07-21 19:46:33] exception has been put in bucket queue belonging to thread 'queue_monitor'
[2023-07-21 19:46:33] setting graceful stop in 10 s since there is no point in continuing
[2023-07-21 19:46:33] 2023-07-21 17:46:18,092 | INFO     | waiting for thread to finish: ['<_MainThread(MainThread, started 140694095030080)>', '<ExcThread(queue_monitor, started 140693548799744)>']
[2023-07-21 19:46:33] 2023-07-21 17:46:20,101 | INFO     | waiting for thread to finish: ['<_MainThread(MainThread, started 140694095030080)>', '<ExcThread(queue_monitor, started 140693548799744)>']
[2023-07-21 19:46:33] 2023-07-21 17:46:22,112 | INFO     | waiting for thread to finish: ['<_MainThread(MainThread, started 140694095030080)>', '<ExcThread(queue_monitor, started 140693548799744)>']
[2023-07-21 19:46:33] 2023-07-21 17:46:24,123 | INFO     | waiting for thread to finish: ['<_MainThread(MainThread, started 140694095030080)>', '<ExcThread(queue_monitor, started 140693548799744)>']
[2023-07-21 19:46:33] 2023-07-21 17:46:26,135 | INFO     | waiting for thread to finish: ['<_MainThread(MainThread, started 140694095030080)>', '<ExcThread(queue_monitor, started 140693548799744)>']
[2023-07-21 19:46:33] 2023-07-21 17:46:28,146 | INFO     | caller=run is remaining thread - safe to abort (names=['<_MainThread(MainThread, started 140694095030080)>'])
[2023-07-21 19:46:33] 2023-07-21 17:46:33,163 | INFO     | end of generic workflow (traces error code: 1354)
[2023-07-21 19:46:33] 2023-07-21 17:46:33,163 | INFO     | traces error code: 1354
[2023-07-21 19:46:33] 2023-07-21 17:46:33,164 | INFO     | an exit code was already set: 1354 (will be converted to a standard shell code)
[2023-07-21 19:46:33] no translation to shell exit code for error code 1354
[2023-07-21 19:46:33] 2023-07-21 17:46:33,164 | INFO     | pilot has finished (exit code=1354, shell exit code=1)
[2023-07-21 19:46:33] ./runpilot2-wrapper.sh: line 15: date: command not found
[2023-07-21 19:46:33]  ==== pilot stdout END ====
[2023-07-21 19:46:33] ./runpilot2-wrapper.sh: line 15: date: command not found
[2023-07-21 19:46:33]  ==== wrapper stdout RESUME ====
[2023-07-21 19:46:33] ./runpilot2-wrapper.sh: line 15: date: command not found
[2023-07-21 19:46:33]  pilotpid: 2434037
[2023-07-21 19:46:33] ./runpilot2-wrapper.sh: line 15: date: command not found
[2023-07-21 19:46:33]  Pilot exit status: 1
[2023-07-21 19:46:33] ./runpilot2-wrapper.sh: line 852: cut: command not found
[2023-07-21 19:46:33] ./runpilot2-wrapper.sh: line 852: xargs: command not found
[2023-07-21 19:46:33] ./runpilot2-wrapper.sh: line 852: /usr/bin/cat: Transport endpoint is not connected
[2023-07-21 19:46:33] ./runpilot2-wrapper.sh: line 15: date: command not found
[2023-07-21 19:46:33]  pandaids: 
[2023-07-21 19:46:33] ./runpilot2-wrapper.sh: line 860: date: command not found
[2023-07-21 19:46:33] ./runpilot2-wrapper.sh: line 15: date: command not found
[2023-07-21 19:46:33]  apfmon messages muted
[2023-07-21 19:46:33] ./runpilot2-wrapper.sh: line 15: date: command not found
[2023-07-21 19:46:33]  Test setup, not cleaning
[2023-07-21 19:46:33] ./runpilot2-wrapper.sh: line 15: date: command not found
[2023-07-21 19:46:33]  ==== wrapper stdout END ====
[2023-07-21 19:46:33] ./runpilot2-wrapper.sh: line 10: date: command not found
[2023-07-21 19:46:33]  ==== wrapper stderr END ====
[2023-07-21 19:46:33] ./runpilot2-wrapper.sh: line 474: date: command not found
[2023-07-21 19:46:33] ./runpilot2-wrapper.sh: line 15: date: command not found
[2023-07-21 19:46:33]  wrapperexiting ec=0, duration=-1689960998
[2023-07-21 19:46:33] ./runpilot2-wrapper.sh: line 15: date: command not found
[2023-07-21 19:46:33]  apfmon messages muted
[2023-07-21 19:46:33]  *** Error codes and diagnostics ***
[2023-07-21 19:46:33]  *** Listing of results directory ***
[2023-07-21 19:46:33] insgesamt 40224
[2023-07-21 19:46:33] -rw-r--r--. 1 boinc boinc   418016 21. Jul 18:33 pilot3.tar.gz
[2023-07-21 19:46:33] -rwx------. 1 boinc boinc    27277 21. Jul 19:16 runpilot2-wrapper.sh
[2023-07-21 19:46:33] -rw-r--r--. 1 boinc boinc     4388 21. Jul 19:16 queuedata.json
[2023-07-21 19:46:33] -rw-r--r--. 1 boinc boinc      107 21. Jul 19:36 wrapper_26015_x86_64-pc-linux-gnu
[2023-07-21 19:46:33] -rwxr-xr-x. 1 boinc boinc     7986 21. Jul 19:36 run_atlas
[2023-07-21 19:46:33] -rw-r--r--. 1 boinc boinc      112 21. Jul 19:36 job.xml
[2023-07-21 19:46:33] -rw-r--r--. 2 boinc boinc    17604 21. Jul 19:36 start_atlas.sh
[2023-07-21 19:46:33] drwxrwx--x. 2 boinc boinc       68 21. Jul 19:36 shared
[2023-07-21 19:46:33] -rw-r--r--. 2 boinc boinc   428855 21. Jul 19:36 input.tar.gz
[2023-07-21 19:46:33] -rw-r--r--. 1 boinc boinc     6174 21. Jul 19:36 init_data.xml
[2023-07-21 19:46:33] -rw-r--r--. 2 boinc boinc 37620382 21. Jul 19:36 EVNT.04972714._000038.pool.root.1
[2023-07-21 19:46:33] -rw-r--r--. 1 boinc boinc        0 21. Jul 19:36 boinc_lockfile
[2023-07-21 19:46:33] -rw-r--r--. 1 boinc boinc     2752 21. Jul 19:36 pandaJob.out
[2023-07-21 19:46:33] -rw-------. 1 boinc boinc      424 21. Jul 19:36 setup.sh.local
[2023-07-21 19:46:33] -rw-------. 1 boinc boinc  1370854 21. Jul 19:37 cric_ddmendpoints.json
[2023-07-21 19:46:33] -rw-------. 1 boinc boinc  1014924 21. Jul 19:37 agis_schedconf.cvmfs.json
[2023-07-21 19:46:33] drwx------. 4 boinc boinc     4096 21. Jul 19:38 pilot3
[2023-07-21 19:46:33] -rw-------. 1 boinc boinc      513 21. Jul 19:39 heartbeat.json
[2023-07-21 19:46:33] -rw-r--r--. 1 boinc boinc      531 21. Jul 19:43 boinc_task_state.xml
[2023-07-21 19:46:33] drwxrwx---. 2 boinc boinc     4096 21. Jul 19:43 PanDA_Pilot-5910110243
[2023-07-21 19:46:33] -rw-------. 1 boinc boinc      998 21. Jul 19:43 memory_monitor_summary.json
[2023-07-21 19:46:33] -rw-r--r--. 1 boinc boinc     8192 21. Jul 19:46 boinc_mmap_file
[2023-07-21 19:46:33] -rw-r--r--. 1 boinc boinc       23 21. Jul 19:46 wrapper_checkpoint.txt
[2023-07-21 19:46:33] -rw-------. 1 boinc boinc    66045 21. Jul 19:46 pilotlog.txt
[2023-07-21 19:46:33] -rw-------. 1 boinc boinc      602 21. Jul 19:46 RZOKDm2u8g3n7Olcko1bjSoqABFKDmABFKDm7AsVDmDkNKDmYNdCsm.diag
[2023-07-21 19:46:33] -rw-r--r--. 1 boinc boinc    11134 21. Jul 19:46 runtime_log.err
[2023-07-21 19:46:33] -rw-r--r--. 1 boinc boinc      428 21. Jul 19:46 runtime_log
[2023-07-21 19:46:33] -rw-------. 1 boinc boinc    85496 21. Jul 19:46 3104de1a-cf23-492c-9419-2a0cee7db128_91912.1.job.log
[2023-07-21 19:46:33] -rw-r--r--. 1 boinc boinc    27742 21. Jul 19:46 stderr.txt
[2023-07-21 19:46:33] No HITS result produced
[2023-07-21 19:46:33]  *** Contents of shared directory: ***
[2023-07-21 19:46:33] insgesamt 37180
[2023-07-21 19:46:33] -rw-r--r--. 2 boinc boinc    17604 21. Jul 19:36 start_atlas.sh
[2023-07-21 19:46:33] -rw-r--r--. 2 boinc boinc   428855 21. Jul 19:36 input.tar.gz
[2023-07-21 19:46:33] -rw-r--r--. 2 boinc boinc 37620382 21. Jul 19:36 ATLAS.root_0
19:46:35 (2430897): run_atlas exited; CPU time 62.875695
19:46:35 (2430897): called boinc_finish(0)

</stderr_txt>
<message>
upload failure: <file_xfer_error>
  <file_name>RZOKDm2u8g3n7Olcko1bjSoqABFKDmABFKDm7AsVDmDkNKDmYNdCsm_0_r1671298027_ATLAS_result</file_name>
  <error_code>-161 (not found)</error_code>
</file_xfer_error>
</message>
]]>


©2024 CERN