Name | D3NKDmeHnh3n7Olcko1bjSoqABFKDmABFKDm7AsVDmdrNKDmyTpxjn_0 |
Workunit | 2320288 |
Created | 23 Jul 2023, 12:22:30 UTC |
Sent | 23 Jul 2023, 12:25:07 UTC |
Report deadline | 30 Jul 2023, 12:25:07 UTC |
Received | 23 Jul 2023, 12:42:54 UTC |
Server state | Over |
Outcome | Computation error |
Client state | Compute error |
Exit status | 0 (0x00000000) |
Computer ID | 4721 |
Run time | 6 min 53 sec |
CPU time | 29 sec |
Validate state | Invalid |
Credit | 0.00 |
Device peak FLOPS | 5.90 GFLOPS |
Application version | ATLAS Simulation v3.01 (native_mt) x86_64-pc-linux-gnu |
Peak working set size | 630.28 MB |
Peak swap size | 1.12 GB |
Peak disk usage | 79.21 MB |
<core_client_version>7.20.2</core_client_version> <![CDATA[ <stderr_txt> 14:28:08 (3684140): wrapper (7.7.26015): starting 14:28:08 (3684140): wrapper: running run_atlas (--nthreads 1) [2023-07-23 14:28:08] Arguments: --nthreads 1 [2023-07-23 14:28:08] Threads: 1 [2023-07-23 14:28:08] Checking for CVMFS [2023-07-23 14:28:10] Probing /cvmfs/atlas.cern.ch... OK [2023-07-23 14:28:10] Probing /cvmfs/atlas-condb.cern.ch... OK [2023-07-23 14:28:10] Running cvmfs_config stat atlas.cern.ch [2023-07-23 14:28:10] VERSION PID UPTIME(M) MEM(K) REVISION EXPIRES(M) NOCATALOGS CACHEUSE(K) CACHEMAX(K) NOFDUSE NOFDMAX NOIOERR NOOPEN HITRATE(%) RX(K) SPEED(K/S) HOST PROXY ONLINE [2023-07-23 14:28:10] 2.10.1.0 3599635 70 39576 121620 1 84 3318546 4096000 0 130560 0 328790 99.994 1707 315 http://s1cern-cvmfs.openhtc.io/cvmfs/atlas.cern.ch http://10.116.178.201:3128 1 [2023-07-23 14:28:10] CVMFS is ok [2023-07-23 14:28:10] Using apptainer image /cvmfs/atlas.cern.ch/repo/containers/fs/singularity/x86_64-centos7 [2023-07-23 14:28:10] Checking for apptainer binary... [2023-07-23 14:28:10] Using apptainer found in PATH at /usr/bin/apptainer [2023-07-23 14:28:10] Running /usr/bin/apptainer --version [2023-07-23 14:28:10] apptainer version 1.1.9-1.el9 [2023-07-23 14:28:10] Checking apptainer works with /usr/bin/apptainer exec -B /cvmfs /cvmfs/atlas.cern.ch/repo/containers/fs/singularity/x86_64-centos7 hostname [2023-07-23 14:28:10] P620-CentOS9 [2023-07-23 14:28:10] apptainer works [2023-07-23 14:28:10] Starting ATLAS job with PandaID=5911727891 [2023-07-23 14:28:10] Running command: /usr/bin/apptainer exec -B /cvmfs,/var/lib/boinc/slots/5 /cvmfs/atlas.cern.ch/repo/containers/fs/singularity/x86_64-centos7 sh start_atlas.sh [2023-07-23 14:34:57] *** The last 200 lines of the pilot log: *** [2023-07-23 14:34:57] [2023-07-23 14:34:57] finished pid=3694443 exit_code=1 state=failed [2023-07-23 14:34:57] [2023-07-23 14:34:57] 2023-07-23 12:32:37,216 | INFO | stopping process 'MemoryMonitor' with signal 10 [2023-07-23 14:34:57] 2023-07-23 12:32:37,216 | INFO | taking a short nap (3 s) to allow the memory monitor to finish writing to the summary file (#0/#20) [2023-07-23 14:34:57] 2023-07-23 12:32:38,497 | INFO | monitor loop #11: job 0:5911727891 is in state 'failed' [2023-07-23 14:34:57] 2023-07-23 12:32:38,497 | INFO | will abort job monitoring soon since job state=failed (job is still in queue) [2023-07-23 14:34:57] 2023-07-23 12:32:40,230 | INFO | copied /var/lib/boinc/slots/5/PanDA_Pilot-5911727891/memory_monitor_summary.json to /var/lib/boinc/slots/5 [2023-07-23 14:34:57] 2023-07-23 12:32:40,230 | INFO | CPU consumption time: 28.64 s (rounded to 29 s) [2023-07-23 14:34:57] 2023-07-23 12:32:40,230 | WARNING | main payload execution returned non-zero exit code: 1 [2023-07-23 14:34:57] 2023-07-23 12:32:40,230 | INFO | scanning dmesg message for subprocess=3698140 for memory errors [2023-07-23 14:34:57] 2023-07-23 12:32:40,230 | INFO | executing command: dmesg|grep 3698140 [2023-07-23 14:34:57] 2023-07-23 12:32:41,015 | INFO | monitor loop #12: job 0:5911727891 is in state 'failed' [2023-07-23 14:34:57] 2023-07-23 12:32:41,015 | INFO | will abort job monitoring soon since job state=failed (job is still in queue) [2023-07-23 14:34:57] 2023-07-23 12:32:43,520 | INFO | monitor loop #13: job 0:5911727891 is in state 'failed' [2023-07-23 14:34:57] 2023-07-23 12:32:43,520 | INFO | will abort job monitoring soon since job state=failed (job is still in queue) [2023-07-23 14:34:57] 2023-07-23 12:32:44,117 | CRITICAL | execute payloads caught an exception (cannot recover): [Errno 107] Transport endpoint is not connected: '/bin/bash', Traceback (most recent call last): [2023-07-23 14:34:57] File "/var/lib/boinc/slots/5/pilot3/pilot/control/payload.py", line 257, in execute_payloads [2023-07-23 14:34:57] perform_initial_payload_error_analysis(job, exit_code) [2023-07-23 14:34:57] File "/var/lib/boinc/slots/5/pilot3/pilot/control/payload.py", line 577, in perform_initial_payload_error_analysis [2023-07-23 14:34:57] msg = scan_for_memory_errors(job.subprocesses) [2023-07-23 14:34:57] File "/var/lib/boinc/slots/5/pilot3/pilot/control/payload.py", line 651, in scan_for_memory_errors [2023-07-23 14:34:57] _, out, _ = execute(cmd) [2023-07-23 14:34:57] File "/var/lib/boinc/slots/5/pilot3/pilot/util/container.py", line 64, in execute [2023-07-23 14:34:57] process = subprocess.Popen(exe, [2023-07-23 14:34:57] File "/cvmfs/atlas.cern.ch/repo/ATLASLocalRootBase/x86_64/python/3.9.14-x86_64-centos7/lib/python3.9/subprocess.py", line 951, in __init__ [2023-07-23 14:34:57] self._execute_child(args, executable, preexec_fn, close_fds, [2023-07-23 14:34:57] File "/cvmfs/atlas.cern.ch/repo/ATLASLocalRootBase/x86_64/python/3.9.14-x86_64-centos7/lib/python3.9/subprocess.py", line 1821, in _execute_child [2023-07-23 14:34:57] raise child_exception_type(errno_num, err_msg, err_filename) [2023-07-23 14:34:57] OSError: [Errno 107] Transport endpoint is not connected: '/bin/bash' [2023-07-23 14:34:57] [2023-07-23 14:34:57] 2023-07-23 12:32:46,025 | INFO | monitor loop #14: job 0:5911727891 is in state 'failed' [2023-07-23 14:34:57] 2023-07-23 12:32:46,025 | INFO | will abort job monitoring soon since job state=failed (job is still in queue) [2023-07-23 14:34:57] 2023-07-23 12:32:46,553 | WARNING | job:job_monitor:received graceful stop - abort after this iteration [2023-07-23 14:34:57] 2023-07-23 12:32:46,553 | INFO | will abort loop [2023-07-23 14:34:57] 2023-07-23 12:32:46,610 | INFO | found 1 job(s) in 20 queues [2023-07-23 14:34:57] 2023-07-23 12:32:46,610 | INFO | aborting job 5911727891 [2023-07-23 14:34:57] 2023-07-23 12:32:46,653 | WARNING | pilot monitor received instruction that args.graceful_stop has been set [2023-07-23 14:34:57] 2023-07-23 12:32:46,653 | WARNING | will wait for a maximum of 300 s for threads to finish [2023-07-23 14:34:57] 2023-07-23 12:32:47,383 | INFO | job 5911727891 has state=failed [2023-07-23 14:34:57] 2023-07-23 12:32:47,383 | INFO | preparing for final server update for job 5911727891 in state='failed' [2023-07-23 14:34:57] 2023-07-23 12:32:47,559 | INFO | [job] job monitor thread has finished [2023-07-23 14:34:57] 2023-07-23 12:32:47,654 | WARNING | job_aborted has been set - aborting pilot monitoring [2023-07-23 14:34:57] 2023-07-23 12:32:47,654 | INFO | [monitor] control thread has ended [2023-07-23 14:34:57] 2023-07-23 12:32:47,714 | INFO | [job] create_data_payload thread has finished [2023-07-23 14:34:57] 2023-07-23 12:32:47,829 | INFO | [job] control thread has finished [2023-07-23 14:34:57] 2023-07-23 12:32:47,877 | WARNING | data:queue_monitoring:received graceful stop - abort after this iteration [2023-07-23 14:34:57] 2023-07-23 12:32:47,881 | INFO | [payload] validate_post thread has finished [2023-07-23 14:34:57] 2023-07-23 12:32:47,886 | INFO | [payload] validate_pre thread has finished [2023-07-23 14:34:57] 2023-07-23 12:32:48,374 | INFO | [data] control thread has finished [2023-07-23 14:34:57] 2023-07-23 12:32:48,558 | INFO | [payload] control thread has finished [2023-07-23 14:34:57] 2023-07-23 12:32:48,592 | INFO | [data] copytool_in thread has finished [2023-07-23 14:34:57] 2023-07-23 12:32:48,684 | INFO | [job] validate thread has finished [2023-07-23 14:34:57] 2023-07-23 12:32:48,735 | INFO | [job] retrieve thread has finished [2023-07-23 14:34:57] 2023-07-23 12:32:48,935 | INFO | [payload] failed_post thread has finished [2023-07-23 14:34:57] 2023-07-23 12:32:50,173 | INFO | [payload] execute_payloads thread has finished [2023-07-23 14:34:57] 2023-07-23 12:32:51,883 | INFO | [data] queue_monitor thread has finished [2023-07-23 14:34:57] 2023-07-23 12:32:57,439 | INFO | job.realtimelogging is not enabled [2023-07-23 14:34:57] 2023-07-23 12:32:58,442 | INFO | [payload] run_realtimelog thread has finished [2023-07-23 14:34:57] 2023-07-23 12:33:00,429 | INFO | waiting for thread to finish: ['<_MainThread(MainThread, started 139845306713920)>', '<ExcThread(queue_monitor, started 139844755887872)>'] [2023-07-23 14:34:57] 2023-07-23 12:33:02,438 | INFO | waiting for thread to finish: ['<_MainThread(MainThread, started 139845306713920)>', '<ExcThread(queue_monitor, started 139844755887872)>'] [2023-07-23 14:34:57] 2023-07-23 12:33:04,443 | INFO | waiting for thread to finish: ['<_MainThread(MainThread, started 139845306713920)>', '<ExcThread(queue_monitor, started 139844755887872)>'] [2023-07-23 14:34:57] 2023-07-23 12:33:06,446 | INFO | waiting for thread to finish: ['<_MainThread(MainThread, started 139845306713920)>', '<ExcThread(queue_monitor, started 139844755887872)>'] [2023-07-23 14:34:57] 2023-07-23 12:33:08,454 | INFO | waiting for thread to finish: ['<_MainThread(MainThread, started 139845306713920)>', '<ExcThread(queue_monitor, started 139844755887872)>'] [2023-07-23 14:34:57] 2023-07-23 12:33:10,462 | INFO | waiting for thread to finish: ['<_MainThread(MainThread, started 139845306713920)>', '<ExcThread(queue_monitor, started 139844755887872)>'] [2023-07-23 14:34:57] 2023-07-23 12:33:12,469 | INFO | waiting for thread to finish: ['<_MainThread(MainThread, started 139845306713920)>', '<ExcThread(queue_monitor, started 139844755887872)>'] [2023-07-23 14:34:57] 2023-07-23 12:33:14,478 | INFO | waiting for thread to finish: ['<_MainThread(MainThread, started 139845306713920)>', '<ExcThread(queue_monitor, started 139844755887872)>'] [2023-07-23 14:34:57] 2023-07-23 12:33:16,486 | INFO | waiting for thread to finish: ['<_MainThread(MainThread, started 139845306713920)>', '<ExcThread(queue_monitor, started 139844755887872)>'] [2023-07-23 14:34:57] 2023-07-23 12:33:18,497 | INFO | waiting for thread to finish: ['<_MainThread(MainThread, started 139845306713920)>', '<ExcThread(queue_monitor, started 139844755887872)>'] [2023-07-23 14:34:57] 2023-07-23 12:33:20,502 | INFO | waiting for thread to finish: ['<_MainThread(MainThread, started 139845306713920)>', '<ExcThread(queue_monitor, started 139844755887872)>'] [2023-07-23 14:34:57] 2023-07-23 12:33:22,512 | INFO | waiting for thread to finish: ['<_MainThread(MainThread, started 139845306713920)>', '<ExcThread(queue_monitor, started 139844755887872)>'] [2023-07-23 14:34:57] 2023-07-23 12:33:24,519 | INFO | waiting for thread to finish: ['<_MainThread(MainThread, started 139845306713920)>', '<ExcThread(queue_monitor, started 139844755887872)>'] [2023-07-23 14:34:57] 2023-07-23 12:33:26,525 | INFO | waiting for thread to finish: ['<_MainThread(MainThread, started 139845306713920)>', '<ExcThread(queue_monitor, started 139844755887872)>'] [2023-07-23 14:34:57] 2023-07-23 12:33:28,535 | INFO | waiting for thread to finish: ['<_MainThread(MainThread, started 139845306713920)>', '<ExcThread(queue_monitor, started 139844755887872)>'] [2023-07-23 14:34:57] 2023-07-23 12:33:30,543 | INFO | waiting for thread to finish: ['<_MainThread(MainThread, started 139845306713920)>', '<ExcThread(queue_monitor, started 139844755887872)>'] [2023-07-23 14:34:57] 2023-07-23 12:33:32,546 | INFO | waiting for thread to finish: ['<_MainThread(MainThread, started 139845306713920)>', '<ExcThread(queue_monitor, started 139844755887872)>'] [2023-07-23 14:34:57] 2023-07-23 12:33:34,558 | INFO | waiting for thread to finish: ['<_MainThread(MainThread, started 139845306713920)>', '<ExcThread(queue_monitor, started 139844755887872)>'] [2023-07-23 14:34:57] 2023-07-23 12:33:36,565 | INFO | waiting for thread to finish: ['<_MainThread(MainThread, started 139845306713920)>', '<ExcThread(queue_monitor, started 139844755887872)>'] [2023-07-23 14:34:57] 2023-07-23 12:33:38,568 | INFO | waiting for thread to finish: ['<_MainThread(MainThread, started 139845306713920)>', '<ExcThread(queue_monitor, started 139844755887872)>'] [2023-07-23 14:34:57] 2023-07-23 12:33:40,579 | INFO | waiting for thread to finish: ['<_MainThread(MainThread, started 139845306713920)>', '<ExcThread(queue_monitor, started 139844755887872)>'] [2023-07-23 14:34:57] 2023-07-23 12:33:42,590 | INFO | waiting for thread to finish: ['<_MainThread(MainThread, started 139845306713920)>', '<ExcThread(queue_monitor, started 139844755887872)>'] [2023-07-23 14:34:57] 2023-07-23 12:33:44,595 | INFO | waiting for thread to finish: ['<_MainThread(MainThread, started 139845306713920)>', '<ExcThread(queue_monitor, started 139844755887872)>'] [2023-07-23 14:34:57] 2023-07-23 12:33:46,605 | INFO | waiting for thread to finish: ['<_MainThread(MainThread, started 139845306713920)>', '<ExcThread(queue_monitor, started 139844755887872)>'] [2023-07-23 14:34:57] 2023-07-23 12:33:48,616 | INFO | waiting for thread to finish: ['<_MainThread(MainThread, started 139845306713920)>', '<ExcThread(queue_monitor, started 139844755887872)>'] [2023-07-23 14:34:57] 2023-07-23 12:33:50,627 | INFO | waiting for thread to finish: ['<_MainThread(MainThread, started 139845306713920)>', '<ExcThread(queue_monitor, started 139844755887872)>'] [2023-07-23 14:34:57] 2023-07-23 12:33:52,631 | INFO | waiting for thread to finish: ['<_MainThread(MainThread, started 139845306713920)>', '<ExcThread(queue_monitor, started 139844755887872)>'] [2023-07-23 14:34:57] 2023-07-23 12:33:54,640 | INFO | waiting for thread to finish: ['<_MainThread(MainThread, started 139845306713920)>', '<ExcThread(queue_monitor, started 139844755887872)>'] [2023-07-23 14:34:57] 2023-07-23 12:33:56,647 | INFO | waiting for thread to finish: ['<_MainThread(MainThread, started 139845306713920)>', '<ExcThread(queue_monitor, started 139844755887872)>'] [2023-07-23 14:34:57] 2023-07-23 12:33:58,654 | INFO | waiting for thread to finish: ['<_MainThread(MainThread, started 139845306713920)>', '<ExcThread(queue_monitor, started 139844755887872)>'] [2023-07-23 14:34:57] 2023-07-23 12:34:00,662 | INFO | waiting for thread to finish: ['<_MainThread(MainThread, started 139845306713920)>', '<ExcThread(queue_monitor, started 139844755887872)>'] [2023-07-23 14:34:57] 2023-07-23 12:34:02,671 | INFO | waiting for thread to finish: ['<_MainThread(MainThread, started 139845306713920)>', '<ExcThread(queue_monitor, started 139844755887872)>'] [2023-07-23 14:34:57] 2023-07-23 12:34:04,678 | INFO | waiting for thread to finish: ['<_MainThread(MainThread, started 139845306713920)>', '<ExcThread(queue_monitor, started 139844755887872)>'] [2023-07-23 14:34:57] 2023-07-23 12:34:06,685 | INFO | waiting for thread to finish: ['<_MainThread(MainThread, started 139845306713920)>', '<ExcThread(queue_monitor, started 139844755887872)>'] [2023-07-23 14:34:57] 2023-07-23 12:34:08,693 | INFO | waiting for thread to finish: ['<_MainThread(MainThread, started 139845306713920)>', '<ExcThread(queue_monitor, started 139844755887872)>'] [2023-07-23 14:34:57] 2023-07-23 12:34:10,703 | INFO | waiting for thread to finish: ['<_MainThread(MainThread, started 139845306713920)>', '<ExcThread(queue_monitor, started 139844755887872)>'] [2023-07-23 14:34:57] 2023-07-23 12:34:12,715 | INFO | waiting for thread to finish: ['<_MainThread(MainThread, started 139845306713920)>', '<ExcThread(queue_monitor, started 139844755887872)>'] [2023-07-23 14:34:57] 2023-07-23 12:34:14,725 | INFO | waiting for thread to finish: ['<_MainThread(MainThread, started 139845306713920)>', '<ExcThread(queue_monitor, started 139844755887872)>'] [2023-07-23 14:34:57] 2023-07-23 12:34:16,736 | INFO | waiting for thread to finish: ['<_MainThread(MainThread, started 139845306713920)>', '<ExcThread(queue_monitor, started 139844755887872)>'] [2023-07-23 14:34:57] 2023-07-23 12:34:18,747 | INFO | waiting for thread to finish: ['<_MainThread(MainThread, started 139845306713920)>', '<ExcThread(queue_monitor, started 139844755887872)>'] [2023-07-23 14:34:57] 2023-07-23 12:34:20,758 | INFO | waiting for thread to finish: ['<_MainThread(MainThread, started 139845306713920)>', '<ExcThread(queue_monitor, started 139844755887872)>'] [2023-07-23 14:34:57] 2023-07-23 12:34:22,768 | INFO | waiting for thread to finish: ['<_MainThread(MainThread, started 139845306713920)>', '<ExcThread(queue_monitor, started 139844755887872)>'] [2023-07-23 14:34:57] 2023-07-23 12:34:24,774 | INFO | waiting for thread to finish: ['<_MainThread(MainThread, started 139845306713920)>', '<ExcThread(queue_monitor, started 139844755887872)>'] [2023-07-23 14:34:57] 2023-07-23 12:34:26,780 | INFO | waiting for thread to finish: ['<_MainThread(MainThread, started 139845306713920)>', '<ExcThread(queue_monitor, started 139844755887872)>'] [2023-07-23 14:34:57] 2023-07-23 12:34:28,788 | INFO | waiting for thread to finish: ['<_MainThread(MainThread, started 139845306713920)>', '<ExcThread(queue_monitor, started 139844755887872)>'] [2023-07-23 14:34:57] 2023-07-23 12:34:30,797 | INFO | waiting for thread to finish: ['<_MainThread(MainThread, started 139845306713920)>', '<ExcThread(queue_monitor, started 139844755887872)>'] [2023-07-23 14:34:57] 2023-07-23 12:34:32,810 | INFO | waiting for thread to finish: ['<_MainThread(MainThread, started 139845306713920)>', '<ExcThread(queue_monitor, started 139844755887872)>'] [2023-07-23 14:34:57] 2023-07-23 12:34:34,812 | INFO | waiting for thread to finish: ['<_MainThread(MainThread, started 139845306713920)>', '<ExcThread(queue_monitor, started 139844755887872)>'] [2023-07-23 14:34:57] 2023-07-23 12:34:36,815 | INFO | waiting for thread to finish: ['<_MainThread(MainThread, started 139845306713920)>', '<ExcThread(queue_monitor, started 139844755887872)>'] [2023-07-23 14:34:57] 2023-07-23 12:34:38,816 | INFO | waiting for thread to finish: ['<_MainThread(MainThread, started 139845306713920)>', '<ExcThread(queue_monitor, started 139844755887872)>'] [2023-07-23 14:34:57] 2023-07-23 12:34:40,822 | INFO | waiting for thread to finish: ['<_MainThread(MainThread, started 139845306713920)>', '<ExcThread(queue_monitor, started 139844755887872)>'] [2023-07-23 14:34:57] 2023-07-23 12:34:42,506 | INFO | proceeding with final server update [2023-07-23 14:34:57] 2023-07-23 12:34:42,506 | INFO | this job has now completed (state=failed) [2023-07-23 14:34:57] 2023-07-23 12:34:42,506 | INFO | pilot will not update the server (heartbeat message will be written to file) [2023-07-23 14:34:57] 2023-07-23 12:34:42,506 | INFO | job 5911727891 has failed - writing final server update [2023-07-23 14:34:57] 2023-07-23 12:34:42,506 | WARNING | making sure that job.state is set to failed since a pilot error code is set [2023-07-23 14:34:57] 2023-07-23 12:34:42,506 | WARNING | wrong length of table data, x=[1690115481.0, 1690115542.0], y=[2446.0, 745768.0] (must be same and length>=4) [2023-07-23 14:34:57] 2023-07-23 12:34:42,507 | INFO | payload/TRF did not report the number of read events [2023-07-23 14:34:57] 2023-07-23 12:34:42,508 | WARNING | command={cmd} does not exist - cannot check number of available cores [2023-07-23 14:34:57] 2023-07-23 12:34:42,508 | INFO | executing command: grep -o 'avx2[^ ]*\|AVX2[^ ]*' /proc/cpuinfo [2023-07-23 14:34:57] File "/var/lib/boinc/slots/5/pilot3/pilot/common/exception.py", line 424, in run [2023-07-23 14:34:57] self._target(**self._kwargs) [2023-07-23 14:34:57] File "/var/lib/boinc/slots/5/pilot3/pilot/control/job.py", line 2432, in queue_monitor [2023-07-23 14:34:57] update_server(job, args) [2023-07-23 14:34:57] File "/var/lib/boinc/slots/5/pilot3/pilot/control/job.py", line 2483, in update_server [2023-07-23 14:34:57] send_state(job, args, job.state, metadata=metadata) [2023-07-23 14:34:57] File "/var/lib/boinc/slots/5/pilot3/pilot/control/job.py", line 329, in send_state [2023-07-23 14:34:57] data = get_data_structure(job, state, args, xml=xml, metadata=metadata, final=final) [2023-07-23 14:34:57] File "/var/lib/boinc/slots/5/pilot3/pilot/control/job.py", line 653, in get_data_structure [2023-07-23 14:34:57] instruction_sets = has_instruction_sets(['AVX2']) [2023-07-23 14:34:57] File "/var/lib/boinc/slots/5/pilot3/pilot/util/auxiliary.py", line 492, in has_instruction_sets [2023-07-23 14:34:57] exit_code, stdout, stderr = execute(cmd) [2023-07-23 14:34:57] File "/var/lib/boinc/slots/5/pilot3/pilot/util/container.py", line 64, in execute [2023-07-23 14:34:57] process = subprocess.Popen(exe, [2023-07-23 14:34:57] File "/cvmfs/atlas.cern.ch/repo/ATLASLocalRootBase/x86_64/python/3.9.14-x86_64-centos7/lib/python3.9/subprocess.py", line 951, in __init__ [2023-07-23 14:34:57] self._execute_child(args, executable, preexec_fn, close_fds, [2023-07-23 14:34:57] File "/cvmfs/atlas.cern.ch/repo/ATLASLocalRootBase/x86_64/python/3.9.14-x86_64-centos7/lib/python3.9/subprocess.py", line 1821, in _execute_child [2023-07-23 14:34:57] raise child_exception_type(errno_num, err_msg, err_filename) [2023-07-23 14:34:57] exception caught by thread run() function: (<class 'OSError'>, OSError(107, 'Transport endpoint is not connected'), <traceback object at 0x7f30430d1bc0>) [2023-07-23 14:34:57] Traceback (most recent call last): [2023-07-23 14:34:57] File "/var/lib/boinc/slots/5/pilot3/pilot/common/exception.py", line 424, in run [2023-07-23 14:34:57] self._target(**self._kwargs) [2023-07-23 14:34:57] File "/var/lib/boinc/slots/5/pilot3/pilot/control/job.py", line 2432, in queue_monitor [2023-07-23 14:34:57] update_server(job, args) [2023-07-23 14:34:57] File "/var/lib/boinc/slots/5/pilot3/pilot/control/job.py", line 2483, in update_server [2023-07-23 14:34:57] send_state(job, args, job.state, metadata=metadata) [2023-07-23 14:34:57] File "/var/lib/boinc/slots/5/pilot3/pilot/control/job.py", line 329, in send_state [2023-07-23 14:34:57] data = get_data_structure(job, state, args, xml=xml, metadata=metadata, final=final) [2023-07-23 14:34:57] File "/var/lib/boinc/slots/5/pilot3/pilot/control/job.py", line 653, in get_data_structure [2023-07-23 14:34:57] instruction_sets = has_instruction_sets(['AVX2']) [2023-07-23 14:34:57] File "/var/lib/boinc/slots/5/pilot3/pilot/util/auxiliary.py", line 492, in has_instruction_sets [2023-07-23 14:34:57] exit_code, stdout, stderr = execute(cmd) [2023-07-23 14:34:57] File "/var/lib/boinc/slots/5/pilot3/pilot/util/container.py", line 64, in execute [2023-07-23 14:34:57] process = subprocess.Popen(exe, [2023-07-23 14:34:57] File "/cvmfs/atlas.cern.ch/repo/ATLASLocalRootBase/x86_64/python/3.9.14-x86_64-centos7/lib/python3.9/subprocess.py", line 951, in __init__ [2023-07-23 14:34:57] self._execute_child(args, executable, preexec_fn, close_fds, [2023-07-23 14:34:57] File "/cvmfs/atlas.cern.ch/repo/ATLASLocalRootBase/x86_64/python/3.9.14-x86_64-centos7/lib/python3.9/subprocess.py", line 1821, in _execute_child [2023-07-23 14:34:57] raise child_exception_type(errno_num, err_msg, err_filename) [2023-07-23 14:34:57] OSError: [Errno 107] Transport endpoint is not connected: '/bin/bash' [2023-07-23 14:34:57] [2023-07-23 14:34:57] None [2023-07-23 14:34:57] exception has been put in bucket queue belonging to thread 'queue_monitor' [2023-07-23 14:34:57] setting graceful stop in 10 s since there is no point in continuing [2023-07-23 14:34:57] 2023-07-23 12:34:42,834 | INFO | waiting for thread to finish: ['<_MainThread(MainThread, started 139845306713920)>', '<ExcThread(queue_monitor, started 139844755887872)>'] [2023-07-23 14:34:57] 2023-07-23 12:34:44,841 | INFO | waiting for thread to finish: ['<_MainThread(MainThread, started 139845306713920)>', '<ExcThread(queue_monitor, started 139844755887872)>'] [2023-07-23 14:34:57] 2023-07-23 12:34:46,852 | INFO | waiting for thread to finish: ['<_MainThread(MainThread, started 139845306713920)>', '<ExcThread(queue_monitor, started 139844755887872)>'] [2023-07-23 14:34:57] 2023-07-23 12:34:48,863 | INFO | waiting for thread to finish: ['<_MainThread(MainThread, started 139845306713920)>', '<ExcThread(queue_monitor, started 139844755887872)>'] [2023-07-23 14:34:57] 2023-07-23 12:34:50,868 | INFO | waiting for thread to finish: ['<_MainThread(MainThread, started 139845306713920)>', '<ExcThread(queue_monitor, started 139844755887872)>'] [2023-07-23 14:34:57] 2023-07-23 12:34:52,879 | INFO | caller=run is remaining thread - safe to abort (names=['<_MainThread(MainThread, started 139845306713920)>']) [2023-07-23 14:34:57] 2023-07-23 12:34:57,892 | INFO | end of generic workflow (traces error code: 1354) [2023-07-23 14:34:57] 2023-07-23 12:34:57,893 | INFO | traces error code: 1354 [2023-07-23 14:34:57] 2023-07-23 12:34:57,893 | INFO | an exit code was already set: 1354 (will be converted to a standard shell code) [2023-07-23 14:34:57] no translation to shell exit code for error code 1354 [2023-07-23 14:34:57] 2023-07-23 12:34:57,893 | INFO | pilot has finished (exit code=1354, shell exit code=1) [2023-07-23 14:34:57] ./runpilot2-wrapper.sh: line 15: date: command not found [2023-07-23 14:34:57] ==== pilot stdout END ==== [2023-07-23 14:34:57] ./runpilot2-wrapper.sh: line 15: date: command not found [2023-07-23 14:34:57] ==== wrapper stdout RESUME ==== [2023-07-23 14:34:57] ./runpilot2-wrapper.sh: line 15: date: command not found [2023-07-23 14:34:57] pilotpid: 3687291 [2023-07-23 14:34:57] ./runpilot2-wrapper.sh: line 15: date: command not found [2023-07-23 14:34:57] Pilot exit status: 1 [2023-07-23 14:34:57] ./runpilot2-wrapper.sh: line 852: cut: command not found [2023-07-23 14:34:57] ./runpilot2-wrapper.sh: line 852: xargs: command not found [2023-07-23 14:34:57] ./runpilot2-wrapper.sh: line 852: /usr/bin/cat: Transport endpoint is not connected [2023-07-23 14:34:57] ./runpilot2-wrapper.sh: line 15: date: command not found [2023-07-23 14:34:57] pandaids: [2023-07-23 14:34:57] ./runpilot2-wrapper.sh: line 860: date: command not found [2023-07-23 14:34:57] ./runpilot2-wrapper.sh: line 15: date: command not found [2023-07-23 14:34:57] apfmon messages muted [2023-07-23 14:34:57] ./runpilot2-wrapper.sh: line 15: date: command not found [2023-07-23 14:34:57] Test setup, not cleaning [2023-07-23 14:34:57] ./runpilot2-wrapper.sh: line 15: date: command not found [2023-07-23 14:34:57] ==== wrapper stdout END ==== [2023-07-23 14:34:57] ./runpilot2-wrapper.sh: line 10: date: command not found [2023-07-23 14:34:57] ==== wrapper stderr END ==== [2023-07-23 14:34:57] ./runpilot2-wrapper.sh: line 474: date: command not found [2023-07-23 14:34:57] ./runpilot2-wrapper.sh: line 15: date: command not found [2023-07-23 14:34:57] wrapperexiting ec=0, duration=-1690115290 [2023-07-23 14:34:57] ./runpilot2-wrapper.sh: line 15: date: command not found [2023-07-23 14:34:57] apfmon messages muted [2023-07-23 14:34:57] *** Error codes and diagnostics *** [2023-07-23 14:34:57] *** Listing of results directory *** [2023-07-23 14:34:57] insgesamt 40856 [2023-07-23 14:34:57] -rw-r--r--. 1 boinc boinc 418016 23. Jul 13:38 pilot3.tar.gz [2023-07-23 14:34:57] -rwx------. 1 boinc boinc 27277 23. Jul 14:22 runpilot2-wrapper.sh [2023-07-23 14:34:57] -rw-r--r--. 1 boinc boinc 4388 23. Jul 14:22 queuedata.json [2023-07-23 14:34:57] -rw-r--r--. 1 boinc boinc 107 23. Jul 14:28 wrapper_26015_x86_64-pc-linux-gnu [2023-07-23 14:34:57] -rwxr-xr-x. 1 boinc boinc 7986 23. Jul 14:28 run_atlas [2023-07-23 14:34:57] -rw-r--r--. 1 boinc boinc 112 23. Jul 14:28 job.xml [2023-07-23 14:34:57] -rw-r--r--. 2 boinc boinc 17604 23. Jul 14:28 start_atlas.sh [2023-07-23 14:34:57] drwxrwx--x. 2 boinc boinc 68 23. Jul 14:28 shared [2023-07-23 14:34:57] -rw-r--r--. 2 boinc boinc 428859 23. Jul 14:28 input.tar.gz [2023-07-23 14:34:57] -rw-r--r--. 1 boinc boinc 6174 23. Jul 14:28 init_data.xml [2023-07-23 14:34:57] -rw-r--r--. 2 boinc boinc 38288972 23. Jul 14:28 EVNT.04972714._000032.pool.root.1 [2023-07-23 14:34:57] -rw-r--r--. 1 boinc boinc 0 23. Jul 14:28 boinc_lockfile [2023-07-23 14:34:57] -rw-r--r--. 1 boinc boinc 2751 23. Jul 14:28 pandaJob.out [2023-07-23 14:34:57] -rw-------. 1 boinc boinc 424 23. Jul 14:28 setup.sh.local [2023-07-23 14:34:57] -rw-------. 1 boinc boinc 1015272 23. Jul 14:28 agis_schedconf.cvmfs.json [2023-07-23 14:34:57] -rw-------. 1 boinc boinc 1370854 23. Jul 14:29 cric_ddmendpoints.json [2023-07-23 14:34:57] drwx------. 4 boinc boinc 4096 23. Jul 14:30 pilot3 [2023-07-23 14:34:57] -rw-------. 1 boinc boinc 514 23. Jul 14:31 heartbeat.json [2023-07-23 14:34:57] -rw-r--r--. 1 boinc boinc 530 23. Jul 14:31 boinc_task_state.xml [2023-07-23 14:34:57] drwxrwx---. 2 boinc boinc 4096 23. Jul 14:32 PanDA_Pilot-5911727891 [2023-07-23 14:34:57] -rw-------. 1 boinc boinc 986 23. Jul 14:32 memory_monitor_summary.json [2023-07-23 14:34:57] -rw-r--r--. 1 boinc boinc 23 23. Jul 14:34 wrapper_checkpoint.txt [2023-07-23 14:34:57] -rw-r--r--. 1 boinc boinc 8192 23. Jul 14:34 boinc_mmap_file [2023-07-23 14:34:57] -rw-------. 1 boinc boinc 55930 23. Jul 14:34 pilotlog.txt [2023-07-23 14:34:57] -rw-r--r--. 1 boinc boinc 11134 23. Jul 14:34 runtime_log.err [2023-07-23 14:34:57] -rw-r--r--. 1 boinc boinc 428 23. Jul 14:34 runtime_log [2023-07-23 14:34:57] -rw-------. 1 boinc boinc 603 23. Jul 14:34 D3NKDmeHnh3n7Olcko1bjSoqABFKDmABFKDm7AsVDmdrNKDmyTpxjn.diag [2023-07-23 14:34:57] -rw-------. 1 boinc boinc 75381 23. Jul 14:34 8e65cf5d-16a8-4ba2-b790-9de836da5bc7_79060.1.job.log [2023-07-23 14:34:57] -rw-r--r--. 1 boinc boinc 26968 23. Jul 14:34 stderr.txt [2023-07-23 14:34:57] No HITS result produced [2023-07-23 14:34:57] *** Contents of shared directory: *** [2023-07-23 14:34:57] insgesamt 37832 [2023-07-23 14:34:57] -rw-r--r--. 2 boinc boinc 17604 23. Jul 14:28 start_atlas.sh [2023-07-23 14:34:57] -rw-r--r--. 2 boinc boinc 428859 23. Jul 14:28 input.tar.gz [2023-07-23 14:34:57] -rw-r--r--. 2 boinc boinc 38288972 23. Jul 14:28 ATLAS.root_0 14:34:59 (3684140): run_atlas exited; CPU time 29.746207 14:34:59 (3684140): called boinc_finish(0) </stderr_txt> <message> upload failure: <file_xfer_error> <file_name>D3NKDmeHnh3n7Olcko1bjSoqABFKDmABFKDm7AsVDmdrNKDmyTpxjn_0_r382370072_ATLAS_result</file_name> <error_code>-161 (not found)</error_code> </file_xfer_error> </message> ]]>
©2024 CERN