Name | 2o2KDmQdci4n7Olcko1bjSoqABFKDmABFKDmyeZQDmOMJKDmbtRpfn_0 |
Workunit | 2376906 |
Created | 12 Jan 2024, 19:44:42 UTC |
Sent | 12 Jan 2024, 19:49:09 UTC |
Report deadline | 19 Jan 2024, 19:49:09 UTC |
Received | 12 Jan 2024, 22:08:14 UTC |
Server state | Over |
Outcome | Success |
Client state | Done |
Exit status | 0 (0x00000000) |
Computer ID | 4008 |
Run time | 36 min 49 sec |
CPU time | 28 min 23 sec |
Validate state | Valid |
Credit | 26.57 |
Device peak FLOPS | 3.12 GFLOPS |
Application version | ATLAS Simulation v3.01 (native_mt) x86_64-pc-linux-gnu |
Peak working set size | 1.51 GB |
Peak swap size | 2.14 GB |
Peak disk usage | 85.17 MB |
<core_client_version>7.16.1</core_client_version> <![CDATA[ <stderr_txt> 22:20:07 (24187): wrapper (7.7.26015): starting 22:20:07 (24187): wrapper: running run_atlas (--nthreads 1) [2024-01-12 22:20:07] Arguments: --nthreads 1 [2024-01-12 22:20:07] Threads: 1 [2024-01-12 22:20:07] Checking for CVMFS [2024-01-12 22:20:11] Probing /cvmfs/atlas.cern.ch... OK [2024-01-12 22:20:18] Probing /cvmfs/atlas-condb.cern.ch... OK [2024-01-12 22:20:18] Running cvmfs_config stat atlas.cern.ch [2024-01-12 22:20:26] VERSION PID UPTIME(M) MEM(K) REVISION EXPIRES(M) NOCATALOGS CACHEUSE(K) CACHEMAX(K) NOFDUSE NOFDMAX NOIOERR NOOPEN HITRATE(%) RX(K) SPEED(K/S) HOST PROXY ONLINE [2024-01-12 22:20:26] 2.11.2.0 24417 0 28624 128108 3 1 2580829 4096001 0 130560 0 0 0.000 962 439 http://s1cern-cvmfs.openhtc.io/cvmfs/atlas.cern.ch DIRECT 1 [2024-01-12 22:20:26] CVMFS is ok [2024-01-12 22:20:26] Efficiency of ATLAS tasks can be improved by the following measure(s): [2024-01-12 22:20:26] Small home clusters do not require a local http proxy but it is suggested if [2024-01-12 22:20:26] more than 10 cores throughout the same LAN segment are regularly running ATLAS like tasks. [2024-01-12 22:20:26] Further information can be found at the LHC@home message board. [2024-01-12 22:20:26] Using apptainer image /cvmfs/atlas.cern.ch/repo/containers/fs/singularity/x86_64-centos7 [2024-01-12 22:20:26] Checking for apptainer binary... [2024-01-12 22:20:26] which: no apptainer in (/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin) [2024-01-12 22:20:26] apptainer is not installed, using version from CVMFS [2024-01-12 22:20:26] Checking apptainer works with /cvmfs/atlas.cern.ch/repo/containers/sw/apptainer/x86_64-el7/current/bin/apptainer exec -B /cvmfs /cvmfs/atlas.cern.ch/repo/containers/fs/singularity/x86_64-centos7 hostname [2024-01-12 22:20:27] HPi7COS7 [2024-01-12 22:20:27] apptainer works [2024-01-12 22:20:27] Starting ATLAS job with PandaID=6075879741 [2024-01-12 22:20:27] Running command: /cvmfs/atlas.cern.ch/repo/containers/sw/apptainer/x86_64-el7/current/bin/apptainer exec -B /cvmfs,/var/lib/boinc/slots/1 /cvmfs/atlas.cern.ch/repo/containers/fs/singularity/x86_64-centos7 sh start_atlas.sh [2024-01-12 22:56:54] *** The last 200 lines of the pilot log: *** [2024-01-12 22:56:54] 2024-01-12 21:56:32,270 | DEBUG | pilot.control.job | send_state | data={'jobId': '6075879741', 'state': 'finished', 'timestamp': '2024-01-12T22:56:24+01:00', 'siteNam [2024-01-12 22:56:54] 2024-01-12 21:56:32,272 | DEBUG | pilot.control.job | write_heartbeat_to_file | heartbeat dictionary: {'jobId': '6075879741', 'state': 'finished', 'timestamp': '2024-01-12T22:56:24 [2024-01-12 22:56:54] 2024-01-12 21:56:32,272 | DEBUG | pilot.control.job | write_heartbeat_to_file | wrote heartbeat to file: /var/lib/boinc/slots/1/heartbeat.json [2024-01-12 22:56:54] 2024-01-12 21:56:32,272 | DEBUG | pilot.control.job | queue_monitor | job.completed=False [2024-01-12 22:56:54] 2024-01-12 21:56:32,937 | INFO | pilot.control.job | make_job_report | [2024-01-12 22:56:54] 2024-01-12 21:56:32,938 | INFO | pilot.control.job | make_job_report | job summary report [2024-01-12 22:56:54] 2024-01-12 21:56:32,938 | INFO | pilot.control.job | make_job_report | -------------------------------------------------- [2024-01-12 22:56:54] 2024-01-12 21:56:32,938 | INFO | pilot.control.job | make_job_report | PanDA job id: 6075879741 [2024-01-12 22:56:54] 2024-01-12 21:56:32,939 | INFO | pilot.control.job | make_job_report | task id: NULL [2024-01-12 22:56:54] 2024-01-12 21:56:32,939 | INFO | pilot.control.job | make_job_report | errors: (none) [2024-01-12 22:56:54] 2024-01-12 21:56:32,939 | INFO | pilot.control.job | make_job_report | status: LOG_TRANSFER = DONE [2024-01-12 22:56:54] 2024-01-12 21:56:32,939 | INFO | pilot.control.job | make_job_report | pilot state: finished [2024-01-12 22:56:54] 2024-01-12 21:56:32,939 | INFO | pilot.control.job | make_job_report | transexitcode: 0 [2024-01-12 22:56:54] 2024-01-12 21:56:32,939 | INFO | pilot.control.job | make_job_report | exeerrorcode: 0 [2024-01-12 22:56:54] 2024-01-12 21:56:32,940 | INFO | pilot.control.job | make_job_report | exeerrordiag: [2024-01-12 22:56:54] 2024-01-12 21:56:32,940 | INFO | pilot.control.job | make_job_report | exitcode: 0 [2024-01-12 22:56:54] 2024-01-12 21:56:32,940 | INFO | pilot.control.job | make_job_report | exitmsg: OK [2024-01-12 22:56:54] 2024-01-12 21:56:32,940 | INFO | pilot.control.job | make_job_report | cpuconsumptiontime: 1778 s [2024-01-12 22:56:54] 2024-01-12 21:56:32,940 | INFO | pilot.control.job | make_job_report | nevents: 2 [2024-01-12 22:56:54] 2024-01-12 21:56:32,943 | INFO | pilot.control.job | make_job_report | neventsw: 0 [2024-01-12 22:56:54] 2024-01-12 21:56:32,944 | INFO | pilot.control.job | make_job_report | pid: 3106 [2024-01-12 22:56:54] 2024-01-12 21:56:32,945 | INFO | pilot.control.job | make_job_report | pgrp: 3106 [2024-01-12 22:56:54] 2024-01-12 21:56:32,946 | INFO | pilot.control.job | make_job_report | corecount: 1 [2024-01-12 22:56:54] 2024-01-12 21:56:32,946 | INFO | pilot.control.job | make_job_report | event service: False [2024-01-12 22:56:54] 2024-01-12 21:56:32,947 | INFO | pilot.control.job | make_job_report | sizes: {0: 2398297, 1: 2399061, 2: 2399232, 12: 2399466, 13: 2399494, 2035: 2421093, 2037: 2430143, [2024-01-12 22:56:54] 2024-01-12 21:56:32,947 | INFO | pilot.control.job | make_job_report | -------------------------------------------------- [2024-01-12 22:56:54] 2024-01-12 21:56:32,947 | INFO | pilot.control.job | make_job_report | [2024-01-12 22:56:54] 2024-01-12 21:56:32,947 | DEBUG | pilot.control.job | has_job_completed | ls -lF /var/lib/boinc/slots/1: [2024-01-12 22:56:54] [2024-01-12 22:56:54] 2024-01-12 21:56:32,948 | INFO | pilot.util.container | print_executable | executing command: ls -lF /var/lib/boinc/slots/1 [2024-01-12 22:56:54] 2024-01-12 21:56:32,978 | DEBUG | pilot.util.container | execute | subprocess.communicate() will use timeout 864000 s [2024-01-12 22:56:54] 2024-01-12 21:56:32,994 | DEBUG | pilot.control.job | has_job_completed | total 45864 [2024-01-12 22:56:54] -rw-------. 1 boinc boinc 123 Jan 12 22:20 2o2KDmQdci4n7Olcko1bjSoqABFKDmABFKDmyeZQDmOMJKDmbtRpfn.diag [2024-01-12 22:56:54] -rw-------. 1 boinc boinc 876671 Jan 12 22:56 3b2b2f98-fcf6-46a1-ba68-ee0ac4740f31_28832.1.job.log [2024-01-12 22:56:54] -rw-------. 1 boinc boinc 208684 Jan 12 22:56 3b2b2f98-fcf6-46a1-ba68-ee0ac4740f31_28832.1.job.log.tgz [2024-01-12 22:56:54] -rw-r--r--. 2 boinc boinc 39611081 Jan 12 22:20 EVNT.04972714._000022.pool.root.1 [2024-01-12 22:56:54] drwxrwx---. 2 boinc boinc 4096 Jan 12 22:56 PanDA_Pilot-6075879741/ [2024-01-12 22:56:54] -rw-------. 1 boinc boinc 0 Jan 12 22:21 agis_ddmendpoints.agis.ALL.json [2024-01-12 22:56:54] -rw-------. 1 boinc boinc 961222 Jan 12 22:22 agis_schedconf.cvmfs.json [2024-01-12 22:56:54] -rw-r--r--. 1 boinc boinc 0 Jan 12 22:20 boinc_lockfile [2024-01-12 22:56:54] -rw-r--r--. 1 boinc boinc 8192 Jan 12 22:56 boinc_mmap_file [2024-01-12 22:56:54] -rw-r--r--. 1 boinc boinc 534 Jan 12 22:56 boinc_task_state.xml [2024-01-12 22:56:54] -rw-------. 1 boinc boinc 1338657 Jan 12 22:21 cric_ddmendpoints.json [2024-01-12 22:56:54] -rw-------. 1 boinc boinc 7427 Jan 12 22:56 heartbeat.json [2024-01-12 22:56:54] -rw-r--r--. 1 boinc boinc 6046 Jan 12 22:20 init_data.xml [2024-01-12 22:56:54] -rw-r--r--. 2 boinc boinc 488887 Jan 12 22:20 input.tar.gz [2024-01-12 22:56:54] -rw-r--r--. 1 boinc boinc 112 Jan 12 22:20 job.xml [2024-01-12 22:56:54] -rw-------. 1 boinc boinc 1017 Jan 12 22:56 memory_monitor_summary.json [2024-01-12 22:56:54] -rw-------. 1 boinc boinc 1549091 Jan 12 22:56 output.1.3b2b2f98-fcf6-46a1-ba68-ee0ac4740f31_28832.pool.root [2024-01-12 22:56:54] -rw-------. 1 boinc boinc 464 Jan 12 22:56 output.list [2024-01-12 22:56:54] -rw-r--r--. 1 boinc boinc 2668 Jan 12 22:20 pandaJob.out [2024-01-12 22:56:54] drwx------. 5 boinc boinc 4096 Jan 12 22:22 pilot3/ [2024-01-12 22:56:54] -rw-------. 1 boinc boinc 477391 Jan 12 20:44 pilot3.tar.gz [2024-01-12 22:56:54] -rw-------. 1 boinc boinc 47 Jan 12 22:22 pilot_heartbeat.json [2024-01-12 22:56:54] -rw-------. 1 boinc boinc 847067 Jan 12 22:56 pilotlog.txt [2024-01-12 22:56:54] -rw-r--r--. 1 boinc boinc 4388 Jan 12 20:42 queuedata.json [2024-01-12 22:56:54] -rwxr-xr-x. 1 boinc boinc 7986 Jan 12 22:20 run_atlas* [2024-01-12 22:56:54] -rwx------. 1 boinc boinc 27437 Jan 12 20:44 runpilot2-wrapper.sh* [2024-01-12 22:56:54] -rw-r--r--. 1 boinc boinc 407 Jan 12 22:20 runtime_log [2024-01-12 22:56:54] -rw-r--r--. 1 boinc boinc 7753 Jan 12 22:20 runtime_log.err [2024-01-12 22:56:54] -rw-------. 1 boinc boinc 424 Jan 12 22:20 setup.sh.local [2024-01-12 22:56:54] drwxrwx--x. 2 boinc boinc 68 Jan 12 22:20 shared/ [2024-01-12 22:56:54] -rw-r--r--. 2 boinc boinc 17646 Jan 12 22:20 start_atlas.sh [2024-01-12 22:56:54] -rw-r--r--. 1 boinc boinc 2120 Jan 12 22:20 stderr.txt [2024-01-12 22:56:54] -rw-r--r--. 1 boinc boinc 107 Jan 12 22:20 wrapper_26015_x86_64-pc-linux-gnu [2024-01-12 22:56:54] -rw-r--r--. 1 boinc boinc 26 Jan 12 22:56 wrapper_checkpoint.txt [2024-01-12 22:56:54] 2024-01-12 21:56:32,994 | INFO | pilot.util.queuehandling | queue_report | queue jobs had 0 job(s) [purged] [2024-01-12 22:56:54] 2024-01-12 21:56:32,995 | INFO | pilot.util.queuehandling | queue_report | queue payloads had 0 job(s) [purged] [2024-01-12 22:56:54] 2024-01-12 21:56:32,995 | INFO | pilot.util.queuehandling | queue_report | queue data_in had 0 job(s) [purged] [2024-01-12 22:56:54] 2024-01-12 21:56:32,995 | INFO | pilot.util.queuehandling | queue_report | queue data_out had 0 job(s) [purged] [2024-01-12 22:56:54] 2024-01-12 21:56:32,995 | INFO | pilot.util.queuehandling | queue_report | queue current_data_in had 0 job(s) [purged] [2024-01-12 22:56:54] 2024-01-12 21:56:32,995 | INFO | pilot.util.queuehandling | queue_report | queue validated_jobs had 0 job(s) [purged] [2024-01-12 22:56:54] 2024-01-12 21:56:32,995 | INFO | pilot.util.queuehandling | queue_report | queue validated_payloads had 0 job(s) [purged] [2024-01-12 22:56:54] 2024-01-12 21:56:32,995 | INFO | pilot.util.queuehandling | queue_report | queue monitored_payloads had 0 job(s) [purged] [2024-01-12 22:56:54] 2024-01-12 21:56:32,995 | INFO | pilot.util.queuehandling | queue_report | queue finished_jobs had 0 job(s) [purged] [2024-01-12 22:56:54] 2024-01-12 21:56:32,995 | INFO | pilot.util.queuehandling | queue_report | queue finished_payloads had 0 job(s) [purged] [2024-01-12 22:56:54] 2024-01-12 21:56:32,995 | INFO | pilot.util.queuehandling | queue_report | queue finished_data_in had 1 job(s) [purged] [2024-01-12 22:56:54] 2024-01-12 21:56:32,996 | INFO | pilot.util.queuehandling | queue_report | queue finished_data_out had 0 job(s) [purged] [2024-01-12 22:56:54] 2024-01-12 21:56:32,996 | INFO | pilot.util.queuehandling | queue_report | queue failed_jobs had 0 job(s) [purged] [2024-01-12 22:56:54] 2024-01-12 21:56:32,996 | INFO | pilot.util.queuehandling | queue_report | queue failed_payloads had 0 job(s) [purged] [2024-01-12 22:56:54] 2024-01-12 21:56:32,996 | INFO | pilot.util.queuehandling | queue_report | queue failed_data_in had 0 job(s) [purged] [2024-01-12 22:56:54] 2024-01-12 21:56:32,996 | INFO | pilot.util.queuehandling | queue_report | queue failed_data_out had 0 job(s) [purged] [2024-01-12 22:56:54] 2024-01-12 21:56:32,996 | INFO | pilot.util.queuehandling | queue_report | queue completed_jobs had 0 job(s) [purged] [2024-01-12 22:56:54] 2024-01-12 21:56:32,996 | INFO | pilot.util.queuehandling | queue_report | queue completed_jobids has 1 job(s) [2024-01-12 22:56:54] 2024-01-12 21:56:32,996 | INFO | pilot.util.queuehandling | queue_report | queue realtimelog_payloads had 0 job(s) [purged] [2024-01-12 22:56:54] 2024-01-12 21:56:32,996 | INFO | pilot.util.queuehandling | queue_report | queue messages had 0 job(s) [purged] [2024-01-12 22:56:54] 2024-01-12 21:56:32,997 | INFO | pilot.control.job | has_job_completed | job 6075879741 has completed (purged errors) [2024-01-12 22:56:54] 2024-01-12 21:56:32,997 | DEBUG | pilot.util.realtimelogger | cleanup | attempting real-time logger cleanup [2024-01-12 22:56:54] 2024-01-12 21:56:32,997 | INFO | pilot.util.processes | cleanup | overall cleanup function is called [2024-01-12 22:56:54] 2024-01-12 21:56:32,999 | DEBUG | pilot.util.processes | cleanup | work directory was removed: /var/lib/boinc/slots/1/PanDA_Pilot-6075879741 [2024-01-12 22:56:54] 2024-01-12 21:56:33,043 | WARNING | pilot.control.monitor | run_checks | too much time has passed since last successful pilot heartbeat (2134.6773953437805 s) - must update [2024-01-12 22:56:54] 2024-01-12 21:56:34,010 | INFO | pilot.info.jobdata | collect_zombies | --- collectZombieJob: --- 10, [3106] [2024-01-12 22:56:54] 2024-01-12 21:56:34,010 | INFO | pilot.info.jobdata | collect_zombies | zombie collector waiting for pid 3106 [2024-01-12 22:56:54] 2024-01-12 21:56:34,011 | INFO | pilot.info.jobdata | collect_zombies | harmless exception when collecting zombies: [Errno 10] No child processes [2024-01-12 22:56:54] 2024-01-12 21:56:35,027 | INFO | pilot.util.processes | cleanup | collected zombie processes [2024-01-12 22:56:54] 2024-01-12 21:56:35,027 | INFO | pilot.util.processes | cleanup | will now attempt to kill all subprocesses of pid=3106 [2024-01-12 22:56:54] 2024-01-12 21:56:35,038 | DEBUG | pilot.util.container | execute | subprocess.communicate() will use timeout 864000 s [2024-01-12 22:56:54] 2024-01-12 21:56:35,050 | WARNING | pilot.control.monitor | run_checks | too much time has passed since last successful pilot heartbeat (2136.6838006973267 s) - must update [2024-01-12 22:56:54] 2024-01-12 21:56:35,100 | INFO | pilot.util.processes | kill_processes | process IDs to be killed: [3106] (in reverse order) [2024-01-12 22:56:54] 2024-01-12 21:56:35,123 | DEBUG | pilot.util.container | execute | subprocess.communicate() will use timeout 864000 s [2024-01-12 22:56:54] 2024-01-12 21:56:35,192 | WARNING | pilot.util.processes | kill_processes | found no corresponding commands to process id(s) [2024-01-12 22:56:54] 2024-01-12 21:56:35,193 | INFO | pilot.util.processes | kill_orphans | Do not look for orphan processes in BOINC jobs [2024-01-12 22:56:54] 2024-01-12 21:56:35,194 | INFO | pilot.util.processes | kill_defunct_children | did not find any defunct processes belonging to 3106 [2024-01-12 22:56:54] 2024-01-12 21:56:35,196 | INFO | pilot.util.processes | kill_defunct_children | did not find any defunct processes belonging to 3106 [2024-01-12 22:56:54] 2024-01-12 21:56:35,196 | DEBUG | pilot.util.queuehandling | purge_queue | queue purged [2024-01-12 22:56:54] 2024-01-12 21:56:35,197 | INFO | pilot.control.job | retrieve | ready for new job [2024-01-12 22:56:54] 2024-01-12 21:56:35,197 | INFO | root | retrieve | pilot has finished with previous job - re-establishing logging [2024-01-12 22:56:54] 2024-01-12 21:56:35,198 | INFO | pilot.util.auxiliary | pilot_version_banner | ************************************** [2024-01-12 22:56:54] 2024-01-12 21:56:35,198 | INFO | pilot.util.auxiliary | pilot_version_banner | *** PanDA Pilot version 3.7.1.22 *** [2024-01-12 22:56:54] 2024-01-12 21:56:35,198 | INFO | pilot.util.auxiliary | pilot_version_banner | ************************************** [2024-01-12 22:56:54] 2024-01-12 21:56:35,199 | INFO | pilot.util.auxiliary | pilot_version_banner | [2024-01-12 22:56:54] 2024-01-12 21:56:35,199 | INFO | pilot.util.auxiliary | pilot_version_banner | pilot is running in a VM [2024-01-12 22:56:54] 2024-01-12 21:56:35,199 | INFO | pilot.util.auxiliary | display_architecture_info | architecture information: [2024-01-12 22:56:54] 2024-01-12 21:56:35,199 | INFO | pilot.util.container | print_executable | executing command: cat /etc/os-release [2024-01-12 22:56:54] 2024-01-12 21:56:35,206 | DEBUG | pilot.util.container | execute | subprocess.communicate() will use timeout 864000 s [2024-01-12 22:56:54] 2024-01-12 21:56:35,214 | INFO | pilot.util.filehandling | dump | cat /etc/os-release: [2024-01-12 22:56:54] NAME="CentOS Linux" [2024-01-12 22:56:54] VERSION="7 (Core)" [2024-01-12 22:56:54] ID="centos" [2024-01-12 22:56:54] ID_LIKE="rhel fedora" [2024-01-12 22:56:54] VERSION_ID="7" [2024-01-12 22:56:54] PRETTY_NAME="CentOS Linux 7 (Core)" [2024-01-12 22:56:54] ANSI_COLOR="0;31" [2024-01-12 22:56:54] CPE_NAME="cpe:/o:centos:centos:7" [2024-01-12 22:56:54] HOME_URL="https://www.centos.org/" [2024-01-12 22:56:54] BUG_REPORT_URL="https://bugs.centos.org/" [2024-01-12 22:56:54] [2024-01-12 22:56:54] CENTOS_MANTISBT_PROJECT="CentOS-7" [2024-01-12 22:56:54] CENTOS_MANTISBT_PROJECT_VERSION="7" [2024-01-12 22:56:54] REDHAT_SUPPORT_PRODUCT="centos" [2024-01-12 22:56:54] REDHAT_SUPPORT_PRODUCT_VERSION="7" [2024-01-12 22:56:54] [2024-01-12 22:56:54] 2024-01-12 21:56:35,214 | INFO | pilot.util.auxiliary | pilot_version_banner | ************************************** [2024-01-12 22:56:54] 2024-01-12 21:56:35,716 | DEBUG | pilot.util.monitoring | check_local_space | checking local space on /var/lib/boinc/slots/1 [2024-01-12 22:56:54] 2024-01-12 21:56:35,716 | INFO | pilot.util.container | print_executable | executing command: df -mP /var/lib/boinc/slots/1 [2024-01-12 22:56:54] 2024-01-12 21:56:35,736 | DEBUG | pilot.util.container | execute | subprocess.communicate() will use timeout 864000 s [2024-01-12 22:56:54] 2024-01-12 21:56:35,745 | DEBUG | pilot.util.workernode | get_local_disk_space | stdout=Filesystem 1048576-blocks Used Available Capacity Mounted on [2024-01-12 22:56:54] /dev/mapper/centos-root 22002 11733 10269 54% /var/lib/boinc/slots/1 [2024-01-12 22:56:54] 2024-01-12 21:56:35,745 | DEBUG | pilot.util.workernode | get_local_disk_space | stderr= [2024-01-12 22:56:54] 2024-01-12 21:56:35,745 | INFO | pilot.util.monitoring | check_local_space | sufficient remaining disk space (10767826944 B) [2024-01-12 22:56:54] 2024-01-12 21:56:35,745 | WARNING | pilot.control.job | proceed_with_getjob | since timefloor is set to 0, pilot was only allowed to run one job [2024-01-12 22:56:54] 2024-01-12 21:56:35,745 | WARNING | pilot.control.job | retrieve | setting graceful_stop since proceed_with_getjob() returned False (pilot will end) [2024-01-12 22:56:54] 2024-01-12 21:56:35,746 | WARNING | pilot.util.common | should_abort | job:job_monitor:received graceful stop - abort after this iteration [2024-01-12 22:56:54] 2024-01-12 21:56:35,746 | INFO | pilot.control.job | job_monitor | aborting loop [2024-01-12 22:56:54] 2024-01-12 21:56:35,746 | WARNING | pilot.control.monitor | control | aborting monitor loop since graceful_stop has been set (timing out remaining threads) [2024-01-12 22:56:54] 2024-01-12 21:56:35,747 | WARNING | pilot.control.monitor | run_checks | too much time has passed since last successful pilot heartbeat (2137.380723953247 s) - must update ? [2024-01-12 22:56:54] 2024-01-12 21:56:35,747 | INFO | pilot.util.queuehandling | abort_jobs_in_queues | found 0 job(s) in 20 queues [2024-01-12 22:56:54] 2024-01-12 21:56:35,747 | WARNING | pilot.control.monitor | run_checks | pilot monitor received instruction that args.graceful_stop has been set [2024-01-12 22:56:54] 2024-01-12 21:56:35,747 | WARNING | pilot.control.monitor | run_checks | will wait for a maximum of 300 s for threads to finish [2024-01-12 22:56:54] 2024-01-12 21:56:36,691 | DEBUG | pilot.control.data | control | data control ending since graceful_stop has been set [2024-01-12 22:56:54] 2024-01-12 21:56:36,764 | INFO | pilot.control.job | job_monitor | [job] job monitor thread has finished [2024-01-12 22:56:54] 2024-01-12 21:56:36,765 | INFO | pilot.control.job | retrieve | [job] retrieve thread has finished [2024-01-12 22:56:54] 2024-01-12 21:56:36,809 | WARNING | pilot.util.common | should_abort | data:copytool_out:received graceful stop - abort after this iteration [2024-01-12 22:56:54] 2024-01-12 21:56:36,876 | DEBUG | pilot.control.job | control | job control ending since graceful_stop has been set [2024-01-12 22:56:54] 2024-01-12 21:56:36,882 | DEBUG | pilot.control.payload | control | payload control ending since graceful_stop has been set [2024-01-12 22:56:54] 2024-01-12 21:56:37,130 | INFO | pilot.control.payload | failed_post | [payload] failed_post thread has finished [2024-01-12 22:56:54] 2024-01-12 21:56:37,434 | INFO | pilot.control.job | validate | [job] validate thread has finished [2024-01-12 22:56:54] 2024-01-12 21:56:37,508 | WARNING | pilot.util.common | should_abort | job:queue_monitor:received graceful stop - abort after this iteration [2024-01-12 22:56:54] 2024-01-12 21:56:37,655 | INFO | pilot.control.payload | validate_pre | [payload] validate_pre thread has finished [2024-01-12 22:56:54] 2024-01-12 21:56:37,691 | INFO | pilot.control.payload | validate_post | [payload] validate_post thread has finished [2024-01-12 22:56:54] 2024-01-12 21:56:37,697 | INFO | pilot.control.data | control | [data] control thread has finished [2024-01-12 22:56:54] 2024-01-12 21:56:37,778 | WARNING | pilot.util.common | should_abort | data:queue_monitoring:received graceful stop - abort after this iteration [2024-01-12 22:56:54] 2024-01-12 21:56:37,832 | INFO | pilot.control.payload | execute_payloads | [payload] execute_payloads thread has finished [2024-01-12 22:56:54] 2024-01-12 21:56:37,882 | INFO | pilot.control.job | control | [job] control thread has finished [2024-01-12 22:56:54] 2024-01-12 21:56:37,890 | INFO | pilot.control.payload | control | [payload] control thread has finished [2024-01-12 22:56:54] 2024-01-12 21:56:37,960 | INFO | pilot.control.data | copytool_in | [data] copytool_in thread has finished [2024-01-12 22:56:54] 2024-01-12 21:56:37,986 | INFO | pilot.control.job | create_data_payload | [job] create_data_payload thread has finished [2024-01-12 22:56:54] 2024-01-12 21:56:38,519 | INFO | pilot.control.job | queue_monitor | [job] queue monitor thread has finished [2024-01-12 22:56:54] 2024-01-12 21:56:38,837 | INFO | pilot.control.data | copytool_out | [data] copytool_out thread has finished [2024-01-12 22:56:54] 2024-01-12 21:56:41,813 | INFO | pilot.control.data | queue_monitoring | [data] queue_monitor thread has finished [2024-01-12 22:56:54] 2024-01-12 21:56:46,019 | INFO | pilot.control.payload | get_logging_info | job.realtimelogging is not enabled [2024-01-12 22:56:54] 2024-01-12 21:56:46,020 | DEBUG | pilot.control.payload | run_realtimelog | real-time logging not needed at this point [2024-01-12 22:56:54] 2024-01-12 21:56:46,020 | DEBUG | pilot.control.payload | run_realtimelog | realtime logger was not found, waiting .. [2024-01-12 22:56:54] 2024-01-12 21:56:47,025 | INFO | pilot.control.payload | run_realtimelog | [payload] run_realtimelog thread has finished [2024-01-12 22:56:54] 2024-01-12 21:56:48,101 | INFO | pilot.util.processes | threads_aborted | only monitor.control thread still running - safe to abort: ['<_MainThread(MainThread, started 140454 [2024-01-12 22:56:54] 2024-01-12 21:56:48,101 | DEBUG | pilot.workflow.generic | run | will proceed to set job_aborted [2024-01-12 22:56:54] 2024-01-12 21:56:48,900 | WARNING | pilot.control.monitor | run_checks | job_aborted has been set - aborting pilot monitoring [2024-01-12 22:56:54] 2024-01-12 21:56:48,901 | INFO | pilot.control.monitor | control | [monitor] control thread has ended [2024-01-12 22:56:54] 2024-01-12 21:56:53,131 | DEBUG | pilot.workflow.generic | run | all relevant threads have aborted (thread count=1) [2024-01-12 22:56:54] 2024-01-12 21:56:53,131 | INFO | pilot.workflow.generic | run | end of generic workflow (traces error code: 0) [2024-01-12 22:56:54] 2024-01-12 21:56:53,172 | DEBUG | pilot.util.processgroups | get_all_child_pids | python3(28383)---pstree(14971) [2024-01-12 22:56:54] [2024-01-12 22:56:54] 2024-01-12 21:56:53,172 | INFO | pilot.util.processgroups | find_defunct_subprocesses | child pids=[28383, 14971] [2024-01-12 22:56:54] 2024-01-12 21:56:53,240 | DEBUG | pilot.util.processgroups | is_defunct | 28383: return code=0, stdout=SN [2024-01-12 22:56:54] , stderr= [2024-01-12 22:56:54] 2024-01-12 21:56:53,283 | DEBUG | pilot.util.processgroups | is_defunct | 14971: return code=1, stdout=, stderr= [2024-01-12 22:56:54] 2024-01-12 21:56:53,283 | INFO | root | list_zombies | no defunct processes were found [2024-01-12 22:56:54] 2024-01-12 21:56:53,283 | INFO | root | wrap_up | traces error code: 0 [2024-01-12 22:56:54] 2024-01-12 21:56:53,284 | INFO | root | wrap_up | pilot has finished (exit code=0, shell exit code=0) [2024-01-12 22:56:54] 2024-01-12 21:56:53,403 [wrapper] ==== pilot stdout END ==== [2024-01-12 22:56:54] 2024-01-12 21:56:53,407 [wrapper] ==== wrapper stdout RESUME ==== [2024-01-12 22:56:54] 2024-01-12 21:56:53,411 [wrapper] pilotpid: 28383 [2024-01-12 22:56:54] 2024-01-12 21:56:53,416 [wrapper] Pilot exit status: 0 [2024-01-12 22:56:54] 2024-01-12 21:56:53,430 [wrapper] pandaids: 6075879741 [2024-01-12 22:56:54] 2024-01-12 21:56:53,439 [wrapper] apfmon messages muted [2024-01-12 22:56:54] 2024-01-12 21:56:53,444 [wrapper] Test setup, not cleaning [2024-01-12 22:56:54] 2024-01-12 21:56:53,449 [wrapper] ==== wrapper stdout END ==== [2024-01-12 22:56:54] 2024-01-12 21:56:53,453 [wrapper] ==== wrapper stderr END ==== [2024-01-12 22:56:54] 2024-01-12 21:56:53,461 [wrapper] wrapperexiting ec=0, duration=2185 [2024-01-12 22:56:54] 2024-01-12 21:56:53,465 [wrapper] apfmon messages muted [2024-01-12 22:56:54] *** Error codes and diagnostics *** [2024-01-12 22:56:54] "exeErrorCode": 0, [2024-01-12 22:56:54] "exeErrorDiag": "", [2024-01-12 22:56:54] "pilotErrorCode": 0, [2024-01-12 22:56:54] "pilotErrorDiag": "", [2024-01-12 22:56:54] *** Listing of results directory *** [2024-01-12 22:56:54] insgesamt 47280 [2024-01-12 22:56:54] -rw-r--r--. 1 boinc boinc 4388 12. Jan 20:42 queuedata.json [2024-01-12 22:56:54] -rwx------. 1 boinc boinc 27437 12. Jan 20:44 runpilot2-wrapper.sh [2024-01-12 22:56:54] -rw-------. 1 boinc boinc 477391 12. Jan 20:44 pilot3.tar.gz [2024-01-12 22:56:54] -rw-r--r--. 1 boinc boinc 107 12. Jan 22:20 wrapper_26015_x86_64-pc-linux-gnu [2024-01-12 22:56:54] -rwxr-xr-x. 1 boinc boinc 7986 12. Jan 22:20 run_atlas [2024-01-12 22:56:54] -rw-r--r--. 1 boinc boinc 112 12. Jan 22:20 job.xml [2024-01-12 22:56:54] -rw-r--r--. 2 boinc boinc 17646 12. Jan 22:20 start_atlas.sh [2024-01-12 22:56:54] drwxrwx--x. 2 boinc boinc 68 12. Jan 22:20 shared [2024-01-12 22:56:54] -rw-r--r--. 2 boinc boinc 488887 12. Jan 22:20 input.tar.gz [2024-01-12 22:56:54] -rw-r--r--. 1 boinc boinc 6046 12. Jan 22:20 init_data.xml [2024-01-12 22:56:54] -rw-r--r--. 2 boinc boinc 39611081 12. Jan 22:20 EVNT.04972714._000022.pool.root.1 [2024-01-12 22:56:54] -rw-r--r--. 1 boinc boinc 0 12. Jan 22:20 boinc_lockfile [2024-01-12 22:56:54] -rw-r--r--. 1 boinc boinc 2668 12. Jan 22:20 pandaJob.out [2024-01-12 22:56:54] -rw-------. 1 boinc boinc 424 12. Jan 22:20 setup.sh.local [2024-01-12 22:56:54] -rw-------. 1 boinc boinc 0 12. Jan 22:21 agis_ddmendpoints.agis.ALL.json [2024-01-12 22:56:54] -rw-------. 1 boinc boinc 1338657 12. Jan 22:21 cric_ddmendpoints.json [2024-01-12 22:56:54] -rw-------. 1 boinc boinc 47 12. Jan 22:22 pilot_heartbeat.json [2024-01-12 22:56:54] -rw-------. 1 boinc boinc 961222 12. Jan 22:22 agis_schedconf.cvmfs.json [2024-01-12 22:56:54] drwx------. 5 boinc boinc 4096 12. Jan 22:22 pilot3 [2024-01-12 22:56:54] -rw-------. 1 boinc boinc 1549091 12. Jan 22:56 output.1.3b2b2f98-fcf6-46a1-ba68-ee0ac4740f31_28832.pool.root [2024-01-12 22:56:54] -rw-r--r--. 1 boinc boinc 534 12. Jan 22:56 boinc_task_state.xml [2024-01-12 22:56:54] -rw-------. 1 boinc boinc 1017 12. Jan 22:56 memory_monitor_summary.json [2024-01-12 22:56:54] -rw-------. 1 boinc boinc 208684 12. Jan 22:56 3b2b2f98-fcf6-46a1-ba68-ee0ac4740f31_28832.1.job.log.tgz [2024-01-12 22:56:54] -rw-------. 1 boinc boinc 7427 12. Jan 22:56 heartbeat.json [2024-01-12 22:56:54] -rw-r--r--. 1 boinc boinc 8192 12. Jan 22:56 boinc_mmap_file [2024-01-12 22:56:54] -rw-r--r--. 1 boinc boinc 26 12. Jan 22:56 wrapper_checkpoint.txt [2024-01-12 22:56:54] -rw-------. 1 boinc boinc 9722 12. Jan 22:56 pilotlog.txt [2024-01-12 22:56:54] -rw-------. 1 boinc boinc 895571 12. Jan 22:56 3b2b2f98-fcf6-46a1-ba68-ee0ac4740f31_28832.1.job.log [2024-01-12 22:56:54] -rw-------. 1 boinc boinc 464 12. Jan 22:56 output.list [2024-01-12 22:56:54] -rw-r--r--. 1 boinc boinc 748 12. Jan 22:56 runtime_log [2024-01-12 22:56:54] -rw-------. 1 boinc boinc 2672640 12. Jan 22:56 result.tar.gz [2024-01-12 22:56:54] -rw-r--r--. 1 boinc boinc 11864 12. Jan 22:56 runtime_log.err [2024-01-12 22:56:54] -rw-------. 1 boinc boinc 632 12. Jan 22:56 2o2KDmQdci4n7Olcko1bjSoqABFKDmABFKDmyeZQDmOMJKDmbtRpfn.diag [2024-01-12 22:56:54] -rw-r--r--. 1 boinc boinc 29647 12. Jan 22:56 stderr.txt [2024-01-12 22:56:54] HITS file was successfully produced: [2024-01-12 22:56:54] -rw-------. 1 boinc boinc 1549091 12. Jan 22:56 shared/HITS.pool.root.1 [2024-01-12 22:56:54] *** Contents of shared directory: *** [2024-01-12 22:56:54] insgesamt 43312 [2024-01-12 22:56:54] -rw-r--r--. 2 boinc boinc 17646 12. Jan 22:20 start_atlas.sh [2024-01-12 22:56:54] -rw-r--r--. 2 boinc boinc 488887 12. Jan 22:20 input.tar.gz [2024-01-12 22:56:54] -rw-r--r--. 2 boinc boinc 39611081 12. Jan 22:20 ATLAS.root_0 [2024-01-12 22:56:54] -rw-------. 1 boinc boinc 1549091 12. Jan 22:56 HITS.pool.root.1 [2024-01-12 22:56:54] -rw-------. 1 boinc boinc 2672640 12. Jan 22:56 result.tar.gz 22:56:55 (24187): run_atlas exited; CPU time 1703.204562 22:56:55 (24187): called boinc_finish(0) </stderr_txt> ]]>
©2024 CERN