Name | 3WSLDmgRKcyn7Olcko1bjSoqABFKDmABFKDmJZ5TDmHGFKDmEENrRo_0 |
Workunit | 2062503 |
Created | 5 Mar 2021, 7:52:58 UTC |
Sent | 5 Mar 2021, 8:01:30 UTC |
Report deadline | 12 Mar 2021, 8:01:30 UTC |
Received | 5 Mar 2021, 9:28:54 UTC |
Server state | Over |
Outcome | Validate error |
Client state | Done |
Exit status | 0 (0x00000000) |
Computer ID | 4228 |
Run time | 14 min 10 sec |
CPU time | 14 min 44 sec |
Validate state | Invalid |
Credit | 0.00 |
Device peak FLOPS | 13.11 GFLOPS |
Application version | ATLAS Simulation v1.03 (native_mt) x86_64-pc-linux-gnu |
Peak working set size | 1.57 GB |
Peak swap size | 2.26 GB |
Peak disk usage | 90.77 MB |
<core_client_version>7.16.11</core_client_version> <![CDATA[ <stderr_txt> 10:04:48 (2287): wrapper (7.7.26015): starting 10:04:48 (2287): wrapper: running run_atlas (--nthreads 4) [2021-03-05 10:04:48] Arguments: --nthreads 4 [2021-03-05 10:04:48] Threads: 4 [2021-03-05 10:04:48] Checking for CVMFS [2021-03-05 10:04:49] Probing /cvmfs/atlas-condb.cern.ch... OK [2021-03-05 10:04:49] Probing /cvmfs/atlas.cern.ch... OK [2021-03-05 10:04:49] VERSION PID UPTIME(M) MEM(K) REVISION EXPIRES(M) NOCATALOGS CACHEUSE(K) CACHEMAX(K) NOFDUSE NOFDMAX NOIOERR NOOPEN HITRATE(%) RX(K) SPEED(K/S) HOST PROXY ONLINE [2021-03-05 10:04:49] 2.5.1.0 30731 25145 66676 80199 1 61 18642335 20480001 0 65024 0 3756449 99.7605 1312089 5142 http://cvmfs-stratum-one.cern.ch/cvmfs/atlas.cern.ch http://137.138.150.196:3128 1 [2021-03-05 10:04:49] CVMFS is ok [2021-03-05 10:04:49] Using singularity image /cvmfs/atlas.cern.ch/repo/containers/images/singularity/x86_64-centos7.img [2021-03-05 10:04:49] Checking for singularity binary... [2021-03-05 10:04:49] Using singularity found in PATH at /usr/bin/singularity [2021-03-05 10:04:49] Running /usr/bin/singularity --version [2021-03-05 10:04:49] singularity version 3.7.1-1.el7 [2021-03-05 10:04:49] Checking singularity works with /usr/bin/singularity exec -B /cvmfs /cvmfs/atlas.cern.ch/repo/containers/images/singularity/x86_64-centos7.img hostname [2021-03-05 10:04:49] dcameron05.cern.ch [2021-03-05 10:04:49] Singularity works [2021-03-05 10:04:50] Set ATHENA_PROC_NUMBER=4 [2021-03-05 10:04:50] Starting ATLAS job with PandaID=4991368413 [2021-03-05 10:04:50] Running command: /usr/bin/singularity exec --pwd /var/lib/boinc/slots/136 -B /cvmfs,/var /cvmfs/atlas.cern.ch/repo/containers/images/singularity/x86_64-centos7.img sh start_atlas.sh [2021-03-05 10:18:55] *** The last 200 lines of the pilot log: *** [2021-03-05 10:18:55] 2021-03-05 09:18:15,899 | INFO | queue_monitor | pilot.util.timing | timing_report | . total setup = 10 s [2021-03-05 10:18:55] 2021-03-05 09:18:15,899 | INFO | queue_monitor | pilot.util.timing | timing_report | . stage-in = 0 s [2021-03-05 10:18:55] 2021-03-05 09:18:15,899 | INFO | queue_monitor | pilot.util.timing | timing_report | . payload execution = 663 s [2021-03-05 10:18:55] 2021-03-05 09:18:15,899 | INFO | queue_monitor | pilot.util.timing | timing_report | . stage-out = 0 s [2021-03-05 10:18:55] 2021-03-05 09:18:15,899 | INFO | queue_monitor | pilot.util.timing | timing_report | .............................. [2021-03-05 10:18:55] 2021-03-05 09:18:15,899 | INFO | queue_monitor | pilot.user.atlas.diagnose | get_log_extracts | building log extracts (sent to the server as 'pilotLog') [2021-03-05 10:18:55] 2021-03-05 09:18:15,900 | DEBUG | queue_monitor | pilot.user.atlas.diagnose | get_panda_tracer_log | PanDA tracer log does not exist: /var/lib/boinc/slots/136/PanDA_Pilot-4991368413/pandatracerlog.txt (ignoring) [2021-03-05 10:18:55] 2021-03-05 09:18:15,900 | INFO | queue_monitor | pilot.util.container | execute | executing command: tail -n 20 /var/lib/boinc/slots/136/PanDA_Pilot-4991368413/pilotlog.txt [2021-03-05 10:18:55] 2021-03-05 09:18:15,928 | WARNING | queue_monitor | pilot.control.job | add_timing_and_extracts | [2021-03-05 10:18:55] XXXXXXXXXXXXXXXXXXXXX[begin log extracts] [2021-03-05 10:18:55] - Log from pilotlog.txt - [2021-03-05 10:18:55] 2021-03-05 09:18:15,895 | DEBUG | queue_monitor | pilot.user.atlas.jobmetrics | get_job_metrics | job metrics="actualCoreCount=2" [2021-03-05 10:18:55] 2021-03-05 09:18:15,895 | INFO | queue_monitor | pilot.control.job | get_data_structure | mean actualcorecount: 1.888889 [2021-03-05 10:18:55] 2021-03-05 09:18:15,895 | INFO | queue_monitor | pilot.control.job | get_data_structure | payload/TRF did not report the number of read events [2021-03-05 10:18:55] 2021-03-05 09:18:15,896 | INFO | queue_monitor | pilot.user.atlas.utilities | get_memory_values | using path: /var/lib/boinc/slots/136/PanDA_Pilot-4991368413/memory_monitor_summary.json (trf name=prmon) [2021-03-05 10:18:55] 2021-03-05 09:18:15,898 | DEBUG | queue_monitor | pilot.user.atlas.utilities | get_memory_monitor_info | summary_dictionary={'Max': {'rx_packets': 80747, 'nprocs': 9, 'nthreads': 10, 'rx_bytes': 17005525, 'wtime': 615, 'rss': 6715500, 'write_bytes': 7278592, 'vmem': 11242440, 'read_bytes': 1217948672, 'stime': 21, 'tx_bytes': 1587308, 'pss': 1848616, 'wchar': 7865468, 'rchar': 320055361, 'tx_packets': 4962, 'swap': 0, 'utime': 852}, 'Avg': {'write_bytes': 11834.0, 'nprocs': 5.454, 'nthreads': 6.3629999999999995, 'rx_bytes': 27649.0, 'rx_packets': 131.287, 'vmem': 4909552.0, 'read_bytes': 1980275.0, 'swap': 0.0, 'tx_bytes': 2580.0, 'pss': 1402519.0, 'wchar': 12788.0, 'rchar': 520381.0, 'tx_packets': 8.067, 'rss': 3017996.0}, 'HW': {'mem': {'MemTotal': 7310268}, 'cpu': {'CoresPerSocket': 1, 'ModelName': 'Intel Core Processor (Broadwell, IBRS)', 'ThreadsPerCore': 1, 'CPUs': 4, 'Sockets': 4}}, 'prmon': {'Version': '2.2.0'}} [2021-03-05 10:18:55] 2021-03-05 09:18:15,898 | INFO | queue_monitor | pilot.user.atlas.utilities | get_memory_monitor_info | extracted standard info from prmon json [2021-03-05 10:18:55] 2021-03-05 09:18:15,899 | INFO | queue_monitor | pilot.user.atlas.utilities | get_memory_monitor_info | extracted standard memory fields from prmon json [2021-03-05 10:18:55] 2021-03-05 09:18:15,899 | INFO | queue_monitor | pilot.util.timing | timing_report | .............................. [2021-03-05 10:18:55] 2021-03-05 09:18:15,899 | INFO | queue_monitor | pilot.util.timing | timing_report | . Timing measurements: [2021-03-05 10:18:55] 2021-03-05 09:18:15,899 | INFO | queue_monitor | pilot.util.timing | timing_report | . get job = 0 s [2021-03-05 10:18:55] 2021-03-05 09:18:15,899 | INFO | queue_monitor | pilot.util.timing | timing_report | . initial setup = 0 s [2021-03-05 10:18:55] 2021-03-05 09:18:15,899 | INFO | queue_monitor | pilot.util.timing | timing_report | . payload setup = 10 s [2021-03-05 10:18:55] 2021-03-05 09:18:15,899 | INFO | queue_monitor | pilot.util.timing | timing_report | . total setup = 10 s [2021-03-05 10:18:55] 2021-03-05 09:18:15,899 | INFO | queue_monitor | pilot.util.timing | timing_report | . stage-in = 0 s [2021-03-05 10:18:55] 2021-03-05 09:18:15,899 | INFO | queue_monitor | pilot.util.timing | timing_report | . payload execution = 663 s [2021-03-05 10:18:55] 2021-03-05 09:18:15,899 | INFO | queue_monitor | pilot.util.timing | timing_report | . stage-out = 0 s [2021-03-05 10:18:55] 2021-03-05 09:18:15,899 | INFO | queue_monitor | pilot.util.timing | timing_report | .............................. [2021-03-05 10:18:55] 2021-03-05 09:18:15,899 | INFO | queue_monitor | pilot.user.atlas.diagnose | get_log_extracts | building log extracts (sent to the server as 'pilotLog') [2021-03-05 10:18:55] 2021-03-05 09:18:15,900 | DEBUG | queue_monitor | pilot.user.atlas.diagnose | get_panda_tracer_log | PanDA tracer log does not exist: /var/lib/boinc/slots/136/PanDA_Pilot-4991368413/pandatracerlog.txt (ignoring) [2021-03-05 10:18:55] 2021-03-05 09:18:15,900 | INFO | queue_monitor | pilot.util.container | execute | executing command: tail -n 20 /var/lib/boinc/slots/136/PanDA_Pilot-4991368413/pilotlog.txt [2021-03-05 10:18:55] XXXXXXXXXXXXXXXXXXXXX[end log extracts] [2021-03-05 10:18:55] 2021-03-05 09:18:15,928 | WARNING | queue_monitor | pilot.control.job | add_error_codes | pilotErrorCodes = [1187] (will report primary/first error code) [2021-03-05 10:18:55] 2021-03-05 09:18:15,928 | WARNING | queue_monitor | pilot.control.job | add_error_codes | pilotErrorDiags = ['Payload metadata does not exist'] (will report primary/first error diag) [2021-03-05 10:18:55] 2021-03-05 09:18:15,929 | DEBUG | queue_monitor | pilot.control.job | send_state | is_harvester_mode(args) : False [2021-03-05 10:18:55] 2021-03-05 09:18:15,929 | DEBUG | queue_monitor | pilot.control.job | write_heartbeat_to_file | heartbeat dictionary: {'pilotErrorCode': 1187, 'rateWBYTES': 11834.0, 'pilotID': 'http://aipanda403.cern.ch/data/jobs/2021-03-05/BOINC-TEST/4991368413.out|PR|2.9.6 (20)', 'meanCoreCount': 1.8888888888888888, 'totRBYTES': 1217948672, 'siteName': 'BOINC-TEST', 'avgVMEM': 4909552.0, 'coreCount': 4, 'totWCHAR': 7865468, 'rateRCHAR': 520381.0, 'jobId': '4991368413', 'exeErrorCode': 0, 'rateWCHAR': 12788.0, 'totRCHAR': 320055361, 'xml': '{"17bf99fe-04d3-43ea-9836-17f98657d6c0_94740.1.job.log.tgz": {"adler32": "0041a0b8", "surl": "srm://srm.ndgf.org:8443/srm/managerv2?SFN=/atlas/disk/atlasdatadisk/rucio/hc_test/f1/b8/17bf99fe-04d3-43ea-9836-17f98657d6c0_94740.1.job.log.tgz", "guid": "5851f46f-280b-4e2a-bd7a-31ebed4d22b3", "fsize": 461105}}', 'maxVMEM': 11242440, 'cpuConversionFactor': 1.0, 'avgSWAP': 0.0, 'state': 'failed', 'transExitCode': 252, 'pilotErrorDiag': 'Payload metadata does not exist', 'node': 'dcameron05.cern.ch', 'avgRSS': 3017996.0, 'avgPSS': 1402519.0, 'timestamp': '2021-03-05T10:18:15+01:00', 'pilotTiming': '0|0|663|0|10', 'attemptNr': 0, 'totWBYTES': 7278592, 'rateRBYTES': 1980275.0, 'pilotLog': '- Log from pilotlog.txt -\n2021-03-05 09:18:15,895 | DEBUG | queue_monitor | pilot.user.atlas.jobmetrics | get_job_metrics | job metrics="actualCoreCount=2"\n2021-03-05 09:18:15,895 | INFO | queue_monitor | pilot.control.job | get_data_structure | mean actualcorecount: 1.888889\n2021-03-05 09:18:15,895 | INFO | queue_monitor | pilot.control.job | get_data_structure | payload/TRF did not report the number of read events\n2021-03-05 09:18:15,896 | INFO | queue_monitor | pilot.user.atlas.utilities | get_memory_values | using path: /var/lib/boinc/slots/136/PanDA_Pilot-4991368413/memory_monitor_summary.json (trf name=prmon)\n2021-03-05 09:18:15,898 | DEBUG | queue_monitor | pilot.user.atlas.utilities | get_memory_monitor_info | summary_dictionary={\'Max\': {\'rx_packets\': 80747, \'nprocs\': 9, \'nthreads\': 10, \'rx_bytes\': 17005525, \'wtime\': 615, \'rss\': 6715500, \'write_bytes\': 7278592, \'vmem\': 11242', 'cpuConsumptionTime': 904, 'startTime': 1614935100.419945, 'cpuConsumptionUnit': 's+Intel Core Processor (Broadwell, IBRS) 16384 KB', 'exeErrorDiag': '', 'maxSWAP': 0, 'jobMetrics': 'actualCoreCount=2', 'maxRSS': 6715500, 'schedulerID': 'harvester-CERN_central_ACTA', 'endTime': 1614935895.928875, 'maxPSS': 1848616} [2021-03-05 10:18:55] 2021-03-05 09:18:15,929 | DEBUG | queue_monitor | pilot.control.job | write_heartbeat_to_file | wrote heartbeat to file /var/lib/boinc/slots/136/heartbeat.json [2021-03-05 10:18:55] 2021-03-05 09:18:15,930 | DEBUG | queue_monitor | pilot.control.job | queue_monitor | job 4991368413 was dequeued from the monitored payloads queue [2021-03-05 10:18:55] 2021-03-05 09:18:16,085 | DEBUG | queue_monitor | pilot.control.job | queue_monitor | tmp job object deleted [2021-03-05 10:18:55] 2021-03-05 09:18:16,116 | INFO | retrieve | pilot.control.job | make_job_report | [2021-03-05 10:18:55] 2021-03-05 09:18:16,116 | INFO | retrieve | pilot.control.job | make_job_report | job summary report [2021-03-05 10:18:55] 2021-03-05 09:18:16,116 | INFO | retrieve | pilot.control.job | make_job_report | -------------------------------------------------- [2021-03-05 10:18:55] 2021-03-05 09:18:16,116 | INFO | retrieve | pilot.control.job | make_job_report | PanDA job id: 4991368413 [2021-03-05 10:18:55] 2021-03-05 09:18:16,116 | INFO | retrieve | pilot.control.job | make_job_report | task id: NULL [2021-03-05 10:18:55] 2021-03-05 09:18:16,116 | INFO | retrieve | pilot.control.job | make_job_report | error 1/1: 1187: Payload metadata does not exist [2021-03-05 10:18:55] 2021-03-05 09:18:16,116 | INFO | retrieve | pilot.control.job | make_job_report | pilot error code: 1187 [2021-03-05 10:18:55] 2021-03-05 09:18:16,116 | INFO | retrieve | pilot.control.job | make_job_report | pilot error diag: metadata does not exist: /var/lib/boinc/slots/136/PanDA_Pilot-4991368413/metadata.xml [2021-03-05 10:18:55] 2021-03-05 09:18:16,117 | INFO | retrieve | pilot.control.job | make_job_report | status: LOG_TRANSFER = DONE [2021-03-05 10:18:55] 2021-03-05 09:18:16,117 | INFO | retrieve | pilot.control.job | make_job_report | pilot state: failed [2021-03-05 10:18:55] 2021-03-05 09:18:16,117 | INFO | retrieve | pilot.control.job | make_job_report | transexitcode: 252 [2021-03-05 10:18:55] 2021-03-05 09:18:16,117 | INFO | retrieve | pilot.control.job | make_job_report | exeerrorcode: 0 [2021-03-05 10:18:55] 2021-03-05 09:18:16,117 | INFO | retrieve | pilot.control.job | make_job_report | exeerrordiag: [2021-03-05 10:18:55] 2021-03-05 09:18:16,117 | INFO | retrieve | pilot.control.job | make_job_report | exitcode: 0 [2021-03-05 10:18:55] 2021-03-05 09:18:16,117 | INFO | retrieve | pilot.control.job | make_job_report | exitmsg: [2021-03-05 10:18:55] 2021-03-05 09:18:16,117 | INFO | retrieve | pilot.control.job | make_job_report | cpuconsumptiontime: 904 s [2021-03-05 10:18:55] 2021-03-05 09:18:16,117 | INFO | retrieve | pilot.control.job | make_job_report | nevents: 0 [2021-03-05 10:18:55] 2021-03-05 09:18:16,117 | INFO | retrieve | pilot.control.job | make_job_report | neventsw: 0 [2021-03-05 10:18:55] 2021-03-05 09:18:16,117 | INFO | retrieve | pilot.control.job | make_job_report | pid: 9315 [2021-03-05 10:18:55] 2021-03-05 09:18:16,117 | INFO | retrieve | pilot.control.job | make_job_report | pgrp: 9315 [2021-03-05 10:18:55] 2021-03-05 09:18:16,117 | INFO | retrieve | pilot.control.job | make_job_report | corecount: 4 [2021-03-05 10:18:55] 2021-03-05 09:18:16,117 | INFO | retrieve | pilot.control.job | make_job_report | event service: False [2021-03-05 10:18:55] 2021-03-05 09:18:16,117 | INFO | retrieve | pilot.control.job | make_job_report | sizes: {0: 8545448, 1: 8545671, 674: 8549362, 676: 8553213, 794: 8553169} [2021-03-05 10:18:55] 2021-03-05 09:18:16,118 | INFO | retrieve | pilot.control.job | make_job_report | -------------------------------------------------- [2021-03-05 10:18:55] 2021-03-05 09:18:16,118 | INFO | retrieve | pilot.control.job | make_job_report | [2021-03-05 10:18:55] 2021-03-05 09:18:16,118 | DEBUG | retrieve | pilot.control.job | has_job_completed | ls -lF /var/lib/boinc/slots/136: [2021-03-05 10:18:55] [2021-03-05 10:18:55] 2021-03-05 09:18:16,118 | INFO | retrieve | pilot.util.container | execute | executing command: ls -lF /var/lib/boinc/slots/136 [2021-03-05 10:18:55] 2021-03-05 09:18:16,144 | DEBUG | retrieve | pilot.control.job | has_job_completed | total 42880 [2021-03-05 10:18:55] -rw-------. 1 boinc boinc 161629 Mar 5 10:18 17bf99fe-04d3-43ea-9836-17f98657d6c0_94740.1.job.log [2021-03-05 10:18:55] -rw-------. 1 boinc boinc 461105 Mar 5 10:16 17bf99fe-04d3-43ea-9836-17f98657d6c0_94740.1.job.log.tgz [2021-03-05 10:18:55] -rw-------. 1 boinc boinc 141 Mar 5 10:04 3WSLDmgRKcyn7Olcko1bjSoqABFKDmABFKDmJZ5TDmHGFKDmEENrRo.diag [2021-03-05 10:18:55] -rw-------. 1 boinc boinc 1181523 Mar 5 10:04 agis_schedconf.cvmfs.json [2021-03-05 10:18:55] -rw-r--r--. 1 boinc boinc 0 Mar 5 10:04 boinc_lockfile [2021-03-05 10:18:55] -rw-r--r--. 1 boinc boinc 8192 Mar 5 10:17 boinc_mmap_file [2021-03-05 10:18:55] -rw-r--r--. 1 boinc boinc 532 Mar 5 10:11 boinc_task_state.xml [2021-03-05 10:18:55] -rw-------. 1 boinc boinc 1948208 Mar 5 10:04 cric_ddmendpoints.json [2021-03-05 10:18:55] -rw-r--r--. 1 boinc boinc 37620382 Mar 5 10:04 EVNT.04972714._000038.pool.root.1 [2021-03-05 10:18:55] -rw-------. 1 boinc boinc 2631 Mar 5 10:18 heartbeat.json [2021-03-05 10:18:55] -rw-r--r--. 1 boinc boinc 6223 Mar 5 10:04 init_data.xml [2021-03-05 10:18:55] -rw-r--r--. 1 boinc boinc 1048401 Mar 5 10:04 input.tar.gz [2021-03-05 10:18:55] -rw-r--r--. 1 boinc boinc 112 Mar 5 10:04 job.xml [2021-03-05 10:18:55] -rw-------. 1 boinc boinc 1020 Mar 5 10:16 memory_monitor_summary.json [2021-03-05 10:18:55] -rw-------. 1 boinc boinc 282 Mar 5 10:16 output.list [2021-03-05 10:18:55] -rw-r--r--. 1 boinc boinc 2654 Mar 5 10:04 pandaJob.out [2021-03-05 10:18:55] drwxrwx---. 3 boinc boinc 4096 Mar 5 10:16 PanDA_Pilot-4991368413/ [2021-03-05 10:18:55] drwx------. 5 boinc boinc 4096 Mar 5 10:05 pilot2/ [2021-03-05 10:18:55] -rw-r--r--. 1 boinc boinc 1042975 Mar 5 08:23 pilot2.tar.gz [2021-03-05 10:18:55] -rw-------. 1 boinc boinc 143767 Mar 5 10:18 pilotlog.txt [2021-03-05 10:18:55] -rw-r--r--. 1 boinc boinc 4974 Mar 5 08:52 queuedata.json [2021-03-05 10:18:55] -rwxr-xr-x. 1 boinc boinc 5573 Mar 5 10:04 run_atlas* [2021-03-05 10:18:55] -rwx------. 1 boinc boinc 19623 Mar 5 08:52 runpilot2-wrapper.sh* [2021-03-05 10:18:55] -rw-r--r--. 1 boinc boinc 407 Mar 5 10:04 runtime_log [2021-03-05 10:18:55] -rw-r--r--. 1 boinc boinc 6441 Mar 5 10:04 runtime_log.err [2021-03-05 10:18:55] -rw-------. 1 boinc boinc 408 Mar 5 10:04 setup.sh.local [2021-03-05 10:18:55] drwxrwx--x. 2 boinc boinc 68 Mar 5 10:04 shared/ [2021-03-05 10:18:55] -rw-r--r--. 1 boinc boinc 16501 Mar 5 10:04 start_atlas.sh [2021-03-05 10:18:55] -rw-r--r--. 1 boinc boinc 1704 Mar 5 10:04 stderr.txt [2021-03-05 10:18:55] -rw-r--r--. 1 boinc boinc 107 Mar 5 10:04 wrapper_26015_x86_64-pc-linux-gnu [2021-03-05 10:18:55] -rw-r--r--. 1 boinc boinc 24 Mar 5 10:17 wrapper_checkpoint.txt [2021-03-05 10:18:55] 2021-03-05 09:18:16,145 | INFO | retrieve | pilot.util.queuehandling | queue_report | queue jobs has 0 job(s) [2021-03-05 10:18:55] 2021-03-05 09:18:16,145 | INFO | retrieve | pilot.util.queuehandling | queue_report | queue payloads has 0 job(s) [2021-03-05 10:18:55] 2021-03-05 09:18:16,145 | INFO | retrieve | pilot.util.queuehandling | queue_report | queue data_in has 0 job(s) [2021-03-05 10:18:55] 2021-03-05 09:18:16,145 | INFO | retrieve | pilot.util.queuehandling | queue_report | queue data_out has 0 job(s) [2021-03-05 10:18:55] 2021-03-05 09:18:16,145 | INFO | retrieve | pilot.util.queuehandling | queue_report | queue current_data_in has 0 job(s) [2021-03-05 10:18:55] 2021-03-05 09:18:16,145 | INFO | retrieve | pilot.util.queuehandling | queue_report | queue validated_jobs has 0 job(s) [2021-03-05 10:18:55] 2021-03-05 09:18:16,145 | INFO | retrieve | pilot.util.queuehandling | queue_report | queue validated_payloads has 0 job(s) [2021-03-05 10:18:55] 2021-03-05 09:18:16,145 | INFO | retrieve | pilot.util.queuehandling | queue_report | queue monitored_payloads has 0 job(s) [2021-03-05 10:18:55] 2021-03-05 09:18:16,145 | INFO | retrieve | pilot.util.queuehandling | queue_report | queue finished_jobs has 0 job(s) [2021-03-05 10:18:55] 2021-03-05 09:18:16,145 | INFO | retrieve | pilot.util.queuehandling | queue_report | queue finished_payloads has 0 job(s) [2021-03-05 10:18:55] 2021-03-05 09:18:16,145 | INFO | retrieve | pilot.util.queuehandling | queue_report | queue finished_data_in has 1 job(s) [2021-03-05 10:18:55] 2021-03-05 09:18:16,146 | INFO | retrieve | pilot.util.queuehandling | queue_report | queue finished_data_out has 0 job(s) [2021-03-05 10:18:55] 2021-03-05 09:18:16,146 | INFO | retrieve | pilot.util.queuehandling | queue_report | queue failed_jobs has 0 job(s) [2021-03-05 10:18:55] 2021-03-05 09:18:16,146 | INFO | retrieve | pilot.util.queuehandling | queue_report | queue failed_payloads has 0 job(s) [2021-03-05 10:18:55] 2021-03-05 09:18:16,146 | INFO | retrieve | pilot.util.queuehandling | queue_report | queue failed_data_in has 0 job(s) [2021-03-05 10:18:55] 2021-03-05 09:18:16,146 | INFO | retrieve | pilot.util.queuehandling | queue_report | queue failed_data_out has 0 job(s) [2021-03-05 10:18:55] 2021-03-05 09:18:16,146 | INFO | retrieve | pilot.util.queuehandling | queue_report | queue completed_jobs has 0 job(s) [2021-03-05 10:18:55] 2021-03-05 09:18:16,146 | INFO | retrieve | pilot.util.queuehandling | queue_report | queue completed_jobids has 1 job(s) [2021-03-05 10:18:55] 2021-03-05 09:18:16,146 | INFO | retrieve | pilot.control.job | has_job_completed | job 4991368413 has completed (purged errors) [2021-03-05 10:18:55] 2021-03-05 09:18:16,146 | INFO | retrieve | pilot.util.processes | cleanup | overall cleanup function is called [2021-03-05 10:18:55] 2021-03-05 09:18:16,151 | DEBUG | retrieve | pilot.util.processes | cleanup | work directory was removed: /var/lib/boinc/slots/136/PanDA_Pilot-4991368413 [2021-03-05 10:18:55] 2021-03-05 09:18:17,156 | INFO | retrieve | pilot.info.jobdata | collect_zombies | --- collectZombieJob: --- 10, [9315] [2021-03-05 10:18:55] 2021-03-05 09:18:17,157 | INFO | retrieve | pilot.info.jobdata | collect_zombies | zombie collector trying to kill pid 9315 [2021-03-05 10:18:55] 2021-03-05 09:18:17,157 | INFO | retrieve | pilot.info.jobdata | collect_zombies | harmless exception when collecting zombies: [Errno 10] No child processes [2021-03-05 10:18:55] 2021-03-05 09:18:18,159 | INFO | retrieve | pilot.util.processes | cleanup | collected zombie processes [2021-03-05 10:18:55] 2021-03-05 09:18:18,160 | INFO | retrieve | pilot.util.processes | cleanup | will now attempt to kill all subprocesses of pid=9315 [2021-03-05 10:18:55] 2021-03-05 09:18:18,230 | INFO | retrieve | pilot.util.processes | kill_processes | process IDs to be killed: [9315] (in reverse order) [2021-03-05 10:18:55] 2021-03-05 09:18:18,262 | WARNING | retrieve | pilot.util.processes | kill_processes | found no corresponding commands to process id(s) [2021-03-05 10:18:55] 2021-03-05 09:18:18,262 | INFO | retrieve | pilot.util.processes | kill_orphans | Do not look for orphan processes in BOINC jobs [2021-03-05 10:18:55] 2021-03-05 09:18:18,262 | DEBUG | retrieve | pilot.util.queuehandling | purge_queue | queue purged [2021-03-05 10:18:55] 2021-03-05 09:18:18,263 | INFO | retrieve | pilot.control.job | retrieve | ready for new job [2021-03-05 10:18:55] 2021-03-05 09:18:18,263 | INFO | retrieve | root | retrieve | pilot has finished for previous job - re-establishing logging [2021-03-05 10:18:55] 2021-03-05 09:18:18,263 | INFO | retrieve | pilot.util.auxiliary | pilot_version_banner | **************************************** [2021-03-05 10:18:55] 2021-03-05 09:18:18,264 | INFO | retrieve | pilot.util.auxiliary | pilot_version_banner | *** PanDA Pilot version 2.9.6 (20) *** [2021-03-05 10:18:55] 2021-03-05 09:18:18,264 | INFO | retrieve | pilot.util.auxiliary | pilot_version_banner | **************************************** [2021-03-05 10:18:55] 2021-03-05 09:18:18,264 | INFO | retrieve | pilot.util.auxiliary | pilot_version_banner | [2021-03-05 10:18:55] 2021-03-05 09:18:18,264 | INFO | retrieve | pilot.util.auxiliary | pilot_version_banner | pilot is running in a VM [2021-03-05 10:18:55] 2021-03-05 09:18:18,264 | INFO | retrieve | pilot.util.auxiliary | display_architecture_info | architecture information: [2021-03-05 10:18:55] 2021-03-05 09:18:18,332 | INFO | retrieve | pilot.util.auxiliary | display_architecture_info | [2021-03-05 10:18:55] LSB Version: :core-4.1-amd64:core-4.1-noarch [2021-03-05 10:18:55] Distributor ID: CentOS [2021-03-05 10:18:55] Description: CentOS Linux release 7.8.2003 (Core) [2021-03-05 10:18:55] Release: 7.8.2003 [2021-03-05 10:18:55] Codename: Core [2021-03-05 10:18:55] 2021-03-05 09:18:18,332 | INFO | retrieve | pilot.util.auxiliary | pilot_version_banner | **************************************** [2021-03-05 10:18:55] 2021-03-05 09:18:18,835 | DEBUG | retrieve | pilot.util.monitoring | check_local_space | checking local space on /var/lib/boinc/slots/136 [2021-03-05 10:18:55] 2021-03-05 09:18:18,850 | INFO | retrieve | pilot.util.monitoring | check_local_space | sufficient remaining disk space (16791896064 B) [2021-03-05 10:18:55] 2021-03-05 09:18:18,851 | WARNING | retrieve | pilot.control.job | proceed_with_getjob | since timefloor is set to 0, pilot was only allowed to run one job [2021-03-05 10:18:55] 2021-03-05 09:18:18,851 | DEBUG | retrieve | pilot.control.job | retrieve | will not set job_aborted yet [2021-03-05 10:18:55] 2021-03-05 09:18:18,851 | DEBUG | retrieve | pilot.control.job | retrieve | [job] retrieve thread has finished [2021-03-05 10:18:55] 2021-03-05 09:18:18,878 | WARNING | monitor | pilot.control.monitor | control | aborting monitor loop since graceful_stop has been set [2021-03-05 10:18:55] 2021-03-05 09:18:18,879 | INFO | monitor | pilot.control.monitor | control | [monitor] control thread has ended [2021-03-05 10:18:55] 2021-03-05 09:18:18,890 | WARNING | copytool_out | pilot.util.common | should_abort | data:copytool_out:received graceful stop - abort after this iteration [2021-03-05 10:18:55] 2021-03-05 09:18:19,046 | DEBUG | create_data_payload | pilot.control.job | create_data_payload | will not set job_aborted yet [2021-03-05 10:18:55] 2021-03-05 09:18:19,046 | DEBUG | create_data_payload | pilot.control.job | create_data_payload | [job] create_data_payload thread has finished [2021-03-05 10:18:55] 2021-03-05 09:18:19,091 | DEBUG | execute_payloads | pilot.control.payload | execute_payloads | will not set job_aborted yet [2021-03-05 10:18:55] 2021-03-05 09:18:19,091 | INFO | execute_payloads | pilot.control.payload | execute_payloads | [payload] execute_payloads thread has finished [2021-03-05 10:18:55] 2021-03-05 09:18:19,505 | DEBUG | copytool_in | pilot.control.data | copytool_in | will not set job_aborted yet [2021-03-05 10:18:55] 2021-03-05 09:18:19,505 | DEBUG | copytool_in | pilot.control.data | copytool_in | [data] copytool_in thread has finished [2021-03-05 10:18:55] 2021-03-05 09:18:19,523 | DEBUG | failed_post | pilot.control.payload | failed_post | will not set job_aborted yet [2021-03-05 10:18:55] 2021-03-05 09:18:19,523 | INFO | failed_post | pilot.control.payload | failed_post | [payload] failed_post thread has finished [2021-03-05 10:18:55] 2021-03-05 09:18:19,573 | DEBUG | validate_post | pilot.control.payload | validate_post | will not set job_aborted yet [2021-03-05 10:18:55] 2021-03-05 09:18:19,573 | INFO | validate_post | pilot.control.payload | validate_post | [payload] validate_post thread has finished [2021-03-05 10:18:55] 2021-03-05 09:18:19,594 | DEBUG | validate_pre | pilot.control.payload | validate_pre | will not set job_aborted yet [2021-03-05 10:18:55] 2021-03-05 09:18:19,594 | INFO | validate_pre | pilot.control.payload | validate_pre | [payload] validate_pre thread has finished [2021-03-05 10:18:55] 2021-03-05 09:18:19,760 | DEBUG | data | pilot.control.data | control | data control ending since graceful_stop has been set [2021-03-05 10:18:55] 2021-03-05 09:18:19,760 | DEBUG | data | pilot.control.data | control | will not set job_aborted yet [2021-03-05 10:18:55] 2021-03-05 09:18:19,760 | DEBUG | data | pilot.control.data | control | [data] control thread has finished [2021-03-05 10:18:55] 2021-03-05 09:18:19,819 | DEBUG | job | pilot.control.job | control | job control ending since graceful_stop has been set [2021-03-05 10:18:55] 2021-03-05 09:18:19,819 | DEBUG | job | pilot.control.job | control | will not set job_aborted yet [2021-03-05 10:18:55] 2021-03-05 09:18:19,819 | DEBUG | job | pilot.control.job | control | [job] control thread has finished [2021-03-05 10:18:55] 2021-03-05 09:18:19,891 | DEBUG | copytool_out | pilot.control.data | copytool_out | will not set job_aborted yet [2021-03-05 10:18:55] 2021-03-05 09:18:19,891 | DEBUG | copytool_out | pilot.control.data | copytool_out | [data] copytool_out thread has finished [2021-03-05 10:18:55] 2021-03-05 09:18:19,990 | DEBUG | validate | pilot.control.job | validate | will not set job_aborted yet [2021-03-05 10:18:55] 2021-03-05 09:18:19,990 | DEBUG | validate | pilot.control.job | validate | [job] validate thread has finished [2021-03-05 10:18:55] 2021-03-05 09:18:20,054 | DEBUG | payload | pilot.control.payload | control | payload control ending since graceful_stop has been set [2021-03-05 10:18:55] 2021-03-05 09:18:20,054 | DEBUG | payload | pilot.control.payload | control | will not set job_aborted yet [2021-03-05 10:18:55] 2021-03-05 09:18:20,054 | DEBUG | payload | pilot.control.payload | control | [payload] control thread has finished [2021-03-05 10:18:55] 2021-03-05 09:18:20,920 | WARNING | queue_monitoring | pilot.util.common | should_abort | data:queue_monitoring:received graceful stop - abort after this iteration [2021-03-05 10:18:55] 2021-03-05 09:18:21,096 | WARNING | queue_monitor | pilot.util.common | should_abort | job:queue_monitor:received graceful stop - abort after this iteration [2021-03-05 10:18:55] 2021-03-05 09:18:21,097 | DEBUG | queue_monitor | pilot.control.job | queue_monitor | will not set job_aborted yet [2021-03-05 10:18:55] 2021-03-05 09:18:21,097 | DEBUG | queue_monitor | pilot.control.job | queue_monitor | [job] queue monitor thread has finished [2021-03-05 10:18:55] 2021-03-05 09:18:23,921 | DEBUG | queue_monitoring | pilot.control.data | queue_monitoring | will not set job_aborted yet [2021-03-05 10:18:55] 2021-03-05 09:18:23,921 | DEBUG | queue_monitoring | pilot.control.data | queue_monitoring | [data] queue_monitor thread has finished [2021-03-05 10:18:55] 2021-03-05 09:18:55,692 | WARNING | job_monitor | pilot.control.job | check_job_monitor_waiting_time | no jobs in monitored_payloads queue (waited for 61 s) [2021-03-05 10:18:55] 2021-03-05 09:18:55,692 | DEBUG | job_monitor | pilot.util.processes | threads_aborted | aborting since the last relevant thread is about to finish [2021-03-05 10:18:55] 2021-03-05 09:18:55,692 | DEBUG | job_monitor | pilot.control.job | job_monitor | will proceed to set job_aborted [2021-03-05 10:18:55] 2021-03-05 09:18:55,692 | DEBUG | job_monitor | pilot.control.job | job_monitor | [job] job monitor thread has finished [2021-03-05 10:18:55] 2021-03-05 09:18:55,765 | INFO | MainThread | pilot.workflow.generic | run | end of generic workflow (traces error code: 0) [2021-03-05 10:18:55] 2021-03-05 09:18:55,767 | INFO | MainThread | root | wrap_up | traces error code: 0 [2021-03-05 10:18:55] 2021-03-05 09:18:55,767 | INFO | MainThread | root | wrap_up | pilot has finished [2021-03-05 10:18:55] 2021-03-05 09:18:55,828 [wrapper] ==== pilot stdout END ==== [2021-03-05 10:18:55] 2021-03-05 09:18:55,832 [wrapper] ==== wrapper stdout RESUME ==== [2021-03-05 10:18:55] 2021-03-05 09:18:55,835 [wrapper] Pilot exit status: 0 [2021-03-05 10:18:55] 2021-03-05 09:18:55,850 [wrapper] pandaids: 4991368413 [2021-03-05 10:18:55] 2021-03-05 09:18:55,856 [wrapper] apfmon messages muted [2021-03-05 10:18:55] 2021-03-05 09:18:55,859 [wrapper] Test setup, not cleaning [2021-03-05 10:18:55] 2021-03-05 09:18:55,862 [wrapper] ==== wrapper stdout END ==== [2021-03-05 10:18:55] 2021-03-05 09:18:55,866 [wrapper] ==== wrapper stderr END ==== [2021-03-05 10:18:55] 2021-03-05 09:18:55,872 [wrapper] wrapperexiting ec=0, duration=845 [2021-03-05 10:18:55] 2021-03-05 09:18:55,875 [wrapper] apfmon messages muted [2021-03-05 10:18:55] *** Error codes and diagnostics *** [2021-03-05 10:18:55] "exeErrorCode": 0, [2021-03-05 10:18:55] "exeErrorDiag": "", [2021-03-05 10:18:55] "pilotErrorCode": 1187, [2021-03-05 10:18:55] "pilotErrorDiag": "Payload metadata does not exist", [2021-03-05 10:18:55] *** Listing of results directory *** [2021-03-05 10:18:55] total 43292 [2021-03-05 10:18:55] -rw-r--r--. 1 boinc boinc 1042975 Mar 5 08:23 pilot2.tar.gz [2021-03-05 10:18:55] -rwx------. 1 boinc boinc 19623 Mar 5 08:52 runpilot2-wrapper.sh [2021-03-05 10:18:55] -rw-r--r--. 1 boinc boinc 4974 Mar 5 08:52 queuedata.json [2021-03-05 10:18:55] -rw-r--r--. 1 boinc boinc 107 Mar 5 10:04 wrapper_26015_x86_64-pc-linux-gnu [2021-03-05 10:18:55] -rwxr-xr-x. 1 boinc boinc 5573 Mar 5 10:04 run_atlas [2021-03-05 10:18:55] -rw-r--r--. 1 boinc boinc 112 Mar 5 10:04 job.xml [2021-03-05 10:18:55] -rw-r--r--. 1 boinc boinc 6223 Mar 5 10:04 init_data.xml [2021-03-05 10:18:55] drwxrwx--x. 2 boinc boinc 68 Mar 5 10:04 shared [2021-03-05 10:18:55] -rw-r--r--. 1 boinc boinc 0 Mar 5 10:04 boinc_lockfile [2021-03-05 10:18:55] -rw-r--r--. 1 boinc boinc 37620382 Mar 5 10:04 EVNT.04972714._000038.pool.root.1 [2021-03-05 10:18:55] -rw-r--r--. 1 boinc boinc 16501 Mar 5 10:04 start_atlas.sh [2021-03-05 10:18:55] -rw-r--r--. 1 boinc boinc 1048401 Mar 5 10:04 input.tar.gz [2021-03-05 10:18:55] -rw-r--r--. 1 boinc boinc 2654 Mar 5 10:04 pandaJob.out [2021-03-05 10:18:55] -rw-------. 1 boinc boinc 408 Mar 5 10:04 setup.sh.local [2021-03-05 10:18:55] -rw-------. 1 boinc boinc 1181523 Mar 5 10:04 agis_schedconf.cvmfs.json [2021-03-05 10:18:55] -rw-------. 1 boinc boinc 1948208 Mar 5 10:04 cric_ddmendpoints.json [2021-03-05 10:18:55] drwx------. 5 boinc boinc 4096 Mar 5 10:05 pilot2 [2021-03-05 10:18:55] -rw-r--r--. 1 boinc boinc 532 Mar 5 10:11 boinc_task_state.xml [2021-03-05 10:18:55] -rw-------. 1 boinc boinc 1020 Mar 5 10:16 memory_monitor_summary.json [2021-03-05 10:18:55] -rw-------. 1 boinc boinc 461105 Mar 5 10:16 17bf99fe-04d3-43ea-9836-17f98657d6c0_94740.1.job.log.tgz [2021-03-05 10:18:55] -rw-------. 1 boinc boinc 2631 Mar 5 10:18 heartbeat.json [2021-03-05 10:18:55] -rw-r--r--. 1 boinc boinc 8192 Mar 5 10:18 boinc_mmap_file [2021-03-05 10:18:55] -rw-r--r--. 1 boinc boinc 24 Mar 5 10:18 wrapper_checkpoint.txt [2021-03-05 10:18:55] -rw-------. 1 boinc boinc 8825 Mar 5 10:18 pilotlog.txt [2021-03-05 10:18:55] -rw-------. 1 boinc boinc 178461 Mar 5 10:18 17bf99fe-04d3-43ea-9836-17f98657d6c0_94740.1.job.log [2021-03-05 10:18:55] -rw-------. 1 boinc boinc 282 Mar 5 10:18 output.list [2021-03-05 10:18:55] -rw-r--r--. 1 boinc boinc 686 Mar 5 10:18 runtime_log [2021-03-05 10:18:55] -rw-------. 1 boinc boinc 655360 Mar 5 10:18 result.tar.gz [2021-03-05 10:18:55] -rw-r--r--. 1 boinc boinc 9885 Mar 5 10:18 runtime_log.err [2021-03-05 10:18:55] -rw-------. 1 boinc boinc 575 Mar 5 10:18 3WSLDmgRKcyn7Olcko1bjSoqABFKDmABFKDmJZ5TDmHGFKDmEENrRo.diag [2021-03-05 10:18:55] -rw-r--r--. 1 boinc boinc 36693 Mar 5 10:18 stderr.txt [2021-03-05 10:18:55] No HITS result produced [2021-03-05 10:18:55] *** Contents of shared directory: *** [2021-03-05 10:18:55] total 38424 [2021-03-05 10:18:55] -rw-r--r--. 1 boinc boinc 37620382 Mar 5 10:04 ATLAS.root_0 [2021-03-05 10:18:55] -rw-r--r--. 1 boinc boinc 16501 Mar 5 10:04 start_atlas.sh [2021-03-05 10:18:55] -rw-r--r--. 1 boinc boinc 1048401 Mar 5 10:04 input.tar.gz [2021-03-05 10:18:55] -rw-------. 1 boinc boinc 655360 Mar 5 10:18 result.tar.gz 10:18:57 (2287): run_atlas exited; CPU time 884.728582 10:18:57 (2287): called boinc_finish(0) </stderr_txt> ]]>
©2024 CERN