Name | 7XrNDmEhugynfZGDcpSWOuwoABFKDmABFKDm2IFNDm3BFKDmVGxyOo_0 |
Workunit | 2063625 |
Created | 17 Mar 2021, 23:23:59 UTC |
Sent | 17 Mar 2021, 23:26:56 UTC |
Report deadline | 24 Mar 2021, 23:26:56 UTC |
Received | 17 Mar 2021, 23:41:26 UTC |
Server state | Over |
Outcome | Validate error |
Client state | Done |
Exit status | 0 (0x00000000) |
Computer ID | 4228 |
Run time | 14 min 11 sec |
CPU time | 15 min 24 sec |
Validate state | Invalid |
Credit | 0.00 |
Device peak FLOPS | 13.11 GFLOPS |
Application version | ATLAS long simulation v1.00 (native_mt) x86_64-pc-linux-gnu |
Peak working set size | 1.69 GB |
Peak swap size | 2.38 GB |
Peak disk usage | 90.49 MB |
<core_client_version>7.16.11</core_client_version> <![CDATA[ <stderr_txt> 00:27:03 (30687): wrapper (7.7.26015): starting 00:27:03 (30687): wrapper: running run_atlas (--nthreads 4) [2021-03-18 00:27:03] Arguments: --nthreads 4 [2021-03-18 00:27:03] Threads: 4 [2021-03-18 00:27:03] Checking for CVMFS [2021-03-18 00:27:04] Probing /cvmfs/atlas-condb.cern.ch... OK [2021-03-18 00:27:04] Probing /cvmfs/atlas.cern.ch... OK [2021-03-18 00:27:05] VERSION PID UPTIME(M) MEM(K) REVISION EXPIRES(M) NOCATALOGS CACHEUSE(K) CACHEMAX(K) NOFDUSE NOFDMAX NOIOERR NOOPEN HITRATE(%) RX(K) SPEED(K/S) HOST PROXY ONLINE [2021-03-18 00:27:05] 2.5.1.0 30996 0 22340 80983 3 1 12087958 20480000 0 65024 0 0 n/a 0 0 http://cvmfs-stratum-one.cern.ch/cvmfs/atlas.cern.ch http://188.184.31.232:3128 1 [2021-03-18 00:27:05] CVMFS is ok [2021-03-18 00:27:05] Using singularity image /cvmfs/atlas.cern.ch/repo/containers/images/singularity/x86_64-centos7.img [2021-03-18 00:27:05] Checking for singularity binary... [2021-03-18 00:27:05] Using singularity found in PATH at /usr/bin/singularity [2021-03-18 00:27:05] Running /usr/bin/singularity --version [2021-03-18 00:27:05] singularity version 3.7.2-1.el7 [2021-03-18 00:27:05] Checking singularity works with /usr/bin/singularity exec -B /cvmfs /cvmfs/atlas.cern.ch/repo/containers/images/singularity/x86_64-centos7.img hostname [2021-03-18 00:27:05] dcameron05.cern.ch [2021-03-18 00:27:05] Singularity works [2021-03-18 00:27:05] Set ATHENA_PROC_NUMBER=4 [2021-03-18 00:27:05] Starting ATLAS job with PandaID=5002450810 [2021-03-18 00:27:05] Running command: /usr/bin/singularity exec --pwd /var/lib/boinc/slots/136 -B /cvmfs,/var /cvmfs/atlas.cern.ch/repo/containers/images/singularity/x86_64-centos7.img sh start_atlas.sh [2021-03-18 00:41:12] *** The last 200 lines of the pilot log: *** [2021-03-18 00:41:12] 2021-03-17 23:40:36,424 | INFO | queue_monitor | pilot.util.timing | timing_report | . total setup = 10 s [2021-03-18 00:41:12] 2021-03-17 23:40:36,424 | INFO | queue_monitor | pilot.util.timing | timing_report | . stage-in = 0 s [2021-03-18 00:41:12] 2021-03-17 23:40:36,424 | INFO | queue_monitor | pilot.util.timing | timing_report | . payload execution = 663 s [2021-03-18 00:41:12] 2021-03-17 23:40:36,424 | INFO | queue_monitor | pilot.util.timing | timing_report | . stage-out = 0 s [2021-03-18 00:41:12] 2021-03-17 23:40:36,424 | INFO | queue_monitor | pilot.util.timing | timing_report | .............................. [2021-03-18 00:41:12] 2021-03-17 23:40:36,424 | INFO | queue_monitor | pilot.user.atlas.diagnose | get_log_extracts | building log extracts (sent to the server as 'pilotLog') [2021-03-18 00:41:12] 2021-03-17 23:40:36,424 | DEBUG | queue_monitor | pilot.user.atlas.diagnose | get_panda_tracer_log | PanDA tracer log does not exist: /var/lib/boinc/slots/136/PanDA_Pilot-5002450810/pandatracerlog.txt (ignoring) [2021-03-18 00:41:12] 2021-03-17 23:40:36,424 | INFO | queue_monitor | pilot.util.container | execute | executing command: tail -n 20 /var/lib/boinc/slots/136/PanDA_Pilot-5002450810/pilotlog.txt [2021-03-18 00:41:12] 2021-03-17 23:40:36,495 | WARNING | queue_monitor | pilot.control.job | add_timing_and_extracts | [2021-03-18 00:41:12] XXXXXXXXXXXXXXXXXXXXX[begin log extracts] [2021-03-18 00:41:12] - Log from pilotlog.txt - [2021-03-18 00:41:12] 2021-03-17 23:40:36,418 | DEBUG | queue_monitor | pilot.user.atlas.jobmetrics | get_job_metrics | job metrics="actualCoreCount=2" [2021-03-18 00:41:12] 2021-03-17 23:40:36,418 | INFO | queue_monitor | pilot.control.job | get_data_structure | mean actualcorecount: 2.666667 [2021-03-18 00:41:12] 2021-03-17 23:40:36,419 | INFO | queue_monitor | pilot.control.job | get_data_structure | payload/TRF did not report the number of read events [2021-03-18 00:41:12] 2021-03-17 23:40:36,420 | INFO | queue_monitor | pilot.user.atlas.utilities | get_memory_values | using path: /var/lib/boinc/slots/136/PanDA_Pilot-5002450810/memory_monitor_summary.json (trf name=prmon) [2021-03-18 00:41:12] 2021-03-17 23:40:36,423 | DEBUG | queue_monitor | pilot.user.atlas.utilities | get_memory_monitor_info | summary_dictionary={'Max': {'rx_packets': 73313, 'nprocs': 6, 'nthreads': 7, 'rx_bytes': 15590948, 'wtime': 615, 'rss': 4267456, 'write_bytes': 5853184, 'vmem': 6476348, 'read_bytes': 992386048, 'stime': 22, 'tx_bytes': 1629713, 'pss': 2174234, 'wchar': 6443710, 'rchar': 320347931, 'tx_packets': 4819, 'swap': 0, 'utime': 899}, 'Avg': {'write_bytes': 9516.0, 'nprocs': 5.09, 'nthreads': 6.0, 'rx_bytes': 25349.0, 'rx_packets': 119.2, 'vmem': 4279275.0, 'read_bytes': 1613530.0, 'swap': 0.0, 'tx_bytes': 2649.0, 'pss': 1488620.0, 'wchar': 10476.0, 'rchar': 520857.0, 'tx_packets': 7.835, 'rss': 2727421.0}, 'HW': {'mem': {'MemTotal': 7310268}, 'cpu': {'CoresPerSocket': 1, 'ModelName': 'Intel Core Processor (Broadwell, IBRS)', 'ThreadsPerCore': 1, 'CPUs': 4, 'Sockets': 4}}, 'prmon': {'Version': '2.2.0'}} [2021-03-18 00:41:12] 2021-03-17 23:40:36,423 | INFO | queue_monitor | pilot.user.atlas.utilities | get_memory_monitor_info | extracted standard info from prmon json [2021-03-18 00:41:12] 2021-03-17 23:40:36,423 | INFO | queue_monitor | pilot.user.atlas.utilities | get_memory_monitor_info | extracted standard memory fields from prmon json [2021-03-18 00:41:12] 2021-03-17 23:40:36,423 | INFO | queue_monitor | pilot.util.timing | timing_report | .............................. [2021-03-18 00:41:12] 2021-03-17 23:40:36,423 | INFO | queue_monitor | pilot.util.timing | timing_report | . Timing measurements: [2021-03-18 00:41:12] 2021-03-17 23:40:36,423 | INFO | queue_monitor | pilot.util.timing | timing_report | . get job = 0 s [2021-03-18 00:41:12] 2021-03-17 23:40:36,424 | INFO | queue_monitor | pilot.util.timing | timing_report | . initial setup = 0 s [2021-03-18 00:41:12] 2021-03-17 23:40:36,424 | INFO | queue_monitor | pilot.util.timing | timing_report | . payload setup = 10 s [2021-03-18 00:41:12] 2021-03-17 23:40:36,424 | INFO | queue_monitor | pilot.util.timing | timing_report | . total setup = 10 s [2021-03-18 00:41:12] 2021-03-17 23:40:36,424 | INFO | queue_monitor | pilot.util.timing | timing_report | . stage-in = 0 s [2021-03-18 00:41:12] 2021-03-17 23:40:36,424 | INFO | queue_monitor | pilot.util.timing | timing_report | . payload execution = 663 s [2021-03-18 00:41:12] 2021-03-17 23:40:36,424 | INFO | queue_monitor | pilot.util.timing | timing_report | . stage-out = 0 s [2021-03-18 00:41:12] 2021-03-17 23:40:36,424 | INFO | queue_monitor | pilot.util.timing | timing_report | .............................. [2021-03-18 00:41:12] 2021-03-17 23:40:36,424 | INFO | queue_monitor | pilot.user.atlas.diagnose | get_log_extracts | building log extracts (sent to the server as 'pilotLog') [2021-03-18 00:41:12] 2021-03-17 23:40:36,424 | DEBUG | queue_monitor | pilot.user.atlas.diagnose | get_panda_tracer_log | PanDA tracer log does not exist: /var/lib/boinc/slots/136/PanDA_Pilot-5002450810/pandatracerlog.txt (ignoring) [2021-03-18 00:41:12] 2021-03-17 23:40:36,424 | INFO | queue_monitor | pilot.util.container | execute | executing command: tail -n 20 /var/lib/boinc/slots/136/PanDA_Pilot-5002450810/pilotlog.txt [2021-03-18 00:41:12] XXXXXXXXXXXXXXXXXXXXX[end log extracts] [2021-03-18 00:41:12] 2021-03-17 23:40:36,495 | WARNING | queue_monitor | pilot.control.job | add_error_codes | pilotErrorCodes = [1187] (will report primary/first error code) [2021-03-18 00:41:12] 2021-03-17 23:40:36,496 | WARNING | queue_monitor | pilot.control.job | add_error_codes | pilotErrorDiags = ['Payload metadata does not exist'] (will report primary/first error diag) [2021-03-18 00:41:12] 2021-03-17 23:40:36,496 | DEBUG | queue_monitor | pilot.control.job | send_state | is_harvester_mode(args) : False [2021-03-18 00:41:12] 2021-03-17 23:40:36,496 | DEBUG | queue_monitor | pilot.control.job | write_heartbeat_to_file | heartbeat dictionary: {'pilotErrorCode': 1187, 'rateWBYTES': 9516.0, 'pilotID': 'http://aipanda403.cern.ch/data/jobs/2021-03-17/BOINC-TEST/5002450810.out|PR|2.9.6 (20)', 'meanCoreCount': 2.6666666666666665, 'totRBYTES': 992386048, 'siteName': 'BOINC-TEST', 'avgVMEM': 4279275.0, 'coreCount': 4, 'totWCHAR': 6443710, 'rateRCHAR': 520857.0, 'jobId': '5002450810', 'exeErrorCode': 0, 'rateWCHAR': 10476.0, 'totRCHAR': 320347931, 'xml': '{"f35d712b-e4fe-4ebf-8c78-e324895d6f56_39406.1.job.log.tgz": {"adler32": "a131b074", "surl": "root://eosatlas.cern.ch:1094//eos/atlas/atlasdatadisk/rucio/hc_test/7a/9e/f35d712b-e4fe-4ebf-8c78-e324895d6f56_39406.1.job.log.tgz", "guid": "c28f3fff-6bc8-4cdc-ab14-f67d2607a7d8", "fsize": 458878}}', 'maxVMEM': 6476348, 'cpuConversionFactor': 1.0, 'avgSWAP': 0.0, 'state': 'failed', 'transExitCode': 252, 'pilotErrorDiag': 'Payload metadata does not exist', 'node': 'dcameron05.cern.ch', 'avgRSS': 2727421.0, 'avgPSS': 1488620.0, 'timestamp': '2021-03-18T00:40:36+01:00', 'pilotTiming': '0|0|663|0|10', 'attemptNr': 0, 'totWBYTES': 5853184, 'rateRBYTES': 1613530.0, 'pilotLog': '- Log from pilotlog.txt -\n2021-03-17 23:40:36,418 | DEBUG | queue_monitor | pilot.user.atlas.jobmetrics | get_job_metrics | job metrics="actualCoreCount=2"\n2021-03-17 23:40:36,418 | INFO | queue_monitor | pilot.control.job | get_data_structure | mean actualcorecount: 2.666667\n2021-03-17 23:40:36,419 | INFO | queue_monitor | pilot.control.job | get_data_structure | payload/TRF did not report the number of read events\n2021-03-17 23:40:36,420 | INFO | queue_monitor | pilot.user.atlas.utilities | get_memory_values | using path: /var/lib/boinc/slots/136/PanDA_Pilot-5002450810/memory_monitor_summary.json (trf name=prmon)\n2021-03-17 23:40:36,423 | DEBUG | queue_monitor | pilot.user.atlas.utilities | get_memory_monitor_info | summary_dictionary={\'Max\': {\'rx_packets\': 73313, \'nprocs\': 6, \'nthreads\': 7, \'rx_bytes\': 15590948, \'wtime\': 615, \'rss\': 4267456, \'write_bytes\': 5853184, \'vmem\': 647634', 'cpuConsumptionTime': 945, 'startTime': 1616023636.125714, 'cpuConsumptionUnit': 's+Intel Core Processor (Broadwell, IBRS) 16384 KB', 'exeErrorDiag': '', 'maxSWAP': 0, 'jobMetrics': 'actualCoreCount=2', 'maxRSS': 4267456, 'schedulerID': 'harvester-CERN_central_ACTA', 'endTime': 1616024436.495934, 'maxPSS': 2174234} [2021-03-18 00:41:12] 2021-03-17 23:40:36,497 | DEBUG | queue_monitor | pilot.control.job | write_heartbeat_to_file | wrote heartbeat to file /var/lib/boinc/slots/136/heartbeat.json [2021-03-18 00:41:12] 2021-03-17 23:40:36,497 | DEBUG | queue_monitor | pilot.control.job | queue_monitor | job 5002450810 was dequeued from the monitored payloads queue [2021-03-18 00:41:12] 2021-03-17 23:40:36,672 | DEBUG | queue_monitor | pilot.control.job | queue_monitor | tmp job object deleted [2021-03-18 00:41:12] 2021-03-17 23:40:37,157 | INFO | retrieve | pilot.control.job | make_job_report | [2021-03-18 00:41:12] 2021-03-17 23:40:37,157 | INFO | retrieve | pilot.control.job | make_job_report | job summary report [2021-03-18 00:41:12] 2021-03-17 23:40:37,157 | INFO | retrieve | pilot.control.job | make_job_report | -------------------------------------------------- [2021-03-18 00:41:12] 2021-03-17 23:40:37,157 | INFO | retrieve | pilot.control.job | make_job_report | PanDA job id: 5002450810 [2021-03-18 00:41:12] 2021-03-17 23:40:37,157 | INFO | retrieve | pilot.control.job | make_job_report | task id: NULL [2021-03-18 00:41:12] 2021-03-17 23:40:37,158 | INFO | retrieve | pilot.control.job | make_job_report | error 1/1: 1187: Payload metadata does not exist [2021-03-18 00:41:12] 2021-03-17 23:40:37,158 | INFO | retrieve | pilot.control.job | make_job_report | pilot error code: 1187 [2021-03-18 00:41:12] 2021-03-17 23:40:37,158 | INFO | retrieve | pilot.control.job | make_job_report | pilot error diag: metadata does not exist: /var/lib/boinc/slots/136/PanDA_Pilot-5002450810/metadata.xml [2021-03-18 00:41:12] 2021-03-17 23:40:37,158 | INFO | retrieve | pilot.control.job | make_job_report | status: LOG_TRANSFER = DONE [2021-03-18 00:41:12] 2021-03-17 23:40:37,158 | INFO | retrieve | pilot.control.job | make_job_report | pilot state: failed [2021-03-18 00:41:12] 2021-03-17 23:40:37,158 | INFO | retrieve | pilot.control.job | make_job_report | transexitcode: 252 [2021-03-18 00:41:12] 2021-03-17 23:40:37,158 | INFO | retrieve | pilot.control.job | make_job_report | exeerrorcode: 0 [2021-03-18 00:41:12] 2021-03-17 23:40:37,158 | INFO | retrieve | pilot.control.job | make_job_report | exeerrordiag: [2021-03-18 00:41:12] 2021-03-17 23:40:37,158 | INFO | retrieve | pilot.control.job | make_job_report | exitcode: 0 [2021-03-18 00:41:12] 2021-03-17 23:40:37,158 | INFO | retrieve | pilot.control.job | make_job_report | exitmsg: [2021-03-18 00:41:12] 2021-03-17 23:40:37,158 | INFO | retrieve | pilot.control.job | make_job_report | cpuconsumptiontime: 945 s [2021-03-18 00:41:12] 2021-03-17 23:40:37,159 | INFO | retrieve | pilot.control.job | make_job_report | nevents: 0 [2021-03-18 00:41:12] 2021-03-17 23:40:37,159 | INFO | retrieve | pilot.control.job | make_job_report | neventsw: 0 [2021-03-18 00:41:12] 2021-03-17 23:40:37,159 | INFO | retrieve | pilot.control.job | make_job_report | pid: 5426 [2021-03-18 00:41:12] 2021-03-17 23:40:37,159 | INFO | retrieve | pilot.control.job | make_job_report | pgrp: 5426 [2021-03-18 00:41:12] 2021-03-17 23:40:37,159 | INFO | retrieve | pilot.control.job | make_job_report | corecount: 4 [2021-03-18 00:41:12] 2021-03-17 23:40:37,159 | INFO | retrieve | pilot.control.job | make_job_report | event service: False [2021-03-18 00:41:12] 2021-03-17 23:40:37,159 | INFO | retrieve | pilot.control.job | make_job_report | sizes: {0: 8515035, 1: 8515103, 2: 8515326, 677: 8519323, 678: 8519017, 679: 8523772, 680: 8524588, 799: 8524592} [2021-03-18 00:41:12] 2021-03-17 23:40:37,159 | INFO | retrieve | pilot.control.job | make_job_report | -------------------------------------------------- [2021-03-18 00:41:12] 2021-03-17 23:40:37,159 | INFO | retrieve | pilot.control.job | make_job_report | [2021-03-18 00:41:12] 2021-03-17 23:40:37,159 | DEBUG | retrieve | pilot.control.job | has_job_completed | ls -lF /var/lib/boinc/slots/136: [2021-03-18 00:41:12] [2021-03-18 00:41:12] 2021-03-17 23:40:37,160 | INFO | retrieve | pilot.util.container | execute | executing command: ls -lF /var/lib/boinc/slots/136 [2021-03-18 00:41:12] 2021-03-17 23:40:37,184 | DEBUG | retrieve | pilot.control.job | has_job_completed | total 43424 [2021-03-18 00:41:12] -rw-------. 1 boinc boinc 141 Mar 18 00:27 7XrNDmEhugynfZGDcpSWOuwoABFKDmABFKDm2IFNDm3BFKDmVGxyOo.diag [2021-03-18 00:41:12] -rw-------. 1 boinc boinc 1174578 Mar 18 00:27 agis_schedconf.cvmfs.json [2021-03-18 00:41:12] -rw-r--r--. 1 boinc boinc 0 Mar 18 00:27 boinc_lockfile [2021-03-18 00:41:12] -rw-r--r--. 1 boinc boinc 8192 Mar 18 00:40 boinc_mmap_file [2021-03-18 00:41:12] -rw-r--r--. 1 boinc boinc 532 Mar 18 00:34 boinc_task_state.xml [2021-03-18 00:41:12] -rw-------. 1 boinc boinc 1948795 Mar 18 00:27 cric_ddmendpoints.json [2021-03-18 00:41:12] -rw-r--r--. 1 boinc boinc 38184411 Mar 18 00:27 EVNT.04972714._000028.pool.root.1 [2021-03-18 00:41:12] -rw-------. 1 boinc boinc 162007 Mar 18 00:40 f35d712b-e4fe-4ebf-8c78-e324895d6f56_39406.1.job.log [2021-03-18 00:41:12] -rw-------. 1 boinc boinc 458878 Mar 18 00:38 f35d712b-e4fe-4ebf-8c78-e324895d6f56_39406.1.job.log.tgz [2021-03-18 00:41:12] -rw-------. 1 boinc boinc 2614 Mar 18 00:40 heartbeat.json [2021-03-18 00:41:12] -rw-r--r--. 1 boinc boinc 6250 Mar 18 00:27 init_data.xml [2021-03-18 00:41:12] -rw-r--r--. 1 boinc boinc 1048520 Mar 18 00:27 input.tar.gz [2021-03-18 00:41:12] -rw-r--r--. 1 boinc boinc 112 Mar 18 00:27 job.xml [2021-03-18 00:41:12] -rw-------. 1 boinc boinc 998 Mar 18 00:38 memory_monitor_summary.json [2021-03-18 00:41:12] -rw-------. 1 boinc boinc 268 Mar 18 00:38 output.list [2021-03-18 00:41:12] -rw-r--r--. 1 boinc boinc 2652 Mar 18 00:27 pandaJob.out [2021-03-18 00:41:12] drwxrwx---. 3 boinc boinc 4096 Mar 18 00:38 PanDA_Pilot-5002450810/ [2021-03-18 00:41:12] drwx------. 5 boinc boinc 4096 Mar 18 00:27 pilot2/ [2021-03-18 00:41:12] -rw-r--r--. 1 boinc boinc 1042975 Mar 17 23:28 pilot2.tar.gz [2021-03-18 00:41:12] -rw-------. 1 boinc boinc 144237 Mar 18 00:40 pilotlog.txt [2021-03-18 00:41:12] -rw-r--r--. 1 boinc boinc 4974 Mar 18 00:23 queuedata.json [2021-03-18 00:41:12] -rwxr-xr-x. 1 boinc boinc 5573 Mar 18 00:27 run_atlas* [2021-03-18 00:41:12] -rwx------. 1 boinc boinc 20043 Mar 18 00:23 runpilot2-wrapper.sh* [2021-03-18 00:41:12] -rw-r--r--. 1 boinc boinc 407 Mar 18 00:27 runtime_log [2021-03-18 00:41:12] -rw-r--r--. 1 boinc boinc 6449 Mar 18 00:27 runtime_log.err [2021-03-18 00:41:12] -rw-------. 1 boinc boinc 408 Mar 18 00:27 setup.sh.local [2021-03-18 00:41:12] drwxrwx--x. 2 boinc boinc 68 Mar 18 00:27 shared/ [2021-03-18 00:41:12] -rw-r--r--. 1 boinc boinc 16507 Mar 18 00:27 start_atlas.sh [2021-03-18 00:41:12] -rw-r--r--. 1 boinc boinc 1681 Mar 18 00:27 stderr.txt [2021-03-18 00:41:12] -rw-r--r--. 1 boinc boinc 107 Mar 18 00:27 wrapper_26015_x86_64-pc-linux-gnu [2021-03-18 00:41:12] -rw-r--r--. 1 boinc boinc 24 Mar 18 00:40 wrapper_checkpoint.txt [2021-03-18 00:41:12] 2021-03-17 23:40:37,185 | INFO | retrieve | pilot.util.queuehandling | queue_report | queue jobs has 0 job(s) [2021-03-18 00:41:12] 2021-03-17 23:40:37,185 | INFO | retrieve | pilot.util.queuehandling | queue_report | queue payloads has 0 job(s) [2021-03-18 00:41:12] 2021-03-17 23:40:37,185 | INFO | retrieve | pilot.util.queuehandling | queue_report | queue data_in has 0 job(s) [2021-03-18 00:41:12] 2021-03-17 23:40:37,185 | INFO | retrieve | pilot.util.queuehandling | queue_report | queue data_out has 0 job(s) [2021-03-18 00:41:12] 2021-03-17 23:40:37,185 | INFO | retrieve | pilot.util.queuehandling | queue_report | queue current_data_in has 0 job(s) [2021-03-18 00:41:12] 2021-03-17 23:40:37,185 | INFO | retrieve | pilot.util.queuehandling | queue_report | queue validated_jobs has 0 job(s) [2021-03-18 00:41:12] 2021-03-17 23:40:37,185 | INFO | retrieve | pilot.util.queuehandling | queue_report | queue validated_payloads has 0 job(s) [2021-03-18 00:41:12] 2021-03-17 23:40:37,186 | INFO | retrieve | pilot.util.queuehandling | queue_report | queue monitored_payloads has 0 job(s) [2021-03-18 00:41:12] 2021-03-17 23:40:37,186 | INFO | retrieve | pilot.util.queuehandling | queue_report | queue finished_jobs has 0 job(s) [2021-03-18 00:41:12] 2021-03-17 23:40:37,186 | INFO | retrieve | pilot.util.queuehandling | queue_report | queue finished_payloads has 0 job(s) [2021-03-18 00:41:12] 2021-03-17 23:40:37,186 | INFO | retrieve | pilot.util.queuehandling | queue_report | queue finished_data_in has 1 job(s) [2021-03-18 00:41:12] 2021-03-17 23:40:37,186 | INFO | retrieve | pilot.util.queuehandling | queue_report | queue finished_data_out has 0 job(s) [2021-03-18 00:41:12] 2021-03-17 23:40:37,186 | INFO | retrieve | pilot.util.queuehandling | queue_report | queue failed_jobs has 0 job(s) [2021-03-18 00:41:12] 2021-03-17 23:40:37,186 | INFO | retrieve | pilot.util.queuehandling | queue_report | queue failed_payloads has 0 job(s) [2021-03-18 00:41:12] 2021-03-17 23:40:37,186 | INFO | retrieve | pilot.util.queuehandling | queue_report | queue failed_data_in has 0 job(s) [2021-03-18 00:41:12] 2021-03-17 23:40:37,186 | INFO | retrieve | pilot.util.queuehandling | queue_report | queue failed_data_out has 0 job(s) [2021-03-18 00:41:12] 2021-03-17 23:40:37,186 | INFO | retrieve | pilot.util.queuehandling | queue_report | queue completed_jobs has 0 job(s) [2021-03-18 00:41:12] 2021-03-17 23:40:37,186 | INFO | retrieve | pilot.util.queuehandling | queue_report | queue completed_jobids has 1 job(s) [2021-03-18 00:41:12] 2021-03-17 23:40:37,186 | INFO | retrieve | pilot.control.job | has_job_completed | job 5002450810 has completed (purged errors) [2021-03-18 00:41:12] 2021-03-17 23:40:37,186 | INFO | retrieve | pilot.util.processes | cleanup | overall cleanup function is called [2021-03-18 00:41:12] 2021-03-17 23:40:37,194 | DEBUG | retrieve | pilot.util.processes | cleanup | work directory was removed: /var/lib/boinc/slots/136/PanDA_Pilot-5002450810 [2021-03-18 00:41:12] 2021-03-17 23:40:38,199 | INFO | retrieve | pilot.info.jobdata | collect_zombies | --- collectZombieJob: --- 10, [5426] [2021-03-18 00:41:12] 2021-03-17 23:40:38,200 | INFO | retrieve | pilot.info.jobdata | collect_zombies | zombie collector trying to kill pid 5426 [2021-03-18 00:41:12] 2021-03-17 23:40:38,200 | INFO | retrieve | pilot.info.jobdata | collect_zombies | harmless exception when collecting zombies: [Errno 10] No child processes [2021-03-18 00:41:12] 2021-03-17 23:40:39,204 | INFO | retrieve | pilot.util.processes | cleanup | collected zombie processes [2021-03-18 00:41:12] 2021-03-17 23:40:39,205 | INFO | retrieve | pilot.util.processes | cleanup | will now attempt to kill all subprocesses of pid=5426 [2021-03-18 00:41:12] 2021-03-17 23:40:39,265 | INFO | retrieve | pilot.util.processes | kill_processes | process IDs to be killed: [5426] (in reverse order) [2021-03-18 00:41:12] 2021-03-17 23:40:39,301 | WARNING | retrieve | pilot.util.processes | kill_processes | found no corresponding commands to process id(s) [2021-03-18 00:41:12] 2021-03-17 23:40:39,302 | INFO | retrieve | pilot.util.processes | kill_orphans | Do not look for orphan processes in BOINC jobs [2021-03-18 00:41:12] 2021-03-17 23:40:39,302 | DEBUG | retrieve | pilot.util.queuehandling | purge_queue | queue purged [2021-03-18 00:41:12] 2021-03-17 23:40:39,303 | INFO | retrieve | pilot.control.job | retrieve | ready for new job [2021-03-18 00:41:12] 2021-03-17 23:40:39,303 | INFO | retrieve | root | retrieve | pilot has finished for previous job - re-establishing logging [2021-03-18 00:41:12] 2021-03-17 23:40:39,304 | INFO | retrieve | pilot.util.auxiliary | pilot_version_banner | **************************************** [2021-03-18 00:41:12] 2021-03-17 23:40:39,304 | INFO | retrieve | pilot.util.auxiliary | pilot_version_banner | *** PanDA Pilot version 2.9.6 (20) *** [2021-03-18 00:41:12] 2021-03-17 23:40:39,304 | INFO | retrieve | pilot.util.auxiliary | pilot_version_banner | **************************************** [2021-03-18 00:41:12] 2021-03-17 23:40:39,304 | INFO | retrieve | pilot.util.auxiliary | pilot_version_banner | [2021-03-18 00:41:12] 2021-03-17 23:40:39,304 | INFO | retrieve | pilot.util.auxiliary | pilot_version_banner | pilot is running in a VM [2021-03-18 00:41:12] 2021-03-17 23:40:39,304 | INFO | retrieve | pilot.util.auxiliary | display_architecture_info | architecture information: [2021-03-18 00:41:12] 2021-03-17 23:40:39,374 | INFO | retrieve | pilot.util.auxiliary | display_architecture_info | [2021-03-18 00:41:12] LSB Version: :core-4.1-amd64:core-4.1-noarch [2021-03-18 00:41:12] Distributor ID: CentOS [2021-03-18 00:41:12] Description: CentOS Linux release 7.8.2003 (Core) [2021-03-18 00:41:12] Release: 7.8.2003 [2021-03-18 00:41:12] Codename: Core [2021-03-18 00:41:12] 2021-03-17 23:40:39,375 | INFO | retrieve | pilot.util.auxiliary | pilot_version_banner | **************************************** [2021-03-18 00:41:12] 2021-03-17 23:40:39,877 | DEBUG | retrieve | pilot.util.monitoring | check_local_space | checking local space on /var/lib/boinc/slots/136 [2021-03-18 00:41:12] 2021-03-17 23:40:39,894 | INFO | retrieve | pilot.util.monitoring | check_local_space | sufficient remaining disk space (24699207680 B) [2021-03-18 00:41:12] 2021-03-17 23:40:39,894 | WARNING | retrieve | pilot.control.job | proceed_with_getjob | since timefloor is set to 0, pilot was only allowed to run one job [2021-03-18 00:41:12] 2021-03-17 23:40:39,895 | DEBUG | retrieve | pilot.control.job | retrieve | will not set job_aborted yet [2021-03-18 00:41:12] 2021-03-17 23:40:39,895 | DEBUG | retrieve | pilot.control.job | retrieve | [job] retrieve thread has finished [2021-03-18 00:41:12] 2021-03-17 23:40:39,901 | WARNING | monitor | pilot.control.monitor | control | aborting monitor loop since graceful_stop has been set [2021-03-18 00:41:12] 2021-03-17 23:40:39,901 | INFO | monitor | pilot.control.monitor | control | [monitor] control thread has ended [2021-03-18 00:41:12] 2021-03-17 23:40:39,917 | WARNING | copytool_out | pilot.util.common | should_abort | data:copytool_out:received graceful stop - abort after this iteration [2021-03-18 00:41:12] 2021-03-17 23:40:39,962 | DEBUG | copytool_in | pilot.control.data | copytool_in | will not set job_aborted yet [2021-03-18 00:41:12] 2021-03-17 23:40:39,962 | DEBUG | copytool_in | pilot.control.data | copytool_in | [data] copytool_in thread has finished [2021-03-18 00:41:12] 2021-03-17 23:40:39,976 | DEBUG | validate_pre | pilot.control.payload | validate_pre | will not set job_aborted yet [2021-03-18 00:41:12] 2021-03-17 23:40:39,976 | INFO | validate_pre | pilot.control.payload | validate_pre | [payload] validate_pre thread has finished [2021-03-18 00:41:12] 2021-03-17 23:40:40,088 | DEBUG | payload | pilot.control.payload | control | payload control ending since graceful_stop has been set [2021-03-18 00:41:12] 2021-03-17 23:40:40,088 | DEBUG | payload | pilot.control.payload | control | will not set job_aborted yet [2021-03-18 00:41:12] 2021-03-17 23:40:40,088 | DEBUG | payload | pilot.control.payload | control | [payload] control thread has finished [2021-03-18 00:41:12] 2021-03-17 23:40:40,088 | DEBUG | validate_post | pilot.control.payload | validate_post | will not set job_aborted yet [2021-03-18 00:41:12] 2021-03-17 23:40:40,088 | INFO | validate_post | pilot.control.payload | validate_post | [payload] validate_post thread has finished [2021-03-18 00:41:12] 2021-03-17 23:40:40,188 | DEBUG | job | pilot.control.job | control | job control ending since graceful_stop has been set [2021-03-18 00:41:12] 2021-03-17 23:40:40,188 | DEBUG | job | pilot.control.job | control | will not set job_aborted yet [2021-03-18 00:41:12] 2021-03-17 23:40:40,189 | DEBUG | job | pilot.control.job | control | [job] control thread has finished [2021-03-18 00:41:12] 2021-03-17 23:40:40,308 | DEBUG | validate | pilot.control.job | validate | will not set job_aborted yet [2021-03-18 00:41:12] 2021-03-17 23:40:40,308 | DEBUG | validate | pilot.control.job | validate | [job] validate thread has finished [2021-03-18 00:41:12] 2021-03-17 23:40:40,777 | DEBUG | failed_post | pilot.control.payload | failed_post | will not set job_aborted yet [2021-03-18 00:41:12] 2021-03-17 23:40:40,777 | INFO | failed_post | pilot.control.payload | failed_post | [payload] failed_post thread has finished [2021-03-18 00:41:12] 2021-03-17 23:40:40,832 | DEBUG | create_data_payload | pilot.control.job | create_data_payload | will not set job_aborted yet [2021-03-18 00:41:12] 2021-03-17 23:40:40,833 | DEBUG | create_data_payload | pilot.control.job | create_data_payload | [job] create_data_payload thread has finished [2021-03-18 00:41:12] 2021-03-17 23:40:40,899 | DEBUG | data | pilot.control.data | control | data control ending since graceful_stop has been set [2021-03-18 00:41:12] 2021-03-17 23:40:40,900 | DEBUG | data | pilot.control.data | control | will not set job_aborted yet [2021-03-18 00:41:12] 2021-03-17 23:40:40,900 | DEBUG | data | pilot.control.data | control | [data] control thread has finished [2021-03-18 00:41:12] 2021-03-17 23:40:40,918 | DEBUG | copytool_out | pilot.control.data | copytool_out | will not set job_aborted yet [2021-03-18 00:41:12] 2021-03-17 23:40:40,918 | DEBUG | copytool_out | pilot.control.data | copytool_out | [data] copytool_out thread has finished [2021-03-18 00:41:12] 2021-03-17 23:40:41,120 | WARNING | queue_monitoring | pilot.util.common | should_abort | data:queue_monitoring:received graceful stop - abort after this iteration [2021-03-18 00:41:12] 2021-03-17 23:40:41,293 | DEBUG | execute_payloads | pilot.control.payload | execute_payloads | will not set job_aborted yet [2021-03-18 00:41:12] 2021-03-17 23:40:41,293 | INFO | execute_payloads | pilot.control.payload | execute_payloads | [payload] execute_payloads thread has finished [2021-03-18 00:41:12] 2021-03-17 23:40:41,679 | WARNING | queue_monitor | pilot.util.common | should_abort | job:queue_monitor:received graceful stop - abort after this iteration [2021-03-18 00:41:12] 2021-03-17 23:40:41,680 | DEBUG | queue_monitor | pilot.control.job | queue_monitor | will not set job_aborted yet [2021-03-18 00:41:12] 2021-03-17 23:40:41,680 | DEBUG | queue_monitor | pilot.control.job | queue_monitor | [job] queue monitor thread has finished [2021-03-18 00:41:12] 2021-03-17 23:40:44,122 | DEBUG | queue_monitoring | pilot.control.data | queue_monitoring | will not set job_aborted yet [2021-03-18 00:41:12] 2021-03-17 23:40:44,122 | DEBUG | queue_monitoring | pilot.control.data | queue_monitoring | [data] queue_monitor thread has finished [2021-03-18 00:41:12] 2021-03-17 23:41:11,532 | WARNING | job_monitor | pilot.control.job | check_job_monitor_waiting_time | no jobs in monitored_payloads queue (waited for 62 s) [2021-03-18 00:41:12] 2021-03-17 23:41:11,532 | DEBUG | job_monitor | pilot.util.processes | threads_aborted | aborting since the last relevant thread is about to finish [2021-03-18 00:41:12] 2021-03-17 23:41:11,532 | DEBUG | job_monitor | pilot.control.job | job_monitor | will proceed to set job_aborted [2021-03-18 00:41:12] 2021-03-17 23:41:11,532 | DEBUG | job_monitor | pilot.control.job | job_monitor | [job] job monitor thread has finished [2021-03-18 00:41:12] 2021-03-17 23:41:12,024 | INFO | MainThread | pilot.workflow.generic | run | end of generic workflow (traces error code: 0) [2021-03-18 00:41:12] 2021-03-17 23:41:12,027 | INFO | MainThread | root | wrap_up | traces error code: 0 [2021-03-18 00:41:12] 2021-03-17 23:41:12,027 | INFO | MainThread | root | wrap_up | pilot has finished [2021-03-18 00:41:12] 2021-03-17 23:41:12,103 [wrapper] ==== pilot stdout END ==== [2021-03-18 00:41:12] 2021-03-17 23:41:12,108 [wrapper] ==== wrapper stdout RESUME ==== [2021-03-18 00:41:12] 2021-03-17 23:41:12,111 [wrapper] Pilot exit status: 0 [2021-03-18 00:41:12] 2021-03-17 23:41:12,129 [wrapper] pandaids: 5002450810 [2021-03-18 00:41:12] 2021-03-17 23:41:12,137 [wrapper] apfmon messages muted [2021-03-18 00:41:12] 2021-03-17 23:41:12,141 [wrapper] Test setup, not cleaning [2021-03-18 00:41:12] 2021-03-17 23:41:12,144 [wrapper] ==== wrapper stdout END ==== [2021-03-18 00:41:12] 2021-03-17 23:41:12,148 [wrapper] ==== wrapper stderr END ==== [2021-03-18 00:41:12] 2021-03-17 23:41:12,155 [wrapper] wrapperexiting ec=0, duration=847 [2021-03-18 00:41:12] 2021-03-17 23:41:12,159 [wrapper] apfmon messages muted [2021-03-18 00:41:12] *** Error codes and diagnostics *** [2021-03-18 00:41:12] "exeErrorCode": 0, [2021-03-18 00:41:12] "exeErrorDiag": "", [2021-03-18 00:41:12] "pilotErrorCode": 1187, [2021-03-18 00:41:12] "pilotErrorDiag": "Payload metadata does not exist", [2021-03-18 00:41:12] *** Listing of results directory *** [2021-03-18 00:41:12] total 43836 [2021-03-18 00:41:12] -rw-r--r--. 1 boinc boinc 1042975 Mar 17 23:28 pilot2.tar.gz [2021-03-18 00:41:12] -rwx------. 1 boinc boinc 20043 Mar 18 00:23 runpilot2-wrapper.sh [2021-03-18 00:41:12] -rw-r--r--. 1 boinc boinc 4974 Mar 18 00:23 queuedata.json [2021-03-18 00:41:12] -rw-r--r--. 1 boinc boinc 107 Mar 18 00:27 wrapper_26015_x86_64-pc-linux-gnu [2021-03-18 00:41:12] -rwxr-xr-x. 1 boinc boinc 5573 Mar 18 00:27 run_atlas [2021-03-18 00:41:12] -rw-r--r--. 1 boinc boinc 112 Mar 18 00:27 job.xml [2021-03-18 00:41:12] -rw-r--r--. 1 boinc boinc 6250 Mar 18 00:27 init_data.xml [2021-03-18 00:41:12] drwxrwx--x. 2 boinc boinc 68 Mar 18 00:27 shared [2021-03-18 00:41:12] -rw-r--r--. 1 boinc boinc 0 Mar 18 00:27 boinc_lockfile [2021-03-18 00:41:12] -rw-r--r--. 1 boinc boinc 38184411 Mar 18 00:27 EVNT.04972714._000028.pool.root.1 [2021-03-18 00:41:12] -rw-r--r--. 1 boinc boinc 16507 Mar 18 00:27 start_atlas.sh [2021-03-18 00:41:12] -rw-r--r--. 1 boinc boinc 1048520 Mar 18 00:27 input.tar.gz [2021-03-18 00:41:12] -rw-r--r--. 1 boinc boinc 2652 Mar 18 00:27 pandaJob.out [2021-03-18 00:41:12] -rw-------. 1 boinc boinc 408 Mar 18 00:27 setup.sh.local [2021-03-18 00:41:12] -rw-------. 1 boinc boinc 1174578 Mar 18 00:27 agis_schedconf.cvmfs.json [2021-03-18 00:41:12] -rw-------. 1 boinc boinc 1948795 Mar 18 00:27 cric_ddmendpoints.json [2021-03-18 00:41:12] drwx------. 5 boinc boinc 4096 Mar 18 00:27 pilot2 [2021-03-18 00:41:12] -rw-r--r--. 1 boinc boinc 532 Mar 18 00:34 boinc_task_state.xml [2021-03-18 00:41:12] -rw-------. 1 boinc boinc 998 Mar 18 00:38 memory_monitor_summary.json [2021-03-18 00:41:12] -rw-------. 1 boinc boinc 458878 Mar 18 00:38 f35d712b-e4fe-4ebf-8c78-e324895d6f56_39406.1.job.log.tgz [2021-03-18 00:41:12] -rw-------. 1 boinc boinc 2614 Mar 18 00:40 heartbeat.json [2021-03-18 00:41:12] -rw-r--r--. 1 boinc boinc 8192 Mar 18 00:41 boinc_mmap_file [2021-03-18 00:41:12] -rw-r--r--. 1 boinc boinc 24 Mar 18 00:41 wrapper_checkpoint.txt [2021-03-18 00:41:12] -rw-------. 1 boinc boinc 8825 Mar 18 00:41 pilotlog.txt [2021-03-18 00:41:12] -rw-------. 1 boinc boinc 178839 Mar 18 00:41 f35d712b-e4fe-4ebf-8c78-e324895d6f56_39406.1.job.log [2021-03-18 00:41:12] -rw-------. 1 boinc boinc 268 Mar 18 00:41 output.list [2021-03-18 00:41:12] -rw-r--r--. 1 boinc boinc 686 Mar 18 00:41 runtime_log [2021-03-18 00:41:12] -rw-r--r--. 1 boinc boinc 9903 Mar 18 00:41 runtime_log.err [2021-03-18 00:41:12] -rw-------. 1 boinc boinc 655360 Mar 18 00:41 result.tar.gz [2021-03-18 00:41:12] -rw-------. 1 boinc boinc 574 Mar 18 00:41 7XrNDmEhugynfZGDcpSWOuwoABFKDmABFKDm2IFNDm3BFKDmVGxyOo.diag [2021-03-18 00:41:12] -rw-r--r--. 1 boinc boinc 36671 Mar 18 00:41 stderr.txt [2021-03-18 00:41:12] No HITS result produced [2021-03-18 00:41:12] *** Contents of shared directory: *** [2021-03-18 00:41:12] total 38976 [2021-03-18 00:41:12] -rw-r--r--. 1 boinc boinc 38184411 Mar 18 00:27 ATLAS.root_0 [2021-03-18 00:41:12] -rw-r--r--. 1 boinc boinc 16507 Mar 18 00:27 start_atlas.sh [2021-03-18 00:41:12] -rw-r--r--. 1 boinc boinc 1048520 Mar 18 00:27 input.tar.gz [2021-03-18 00:41:12] -rw-------. 1 boinc boinc 655360 Mar 18 00:41 result.tar.gz 00:41:13 (30687): run_atlas exited; CPU time 924.766591 00:41:13 (30687): called boinc_finish(0) </stderr_txt> ]]>
©2024 CERN