Name zLrLDmNTjkunShfckohDCDFpABFKDmABFKDmvc4MDmABFKDmdMzjXn_3
Workunit 1896672
Created 18 May 2019, 2:17:05 UTC
Sent 18 May 2019, 4:25:11 UTC
Report deadline 25 May 2019, 4:25:11 UTC
Received 18 May 2019, 14:52:37 UTC
Server state Over
Outcome Validate error
Client state Done
Exit status 0 (0x00000000)
Computer ID 2244
Run time 10 hours 27 min 26 sec
CPU time 10 hours 44 min 59 sec
Validate state Invalid
Credit 0.00
Device peak FLOPS 2.00 GFLOPS
Application version ATLAS Simulation v0.62 (native_mt)
x86_64-pc-linux-gnu
Peak working set size 1.91 GB
Peak swap size 2.45 GB
Peak disk usage 843.90 MB

Stderr output

<core_client_version>7.5.1</core_client_version>
<![CDATA[
<stderr_txt>
09:19:00 (7629): wrapper (7.7.26015): starting
09:19:00 (7629): wrapper: running run_atlas (--nthreads 1)
singularity image is /cvmfs/atlas.cern.ch/repo/containers/images/singularity/x86_64-slc6.img
sys.argv = ['run_atlas', '--nthreads', '1']
THREADS=1
Checking for CVMFS
CVMFS is installed
OS:Scientific Linux release 6.10 (Carbon)

This is SLC or CentOS release 6, run the atlas job without Singularity
copy /root/Downloads/BOINC/slots/0/shared/input.tar.gz
copy /root/Downloads/BOINC/slots/0/shared/start_atlas.sh
copy /root/Downloads/BOINC/slots/0/shared/ATLAS.root_0
copy /root/Downloads/BOINC/slots/0/shared/RTE.tar.gz
start atlas job with 
cmd = sh start_atlas.sh > runtime_log 2> runtime_log.err
running cmd return value is 0
Moving ./HITS.17905780._051444.pool.root.1 to shared/HITS.pool.root.1
*****************The last 100 lines of the pilot log******************
2019-05-18 18:01:34,306 | INFO     | queue_monitor       | pilot.user.atlas.utilities       | get_memory_values         | using path: /root/Downloads/BOINC/slots/0/PanDA_Pilot-4343820581/memory_monitor_summary.json
2019-05-18 18:01:34,307 | DEBUG    | queue_monitor       | pilot.user.atlas.utilities       | get_memory_monitor_info   | summary_dictionary={'Max': {'maxRSS': 2114476, 'maxSwap': 0, 'totRCHAR': 1652224760, 'totRBYTES': 1645117440, 'totWCHAR': 119604240, 'totWBYTES': 129916928, 'maxVMEM': 3232784, 'maxPSS': 2105976}, 'Avg': {'avgRSS': 2066052, 'rateWCHAR': 3117, 'avgVMEM': 3167319, 'avgPSS': 2057266, 'rateRCHAR': 43068, 'avgSwap': 0, 'rateRBYTES': 42882, 'rateWBYTES': 3386}}
2019-05-18 18:01:34,307 | INFO     | queue_monitor       | pilot.user.atlas.utilities       | get_memory_monitor_info   | extracted standard info from memory monitor json
2019-05-18 18:01:34,307 | INFO     | queue_monitor       | pilot.user.atlas.utilities       | get_memory_monitor_info   | extracted standard memory fields from memory monitor json
2019-05-18 18:01:34,307 | INFO     | queue_monitor       | pilot.util.auxiliary             | timing_report             | ..............................
2019-05-18 18:01:34,308 | INFO     | queue_monitor       | pilot.util.auxiliary             | timing_report             | . Timing measurements:
2019-05-18 18:01:34,308 | INFO     | queue_monitor       | pilot.util.auxiliary             | timing_report             | . get job = 8 s
2019-05-18 18:01:34,308 | INFO     | queue_monitor       | pilot.util.auxiliary             | timing_report             | . initial setup = 17 s
2019-05-18 18:01:34,308 | INFO     | queue_monitor       | pilot.util.auxiliary             | timing_report             | . payload setup = 0 s
2019-05-18 18:01:34,308 | INFO     | queue_monitor       | pilot.util.auxiliary             | timing_report             | . total setup = 17 s
2019-05-18 18:01:34,308 | INFO     | queue_monitor       | pilot.util.auxiliary             | timing_report             | . stage-in = 9 s
2019-05-18 18:01:34,308 | INFO     | queue_monitor       | pilot.util.auxiliary             | timing_report             | . payload execution = 38436 s
2019-05-18 18:01:34,308 | INFO     | queue_monitor       | pilot.util.auxiliary             | timing_report             | . stage-out = 28 s
2019-05-18 18:01:34,308 | INFO     | queue_monitor       | pilot.util.auxiliary             | timing_report             | ..............................
2019-05-18 18:01:34,309 | INFO     | queue_monitor       | pilot.util.auxiliary             | get_log_extracts          | building log extracts (sent to the server as 'pilotLog')
2019-05-18 18:01:34,309 | DEBUG    | queue_monitor       | pilot.util.auxiliary             | get_panda_tracer_log      | PanDA tracer log does not exist: /root/Downloads/BOINC/slots/0/PanDA_Pilot-4343820581/pandatracerlog.txt (ignoring)
2019-05-18 18:01:34,310 | INFO     | queue_monitor       | pilot.util.container             | execute                   | executing command: tail -n 20 /root/Downloads/BOINC/slots/0/PanDA_Pilot-4343820581/pilotlog.txt
2019-05-18 18:01:34,433 | DEBUG    | queue_monitor       | pilot.util.auxiliary             | get_pilot_log_extracts    | dumping warning messages from pilotlog.txt:

2019-05-18 18:01:34,503 | WARNING  | queue_monitor       | pilot.util.auxiliary             | get_log_extracts          | detected the following tail of warning/fatal messages in the pilot log:
- Log from pilotlog.txt -2019-05-18 18:01:34,303 | INFO     | queue_monitor       | pilot.api.analytics              | get_fitted_data           | current memory leak: 4.56 B/s (using 619 data points, chi2=22933210671.944942)
2019-05-18 18:01:34,304 | DEBUG    | queue_monitor       | pilot.util.auxiliary             | get_job_metrics           | job metrics="coreCount=1 leak=4.56"
2019-05-18 18:01:34,305 | INFO     | queue_monitor       | pilot.control.job                | get_data_structure        | payload/TRF did not report the number of read events
2019-05-18 18:01:34,306 | INFO     | queue_monitor       | pilot.user.atlas.utilities       | get_memory_values         | using path: /root/Downloads/BOINC/slots/0/PanDA_Pilot-4343820581/memory_monitor_summary.json
2019-05-18 18:01:34,307 | DEBUG    | queue_monitor       | pilot.user.atlas.utilities       | get_memory_monitor_info   | summary_dictionary={'Max': {'maxRSS': 2114476, 'maxSwap': 0, 'totRCHAR': 1652224760, 'totRBYTES': 1645117440, 'totWCHAR': 119604240, 'totWBYTES': 129916928, 'maxVMEM': 3232784, 'maxPSS': 2105976}, 'Avg': {'avgRSS': 2066052, 'rateWCHAR': 3117, 'avgVMEM': 3167319, 'avgPSS': 2057266, 'rateRCHAR': 43068, 'avgSwap': 0, 'rateRBYTES': 42882, 'rateWBYTES': 3386}}
2019-05-18 18:01:34,307 | INFO     | queue_monitor       | pilot.user.atlas.utilities       | get_memory_monitor_info   | extracted standard info from memory monitor json
2019-05-18 18:01:34,307 | INFO     | queue_monitor       | pilot.user.atlas.utilities       | get_memory_monitor_info   | extracted standard memory fields from memory monitor json
2019-05-18 18:01:34,307 | INFO     | queue_monitor       | pilot.util.auxiliary             | timing_report             | ..............................
2019-05-18 18:01:34,308 | INFO     | queue_monitor       | pilot.util.auxiliary             | timing_report             | . Timing measurements:
2019-05-18 18:01:34,308 | INFO     | queue_monitor       | pilot.util.auxiliary             | timing_report             | . get job = 8 s
2019-05-18 18:01:34,308 | INFO     | queue_monitor       | pilot.util.auxiliary             | timing_report             | . initial setup = 17 s
2019-05-18 18:01:34,308 | INFO     | queue_monitor       | pilot.util.auxiliary             | timing_report             | . payload setup = 0 s
2019-05-18 18:01:34,308 | INFO     | queue_monitor       | pilot.util.auxiliary             | timing_report             | . total setup = 17 s
2019-05-18 18:01:34,308 | INFO     | queue_monitor       | pilot.util.auxiliary             | timing_report             | . stage-in = 9 s
2019-05-18 18:01:34,308 | INFO     | queue_monitor       | pilot.util.auxiliary             | timing_report             | . payload execution = 38436 s
2019-05-18 18:01:34,308 | INFO     | queue_monitor       | pilot.util.auxiliary             | timing_report             | . stage-out = 28 s
2019-05-18 18:01:34,308 | INFO     | queue_monitor       | pilot.util.auxiliary             | timing_report             | ..............................
2019-05-18 18:01:34,309 | INFO     | queue_monitor       | pilot.util.auxiliary             | get_log_extracts          | building log extracts (sent to the server as 'pilotLog')
2019-05-18 18:01:34,309 | DEBUG    | queue_monitor       | pilot.util.auxiliary             | get_panda_tracer_log      | PanDA tracer log does not exist: /root/Downloads/BOINC/slots/0/PanDA_Pilot-4343820581/pandatracerlog.txt (ignoring)
2019-05-18 18:01:34,310 | INFO     | queue_monitor       | pilot.util.container             | execute                   | executing command: tail -n 20 /root/Downloads/BOINC/slots/0/PanDA_Pilot-4343820581/pilotlog.txt
- Error messages from pilotlog.txt -
          "CRITICAL": 0, 

          "ERROR": 0, 

          "FATAL": 0, 


2019-05-18 18:01:34,507 | DEBUG    | queue_monitor       | pilot.control.job                | send_state                | wrote heartbeat to file /root/Downloads/BOINC/slots/0/heartbeat.json
2019-05-18 18:01:34,508 | INFO     | queue_monitor       | pilot.control.job                | queue_monitor             | job 4343820581 was dequeued from the monitored payloads queue
2019-05-18 18:01:35,340 | INFO     | retrieve            | pilot.util.auxiliary             | make_job_report           | 
2019-05-18 18:01:35,341 | INFO     | retrieve            | pilot.util.auxiliary             | make_job_report           | job summary report
2019-05-18 18:01:35,341 | INFO     | retrieve            | pilot.util.auxiliary             | make_job_report           | --------------------------------------------------
2019-05-18 18:01:35,341 | INFO     | retrieve            | pilot.util.auxiliary             | make_job_report           | PanDA job id: 4343820581
2019-05-18 18:01:35,342 | INFO     | retrieve            | pilot.util.auxiliary             | make_job_report           | task id: 17905780
2019-05-18 18:01:35,342 | INFO     | retrieve            | pilot.util.auxiliary             | make_job_report           | errors: (none)
2019-05-18 18:01:35,342 | INFO     | retrieve            | pilot.util.auxiliary             | make_job_report           | status: LOG_TRANSFER = DONE 
2019-05-18 18:01:35,342 | INFO     | retrieve            | pilot.util.auxiliary             | make_job_report           | pilot state: finished 
2019-05-18 18:01:35,342 | INFO     | retrieve            | pilot.util.auxiliary             | make_job_report           | transexitcode: 0
2019-05-18 18:01:35,342 | INFO     | retrieve            | pilot.util.auxiliary             | make_job_report           | exeerrorcode: 0
2019-05-18 18:01:35,342 | INFO     | retrieve            | pilot.util.auxiliary             | make_job_report           | exeerrordiag: 
2019-05-18 18:01:35,342 | INFO     | retrieve            | pilot.util.auxiliary             | make_job_report           | exitcode: 0
2019-05-18 18:01:35,343 | INFO     | retrieve            | pilot.util.auxiliary             | make_job_report           | exitmsg: OK
2019-05-18 18:01:35,343 | INFO     | retrieve            | pilot.util.auxiliary             | make_job_report           | cpuconsumptiontime: 38432 s
2019-05-18 18:01:35,343 | INFO     | retrieve            | pilot.util.auxiliary             | make_job_report           | nevents: 0
2019-05-18 18:01:35,343 | INFO     | retrieve            | pilot.util.auxiliary             | make_job_report           | neventsw: 0
2019-05-18 18:01:35,343 | INFO     | retrieve            | pilot.util.auxiliary             | make_job_report           | pid: 13899
2019-05-18 18:01:35,343 | INFO     | retrieve            | pilot.util.auxiliary             | make_job_report           | pgrp: None
2019-05-18 18:01:35,343 | INFO     | retrieve            | pilot.util.auxiliary             | make_job_report           | corecount: 1
2019-05-18 18:01:35,344 | INFO     | retrieve            | pilot.util.auxiliary             | make_job_report           | event service: False
2019-05-18 18:01:35,344 | INFO     | retrieve            | pilot.util.auxiliary             | make_job_report           | sizes: {0: 15627241, 1: 15627617, 38434: 15869435, 38435: 15869483, 5: 15627617, 38465: 15876514, 10: 15627641, 13: 15627740, 15: 15627740, 38467: 15876562, 38464: 15876466}
2019-05-18 18:01:35,344 | INFO     | retrieve            | pilot.util.auxiliary             | make_job_report           | --------------------------------------------------
2019-05-18 18:01:35,344 | INFO     | retrieve            | pilot.util.auxiliary             | make_job_report           | 
2019-05-18 18:01:35,344 | INFO     | retrieve            | pilot.control.job                | has_job_completed         | job 4343820581 has completed
2019-05-18 18:01:35,344 | INFO     | retrieve            | pilot.control.job                | retrieve                  | ready for new job
2019-05-18 18:01:35,344 | DEBUG    | retrieve            | pilot.control.job                | retrieve                  | getjob_requests=1
2019-05-18 18:01:41,139 | WARNING  | job_monitor         | pilot.control.job                | check_job_monitor_waiting_time | no jobs in monitored_payloads queue (waited for 62 s)
2019-05-18 18:01:45,373 | DEBUG    | retrieve            | pilot.control.job                | proceed_with_getjob       | proceed_with_getjob called with getjob_requests=1
2019-05-18 18:01:45,373 | DEBUG    | retrieve            | pilot.util.monitoring            | check_local_space         | checking local space on /root/Downloads/BOINC/slots/0
2019-05-18 18:01:45,403 | INFO     | retrieve            | pilot.util.monitoring            | check_local_space         | sufficient remaining disk space (14305722368 B)
2019-05-18 18:01:45,403 | WARNING  | retrieve            | pilot.control.job                | proceed_with_getjob       | since timefloor is set to 0, pilot was only allowed to run one job
2019-05-18 18:01:45,431 | INFO     | monitor             | pilot.control.monitor            | control                   | monitor control has ended
2019-05-18 18:01:45,458 | DEBUG    | data                | pilot.control.data               | control                   | data control ending since graceful_stop has been set
2019-05-18 18:01:45,624 | DEBUG    | payload             | pilot.control.payload            | control                   | payload control ending since graceful_stop has been set
2019-05-18 18:01:45,932 | DEBUG    | job                 | pilot.control.job                | control                   | job control ending since graceful_stop has been set
2019-05-18 18:01:45,968 | WARNING  | copytool_out        | pilot.util.common                | should_abort              | data:copytool_out:received graceful stop - abort after this iteration
2019-05-18 18:01:45,968 | DEBUG    | copytool_out        | pilot.control.data               | copytool_out              | will abort 
2019-05-18 18:01:46,073 | WARNING  | queue_monitoring    | pilot.util.common                | should_abort              | data:queue_monitoring:received graceful stop - abort after this iteration
2019-05-18 18:01:46,322 | DEBUG    | copytool_in         | pilot.control.data               | copytool_in               | copytool_in ended since graceful_stop has been set
2019-05-18 18:01:46,424 | INFO     | validate_post       | pilot.control.payload            | validate_post             | validate_post has finished
2019-05-18 18:01:47,103 | DEBUG    | copytool_out        | pilot.control.data               | copytool_out              | aborting
2019-05-18 18:01:47,107 | DEBUG    | copytool_out        | pilot.control.data               | copytool_out              | copytool_out has finished
2019-05-18 18:01:47,458 | WARNING  | queue_monitor       | pilot.util.common                | should_abort              | job:queue_monitor:received graceful stop - abort after this iteration
2019-05-18 18:01:47,458 | INFO     | queue_monitor       | pilot.control.job                | queue_monitor             | [job] queue monitor has finished
2019-05-18 18:01:49,209 | INFO     | queue_monitoring    | pilot.control.data               | queue_monitoring          | [data] queue monitor has finished
2019-05-18 18:02:42,427 | WARNING  | job_monitor         | pilot.control.job                | check_job_monitor_waiting_time | no jobs in monitored_payloads queue (waited for 123 s)
2019-05-18 18:02:42,428 | INFO     | job_monitor         | pilot.control.job                | job_monitor               | job monitor has finished
2019-05-18 18:02:42,431 | INFO     | MainThread          | pilot.workflow.generic           | run                       | end of generic workflow (traces error code: 0)
2019-05-18 18:02:42,432 | INFO     | MainThread          | root                             | wrap_up                   | traces error code: 0
2019-05-18 18:02:42,432 | INFO     | MainThread          | root                             | wrap_up                   | pilot has finished
***************diag file************
runtimeenvironments=APPS/HEP/ATLAS-SITE;
Processors=1
WallTime=38612.55s
KernelTime=857.47s
UserTime=38431.72s
CPUUsage=101%
MaxResidentMemory=2012872kB
AverageResidentMemory=0kB
AverageTotalMemory=0kB
AverageUnsharedMemory=0kB
AverageUnsharedStack=0kB
AverageSharedMemory=0kB
PageSize=4096B
MajorPageFaults=9701
MinorPageFaults=15811105
Swaps=0
ForcedSwitches=396105
WaitSwitches=19077797
Inputs=3372536
Outputs=308336
SocketReceived=0
SocketSent=0
Signals=0

nodename=maeax@APU8S
exitcode=0
******************************WorkDir***********************
insgesamt 384140
drwxrwx--x. 7 root root       4096 18. Mai 20:02 .
drwxr-x--x. 3 root root       4096 18. Mai 09:18 ..
-rw-------. 1 root root    7245969 18. Mai 09:19 agis_ddmendpoints.json
-rw-------. 1 root root    4884028 18. Mai 09:19 agis_schedconf.cvmfs.json
drwx------. 2 root root       4096 18. Mai 09:19 .alrb
drwxr-xr-x. 3 root root       4096 18. Mai 09:19 APPS
-rw-------. 1 root root        549 18. Mai 09:19 .asetup
-rw-------. 1 root root       2985 18. Mai 09:20 .asetup.save
-rw-r--r--. 1 root root          0 18. Mai 09:19 boinc_lockfile
-rw-r--r--. 1 root root       8192 18. Mai 20:02 boinc_mmap_file
-rw-r--r--. 1 root root        537 18. Mai 20:01 boinc_task_state.xml
-rw-r--r--. 1 root root  376631601 18. Mai 09:19 EVNT.17323857._000893.pool.root.1
-rw-------. 1 root root      57536 18. Mai 20:01 heartbeat.json
-rw-r--r--. 1 root root       5474 18. Mai 19:44 init_data.xml
-rw-r--r--. 1 root root     245120 18. Mai 09:19 input.tar.gz
-rw-r--r--. 1 root root        112 18. Mai 09:18 job.xml
-rw-------. 1 root root    1771368 18. Mai 20:02 log.17905780._051444.job.log.1
-rw-------. 1 root root     323595 18. Mai 20:01 log.17905780._051444.job.log.tgz.1
-rw-------. 1 root root        306 18. Mai 20:00 memory_monitor_summary.json
-rw-------. 1 root root        391 18. Mai 20:02 output.list
-rw-------. 1 4871  1028      2887 14. Mai 12:04 pandaJobData.out
drwxrwx---. 2 root root       4096 18. Mai 20:01 PanDA_Pilot-4343820581
drwxr-xr-x. 3  501 games      4096  6. Mai 13:40 pilot2
-rw-r--r--. 1 root root     236270 13. Mai 16:43 pilot2.tar.gz
-rw-------. 1 root root    1756291 18. Mai 20:02 pilotlog.txt
-rw-r--r--. 1 root root       4444 14. Mai 12:04 queuedata.json
-rw-r--r--. 1 root root        786 18. Mai 09:19 RTE.tar.gz
-rwxr-xr-x. 1 root root       8512 18. Mai 09:18 run_atlas
-rwx------. 1 4871  1028     14490 14. Mai 12:04 runpilot2-wrapper.sh
-rw-r--r--. 1 root root        692 18. Mai 20:02 runtime_log
-rw-r--r--. 1 root root       7327 18. Mai 20:02 runtime_log.err
drwxrwx--x. 2 root root       4096 18. Mai 20:02 shared
-rw-r--r--. 1 root root       8714 18. Mai 09:19 start_atlas.sh
-rw-r--r--. 1 root root      16874 18. Mai 20:02 stderr.txt
-rw-r--r--. 1 root root        107 18. Mai 09:18 wrapper_26015_x86_64-pc-linux-gnu
-rw-r--r--. 1 root root         28 18. Mai 20:02 wrapper_checkpoint.txt
-rw-------. 1 root root        493 18. Mai 20:02 zLrLDmNTjkunShfckohDCDFpABFKDmABFKDmvc4MDmABFKDmdMzjXn.diag
running start_atlas return value is 0
Parent exit 0
child process exit 0
20:02:43 (7629): run_atlas exited; CPU time 38432.793327
20:02:43 (7629): called boinc_finish(0)

</stderr_txt>
]]>


©2024 CERN