Name h1JLDmgCz0unShfckohDCDFpABFKDmABFKDm6cOKDmABFKDm0WDbUm_1
Workunit 1907091
Created 28 Jun 2019, 8:53:37 UTC
Sent 28 Jun 2019, 15:29:18 UTC
Report deadline 5 Jul 2019, 15:29:18 UTC
Received 29 Jun 2019, 2:28:47 UTC
Server state Over
Outcome Success
Client state Done
Exit status 0 (0x00000000)
Computer ID 2244
Run time 10 hours 59 min 29 sec
CPU time 11 hours 17 min 56 sec
Validate state Valid
Credit 191.79
Device peak FLOPS 2.09 GFLOPS
Application version ATLAS Simulation v0.62 (native_mt)
x86_64-pc-linux-gnu
Peak working set size 1.91 GB
Peak swap size 2.57 GB
Peak disk usage 843.92 MB

Stderr output

<core_client_version>7.5.1</core_client_version>
<![CDATA[
<stderr_txt>
17:47:42 (10816): wrapper (7.7.26015): starting
17:47:42 (10816): wrapper: running run_atlas (--nthreads 1)
singularity image is /cvmfs/atlas.cern.ch/repo/containers/images/singularity/x86_64-slc6.img
sys.argv = ['run_atlas', '--nthreads', '1']
THREADS=1
Checking for CVMFS
CVMFS is installed
OS:Scientific Linux release 6.10 (Carbon)

This is SLC or CentOS release 6, run the atlas job without Singularity
copy /root/Downloads/BOINC/slots/1/shared/input.tar.gz
copy /root/Downloads/BOINC/slots/1/shared/start_atlas.sh
copy /root/Downloads/BOINC/slots/1/shared/ATLAS.root_0
copy /root/Downloads/BOINC/slots/1/shared/RTE.tar.gz
start atlas job with 
cmd = sh start_atlas.sh > runtime_log 2> runtime_log.err
running cmd return value is 0
Moving ./HITS.18308834._045315.pool.root.1 to shared/HITS.pool.root.1
*****************The last 100 lines of the pilot log******************
2019-06-29 03:03:17,734 | INFO     | queue_monitor       | pilot.control.job                | get_data_structure        | total number of processed events: 200 (read)
2019-06-29 03:03:17,737 | INFO     | queue_monitor       | pilot.user.atlas.utilities       | get_memory_values         | using path: /root/Downloads/BOINC/slots/1/PanDA_Pilot-4397089446/memory_monitor_summary.json
2019-06-29 03:03:17,740 | DEBUG    | queue_monitor       | pilot.user.atlas.utilities       | get_memory_monitor_info   | summary_dictionary={'Max': {'maxRSS': 2108648, 'maxSwap': 0, 'totRCHAR': 1641604761, 'totRBYTES': 1680195584, 'totWCHAR': 121144801, 'totWBYTES': 131731456, 'maxVMEM': 3471180, 'maxPSS': 2101196}, 'Avg': {'avgRSS': 2053022, 'rateWCHAR': 3003, 'avgVMEM': 3385590, 'avgPSS': 2045355, 'rateRCHAR': 40698, 'avgSwap': 0, 'rateRBYTES': 41654, 'rateWBYTES': 3265}}
2019-06-29 03:03:17,741 | INFO     | queue_monitor       | pilot.user.atlas.utilities       | get_memory_monitor_info   | extracted standard info from memory monitor json
2019-06-29 03:03:17,741 | INFO     | queue_monitor       | pilot.user.atlas.utilities       | get_memory_monitor_info   | extracted standard memory fields from memory monitor json
2019-06-29 03:03:17,743 | INFO     | queue_monitor       | pilot.util.auxiliary             | timing_report             | ..............................
2019-06-29 03:03:17,743 | INFO     | queue_monitor       | pilot.util.auxiliary             | timing_report             | . Timing measurements:
2019-06-29 03:03:17,743 | INFO     | queue_monitor       | pilot.util.auxiliary             | timing_report             | . get job = 9 s
2019-06-29 03:03:17,744 | INFO     | queue_monitor       | pilot.util.auxiliary             | timing_report             | . initial setup = 17 s
2019-06-29 03:03:17,744 | INFO     | queue_monitor       | pilot.util.auxiliary             | timing_report             | . payload setup = 0 s
2019-06-29 03:03:17,744 | INFO     | queue_monitor       | pilot.util.auxiliary             | timing_report             | . total setup = 17 s
2019-06-29 03:03:17,744 | INFO     | queue_monitor       | pilot.util.auxiliary             | timing_report             | . stage-in = 8 s
2019-06-29 03:03:17,745 | INFO     | queue_monitor       | pilot.util.auxiliary             | timing_report             | . payload execution = 40407 s
2019-06-29 03:03:17,745 | INFO     | queue_monitor       | pilot.util.auxiliary             | timing_report             | . stage-out = 32 s
2019-06-29 03:03:17,745 | INFO     | queue_monitor       | pilot.util.auxiliary             | timing_report             | ..............................
2019-06-29 03:03:17,745 | INFO     | queue_monitor       | pilot.util.auxiliary             | get_log_extracts          | building log extracts (sent to the server as 'pilotLog')
2019-06-29 03:03:17,745 | DEBUG    | queue_monitor       | pilot.util.auxiliary             | get_panda_tracer_log      | PanDA tracer log does not exist: /root/Downloads/BOINC/slots/1/PanDA_Pilot-4397089446/pandatracerlog.txt (ignoring)
2019-06-29 03:03:17,746 | INFO     | queue_monitor       | pilot.util.container             | execute                   | executing command: tail -n 20 /root/Downloads/BOINC/slots/1/PanDA_Pilot-4397089446/pilotlog.txt
2019-06-29 03:03:17,945 | DEBUG    | queue_monitor       | pilot.util.auxiliary             | get_pilot_log_extracts    | dumping warning messages from pilotlog.txt:

2019-06-29 03:03:17,947 | WARNING  | queue_monitor       | pilot.util.auxiliary             | get_log_extracts          | detected the following tail of warning/fatal messages in the pilot log:
- Log from pilotlog.txt -2019-06-29 03:03:17,732 | INFO     | queue_monitor       | pilot.api.analytics              | get_fitted_data           | current memory leak: 4.43 B/s (using 647 data points, chi2=-3363576748.944323)
2019-06-29 03:03:17,733 | DEBUG    | queue_monitor       | pilot.util.auxiliary             | get_job_metrics           | job metrics="coreCount=1 nEvents=200 leak=4.43"
2019-06-29 03:03:17,734 | INFO     | queue_monitor       | pilot.control.job                | get_data_structure        | total number of processed events: 200 (read)
2019-06-29 03:03:17,737 | INFO     | queue_monitor       | pilot.user.atlas.utilities       | get_memory_values         | using path: /root/Downloads/BOINC/slots/1/PanDA_Pilot-4397089446/memory_monitor_summary.json
2019-06-29 03:03:17,740 | DEBUG    | queue_monitor       | pilot.user.atlas.utilities       | get_memory_monitor_info   | summary_dictionary={'Max': {'maxRSS': 2108648, 'maxSwap': 0, 'totRCHAR': 1641604761, 'totRBYTES': 1680195584, 'totWCHAR': 121144801, 'totWBYTES': 131731456, 'maxVMEM': 3471180, 'maxPSS': 2101196}, 'Avg': {'avgRSS': 2053022, 'rateWCHAR': 3003, 'avgVMEM': 3385590, 'avgPSS': 2045355, 'rateRCHAR': 40698, 'avgSwap': 0, 'rateRBYTES': 41654, 'rateWBYTES': 3265}}
2019-06-29 03:03:17,741 | INFO     | queue_monitor       | pilot.user.atlas.utilities       | get_memory_monitor_info   | extracted standard info from memory monitor json
2019-06-29 03:03:17,741 | INFO     | queue_monitor       | pilot.user.atlas.utilities       | get_memory_monitor_info   | extracted standard memory fields from memory monitor json
2019-06-29 03:03:17,743 | INFO     | queue_monitor       | pilot.util.auxiliary             | timing_report             | ..............................
2019-06-29 03:03:17,743 | INFO     | queue_monitor       | pilot.util.auxiliary             | timing_report             | . Timing measurements:
2019-06-29 03:03:17,743 | INFO     | queue_monitor       | pilot.util.auxiliary             | timing_report             | . get job = 9 s
2019-06-29 03:03:17,744 | INFO     | queue_monitor       | pilot.util.auxiliary             | timing_report             | . initial setup = 17 s
2019-06-29 03:03:17,744 | INFO     | queue_monitor       | pilot.util.auxiliary             | timing_report             | . payload setup = 0 s
2019-06-29 03:03:17,744 | INFO     | queue_monitor       | pilot.util.auxiliary             | timing_report             | . total setup = 17 s
2019-06-29 03:03:17,744 | INFO     | queue_monitor       | pilot.util.auxiliary             | timing_report             | . stage-in = 8 s
2019-06-29 03:03:17,745 | INFO     | queue_monitor       | pilot.util.auxiliary             | timing_report             | . payload execution = 40407 s
2019-06-29 03:03:17,745 | INFO     | queue_monitor       | pilot.util.auxiliary             | timing_report             | . stage-out = 32 s
2019-06-29 03:03:17,745 | INFO     | queue_monitor       | pilot.util.auxiliary             | timing_report             | ..............................
2019-06-29 03:03:17,745 | INFO     | queue_monitor       | pilot.util.auxiliary             | get_log_extracts          | building log extracts (sent to the server as 'pilotLog')
2019-06-29 03:03:17,745 | DEBUG    | queue_monitor       | pilot.util.auxiliary             | get_panda_tracer_log      | PanDA tracer log does not exist: /root/Downloads/BOINC/slots/1/PanDA_Pilot-4397089446/pandatracerlog.txt (ignoring)
2019-06-29 03:03:17,746 | INFO     | queue_monitor       | pilot.util.container             | execute                   | executing command: tail -n 20 /root/Downloads/BOINC/slots/1/PanDA_Pilot-4397089446/pilotlog.txt
- Error messages from pilotlog.txt -
          "CRITICAL": 0, 

          "ERROR": 0, 

          "FATAL": 0, 


2019-06-29 03:03:17,948 | DEBUG    | queue_monitor       | pilot.control.job                | send_state                | wrote heartbeat to file /root/Downloads/BOINC/slots/1/heartbeat.json
2019-06-29 03:03:17,948 | INFO     | queue_monitor       | pilot.control.job                | queue_monitor             | job 4397089446 was dequeued from the monitored payloads queue
2019-06-29 03:03:19,119 | INFO     | retrieve            | pilot.util.auxiliary             | make_job_report           | 
2019-06-29 03:03:19,119 | INFO     | retrieve            | pilot.util.auxiliary             | make_job_report           | job summary report
2019-06-29 03:03:19,119 | INFO     | retrieve            | pilot.util.auxiliary             | make_job_report           | --------------------------------------------------
2019-06-29 03:03:19,120 | INFO     | retrieve            | pilot.util.auxiliary             | make_job_report           | PanDA job id: 4397089446
2019-06-29 03:03:19,120 | INFO     | retrieve            | pilot.util.auxiliary             | make_job_report           | task id: 18308834
2019-06-29 03:03:19,120 | INFO     | retrieve            | pilot.util.auxiliary             | make_job_report           | errors: (none)
2019-06-29 03:03:19,120 | INFO     | retrieve            | pilot.util.auxiliary             | make_job_report           | status: LOG_TRANSFER = DONE 
2019-06-29 03:03:19,120 | INFO     | retrieve            | pilot.util.auxiliary             | make_job_report           | pilot state: finished 
2019-06-29 03:03:19,120 | INFO     | retrieve            | pilot.util.auxiliary             | make_job_report           | transexitcode: 0
2019-06-29 03:03:19,120 | INFO     | retrieve            | pilot.util.auxiliary             | make_job_report           | exeerrorcode: 0
2019-06-29 03:03:19,120 | INFO     | retrieve            | pilot.util.auxiliary             | make_job_report           | exeerrordiag: 
2019-06-29 03:03:19,120 | INFO     | retrieve            | pilot.util.auxiliary             | make_job_report           | exitcode: 0
2019-06-29 03:03:19,120 | INFO     | retrieve            | pilot.util.auxiliary             | make_job_report           | exitmsg: OK
2019-06-29 03:03:19,121 | INFO     | retrieve            | pilot.util.auxiliary             | make_job_report           | cpuconsumptiontime: 40533 s
2019-06-29 03:03:19,121 | INFO     | retrieve            | pilot.util.auxiliary             | make_job_report           | nevents: 200
2019-06-29 03:03:19,121 | INFO     | retrieve            | pilot.util.auxiliary             | make_job_report           | neventsw: 0
2019-06-29 03:03:19,121 | INFO     | retrieve            | pilot.util.auxiliary             | make_job_report           | pid: 17456
2019-06-29 03:03:19,121 | INFO     | retrieve            | pilot.util.auxiliary             | make_job_report           | pgrp: None
2019-06-29 03:03:19,121 | INFO     | retrieve            | pilot.util.auxiliary             | make_job_report           | corecount: 1
2019-06-29 03:03:19,121 | INFO     | retrieve            | pilot.util.auxiliary             | make_job_report           | event service: False
2019-06-29 03:03:19,121 | INFO     | retrieve            | pilot.util.auxiliary             | make_job_report           | --------------------------------------------------
2019-06-29 03:03:19,121 | INFO     | retrieve            | pilot.util.auxiliary             | make_job_report           | 
2019-06-29 03:03:19,122 | INFO     | retrieve            | pilot.control.job                | has_job_completed         | job 4397089446 has completed
2019-06-29 03:03:19,122 | INFO     | retrieve            | pilot.control.job                | retrieve                  | ready for new job
2019-06-29 03:03:19,122 | DEBUG    | retrieve            | pilot.control.job                | retrieve                  | getjob_requests=1
2019-06-29 03:03:24,889 | WARNING  | job_monitor         | pilot.control.job                | check_job_monitor_waiting_time | no jobs in monitored_payloads queue (waited for 61 s)
2019-06-29 03:03:29,157 | DEBUG    | retrieve            | pilot.control.job                | proceed_with_getjob       | proceed_with_getjob called with getjob_requests=1
2019-06-29 03:03:29,157 | DEBUG    | retrieve            | pilot.util.monitoring            | check_local_space         | checking local space on /root/Downloads/BOINC/slots/1
2019-06-29 03:03:29,336 | INFO     | retrieve            | pilot.util.monitoring            | check_local_space         | sufficient remaining disk space (14307819520 B)
2019-06-29 03:03:29,336 | WARNING  | retrieve            | pilot.control.job                | proceed_with_getjob       | since timefloor is set to 0, pilot was only allowed to run one job
2019-06-29 03:03:29,337 | INFO     | monitor             | pilot.control.monitor            | control                   | monitor control has ended
2019-06-29 03:03:29,369 | WARNING  | copytool_out        | pilot.util.common                | should_abort              | data:copytool_out:received graceful stop - abort after this iteration
2019-06-29 03:03:29,369 | DEBUG    | copytool_out        | pilot.control.data               | copytool_out              | will abort 
2019-06-29 03:03:29,444 | DEBUG    | data                | pilot.control.data               | control                   | data control ending since graceful_stop has been set
2019-06-29 03:03:29,570 | DEBUG    | job                 | pilot.control.job                | control                   | job control ending since graceful_stop has been set
2019-06-29 03:03:29,727 | INFO     | validate_post       | pilot.control.payload            | validate_post             | validate_post has finished
2019-06-29 03:03:29,745 | DEBUG    | payload             | pilot.control.payload            | control                   | payload control ending since graceful_stop has been set
2019-06-29 03:03:30,396 | DEBUG    | copytool_out        | pilot.control.data               | copytool_out              | aborting
2019-06-29 03:03:30,399 | DEBUG    | copytool_out        | pilot.control.data               | copytool_out              | copytool_out has finished
2019-06-29 03:03:30,398 | DEBUG    | copytool_in         | pilot.control.data               | copytool_in               | copytool_in ended since graceful_stop has been set
2019-06-29 03:03:30,820 | WARNING  | queue_monitoring    | pilot.util.common                | should_abort              | data:queue_monitoring:received graceful stop - abort after this iteration
2019-06-29 03:03:31,217 | WARNING  | queue_monitor       | pilot.util.common                | should_abort              | job:queue_monitor:received graceful stop - abort after this iteration
2019-06-29 03:03:31,218 | INFO     | queue_monitor       | pilot.control.job                | queue_monitor             | [job] queue monitor has finished
2019-06-29 03:03:34,057 | INFO     | queue_monitoring    | pilot.control.data               | queue_monitoring          | [data] queue monitor has finished
2019-06-29 03:04:26,062 | WARNING  | job_monitor         | pilot.control.job                | check_job_monitor_waiting_time | no jobs in monitored_payloads queue (waited for 123 s)
2019-06-29 03:04:26,062 | INFO     | job_monitor         | pilot.control.job                | job_monitor               | job monitor has finished
2019-06-29 03:04:26,063 | INFO     | MainThread          | pilot.workflow.generic           | run                       | end of generic workflow (traces error code: 0)
2019-06-29 03:04:26,063 | INFO     | MainThread          | root                             | wrap_up                   | traces error code: 0
2019-06-29 03:04:26,064 | INFO     | MainThread          | root                             | wrap_up                   | pilot has finished
***************diag file************
runtimeenvironments=APPS/HEP/ATLAS-SITE;
Processors=1
WallTime=40597.70s
KernelTime=1131.66s
UserTime=40367.05s
CPUUsage=102%
MaxResidentMemory=2023676kB
AverageResidentMemory=0kB
AverageTotalMemory=0kB
AverageUnsharedMemory=0kB
AverageUnsharedStack=0kB
AverageSharedMemory=0kB
PageSize=4096B
MajorPageFaults=9812
MinorPageFaults=16857541
Swaps=0
ForcedSwitches=525387
WaitSwitches=21282772
Inputs=3375976
Outputs=314256
SocketReceived=0
SocketSent=0
Signals=0

nodename=maeax@APU8S
exitcode=0
******************************WorkDir***********************
insgesamt 383732
drwxrwx--x. 8 root root       4096 29. Jun 05:04 .
drwxr-x--x. 5 root root       4096 28. Jun 17:47 ..
-rw-------. 1 root root    7342324 28. Jun 17:48 agis_ddmendpoints.json
-rw-------. 1 root root    4708901 28. Jun 17:48 agis_schedconf.cvmfs.json
drwx------. 2 root root       4096 28. Jun 17:47 .alrb
drwxr-xr-x. 3 root root       4096 28. Jun 17:47 APPS
drwxr-xr-x. 2 root root       4096 28. Jun 17:48 .arc
-rw-------. 1 root root        549 28. Jun 17:47 .asetup
-rw-------. 1 root root       4198 28. Jun 17:49 .asetup.save
-rw-r--r--. 1 root root          0 28. Jun 17:47 boinc_lockfile
-rw-r--r--. 1 root root       8192 29. Jun 05:04 boinc_mmap_file
-rw-r--r--. 1 root root        537 29. Jun 05:02 boinc_task_state.xml
-rw-r--r--. 1 root root  376157053 28. Jun 17:47 EVNT.17324023._000864.pool.root.1
-rw-------. 1 root root        494 29. Jun 05:04 h1JLDmgCz0unShfckohDCDFpABFKDmABFKDm6cOKDmABFKDm0WDbUm.diag
-rw-------. 1 root root      51577 29. Jun 05:03 heartbeat.json
-rw-r--r--. 1 root root       5467 28. Jun 17:47 init_data.xml
-rw-r--r--. 1 root root     246323 28. Jun 17:47 input.tar.gz
-rw-r--r--. 1 root root        112 28. Jun 17:47 job.xml
-rw-------. 1 root root    1841564 29. Jun 05:04 log.18308834._045315.job.log.1
-rw-------. 1 root root     327988 29. Jun 05:02 log.18308834._045315.job.log.tgz.1
-rw-------. 1 root root        306 29. Jun 05:02 memory_monitor_summary.json
-rw-------. 1 root root        391 29. Jun 05:04 output.list
-rw-------. 1 4871  1028      2884 28. Jun 08:06 pandaJobData.out
drwxrwx---. 2 root root       4096 29. Jun 05:03 PanDA_Pilot-4397089446
drwxr-xr-x. 3  501 games      4096 31. Mai 14:36 pilot2
-rw-r--r--. 1 root root     237372 24. Jun 09:16 pilot2.tar.gz
-rw-------. 1 root root    1822340 29. Jun 05:04 pilotlog.txt
-rw-r--r--. 1 root root       4468 28. Jun 08:05 queuedata.json
-rw-r--r--. 1 root root        815 28. Jun 17:47 RTE.tar.gz
-rwxr-xr-x. 1 root root       8512 28. Jun 17:47 run_atlas
-rwx------. 1 4871  1028     15232 28. Jun 08:06 runpilot2-wrapper.sh
-rw-r--r--. 1 root root        692 29. Jun 05:04 runtime_log
-rw-r--r--. 1 root root       7087 29. Jun 05:04 runtime_log.err
drwxrwx--x. 2 root root       4096 29. Jun 05:04 shared
-rw-r--r--. 1 root root       8714 28. Jun 17:47 start_atlas.sh
-rw-r--r--. 1 root root      16754 29. Jun 05:04 stderr.txt
-rw-r--r--. 1 root root        107 28. Jun 17:47 wrapper_26015_x86_64-pc-linux-gnu
-rw-r--r--. 1 root root         28 29. Jun 05:04 wrapper_checkpoint.txt
running start_atlas return value is 0
Parent exit 0
child process exit 0
05:04:27 (10816): run_atlas exited; CPU time 40367.460213
05:04:27 (10816): called boinc_finish(0)

</stderr_txt>
]]>


©2024 CERN