Message boards : Theory Application : Task not starting
Message board moderation

To post messages, you must log in.

AuthorMessage
Rasputin42
Volunteer tester

Send message
Joined: 16 Aug 15
Posts: 966
Credit: 1,211,816
RAC: 0
Message 2932 - Posted: 22 Apr 2016, 11:40:16 UTC
Last modified: 22 Apr 2016, 11:42:00 UTC

I am getting the message:

Could not source logging functions from /cvmfs/grid.cern..../bin/logging_functions

Task is not doing anything (no CPU usage), no console windows working, show graphics-page not found.
ID: 2932 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Laurence
Project administrator
Project developer
Project tester
Avatar

Send message
Joined: 12 Sep 14
Posts: 1069
Credit: 334,882
RAC: 0
Message 2933 - Posted: 22 Apr 2016, 11:53:21 UTC - in response to Message 2932.  

Thanks for the report, it has already been fix, just waiting for the cache to update.
ID: 2933 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Crystal Pellet
Volunteer tester

Send message
Joined: 13 Feb 15
Posts: 1188
Credit: 861,475
RAC: 2
Message 2942 - Posted: 22 Apr 2016, 14:51:30 UTC

I got 5 Theory's and the first one got a job and is still running.
The other 4 were blowing by the wind, running 1 to 2 minutes.

Extracts from the result logs:

2016-04-22 16:32:03 (9776): Guest Log: [INFO] Mounting the shared directory
2016-04-22 16:32:03 (9776): Guest Log: [INFO] Shared directory mounted, enabling vboxmonitor
2016-04-22 16:32:03 (9776): VM Completion File Detected.
2016-04-22 16:32:03 (9776): VM Completion Message: 1

2016-04-22 16:33:37 (8472): Guest Log: [INFO] Mounting the shared directory
2016-04-22 16:33:37 (8472): Guest Log: [INFO] Shared directory mounted, enabling vboxmonitor
2016-04-22 16:33:37 (8472): Guest Log: [INFO] Reading volunteer information
2016-04-22 16:33:37 (8472): Guest Log: [INFO] Volunteer: () Host:
2016-04-22 16:33:37 (8472): Guest Log: [ERROR] BOINC_USERID is not an integer. Shuting down!
2016-04-22 16:33:37 (8472): Guest Log: [INFO] VMID: c7b1ab21-6a52-4050-9335-489372ba8b3d
2016-04-22 16:33:37 (8472): Guest Log: [ERROR] BOINC_USERID is not set.
2016-04-22 16:33:37 (8472): Guest Log: [ERROR] The x509 proxy creation failed.
2016-04-22 16:33:37 (8472): Guest Log: [INFO] application starting. Check log files.
2016-04-22 16:33:37 (8472): Guest Log: [ERROR] App is not supported. Shutting down!
2016-04-22 16:33:37 (8472): VM Completion File Detected.
2016-04-22 16:33:37 (8472): VM Completion Message: 1

2016-04-22 16:35:00 (9444): Guest Log: [INFO] Mounting the shared directory
2016-04-22 16:35:00 (9444): Guest Log: [INFO] Shared directory mounted, enabling vboxmonitor
2016-04-22 16:35:00 (9444): VM Completion File Detected.
2016-04-22 16:35:00 (9444): VM Completion Message: 1

2016-04-22 16:36:43 (9588): Guest Log: [INFO] Mounting the shared directory
2016-04-22 16:36:43 (9588): Guest Log: [INFO] Shared directory mounted, enabling vboxmonitor
2016-04-22 16:36:43 (9588): Guest Log: [INFO] Reading volunteer information
2016-04-22 16:36:43 (9588): Guest Log: [INFO] Volunteer: () Host:
2016-04-22 16:36:43 (9588): Guest Log: [ERROR] BOINC_USERID is not an integer. Shuting down!
2016-04-22 16:36:43 (9588): Guest Log: [INFO] VMID: c7b1ab21-6a52-4050-9335-489372ba8b3d
2016-04-22 16:36:43 (9588): Guest Log: [ERROR] BOINC_USERID is not set.
2016-04-22 16:36:43 (9588): Guest Log: [ERROR] The x509 proxy creation failed.
2016-04-22 16:36:43 (9588): Guest Log: [INFO] application starting. Check log files.
2016-04-22 16:36:43 (9588): Guest Log: [ERROR] App is not supported. Shutting down!
2016-04-22 16:36:43 (9588): VM Completion File Detected.
2016-04-22 16:36:43 (9588): VM Completion Message: 1
ID: 2942 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Laurence
Project administrator
Project developer
Project tester
Avatar

Send message
Joined: 12 Sep 14
Posts: 1069
Credit: 334,882
RAC: 0
Message 2943 - Posted: 22 Apr 2016, 15:19:57 UTC - in response to Message 2942.  
Last modified: 22 Apr 2016, 15:20:23 UTC

It is failing to find the BOINC info but I don't know why. It is almost like the init_data.xml file in the shared directory is empty. Can you check in your slot directories on your host.

2016-04-22 16:33:37 (8472): Guest Log: [INFO] Volunteer: () Host:

2016-04-22 16:33:37 (8472): Guest Log: [ERROR] BOINC_USERID is not an integer. Shuting down!
It should exit here. Will clean that up.
ID: 2943 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Crystal Pellet
Volunteer tester

Send message
Joined: 13 Feb 15
Posts: 1188
Credit: 861,475
RAC: 2
Message 2948 - Posted: 22 Apr 2016, 17:22:31 UTC
Last modified: 22 Apr 2016, 17:42:57 UTC

I was 'lucky' to get the error again -> http://lhcathomedev.cern.ch/vLHCathome-dev/result.php?resultid=155251
All 4 new tasks ended that way.

A 8450 bytes init_data.xml was in the shared directory.
I saved the xml-file.

Maybe you're not pointing to the right file, due to a typo in file-name like Shuting down! ;)
ID: 2948 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Laurence
Project administrator
Project developer
Project tester
Avatar

Send message
Joined: 12 Sep 14
Posts: 1069
Credit: 334,882
RAC: 0
Message 2950 - Posted: 22 Apr 2016, 18:09:31 UTC - in response to Message 2948.  

Have just added some debug messages. If you see a similar issue in about an hour, please post a link to the task. I think that this is working for others so am a little confused. Fixed the typo but it will push with other fixes.
ID: 2950 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Ray Murray
Avatar

Send message
Joined: 13 Apr 15
Posts: 138
Credit: 2,969,210
RAC: 0
Message 2954 - Posted: 22 Apr 2016, 19:54:25 UTC - in response to Message 2943.  
Last modified: 22 Apr 2016, 19:54:56 UTC

ALL my recent tasks, CMS,ATLAS and Theory, are ending this way after c.2mins.

eg. init_data from Task 155406 shows;

<userid>196</userid>
<teamid>20</teamid>
<hostid>508</hostid>
<app_name>Theory</app_name>

yet console window shows;

Guest Log: [INFO] Volunteer: () Host:
Guest Log: [ERROR] BOINC_USERID is not an integer. Shuting down!
just before the task exits.
So at least they're exiting but still aren't seeing this info.
ID: 2954 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Crystal Pellet
Volunteer tester

Send message
Joined: 13 Feb 15
Posts: 1188
Credit: 861,475
RAC: 2
Message 2957 - Posted: 22 Apr 2016, 20:14:31 UTC
Last modified: 22 Apr 2016, 20:25:12 UTC

ID: 2957 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Rasputin42
Volunteer tester

Send message
Joined: 16 Aug 15
Posts: 966
Credit: 1,211,816
RAC: 0
Message 2958 - Posted: 22 Apr 2016, 20:19:28 UTC

ID: 2958 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Laurence
Project administrator
Project developer
Project tester
Avatar

Send message
Joined: 12 Sep 14
Posts: 1069
Credit: 334,882
RAC: 0
Message 2959 - Posted: 22 Apr 2016, 20:32:44 UTC - in response to Message 2954.  

Thanks added some more debugging. Will take about 1 hour to propagate.
ID: 2959 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Crystal Pellet
Volunteer tester

Send message
Joined: 13 Feb 15
Posts: 1188
Credit: 861,475
RAC: 2
Message 2960 - Posted: 22 Apr 2016, 21:00:23 UTC - in response to Message 2959.  

Thanks added some more debugging. Will take about 1 hour to propagate.

Something has changed, but not solved yet. Maybe this helps:
ID: 2960 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Laurence
Project administrator
Project developer
Project tester
Avatar

Send message
Joined: 12 Sep 14
Posts: 1069
Credit: 334,882
RAC: 0
Message 2961 - Posted: 22 Apr 2016, 21:22:52 UTC - in response to Message 2960.  

I think I have fixed. We will see in 1 hour.
ID: 2961 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Ray Murray
Avatar

Send message
Joined: 13 Apr 15
Posts: 138
Credit: 2,969,210
RAC: 0
Message 2962 - Posted: 22 Apr 2016, 22:10:21 UTC


Only this output from Alt-F1. Other consoles not accessible.
Boinc reports Task running but 20mins in, no CPU usage.

I'll leave those that are running to do so overnight but will set No New Tasks in case it's still broken.
ID: 2962 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Laurence
Project administrator
Project developer
Project tester
Avatar

Send message
Joined: 12 Sep 14
Posts: 1069
Credit: 334,882
RAC: 0
Message 2963 - Posted: 22 Apr 2016, 22:22:01 UTC - in response to Message 2962.  

Thanks, just pushed another fix.
ID: 2963 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Rasputin42
Volunteer tester

Send message
Joined: 16 Aug 15
Posts: 966
Credit: 1,211,816
RAC: 0
Message 2965 - Posted: 22 Apr 2016, 23:32:54 UTC

Same error messages as in message 2962 , but in /usr/bin/boinc-proxy: line 21

Console F1 and F3 working, app running.
Logs working.
ID: 2965 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Laurence
Project administrator
Project developer
Project tester
Avatar

Send message
Joined: 12 Sep 14
Posts: 1069
Credit: 334,882
RAC: 0
Message 2967 - Posted: 23 Apr 2016, 6:52:01 UTC - in response to Message 2965.  

Fixed pushed.
ID: 2967 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Crystal Pellet
Volunteer tester

Send message
Joined: 13 Feb 15
Posts: 1188
Credit: 861,475
RAC: 2
Message 2968 - Posted: 23 Apr 2016, 7:01:59 UTC
Last modified: 23 Apr 2016, 7:14:09 UTC



Job is running. Only Consoles 1 and 3 (top) have output.
In CMS however, Console 2 shows the records processing (running.log) and Console 4 the stdout.log

Dunno what fix, but my Theory-VM was running already before you pushed.
ID: 2968 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Laurence
Project administrator
Project developer
Project tester
Avatar

Send message
Joined: 12 Sep 14
Posts: 1069
Credit: 334,882
RAC: 0
Message 2970 - Posted: 23 Apr 2016, 10:48:47 UTC - in response to Message 2968.  
Last modified: 23 Apr 2016, 10:57:42 UTC

The error should have disappeared by the time the next task starts. In general for all the applications:

Console 1 => boot and initialization
Console 2 => running.log (job log)
Console 3 => top
Console 4 => stdout.log (of the job wrapper)
Console 5 => stderr.log (of the job wrapper)
Console 6 => login prompt

1,2,3 and 6 should always have output and the files are the same ones that can be seen in the Web logs (show graphics).

I can confirm that running.log is missing and will look into it asap.

EDIT: Fix pushed
ID: 2970 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Rasputin42
Volunteer tester

Send message
Joined: 16 Aug 15
Posts: 966
Credit: 1,211,816
RAC: 0
Message 2971 - Posted: 23 Apr 2016, 16:32:56 UTC - in response to Message 2970.  

Running.log is back--thank you.
Console F2 has it as well.
F4 and F5---don't know, have not started a new task.
ID: 2971 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote

Message boards : Theory Application : Task not starting


©2024 CERN